Subgraph matching is a fundamental building block of graph analysis systems. In real-world applications, graph query engines need to handle a large number of subgraph matching queries. Moreover, these queries usually share common subqueries. Selecting these common subqueries as view patterns and materializing their results can enable computation sharing and significantly enhance the efficiency of processing subsequent similar queries. However, existing methods for view materialization either have high memory consumption or cannot provide enough acceleration. In this paper, we propose a new view-based subgraph matching algorithm, MAVIS. It partitions view patterns into connected subgraphs called super-nodes and leverages a super-node-oriented materialization that strikes a balance between memory consumption and acceleration effect. Moreover, we propose a tree-based super-node partitioning method, which can avoid generating invalid candidates in materialization. We further propose a query answering algorithm that leverages these materialized views to expedite processing. Extensive experiments on real-world datasets confirm the effectiveness of MAVIS.
Our project is built on cmake. Under the root directory of the project, execute the following commands to compile the source code.
mkdir build
cd build
cmake ..
make -jAfter compiling the source code, you can find the binary file 'mavis' under the 'build' directory.
There are three types of commands in mavis:
- Test: The following command is used to test the basic function of mavis.
build/mavis -test- View Construction:
The following command is used to build materialized views. The parameter
-dataspecifies the path of the data graph. The parameter-viewspecifies the paths of view patterns. If there are multiple views that need materializing, seperating them by ",". The parameter-mv_pathspecifies the folder in which to the materialized views. Make sure the folder exists before running the command.
build/mavis -data [data_graph_path] -view [view_1,...,view_n] -mv_path [matview_folder]- Answer Query:
The following command is used to answer a query. Here,
-dataand-queryspecify the path of the data graph and the query graph, respectively.-mv_pathspecifies the folder that stores the materialized views. The program will accelerate the query processing by the materialized views when they are available.
build/mavis -data [data_graph_path] -query [query_graph_path] -mv_path [matview_folder]There are two view patterns and a query graph in dataset/test/example. They are the subgraphs of the data graph located in dataset/test/data_graph/HPRD.graph. These views are also the subgraphs of the query graph. We can use them to answer the query graph.
- Materialiation Execute the following command to materialize the views:
build/mavis -data dataset/test/data_graph/HPRD.graph -view dataset/test/example/view_pattern_0.graph,dataset/test/example/view_pattern_1.graph -mv_path materialization- Answer Query
build/mavis -data dataset/test/data_graph/HPRD.graph -query dataset/test/example/query_graph.graph -mv_path materializationOur project utilizes the source code from RapidMatch. This is its github link: [https://github.com/RapidsAtHKUST/RapidMatch][https://github.com/RapidsAtHKUST/RapidMatch]. And its related paper is "Shixuan Sun, Xibo Sun, Yulin Che, Qiong Luo, and Bingsheng He. RapidMatch: A Holistic Approach to Subgraph Query Processing. VLDB 2021."