CN111984833A

CN111984833A - GPU-based high-performance graph mining method and system

Info

Publication number: CN111984833A
Application number: CN202011078543.7A
Authority: CN
Inventors: 谭光明; 林志恒; 张春明; 段勃
Original assignee: Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Current assignee: Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Priority date: 2020-05-18
Filing date: 2020-10-10
Publication date: 2020-11-24
Anticipated expiration: 2040-10-10
Also published as: CN111984833B; CN111625691A; CN111831864A

Abstract

The invention discloses a high-performance graph mining method and a high-performance graph mining system based on a GPU (graphics processing unit). according to the method, a GPU and CPU (central processing unit) cooperative computing framework is adopted, GPU multithreading can be used for graph mining operation to improve the searching efficiency, and meanwhile, a CPU memory is used for storing a large number of intermediate subgraphs generated in the graph mining process; the system architecture is described by combining a Grow-cut execution model: in the system operation process, each time partial subgraph is required to be copied to a GPU to execute Grow operation, judging the relationship between the subgraph and a vertex/edge, and copying the generated candidate subgraph to a CPU memory; in order to check the validity of the candidate subgraph, the CPU multithreading technology is used for executing the Cull operation to judge the candidate subgraph, the qualified subgraph is stored in the CPU main memory, and the system repeats the iteration process. By taking the idea of a pipeline as a reference, the CPU calculation and the GPU calculation can be executed simultaneously during iteration, and the bidirectional copy of data can also be executed simultaneously, so that the delay of calculation and transmission is covered.

Description

GPU-based high-performance graph mining method and system

Technical Field

The invention relates to a high-performance graph mining method and system based on a GPU.

Background

The graph data structure can well express the relationship between entities, and the traditional data structure cannot efficiently express the relationship. Such advantages make the graph data play a crucial role in different fields such as traffic networks, social networks, human brain programs, biological genes, and the like. With the development of the internet, a large amount of graph data is generated in more and more fields, and the scale of the graph data is increasing year by year. Analyzing and processing these huge amounts of graph data is becoming increasingly important. In addition, along with the development of hardware, the computing power of computers is also higher and higher, and devices including GPUs, FPGAs and the like are provided to assist the CPU in computing. In recent years some researchers have begun to investigate graph data analysis and processing with the computational power provided by a wide variety of hardware resources. The common graph data analysis process firstly needs to extract the relationships between entities from data of the real world (social network, biological information, road network and other fields), abstract the relationships into graph data, and then process the graph data (the common graph data processing includes graph calculation, graph mining, graph database storage and other modes) to obtain a processing result. Taking a breadth-first traversal algorithm (BFS) in graph computation as an example, the BFS algorithm needs to perform global traversal on graph data, visit neighbors of vertices from a root vertex in an iterative manner, and finally mark a state for each vertex. The maximum blob enumeration algorithm (MCE) target in graph mining is then to mine all the maximum blobs in the graph. Graph database domain concerns graph storage and query. The target difference of different graph processing algorithms is large, and the obtained result may be the change of the graph state, a subgraph meeting the condition, and the like.

In the field of graph computation, google's Pregel graph computation system presents a vertex-centric (think vertex) programming abstraction. However, the characteristics of the graph mining algorithm are not considered in the design, and the processing granularity of the graph mining algorithm is coarser, namely, the processing object is a subgraph instead of a vertex. Therefore, the existing graph computation model is difficult to be directly applied to the graph mining application. In addition, the graph mining algorithm is difficult to handle for medium and large scale graphs. This is because exponential levels of candidate patterns and subgraphs may be generated during large-scale graph mining, resulting in explosive growth of computation and intermediate state storage. There are well-established algorithms and frameworks in the field of graph computation, but there is a lack of reasonable algorithm abstraction for high-performance graph mining systems, while most of the research focuses only on single graph mining application optimization. More importantly, the current GPU-based graph mining system lacks research.

One important factor affecting the efficiency of graph mining algorithms is parallelism. Although multi-core CPUs have been developed for a long time, the number of concurrent threads is still very limited, usually up to 16 or 20. The limited parallelism capability has become a bottleneck for these CPU-based algorithms. In contrast, high-end GPUs have the ability to execute thousands or more threads simultaneously, which makes them suitable for applications involving processing large amounts of data. GPUs also have a very high memory bandwidth compared to CPUs. Therefore, in order to further improve the efficiency of the graph mining algorithm, the GPU is a good solution.

Disclosure of Invention

The invention aims to provide a high-performance graph mining method and system based on a GPU (graphics processing unit) so as to solve the problem of low efficiency of the existing graph mining algorithm.

In order to solve the technical problem, the invention provides a high-performance graph mining method based on a GPU, which is characterized by comprising the following steps:

constructing a corresponding search space according to different graph mining applications;

a plurality of vertexes or edges are candidate in the search space according to sub-graph information provided by a user, and an initial candidate sub-graph set is constructed;

taking the search space and the candidate subgraph set as the input of a Grow-cut execution model, expanding the candidate subgraph set through Grow operation to obtain a middle subgraph set, and screening qualified subgraphs from the middle subgraph set through cut operation to obtain a new candidate subgraph set;

judging whether the new candidate subgraph set meets the user specified conditions, if so, ending the operation, wherein the new candidate subgraph set comprises subgraphs required by all users; and if not, covering the new candidate subgraph set with the candidate subgraph set of the previous round as the input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all subgraphs meeting the user-specified condition are found.

Further, the step of expanding the candidate subgraph set through Grow operation to obtain an intermediate subgraph set specifically includes:

performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;

checking the validity of the point extension or the edge extension;

and deleting the subgraphs which do not meet the expanded legality, and generating an intermediate subgraph set according to the subgraphs which meet the expanded legality.

Further, the GPU is adopted to execute the Grow operation, the CPU is adopted to execute the cut operation, the GPU copies the obtained intermediate subgraph set to the CPU to execute the cut operation after executing the Grow operation, and the intermediate subgraph set is stored in a CPU memory; when the Grow operation and the cut operation are executed in an iteration mode, only the new candidate subgraph set obtained in the iteration process of the round is copied to the GPU to execute the Grow operation each time.

In addition, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and the CPU; the GPU and the CPU realize graph mining through the following steps:

mining application according to different graphs through the CPU, and constructing corresponding search spaces; a plurality of vertexes or edges are candidate in the search space according to sub-image information provided by a user, and an initial candidate sub-image set is constructed;

copying the candidate subgraph set to the GPU, executing Grow operation through the GPU, and expanding the candidate subgraph set to obtain an intermediate subgraph set;

copying the intermediate subgraph set to the CPU for storage, and enabling the CPU to execute the toll operation, screening qualified subgraphs from the intermediate subgraph set to obtain a new candidate subgraph set and storing the new candidate subgraph set in a CPU main memory;

and replacing the candidate subgraph set by using the new candidate subgraph set obtained by the CPU, and iteratively executing Grow operation and fill operation until all subgraphs meeting the conditions specified by the user are obtained.

Further, a user programming interface is also included; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow phase and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut phase; wherein,

the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbor of the vertex;

TOADD interface, used for eliminating the isomorphic redundancy of subgraph, and describe the operation of pruning in order to accelerate the search;

the TOCHECK interface is used for taking the subgraph as input and judging the relationship between the newly added vertex and all the vertices in the current subgraph;

the FOREDGE interface is used for customizing a unified edge relation;

and the TOCOMPARE interface is used for judging whether the candidate subgraph and the query subgraph are isomorphic.

Further, the user programming interface further comprises:

the IS _ CONNECT interface IS used for judging whether edges of the two vertexes are connected or not;

the COMMON _ NEIGHBOR interface is used primarily to find the COMMON NEIGHBORs of two vertices.

Further, data can be copied bidirectionally between the CPU and the GPU, and CPU calculation and GPU calculation can be executed simultaneously.

The invention has the beneficial effects that: by taking the subgraph as a center and adopting a Grow-fill execution model, the graph mining problem is abstracted, the graph mining problem is well described, and the method can be applied to common graph mining application; and by designing the graph mining system based on the CPU & GPU cooperative computing, the graph mining process can be accelerated and the memory pressure can be relieved by fully utilizing the computing resources of the CPU and the GPU.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a schematic diagram of an execution flow of a graph mining system according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of a system providing a user programming interface according to one embodiment of the invention.

FIG. 3 is a diagram illustrating an abstraction of a graph mining application, in accordance with an embodiment of the present invention.

FIG. 4 is a system architecture diagram according to an embodiment of the present invention.

FIG. 5 is a system execution pipeline diagram of one embodiment of the present invention.

Detailed Description

The high-performance graph mining method based on the GPU shown in FIG. 1 comprises the following steps:

s1: constructing a corresponding search space according to different graph mining applications; due to different applications of graph mining, the corresponding search spaces are also different. In the general case, the applied search space is the whole large graph, and the system mines the needed subgraphs in the large graph.

S2: a plurality of vertexes or edges are candidate in the search space according to sub-graph information provided by a user, and an initial candidate sub-graph set is constructed;

s3: taking the search space and the candidate subgraph set as the input of a Grow-cut execution model, expanding the candidate subgraph set through Grow operation to obtain a middle subgraph set, and screening qualified subgraphs from the middle subgraph set through cut operation to obtain a new candidate subgraph set; in the Cull operation process, whether the whole sub-graph meets the condition or not needs to be strictly checked, and the qualified sub-graph is inserted into a new candidate sub-graph set;

s4: judging whether the new candidate subgraph set meets the user specified conditions, if so, ending the operation, wherein the new candidate subgraph set comprises subgraphs required by all users; and if not, covering the new candidate subgraph set with the candidate subgraph set of the previous round as the input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all subgraphs meeting the user-specified condition are found.

The step of expanding the candidate subgraph set through Grow operation to obtain an intermediate subgraph set specifically includes:

s31: performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;

s32: checking the validity of the point extension or the edge extension;

s33: and deleting the subgraphs which do not meet the expanded legality, and generating an intermediate subgraph set according to the subgraphs which meet the expanded legality.

The method uses a GPU and CPU cooperative computing architecture to execute graph mining; adopting a GPU to execute the Grow operation, adopting a CPU to execute the fill operation, copying an obtained intermediate subgraph set to the CPU to execute the fill operation after the GPU executes the Grow operation, and storing the intermediate subgraph set in a CPU memory; when the Grow operation and the cut operation are executed in an iteration mode, only the new candidate subgraph set obtained in the iteration process of the round is copied to the GPU to execute the Grow operation each time. And (4) executing the Grow operation by using the GPU, executing the close operation by using the CPU, and covering the calculation and transmission delay in a pipeline mode.

In addition, as shown in fig. 4, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and the CPU; the GPU and the CPU realize graph mining through the following steps:

s1: mining application by the CPU according to different graphs, and constructing corresponding search spaces; a plurality of vertexes or edges are candidate in the search space according to sub-image information provided by a user, and an initial candidate sub-image set is constructed;

s2: copying the candidate subgraph set to the GPU, executing Grow operation through the GPU, and expanding the candidate subgraph set to obtain an intermediate subgraph set;

s3: copying the intermediate subgraph set to the CPU for storage, and enabling the CPU to execute the toll operation, screening qualified subgraphs from the intermediate subgraph set to obtain a new candidate subgraph set and storing the new candidate subgraph set in a CPU main memory;

s4: and replacing the candidate subgraph set by using the new candidate subgraph set obtained by the CPU, and iteratively executing Grow operation and fill operation until all subgraphs meeting the conditions specified by the user are obtained.

According to steps S2 and S3, the invention can serially execute the whole graph mining process in turn, and finally find all sub graphs. In the nth round, the n +1 th round of iteration is executed after the steps S2 and S3 are completed according to the serial thought. But in such a way that when the GPU is performing a Grow operation, it is in a wait state because the CPU has not received data yet and is unable to process it. The GPU is also idling when the CPU is performing the close operation.

To prevent memory overflow, the system copies only a portion of the data of the current round to the GPU for expansion at a time, because the GPU generates a large number of candidate sets when performing the Grow operation. Therefore, in the n-th round, if the data of the current round is divided into k pieces, k times of operations S2 and S3 are required until the n + 1-th round. By using the idea of a pipeline, four steps of k times of the nth round can be overlapped. That is, in the nth round, when the CPU checks the ith data (performs the close operation), the GPU may receive the (i + 1) th data for expansion (performs the Grow operation). After the two steps are executed, due to the double-copy (dual-copy) engine of the GPU, the D2H transmission of the i +1 th expanded candidate data and the H2D transmission of the i +2 th data to be expanded can also be overlapped in data transmission. The execution pipeline for the ith round of iterations is shown in FIG. 5. The first initial data needs to be transmitted to the GPU first to perform a Grow operation to generate a second candidate set of data. At this point, blocking (barrier) is required until the execution of the Grow operation is finished. When the data is transmitted back to the CPU, the pipeline is started, and the 2 nd initial data starts to be transmitted to the GPU, and because the data transmission is required to be finished and the calculation can be started, the data is also required to be blocked before the transmission is finished. When the CPU and the GPU receive data which need to be processed respectively, the two devices respectively start to execute Grow operation and close operation until the slower side of the two devices finishes calculation, and the next round of transmission can not be executed. By utilizing the pipeline mode, the system can fully utilize the computing power of the CPU and the GPU, and meanwhile, the transmission time of data is saved due to the simultaneous transmission of two lines.

FIG. 2 is a schematic diagram of providing multiple user-customizable programming interfaces, listing the programming interfaces provided to a user by the graph mining system, in accordance with an embodiment of the present invention; the user can develop the graph mining application only by rewriting a plurality of interfaces in the Grow and the cut; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow phase and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut phase; wherein:

the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbor of the vertex; the default extended behavior is typically all neighbors that access a qualified vertex.

TOADD interface, used for eliminating the isomorphic redundancy of subgraph, and describe the operation of pruning in order to accelerate the search; default to the most basic isomorphic detection and not perform any pruning operations.

The TOCHECK interface is used for taking the subgraph as input and judging the relationship between the newly added vertex and all the vertices in the current subgraph; the user can describe the shape of the sub-graph through TOCHECK.

The FOREDGE interface is used for customizing a unified edge relation; in some cases, it is only necessary to determine whether the newly added vertex and the other vertices satisfy a consistent relationship.

The TOCOMPARE interface is mainly used for judging whether the candidate subgraph and the query subgraph are isomorphic aiming at the application with the query subgraph; and judging whether the candidate subgraph and the query subgraph are isomorphic.

Additionally, the user may also invoke the IS _ CONNECT interface and the COMMON _ NEIGHBOR interface to determine the manner in which the vertices in the input graph are connected.

The data can be copied bidirectionally between the CPU and the GPU simultaneously, the CPU calculation and the GPU calculation can be executed simultaneously, and the calculation resources of the CPU and the GPU can be fully utilized to accelerate the graph mining process and relieve the memory pressure.

FIG. 3 is a diagram illustrating an abstraction of a graph mining application, in accordance with an embodiment of the present invention. The following describes how the common three graph mining applications are abstracted using the Grow and Cull interfaces:

triangle count, maximum blob enumeration, and subgraph matching. Assuming a point-spread mode is used, for triangle counting, Grow mainly accesses the neighbors of all vertices in the current subgraph, without considering the information of the neighbors' labels or attributes. While fill requires filtering all sub-graphs that are not triangular. For the maximum clusters, the Grow operation is similar to triangle-counted Grow, and the cut operation screens out all fully-connected graphs. In the example of subgraph matching, Grow considers whether the vertex label is consistent with the query graph and whether the vertex connection relation is consistent with the connection relation of the corresponding vertex of the query graph besides the connection relation of the newly added vertex. In Cull, it is necessary to determine whether the candidate subgraph is isomorphic with the query graph.

The method adopts a GPU and CPU cooperative computing architecture, can utilize GPU multithreading to carry out graph mining operation to improve searching efficiency, and simultaneously utilizes CPU memory to store a large number of intermediate subgraphs generated in the graph mining process. Describing a system architecture by combining a Grow-cut execution model: in the system operation process, each time partial subgraph is required to be copied to a GPU to execute Grow operation, judging the relationship between the subgraph and a vertex/edge, and copying the generated candidate subgraph to a CPU memory; in order to check the validity of the candidate subgraph, the CPU multithreading technology is used for executing the Cull operation to judge the candidate subgraph, the qualified subgraph is stored in the CPU main memory, and the system repeats the iteration process. By taking the idea of a pipeline as a reference, the CPU calculation and the GPU calculation can be executed simultaneously during iteration, and the bidirectional copy of data can also be executed simultaneously, so that the delay of calculation and transmission is covered.

Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims

1. A high-performance graph mining method based on a GPU is characterized by comprising the following steps:

2. The GPU-based high performance graph mining method of claim 1, wherein the expanding the candidate subgraph set by Grow operation to obtain an intermediate subgraph set specifically comprises:

checking the validity of the point extension or the edge extension;

3. The GPU-based high-performance graph mining method according to claim 1 or 2, characterized in that a GPU is used for executing the Grow operation, a CPU is used for executing the cut operation, the GPU copies an obtained intermediate subgraph set to the CPU for executing the cut operation after executing the Grow operation, and stores the intermediate subgraph set in a CPU memory; when the Grow operation and the cut operation are executed in an iteration mode, only the new candidate subgraph set obtained in the iteration process of the round is copied to the GPU to execute the Grow operation each time.

4. A high-performance graph mining system based on a GPU is characterized by comprising the GPU and a CPU; the GPU and the CPU realize graph mining through the following steps:

5. A GPU-based high-performance graph mining system according to claim 4, further comprising a user programming interface; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow phase and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut phase; wherein,

the FOREDGE interface is used for customizing a unified edge relation;

6. A GPU-based high performance graph mining system in accordance with claim 5, wherein said user programming interface further comprises:

7. A GPU-based high-performance graph mining system according to any of claims 4-6, characterized in that data can be copied bidirectionally between the CPU and the GPU simultaneously, and CPU calculations and GPU calculations can be performed simultaneously.