CN111984833B

CN111984833B - High-performance graph mining method and system based on GPU

Info

Publication number: CN111984833B
Application number: CN202011078543.7A
Authority: CN
Inventors: 谭光明; 林志恒; 张春明; 段勃
Original assignee: Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Current assignee: Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Priority date: 2020-05-18
Filing date: 2020-10-10
Publication date: 2023-08-01
Anticipated expiration: 2040-10-10
Also published as: CN111831864A; CN111984833A; CN111625691A

Abstract

The invention discloses a high-performance graph mining method and a system based on a GPU, wherein the invention adopts a GPU and CPU cooperative computing architecture, so that graph mining operation can be carried out by utilizing GPU multithread to improve the searching efficiency, and simultaneously, a large number of intermediate subgraphs generated in the graph mining process are saved by utilizing CPU memory; describing the system architecture by combining a Grow-ball execution model: in the running process of the system, each time a part of the subgraph needs to be copied to the GPU to execute Grow operation, the relationship between the subgraph and the vertex/edge is judged, and the generated candidate subgraph is copied to a CPU memory; in order to check the legality of the candidate subgraph, a CPU multithreading technology is used for executing a cut operation to judge the candidate subgraph, the qualified subgraph is stored on a CPU main memory, and the system iterates the process. By taking reference to the idea of the pipeline, the CPU calculation and the GPU calculation can be simultaneously executed during iteration, and the bidirectional copying of data can also be simultaneously executed, so that the delay of calculation and transmission is covered.

Description

High-performance graph mining method and system based on GPU

Technical Field

The invention relates to a high-performance graph mining method and system based on a GPU.

Background

The graph data structure can well express the relationship among entities, but the traditional data structure cannot express the relationship efficiently. The advantages enable the graph data to play a vital role in different fields of traffic networks, social networks, brain plans, biological genes and the like. With the development of the internet, a large amount of map data is generated in more and more fields, and the scale of the map data is increasing year by year. Analyzing and processing these vast amounts of graph data is becoming increasingly important. In addition, with the development of hardware, the computing power of the computer is also higher, and devices including GPUs, FPGAs and the like are appeared to assist the calculation of the CPU. Some researchers have begun to leverage the computational effort provided by a wide variety of hardware resources in recent years to address graph data analysis and processing. The common flow of graph data analysis firstly needs to extract the relationship between entities from the data in the real world (fields of social networks, biological information, road networks and the like), abstract the relationship into graph data, and then process the graph data (common graph data processing includes graph calculation, graph mining, graph database storage and the like) to obtain a processing result. Taking breadth first traversal algorithm (BFS) in graph computation as an example, the BFS algorithm needs to perform global traversal on graph data, starting from a root vertex, access neighbors of the vertices in an iterative manner, and finally mark states for each vertex. The maximum clique enumeration algorithm (MCE) goal in graph mining is then to mine all the maximum cliques in the graph. The field of graph databases focuses on the storage and querying of graphs. The target differences of different graph processing algorithms are large, and the obtained result may be a change of the graph state, a subgraph meeting the condition, and the like.

In the field of graph computation, google's Pregel graph computation system presents a programming abstraction with vertex as center (think like vertex). But the characteristics of the graph mining algorithm are not considered in the design, and the processing granularity is coarser for the graph mining algorithm, namely the processing object is a subgraph rather than a vertex. Therefore, the existing graph calculation model is difficult to directly apply to graph mining application. In addition, graph mining algorithms have difficulty in centering large scale graphs. This is because exponential candidate patterns and subgraphs may be generated during large-scale graph mining, resulting in explosive growth of computation and intermediate state storage. The field of graph computation has mature algorithms and frameworks, but a graph mining system for high performance lacks reasonable algorithm abstraction, and most of researches focus on single graph mining application optimization. More importantly, current GPU-based graph mining systems lack research.

An important factor affecting the efficiency of graph mining algorithms is parallelism. Although multi-core CPUs have been developed for a long time, the number of concurrent threads is still very limited, typically at most 16 or 20. The limited parallelism has become a bottleneck for these CPU-based algorithms. In contrast, high-end GPUs have the ability to execute thousands or more threads simultaneously, which makes them suitable for applications involving processing large amounts of data. GPU also has a very high memory bandwidth compared to CPU. Therefore, to further increase the efficiency of the graph mining algorithm, using a GPU is a good solution.

Disclosure of Invention

The invention aims to provide a high-performance graph mining method and system based on a GPU (graphics processing unit), which are used for solving the problem of low efficiency of the conventional graph mining algorithm.

In order to solve the technical problems, the invention provides a high-performance graph mining method based on a GPU, which is characterized by comprising the following steps:

constructing corresponding search spaces according to different graph mining applications;

candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;

the search space and the candidate sub-graph set are used as input of a Grow-cut execution model, the candidate sub-graph set is expanded through Grow operation to obtain an intermediate sub-graph set, and qualified sub-graphs are screened out from the intermediate sub-graph set through cut operation to obtain a new candidate sub-graph set;

judging whether the new candidate sub-graph set meets the user specified condition, if so, ending the operation, wherein the new candidate sub-graph set comprises all sub-graphs required by the user; otherwise, taking the candidate sub-graph set of the previous round covered by the new candidate sub-graph set as input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are found.

Further, the expanding the candidate sub-graph set through the Grow operation to obtain an intermediate sub-graph set specifically includes:

performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;

checking the validity of the point expansion or the edge expansion;

deleting the subgraphs which do not meet the extended legitimacy, and generating an intermediate subgraph set according to the subgraphs which meet the extended legitimacy.

Further, executing the Grow operation by using a GPU, executing the cut operation by using a CPU, copying the obtained intermediate sub-graph set to the CPU to execute the cut operation after the GPU executes the Grow operation, and storing the intermediate sub-graph set in a CPU memory; when the Grow operation and the roll operation are executed in an iteration mode, only the new candidate sub-graph set obtained in the round of iteration process is copied to the GPU to execute the Grow operation each time.

In addition, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and a CPU; the GPU and the CPU realize graph mining by the following steps:

constructing corresponding search spaces by the CPU according to different graph mining applications; candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;

copying the candidate sub-graph set to the GPU, executing Grow operation by the GPU, and expanding the candidate sub-graph set to obtain an intermediate sub-graph set;

copying the intermediate sub-graph set to the CPU for storage, and performing a cut operation by the CPU, screening qualified sub-graphs from the intermediate sub-graph set to obtain a new candidate sub-graph set, and storing the new candidate sub-graph set on a CPU main memory;

and replacing the candidate sub-graph set by using the new candidate sub-graph set obtained by the CPU, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are obtained.

Further, a user programming interface is included; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow stage, and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut stage; wherein,,

the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbors of the vertex;

a TOADD interface for eliminating isomorphic redundancy of subgraphs, and describing pruning operations to accelerate searches;

the TOCHECK interface is used for taking the subgraph as input and judging the relation between newly added vertexes and all vertexes in the current subgraph;

the FOREDGE interface is used for customizing the unified side relationship;

TOCOMPARE interface, is used for judging whether candidate subgraph and inquiry subgraph isomorphism.

Further, the user programming interface further comprises:

the IS_CONNECT interface IS used for judging whether two vertexes are connected by an edge or not;

the common_neighbor interface is mainly used for finding the public NEIGHBORs of two vertices.

Further, data can be copied between the CPU and the GPU in two directions at the same time, and CPU calculation and GPU calculation can be executed at the same time.

The beneficial effects of the invention are as follows: by taking the subgraph as the center and adopting a Grow-ball execution model, the graph mining problem is abstracted, has better descriptive property, and can be suitable for common graph mining application; and by designing a graph mining system based on CPU and GPU cooperative computing, the graph mining process can be accelerated and the memory pressure can be relieved by fully utilizing the computing resources of the CPU and the GPU.

Drawings

The accompanying drawings, where like reference numerals refer to identical or similar parts throughout the several views and which are included to provide a further understanding of the present application, are included to illustrate and explain illustrative examples of the present application and do not constitute a limitation on the present application. In the drawings:

FIG. 1 is a schematic diagram of an embodiment of the present invention of the execution flow of the mining system.

FIG. 2 is a schematic diagram of a system providing a user programming interface in accordance with one embodiment of the present invention.

FIG. 3 is a diagram of a diagram mining application abstraction in accordance with one embodiment of the present invention.

FIG. 4 is a system architecture diagram of an embodiment of the present invention.

FIG. 5 is a schematic diagram of a system execution pipeline according to one embodiment of the present invention.

Detailed Description

The high-performance graph mining method based on the GPU shown in fig. 1 comprises the following steps:

s1: constructing corresponding search spaces according to different graph mining applications; because of the different applications of graph mining, the corresponding search spaces are not the same. In the usual case, the search space applied is the entire large graph, and the system then mines the required sub-graph in the large graph.

S2: candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;

s3: the search space and the candidate sub-graph set are used as input of a Grow-cut execution model, the candidate sub-graph set is expanded through Grow operation to obtain an intermediate sub-graph set, and qualified sub-graphs are screened out from the intermediate sub-graph set through cut operation to obtain a new candidate sub-graph set; in the cut operation process, whether the whole of each sub-graph meets the condition or not is strictly checked, and qualified sub-graphs are inserted into a new candidate sub-graph set;

s4: judging whether the new candidate sub-graph set meets the user specified condition, if so, ending the operation, wherein the new candidate sub-graph set comprises all sub-graphs required by the user; otherwise, taking the candidate sub-graph set of the previous round covered by the new candidate sub-graph set as input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are found.

The expanding the candidate sub-graph set through Grow operation to obtain an intermediate sub-graph set specifically includes:

s31: performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;

s32: checking the validity of the point expansion or the edge expansion;

s33: deleting the subgraphs which do not meet the extended legitimacy, and generating an intermediate subgraph set according to the subgraphs which meet the extended legitimacy.

The invention uses GPU and CPU collaborative computing architecture to execute graph mining; executing the Grow operation by using a GPU, executing the cut operation by using a CPU, copying the obtained intermediate sub-graph set to the CPU to execute the cut operation after the GPU executes the Grow operation, and storing the intermediate sub-graph set in a CPU memory; when the Grow operation and the roll operation are executed in an iteration mode, only the new candidate sub-graph set obtained in the round of iteration process is copied to the GPU to execute the Grow operation each time. The GPU is utilized to execute Grow operation, the CPU executes ball operation, and the computing and transmission delay is covered in a pipeline mode.

In addition, as shown in fig. 4, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and a CPU; the GPU and the CPU realize graph mining by the following steps:

s1: the CPU digs the application according to different graphs to construct a corresponding search space; candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;

s2: copying the candidate sub-graph set to the GPU, executing Grow operation by the GPU, and expanding the candidate sub-graph set to obtain an intermediate sub-graph set;

s3: copying the intermediate sub-graph set to the CPU for storage, and performing a cut operation by the CPU, screening qualified sub-graphs from the intermediate sub-graph set to obtain a new candidate sub-graph set, and storing the new candidate sub-graph set on a CPU main memory;

s4: and replacing the candidate sub-graph set by using the new candidate sub-graph set obtained by the CPU, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are obtained.

According to the steps S2 and S3, the invention can sequentially and serially execute the whole graph mining process, and finally find all subgraphs. In the nth round, the n+1st round of iteration is executed after steps S2 and S3 are completed according to the serial thought. But in such a way that the GPU is waiting when it is performing Grow operations because the CPU has not yet received data and cannot process it. The GPU is also idling when the CPU performs a cut operation.

To prevent memory overflow, the system only copies a portion of the data of the current round to the GPU at a time to expand because the GPU generates a large number of candidate sets when performing Grow operations. Therefore, in the nth round, assuming that the data of the present round is divided into k parts, operations S2 and S3 are performed k times to go to the n+1st round. By taking the idea of the pipeline into consideration, the k steps of the nth round can be overlapped. That is, in the nth round, the CPU may receive the (i+1) th data to expand (perform Grow operation) when checking the (i) th data (performing the cut operation). When the execution of the two steps is finished, the data transmission overlapping can also be performed by the D2H transmission of the candidate data after the i+1st part expansion and the H2D transmission of the data to be expanded of the i+2nd part thanks to a dual-copy (dual-copy) engine of the GPU. The execution pipeline for the ith iteration is shown in fig. 5. The first data is first transferred to the GPU to perform a Grow operation to generate a second candidate set of data. At this point a blocking (barrier) is required until the Grow operation execution ends. When the data is transmitted back to the CPU, the pipeline is started, and the 2 nd initial data is transmitted to the GPU, and the data transmission is finished to start calculation, so that the data is blocked before the transmission is finished. When the CPU and the GPU receive the data to be processed, the two devices respectively start to execute the Grow operation and execute the cut operation, and the next round of transmission can not be executed until the slower of the two devices finishes the calculation. By using the pipeline mode, the system can fully utilize the computing capacities of the CPU and the GPU, and simultaneously, the transmission time of data is saved by simultaneous transmission of two wires.

FIG. 2 is a schematic diagram of a plurality of user-customizable programming interfaces provided in accordance with an embodiment of the present invention, illustrating the programming interfaces provided to a user by the graph mining system; the user can develop the graph mining application only by rewriting a plurality of interfaces in the Grow and the call; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow stage, and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut stage; wherein:

the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbors of the vertex; the default extension behavior is typically all neighbors that access the eligible vertices.

A TOADD interface for eliminating isomorphic redundancy of subgraphs, and describing pruning operations to accelerate searches; the most basic isomorphic detection is done by default and no pruning operation is performed.

The TOCHECK interface is used for taking the subgraph as input and judging the relation between newly added vertexes and all vertexes in the current subgraph; the user may describe the shape of the subgraph through TOCHECK.

The FOREDGE interface is used for customizing the unified side relationship; in some cases, it is only necessary to determine whether the newly added vertex and the other vertices satisfy a consistent relationship.

The TOCOMPARE interface is mainly used for judging whether the candidate sub-graph and the query sub-graph are isomorphic or not aiming at the application with the query sub-graph; and judging whether the candidate sub-graph and the query sub-graph are isomorphic.

In addition, the user may also invoke the is_connect interface and the common_connect interface to determine the manner in which vertices in the input graph are connected.

The CPU and the GPU can simultaneously copy data in two directions, the CPU calculation and the GPU calculation can be simultaneously executed, and the calculation resources of the CPU and the GPU can be fully utilized to accelerate the graph mining process and relieve the memory pressure.

FIG. 3 is a diagram of a diagram mining application abstraction in accordance with an embodiment of the present invention. The following describes how the common three graph mining applications are abstracted using Grow and call interfaces:

triangle count, max clique enumeration, and sub-graph matching. Assuming that a point expansion mode is adopted, for triangle counting, the Grow mainly accesses neighbors of all vertexes in the current subgraph, and does not need to consider the label or attribute information of the neighbors. While cut needs to filter all subgraphs that are not triangles. For the maximum clique, the Grow operation is similar to the triangle count Grow, and the cut operation screens out all full communication graphs. In the example of sub-graph matching, the Grow considers, in addition to the connection relationship of the newly added vertex, whether the vertex label is consistent with the query graph and whether the connection relationship of the vertex is consistent with the connection relationship of the corresponding vertex of the query graph. In cut, it is necessary to determine whether the candidate subgraph is isomorphic with the query graph.

The invention adopts a GPU and CPU cooperative computing architecture, can utilize GPU multithreading to carry out graph mining operation to improve the searching efficiency, and simultaneously utilizes CPU memory to store a large number of intermediate subgraphs generated in the graph mining process. Describing a system architecture in combination with a Grow-ball execution model: in the running process of the system, each time a part of the subgraph needs to be copied to the GPU to execute Grow operation, the relationship between the subgraph and the vertex/edge is judged, and the generated candidate subgraph is copied to a CPU memory; in order to check the legality of the candidate subgraph, a CPU multithreading technology is used for executing a cut operation to judge the candidate subgraph, the qualified subgraph is stored on a CPU main memory, and the system iterates the process. By taking reference to the idea of the pipeline, the CPU calculation and the GPU calculation can be simultaneously executed during iteration, and the bidirectional copying of data can also be simultaneously executed, so that the delay of calculation and transmission is covered.

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims

1. The high-performance graph mining method based on the GPU is characterized by comprising the following steps of:

judging whether the new candidate sub-graph set meets the user specified condition, if so, ending the operation, wherein the new candidate sub-graph set comprises all sub-graphs required by the user; otherwise, taking the candidate sub-graph set of the previous round covered by the new candidate sub-graph set as input of a Grow operation, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are found;

executing the Grow operation by using a GPU, executing the cut operation by using a CPU, copying the obtained intermediate sub-graph set to the CPU to execute the cut operation after the GPU executes the Grow operation, and storing the intermediate sub-graph set in a CPU memory; when the Grow operation and the roll operation are executed in an iteration mode, only the new candidate sub-graph set obtained in the round of iteration process is copied to the GPU to execute the Grow operation each time.

2. The GPU-based high performance graph mining method according to claim 1, wherein said expanding the candidate sub-graph set by a Grow operation to obtain an intermediate sub-graph set specifically comprises:

checking the validity of the point expansion or the edge expansion;

3. The high-performance graph mining system based on the GPU is characterized by comprising the GPU and a CPU; the GPU and the CPU realize graph mining by the following steps:

4. The GPU-based high performance graph mining system of claim 3, further comprising a user programming interface; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow stage, and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut stage; wherein,,

the FOREDGE interface is used for customizing the unified side relationship;

5. The GPU-based high performance graph mining system according to claim 4, wherein the user programming interface further comprises:

6. The GPU-based high performance graphics mining system according to any of claims 3-5, wherein data is bi-directionally copied between the CPU and the GPU simultaneously, and CPU computations and GPU computations are performed simultaneously.