CN111984833B - High-performance graph mining method and system based on GPU - Google Patents

High-performance graph mining method and system based on GPU Download PDF

Info

Publication number
CN111984833B
CN111984833B CN202011078543.7A CN202011078543A CN111984833B CN 111984833 B CN111984833 B CN 111984833B CN 202011078543 A CN202011078543 A CN 202011078543A CN 111984833 B CN111984833 B CN 111984833B
Authority
CN
China
Prior art keywords
graph
sub
gpu
cpu
graph set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011078543.7A
Other languages
Chinese (zh)
Other versions
CN111984833A (en
Inventor
谭光明
林志恒
张春明
段勃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Original Assignee
Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences filed Critical Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Publication of CN111984833A publication Critical patent/CN111984833A/en
Application granted granted Critical
Publication of CN111984833B publication Critical patent/CN111984833B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a high-performance graph mining method and a system based on a GPU, wherein the invention adopts a GPU and CPU cooperative computing architecture, so that graph mining operation can be carried out by utilizing GPU multithread to improve the searching efficiency, and simultaneously, a large number of intermediate subgraphs generated in the graph mining process are saved by utilizing CPU memory; describing the system architecture by combining a Grow-ball execution model: in the running process of the system, each time a part of the subgraph needs to be copied to the GPU to execute Grow operation, the relationship between the subgraph and the vertex/edge is judged, and the generated candidate subgraph is copied to a CPU memory; in order to check the legality of the candidate subgraph, a CPU multithreading technology is used for executing a cut operation to judge the candidate subgraph, the qualified subgraph is stored on a CPU main memory, and the system iterates the process. By taking reference to the idea of the pipeline, the CPU calculation and the GPU calculation can be simultaneously executed during iteration, and the bidirectional copying of data can also be simultaneously executed, so that the delay of calculation and transmission is covered.

Description

High-performance graph mining method and system based on GPU
Technical Field
The invention relates to a high-performance graph mining method and system based on a GPU.
Background
The graph data structure can well express the relationship among entities, but the traditional data structure cannot express the relationship efficiently. The advantages enable the graph data to play a vital role in different fields of traffic networks, social networks, brain plans, biological genes and the like. With the development of the internet, a large amount of map data is generated in more and more fields, and the scale of the map data is increasing year by year. Analyzing and processing these vast amounts of graph data is becoming increasingly important. In addition, with the development of hardware, the computing power of the computer is also higher, and devices including GPUs, FPGAs and the like are appeared to assist the calculation of the CPU. Some researchers have begun to leverage the computational effort provided by a wide variety of hardware resources in recent years to address graph data analysis and processing. The common flow of graph data analysis firstly needs to extract the relationship between entities from the data in the real world (fields of social networks, biological information, road networks and the like), abstract the relationship into graph data, and then process the graph data (common graph data processing includes graph calculation, graph mining, graph database storage and the like) to obtain a processing result. Taking breadth first traversal algorithm (BFS) in graph computation as an example, the BFS algorithm needs to perform global traversal on graph data, starting from a root vertex, access neighbors of the vertices in an iterative manner, and finally mark states for each vertex. The maximum clique enumeration algorithm (MCE) goal in graph mining is then to mine all the maximum cliques in the graph. The field of graph databases focuses on the storage and querying of graphs. The target differences of different graph processing algorithms are large, and the obtained result may be a change of the graph state, a subgraph meeting the condition, and the like.
In the field of graph computation, google's Pregel graph computation system presents a programming abstraction with vertex as center (think like vertex). But the characteristics of the graph mining algorithm are not considered in the design, and the processing granularity is coarser for the graph mining algorithm, namely the processing object is a subgraph rather than a vertex. Therefore, the existing graph calculation model is difficult to directly apply to graph mining application. In addition, graph mining algorithms have difficulty in centering large scale graphs. This is because exponential candidate patterns and subgraphs may be generated during large-scale graph mining, resulting in explosive growth of computation and intermediate state storage. The field of graph computation has mature algorithms and frameworks, but a graph mining system for high performance lacks reasonable algorithm abstraction, and most of researches focus on single graph mining application optimization. More importantly, current GPU-based graph mining systems lack research.
An important factor affecting the efficiency of graph mining algorithms is parallelism. Although multi-core CPUs have been developed for a long time, the number of concurrent threads is still very limited, typically at most 16 or 20. The limited parallelism has become a bottleneck for these CPU-based algorithms. In contrast, high-end GPUs have the ability to execute thousands or more threads simultaneously, which makes them suitable for applications involving processing large amounts of data. GPU also has a very high memory bandwidth compared to CPU. Therefore, to further increase the efficiency of the graph mining algorithm, using a GPU is a good solution.
Disclosure of Invention
The invention aims to provide a high-performance graph mining method and system based on a GPU (graphics processing unit), which are used for solving the problem of low efficiency of the conventional graph mining algorithm.
In order to solve the technical problems, the invention provides a high-performance graph mining method based on a GPU, which is characterized by comprising the following steps:
constructing corresponding search spaces according to different graph mining applications;
candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;
the search space and the candidate sub-graph set are used as input of a Grow-cut execution model, the candidate sub-graph set is expanded through Grow operation to obtain an intermediate sub-graph set, and qualified sub-graphs are screened out from the intermediate sub-graph set through cut operation to obtain a new candidate sub-graph set;
judging whether the new candidate sub-graph set meets the user specified condition, if so, ending the operation, wherein the new candidate sub-graph set comprises all sub-graphs required by the user; otherwise, taking the candidate sub-graph set of the previous round covered by the new candidate sub-graph set as input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are found.
Further, the expanding the candidate sub-graph set through the Grow operation to obtain an intermediate sub-graph set specifically includes:
performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;
checking the validity of the point expansion or the edge expansion;
deleting the subgraphs which do not meet the extended legitimacy, and generating an intermediate subgraph set according to the subgraphs which meet the extended legitimacy.
Further, executing the Grow operation by using a GPU, executing the cut operation by using a CPU, copying the obtained intermediate sub-graph set to the CPU to execute the cut operation after the GPU executes the Grow operation, and storing the intermediate sub-graph set in a CPU memory; when the Grow operation and the roll operation are executed in an iteration mode, only the new candidate sub-graph set obtained in the round of iteration process is copied to the GPU to execute the Grow operation each time.
In addition, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and a CPU; the GPU and the CPU realize graph mining by the following steps:
constructing corresponding search spaces by the CPU according to different graph mining applications; candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;
copying the candidate sub-graph set to the GPU, executing Grow operation by the GPU, and expanding the candidate sub-graph set to obtain an intermediate sub-graph set;
copying the intermediate sub-graph set to the CPU for storage, and performing a cut operation by the CPU, screening qualified sub-graphs from the intermediate sub-graph set to obtain a new candidate sub-graph set, and storing the new candidate sub-graph set on a CPU main memory;
and replacing the candidate sub-graph set by using the new candidate sub-graph set obtained by the CPU, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are obtained.
Further, a user programming interface is included; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow stage, and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut stage; wherein,,
the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbors of the vertex;
a TOADD interface for eliminating isomorphic redundancy of subgraphs, and describing pruning operations to accelerate searches;
the TOCHECK interface is used for taking the subgraph as input and judging the relation between newly added vertexes and all vertexes in the current subgraph;
the FOREDGE interface is used for customizing the unified side relationship;
TOCOMPARE interface, is used for judging whether candidate subgraph and inquiry subgraph isomorphism.
Further, the user programming interface further comprises:
the IS_CONNECT interface IS used for judging whether two vertexes are connected by an edge or not;
the common_neighbor interface is mainly used for finding the public NEIGHBORs of two vertices.
Further, data can be copied between the CPU and the GPU in two directions at the same time, and CPU calculation and GPU calculation can be executed at the same time.
The beneficial effects of the invention are as follows: by taking the subgraph as the center and adopting a Grow-ball execution model, the graph mining problem is abstracted, has better descriptive property, and can be suitable for common graph mining application; and by designing a graph mining system based on CPU and GPU cooperative computing, the graph mining process can be accelerated and the memory pressure can be relieved by fully utilizing the computing resources of the CPU and the GPU.
Drawings
The accompanying drawings, where like reference numerals refer to identical or similar parts throughout the several views and which are included to provide a further understanding of the present application, are included to illustrate and explain illustrative examples of the present application and do not constitute a limitation on the present application. In the drawings:
FIG. 1 is a schematic diagram of an embodiment of the present invention of the execution flow of the mining system.
FIG. 2 is a schematic diagram of a system providing a user programming interface in accordance with one embodiment of the present invention.
FIG. 3 is a diagram of a diagram mining application abstraction in accordance with one embodiment of the present invention.
FIG. 4 is a system architecture diagram of an embodiment of the present invention.
FIG. 5 is a schematic diagram of a system execution pipeline according to one embodiment of the present invention.
Detailed Description
The high-performance graph mining method based on the GPU shown in fig. 1 comprises the following steps:
s1: constructing corresponding search spaces according to different graph mining applications; because of the different applications of graph mining, the corresponding search spaces are not the same. In the usual case, the search space applied is the entire large graph, and the system then mines the required sub-graph in the large graph.
S2: candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;
s3: the search space and the candidate sub-graph set are used as input of a Grow-cut execution model, the candidate sub-graph set is expanded through Grow operation to obtain an intermediate sub-graph set, and qualified sub-graphs are screened out from the intermediate sub-graph set through cut operation to obtain a new candidate sub-graph set; in the cut operation process, whether the whole of each sub-graph meets the condition or not is strictly checked, and qualified sub-graphs are inserted into a new candidate sub-graph set;
s4: judging whether the new candidate sub-graph set meets the user specified condition, if so, ending the operation, wherein the new candidate sub-graph set comprises all sub-graphs required by the user; otherwise, taking the candidate sub-graph set of the previous round covered by the new candidate sub-graph set as input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are found.
The expanding the candidate sub-graph set through Grow operation to obtain an intermediate sub-graph set specifically includes:
s31: performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;
s32: checking the validity of the point expansion or the edge expansion;
s33: deleting the subgraphs which do not meet the extended legitimacy, and generating an intermediate subgraph set according to the subgraphs which meet the extended legitimacy.
The invention uses GPU and CPU collaborative computing architecture to execute graph mining; executing the Grow operation by using a GPU, executing the cut operation by using a CPU, copying the obtained intermediate sub-graph set to the CPU to execute the cut operation after the GPU executes the Grow operation, and storing the intermediate sub-graph set in a CPU memory; when the Grow operation and the roll operation are executed in an iteration mode, only the new candidate sub-graph set obtained in the round of iteration process is copied to the GPU to execute the Grow operation each time. The GPU is utilized to execute Grow operation, the CPU executes ball operation, and the computing and transmission delay is covered in a pipeline mode.
In addition, as shown in fig. 4, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and a CPU; the GPU and the CPU realize graph mining by the following steps:
s1: the CPU digs the application according to different graphs to construct a corresponding search space; candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;
s2: copying the candidate sub-graph set to the GPU, executing Grow operation by the GPU, and expanding the candidate sub-graph set to obtain an intermediate sub-graph set;
s3: copying the intermediate sub-graph set to the CPU for storage, and performing a cut operation by the CPU, screening qualified sub-graphs from the intermediate sub-graph set to obtain a new candidate sub-graph set, and storing the new candidate sub-graph set on a CPU main memory;
s4: and replacing the candidate sub-graph set by using the new candidate sub-graph set obtained by the CPU, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are obtained.
According to the steps S2 and S3, the invention can sequentially and serially execute the whole graph mining process, and finally find all subgraphs. In the nth round, the n+1st round of iteration is executed after steps S2 and S3 are completed according to the serial thought. But in such a way that the GPU is waiting when it is performing Grow operations because the CPU has not yet received data and cannot process it. The GPU is also idling when the CPU performs a cut operation.
To prevent memory overflow, the system only copies a portion of the data of the current round to the GPU at a time to expand because the GPU generates a large number of candidate sets when performing Grow operations. Therefore, in the nth round, assuming that the data of the present round is divided into k parts, operations S2 and S3 are performed k times to go to the n+1st round. By taking the idea of the pipeline into consideration, the k steps of the nth round can be overlapped. That is, in the nth round, the CPU may receive the (i+1) th data to expand (perform Grow operation) when checking the (i) th data (performing the cut operation). When the execution of the two steps is finished, the data transmission overlapping can also be performed by the D2H transmission of the candidate data after the i+1st part expansion and the H2D transmission of the data to be expanded of the i+2nd part thanks to a dual-copy (dual-copy) engine of the GPU. The execution pipeline for the ith iteration is shown in fig. 5. The first data is first transferred to the GPU to perform a Grow operation to generate a second candidate set of data. At this point a blocking (barrier) is required until the Grow operation execution ends. When the data is transmitted back to the CPU, the pipeline is started, and the 2 nd initial data is transmitted to the GPU, and the data transmission is finished to start calculation, so that the data is blocked before the transmission is finished. When the CPU and the GPU receive the data to be processed, the two devices respectively start to execute the Grow operation and execute the cut operation, and the next round of transmission can not be executed until the slower of the two devices finishes the calculation. By using the pipeline mode, the system can fully utilize the computing capacities of the CPU and the GPU, and simultaneously, the transmission time of data is saved by simultaneous transmission of two wires.
FIG. 2 is a schematic diagram of a plurality of user-customizable programming interfaces provided in accordance with an embodiment of the present invention, illustrating the programming interfaces provided to a user by the graph mining system; the user can develop the graph mining application only by rewriting a plurality of interfaces in the Grow and the call; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow stage, and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut stage; wherein:
the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbors of the vertex; the default extension behavior is typically all neighbors that access the eligible vertices.
A TOADD interface for eliminating isomorphic redundancy of subgraphs, and describing pruning operations to accelerate searches; the most basic isomorphic detection is done by default and no pruning operation is performed.
The TOCHECK interface is used for taking the subgraph as input and judging the relation between newly added vertexes and all vertexes in the current subgraph; the user may describe the shape of the subgraph through TOCHECK.
The FOREDGE interface is used for customizing the unified side relationship; in some cases, it is only necessary to determine whether the newly added vertex and the other vertices satisfy a consistent relationship.
The TOCOMPARE interface is mainly used for judging whether the candidate sub-graph and the query sub-graph are isomorphic or not aiming at the application with the query sub-graph; and judging whether the candidate sub-graph and the query sub-graph are isomorphic.
In addition, the user may also invoke the is_connect interface and the common_connect interface to determine the manner in which vertices in the input graph are connected.
The IS_CONNECT interface IS used for judging whether two vertexes are connected by an edge or not;
the common_neighbor interface is mainly used for finding the public NEIGHBORs of two vertices.
The CPU and the GPU can simultaneously copy data in two directions, the CPU calculation and the GPU calculation can be simultaneously executed, and the calculation resources of the CPU and the GPU can be fully utilized to accelerate the graph mining process and relieve the memory pressure.
FIG. 3 is a diagram of a diagram mining application abstraction in accordance with an embodiment of the present invention. The following describes how the common three graph mining applications are abstracted using Grow and call interfaces:
triangle count, max clique enumeration, and sub-graph matching. Assuming that a point expansion mode is adopted, for triangle counting, the Grow mainly accesses neighbors of all vertexes in the current subgraph, and does not need to consider the label or attribute information of the neighbors. While cut needs to filter all subgraphs that are not triangles. For the maximum clique, the Grow operation is similar to the triangle count Grow, and the cut operation screens out all full communication graphs. In the example of sub-graph matching, the Grow considers, in addition to the connection relationship of the newly added vertex, whether the vertex label is consistent with the query graph and whether the connection relationship of the vertex is consistent with the connection relationship of the corresponding vertex of the query graph. In cut, it is necessary to determine whether the candidate subgraph is isomorphic with the query graph.
The invention adopts a GPU and CPU cooperative computing architecture, can utilize GPU multithreading to carry out graph mining operation to improve the searching efficiency, and simultaneously utilizes CPU memory to store a large number of intermediate subgraphs generated in the graph mining process. Describing a system architecture in combination with a Grow-ball execution model: in the running process of the system, each time a part of the subgraph needs to be copied to the GPU to execute Grow operation, the relationship between the subgraph and the vertex/edge is judged, and the generated candidate subgraph is copied to a CPU memory; in order to check the legality of the candidate subgraph, a CPU multithreading technology is used for executing a cut operation to judge the candidate subgraph, the qualified subgraph is stored on a CPU main memory, and the system iterates the process. By taking reference to the idea of the pipeline, the CPU calculation and the GPU calculation can be simultaneously executed during iteration, and the bidirectional copying of data can also be simultaneously executed, so that the delay of calculation and transmission is covered.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims (6)

1. The high-performance graph mining method based on the GPU is characterized by comprising the following steps of:
constructing corresponding search spaces according to different graph mining applications;
candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;
the search space and the candidate sub-graph set are used as input of a Grow-cut execution model, the candidate sub-graph set is expanded through Grow operation to obtain an intermediate sub-graph set, and qualified sub-graphs are screened out from the intermediate sub-graph set through cut operation to obtain a new candidate sub-graph set;
judging whether the new candidate sub-graph set meets the user specified condition, if so, ending the operation, wherein the new candidate sub-graph set comprises all sub-graphs required by the user; otherwise, taking the candidate sub-graph set of the previous round covered by the new candidate sub-graph set as input of a Grow operation, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are found;
executing the Grow operation by using a GPU, executing the cut operation by using a CPU, copying the obtained intermediate sub-graph set to the CPU to execute the cut operation after the GPU executes the Grow operation, and storing the intermediate sub-graph set in a CPU memory; when the Grow operation and the roll operation are executed in an iteration mode, only the new candidate sub-graph set obtained in the round of iteration process is copied to the GPU to execute the Grow operation each time.
2. The GPU-based high performance graph mining method according to claim 1, wherein said expanding the candidate sub-graph set by a Grow operation to obtain an intermediate sub-graph set specifically comprises:
performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;
checking the validity of the point expansion or the edge expansion;
deleting the subgraphs which do not meet the extended legitimacy, and generating an intermediate subgraph set according to the subgraphs which meet the extended legitimacy.
3. The high-performance graph mining system based on the GPU is characterized by comprising the GPU and a CPU; the GPU and the CPU realize graph mining by the following steps:
constructing corresponding search spaces by the CPU according to different graph mining applications; candidate a plurality of vertexes or edges in the search space according to the sub-graph information provided by the user, and constructing an initial candidate sub-graph set;
copying the candidate sub-graph set to the GPU, executing Grow operation by the GPU, and expanding the candidate sub-graph set to obtain an intermediate sub-graph set;
copying the intermediate sub-graph set to the CPU for storage, and performing a cut operation by the CPU, screening qualified sub-graphs from the intermediate sub-graph set to obtain a new candidate sub-graph set, and storing the new candidate sub-graph set on a CPU main memory;
and replacing the candidate sub-graph set by using the new candidate sub-graph set obtained by the CPU, and iteratively executing the Grow operation and the cut operation until all sub-graphs meeting the user specified conditions are obtained.
4. The GPU-based high performance graph mining system of claim 3, further comprising a user programming interface; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow stage, and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut stage; wherein,,
the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbors of the vertex;
a TOADD interface for eliminating isomorphic redundancy of subgraphs, and describing pruning operations to accelerate searches;
the TOCHECK interface is used for taking the subgraph as input and judging the relation between newly added vertexes and all vertexes in the current subgraph;
the FOREDGE interface is used for customizing the unified side relationship;
TOCOMPARE interface, is used for judging whether candidate subgraph and inquiry subgraph isomorphism.
5. The GPU-based high performance graph mining system according to claim 4, wherein the user programming interface further comprises:
the IS_CONNECT interface IS used for judging whether two vertexes are connected by an edge or not;
the common_neighbor interface is mainly used for finding the public NEIGHBORs of two vertices.
6. The GPU-based high performance graphics mining system according to any of claims 3-5, wherein data is bi-directionally copied between the CPU and the GPU simultaneously, and CPU computations and GPU computations are performed simultaneously.
CN202011078543.7A 2020-05-18 2020-10-10 High-performance graph mining method and system based on GPU Active CN111984833B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2020104241110 2020-05-18
CN202010424111.0A CN111625691A (en) 2020-05-18 2020-05-18 GPU-based high-performance graph mining method and system

Publications (2)

Publication Number Publication Date
CN111984833A CN111984833A (en) 2020-11-24
CN111984833B true CN111984833B (en) 2023-08-01

Family

ID=72259804

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202010424111.0A Pending CN111625691A (en) 2020-05-18 2020-05-18 GPU-based high-performance graph mining method and system
CN202010866763.XA Withdrawn CN111831864A (en) 2020-05-18 2020-08-25 GPU-based high-performance graph mining method and system
CN202011078543.7A Active CN111984833B (en) 2020-05-18 2020-10-10 High-performance graph mining method and system based on GPU

Family Applications Before (2)

Application Number Title Priority Date Filing Date
CN202010424111.0A Pending CN111625691A (en) 2020-05-18 2020-05-18 GPU-based high-performance graph mining method and system
CN202010866763.XA Withdrawn CN111831864A (en) 2020-05-18 2020-08-25 GPU-based high-performance graph mining method and system

Country Status (1)

Country Link
CN (3) CN111625691A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112934758B (en) * 2020-12-14 2022-10-14 中科院计算所西部高等技术研究院 Coal sorting hand-dialing control method based on image recognition
CN114661757B (en) * 2020-12-22 2024-04-19 华东师范大学 Subgraph matching method and system based on heterogeneous computer FPGA

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559016A (en) * 2013-10-23 2014-02-05 江西理工大学 Frequent subgraph excavating method based on graphic processor parallel computing
CN108170519A (en) * 2018-01-25 2018-06-15 上海交通大学 Optimize the systems, devices and methods of expansible GPU vitualization

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9208535B2 (en) * 2013-11-25 2015-12-08 Xerox Corporation Method and apparatus for graphical processing unit (GPU) accelerated large-scale web community detection
SG11201903858XA (en) * 2016-10-28 2019-05-30 Illumina Inc Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing
US10235182B2 (en) * 2017-06-20 2019-03-19 Palo Alto Research Center Incorporated System and method for hybrid task management across CPU and GPU for efficient data mining

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559016A (en) * 2013-10-23 2014-02-05 江西理工大学 Frequent subgraph excavating method based on graphic processor parallel computing
CN108170519A (en) * 2018-01-25 2018-06-15 上海交通大学 Optimize the systems, devices and methods of expansible GPU vitualization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
a gpu based maximum common subgraph algorithm for drug discovery application;P.B.Jayaraj;《2016 IEEE international parallel and distributed processing symposium workshops》;580-588 *
基于 GPU 异构体系结构的大规模图数据挖 掘关键技术研究;杨博;《中国博士学位论文全文数据库(信息科技辑)》(第02期);I138-62 *

Also Published As

Publication number Publication date
CN111831864A (en) 2020-10-27
CN111984833A (en) 2020-11-24
CN111625691A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
US9535975B2 (en) Parallel programming of in memory database utilizing extensible skeletons
EP2743845A1 (en) Graph traversal operator inside a column store
US10445323B2 (en) Association rule mining with the micron automata processor
CN111984833B (en) High-performance graph mining method and system based on GPU
Serafini et al. Scalable graph neural network training: The case for sampling
CN106446134B (en) Local multi-query optimization method based on predicate specification and cost estimation
CN106569896B (en) A kind of data distribution and method for parallel processing and system
Souravlas et al. Hybrid CPU-GPU community detection in weighted networks
CN113419861A (en) Graph traversal mixed load balancing method facing GPU card group
CN106777065A (en) The method and system that a kind of Frequent tree mining is excavated
Elshawi et al. Big graph processing systems: State-of-the-art and open challenges
Huang et al. A robust ECO engine by resource-constraint-aware technology mapping and incremental routing optimization
Wang et al. RDF partitioning for scalable SPARQL query processing
CN110851178B (en) Inter-process program static analysis method based on distributed graph reachable computation
Hu et al. A gpu-based graph pattern mining system
Zeng et al. DF-GAS: a Distributed FPGA-as-a-Service Architecture towards Billion-Scale Graph-based Approximate Nearest Neighbor Search
CN108009099B (en) Acceleration method and device applied to K-Mean clustering algorithm
Shao et al. Large-scale Graph Analysis: System, Algorithm and Optimization
CN115687707A (en) Acceleration subgraph matching method based on CPU-FPGA hybrid platform
Sevim et al. FUDJ: Flexible User-Defined Distributed Joins
Zheng et al. A novel method to generate frequent itemsets in distributed environment
Wan et al. An order sampling processing-in-memory architecture for approximate graph pattern mining
Jradi et al. A GPU-based parallel algorithm for enumerating all chordless cycles in graphs
Srinivasan et al. A Distributed Algorithm for Identifying Strongly Connected Components on Incremental Graphs
Luo et al. Partitioning Mesh for Workload Balance According to the Capability of Each Computing Node

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant