CN111984833A - GPU-based high-performance graph mining method and system - Google Patents

GPU-based high-performance graph mining method and system Download PDF

Info

Publication number
CN111984833A
CN111984833A CN202011078543.7A CN202011078543A CN111984833A CN 111984833 A CN111984833 A CN 111984833A CN 202011078543 A CN202011078543 A CN 202011078543A CN 111984833 A CN111984833 A CN 111984833A
Authority
CN
China
Prior art keywords
gpu
subgraph
cpu
candidate
subgraph set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011078543.7A
Other languages
Chinese (zh)
Other versions
CN111984833B (en
Inventor
谭光明
林志恒
张春明
段勃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Original Assignee
Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences filed Critical Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Publication of CN111984833A publication Critical patent/CN111984833A/en
Application granted granted Critical
Publication of CN111984833B publication Critical patent/CN111984833B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a high-performance graph mining method and a high-performance graph mining system based on a GPU (graphics processing unit). according to the method, a GPU and CPU (central processing unit) cooperative computing framework is adopted, GPU multithreading can be used for graph mining operation to improve the searching efficiency, and meanwhile, a CPU memory is used for storing a large number of intermediate subgraphs generated in the graph mining process; the system architecture is described by combining a Grow-cut execution model: in the system operation process, each time partial subgraph is required to be copied to a GPU to execute Grow operation, judging the relationship between the subgraph and a vertex/edge, and copying the generated candidate subgraph to a CPU memory; in order to check the validity of the candidate subgraph, the CPU multithreading technology is used for executing the Cull operation to judge the candidate subgraph, the qualified subgraph is stored in the CPU main memory, and the system repeats the iteration process. By taking the idea of a pipeline as a reference, the CPU calculation and the GPU calculation can be executed simultaneously during iteration, and the bidirectional copy of data can also be executed simultaneously, so that the delay of calculation and transmission is covered.

Description

GPU-based high-performance graph mining method and system
Technical Field
The invention relates to a high-performance graph mining method and system based on a GPU.
Background
The graph data structure can well express the relationship between entities, and the traditional data structure cannot efficiently express the relationship. Such advantages make the graph data play a crucial role in different fields such as traffic networks, social networks, human brain programs, biological genes, and the like. With the development of the internet, a large amount of graph data is generated in more and more fields, and the scale of the graph data is increasing year by year. Analyzing and processing these huge amounts of graph data is becoming increasingly important. In addition, along with the development of hardware, the computing power of computers is also higher and higher, and devices including GPUs, FPGAs and the like are provided to assist the CPU in computing. In recent years some researchers have begun to investigate graph data analysis and processing with the computational power provided by a wide variety of hardware resources. The common graph data analysis process firstly needs to extract the relationships between entities from data of the real world (social network, biological information, road network and other fields), abstract the relationships into graph data, and then process the graph data (the common graph data processing includes graph calculation, graph mining, graph database storage and other modes) to obtain a processing result. Taking a breadth-first traversal algorithm (BFS) in graph computation as an example, the BFS algorithm needs to perform global traversal on graph data, visit neighbors of vertices from a root vertex in an iterative manner, and finally mark a state for each vertex. The maximum blob enumeration algorithm (MCE) target in graph mining is then to mine all the maximum blobs in the graph. Graph database domain concerns graph storage and query. The target difference of different graph processing algorithms is large, and the obtained result may be the change of the graph state, a subgraph meeting the condition, and the like.
In the field of graph computation, google's Pregel graph computation system presents a vertex-centric (think vertex) programming abstraction. However, the characteristics of the graph mining algorithm are not considered in the design, and the processing granularity of the graph mining algorithm is coarser, namely, the processing object is a subgraph instead of a vertex. Therefore, the existing graph computation model is difficult to be directly applied to the graph mining application. In addition, the graph mining algorithm is difficult to handle for medium and large scale graphs. This is because exponential levels of candidate patterns and subgraphs may be generated during large-scale graph mining, resulting in explosive growth of computation and intermediate state storage. There are well-established algorithms and frameworks in the field of graph computation, but there is a lack of reasonable algorithm abstraction for high-performance graph mining systems, while most of the research focuses only on single graph mining application optimization. More importantly, the current GPU-based graph mining system lacks research.
One important factor affecting the efficiency of graph mining algorithms is parallelism. Although multi-core CPUs have been developed for a long time, the number of concurrent threads is still very limited, usually up to 16 or 20. The limited parallelism capability has become a bottleneck for these CPU-based algorithms. In contrast, high-end GPUs have the ability to execute thousands or more threads simultaneously, which makes them suitable for applications involving processing large amounts of data. GPUs also have a very high memory bandwidth compared to CPUs. Therefore, in order to further improve the efficiency of the graph mining algorithm, the GPU is a good solution.
Disclosure of Invention
The invention aims to provide a high-performance graph mining method and system based on a GPU (graphics processing unit) so as to solve the problem of low efficiency of the existing graph mining algorithm.
In order to solve the technical problem, the invention provides a high-performance graph mining method based on a GPU, which is characterized by comprising the following steps:
constructing a corresponding search space according to different graph mining applications;
a plurality of vertexes or edges are candidate in the search space according to sub-graph information provided by a user, and an initial candidate sub-graph set is constructed;
taking the search space and the candidate subgraph set as the input of a Grow-cut execution model, expanding the candidate subgraph set through Grow operation to obtain a middle subgraph set, and screening qualified subgraphs from the middle subgraph set through cut operation to obtain a new candidate subgraph set;
judging whether the new candidate subgraph set meets the user specified conditions, if so, ending the operation, wherein the new candidate subgraph set comprises subgraphs required by all users; and if not, covering the new candidate subgraph set with the candidate subgraph set of the previous round as the input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all subgraphs meeting the user-specified condition are found.
Further, the step of expanding the candidate subgraph set through Grow operation to obtain an intermediate subgraph set specifically includes:
performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;
checking the validity of the point extension or the edge extension;
and deleting the subgraphs which do not meet the expanded legality, and generating an intermediate subgraph set according to the subgraphs which meet the expanded legality.
Further, the GPU is adopted to execute the Grow operation, the CPU is adopted to execute the cut operation, the GPU copies the obtained intermediate subgraph set to the CPU to execute the cut operation after executing the Grow operation, and the intermediate subgraph set is stored in a CPU memory; when the Grow operation and the cut operation are executed in an iteration mode, only the new candidate subgraph set obtained in the iteration process of the round is copied to the GPU to execute the Grow operation each time.
In addition, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and the CPU; the GPU and the CPU realize graph mining through the following steps:
mining application according to different graphs through the CPU, and constructing corresponding search spaces; a plurality of vertexes or edges are candidate in the search space according to sub-image information provided by a user, and an initial candidate sub-image set is constructed;
copying the candidate subgraph set to the GPU, executing Grow operation through the GPU, and expanding the candidate subgraph set to obtain an intermediate subgraph set;
copying the intermediate subgraph set to the CPU for storage, and enabling the CPU to execute the toll operation, screening qualified subgraphs from the intermediate subgraph set to obtain a new candidate subgraph set and storing the new candidate subgraph set in a CPU main memory;
and replacing the candidate subgraph set by using the new candidate subgraph set obtained by the CPU, and iteratively executing Grow operation and fill operation until all subgraphs meeting the conditions specified by the user are obtained.
Further, a user programming interface is also included; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow phase and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut phase; wherein,
the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbor of the vertex;
TOADD interface, used for eliminating the isomorphic redundancy of subgraph, and describe the operation of pruning in order to accelerate the search;
the TOCHECK interface is used for taking the subgraph as input and judging the relationship between the newly added vertex and all the vertices in the current subgraph;
the FOREDGE interface is used for customizing a unified edge relation;
and the TOCOMPARE interface is used for judging whether the candidate subgraph and the query subgraph are isomorphic.
Further, the user programming interface further comprises:
the IS _ CONNECT interface IS used for judging whether edges of the two vertexes are connected or not;
the COMMON _ NEIGHBOR interface is used primarily to find the COMMON NEIGHBORs of two vertices.
Further, data can be copied bidirectionally between the CPU and the GPU, and CPU calculation and GPU calculation can be executed simultaneously.
The invention has the beneficial effects that: by taking the subgraph as a center and adopting a Grow-fill execution model, the graph mining problem is abstracted, the graph mining problem is well described, and the method can be applied to common graph mining application; and by designing the graph mining system based on the CPU & GPU cooperative computing, the graph mining process can be accelerated and the memory pressure can be relieved by fully utilizing the computing resources of the CPU and the GPU.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic diagram of an execution flow of a graph mining system according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of a system providing a user programming interface according to one embodiment of the invention.
FIG. 3 is a diagram illustrating an abstraction of a graph mining application, in accordance with an embodiment of the present invention.
FIG. 4 is a system architecture diagram according to an embodiment of the present invention.
FIG. 5 is a system execution pipeline diagram of one embodiment of the present invention.
Detailed Description
The high-performance graph mining method based on the GPU shown in FIG. 1 comprises the following steps:
s1: constructing a corresponding search space according to different graph mining applications; due to different applications of graph mining, the corresponding search spaces are also different. In the general case, the applied search space is the whole large graph, and the system mines the needed subgraphs in the large graph.
S2: a plurality of vertexes or edges are candidate in the search space according to sub-graph information provided by a user, and an initial candidate sub-graph set is constructed;
s3: taking the search space and the candidate subgraph set as the input of a Grow-cut execution model, expanding the candidate subgraph set through Grow operation to obtain a middle subgraph set, and screening qualified subgraphs from the middle subgraph set through cut operation to obtain a new candidate subgraph set; in the Cull operation process, whether the whole sub-graph meets the condition or not needs to be strictly checked, and the qualified sub-graph is inserted into a new candidate sub-graph set;
s4: judging whether the new candidate subgraph set meets the user specified conditions, if so, ending the operation, wherein the new candidate subgraph set comprises subgraphs required by all users; and if not, covering the new candidate subgraph set with the candidate subgraph set of the previous round as the input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all subgraphs meeting the user-specified condition are found.
The step of expanding the candidate subgraph set through Grow operation to obtain an intermediate subgraph set specifically includes:
s31: performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;
s32: checking the validity of the point extension or the edge extension;
s33: and deleting the subgraphs which do not meet the expanded legality, and generating an intermediate subgraph set according to the subgraphs which meet the expanded legality.
The method uses a GPU and CPU cooperative computing architecture to execute graph mining; adopting a GPU to execute the Grow operation, adopting a CPU to execute the fill operation, copying an obtained intermediate subgraph set to the CPU to execute the fill operation after the GPU executes the Grow operation, and storing the intermediate subgraph set in a CPU memory; when the Grow operation and the cut operation are executed in an iteration mode, only the new candidate subgraph set obtained in the iteration process of the round is copied to the GPU to execute the Grow operation each time. And (4) executing the Grow operation by using the GPU, executing the close operation by using the CPU, and covering the calculation and transmission delay in a pipeline mode.
In addition, as shown in fig. 4, the invention also provides a high-performance graph mining system based on the GPU, which comprises the GPU and the CPU; the GPU and the CPU realize graph mining through the following steps:
s1: mining application by the CPU according to different graphs, and constructing corresponding search spaces; a plurality of vertexes or edges are candidate in the search space according to sub-image information provided by a user, and an initial candidate sub-image set is constructed;
s2: copying the candidate subgraph set to the GPU, executing Grow operation through the GPU, and expanding the candidate subgraph set to obtain an intermediate subgraph set;
s3: copying the intermediate subgraph set to the CPU for storage, and enabling the CPU to execute the toll operation, screening qualified subgraphs from the intermediate subgraph set to obtain a new candidate subgraph set and storing the new candidate subgraph set in a CPU main memory;
s4: and replacing the candidate subgraph set by using the new candidate subgraph set obtained by the CPU, and iteratively executing Grow operation and fill operation until all subgraphs meeting the conditions specified by the user are obtained.
According to steps S2 and S3, the invention can serially execute the whole graph mining process in turn, and finally find all sub graphs. In the nth round, the n +1 th round of iteration is executed after the steps S2 and S3 are completed according to the serial thought. But in such a way that when the GPU is performing a Grow operation, it is in a wait state because the CPU has not received data yet and is unable to process it. The GPU is also idling when the CPU is performing the close operation.
To prevent memory overflow, the system copies only a portion of the data of the current round to the GPU for expansion at a time, because the GPU generates a large number of candidate sets when performing the Grow operation. Therefore, in the n-th round, if the data of the current round is divided into k pieces, k times of operations S2 and S3 are required until the n + 1-th round. By using the idea of a pipeline, four steps of k times of the nth round can be overlapped. That is, in the nth round, when the CPU checks the ith data (performs the close operation), the GPU may receive the (i + 1) th data for expansion (performs the Grow operation). After the two steps are executed, due to the double-copy (dual-copy) engine of the GPU, the D2H transmission of the i +1 th expanded candidate data and the H2D transmission of the i +2 th data to be expanded can also be overlapped in data transmission. The execution pipeline for the ith round of iterations is shown in FIG. 5. The first initial data needs to be transmitted to the GPU first to perform a Grow operation to generate a second candidate set of data. At this point, blocking (barrier) is required until the execution of the Grow operation is finished. When the data is transmitted back to the CPU, the pipeline is started, and the 2 nd initial data starts to be transmitted to the GPU, and because the data transmission is required to be finished and the calculation can be started, the data is also required to be blocked before the transmission is finished. When the CPU and the GPU receive data which need to be processed respectively, the two devices respectively start to execute Grow operation and close operation until the slower side of the two devices finishes calculation, and the next round of transmission can not be executed. By utilizing the pipeline mode, the system can fully utilize the computing power of the CPU and the GPU, and meanwhile, the transmission time of data is saved due to the simultaneous transmission of two lines.
FIG. 2 is a schematic diagram of providing multiple user-customizable programming interfaces, listing the programming interfaces provided to a user by the graph mining system, in accordance with an embodiment of the present invention; the user can develop the graph mining application only by rewriting a plurality of interfaces in the Grow and the cut; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow phase and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut phase; wherein:
the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbor of the vertex; the default extended behavior is typically all neighbors that access a qualified vertex.
TOADD interface, used for eliminating the isomorphic redundancy of subgraph, and describe the operation of pruning in order to accelerate the search; default to the most basic isomorphic detection and not perform any pruning operations.
The TOCHECK interface is used for taking the subgraph as input and judging the relationship between the newly added vertex and all the vertices in the current subgraph; the user can describe the shape of the sub-graph through TOCHECK.
The FOREDGE interface is used for customizing a unified edge relation; in some cases, it is only necessary to determine whether the newly added vertex and the other vertices satisfy a consistent relationship.
The TOCOMPARE interface is mainly used for judging whether the candidate subgraph and the query subgraph are isomorphic aiming at the application with the query subgraph; and judging whether the candidate subgraph and the query subgraph are isomorphic.
Additionally, the user may also invoke the IS _ CONNECT interface and the COMMON _ NEIGHBOR interface to determine the manner in which the vertices in the input graph are connected.
The IS _ CONNECT interface IS used for judging whether edges of the two vertexes are connected or not;
the COMMON _ NEIGHBOR interface is used primarily to find the COMMON NEIGHBORs of two vertices.
The data can be copied bidirectionally between the CPU and the GPU simultaneously, the CPU calculation and the GPU calculation can be executed simultaneously, and the calculation resources of the CPU and the GPU can be fully utilized to accelerate the graph mining process and relieve the memory pressure.
FIG. 3 is a diagram illustrating an abstraction of a graph mining application, in accordance with an embodiment of the present invention. The following describes how the common three graph mining applications are abstracted using the Grow and Cull interfaces:
triangle count, maximum blob enumeration, and subgraph matching. Assuming a point-spread mode is used, for triangle counting, Grow mainly accesses the neighbors of all vertices in the current subgraph, without considering the information of the neighbors' labels or attributes. While fill requires filtering all sub-graphs that are not triangular. For the maximum clusters, the Grow operation is similar to triangle-counted Grow, and the cut operation screens out all fully-connected graphs. In the example of subgraph matching, Grow considers whether the vertex label is consistent with the query graph and whether the vertex connection relation is consistent with the connection relation of the corresponding vertex of the query graph besides the connection relation of the newly added vertex. In Cull, it is necessary to determine whether the candidate subgraph is isomorphic with the query graph.
The method adopts a GPU and CPU cooperative computing architecture, can utilize GPU multithreading to carry out graph mining operation to improve searching efficiency, and simultaneously utilizes CPU memory to store a large number of intermediate subgraphs generated in the graph mining process. Describing a system architecture by combining a Grow-cut execution model: in the system operation process, each time partial subgraph is required to be copied to a GPU to execute Grow operation, judging the relationship between the subgraph and a vertex/edge, and copying the generated candidate subgraph to a CPU memory; in order to check the validity of the candidate subgraph, the CPU multithreading technology is used for executing the Cull operation to judge the candidate subgraph, the qualified subgraph is stored in the CPU main memory, and the system repeats the iteration process. By taking the idea of a pipeline as a reference, the CPU calculation and the GPU calculation can be executed simultaneously during iteration, and the bidirectional copy of data can also be executed simultaneously, so that the delay of calculation and transmission is covered.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims (7)

1. A high-performance graph mining method based on a GPU is characterized by comprising the following steps:
constructing a corresponding search space according to different graph mining applications;
a plurality of vertexes or edges are candidate in the search space according to sub-graph information provided by a user, and an initial candidate sub-graph set is constructed;
taking the search space and the candidate subgraph set as the input of a Grow-cut execution model, expanding the candidate subgraph set through Grow operation to obtain a middle subgraph set, and screening qualified subgraphs from the middle subgraph set through cut operation to obtain a new candidate subgraph set;
judging whether the new candidate subgraph set meets the user specified conditions, if so, ending the operation, wherein the new candidate subgraph set comprises subgraphs required by all users; and if not, covering the new candidate subgraph set with the candidate subgraph set of the previous round as the input of the Grow operation, and iteratively executing the Grow operation and the cut operation until all subgraphs meeting the user-specified condition are found.
2. The GPU-based high performance graph mining method of claim 1, wherein the expanding the candidate subgraph set by Grow operation to obtain an intermediate subgraph set specifically comprises:
performing point expansion or edge expansion on the subgraphs in the candidate subgraph set;
checking the validity of the point extension or the edge extension;
and deleting the subgraphs which do not meet the expanded legality, and generating an intermediate subgraph set according to the subgraphs which meet the expanded legality.
3. The GPU-based high-performance graph mining method according to claim 1 or 2, characterized in that a GPU is used for executing the Grow operation, a CPU is used for executing the cut operation, the GPU copies an obtained intermediate subgraph set to the CPU for executing the cut operation after executing the Grow operation, and stores the intermediate subgraph set in a CPU memory; when the Grow operation and the cut operation are executed in an iteration mode, only the new candidate subgraph set obtained in the iteration process of the round is copied to the GPU to execute the Grow operation each time.
4. A high-performance graph mining system based on a GPU is characterized by comprising the GPU and a CPU; the GPU and the CPU realize graph mining through the following steps:
mining application according to different graphs through the CPU, and constructing corresponding search spaces; a plurality of vertexes or edges are candidate in the search space according to sub-image information provided by a user, and an initial candidate sub-image set is constructed;
copying the candidate subgraph set to the GPU, executing Grow operation through the GPU, and expanding the candidate subgraph set to obtain an intermediate subgraph set;
copying the intermediate subgraph set to the CPU for storage, and enabling the CPU to execute the toll operation, screening qualified subgraphs from the intermediate subgraph set to obtain a new candidate subgraph set and storing the new candidate subgraph set in a CPU main memory;
and replacing the candidate subgraph set by using the new candidate subgraph set obtained by the CPU, and iteratively executing Grow operation and fill operation until all subgraphs meeting the conditions specified by the user are obtained.
5. A GPU-based high-performance graph mining system according to claim 4, further comprising a user programming interface; the user programming interface comprises a TOEXPAND interface and a TOADD interface applied to a Grow phase and a FOREDGE interface, a TOCHECK interface and a TOCOMPARE interface applied to a cut phase; wherein,
the TOEXPAND interface is used for determining a certain vertex of the subgraph to be processed and how to access the neighbor of the vertex;
TOADD interface, used for eliminating the isomorphic redundancy of subgraph, and describe the operation of pruning in order to accelerate the search;
the TOCHECK interface is used for taking the subgraph as input and judging the relationship between the newly added vertex and all the vertices in the current subgraph;
the FOREDGE interface is used for customizing a unified edge relation;
and the TOCOMPARE interface is used for judging whether the candidate subgraph and the query subgraph are isomorphic.
6. A GPU-based high performance graph mining system in accordance with claim 5, wherein said user programming interface further comprises:
the IS _ CONNECT interface IS used for judging whether edges of the two vertexes are connected or not;
the COMMON _ NEIGHBOR interface is used primarily to find the COMMON NEIGHBORs of two vertices.
7. A GPU-based high-performance graph mining system according to any of claims 4-6, characterized in that data can be copied bidirectionally between the CPU and the GPU simultaneously, and CPU calculations and GPU calculations can be performed simultaneously.
CN202011078543.7A 2020-05-18 2020-10-10 High-performance graph mining method and system based on GPU Active CN111984833B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010424111.0A CN111625691A (en) 2020-05-18 2020-05-18 GPU-based high-performance graph mining method and system
CN2020104241110 2020-05-18

Publications (2)

Publication Number Publication Date
CN111984833A true CN111984833A (en) 2020-11-24
CN111984833B CN111984833B (en) 2023-08-01

Family

ID=72259804

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202010424111.0A Pending CN111625691A (en) 2020-05-18 2020-05-18 GPU-based high-performance graph mining method and system
CN202010866763.XA Withdrawn CN111831864A (en) 2020-05-18 2020-08-25 GPU-based high-performance graph mining method and system
CN202011078543.7A Active CN111984833B (en) 2020-05-18 2020-10-10 High-performance graph mining method and system based on GPU

Family Applications Before (2)

Application Number Title Priority Date Filing Date
CN202010424111.0A Pending CN111625691A (en) 2020-05-18 2020-05-18 GPU-based high-performance graph mining method and system
CN202010866763.XA Withdrawn CN111831864A (en) 2020-05-18 2020-08-25 GPU-based high-performance graph mining method and system

Country Status (1)

Country Link
CN (3) CN111625691A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112934758A (en) * 2020-12-14 2021-06-11 中科院计算所西部高等技术研究院 Coal sorting hand-dialing control method based on image recognition

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661757B (en) * 2020-12-22 2024-04-19 华东师范大学 Subgraph matching method and system based on heterogeneous computer FPGA

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559016A (en) * 2013-10-23 2014-02-05 江西理工大学 Frequent subgraph excavating method based on graphic processor parallel computing
US20150145866A1 (en) * 2013-11-25 2015-05-28 Xerox Corporation Method and apparatus for graphical processing unit (gpu) accelerated large-scale web community detection
US20180121601A1 (en) * 2016-10-28 2018-05-03 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing
CN108170519A (en) * 2018-01-25 2018-06-15 上海交通大学 Optimize the systems, devices and methods of expansible GPU vitualization
US20180365019A1 (en) * 2017-06-20 2018-12-20 Palo Alto Research Center Incorporated System and method for hybrid task management across cpu and gpu for efficient data mining

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559016A (en) * 2013-10-23 2014-02-05 江西理工大学 Frequent subgraph excavating method based on graphic processor parallel computing
US20150145866A1 (en) * 2013-11-25 2015-05-28 Xerox Corporation Method and apparatus for graphical processing unit (gpu) accelerated large-scale web community detection
US20180121601A1 (en) * 2016-10-28 2018-05-03 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing
US20180365019A1 (en) * 2017-06-20 2018-12-20 Palo Alto Research Center Incorporated System and method for hybrid task management across cpu and gpu for efficient data mining
CN108170519A (en) * 2018-01-25 2018-06-15 上海交通大学 Optimize the systems, devices and methods of expansible GPU vitualization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
P.B.JAYARAJ: "a gpu based maximum common subgraph algorithm for drug discovery application", 《2016 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS》, pages 580 - 588 *
杨博: "基于 GPU 异构体系结构的大规模图数据挖 掘关键技术研究", 《中国博士学位论文全文数据库(信息科技辑)》, no. 02, pages 138 - 62 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112934758A (en) * 2020-12-14 2021-06-11 中科院计算所西部高等技术研究院 Coal sorting hand-dialing control method based on image recognition
CN112934758B (en) * 2020-12-14 2022-10-14 中科院计算所西部高等技术研究院 Coal sorting hand-dialing control method based on image recognition

Also Published As

Publication number Publication date
CN111984833B (en) 2023-08-01
CN111625691A (en) 2020-09-04
CN111831864A (en) 2020-10-27

Similar Documents

Publication Publication Date Title
Pradhan et al. Finding all-pairs shortest path for a large-scale transportation network using parallel Floyd-Warshall and parallel Dijkstra algorithms
Serafini et al. Scalable graph neural network training: The case for sampling
Geist et al. Major computer science challenges at exascale
US20130290919A1 (en) Selective execution for partitioned parallel simulations
Yuan et al. PathGraph: A path centric graph processing system
CN111984833B (en) High-performance graph mining method and system based on GPU
US11456946B2 (en) Regular path queries (RPQS) for distributed graphs
Souravlas et al. Hybrid CPU-GPU community detection in weighted networks
Zheng et al. Linked data processing for human-in-the-loop in cyber–physical systems
Nath et al. Massively parallel algorithms for computing TIN DEMs and contour trees for large terrains
Liu et al. Closeness centrality on uncertain graphs
Hu et al. A gpu-based graph pattern mining system
Kaepke et al. A comparative evaluation of big data frameworks for graph processing
Wu et al. A Comprehensive Survey on GNN Characterization
Yasar et al. PGAbB: A Block-Based Graph Processing Framework for Heterogeneous Platforms
Shao et al. Large-scale Graph Analysis: System, Algorithm and Optimization
Zhang et al. Bring orders into uncertainty: enabling efficient uncertain graph processing via novel path sampling on multi-accelerator systems
Zhang et al. Bigflow: A General Optimization Layer for Distributed Computing Frameworks
Madduri A high-performance framework for analyzing massive complex networks
Li et al. Concurrent hybrid breadth-first-search on distributed powergraph for skewed graphs
Grossman et al. HOOVER: Leveraging OpenSHMEM for High Performance, Flexible Streaming Graph Applications
Cai et al. PimPam: Efficient Graph Pattern Matching on Real Processing-in-Memory Hardware
CN115687707A (en) Acceleration subgraph matching method based on CPU-FPGA hybrid platform
Zhang et al. A high-performance dataflow-centric optimization framework for deep learning inference on the edge
Wan et al. An order sampling processing-in-memory architecture for approximate graph pattern mining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant