CN116974745A

CN116974745A - Implementation method of graph mining algorithm and computer equipment

Info

Publication number: CN116974745A
Application number: CN202310410551.4A
Authority: CN
Inventors: 邹磊; 胡琳
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2023-04-17
Filing date: 2023-04-17
Publication date: 2023-10-31

Abstract

The disclosure discloses a method for realizing a graph mining algorithm and a computer, and belongs to the technical field of computers. The method comprises the following steps: determining reference nodes required by iterative processing corresponding to the current iterative times based on a preset graph mining algorithm; acquiring adjacency lists corresponding to a plurality of reference nodes stored in a memory; performing iteration processing corresponding to the current iteration times based on a preset graph mining algorithm and adjacency lists corresponding to a plurality of reference nodes to obtain an intermediate output subgraph; the middle output subgraph is sent to a memory for storage; determining whether an iteration ending condition is met, if yes, determining an intermediate output subgraph corresponding to the current iteration number as output data, and if not, adding one to the current iteration number, and turning to a step of executing a reference node required for determining the iteration processing corresponding to the current iteration number. By adopting the method and the device, the scale of the data graph which can be processed by the GPU is increased, and the application range of the graph mining algorithm is increased.

Description

Implementation method of graph mining algorithm and computer equipment

Technical Field

The disclosure relates to the technical field of computers, and in particular relates to a method for realizing a graph mining algorithm and a computer.

Background

The graph mining algorithm is an important component in the graph algorithm, and comprises sub-graph matching, frequent pattern mining and the like. The purpose of such algorithms is to find the required information in a given data graph, e.g. sub-graph matching is to find all sub-graphs in the data graph isomorphic with the given sub-graph, frequent pattern mining is to find sub-graphs in the data graph with occurrence times greater than a threshold of times. The graph mining algorithm plays a very critical role in chemical pharmacy, spam detection, community discovery and the like.

The graph mining algorithm is a computationally intensive algorithm, the computation time is longer, the multi-core parallel capacity of the CPU (Central Processing Unit ) is limited, and good performance improvement cannot be brought to the graph mining algorithm, so that a GPU (Graphics Processing Unit, graphics processor) with high parallel capacity is often used for realizing the graph mining algorithm.

However, each iteration process in the process of executing the graph mining algorithm generates a large number of intermediate results, namely a large number of intermediate output subgraphs, the size of the intermediate results is often tens of times or more than that of the data graph, and the video memory space of the GPU is relatively small, which results in limited scale of the data graph which can be processed by the GPU, and greatly reduces the application range of the graph mining algorithm.

Disclosure of Invention

The embodiment of the disclosure provides a method for realizing a graph mining algorithm, which can solve the technical problem that the application range of the graph mining algorithm is reduced due to the large data volume of an intermediate output sub graph in the prior art.

In a first aspect, a method for implementing a graph mining algorithm is provided, where the method includes:

determining a reference node required by iterative processing corresponding to the current iteration times from a plurality of nodes included in an intermediate output subgraph corresponding to the previous iteration times based on a preset graph mining algorithm;

accessing a memory, and acquiring adjacency lists corresponding to a plurality of reference nodes included in a data graph stored in the memory, wherein the adjacency list corresponding to the reference nodes includes other nodes with edge relations with the reference nodes in the data graph;

performing iteration processing corresponding to the current iteration times based on the preset graph mining algorithm and the adjacency lists corresponding to the plurality of reference nodes to obtain an intermediate output subgraph corresponding to the current iteration times;

transmitting the intermediate output subgraph corresponding to the current iteration times to the memory for storage;

and determining whether an iteration ending condition is met, if so, determining the intermediate output subgraph corresponding to the current iteration times as output data, and if not, adding one to the current iteration times, and turning to the step of executing the reference nodes required by the iteration processing corresponding to the determined current iteration times.

In one possible implementation, the memory includes a plurality of memory pages, the memory pages being used to store the data map and the intermediate output data;

before the memory is accessed and the adjacency list corresponding to the plurality of reference nodes included in the data graph stored in the memory is obtained, the method further comprises:

determining the expected access times of the adjacency list corresponding to each reference node in the iteration process corresponding to the current iteration times;

determining a plurality of target memory pages where adjacent tables corresponding to the plurality of reference nodes are located;

determining the access heat of each target memory page where the adjacency list corresponding to the plurality of reference nodes is based on the expected access times corresponding to each reference node;

determining a target access mode of the adjacency list corresponding to each reference node from a plurality of preset access modes based on the access heat of each target memory page and a preset heat threshold;

the accessing the memory and obtaining the adjacency list corresponding to the plurality of reference nodes included in the data graph stored in the memory includes:

and acquiring the adjacency list corresponding to each reference node stored in the memory according to the target access mode of the adjacency list corresponding to each reference node.

In one possible implementation manner, the determining, based on the expected access times corresponding to each reference node, the access heat of each target memory page where the adjacency tables corresponding to the plurality of reference nodes are located includes:

for each reference node stored in each target memory page, determining the product of the data volume of an adjacency list corresponding to the reference node stored in the target memory page and the expected access frequency corresponding to the reference node as the expected access data volume corresponding to the current iteration frequency of the reference node stored in the target memory page;

for each target memory page, determining the sum of expected access data amounts corresponding to at least one reference node in the target memory page as the expected access data amount corresponding to the current iteration number of the target memory page;

and determining the access heat of the target memory page based on the expected access data amount corresponding to the current iteration number of the target memory page.

In one possible implementation manner, the determining the access heat of the target memory page based on the expected access data amount corresponding to the current iteration number of the target memory page includes:

Determining the corresponding access data amount of each iteration process before the current iteration times of the target memory page;

and determining the access heat of the target memory page based on the expected access data amount corresponding to the current iteration number of the target memory page and the access data amount corresponding to each iteration process of the target memory page before the current iteration number.

In one possible implementation manner, the determining, based on the access heat of each target memory page and a preset heat threshold, a target access manner of the adjacency table corresponding to each reference node from a plurality of preset access manners includes:

for each target memory page, when the access heat of the target memory page is greater than or equal to the preset heat threshold, acquiring the target memory page from the memory to a video memory for caching, and acquiring an adjacency list corresponding to at least one reference node stored in the target memory page from the video memory;

and when the access heat of the target memory page is smaller than the preset heat threshold, acquiring an adjacency list corresponding to at least one reference node stored in the target memory page from the memory.

In one possible implementation manner, the GPU is provided with a plurality of working thread groups, and each working thread group corresponds to a plurality of intermediate output subgraphs corresponding to the previous iteration times one by one;

the step of performing iteration processing corresponding to the current iteration number based on the preset graph mining algorithm and the adjacency list corresponding to the plurality of reference nodes to obtain an intermediate output subgraph corresponding to the current iteration number, including:

and for each working thread group, performing iterative processing corresponding to the current iteration times by the working thread group based on the preset graph mining algorithm and an adjacency list corresponding to each reference node in the intermediate output subgraph corresponding to the working thread group, so as to obtain the intermediate output subgraph corresponding to the working thread group.

In one possible implementation manner, the obtaining the adjacency list corresponding to the plurality of reference nodes included in the data graph stored in the memory includes:

for each working thread group, the working thread group acquires intersection data of adjacent tables corresponding to a plurality of reference nodes in an intermediate output subgraph corresponding to the working thread group;

the step of performing iteration processing corresponding to the current iteration times based on the preset graph mining algorithm and the adjacency list corresponding to the plurality of reference nodes to obtain an intermediate output subgraph corresponding to the current iteration times of the working thread group, including:

And for each working thread group, acquiring intersection data of an adjacent table corresponding to a plurality of reference nodes in an intermediate output subgraph corresponding to the working thread group based on the preset graph mining algorithm, and performing iterative processing corresponding to the current iteration times to obtain the intermediate output subgraph corresponding to the working thread group.

In one possible implementation manner, the GPU is further provided with a scheduling thread, and the memory includes a plurality of memory blocks;

the step of sending the intermediate output subgraph corresponding to the current iteration times to the memory for storage includes:

for each working thread group, the working thread group sends an intermediate output subgraph of the current iteration number corresponding to the working thread group to a memory block corresponding to the working thread group for storage;

when the memory blocks corresponding to the working thread group are full, the scheduling thread allocates new memory blocks for the working thread group.

In one possible implementation, the preset graph mining algorithm includes at least one of an extension operator, an aggregation operator, and a filtering operator.

In a second aspect, a computer device is provided, the computer device comprising a GPU and a memory, the GPU to:

accessing the memory, and acquiring adjacency lists corresponding to a plurality of reference nodes included in a data graph stored in the memory, wherein the adjacency list corresponding to the reference nodes includes other nodes with edge relations with the reference nodes in the data graph;

The technical scheme provided by the embodiment of the disclosure has the beneficial effects that: according to the scheme, iteration processing of a preset graph mining algorithm can be performed in the GPU, the data graph and the middle output subgraph corresponding to each iteration processing are stored in the memory, when the next iteration processing is performed, the adjacency list corresponding to the reference node required by the next iteration processing is acquired from the memory, and as the memory space is tens of times or more of the video memory space, the size of the data graph which can be processed by the GPU is increased, and the application range of the graph mining algorithm is increased.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is a flow chart of a method for implementing a graph mining algorithm provided by an embodiment of the present disclosure;

fig. 2 is a flowchart of a method for determining a target access manner according to an embodiment of the present disclosure.

Detailed Description

For the purposes of clarity, technical solutions and advantages of the present disclosure, the following further details the embodiments of the present disclosure with reference to the accompanying drawings.

The embodiment of the disclosure provides a method for realizing a graph mining algorithm, which can be realized by a GPU in computer equipment. The computer devices may be terminals, servers, etc., the terminals may be desktop computers, notebook computers, tablet computers, cell phones, etc., the servers may be single servers or clusters of servers, etc.

The staff can build a corresponding graph mining algorithm according to the requirements, for example, when sub-graph matching is needed, a graph mining algorithm corresponding to the sub-graph matching can be built, when frequent pattern mining is needed, a graph mining algorithm corresponding to the frequent pattern mining can be built, and the like.

The output data of the graph mining algorithm may be a plurality of conditional subgraphs, for example, when the graph mining algorithm is an algorithm corresponding to subgraph matching, the output data may be at least one subgraph of the data graph that is isomorphic with the query graph, when the graph mining algorithm is an algorithm corresponding to frequent pattern mining, the output data may be a plurality of subgraphs of the same pattern in the data graph, and the number of subgraphs of the same pattern is greater than or equal to a preset number, and so on.

The data graph may include a plurality of nodes and an adjacency list corresponding to each node, where the adjacency list corresponding to a node includes other nodes having an edge relationship with the node in the data graph, and thus the data graph may reflect the edge relationship between the plurality of nodes. For example, in a social networking data graph, each node may represent a social media user, two nodes with edge relationships may be relationship of interest to each other, and so on.

Each node may include its own identity and tag. Each identity is unique and different nodes have different identities. The tags may include a plurality of types, for example, the tags may include at least one of a person, a city, a gender, an academy, the tags are not unique, and the tags of different nodes may be the same or different.

In order to obtain the plurality of sub-graphs meeting the conditions, the implementation flow of the graph mining algorithm is as follows:

performing first iterative processing, namely: traversing each node in the data graph, determining nodes which accord with the condition setting of the first-level nodes from a plurality of nodes in the data graph based on the condition setting of the first-level nodes by a graph mining algorithm, wherein the nodes are the determined first-level nodes, and each first-level node is an intermediate output subgraph output by first iteration processing.

Then, a second iteration process is performed, namely: and determining a first level node with an edge relation with the second level node from each intermediate output subgraph corresponding to the first iteration processing based on condition setting of the second level node by a graph mining algorithm, wherein the first level node with the edge relation with the second level node is a reference node required by the second iteration processing, obtaining an adjacency list corresponding to the reference nodes, determining nodes meeting the condition setting of the second level node from the adjacency list corresponding to the reference nodes, namely the second level node, updating the intermediate output subgraph (the intermediate output subgraph corresponding to the first iteration processing) which determines the second level node to obtain an intermediate output subgraph comprising the first level node and the second level node, and deleting the intermediate output subgraph (the intermediate output subgraph corresponding to the first iteration processing) which does not determine the second level node to obtain the intermediate output subgraph comprising the first level node and the second level node.

Then, the third iteration process or even more iteration processes may be performed until the iteration end condition is satisfied, that is, the output data is obtained. The flow of the iterative process of the third time and thereafter is similar to that of the above-described second iterative process.

For example, the third iterative process may be: and determining a reference node with an edge relation with the three-level node from each intermediate output subgraph corresponding to the second iteration processing based on the condition setting of the three-level node by a graph mining algorithm, wherein the reference node can be any level of node in the intermediate output subgraph, can be a primary node or a secondary node, or can comprise a primary node or a secondary node, and the like. And then, obtaining adjacency lists corresponding to the reference nodes, determining nodes meeting the condition setting of the three-level nodes from the adjacency lists corresponding to the reference nodes, wherein the nodes are the three-level nodes, updating the intermediate output subgraphs (intermediate output subgraphs corresponding to the second iteration processing) of the determined three-level nodes to obtain intermediate output subgraphs comprising the first-level nodes, the second-level nodes and the three-level nodes, deleting the intermediate output subgraphs (intermediate output subgraphs corresponding to the second iteration processing) of the undetermined three-level nodes, and obtaining the intermediate output subgraphs comprising the first-level nodes, the second-level nodes and the third-level nodes, namely the intermediate output subgraphs corresponding to the third iteration processing.

And obtaining an intermediate output subgraph corresponding to the Nth iteration processing after the N times of iteration processing, and if the iteration ending condition is met, namely the intermediate output subgraph corresponding to the Nth iteration processing is the output data.

Taking sub-graph matching as an example, the implementation flow of the graph mining algorithm is introduced:

the query graph may include four nodes with different labels, the four nodes have edge relationships between each other, the four nodes are allocated according to a first-level node, a second-level node, a third-level node and a fourth-level node, the labels of the four-level nodes may be A, B, C and D respectively, and then the flow of the graph mining algorithm for determining the subgraph isomorphic to the query graph in the data graph may be:

the condition setting of the primary node may include: the label is A. Performing first iterative processing, namely: traversing each node in the data graph, determining a plurality of nodes with labels of A from the plurality of nodes in the data graph, wherein the nodes are first-stage nodes, and each first-stage node is an intermediate output subgraph output by first iteration processing.

The condition setting of the secondary node may include: the label is B, and has an edge relation with the primary node. Performing a second iteration process, namely: and carrying out the following operation on each intermediate output subgraph corresponding to the first iteration process, wherein the second-level node and the first-level node in the query graph have an edge relationship, so that the first-level node is a reference node required by the second iteration process, and then an adjacent table corresponding to the first-level node in the intermediate output subgraph is obtained. And determining a node with a label B from an adjacent table corresponding to the first-level node as a second-level node corresponding to the intermediate output subgraph, and updating the intermediate output subgraph to obtain a new intermediate output subgraph comprising the first-level node and the second-level node. When a secondary node with a label of B is not found in an adjacent table corresponding to a certain primary node in the intermediate output subgraph corresponding to the first iteration processing, the intermediate output subgraph can be determined to be inconsistent with the condition setting, and the intermediate output subgraph can be deleted.

The condition setting of the tertiary node may include: the label is C, and has an edge relation with the primary node and the secondary node. And performing a third iteration process, namely: and (3) carrying out the following operation on each intermediate output subgraph corresponding to the second iteration processing, wherein the three-level nodes, the first-level node and the second-level node in the query graph have edge relations, so that the first-level node and the second-level node are all reference nodes required by the third iteration processing, an adjacency list corresponding to the first-level node and an adjacency list corresponding to the second-level node in the intermediate output subgraph are obtained, and a node which has edge relations with the first-level node, the second-level node and the C label is determined from the adjacency list and is used as the three-level node corresponding to the intermediate output subgraph, and the intermediate output subgraph is updated to obtain a new intermediate output subgraph comprising the first-level node, the second-level node and the third-level node. When a node which has an edge relation with the first-stage node and an edge relation with the second-stage node and is marked as C is not found in the adjacent table corresponding to the first-stage node and the adjacent table corresponding to the second-stage node in a certain intermediate output subgraph corresponding to the second iteration processing, the non-conforming condition setting of the intermediate output subgraph can be determined, and the intermediate output subgraph can be deleted.

The condition setting of the four-level node may include: the label is D, and has an edge relation with the primary node, the secondary node and the tertiary node. And performing the fourth iterative processing, namely: and (3) carrying out the following operation on each intermediate output subgraph corresponding to the third iteration processing, wherein the four-level node and the first-level node, the second-level node and the third-level node in the query graph have edge relations, so that the first-level node, the second-level node and the third-level node are all reference nodes required by the fourth iteration processing, an adjacent table corresponding to the first-level node, an adjacent table corresponding to the second-level node and an adjacent table corresponding to the third-level node in the intermediate output subgraph are obtained, and the node which has the edge relations with the first-level node, the second-level node, the third-level node and the third-level node, has the edge relation with the third-level node and the label D is determined from the adjacent table corresponding to the first-level node, the second-level node, the third-level node and the fourth-level node and is used as the fourth-level node corresponding to the intermediate output subgraph, and updating the intermediate output subgraph to obtain a new intermediate output subgraph comprising the first-level node, the second-level node, the third-level node and the fourth-level node. When a node which has an edge relation with the first-stage node, an edge relation with the second-stage node, an edge relation with the third-stage node and a label of D is not found in the adjacent table corresponding to the first-stage node, the adjacent table corresponding to the second-stage node and the adjacent table corresponding to the third-stage node in a certain intermediate output subgraph corresponding to the third iteration process, the intermediate output subgraph can be determined to be in non-conforming arrangement, and can be deleted.

The method comprises the steps that at least one intermediate output subgraph which is consistent with the condition setting and comprises a first-level node, a second-level node, a third-level node and a fourth-level node is obtained, namely the subgraph which is isomorphic with the query graph, and the subgraph is determined to be output data for output.

In the above iterative process, a large number of intermediate output subgraphs are output in each iterative process, so as to solve the technical problem that the application range of the graph mining algorithm is reduced due to the large data volume of the intermediate output subgraphs and the small video memory space of the GPU in the prior art, the embodiment of the disclosure provides a method for implementing the graph mining algorithm, and the method is applied to the GPU, and referring to fig. 1, the embodiment includes:

101. and determining a reference node required by the iterative processing corresponding to the current iteration number from a plurality of nodes included in the intermediate output subgraph corresponding to the previous iteration number based on a preset graph mining algorithm.

The intermediate output subgraph corresponding to the previous iteration number is an intermediate output subgraph obtained during the iteration processing corresponding to the previous iteration number. For example, if the current iteration number is 5, the intermediate output subgraph corresponding to the previous iteration number is the intermediate output subgraph obtained during the fourth iteration process.

In implementation, a worker can build a graph mining algorithm based on requirements, and the graph mining algorithm can be built by using a plurality of preset operators, so that the preset graph mining algorithm is obtained.

In one possible implementation, the preset operator may include at least one of an extension (extension) operator, an aggregation (aggregation) operator, and a filtering (filtering) operator.

The expansion algorithm is based on an adjacency list corresponding to the nodes in the intermediate output subgraph corresponding to the previous iteration times, and adds one to the number of the nodes in the intermediate output subgraph, namely, one node is spliced in the intermediate output subgraph, so that a new intermediate output subgraph is obtained.

The aggregation operator performs statistics on the overall information of the intermediate output subgraphs, for example, statistics on the occurrence times of each subgraph mode in the plurality of intermediate output subgraphs, types of different subgraph modes, and the like.

The filtering operator filters, namely deletes, the intermediate output subgraphs which do not meet the condition after the expansion operation and the aggregation operation.

The embodiment of the present disclosure is not limited to the type and number of preset operators included in the preset map mining algorithm.

When executing the preset graph mining algorithm, multiple iterative processes can be performed, and when executing any iterative process, a reference node required by iterative processes corresponding to the current iterative process can be determined from a plurality of nodes included in the intermediate output subgraph corresponding to the previous iterative process, wherein the reference node is a node having an edge relationship with a node required to be expanded in the current iterative process, for example, the previous iterative process is 2, and the current iterative process is 3, and the reference node required by iterative processes corresponding to the current iterative process is a node having an edge relationship with a three-level node required to be expanded in the intermediate output subgraph corresponding to the 2 nd iterative process.

It will be appreciated that the current number of iterations is a positive integer greater than 1.

102. Accessing the memory, and acquiring adjacency lists corresponding to a plurality of reference nodes included in the data graph stored in the memory.

The adjacency list corresponding to the reference node comprises other nodes with edge relation with the reference node in the data graph.

In implementation, the data graph is stored in a memory with larger space so as to avoid occupying the video memory space of the GPU, and when the iteration processing of the current iteration number is performed, the GPU can access the memory and acquire the adjacency list corresponding to the node to be accessed from the memory.

103. And performing iteration processing corresponding to the current iteration times based on a preset graph mining algorithm and an adjacency list corresponding to a plurality of reference nodes to obtain an intermediate output subgraph corresponding to the current iteration times.

In implementation, an extension operator can be used for each intermediate output sub-graph corresponding to the previous iteration times, a node is spliced for the intermediate output sub-graph based on an adjacency list corresponding to a plurality of reference nodes to obtain a spliced intermediate output sub-graph, and then a filtering algorithm can be used for deleting the spliced intermediate output sub-graph which does not meet the condition setting to obtain the spliced intermediate output sub-graph which meets the condition setting, namely the intermediate output sub-graph corresponding to the current iteration times.

It can be understood that, because the spliced intermediate output subgraphs which do not meet the condition are deleted, the number of the intermediate output subgraphs corresponding to the current iteration number is smaller than or equal to the number of the intermediate output subgraphs corresponding to the previous iteration number.

104. And sending the intermediate output subgraph corresponding to the current iteration times to a memory for storage.

In implementation, after each iteration process, the GPU can send a plurality of intermediate output subgraphs corresponding to the current iteration times to the memory of the CPU for storage, and as the memory space is tens of times or more of the video memory space, the size of the data graph which can be processed by the GPU is increased, and the application range of the graph mining algorithm is increased.

105. Determining whether an iteration ending condition is met, if yes, determining an intermediate output subgraph corresponding to the current iteration number as output data, and if not, adding one to the current iteration number, and turning to a step of executing a reference node required for determining the iteration processing corresponding to the current iteration number.

In implementation, after each iteration is completed, it may be determined whether an iteration end condition is currently satisfied, where the iteration end condition may be any reasonable condition, may be set according to a preset graph mining algorithm, or may be set based on a query graph, for example, the iteration end condition may be that the current iteration number is equal to the number of nodes in the query graph, and so on, and the embodiment of the disclosure is not limited to this.

By adopting the method and the device, the iteration processing of the preset graph mining algorithm can be carried out in the GPU, and the data with larger data quantity such as the data graph and the middle output subgraph corresponding to each iteration processing are stored in the memory, when the next iteration processing is carried out, the adjacency list corresponding to the reference node required by the next iteration processing is acquired from the memory, and because the memory space is tens of times or more of the video memory space, the scale of the data graph which can be processed by the GPU is increased, and the application range of the graph mining algorithm is increased.

In this embodiment of the present disclosure, the corresponding access manner may be set according to different amounts of accessed data, so as to reduce the amount of data transmitted as much as possible, and reduce the time spent for data transmission, and in the following, the access manner of the GPU in the embodiment of the present disclosure is described in more detail:

the memory may include a plurality of memory pages for storing the adjacency list and the intermediate output data corresponding to the nodes.

Before step 102, the access heat corresponding to each memory page may be determined. The data amount of each access in different access modes is different, so that the access mode of the data can be determined according to the access heat of different memory pages, the transmitted data amount is reduced, and the time spent for data transmission is reduced.

Referring to fig. 2, the method for determining the access manner may be as follows:

201. and determining the expected access times of the adjacency list corresponding to each reference node in the iteration process corresponding to the current iteration times.

In implementation, after the iteration process corresponding to the previous iteration number is completed, before the iteration process corresponding to the current iteration number is performed, the GPU may determine the expected number of accesses to the adjacency list corresponding to the reference node in the iteration process corresponding to the current iteration number.

For example, when performing sub-graph matching, in the process of performing the iteration process corresponding to the current iteration number, each time when an adjacency table corresponding to a reference node needs to be used, the GPU needs to access a memory page where the adjacency table corresponding to the reference node is located, so the method for determining the expected access number of the adjacency table corresponding to the reference node in the iteration process corresponding to the current iteration number may be: each intermediate output sub-graph corresponding to the previous iteration number comprises at least one reference node required by the iteration processing corresponding to the current iteration number, the union set of the reference nodes included in all the intermediate output sub-graphs corresponding to the previous iteration number is obtained, a reference node set is obtained, the same reference node with the same identification can exist in the reference node set, multiple reference nodes with different identifications can also exist in the reference node set, and then the number of the reference nodes with the same identification in the reference node set can be determined as the preset access number of the reference node.

The specific method for determining the expected access times of the adjacency list corresponding to the reference node may be set differently according to different preset graph mining algorithms, which is not limited in the embodiment of the present disclosure.

202. And determining a plurality of target memory pages where the adjacency lists corresponding to the plurality of reference nodes are located.

In an implementation, the memory includes memory pages, and adjacency tables corresponding to different reference nodes may be located in the same memory page or may be located in different memory pages, and similarly, adjacency tables corresponding to one reference node may be all located in one memory page or may be located in multiple memory pages.

The GPU needs to determine at least one target memory page where the adjacency list corresponding to each reference node is located.

203. And determining the access heat of each target memory page where the adjacency list corresponding to the plurality of reference nodes is based on the expected access times corresponding to each reference node.

In implementation, there may be various ways to determine the access heat of the target memory page based on the expected access times corresponding to the reference node, for example, the expected access times corresponding to one reference node may be directly used as the access heat of the target memory page where the reference node is located, or the expected access times corresponding to the reference node may be used only as one of the factors affecting the access heat of the target memory page, and may be considered with other factors to determine the access heat of the target memory page.

Other factors may be any reasonable factors that affect the access heat of the target memory page, which is not limited in the embodiment of the present disclosure, and one of them is described below as an example:

other factors may be the amount of data in each target memory page of the adjacency list corresponding to the reference node.

When the access heat of each target memory page needs to be calculated, the following processing may be performed for each target memory page: for each reference node stored in the target memory page, the product of the data amount of the adjacency list corresponding to the reference node stored in the target memory page and the expected access frequency corresponding to the reference node can be determined as the expected access data amount corresponding to the current iteration frequency of the reference node stored in the target memory page. And then, determining the sum of the expected access data amounts corresponding to at least one reference node in the target memory page as the expected access data amount corresponding to the current iteration number of the target memory page. And determining the access heat of the target memory page based on the expected access data volume corresponding to the current iteration times of the target memory page.

In implementation, each target memory page may store only one adjacency list corresponding to a reference node, or may store adjacency lists corresponding to a plurality of reference nodes, and each target memory page may store a complete adjacency list corresponding to a reference node, or may store a part of data in the adjacency list corresponding to a reference node.

Therefore, for each target memory page, the data amount of the adjacency list corresponding to all the reference nodes stored in the target memory page can be determined, where the data amount can be the length of the adjacency list corresponding to all the reference nodes stored in the target memory page. Multiplying the data volume of the adjacency list corresponding to each reference node by the expected access times corresponding to the reference node, and obtaining a numerical value, namely the expected access data volume of the adjacency list corresponding to the reference node in the target memory page in the iterative processing process corresponding to the current iteration times.

And then, the expected access data quantity corresponding to the plurality of reference nodes in the target memory page can be added, so that the expected access data quantity in the iterative processing process corresponding to the current iteration number of the target memory page is obtained.

At this time, the predicted access data amount corresponding to the target memory page may be directly used as the access heat of the target memory page, or other factors may be further integrated to determine the access heat of the target memory page, and one possible description is presented below:

and determining the corresponding access data amount of each iteration process of the target memory page before the current iteration times. And determining the access heat of the target memory page based on the expected access data amount corresponding to the current iteration times of the target memory page and the access data amount corresponding to each iteration process before the current iteration times of the target memory page.

In implementation, the access data amount of the target memory page in the previous iteration process has a larger influence on the access data amount corresponding to the target memory page in the iteration process corresponding to the current iteration number and the access data amount corresponding to the target memory page in the subsequent iteration process, so that the access data amount corresponding to each iteration process of the target memory page before the current iteration number can also be used as one influence factor of the access heat of the target memory page.

The amount of access data corresponding to each iteration process before the current iteration number of the target memory page may be multiple, for example, when the access heat of the target memory page needs to be determined, the expected amount of access data corresponding to the current iteration number of the target memory page and the amount of access data corresponding to each iteration process before the current iteration number of the target memory page are calculated respectively, or, after each iteration process, the GPU may store the amount of access data corresponding to each target memory page corresponding to the current iteration number, and when the access heat of the target memory page needs to be determined, the stored amount of access data of each target memory page in each iteration process may be directly obtained, or may also be other reasonable determination manners.

After determining the expected access data amount corresponding to the current iteration times of the target memory page and the access data amount corresponding to each iteration process of the target memory page before the current iteration times, different weights can be allocated to the expected access data amount and the access data amount to obtain a weighted sum, and the weighted sum is used as the access heat of the target memory page.

In the embodiment of the present disclosure, the access heat of each target memory page may be calculated according to the following formula:

wherein, the SpatialLoc _i (p) is the expected access data amount of the target memory page p corresponding to the current iteration times, i is the current iteration times, j is a variable, v is a node, l (v) is an adjacency list of the node v, and times _i (l (v)) is the expected number of accesses of the adjacency list of v in the target memory page at the ith iteration process, |l (v) | is the data amount of the adjacency list of v in the target memory page, tempLoc _i (p) the access data volume corresponding to the multiple iterative processes of the target memory page p before the current iterative times, A _i And the data accessed in the ith iteration process for the target memory page.

Of course, the weights of the two may be other settings, which are not limited by the embodiments of the disclosure.

204. And determining a target access mode of the adjacency list corresponding to each reference node from a plurality of preset access modes based on the access heat of each target memory page and a preset heat threshold.

In implementation, for each target memory page, a target access mode of an adjacency list corresponding to each reference node in the target memory page can be determined according to the access heat of the target memory page, and then, for each adjacency list corresponding to the reference node, the GPU obtains the adjacency list corresponding to the reference node stored in the memory in the target access mode of the adjacency list corresponding to the reference node, so that data transmission is reduced, resource waste is reduced, and additional time required for data transmission is reduced.

In one possible implementation, the target access manner of the adjacency table corresponding to each reference node may be determined based on the following method:

when the access heat of the target memory page is greater than or equal to the preset heat threshold, a unified memory (unified memory) may be used as a target access manner, that is: and obtaining a target memory page from the memory to the video memory for caching, and obtaining an adjacency list corresponding to at least one reference node stored in the target memory page from the video memory.

In implementation, since the access heat of the target memory page is higher, the access manner can be used, the target memory page is transmitted to the GPU in the form of 4KB (Kilobyte) at a time and is cached on the video memory of the GPU, and in the process of iterative processing corresponding to the current iteration times, whenever the GPU needs to acquire the adjacency list corresponding to the reference node stored in the target memory page, the target memory page can be directly acquired from the video memory without accessing the memory.

When the access heat of the target memory page is smaller than the preset heat threshold, a zero-copy memory (zero-copy memory) may be used as the target access mode, that is: and acquiring an adjacency list corresponding to at least one reference node stored in the target memory page from the memory.

In implementation, since the access heat of the target memory page is low, the GPU only needs to acquire the adjacency list corresponding to the reference node stored on the target memory page in real time when the adjacency list needs to be acquired, and the data is transmitted to the GPU in a form of 128 bytes at a time.

Of course, the target access manner in the embodiments of the present disclosure may also be any other reasonable access manner, which is not limited by the embodiments of the present disclosure.

In the embodiment of the present disclosure, in order to process the plurality of intermediate output subgraphs, the GPU may be provided with a plurality of working thread groups, where each working thread group corresponds to the plurality of intermediate output subgraphs corresponding to the previous iteration number one by one, that is, when performing the iteration process corresponding to the current iteration number, a corresponding working thread group may be allocated to each intermediate output subgraph corresponding to the previous iteration number.

For each working thread group, the working thread group can perform iteration processing corresponding to the current iteration times based on a preset graph mining algorithm and an adjacency list corresponding to each reference node in the intermediate output subgraph corresponding to the working thread group, so as to obtain the intermediate output subgraph corresponding to the working thread group.

In implementation, each working thread group carries out iterative processing on the intermediate output subgraph which is responsible for the working thread group, so that the intermediate output subgraph corresponding to the current iteration times which accords with the condition setting is obtained.

In one possible implementation, for each worker thread group, it may perform the following iterative process:

and the working thread group acquires intersection data of adjacent tables corresponding to a plurality of reference nodes included in the intermediate output subgraph corresponding to the working thread group, and then performs iteration processing corresponding to the current iteration times based on a preset graph mining algorithm and the intersection data of the adjacent tables corresponding to the plurality of reference nodes in the intermediate output subgraph corresponding to the working thread group to obtain the intermediate output subgraph corresponding to the working thread group.

In the implementation, when the working thread group performs iterative processing on an intermediate output sub-graph, if there are multiple reference nodes included in the intermediate output sub-graph, the newly added node in the new intermediate output sub-graph that represents the subsequent determination has an edge relationship with the multiple reference nodes, and before performing iterative processing on the adjacency table corresponding to each reference node, multiple nodes included in the adjacency table corresponding to the multiple reference nodes included in the target output sub-graph may be intersected to obtain intersection data including at least one node, and each node in the intersection data has an edge relationship with each reference node in the intermediate output sub-graph.

After intersection data of adjacent tables corresponding to the plurality of reference nodes in the intermediate output subgraph are obtained, iteration processing corresponding to the current iteration times can be carried out according to a preset graph mining algorithm and the intersection data, so that the intermediate output subgraph corresponding to the current iteration times is obtained. In this process, the use of intersection data reduces much of the additional memory overhead and redundant computation that is incurred by the adjacency lists to which the plurality of reference nodes correspond.

In one possible implementation, the GPU may also be provided with a scheduling thread, the memory comprising a plurality of memory blocks.

In the iterative processing process corresponding to the current iteration times, each time the working thread group determines an intermediate output subgraph corresponding to the current iteration times, the intermediate output subgraph can be sent to a memory block corresponding to the working thread group for storage. When the scheduling thread monitors that the memory block corresponding to the working thread group is full, a new memory block can be allocated for the scheduling thread.

In such a storage manner, a suitable memory space can be allocated for the generated intermediate output subgraph according to the data volume thereof, and the additional time and space cost cannot be introduced.

Any combination of the above-mentioned optional solutions may be adopted to form an optional embodiment of the present disclosure, which is not described herein in detail.

According to the scheme, iteration processing of a preset graph mining algorithm can be performed in the GPU, the data graph and the middle output subgraph corresponding to each iteration processing are stored in the memory, when the next iteration processing is performed, the adjacency list corresponding to the reference node required by the next iteration processing is acquired from the memory, and as the memory space is tens of times or more of the video memory space, the size of the data graph which can be processed by the GPU is increased, and the application range of the graph mining algorithm is increased.

Embodiments of the present disclosure provide a computer device including a GPU and a memory, the GPU configured to:

the GPU is further configured to:

In one possible implementation, the GPU is configured to:

The GPU is configured to:

In one possible implementation, the GPU is configured to:

the GPU is configured to:

It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals (including but not limited to signals transmitted between a user terminal and other devices, etc.) related to the present disclosure are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions. For example, reference in this disclosure to "data graphs" is to be taken with sufficient authorization.

The foregoing description of the preferred embodiments of the present disclosure is provided for the purpose of illustration only, and is not intended to limit the disclosure to the particular embodiments disclosed, but on the contrary, the intention is to cover all modifications, equivalents, alternatives, and alternatives falling within the spirit and principles of the disclosure.

Claims

1. A method for implementing a graph mining algorithm, the method being applied to a graphics processor GPU, the method comprising:

2. The method of claim 1, wherein the memory comprises a plurality of memory pages for storing the data map and the intermediate output data;

3. The method of claim 2, wherein determining the access heat of each target memory page where the adjacency list corresponding to the plurality of reference nodes is located based on the expected number of accesses corresponding to each reference node comprises:

4. The method of claim 3, wherein determining the access heat of the target memory page based on the expected access data amount corresponding to the current iteration number of the target memory page comprises:

5. The method of claim 4, wherein determining, from a plurality of preset access modes, the target access mode of the adjacency list corresponding to each reference node based on the access heat of each target memory page and a preset heat threshold, comprises:

6. The method according to claim 1, wherein the GPU is provided with a plurality of working thread groups, each working thread group corresponding to a plurality of intermediate output subgraphs corresponding to the previous iteration number one-to-one;

7. The method of claim 6, wherein the obtaining the adjacency list corresponding to the plurality of reference nodes included in the data map stored in the memory includes:

8. The method of claim 6, wherein the GPU is further provided with a dispatch thread, the memory comprising a plurality of memory blocks;

9. The method of claim 1, wherein the preset graph mining algorithm comprises at least one of an extension operator, an aggregation operator, and a filtering operator.

10. A computer device comprising a GPU and a memory, the GPU configured to: