WO2022016833A1

WO2022016833A1 - Graph computing method and apparatus, and device and storage medium

Info

Publication number: WO2022016833A1
Application number: PCT/CN2021/071205
Authority: WO
Inventors: 胡克坤; 董刚; 赵雅倩
Original assignee: 苏州浪潮智能科技有限公司
Priority date: 2020-07-24
Filing date: 2021-01-12
Publication date: 2022-01-27
Also published as: CN111858059A

Abstract

A graph computing method and apparatus, and a device and a storage medium. The steps of the method comprise: compiling statistics on computing resource proportions between processing nodes; acquiring a graph to be operated; on the basis of the computing resource proportions, dividing said graph into sub-graphs to be operated of each processing node, wherein the proportions between task loads of said sub-graphs allocated to each processing node are consistent with the computing resource proportions; and allocating each of said sub-graphs to a corresponding processing node, such that each processing node performs a graph computing operation in parallel. By means of this method, the overall efficiency of performing graph computing by a plurality of processing nodes is relatively ensured. In addition, further provided are a graph computing apparatus, a device and a storage medium, which have the same beneficial effects as described above.

Description

A graph computing method, device, equipment and storage medium

This application claims the priority of the Chinese patent application filed on July 24, 2020 with the application number 202010724722.7 and the invention titled "A Graph Computing Method, Apparatus, Equipment and Storage Medium", the entire contents of which are by reference Incorporated in this application.

technical field

The present application relates to the field of cloud computing, and in particular, to a graph computing method, apparatus, device and storage medium.

Background technique

Graph (Graph) is an abstract data structure used to represent the association relationship between objects, which is described by vertices (Vertex) and edges (Edge): vertices represent objects, and edges represent the relationship between objects. Data that can be abstracted into graphs is graph data.

Graph computing is the process of modeling, analyzing and solving real problems using graphs as data models. In actual scenarios, the scale of graphs is often relatively large, and parallel distributed processing of graphs through multiple processing nodes is the main solution for graph computing at present. Before performing parallel distributed processing on a graph through a parallel computing system composed of multiple processing nodes, the graph needs to be divided into subgraphs, and then each processing node performs operations on the corresponding subgraphs, and then each processing node performs operations on the corresponding subgraphs. Joint operation to obtain the graph calculation result. However, in actual computing scenarios, the computing performance of each processing node is often different, and the overall efficiency of graph computing is determined by the processing node that finally completes the computing operation on the subgraph. However, the current graph partitioning algorithms often assume that the parallel computing system is homogeneous, and the generated graph partitioning scheme is difficult to ensure the overall efficiency of graph computing performed by multiple processing nodes.

It can be seen that it is a problem to be solved by those skilled in the art to provide a graph calculation method to relatively ensure the overall efficiency of graph calculation performed by multiple processing nodes.

SUMMARY OF THE INVENTION

The purpose of the present application is to provide a graph computing method, apparatus, device and storage medium to relatively ensure the overall efficiency of graph computing performed by multiple processing nodes.

In order to solve the above-mentioned technical problems, the present application provides a graph calculation method, including:

Count the proportion of computing resources between processing nodes;

Get the to-be-operated graph;

The graph to be computed is divided into subgraphs to be computed of each processing node based on the ratio of computing resources; wherein, the ratio between the task loads of the subgraphs to be computed allocated to each processing node is consistent with the ratio of computing resources;

Allocate the subgraphs to be computed to the corresponding processing nodes, so that the processing nodes can perform graph computing operations in parallel.

Preferably, the graph to be computed is divided into subgraphs to be computed of each processing node based on the ratio of computing resources, including:

While acquiring the graph to be computed, the graph to be computed is divided into subgraphs to be computed of each processing node based on the ratio of computing resources.

Calculate the total task load corresponding to the graph to be operated;

Calculate the task load expectation of each processing node according to the total task load and the proportion of computing resources;

According to the task load expectation of each processing node, the graph to be computed is divided into corresponding subgraphs to be computed.

Preferably, the task load expectations include computing load expectations and communication load expectations.

Preferably, calculating the total task load corresponding to the graph to be calculated includes:

Calculate the total task load corresponding to the graph to be operated in the target algorithm scenario;

Allocate each subgraph to be computed to the corresponding processing node, so that each processing node can perform graph computing operations in parallel, including:

Allocate each subgraph to be computed to a corresponding processing node, so that each processing node can perform graph computing operations based on the target algorithm.

Preferably, calculating the total task load corresponding to the graph to be computed in the target algorithm scenario, including:

Use the preset model data set to calculate the total task load corresponding to the graph to be calculated in the target algorithm scenario;

The generation process of the model dataset includes:

Obtain a graph sample set and a sample algorithm set;

Establish the combination relationship between the graph samples in the graph sample set and the sample algorithms in the sample algorithm set;

The sample task load of each combination relationship is counted to obtain the model data set.

Preferably, the computing resource ratio includes a computing rate ratio of the computing resources.

In addition, the present application also provides a graph computing device, comprising:

The proportion statistics module is used to calculate the proportion of computing resources among the processing nodes;

The graph obtaining module is used to obtain the graph to be computed;

The graph dividing module is used to divide the graph to be computed into subgraphs to be computed of each processing node based on the ratio of computing resources; wherein, the ratio between the task loads of the subgraphs to be computed allocated to each processing node is consistent with the ratio of computing resources;

The sub-graph allocation module is used for allocating each sub-graph to be calculated to the corresponding processing node, so that each processing node can perform the graph computing operation in parallel.

Preferably, the graph partitioning module includes:

The total task load calculation module is used to calculate the total task load corresponding to the graph to be calculated;

The expected task load calculation module is used to calculate the task load expectation of each processing node according to the total task load and the proportion of computing resources;

The dividing module is used for dividing the graph to be operated into corresponding subgraphs to be operated according to the task load expectation of each processing node.

In addition, the present application also provides a graph computing device, including:

memory for storing computer programs;

The processor is configured to implement the steps of the above graph computing method when executing the computer program.

In addition, the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above graph computing method are implemented.

The graph computing method provided by the present application firstly counts the ratio of computing resources among processing nodes used for performing graph computation, obtains the graph to be computed, and then divides the graph to be computed into the computations to be computed of each processing node based on the ratio of computing resources. Subgraphs, in which the ratio of the task load of the subgraphs to be calculated to each processing node is consistent with the ratio of computing resources, and then each subgraph to be calculated is allocated to the corresponding processing node for each processing node to execute the graph in parallel computing operations. Since this method is based on the ratio of computing resources among the processing nodes, the sub-graphs of the corresponding task load ratios to be executed on the graph to be operated are divided, and then the ratio of the task load between the processing nodes to the sub-graph to be operated and the ratio between the processing nodes The proportion of computing resources is the same, so that the sub-graphs to be operated on the corresponding loads are allocated in a targeted manner according to the difference in the computing performance of each processing node, thereby relatively ensuring the overall efficiency of graph computing for multiple processing nodes. In addition, the present application also provides a graph computing device, equipment and storage medium, the beneficial effects are the same as described above.

Description of drawings

1 is a flowchart of a graph computing method disclosed in an embodiment of the present application;

2 is a flowchart of a graph computing method disclosed in an embodiment of the present application;

FIG. 3 is a schematic structural diagram of a graph computing device disclosed in an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. All other embodiments obtained by those of ordinary skill in the art based on the embodiments in the present application without creative work fall within the protection scope of the present application.

Therefore, the core of the present application is to provide a graph computing method to relatively ensure the overall efficiency of graph computing performed by multiple processing nodes.

In order to make those skilled in the art better understand the solution of the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.

Referring to FIG. 1 , an embodiment of the present application discloses a graph computing method, including:

Step S10: Counting the computing resource ratio among the processing nodes.

It should be noted that each processing node in this step refers to a node that jointly performs distributed graph computation on the computation graph to be processed. The ratio of computing resources among the processing nodes calculated in this step is essentially the ratio between the computing resources of the processing nodes. The computing resources refer to the number of hardware resources that the processing nodes can call when performing graph computation. is an index reflecting the computing performance of the processing nodes, so the ratio of computing resources among the processing nodes obtained by statistics in this step is equivalent to the proportional relationship between the computing performances of the processing nodes.

Step S11: Obtain the graph to be calculated.

The graph to be computed obtained in this step refers to the graph that each processing node needs to perform distributed graph computation in subsequent steps, which can be considered as a data model with specific data content recorded. In addition, it should be noted that there is no sequence between the step of obtaining the graph to be calculated and the step of calculating the computing resource ratio between the processing nodes. Therefore, in this embodiment, the execution sequence between step S10 and step S11 is not fixed. Step S10 and step S11 may also be performed simultaneously, which should be determined according to the actual situation, which is not specifically limited here.

Step S12: Divide the graph to be computed into subgraphs to be computed of each processing node based on the ratio of computing resources.

Wherein, the ratio between the task loads of the subgraphs to be operated that are allocated by each processing node is consistent with the ratio of computing resources.

It should be noted that, after obtaining the computing resource ratio among the processing nodes and the to-be-operated graph, this step further divides the to-be-operated graph into the to-be-operated subgraphs of each processing node according to the computing resource ratio. That is to say, in this step, according to the ratio of computing resources among the processing nodes, the graph to be computed is divided into subgraphs to be computed corresponding to the processing nodes, and the task load ratio of the subgraphs to be computed among the processing nodes It is consistent with the ratio of computing resources among the processing nodes, where the task load ratio refers to the proportional relationship between the corresponding task loads generated when each sub-graph to be operated is processed, that is to say, this embodiment is based on each The proportion of the computing power of the processing nodes in the overall processing nodes, and divides the to-be-operated subgraphs with the corresponding task load proportions in the to-be-operated graph for each processing node, so as to ensure the target processing with a larger proportion of computing resources in the processing nodes. Node, which can perform graph computing operations on subgraphs to be operated with large task load.

Step S13: Allocate each subgraph to be computed to a corresponding processing node, so that each processing node can perform graph computing operations in parallel.

After dividing the to-be-operated graph into the to-be-operated subgraphs of each processing node based on the ratio of computing resources, the corresponding relationship between the to-be-operated subgraphs and the processing nodes is established, and then each to-be-operated subgraph is allocated to the corresponding processing node , so that each processing node can perform a graph calculation operation on the subgraphs to be calculated that it receives, so as to realize the cooperative operation processing of the graphs to be calculated by each processing node.

The graph computing method provided by the present application firstly counts the ratio of computing resources among processing nodes used for performing graph computation, obtains the graph to be computed, and then divides the graph to be computed into the computations to be computed of each processing node based on the ratio of computing resources. Subgraphs, in which the ratio of the task load of the subgraphs to be calculated to each processing node is consistent with the ratio of computing resources, and then each subgraph to be calculated is allocated to the corresponding processing node for each processing node to execute the graph in parallel computing operations. Since this method is based on the ratio of computing resources among the processing nodes, the sub-graphs of the corresponding task load ratios to be executed on the graph to be operated are divided, and then the ratio of the task load between the processing nodes to the sub-graph to be operated and the ratio between the processing nodes The proportion of computing resources is the same, so that the sub-graphs to be operated on the corresponding loads are allocated in a targeted manner according to the difference in the computing performance of each processing node, thereby relatively ensuring the overall efficiency of graph computing for multiple processing nodes.

On the basis of the above embodiment, as a preferred implementation, the graph to be computed is divided into subgraphs to be computed of each processing node based on the ratio of computing resources, including:

It should be noted that, considering that in the actual graph calculation scenario, the data amount of the graph to be calculated may be relatively large, so in order to further improve the processing efficiency of the subgraph to be calculated, the focus of this embodiment is to obtain the data to be calculated. While computing the graph, the step of dividing the graph to be computed into subgraphs to be computed of each processing node based on the ratio of computing resources is performed in parallel, so as to achieve the purpose of stream division of the graph to be computed. Further, while acquiring the graph to be computed, the graph to be computed is divided into subgraphs to be computed of each processing node based on the ratio of computing resources, and may specifically be cyclically divided based on the ratio of computing resources to each processing node having a corresponding task load ratio. Subgraphs to be operated on, until all of the graphs to be operated on are divided. This embodiment can further ensure the overall execution efficiency of the process of dividing the graph to be calculated into subgraphs to be calculated for each processing node based on the ratio of computing resources, thereby relatively ensuring the overall efficiency of the graph calculation process.

Referring to FIG. 2, an embodiment of the present application discloses a graph computing method, including:

Step S20: Counting the computing resource ratio among the processing nodes.

Step S21: Obtain the graph to be calculated.

Step S22: Calculate the total task load corresponding to the graph to be calculated.

It should be noted that, in this step, after the graph to be calculated is obtained, the total task load corresponding to the graph to be calculated is further calculated, which is equivalent to calculating the overall task load occupied by the graph to be calculated.

Step S23: Calculate the task load expectation of each processing node according to the total task load and the computing resource ratio.

After the total task load corresponding to the graph to be computed is obtained, this step further calculates the task load expectation of each processing node according to the total task load and the ratio of computing resources. The task load expectation here refers to the The load criteria that a node needs to meet when dividing the subgraph to be calculated is to ensure that the task load corresponding to the subgraph to be calculated matches the task load expectation of the target processing node when dividing the subgraph to be calculated for the target processing node.

Step S24 : Divide the graph to be operated into corresponding subgraphs to be operated according to the task load expectation of each processing node.

It should be noted that, in this step, after calculating the task load expectation of each processing node according to the total task load and the ratio of computing resources, the graph to be computed is further divided into corresponding subgraphs to be computed according to the task load expectation of each processing node. The purpose of dividing the to-be-operated graph into the to-be-operated subgraphs of each processing node based on the ratio of computing resources among the processing nodes is achieved.

Step S25: Allocate each subgraph to be computed to a corresponding processing node, so that each processing node can perform graph computing operations in parallel.

In this embodiment, the total task load corresponding to the graph to be calculated is calculated, and then the task load expectation of each processing node is calculated according to the total task load and the ratio of computing resources, and finally the graph to be calculated is divided into corresponding to-be-calculated graphs according to the task load expectation of each processing node. The method of calculating subgraphs further ensures the overall accuracy of the process of dividing the graph to be calculated into subgraphs to be calculated for each processing node based on the ratio of computing resources, and further ensures the overall efficiency of graph computation performed by multiple processing nodes.

On the basis of the foregoing embodiment, as a preferred implementation manner, the task load expectation includes the computing load expectation and the communication load expectation.

It should be noted that, in this embodiment, the task load expectation further includes the computation load expectation and the communication load expectation, wherein the computation load expectation refers to that when the processing node performs a graph computing operation on the subgraph to be operated, the subgraph to be operated is in the processing node. The standard of the computing resource load generated in the communication load expectation refers to the standard of the communication resource load generated by the network communication between the processing node and other processing nodes when the processing node performs the graph computing operation on the subgraph to be operated. In this embodiment, the task load expectation is further refined, thereby further ensuring the accuracy of dividing the graph to be calculated into subgraphs to be calculated corresponding to each processing node.

Furthermore, when the processing node performs a graph computation operation on the subgraph to be operated, the computing resource load generated by the subgraph to be operated in the processing node is specifically determined by the subgraph to be operated without considering the algorithm that the graph computation operation follows. The number of edges and the number of vertices are affected jointly, and the communication resource load generated by the network communication between the processing node and other processing nodes is proportional to the number of cut edges of the subgraph to be operated. On this basis, in this embodiment, the graph to be calculated is divided into corresponding subgraphs to be calculated according to the task load expectation of each processing node. The sub-graphs to be operated with the corresponding number of edges, vertices, and cuts are divided in .

On the basis of the above embodiment, as a preferred implementation, calculating the total task load corresponding to the graph to be calculated includes:

It should be noted that, since the total task load generated by the graph to be computed is often different when the graph computation operation is performed on the graph with different algorithms, the focus of this embodiment is to calculate the graph to be computed on a specific target algorithm. The total task load corresponding to the scene can further ensure the accuracy of the total task load corresponding to the graph to be calculated, and then after the subgraph to be calculated is divided based on the graph to be calculated, each subgraph to be calculated is allocated to the corresponding processing node, For each processing node to perform the graph computing operation based on the target algorithm, it can further ensure the overall efficiency of the graph computing performed by the multiple processing nodes.

Further, as a preferred embodiment, calculating the total task load corresponding to the graph to be computed in the target algorithm scenario, including:

The generation process of the model dataset includes:

Obtain a graph sample set and a sample algorithm set;

It should be noted that the focus of this embodiment is to use a preset model data set to calculate the total task load corresponding to the graph to be computed in the target algorithm scenario. The model data set is generated in advance based on the graph sample set and the sample algorithm set. The graph sample set refers to the set of graph samples, and the sample algorithm set refers to the set of graph algorithms. After the graph sample set and the sample algorithm set are obtained After that, the combination relationship between the map samples in the graph sample set and the sample algorithms in the sample algorithm set is further established. The combination relationship referred to here is essentially the corresponding relationship between the graph samples and the sample algorithms. The combination relationship referred to here is, It can be a one-to-one combination relationship between graph samples and sample algorithms, or a one-to-many combination relationship, which should be determined according to actual needs. After the combination relationship between graph samples and sample algorithms is established, this implementation The method further counts the sample task load generated by executing the sample algorithm in the combination relationship on the graph samples in the combination relationship, so as to obtain a model data set including the combination relationship and the sample task load corresponding to the combination relationship. On this basis, the preset model data set is used to calculate the total task load corresponding to the graph to be calculated in the target algorithm scenario. Specifically, the target sample task corresponding to the combination of the graph to be calculated and the target algorithm can be obtained in the model data set. load, and set the target sample task load as the total task load.

This embodiment further ensures the overall efficiency and accuracy.

On the basis of the above series of embodiments, as a preferred embodiment, the computing resource ratio includes the computing rate ratio of the computing resources.

It should be noted that, considering that the computing rate ratio of computing resources is equivalent to the number of computing resources that perform computing processing on data per unit time, the computing performance of processing nodes can be relatively accurately reflected, and the computing resources between computing resources can be processed. The rate ratio can further accurately reflect the relationship between the computing performance between the processing nodes. Therefore, this embodiment can further improve the overall accuracy of the process of dividing the graph to be computed into subgraphs to be computed for each processing node based on the ratio of computing resources, and further ensure that It improves the overall efficiency of graph computation with multiple processing nodes.

In addition, in a specific scenario, computing resources may include multiple types, and correspondingly, the computing resource ratio includes the computing rate ratio of the computing resources of multiple types. For example, computing resources include one or more of CPU (central processing unit, central processing unit), GPU (Graphics Processing Unit, graphics processor) and FPGA (Field Programmable Gate Array, field programmable gate array), then The computing resource ratio specifically includes the computing rate ratio among one or more computing resources of the CPU, GPU, and FPGA.

In order to increase the understanding of the above embodiments of the present application, a specific scenario embodiment is used for further description below.

When the graph computing operation is performed in the embodiment of this scenario, it is mainly composed of two technical parts: (1) a quantitative analysis technique of graph task load; (2) a flow heuristic rule design technique of arbitrary scale. Among them, the former provides measurement basis for the implementation of division; the latter provides reference for the formulation of division rules.

(1) Graph task load quantitative analysis technology. The so-called graph task load, in a broad sense, refers to the workload of the algorithm operation when the graph algorithm Alg to solve a graph theory problem is executed on the to-be-operated graph G. Specifically, it refers to the sum of the workload of the calculation operation of the graph vertices and the data transmission operation of the connecting edge. . Therefore, the graph task load Load(G, Alg) is inseparable from the topology of G and the graph algorithm Alg executed on it. Drawing on the theory of graph algorithm complexity analysis, according to observations and summaries, Load(G, Alg) can be measured by the following formula:

Load(G,Alg)=a(|V|-r) ^α +b(|E|-s) ^β +clog((|V|-r)·(|E|-s)+1)+d ( 1)

Among them, a, b, c, d, r, s and α, β are all unknown parameters in the real number domain, and their values are closely related to the graph topology features and the execution behavior of graph algorithms. These parameters can be obtained by performing multiple nonlinear regression analysis on the machine learning algorithm. Among them, the training data set generation method is as follows: download the graph data set from the open source graph data set, select a typical graph algorithm, perform the operations specified by the graph algorithm on the large graph on a single computer to analyze and mine the big graph, record the graph algorithm The execution time is calculated by multiplying the CPU computing speed of the processing node by the execution time of the graph algorithm to obtain the graph task load. The number of vertices, the number of connected edges and the graph task load of each graph constitute a training sample, and multiple samples can be obtained by repeated execution, and all samples constitute a training data set.

For any to-be- _{operated subgraph G i of the} to-be-operated graph G, i∈[1,k], k is, the task load executed by the graph algorithm Alg on it is composed of two parts: computation load and communication load. Remember that the above three are Load(G _i ,Alg), Load _IN (G _i ,Alg) and Load _OUT (G _i ,Alg), then:

Load(G _i ,Alg)=Load _IN (G _i ,Alg)+Load _OUT (G _i ,Alg) (2)

Among them, Load _IN (G _i ,Alg) can be obtained by referring to formula (1):

in,

Represents the set of interior edges of the _{subgraph G i} to be operated on. G _i and the communication load from the other sub-graph is proportional to the total number of cutting edges between them, if the mind set between the cutting edge and the other G _i is a subgraph

but:

Among them, g, h and γ are all unknown parameters in the real number domain, and they can also be obtained by the multivariate nonlinear regression analysis method.

(2) Flow heuristic rule design technology of arbitrary scale. The flow heuristic rule of any proportion is the core of the arbitrary proportion streaming partition method, and it is the key to realize the arbitrary proportion partition. The following describes the general flow of the heuristic rule design for any proportional flow: (a) Suppose the parallel computing system consists of k processing nodes p ₁ , p ₂ ,..., p _k connected through an interconnecting network, and their processing speeds are respectively denoted as sv ₁ ,sv ₂ ,...,sv _k , and satisfy sv ₁ ≤sv ₂ ≤...≤sv _k ; count the number of processing nodes in the target parallel computing system and the computing speed of the CPU of each processing node, and take their ratio as any proportional relationship

sub (b) to be given a graph G and the operational algorithm Alg, according to equation (1) is calculated in the implementation of the algorithm G Alg FIG workload Load (G _i, Alg), ideally, the share of processing node p _i FIG G _i corresponding computing capacity and the size of the task load p _i FIG match. It said the task may wish to load the desired task or load p _i G _i, denoted by _{_{Load E (G i, Alg)}} , then:

Load _E (G _i ,Alg)=γ _i Load(G,Alg) (5)

At each moment in the division process, the actual task load size Load(G _i , Alg) corresponding to the allocated subgraph G _i _{on the node p i} can be calculated according to formulas (2) to (4). Based on the above two information, formulate flow heuristic rules to specify which subgraph each vertex should flow to, so as to continuously reduce the difference between the actual task load and the expected task load of each subgraph during the division process; At the end of the division, a graph division in which the task load of each subgraph satisfies any proportional relationship can be obtained. This division ensures that the graph task load assigned by each processing node matches its computing speed, which can greatly improve the parallel processing efficiency.

Referring to FIG. 3 , an embodiment of the present application provides a graph computing device, including:

The proportion statistics module 10 is used to calculate the proportion of computing resources among the processing nodes;

The graph obtaining module 11 is used to obtain the graph to be calculated;

The graph dividing module 12 is configured to divide the graph to be computed into subgraphs to be computed of each processing node based on the ratio of computing resources; wherein, the ratio between the task loads of the subgraphs to be computed allocated to each processing node is consistent with the ratio of computing resources ;

The subgraph allocation module 13 is used for allocating each subgraph to be calculated to a corresponding processing node, so that each processing node can perform a graph calculation operation in parallel.

Further, as a preferred embodiment, the graph division module includes:

The graph computing device provided by the present application firstly counts the ratio of computing resources between processing nodes for performing graph computing, obtains the graph to be computed, and then divides the graph to be computed into the computations to be computed of each processing node based on the ratio of computing resources. Subgraphs, in which the ratio of the task load of the subgraphs to be calculated to each processing node is consistent with the ratio of computing resources, and then each subgraph to be calculated is allocated to the corresponding processing node for each processing node to execute the graph in parallel computing operations. Because the device divides the sub-graphs based on the ratio of computing resources among the processing nodes to the corresponding task load ratio of the graph to be operated, and then the ratio of the task load between the processing nodes to the sub-graph to be operated is the same as that between the processing nodes. The proportion of computing resources is the same, so as to realize the targeted distribution of sub-graphs to be operated on corresponding loads according to the difference in computing performance of each processing node, thereby relatively ensuring the overall efficiency of graph computing for multiple processing nodes.

In addition, an embodiment of the present application also provides a graph computing device, including:

memory for storing computer programs;

The graph computing device provided by the present application firstly counts the ratio of computing resources between processing nodes used to perform graph computation, obtains the graph to be computed, and then divides the graph to be computed into the computations to be computed of each processing node based on the ratio of computing resources. Subgraphs, in which the ratio of the task load of the subgraphs to be calculated to each processing node is consistent with the ratio of computing resources, and then each subgraph to be calculated is allocated to the corresponding processing node for each processing node to execute the graph in parallel computing operations. Since this device is based on the ratio of computing resources among the processing nodes, the sub-graphs of the corresponding task load ratio to be executed on the graph to be operated are divided, and then the ratio of the task load between the processing nodes to the sub-graph to be operated and the ratio between the processing nodes The proportion of computing resources is the same, so that the sub-graphs to be operated on the corresponding loads are allocated in a targeted manner according to the difference in the computing performance of each processing node, thereby relatively ensuring the overall efficiency of graph computing for multiple processing nodes.

In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned graph computing method are implemented.

The computer-readable storage medium provided by the present application firstly counts the ratio of computing resources between processing nodes used to perform graph computation, obtains the graph to be computed, and then divides the graph to be computed into A subgraph to be operated on, wherein the ratio of the task load of the subgraph to be operated to each processing node is the same as the ratio of computing resources, and then each subgraph to be operated is allocated to the corresponding processing node for parallel processing by each processing node Perform graph computation operations. Since the computer-readable storage medium is based on the ratio of computing resources among the processing nodes, the sub-graphs are divided into sub-graphs that perform corresponding task load ratios on the graph to be operated, and then the task load ratio between the processing nodes and the sub-graph to be operated is related to each processing node. The proportion of computing resources between nodes is consistent, so that according to the difference in computing performance of each processing node, the sub-graphs to be operated on corresponding loads are allocated in a targeted manner, thereby relatively ensuring the overall efficiency of graph computing for multiple processing nodes.

A graph computing method, apparatus, device, and storage medium provided by the present application have been described in detail above. The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same and similar parts between the various embodiments can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method. It should be pointed out that for those of ordinary skill in the art, without departing from the principles of the present application, several improvements and modifications can also be made to the present application, and these improvements and modifications also fall within the protection scope of the claims of the present application.

It should also be noted that, in this specification, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations. There is no such actual relationship or sequence between operations. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

Claims

A graph computing method, comprising:

Count the proportion of computing resources between processing nodes;

Get the to-be-operated graph;

The graph to be computed is divided into subgraphs to be computed of each processing node based on the ratio of computing resources; wherein the ratio of the task loads of the subgraphs to be computed allocated to each processing node is the same as that of all the subgraphs to be computed. The proportion of computing resources described above is consistent;

Each of the subgraphs to be computed is allocated to the corresponding processing nodes, so that each of the processing nodes can perform graph computing operations in parallel.
The graph computing method according to claim 1, wherein the dividing the to-be-computed graph into the to-be-computed subgraphs of each processing node based on the computing resource ratio comprises:

While acquiring the graph to be computed, the graph to be computed is divided into subgraphs to be computed of each processing node based on the computing resource ratio.
The graph computing method according to claim 1, wherein the dividing the to-be-computed graph into the to-be-computed subgraphs of each processing node based on the computing resource ratio comprises:

calculating the total task load corresponding to the to-be-operated graph;

Calculate the task load expectation of each of the processing nodes according to the total task load and the computing resource ratio;

The to-be-operated graph is divided into corresponding to-be-operated subgraphs according to the task load expectations of each of the processing nodes.
The graph computing method according to claim 3, wherein the task load expectation includes a computation load expectation and a communication load expectation.
The graph computing method according to claim 3, wherein the calculating the total task load corresponding to the graph to be computed comprises:

calculating the total task load corresponding to the to-be-operated graph in the target algorithm scenario;

The allocating each of the sub-graphs to be computed to the corresponding processing nodes, so that each of the processing nodes can perform graph computing operations in parallel, includes:

Each of the subgraphs to be computed is allocated to the corresponding processing node, so that each of the processing nodes can perform a graph computing operation based on the target algorithm.
The graph computing method according to claim 5, wherein the calculating the total task load corresponding to the graph to be computed in the target algorithm scenario comprises:

Using a preset model data set to calculate the total task load corresponding to the graph to be computed in the target algorithm scenario;

The generation process of the model dataset includes:

Obtain a graph sample set and a sample algorithm set;

establishing a combination relationship between the graph samples in the graph sample set and the sample algorithms in the sample algorithm set;

The sample task load of each of the combination relationships is counted to obtain the model data set.
The graph computing method according to any one of claims 1 to 6, wherein the computing resource ratio comprises a computing rate ratio of computing resources.
A graph computing device, comprising:

The proportion statistics module is used to calculate the proportion of computing resources among the processing nodes;

The graph obtaining module is used to obtain the graph to be computed;

A graph dividing module, configured to divide the graph to be calculated into subgraphs to be calculated for each processing node based on the computing resource ratio; wherein, the task load of the subgraphs to be calculated allocated to each processing node The ratio between them is consistent with the computing resource ratio;

A sub-graph allocation module, configured to allocate each of the sub-graphs to be computed to the corresponding processing nodes, so that each of the processing nodes can perform graph computing operations in parallel.
A graph computing device, comprising:

memory for storing computer programs;

The processor is configured to implement the steps of the graph computing method according to any one of claims 1 to 6 when executing the computer program.
A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the graph computing method according to any one of claims 1 to 6 is implemented A step of.