CN113326125A - Large-scale distributed graph calculation end-to-end acceleration method and device - Google Patents
Large-scale distributed graph calculation end-to-end acceleration method and device Download PDFInfo
- Publication number
- CN113326125A CN113326125A CN202110552903.0A CN202110552903A CN113326125A CN 113326125 A CN113326125 A CN 113326125A CN 202110552903 A CN202110552903 A CN 202110552903A CN 113326125 A CN113326125 A CN 113326125A
- Authority
- CN
- China
- Prior art keywords
- algorithm
- graph
- load balancing
- task
- radix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/544—Buffers; Shared memory; Pipes
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a large-scale distributed graph computation end-to-end acceleration method and a device, wherein the method comprises the steps of carrying out task division on distributed graph computation to obtain a model selection task, a vertex distribution task and an adjacent linked list construction task; selecting a corresponding information flow mode to calculate a model selection task; dividing the vertexes into different graph divisions according to the end-to-end division indexes, and then distributing the vertexes through a streaming block division algorithm with an optimal threshold value; and expanding the load balancing radix sequencing algorithm to obtain a NUMA-aware load balancing radix sequencing algorithm, and converting the edge array into an adjacent linked list by using a distributed sequencing algorithm on the data format of the underlying graph through the NUMA-aware load balancing radix sequencing algorithm. The acceleration scheme taking the end-to-end time as an optimization target can greatly accelerate the end-to-end graph calculation processing performance.
Description
Technical Field
The invention relates to the technical field of distributed computing, in particular to a method and a device for accelerating computing of a large-scale distributed graph from end to end.
Background
In the big data era, applications such as social networks, internet of things and e-commerce generate a great deal of data, which is generally organized into a graph format and continuously increases, and has grown to the TB level. In order to efficiently process such large-scale graph data, a large number of distributed graph computing systems are proposed.
The process flow of a distributed graph computing system generally includes two phases. The first stage is a pre-treatment stage: naturally occurring maps are large and irregular and require pre-processing to perform a particular map algorithm. In the preprocessing stage, the format of the input graph needs to be converted and the graph needs to be divided into different machines. The second phase is the algorithm execution phase: a specific graph algorithm is executed on the preprocessed graph. Most graph computing systems primarily optimize the efficiency of the algorithm execution phase, without concern for pre-processing phase performance, resulting in very long end-to-end processing times.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, an object of the present invention is to provide an end-to-end acceleration method for large-scale distributed graph computation, which uses an end-to-end time as an acceleration scheme of an optimization target, and can greatly accelerate the end-to-end graph computation processing performance.
Another objective of the present invention is to provide an end-to-end acceleration apparatus for large-scale distributed graph computation.
In order to achieve the above object, an embodiment of an aspect of the present invention provides an end-to-end acceleration method for large-scale distributed graph computation, including:
performing task division on the distributed graph calculation to obtain a model selection task, a vertex distribution task and an adjacency linked list construction task;
selecting a corresponding information flow mode to calculate a model selection task;
dividing the vertexes into different graph divisions according to the end-to-end division indexes, and then distributing the vertexes through a streaming block division algorithm with an optimal threshold value;
and expanding the load balancing radix sequencing algorithm to obtain a NUMA-aware load balancing radix sequencing algorithm, and converting the edge array into an adjacent linked list by using a distributed sequencing algorithm on the data format of the underlying graph through the NUMA-aware load balancing radix sequencing algorithm.
According to the large-scale distributed graph calculation end-to-end acceleration method, the large-scale distributed graph calculation end-to-end acceleration device takes end-to-end time as an optimization target, and a static mode is proposed based on theoretical analysis to reduce a data preprocessing flow; providing a more balanced end-to-end partition index and a streaming block partition algorithm; a faster and more efficient distributed sorting algorithm accelerated sorting process is provided.
In addition, the large-scale distributed graph computation end-to-end acceleration method according to the above embodiment of the present invention may further have the following additional technical features:
further, the information flow mode comprises a push mode and a pull mode, wherein the push mode is that each vertex pushes the updated information to a target vertex through an outgoing edge; the pull mode is that each vertex pulls the updated information from the source vertex to itself through an incoming edge.
Further, the end-to-end division index is:
(1+η+θ(K-1))*E(Pi)+η(K-1)*V(Pi)
where η is a variable parameter to balance the weight of preprocessing and algorithm execution, θ is the communication ratio in the distributed ranking algorithm, K is the division of the entire graph into K divisions, E (P)i) To divide into PiThe number of edges of all vertices, V (P)i) To divide into PiThe number of all vertices above.
Further, the optimal threshold value of the algorithm is searched through the dichotomy.
Further, the load balancing radix ranking algorithm obtained by expanding the load balancing radix ranking algorithm to obtain NUMA perception includes:
shared memory communication is used and allocated in a particular NUMA memory for different threads.
In order to achieve the above object, another embodiment of the present invention provides an end-to-end acceleration apparatus for large-scale distributed graph computation, including:
the division module is used for carrying out task division on the distributed graph calculation to obtain a model selection task, a vertex distribution task and an adjacency linked list construction task;
the selection module is used for selecting the corresponding information flow mode to calculate the model selection task;
the distribution module is used for dividing the vertexes into different graph divisions according to the end-to-end division indexes and distributing the vertexes through a streaming block division algorithm of an optimal threshold value;
the building module is used for expanding the load balancing radix sorting algorithm to obtain the NUMA-aware load balancing radix sorting algorithm, and the NUMA-aware load balancing radix sorting algorithm is used for converting the edge array into the adjacent linked list by using the distributed sorting algorithm in the data format of the underlying graph.
The large-scale distributed graph calculation end-to-end accelerating device takes end-to-end time as an optimization target, and reduces data preprocessing flow by proposing a static mode based on theoretical analysis; providing a more balanced end-to-end partition index and a streaming block partition algorithm; a faster and more efficient distributed sorting algorithm accelerated sorting process is provided.
In addition, the large-scale distributed graph computation end-to-end acceleration apparatus according to the above embodiment of the present invention may further have the following additional technical features:
further, the information flow mode comprises a push mode and a pull mode, wherein the push mode is that each vertex pushes the updated information to a target vertex through an outgoing edge; the pull mode is that each vertex pulls the updated information from the source vertex to itself through an incoming edge.
Further, the end-to-end division index is:
(1+η+θ(K-1))*E(Pi)+η(K-1)*V(Pi)
where η is a variable parameter to balance the weight of preprocessing and algorithm execution, θ is the communication ratio in the distributed ranking algorithm, K is the division of the entire graph into K divisions, E (P)i) To divide into PiThe number of edges of all vertices, V (P)i) To divide into PiThe number of all vertices above.
Further, the optimal threshold value of the algorithm is searched through the dichotomy.
Further, the load balancing radix ranking algorithm obtained by expanding the load balancing radix ranking algorithm to obtain NUMA perception includes:
shared memory communication is used and allocated in a particular NUMA memory for different threads.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow diagram of a large-scale distributed graph computation end-to-end acceleration method according to one embodiment of the invention;
FIG. 2 is a block partitioning algorithm for optimal threshold according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a NUMA aware load balancing cardinality ordering algorithm according to one embodiment of the invention;
FIG. 4 is a schematic diagram of a distributed NUMA aware load balancing cardinality ordering algorithm, according to one embodiment of the invention;
FIG. 5 is a block diagram of a large-scale distributed graph computation end-to-end acceleration apparatus according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a method and an apparatus for end-to-end acceleration of large-scale distributed graph computation according to an embodiment of the present invention with reference to the accompanying drawings.
First, a proposed large-scale distributed graph computation end-to-end acceleration method according to an embodiment of the present invention will be described with reference to the accompanying drawings.
FIG. 1 is a flow diagram of a large-scale distributed graph computation end-to-end acceleration method according to one embodiment of the invention.
As shown in fig. 1, the large-scale distributed graph computation end-to-end acceleration method includes the following steps:
and step S1, performing task division on the distributed graph calculation to obtain a model selection task, a vertex distribution task and an adjacency linked list construction task.
Distributed graph computation typically involves two phases of preprocessing and algorithm execution, and the lack of attention to the preprocessing phase in existing schemes results in very long end-to-end processing times. The pre-processing stage can be divided into three tasks: the method comprises a model selection task, a vertex distribution task and an adjacency linked list construction task.
In step S2, the corresponding information flow pattern is selected to calculate the model selection task.
In the task of model selection, the characteristics of an algorithm and an information flow mode are fully considered, and a static mode is provided to reduce the workload of preprocessing.
The first task of the pre-processing is to select the information flow pattern used by the algorithm. Two information flow modes are used, one is a push (push) mode, namely each vertex pushes updated information to a target vertex through an outgoing edge; the other is the pull mode, where each vertex pulls updated information from the source vertex to itself through an incoming edge. The existing graph algorithms can be divided into two types, one is an Always-Active-Style (Always-Active-Style) graph algorithm, and the other is a Traversal-Style (Traversal-Style) graph algorithm. It can be demonstrated that for the always-active type graph algorithm, the time of the push mode is strictly less than that of the pull mode; for a traversal-type graph algorithm, the time of the pull mode is strictly less than the time of the push mode. Based on the above theorem, we can use a specific pattern for a specific algorithm.
And step S3, dividing the vertexes into different graph divisions according to the end-to-end division indexes, and then distributing the vertexes through a streaming block division algorithm of an optimal threshold value.
In the vertex distribution task, a more representative division index is provided according to the characteristics of an end-to-end task; and then a streaming block partitioning algorithm with theoretical guarantee is provided to enable end-to-end task partitioning to be more balanced.
The second task is to divide the vertices onto different graph partitions, making the workload on each slice as equal as possible. First, a more balanced partitioning formula is proposed as follows:
(1+η+θ(K-1))*E(Pi)+η(K-1)*V(Pi)
where η is a variable parameter to balance the weight of preprocessing and algorithm execution, θ is the communication ratio in the distributed ranking algorithm, K is the division of the entire graph into K divisions, E (P)i) To divide into PiThe number of edges of all vertices, V (P)i) To divide into PiThe number of all vertices above.
This formula takes into account the communication load and the computation load of the preprocessing phase and the algorithm execution phase. And then a streaming block partitioning algorithm of an optimal threshold value is proposed. The chunking partitioning (chunking partitioning) algorithm is a partitioning algorithm with the lowest preprocessing cost known at present, but the existing chunking partitioning algorithm has the problem of load imbalance. As shown in fig. 2, the optimal partitioning strategy is found by searching for the optimal threshold. It can be shown that the function of this search algorithm is a non-decreasing function and that the optimal threshold value is exactly the point of change of the function value. Based on the two properties, the optimal threshold value can be found by using a binary search algorithm efficiently.
And step S4, the load balancing radix sorting algorithm is expanded to obtain a NUMA-aware load balancing radix sorting algorithm, and the edge array is converted into an adjacent linked list by the NUMA-aware load balancing radix sorting algorithm by using a distributed sorting algorithm in the data format of the underlying graph.
In the task of constructing the adjacency linked list, a load balancing radix sorting algorithm is used. And the algorithm is expanded to a distributed scene by utilizing the characteristics of the graph calculation and the overhead of distributed communication is greatly reduced.
The third task is to convert the edge array into the adjacency linked list by using a distributed sorting algorithm in the data format of the underlying graph, which is the most time-consuming task in the preprocessing stage. For the sorting in the machine, the existing load balancing radix sorting algorithm is expanded to obtain a load balancing radix sorting algorithm perceived by NUMA, as shown in fig. 3, so that the load balancing radix sorting algorithm is more suitable for the NUMA architecture of the current server. Extensions include two aspects, the first is to use shared memory communication, and the second is to allocate in a particular NUMA memory for different threads. Then, for inter-machine ordering, the NUMA-aware load balancing cardinality ordering algorithm is extended to a distributed scenario, as shown in fig. 4. The characteristics of graph calculation, namely the characteristics that partial sequencing results are known before sequencing, are fully utilized. The data can be partially sorted before being transmitted so as to reduce the transmission amount of the data.
According to the large-scale distributed graph calculation end-to-end acceleration method provided by the embodiment of the invention, end-to-end time is taken as an optimization target, and a static mode is provided based on theoretical analysis to reduce a data preprocessing flow; providing a more balanced end-to-end partition index and a streaming block partition algorithm; a faster and more efficient distributed sorting algorithm accelerated sorting process is provided.
Next, a description is given, with reference to the drawings, of a large-scale distributed graph computation end-to-end acceleration apparatus proposed according to an embodiment of the present invention.
FIG. 5 is a block diagram of a large-scale distributed graph computation end-to-end acceleration apparatus according to an embodiment of the present invention.
As shown in fig. 5, the large-scale distributed graph computation end-to-end acceleration apparatus includes: a partitioning module 501, a selection module 502, an assignment module 503, and a construction module 504.
The partitioning module 501 is configured to perform task partitioning on the distributed graph computation, so as to obtain a model selection task, a vertex allocation task, and an adjacency linked list construction task.
And the selection module 502 is used for selecting the corresponding information flow mode to calculate the model selection task.
The allocating module 503 is configured to divide the vertex into different graph partitions according to the end-to-end division index, and allocate the vertex through a streaming block division algorithm with an optimal threshold.
The building module 504 is configured to expand the load balancing radix ranking algorithm to obtain a NUMA-aware load balancing radix ranking algorithm, and convert the edge array into the adjacency linked list using a distributed ranking algorithm for the underlying graph data format through the NUMA-aware load balancing radix ranking algorithm.
Furthermore, the information flow mode comprises a push mode and a pull mode, wherein the push mode is that each vertex pushes the updated information to the target vertex through an outgoing edge; the pull mode is that each vertex pulls the updated information from the source vertex to itself through an incoming edge.
Further, the end-to-end division index is:
(1+η+θ(K-1))*E(Pi)+η(K-1)*V(Pi)
where η is a variable parameter to balance the weight of preprocessing and algorithm execution, θ is the communication ratio in the distributed ranking algorithm, K is the division of the entire graph into K divisions, E (P)i) To divide into PiThe number of edges of all vertices, V (P)i) To divide into PiThe number of all vertices above.
Further, the optimal threshold value of the algorithm is searched through the dichotomy.
Further, the load balancing radix ranking algorithm obtained by expanding the load balancing radix ranking algorithm to obtain NUMA perception includes:
shared memory communication is used and allocated in a particular NUMA memory for different threads.
It should be noted that the foregoing explanation of the method embodiment is also applicable to the apparatus of this embodiment, and is not repeated herein.
According to the large-scale distributed graph calculation end-to-end accelerating device provided by the embodiment of the invention, end-to-end time is taken as an optimization target, and a static mode is provided based on theoretical analysis to reduce a data preprocessing flow; providing a more balanced end-to-end partition index and a streaming block partition algorithm; a faster and more efficient distributed sorting algorithm accelerated sorting process is provided.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
Claims (10)
1. An end-to-end acceleration method for large-scale distributed graph computation is characterized by comprising the following steps:
performing task division on the distributed graph calculation to obtain a model selection task, a vertex distribution task and an adjacency linked list construction task;
selecting a corresponding information flow mode to calculate a model selection task;
dividing the vertexes into different graph divisions according to the end-to-end division indexes, and then distributing the vertexes through a streaming block division algorithm with an optimal threshold value;
and expanding the load balancing radix sequencing algorithm to obtain a NUMA-aware load balancing radix sequencing algorithm, and converting the edge array into an adjacent linked list by using a distributed sequencing algorithm on the data format of the underlying graph through the NUMA-aware load balancing radix sequencing algorithm.
2. The method of claim 1, wherein the information flow patterns comprise a push pattern and a pull pattern, wherein the push pattern is used for pushing each vertex to the destination vertex through an outgoing edge after the updated information is pushed; the pull mode is that each vertex pulls the updated information from the source vertex to itself through an incoming edge.
3. The method of claim 1, wherein the end-to-end partition index is:
(1+η+θ(K-1))*E(Pi)+η(K-1)*V(Pi)
where η is a variable parameter to balance the weight of preprocessing and algorithm execution, θ is the communication ratio in the distributed ranking algorithm, K is the division of the entire graph into K divisions, E (P)i) To divide into PiThe number of edges of all vertices, V (P)i) To divide into PiThe number of all vertices above.
4. The method of claim 1, wherein the optimal threshold value of the algorithm is found by a dichotomy.
5. The method of claim 1, wherein expanding the load balancing radix ranking algorithm to a NUMA aware load balancing radix ranking algorithm comprises:
shared memory communication is used and allocated in a particular NUMA memory for different threads.
6. An end-to-end acceleration apparatus for large-scale distributed graph computation, comprising:
the division module is used for carrying out task division on the distributed graph calculation to obtain a model selection task, a vertex distribution task and an adjacency linked list construction task;
the selection module is used for selecting the corresponding information flow mode to calculate the model selection task;
the distribution module is used for dividing the vertexes into different graph divisions according to the end-to-end division indexes and distributing the vertexes through a streaming block division algorithm of an optimal threshold value;
the building module is used for expanding the load balancing radix sorting algorithm to obtain the NUMA-aware load balancing radix sorting algorithm, and the NUMA-aware load balancing radix sorting algorithm is used for converting the edge array into the adjacent linked list by using the distributed sorting algorithm in the data format of the underlying graph.
7. The apparatus of claim 6, wherein the information flow patterns comprise a push pattern and a pull pattern, wherein the push pattern is to push each vertex to the destination vertex through an outgoing edge; the pull mode is that each vertex pulls the updated information from the source vertex to itself through an incoming edge.
8. The apparatus of claim 6, wherein the end-to-end partition index is:
(1+η+θ(K-1))*E(Pi)+η(K-1)*V(Pi)
where η is a variable parameter to balance the weight of preprocessing and algorithm execution, θ is the communication ratio in the distributed ranking algorithm, K is the division of the entire graph into K divisions, E (P)i) To divide into PiThe number of edges of all vertices, V (P)i) To divide into PiThe number of all vertices above.
9. The apparatus of claim 6, wherein the optimal threshold value of the algorithm is found by a dichotomy.
10. The apparatus of claim 6, wherein the load balancing radix ranking algorithm that is extended to NUMA aware load balancing radix ranking algorithm comprises:
shared memory communication is used and allocated in a particular NUMA memory for different threads.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110552903.0A CN113326125B (en) | 2021-05-20 | 2021-05-20 | Large-scale distributed graph calculation end-to-end acceleration method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110552903.0A CN113326125B (en) | 2021-05-20 | 2021-05-20 | Large-scale distributed graph calculation end-to-end acceleration method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113326125A true CN113326125A (en) | 2021-08-31 |
CN113326125B CN113326125B (en) | 2023-03-24 |
Family
ID=77416134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110552903.0A Active CN113326125B (en) | 2021-05-20 | 2021-05-20 | Large-scale distributed graph calculation end-to-end acceleration method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113326125B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130097321A1 (en) * | 2011-10-17 | 2013-04-18 | Yahoo! Inc. | Method and system for work load balancing |
CN104780213A (en) * | 2015-04-17 | 2015-07-15 | 华中科技大学 | Load dynamic optimization method for principal and subordinate distributed graph manipulation system |
CN104954823A (en) * | 2014-03-31 | 2015-09-30 | 华为技术有限公司 | Image calculation pretreatment device, method thereof and system thereof |
CN105787020A (en) * | 2016-02-24 | 2016-07-20 | 鄞州浙江清华长三角研究院创新中心 | Graph data partitioning method and device |
CN109919826A (en) * | 2019-02-02 | 2019-06-21 | 西安邮电大学 | A kind of diagram data compression method and figure computation accelerator for figure computation accelerator |
US20190278760A1 (en) * | 2008-11-14 | 2019-09-12 | Georgetown University | Process and Framework For Facilitating Information Sharing Using a Distributed Hypergraph |
CN110245135A (en) * | 2019-05-05 | 2019-09-17 | 华中科技大学 | A kind of extensive streaming diagram data update method based on NUMA architecture |
CN111209106A (en) * | 2019-12-25 | 2020-05-29 | 北京航空航天大学杭州创新研究院 | Streaming graph partitioning method and system based on cache mechanism |
CN111581443A (en) * | 2020-04-16 | 2020-08-25 | 南方科技大学 | Distributed graph calculation method, terminal, system and storage medium |
US20210081347A1 (en) * | 2019-09-17 | 2021-03-18 | Huazhong University Of Science And Technology | Graph processing optimization method based on multi-fpga accelerator interconnection |
-
2021
- 2021-05-20 CN CN202110552903.0A patent/CN113326125B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190278760A1 (en) * | 2008-11-14 | 2019-09-12 | Georgetown University | Process and Framework For Facilitating Information Sharing Using a Distributed Hypergraph |
US20130097321A1 (en) * | 2011-10-17 | 2013-04-18 | Yahoo! Inc. | Method and system for work load balancing |
CN104954823A (en) * | 2014-03-31 | 2015-09-30 | 华为技术有限公司 | Image calculation pretreatment device, method thereof and system thereof |
CN104780213A (en) * | 2015-04-17 | 2015-07-15 | 华中科技大学 | Load dynamic optimization method for principal and subordinate distributed graph manipulation system |
CN105787020A (en) * | 2016-02-24 | 2016-07-20 | 鄞州浙江清华长三角研究院创新中心 | Graph data partitioning method and device |
CN109919826A (en) * | 2019-02-02 | 2019-06-21 | 西安邮电大学 | A kind of diagram data compression method and figure computation accelerator for figure computation accelerator |
CN110245135A (en) * | 2019-05-05 | 2019-09-17 | 华中科技大学 | A kind of extensive streaming diagram data update method based on NUMA architecture |
US20210081347A1 (en) * | 2019-09-17 | 2021-03-18 | Huazhong University Of Science And Technology | Graph processing optimization method based on multi-fpga accelerator interconnection |
CN111209106A (en) * | 2019-12-25 | 2020-05-29 | 北京航空航天大学杭州创新研究院 | Streaming graph partitioning method and system based on cache mechanism |
CN111581443A (en) * | 2020-04-16 | 2020-08-25 | 南方科技大学 | Distributed graph calculation method, terminal, system and storage medium |
Non-Patent Citations (3)
Title |
---|
殷晓波等: "一种松弛的优化均衡流式图划分算法研究", 《计算机科学》 * |
王童童等: "分布式图处理系统技术综述", 《软件学报》 * |
罗冬梅: "面向分布式图计算的平衡图划分算法", 《信息与电脑(理论版)》 * |
Also Published As
Publication number | Publication date |
---|---|
CN113326125B (en) | 2023-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108566659B (en) | 5G network slice online mapping method based on reliability | |
Nabi et al. | Resource assignment in vehicular clouds | |
CN106250233B (en) | MapReduce performance optimization system and optimization method | |
Schlag et al. | Scalable edge partitioning | |
CN112148492A (en) | Service deployment and resource allocation method considering multi-user mobility | |
CN114418127A (en) | Machine learning calculation optimization method and platform | |
Xu et al. | Computational experience with a software framework for parallel integer programming | |
Koh et al. | MapReduce skyline query processing with partitioning and distributed dominance tests | |
CN111538867A (en) | Method and system for dividing bounded incremental graph | |
Badri et al. | A sample average approximation-based parallel algorithm for application placement in edge computing systems | |
CN113326125B (en) | Large-scale distributed graph calculation end-to-end acceleration method and device | |
WO2015055502A2 (en) | Method of partitioning storage in a distributed data storage system and corresponding device | |
Choo et al. | Reliable vehicle selection algorithm with dynamic mobility of vehicle in vehicular cloud system | |
CN116303763A (en) | Distributed graph database incremental graph partitioning method and system based on vertex degree | |
Abdolazimi et al. | Connected components of big graphs in fixed mapreduce rounds | |
Guinand et al. | Sensitivity analysis of tree scheduling on two machines with communication delays | |
Herrera et al. | Dynamic and hierarchical load-balancing techniques applied to parallel branch-and-bound methods | |
Menouer et al. | Towards a parallel constraint solver for cloud computing environments | |
CN113157431A (en) | Computing task copy distribution method for edge network application environment | |
CN110188925A (en) | A kind of time domain continuous type space crowdsourcing method for allocating tasks | |
CN111737531B (en) | Application-driven graph division adjusting method and system | |
Karanik et al. | Edge Service Allocation Based on Clustering Techniques | |
CN117349031B (en) | Distributed super computing resource scheduling analysis method, system, terminal and medium | |
Shahin | Using heavy clique base coarsening to enhance virtual network embedding | |
Cavallo et al. | Fragmenting Big Data to boost the performance of MapReduce in geographical computing contexts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |