CN111538867A - Method and system for dividing bounded incremental graph - Google Patents

Method and system for dividing bounded incremental graph Download PDF

Info

Publication number
CN111538867A
CN111538867A CN202010294991.4A CN202010294991A CN111538867A CN 111538867 A CN111538867 A CN 111538867A CN 202010294991 A CN202010294991 A CN 202010294991A CN 111538867 A CN111538867 A CN 111538867A
Authority
CN
China
Prior art keywords
sub
graph structure
graph
edges
expansion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010294991.4A
Other languages
Chinese (zh)
Other versions
CN111538867B (en
Inventor
樊文飞
田超
许瑞琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Computing Sciences
Original Assignee
Shenzhen Institute of Computing Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Computing Sciences filed Critical Shenzhen Institute of Computing Sciences
Priority to CN202010294991.4A priority Critical patent/CN111538867B/en
Priority to PCT/CN2020/087707 priority patent/WO2021208147A1/en
Publication of CN111538867A publication Critical patent/CN111538867A/en
Application granted granted Critical
Publication of CN111538867B publication Critical patent/CN111538867B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Abstract

The invention discloses a method and a system for dividing a bounded incremental graph. The method comprises the following steps: the coordinator divides the initial graph structure into a plurality of first sub-graphs, correspondingly obtains a plurality of first sub-divisions, and distributes the first sub-divisions to a plurality of services; each service carries out iterative expansion on the acquired first sub-division, judges whether the first sub-division reaches a preset equilibrium upper bound or not in the iterative expansion process, and stops expanding the first sub-division if the first sub-division reaches the preset equilibrium upper bound; the coordinator confirms whether the updated data exists; if the updated data exists, the updated data is merged with the initial graph structure to obtain an updated partial graph structure, then the partial graph structure is divided into a plurality of second subgraphs and corresponding second subdivisions, the second subdivisions are distributed to the service, and the service receiving the second subdivisions carries out iterative expansion. The invention can reduce the calculation cost in the division of the distributed graph and make the division result more balanced.

Description

Method and system for dividing bounded incremental graph
Technical Field
The invention relates to the field of distributed graph partitioning, in particular to a bounded incremental graph partitioning method and a bounded incremental graph partitioning system.
Background
A graph (graph) is a network of vertices and edges between the vertices. Graph partitioning (graph partitioning) is the partitioning of a graph into sub-graphs such that the sizes of the different sub-graphs are approximately equal and the resulting partitioning cost (cut edges or cut points) is minimized as much as possible. The graph partitions can be divided into: a point partitioning (vertex partitioning) and an edge partitioning (edge partitioning), the former of which divides a node set of a graph; the latter divides the edge set of the graph. Graph partitioning problems are prevalent in various aspects of computer science and technology, such as image segmentation, data clustering, large scale integrated circuit design, distributed parallel computing systems, and the like. On the other hand, many practical problems can also be modeled as graphs, such as knowledge graphs and the like.
In recent years, with the development of the internet, graph data is explosively increased, which brings great challenges to traditional graph computation, such as computation and storage of large-scale graph data. The graph data under the large data can not be accommodated by the memory space of a single machine, so that the graph must be divided and then stored on different computing nodes respectively for distributed computing. The distributed system is a computing system consisting of a plurality of independent computers and a communication network among the independent computers, and each computing node has an independent CPU, a memory address and a storage space. Distributed graph computation needs to divide and divide large-scale graph data into a plurality of subgraphs, store the subgraphs in memories or disks of different nodes, compute all the computations simultaneously and coordinate the computation operation through network communication to complete the computation task. Whether a distributed computing system can operate efficiently depends on: one important indicator of the computational performance, system bandwidth, and quality of graph partitioning of each node, whether it is efficient, is the response time of the distributed system, i.e., the total time from submitting a computational task to obtaining a computational result.
Two indexes need to be considered when dividing the graph: one is load balancing, that is, under the condition of uneven load distribution, the computing node with the highest load can form a computing bottleneck and delay the response time seriously. Assuming that all compute nodes have equal amounts of compute resources, the more balanced the graph partitioning, the shorter the total response time. One indicator of graph partitioning is therefore equality. The second is communication overhead, i.e., communication between each node through the network also increases response time. Communications are caused by the boundaries of the graph being partitioned, and communications are generated when computations need to cross a partition boundary. Therefore, the more sparse the boundary of the graph division is, the less the total amount of communication is, so that the time occupied by the communication is reduced.
The graph partitioning systems that have been widely used now include METIS (a package for serial graph splitting), xtrapalp (a graph partitioning tool), etc., which can generate a partition of graph data on a static graph. However, in practical applications, most graph data is dynamic and is frequently updated, and the updated part is often only a small proportion of the whole graph. The static graph partitioning method and system need to recalculate the whole graph partitioning, and have huge calculation overhead and long time consumption. For example, dividing static map data of about 20GB in size using xtrapalp takes 10 minutes or more. This requires incremental partitioning, i.e., dynamically computing new graph partitions based on the updated portion of the graph data and the existing partitioning results. When the updating amount is smaller, the changing amount of the general partition result is smaller, so that the incremental partition can quickly return a new partition result.
The existing graph partitioning methods all have certain disadvantages, for example, for non-incremental point partitioning and edge partitioning, even a small amount of update needs to be completely recalculated, resulting in increased computational overhead; for point division of non-bounded increments, the division results are unbalanced, and the calculation cost is large when a small amount of updating exists; for non-bounded delta edge partitioning, there are a small number of updates whose computational overhead is also relatively large; for the point division of the bounded increment, the effect of equalization cannot be achieved when the point division is used for dividing the graph. That is, the aforementioned several distributed graph partitioning methods are more or less unable to satisfy two criteria that need to be considered when performing graph partitioning.
Disclosure of Invention
The embodiment of the invention provides a method and a system for dividing a bounded incremental graph, aiming at reducing the calculation overhead of graph division and enabling the graph division result to be more balanced.
In a first aspect, an embodiment of the present invention provides a bounded increment graph partitioning method, where the method includes:
the coordinator divides the initial graph structure into a plurality of first sub-graphs, correspondingly obtains a plurality of first sub-divisions, and distributes the plurality of first sub-graphs and the corresponding first sub-divisions to a plurality of services;
each service acquires a first sub-partition corresponding to a respective first sub-graph, performs iterative expansion on the first sub-partition, and judges whether the first sub-partition reaches a preset first equilibrium upper bound or not in the iterative expansion process, if the first sub-partition reaches the preset first equilibrium upper bound, stops expanding the first sub-partition, and if the first sub-partition does not reach the preset first equilibrium upper bound, continues expanding the first sub-partition;
when all the services complete respective corresponding expansion, feeding back information of the current iteration completion to the coordinator;
after receiving the information of the current iteration completion, the coordinator determines whether an unallocated edge exists or not;
if the unallocated edges exist, the coordinator informs the service of performing iterative expansion on the unallocated edges until all the edges are allocated;
if the unallocated edge does not exist, determining whether the update data exists;
if the updated data exists, the coordinator firstly merges the updated data and the initial graph structure to obtain a partial graph structure corresponding to the updated data, then divides the partial graph structure into a plurality of second subgraphs, obtains a second subdivision corresponding to the second subgraph, distributes the plurality of second subdivisions to the plurality of services, and carries out iterative expansion by the service receiving the second subdivision;
if there is no update data, the graph division processing is ended.
Further, the obtaining, by each service, a first sub-partition corresponding to the respective first sub-graph, and iteratively expanding the first sub-partition includes:
the service acquires a derivative vertex set in the first subdivision, acquires a core vertex set in the initial graph structure, selects a vertex with a priority greater than a preset level threshold value from a difference set of the derivative vertex set and the core vertex set as an expansion vertex set, and then expands all expansion vertices in the expansion vertex set.
Further, the expanding all the expansion vertexes in the expansion vertex set includes:
acquiring unallocated adjacent edges corresponding to all the expansion vertexes, and allocating the adjacent edges to the first subdivision;
updating the derived vertex set according to the newly allocated edges;
judging whether the updated derived vertex set has the condition that two endpoints corresponding to adjacent edges are both in the derived vertex set or not;
and if the two endpoints corresponding to the adjacent edges are both in the derived vertex set, distributing the corresponding adjacent edges to the first subdivision.
Further, the determining whether the first sub-partition reaches a preset first upper balance limit includes:
firstly according to the formula
Figure BDA0002451846900000031
Calculating a first balance upper bound and the number of edges in the first sub-division, then judging whether the calculated number of edges reaches the first balance upper bound, and if so, judging that the first sub-division reaches a preset first balance upper bound;
in the formula, k is the total number of all the first subdivisions, and | E | is the total number of edges of the initial graph structure.
Further, the expanding all the expansion vertexes in the expansion vertex set further includes:
and when all adjacent edges in the derived vertex set are distributed, randomly selecting a core vertex from the core vertex set, and distributing the core vertex to the derived vertex set.
Further, if there is update data, the coordinator first merges the update data with the initial graph structure to obtain a partial graph structure corresponding to the update data, then divides the partial graph structure into a plurality of second subgraphs, obtains a second subdivision corresponding to the second subgraph, distributes the plurality of second subdivisions to the plurality of services, and performs iterative expansion by the service receiving the second subdivision, including:
merging the updated data and the initial graph structure to obtain an updated graph structure and a partial graph structure corresponding to the updated data;
dividing the partial graph structure to obtain a plurality of second subgraphs and second subdivisions corresponding to the second subgraphs;
calculating the total number of edges of the updated graph structure, and calculating a second balance upper bound according to the total number of edges of the updated graph structure and the total number of all the first sub-partitions;
moving out partial edges in a second sub-partition that reaches the second upper equilibrium bound such that the second sub-partition satisfies the second upper equilibrium bound;
and acquiring and removing redundant vertexes in the second subdivision and derivative vertexes of which the number of corresponding adjacent edges is smaller than a second preset edge number value.
Further, if there is update data, the coordinator first merges the update data with the initial graph structure, and obtains a partial graph structure corresponding to the update data, then divides the partial graph structure into a plurality of second subgraphs, and obtains a second subdivision corresponding to the second subgraph, and distributes the plurality of second subdivisions to the plurality of services, and performs iterative expansion by the service receiving the second subdivision, and the method further includes:
and the coordinator distributes the second plurality of sub-partitions in a broadcast distribution mode.
In a second aspect, an embodiment of the present invention further provides a bounded incremental graph partitioning system, including a coordinator and a plurality of services;
the coordinator includes:
the system comprises a first dividing unit, a second dividing unit and a third dividing unit, wherein the first dividing unit is used for dividing an initial graph structure into a plurality of subgraphs, correspondingly obtaining a plurality of first sub-divisions, and distributing the subgraphs and the corresponding first sub-divisions to a plurality of services;
the first confirming unit is used for confirming whether an unallocated edge exists or not after receiving the information that the current iteration is completed;
a notification unit, configured to notify the service to perform iterative expansion on the unallocated edges until all the edges are allocated;
a second confirming unit configured to confirm whether there is update data if there is no unallocated edge;
a second dividing unit, configured to, if there is update data, merge the update data with the initial graph structure, obtain a partial graph structure corresponding to the update data, divide the partial graph structure into multiple second subgraphs, obtain second subdivisions corresponding to the second subgraphs, distribute the multiple second subdivisions to the multiple services, and perform iterative expansion by the services receiving the second subdivisions;
and an ending unit configured to end the graph division processing if there is no update data.
Each service comprises:
the iterative expansion unit is used for acquiring first sub-partitions corresponding to respective sub-graphs, performing iterative expansion on the first sub-partitions, judging whether the first sub-partitions reach a preset first equilibrium upper bound or not in the iterative expansion process, stopping the expansion of the first sub-partitions if the first sub-partitions reach the preset first equilibrium upper bound, and continuing to expand the first sub-partitions if the first sub-partitions do not reach the preset first equilibrium upper bound;
and the feedback unit is used for feeding back information of finishing current iteration to the coordinator when all the services finish respective corresponding expansion.
Further, the bounded incremental graph partitioning system further comprises an IO controller;
the IO controller is used for receiving external update data of the initial graph structure and forwarding the update data to the coordinator.
Further, each service further includes:
the first distribution unit is used for acquiring the unallocated adjacent edges corresponding to all the expansion vertexes and distributing the adjacent edges to the first subdivision;
an updating unit, configured to update the derived vertex set according to the newly allocated edge;
the judging unit is used for judging whether the updated derived vertex set has the condition that two end points corresponding to the adjacent edges are both in the derived vertex set;
and the second distribution unit is used for distributing the corresponding adjacent edges to the first subdivision if the two end points corresponding to the adjacent edges are both in the derived vertex set.
The embodiment of the invention provides a method and a system for dividing a bounded incremental graph. The method comprises the steps that a coordinator divides an initial graph structure into a plurality of first sub-graphs, correspondingly obtains a plurality of first sub-divisions, and distributes the plurality of first sub-graphs and the corresponding first sub-divisions to a plurality of services; each service acquires a first sub-partition corresponding to a respective first sub-graph, performs iterative expansion on the first sub-partition, and judges whether the first sub-partition reaches a preset first equilibrium upper bound or not in the iterative expansion process, if the first sub-partition reaches the preset first equilibrium upper bound, stops expanding the first sub-partition, and if the first sub-partition does not reach the preset first equilibrium upper bound, continues expanding the first sub-partition; when all the services complete respective corresponding expansion, feeding back information of the current iteration completion to the coordinator; after receiving the information of the current iteration completion, the coordinator determines whether an unallocated edge exists or not; if the unallocated edges exist, the coordinator informs the service of performing iterative expansion on the unallocated edges until all the edges are allocated; if the unallocated edge does not exist, determining whether the update data exists; if the updated data exists, the coordinator firstly merges the updated data and the initial graph structure to obtain a partial graph structure corresponding to the updated data, then divides the partial graph structure into a plurality of second subgraphs, obtains a second subdivision corresponding to the second subgraph, distributes the plurality of second subdivisions to the plurality of services, and carries out iterative expansion by the service receiving the second subdivision; if there is no update data, the graph division processing is ended. The embodiment of the invention can reduce the calculation cost in the division of the distributed graph and make the division result more balanced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart illustrating a bounded delta graph partitioning method according to an embodiment of the present invention;
FIG. 2 is a sub-flow diagram illustrating a bounded delta graph partitioning method according to an embodiment of the present invention;
FIG. 3 is an exemplary diagram of a bounded delta graph partitioning method according to an embodiment of the present invention;
FIG. 4 is a schematic view of another sub-flow of a bounded delta graph partitioning method according to an embodiment of the present invention;
FIG. 5 is a basic architecture diagram of a bounded incremental graph partitioning method according to an embodiment of the present invention;
FIG. 6 is a basic flowchart of a bounded delta graph partitioning method according to an embodiment of the present invention;
FIG. 7 is a flowchart illustrating a method for partitioning a bounded delta graph according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Referring to fig. 1, fig. 1 is a schematic flowchart of a bounded incremental graph partitioning method according to an embodiment of the present invention, which specifically includes: steps S101 to S107.
S101, the initial graph structure G is divided into a plurality of first sub-graphs G by the coordinatoriAnd correspondingly obtaining a plurality of first sub-divisions PiAnd combining the plurality of first sub-graphs GiAnd a corresponding first subdivision PiDistribution into a plurality of services;
s102, each service acquires respective first sub-graph GiCorresponding first subdivision PiAnd dividing the first sub-division PiPerforming iterative expansion, and judging the first subdivision P in the iterative expansion processiWhether a preset first equilibrium upper bound is reached, if the first subdivision PiStopping the first subdivision P when a preset first equilibrium upper bound is reachediIf said first subdivision PiIf the preset first balance upper bound is not reached, continuing to divide the first sub-partition PiCarrying out expansion;
s103, when all the services complete respective corresponding expansion, feeding back information of current iteration completion to the coordinator;
s104, after receiving the information of the current iteration completion, the coordinator confirms whether an unallocated edge exists or not, and if the unallocated edge exists, the step S105 is executed; if there is no unallocated edge, go to step S106;
s105, the coordinator informs the service of performing iterative expansion on the unallocated edges until all the edges are allocated;
s106, confirming whether the updated data delta G exists or not; if the update data Δ G exists, the process proceeds to step S107; if the update data Δ G does not exist, the process proceeds to step S108.
S107, the coordinator firstly merges the updated data delta G and the initial graph structure G to obtain a partial graph structure delta G 'corresponding to the updated data, and then divides the partial graph structure delta G' into a plurality of second sub-graphs Gi', and obtaining a second subdivision P corresponding to the second subgraphi', and dividing said plurality of second sub-divisions Pi' distributing to the plurality of services, and receiving the second subdivision Pi' performs iterative expansion;
and S108, ending the graph dividing processing.
In this embodiment, when performing distributed computation on the initial graph structure G, on one hand, each service is enabled to perform a first subdivision P for each service by presetting a first upper balance bound on each serviceiWhen the iterative expansion is carried out, the expansion is not carried out without limitation, but when the first subdivision P is dividediTo the first balanceOn the other hand, when the initial graph structure G has the updated data delta G, the updated graph structure G ⊕ delta G does not need to be completely expanded, and only the updated part delta G' needs to be iteratively expanded, so that the effect of incremental partitioning is realized.
In this embodiment, the distribution of the subdivisions is achieved by the coordinator, and the iterative expansion of the different subdivisions is achieved by multiple services. The coordinator in this embodiment does not store a data structure related to the graph structure, but stores only a small number of temporary variables. Besides distributing each subdivision to each service, the coordinator monitors the processes of all the graph divisions through a global variable, further determines the operation steps to be executed next according to the processes of each service, and forwards the communication among the services, and the communication content is generally used for synchronizing the data corresponding to each subdivision, and theoretically can be a data structure after any serialization. All services in the embodiment have independent CPU and memory space, and sub-graph G is storediAnd a first subdivision P on the subgraphiAnd all services are iteratively expanded independently. In addition, upon reception of the update data Δ G distributed by the coordinator (i.e. the second subdivision P)i') each service still performs the iterative expansion independently, i.e. the second subdivision is computed locally. It should be noted that the update data Δ G in the present embodiment includes insertion data Δ of a graph structure+G and deletion data Delta-G。
Currently, existing incremental graph partitioning systems, such as ParMETIS (an MPI parallel library, which has many built-in non-structure graphs and algorithms for grid partitioning (partitioning) and subdivision (partitioning)), Hermes (a large data real-time multidimensional analysis platform), and the like, although the calculation can be accelerated by using the previous old partitioning result, the existing incremental graph partitioning systems are not sensitive to the update size, and also take a lot of time to calculate when the update amount is small. The reason is that the algorithm does not have incremental bounding (incremental bounding) property, i.e. the incremental computation cost cannot be limited by an expression about the update size, which requires designing a bounded incremental algorithm, so that the graph partitioning system can quickly return a new graph partitioning result at a smaller update amount. On the other hand, the existing incremental graph partitioning system is designed based on a point partitioning model. Compared with point division, edge division can divide the graph data more uniformly to obtain a better division result, and the incremental graph division algorithm in the current edge division mode is still blank.
The embodiment provides a method for dividing a distributed bounded increment graph in an edge division mode, and importantly, the method is the first method with bounded increments in the field of distributed graph division, realizes bounded increment calculation of a graph structure, and fills up the technical blank in the technical field. The advantage of bounded increments over other methods is that their computational overhead is uniquely determined by the update data, and therefore is at most a constant time when the update data Δ G is empty. When the update data Δ G is very small, e.g. only 1% of the full graph proportion, the overhead of calculating new graph partitions must also be very small, and is therefore suitable for frequent small updates.
In an embodiment, the services obtain respective first sub-graph GiCorresponding first subdivision PiAnd dividing the first sub-division PiPerforming iterative dilation comprising:
said service being in said first subdivision PiIn-process derived vertex set SiAnd acquiring a core vertex set C in the initial graph structure G, and acquiring a derived vertex set S in the derived vertex set SiAnd a difference set S of the core vertex set CiSelecting the vertexes with the priority levels larger than the preset level threshold value from the C as expansion vertexes and forming an expansion vertex set XiThen set X to the expansion vertexiAll the expansion apexes in (1) are expanded.
In this embodiment, the core vertex set C specifically means that all the adjacent edges corresponding to all the vertices in the set have been allocated and are allocated to each of the first subdivisions PiPerforming the following steps; set of derived vertices SiIn particular toMeans that there is at least one vertex in the set assigned to the set S of derived verticesiBelonging first subdivision PiAnd the first subdivision PiBoth end points corresponding to all edges in (b) belong to the first subdivision PiCorresponding set of derived vertices Si
Note that the derived vertex set SiAnd a difference set S of the core vertex set Ci\ C specifically refers to: set of derived vertices SiThe vertex in (A) is removed from the vertex in the core vertex set C, and the remaining vertices are set, that is, in the difference set SiThe selected vertex in C belongs to a set S of derived verticesiBut not core vertex set C. Thus, the core vertex set C and the derived vertex set SiDifference set S betweeniC can be understood as: in this difference set SiC, there is at least one vertex for all vertices assigned to the first subdivision PiAnd there is also at least one adjacent edge that is not allocated. That is, at least one adjacent edge exists for each selected extended vertex, and the adjacent edge is not assigned to the derived vertex set SiCorresponding first subdivision PiThese expansion vertices can thus be iteratively expanded in the next step, i.e. their corresponding unassigned adjoining edges are assigned to the respective first subdivision PiIn (1).
In this embodiment, the number of edges of the unallocated adjacent edge corresponding to the vertex is used as the determination of the priority of the vertex, that is, the smaller the number of edges of the unallocated adjacent edge corresponding to the vertex is, the higher the priority corresponding to the vertex is, and vice versa. That is, the vertex with the smaller number of edges to which no adjacent edge is assigned is preferentially selected as the expansion vertex. This has the advantage that as much as possible all the contiguous edges corresponding to one vertex are allocated in the same subdivision. In a specific application scenario, the priorities of all the vertices are calculated, and the vertices with the priorities of the first 10% are acquired as the expansion vertices.
In addition, in distributed computing, due to a top in a graphPoints may exist on different subgraphs, which are in different services, respectively, so that when one service selects the expanded vertex set XiThe vertex set then needs to be expanded in synchronization with other services. In other words, for a vertex, if there is a service, it is selected as an extended vertex into the extended vertex set XiThen the vertex needs to be added to the corresponding set of expanded vertices on all other services that own the vertex.
In one embodiment, as shown in FIG. 2, the pair of the extended vertex sets XiAll of the expansion apexes in (1) are expanded, including: steps S201 to S204.
S201, obtaining the unallocated adjacent edges corresponding to all the expansion vertexes, and allocating the adjacent edges to the first subdivision PiPerforming the following steps;
s202, according to newly distributed edges, collecting the derived vertexes SiUpdating is carried out;
s203, judging the updated derivative vertex set SiWhether two end points corresponding to the adjacent edges exist in the derived vertex set S or notiThe case (1) above;
s204, if two endpoints corresponding to the adjacent edges exist, the two endpoints are all in the derived vertex set SiIn case (3), then the corresponding adjacent edge is assigned to the first subdivision PiIn (1).
In this embodiment, when an adjacent edge is assigned to the corresponding first subdivision, another vertex (i.e., a non-expanded vertex) of the adjacent edge also conforms to the derived vertex set Si(i.e., there is at least one vertex in the set that is assigned to the set S of derived verticesiBelonging first subdivision PiMedium), it is therefore necessary to set S to the derived vertices after the assignment of the adjacent edges is completediAnd (6) updating. And for all newly added derived vertex sets SiAll adjacent edges corresponding to the vertex in (1) are checked, and the set S of derived vertices is not searchediAll possible assigned contiguous edges are assigned with the addition of a new vertex. When updated derived vertex set SiThere is an adjacent edge whose two end points belong to the derived vertex set SiThen the adjacent edge is allocated to the corresponding first subdivision PiIn (1).
In addition, when an adjacent edge is allocated to different first subdivisions PiWhen expanding, the adjacent edge is randomly distributed to one of the first subdivisions PiIn (1).
For example, as shown in FIG. 3, for the first subdivision P0In the iterative expansion process, a boundary expansion vertex u is selected for expansion, and the adjacent edges (u, v) which are not distributed in all the adjacent edges of the expansion vertex u are distributed to a first subdivision P0Then, for the newly entered derived vertex set S0Checks all its adjacent edges and does not go to S0All possible assigned edges are assigned with the introduction of a new vertex.
In one embodiment, the determining the first subdivision PiWhether a preset first upper balance limit is reached comprises the following steps:
firstly according to the formula
Figure BDA0002451846900000111
Calculating a first upper equilibrium bound and calculating the first subdivision PiThen judging whether the calculated number of edges reaches the first upper balance limit, if so, judging the first subdivision PiReaching a preset first balance upper bound;
wherein k is a predetermined number and k is all of the first subdivision PiIs the total number of edges of the existing graph structure G, | E |.
In the present embodiment, P is divided into a first sub-divisioniWhen performing iterative dilation, to avoid imbalance of the final partition, all the first sub-partitions P are dividediA first upper balance bound is preset. When the first subdivision PiWhen the number of edges in (1) reaches the first equilibrium upper bound, stopping the process towards the first subdivision PiMaking a division of adjoining edges, i.e. the first subdivision PiThe iterative expansion is not continued, at this timeIf other first subdivision PiThe first upper equilibrium limit has not been reached, iterative dilation may continue.
It is to be noted that the formula
Figure BDA0002451846900000112
Is a preset small amount greater than 0, and the formula can be understood as: the scale of one of the largest first subdivisions in the graph structure may not exceed (1+) times the absolute uniform subdivision (1/k).
In one embodiment, the pair of the set of expansion vertices XiThe expanding of all of the expanding apices further comprising:
when the derived vertex set SiWhen all the adjacent edges in (a) are allocated, a core vertex is randomly selected from the core vertex set C, and the core vertex is allocated to the derived vertex set SiIn (1).
In this embodiment, when deriving vertex set SiAre assigned, the corresponding first subdivision PiThe expansion will stop. At this time, if the first subdivision PiWhen the preset first balance upper bound is not reached yet, other first sub-divisions PiReaching the preset first upper balance limit may result in an unbalanced final partition. Therefore, in order to ensure that the division results in better locality (balance), it is necessary to set S of derived verticesiPerforming active expansion, namely adding the vertexes in the core vertex set C into the derivative vertex set SiSo that the first subdivision P can continue to be dividediIs expanded.
In one embodiment, as shown in fig. 4, the step S107 includes: steps S401 to S405.
S401, merging the updated data delta G and the initial graph structure G to obtain an updated graph structure G ≧ delta G and a partial graph structure delta G' corresponding to the updated data;
s402, dividing the partial graph structure delta G 'to obtain a plurality of second sub graphs G' and the second sub graphsSecond subdivision P for graph mappingi’;
S403, calculating the total number of edges of the updated graph structure G ⊕ Δ G, and dividing the updated graph structure G ⊕ Δ G into all first sub-partitions PiCalculating a second upper balance bound;
s404, a second subdivision P reaching the second equilibrium upper boundi' removing partial edges so that the second subdivision Pi' satisfying the second upper balance bound;
s405, obtaining and removing the second subdivision Pi' and the derived vertices with the number of corresponding adjacent edges less than a second predetermined number of edges.
In this embodiment, when there is update data Δ G for the initial graph structure G, after the update data Δ G and the initial graph structure G are merged (i.e., G ⊕ Δ G), a partial graph structure Δ G 'corresponding to the update data Δ G and the update graph structure G ⊕ Δ G' may be obtainedi' and a corresponding plurality of second subdivisions Pi'. Since the total number of edges in the updated graph structure may change, the total number of edges | E '| of the updated graph structure needs to be recalculated, and the second upper bound of equalization needs to be recalculated according to the total number of edges | E' |. After obtaining the second equilibrium upper bound, the second subdivision P needs to be partitionedi' proceed the corresponding process to make the second subdivision Pi' satisfies the second upper balance bound.
The redundant vertex in this embodiment refers to a vertex without any adjacent edge in the corresponding subdivision, and the second subdivision P is divided intoi' after partial edge removal, this second subdivision P may resulti' some redundant vertices are generated, and removing these redundant vertices and the derived vertices with fewer corresponding adjacent edges in time enables the second subdivision P to be generatedi' with less communication overhead.
In addition, after the above operations are completed, the kernel vertex set S is also requiredi' and all derived vertex sets C are updated.
In an embodiment, the step S107 further includes:
the coordinator adopts a broadcast distribution mode to divide the plurality of second sub-partitions Pi' to distribute.
In this embodiment, some first sub-partitions P may be in the process of partitioning the initial graph structure GiThe preset first balance upper bound is reached, and the expansion is not needed to be continued, so that when the update data deltaG exists, the coordinator adopts a broadcast distribution mode to divide the second sub-partition P into the second sub-partitionsi' distribution and decision by each service whether to receive the second subdivision P distributed by the coordinatori' in this way, the final partitioning can be guaranteed to be more uniform.
In a specific embodiment, as shown in fig. 5, the embodiment of the present invention specifically includes an IO controller, a coordinator (coordinator) and a plurality of services (workers). When the external world makes an update change to the graph structure, the IO controller receives update data Δ G (including insertion data Δ) of the external world to the graph structure+G and deletion data Delta-G) And sending the updated data Δ G to a coordinator, and the coordinator dividing the received updated data Δ G to obtain a plurality of sub-divisions Pi', and dividing these into Pi' distribution to services and subdivision P by receptioniThe service of' iteratively expands independently.
In one embodiment, as shown in fig. 6, the embodiment of the present invention is mainly composed of two stages: a partial partitioning phase and a rebalancing phase. Wherein Partial partitioning (Partial Allocation) is used to expand the graph partitioning that exists so far until all edges have been allocated. At this time, if the IO controller detects that the graph structure has the updated data Δ G, a rebalancing (ReBalance) phase is performed to process the updated graph data Δ G, so that a Partial result of the current graph partitioning can be processed again by the Partial Allocation (Partial Allocation) phase.
Specifically, in the Partial Allocation stage, the input content includes a graph structure G, a subgraph G', and a sub-partition P on the subgraph Gi
Figure BDA0002451846900000131
The output content comprises: subgraph G', subdivision P after expansioni
Figure BDA0002451846900000132
At this stage, a partition on the sub-graph G' is expanded to a larger sub-graph G ″, and the specific partitioning steps can refer to steps S201 to S204.
In the rebalancing stage, its input content includes: graph G, subdivision PiUpdate data deltaG, and the output content includes an updated graph G ⊕ deltaG and a subdivision PiWherein
Figure BDA0002451846900000133
At this stage, the graph update data Δ G is incorporated into the existing graph structure G and a new partial graph partition P (G') is generated such that this graph partition meets the updated balance constraint, i.e. the number of edges in the largest subdivision does not exceed the updated balance constraint
Figure BDA0002451846900000134
The steps S301 to S305 can be referred to for the specific steps of rebalancing.
In another embodiment, as shown in fig. 7, the process with the prefix PMIC _ is executed on the coordinator, and the rest of the processes are executed in parallel on the services. The part of the input update data Δ G is then processed by the IO controller. partialAllocation corresponds to the main loop process from filtering to PMIC _ expansion S, the service partitioning a portion of edges each time until all edges have been allocated. After receiving the Update data Δ G, the Update data Δ G is preprocessed through PMIC _ Update (PMIC _ Update) and ReBalance (ReBalance) processes, and then returned to the main loop body for iterative expansion. It should be noted that each service uses a BSP (global synchronous parallel computing) mode in parallel, each phase is executed independently on a service or a coordinator, and does not involve communication with other nodes (other services or coordinators), and after local computing of all services is finished, global communication is performed to exchange information.
Specifically, in the filtering (Filter) stage, the input content includes: set of derived vertices SiAnd a difference set S of the core vertex set CiC and difference set SiV (v) priority level for each vertex in C; the output content comprises: expanding vertex set XiAnd to synchronize the communication messages to each vertex that corresponds thereto. At this stage, the difference set S of the local is measurediV of C is sorted from high to low according to f (v), and the vertex with higher priority is selected as the expanding vertex Xi(e.g., pick the top 10% of the priority). Finally, communicating with other services through the coordinator, XiThe vertex v in (b) is synchronized into the expanded vertex set of the other service.
In the PMIC _ Filter (PMIC _ Filter) phase, the coordinator is used to pass communication messages and to calculate the following expansion of C (ExpandC) by expansion XiIs assigned to PiIf the total number of edges exceeds the preset balance upper bound, the number of the edges is reduced by XiIs partially vertex-pointed.
In the expansion c (expand c) stage, its inputs include: subfigure GiPartial subdivision of PiEach expansion node set XiK, · i ═ 1, 2; the output content comprises: subdivision P by dilation CiUpdated core vertex set C, derivative vertex set SiAnd expanding the vertex set Xi1, 2.. k, and for synchronizing the communication messages of the corresponding set on other services.
Specifically, in the dilation C phase, P is divided into sub-partitionsiFor expanding vertex set XiAll local vertices are expanded, i.e. all locally visible contiguous edges are assigned to subdivision PiIn (1). For example, if the vertex v is at X0In (3), all its adjacent edges are allocated to P0In (1). Assuming that one of the edges assigned is (v, u), the vertex u is added to SiMiddle (update S)i). When one edge is simultaneously expanded by a plurality of sub-partitions, one of the sub-partitions is randomly selected for expansion.
In PMIC _ expanded C (Expandc) stage, harmonizationFor communicating communication messages and updating the globally allocated payload. Herein, the allocated payload specifically means having been allocated to the subdivision PiThe number of edges in (1).
At expansion s (expands), its input content includes: subfigure GiSubdivision Pi,SiNew addition part Delta S ofiK, · i ═ 1, 2; the output content comprises: subdivision P via dilation SiUpdated local core vertex set C and derivative vertex set SiAnd expanding the vertex set Xi1, 2.. k, and a synchronous communication message.
Specifically, in the dilation S phase, P is divided into sub-partitionsiScanning the updated set S of derived verticesiAll the adjacent edges which are not allocated yet and correspond to the middle vertex, if another vertex of the adjacent edges which are not allocated is also in the derived vertex set SiIn (3), the adjacent edge is divided into sub-partitions PiAnd the inside. And simultaneously, synchronizing the vertex information in the updated derivative vertex set with other services.
In a PMIC _ expanded S (PMIC _ ExpandS) stage, the coordinator is used for transmitting a communication message and judging whether an unallocated edge exists or not, and if the unallocated edge exists, entering a next iteration; if the unallocated edge does not exist, the next judgment process is carried out, namely whether the updating data delta G exists or not is judged, if the updating data delta G exists, the PMIC _ Update stage is started, and if the updating data delta G does not exist, the process is ended.
In PMIC _ Update (PMIC _ Update) phase, the coordinator divides the Update data Δ G into a plurality of sub-divisions Pi', and Pi' distribute to individual services, and compute an updated equilibrium upper bound.
In the ReBalance (ReBalance) stage, the specific content is the same as that in the ReBalance stage, and is not described herein again.
The bounded incremental graph dividing method provided by the embodiment can efficiently and quickly divide the graph. In a specific application scenario, when the updated data occupies 10% of the full map, the method provided by the present embodiment can achieve 7.9 times of acceleration ratio compared to using the static method for repartitioning, and when the updated data occupies 50% of the full map, the acceleration ratio still remains 3.9 times.
The bounded incremental graph partitioning method provided by the embodiment can also achieve the same or even better partitioning quality as static graph partitioning. In a specific application scenario, the graph partitioning communication overhead corresponding to the method provided by the embodiment is about 10% lower than that of other static edge partitioning methods.
The bounded incremental graph partitioning method provided by the embodiment has extremely strong parallel expansibility. In a specific application scenario, when 128 services are used, 128-partitioning a graph structure with a size of 58 hundred million edges takes only 51 seconds.
The bounded incremental graph dividing method provided by the embodiment is less time-consuming compared with other existing incremental graph dividing methods. In a specific application scenario, the response time of the method provided by this embodiment is at least 6.4 times less than that of ParMETIS, and at least 2.2 times less than that of Hermes.
The embodiment of the invention also provides a bounded incremental graph partitioning system, which comprises a coordinator and a plurality of services;
the coordinator includes:
the system comprises a first dividing unit, a second dividing unit and a third dividing unit, wherein the first dividing unit is used for dividing an initial graph structure into a plurality of subgraphs, correspondingly obtaining a plurality of first sub-divisions, and distributing the subgraphs and the corresponding first sub-divisions to a plurality of services;
the first confirming unit is used for confirming whether an unallocated edge exists or not after receiving the information that the current iteration is completed;
a notification unit, configured to notify the service to perform iterative expansion on the unallocated edges until all the edges are allocated;
a second confirming unit configured to confirm whether there is update data if there is no unallocated edge;
a second dividing unit, configured to, if there is update data, merge the update data with the initial graph structure, obtain a partial graph structure corresponding to the update data, divide the partial graph structure into multiple second subgraphs, obtain second subdivisions corresponding to the second subgraphs, distribute the multiple second subdivisions to the multiple services, and perform iterative expansion by the services receiving the second subdivisions;
and an ending unit configured to end the graph division processing if there is no update data.
Each service comprises:
the iterative expansion unit is used for acquiring first sub-partitions corresponding to respective sub-graphs, performing iterative expansion on the first sub-partitions, judging whether the first sub-partitions reach a preset first equilibrium upper bound or not in the iterative expansion process, stopping the expansion of the first sub-partitions if the first sub-partitions reach the preset first equilibrium upper bound, and continuing to expand the first sub-partitions if the first sub-partitions do not reach the preset first equilibrium upper bound;
and the feedback unit is used for feeding back information of finishing current iteration to the coordinator when all the services finish respective corresponding expansion.
In an embodiment, the bounded delta graph partitioning system further comprises an IO controller;
the IO controller is used for receiving external update data of the initial graph structure and forwarding the update data to the coordinator.
In an embodiment, each of the services further comprises:
the first distribution unit is used for acquiring the unallocated adjacent edges corresponding to all the expansion vertexes and distributing the adjacent edges to the first subdivision;
an updating unit, configured to update the derived vertex set according to the newly allocated edge;
the judging unit is used for judging whether the updated derived vertex set has the condition that two end points corresponding to the adjacent edges are both in the derived vertex set;
a second distribution unit, configured to, if there are two endpoints corresponding to adjacent edges, locate both in the derived vertex set SiIn case of (1), then the correspondingThe adjacent edge is allocated into the first subdivision.
Since the embodiment of the system part corresponds to the embodiment of the method part, the embodiment of the system part is described with reference to the embodiment of the method part, and is not repeated here.
Embodiments of the present invention also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed, the steps provided by the above embodiments can be implemented. The storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The embodiment of the present invention further provides a computer device, which may include a memory and a processor, where the memory stores a computer program, and the processor may implement the steps provided in the above embodiments when calling the computer program in the memory. Of course, the electronic device may also include various network interfaces, power supplies, and the like.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A bounded delta graph partitioning method, comprising:
the coordinator divides the initial graph structure into a plurality of first sub-graphs, correspondingly obtains a plurality of first sub-divisions, and distributes the plurality of sub-graphs and the corresponding first sub-divisions to a plurality of services;
each service acquires a first sub-partition corresponding to a respective first sub-graph, performs iterative expansion on the first sub-partition, and judges whether the first sub-partition reaches a preset first equilibrium upper bound or not in the iterative expansion process, if the first sub-partition reaches the preset first equilibrium upper bound, stops expanding the first sub-partition, and if the first sub-partition does not reach the preset first equilibrium upper bound, continues expanding the first sub-partition;
when all the services complete respective corresponding expansion, feeding back information of the current iteration completion to the coordinator;
after receiving the information of the current iteration completion, the coordinator determines whether an unallocated edge exists or not;
if the unallocated edges exist, the coordinator informs the service of performing iterative expansion on the unallocated edges until all the edges are allocated;
if the unallocated edge does not exist, determining whether the update data exists;
if the updated data exists, the coordinator firstly merges the updated data and the initial graph structure to obtain a partial graph structure corresponding to the updated data, then divides the partial graph structure into a plurality of second subgraphs, obtains a second subdivision corresponding to the second subgraph, distributes the plurality of second subdivisions to the plurality of services, and carries out iterative expansion by the service receiving the second subdivision;
if there is no update data, the graph division processing is ended.
2. The method according to claim 1, wherein the obtaining a first sub-partition corresponding to a first sub-graph of each service and iteratively expanding the first sub-partition comprises:
the service acquires a derivative vertex set in the first subdivision, acquires a core vertex set in the initial graph structure, selects a vertex with a priority greater than a preset level threshold value from a difference set of the derivative vertex set and the core vertex set as an expansion vertex set, and then expands all expansion vertices in the expansion vertex set.
3. The method of bounded delta graph partitioning according to claim 2, wherein said expanding all of said set of expanded vertices comprises:
acquiring unallocated adjacent edges corresponding to all the expansion vertexes, and allocating the adjacent edges to the first subdivision;
updating the derived vertex set according to the newly allocated edges;
judging whether the updated derived vertex set has the condition that two endpoints corresponding to adjacent edges are both in the derived vertex set or not;
and if the two endpoints corresponding to the adjacent edges are both in the derived vertex set, distributing the corresponding adjacent edges to the first subdivision.
4. The method according to claim 1, wherein said determining whether the first sub-division reaches a preset first equilibrium upper bound comprises:
firstly according to the formula
Figure FDA0002451846890000021
Calculating a first balance upper bound and the number of edges in the first sub-division, then judging whether the calculated number of edges reaches the first balance upper bound, and if so, judging that the first sub-division reaches a preset first balance upper bound;
in the formula, k is the total number of all the first subdivisions, and | E | is the total number of edges of the initial graph structure.
5. The method of bounded delta graph partitioning according to claim 3, wherein said expanding all of said set of expanded vertices further comprises:
and when all adjacent edges in the derived vertex set are distributed, randomly selecting a core vertex from the core vertex set, and distributing the core vertex to the derived vertex set.
6. The method according to claim 4, wherein if there is update data, the coordinator merges the update data with the initial graph structure to obtain a partial graph structure corresponding to the update data, then divides the partial graph structure into a plurality of second subgraphs to obtain second sub-partitions corresponding to the second subgraphs, and distributes the second sub-partitions to the services, and performs iterative expansion by the service that receives the second sub-partitions, and the method includes:
merging the updated data and the initial graph structure to obtain an updated graph structure and a partial graph structure corresponding to the updated data;
dividing the partial graph structure to obtain a plurality of second subgraphs and second subdivisions corresponding to the second subgraphs;
calculating the total number of edges of the updated graph structure, and calculating a second balance upper bound according to the total number of edges of the updated graph structure and the total number of all the first sub-partitions;
moving out partial edges in a second sub-partition that reaches the second upper equilibrium bound such that the second sub-partition satisfies the second upper equilibrium bound;
and acquiring and removing redundant vertexes in the second subdivision and derivative vertexes of which the number of corresponding adjacent edges is smaller than a second preset edge number value.
7. The method according to claim 6, wherein if there is update data, the coordinator merges the update data with the initial graph structure to obtain a partial graph structure corresponding to the update data, then divides the partial graph structure into a plurality of second subgraphs to obtain second sub-partitions corresponding to the second subgraphs, distributes the second sub-partitions to the services, and performs iterative expansion by the services receiving the second sub-partitions; if there is no update data, ending the graph partitioning process, further comprising:
and the coordinator distributes the second plurality of sub-partitions in a broadcast distribution mode.
8. A bounded delta graph partitioning system comprising a coordinator and a plurality of services:
the coordinator includes:
the system comprises a first dividing unit, a second dividing unit and a third dividing unit, wherein the first dividing unit is used for dividing an initial graph structure into a plurality of subgraphs, correspondingly obtaining a plurality of first sub-divisions, and distributing the subgraphs and the corresponding first sub-divisions to a plurality of services;
the first confirming unit is used for confirming whether an unallocated edge exists or not after receiving the information that the current iteration is completed;
a notification unit, configured to notify the service to perform iterative expansion on the unallocated edges until all the edges are allocated;
a second confirming unit configured to confirm whether there is update data if there is no unallocated edge;
a second dividing unit, configured to, if there is update data, merge the update data with the initial graph structure, obtain a partial graph structure corresponding to the update data, divide the partial graph structure into multiple second subgraphs, obtain second subdivisions corresponding to the second subgraphs, distribute the multiple second subdivisions to the multiple services, and perform iterative expansion by the services receiving the second subdivisions;
and an ending unit configured to end the graph division processing if there is no update data.
Each service comprises:
the iterative expansion unit is used for acquiring first sub-partitions corresponding to respective sub-graphs, performing iterative expansion on the first sub-partitions, judging whether the first sub-partitions reach a preset first equilibrium upper bound or not in the iterative expansion process, stopping the expansion of the first sub-partitions if the first sub-partitions reach the preset first equilibrium upper bound, and continuing to expand the first sub-partitions if the first sub-partitions do not reach the preset first equilibrium upper bound;
and the feedback unit is used for feeding back information of finishing current iteration to the coordinator when all the services finish respective corresponding expansion.
9. The bounded delta graph partitioning system according to claim 8, further comprising an IO controller;
the IO controller is used for receiving external update data of the initial graph structure and forwarding the update data to the coordinator.
10. The bounded delta graph partitioning system according to claim 8, wherein said each service further comprises:
the first distribution unit is used for acquiring the unallocated adjacent edges corresponding to all the expansion vertexes and distributing the adjacent edges to the first subdivision;
an updating unit, configured to update the derived vertex set according to the newly allocated edge;
the judging unit is used for judging whether the updated derived vertex set has the condition that two end points corresponding to the adjacent edges are both in the derived vertex set;
a second distribution unit, configured to, if there are two endpoints corresponding to adjacent edges, locate both in the derived vertex set SiIn (3), the corresponding adjacent edge is allocated to the first subdivision.
CN202010294991.4A 2020-04-15 2020-04-15 Method and system for dividing bounded incremental graph Active CN111538867B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010294991.4A CN111538867B (en) 2020-04-15 2020-04-15 Method and system for dividing bounded incremental graph
PCT/CN2020/087707 WO2021208147A1 (en) 2020-04-15 2020-04-29 Bounded increment graph partitioning method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010294991.4A CN111538867B (en) 2020-04-15 2020-04-15 Method and system for dividing bounded incremental graph

Publications (2)

Publication Number Publication Date
CN111538867A true CN111538867A (en) 2020-08-14
CN111538867B CN111538867B (en) 2021-06-15

Family

ID=71952266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010294991.4A Active CN111538867B (en) 2020-04-15 2020-04-15 Method and system for dividing bounded incremental graph

Country Status (2)

Country Link
CN (1) CN111538867B (en)
WO (1) WO2021208147A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634290A (en) * 2020-12-30 2021-04-09 广州南洋理工职业学院 Graph segmentation method based on clustering interaction
CN113190720A (en) * 2021-05-17 2021-07-30 深圳计算科学研究院 Graph compression-based graph database construction method and device and related components
WO2022262007A1 (en) * 2021-06-18 2022-12-22 深圳计算科学研究院 Graph algorithm autoincrement method and apparatus, device, and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116166846B (en) * 2023-04-13 2023-08-01 广东广宇科技发展有限公司 Distributed multidimensional data processing method based on cloud computing

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW473587B (en) * 2000-02-28 2002-01-21 James Hardie Res Pty Ltd Surface groove system for building sheets
US20040215904A1 (en) * 2003-04-22 2004-10-28 International Business Machines Corporation System and method for assigning data collection agents to storage area network nodes in a storage area network resource management system
CN101315699A (en) * 2008-07-10 2008-12-03 哈尔滨工业大学 Incremental variation level set fast medical image partition method
CN102165515A (en) * 2008-09-30 2011-08-24 索尼计算机娱乐公司 Image processing device and image processing method
CN102521332A (en) * 2011-12-06 2012-06-27 北京航空航天大学 Graphic mode matching method, device and system based on strong simulation
EP2581862A1 (en) * 2011-10-14 2013-04-17 Palo Alto Research Center Incorporated System and method for parallel edge partioning in and/or graph search
US20130218789A1 (en) * 2012-02-21 2013-08-22 University Of South Carolina Systematic Approach to Enforcing Contiguity Constraint in Trajectory-based Methods for Combinatorial Optimization
CN103699606A (en) * 2013-12-16 2014-04-02 华中科技大学 Large-scale graphical partition method based on vertex cut and community detection
US9477532B1 (en) * 2015-10-06 2016-10-25 Oracle International Corporation Graph-data partitioning for workload-balanced distributed computation with cost estimation functions
US20170032064A1 (en) * 2014-01-03 2017-02-02 Schlumberger Technology Corporation Graph Partitioning To Distribute Wells In Parallel Reservoir Simulation
CN107193896A (en) * 2017-05-09 2017-09-22 华中科技大学 A kind of diagram data division methods based on cluster
CN108804226A (en) * 2018-05-28 2018-11-13 中国人民解放军国防科技大学 Graph segmentation and division method for distributed graph computation
CN109165325A (en) * 2018-08-27 2019-01-08 北京百度网讯科技有限公司 Method, apparatus, equipment and computer readable storage medium for cutting diagram data
CN109918199A (en) * 2019-02-28 2019-06-21 中国科学技术大学苏州研究院 Distributed figure processing system based on GPU
CN110110157A (en) * 2019-04-26 2019-08-09 东北大学 A kind of hypergraph alternative manner and its application based on two jump figures
CN110232689A (en) * 2018-03-06 2019-09-13 奥多比公司 Semantic classes positions digital environment
CN110569244A (en) * 2019-08-30 2019-12-13 深圳计算科学研究院 Hamming space approximate query method and storage medium
CN110688610A (en) * 2019-09-27 2020-01-14 支付宝(杭州)信息技术有限公司 Weight calculation method and device for graph data and electronic equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10198834B2 (en) * 2013-04-29 2019-02-05 Microsoft Technology Licensing, Llc Graph partitioning for massive scale graphs
CN108319698B (en) * 2018-02-02 2021-01-15 华中科技大学 Game-based flow graph dividing method and system

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW473587B (en) * 2000-02-28 2002-01-21 James Hardie Res Pty Ltd Surface groove system for building sheets
US20040215904A1 (en) * 2003-04-22 2004-10-28 International Business Machines Corporation System and method for assigning data collection agents to storage area network nodes in a storage area network resource management system
CN101315699A (en) * 2008-07-10 2008-12-03 哈尔滨工业大学 Incremental variation level set fast medical image partition method
CN102165515A (en) * 2008-09-30 2011-08-24 索尼计算机娱乐公司 Image processing device and image processing method
EP2581862A1 (en) * 2011-10-14 2013-04-17 Palo Alto Research Center Incorporated System and method for parallel edge partioning in and/or graph search
CN102521332A (en) * 2011-12-06 2012-06-27 北京航空航天大学 Graphic mode matching method, device and system based on strong simulation
US20130218789A1 (en) * 2012-02-21 2013-08-22 University Of South Carolina Systematic Approach to Enforcing Contiguity Constraint in Trajectory-based Methods for Combinatorial Optimization
CN103699606A (en) * 2013-12-16 2014-04-02 华中科技大学 Large-scale graphical partition method based on vertex cut and community detection
US20170032064A1 (en) * 2014-01-03 2017-02-02 Schlumberger Technology Corporation Graph Partitioning To Distribute Wells In Parallel Reservoir Simulation
US9477532B1 (en) * 2015-10-06 2016-10-25 Oracle International Corporation Graph-data partitioning for workload-balanced distributed computation with cost estimation functions
CN107193896A (en) * 2017-05-09 2017-09-22 华中科技大学 A kind of diagram data division methods based on cluster
CN110232689A (en) * 2018-03-06 2019-09-13 奥多比公司 Semantic classes positions digital environment
CN108804226A (en) * 2018-05-28 2018-11-13 中国人民解放军国防科技大学 Graph segmentation and division method for distributed graph computation
CN109165325A (en) * 2018-08-27 2019-01-08 北京百度网讯科技有限公司 Method, apparatus, equipment and computer readable storage medium for cutting diagram data
CN109918199A (en) * 2019-02-28 2019-06-21 中国科学技术大学苏州研究院 Distributed figure processing system based on GPU
CN110110157A (en) * 2019-04-26 2019-08-09 东北大学 A kind of hypergraph alternative manner and its application based on two jump figures
CN110569244A (en) * 2019-08-30 2019-12-13 深圳计算科学研究院 Hamming space approximate query method and storage medium
CN110688610A (en) * 2019-09-27 2020-01-14 支付宝(杭州)信息技术有限公司 Weight calculation method and device for graph data and electronic equipment

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
DONG DAI等: "IOGP: An Incremental Online Graph Partitioning Algorithm for Distributed Graph Databases", 《HPDC’17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING》 *
FAN ZHANG等: "Dynamic feedback synchronization of Lur’e networks via incremental sector boundednes", 《IEEE TRANSACTIONS ON AUTOMATIC CONTROL》 *
UPA GUPTA等: "Distributed Incremental Graph Analysis", 《2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS)》 *
WENFEI FAN等: "Incremental Graph Computations: Doable and Undoable", 《SIGMOD’17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA》 *
WENFEI FAN等: "Incrementalization of graph partitioning algorithms", 《PROCEEDINGS OF THE VLDB ENDOWMENT》 *
吉安明: "分布式环境下增量图划分技术的研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
张晓媛等: "基于领域的大规模图数据动态分割算法", 《计算机系统应用》 *
王志刚: "大规模图增量迭代处理技术的研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
郭鹏飞: "分布式大规模图数据流式划分算法FENNEL的改进", 《中国优秀硕士学位论文全文数据库基础科学辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634290A (en) * 2020-12-30 2021-04-09 广州南洋理工职业学院 Graph segmentation method based on clustering interaction
CN112634290B (en) * 2020-12-30 2023-09-19 广州南洋理工职业学院 Graph segmentation method based on clustering interaction
CN113190720A (en) * 2021-05-17 2021-07-30 深圳计算科学研究院 Graph compression-based graph database construction method and device and related components
WO2022262007A1 (en) * 2021-06-18 2022-12-22 深圳计算科学研究院 Graph algorithm autoincrement method and apparatus, device, and storage medium

Also Published As

Publication number Publication date
CN111538867B (en) 2021-06-15
WO2021208147A1 (en) 2021-10-21

Similar Documents

Publication Publication Date Title
CN111538867B (en) Method and system for dividing bounded incremental graph
Meyerhenke et al. Parallel graph partitioning for complex networks
Pearce et al. Faster parallel traversal of scale free graphs at extreme scale with vertex delegates
CN102663801B (en) Method for improving three-dimensional model rendering performance
US20130151535A1 (en) Distributed indexing of data
WO2015196911A1 (en) Data mining method and node
CN110147407B (en) Data processing method and device and database management server
JP6429262B2 (en) Load balancing for large in-memory databases
JP5427640B2 (en) Decision tree generation apparatus, decision tree generation method, and program
CN108804383B (en) Support point parallel enumeration method and device based on measurement space
US9485309B2 (en) Optimal fair distribution among buckets of different capacities
CN108089918B (en) Graph computation load balancing method for heterogeneous server structure
CN115391023A (en) Computing resource optimization method and device for multitask container cluster
Sakouhi et al. Dynamicdfep: A distributed edge partitioning approach for large dynamic graphs
CN112699134A (en) Distributed graph database storage and query method based on graph subdivision
Touheed et al. A comparison of some dynamic load-balancing algorithms for a parallel adaptive flow solver
CN116910061A (en) Database splitting and table splitting method, device and equipment and readable storage medium
CN115174582B (en) Data scheduling method and related device
Bae et al. Label propagation-based parallel graph partitioning for large-scale graph data
CN113111351A (en) Test method, test device and computer-readable storage medium
CN116303763A (en) Distributed graph database incremental graph partitioning method and system based on vertex degree
CN111737531B (en) Application-driven graph division adjusting method and system
CN112948087A (en) Task scheduling method and system based on topological sorting
CN112988367A (en) Resource allocation method and device, computer equipment and readable storage medium
Tzovas et al. Distributing sparse matrix/graph applications in heterogeneous clusters-an experimental study

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant