WO2021000435A1 - Large-scale dynamic graph division method based on sliding window - Google Patents

Large-scale dynamic graph division method based on sliding window Download PDF

Info

Publication number
WO2021000435A1
WO2021000435A1 PCT/CN2019/108136 CN2019108136W WO2021000435A1 WO 2021000435 A1 WO2021000435 A1 WO 2021000435A1 CN 2019108136 W CN2019108136 W CN 2019108136W WO 2021000435 A1 WO2021000435 A1 WO 2021000435A1
Authority
WO
WIPO (PCT)
Prior art keywords
vertex
vertices
edge
take
partition
Prior art date
Application number
PCT/CN2019/108136
Other languages
French (fr)
Chinese (zh)
Inventor
崔焕庆
荣炫宇
贾瑞生
魏永山
张峰
徐强
Original Assignee
山东科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 山东科技大学 filed Critical 山东科技大学
Publication of WO2021000435A1 publication Critical patent/WO2021000435A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Definitions

  • the invention belongs to the field of computer technology, and specifically relates to a large-scale dynamic graph division method based on a sliding window.
  • graphs can express complex structures and rich semantics, and have been widely used in many fields such as social networks, communications, and scientific computing. In recent years, with the continuous growth of data scale, it is necessary to use distributed graph computing system to analyze and process graph data.
  • Graph partitioning is a technology for distributing large-scale graph structure data into a distributed computing system composed of a large number of computing nodes, and is the basis for realizing distributed graph computing.
  • graph partitioning if the two vertices of an edge are divided into different computing nodes, the edge is called a cut edge.
  • Graph partitioning should minimize the number of cut edges and achieve load balancing among computing nodes.
  • the present invention proposes a large-scale dynamic graph partition method based on a sliding window, which is reasonable in design, overcomes the shortcomings of the prior art, and has good effects.
  • a large-scale dynamic graph partition method based on sliding window includes the following steps:
  • Step 1 Add vertices; specifically include the following steps:
  • Step 1.1 Set Specify the upper limit of
  • W vertex is the set of candidate vertices to be divided, and its vertices come from S vertex ;
  • Step 1.3 If Output the division result and end the division process; otherwise, go to step 1.4;
  • Step 1.6 If Go to step 1.8; otherwise, take the first vertex u from Q and delete the vertex from Q;
  • (u,w) as an edge of the graph, and R R ⁇ w
  • i 1, 2,...,K ⁇ , that is, C m is the minimum value among all ⁇ C i
  • i 1,2,...,K ⁇ ;
  • Step 1.11 Divide all vertices in V into partition P m ;
  • P m is the partition corresponding to the minimum value C m ;
  • Step 2 Add edges; specifically include the following steps:
  • Step 2.1 Set Specify the upper limit of
  • Step 2.3 If Then output the division result and end the division process; otherwise, go to step 2.4;
  • i 1, 2,...,K ⁇ , that is, C m is the smallest value among all ⁇ C i
  • i 1,2,...,K ⁇ ;
  • Step 2.9 Transfer v to the partition P m ;
  • Step 2.10 E v for each of the one side (u, v), u ⁇ P i , v ⁇ P j, if i ⁇ j, then (u, v) is divided into P i and P j; and otherwise (u, v) is divided into the P i;
  • the present invention preferentially selects vertices with a higher degree for division in the sliding window, which can not only make the vertices with a small degree gather to the vertices with a large degree, but also divide as many vertices as possible into each division.
  • the number of cut edges is reduced while achieving load balancing, thereby greatly reducing the communication cost in the graph calculation process.
  • the present invention preferentially selects the vertices with the most adjacent edges for division in the sliding window, which can effectively avoid frequent vertex migration, and can divide as many adjacent vertices as possible into suitable partitions during each division , Thereby greatly reducing the number of vertices migration, improving the efficiency of division, and achieving load balancing and minimizing the number of edge cuts.
  • Figure 1 is a flowchart for adding vertices.
  • Figure 2 is a flowchart of adding edges.
  • Figure 3 is a schematic diagram of the window structure when vertices are added.
  • Figure 4 is a schematic diagram of the window structure when edges are added.
  • Fig. 5 is a schematic diagram of a sliding window model with added vertices.
  • Figure 6 shows an example of adding vertices.
  • Figures 6(a)-(d) respectively show the information in the vertex window corresponding to the A state shown in Figure 5, the partition state of the graph structure data that has been divided before the vertex is added, and the flow graph partition algorithm to increase v 8 and The division result after v 9 and the division result after v 8 and v 9 are added using the algorithm proposed by the present invention.
  • Figure 7 shows an example of adding edges.
  • Figure 7(a)-(d) respectively show the window information when adding an edge, the partition status of the graph structure data that has been divided before adding the edge, and the division result after adding (v 1 , v 3 ) using the streaming graph partition algorithm , And a schematic diagram of the division result after adding v 1 and its associated edges using the algorithm proposed by the present invention.
  • a large-scale dynamic graph partition method based on sliding window includes the following steps:
  • Step 1 Add vertices; the process is shown in Figure 1, which specifically includes the following steps:
  • Step 1.1 Set Specify the upper limit of
  • W vertex is the set of candidate vertices to be divided, and its vertices come from S vertex ;
  • Step 1.3 If Output the division result and end the division process; otherwise, go to step 1.4;
  • Step 1.6 If Go to step 1.8; otherwise, take the first vertex u from Q and delete the vertex from Q;
  • (u,w) as an edge of the graph, and R R ⁇ w
  • i 1, 2,...,K ⁇ , that is, C m is the minimum value among all ⁇ C i
  • i 1,2,...,K ⁇ ;
  • Step 1.11 Divide all vertices in V into partition P m ;
  • P m is the partition corresponding to the minimum value C m ;
  • Step 2 Add edges; the process is shown in Figure 2, which specifically includes the following steps:
  • Step 2.1 Set Specify the upper limit of
  • Step 2.3 If Then output the division result and end the division process; otherwise, go to step 2.4;
  • max j 1, 2,..., K ⁇
  • is the number of vertices in the partition with the most vertices;
  • i 1, 2,...,K ⁇ , that is, C m is the smallest value among all ⁇ C i
  • i 1,2,...,K ⁇ ;
  • Step 2.9 Transfer v to the partition P m ;
  • Step 2.10 E v for each of the one side (u, v), u ⁇ P i , v ⁇ P j, if i ⁇ j, then (u, v) is divided into P i and P j; and otherwise (u, v) is divided into the P i;
  • the window structure when adding vertices is shown in Figure 3.
  • the sliding window when adding vertices is composed of L vertex vertices, which are sorted by degree, and each vertex includes 3 fields:
  • Each vertex to be divided corresponds to a primary key in the sliding window.
  • Undivided adjacent vertices A list of vertices in the sliding window that are adjacent to the primary key and have not been divided.
  • the window structure when adding an edge is shown in Figure 4.
  • the sliding window when adding an edge is composed of L edge edges, and all edges are composed in the manner of an adjacency list, where the head vertices of the adjacency list are based on the number of adjacent points in W edge Sort, that is, include:
  • Each vertex in the sliding window corresponds to a primary key.
  • Adjacent vertices (Secondary Key): other vertices associated with the primary key and corresponding to the edges in the sliding window.
  • the first case increase the vertex.
  • step 1.1 first initialize the window W vertex and specify L vertex as 4.
  • Figure 6(b) shows the partition status of the graph structure data that has been divided before adding vertices.
  • the dotted circles P 1 and P 2 are two partitions, and the hollow circles v 0 , v 1 , v 2 , v 3 , v 4 , v 5 , v 6 and v 7 represent 8 vertices of the graph structure data, and the solid lines between the vertices represent edges in the graph structure data.
  • the number of cut edges after division is 3.
  • the number of cut edges is reduced by 2.
  • step 1.12 delete vertices v 8 and v 9 from W vertex .
  • the division of v 8 and v 9 in the A state shown in Fig. 5 is completed, and go to step 1.2.
  • N is 2, add the vertices v 12 and v 13 in S vertex to W vertex to reach the state B in Figure 5.
  • the second case adding edges.
  • step 2.1 to 2.3,2.11 vertices and the like increase, i.e. increasing from the edges to S edge W edge, when the number of edges reaches W edge L edge, or W edge is not empty , Select the vertex transition and divide the related edges into corresponding partitions; when W edge is empty, output the division result and end.
  • step 2.4 to step 2.10 under the window W edge , and specify L edge as 3.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A large-scale dynamic graph division method based on a sliding window, which method belongs to the technical field of computers. According to the method, when vertexes are added, the vertexes with higher degrees are preferentially selected from the sliding window for division, so that the vertexes with lower degrees can be gathered to the vertexes with higher degrees, and as many vertexes as possible can also be divided into appropriate partitions at each instance of division, thereby realizing load balancing and reducing the number of cut edges, and thus greatly reducing the communication cost during a graph calculation process; and when edges are added, the vertexes with the most adjacent edges are preferentially selected from the sliding window for division, so that the frequent migration of the vertexes can be effectively avoided, and as many adjacent vertexes as possible can also be divided into appropriate partitions at each instance of division, thereby greatly reducing the number of instances of migrations of the vertexes, improving the division efficiency, and realizing load balancing and minimizing the number of cut edges.

Description

一种基于滑动窗口的大规模动态图划分方法A large-scale dynamic graph partition method based on sliding window 技术领域Technical field
本发明属于计算机技术领域,具体涉及一种基于滑动窗口的大规模动态图划分方法。The invention belongs to the field of computer technology, and specifically relates to a large-scale dynamic graph division method based on a sliding window.
背景技术Background technique
图作为一种抽象的数据结构,可以表达复杂的结构和丰富的语义,在社交网络、通信和科学计算等诸多领域获得了广泛应用。近年来随着数据规模的不断增长,必须借助分布式图计算系统才能进行图数据的分析和处理。As an abstract data structure, graphs can express complex structures and rich semantics, and have been widely used in many fields such as social networks, communications, and scientific computing. In recent years, with the continuous growth of data scale, it is necessary to use distributed graph computing system to analyze and process graph data.
图划分是将大规模图结构数据分布到由大量计算节点组成的分布式计算系统中的技术,是实现分布式图计算的基础。在图划分中,如果一条边的两个顶点被划分到不同的计算节点上,则称这条边为割边。图划分应最小化割边数量并实现计算节点间的负载均衡。Graph partitioning is a technology for distributing large-scale graph structure data into a distributed computing system composed of a large number of computing nodes, and is the basis for realizing distributed graph computing. In graph partitioning, if the two vertices of an edge are divided into different computing nodes, the edge is called a cut edge. Graph partitioning should minimize the number of cut edges and achieve load balancing among computing nodes.
当前,很多应用场景的图数据会经常发生变化,如在社交网络中用户及相关关系的增加、删除等,这类图称为动态图。现有的图划分算法大多是针对静态图的,在划分前需要将图数据全部加载到内存后再进行划分,此类算法用于动态图划分时易产生巨大的计算开销。At present, the graph data of many application scenarios will often change, such as the addition or deletion of users and related relationships in social networks. Such graphs are called dynamic graphs. Most of the existing graph partitioning algorithms are for static graphs. Before dividing, all graph data needs to be loaded into the memory and then divided. Such algorithms are prone to generate huge computational overhead when used for dynamic graph division.
发明内容Summary of the invention
针对现有技术中存在的上述技术问题,本发明提出了一种基于滑动窗口的大规模动态图划分方法,设计合理,克服了现有技术的不足,具有良好的效果。In view of the above-mentioned technical problems in the prior art, the present invention proposes a large-scale dynamic graph partition method based on a sliding window, which is reasonable in design, overcomes the shortcomings of the prior art, and has good effects.
为了实现上述目的,本发明采用如下技术方案:In order to achieve the above objectives, the present invention adopts the following technical solutions:
一种基于滑动窗口的大规模动态图划分方法,包括如下步骤:A large-scale dynamic graph partition method based on sliding window includes the following steps:
步骤1:增加顶点;具体包括如下步骤:Step 1: Add vertices; specifically include the following steps:
输入为待增加顶点的集合S vertex、当前K个分区P i(i=1,2,…,K)的各个分区的顶点集合; Input is set to be increased vertices S vertex, K current partition P i (i = 1,2, ... , K) of the set of vertices of the respective partitions;
步骤1.1:置
Figure PCTCN2019108136-appb-000001
指定|W vertex|的上限为L vertex
Step 1.1: Set
Figure PCTCN2019108136-appb-000001
Specify the upper limit of |W vertex | as L vertex ;
其中,W vertex为将被划分的候选顶点集合,其顶点来自S vertexAmong them, W vertex is the set of candidate vertices to be divided, and its vertices come from S vertex ;
步骤1.2:取N=min{|S vertex|,L vertex-|W vertex|},即|S vertex|和L vertex-|W vertex|的最小值,将S vertex中的前N个顶点增加到W vertex中,并从S vertex中删除这些顶点; Step 1.2: Take N=min{|S vertex |,L vertex -|W vertex |}, which is the minimum value of |S vertex | and L vertex -|W vertex |, and increase the first N vertices in S vertex to W vertex , and delete these vertices from S vertex ;
步骤1.3:如果
Figure PCTCN2019108136-appb-000002
则输出划分结果,并结束划分流程;否则转步骤1.4;
Step 1.3: If
Figure PCTCN2019108136-appb-000002
Output the division result and end the division process; otherwise, go to step 1.4;
步骤1.4:取v=argmax{d u|u∈W vertex,d u是顶点u的度数即与u相邻接的顶点的个数},即取W vertex中度数最大的顶点v,如果有多个度数相同且度数最大的顶点,则任取其中一个; Step 1.4: Take v = argmax {d u | u∈W vertex, d u u is the number of vertices i.e., the degree of contact of adjacent apex u}, that is to take the maximum degree of vertex v vertex W, if multiple For the vertices with the same degree and the largest degree, choose any one of them;
步骤1.5:取V=Q=R={v};V为被选中划分到某分区的顶点集合,
Figure PCTCN2019108136-appb-000003
Q为顶点队列;R为与V中的顶点相邻的所有顶点的集合;
Step 1.5: Take V=Q=R={v}; V is the set of vertices selected to be divided into a certain partition,
Figure PCTCN2019108136-appb-000003
Q is the vertex queue; R is the set of all vertices adjacent to the vertices in V;
步骤1.6:若
Figure PCTCN2019108136-appb-000004
则转步骤1.8;否则从取Q中取出第1个顶点u,并从Q中删除该顶点;
Step 1.6: If
Figure PCTCN2019108136-appb-000004
Go to step 1.8; otherwise, take the first vertex u from Q and delete the vertex from Q;
步骤1.7:取Q=Q∪{w|(u,w)是图的一条边,且
Figure PCTCN2019108136-appb-000005
R=R∪{w|(u,w)是图的一条边},然后转步骤1.6;
Step 1.7: Take Q=Q∪{w|(u,w) as an edge of the graph, and
Figure PCTCN2019108136-appb-000005
R=R∪{w|(u,w) is an edge of the graph}, then go to step 1.6;
步骤1.8:取V=V∪{w|w∈R,且
Figure PCTCN2019108136-appb-000006
Step 1.8: Take V=V∪{w|w∈R, and
Figure PCTCN2019108136-appb-000006
步骤1.9:对每个分区P i(i=1,2,…,K),计算
Figure PCTCN2019108136-appb-000007
Figure PCTCN2019108136-appb-000008
其中,C i为将顶点或边划分到第i个分区的代价;
Figure PCTCN2019108136-appb-000009
是拥有最多顶点的分区的顶点个数,α为计算C i时分区负载和割边数量的权重系数,0<α<1;
Figure PCTCN2019108136-appb-000010
用于衡量分区负载情况,
Figure PCTCN2019108136-appb-000011
用户衡量割边数量;
Step 1.9: for each partition P i (i = 1,2, ... , K), calculated
Figure PCTCN2019108136-appb-000007
Figure PCTCN2019108136-appb-000008
Among them, C i is the cost of dividing the vertex or edge into the i-th partition;
Figure PCTCN2019108136-appb-000009
Is the number of vertices in the partition with the most vertices, α is the weight coefficient of the partition load and the number of cut edges when calculating C i , 0<α<1;
Figure PCTCN2019108136-appb-000010
Used to measure the partition load,
Figure PCTCN2019108136-appb-000011
The user measures the number of cut edges;
步骤1.10:取m=argmin{C i|i=1,2,…,K},即C m是所有{C i|i=1,2,…,K}中的最小值; Step 1.10: Take m=argmin{C i |i=1, 2,...,K}, that is, C m is the minimum value among all {C i |i=1,2,...,K};
步骤1.11:将V中的所有顶点划分到分区P m中;P m为与最小值C m对应的分区; Step 1.11: Divide all vertices in V into partition P m ; P m is the partition corresponding to the minimum value C m ;
步骤1.12:取W vertex=W vertex-V,然后转步骤1.2; Step 1.12: Take W vertex = W vertex -V, then go to step 1.2;
步骤2:增加边;具体包括如下步骤:Step 2: Add edges; specifically include the following steps:
输入为待增加边的集合S edge、当前K个分区P i(i=1,2,…,K)的各个分区的顶点集合; To be added to the input side of the set S edge, the K current partition P i (i = 1,2, ... , K) of the set of vertices of the respective partitions;
前提:S edge中边的所有顶点都已经划分完毕; Prerequisite: All vertices of the edge in S edge have been divided;
步骤2.1:置
Figure PCTCN2019108136-appb-000012
指定|W edge|的上限为L edge
Step 2.1: Set
Figure PCTCN2019108136-appb-000012
Specify the upper limit of |W edge | as L edge ;
步骤2.2:取N=min{|S edge|,L edge-|W edge|},即|S edge|和L edge-|W edge|的最小值,将S edge中的前N条边增加到W edge中,并从S edge中删除这些边; Step 2.2: Take N=min{|S edge |,L edge -|W edge |}, which is the minimum value of |S edge | and L edge -|W edge |, and increase the first N edges in S edge to W edge , and delete these edges from S edge ;
步骤2.3:如果
Figure PCTCN2019108136-appb-000013
则输出划分结果,并结束划分流程;否则转步骤2.4;
Step 2.3: If
Figure PCTCN2019108136-appb-000013
Then output the division result and end the division process; otherwise, go to step 2.4;
步骤2.4:对每个在W edge中的顶点v,取E v=u|(u,v)∈W edge};E v为与顶点v相邻接的且属于W edge的顶点集合; Step 2.4: For each vertex v in W edge , take E v =u|(u,v)∈W edge }; E v is the set of vertices adjacent to vertex v and belonging to W edge ;
步骤2.5:取v=argmax{|E v|},即取W edge中邻接顶点个数最多的顶点v,如果有多个满足条件的顶点,则任取其中一个; Step 2.5: Take v=argmax{|E v |}, that is, take the vertex v with the largest number of adjacent vertices in W edge . If there are multiple vertices that meet the conditions, choose any one of them;
步骤2.6:取T={w|(v,w)是图的一条边};T为与W edge中的某个顶点相关联的所有顶点的集合; Step 2.6: Take T={w|(v,w) is an edge of the graph}; T is the set of all vertices associated with a vertex in W edge ;
步骤2.7:对每个分区P i(i=1,2,…,K),如果v∈P i,则
Figure PCTCN2019108136-appb-000014
Figure PCTCN2019108136-appb-000015
否则
Figure PCTCN2019108136-appb-000016
其中max j=1,2,…,K{|P j|} 是拥有最多顶点的分区的顶点个数;
Step 2.7: for each partition P i (i = 1,2, ... , K), if v∈P i, then
Figure PCTCN2019108136-appb-000014
Figure PCTCN2019108136-appb-000015
otherwise
Figure PCTCN2019108136-appb-000016
Where max j=1,2,...,K {|P j |} is the number of vertices in the partition with the most vertices;
步骤2.8:取m=argmin{C i|i=1,2,…,K},即C m是所有{C i|i=1,2,…,K}中的最小值; Step 2.8: Take m=argmin{C i |i=1, 2,...,K}, that is, C m is the smallest value among all {C i |i=1,2,...,K};
步骤2.9:将v转移到分区P m中; Step 2.9: Transfer v to the partition P m ;
步骤2.10:对于E v中的每一条边(u,v),u∈P i,v∈P j,若i≠j,则将(u,v)划分到P i和P j中;否则将(u,v)划分到P i中; Step 2.10: E v for each of the one side (u, v), u∈P i , v∈P j, if i ≠ j, then (u, v) is divided into P i and P j; and otherwise (u, v) is divided into the P i;
步骤2.11:W edge=W Edge-E v,转步骤2.2。 Step 2.11: W edge = W Edge- E v , go to step 2.2.
本发明所带来的有益技术效果:The beneficial technical effects brought by the present invention:
本发明在增加顶点时,在滑动窗口中优先选取度数较高的顶点进行划分,既能够使得度数小的顶点向度数大的顶点聚集,又可以在每次划分时将尽可能多的顶点划分到适合的分区中,在实现负载均衡的同时降低了割边数量,从而极大减少图计算过程中的通信成本。When adding vertices, the present invention preferentially selects vertices with a higher degree for division in the sliding window, which can not only make the vertices with a small degree gather to the vertices with a large degree, but also divide as many vertices as possible into each division. In a suitable partition, the number of cut edges is reduced while achieving load balancing, thereby greatly reducing the communication cost in the graph calculation process.
本发明在增加边时,在滑动窗口中优先选取邻接边最多的顶点进行划分,既能够有效避免频繁的顶点迁移,又可以在每次划分时将尽可能多的邻接顶点划分到合适的分区中,从而极大减少了顶点的迁移次数,提高了划分效率,并实现了负载均衡和割边数量的最小化。When adding edges, the present invention preferentially selects the vertices with the most adjacent edges for division in the sliding window, which can effectively avoid frequent vertex migration, and can divide as many adjacent vertices as possible into suitable partitions during each division , Thereby greatly reducing the number of vertices migration, improving the efficiency of division, and achieving load balancing and minimizing the number of edge cuts.
附图说明Description of the drawings
图1为增加顶点的流程图。Figure 1 is a flowchart for adding vertices.
图2为增加边的流程图。Figure 2 is a flowchart of adding edges.
图3为增加顶点时的窗口结构示意图。Figure 3 is a schematic diagram of the window structure when vertices are added.
图4为增加边时的窗口结构示意图。Figure 4 is a schematic diagram of the window structure when edges are added.
图5为增加顶点下的滑动窗口模型示意图。Fig. 5 is a schematic diagram of a sliding window model with added vertices.
图6为增加顶点实例图。Figure 6 shows an example of adding vertices.
图6(a)-(d)分别表示图5所示的A状态对应的顶点窗口中的信息、增加顶点前已经划分好的图结构数据的分区状态、采用流式图划分算法增加v 8和v 9后的划分结果,以及采用本发明所提出的算法增加v 8和v 9后的划分结果示意图。 Figures 6(a)-(d) respectively show the information in the vertex window corresponding to the A state shown in Figure 5, the partition state of the graph structure data that has been divided before the vertex is added, and the flow graph partition algorithm to increase v 8 and The division result after v 9 and the division result after v 8 and v 9 are added using the algorithm proposed by the present invention.
图7为增加边实例图。Figure 7 shows an example of adding edges.
图7(a)-(d)分别表示增加边时的窗口信息、增加边之前已经划分好的图结构数据的分区状态、采用流式图划分算法增加(v 1,v 3)后的划分结果,以及采用本发明所提出的算法增加v 1及其关联边后的划分结果示意图。 Figure 7(a)-(d) respectively show the window information when adding an edge, the partition status of the graph structure data that has been divided before adding the edge, and the division result after adding (v 1 , v 3 ) using the streaming graph partition algorithm , And a schematic diagram of the division result after adding v 1 and its associated edges using the algorithm proposed by the present invention.
具体实施方式Detailed ways
下面结合附图以及具体实施方式对本发明作进一步详细说明:The present invention will be further described in detail below in conjunction with the drawings and specific embodiments:
一种基于滑动窗口的大规模动态图划分方法,包括如下步骤:A large-scale dynamic graph partition method based on sliding window includes the following steps:
步骤1:增加顶点;其流程如图1所示,具体包括如下步骤:Step 1: Add vertices; the process is shown in Figure 1, which specifically includes the following steps:
输入为待增加顶点的集合S vertex、当前K个分区P i(i=1,2,…,K)的各个分区的顶点集合; Input is set to be increased vertices S vertex, K current partition P i (i = 1,2, ... , K) of the set of vertices of the respective partitions;
步骤1.1:置
Figure PCTCN2019108136-appb-000017
指定|W vertex|的上限为L vertex
Step 1.1: Set
Figure PCTCN2019108136-appb-000017
Specify the upper limit of |W vertex | as L vertex ;
其中,W vertex为将被划分的候选顶点集合,其顶点来自S vertexAmong them, W vertex is the set of candidate vertices to be divided, and its vertices come from S vertex ;
步骤1.2:取N=min{|S vertex|,L vertex-|W vertex|},即|S vertex|和L vertex-|W vertex|的最小值,将S vertex中的前N个顶点增加到W vertex中,并从S vertex中删除这些顶点; Step 1.2: Take N=min{|S vertex |,L vertex -|W vertex |}, which is the minimum value of |S vertex | and L vertex -|W vertex |, and increase the first N vertices in S vertex to W vertex , and delete these vertices from S vertex ;
步骤1.3:如果
Figure PCTCN2019108136-appb-000018
则输出划分结果,并结束划分流程;否则转步骤1.4;
Step 1.3: If
Figure PCTCN2019108136-appb-000018
Output the division result and end the division process; otherwise, go to step 1.4;
步骤1.4:取v=argmax{d u|u∈W vertex,d u是顶点u的度数即与u相邻接的顶点的个数},即取W vertex中度数最大的顶点v,如果有多个度数相同且度数最大的顶点,则任取其中一个; Step 1.4: Take v = argmax {d u | u∈W vertex, d u u is the number of vertices i.e., the degree of contact of adjacent apex u}, that is to take the maximum degree of vertex v vertex W, if multiple For the vertices with the same degree and the largest degree, choose any one of them;
步骤1.5:取V=Q=R={v};V为被选中划分到某分区的顶点集合,
Figure PCTCN2019108136-appb-000019
Q为顶点队列;R为与V中的顶点相邻的所有顶点的集合;
Step 1.5: Take V=Q=R={v}; V is the set of vertices selected to be divided into a certain partition,
Figure PCTCN2019108136-appb-000019
Q is the vertex queue; R is the set of all vertices adjacent to the vertices in V;
步骤1.6:若
Figure PCTCN2019108136-appb-000020
则转步骤1.8;否则从取Q中取出第1个顶点u,并从Q中删除该顶点;
Step 1.6: If
Figure PCTCN2019108136-appb-000020
Go to step 1.8; otherwise, take the first vertex u from Q and delete the vertex from Q;
步骤1.7:取Q=Q∪{w|(u,w)是图的一条边,且
Figure PCTCN2019108136-appb-000021
R=R∪{w|(u,w)是图的一条边},然后转步骤1.6;
Step 1.7: Take Q=Q∪{w|(u,w) as an edge of the graph, and
Figure PCTCN2019108136-appb-000021
R=R∪{w|(u,w) is an edge of the graph}, then go to step 1.6;
步骤1.8:取V=V∪{w|w∈R,且
Figure PCTCN2019108136-appb-000022
Step 1.8: Take V=V∪{w|w∈R, and
Figure PCTCN2019108136-appb-000022
步骤1.9:对每个分区P i(i=1,2,…,K),计算
Figure PCTCN2019108136-appb-000023
Figure PCTCN2019108136-appb-000024
其中,C i为将顶点或边划分到第i个分区的代价;
Figure PCTCN2019108136-appb-000025
是拥有最多顶点的分区的顶点个数,α为计算C i时分区负载和割边数量的权重系数,0<α<1;
Figure PCTCN2019108136-appb-000026
用于衡量分区负载情况,
Figure PCTCN2019108136-appb-000027
用户衡量割边数量;
Step 1.9: for each partition P i (i = 1,2, ... , K), calculated
Figure PCTCN2019108136-appb-000023
Figure PCTCN2019108136-appb-000024
Among them, C i is the cost of dividing the vertex or edge into the i-th partition;
Figure PCTCN2019108136-appb-000025
Is the number of vertices in the partition with the most vertices, α is the weight coefficient of the partition load and the number of cut edges when calculating C i , 0<α<1;
Figure PCTCN2019108136-appb-000026
Used to measure the partition load,
Figure PCTCN2019108136-appb-000027
The user measures the number of cut edges;
步骤1.10:取m=argmin{C i|i=1,2,…,K},即C m是所有{C i|i=1,2,…,K}中的最小值; Step 1.10: Take m=argmin{C i |i=1, 2,...,K}, that is, C m is the minimum value among all {C i |i=1,2,...,K};
步骤1.11:将V中的所有顶点划分到分区P m中;P m为与最小值C m对应的分区; Step 1.11: Divide all vertices in V into partition P m ; P m is the partition corresponding to the minimum value C m ;
步骤1.12:取W vertex=W vertex-V,然后转步骤1.2; Step 1.12: Take W vertex = W vertex -V, then go to step 1.2;
步骤2:增加边;其流程如图2所示,具体包括如下步骤:Step 2: Add edges; the process is shown in Figure 2, which specifically includes the following steps:
输入为待增加边的集合S edge、当前K个分区P i(i=1,2,…,K)的各个分区的顶点集合; To be added to the input side of the set S edge, the K current partition P i (i = 1,2, ... , K) of the set of vertices of the respective partitions;
前提:S edge中边的所有顶点都已经划分完毕; Prerequisite: All vertices of the edge in S edge have been divided;
步骤2.1:置
Figure PCTCN2019108136-appb-000028
指定|W edge|的上限为L edge
Step 2.1: Set
Figure PCTCN2019108136-appb-000028
Specify the upper limit of |W edge | as L edge ;
步骤2.2:取N=min{|S edge|,L edge-|W edge|},即|S edge|和L edge-|W edge|的最小值,将 S edge中的前N条边增加到W edge中,并从S edge中删除这些边; Step 2.2: Take N=min{|S edge |,L edge -|W edge |}, which is the minimum value of |S edge | and L edge -|W edge |, and increase the first N edges in S edge to W edge , and delete these edges from S edge ;
步骤2.3:如果
Figure PCTCN2019108136-appb-000029
则输出划分结果,并结束划分流程;否则转步骤2.4;
Step 2.3: If
Figure PCTCN2019108136-appb-000029
Then output the division result and end the division process; otherwise, go to step 2.4;
步骤2.4:对每个在W edge中的顶点v,取E v={u|(u,v)∈W edge};E v为与顶点v相邻接的且属于W edge的顶点集合; Step 2.4: For each vertex v in W edge , take E v ={u|(u,v)∈W edge }; E v is the set of vertices adjacent to vertex v and belonging to W edge ;
步骤2.5:取v=argmax{|E v|},即取W edge中邻接顶点个数最多的顶点v,如果有多个满足条件的顶点,则任取其中一个; Step 2.5: Take v=argmax{|E v |}, that is, take the vertex v with the largest number of adjacent vertices in W edge . If there are multiple vertices that meet the conditions, choose any one of them;
步骤2.6:取T={w|(v,w)是图的一条边};T为与W edge中的某个顶点相关联的所有顶点的集合; Step 2.6: Take T={w|(v,w) is an edge of the graph}; T is the set of all vertices associated with a vertex in W edge ;
步骤2.7:对每个分区P i(i=1,2,…,K),如果v∈P i,则
Figure PCTCN2019108136-appb-000030
Figure PCTCN2019108136-appb-000031
否则
Figure PCTCN2019108136-appb-000032
其中max j=1,2,…,K{|P j|}是拥有最多顶点的分区的顶点个数;
Step 2.7: for each partition P i (i = 1,2, ... , K), if v∈P i, then
Figure PCTCN2019108136-appb-000030
Figure PCTCN2019108136-appb-000031
otherwise
Figure PCTCN2019108136-appb-000032
Where max j = 1, 2,..., K {|P j |} is the number of vertices in the partition with the most vertices;
步骤2.8:取m=argmin{C i|i=1,2,…,K},即C m是所有{C i|i=1,2,…,K}中的最小值; Step 2.8: Take m=argmin{C i |i=1, 2,...,K}, that is, C m is the smallest value among all {C i |i=1,2,...,K};
步骤2.9:将v转移到分区P m中; Step 2.9: Transfer v to the partition P m ;
步骤2.10:对于E v中的每一条边(u,v),u∈P i,v∈P j,若i≠j,则将(u,v)划分到P i和P j中;否则将(u,v)划分到P i中; Step 2.10: E v for each of the one side (u, v), u∈P i , v∈P j, if i ≠ j, then (u, v) is divided into P i and P j; and otherwise (u, v) is divided into the P i;
步骤2.11:W edge=W edge-E v,转步骤2.2。 Step 2.11: W edge = W edge- E v , go to step 2.2.
上述方法中,涉及的符号及含义见表1。In the above method, the symbols and meanings involved are shown in Table 1.
表1主要符号及其含义Table 1 Main symbols and their meanings
Figure PCTCN2019108136-appb-000033
Figure PCTCN2019108136-appb-000033
Figure PCTCN2019108136-appb-000034
Figure PCTCN2019108136-appb-000034
本方法需要构建滑动窗口,增加顶点时的窗口结构如图3所示,增加顶点时的滑动窗口由L vertex个顶点组成,这些顶点通过度数进行排序,每个顶点包括3个字段: This method needs to build a sliding window. The window structure when adding vertices is shown in Figure 3. The sliding window when adding vertices is composed of L vertex vertices, which are sorted by degree, and each vertex includes 3 fields:
(1)主键(Primary Key):在滑动窗口中每个待划分顶点对应一个主键。(1) Primary Key: Each vertex to be divided corresponds to a primary key in the sliding window.
(2)已划分邻接顶点(Secondary Key):与主键相邻接的且已经划分到某个分区中的顶点列表。(2) Divided adjacent vertices (Secondary Key): A list of vertices that are adjacent to the primary key and have been divided into a certain partition.
(3)未划分邻接顶点(Unassigned Key):与主键相邻接的且尚未划分的、在滑动窗口中的顶点列表。(3) Undivided adjacent vertices (Unassigned Key): A list of vertices in the sliding window that are adjacent to the primary key and have not been divided.
增加边时的窗口结构如图4所示,增加边时的滑动窗口由L edge条边组成,所有边按照邻接表的方式组成,其中邻接表的表头顶点按照W edge中邻接点的个数进行排序,即包含: The window structure when adding an edge is shown in Figure 4. The sliding window when adding an edge is composed of L edge edges, and all edges are composed in the manner of an adjacency list, where the head vertices of the adjacency list are based on the number of adjacent points in W edge Sort, that is, include:
(1)主键(Primary Key):滑动窗口中每个顶点对应一个主键。(1) Primary Key: Each vertex in the sliding window corresponds to a primary key.
(2)邻接顶点(Secondary Key):与主键相关联的且在滑动窗口中的边对应的其它顶点。(2) Adjacent vertices (Secondary Key): other vertices associated with the primary key and corresponding to the edges in the sliding window.
很多应用场景的图都是动态变化的,下面结合附图和具体实例对本发明在增加顶点和增加边的具体实施做进一步说明。The graphs of many application scenarios are dynamically changing. The specific implementation of adding vertices and adding edges of the present invention will be further described below with reference to the drawings and specific examples.
第一种情况:增加顶点。The first case: increase the vertex.
增加顶点下的滑动窗口模型如图5所示,S vertex为顶点流;A、B代表在对W vertex填充(通过窗口向右滑动)顶点后的状态。 The sliding window model under the added vertex is shown in Figure 5, where S vertex is the vertex stream; A and B represent the state after filling the W vertex (by sliding the window to the right) with the vertex.
根据步骤1.1,首先初始化窗口W vertex,指定L vertex为4。 According to step 1.1, first initialize the window W vertex and specify L vertex as 4.
根据步骤1.2,取N=min{|S vertex|,L vertex-|W vertex|},将S vertex中的前N个顶点增加到W vertex中,达到A状态,此时W vertex中的顶点为{v 8,v 9,v 10,v 11},窗口信息如图6(a)所示。图6(b)表示在增加顶点前,已经划分好的图结构数据的分区状态,其中虚线圆P 1和P 2为2个 分区,空心圆v 0、v 1、v 2、v 3、v 4、v 5、v 6和v 7表示图结构数据的8个顶点,顶点之间的实线表示图结构数据中的边。 According to step 1.2, take N=min{|S vertex |,L vertex -|W vertex |}, add the first N vertices in S vertex to W vertex to reach the A state, at this time the vertices in W vertex are {v 8 ,v 9 ,v 10 ,v 11 }, the window information is shown in Figure 6(a). Figure 6(b) shows the partition status of the graph structure data that has been divided before adding vertices. The dotted circles P 1 and P 2 are two partitions, and the hollow circles v 0 , v 1 , v 2 , v 3 , v 4 , v 5 , v 6 and v 7 represent 8 vertices of the graph structure data, and the solid lines between the vertices represent edges in the graph structure data.
根据步骤1.3到1.5,从窗口中取出度数最大的顶点,其中v 8度数为4,则选取v=v 8并加入到V,Q,R中。 According to steps 1.3 to 1.5, take the vertex with the largest degree from the window, where v 8 has a degree of 4, then select v = v 8 and add it to V, Q, R.
根据步骤1.6到1.8,Q={v 8}不为空,从Q中取出第一个顶点u=v 8并从Q中删除v 8,遍历u的邻居顶点v 1,v 3,v 5,v 9,将v 9增加到Q中,将v 1,v 3,v 5,v 9增加到R中。然后继续执行上述步骤,当Q为空时结束。最后将R中未划分的顶点增加到V中。此时V={v 8, 9},R={v 0,v 1,v 3,v 5,v 7}。 According to steps 1.6 to 1.8, Q={v 8 } is not empty, take the first vertex u=v 8 from Q and delete v 8 from Q, traverse u's neighbor vertices v 1 , v 3 , v 5 , v 9 , add v 9 to Q, and add v 1 , v 3 , v 5 , and v 9 to R. Then continue to perform the above steps, and end when Q is empty. Finally, the undivided vertices in R are added to V. At this time, V={v 8 , 9 }, R={v 0 , v 1 , v 3 , v 5 , v 7 }.
根据步骤1.9到1.11,计算将V中的顶点v 8,v 9分别增加到P 1,P 2产生的开销C 1,C 2,满足C 1>C 2,即目标分区P m=P 2,将v 8,v 9增加到P 2中,得到的结果如图6(d)所示,划分后产生的割边数为3。相比于流式图划分算法得出来的结果(图6(c)),割边数减少了2条。 According to steps 1.9 to 1.11, calculate the costs C 1 , C 2 incurred by adding the vertices v 8 and v 9 in V to P 1 and P 2 respectively, satisfying C 1 >C 2 , that is, the target partition P m =P 2 , Add v 8 , v 9 to P 2 and the result is shown in Figure 6(d). The number of cut edges after division is 3. Compared with the result obtained by the flow graph partitioning algorithm (Figure 6(c)), the number of cut edges is reduced by 2.
根据步骤1.12,将顶点v 8,v 9从W vertex中删除。图5所示的A状态下的v 8,v 9划分完毕,转步骤1.2。此时N为2,将S vertex中的顶点v 12,v 13增加到W vertex中,达到图5中B状态,按照上述步骤直到将S vertex中所有顶点全部划分到图中,输出划分结果并结束。 According to step 1.12, delete vertices v 8 and v 9 from W vertex . The division of v 8 and v 9 in the A state shown in Fig. 5 is completed, and go to step 1.2. At this time, N is 2, add the vertices v 12 and v 13 in S vertex to W vertex to reach the state B in Figure 5. Follow the above steps until all vertices in S vertex are divided into the graph, output the division result and End.
第二种情况:增加边。The second case: adding edges.
在增加边的情况下,步骤2.1到2.3,2.11和增加顶点的情况类似,即从S edge中不断增加边到W edge中,当W edge中边数达到L edge,或W edge不为空时,选取顶点转移和将相关的边划分到相应分区中;当W edge为空时,输出划分结果并结束。这里主要叙述在窗口W edge下,选取顶点转移和增加相关边的过程(步骤2.4到步骤2.10),指定L edge为3。 In the case of increasing the edge, step 2.1 to 2.3,2.11 vertices and the like increase, i.e. increasing from the edges to S edge W edge, when the number of edges reaches W edge L edge, or W edge is not empty , Select the vertex transition and divide the related edges into corresponding partitions; when W edge is empty, output the division result and end. Here we mainly describe the process of selecting vertex transition and adding related edges (step 2.4 to step 2.10) under the window W edge , and specify L edge as 3.
根据步骤2.4到2.5,从W edge中选取邻接顶点个数最多的顶点v,由图7(a)窗口信息可得,v=v 1According to steps 2.4 to 2.5, select the vertex v with the largest number of adjacent vertices from W edge , which can be obtained from the window information in Figure 7(a), v=v 1 .
根据步骤2.6,将v 1的所有邻接顶点放入T中。图7(b)为增加边之前已经划分好的图结构数据的分区状态示意图,因此将在图7(a)、图7(b)中所有与v 1相邻接的顶点v 2,v 5,v 3,v 4,v 6都放到T中。此时,T={v 2,v 3,v 4,v 5,v 6}。 According to step 2.6, put all adjacent vertices of v 1 into T. Figure 7(b) is a schematic diagram of the partition state of the graph structure data that has been divided before adding edges. Therefore, in Figure 7(a) and Figure 7(b), all vertices v 2 , v 5 adjacent to v 1 ,v 3 ,v 4 ,v 6 are all put in T. At this time, T={v 2 , v 3 , v 4 , v 5 , v 6 }.
根据步骤2.7到2.10,计算将v 1转移到P 1,P 2产生的开销C 1,C 2,满足C 1>C 2,即目标分区P m=P 2,由于v 1已经在分区P 2中,那么不再进行划分。当将E v={(v 1,v 3),(v 1,v 4),(v 1,v 6)}中的边划分到分区中后,得到的划分结果如图7(d)所示。 According to steps 2.7 to 2.10, calculate the cost C 1 , C 2 incurred by transferring v 1 to P 1 , P 2 and satisfy C 1 >C 2 , that is, the target partition P m =P 2 , because v 1 is already in partition P 2 In, then no more division. When the edges in E v = {(v 1 ,v 3 ),(v 1 ,v 4 ),(v 1 ,v 6 )} are divided into partitions, the result of the division is shown in Figure 7(d) Show.
在已有的流式图划分方法中,大多用割边数作为转移判断的主要依据。在图7(b)中增加(v 1,v 3)后,为了降低割边数,会将v 1转移到P 1中(图7(c))。继续增加(v 1,v 4)和(v 1,v 6)后,v 1又会转移回原分区P 2,达到和本发明相同的划分结果,但却多出了2次顶点转移的开销,划分效 率降低。 In the existing flow graph division methods, the number of cut edges is mostly used as the main basis for transition judgment. After adding (v 1 , v 3 ) in Figure 7(b), in order to reduce the number of cutting edges, v 1 will be transferred to P 1 (Figure 7(c)). After continuing to increase (v 1 ,v 4 ) and (v 1 ,v 6 ), v 1 will be transferred back to the original partition P 2 to achieve the same division result as the present invention, but it will cost two more vertex transfers , The division efficiency is reduced.
当然,上述说明并非是对本发明的限制,本发明也并不仅限于上述举例,本技术领域的技术人员在本发明的实质范围内所做出的变化、改型、添加或替换,也应属于本发明的保护范围。Of course, the above description is not a limitation of the present invention, and the present invention is not limited to the above examples. Changes, modifications, additions or substitutions made by those skilled in the art within the essential scope of the present invention shall also belong to the present invention. The scope of protection of the invention.

Claims (1)

  1. 一种基于滑动窗口的大规模动态图划分方法,其特征在于:包括如下步骤:A method for dividing a large-scale dynamic graph based on a sliding window is characterized in that it includes the following steps:
    步骤1:增加顶点;具体包括如下步骤:Step 1: Add vertices; specifically include the following steps:
    输入为待增加顶点的集合S vertex、当前K个分区P i(i=1,2,…,K)的各个分区的顶点集合; Input is set to be increased vertices S vertex, K current partition P i (i = 1,2, ... , K) of the set of vertices of the respective partitions;
    步骤1.1:置
    Figure PCTCN2019108136-appb-100001
    指定|W vertex|的上限为L vertex
    Step 1.1: Set
    Figure PCTCN2019108136-appb-100001
    Specify the upper limit of |W vertex | as L vertex ;
    其中,W vertex为将被划分的候选顶点集合,其顶点来自S vertexAmong them, W vertex is the set of candidate vertices to be divided, and its vertices come from S vertex ;
    步骤1.2:取N=min{|S vertex|,L vertex-|W vertex|},即|S vertex|和L vertex-|W vertex|的最小值,将S vertex中的前N个顶点增加到W vertex中,并从S vertex中删除这些顶点; Step 1.2: Take N=min{|S vertex |,L vertex -|W vertex |}, which is the minimum value of |S vertex | and L vertex -|W vertex |, and increase the first N vertices in S vertex to W vertex , and delete these vertices from S vertex ;
    步骤1.3:如果
    Figure PCTCN2019108136-appb-100002
    则输出划分结果,并结束划分流程;否则转步骤1.4;
    Step 1.3: If
    Figure PCTCN2019108136-appb-100002
    Output the division result and end the division process; otherwise, go to step 1.4;
    步骤1.4:取v=argmax{d u|u∈W vertex,d u是顶点u的度数即与u相邻接的顶点的个数},即取W vertex中度数最大的顶点v,如果有多个度数相同且度数最大的顶点,则任取其中一个; Step 1.4: Take v = argmax {d u | u∈W vertex, d u u is the number of vertices i.e., the degree of contact of adjacent apex u}, that is to take the maximum degree of vertex v vertex W, if multiple For the vertices with the same degree and the largest degree, choose any one of them;
    步骤1.5:取V=Q=R={v};V为被选中划分到某分区的顶点集合,
    Figure PCTCN2019108136-appb-100003
    Q为顶点队列;R为与V中的顶点相邻的所有顶点的集合;
    Step 1.5: Take V=Q=R={v}; V is the set of vertices selected to be divided into a certain partition,
    Figure PCTCN2019108136-appb-100003
    Q is the vertex queue; R is the set of all vertices adjacent to the vertices in V;
    步骤1.6:若
    Figure PCTCN2019108136-appb-100004
    则转步骤1.8;否则从取Q中取出第1个顶点u,并从Q中删除该顶点;
    Step 1.6: If
    Figure PCTCN2019108136-appb-100004
    Go to step 1.8; otherwise, take the first vertex u from Q and delete the vertex from Q;
    步骤1.7:取Q=Q∪{w|(u,w)是图的一条边,且
    Figure PCTCN2019108136-appb-100005
    R=R∪{w|(u,w)是图的一条边},然后转步骤1.6;
    Step 1.7: Take Q=Q∪{w|(u,w) as an edge of the graph, and
    Figure PCTCN2019108136-appb-100005
    R=R∪{w|(u,w) is an edge of the graph}, then go to step 1.6;
    步骤1.8:取V=V∪{w|w∈R,且
    Figure PCTCN2019108136-appb-100006
    Step 1.8: Take V=V∪{w|w∈R, and
    Figure PCTCN2019108136-appb-100006
    步骤1.9:对每个分区P i(i=1,2,…,K),计算
    Figure PCTCN2019108136-appb-100007
    Figure PCTCN2019108136-appb-100008
    其中,C i为将顶点或边划分到第i个分区的代价;
    Figure PCTCN2019108136-appb-100009
    是拥有最多顶点的分区的顶点个数,α为计算C i时分区负载和割边数量的权重系数,0<α<1;
    Figure PCTCN2019108136-appb-100010
    用于衡量分区负载情况,
    Figure PCTCN2019108136-appb-100011
    用户衡量割边数量;
    Step 1.9: for each partition P i (i = 1,2, ... , K), calculated
    Figure PCTCN2019108136-appb-100007
    Figure PCTCN2019108136-appb-100008
    Among them, C i is the cost of dividing the vertex or edge into the i-th partition;
    Figure PCTCN2019108136-appb-100009
    Is the number of vertices in the partition with the most vertices, α is the weight coefficient of the partition load and the number of cut edges when calculating C i , 0<α<1;
    Figure PCTCN2019108136-appb-100010
    Used to measure the partition load,
    Figure PCTCN2019108136-appb-100011
    The user measures the number of cut edges;
    步骤1.10:取m=argmin{C i|i=1,2,…,K},即C m是所有{C i|i=1,2,…,K}中的最小值; Step 1.10: Take m=argmin{C i |i=1, 2,...,K}, that is, C m is the minimum value among all {C i |i=1,2,...,K};
    步骤1.11:将V中的所有顶点划分到分区P m中;P m为与最小值C m对应的分区; Step 1.11: Divide all vertices in V into partition P m ; P m is the partition corresponding to the minimum value C m ;
    步骤1.12:取W vertex=W vertex-V,然后转步骤1.2; Step 1.12: Take W vertex = W vertex -V, then go to step 1.2;
    步骤2:增加边;具体包括如下步骤:Step 2: Add edges; specifically include the following steps:
    输入为待增加边的集合S edge、当前K个分区P i(i=1,2,…,K)的各个分区的顶点集合; To be added to the input side of the set S edge, the K current partition P i (i = 1,2, ... , K) of the set of vertices of the respective partitions;
    前提:S edge中边的所有顶点都已经划分完毕; Prerequisite: All vertices of the edge in S edge have been divided;
    步骤2.1:置
    Figure PCTCN2019108136-appb-100012
    指定|W edge|的上限为L edge
    Step 2.1: Set
    Figure PCTCN2019108136-appb-100012
    Specify the upper limit of |W edge | as L edge ;
    步骤2.2:取N=min{|S edge|,L edge-|W edge|},即|S edge|和L edge-|W edge|的最小值,将 S edge中的前N条边增加到W edge中,并从S edge中删除这些边; Step 2.2: Take N=min{|S edge |,L edge -|W edge |}, which is the minimum value of |S edge | and L edge -|W edge |, and increase the first N edges in S edge to W edge , and delete these edges from S edge ;
    步骤2.3:如果
    Figure PCTCN2019108136-appb-100013
    则输出划分结果,并结束划分流程;否则转步骤2.4;
    Step 2.3: If
    Figure PCTCN2019108136-appb-100013
    Then output the division result and end the division process; otherwise, go to step 2.4;
    步骤2.4:对每个在W edge中的顶点v,取E v={u|(u,v)∈W edge};E v为与顶点v相邻接的且属于W edge的顶点集合; Step 2.4: For each vertex v in W edge , take E v ={u|(u,v)∈W edge }; E v is the set of vertices adjacent to vertex v and belonging to W edge ;
    步骤2.5:取v=argmax{|E v|},即取W edge中邻接顶点个数最多的顶点v,如果有多个满足条件的顶点,则任取其中一个; Step 2.5: Take v=argmax{|E v |}, that is, take the vertex v with the largest number of adjacent vertices in W edge . If there are multiple vertices that meet the conditions, choose any one of them;
    步骤2.6:取T={w|(v,w)是图的一条边};T为与W edge中的某个顶点相关联的所有顶点的集合; Step 2.6: Take T={w|(v,w) is an edge of the graph}; T is the set of all vertices associated with a vertex in W edge ;
    步骤2.7:对每个分区P i(i=1,2,…,K),如果v∈P i,则
    Figure PCTCN2019108136-appb-100014
    Figure PCTCN2019108136-appb-100015
    否则
    Figure PCTCN2019108136-appb-100016
    其中max j=1,2,…,K{|P j|}是拥有最多顶点的分区的顶点个数;
    Step 2.7: for each partition P i (i = 1,2, ... , K), if v∈P i, then
    Figure PCTCN2019108136-appb-100014
    Figure PCTCN2019108136-appb-100015
    otherwise
    Figure PCTCN2019108136-appb-100016
    Where max j = 1, 2,..., K {|P j |} is the number of vertices in the partition with the most vertices;
    步骤2.8:取m=argmin{C i|i=1,2,…,K},即C m是所有{C i|i=1,2,…,K}中的最小值; Step 2.8: Take m=argmin{C i |i=1, 2,...,K}, that is, C m is the smallest value among all {C i |i=1,2,...,K};
    步骤2.9:将v转移到分区P m中; Step 2.9: Transfer v to the partition P m ;
    步骤2.10:对于E v中的每一条边(u,v),u∈P i,v∈P j,若i≠j,则将(u,v)划分到P i和P j中;否则将(u,v)划分到P i中; Step 2.10: E v for each of the one side (u, v), u∈P i , v∈P j, if i ≠ j, then (u, v) is divided into P i and P j; and otherwise (u, v) is divided into the P i;
    步骤2.11:W edge=W edge-E v,转步骤2.2。 Step 2.11: W edge = W edge- E v , go to step 2.2.
PCT/CN2019/108136 2019-07-01 2019-09-26 Large-scale dynamic graph division method based on sliding window WO2021000435A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910583072.6 2019-07-01
CN201910583072.6A CN110309371A (en) 2019-07-01 2019-07-01 A kind of extensive Dynamic Graph division methods based on sliding window

Publications (1)

Publication Number Publication Date
WO2021000435A1 true WO2021000435A1 (en) 2021-01-07

Family

ID=68078081

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/108136 WO2021000435A1 (en) 2019-07-01 2019-09-26 Large-scale dynamic graph division method based on sliding window

Country Status (2)

Country Link
CN (1) CN110309371A (en)
WO (1) WO2021000435A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7742906B2 (en) * 2007-03-06 2010-06-22 Hewlett-Packard Development Company, L.P. Balancing collections of vertices in a network
CN103699606A (en) * 2013-12-16 2014-04-02 华中科技大学 Large-scale graphical partition method based on vertex cut and community detection
CN104361578A (en) * 2014-10-20 2015-02-18 北京大学 Hierarchical grid partition method under multi-scale precision control
CN106780697A (en) * 2016-12-07 2017-05-31 珠海金山网络游戏科技有限公司 It is a kind of based on normal direction, geometry, uv factors lattice simplified method
CN109325976A (en) * 2018-09-28 2019-02-12 深圳大学 It is divided and simplified figure diameter algorithm and system based on K scattergram of dynamic

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7742906B2 (en) * 2007-03-06 2010-06-22 Hewlett-Packard Development Company, L.P. Balancing collections of vertices in a network
CN103699606A (en) * 2013-12-16 2014-04-02 华中科技大学 Large-scale graphical partition method based on vertex cut and community detection
CN104361578A (en) * 2014-10-20 2015-02-18 北京大学 Hierarchical grid partition method under multi-scale precision control
CN106780697A (en) * 2016-12-07 2017-05-31 珠海金山网络游戏科技有限公司 It is a kind of based on normal direction, geometry, uv factors lattice simplified method
CN109325976A (en) * 2018-09-28 2019-02-12 深圳大学 It is divided and simplified figure diameter algorithm and system based on K scattergram of dynamic

Also Published As

Publication number Publication date
CN110309371A (en) 2019-10-08

Similar Documents

Publication Publication Date Title
CN103699606B (en) A kind of large-scale graphical partition method assembled with community based on summit cutting
CN111132235B (en) Mobile offload migration algorithm based on improved HRRN algorithm and multi-attribute decision
WO2022151654A1 (en) Random greedy algorithm-based horizontal federated gradient boosted tree optimization method
WO2023024219A1 (en) Joint optimization method and system for delay and spectrum occupancy in cloud-edge collaborative network
CN103345508A (en) Data storage method and system suitable for social network graph
CN107196806B (en) Topological proximity matching virtual network mapping method based on sub-graph radiation
CN104104621A (en) Dynamic adaptive adjustment method of virtual network resources based on nonlinear dimensionality reduction
CN114357676B (en) Aggregation frequency control method for hierarchical model training framework
CN104731811B (en) A kind of clustering information evolution analysis method towards extensive dynamic short text
CN116233954A (en) Clustered data sharing method and device based on federal learning system and storage medium
CN105430049B (en) A kind of virtual streaming media cluster collaboration moving method based on DCN
Li et al. GAP: Genetic algorithm based large-scale graph partition in heterogeneous cluster
CN104348695B (en) A kind of mapping method of virtual network and its system based on artificial immune system
CN110058945A (en) The accelerating algorithm of Large Scale Graphs parallel computation max-flow based on cutpoint splicing mechanism
WO2021000435A1 (en) Large-scale dynamic graph division method based on sliding window
WO2019183962A1 (en) Method for classifying network packet on basis of equal length and equal density segmentation
Kumar et al. Graphsteal: Dynamic re-partitioning for efficient graph processing in heterogeneous clusters
Wang et al. Joint job offloading and resource allocation for distributed deep learning in edge computing
WO2024036909A1 (en) Fair load unloading and migration method for edge service network
CN112612422B (en) Dynamic consistency maintenance method for copy in mobile edge calculation
CN114827933A (en) Multipath routing method for wireless sensor network
Li et al. Data & computation-intensive service re-scheduling in edge networks
Song et al. A Computational Offloading Method Based on Resource Joint Optimization
CN103220223B (en) Network data flow sorting technique and system
CN105119830A (en) Load balancing software defined networking routing aggregation method based on packing optimization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19936228

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19936228

Country of ref document: EP

Kind code of ref document: A1