CN107122248A

CN107122248A - A kind of distributed figure processing method of storage optimization

Info

Publication number: CN107122248A
Application number: CN201710301095.4A
Authority: CN
Inventors: 施展; 冯丹; 单玉祥; 李君浩; 毛艳; 张芸怡; 方交凤
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2017-05-02
Filing date: 2017-05-02
Publication date: 2017-09-01
Anticipated expiration: 2037-05-02
Also published as: CN107122248B

Abstract

The invention discloses a kind of distributed figure processing method based on storage optimization, belong to figure calculating field.The present invention includes：Data preprocessing phase carries out data division；Distribute figure partition data；Start data iterative processing；New information is transmitted；Working node extends decision-making；Data processing terminates.The present invention proposes to carry out subregion and storage to diagram data using uniformity hash algorithm, and design realizes the distributed figure processing system based on external memory pattern, the strategy optimized using dynamic memory, according to the partitioned storage of adjustment of load figure, realize that diagram data handles load balance, accelerate diagram data processing speed, solve the laod unbalance that prior art is present, overall performance caused by focus is caused to decline problem in diagram data processing procedure, so as to improve the performance of figure processing.

Description

A kind of distributed figure processing method of storage optimization

Technical field

The invention belongs to figure calculating field, more particularly, to a kind of distributed figure processing method of storage optimization.

Background technology

Scheme, as classical data structure, complicated data relationship to be expressed by putting with side, society is widely used to Each field, including the social data analysis of internet arena is interacted with the protein of excavation, chemical field, medical domain disease is sudden and violent Prediction, the adduction relationship of sphere of learning Literature in path etc. are sent out, many important algorithms are then derived, including PageRank, shortest path, connected component, maximal independent set etc..Just because of diagram data has great importance, need again a large amount of Calculating, then occur in that various figure processing systems.

It is distributed memory ideograph processing system first, including Pregel, GraphLab etc., these systems are first figure All information are all put into internal memory start to process again, and it is fast that this mode performs speed, but cost is big, cost is high, scale after Under the figure application background of continuous increase, challenge more and more significant.And single processor relatively limited, the processing system that can assemble amount of ram It is extending transversely can only horizontal supplement process machine quantity, this will inevitably increase figure subregion, further increase trimming quantity, Increase interprocessor communication pressure, aggravate network I/O latency, thus will offset provided parallel advantage extending transversely, tie down figure Process performance.

In experience contradiction extending transversely, a collection of unit external memory ideograph for taking Longitudinal Extension to design processing skill is emerged Art system, including GraphChi, X-Stream etc., it is excellent using external memory is cheap relative to internal memory and capacity is more easily extensible Most of data of figure are resided at external memory by point, and low volume data is loaded only when calculating has dependence and enters internal memory, the information master of figure Dependence to being communicated between multimachine is reduced by the income of disk access, and can realize the resources such as internal memory height by The processing of performance acceptable figure is carried out on the common machines of limit, but the performance of this system is seriously by the shadow of disk I/O Ring.

In the epoch of big data, the scale of diagram data is increasing, and more and more higher is required to autgmentability, concurrency.At figure Unit Longitudinal Extension is either taken in structure for reason system or cluster is extending transversely to face respective limitation.With regard to unit Say, its resource expansion, either computing capability or memory source, I/O bandwidth have deficiency, review distributed structure/architecture, the conjunction of figure Reason division turns into classical challenge already, although good data divide the load of energy EQUILIBRIUM CALCULATION FOR PROCESS, communication overhead reduced, so that at acceleration Reason, but this division is a NP-hard problem in itself, that is, enables and realize approximate data, often also to take a substantial amount of time And resource is pre-processed, lose more than gain, in view of this, prior art still only carries out simple diagram data division, such as The Pregel division based on hash, Gemini's is continuously divided by section.This simple diagram data is divided at distributed figure During reason, it is difficult to the problem of avoiding laod unbalance, the processing focus of dynamic change is caused, as tying down at whole figure iteration The short slab of reason, influence figure processing overall performance.

The content of the invention

For the disadvantages described above of prior art, the present invention provides a kind of distributed figure processing method of storage optimization, to figure Data carry out partitioned storage and IO balances, realize that diagram data handles load balance, accelerate diagram data processing speed, solve existing skill The laod unbalance that art is present, causes overall performance caused by focus to decline problem in diagram data processing procedure.

To achieve the above object, the present invention provides a kind of distributed figure processing method of storage optimization, comprises the following steps：

(1) initialize；Figure processing system processor node is divided into a main controlled node and multiple working nodes, each work Making node is used for the basic process of completion figure processing, realizes the computation model of figure processing；Main controlled node is used to control each work Node；

Configuration of the main controlled node according to user initially, can be the file for including each nodal information, by this file Lai The initial message routing table of generation, all working node preserves the message routing table copy and synchronizes renewal；It is described to disappear Breath routing table is used to record the routing iinformation between each working node；Main controlled node controls the execution of whole figure processing system, work Node completes the basic process of figure processing；The routing iinformation, transmission and master control section for new information between working node Point communicates between arriving each working node；

Main controlled node divides diagram data according to the message routing table, i.e., the large-scale data of information is represented with graph structure, Divided by the node id of figure；Divide block number identical with working node number in message routing table；Piecemeal is according to figure node sum It is average to divide；All nodes one annulus of formation of the figure, figure node id minimum value and figure node id maximum It is adjacent；By dividing, two kinds of situations are presented in each figure node subregion, and a kind of subregion is that have continuous id, and another subregion is that have two Duan Lianxu id；It is exactly a figure subregion in above-mentioned annulus to scheming by the segmentation of figure node " continuous ", " continuous " here It is continuous；This is a kind of block-based partition method, and the system does not carry out division deliberately to diagram data initially, do not ensured The balance of subregion, but only by by figure node it is average assign to each figure subregion up, and a figure subregion possesses this Subregion point it is all go out side；

(2) distribution of diagram data；Main controlled node divides step (1) member in obtained each figure node subregion and the subregion Data are sent in message routing table corresponding working node according to uniformity hash algorithm, and the metadata includes the side of full figure Number, the side number of the nodal point number of full figure, the type of figure, each subregion, the id of each subregion, the figure node id of the starting of each subregion, each point Point diagram node id in the figure node id of end of extent, figure subregion；Subregion can use the information again below；

(3) iterations differentiates；Each working node starts diagram data iterative processing under main controlled node control；Before iteration, Main controlled node differentiates whether iterations reaches iterations preset value, is to go to step (6)；Otherwise, go to step (4)；Here Main controlled node plays a part of a barrier, and barrier is that a working node will wait until all working nodes in figure processing procedure All complete that after a wheel iteration next round iteration could be carried out；

(4) new information is transmitted, and each working node performs MGA and calculated, including：

First, each edge to figure subregion is carried out a Map operation, if PageRank algorithms are to each figure node Weight is divided equally according to the quantity for going out side, and each figure node produces a new information while Map operations are performed, and is sent to Corresponding destination address；New information is that concrete application is produced, new information structure include needing to be transmitted to the data of adjacent node with One destination address, the i.e. address of adjacent node；Secondly, each figure node performs a Gather operation collection and passes to the figure knot All new informations of point；3rd, each figure node performs an Apply operation, changes this knot with the new information of collection The data of point；The transmission of the new information is occurred only between working node, and new information is sent according to message routing table To corresponding working node；The MGA calculating is the abbreviation of above-mentioned Map, Gather and Apply operation, is changed in the wheel of figure processing one Dai Zhong, each figure node will undergo these three stages；

In this step, according to MGA computation models, (MGA is the letter of Map-Gather-Apply processes to each working node Claim, in iteration is taken turns in figure processing one, each figure node will undergo these three stages), each edge to figure subregion is carried out Map operation (producing new information according to concrete application), produces a new information (new information knot in working node Structure passes to the data for facing node and a destination address comprising one), it is sent to corresponding destination address；Then, each figure knot Point performs a Gather operation (collection passes to all new informations of the node), will be assigned to the new information of the node Collect, then perform an Apply operation (node state is updated according to new information)；Therefore in each round iteration, work Node needs the new information of generation being sent to corresponding working node, and the transmission of this new information occurs only at work Between node, new information is sent to corresponding working node according to distributed message routing table；All working node and master control Node will preserve a distributed system message routing table, the distributed system message road that each of which working node is preserved By table be the distributed system routing table that main controlled node is preserved copy, when main controlled node distributed system message route When table updates, main controlled node by the message routing table synchronization and can update all working node message routing tables；

(5) extension process；Main controlled node differentiates that each working node load is according to collected working node running status No equilibrium：

It is then, without splitting and extending, to go to step (3) and carry out next round iteration；

Otherwise line splitting is entered to the figure partition data for loading maximum functional node, i.e., the diagram data of processing split, Then it is extending transversely, that most long node is taken to eliminate focus, i.e. processing data, uniformity hash algorithm is used for work Make node distribution figure partition data, reach the purpose of load regulation and control；Then new information routing table, goes to step (3) progress next Take turns iteration；Figure node is divided into two parts by described refer to；Extending transversely refer to adds a working node, to the figure separated Data are handled；Uniformity hash algorithm is for the operation of working node partition data, to add each time after working node, all Redistribute once；Extension is the key method of equally loaded, is also to speed up the important means that a wheel figure handles iteration；

(6) diagram data processing terminates, while exporting result of calculation.

Further, MGA calculating process ensure that to memory using the streaming reading of diagram data in the step (4) Sequential access, is utilized so as to ensure that to external memory IO maximum.

Further, the working node running status collected in the step (5) includes disk I/O, network I/O and calculating Consumption cost.

Further, focus described in the step (5), which refers in a wheel iteration, runs most slow working node, to focus The principle that figure subregion enters line splitting be make two subregions being divided into load cost as close possible to；

COST=α | V |+| E |

Wherein, for diagram data processing, α takes the average in-degree of figure, α | V | represent what a figure subregion to be received New information, | E | represent the load that COST in a figure subregion new information to be sent, formula weighs a figure；Division Purpose be to find a figure node on the figure subregion for needing to divide, make the load for the two cross-talk subregions that the figure subregion splits into Cost is roughly the same.COST can preferably weigh the load of a figure in the formula.Obviously, when extending decision-making, needing to divide A figure node can be always found on the figure subregion split, makes the load cost substantially phase for the two cross-talk subregions that the figure subregion splits into When.

The partition method that data use storage optimization is divided in the present invention, in the step 1, this partitioning strategies is thought substantially Want to carry out when initial simply dividing by section, subsequently during figure is handled, system can be according to the load of working node Situation carries out dividing by section again to subregion.Scheme one annulus of all node formation, figure node id minimum value and figure are tied Point id maximum is adjacent, is exactly a figure subregion above-mentioned to scheming by the segmentation of figure node " continuous ", " continuous " here Annulus is continuous.

In the present invention, MGA computation models particular content in the step 4 is as shown in figure 4, MGA processes use diagram data Streaming reads the sequential access that ensure that to disk, and external memory IO maximum is utilized so as to ensure that.

In general, by the contemplated above technical scheme of the present invention compared with prior art, with following beneficial effect Really：

The present invention stores diagram data when handling diagram data using uniformity Hash, then takes what dynamic memory optimized Strategy, according to the partitioned storage of adjustment of load figure, so as to realize the dynamic expansion of figure subregion, eliminates " focus " work section Point, balances IO, improves the performance of figure processing, greatly improves the speed of system processes data.

Brief description of the drawings

Fig. 1 is the distributed figure processing system execution flow chart of storage optimization of the present invention；

Fig. 2 is the general frame figure of the distributed figure processing system of storage optimization of the present invention；

Fig. 3 is the partitioning strategies figure of the distributed figure processing system of storage optimization of the present invention；

Fig. 4 is the distributed figure processing system calculation model M GA schematic diagrames of storage optimization of the present invention；

Fig. 5,6,7 are the distributed figure processing system subregion dynamic expansion schematic diagram of storage optimization of the present invention.

Embodiment

In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in each embodiment of invention described below Not constituting conflict each other can just be mutually combined.

Fig. 2 is the system architecture diagram of the distributed figure processing system of storage optimization of the present invention, and the system is made up of two parts, Main controlled node and working node, main controlled node control the execution of whole figure processing system, and working node completes the basic of figure processing Process, realizes the computation model of figure processing.Fig. 1 is the flow chart that the system data is handled, and specifically includes following steps：

Step 1：The division of figure：

Main controlled node can divide figure number according to the initial message routing table of the configuration generation of user initially according to the table first According to having two kinds of situations in a subregion, one kind is that have continuous id, second is to have two sections of continuous id, this is a kind of to be based on block Partition method, the system initially do not carry out division deliberately to diagram data, will not ensure the balance of subregion, but only By by figure node it is average assign to each figure subregion up.

Step 2：The distribution of diagram data：

Figure subregion and subregion metadata are sent to corresponding working node by main controlled node, include the side number, complete of full figure The nodal point number of figure, the type of figure, the side number of subregion, the id of subregion, the figure node id of the starting of subregion, the end figure node of subregion Point diagram node id in id, figure subregion, again subregion can use the information；

Step 3：Start data iterative processing：

The beginning and end of iteration is often taken turns in the processing of main controlled node control figure, while time of iteration can be controlled according to preset value Number, here main controlled node play a part of a barrier, barrier is that a working node will wait until all in figure processing procedure Working node all completes that after a wheel iteration next round iteration could be carried out.When judging whole iteration completions, step 6 is performed；

Step 4：New information is transmitted：

In each round iteration, according to MGA (Map-Gather-Apply) computation model, each figure node will be through These three stages are gone through, diagram data is carried out a Map behaviour to each edge in the form of streaming from disk input so that parallelization is handled Make, produce a renewal, this more new construction includes a destination address, and each update can be sent to corresponding destination Location, each figure node has Gather operation, then etc. all more new capital is collected finish after, each figure node can be held One Apply operation of row.Working node needs the new information of generation being sent to corresponding working node, this new information Transmission occur only between working node, new information is sent to corresponding work section according to distributed message routing table Point；

Step 5：Extend decision-making：

Extension decision-making is the key method of equally loaded, is also to speed up the important means that a wheel figure handles iteration, master control section Working node running status of the point meeting in iterative process according to collected by main controlled node, decides whether to carry out horizontal stroke to system Carry out " division " to eliminate focus to extension (scale-out), and to which working node, reach the purpose of load regulation and control；

Step 6 diagram data processing terminates, while exporting result of calculation.

The present invention provides one embodiment, with [0,2³²- 1] ring-type hash spaces, Vid1, Vid2, Vid3, Vid4 tetra- Exemplified by figure node, 3 figure subregions, wherein Vid2 and Vid4 are in same subregion, as shown in figure 5, specific introduce the present invention, bag Include following steps：

The division of step 1 figure：

For example shown in Fig. 3, it is an annulus in subregion that figure node, which is, and figure node is end to end, is divided into four points Figure node id in area, subregion one, subregion two, subregion three be it is strict continuous, and subregion four be on annulus it is continuous, The head and the tail node of figure is spanned, is really made up of two sections of strict continuous figure nodes.Assuming that in distributed figure processing procedure, This block-based method reduces global figure node id to this block plan node part id mapping cost, and each figure subregion is only Need to safeguard that boundary information just can be changed quickly.During load transfer is carried out, a figure subregion is divided again When, it can prove, be inquired about along ring, can always find a figure node, be extended to the figure node, by the figure subregion again two Point, it can make it that the load of this two node is equal or difference is minimum.

The distribution of step 2 diagram data：

Figure subregion and subregion metadata are sent to corresponding working node by main controlled node, first map working node On the ring constituted to figure node, it is assumed that three working nodes obtain corresponding key (KEY) by uniformity Hash hash algorithm Value, i.e. position of the working node on this ring.

Hash (W1)=KEY1

Hash (W2)=KEY2

Hash (W3)=KEY3

Then Key values are positioned in the ring, as shown in Figure 6.

Then uniformity hash algorithm is used, with clockwise direction, all figure nodes (Vid) of each figure subregion are reflected It is mapped in the working node nearest from the subregion (being exactly that this figure subregion returns this working node to calculate).

Step 3 starts one and takes turns data iterative processing：

Step 4 new information is transmitted：

Step 5 extends decision-making：

Extension decision-making is the key method of equally loaded, is also to speed up the important means that a wheel figure handles iteration, master control section Working node running status between meeting is put in iterative process each time according to collected by main controlled node, decides whether to system Carry out (scale-out) extending transversely, and " division " is carried out to eliminate focus to which working node, while using step 2 Described in uniformity hash algorithm distribute diagram data again, reach load regulation and control purpose.Assuming that focus appears in the 3rd The subregion two of individual working node W3 processing, then need subregion two " division " into two child partitions, and adds the 4th work section Point W4 handles one of child partition of division, as shown in Figure 7.

The present invention carries out quantitatively evaluating by the load to each working node, made " focus " when performing diagram data processing The partition data of working node is divided equally, and realizes the dynamic expansion of subregion by uniformity hash algorithm, to the figure number of synthesis Social network map data is tested according to this and truly, and experimental result is also demonstrated by storage optimization well, figure processing Disequilibrium is reduced, and accelerates figure processing.

As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, it is not used to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the invention etc., it all should include Within protection scope of the present invention.

Claims

1. the distributed figure processing method of a kind of storage optimization, it is characterised in that comprise the following steps：

(1) initialize；Figure processing system processor node is divided into a main controlled node and multiple working nodes, each work section Point is used for the basic process for completing figure processing, realizes the computation model of figure processing；Main controlled node is used to control each working node；

Configuration of the main controlled node according to user initially, generates initial message routing table, and all working node preserves the message Routing table copy simultaneously synchronizes renewal；The message routing table is used to record the routing iinformation between each working node；Master control section The execution of the whole figure processing system of point control, working node completes the basic process of figure processing；The routing iinformation, for work The transmission of new information and main controlled node communicate between each working node between node；

Main controlled node divides diagram data according to the message routing table, is divided by the node id of figure；Divide block number and message Working node number is identical in routing table；Piecemeal is divided according to figure node sum is average；All nodes one ring of formation of the figure Shape space, figure node id minimum value and figure node id maximum are adjacent；By dividing, two kinds of feelings are presented in each figure node subregion A kind of condition, subregion is that have continuous id, and another subregion is that have two sections of continuous id；

(2) distribution of diagram data；Main controlled node divides step (1) metadata in obtained each figure node subregion and the subregion Corresponding working node is sent in message routing table according to uniformity hash algorithm, the metadata include full figure side number, The nodal point number of full figure, the type of figure, the side number of each subregion, the id of each subregion, the figure node id of the starting of each subregion, each subregion knot Point diagram node id in the figure node id of beam, figure subregion；

(3) iterations differentiates；Each working node starts diagram data iterative processing under main controlled node control；Before iteration, master control Node differentiates whether iterations reaches iterations preset value, is to go to step (6)；Otherwise, go to step (4)；

First, each edge to figure subregion is carried out a Map operation, and each figure node is produced while MAP operations are performed A raw new information, is sent to corresponding destination address；

Secondly, each figure node performs a Gather operation collection and passes to all new informations of the figure node；

3rd, each figure node performs an Apply operation, changes the data of this figure node with the new information of collection；Institute The transmission for stating new information is occurred only between working node, and new information is sent to corresponding work according to message routing table Node；

The MGA calculating is the abbreviation of above-mentioned Map, Gather and Apply operation, in iteration is taken turns in figure processing one, each figure Node will undergo these three stages；

(5) extension process；Main controlled node differentiates whether each working node load is equal according to collected working node running status Weighing apparatus：

Otherwise line splitting is entered to the figure partition data for loading maximum functional node, i.e., the diagram data of processing split, then It is extending transversely, that most long node is taken to eliminate focus, i.e. processing data, uniformity hash algorithm is used for work section Point distribution diagram partition data, reaches the purpose of load regulation and control；Then new information routing table, goes to step (3) progress next round and changes Generation；

(6) diagram data processing terminates, while exporting result of calculation.

2. the method as described in claim 1, it is characterised in that MGA calculating process uses the stream of diagram data in the step (4) Formula reads the sequential access that ensure that to memory, and external memory IO maximum is utilized so as to ensure that.

3. the method as described in claim 1, it is characterised in that the working node running status bag collected in the step (5) Include disk I/O, network I/O and the consumption of calculating cost.

4. the method as described in claim 1, it is characterised in that focus described in step (5) refer to run in a wheel iteration it is most slow Working node, the principle that line splitting is entered to the figure subregion of focus be make two subregions being divided into load cost as close possible to；

COST=α | V |+| E |

Wherein, for diagram data processing, α takes the average in-degree of figure, α | V | represent a figure subregion renewal to be received Message, | E | represent the load that COST in a figure subregion new information to be sent, formula weighs a figure；The mesh of division Be to find a figure node on the figure subregion for needing to divide, make the load cost for the two cross-talk subregions that the figure subregion splits into It is roughly the same.

5. a kind of distributed figure processing method based on storage optimization, it is characterised in that system is made up of two parts, main controlled node And working node, the execution of the whole figure processing system of main controlled node control, the basic process of working node completion figure processing, realize Scheme the computation model of processing.