CN104618153B - Dynamic fault-tolerant method and system based on P2P in the processing of distributed parallel figure - Google Patents
Dynamic fault-tolerant method and system based on P2P in the processing of distributed parallel figure Download PDFInfo
- Publication number
- CN104618153B CN104618153B CN201510026680.9A CN201510026680A CN104618153B CN 104618153 B CN104618153 B CN 104618153B CN 201510026680 A CN201510026680 A CN 201510026680A CN 104618153 B CN104618153 B CN 104618153B
- Authority
- CN
- China
- Prior art keywords
- processor node
- node
- data cell
- copy
- adjacent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000012545 processing Methods 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000010586 diagram Methods 0.000 claims abstract description 17
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 7
- 238000011084 recovery Methods 0.000 claims description 10
- 230000005540 biological transmission Effects 0.000 claims description 4
- 230000000644 propagated effect Effects 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 3
- 230000000712 assembly Effects 0.000 description 2
- 238000000429 assembly Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000007257 malfunction Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Landscapes
- Hardware Redundancy (AREA)
Abstract
The present invention relates to dynamic fault-tolerant methods and system based on P2P in a kind of processing of distributed parallel figure.This method includes:The data cell of the distributed figure process problem of definition, the integrality of restored data when ensuring dynamic fault-tolerant;Processor node is formed into cyclic structure, the diagram data of input is divided into several subregions and each subregion is assigned in each processor node, each processor node backs up respective data cell ghost in adjacent processor node;After each processor node has executed the data cell of itself, its copy being positioned in adjacent processor node is updated in a manner of increment;When some processor node fails or is offline because of network error, its adjacent node is assigned to substitute original data cell using data copy, complete corresponding operation, restores the normal execution of figure processing.The present invention can make figure processing operation restore normal condition from the exceptions such as node failure, network error, ensure the correct execution of operation.
Description
Technical field
The present invention relates to technical field of the computer network, and in particular to one kind is distributed under opening, dynamic network environment
Formula schemes the method and system of the dynamic fault-tolerant of processing parallel.
Background technology
In recent years, with the universal of the technologies such as social networks, contract network and development, the data scale in internet is more next
Bigger, this brings new challenge to these data are analyzed.It, may between data in the scenes such as social networks, contract network
There are association, correlative study is through commonly using graph structure these data to be described.Vertex in figure records data self attributes,
The association between the corresponding data of side in figure.In this way, being converted to figure processing to the analysis of network data.However, above-mentioned figure knot
Structure usually has million, ten million vertex and several hundred million sides, this poses a big pressure to the memory of common computer.It is more serious
It is that, since figure processing procedure will generate the intermediate result for the quantity for being proportional to figure scale, this makes limitation of the single machine due to memory
It is difficult to normally calculate figure.
Therefore, parallel processing is carried out to figure using distributed type assemblies and becomes the main side analyzed for current network diagram data
Formula.For the Pregel frames proposed using Google as Typical Representative, diagram data is divided into multiple subgraphs by most of figure processing system
(or figure subregion) is assigned in several machines, and parallel computation is carried out to diagram data.Parallel computation mainly uses BSP (Bulk
Synchronous Processing) model, i.e., operation is iterated to diagram data.In each iteration step, each pushed up in figure
Point obtains the message of adjacent vertex, each self refresh oneself state, and new state is broadcast to adjacent vertex.As shown in Figure 1, repeatedly
Between riding instead of walk, the operation needs on all vertex synchronize, and when the operation of current iteration step is completed on all vertex, can trigger
It is walked into following iteration.When result is acquired in all vertex, figure processing, which calculates, to be completed.
Under open network environment, distributed type assemblies have the characteristics that dynamic change, and certain nodes in cluster may
Failure, the network for connecting cluster may malfunction, and influence diagram is handled the normal execution of operation by these abnormal conditions.Therefore, it is distributed
Formula schemes processing and needs the fault-tolerance considered in calculating process parallel.Existing figure processing system generally passes through checkpoint machines
System is realized fault-tolerant.This fault tolerance requirements figure processing stops current operation, loads nearest one from disk again when finding to malfunction
Secondary checkpoint data, the iteration point recorded from the checkpoint re-start operation.This fault tolerant mechanism is quiet
State, and recovery cost is also bigger, is not particularly suited for open network environment.
Invention content
The object of the present invention is to provide be based on P2P in the processing of distributed parallel figure under a kind of opening, dynamic network environment
The method of the dynamic fault-tolerant of (Peer to Peer, P2P computing/peer-to-peer network) so that figure processing operation can lose from node
Restore normal condition in the exceptions such as effect, network error, ensures the correct execution of operation.
To achieve the above object, the present invention adopts the following technical scheme that:
Dynamic fault-tolerant method based on P2P in a kind of processing of distributed parallel figure, step include:
1) data cell for defining distributed figure process problem, the integrality of restored data when ensuring dynamic fault-tolerant;
2) processor node is formed into cyclic structure, the diagram data of input is divided into several subregions, and each subregion is divided
It sends in each processor node, each processor node backs up respective data cell ghost in adjacent processor node
In;
3) in each iteration step of operation, after each processor node has executed the data cell of itself, with increment
Mode update its copy being positioned in adjacent processor node;
4) when some processor node fails or is offline because of network error, its adjacent node is assigned to utilize data copy
Original data cell is substituted, corresponding operation is completed, to restore the normal execution of figure processing.
Further, the data cell for the distributed figure process problem that step (1) defines is two tuple (Pj,InMsg
(Pj)), wherein PjFor a certain figure subregion being divided by graph structure, i.e., a certain subgraph;InMsg(Pj) it is subgraph PjIncluding institute
There is the massage set that vertex is received in a certain iteration step, when operation is initial, InMsg (Pj) it is empty set;1≤j≤m, m are pair
Figure carries out the quantity of subregion after subregion.
Further, after the diagram data of input is divided into several subregions by step (2), with two tuple (Pj, empty message set
Close) structure be assigned in each processor node.
Further, in step (2), the data cell being assigned for oneself, processor node is based on BSP model logarithms
Operation is carried out according to each vertex in unit;For the data cell copy of adjacent node, processor node only records, and
It waits for each time after iteration step, adjacent node transmission data updates these copies.
Further, step (3) carries out Replica updating after the completion of each iteration step;Or occurred according to system mistake
Frequency considers the time cost that Replica updating is consumed, and a Replica updating is carried out every certain iteration step.
Further, in each iteration step of figure processing, each vertex is handled first in upper primary iteration step (3)
The received message set of step, and according to these information updatings state of oneself, then the new state of oneself is propagated to adjacent vertex;Together
Message of one vertex between different iteration steps need not accumulate, after a certain iteration step has calculated that the updated value on vertex,
The message received before the iteration step all no longer needs.
A kind of distributed parallel figure processing system using the above method, including controller and processor node, the control
Device processed is responsible for the diagram data that will be inputted and carries out subregion, each subregion is assigned in each processor node, and monitor each processor section
The operating condition of point;Each processor node forms cyclic structure, and each processor node is respective data cell ghost
Backup after each processor node has executed the data cell of itself, is updated in adjacent processor node in a manner of increment
It is positioned over the copy in adjacent processor node;When some processor node fails or is offline because of network error, control
Device substitutes original data cell using the data copy of its adjacent node, restores the normal execution of figure processing.
Compared with prior art, beneficial effects of the present invention are as follows:
(1) traditional checkpoint mechanism generally requires read-write disk, and when present invention will be fault-tolerant required copy delays
It is stored in memory.Compared with traditional checkpoint mechanism, read/write memory can realize copy record and Fault recovery more quickly;
(2) it avoids copy record the present invention is based on the assignment of the copy of P2P and the single hot spot in error recovery procedure (ERP) is existing
As so that the Internet traffic between distributed processors is more average, when reducing the overall communication in each iteration step
Between;
(3) present invention supports dynamic fault-tolerant can be from most from abnormal point when that is, figure processing is abnormal in calculating process
Close data cell copy restores operation also need not from the beginning again without recalculating those data being correctly completed
Execute entire figure processing application.
Description of the drawings
Fig. 1 is the distributed parallel figure processing work flow chart based on BSP models.
Fig. 2 is that the data cell copy based on P2P generates schematic diagram.
Fig. 3 is the schematic diagram that two adjacent processor nodes carry out Replica updating in adjacent iteration step.
Fig. 4 is the reconstruction schematic diagram of Fault recovery and P2P copies.
Fig. 5 is UniAS configuration diagrams in specific example.
Specific implementation mode
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below by specific embodiment and
Attached drawing, the present invention will be further described.
The method of the dynamic fault-tolerant based on P2P, includes the following steps in the distributed parallel figure processing of the present invention:
(1) in order to ensure can completely restore data during dynamic fault-tolerant, the present invention is to distributed figure processing
Data cell is defined.
In being handled in figure, the parameter needed for each vertex operation not only includes the original state on vertex itself, it is also necessary to
The message that its adjacent vertex is propagated.Therefore, when defining the data cell of figure process problem, not only to consider to preceding iteration
Vertex in data dependence, it is also necessary to consider adjacent vertex message rely on.In order to more intuitively describe data cell, this hair
Bright definition graph structure is as follows:G=(V, E), wherein V indicate the set { v on vertex in figure1,v2,v3…vn, having in E expression figures
The set formed to side.Initial graph structure is divided into the set of several figure subregion compositions, is expressed as P={ P1,P2,P3…Pm,
These subregions will be assigned in each processor node in the process of running.For arbitrary vertex vi, in a certain iteration step institute
The massage set received is InMsg (vi).Further, for a certain subgraph Pj, it includes all vertex in a certain iteration step institute
The massage set received is InMsg (Pj)={ InMsg (vi)|all vi in Pj}.Then, the present invention defines at distributed figure
The data cell of reason is two tuple (Pj,InMsg(Pj)), wherein 1≤j≤m, when operation is initial, InMsg (Pj) it is empty set.
In fault-toleranr technique hereafter, it is desirable that treat two tuples as a whole, be carried out at the same time duplication or recovery.
(2) processor node is formed into cyclic structure, each node backs up respective data cell ghost in neighbour
It connects in processor node.
Efficient fault-tolerant in order to realize, the copy of calculating process is not stored in disk by the present invention, but is recorded in
In memory.However, since the figure in the processing of distributed figure is larger, if the figure ghost after each interative computation protected
There are in a certain processor node, then the node is likely to become data transmission hot spot, and further influences the efficiency of integral operation.
Therefore, the preservation of copy is distributed to each processor node by the present invention, the volume of transmitted data during being generated with balanced copy.
The present invention numbers all processor nodes for participating in calculating process, forms cyclic structure.Data cell is determined according in (1)
The diagram data of input is divided into several subregions by justice, with two tuple (Pj, empty massage set) structure be assigned to each processing
In device node.It completes after assigning, its transfer copies is given to number in two adjacent processor nodes and be preserved by each data cell.
Therefore, each processor node had both preserved the data cell (hereinafter referred to as arithmetic element) oneself being assigned, and also delayed
The data cell copy (hereinafter referred to as mirror image unit) in adjacent node is deposited, as shown in Fig. 2, wherein A~H indicates operation list
Member, A '~H ' indicate mirror image unit.The data cell being assigned for oneself, processor will be based on BSP models to data cell
In each vertex carry out operation;For the data cell copy of adjacent node, processor only records, and waits for each time
After iteration step, adjacent node transmission data updates these copies.
(3) in operation, after each processor node has executed the data cell of itself, in a manner of increment updating it puts
The copy being placed in adjacent processor node.
In distributed figure processing system, each vertex of figure is as interative computation is constantly updated, this makes step
(2) set copy needs to constantly update in.And since the operation on each vertex needs its adjacent vertex in a upper iteration step
The message propagated, this to must take into consideration in data recovery procedure between the iteration step residing for multiple data cell copies
Relations problems.
In order to simplify this problem, ensure that the update of copy is consistent with iteration step residing for copy.The present invention is based on BSP meters
Model is calculated, the Replica updating stage is added between two iteration steps, at this stage, it is desirable that all data cells in figure processing
Its copy will be updated according to current value.The Replica updating stage can be happened at after the completion of each iteration step, can also be according to being
The frequency that system mistake occurs, considers the time cost that Replica updating is consumed, and a copy is carried out every certain iteration step
Update.
In order to reduce the time and Internet resources cost that each transfer copies are consumed, and it is occupied to reduce copy caching
Memory headroom, the present invention propose the Replica updating of increment type.Generation is only transmitted to the update of figure subregion state in copy every time
The part of variation is added in data cell copy by the part of variation, adjacent processor node, realizes figure subregion shape in copy
The update of state.Further, since in each iteration step of figure processing, each vertex will execute following operation, handle first
The message set that last iteration step is received, and propagate oneself according to these information updatings state of oneself, then to adjacent vertex
New state.Therefore, message of the same vertex between different iteration steps need not accumulate, when a certain iteration step has calculated
After going out the updated value on vertex, the message received before the iteration step all no longer needs.With some figure subregion PjFor, first
By (P when the beginningj 0, null) and ghost is buffered in adjacent processor node.When the 1st iteration step operation is completed, the subregion
State be updated to Pj 1, enable subregion increment △ P1For Pj 1With Pj 0Difference.Then PjThe processor unit at place is by (△ P1,InMsg
(Pj) [1]) it is sent to adjacent processor node.Adjacent processor node is by △ P1Accumulate copy unit (Pj 0, null) in
Generate Pj 1, and it is InMsg (P to replace massage setj)[1].And so on, when k-th of iteration step operation is completed, subregion is enabled to increase
Measure △ PkFor Pj kWith Pj k-1Difference.In the Replica updating stage, by (△ Pk,InMsg(Pj) [k]) it is sent to adjacent processor section
Point.Adjacent processor node is by △ PkAccumulate copy unit (Pj k-1,InMsg(Pj) [k-1]) and in generate Pj k, and delete InMsg
(Pj) [k-1], replacement massage set is InMsg (Pj)[k].Fig. 3 illustrates two adjacent processor nodes in adjacent iteration step
Carry out the process of Replica updating.
(4) when some processor node fails or is offline because of network error, its adjacent node is assigned to utilize data pair
This substitutes original data cell, completes corresponding operation, to restore the normal execution of figure processing.
When figure processing runs to a certain iteration step k, if the failure or offline because of network error of some processor node,
Data cell in the offline node can not continue operation, but their copy (Pj,InMsg(Pj)) it is cached in adjacent node
In.Based on the mode that the copy data increment that step (3) is previously mentioned preserves, the present invention can restore from the copy of adjacent node
Go out the data cell of these loss, as shown in Figure 4.
For this purpose, the present invention devises following dynamic fault-tolerant algorithm:
Recovery_scheduling (offline node ID, the data cell set Set in offline node)
Obtain two adjacent nodes L, R of offline node;
If (L Xians &&R is online)
According to node L, Set is divided into two subsets SetL, SetR by the loading condition of R;
Node L during oneself is cached mirror image unit corresponding with SetL be arranged to arithmetic element, and complete these SetL
In the operation that is walked in current iteration of all vertex;
Node R during oneself is cached mirror image unit corresponding with SetR be arranged to arithmetic element, and complete these SetR
In the operation that is walked in current iteration of all vertex;
It is adjacent node that node L, R, which is arranged,;
The arithmetic element in node L, R is compared, if generating the data cell there is no mirror image unit in Correspondent Node
Copy be sent in Correspondent Node;
Restore normal to execute;
Else if (L is online | | R is online)
Node online N=;
Node N during oneself is cached mirror image unit corresponding with Set be arranged to arithmetic element, and complete in these Set
The operation that is walked in current iteration of all vertex;
Find the adjacent node of N;
The arithmetic element in node N nodes adjacent thereto is compared, if there is no mirror image units in Correspondent Node, is generated
The copy of the data cell is sent in Correspondent Node;
Restore normal to execute;
}
}
Since said program is the extension to original BSP models, so having no effect on figure processing system in normal circumstances
Execution.And scheme provided by the present invention is utilized, the fault-tolerant of figure processing system does not need to stop current operation, whole rollbacks
To nearest checkpoint points, copy data is read from disk.When processor node is offline, figure processing system only needs
Recovery is adjusted to the local nodes of error, and the operation result that other normal nodes are completed can continue to use it is next
In iteration step.
So far, the present invention completes the dynamic fault-tolerant in the processing of distributed parallel figure.
A point for supporting dynamic fault-tolerant using present invention structure on distributed parallel operation platform UniAS is given below
Cloth schemes the case study on implementation of processing parallel.
UniAS is put down by the distributed parallel operation of software study institute of Information Science and Technoledge School of Peking University independent research
Platform has supported to include batch processing at present, figure handles and the processing application of the big data of the various modes such as stream process.It surrounds below
Figure processing frame in UniAS introduces the implementation process of the present invention.
As shown in figure 5, the figure processing frame in UniAS is realized based on Master-Slave structures.Controller is responsible for scheme
Subregion is carried out, and each subregion is assigned in processor node, and monitors the operating condition of each node.Each processor node
There are one data cell Queue module, the copy being responsible in operational process is safeguarded and dynamic fault-tolerant.The technique according to the invention
Main points hereafter realize that the distributed figure of dynamic fault-tolerant is handled by several steps:
1. starting processor node, registered to controller.All processor nodes reached the standard grade are numbered controller, row
Arrange circlewise structure;
2. initial phase, the graph structure of input is carried out subregion by controller, and each subregion is assigned to processor node
In.After the completion of assignment, the copy for triggering each node generates operation, and the data cell copy of itself is sent to phase by each node
In neighbors;
3. starting figure processing operation.According to the requirement of the technology of the present invention point 3 (i.e. above-mentioned steps (3)), in each iteration step
When completion, the difference of an iteration step and current iteration step result on each vertex is found out.If vertex state has change, by difference
It is added in subregion increment;If vertex, without changing, is added without subregion increment in current iteration.By point of each subregion
Area's increment is sent to adjacent node with massage set, is used for the Replica updating of adjacent node;
4. each processor node periodically sends heartbeat message to controller.When controller fails to receive some processing in time
When the heartbeat message of device node, controller confirms that the processor node is offline, into the Fault recovery stage;
5. controller calls the algorithm designed by the technology of the present invention point 4 (i.e. above-mentioned steps (4)), adjacent processor is utilized
Copy restores the data lost, and resets the adjacency state between processor node, to the normal operation of recovery routine;
6. when in some iteration step, all vertex all handle completion, and all data cells need not all update in operation again
When, entire figure processing job success completes.
So far, the dynamic fault-tolerant mechanism of distributed figure processing frame is constructed on UniAS platforms using the present invention.
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, the ordinary skill of this field
Personnel can be modified or replaced equivalently technical scheme of the present invention, without departing from the spirit and scope of the present invention, this
The protection domain of invention should be subject to described in claims.
Claims (6)
1. the dynamic fault-tolerant method based on P2P in a kind of distributed parallel figure processing, step include:
1) data cell for defining distributed figure process problem, the integrality of restored data when ensuring dynamic fault-tolerant;It is described
Data cell is two tuple (Pj,InMsg(Pj)), wherein PjFor a certain figure subregion being divided by graph structure, i.e., a certain subgraph;
InMsg(Pj) it is subgraph PjIncluding the massage set that is received in a certain iteration step of all vertex, when operation is initial, InMsg
(Pj) it is empty set;1≤j≤m, m are the quantity that subregion after subregion is carried out to figure;
2) processor node is numbered to and is formed cyclic structure, if the diagram data of input is divided by the definition according to data cell
Dry subregion, and each subregion is assigned in each processor node, each processor node is respective data cell ghost
It backs up in numbering two adjacent processor nodes;The copy is not stored in disk, but is recorded in memory;
3) in each iteration step of operation, after each processor node has executed the data cell of itself, with the side of increment
Formula updates its copy being positioned in adjacent processor node, only transmits become to the update of figure subregion state in copy every time
The part of variation is added in data cell copy by the part of change, adjacent processor node, realizes figure subregion state in copy
Update;In each iteration step of figure processing, each vertex handles the message set received in upper primary iteration step first,
And according to these information updatings state of oneself, then the new state of oneself is propagated to adjacent vertex;Same vertex is in different iteration
Message between step need not accumulate, and after a certain iteration step has calculated that the updated value on vertex, be received before the iteration step
To message all no longer need;
4) when some processor node fails or is offline because of network error, its adjacent node is assigned to be substituted using data copy
Original data cell completes corresponding operation, to restore the normal execution of figure processing.
2. the method as described in claim 1, it is characterised in that:After the diagram data of input is divided into several subregions by step 2),
With two tuple (Pj, empty massage set) structure be assigned in each processor node.
3. the method as described in claim 1, it is characterised in that:In step 2), the data cell being assigned for oneself, processing
Device node carries out operation based on BSP models to each vertex in data cell;For the data cell copy of adjacent node,
Processor node only records, and wait for each time after iteration step, and adjacent node transmission data updates these copies.
4. the method as described in claim 1, it is characterised in that:Step 3) carries out Replica updating after the completion of each iteration step;
Or the frequency occurred according to system mistake, the time cost that Replica updating is consumed is considered, every certain iteration stepping
Replica updating of row.
5. a kind of distributed parallel figure processing system using claim 1 the method, which is characterized in that including controller and
Processor node, the controller are responsible for the diagram data that will be inputted and carry out subregion, each subregion is assigned in each processor node,
And monitor the operating condition of each processor node;Each processor node forms cyclic structure, and each processor node is respective
The backup of data cell ghost is in adjacent processor node, after each processor node has executed the data cell of itself,
Its copy being positioned in adjacent processor node is updated in a manner of increment;When some processor node fails or because network goes out
It is wrong and it is offline when, controller substitutes original data cell using the data copy of its adjacent node, restores the normal of figure processing
It executes.
6. system as claimed in claim 5, it is characterised in that:Each processor node periodically sends heartbeat to controller and disappears
Breath, when controller fails to receive the heartbeat message of some processor node in time, controller confirms that the processor node is offline,
Into the Fault recovery stage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510026680.9A CN104618153B (en) | 2015-01-20 | 2015-01-20 | Dynamic fault-tolerant method and system based on P2P in the processing of distributed parallel figure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510026680.9A CN104618153B (en) | 2015-01-20 | 2015-01-20 | Dynamic fault-tolerant method and system based on P2P in the processing of distributed parallel figure |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104618153A CN104618153A (en) | 2015-05-13 |
CN104618153B true CN104618153B (en) | 2018-08-03 |
Family
ID=53152444
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510026680.9A Expired - Fee Related CN104618153B (en) | 2015-01-20 | 2015-01-20 | Dynamic fault-tolerant method and system based on P2P in the processing of distributed parallel figure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104618153B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104954477B (en) * | 2015-06-23 | 2018-06-12 | 华中科技大学 | One kind is based on concurrent improved large-scale graph data streaming division methods and system |
US10635562B2 (en) * | 2015-09-04 | 2020-04-28 | Futurewei Technologies, Inc. | Fault tolerance in distributed graph processing networks |
US20170160962A1 (en) * | 2015-12-03 | 2017-06-08 | Mediatek Inc. | System and method for processor mapping |
CN108241553B (en) * | 2016-12-23 | 2022-04-08 | 中科星图股份有限公司 | Data backup control method |
CN109213592B (en) * | 2017-07-03 | 2023-07-18 | 北京大学 | Graph calculation method based on automatic selection of duplicate factor model |
CN107908476B (en) * | 2017-11-11 | 2020-06-23 | 许继集团有限公司 | Data processing method and device based on distributed cluster |
CN107943918B (en) * | 2017-11-20 | 2021-09-07 | 合肥亚慕信息科技有限公司 | Operation system based on hierarchical large-scale graph data |
CN110232087B (en) * | 2019-05-30 | 2021-08-17 | 湖南大学 | Big data increment iteration method and device, computer equipment and storage medium |
CN114756714A (en) * | 2022-03-23 | 2022-07-15 | 腾讯科技(深圳)有限公司 | Graph data processing method and device and storage medium |
CN115630003B (en) * | 2022-11-16 | 2023-07-21 | 苏州浪潮智能科技有限公司 | Mirror image method, device, equipment and medium for cache data |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101414277A (en) * | 2008-11-06 | 2009-04-22 | 清华大学 | Need-based increment recovery disaster-containing system and method based on virtual machine |
CN102281321A (en) * | 2011-04-25 | 2011-12-14 | 程旭 | Data cloud storage partitioning and backup method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9723074B2 (en) * | 2011-11-15 | 2017-08-01 | Alcatel Lucent | Method and apparatus for in the middle primary backup replication |
-
2015
- 2015-01-20 CN CN201510026680.9A patent/CN104618153B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101414277A (en) * | 2008-11-06 | 2009-04-22 | 清华大学 | Need-based increment recovery disaster-containing system and method based on virtual machine |
CN102281321A (en) * | 2011-04-25 | 2011-12-14 | 程旭 | Data cloud storage partitioning and backup method and device |
Non-Patent Citations (4)
Title |
---|
Pregel: A System for Large-Scale Graph Processing;Grzegorz Malewicz.etc;《Proceedings of the SIGMOD,2010》;20101231;全文 * |
云计算环境下的大规模图数据处理技术;于戈等;《计算机学报》;20111031;第34卷(第10期);正文第1-15页,图1-4 * |
大规模增量迭代处理技术的研究与实现;王志刚;《中国优秀硕士学位论文全文数据库》;20140715;全文 * |
数据库集群故障切换技术的研究与实现;梁勇;《中国优秀硕士学位论文全文数据库》;20110115;正文第1-76页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104618153A (en) | 2015-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104618153B (en) | Dynamic fault-tolerant method and system based on P2P in the processing of distributed parallel figure | |
Mukwevho et al. | Toward a smart cloud: A review of fault-tolerance methods in cloud systems | |
McGilvary et al. | Ad hoc cloud computing | |
US9652339B2 (en) | Fault tolerant listener registration in the presence of node crashes in a data grid | |
CN113646749B (en) | IOT partition management and load balancing | |
CN105871603A (en) | Failure recovery system and method of real-time streaming data processing based on memory data grid | |
CN110490316B (en) | Training processing method and training system based on neural network model training system | |
CN106101212A (en) | Big data access method under cloud platform | |
CN107992354B (en) | Method and device for reducing memory load | |
Du et al. | Hawkeye: Adaptive straggler identification on heterogeneous spark cluster with reinforcement learning | |
JP2012064217A (en) | Data restoration method and server device | |
Toulouse et al. | Distributed load-balancing for account-based sharded blockchains | |
CN105635285B (en) | A kind of VM migration scheduling method based on state aware | |
Arif et al. | Canary: fault-tolerant faas for stateful time-sensitive applications | |
US20230418663A1 (en) | System and methods for dynamic workload migration and service utilization based on multiple constraints | |
Mohammed et al. | An integrated virtualized strategy for fault tolerance in cloud computing environment | |
CN103713990A (en) | Method and device for predicting defaults of software | |
JP2019140496A (en) | Operation device and operation method | |
Mahato et al. | Reliability modeling and analysis for deadline-constrained grid service | |
Sansottera et al. | Consolidation of multi-tier workloads with performance and reliability constraints | |
Cinquilli et al. | The CMS workload management system | |
Abderrahim et al. | The three-dimensional model for dependability integration in cloud computing | |
Kumari et al. | Checkpointing algorithms for fault-tolerant execution of large-scale distributed applications in cloud | |
De Grande et al. | Measuring communication delay for dynamic balancing strategies of distributed virtual simulations | |
Chorey et al. | Failure recovery model in big data using the checkpoint approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180803 |