WO2013091162A1 - Method, device, and system for recovering distributed storage data - Google Patents

Method, device, and system for recovering distributed storage data Download PDF

Info

Publication number
WO2013091162A1
WO2013091162A1 PCT/CN2011/084219 CN2011084219W WO2013091162A1 WO 2013091162 A1 WO2013091162 A1 WO 2013091162A1 CN 2011084219 W CN2011084219 W CN 2011084219W WO 2013091162 A1 WO2013091162 A1 WO 2013091162A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
operation sequence
sequence set
data operation
list
Prior art date
Application number
PCT/CN2011/084219
Other languages
French (fr)
Chinese (zh)
Inventor
王志用
杨德平
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201180003086.8A priority Critical patent/CN103262042B/en
Priority to PCT/CN2011/084219 priority patent/WO2013091162A1/en
Publication of WO2013091162A1 publication Critical patent/WO2013091162A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying

Definitions

  • the present invention belongs to the field of information technology, and in particular, to a distributed storage data recovery method, apparatus and system.
  • the distributed cluster system consists of a large number of traditional nodes, and the overall powerful processing capability is presented externally by sharing the processing power to each node. Each node needs to collaborate through shared data to complete processing tasks.
  • the distributed block storage system is a distributed storage system that uses data blocks as storage units to meet the massive storage requirements, and presents a powerful storage capability.
  • the failure of nodes in a distributed storage system and the rapid recovery of data from a faulty node regression cluster are key to providing high-quality services.
  • a node maintains a snapshot file, and the snapshot file stores all data backups of the node.
  • the faulty node sends a snapshot file to the node that provides data recovery.
  • the node that provides the recovery data receives the snapshot file sent by the faulty node, compares it with the snapshot file stored, and returns the difference data to the faulty node.
  • the inventors have found that since the snapshot file stores data backup, the data backup by sending the failed node is compared with the data backup of the node providing the restored data during data recovery, and the amount of data to be transmitted is very large, which seriously wastes network bandwidth.
  • the embodiment of the invention provides a method for recovering distributed storage data, including:
  • the local node receiving target node sends according to the version value of the local node's data operation sequence set. a list of data manipulation sequence sets;
  • the embodiment of the invention further provides a distributed storage data recovery method, including:
  • the embodiment of the invention further provides a node, including:
  • a receiving unit configured to receive a data operation sequence set list sent by the target node according to the version value of the data operation sequence set of the node, and receive data corresponding to the data operation sequence set list sent by the target node;
  • the updating unit updates the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
  • the embodiment of the invention further provides a node, including:
  • a receiving unit configured to receive a version value of a local node data operation sequence set
  • a sending unit configured to send, according to the version value, the data operation sequence set list and the data operation sequence set list corresponding data to the local node.
  • the embodiment of the invention further provides a distributed storage data recovery system, comprising:
  • a local node configured to receive a data operation sequence set list sent by the target node according to a version value of the data operation sequence set of the local node, and receive data corresponding to the data operation sequence set list sent by the target node, And being further configured to update data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list;
  • the target node is configured to receive a version value of a data operation sequence set of the local node, and a root And according to the version value, sending a data operation sequence set list to the local node, and data corresponding to the data operation sequence set list.
  • the distributed storage data recovery method, device and system provided by the embodiment of the present invention update the data of the local node by receiving the data operation sequence set list sent by the target node according to the version value of the data operation sequence set of the local node, thereby performing When data is recovered, the amount of data transmission is reduced, and network bandwidth is saved.
  • 1 is a schematic diagram of a data operation sequence and a data operation sequence set
  • FIG. 1 is a schematic diagram of a process of generating an operation record file
  • FIG. 3 is a schematic diagram of a process of combining data operation sequences
  • FIG. 4 is a schematic flow chart of a first embodiment of the present invention.
  • Figure 5 is a schematic flow chart of a second embodiment of the present invention.
  • FIG. 6 is a schematic flow chart of a third embodiment of the present invention.
  • FIG. 7 is a schematic flow chart of a fourth embodiment of the present invention.
  • FIG. 8 is a schematic flow chart of a fifth embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a node according to a sixth embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a node according to a seventh embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of a node according to an eighth embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of a node according to a ninth embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of a node according to a tenth embodiment of the present invention.
  • 14 is a schematic structural diagram of a node according to an eleventh embodiment of the present invention
  • 15 is a schematic structural diagram of a node according to a twelfth embodiment of the present invention
  • 16 is a schematic structural diagram of a system according to a thirteenth embodiment of the present invention.
  • the local node in this embodiment refers to a node that needs to perform data recovery, and may also be referred to as a fault node.
  • the target node refers to a node that provides recovery data, and may also be referred to as a recovery data target node.
  • the same data needs to be stored on different nodes to form a backup.
  • the corresponding storage data on the node where the backup relationship exists should be consistent.
  • the data manipulation sequence is represented by obj-id: ⁇ off set, s ize>, where obj-id is a data identifier used to represent one type of data.
  • Off set represents the initial value of the data operation sequence, and size represents the offset of the data operation sequence from the initial value.
  • the data operation sequence is recorded in the node cache, and when the node status changes, the data operation sequence in the cache is flushed to the operation record file.
  • the operation log file may include, but is not limited to, a check-point file.
  • the sequence of data operations that are flushed into the operation log file at a time is called a collection of data manipulation sequences, and each data manipulation sequence set has a version number.
  • a collection of data manipulation sequences contains a version value and one or more rows of data manipulation sequences.
  • the data operation sequence set version value is a monotonically increasing non-negative integer. Of course, other symbols that can represent the order relationship can also be used.
  • the data operation sequence set is stored in the order of the version values, forming a list of data operation sequence sets.
  • a data manipulation sequence collection list contains one or more collections of data manipulation sequences.
  • the node may also include a log data operation sequence list.
  • Log The data manipulation sequence list is used to store all data manipulation sequences during the current version of the data manipulation sequence set.
  • the log data manipulation sequence list consists of a collection of all data manipulation sequences generated during the current version of the data manipulation sequence collection.
  • the data manipulation sequence in all embodiments of the present invention records only the operation of the node to write data, and the data manipulation sequence does not include the written data itself.
  • an update flowchart of the operation record file, the operation record file, which may include but is not limited to a check-point file, includes the following steps:
  • Step 201 The node starts, caches the initialization data operation sequence, and reads the current version value V of the node data operation sequence set.
  • Step 202 Set the value of the working variable V' to the current version value V of the node data operation sequence set.
  • Step 203 Set the initial value of the data operation sequence of the log data operation sequence list to 0.
  • Step 204 Read a data operation sequence from the initial value of the data operation sequence of the log data operation sequence list, and the data operation sequence length is recorded as len.
  • Step 205 Determine whether the read data operation sequence reaches the end of the data.
  • step 206a If the data operation sequence is read to the end of the data, the data in the initial value of the data operation sequence of the log data operation sequence list has been read, and the process proceeds to step 206a. If the sequence of reading the data operation is unsuccessful or does not reach the end of the data, the process proceeds to step 206b.
  • Step 206a Determine whether the working variable V' is equal to the current version value V of the data operation sequence set. If the working variable V' is equal to the current version value of the data operation sequence set, it means that the node state has not changed, and the process proceeds to step 209a to enter sleep waiting. If the working variable V is not equal to the version value V of the current data operation sequence set, then the state of the node changes, and the version value of the data operation sequence set also changes, and step 209b is executed.
  • Step 206b Determine whether the data identifier is already in the cache.
  • step 207a Determining whether a data operation sequence of the same data identifier exists in the cache. If there is a data operation sequence of the same identifier in the cache, proceed to step 207a to perform a vector merge operation, if If there is no data operation sequence of the same data identifier in the cache, then the process proceeds to step 207b to add the data operation sequence to the cache.
  • Step 207a The data operation sequence is vector merged.
  • the new data operation sequence is vector-combined with the data operation sequence having the same data identifier already existing in the cache to obtain a new sequence, and the original data operation sequence in the cache is updated.
  • Step 207b Add a sequence of data operations to the cache.
  • the newly read sequence of data operations is added to the cache intact.
  • Step 208 The initial value of the data operation sequence of the log data operation sequence list is shifted backward by len length.
  • the initial value is shifted back by len length to skip the data sequence that has just been read and moves to the new data operation sequence.
  • the next data operation sequence is read.
  • Step 209a Enter sleep waiting.
  • the time T is a time that can be set according to the state of the system, and is used as a timing to judge the time value of the refresh data operation sequence set.
  • Step 209b The result of the merged data operation sequence in the cache is brushed to the operation record file.
  • the state of the node changes, causing the data operation sequence set version value V to increase, and the merged data operation sequence set in the cache is flushed to the operation record file.
  • Step 210 Clear the cache and log data operation sequence list.
  • the cache is cleared for new data operation sequence storage and vector merging, and a new log data operation is created.
  • the sequence list stores all data operation sequences, deletes the old log data operation sequence list, and proceeds to step 202 to start a new data operation sequence merging operation.
  • the data operation sequence set in the operation record file is flushed from the cache to the operation record file when the node status changes.
  • Data manipulation sequence In the current version of the collection because the current node state has not changed, the merged data operation sequence has not been flushed into the operation log file in the cache, so the current version of the data operation sequence set is missing from the operation log file.
  • the log data operation sequence list can record all data operation sequences in the current version, and the data operation sequence for the same data identification is not vector-combined. Thus, when the operation record file lacks the current version data operation sequence set, all data operation sequences of the current version of the data operation sequence set can be searched from the log data operation sequence list.
  • the data operation sequence with the same data identifier is vector-merged, and the operation is as shown in FIG. 3, which specifically includes the following steps:
  • Step 301 A sequence of data operations to be merged.
  • Step 302 Determine whether there is an overlap interval between the data operation sequence to be merged and the existing operation sequence.
  • the data operation sequence to be merged is compared with the already merged data operation sequence. If there is overlap, the process proceeds to step 303a to perform a vector merge operation. Otherwise, the process proceeds to step 303b to insert the sequence into the existing space according to the initial value of the data operation sequence.
  • Step 303a Combine the existing data operation sequence with the interval of the data operation sequence to be merged.
  • the merged data operation sequence is A: ⁇ 2, 6>
  • the existing data operation sequence A: ⁇ 1, 5> has an overlapping interval with the data operation sequence A to be merged: ⁇ 2, 6>, and is A: ⁇ 1, 7> after the combination.
  • Step 303b Insert the data operation sequence to be merged into the existing data operation sequence.
  • the data operation sequence to be merged is inserted into the already existing data operation sequence.
  • the insertion principle is to ensure that all data operation sequences after the insertion are strictly incremented by the initial value. If the existing data operation sequence is A: ⁇ 1, 4>, the data operation sequence to be merged is A: ⁇ 6, 3>, there is no overlap between the two data operation sequences, according to the principle that the initial value of the data operation sequence is strictly increased. , expressed as A: ⁇ 1, 4X6, 3>.
  • Step 304 The merge is completed.
  • the embodiment of the present invention provides data recovery for storing all data operation sequence sets (without data) before the current version and all data operation sequences when the log file stores the current version data operation sequence set based on the operation record file.
  • the operation record file stores a set of data operation sequences of all versions prior to the current version of the data operation sequence set, wherein if there is an overlap interval between the data operation sequences having the same data identifier in each data operation sequence set, vector merging is performed.
  • the log data operation sequence list stores all data operation sequences on the node when the current data operation sequence set version value, and the data operation sequence having the same data identification is not vector merged.
  • a node in a distributed storage system fails, other storage nodes that store the same data are still working properly. If a node that stores the same data backup fails, the node status changes, which causes the node that stores the same data backup to refresh the current version of the data operation sequence set stored in the node cache to the operation record file of the node, and cache. Used to store a collection of data manipulation sequences with new version values incremented. Because the failed node cannot perform data operations until the failure recovers. After the fault recovery, the local node only needs to send the list of data operation sequence sets composed of the local node related data operation sequence set to the current data operation sequence set version value in the local node operation record file to the data recovery.
  • the local node sends the data corresponding to the data operation sequence set list sent to the local node at the same time, and updates the data to the local node, so that the data corresponding to the operation record file can be recovered.
  • the log data operation sequence list composed of the target node log data operation sequence is sent to the local node, and the data corresponding to the log data operation sequence list is sent to the local node, and the local node updates the log data operation sequence list correspondingly.
  • the data can be updated, so the node has an operation log file, a log data operation sequence list and data.
  • the operation record file stores a list of data operation sequence sets composed of a set of data operation operation sequences of all versions before the current version of the data operation sequence set, data operations sequences in the data operation sequence set and data written before the current version of the data operation sequence set Correspondingly, each data write operation is recorded.
  • Log data operation sequence list stores all data operation sequences during the current version of the data operation sequence set, these data operation sequences Corresponding to the data written during the current version of the data manipulation sequence set, each data write operation is recorded.
  • the first embodiment of the present invention provides a distributed storage data recovery method. As shown in FIG. 4, the method includes:
  • Step 401 The local node receives a list of data operation sequence sets sent by the target node according to the version value of the data operation sequence set of the local node.
  • the version value of the data operation sequence set of the local node is sent by the local node to the target node.
  • the version value of the data operation sequence set of the local node may also be sent by the master node to the target node.
  • Step 402 Receive data corresponding to the data operation sequence set list sent by the target node.
  • the method further includes: the local node performing a vector on the data operation sequence set list Merging operation, and sending a list of data operation sequence sets after the vector combining operation to the target node;
  • And the data corresponding to the data operation sequence set list sent by the target node is received by: receiving data corresponding to the data operation sequence set list after the vector merging operation sent by the target node.
  • the method further includes:
  • the local node performs random input and output (Input/Output, hereinafter abbreviated as 10) on the data operation sequence set list into the operation of the sequence 10, and sends the data operation sequence set after the random 10 is merged into the sequence 10 operation. List to the target node;
  • a ratio of a hole value between the data operation sequence having the same identifier to a continuous space size spanned by the merge sequence is less than a set percentage, or a hole between data operation sequences having the same identifier If the value is less than the set threshold, the random operation 10 is merged into the sequence 10 operation on the data operation sequence set list.
  • receiving data corresponding to the data operation sequence set list sent by the target node is: receiving data that is sent by the target node to perform a vector merge operation on the data operation sequence set list. Operates the data corresponding to the sequence collection list.
  • receiving, by the target node, the data corresponding to the data operation sequence set list is: receiving, by the target node, the target node, randomly, combining the data operation sequence set list into a sequence of 10 The data corresponding to the data operation sequence set list after the operation.
  • Step 403 Update data of the local node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
  • the local node updates the received list of data operation sequence sets to an operation record file of the local node.
  • the method further includes:
  • the local node receives a log operation sequence list sent by the target node
  • the distributed storage data recovery method provided by the first embodiment of the present invention sends a data operation sequence sent by the target node according to the version value of the data operation sequence set of the local node by sending the version value of the data operation sequence set of the local node to the target node.
  • the collection list is used to update the data of the local node, thereby reducing the amount of data transmission during the data recovery process and saving network bandwidth.
  • a second embodiment of the present invention provides a method for recovering distributed storage data. As shown in FIG. 5, the method specifically includes:
  • Step 501 Receive a version value of a data operation sequence set of the local node.
  • the version value of the local node's data manipulation sequence set may be received by the target node.
  • the current version value of the local node data operation sequence set received by the target node may be sent by the local node.
  • the current version value of the local node data operation sequence set received by the target node may also be sent by the master node.
  • Step 502 Send a data operation sequence set list to the local node according to the version value.
  • a data operation sequence set associated with the local node is selected using a hash algorithm under a distributed hash table architecture or an allocation table algorithm in a metadata service architecture.
  • Step 503 Send, according to the data operation sequence set list, data corresponding to the data operation sequence set list to the local node.
  • the method further includes: performing a vector merging operation on the data operation sequence set list;
  • the method further includes: merging the data operation sequence set list into a sequence 10 by random 10 ; And sending, by the local node, the data corresponding to the data operation sequence set list is: data corresponding to the table.
  • the method before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: receiving, by the local node, the list of the data operation sequence set a list of data manipulation sequence sets after the vector merge operation;
  • the method before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: receiving, by the local node, the data operation sequence The list of data operations sequence sets after the random 10 merge order 10 operation is performed on the set list;
  • the method further includes receiving a log data operation sequence request of the local node;
  • the distributed storage data recovery method provided by the second embodiment of the present invention transmits a data operation sequence set list by receiving a version value of a local operation data operation sequence set, and sends a data operation sequence set list corresponding according to the data operation sequence set list. Data, to update the data of the local node, thereby reducing the amount of data transmission during data recovery, saving network bandwidth.
  • a third embodiment of the present invention provides a distributed storage data recovery method, as shown in FIG.
  • the body includes the following steps:
  • Step 601 Start recovering data.
  • the data recovery process begins.
  • the local node fails other nodes that store the same data backup still work normally, and the data storage operation is performed.
  • the local node recovers from the failure, it needs to be consistent with the data backup stored on other nodes. Therefore, it is necessary to restore the data that cannot be stored after the failure on the local node.
  • Step 602 Send a data operation sequence set current version value to the target node.
  • the local node reads the current version of the data operation sequence set stored in the operation log file before the failure, and sends the current version value to the target node.
  • the state of the node changes, and the version value of the data operation sequence set on the other nodes storing the same data backup increases. Therefore, it is necessary to send the current version value of the data operation sequence set in the local node operation record file to the target node, thereby finding the data operation sequence set added by the target node during the failure.
  • the current version value of the local node data operation sequence set received by the target node may also be sent by the master node.
  • Step 603 Receive a version value of the data operation sequence set of the local node and find a data operation sequence set whose operation value in the operation record file is greater than the local node version value.
  • the target node finds, from the target node operation record file, a set of all data operation sequences whose version value is greater than the current version value of the data operation sequence set in the local node operation record file.
  • a set of all data operation sequences whose version value is greater than the current version value of the data operation sequence set in the local node operation record file.
  • Step 604 Generate a list of data operation sequence sets related to the local node.
  • the version value stored in the target node operation log file is larger than the local node operation record file.
  • Data operation sequence set The data operation sequence set of the current version value is a data operation sequence set corresponding to all data stored on the target node, so there is data not related to the local node, meaning that data operations not related to the local node are also stored. sequence. Therefore, after the target node generates a list of data operation sequence sets whose version value is greater than the current version value of the data operation sequence set in the local node operation record file, it is required to calculate a version value greater than the data operation sequence in the local node operation record file according to the data placement algorithm.
  • the data identifier of the data operation sequence of the data operation sequence set list of the current version value is set to belong to the local node, so that the data operation sequence not belonging to the local node is from the data operation sequence whose version value is greater than the current version value of the local node data operation sequence set.
  • the collection of data in the collection is removed from the sequence of operations.
  • Commonly used data placement algorithms include using a distributed hash table algorithm in a distributed hash table architecture or an allocation table algorithm in a metadata service architecture, and calculating a version value greater than a local node operation record file according to a data identifier of the data operation sequence.
  • Step 605 Send a list of data operation sequence sets related to the local node.
  • the version value mentioned in the embodiment of the present invention is larger than the data operation sequence in the local node operation record file.
  • the data operation sequence in the data operation sequence set list of the current version value is collected, and the partial operation sequence is used to recover the corresponding data.
  • Step 606 The local node receives a list of data operation sequence sets related to the local node.
  • the target node sends a list of data operation sequence sets generated by the target node during the local node failure, and the local node receives a list of data operation sequence sets whose version value is greater than the current version value of the data operation sequence set in the local node operation record file, for updating and receiving.
  • the version value is greater than the data operation sequence set list corresponding to the current version value of the data operation sequence set in the local node operation record file. Because each data operation sequence in the data operation sequence set list of the current version value of the data operation sequence set records a data write operation, when restoring data, it is required to follow the corresponding
  • the data manipulation sequence updates the data corresponding to the data manipulation sequence record.
  • the version value may be greater than the current version value of the data operation sequence set in the local node operation record file and the list of data operation sequence sets related to the local node is stored in the operation record file of the local node, and the version value is greater than the local node operation record file.
  • the data manipulation sequence set in the current version value and the list of data manipulation sequence sets associated with the local node are not vector-combined and/or randomly merged into a sequence of 10. Storing the version value with the current version value of the data operation sequence set in the local node operation record file and storing the list of data operation sequence sets related to the local node in the operation record file of the local node is optional, not necessary for the embodiment of the present invention. step.
  • Step 607 Find data corresponding to the data operation sequence set list whose generation version value is greater than the current operation value of the data operation sequence set in the local node operation record file and related to the local node.
  • the target node reads the corresponding data according to the generated version value greater than the data operation sequence in the local node operation record file, and the current version value is read and the corresponding data is read from the list of data operation sequence sets associated with the local node.
  • the version value needs to be larger than the data operation in the local node operation record file.
  • the sequence sets the current version value and the data corresponding to the list of data operation sequence sets associated with the local node is sent to the local node.
  • the version value is greater than the current version value of the data operation sequence set in the local node operation record file, and the data corresponding to the data operation sequence set list related to the local node is one by one according to the data operation sequence in the data operation sequence set list. Find the data generated by reading the corresponding data.
  • Step 607 is performed after step 604, and may be performed simultaneously with step 605, or may be performed after step 605.
  • Step 608 The target node sends data corresponding to the list of data operation sequence sets related to the local node.
  • Step 609 Receive data corresponding to the data operation sequence set list.
  • the local node receives a version value greater than the data operation sequence set in the local node operation record file.
  • the current version value and the data corresponding to the list of data manipulation sequence sets associated with the local node are included in the local node.
  • Step 610 Update the data.
  • the data of the local node is updated according to the version value of the data operation sequence set related to the local node operation record file and the data operation sequence set list related to the local operation node and the data corresponding to the data operation sequence set list.
  • the local node After receiving the data corresponding to the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list related to the local node, the local node receives the data operation sequence in the local node operation record file according to the version value.
  • a list of data manipulation sequence sets that aggregate the current version value and associated with the local node, updating the data to the local node.
  • Step 611 Send a 1 og data operation sequence request.
  • the data operation sequence set recorded in the operation record file in the target node is all data operation sequence sets before the current version of the target node data operation sequence set, and the data operation sequence set of the current version of the target node is In the node cache, it is not refreshed to the operation record file of the target node. Therefore, in order to recover all the data, it is also necessary to restore the data corresponding to the current version of the data operation sequence set of the target node to the local node. Since the current version of the data operation sequence of the target node is stored in the cache, it cannot be flushed to the target node operation log file until the node status changes, and is now unreadable. Therefore, the log data operation sequence of the target node is required, and the data corresponding to the log data operation sequence of the target node is used for data recovery. In this step, the log data operation sequence request of the local node may also be sent by the control node.
  • Step 612 The target node receives the log data operation sequence request and searches for a list of log data operation sequences related to the local node.
  • Step 613 The target node sends a log data operation sequence list.
  • Step 614 The local node receives and updates the log data operation sequence list.
  • the local node After receiving the list of log data operation sequences sent by the target node, the local node updates the local log data operation sequence.
  • Step 615 Find data corresponding to the log data operation sequence list related to the local node.
  • the target node searches for the corresponding data one by one according to the data operation sequence corresponding to the log data operation sequence table associated with the local node.
  • Step 616 The target node sends data corresponding to the log data operation sequence list.
  • Step 615 may be performed after step 612, may be performed concurrently with step 613, or may be performed after step 613.
  • Step 617 The local node receives the data corresponding to the log data operation sequence list.
  • Step 618 The local node updates the local node data according to the log operation sequence list and the data corresponding to the log operation sequence list.
  • the data recovery by sending the current version value of the data operation sequence set of the local node to the target node, and receiving the data operation sequence set list sent by the target node according to the version value of the data operation sequence set of the local node, the data recovery is performed, and the data recovery is reduced.
  • the amount of data transferred saves network bandwidth.
  • a fourth embodiment of the present invention provides a distributed storage data recovery method. As shown in FIG. 7, the specific embodiment includes:
  • Step 701 to step 706 are the same as step 601 to step 606 of the third embodiment of the present invention, and details are not described herein again.
  • Step 707 The local node compares the version value with the current version value of the data operation sequence set in the local node operation record file and the vector operation sequence set list related to the local node.
  • the version value is greater than the local node operation record.
  • the local node receives the version value greater than the current version value of the data operation sequence set in the local node operation record file and the local node A list of related data manipulation sequence sets is vector merged.
  • the specific operation method is that the version value is greater than the current version value of the data operation sequence set in the local node operation record file, and the data operation sequence set list related to the local node is the same data regardless of whether the version value of the data operation sequence set is the same or not.
  • the operation sequence is vector-merged. The principle of merging is to merge all overlapping parts of the interval. If both data identifiers are
  • 0x123 data operation sequence 0x123: ⁇ 0, 1024> ⁇ 2000, 1024> and 0x123: ⁇ 500, 4096>
  • the combined sequence is 0x123: ⁇ 0, 4596>.
  • the data operation sequence 0x321 with three data identifiers being 0x321: ⁇ 0, 512> ⁇ 1024, 1024>, 0x321: ⁇ 1500, 2000>, 0x321: ⁇ 4096, 10240)
  • the combined operation sequence is 0x321: ⁇ 0, 512> ⁇ 1024, 2476> ⁇ 4096, 10240>.
  • Step 708 The version value is greater than the data operation sequence set current version value in the local node operation record file and the data operation sequence set list associated with the local node is randomly 10 combined into the sequence 10.
  • the random distribution is evaluated by a statistical algorithm, and the random 10 is merged into the sequence 10 to reduce the node.
  • network overhead for optimal recovery performance.
  • the common statistical algorithm is the space in the data operation sequence in the list of statistical data operation sequence sets (the space with the same data identifier does not repeat), and then the space occupied by the merged sequence.
  • the size comparison calculate the percentage value, if the percentage value is less than the value set by the system, you can combine these data operation sequences with the same data identification.
  • the percentage value set by the system can be set and adjusted as needed.
  • the present embodiment is exemplified by 20%, but this is not a limitation of the present invention, but only to explain the embodiment of the present invention more clearly.
  • the sequence of operations after the combination in step 707 is 0x321: ⁇ 0, 512> ⁇ 1024, 2476> ⁇ 4096, 10240)
  • the embodiment of the present invention may only perform step 707 or step 708, and may also perform step 707 and step 708 at the same time.
  • step 709 and step 709 the version value after the processing operation is uniformly greater than that in the local node operation record file.
  • the data manipulation sequence aggregates the current version value and a list of data manipulation sequence sets associated with the local node.
  • step 707, or step 708, or step 707 and step 708, a list of data operation sequence sets having a version value greater than a current version value of the local node data operation sequence set and related to the local node, but the operation does not affect step 706
  • Step 709 Send a list of data operation sequence sets after the processing operation to the target node.
  • Step 710 The target node receives and searches for the corresponding data according to the data operation sequence set list after the processing operation.
  • the target node receives the data operation sequence set list after the processed operation, searches and reads the corresponding data one by one according to the data operation sequence in the data operation sequence set list after the processed operation, and generates data corresponding to the data operation sequence set list.
  • Step 711 Send data corresponding to the data operation sequence set list.
  • the data corresponding to the processed data operation sequence set list generated according to step 710 is used as the version value corresponding to the current operation value of the data operation sequence set in the local node operation record file and corresponding to the data operation sequence set list related to the local node.
  • the data is sent to the local node.
  • Step 712 The local node receives the number corresponding to the data operation sequence set list after the processing operation According to.
  • Step 713 Update the data.
  • the local node updates the data of the local node according to the data operation sequence set list of the unprocessed operation and the data corresponding to the data operation sequence set list after receiving the processed operation.
  • Steps 714 to 715 refer to steps 611 to 614 of the third embodiment of the present invention.
  • Step 716 Perform vector merging of the 1 og data operation sequence list.
  • Step 717 Combine the 1 og data operation sequence list into 10 random numbers.
  • the log data operation sequence list random 10 is merged into the sequence 10, and the specific merge mode is the same as step 708.
  • step 716 or step 717 may only perform step 716 or step 717, and may also perform step 716 and step 717 at the same time.
  • step 718 and step 718 it is collectively referred to as a log data operation sequence list after the processing operation.
  • step 716 the log data operation sequence list after the operation is processed in step 716, or step 717, or step 716 and step 717 does not affect the list of log data operation sequences that the local node has received in step 715.
  • Step 718 The local node sends a list of log data operation sequences after the processing operation.
  • Step 719 The target node receives and searches for the corresponding data according to the log data operation sequence list after the processed operation.
  • the target node receives the log data operation sequence list after the processed operation, and searches for the corresponding data one by one according to the data operation sequence in the log data operation sequence list after the processing operation, and forms data corresponding to the log data operation sequence list.
  • Step 720 Send data corresponding to the log data operation sequence list.
  • Step 721 Locally receive data corresponding to the list of l og data operation sequences after the processing operation.
  • Step 722 Update data corresponding to the log data operation sequence list after the processing operation.
  • the data corresponding to the log data operation sequence list after the processing operation is updated to the local node according to the data corresponding to the l og data operation sequence list after the unprocessed operation and the l og data operation sequence list after the processing operation.
  • step 615 to the step 618 in the third embodiment of the present invention may be directly executed without performing step 716 to step 722.
  • the distributed storage data recovery method provided by the embodiment of the invention can further reduce the transmission of duplicate data, save network bandwidth, and reduce the load of the target node.
  • a fifth embodiment of the present invention provides a distributed storage data recovery method. As shown in FIG. 8, the specific method includes:
  • Steps 801 to 806 are the same as steps 601 to 606 of the third embodiment of the present invention, and are not described again.
  • Step 807 Perform vector combination of the version value of the data operation sequence set related to the local node and the data operation sequence set related to the local node in the local node operation record file.
  • a method of performing vector merging of a list of data operation sequence sets related to a local node with a version value greater than a data operation sequence set in the local node operation record file may refer to step 707 in the fourth embodiment of the present invention.
  • Step 808 Combine the random value 10 whose version value is greater than the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list related to the local node into the sequence 10.
  • Step 808 The method of combining the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list random sequence 10 related to the local node into the sequence 10 may refer to step 708 in the fourth embodiment of the present invention. .
  • the embodiment of the present invention may only perform step 807 or step 808, and may also perform step 807 and step 808 at the same time.
  • step 809 the version value after the processing operation is collectively greater than the local value.
  • the data operation sequence in the node operation record sets a current version value and a list of data operation sequence sets associated with the local node.
  • Step 807, or step 807, or step 807 and step 808, the version value is greater than the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list related to the local node is processed, but the operation is performed.
  • the list of data manipulation sequence sets that have received the version value in step 806 that is greater than the current version value of the data manipulation sequence set in the local node operation record file and related to the local node are not affected.
  • Step 809 Find corresponding data according to the data operation sequence set list after the processed operation.
  • the target node reads and reads the corresponding data according to the version value after the processing operation is greater than the current version value of the data operation sequence set in the local node operation record file and the data operation sequence in the data operation sequence set list related to the local node is searched one by one.
  • the data corresponds to the data in the sequence set list.
  • the data corresponding to the list of data operation sequence sets used to restore the local node version value is greater than the data operation sequence in the local node operation record file.
  • Steps 807 to 809 may be performed before step 806 after step 805, or may be performed simultaneously with step 806, or may be performed after step 806.
  • Step 810 The target node sends the data corresponding to the data operation sequence set list after the processing operation.
  • Step 811 The local node receives the data corresponding to the data operation sequence set list after the processing operation.
  • Step 812 Update the data corresponding to the data operation sequence set list after the processing operation to update the data of the local node.
  • Steps 813 to 816 refer to the description of steps 611 to 614 of the third embodiment of the present invention.
  • Step 817 The target node performs vector merging of the log data operation sequence list.
  • the target node will have all data manipulation sequences with the same data identifier in the log data operation sequence.
  • the specific method can refer to step 807.
  • Step 818 The target node merges the log data operation sequence list random 10 into the sequence 10. Combine the log data operation sequence list random 10 into the sequence 10 processing, the specific method is the same as the step
  • step 817 or step 818 may only perform step 817 or step 818, and may also perform step 817 and step 818 at the same time.
  • step 819 and step 819 it is collectively referred to as a log data operation sequence list after the processing operation.
  • step 817 the log data operation sequence list after the operation is processed in step 817, or step 818, or step 817 and step 818 does not affect the list of log data operation sequences that the local node has received in step 816.
  • Step 819 Find corresponding data according to the 1 og data operation sequence list after the merging processing operation. According to the data operation sequence in the log data operation sequence list after the processing operation, the corresponding data is read one by one to form data corresponding to the log data operation sequence list.
  • Step 820 Send data corresponding to the log data operation sequence list.
  • Step 821 The local node receives the data corresponding to the log data operation sequence list after the processing operation.
  • Step 822 Update the data corresponding to the log data operation sequence list after the processing operation.
  • the data corresponding to the log data operation sequence list after the processing operation is updated to the local node according to the log data operation sequence list after the unprocessed operation and the data corresponding to the log data operation sequence list after the processing operation.
  • the step 615 to the step 618 in the third embodiment of the present invention may be directly executed without performing the steps 817 to 822.
  • the distributed storage data recovery method provided by the embodiment of the invention reduces the transmission of duplicate data, further reduces the data that needs to be restored, and saves network bandwidth.
  • a sixth embodiment of the present invention provides a node, as shown in FIG. 9, specifically including a receiving unit 901. And update unit 902.
  • the receiving unit 901 is configured to receive a data operation sequence set list sent by the target node according to the version value of the node data operation sequence set, and receive data corresponding to the data operation sequence set list sent by the target node.
  • the updating unit 902 is configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
  • the receiving unit 901 is further configured to receive a log operation sequence list sent by the target node, and further configured to receive data corresponding to the log operation sequence list sent by the target node.
  • the updating unit 902 is further configured to update the node data according to the log operation sequence list received by the receiving unit 901 and the data corresponding to the log operation sequence list.
  • the data corresponding to the data operation sequence set list sent by the target node received by the receiving unit 901 may be data corresponding to the data operation sequence set list after the target node sends the vector operation operation to the data operation sequence set list.
  • the updating unit 902 is specifically configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list after the vector combining operation.
  • the data corresponding to the data operation sequence set list sent by the target node received by the receiving unit 901 may be a data operation sequence after the target node sent by the target node performs random 10 merging to the data operation sequence set list.
  • the updating unit 902 is specifically configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list after the random 10 merge operation.
  • the update unit 902 can also be used to update the list of received data manipulation sequence sets to the operation log file.
  • the node provided by the embodiment of the present invention receives, by the receiving unit, a list of data operation sequence sets sent by the target node according to the version value of the data operation sequence set of the node, and the update unit updates the data of the node, thereby reducing the data recovery process.
  • the amount of data transferred in the network saves network bandwidth.
  • a seventh embodiment of the present invention provides a node, as shown in FIG. 10, specifically including a sending unit. 1001. Receiving unit 1002 and updating unit 1 003.
  • the sending unit 1001 is configured to send, to the target node, a version value of the data operation sequence set of the node.
  • the receiving unit 1002 is configured to receive a data operation sequence set list sent by the target node according to the version value of the node data operation sequence set, and receive data corresponding to the data operation sequence set list sent by the target node.
  • the updating unit 1003 is configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
  • the update unit 1 003 can also be used to update the list of received data operation sequence sets to the operation log file.
  • receiving unit 1002 can be referred to the receiving unit 901 of the sixth embodiment, and will not be described again.
  • update unit 1003 Further description of the update unit 1003 can be referred to the update unit 902 of the sixth embodiment, and will not be described again.
  • the node provided by the embodiment of the present invention sends a version value of the data operation sequence set of the node to the target node by using the sending unit, and the receiving unit receives the data operation sequence set sent by the target node according to the version value of the data operation sequence set of the node.
  • the list, the update unit updates the data of the node, thereby reducing the amount of data transmission during the data recovery process and saving network bandwidth.
  • the operation log file of the node can be further updated.
  • the eighth embodiment of the present invention provides a node, as shown in FIG. 11, specifically, including a sending unit 1101, a receiving unit 1102, a vector combining unit 11 03, and an updating unit 11 04.
  • the sending unit 11 01 is configured to send, to the target node, a version value of the data operation sequence set of the node.
  • the receiving unit 1102 is configured to receive a data operation sequence set list that is sent by the target node according to a version value of the node data operation sequence set.
  • the vector merging unit 1 103 is configured to perform a vector merging operation on the list of data operation sequence sets received by the receiving unit 1101.
  • the sending unit 1101 is further configured to send a data operation sequence set list after the vector combining operation to the target node, and the receiving unit is further configured to receive data corresponding to the data operation sequence set list after performing the vector combining operation.
  • Update unit 1 104 The data of the node is updated according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
  • the updating unit 1104 can also be configured to update the received data operation sequence set list to the operation record file.
  • receiving unit 1102 can be referred to the receiving unit 901 of the sixth embodiment, and will not be described again.
  • update unit 1104 can be referred to the update unit 902 of the sixth embodiment, and will not be described again.
  • the node provided by the embodiment of the present invention can further reduce the transmission amount of the restored data and reduce the target node by sending a current version value of the data operation sequence set to the target node and performing a vector merge operation on the data operation sequence set list sent by the target node.
  • the load when restoring data saves network bandwidth.
  • a ninth embodiment of the present invention provides a node, as shown in FIG. 12, specifically including a transmitting unit 1201, a receiving unit 1202, a merging unit 1203, and an updating unit 1204.
  • the sending unit 1201 is configured to send, to the target node, a version value of the data operation sequence set of the node.
  • the receiving unit 1202 is configured to receive a data operation sequence set list that is sent by the target node according to a version value of the node data operation sequence set.
  • the merging unit 1203 is configured to combine the data operation sequence set list received by the receiving unit 1202 into a random 10 operation into a sequence 10 operation.
  • the sending unit 1201 is further configured to send, to the target node, a data operation sequence set list in which the random 10 is merged into the sequence 10 operation, and the receiving unit is further configured to receive data corresponding to the data operation sequence set list after the random 10 merge to the sequence 10 operation.
  • the updating unit 1204 updates the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
  • the update unit 1204 can also be configured to update the received list of data manipulation sequence sets to the operation log file.
  • the vector merging unit 1103 and the merging unit 1203 may be simultaneously included, and the data operation sequence set sent by the target node is subjected to vector merging processing and the random 10 is combined into the sequence 10 processing.
  • the receiving unit 1202 reference may be made to the receiving unit 901 of the sixth embodiment, and details are not described herein again.
  • update unit 1204 can be referred to the update unit 902 of the sixth embodiment, and will not be described again.
  • the node provided by the embodiment of the present invention can perform the merge operation of the data operation sequence set list sent by the target node by sending the current version value of the data operation sequence set to the target node, thereby effectively reducing the transmission amount of the restored data and saving the network bandwidth. , to reduce the target node load.
  • the node provided in the sixth to ninth embodiments of the present invention may further be configured to receive a log operation sequence list sent by the target node, and further configured to receive data corresponding to the log operation sequence list sent by the target node.
  • the updating unit may be further configured to update the data of the node according to the log operation sequence list received by the receiving unit and the data corresponding to the log operation sequence list. Thereby restoring all data of the node and reducing the amount of data transmission.
  • a tenth embodiment of the present invention provides a node, as shown in FIG. 13, including a receiving unit 1301 and a transmitting unit 1302.
  • the receiving unit 1301 is configured to receive a version value of the local node data operation sequence set.
  • the sending unit 1302 is configured to send, according to the version value received by the receiving unit 1301, data corresponding to the data operation sequence set list and the data operation sequence set list to the local node.
  • the receiving unit 1301 is further configured to receive a list of data operation sequence sets sent by the local node to perform a vector combining operation on the data operation sequence set list. Then, the data corresponding to the data operation sequence set list sent by the sending unit 1302 is specifically the data corresponding to the data operation sequence set list after the vector combining operation.
  • the receiving unit 1301 is further configured to receive a data operation sequence set list that is sent by the local node and that performs the random 10 merge sequence 10 operation on the data operation sequence set list. Then send at this time
  • the data corresponding to the data operation sequence set list sent by the unit 1 302 is specifically: the data corresponding to the data operation sequence set list after the random 10 merge order 10 operation.
  • the node provided by the embodiment of the present invention may provide a data set corresponding to the current version of the data operation sequence set of the local node and a data corresponding to the data operation sequence set list, and provide data recovery for the local node.
  • the receiving unit receives the version value of the data operation sequence set of the local node
  • the sending unit is configured to send the data operation sequence set list and the data operation sequence set list to the local node according to the version value received by the receiving unit.
  • the corresponding data is used to update the data of the local node, thereby reducing the amount of data transmission during the data recovery process and saving network bandwidth.
  • An eleventh embodiment of the present invention provides a node, as shown in FIG. 14, comprising: a receiving unit 1401, a transmitting unit 1402, and a vector combining unit 1403.
  • the receiving unit 1401 is configured to receive a version value of the local node data operation sequence set.
  • the sending unit 1403 is configured to send, according to the version value received by the receiving unit 1401, a data operation sequence set list to the local node.
  • the vector merging unit 1403 is configured to perform vector merging on the data operation sequence set list, and the sending unit 1402 is further configured to send, to the local node, data corresponding to the data operation sequence set list after performing the vector merging operation.
  • receiving unit 1401 can be referred to the receiving unit 1 301 of the tenth embodiment, and will not be described again.
  • transmitting unit 1402 can be referred to the updating unit 1 302 of the tenth embodiment, and will not be described again.
  • the node provided by the embodiment of the present invention may provide a data set of the current version value data operation sequence set larger than the local node data operation sequence set and the data corresponding to the data operation sequence set list after the vector merge process, and provide data recovery for the local node. , further reducing the amount of data transferred.
  • a twelfth embodiment of the present invention provides a node, as shown in FIG. 15, comprising: a receiving unit 1501 The transmitting unit 1502 and the merging unit 1503.
  • the receiving unit 1501 is configured to receive a version value of the local node data operation sequence set.
  • the sending unit 1503 is configured to send, according to the version value received by the receiving unit 1501, a data operation sequence set list to the local node.
  • the merging unit 1503 is configured to perform random 10 merging into a sequence 10 operation on the data operation sequence set list, and the sending unit 1502 is further configured to send, to the local node, data corresponding to the data operation sequence set list after the random 10 merging to the sequence 10 operation.
  • receiving unit 1501 can be referred to the receiving unit 1301 of the tenth embodiment, and will not be described again.
  • transmitting unit 1502 can be referred to the updating unit 1302 of the tenth embodiment, and will not be described again.
  • the vector merging unit 1403 and the merging unit 1503 may be simultaneously included, the data operation sequence set is subjected to vector merging processing, and the random ray 10 is combined into a sequence 10 processing. It can reduce the overhead of data transmission and nodes.
  • the node provided by the embodiment of the present invention may provide a data sequence sequence set of the data operation sequence set larger than the local node, and a data sequence sequence set corresponding to the data sequence sequence set of the sequence 10 processed by the sequence 10, which is a local node. Provides data recovery while reducing node overhead.
  • the receiving unit is further configured to receive a log data operation sequence request of the local node.
  • the sending unit is further configured to: according to the log data operation sequence request, send a log data operation sequence list to the local node, and send the data corresponding to the log data operation sequence list to the local node according to the log data operation sequence list.
  • the node provided by the tenth to the twelfth embodiments of the present invention may further include a search generating unit, configured to search for a data operation in which the version value of the operation record file of the target node is greater than the local node according to the version value of the data operation sequence set of the local node.
  • a thirteenth embodiment of the present invention provides a distributed storage data recovery system, as shown in FIG. 16, including a local node 1601 and a target node 1602.
  • the local node 1601 is configured to receive a data operation sequence set list that is sent by the target node according to the version value of the data operation sequence set of the local node, and receive data corresponding to the data operation sequence set list sent by the target node, and is also used to Updating the data of the local node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
  • the target node 1602 is configured to receive a version value of the data operation sequence set of the local node, and send, according to the version value, a data operation sequence set list and a data corresponding to the data operation sequence set list to the local node.
  • the system provided by the embodiment of the present invention reduces the amount of data transmission and saves network bandwidth by transmitting the version number of the data operation sequence set of the local node for data recovery.
  • nodes provided in the sixth to twelfth embodiments of the present invention and the system provided in the thirteenth embodiment can be specifically referred to the description of the method embodiments of the present invention.
  • the distributed storage data recovery system provided by the embodiment of the present invention further reduces the data transmission amount by merging the data operation sequence set list and the log data operation sequence list into the sequence 10, thereby reducing the network bandwidth. . At the same time, the load on the target node is alleviated.
  • the disclosed systems, devices, and methods can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • a computer device which may be a personal computer, server, or network device, etc.
  • the foregoing storage medium includes: a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM, Random Acces s Memory), a magnetic disk or an optical disk, and the like, which can store program codes. medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the present invention provide a method, device, and system for recovering distributed storage data. The method comprises: a local node receiving a data operation sequence set list sent, according to a version value of a data operation sequence set of the local node, by a destination node; receiving data corresponding to the data operation sequence set list sent by the destination node; and according to the data operation sequence set list and the data corresponding to the data operation sequence set list, updating data of the local node. The method, device, and system for recovering distributed storage data provided in the embodiments of the present invention are able to recover data by sending the version number of a data operation sequence set of a local node, thereby reducing data transmission volumes and saving network bandwidth.

Description

一种分布式存储数据恢复方法、 装置及系统  Distributed storage data recovery method, device and system
技术领域 Technical field
本发明属于信息技术领域,特别涉及一种分布式存储数据恢复方法、装置 及系统。  The present invention belongs to the field of information technology, and in particular, to a distributed storage data recovery method, apparatus and system.
背景技术 Background technique
随着信息技术的发展, 海量数据处理给传统的数据处理方式带来了挑战。 因此各种大型分布式集群系统应运而生。 分布式集群系统由大量传统节点构 成, 通过将处理能力分担到各个节点的方式, 对外呈现整体强大的处理能力。 各节点之间需要通过共享数据进行协作, 以完成处理任务。  With the development of information technology, massive data processing has brought challenges to traditional data processing methods. Therefore, various large-scale distributed cluster systems have emerged. The distributed cluster system consists of a large number of traditional nodes, and the overall powerful processing capability is presented externally by sharing the processing power to each node. Each node needs to collaborate through shared data to complete processing tasks.
分布式块存储系统是一种以数据块为存储单位、满足海量存储需求的分布 式存储系统,对外呈现一种强大的存储能力。分布式存储系统中节点发生故障, 故障节点回归集群如何快速进行数据恢复, 是提供高质量服务的关键。  The distributed block storage system is a distributed storage system that uses data blocks as storage units to meet the massive storage requirements, and presents a powerful storage capability. The failure of nodes in a distributed storage system and the rapid recovery of data from a faulty node regression cluster are key to providing high-quality services.
现有技术中 ,节点维护一个快照文件,快照文件存储该节点所有数据备份。 进行数据恢复时, 故障节点向提供数据恢复的节点发送快照文件,提供恢复数 据的节点接收故障节点发送的快照文件, 并与其存储的快照文件比较, 然后向 故障节点返回差异的那部分数据。本发明人发现由于快照文件存放的是数据备 份,进行数据恢复时通过发送故障节点的数据备份与提供恢复数据的节点的数 据备份进行比较, 需要传输的数据量非常巨大, 严重浪费网络带宽。  In the prior art, a node maintains a snapshot file, and the snapshot file stores all data backups of the node. When data recovery is performed, the faulty node sends a snapshot file to the node that provides data recovery. The node that provides the recovery data receives the snapshot file sent by the faulty node, compares it with the snapshot file stored, and returns the difference data to the faulty node. The inventors have found that since the snapshot file stores data backup, the data backup by sending the failed node is compared with the data backup of the node providing the restored data during data recovery, and the amount of data to be transmitted is very large, which seriously wastes network bandwidth.
发明内容 Summary of the invention
在下文中给出了关于本发明的简要概述,以便提供关于本发明的某些方面 的基本理解。 应当理解, 这个概述并不是关于本发明的穷举性概述。 它并不是 意图确定本发明的关键或重要部分,也不是意图限定本发明的范围。其目的仅 仅是以简化的形式给出某些概念, 以此作为稍后论述的更详细描述的前序。  A brief summary of the invention is set forth below in order to provide a basic understanding of certain aspects of the invention. It should be understood that this summary is not an exhaustive overview of the invention. It is not intended to identify key or critical aspects of the invention, and is not intended to limit the scope of the invention. Its purpose is to present some concepts in a simplified form as a pre-
本发明实施例提供一种分布式存储数据恢复方法, 包括:  The embodiment of the invention provides a method for recovering distributed storage data, including:
本地节点接收目标节点根据本地节点的数据操作序列集合的版本值发送 的数据操作序列集合列表; The local node receiving target node sends according to the version value of the local node's data operation sequence set. a list of data manipulation sequence sets;
接收所述目标节点发送的所述数据操作序列集合列表对应的数据; 根据所述数据操作序列集合列表以及所述数据操作序列集合列表对应的数据, 更新所述本地节点的数据。  Receiving data corresponding to the data operation sequence set list sent by the target node; updating the data of the local node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
本发明实施例还提供一种分布式存储数据恢复方法, 包括:  The embodiment of the invention further provides a distributed storage data recovery method, including:
接收本地节点的数据操作序列集合的版本值;  Receiving a version value of a data operation sequence set of the local node;
根据所述版本值, 向所述本地节点发送数据操作序列集合列表, 根据所述数据操作序列集合列表,向所述本地节点发送所述数据操作序列 集合列表对应的数据。  And sending, according to the version value, a data operation sequence set list to the local node, and sending, according to the data operation sequence set list, data corresponding to the data operation sequence set list to the local node.
本发明实施例又提供一种节点, 包括:  The embodiment of the invention further provides a node, including:
接收单元,用于接收目标节点根据所述节点的数据操作序列集合的版本值 发送的数据操作序列集合列表,以及用于接收所述目标节点发送的所述数据操 作序列集合列表对应的数据;  a receiving unit, configured to receive a data operation sequence set list sent by the target node according to the version value of the data operation sequence set of the node, and receive data corresponding to the data operation sequence set list sent by the target node;
更新单元,根据所述数据操作序列集合列表以及所述数据操作序列集合列 表对应的数据, 更新所述节点的数据。  The updating unit updates the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
本发明实施例还提供一种节点, 包括:  The embodiment of the invention further provides a node, including:
接收单元, 用于接收本地节点数据操作序列集合的版本值;  a receiving unit, configured to receive a version value of a local node data operation sequence set;
发送单元 , 用于根据所述版本值, 向所述本地节点发送数据操作序列集合 列表和所述数据操作序列集合列表对应的数据。  And a sending unit, configured to send, according to the version value, the data operation sequence set list and the data operation sequence set list corresponding data to the local node.
本发明实施例再提供一种分布式存储数据恢复系统, 包括:  The embodiment of the invention further provides a distributed storage data recovery system, comprising:
本地节点,用于接收目标节点根据所述本地节点的数据操作序列集合的版 本值发送的数据操作序列集合列表,以及用于接收所述目标节点发送的所述数 据操作序列集合列表对应的数据,还用于根据所述数据操作序列集合列表以及 所述数据操作序列集合列表对应的数据, 更新所述节点的数据;  a local node, configured to receive a data operation sequence set list sent by the target node according to a version value of the data operation sequence set of the local node, and receive data corresponding to the data operation sequence set list sent by the target node, And being further configured to update data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list;
所述目标节点, 用于接收所述本地节点的数据操作序列集合的版本值,根 据所述版本值, 向所述本地节点发送数据操作序列集合列表, 和所述数据操作 序列集合列表对应的数据。 The target node is configured to receive a version value of a data operation sequence set of the local node, and a root And according to the version value, sending a data operation sequence set list to the local node, and data corresponding to the data operation sequence set list.
本发明实施例提供的分布式存储数据恢复方法、设备和系统, 通过接收目 标节点根据本地节点的数据操作序列集合的版本值发送的数据操作序列集合 列表, 来更新本地节点的数据, 从而在进行数据恢复时, 减少了数据传输量, 节约了网络带宽。  The distributed storage data recovery method, device and system provided by the embodiment of the present invention update the data of the local node by receiving the data operation sequence set list sent by the target node according to the version value of the data operation sequence set of the local node, thereby performing When data is recovered, the amount of data transmission is reduced, and network bandwidth is saved.
附图说明 DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所 使用的附图作一简单介绍,显而易见地, 下面描述中的附图是本发明的一些实 施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以 才艮据这些附图获得其他的附图。  In order to more clearly illustrate the technical solutions in the embodiments of the present invention, a brief description of the drawings used in the description of the embodiments will be briefly described. Those skilled in the art can also obtain other drawings based on these drawings without paying creative labor.
图 1为数据操作序列与数据操作序列集合示意图;  1 is a schematic diagram of a data operation sequence and a data operation sequence set;
图 1为操作记录文件生成流程示意图;  FIG. 1 is a schematic diagram of a process of generating an operation record file;
图 3为数据操作序列合并流程示意图;  FIG. 3 is a schematic diagram of a process of combining data operation sequences;
图 4为本发明第一实施例流程示意图;  4 is a schematic flow chart of a first embodiment of the present invention;
图 5为本发明第二实施例流程示意图;  Figure 5 is a schematic flow chart of a second embodiment of the present invention;
图 6为本发明第三实施例流程示意图;  6 is a schematic flow chart of a third embodiment of the present invention;
图 7为本发明第四实施例流程示意图;  7 is a schematic flow chart of a fourth embodiment of the present invention;
图 8为本发明第五实施例流程示意图;  8 is a schematic flow chart of a fifth embodiment of the present invention;
图 9为本发明第六实施例节点结构示意图;  9 is a schematic structural diagram of a node according to a sixth embodiment of the present invention;
图 10为本发明第七实施例节点结构示意图;  10 is a schematic structural diagram of a node according to a seventh embodiment of the present invention;
图 11为本发明第八实施例节点结构示意图;  11 is a schematic structural diagram of a node according to an eighth embodiment of the present invention;
图 12为本发明第九实施例节点结构示意图;  12 is a schematic structural diagram of a node according to a ninth embodiment of the present invention;
图 1 3为本发明第十实施例节点结构示意图;  FIG. 13 is a schematic structural diagram of a node according to a tenth embodiment of the present invention; FIG.
图 14为本发明第十一实施例节点结构示意图; 图 15为本发明第十二实施例节点结构示意图; 14 is a schematic structural diagram of a node according to an eleventh embodiment of the present invention; 15 is a schematic structural diagram of a node according to a twelfth embodiment of the present invention;
图 16为本发明第十三实施例系统结构示意图;  16 is a schematic structural diagram of a system according to a thirteenth embodiment of the present invention;
具体实施例 Specific embodiment
在下文中将结合附图对本发明的示范性实施例进行描述。为了清楚和简明 起见, 在说明书中并未描述实际实施方式的所有特征。 然而, 应该了解, 在开 发任何这种实际实施例的过程中必须做出很多特定于实施方式的决定,以便实 现开发人员的具体目标, 并且这些决定可能会随着实施方式的不同而有所改 变。  Exemplary embodiments of the present invention will be described hereinafter with reference to the accompanying drawings. For the sake of clarity and conciseness, not all features of an actual implementation are described in the specification. However, it should be understood that many implementation-specific decisions must be made in the development of any such practical embodiment in order to achieve the developer's specific goals, and these decisions may vary from implementation to implementation. .
本实施例中的本地节点指需要进行数据恢复的节点, 也可以称为故障节 点; 目标节点指提供恢复数据的节点, 也可以称为恢复数据目标节点。  The local node in this embodiment refers to a node that needs to perform data recovery, and may also be referred to as a fault node. The target node refers to a node that provides recovery data, and may also be referred to as a recovery data target node.
在分布式存储系统中, 相同的数据需要存储在不同的节点上, 形成备份。 存在备份关系的节点上对应的存储数据应该保持一致。如图 1所示, 当对节点 上数据进行写入数据操作时,会产生一条对应的数据操作序列, 用来记录对该 数据的一次写操作。 该数据操作序列用 obj-id: 〈off set, s ize〉表示, 其中 obj-id 是数据标识, 用来表示一种类型的数据。 off set表示该条数据操作序 列的初始值, s ize表示数据操作序列相对初始值的偏移量。 数据操作序列记 录在节点緩存中, 当节点状态变化时, 将緩存中的数据操作序列刷新到操作记 录文件中。 操作记录文件, 可以包括但不限于 check-point文件。 一次刷新到 操作记录文件中的数据操作序列称为一个数据操作序列集合,每个数据操作序 列集合有一个版本号。一个数据操作序列集合包含一个版本值、 以及一行或多 行数据操作序列。 其中数据操作序列集合版本值为一个单调递增的非负整数。 当然, 也可以使用其他能够表示顺序关系的符号。 在操作记录文件中, 数据操 作序列集合按照版本值顺序存储, 形成数据操作序列集合列表。 一个数据操作 序列集合列表包含一个或多个数据操作序列集合。  In a distributed storage system, the same data needs to be stored on different nodes to form a backup. The corresponding storage data on the node where the backup relationship exists should be consistent. As shown in Figure 1, when data is written to the node, a corresponding sequence of data operations is generated to record a write to the data. The data manipulation sequence is represented by obj-id: <off set, s ize>, where obj-id is a data identifier used to represent one type of data. Off set represents the initial value of the data operation sequence, and size represents the offset of the data operation sequence from the initial value. The data operation sequence is recorded in the node cache, and when the node status changes, the data operation sequence in the cache is flushed to the operation record file. The operation log file may include, but is not limited to, a check-point file. The sequence of data operations that are flushed into the operation log file at a time is called a collection of data manipulation sequences, and each data manipulation sequence set has a version number. A collection of data manipulation sequences contains a version value and one or more rows of data manipulation sequences. The data operation sequence set version value is a monotonically increasing non-negative integer. Of course, other symbols that can represent the order relationship can also be used. In the operation log file, the data operation sequence set is stored in the order of the version values, forming a list of data operation sequence sets. A data manipulation sequence collection list contains one or more collections of data manipulation sequences.
节点中除了包括操作记录文件外,还可以包括 log数据操作序列列表。 log 数据操作序列列表用来存储数据操作序列集合当前版本期间所有数据操作序 列。 log数据操作序列列表由数据操作序列集合当前版本期间生成的所有数据 操作序列集合组成。 In addition to the operation record file, the node may also include a log data operation sequence list. Log The data manipulation sequence list is used to store all data manipulation sequences during the current version of the data manipulation sequence set. The log data manipulation sequence list consists of a collection of all data manipulation sequences generated during the current version of the data manipulation sequence collection.
本发明所有实施例中的数据操作序列只记录节点写入数据的操作 ,数据操 作序列不包括写入的数据本身。  The data manipulation sequence in all embodiments of the present invention records only the operation of the node to write data, and the data manipulation sequence does not include the written data itself.
如图 2所示为操作记录文件的更新流程图,操作记录文件, 可以包括但不 限于 check-point文件,具体包括如下步骤:  As shown in FIG. 2, an update flowchart of the operation record file, the operation record file, which may include but is not limited to a check-point file, includes the following steps:
步骤 201 : 节点启动, 緩存初始化数据操作序列, 读取节点数据操作序列 集合的当前版本值 V。  Step 201: The node starts, caches the initialization data operation sequence, and reads the current version value V of the node data operation sequence set.
步骤 202:设置工作变量 V'的值为节点数据操作序列集合的当前版本值 V。 步骤 203: 设置 log数据操作序列列表的数据操作序列的初始值为 0。 步骤 204: 从 log数据操作序列列表的数据操作序列初始值处读取一条数 据操作序列, 该数据操作序列长度记为 len。  Step 202: Set the value of the working variable V' to the current version value V of the node data operation sequence set. Step 203: Set the initial value of the data operation sequence of the log data operation sequence list to 0. Step 204: Read a data operation sequence from the initial value of the data operation sequence of the log data operation sequence list, and the data operation sequence length is recorded as len.
步骤 205: 判断读取的数据操作序列是否到达数据末尾。  Step 205: Determine whether the read data operation sequence reaches the end of the data.
如果此次读取完该条数据操作序列到达数据末尾,则 log数据操作序列列 表的数据操作序列的初始值中的数据已经读取完毕, 进入步骤 206a。 如果此 次读取该条数据操作序列没有成功或者没有到达数据末尾, 则进入步骤 206b。  If the data operation sequence is read to the end of the data, the data in the initial value of the data operation sequence of the log data operation sequence list has been read, and the process proceeds to step 206a. If the sequence of reading the data operation is unsuccessful or does not reach the end of the data, the process proceeds to step 206b.
步骤 206a: 判断工作变量 V'是否与数据操作序列集合当前版本值 V相等。 如果工作变量 V'与数据操作序列集合当前的版本值相等则代表节点状态 没有发生变化, 进入步骤 209a进入睡眠等待。 如果工作变量 V,与当前数据操 作序列集合的版本值 V不相等, 则说明节点状态发生变化,数据操作序列集合 的版本值也发生变化, 则将执行步骤 209b。  Step 206a: Determine whether the working variable V' is equal to the current version value V of the data operation sequence set. If the working variable V' is equal to the current version value of the data operation sequence set, it means that the node state has not changed, and the process proceeds to step 209a to enter sleep waiting. If the working variable V is not equal to the version value V of the current data operation sequence set, then the state of the node changes, and the version value of the data operation sequence set also changes, and step 209b is executed.
步骤 206b: 判断数据标识是否已经存在緩存中。  Step 206b: Determine whether the data identifier is already in the cache.
判断是否已经在緩存中存在相同数据标识的数据操作序列,如果在緩存中 存在相同标识的数据操作序列, 则进入步骤 207a进行向量合并操作, 如果在 緩存中不存在相同数据标识的数据操作序列, 则进入步骤 207b将该条数据操 作序列添加到緩存。 Determining whether a data operation sequence of the same data identifier exists in the cache. If there is a data operation sequence of the same identifier in the cache, proceed to step 207a to perform a vector merge operation, if If there is no data operation sequence of the same data identifier in the cache, then the process proceeds to step 207b to add the data operation sequence to the cache.
步骤 207a: 数据操作序列进行向量合并。  Step 207a: The data operation sequence is vector merged.
将该条新的数据操作序列与緩存中已经存在的具有相同数据标识的数据 操作序列进行向量合并得到新的序列, 并更新緩存中原有的数据操作序列。  The new data operation sequence is vector-combined with the data operation sequence having the same data identifier already existing in the cache to obtain a new sequence, and the original data operation sequence in the cache is updated.
步骤 207b: 添加数据操作序列到緩存。  Step 207b: Add a sequence of data operations to the cache.
将新读取的该条数据操作序列原封不动地添加到緩存。  The newly read sequence of data operations is added to the cache intact.
步骤 208: log数据操作序列列表的数据操作序列的初始值往后移 len长 度。  Step 208: The initial value of the data operation sequence of the log data operation sequence list is shifted backward by len length.
初始值往后移 len长度,以跳过刚读取的数据操作序列移动到新的数据操 作序列开始, 返回步骤 204读取下一条数据操作序列。  The initial value is shifted back by len length to skip the data sequence that has just been read and moves to the new data operation sequence. Returning to step 204, the next data operation sequence is read.
步骤 209a: 进入睡眠等待。  Step 209a: Enter sleep waiting.
睡眠等待 T时间,然后进入步骤 204将最近 T时间内的数据操作序列合并。 其中时间 T为可以才艮据系统状态自行设定的一个时间,作为定时判断刷新数据 操作序列集合的时间值。  Sleep waits for T time, and then proceeds to step 204 to merge the sequence of data operations in the most recent T time. The time T is a time that can be set according to the state of the system, and is used as a timing to judge the time value of the refresh data operation sequence set.
步骤 209b: 緩存中合并的数据操作序列结果刷到操作记录文件。  Step 209b: The result of the merged data operation sequence in the cache is brushed to the operation record file.
节点状态发生变化, 导致数据操作序列集合版本值 V增加,将緩存中合并 的数据操作序列集合刷新到操作记录文件。  The state of the node changes, causing the data operation sequence set version value V to increase, and the merged data operation sequence set in the cache is flushed to the operation record file.
步骤 210: 清空緩存与 log数据操作序列列表。  Step 210: Clear the cache and log data operation sequence list.
清空緩存与 log文件, 重新执行步骤 202。  Clear the cache and log files and repeat step 202.
清空緩存用于新的数据操作序列存储与向量合并,建立新的 log数据操作 序列列表存储所有数据操作序列,删除旧的 log数据操作序列列表, 进入步骤 202开始新的数据操作序列合并操作。  The cache is cleared for new data operation sequence storage and vector merging, and a new log data operation is created. The sequence list stores all data operation sequences, deletes the old log data operation sequence list, and proceeds to step 202 to start a new data operation sequence merging operation.
由上面的操作记录文件生成流程可知,操作记录文件中的数据操作序列集 合是节点状态发生变化时从緩存中刷新到操作记录文件中的。在数据操作序列 集合当前版本中, 因为当前节点状态还没有发生变化,合并的数据操作序列在 緩存中还没有刷新到操作记录文件中,因此操作记录文件中缺少当前版本的数 据操作序列集合。而 log数据操作序列列表可以记录当前版本下所有数据操作 序列, 并且对于相同数据标识的数据操作序列不进行向量合并。 这样在操作记 录文件缺少当前版本数据操作序列集合时,可以从 log数据操作序列列表中查 找当数据操作序列集合当前版本的所有数据操作序列。 It can be seen from the above operation record file generation process that the data operation sequence set in the operation record file is flushed from the cache to the operation record file when the node status changes. Data manipulation sequence In the current version of the collection, because the current node state has not changed, the merged data operation sequence has not been flushed into the operation log file in the cache, so the current version of the data operation sequence set is missing from the operation log file. The log data operation sequence list can record all data operation sequences in the current version, and the data operation sequence for the same data identification is not vector-combined. Thus, when the operation record file lacks the current version data operation sequence set, all data operation sequences of the current version of the data operation sequence set can be searched from the log data operation sequence list.
数据操作序列中, 将具有相同数据标识的数据操作序列进行向量合并 ,操 作如图 3所示, 具体包括以下步骤:  In the data operation sequence, the data operation sequence with the same data identifier is vector-merged, and the operation is as shown in FIG. 3, which specifically includes the following steps:
步骤 301 : 待合并数据操作序列。  Step 301: A sequence of data operations to be merged.
步骤 302: 判断待合并数据操作序列与已有操作序列是否有重叠区间。 将待合并的数据操作序列与已经合并数据操作序列比较,若有重叠则进入 步骤 303a做向量合并操作,否则进入步骤 303b将其按照数据操作序列的初始 值大小顺序插入已有空间中。  Step 302: Determine whether there is an overlap interval between the data operation sequence to be merged and the existing operation sequence. The data operation sequence to be merged is compared with the already merged data operation sequence. If there is overlap, the process proceeds to step 303a to perform a vector merge operation. Otherwise, the process proceeds to step 303b to insert the sequence into the existing space according to the initial value of the data operation sequence.
步骤 303a: 将已有数据操作序列与待合并数据操作序列有重叠的区间进 行向量合并。  Step 303a: Combine the existing data operation sequence with the interval of the data operation sequence to be merged.
将已合并数据操作序列中所有与待合并数据操作序列重叠的部分进行合 并, 如, 已经存在数据操作序列 A: <1, 5> , 等合并数据操作序列为 A: <2, 6>, 已经存在的数据操作序列 A: <1, 5>与待合并数据操作序列 A: <2, 6>有重叠区 间, 合并后为 A: <1, 7>。  Combine all the parts of the merged data operation sequence that overlap with the data operation sequence to be merged. For example, the data operation sequence A: <1, 5> already exists, and the merged data operation sequence is A: <2, 6>, already The existing data operation sequence A: <1, 5> has an overlapping interval with the data operation sequence A to be merged: <2, 6>, and is A: <1, 7> after the combination.
步骤 303b: 将待合并数据操作序列插入到已有数据操作序列中。  Step 303b: Insert the data operation sequence to be merged into the existing data operation sequence.
将待合并数据操作序列插入已经数据操作序列中,插入原则是保证插入后 所有数据操作序列以初始值严格递增。 如已经存在的数据操作序列为 A: <1 , 4>, 待合并数据操作序列为 A: <6, 3>,两条数据操作序列之间没有重叠, 按照数据 操作序列初始值严格递增的原则, 表示为 A: <1, 4X6, 3>。  The data operation sequence to be merged is inserted into the already existing data operation sequence. The insertion principle is to ensure that all data operation sequences after the insertion are strictly incremented by the initial value. If the existing data operation sequence is A: <1, 4>, the data operation sequence to be merged is A: <6, 3>, there is no overlap between the two data operation sequences, according to the principle that the initial value of the data operation sequence is strictly increased. , expressed as A: <1, 4X6, 3>.
步骤 304: 合并完成。 本发明实施例给出基于操作记录文件存储当前版本以前的所有数据操作 序列集合 (不含数据 )与 log文件存储当前版本数据操作序列集合时的所有数 据操作序列, 来进行数据恢复。操作记录文件存储数据操作序列集合当前版本 以前的所有版本的数据操作序列集合,其中每个数据操作序列集合中的具有相 同数据标识的数据操作序列之间如果有重叠区间, 则进行向量合并。 log数据 操作序列列表存储当前数据操作序列集合版本值时节点上所有数据操作序列, 并且其中具有相同数据标识的数据操作序列不进行向量合并。 Step 304: The merge is completed. The embodiment of the present invention provides data recovery for storing all data operation sequence sets (without data) before the current version and all data operation sequences when the log file stores the current version data operation sequence set based on the operation record file. The operation record file stores a set of data operation sequences of all versions prior to the current version of the data operation sequence set, wherein if there is an overlap interval between the data operation sequences having the same data identifier in each data operation sequence set, vector merging is performed. The log data operation sequence list stores all data operation sequences on the node when the current data operation sequence set version value, and the data operation sequence having the same data identification is not vector merged.
当分布式存储系统中的某个节点发生故障,其他存储相同数据备份节点仍 然在正常工作。 由于存储相同数据备份的某个节点发生故障,发生节点状态变 化,将导致存储相同数据备份的节点将节点緩存中存储的当前版本的数据操作 序列集合刷新到所在节点的操作记录文件中,同时緩存用来存储新的版本值递 增后的数据操作序列集合。由于发生故障的节点在故障恢复之前不能进行数据 操作。 故障恢复后的本地节点, 进行数据恢复时, 只需要目标节点将大于本地 节点操作记录文件中的当前数据操作序列集合版本值的与本地节点相关数据 操作序列集合组成 的数据操作序列集合列表发送到本地节点, 同时将发送到 本地节点的数据操作序列集合列表对应的数据发送过来,并将数据更新到本地 节点, 就可以将操作记录文件对应的数据恢复。 同时为了全部恢复数据, 再将 目标节点 log数据操作序列组成的 log数据操作序列列表发送到本地节点,同 时将 log数据操作序列列表对应的数据发送到本地节点, 本地节点更新 log 数据操作序列列表对应的数据, 即可更新全部数据, 因此节点上存储有操作记 录文件, log数据操作序列列表和数据。 操作记录文件存储数据操作序列集合 当前版本以前的所有版本的数据操作操作序列集合组成的数据操作序列集合 列表,这些数据操作序列集合中的数据操作序列与数据操作序列集合当前版本 之前写入的数据相对应, 记录了每一次数据写入操作。 log数据操作序列列表 存储数据操作序列集合当前版本期间所有的数据操作序列,这些数据操作序列 与数据操作序列集合当前版本期间写入的数据相对应,记录了每一次数据写入 操作。 When a node in a distributed storage system fails, other storage nodes that store the same data are still working properly. If a node that stores the same data backup fails, the node status changes, which causes the node that stores the same data backup to refresh the current version of the data operation sequence set stored in the node cache to the operation record file of the node, and cache. Used to store a collection of data manipulation sequences with new version values incremented. Because the failed node cannot perform data operations until the failure recovers. After the fault recovery, the local node only needs to send the list of data operation sequence sets composed of the local node related data operation sequence set to the current data operation sequence set version value in the local node operation record file to the data recovery. The local node sends the data corresponding to the data operation sequence set list sent to the local node at the same time, and updates the data to the local node, so that the data corresponding to the operation record file can be recovered. At the same time, in order to recover all the data, the log data operation sequence list composed of the target node log data operation sequence is sent to the local node, and the data corresponding to the log data operation sequence list is sent to the local node, and the local node updates the log data operation sequence list correspondingly. The data can be updated, so the node has an operation log file, a log data operation sequence list and data. The operation record file stores a list of data operation sequence sets composed of a set of data operation operation sequences of all versions before the current version of the data operation sequence set, data operations sequences in the data operation sequence set and data written before the current version of the data operation sequence set Correspondingly, each data write operation is recorded. Log data operation sequence list stores all data operation sequences during the current version of the data operation sequence set, these data operation sequences Corresponding to the data written during the current version of the data manipulation sequence set, each data write operation is recorded.
本发明第一实施例提供了一种分布式存储数据恢复方法, 如图 4所示, 包 括:  The first embodiment of the present invention provides a distributed storage data recovery method. As shown in FIG. 4, the method includes:
步骤 401 : 本地节点接收目标节点根据本地节点的数据操作序列集合的版 本值发送的数据操作序列集合列表。  Step 401: The local node receives a list of data operation sequence sets sent by the target node according to the version value of the data operation sequence set of the local node.
可选地 ,所述本地节点的数据操作序列集合的版本值为所述本地节点向所 述目标节点发送的。  Optionally, the version value of the data operation sequence set of the local node is sent by the local node to the target node.
可选地,所述本地节点的数据操作序列集合的版本值也可以由主控节点向 所述目标节点发送。  Optionally, the version value of the data operation sequence set of the local node may also be sent by the master node to the target node.
步骤 402: 接收所述目标节点发送的所述数据操作序列集合列表对应的数 据。  Step 402: Receive data corresponding to the data operation sequence set list sent by the target node.
可选地,所述本地节点接收目标节点根据本地节点数据操作序列集合的版 本值发送的数据操作序列集合列表之后, 所述方法还包括: 所述本地节点对所 述数据操作序列集合列表进行向量合并操作,并发送所述向量合并操作后的数 据操作序列集合列表给所述目标节点;  Optionally, after the local node receives the data operation sequence set list sent by the target node according to the version value of the local node data operation sequence set, the method further includes: the local node performing a vector on the data operation sequence set list Merging operation, and sending a list of data operation sequence sets after the vector combining operation to the target node;
则所述接收所述目标节点发送的所述数据操作序列集合列表对应的数据 具体为:接收所述目标节点发送的所述向量合并操作后的数据操作序列集合列 表对应的数据。  And the data corresponding to the data operation sequence set list sent by the target node is received by: receiving data corresponding to the data operation sequence set list after the vector merging operation sent by the target node.
可选地,所述本地节点接收目标节点根据本地节点数据操作序列集合的版 本值发送的数据操作序列集合列表之后, 所述方法还包括:  Optionally, after the local node receives the data operation sequence set list that is sent by the target node according to the version value of the local node data operation sequence set, the method further includes:
所述本地节点对所述数据操作序列集合列表进行随机输入输出 ( Input/Output, 以下简称 10 )合并为顺序 10的操作, 并发送所述进行随机 10合并为顺序 10操作后的数据操作序列集合列表给所述目标节点;  The local node performs random input and output (Input/Output, hereinafter abbreviated as 10) on the data operation sequence set list into the operation of the sequence 10, and sends the data operation sequence set after the random 10 is merged into the sequence 10 operation. List to the target node;
则所述接收所述目标节点发送的所述数据操作序列集合列表对应的数据 据操作序列集合列表对应的数据。 And receiving the data corresponding to the data operation sequence set list sent by the target node According to the data corresponding to the operation sequence collection list.
具体地, 当所述数据操作序列集合列表中, 具有相同标识的数据操作序列 之间空洞值与合并序列跨越的连续空间大小的比值小于设定百分比,或者具有 相同标识的数据操作序列之间空洞值小于设定的阔值,则对所述数据操作序列 集合列表进行所述随机 10合并为顺序 10的操作。  Specifically, in the data operation sequence set list, a ratio of a hole value between the data operation sequence having the same identifier to a continuous space size spanned by the merge sequence is less than a set percentage, or a hole between data operation sequences having the same identifier If the value is less than the set threshold, the random operation 10 is merged into the sequence 10 operation on the data operation sequence set list.
可选地,接收所述目标节点发送的所述数据操作序列集合列表对应的数据 具体为:接收所述目标节点发送的所述目标节点对所述数据操作序列集合列表 进行向量合并操作后的数据操作序列集合列表对应的数据。  Optionally, receiving data corresponding to the data operation sequence set list sent by the target node is: receiving data that is sent by the target node to perform a vector merge operation on the data operation sequence set list. Operates the data corresponding to the sequence collection list.
可选地,接收所述目标节点发送的所述数据操作序列集合列表对应的数据 具体为:接收所述目标节点发送的所述目标节点对所述数据操作序列集合列表 进行随机 10合并为顺序 10操作后的数据操作序列集合列表对应的数据。 步骤 403: 根据所述数据操作序列集合列表以及所述数据操作序列集合列表对 应的数据, 更新所述本地节点的数据。  Optionally, receiving, by the target node, the data corresponding to the data operation sequence set list is: receiving, by the target node, the target node, randomly, combining the data operation sequence set list into a sequence of 10 The data corresponding to the data operation sequence set list after the operation. Step 403: Update data of the local node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
可选地,所述本地节点将接收的数据操作序列集合列表更新到所述本地节 点的操作记录文件。  Optionally, the local node updates the received list of data operation sequence sets to an operation record file of the local node.
可选地, 还包括:  Optionally, the method further includes:
所述本地节点接收所述目标节点发送的 log操作序列列表;  The local node receives a log operation sequence list sent by the target node;
接收所述目标节点发送的所述 log操作序列列表对应的数据;  Receiving data corresponding to the log operation sequence list sent by the target node;
根据所述 log操作序列列表和所述 log操作序列列表对应的数据,更新所 述本地节点的数据。  Updating the data of the local node according to the log operation sequence list and the data corresponding to the log operation sequence list.
本发明第一实施例提供的分布式存储数据恢复方法,通过向目标节点发送 本地节点的数据操作序列集合的版本值,接收目标节点根据本地节点的数据操 作序列集合的版本值发送的数据操作序列集合列表, 来更新本地节点的数据, 从而减少了数据恢复过程中的数据传输量, 节约了网络带宽。 本发明第二实施例提供了一种分布式存储数据恢复方法, 如图 5所示, 具 体包括: The distributed storage data recovery method provided by the first embodiment of the present invention sends a data operation sequence sent by the target node according to the version value of the data operation sequence set of the local node by sending the version value of the data operation sequence set of the local node to the target node. The collection list is used to update the data of the local node, thereby reducing the amount of data transmission during the data recovery process and saving network bandwidth. A second embodiment of the present invention provides a method for recovering distributed storage data. As shown in FIG. 5, the method specifically includes:
步骤 501 : 接收本地节点的数据操作序列集合的版本值。  Step 501: Receive a version value of a data operation sequence set of the local node.
可以由目标节点接收本地节点的数据操作序列集合的版本值。  The version value of the local node's data manipulation sequence set may be received by the target node.
具体地,该目标节点接收的本地节点数据操作序列集合当前版本值可以由 本地节点发送。  Specifically, the current version value of the local node data operation sequence set received by the target node may be sent by the local node.
具体地,该目标节点接收的本地节点数据操作序列集合当前版本值也可以 由主控节点发送。  Specifically, the current version value of the local node data operation sequence set received by the target node may also be sent by the master node.
步骤 502 :根据所述版本值,向所述本地节点发送数据操作序列集合列表。 可选地,查找目标节点的操作记录文件中版本值大于所述本地节点的数据 操作序列集合的版本值的数据操作序列集合,并选择与所述本地节点相关的数 据操作序列集合生成所述数据操作序列集合列表;  Step 502: Send a data operation sequence set list to the local node according to the version value. Optionally, searching for a data operation sequence set in the operation record file of the target node that has a version value greater than a version value of the data operation sequence set of the local node, and selecting a data operation sequence set related to the local node to generate the data. a list of operational sequence sets;
向所述本地节点发送所述数据操作序列集合列表。  Sending the list of data manipulation sequence sets to the local node.
具体地,利用分布式哈希表架构下的哈希算法或元数据服务架构中的分配 表算法, 选择与所述本地节点相关的数据操作序列集合。  Specifically, a data operation sequence set associated with the local node is selected using a hash algorithm under a distributed hash table architecture or an allocation table algorithm in a metadata service architecture.
步骤 503: 根据所述数据操作序列集合列表, 向所述本地节点发送所述数 据操作序列集合列表对应的数据。  Step 503: Send, according to the data operation sequence set list, data corresponding to the data operation sequence set list to the local node.
根据所述数据操作序列集合列表,向所述本地节点发送所述数据操作序列 集合列表对应的数据之前, 所述方法还包括: 将所述数据操作序列集合列表进 行向量合并操作;  And before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: performing a vector merging operation on the data operation sequence set list;
则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 所述本地节点发送所述向量合并操作后的数据操作序列集合列表对应的数据。  And sending the data corresponding to the data operation sequence set list to the local node, where the data corresponding to the data operation sequence set list after the vector combining operation is sent to the local node.
根据所述数据操作序列集合列表,向所述本地节点发送所述数据操作序列 集合列表对应的数据之前, 所述方法还包括: 将所述数据操作序列集合列表进 行随机 10合并为顺序 10的操作; 则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 表对应的数据。 Before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: merging the data operation sequence set list into a sequence 10 by random 10 ; And sending, by the local node, the data corresponding to the data operation sequence set list is: data corresponding to the table.
可选地,根据所述数据操作序列集合列表, 向所述本地节点发送所述数据 操作序列集合列表对应的数据之前,还包括: 接收所述本地节点发送的将所述 数据操作序列集合列表进行向量合并操作后的数据操作序列集合列表;  Optionally, before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: receiving, by the local node, the list of the data operation sequence set a list of data manipulation sequence sets after the vector merge operation;
则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 所述本地节点发送所述向量合并操作后的数据操作序列集合列表对应的数据。  And sending the data corresponding to the data operation sequence set list to the local node, where the data corresponding to the data operation sequence set list after the vector combining operation is sent to the local node.
可选地,根据所述数据操作序列集合列表, 向所述本地节点发送所述数据 操作序列集合列表对应的数据之前, 所述方法还包括: 接收所述本地节点发送 的将所述数据操作序列集合列表进行随机 10合并顺序 10操作后的数据操作序 列集合列表;  Optionally, before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: receiving, by the local node, the data operation sequence The list of data operations sequence sets after the random 10 merge order 10 operation is performed on the set list;
则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 所述本地节点发送所述随机 10合并顺序 10操作后的数据操作序列集合列表对 应的数据。  And sending the data corresponding to the data operation sequence set list to the local node, where the data corresponding to the data operation sequence set list after the random 10 merge order 10 operation is sent to the local node.
可选地, 还可以包括接收本地节点的 log数据操作序列请求;  Optionally, the method further includes receiving a log data operation sequence request of the local node;
根据所述 log数据操作序列请求,向所述本地节点发送 log数据操作序列 列表,  Sending a log data operation sequence list to the local node according to the log data operation sequence request,
根据所述 log数据操作序列列表,向所述本地节点发送所述 log数据操作 序列列表对应的数据。  And transmitting, according to the log data operation sequence list, data corresponding to the log data operation sequence list to the local node.
本发明第二实施例提供的分布式存储数据恢复方法,通过接收本地节点的 数据操作序列集合的版本值,发送数据操作序列集合列表, 并根据数据操作序 列集合列表发送数据操作序列集合列表对应的数据, 来更新本地节点的数据, 从而减少了数据恢复过程中的数据传输量, 节约了网络带宽。  The distributed storage data recovery method provided by the second embodiment of the present invention transmits a data operation sequence set list by receiving a version value of a local operation data operation sequence set, and sends a data operation sequence set list corresponding according to the data operation sequence set list. Data, to update the data of the local node, thereby reducing the amount of data transmission during data recovery, saving network bandwidth.
本发明第三实施例提供了一种分布式存储数据恢复方法, 如图 6所示, 具 体包括以下步骤: A third embodiment of the present invention provides a distributed storage data recovery method, as shown in FIG. The body includes the following steps:
步骤 601 : 开始恢复数据。  Step 601: Start recovering data.
本地节点故障恢复后, 开始数据恢复进程。 当本地节点发生故障后, 其他 存储相同数据备份的节点仍然正常工作, 进行数据存储操作。 当本地节点在故 障恢复后, 需要与其它节点上存储的数据备份保持一致, 因此需要恢复本地节 点上故障后未能存储的数据。  After the local node fails, the data recovery process begins. When the local node fails, other nodes that store the same data backup still work normally, and the data storage operation is performed. When the local node recovers from the failure, it needs to be consistent with the data backup stored on other nodes. Therefore, it is necessary to restore the data that cannot be stored after the failure on the local node.
步骤 602: 发送数据操作序列集合当前版本值到目标节点。  Step 602: Send a data operation sequence set current version value to the target node.
本地节点读取故障前操作记录文件中存储的数据操作序列集合的当前版 本值, 发送该当前版本值到目标节点。 当本地节点发生故障后, 节点状态发生 变化, 存储相同数据备份的其他节点上的数据操作序列集合的版本值会递增。 因此,需要将本地节点操作记录文件中数据操作序列集合当前版本值发送到目 标节点, 从而找出故障期间目标节点增加的数据操作序列集合。 本步骤中, 该 目标节点接收的本地节点数据操作序列集合当前版本值也可以由主控节点发 送。  The local node reads the current version of the data operation sequence set stored in the operation log file before the failure, and sends the current version value to the target node. When the local node fails, the state of the node changes, and the version value of the data operation sequence set on the other nodes storing the same data backup increases. Therefore, it is necessary to send the current version value of the data operation sequence set in the local node operation record file to the target node, thereby finding the data operation sequence set added by the target node during the failure. In this step, the current version value of the local node data operation sequence set received by the target node may also be sent by the master node.
步骤 603: 接收本地节点的数据操作序列集合的版本值并查找操作记录文 件中版本值大于本地节点版本值的数据操作序列集合。  Step 603: Receive a version value of the data operation sequence set of the local node and find a data operation sequence set whose operation value in the operation record file is greater than the local node version value.
目标节点从目标节点操作记录文件查找出版本值大于本地节点操作记录 文件中的数据操作序列集合当前版本值的所有数据操作序列集合。在本地节点 故障期间,存储相同数据备份的其他节点还有可能再发生状态变化, 因此需要 在目标节点操作记录文件中查找版本值大于本地节点操作记录文件中的数据 操作序列集合的当前版本值的所有数据操作序列集合。这些版本值大于本地节 点操作记录文件中的数据操作序列集合的当前版本值的数据操作序列集合记 录的是本地节点故障后对数据备份进行的所有写操作。  The target node finds, from the target node operation record file, a set of all data operation sequences whose version value is greater than the current version value of the data operation sequence set in the local node operation record file. During the failure of the local node, other nodes storing the same data backup may have a state change again. Therefore, it is necessary to find in the target node operation record file that the version value is greater than the current version value of the data operation sequence set in the local node operation record file. A collection of all data manipulation sequences. These data manipulation sequence collections whose version values are greater than the current version value of the data manipulation sequence collection in the local node operation log file record all write operations to the data backup after the local node failure.
步骤 604: 生成与本地节点相关的数据操作序列集合列表。  Step 604: Generate a list of data operation sequence sets related to the local node.
在目标节点操作记录文件中存储的版本值大于本地节点操作记录文件中 的数据操作序列集合当前版本值的数据操作序列集合是目标节点上存储的所 有数据对应的数据操作序列集合, 因此存在与本地节点不相关的数据, 意味着 也存储与本地节点不相关的数据操作序列。 因此,在目标节点生成版本值大于 本地节点操作记录文件中的数据操作序列集合当前版本值的数据操作序列集 合列表后,需要根据数据放置算法算出版本值大于本地节点操作记录文件中的 数据操作序列集合当前版本值的数据操作序列集合列表的数据操作序列的数 据标识是否属于本地节点,从而将不属于本地节点上的数据操作序列从版本值 大于本地节点数据操作序列集合当前版本值的数据操作序列集合列表的数据 操作序列集合中删除。常用的数据放置算法包括使用分布式哈希表架构下的分 布式哈希表算法或元数据服务架构中的分配表算法,根据数据操作序列的数据 标识算出版本值大于本地节点操作记录文件中的数据操作序列集合当前版本 值的数据操作序列集合列表中属于本地节点上的数据操作序列集合,生成与本 地节点相关的数据操作序列集合列表。 The version value stored in the target node operation log file is larger than the local node operation record file. Data operation sequence set The data operation sequence set of the current version value is a data operation sequence set corresponding to all data stored on the target node, so there is data not related to the local node, meaning that data operations not related to the local node are also stored. sequence. Therefore, after the target node generates a list of data operation sequence sets whose version value is greater than the current version value of the data operation sequence set in the local node operation record file, it is required to calculate a version value greater than the data operation sequence in the local node operation record file according to the data placement algorithm. The data identifier of the data operation sequence of the data operation sequence set list of the current version value is set to belong to the local node, so that the data operation sequence not belonging to the local node is from the data operation sequence whose version value is greater than the current version value of the local node data operation sequence set. The collection of data in the collection is removed from the sequence of operations. Commonly used data placement algorithms include using a distributed hash table algorithm in a distributed hash table architecture or an allocation table algorithm in a metadata service architecture, and calculating a version value greater than a local node operation record file according to a data identifier of the data operation sequence. A set of data manipulation sequences on the local node in the list of data manipulation sequence sets of the current version value of the data manipulation sequence set, generating a list of data manipulation sequence sets associated with the local node.
步骤 605: 发送与本地节点相关的数据操作序列集合列表。  Step 605: Send a list of data operation sequence sets related to the local node.
为了使本地节点恢复故障期间未能存储的数据,首先需要获得本地节点故 障期间目标节点产生的数据操作序列,就是本发明实施例中提到的版本值大于 本地节点操作记录文件中的数据操作序列集合当前版本值的数据操作序列集 合列表中的数据操作序列, 用这部分操作序列来恢复对应的数据。  In order to restore the data that cannot be stored during the failure of the local node, it is first necessary to obtain the data operation sequence generated by the target node during the local node failure, that is, the version value mentioned in the embodiment of the present invention is larger than the data operation sequence in the local node operation record file. The data operation sequence in the data operation sequence set list of the current version value is collected, and the partial operation sequence is used to recover the corresponding data.
步骤 606: 本地节点接收本地节点相关的数据操作序列集合列表。  Step 606: The local node receives a list of data operation sequence sets related to the local node.
目标节点发送在本地节点故障期间目标节点产生的数据操作序列集合列 表,本地节点接收版本值大于本地节点操作记录文件中的数据操作序列集合当 前版本值的数据操作序列集合列表,用来更新与接收版本值大于本地节点操作 记录文件中的数据操作序列集合当前版本值的数据操作序列集合列表对应的 数据。因为数据操作序列集合当前版本值的数据操作序列集合列表中的每一条 数据操作序列记录了一次数据写入操作, 因此在恢复数据时, 需要按照对应的 数据操作序列更新与数据操作序列记录对应的数据。可以将版本值大于本地节 点操作记录文件中的数据操作序列集合当前版本值并且与本地节点相关的数 据操作序列集合列表存储在本地节点的操作记录文件中,并且该版本值大于本 地节点操作记录文件中的数据操作序列集合当前版本值并且与本地节点相关 的数据操作序列集合列表是未经向量合并和 /或将随机合并为顺序 10处理的。 将版本值大于本地节点操作记录文件中的数据操作序列集合当前版本值并且 与本地节点相关的数据操作序列集合列表存储在本地节点的操作记录文件中 是可选的, 不是本发明实施例的必要步骤。 The target node sends a list of data operation sequence sets generated by the target node during the local node failure, and the local node receives a list of data operation sequence sets whose version value is greater than the current version value of the data operation sequence set in the local node operation record file, for updating and receiving. The version value is greater than the data operation sequence set list corresponding to the current version value of the data operation sequence set in the local node operation record file. Because each data operation sequence in the data operation sequence set list of the current version value of the data operation sequence set records a data write operation, when restoring data, it is required to follow the corresponding The data manipulation sequence updates the data corresponding to the data manipulation sequence record. The version value may be greater than the current version value of the data operation sequence set in the local node operation record file and the list of data operation sequence sets related to the local node is stored in the operation record file of the local node, and the version value is greater than the local node operation record file. The data manipulation sequence set in the current version value and the list of data manipulation sequence sets associated with the local node are not vector-combined and/or randomly merged into a sequence of 10. Storing the version value with the current version value of the data operation sequence set in the local node operation record file and storing the list of data operation sequence sets related to the local node in the operation record file of the local node is optional, not necessary for the embodiment of the present invention. step.
步骤 607: 查找生成版本值大于本地节点操作记录文件中的数据操作序列 集合当前版本值并且与本地节点相关的数据操作序列集合列表对应的数据。  Step 607: Find data corresponding to the data operation sequence set list whose generation version value is greater than the current operation value of the data operation sequence set in the local node operation record file and related to the local node.
目标节点根据生成版本值大于本地节点操作记录文件中的数据操作序列 集合当前版本值并且与本地节点相关的数据操作序列集合列表读取对应的数 据。将版本值大于本地节点操作记录文件中的数据操作序列集合当前版本值并 且与本地节点相关的数据操作序列集合列表发送到目标节点后,还需要将版本 值大于本地节点操作记录文件中的数据操作序列集合当前版本值并且与本地 节点相关的数据操作序列集合列表对应的数据发送到本地节点。 本实施例中, 该版本值大于本地节点操作记录文件中的数据操作序列集合当前版本值并且 与本地节点相关的数据操作序列集合列表对应的数据根据该数据操作序列集 合列表中的数据操作序列逐一查找读取对应的数据生成的。  The target node reads the corresponding data according to the generated version value greater than the data operation sequence in the local node operation record file, and the current version value is read and the corresponding data is read from the list of data operation sequence sets associated with the local node. After the version value is greater than the current version value of the data operation sequence set in the local node operation record file and the list of data operation sequence sets related to the local node is sent to the target node, the version value needs to be larger than the data operation in the local node operation record file. The sequence sets the current version value and the data corresponding to the list of data operation sequence sets associated with the local node is sent to the local node. In this embodiment, the version value is greater than the current version value of the data operation sequence set in the local node operation record file, and the data corresponding to the data operation sequence set list related to the local node is one by one according to the data operation sequence in the data operation sequence set list. Find the data generated by reading the corresponding data.
步骤 607在步骤 604后执行, 可以与步骤 605同时执行, 也可以在步骤 605之后执行。  Step 607 is performed after step 604, and may be performed simultaneously with step 605, or may be performed after step 605.
步骤 608: 目标节点发送本地节点相关的数据操作序列集合列表对应的数 据。  Step 608: The target node sends data corresponding to the list of data operation sequence sets related to the local node.
步骤 609: 接收数据操作序列集合列表对应的数据。  Step 609: Receive data corresponding to the data operation sequence set list.
本地节点接收版本值大于本地节点操作记录文件中的数据操作序列集合 当前版本值并且与本地节点相关的数据操作序列集合列表对应的数据。 The local node receives a version value greater than the data operation sequence set in the local node operation record file. The current version value and the data corresponding to the list of data manipulation sequence sets associated with the local node.
步骤 610: 更新数据。  Step 610: Update the data.
根据版本值大于本地节点操作记录文件中的数据操作序列集合当前版本 值并且与本地节点相关的数据操作序列集合列表和该数据操作序列集合列表 对应的数据更新本地节点的数据。  The data of the local node is updated according to the version value of the data operation sequence set related to the local node operation record file and the data operation sequence set list related to the local operation node and the data corresponding to the data operation sequence set list.
本地节点接收版本值大于本地节点操作记录文件中的数据操作序列集合 当前版本值并且与本地节点相关的数据操作序列集合列表对应的数据后,根据 版本值大于本地节点操作记录文件中的数据操作序列集合当前版本值并且与 本地节点相关的数据操作序列集合列表, 将数据更新到本地节点。  After receiving the data corresponding to the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list related to the local node, the local node receives the data operation sequence in the local node operation record file according to the version value. A list of data manipulation sequence sets that aggregate the current version value and associated with the local node, updating the data to the local node.
下面为本发明实施例可选步骤(图 6未示出):  The following is an optional step (not shown in FIG. 6) of the embodiment of the present invention:
步骤 611 : 发送 1 og数据操作序列请求。  Step 611: Send a 1 og data operation sequence request.
由于本地节点在进行数据恢复时,目标节点中操作记录文件中记录的数据 操作序列集合是目标节点数据操作序列集合当前版本之前的所有数据操作序 列集合, 而目标节点当前版本的数据操作序列集合在节点緩存中, 没有刷新到 目标节点的操作记录文件中, 因此, 为了恢复全部数据, 还需要将目标节点当 前版本的数据操作序列集合对应的数据恢复到本地节点。由于目标节点当前版 本的数据操作序列集合存储在緩存中,在节点状态发生变化之前不能刷新到目 标节点操作记录文件中, 现在是不可读取的。 所以需要目标节点的 log数据操 作序列 ,用目标节点的 log数据操作序列对应的数据进行数据恢复。本步骤中, 该本地节点的 log数据操作序列请求也可以由主控节点控制发送。  Since the local node is performing data recovery, the data operation sequence set recorded in the operation record file in the target node is all data operation sequence sets before the current version of the target node data operation sequence set, and the data operation sequence set of the current version of the target node is In the node cache, it is not refreshed to the operation record file of the target node. Therefore, in order to recover all the data, it is also necessary to restore the data corresponding to the current version of the data operation sequence set of the target node to the local node. Since the current version of the data operation sequence of the target node is stored in the cache, it cannot be flushed to the target node operation log file until the node status changes, and is now unreadable. Therefore, the log data operation sequence of the target node is required, and the data corresponding to the log data operation sequence of the target node is used for data recovery. In this step, the log data operation sequence request of the local node may also be sent by the control node.
步骤 612: 目标节点接收 log数据操作序列请求并查找生成与本地节点相 关的 log数据操作序列列表。  Step 612: The target node receives the log data operation sequence request and searches for a list of log data operation sequences related to the local node.
由于目标节点上存储的 log数据操作序列列表是目标节点数据操作序列 集合当前版本期间的所有数据操作序列, 因此需要根据本地节点发送的 log 数据操作序列请求将目标节点与本地节点相关的 1 og数据操作序列列表发送 到本地节点。选择与本地节点相关的 1 og数据操作序列列表的方法同步骤 604。 步骤 613: 目标节点发送 log数据操作序列列表。 Since the log data operation sequence list stored on the target node is all data operation sequences during the current version of the target node data operation sequence set, it is required to request the target node to associate the target node with the local node according to the log data operation sequence sent by the local node. Operation sequence list transmission Go to the local node. The method of selecting the list of 1 og data operation sequences associated with the local node is the same as step 604. Step 613: The target node sends a log data operation sequence list.
步骤 614: 本地节点接收并更新 log数据操作序列列表。  Step 614: The local node receives and updates the log data operation sequence list.
本地节点接收目标节点发送的 log数据操作序列列表后, 更新本地 log 数据操作序列。  After receiving the list of log data operation sequences sent by the target node, the local node updates the local log data operation sequence.
步骤 615:查找生成与本地节点相关的 log数据操作序列列表对应的数据。 为了恢复与本地节点相关的 log数据操作序列列表对应的数据,目标节点 根据与本地节点相关的 log数据操作序列表对应的数据操作序列逐一查找读 取对应的数据。  Step 615: Find data corresponding to the log data operation sequence list related to the local node. In order to recover the data corresponding to the log data operation sequence list associated with the local node, the target node searches for the corresponding data one by one according to the data operation sequence corresponding to the log data operation sequence table associated with the local node.
步骤 616: 目标节点发送 log数据操作序列列表对应的数据。  Step 616: The target node sends data corresponding to the log data operation sequence list.
步骤 615可以在步骤 612之后执行, 可以与步骤 613同时执行, 也可以在 步骤 613之后执行。  Step 615 may be performed after step 612, may be performed concurrently with step 613, or may be performed after step 613.
步骤 617: 本地节点接收 log数据操作序列列表对应的数据。  Step 617: The local node receives the data corresponding to the log data operation sequence list.
步骤 618: 本地节点根据 log操作序列列表及 log操作序列列表对应的数 据更新本地节点数据。  Step 618: The local node updates the local node data according to the log operation sequence list and the data corresponding to the log operation sequence list.
本发明实施例通过向目标节点发送本地节点的数据操作序列集合的当前 版本值,接收目标节点根据本地节点的数据操作序列集合的版本值发送的数据 操作序列集合列表, 来进行数据恢复, 减少了数据传输量, 节约了网络带宽。  In the embodiment of the present invention, by sending the current version value of the data operation sequence set of the local node to the target node, and receiving the data operation sequence set list sent by the target node according to the version value of the data operation sequence set of the local node, the data recovery is performed, and the data recovery is reduced. The amount of data transferred saves network bandwidth.
本发明第四实施例提供了一种分布式存储数据恢复方法, 如图 7所示, 具 体包括:  A fourth embodiment of the present invention provides a distributed storage data recovery method. As shown in FIG. 7, the specific embodiment includes:
步骤 701至步骤 706同本发明第三实施例步骤 601至步骤 606 ,不再赘述。 步骤 707: 本地节点将版本值大于本地节点操作记录文件中的数据操作序 列集合当前版本值并且与本地节点相关的数据操作序列集合列表进行向量合 并。  Step 701 to step 706 are the same as step 601 to step 606 of the third embodiment of the present invention, and details are not described herein again. Step 707: The local node compares the version value with the current version value of the data operation sequence set in the local node operation record file and the vector operation sequence set list related to the local node.
为了减少需要恢复的数据的传输量,在将版本值大于本地节点操作记录文 件中的数据操作序列集合当前版本值并且与本地节点相关的数据操作序列集 合列表之后,本地节点对接收到的版本值大于本地节点操作记录文件中的数据 操作序列集合当前版本值并且与本地节点相关的数据操作序列集合列表进行 向量合并。具体操作方法为版本值大于本地节点操作记录文件中的数据操作序 列集合当前版本值并且与本地节点相关的数据操作序列集合列表中不论数据 操作序列集合的版本值是否相同,将数据标识相同的数据操作序列进行向量合 并, 合并的原则是将所有区间有重叠的部分进行合并。 如两个数据标识均为In order to reduce the amount of data that needs to be recovered, the version value is greater than the local node operation record. After the data operation sequence in the piece collects the current version value and the list of data operation sequence sets related to the local node, the local node receives the version value greater than the current version value of the data operation sequence set in the local node operation record file and the local node A list of related data manipulation sequence sets is vector merged. The specific operation method is that the version value is greater than the current version value of the data operation sequence set in the local node operation record file, and the data operation sequence set list related to the local node is the same data regardless of whether the version value of the data operation sequence set is the same or not. The operation sequence is vector-merged. The principle of merging is to merge all overlapping parts of the interval. If both data identifiers are
0x123的数据操作序列: 0x123: < 0, 1024〉〈 2000, 1024〉与 0x123: 〈 500, 4096〉, 则经过合并后得到的序列为 0x123: 〈 0, 4596〉。 又如三个数据标识均 为 0x321的数据操作序列 0x321: 〈0, 512〉〈 1024, 1024〉, 0x321: 〈 1500, 2000〉, 0x321: 〈 4096, 10240), 合并后的操作序列为 0x321: 〈0, 512> < 1024, 2476〉〈 4096, 10240〉。 0x123 data operation sequence: 0x123: < 0, 1024> < 2000, 1024> and 0x123: < 500, 4096>, then the combined sequence is 0x123: 〈 0, 4596>. For example, the data operation sequence 0x321 with three data identifiers being 0x321: <0, 512> < 1024, 1024>, 0x321: < 1500, 2000>, 0x321: < 4096, 10240), the combined operation sequence is 0x321: <0, 512> < 1024, 2476> < 4096, 10240>.
步骤 708: 将版本值大于本地节点操作记录文件中的数据操作序列集合当 前版本值并且与本地节点相关的数据操作序列集合列表随机 10合并为顺序 10。  Step 708: The version value is greater than the data operation sequence set current version value in the local node operation record file and the data operation sequence set list associated with the local node is randomly 10 combined into the sequence 10.
对版本值大于本地节点操作记录文件中的数据操作序列集合当前版本值 并且与本地节点相关的数据操作序列集合列表用统计算法来评估其随机分布 性, 将随机 10合并为顺序 10, 以减少节点和网络开销, 获得最佳恢复性能。 统计算法有多种,常见的统计算法是统计数据操作序列集合列表中数据操作序 列中的空洞(具有相同数据标识的数据操作序列没有重复的那些空间)占用空 间, 然后与合并序列跨越的连续空间大小比较, 算出百分比值, 如果百分比值 小于系统设置的值, 则可以将这些具有相同数据标识的数据操作序列进行合 并。系统设置的百分比值可以根据需要进行设置和调整。本实施例以 20%为例, 但这并不是对本发明的限制, 只是为了更清楚地说明本发明实施例。 以步骤 707中合并后的操作序列为 0x321: <0, 512〉〈 1024, 2476〉〈 4096, 10240) 为例, 这三个数据操作序列合并后空洞大小为 〔 1024- ( 0+512 )〕 + C4096- ( 1024+2476 )] =1108, 跨越区间为 4096+10240-0=14336, 空洞百分比约为 7.1%, 小于 20%, 合并为顺序 10后为 0x321: 〈 0, 14336〉 .另外也可以使用数 据操作序列之间的空洞值是否大于一个阔值来决定是否合并,阔值的设置可以 根据实际需要进行设置和调整。 For the list of data operation sequence sets whose version value is greater than the current operation value of the data operation sequence set in the local node operation record file and related to the local node, the random distribution is evaluated by a statistical algorithm, and the random 10 is merged into the sequence 10 to reduce the node. And network overhead for optimal recovery performance. There are many kinds of statistical algorithms. The common statistical algorithm is the space in the data operation sequence in the list of statistical data operation sequence sets (the space with the same data identifier does not repeat), and then the space occupied by the merged sequence The size comparison, calculate the percentage value, if the percentage value is less than the value set by the system, you can combine these data operation sequences with the same data identification. The percentage value set by the system can be set and adjusted as needed. The present embodiment is exemplified by 20%, but this is not a limitation of the present invention, but only to explain the embodiment of the present invention more clearly. The sequence of operations after the combination in step 707 is 0x321: <0, 512>< 1024, 2476>< 4096, 10240) For example, after the three data operation sequences are merged, the hole size is [1024-(0+512)] + C4096-(1024+2476)]=1108, and the span interval is 4096+10240-0=14336, and the percentage of holes is approximately 7.1%, less than 20%, merged into order 10 is 0x321: < 0, 14336>. In addition, you can also use whether the hole value between the data operation sequence is greater than a threshold to determine whether to merge, the setting of the threshold can be based on the actual Need to be set and adjusted.
其中, 本发明实施例可以只执行步骤 707或者步骤 708, 也可以同时执行 步骤 707和步骤 708, 在步骤 709及步骤 709以后统一称为经处理操作后的版 本值大于本地节点操作记录文件中的数据操作序列集合当前版本值并且与本 地节点相关的数据操作序列集合列表。  The embodiment of the present invention may only perform step 707 or step 708, and may also perform step 707 and step 708 at the same time. After step 709 and step 709, the version value after the processing operation is uniformly greater than that in the local node operation record file. The data manipulation sequence aggregates the current version value and a list of data manipulation sequence sets associated with the local node.
经步骤 707、 或步骤 708、 或步骤 707及步骤 708将版本值大于本地节点 数据操作序列集合当前版本值并且与本地节点相关的数据操作序列集合列表 进行操作处理,但是该操作不影响步骤 706中已经接收到的版本值大于本地节 点操作记录文件中的数据操作序列集合当前版本值并且与本地节点相关的数 据操作序列集合列表。  Performing, by step 707, or step 708, or step 707 and step 708, a list of data operation sequence sets having a version value greater than a current version value of the local node data operation sequence set and related to the local node, but the operation does not affect step 706 The list of data manipulation sequence sets that have been received with a version value greater than the current version value of the data manipulation sequence set in the local node operation record file and associated with the local node.
步骤 709: 发送经处理操作后的数据操作序列集合列表到目标节点。  Step 709: Send a list of data operation sequence sets after the processing operation to the target node.
步骤 710: 目标节点接收并根据该处理操作后的数据操作序列集合列表查 找读取对应的数据。  Step 710: The target node receives and searches for the corresponding data according to the data operation sequence set list after the processing operation.
目标节点接收该经处理操作后的数据操作序列集合列表,根据该经处理操 作后的数据操作序列集合列表中数据操作序列逐一查找读取对应的数据 ,生成 与数据操作序列集合列表对应的数据。  The target node receives the data operation sequence set list after the processed operation, searches and reads the corresponding data one by one according to the data operation sequence in the data operation sequence set list after the processed operation, and generates data corresponding to the data operation sequence set list.
步骤 711: 发送数据操作序列集合列表对应的数据。  Step 711: Send data corresponding to the data operation sequence set list.
将根据步骤 710生成的与经处理操作后的数据操作序列集合列表对应的 数据作为版本值大于本地节点操作记录文件中的数据操作序列集合当前版本 值并且与本地节点相关的数据操作序列集合列表对应的数据发送到本地节点。  The data corresponding to the processed data operation sequence set list generated according to step 710 is used as the version value corresponding to the current operation value of the data operation sequence set in the local node operation record file and corresponding to the data operation sequence set list related to the local node. The data is sent to the local node.
步骤 712: 本地节点接收经处理操作后的数据操作序列集合列表对应的数 据。 Step 712: The local node receives the number corresponding to the data operation sequence set list after the processing operation According to.
步骤 713: 更新数据。  Step 713: Update the data.
本地节点根据未经处理操作的数据操作序列集合列表和接收经处理操作 后的数据操作序列集合列表对应的数据更新本地节点的数据。  The local node updates the data of the local node according to the data operation sequence set list of the unprocessed operation and the data corresponding to the data operation sequence set list after receiving the processed operation.
下面为本发明实施例可选步骤(附图 7未示出 ):  The following are optional steps of an embodiment of the invention (not shown in Figure 7):
步骤 714至步骤 715参考本发明第三实施例步骤 611至步骤 614。  Steps 714 to 715 refer to steps 611 to 614 of the third embodiment of the present invention.
步骤 716: 将 1 og数据操作序列列表进行向量合并。  Step 716: Perform vector merging of the 1 og data operation sequence list.
将 log数据操作序列列表中所有数据标识相同的数据操作序列进行向量 合并, 具体合并方向与步骤 707相同。  All the data operation sequences with the same data identifiers in the log data operation sequence list are vector-combined, and the specific merge direction is the same as step 707.
步骤 717: 将 1 og数据操作序列列表随机 10合并为顺序 10。  Step 717: Combine the 1 og data operation sequence list into 10 random numbers.
将 log数据操作序列列表随机 10合并为顺序 10,具体合并方式与步骤 708 相同。  The log data operation sequence list random 10 is merged into the sequence 10, and the specific merge mode is the same as step 708.
其中, 本发明实施例可以只执行步骤 716或者步骤 717 , 也可以同时执行 步骤 716和步骤 717 , 在步骤 718及步骤 718以后中统一称为经处理操作后的 log数据操作序列列表。  The embodiment of the present invention may only perform step 716 or step 717, and may also perform step 716 and step 717 at the same time. In step 718 and step 718, it is collectively referred to as a log data operation sequence list after the processing operation.
这样经步骤 716、 或步骤 717、 或步骤 716及步骤 717处理操作后的 log 数据操作序列列表,并不影响步骤 715本地节点已经接收到的 log数据操作序 列列表。  Thus, the log data operation sequence list after the operation is processed in step 716, or step 717, or step 716 and step 717 does not affect the list of log data operation sequences that the local node has received in step 715.
步骤 718: 本地节点发送经处理操作后的 log数据操作序列列表。  Step 718: The local node sends a list of log data operation sequences after the processing operation.
步骤 719: 目标节点接收并根据该经处理操作后的 log数据操作序列列表 查找读取对应的数据。  Step 719: The target node receives and searches for the corresponding data according to the log data operation sequence list after the processed operation.
目标节点接收该经处理操作后的 log数据操作序列列表,并根据该处理操 作后的 log数据操作序列列表中的数据操作序列逐一查找读取对应的数据,形 成 log数据操作序列列表对应的数据。  The target node receives the log data operation sequence list after the processed operation, and searches for the corresponding data one by one according to the data operation sequence in the log data operation sequence list after the processing operation, and forms data corresponding to the log data operation sequence list.
步骤 720: 发送 log数据操作序列列表对应的数据。 步骤 721 : 本地接收经处理操作后的 l og数据操作序列列表对应的数据。 步骤 722 : 更新经处理操作后的 log数据操作序列列表对应的数据。 Step 720: Send data corresponding to the log data operation sequence list. Step 721: Locally receive data corresponding to the list of l og data operation sequences after the processing operation. Step 722: Update data corresponding to the log data operation sequence list after the processing operation.
根据未经处理操作后的 l og数据操作序列列表和处理操作后的 l og数据操 作序列列表对应的数据,更新处理操作后的 log数据操作序列列表对应的数据 到本地节点。  The data corresponding to the log data operation sequence list after the processing operation is updated to the local node according to the data corresponding to the l og data operation sequence list after the unprocessed operation and the l og data operation sequence list after the processing operation.
其中, 也可以不执行步骤 716至步骤 722 , 直接执行同本发明第三实施例 中步骤 615至步骤 618。  The step 615 to the step 618 in the third embodiment of the present invention may be directly executed without performing step 716 to step 722.
本发明实施例提供的分布式存储数据恢复方法,可以进一步减少重复数据 的传输, 节约了网络带宽, 同时可以减轻目标节点的负载。  The distributed storage data recovery method provided by the embodiment of the invention can further reduce the transmission of duplicate data, save network bandwidth, and reduce the load of the target node.
本发明第五实施例提供了一种分布式存储数据恢复方法, 如图 8所示, 具 体包括:  A fifth embodiment of the present invention provides a distributed storage data recovery method. As shown in FIG. 8, the specific method includes:
步骤 801至步骤 806同本发明第三实施例步骤 601至步骤 606 ,不再赘述。 步骤 807 : 将版本值大于本地节点操作记录文件中的数据操作序列集合当 前版本值并且与本地节点相关的数据操作序列集合列表进行向量合并。  Steps 801 to 806 are the same as steps 601 to 606 of the third embodiment of the present invention, and are not described again. Step 807: Perform vector combination of the version value of the data operation sequence set related to the local node and the data operation sequence set related to the local node in the local node operation record file.
将版本值大于本地节点操作记录文件中的数据操作序列集合当前版本值 并且与本地节点相关的数据操作序列集合列表进行向量合并的方法可以参照 本发明第四实施例中步骤 707。  A method of performing vector merging of a list of data operation sequence sets related to a local node with a version value greater than a data operation sequence set in the local node operation record file may refer to step 707 in the fourth embodiment of the present invention.
步骤 808 : 将版本值大于本地节点操作记录文件中的数据操作序列集合当 前版本值并且与本地节点相关的数据操作序列集合列表的随机 10合并为顺序 10。  Step 808: Combine the random value 10 whose version value is greater than the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list related to the local node into the sequence 10.
步骤 808将版本值大于本地节点操作记录文件中的数据操作序列集合当 前版本值并且与本地节点相关的数据操作序列集合列表随机 10合并为顺序 10 的方法可以参照本发明第四实施例中步骤 708。  Step 808: The method of combining the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list random sequence 10 related to the local node into the sequence 10 may refer to step 708 in the fourth embodiment of the present invention. .
其中, 本发明实施例可以只执行步骤 807或者步骤 808 , 也可以同时执行 步骤 807和步骤 808 , 在步骤 809中统一称为经处理操作后的版本值大于本地 节点操作记录文件中的数据操作序列集合当前版本值并且与本地节点相关的 数据操作序列集合列表。 The embodiment of the present invention may only perform step 807 or step 808, and may also perform step 807 and step 808 at the same time. In step 809, the version value after the processing operation is collectively greater than the local value. The data operation sequence in the node operation record sets a current version value and a list of data operation sequence sets associated with the local node.
经步骤 807、 或步骤 807、 或步骤 807及步骤 808将版本值大于本地节点 操作记录文件中的数据操作序列集合当前版本值并且与本地节点相关的数据 操作序列集合列表进行操作处理,但是该操作不影响步骤 806中已经接收到的 版本值大于本地节点操作记录文件中的数据操作序列集合当前版本值并且与 本地节点相关的数据操作序列集合列表。  Step 807, or step 807, or step 807 and step 808, the version value is greater than the current version value of the data operation sequence set in the local node operation record file and the data operation sequence set list related to the local node is processed, but the operation is performed. The list of data manipulation sequence sets that have received the version value in step 806 that is greater than the current version value of the data manipulation sequence set in the local node operation record file and related to the local node are not affected.
步骤 809: 根据经处理操作后的数据操作序列集合列表查找对应的数据。 目标节点根据经处理操作后的版本值大于本地节点操作记录文件中的数 据操作序列集合当前版本值并且与本地节点相关的数据操作序列集合列表中 的数据操作序列逐一查找读取相应的数据,形成数据操作序列集合列表对应的 数据。用来恢复本地节点版本值大于本地节点操作记录文件中的数据操作序列 集合当前版本值并且与本地节点相关的数据操作序列集合列表对应的数据。  Step 809: Find corresponding data according to the data operation sequence set list after the processed operation. The target node reads and reads the corresponding data according to the version value after the processing operation is greater than the current version value of the data operation sequence set in the local node operation record file and the data operation sequence in the data operation sequence set list related to the local node is searched one by one. The data corresponds to the data in the sequence set list. The data corresponding to the list of data operation sequence sets used to restore the local node version value is greater than the data operation sequence in the local node operation record file.
步骤 807至步骤 809可以在步骤 805之后步骤 806之前执行,也可以与步 骤 806同时执行, 也可以在步骤 806之后执行。  Steps 807 to 809 may be performed before step 806 after step 805, or may be performed simultaneously with step 806, or may be performed after step 806.
步骤 810: 目标节点发送经处理操作后的数据操作序列集合列表对应的数 据。  Step 810: The target node sends the data corresponding to the data operation sequence set list after the processing operation.
步骤 811 : 本地节点接收经处理操作后的数据操作序列集合列表对应的数 据。  Step 811: The local node receives the data corresponding to the data operation sequence set list after the processing operation.
步骤 812: 更新经处理操作后的数据操作序列集合列表对应的数据更新本 地节点的数据。  Step 812: Update the data corresponding to the data operation sequence set list after the processing operation to update the data of the local node.
下面为本发明实施例可选步骤(附图 8未示出 ):  The following are optional steps of an embodiment of the invention (not shown in Figure 8):
步骤 813至步骤 816参照本发明第三实施例步骤 611至 614的描述。 步骤 817: 目标节点将 log数据操作序列列表进行向量合并。  Steps 813 to 816 refer to the description of steps 611 to 614 of the third embodiment of the present invention. Step 817: The target node performs vector merging of the log data operation sequence list.
目标节点将 log数据操作序列中所有具有相同数据标识的数据操作序列 进行向量合并, 具体方法可以参照步骤 807。 The target node will have all data manipulation sequences with the same data identifier in the log data operation sequence. For vector merging, the specific method can refer to step 807.
步骤 818: 目标节点将 log数据操作序列列表随机 10合并为顺序 10。 将 log数据操作序列列表随机 10合并为顺序 10的处理,具体方法同步骤 Step 818: The target node merges the log data operation sequence list random 10 into the sequence 10. Combine the log data operation sequence list random 10 into the sequence 10 processing, the specific method is the same as the step
808。 808.
其中, 本发明实施例可以只执行步骤 817或者步骤 818 , 也可以同时执行 步骤 817和步骤 818 , 在步骤 819及步骤 819以后中统一称为经处理操作后的 log数据操作序列列表。  The embodiment of the present invention may only perform step 817 or step 818, and may also perform step 817 and step 818 at the same time. In step 819 and step 819, it is collectively referred to as a log data operation sequence list after the processing operation.
这样经步骤 817、 或步骤 818、 或步骤 817及步骤 818处理操作后的 log 数据操作序列列表,并不影响步骤 816本地节点已经接收到的 log数据操作序 列列表。  Thus, the log data operation sequence list after the operation is processed in step 817, or step 818, or step 817 and step 818 does not affect the list of log data operation sequences that the local node has received in step 816.
步骤 819:根据经合处理操作后的 1 og数据操作序列列表查找对应的数据。 根据经处理操作后的 log数据操作序列列表中的数据操作序列逐一查找 读取对应的数据, 形成 log数据操作序列列表对应的数据。  Step 819: Find corresponding data according to the 1 og data operation sequence list after the merging processing operation. According to the data operation sequence in the log data operation sequence list after the processing operation, the corresponding data is read one by one to form data corresponding to the log data operation sequence list.
步骤 820: 发送 log数据操作序列列表对应的数据。  Step 820: Send data corresponding to the log data operation sequence list.
步骤 821 : 本地节点接收经处理操作后的 log数据操作序列列表对应的数 据。  Step 821: The local node receives the data corresponding to the log data operation sequence list after the processing operation.
步骤 822: 更新经处理操作后的 log数据操作序列列表对应的数据。  Step 822: Update the data corresponding to the log data operation sequence list after the processing operation.
根据未经处理操作后的 log数据操作序列列表和处理操作后的 log数据操 作序列列表对应的数据,更新处理操作后的 log数据操作序列列表对应的数据 到本地节点。  The data corresponding to the log data operation sequence list after the processing operation is updated to the local node according to the log data operation sequence list after the unprocessed operation and the data corresponding to the log data operation sequence list after the processing operation.
其中, 也可以不执行步骤 817至步骤 822 , 直接执行同本发明第三实施例 中步骤 615至步骤 618。  The step 615 to the step 618 in the third embodiment of the present invention may be directly executed without performing the steps 817 to 822.
本发明实施例提供的分布式存储数据恢复方法, 减少重复数据的传输, 进 一步减少了需要恢复的数据, 节约了网络带宽。  The distributed storage data recovery method provided by the embodiment of the invention reduces the transmission of duplicate data, further reduces the data that needs to be restored, and saves network bandwidth.
本发明第六实施例提供了一种节点, 如图 9所示, 具体包括接收单元 901 和更新单元 902。 A sixth embodiment of the present invention provides a node, as shown in FIG. 9, specifically including a receiving unit 901. And update unit 902.
其中接收单元 901 , 用于接收目标节点根据节点的数据操作序列集合的版 本值发送的数据操作序列集合列表,以及用于接收目标节点发送的数据操作序 列集合列表对应的数据。  The receiving unit 901 is configured to receive a data operation sequence set list sent by the target node according to the version value of the node data operation sequence set, and receive data corresponding to the data operation sequence set list sent by the target node.
更新单元 902 , 用于根据数据操作序列集合列表以及数据操作序列集合列 表对应的数据, 更新所述节点的数据。 接收单元 901 , 还可以用于接收目标节 点发送的 log操作序列列表,以及还用于接收目标节点发送的 log操作序列列 表对应的数据。 更新单元 902 , 还用于根据接收单元 901接收的 log操作序列 列表和 log操作序列列表对应的数据, 更新节点数据。  The updating unit 902 is configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list. The receiving unit 901 is further configured to receive a log operation sequence list sent by the target node, and further configured to receive data corresponding to the log operation sequence list sent by the target node. The updating unit 902 is further configured to update the node data according to the log operation sequence list received by the receiving unit 901 and the data corresponding to the log operation sequence list.
可选地,接收单元 901接收的目标节点发送的数据操作序列集合列表对应 的数据可以是目标节点发送的目标节点对数据操作序列集合列表进行向量合 并操作后的数据操作序列集合列表对应的数据。 则更新单元 902 , 具体用于根 据数据操作序列集合列表以及进行向量合并操作后的数据操作序列集合列表 对应的数据, 更新节点的数据。  Optionally, the data corresponding to the data operation sequence set list sent by the target node received by the receiving unit 901 may be data corresponding to the data operation sequence set list after the target node sends the vector operation operation to the data operation sequence set list. Then, the updating unit 902 is specifically configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list after the vector combining operation.
可选地,接收单元 901接收的目标节点发送的数据操作序列集合列表对应 的数据可以是目标节点发送的目标节点对所述数据操作序列集合列表进行随 机 10合并为顺序 10操作后的数据操作序列集合列表对应的数据。则更新单元 902 , 具体用于根据数据操作序列集合列表以及进行随机 10合并为顺序 10操 作后的数据操作序列集合列表对应的数据, 更新节点的数据。  Optionally, the data corresponding to the data operation sequence set list sent by the target node received by the receiving unit 901 may be a data operation sequence after the target node sent by the target node performs random 10 merging to the data operation sequence set list. The data corresponding to the collection list. Then, the updating unit 902 is specifically configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list after the random 10 merge operation.
更新单元 902 还可以用于将接收到的数据操作序列集合列表更新到操作 记录文件。  The update unit 902 can also be used to update the list of received data manipulation sequence sets to the operation log file.
本发明实施例提供的节点,通过接收单元接收目标节点根据所述节点的数 据操作序列集合的版本值发送的数据操作序列集合列表,更新单元来更新所述 节点的数据, 从而减少了数据恢复过程中的数据传输量, 节约了网络带宽。  The node provided by the embodiment of the present invention receives, by the receiving unit, a list of data operation sequence sets sent by the target node according to the version value of the data operation sequence set of the node, and the update unit updates the data of the node, thereby reducing the data recovery process. The amount of data transferred in the network saves network bandwidth.
本发明第七实施例提供了一种节点, 如图 10所示, 具体包括发送单元 1001、 接收单元 1002和更新单元 1 003。 A seventh embodiment of the present invention provides a node, as shown in FIG. 10, specifically including a sending unit. 1001. Receiving unit 1002 and updating unit 1 003.
其中, 发送单元 1001 , 用于向目标节点发送所述节点的数据操作序列集 合的版本值。 接收单元 1002 , 用于接收目标节点根据节点的数据操作序列集 合的版本值发送的数据操作序列集合列表,以及用于接收目标节点发送的数据 操作序列集合列表对应的数据。 更新单元 1003 , 用于根据数据操作序列集合 列表以及数据操作序列集合列表对应的数据, 更新节点的数据。  The sending unit 1001 is configured to send, to the target node, a version value of the data operation sequence set of the node. The receiving unit 1002 is configured to receive a data operation sequence set list sent by the target node according to the version value of the node data operation sequence set, and receive data corresponding to the data operation sequence set list sent by the target node. The updating unit 1003 is configured to update the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
更新单元 1 003还可以用于将接收到的数据操作序列集合列表更新到操作 记录文件。  The update unit 1 003 can also be used to update the list of received data operation sequence sets to the operation log file.
接收单元 1002 的进一步地描述可以参照第六实施例的接收单元 901 , 不 再赘述。  Further description of the receiving unit 1002 can be referred to the receiving unit 901 of the sixth embodiment, and will not be described again.
更新单元 1003的进一步地描述可以参照第六实施例的更新单元 902 , 不 再赘述。  Further description of the update unit 1003 can be referred to the update unit 902 of the sixth embodiment, and will not be described again.
本发明实施例提供的节点,通过发送单元向目标节点发送所述节点的数据 操作序列集合的版本值,接收单元接收目标节点根据所述节点的数据操作序列 集合的版本值发送的数据操作序列集合列表, 更新单元来更新所述节点的数 据, 从而减少了数据恢复过程中的数据传输量, 节约了网络带宽。 而且还可以 进一步更新所述节点的操作记录文件。  The node provided by the embodiment of the present invention sends a version value of the data operation sequence set of the node to the target node by using the sending unit, and the receiving unit receives the data operation sequence set sent by the target node according to the version value of the data operation sequence set of the node. The list, the update unit updates the data of the node, thereby reducing the amount of data transmission during the data recovery process and saving network bandwidth. Moreover, the operation log file of the node can be further updated.
本发明第八实施例提供了一种节点, 如图 11所示, 具体包括发送单元 1101、 接收单元 1102、 向量合并单元 11 03和更新单元 11 04。  The eighth embodiment of the present invention provides a node, as shown in FIG. 11, specifically, including a sending unit 1101, a receiving unit 1102, a vector combining unit 11 03, and an updating unit 11 04.
其中, 发送单元 11 01 , 用于向目标节点发送节点的数据操作序列集合的 版本值。 接收单元 1102 , 用于接收目标节点根据节点的数据操作序列集合的 版本值发送的数据操作序列集合列表。 向量合并单元 1 103 用于将接收单元 1101接收的数据操作序列集合列表进行向量合并操作。 发送单元 1 101还用于 向目标节点发送向量合并操作后的数据操作序列集合列表,接收单元还用于接 收进行向量合并操作后的数据操作序列集合列表对应的数据。 更新单元 1 104 , 根据数据操作序列集合列表以及数据操作序列集合列表对应的数据,更新节点 的数据。 更新单元 1104还可以用于将接收到的数据操作序列集合列表更新到 操作记录文件。 The sending unit 11 01 is configured to send, to the target node, a version value of the data operation sequence set of the node. The receiving unit 1102 is configured to receive a data operation sequence set list that is sent by the target node according to a version value of the node data operation sequence set. The vector merging unit 1 103 is configured to perform a vector merging operation on the list of data operation sequence sets received by the receiving unit 1101. The sending unit 1101 is further configured to send a data operation sequence set list after the vector combining operation to the target node, and the receiving unit is further configured to receive data corresponding to the data operation sequence set list after performing the vector combining operation. Update unit 1 104, The data of the node is updated according to the data operation sequence set list and the data corresponding to the data operation sequence set list. The updating unit 1104 can also be configured to update the received data operation sequence set list to the operation record file.
接收单元 1102 的进一步地描述可以参照第六实施例的接收单元 901 , 不 再赘述。  Further description of the receiving unit 1102 can be referred to the receiving unit 901 of the sixth embodiment, and will not be described again.
更新单元 1104 的进一步地描述可以参照第六实施例的更新单元 902 , 不 再赘述。  Further description of the update unit 1104 can be referred to the update unit 902 of the sixth embodiment, and will not be described again.
本发明实施例提供的节点,可以通过向目标节点发送数据操作序列集合当 前版本值, 同时对目标节点发送的数据操作序列集合列表进行向量合并操作, 可以进一步减少恢复数据的传输量, 减轻目标节点的恢复数据时的负载, 节约 了网络带宽。  The node provided by the embodiment of the present invention can further reduce the transmission amount of the restored data and reduce the target node by sending a current version value of the data operation sequence set to the target node and performing a vector merge operation on the data operation sequence set list sent by the target node. The load when restoring data saves network bandwidth.
本发明第九实施例提供了一种节点, 如图 12所示, 具体包括发送单元 1201、 接收单元 1202、 合并单元 1203和更新单元 1204。  A ninth embodiment of the present invention provides a node, as shown in FIG. 12, specifically including a transmitting unit 1201, a receiving unit 1202, a merging unit 1203, and an updating unit 1204.
其中, 发送单元 1201 , 用于向目标节点发送节点的数据操作序列集合的 版本值。 接收单元 1202 , 用于接收目标节点根据节点的数据操作序列集合的 版本值发送的数据操作序列集合列表。 合并单元 1203 用于将接收单元 1202 接收的数据操作序列集合列表进行随机 10合并为顺序 10操作。发送单元 1201 还用于向目标节点发送随机 10合并为顺序 10操作后的数据操作序列集合列 表,接收单元还用于接收进行随机 10合并为顺序 10操作后的数据操作序列集 合列表对应的数据。 更新单元 1204 , 根据数据操作序列集合列表以及数据操 作序列集合列表对应的数据, 更新节点的数据。 更新单元 1204还可以用于将 接收到的数据操作序列集合列表更新到操作记录文件。  The sending unit 1201 is configured to send, to the target node, a version value of the data operation sequence set of the node. The receiving unit 1202 is configured to receive a data operation sequence set list that is sent by the target node according to a version value of the node data operation sequence set. The merging unit 1203 is configured to combine the data operation sequence set list received by the receiving unit 1202 into a random 10 operation into a sequence 10 operation. The sending unit 1201 is further configured to send, to the target node, a data operation sequence set list in which the random 10 is merged into the sequence 10 operation, and the receiving unit is further configured to receive data corresponding to the data operation sequence set list after the random 10 merge to the sequence 10 operation. The updating unit 1204 updates the data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list. The update unit 1204 can also be configured to update the received list of data manipulation sequence sets to the operation log file.
在另一实施例中, 可以同时包括向量合并单元 1103和合并单元 1203 , 将 目标节点发送的数据操作序列集合进行向量合并处理和将随机 10合并为顺序 10处理。 接收单元 1202 的进一步地描述可以参照第六实施例的接收单元 901 , 不 再赘述。 In another embodiment, the vector merging unit 1103 and the merging unit 1203 may be simultaneously included, and the data operation sequence set sent by the target node is subjected to vector merging processing and the random 10 is combined into the sequence 10 processing. For further description of the receiving unit 1202, reference may be made to the receiving unit 901 of the sixth embodiment, and details are not described herein again.
更新单元 1204 的进一步地描述可以参照第六实施例的更新单元 902 , 不 再赘述。  Further description of the update unit 1204 can be referred to the update unit 902 of the sixth embodiment, and will not be described again.
本发明实施例提供的节点,可以通过向目标节点发送数据操作序列集合当 前版本值, 同时对目标节点发送的数据操作序列集合列表进行合并操作, 可以 有效减少恢复数据的传输量, 节约了网络带宽, 减轻目标节点负载。  The node provided by the embodiment of the present invention can perform the merge operation of the data operation sequence set list sent by the target node by sending the current version value of the data operation sequence set to the target node, thereby effectively reducing the transmission amount of the restored data and saving the network bandwidth. , to reduce the target node load.
本发明第六至九实施例中提供的节点,接收单元还可以用于接收目标节点 发送的 log操作序列列表,以及还用于接收目标节点发送的 log操作序列列表 对应的数据。更新单元还可以用于根据接收单元接收的 log操作序列列表和该 log操作序列列表对应的数据, 更新节点的数据。 从而恢复节点全部数据, 同 时减少数据传输量。  The node provided in the sixth to ninth embodiments of the present invention may further be configured to receive a log operation sequence list sent by the target node, and further configured to receive data corresponding to the log operation sequence list sent by the target node. The updating unit may be further configured to update the data of the node according to the log operation sequence list received by the receiving unit and the data corresponding to the log operation sequence list. Thereby restoring all data of the node and reducing the amount of data transmission.
本发明第六至九实施例中提供的节点,具体可以参照方法第一至第五实施 例中的本地节点的描述。  For the nodes provided in the sixth to ninth embodiments of the present invention, reference may be made to the description of the local node in the first to fifth embodiments of the method.
本发明第十实施例提供了一种节点, 如图 13所示, 包括, 接收单元 1301 和发送单元 1302。  A tenth embodiment of the present invention provides a node, as shown in FIG. 13, including a receiving unit 1301 and a transmitting unit 1302.
其中接收单元 1301 , 用于接收本地节点数据操作序列集合的版本值。 发 送单元 1302 , 用于根据接收单元 1301接收的版本值, 向本地节点发送数据操 作序列集合列表和数据操作序列集合列表对应的数据。  The receiving unit 1301 is configured to receive a version value of the local node data operation sequence set. The sending unit 1302 is configured to send, according to the version value received by the receiving unit 1301, data corresponding to the data operation sequence set list and the data operation sequence set list to the local node.
接收单元 1301 , 还可以用于接收本地节点发送的将数据操作序列集合列 表进行向量合并操作后的数据操作序列集合列表。 则此时发送单元 1302发送 的数据操作序列集合列表对应的数据具体为:向量合并操作后的数据操作序列 集合列表对应的数据。  The receiving unit 1301 is further configured to receive a list of data operation sequence sets sent by the local node to perform a vector combining operation on the data operation sequence set list. Then, the data corresponding to the data operation sequence set list sent by the sending unit 1302 is specifically the data corresponding to the data operation sequence set list after the vector combining operation.
接收单元 1301 , 还可以用于接收本地节点发送的将所述数据操作序列集 合列表进行随机 10合并顺序 10操作后的数据操作序列集合列表。则此时发送 单元 1 302发送的数据操作序列集合列表对应的数据具体为:所述随机 10合并 顺序 10操作后的数据操作序列集合列表对应的数据。 The receiving unit 1301 is further configured to receive a data operation sequence set list that is sent by the local node and that performs the random 10 merge sequence 10 operation on the data operation sequence set list. Then send at this time The data corresponding to the data operation sequence set list sent by the unit 1 302 is specifically: the data corresponding to the data operation sequence set list after the random 10 merge order 10 operation.
本发明实施例提供的节点,可以提供大于本地节点的数据操作序列集合当 前的版本值数据操作序列集合列表及该数据操作序列集合列表对应的数据,为 本地节点提供数据恢复。  The node provided by the embodiment of the present invention may provide a data set corresponding to the current version of the data operation sequence set of the local node and a data corresponding to the data operation sequence set list, and provide data recovery for the local node.
本发明实施例提供的节点,接收单元通过接收本地节点的数据操作序列集 合的版本值,发送单元用于根据接收单元接收的版本值, 向本地节点发送数据 操作序列集合列表和数据操作序列集合列表对应的数据,来更新本地节点的数 据, 从而减少了数据恢复过程中的数据传输量, 节约了网络带宽。  According to the embodiment of the present invention, the receiving unit receives the version value of the data operation sequence set of the local node, and the sending unit is configured to send the data operation sequence set list and the data operation sequence set list to the local node according to the version value received by the receiving unit. The corresponding data is used to update the data of the local node, thereby reducing the amount of data transmission during the data recovery process and saving network bandwidth.
本发明第十一实施例提供了一种节点,如图 14所示 ,包括:接收单元 1401、 发送单元 1402和向量合并单元 1403。  An eleventh embodiment of the present invention provides a node, as shown in FIG. 14, comprising: a receiving unit 1401, a transmitting unit 1402, and a vector combining unit 1403.
其中接收单元 1401 , 用于接收本地节点数据操作序列集合的版本值。 发 送单元 1403 , 用于根据接收单元 1401接收的版本值, 向本地节点发送数据操 作序列集合列表。 向量合并单元 1403 , 用于数据操作序列集合列表进行向量 合并, 发送单元 1402还用于向本地节点发送进行向量合并操作后的数据操作 序列集合列表对应的数据。  The receiving unit 1401 is configured to receive a version value of the local node data operation sequence set. The sending unit 1403 is configured to send, according to the version value received by the receiving unit 1401, a data operation sequence set list to the local node. The vector merging unit 1403 is configured to perform vector merging on the data operation sequence set list, and the sending unit 1402 is further configured to send, to the local node, data corresponding to the data operation sequence set list after performing the vector merging operation.
接收单元 1401的进一步地描述可以参照第十实施例的接收单元 1 301 , 不 再赘述。  Further description of the receiving unit 1401 can be referred to the receiving unit 1 301 of the tenth embodiment, and will not be described again.
发送单元 1402的进一步地描述可以参照第十实施例的更新单元 1 302 , 不 再赘述。  Further description of the transmitting unit 1402 can be referred to the updating unit 1 302 of the tenth embodiment, and will not be described again.
本发明实施例提供的节点,可以提供大于本地节点的数据操作序列集合当 前的版本值数据操作序列集合列表及经向量合并处理后的该数据操作序列集 合列表对应的数据, 为本地节点提供数据恢复, 进一步减少了恢复数据的传输 量。  The node provided by the embodiment of the present invention may provide a data set of the current version value data operation sequence set larger than the local node data operation sequence set and the data corresponding to the data operation sequence set list after the vector merge process, and provide data recovery for the local node. , further reducing the amount of data transferred.
本发明第十二实施例提供了一种节点,如图 15所示,包括:接收单元 1501、 发送单元 1502和合并单元 1503。 A twelfth embodiment of the present invention provides a node, as shown in FIG. 15, comprising: a receiving unit 1501 The transmitting unit 1502 and the merging unit 1503.
其中接收单元 1501 , 用于接收本地节点数据操作序列集合的版本值。 发 送单元 1503 , 用于根据接收单元 1501接收的版本值, 向本地节点发送数据操 作序列集合列表。 合并单元 1503 , 用于对数据操作序列集合列表进行随机 10 合并为顺序 10操作, 发送单元 1502还用于向本地节点发送进行随机 10合并 为顺序 10操作后的数据操作序列集合列表对应的数据。  The receiving unit 1501 is configured to receive a version value of the local node data operation sequence set. The sending unit 1503 is configured to send, according to the version value received by the receiving unit 1501, a data operation sequence set list to the local node. The merging unit 1503 is configured to perform random 10 merging into a sequence 10 operation on the data operation sequence set list, and the sending unit 1502 is further configured to send, to the local node, data corresponding to the data operation sequence set list after the random 10 merging to the sequence 10 operation.
接收单元 1501的进一步地描述可以参照第十实施例的接收单元 1301 , 不 再赘述。  Further description of the receiving unit 1501 can be referred to the receiving unit 1301 of the tenth embodiment, and will not be described again.
发送单元 1502的进一步地描述可以参照第十实施例的更新单元 1302 , 不 再赘述。  Further description of the transmitting unit 1502 can be referred to the updating unit 1302 of the tenth embodiment, and will not be described again.
在本发明另一实施例中, 可以同时包括向量合并单元 1403和合并单元 1503 , 数据操作序列集合进行向量合并处理和将随机 10合并为顺序 10处理。 可以减少数据传输和节点的开销。  In another embodiment of the present invention, the vector merging unit 1403 and the merging unit 1503 may be simultaneously included, the data operation sequence set is subjected to vector merging processing, and the random ray 10 is combined into a sequence 10 processing. It can reduce the overhead of data transmission and nodes.
本发明实施例提供的节点,可以提供大于本地节点的数据操作序列集合当 前的版本值数据操作序列集合列表及随机 10合并为顺序 10处理后的该数据操 作序列集合列表对应的数据, 为本地节点提供数据恢复, 同时减少了节点的开 销。  The node provided by the embodiment of the present invention may provide a data sequence sequence set of the data operation sequence set larger than the local node, and a data sequence sequence set corresponding to the data sequence sequence set of the sequence 10 processed by the sequence 10, which is a local node. Provides data recovery while reducing node overhead.
本发明十至十二实施例提供的节点, 接收单元还用于接收本地节点的 log 数据操作序列请求。发送单元还用于根据 log数据操作序列请求, 向本地节点 发送 log数据操作序列列表, 以及根据该 log数据操作序列列表, 向本地节点 发送该 log数据操作序列列表对应的数据。  The node provided by the tenth to twelfth embodiments of the present invention, the receiving unit is further configured to receive a log data operation sequence request of the local node. The sending unit is further configured to: according to the log data operation sequence request, send a log data operation sequence list to the local node, and send the data corresponding to the log data operation sequence list to the local node according to the log data operation sequence list.
本发明十至十二实施例提供的节点,还可以包括查找生成单元, 用于根据 本地节点的数据操作序列集合的版本值,查找目标节点的操作记录文件中版本 值大于该本地节点的数据操作序列集合的版本值的数据操作序列集合,并选择 与该本地节点相关的数据操作序列集合生成所述数据操作序列集合列表。 本发明第十至十二实施例中提供的节点,具体可以参照方法第一至第五实 施例中的目标节点的描述。 The node provided by the tenth to the twelfth embodiments of the present invention may further include a search generating unit, configured to search for a data operation in which the version value of the operation record file of the target node is greater than the local node according to the version value of the data operation sequence set of the local node. A set of data manipulation sequences of version values of the sequence set, and selecting a set of data manipulation sequences associated with the local node to generate the list of data manipulation sequence sets. For the nodes provided in the tenth to twelfth embodiments of the present invention, reference may be made to the description of the target nodes in the first to fifth embodiments of the method.
本发明第十三实施例提供了一种分布式存储数据恢复系统,如图 16所示, 包括本地节点 1601和目标节点 1602。  A thirteenth embodiment of the present invention provides a distributed storage data recovery system, as shown in FIG. 16, including a local node 1601 and a target node 1602.
其中, 本地节点 1601用于接收目标节点根据本地节点的数据操作序列集 合的版本值发送的数据操作序列集合列表,以及用于接收目标节点发送的该数 据操作序列集合列表对应的数据,还用于根据该数据操作序列集合列表以及该 数据操作序列集合列表对应的数据, 更新本地节点的数据。 目标节点 1602 , 用于接收本地节点的数据操作序列集合的版本值,根据该版本值, 向本地节点 发送数据操作序列集合列表, 和该数据操作序列集合列表对应的数据。  The local node 1601 is configured to receive a data operation sequence set list that is sent by the target node according to the version value of the data operation sequence set of the local node, and receive data corresponding to the data operation sequence set list sent by the target node, and is also used to Updating the data of the local node according to the data operation sequence set list and the data corresponding to the data operation sequence set list. The target node 1602 is configured to receive a version value of the data operation sequence set of the local node, and send, according to the version value, a data operation sequence set list and a data corresponding to the data operation sequence set list to the local node.
本发明实施例提供的系统,通过发送本地节点的数据操作序列集合的版本 号, 进行数据恢复时, 减少了数据传输量, 节约了网络带宽。  The system provided by the embodiment of the present invention reduces the amount of data transmission and saves network bandwidth by transmitting the version number of the data operation sequence set of the local node for data recovery.
本发明第六到十二实施例提供的节点和第十三实施例提供的系统具体可 以参照本发明方法实施例的描述。  The nodes provided in the sixth to twelfth embodiments of the present invention and the system provided in the thirteenth embodiment can be specifically referred to the description of the method embodiments of the present invention.
本发明实施例提供的分布式存储数据恢复系统,通过将数据操作序列集合 列表和 log数据操作序列列表进行向量合并并且将随机 10合并为顺序 10, 进 一步减少了数据的传输量, 节约了网络带宽。 同时緩解了目标节点的负载。  The distributed storage data recovery system provided by the embodiment of the present invention further reduces the data transmission amount by merging the data operation sequence set list and the log data operation sequence list into the sequence 10, thereby reducing the network bandwidth. . At the same time, the load on the target node is alleviated.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示 例的单元及算法步骤, 能够以电子硬件、 计算机软件或者二者的结合来实现, 为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地 描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决 于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用 来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范 围。  Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both, for clarity of hardware and software. Interchangeability, the composition and steps of the various examples have been generally described in terms of function in the above description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
所属领域的技术人员可以清楚地了解到, 为描述的方便和简洁, 上述描述 的系统、装置和单元的具体工作过程 ,可以参考前述方法实施例中的对应过程 , 在此不再赘述。 It will be apparent to those skilled in the art that the above description is convenient and concise for the description. For a specific working process of the system, the device, and the unit, refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
在本申请所提供的几个实施例中, 应该理解到, 所披露的系统、 装置和方 法, 可以通过其它的方式实现。 例如, 以上所描述的装置实施例仅仅是示意性 的, 例如, 所述单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可以有另 外的划分方式, 例如多个单元或组件可以结合或者可以集成到另一个系统, 或 一些特征可以忽略, 或不执行。 另一点, 所显示或讨论的相互之间的耦合或直 接耦合或通信连接可以是通过一些接口, 装置或单元的间接耦合或通信连接, 可以是电性, 机械或其它的形式。  In the several embodiments provided herein, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed. In addition, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
另外, 在本发明各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是各个单元单独物理存在 ,也可以两个或两个以上单元集成在一个单元 中。上述集成的单元既可以釆用硬件的形式实现, 也可以釆用软件功能单元的 形式实现。  In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售 或使用时, 可以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发 明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全 部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储 介质中, 包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器, 或者网络设备等 )执行本发明各个实施例所述方法的全部或部分步骤。 而前述 的存储介质包括: U盘、 移动硬盘、 只读存储器 (ROM, Read-Only Memory ), 随机存取存储器 (RAM, Random Acces s Memory ), 磁碟或者光盘等各种可以存 储程序代码的介质。  The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM, Random Acces s Memory), a magnetic disk or an optical disk, and the like, which can store program codes. medium.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于 此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易想到 变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护范围应 以所述权利要求的保护范围为准 The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of protection of the present invention should Subject to the scope of protection of the claims

Claims

1、 一种分布式存储数据恢复方法, 其特征在于, 包括: A distributed storage data recovery method, comprising:
本地节点接收目标节点根据本地节点的数据操作序列集合的版本值发送 的数据操作序列集合列表;  The local node receives a list of data operation sequence sets sent by the target node according to the version value of the data operation sequence set of the local node;
接收所述目标节点发送的所述数据操作序列集合列表对应的数据; 根据所述数据操作序列集合列表以及所述数据操作序列集合列表对应的 数据, 更新所述本地节点的数据。  Receiving data corresponding to the data operation sequence set list sent by the target node; updating data of the local node according to the data operation sequence set list and the data corresponding to the data operation sequence set list.
2、 如权利要求 1所述的方法, 其特征在于, 所述本地节点的数据操作序 列集合的版本值为所述本地节点向所述目标节点发送的。  2. The method according to claim 1, wherein a version value of the data operation sequence set of the local node is sent by the local node to the target node.
3、 如权利要求 1或 2所述的方法, 其特征在于,  3. The method of claim 1 or 2, wherein
所述本地节点接收目标节点根据本地节点数据操作序列集合的版本值发 送的数据操作序列集合列表之后, 所述方法还包括: 所述本地节点对所述数据 操作序列集合列表进行向量合并操作,并发送所述向量合并操作后的数据操作 序列集合列表给所述目标节点;  After the local node receives the data operation sequence set list sent by the target node according to the version value of the local node data operation sequence set, the method further includes: the local node performing a vector merge operation on the data operation sequence set list, and Sending a list of data operation sequence sets after the vector combining operation to the target node;
则所述接收所述目标节点发送的所述数据操作序列集合列表对应的数据 具体为:接收所述目标节点发送的所述向量合并操作后的数据操作序列集合列 表对应的数据。  And the data corresponding to the data operation sequence set list sent by the target node is received by: receiving data corresponding to the data operation sequence set list after the vector merging operation sent by the target node.
4、 如权利要求 1至 3任一所述的方法, 其特征在于, 所述本地节点接收 目标节点根据本地节点数据操作序列集合的版本值发送的数据操作序列集合 列表之后, 所述方法还包括:  The method according to any one of claims 1 to 3, wherein after the local node receives the data operation sequence set list sent by the target node according to the version value of the local node data operation sequence set, the method further includes :
所述本地节点对所述数据操作序列集合列表进行随机 10 合并为顺序 10 的操作,并发送所述进行随机 10合并为顺序 10操作后的数据操作序列集合列 表给所述目标节点;  The local node performs a random 10 merging operation on the data operation sequence set list into a sequence 10 operation, and sends the random operation 10 merging to the sequence 10 operation data operation sequence set list to the target node;
则所述接收所述目标节点发送的所述数据操作序列集合列表对应的数据 据操作序列集合列表对应的数据。 And the data corresponding to the data operation sequence set list corresponding to the data operation sequence set list sent by the target node is received.
5、 如权利要求 4所述的方法, 其特征在于, 当所述数据操作序列集合列 表中,具有相同标识的数据操作序列之间空洞值与合并序列跨越的连续空间大 小的比值小于设定百分比,或者具有相同标识的数据操作序列之间空洞值小于 设定的阔值, 则对所述数据操作序列集合列表进行所述随机 10合并为顺序 10 的操作。 5. The method according to claim 4, wherein, in the data operation sequence set list, a ratio of a hole value between the data operation sequences having the same identifier to a continuous space size spanned by the merge sequence is less than a set percentage Or, if the hole value between the data operation sequences having the same identifier is less than the set threshold, the random operation 10 is merged into the sequence 10 operation on the data operation sequence set list.
6、 如权利要求 1或 2所述的方法, 其特征在于, 接收所述目标节点发送 的所述数据操作序列集合列表对应的数据具体为:接收所述目标节点发送的所 述目标节点对所述数据操作序列集合列表进行向量合并操作后的数据操作序 列集合列表对应的数据。  The method according to claim 1 or 2, wherein receiving data corresponding to the data operation sequence set list sent by the target node is specifically: receiving the target node pair sent by the target node The data operation sequence set list performs data corresponding to the data operation sequence set list after the vector merge operation.
7、 如权利要求 1、 2或 6所述的方法, 其特征在于, 接收所述目标节点发 送的所述数据操作序列集合列表对应的数据具体为:接收所述目标节点发送的 所述目标节点对所述数据操作序列集合列表进行随机 10合并为顺序 10操作后 的数据操作序列集合列表对应的数据。  The method according to claim 1, 2 or 6, wherein the receiving the data corresponding to the data operation sequence set list sent by the target node is specifically: receiving the target node sent by the target node The data operation sequence set list is randomly 10 combined into data corresponding to the data operation sequence set list after the sequence 10 operation.
8、 如权利要求 1至 7任一所述的方法, 其特征在于, 所述本地节点将接 收的数据操作序列集合列表更新到所述本地节点的操作记录文件。  The method according to any one of claims 1 to 7, wherein the local node updates the received data operation sequence set list to the operation record file of the local node.
9、 如权利要求 1至 8任一所述的方法, 其特征在于, 还包括:  The method according to any one of claims 1 to 8, further comprising:
所述本地节点接收所述目标节点发送的 log操作序列列表;  The local node receives a log operation sequence list sent by the target node;
接收所述目标节点发送的所述 log操作序列列表对应的数据;  Receiving data corresponding to the log operation sequence list sent by the target node;
根据所述 log操作序列列表和所述 log操作序列列表对应的数据,更新所 述本地节点的数据。  Updating the data of the local node according to the log operation sequence list and the data corresponding to the log operation sequence list.
10、 一种分布式存储数据恢复方法, 其特征在于, 包括:  A distributed storage data recovery method, comprising:
接收本地节点的数据操作序列集合的版本值;  Receiving a version value of a data operation sequence set of the local node;
根据所述版本值, 向所述本地节点发送数据操作序列集合列表; 根据所述数据操作序列集合列表,向所述本地节点发送所述数据操作序列 集合列表对应的数据。 And sending, according to the version value, a data operation sequence set list to the local node; and sending, according to the data operation sequence set list, data corresponding to the data operation sequence set list to the local node.
11、 如权利要求 10所述的方法, 其特征在于, 根据所述版本值, 向所述 本地节点发送数据操作序列集合列表具体包括: The method of claim 10, wherein the sending the data operation sequence set list to the local node according to the version value comprises:
查找目标节点的操作记录文件中版本值大于所述本地节点的数据操作序 列集合的版本值的数据操作序列集合,并选择与所述本地节点相关的数据操作 序列集合生成所述数据操作序列集合列表;  Finding a data operation sequence set whose version value in the operation record file of the target node is greater than a version value of the data operation sequence set of the local node, and selecting a data operation sequence set related to the local node to generate the data operation sequence set list ;
向所述本地节点发送所述数据操作序列集合列表。  Sending the list of data manipulation sequence sets to the local node.
12、 如权利要求 11所述的方法, 其特征在于, 利用分布式哈希表架构下 的哈希算法或元数据服务架构中的分配表算法,选择与所述本地节点相关的数 据操作序列集合。  12. The method according to claim 11, wherein the data operation sequence set related to the local node is selected by using a hash algorithm under a distributed hash table architecture or an allocation table algorithm in a metadata service architecture. .
13、 如权利要求 10至 12任一所述的方法, 其特征在于,  13. A method according to any one of claims 10 to 12, characterized in that
根据所述数据操作序列集合列表,向所述本地节点发送所述数据操作序列 集合列表对应的数据之前, 所述方法还包括: 将所述数据操作序列集合列表进 行向量合并操作;  And before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: performing a vector merging operation on the data operation sequence set list;
则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 所述本地节点发送所述向量合并操作后的数据操作序列集合列表对应的数据。  And sending the data corresponding to the data operation sequence set list to the local node, where the data corresponding to the data operation sequence set list after the vector combining operation is sent to the local node.
14、 如权利要求 10至 13任一所述的方法, 其特征在于,  14. A method according to any one of claims 10 to 13 wherein:
根据所述数据操作序列集合列表,向所述本地节点发送所述数据操作序列 集合列表对应的数据之前, 所述方法还包括: 将所述数据操作序列集合列表进 行随机 10合并为顺序 10的操作;  Before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: merging the data operation sequence set list into a sequence 10 by random 10 ;
则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 表对应的数据。  And sending, by the local node, the data corresponding to the data operation sequence set list is: data corresponding to the table.
15、 如权利要求 10至 12任一所述的方法, 其特征在于,  15. A method as claimed in any one of claims 10 to 12, characterized in that
根据所述数据操作序列集合列表,向所述本地节点发送所述数据操作序列 集合列表对应的数据之前,还包括: 接收所述本地节点发送的将所述数据操作 序列集合列表进行向量合并操作后的数据操作序列集合列表; Before the sending, according to the data operation sequence set list, the data corresponding to the data operation sequence set list to the local node, the method further includes: receiving the data operation sent by the local node a list of data operation sequence sets after the vector combination operation is performed on the sequence collection list;
则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 所述本地节点发送所述向量合并操作后的数据操作序列集合列表对应的数据。  And sending the data corresponding to the data operation sequence set list to the local node, where the data corresponding to the data operation sequence set list after the vector combining operation is sent to the local node.
16、 如权利要求 10、 11、 12或 15所述的方法, 其特征在于, 根据所述数 据操作序列集合列表,向所述本地节点发送所述数据操作序列集合列表对应的 数据之前, 所述方法还包括: 接收所述本地节点发送的将所述数据操作序列集 合列表进行随机 10合并顺序 10操作后的数据操作序列集合列表;  The method according to claim 10, 11, 12 or 15, wherein, before the data corresponding to the data operation sequence set list is sent to the local node according to the data operation sequence set list, The method further includes: receiving, by the local node, a list of data operation sequence sets after performing the random 10 merge sequence 10 operation on the data operation sequence set list;
则向所述本地节点发送所述数据操作序列集合列表对应的数据具体为:向 所述本地节点发送所述随机 10合并顺序 10操作后的数据操作序列集合列表对 应的数据。  And sending the data corresponding to the data operation sequence set list to the local node, where the data corresponding to the data operation sequence set list after the random 10 merge order 10 operation is sent to the local node.
17、 如权利要求 10至 16任一所述的方法, 其特征在于, 还包括: 接收本地节点的 log数据操作序列请求;  The method according to any one of claims 10 to 16, further comprising: receiving a log data operation sequence request of the local node;
根据所述 log数据操作序列请求,向所述本地节点发送 log数据操作序列 列表,  Sending a log data operation sequence list to the local node according to the log data operation sequence request,
根据所述 log数据操作序列列表,向所述本地节点发送所述 log数据操作 序列列表对应的数据。  And transmitting, according to the log data operation sequence list, data corresponding to the log data operation sequence list to the local node.
18、 一种节点, 其特征在于, 包括:  18. A node, comprising:
接收单元,用于接收目标节点根据所述节点的数据操作序列集合的版本值 发送的数据操作序列集合列表,以及用于接收所述目标节点发送的所述数据操 作序列集合列表对应的数据;  a receiving unit, configured to receive a data operation sequence set list sent by the target node according to the version value of the data operation sequence set of the node, and receive data corresponding to the data operation sequence set list sent by the target node;
更新单元,用于根据所述数据操作序列集合列表以及所述数据操作序列集 合列表对应的数据, 更新所述节点的数据。  And an updating unit, configured to update data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence collection list.
19、 如权利要求 18所述的节点, 其特征在于, 还包括:  The node according to claim 18, further comprising:
发送单元,用于向所述目标节点发送所述节点的数据操作序列集合的版本 值。 And a sending unit, configured to send, to the target node, a version value of the data operation sequence set of the node.
20、 如权利要求 19所述的节点, 其特征在于, 还包括: The node according to claim 19, further comprising:
向量合并单元,用于将所述接收单元接收的所述数据操作序列集合列表进 行向量合并操作;  a vector merging unit, configured to perform a vector merging operation on the list of the data operation sequence received by the receiving unit;
发送单元还用于向所述目标节点发送所述向量合并操作后的数据操作序 列集合列表;  The sending unit is further configured to send, to the target node, a data operation sequence set list after the vector combining operation;
则所述接收单元接收的数据操作序列集合列表对应的数据具体为:所述进 行向量合并操作后的数据操作序列集合列表对应的数据。  The data corresponding to the data operation sequence set list received by the receiving unit is specifically: the data corresponding to the data operation sequence set list after the vector combining operation.
21、 如权利要求 19或 20所述的节点, 其特征在于, 还包括:  The node according to claim 19 or 20, further comprising:
合并单元,用于将所述接收单元接收的所述数据操作序列集合列表进行随 机 10合并顺序 10操作; 据操作序列集合列表;  a merging unit, configured to perform the random sequence 10 operation of the data operation sequence set list received by the receiving unit; according to the operation sequence set list;
则所述接收单元接收的数据操作序列集合列表对应的数据具体为:所述随 机 10合并顺序 10操作后的数据操作序列集合列表对应的数据。  The data corresponding to the data operation sequence set list received by the receiving unit is specifically: the data corresponding to the data operation sequence set list after the sequence 10 operation of the random machine 10 is merged.
22、 如权利要求 18或 19所述的节点, 其特征在于, 所述接收单元接收的 所述目标节点发送的所述数据操作序列集合列表对应的数据具体为:接收所述 目标节点发送的所述目标节点对所述数据操作序列集合列表进行向量合并操 作后的数据操作序列集合列表对应的数据。  The node according to claim 18 or 19, wherein the data corresponding to the data operation sequence set list sent by the target node received by the receiving unit is specifically: receiving the location sent by the target node And the data corresponding to the data operation sequence set list after the target node performs a vector merge operation on the data operation sequence set list.
23、 如权利要求 18、 19或 22所述的节点, 其特征在于, 所述接收单元接 收的所述目标节点发送的所述数据操作序列集合列表对应的数据具体为:接收 所述目标节点发送的所述目标节点对所述数据操作序列集合列表进行随机 10 合并为顺序 10操作后的数据操作序列集合列表对应的数据。  The node according to claim 18, 19 or 22, wherein the data corresponding to the data operation sequence set list sent by the target node received by the receiving unit is specifically: receiving the target node to send The target node performs random 10 merging on the data operation sequence set list to data corresponding to the data operation sequence set list after the sequence 10 operation.
24、 如权利要求 18至 23任一所述的节点, 其特征在于,  24. A node according to any of claims 18 to 23, characterized in that
所述更新单元还用于将所述接收单元接收的所述数据操作序列集合列表 更新到所述节点的操作记录文件。 The updating unit is further configured to update the data operation sequence set list received by the receiving unit to an operation record file of the node.
25、 如权利要求 18至 24任一所述的节点, 其特征在于, 25. A node according to any of claims 18 to 24, characterized in that
所述接收单元,还用于接收所述目标节点发送的 log操作序列列表, 以及 还用于接收所述目标节点发送的所述 log操作序列列表对应的数据;  The receiving unit is further configured to receive a log operation sequence list sent by the target node, and further configured to receive data corresponding to the log operation sequence list sent by the target node;
所述更新单元,还用于根据所述 log操作序列列表和所述 log操作序列列 表对应的数据, 更新所述节点的数据。  The updating unit is further configured to update data of the node according to the log operation sequence list and the data corresponding to the log operation sequence list.
26、 一种节点, 其特征在于, 包括:  26. A node, comprising:
接收单元, 用于接收本地节点数据操作序列集合的版本值;  a receiving unit, configured to receive a version value of a local node data operation sequence set;
发送单元 , 用于根据所述版本值, 向所述本地节点发送数据操作序列集合 列表和所述数据操作序列集合列表对应的数据。  And a sending unit, configured to send, according to the version value, the data operation sequence set list and the data operation sequence set list corresponding data to the local node.
27、 如权利要求 26所述的节点, 其特征在于, 还包括:  27. The node of claim 26, further comprising:
查找生成单元, 用于根据所述版本值, 查找目标节点的操作记录文件中版 本值大于所述本地节点的数据操作序列集合的版本值的数据操作序列集合,并 选择与所述本地节点相关的数据操作序列集合生成所述数据操作序列集合列 表。  a search generating unit, configured to search, according to the version value, a data operation sequence set whose version value in the operation record file of the target node is greater than a version value of the data operation sequence set of the local node, and select a related to the local node The set of data manipulation sequences generates a list of the set of data manipulation sequences.
28、 如权利要求 26或 27所述的节点, 其特征在于, 所述接收单元还用于 接收所述本地节点发送的将所述数据操作序列集合列表进行向量合并操作后 的数据操作序列集合列表;  The node according to claim 26 or 27, wherein the receiving unit is further configured to receive, by the local node, a list of data operation sequence sets after performing a vector combining operation on the data operation sequence set list sent by the local node. ;
则所述发送单元发送的所述数据操作序列集合列表对应的数据具体为:所 述向量合并操作后的数据操作序列集合列表对应的数据。  The data corresponding to the data operation sequence set list sent by the sending unit is specifically: the data corresponding to the data operation sequence set list after the vector combining operation.
29、 如权利要求 26至 28任一所述的节点, 其特征在于, 所述接收单元还 用于接收所述本地节点发送的将所述数据操作序列集合列表进行随机 10合并 顺序 10操作后的数据操作序列集合列表;  The node according to any one of claims 26 to 28, wherein the receiving unit is further configured to receive, after the local node sends the data operation sequence set list, the random 10 merge order 10 operation. a list of data manipulation sequence sets;
则所述发送单元发送的所述数据操作序列集合列表对应的数据具体为:所 述随机 10合并顺序 10操作后的数据操作序列集合列表对应的数据。  The data corresponding to the data operation sequence set list sent by the sending unit is specifically: the data corresponding to the data operation sequence set list after the random 10 merge order 10 operation.
30、 如权利要求 26或 27所述的节点, 其特征在于, 还包括: 向量合并单元, 用于将所述数据操作序列集合列表进行向量合并操作; 则所述发送单元用于向所述本地节点发送所述数据操作序列集合列表对 应的数据具体为:用于向所述本地节点发送所述进行向量合并操作后的数据操 作序列集合列表对应的数据。 The node according to claim 26 or 27, further comprising: a vector merging unit, configured to perform a vector merging operation on the data operation sequence set list; and the sending unit is configured to send, by the sending unit, the data corresponding to the data operation sequence set list to: The local node sends the data corresponding to the data operation sequence set list after performing the vector combining operation.
31、 如权利要求 26、 27或 30所述的节点, 其特征在于, 还包括: 合并单元, 用于将所述数据操作序列集合列表进行随机 10合并顺序 10 操作;  The node according to claim 26, 27 or 30, further comprising: a merging unit, configured to perform the random 10 merging sequence 10 operation on the data operation sequence set list;
则所述发送单元用于向所述本地节点发送所述数据操作序列集合列表对 后的数据操作序列集合列表对应的数据。  And the sending unit is configured to send, to the local node, data corresponding to the data operation sequence set list after the data operation sequence set list pair.
32、 如权利要求 26至 31任一所述的节点, 其特征在于,  32. A node according to any of claims 26 to 31, characterized in that
所述接收单元还用于接收本地节点的 log数据操作序列请求;  The receiving unit is further configured to receive a log data operation sequence request of the local node;
所述发送单元还用于根据所述 log数据操作序列请求,向所述本地节点发 送 log数据操作序列列表, 以及根据所述 log数据操作序列列表, 向所述本地 节点发送所述 log数据操作序列列表对应的数据。  The sending unit is further configured to send, according to the log data operation sequence request, a log data operation sequence list to the local node, and send the log data operation sequence to the local node according to the log data operation sequence list. The data corresponding to the list.
33、 一种分布式存储数据恢复系统, 其特征在于, 包括:  33. A distributed storage data recovery system, comprising:
本地节点,用于接收目标节点根据所述本地节点的数据操作序列集合的版 本值发送的数据操作序列集合列表,以及用于接收所述目标节点发送的所述数 据操作序列集合列表对应的数据,还用于根据所述数据操作序列集合列表以及 所述数据操作序列集合列表对应的数据, 更新所述节点的数据;  a local node, configured to receive a data operation sequence set list sent by the target node according to a version value of the data operation sequence set of the local node, and receive data corresponding to the data operation sequence set list sent by the target node, And being further configured to update data of the node according to the data operation sequence set list and the data corresponding to the data operation sequence set list;
所述目标节点, 用于接收所述本地节点的数据操作序列集合的版本值,根 据所述版本值, 向所述本地节点发送数据操作序列集合列表, 和所述数据操作 序列集合列表对应的数据。  The target node is configured to receive a version value of the data operation sequence set of the local node, and send, according to the version value, a data operation sequence set list to the local node, and the data corresponding to the data operation sequence set list .
PCT/CN2011/084219 2011-12-19 2011-12-19 Method, device, and system for recovering distributed storage data WO2013091162A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201180003086.8A CN103262042B (en) 2011-12-19 2011-12-19 A kind of distributed storage data reconstruction method, Apparatus and system
PCT/CN2011/084219 WO2013091162A1 (en) 2011-12-19 2011-12-19 Method, device, and system for recovering distributed storage data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/084219 WO2013091162A1 (en) 2011-12-19 2011-12-19 Method, device, and system for recovering distributed storage data

Publications (1)

Publication Number Publication Date
WO2013091162A1 true WO2013091162A1 (en) 2013-06-27

Family

ID=48667625

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/084219 WO2013091162A1 (en) 2011-12-19 2011-12-19 Method, device, and system for recovering distributed storage data

Country Status (2)

Country Link
CN (1) CN103262042B (en)
WO (1) WO2013091162A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105320577A (en) * 2014-06-11 2016-02-10 中国移动通信集团公司 Data backup and recovery method, system and device

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294211B (en) * 2016-08-08 2019-05-28 浪潮(北京)电子信息产业有限公司 A kind of detection method and device of multichannel sequential flow
CN108268357B (en) * 2016-12-30 2021-10-26 阿里巴巴集团控股有限公司 Real-time data processing method and device
CN109788077A (en) * 2019-03-27 2019-05-21 上海爱数信息技术股份有限公司 A kind of cloud standby system that supporting cluster and its method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1271441A (en) * 1997-07-21 2000-10-25 艾利森电话股份有限公司 Method relating to databases
CN101593185A (en) * 2008-05-29 2009-12-02 国际商业机器公司 Utilize and carry out the method and system that data are recovered synchronously
CN101794247A (en) * 2010-03-26 2010-08-04 天津理工大学 Real-time database failure recovery method under nested transaction model
CN101807210A (en) * 2010-04-26 2010-08-18 中兴通讯股份有限公司 Database data synchronic method, system and device
CN101996108A (en) * 2009-08-18 2011-03-30 中兴通讯股份有限公司 Distributed environment backup and recovery method and system
CN102156720A (en) * 2011-03-28 2011-08-17 中国人民解放军国防科学技术大学 Method, device and system for restoring data
US20110208994A1 (en) * 2010-02-22 2011-08-25 International Business Machines Corporation Rebuilding lost data in a distributed redundancy data storage system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7440571B2 (en) * 2002-12-03 2008-10-21 Nagravision S.A. Method for securing software updates
CN100456887C (en) * 2006-04-21 2009-01-28 江苏移动通信有限责任公司 Method and system of realizing data synchronization of user's terminal and server
CN101546269B (en) * 2008-03-28 2013-07-03 鸿富锦精密工业(深圳)有限公司 System and method capable of executing file version updating
CN101699399B (en) * 2009-11-03 2014-04-30 中兴通讯股份有限公司 Software update system and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1271441A (en) * 1997-07-21 2000-10-25 艾利森电话股份有限公司 Method relating to databases
CN101593185A (en) * 2008-05-29 2009-12-02 国际商业机器公司 Utilize and carry out the method and system that data are recovered synchronously
CN101996108A (en) * 2009-08-18 2011-03-30 中兴通讯股份有限公司 Distributed environment backup and recovery method and system
US20110208994A1 (en) * 2010-02-22 2011-08-25 International Business Machines Corporation Rebuilding lost data in a distributed redundancy data storage system
CN101794247A (en) * 2010-03-26 2010-08-04 天津理工大学 Real-time database failure recovery method under nested transaction model
CN101807210A (en) * 2010-04-26 2010-08-18 中兴通讯股份有限公司 Database data synchronic method, system and device
CN102156720A (en) * 2011-03-28 2011-08-17 中国人民解放军国防科学技术大学 Method, device and system for restoring data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105320577A (en) * 2014-06-11 2016-02-10 中国移动通信集团公司 Data backup and recovery method, system and device

Also Published As

Publication number Publication date
CN103262042B (en) 2016-03-30
CN103262042A (en) 2013-08-21

Similar Documents

Publication Publication Date Title
US11023448B2 (en) Data scrubbing method and apparatus, and computer readable storage medium
CN102521071B (en) Private cloud-based virtual machine maintaining method
WO2018098972A1 (en) Log recovery method, storage device and storage node
CN101739313B (en) Method for protecting and restoring continuous data
CN109542682B (en) Data backup method, device, equipment and storage medium
US20210073085A1 (en) Query Fault Processing Method and Processing Apparatus
WO2013163864A1 (en) Data persistence processing method and device and database system
WO2014059804A1 (en) Method and system for data synchronization
WO2012097691A1 (en) Data backup method and device
CN109783014B (en) Data storage method and device
CN105468475A (en) Backup method and backup device of database
WO2020063600A1 (en) Data disaster recovery method and site
WO2020042850A1 (en) Data storage method and apparatus and storage system
WO2015157904A1 (en) File synchronization method, server, and terminal
WO2013091212A1 (en) Partition management method, device and system in distributed storage system
CN104461773A (en) Backup deduplication method of virtual machine
CN116680256B (en) Database node upgrading method and device and computer equipment
WO2013091162A1 (en) Method, device, and system for recovering distributed storage data
WO2014101510A1 (en) Data synchronization processing method and system
CN113010496A (en) Data migration method, device, equipment and storage medium
CN104407940A (en) Method for quickly recovering CDP system
WO2024103594A1 (en) Container disaster recovery method, system, apparatus and device, and computer-readable storage medium
JP5287366B2 (en) Management server, backup method, backup method, and program
CN111045865A (en) Real-time synchronization method and system based on block replication
WO2013091183A1 (en) Method and device for key-value pair operation

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201180003086.8

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11878155

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11878155

Country of ref document: EP

Kind code of ref document: A1