CN107463462B - Data restoration method and data restoration device - Google Patents

Data restoration method and data restoration device Download PDF

Info

Publication number
CN107463462B
CN107463462B CN201610399396.0A CN201610399396A CN107463462B CN 107463462 B CN107463462 B CN 107463462B CN 201610399396 A CN201610399396 A CN 201610399396A CN 107463462 B CN107463462 B CN 107463462B
Authority
CN
China
Prior art keywords
data
repair
node
relay
rack
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610399396.0A
Other languages
Chinese (zh)
Other versions
CN107463462A (en
Inventor
胡燏翀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Huazhong University of Science and Technology
Original Assignee
Tencent Technology Shenzhen Co Ltd
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd, Huazhong University of Science and Technology filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610399396.0A priority Critical patent/CN107463462B/en
Publication of CN107463462A publication Critical patent/CN107463462A/en
Application granted granted Critical
Publication of CN107463462B publication Critical patent/CN107463462B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/085Error detection or correction by redundancy in data representation, e.g. by using checking codes using codes with inherent redundancy, e.g. n-out-of-m codes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a data recovery method, which comprises the following steps: acquiring a data repair instruction, and creating a local new node for repairing a lost data segment in a local rack according to the data repair instruction; according to the data repair instruction, collecting local repair data in the coded data segments of other nodes of the local rack; determining a data relay node in the relay rack according to the data repair instruction, and collecting relay repair data in the coding data segment of the node of the corresponding relay rack through the data relay node; and according to the local repair data and the relay repair data, repairing the lost data segment by the local new node. The invention also provides a data repair device, and the data repair method and the data repair device combine data repair in the rack and data repair across the rack, thereby improving the transmission efficiency of data repair operation.

Description

Data restoration method and data restoration device
Technical Field
The present invention relates to the field of data processing, and in particular, to a data recovery method and a data recovery apparatus.
Background
Along with the development of science and technology, the demand of internet enterprises on data storage is higher and higher, and the enterprise still needs to solve the data loss problem when equipment breaks down when setting up large-scale data storage center. Therefore, the existing enterprises can adopt the erasure codes to divide the redundant data blocks into segments, and encode the divided redundant data blocks and store the encoded redundant data blocks at different hardware positions, so that the data storage equipment can realize fault-tolerant storage to a certain degree.
Existing erasure codes are constructed from two configurable parameters n and k, where k is a positive integer less than n. If the data amount of the original data to be stored is M, an (n, k) erasure code may divide the original data into k data segments with data amount of M/k, and then code-convert each segment into n encoded data segments of the same size, each of which is stored on a different node or server.
Erasure coding trades off storage efficiency by reducing device performance, and the repair of any data fragment involves the transmission of an additional data fragment. The conventional way to repair a failed data segment is to retrieve k segments from the remaining non-failed nodes to reconstruct the original data, thereby repairing the lost data segment.
In order to reduce the amount of transmission data when repairing a data segment, an erasure code reconstructs a single data segment by a non-failed node in a Minimum Redundancy Storage (Minimum Redundancy Storage) coding manner, and the coding of the erasure code satisfies the characteristic of maximum distance separable codes (MDSs).
However, due to the heterogeneous nature of existing data storage devices, a data center often includes multiple racks, each rack including multiple storage nodes, the storage nodes in the racks being interconnected by a switch at the top of the rack, while the switches at the top of the multiple racks are further connected by a network core switch.
In order to tolerate the faults of the nodes and the racks at the same time, different coded data segments are placed in different nodes of different racks, so that the data segments are inevitably transmitted from non-failed nodes of other racks when any lost data segment is repaired, a large amount of cross-rack data transmission bandwidth is generated, and the normal repair efficiency of the existing data storage equipment is greatly influenced.
Disclosure of Invention
The embodiment of the invention provides a data restoration method and a data restoration device which can improve restoration efficiency; the data recovery method and the data recovery device solve the technical problem that the existing data recovery method and the existing data recovery device are large in cross-rack data transmission bandwidth and low in recovery efficiency.
The embodiment of the invention provides a data recovery method, which comprises the following steps:
acquiring a data repair instruction, and creating a local new node for repairing a lost data segment in a local rack according to the data repair instruction;
according to the data repair instruction, collecting local repair data in the coded data segments of other nodes of the local rack;
determining a data relay node in a relay rack according to the data repair instruction, and collecting relay repair data in the coding data segment of the node of the corresponding relay rack through the data relay node; and
and according to the local repair data and the relay repair data, performing repair operation on the lost data segment at the local new node.
An embodiment of the present invention further provides a data recovery apparatus, which includes:
the local new node creating module is used for acquiring a data repairing instruction and creating a local new node for repairing a lost data segment in a local rack according to the data repairing instruction;
a local repair data collection module, configured to collect local repair data in other nodes of the local rack according to the data repair instruction;
the relay repair data collection module is used for determining a data relay node in a relay rack according to the data repair instruction and collecting relay repair data in the corresponding node of the relay rack through the data relay node; and
and the repair module is used for performing repair operation on the lost data segment at the local new node according to the local repair data and the relay repair data.
Compared with the data restoration method and the data restoration device in the prior art, the data restoration method and the data restoration device combine data restoration in the rack and data restoration across the rack, so that the transmission efficiency of data restoration operation is improved; the data restoration method and the data restoration device solve the technical problem that the restoration efficiency is low due to the fact that the cross-rack data transmission bandwidth of the existing data restoration method and the data restoration device is large.
Drawings
FIG. 1 is a flow chart of a first preferred embodiment of a data repair method of the present invention;
FIG. 2 is a flow chart of a second preferred embodiment of the data repair method of the present invention;
FIG. 3 is an information flow diagram of data repair of a second preferred embodiment of the data repair method of the present invention;
FIG. 4 is a schematic structural diagram of a first preferred embodiment of the data recovery apparatus of the present invention;
FIG. 5 is a schematic structural diagram of a data recovery apparatus according to a second preferred embodiment of the present invention;
fig. 6 is a schematic view of a working environment structure of an electronic device in which the data recovery apparatus of the present invention is located.
Detailed Description
Referring to the drawings, wherein like reference numbers refer to like elements, the principles of the present invention are illustrated as being implemented in a suitable computing environment. The following description is based on illustrated embodiments of the invention and should not be taken as limiting the invention with regard to other embodiments that are not detailed herein.
In the description that follows, embodiments of the invention are described with reference to steps and symbols of operations performed by one or more computers, unless otherwise indicated. It will thus be appreciated that those steps and operations, which are referred to herein several times as being computer-executed, include being manipulated by a computer processing unit in the form of electronic signals representing data in a structured form. This manipulation transforms the data or maintains it at locations in the computer's memory system, which may reconfigure or otherwise alter the computer's operation in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the invention have been described in language specific to above, it is not intended to be limited to the specific details shown, since one skilled in the art will recognize that various steps and operations described below may be implemented in hardware.
The data restoration method and the data restoration device can be used in electronic equipment such as a data storage server and the like, the data storage server can carry out MDS storage based on optimal storage redundancy on the stored data, and when a node failure occurs, the bandwidth consumption of the data restored among different racks can be minimized by combining data restoration in the racks and data restoration across the racks, so that the restoration efficiency of the data restoration operation of the data storage server is improved.
Referring to fig. 1, fig. 1 is a flowchart illustrating a data recovery method according to a first preferred embodiment of the present invention. The data recovery method of the preferred embodiment may be implemented by using the electronic device, and the data recovery method includes:
step S101, acquiring a data repair instruction, and creating a local new node for repairing a lost data segment in a local rack according to the data repair instruction;
step S102, according to the data repair instruction, collecting local repair data in the coded data segments of other nodes of the local rack;
step S103, determining a data relay node in the relay rack according to the data repair instruction, and collecting relay repair data in the coding data segment of the node of the corresponding relay rack through the data relay node;
and step S104, according to the local repair data and the relay repair data, repairing the lost data segment at the local new node.
The data repair process of the data repair method of the present preferred embodiment is described in detail below.
In step S101, the data recovery apparatus obtains a data recovery instruction, where the data recovery instruction is an instruction that causes a read failure in the data stored in the data storage server, and at this time, a rebuilding operation needs to be performed on a node corresponding to the read failed data. Therefore, in this step, the data repair apparatus creates a local new node for repairing the missing data segment in the local chassis where the data reading failure occurred according to the data repair instruction. Subsequently, the process goes to step S102.
In step S102, since the lost data segment is redundantly stored in both the local chassis and the relay chassis, in this step, the data repair apparatus collects, in accordance with the data repair instruction, local repair data, which is associated with the data node in which the data read failure has occurred, in the encoded data segments of the other nodes of the local chassis. Subsequently, the process goes to step S103.
In step S103, the data repair device determines data relay nodes in the relay chassis according to the data repair instruction, and then the data repair device collects relay repair data in the encoded data segments of the nodes of the relay chassis through the data relay nodes of each relay chassis, the relay repair data being related to the data node in which the data read failure occurs. Subsequently, the process goes to step S104.
In step S104, the data repair apparatus performs a repair operation on the missing data segment in the local new node created in step S101, that is, creates a local new node including the missing data segment, based on the local repair data acquired in step S102 and the relay repair data acquired in step S103.
This completes the data repair process of the data repair method of the present preferred embodiment.
The data restoration method of the preferred embodiment combines in-rack data restoration and cross-rack data restoration, and improves restoration efficiency of data restoration operation.
Referring to fig. 2, fig. 2 is a flowchart illustrating a data recovery method according to a second preferred embodiment of the present invention. The data recovery method of the preferred embodiment may be implemented by using the electronic device, and the data recovery method includes:
step S201, based on the node data reading fault, generating a data repair instruction;
step S202, acquiring a data repair instruction, and creating a local new node for repairing a lost data segment in a local rack according to the data repair instruction;
step S203, collecting local repair data in the coded data segments of other nodes of the local rack according to the data repair instruction;
step S204, determining a data relay node in the relay rack according to the data repair instruction, and collecting relay repair data in the coding data segment of the node of the corresponding relay rack through the data relay node;
and step S205, according to the local repair data and the relay repair data, repairing the lost data segment at the local new node.
The data repair process of the data repair method of the preferred embodiment is described in detail below.
Before data recovery, a data recovery device needs to establish a data storage architecture, specifically:
the data recovery apparatus encodes original data of size M into n encoded data pieces of size M/k each using an MDS erasure code having an (n, k) parameter. And then, distributing n coded data fragments to n nodes, and uniformly distributing the n nodes on r racks, wherein each rack is provided with n/r nodes.
Here, n is set to be an integer multiple of r; wherein n/r is less than or equal to k to ensure that any node failure cannot be repaired locally within the chassis; n/r is less than or equal to n-k, so that when any one rack fails, the fault can be repaired by nodes on other racks. RhDenotes the h rack, XhiRepresenting the ith node of the h-th chassis.
In step S201, the data repair apparatus generates a data repair instruction based on the node data read failure. When data on a certain data node on a certain rack fails, a corresponding node data reading fault is generated. And then the data repair device generates a corresponding data repair instruction according to the node data reading fault. The data repair instruction refers to an instruction that a read failure occurs in the storage data in the data storage server. Subsequently, the process goes to step S202.
In step S202, the data repair apparatus obtains a data repair instruction, and then needs to perform a rebuilding operation on a node corresponding to the data with the read failure. Therefore, in this step, the data repair apparatus creates a local new node for repairing the missing data segment in the local chassis where the data reading failure occurred according to the data repair instruction. Such as when node X1,1In case of failure, it will be in the rack R1In newly built node X'1,1. Subsequently, the process goes to step S203.
In step S203, since the lost data segment is redundantly stored in both the local chassis and the relay chassis, in this step, the data repair apparatus collects local repair data, which is related to the data node in which the data read failure has occurred, in the encoded data segments of the other nodes of the local chassis according to the data repair instruction. Namely node X'1,1Collecting X in surviving node1,2……X1,n/rInAnd repairing the data. Subsequently, the process goes to step S204.
In step S204, the data repair device determines data relay nodes in the relay chassis according to the data repair instruction, and then the data repair device collects relay repair data in the encoded data segments of the nodes of the relay chassis through the data relay nodes of each relay chassis, where the relay repair data is related to the data node in which the data read failure occurs. If each relay chassis uses node Xh,1To collect relay repair data for other nodes in the same chassis. Subsequently, it goes to step S205.
In step S205, the data repair apparatus performs a repair operation on the missing data segment in the local new node created in step S202, that is, creates a local new node including the missing data segment, based on the local repair data acquired in step S203 and the relay repair data acquired in step S204.
This completes the data repair process of the data repair method of the present preferred embodiment.
Preferably, in order to further span the data transmission bandwidth of the rack, the data amount of the relay repair data collected by each relay rack is set to be the same, and the data amount of the relay repair data collected by each relay rack is:
Figure BDA0001010816470000071
where β is the amount of data of relay repair data collected by each relay chassis.
Therefore, the node data after the repair operation can minimize the transmission of data across the rack on the basis of keeping the characteristic of the maximum distance divisible code.
Fig. 3 below illustrates the process of acquiring the minimum value of β. Referring to fig. 3, fig. 3 is an information flow diagram of data repair according to a second preferred embodiment of the data repair method of the present invention.
It is assumed here that the data volume of the relay repair data transmitted to the local chassis by each relay chassis is the same, so that the cross-chassis repair bandwidth for repairing one node is (r-1) β.
Here, the information flow G (n, k, r, β) is established using parameters n and k of the MDS erasure correction code, the number of racks r, and the single-relay-rack repair bandwidth β, that is, the data amount of the relay repair data collected by each relay rack as parameters, where n is 6, k is 3, and r is 3.
Where G has three types of nodes: virtual data source S, data collection source T and input-output node
Figure BDA0001010816470000072
Function G has six types of directed edges:
one, from the virtual data source S to each input node
Figure BDA0001010816470000081
An edge with infinite capacity;
two, the slave input node
Figure BDA0001010816470000082
To the output node
Figure BDA0001010816470000083
An edge with M/k capacity;
three, from the input node
Figure BDA0001010816470000084
To the output node
Figure BDA0001010816470000085
The edge with the M/k capacity, that is, the maximum repair bandwidth (maximum relay repair data) of two data relay nodes in the same relay chassis;
fourth, from the output node
Figure BDA0001010816470000086
To the input node
Figure BDA0001010816470000087
The edge with M/k capacity, i.e. the maximum repair band of the surviving node and the new node in the local rackWidth;
fifth, from the output node
Figure BDA0001010816470000088
To the input node
Figure BDA0001010816470000089
The edge for β capacity, i.e., repair bandwidth across chassis;
sixthly, from the k selected output nodes
Figure BDA00010108164700000810
An edge with unlimited capacity to the data collection source T for data transmission.
The minimum value of β is obtained based on the capacity of the smallest possible cut of the information stream G, where a cut is defined as a set of directed edges, at least one edge of any set of edges from S to T being within the cut
Figure BDA00010108164700000811
The smallest possible cut.
If all of the information stream G
Figure BDA00010108164700000812
The smallest possible partitions, S and T, are both smaller than M, then random linear network coding is sufficient to reconstruct the data when any k of the n nodes are connected to T.
If all in the information stream G
Figure BDA00010108164700000813
The smallest possible cut capacity is at least M, then
Figure BDA00010108164700000814
The specific reasoning process is as follows:
let the following k nodes be connected to the data collection source T: frame R1New in ChinaOne node X'1,1Machine frame R1With fault removal node X1,1X nodes, y relay nodes, each rack having w inside1A relay node not connected to the data collection source T and having w inside each rack2A relay node connected to the data collection source T. Here, it is defined that:
1+x+y+w1+w2=k;(1)
using λ (x, y, w)1,w2) To represent the capacity of a segment, i.e. x, y, w1,w2As a function of (c).
Wherein the contribution ratio of all non-failure nodes of the failed rack of lambda to the capacity lambda is (n/r-1) M/k, the contribution ratio of the relay rack connected to the data collector T to the capacity lambda is y n/r M/k, the contribution ratio of the relay rack not connected to the data collection source T to the capacity lambda is (r-1-y) β, and the contribution ratio of the relay node inside the relay rack not connected to the data collector T to the capacity lambda is w1M/k; w of the relay chassis internally connected to the data collector T2The individual nodes cannot contribute to the capacity λ, so the capacity λ of the segment is
λ=(n/r-1)·M/k+y·n/r·M/k+(r-1-y)·β+w1·M/k;(2)
Since the capacity of λ is equal to or greater than M, it can be derived:
Figure BDA0001010816470000091
the above equation can be simplified as:
w1=k-n/r-y·n/r;(4)
now found max { β' (y, w)1) At most n/R-1 nodes in the rack R1 (including the failure node X)1,1) Can be connected to a data collector T, thus
x≤n/r-1;(5)
Also for a relay chassis connected to the data collector T, at most n/r-1 nodes are connected to the data collector T, and therefore
w2≤(n/r-1)y;(6)
By equation (1), x and w are found2Taking larger values will result in y and w1Has a smaller value range, when x and w are expressed by equation (3)2When the maximum value is taken, β' (y, w)1) The maximum value is also obtained, namely x is n/r-1; w is a2(n/r-1) y. Equation (1) then reduces to:
w1=k-n/r-y·n/r;(7)
due to w1Greater than or equal to 0, i.e.:
Figure BDA0001010816470000092
thereby can derive
Figure BDA0001010816470000101
Namely, it is
Figure BDA0001010816470000102
We prove that after the data recovery method of the preferred embodiment performs node data recovery, the node data after recovery still has the characteristic of maximum distance divisible code.
Dividing the M-sized original data into q × k blocks, where
Figure BDA0001010816470000103
Each node stores a coded segment consisting of q coded blocks, the size of the coded block being the minimum of β, namely
Figure BDA0001010816470000104
For each node Xh,i(1 ≦ h ≦ r,1 ≦ i ≦ n/r), the jth (1 ≦ j ≦ q) block of this node is a linear combination of q × k original coding blocks in the finite field Fh,i,jDesignating P as a column vector of the designated coefficients of the linear combinationh,iAs a group of { ph,i,j}1≤j≤q.A constituent qk × q matrix.
The data repair process of the preferred embodiment is as follows, provided that X1,1Failure, we are at New node X'1,1On to reconstruct one coded segment P'1,1. First, each relay node Xh,1(h is more than or equal to 2 and less than or equal to R) slave rack RhOne new coded block P 'is calculated from all the coded blocks'h
p'h=[Ph,1;Ph,2;…;Ph,n/r]·ch;(10)
Wherein c ishRepresenting a coefficient vector of size qn/r.
Then X'1,1According to the frame R1All non-failed nodes of (2) and the newly encoded blocks p 'of the other chassis'hCalculating New coded segment P'1,1
P1',1=[P1,2;…;P1,n/r;p'2;…;p'r]·D;(11)
Where D is a matrix of coefficients of (q (n/r-1) + r-1). times.q.
In order to ensure that the node data after the repair operation still has the characteristic of extremely large distance separable codes, it is necessary to ensure that the qk vector span of k nodes in any n nodes is full rank.
Setting U to Ph,i(except for P)1,1) Any k-1 set of (a). Suppose { Ph,i}1≤h≤r,1≤i≤n/r.Has the characteristic of extremely large distance capable of being coded, an
Figure BDA0001010816470000105
A full rank qk vector can be constructed according to the following conditions:
one, select q vectors from each k-1 coded slice in U.
II, presence of a compound not including R1And q frames, each of the q frames having at least one coded slice not in U, such that a vector is selected from each of the q coded slices.
Assume set { Ph,i}1≤h≤r,1≤i≤n/r.Satisfy the characteristic of the very large distance divisible codeThere is a designation of ch(2 is more than or equal to h and less than or equal to r) and D, so that after one-time single-node repair, a new set { P'1,1,P1,2,…,P1,n/r,…,Pr,1,…,Pr,n/rIt still has the property of being very long-distance divisible. I.e., { P, in the case of any possible U1,1The qk vectors in YU still maintain full rank.
Such as
Figure BDA0001010816470000111
Based on equation (10) and equation (11), by adjusting chAnd D, can be P'1,1From other than R1Q vectors of q different racks, thus P1,i’YU is full rank.
Such as
Figure BDA0001010816470000112
I.e. there is one P1,i’Not belonging to U, P 'can be made by adjusting D based on equation (11)'1,1From P1,i’Q vectors of, thus P1,i’YU is full rank.
Therefore, after the data recovery method of the preferred embodiment performs node data recovery, the node data after recovery still has the characteristic of being separated by a very large distance.
On the basis of the first preferred embodiment, the data recovery method of the present preferred embodiment can further reduce the data transmission bandwidth across the racks by establishing a data storage architecture and setting the minimum value of the relay recovery data collected by the relay racks, thereby further improving the recovery efficiency of the data recovery operation.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a data recovery apparatus according to a first preferred embodiment of the present invention. The data repair apparatus of the present preferred embodiment may be implemented using the first preferred embodiment of the data repair method described above, and the data repair apparatus 40 includes a local new node creation module 41, a local repair data collection module 42, a relay repair data collection module 43, and a repair module 44.
The local new node creating module 41 is configured to obtain a data repair instruction, and create a local new node for repairing a lost data segment in the local rack according to the data repair instruction; the local repair data collection module 42 is configured to collect local repair data in other nodes of the local rack according to the data repair instruction; the relay repair data collection module 43 is configured to determine a data relay node in the relay rack according to the data repair instruction, and collect relay repair data in a node of the corresponding relay rack through the data relay node; the repair module 44 is configured to perform a repair operation on the lost data segment at the local new node according to the local repair data and the relay repair data.
When the data repair apparatus 40 of the preferred embodiment is used, first, the local new node creation module 41 obtains a data repair instruction, where the data repair instruction refers to an instruction that a read failure occurs in data stored in the data storage server, and at this time, a reconstruction operation needs to be performed on a node corresponding to the read failure data. Therefore, in this step, the local new node creation module 41 creates a local new node for repairing the lost data segment in the local chassis where the data reading failure occurs according to the data repair instruction.
Subsequently, since the lost data segment is redundantly stored in the local chassis and the relay chassis at the same time, in this step, the local repair data collection module 42 collects local repair data, which is related to the data node in which the data reading failure occurs, in the encoded data segments of the other nodes of the local chassis according to the data repair instruction.
The relay repair data collection module 43 then determines data relay nodes in the relay chassis according to the data repair instruction, and then the relay repair data collection module 43 collects relay repair data in the encoded data segments of the nodes of each relay chassis through the data relay nodes of the relay chassis, where the relay repair data is related to the data node where the data reading failure occurs.
Finally, the repair module 44 performs a repair operation on the missing data segment in the local new node created by the local new node creation module 41 according to the local repair data acquired by the local repair data collection module 42 and the relay repair data acquired by the relay repair data collection module 43, that is, creates a local new node including the missing data segment.
This completes the data repair process of the data repair apparatus 40 of the present preferred embodiment.
The data restoration device of the preferred embodiment combines in-rack data restoration and cross-rack data restoration, and improves restoration efficiency of data restoration operation.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a data recovery device according to a second preferred embodiment of the present invention. The data restoring apparatus of the present preferred embodiment may be implemented using the second preferred embodiment of the data restoring method described above, and the data restoring apparatus 50 includes a local new node creating module 51, a local restoration data collecting module 52, a relay restoration data collecting module 53, a restoring module 54, an encoded data segment generating module 55, a node setting module 56, and an instruction generating module 57.
The local new node creating module 51 is configured to obtain a data repair instruction, and create a local new node for repairing a lost data segment in the local rack according to the data repair instruction; the local repair data collection module 52 is configured to collect local repair data in other nodes of the local rack according to the data repair instruction; the relay repair data collection module 53 is configured to determine a data relay node in the relay rack according to the data repair instruction, and collect relay repair data in a node of the corresponding relay rack through the data relay node; the repair module 54 is configured to perform a repair operation on the lost data segment at the local new node according to the local repair data and the relay repair data. The encoded data segment generating module 55 is configured to encode and store the original data into n encoded data segments with a size of M/k, where M is the size of the original data. The node setting module 56 is configured to set n encoded data segments onto n nodes; where n nodes are located on average on r racks. The instruction generating module 57 is configured to generate a data repair instruction based on the node data read failure.
Before data recovery, a data recovery device needs to establish a data storage architecture, specifically:
the encoded data segment generation module 55 encodes the original data of size M into n encoded data segments of size M/k each using an MDS erasure code with (n, k) parameters. Subsequently, the node setting module 56 distributes n encoded data segments to n nodes, and the n nodes are uniformly distributed on r racks, and each rack is provided with n/r nodes.
Here, n is set to be an integer multiple of r; wherein n/r is less than or equal to k to ensure that any node failure cannot be repaired locally within the chassis; n/r is less than or equal to n-k, so that when any one rack fails, the fault can be repaired by nodes on other racks. RhDenotes the h rack, Xh,iRepresenting the ith node of the h-th chassis.
The instruction generation module 57 then generates a data repair instruction based on the node data read failure. When data on a certain data node on a certain rack fails, a corresponding node data reading fault is generated. The instruction generating module 57 then generates a corresponding data repair instruction according to the node data read failure. The data repair instruction refers to an instruction that a read failure occurs in the storage data in the data storage server.
Then, the local new node creation module 51 obtains the data repair instruction, and at this time, it is necessary to perform a rebuilding operation on the node corresponding to the data with the read failure. Therefore, in this step, the local new node creation module 51 creates a local new node for repairing the lost data segment in the local chassis where the data reading failure occurs according to the data repair instruction. Such as when node X1,1In case of failure, it will be in the rack R1In newly built node X'1,1
Subsequently, since the lost data segment is redundantly stored in the local rack and the relay rack at the same time, in this step, the local repair data collection module 52 collects local repair data, which is the same as the data node in which the data read failure has occurred, in the encoded data segments of the other nodes of the local rack according to the data repair instructionAnd off. Namely node X'1,1Collecting X in surviving node1,2……X1,n/rLocally repair the data.
Then, the relay repair data collection module 53 determines data relay nodes in the relay chassis according to the data repair instruction, and then the relay repair data collection module 53 collects relay repair data, which is related to the data node having the data read failure, in the encoded data segments of the nodes of the relay chassis through the data relay nodes of each relay chassis. For example, each relay rack newly builds a node Xh,1To collect relay repair data for other nodes in the same chassis.
The repair module 54 then performs a repair operation on the missing data segment in the local new node created by the local new node creation module 51 according to the local repair data acquired by the local repair data collection module 52 and the relay repair data acquired by the relay repair data collection module 53, that is, creates a local new node including the missing data segment.
This completes the data repair process of the data repair apparatus 50 of the present preferred embodiment.
Preferably, in order to further span the data transmission bandwidth of the rack, the data amount of the relay repair data collected by each relay rack is set to be the same, and the data amount of the relay repair data collected by each relay rack is:
Figure BDA0001010816470000141
where β is the amount of data of relay repair data collected by each relay chassis.
Therefore, the node data after the repair operation can minimize the transmission of data across the rack on the basis of keeping the characteristic of the maximum distance divisible code.
On the basis of the first preferred embodiment, the data recovery apparatus of the present preferred embodiment can further reduce the data transmission bandwidth across the racks by establishing a data storage architecture and setting the minimum value of the relay recovery data collected by the relay racks, thereby further improving the recovery efficiency of the data recovery operation.
The data restoration method and the data restoration device combine data restoration in the rack and data restoration across the rack, thereby improving the restoration efficiency of data restoration operation; the data restoration method and the data restoration device solve the technical problem that the restoration efficiency is low due to the fact that the cross-rack data transmission bandwidth of the existing data restoration method and the data restoration device is large.
As used herein, the terms "component," "module," "system," "interface," "process," and the like are generally intended to refer to a computer-related entity: hardware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a controller and the controller can be a component. One or more components can reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
Furthermore, the claimed subject matter may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed subject matter. The term "article of manufacture" as used herein is intended to encompass a computer program accessible from any computer-readable device, carrier, or media. Of course, those skilled in the art will recognize many modifications may be made to this configuration without departing from the scope or spirit of the claimed subject matter.
FIG. 6 and the following discussion provide a brief, general description of an operating environment of an electronic device in which the data recovery apparatus of the present invention may be implemented. The operating environment of FIG. 6 is only one example of a suitable operating environment and is not intended to suggest any limitation as to the scope of use or functionality of the operating environment. Example electronic devices 612 include, but are not limited to, personal computers, server computers, hand-held or laptop devices, mobile devices (such as mobile phones, Personal Digital Assistants (PDAs), media players, and the like), multiprocessor systems, consumer electronics, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Although not required, embodiments are described in the general context of "computer readable instructions" being executed by one or more electronic devices. Computer readable instructions may be distributed via computer readable media (discussed below). Computer readable instructions may be implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, etc. that perform particular tasks or implement particular abstract data types. Typically, the functionality of the computer readable instructions may be combined or distributed as desired in various environments.
FIG. 6 illustrates an example of an electronic device 612 that includes one or more embodiments of the data recovery apparatus of the present invention. In one configuration, electronic device 612 includes at least one processing unit 616 and memory 618. Depending on the exact configuration and type of electronic device, memory 618 may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.) or some combination of the two. This configuration is illustrated in fig. 6 by dashed line 614.
In other embodiments, the electronic device 612 may include additional features and/or functionality. For example, device 612 may also include additional storage (e.g., removable and/or non-removable) including, but not limited to, magnetic storage, optical storage, and the like. Such additional storage is illustrated in fig. 6 by storage 620. In one embodiment, computer readable instructions to implement one or more embodiments provided herein may be in storage 620. Storage 620 may also store other computer readable instructions to implement an operating system, an application program, and the like. Computer readable instructions may be loaded in memory 618 for execution by processing unit 616, for example.
The term "computer readable media" as used herein includes computer storage media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions or other data. Memory 618 and storage 620 are examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by electronic device 612. Any such computer storage media may be part of electronic device 612.
The electronic device 612 may also include a communication connection 626 that allows the electronic device 612 to communicate with other devices. Communication connection 626 may include, but is not limited to, a modem, a Network Interface Card (NIC), an integrated network interface, a radio frequency transmitter/receiver, an infrared port, a USB connection, or other interfaces for connecting electronic device 912 to other electronic devices. Communication connection 626 may include a wired connection or a wireless connection. Communication connection 626 may transmit and/or receive communication media.
The term "computer readable media" may include communication media. Communication media typically embodies computer readable instructions or other data in a "modulated data signal" such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal" may include signals that: one or more of the signal characteristics may be set or changed in such a manner as to encode information in the signal.
The electronic device 612 may include input device(s) 624 such as keyboard, mouse, pen, voice input device, touch input device, infrared cameras, video input devices, and/or any other input device. Output device(s) 622 such as one or more displays, speakers, printers, and/or any other output device may also be included in device 612. The input device 624 and the output device 622 may be connected to the electronic device 612 via a wired connection, a wireless connection, or any combination thereof. In one embodiment, an input device or an output device from another electronic device may be used as input device 624 or output device 622 for electronic device 612.
The components of the electronic device 612 may be connected by various interconnects, such as a bus. Such interconnects may include Peripheral Component Interconnect (PCI), such as PCI express, Universal Serial Bus (USB), firewire (IEEE1394), optical bus structures, and the like. In another embodiment, components of the electronic device 612 may be interconnected by a network. For example, memory 618 may be comprised of multiple physical memory units located in different physical locations interconnected by a network.
Those skilled in the art will realize that storage devices utilized to store computer readable instructions may be distributed across a network. For example, an electronic device 630 accessible via network 628 may store computer readable instructions to implement one or more embodiments provided by the present invention. The electronic device 612 may access the electronic device 630 and download a part or all of the computer readable instructions for execution. Alternatively, electronic device 612 may download pieces of the computer readable instructions, as needed, or some instructions may be executed at electronic device 612 and some at electronic device 630.
Various operations of embodiments are provided herein. In one embodiment, the one or more operations may constitute computer readable instructions stored on one or more computer readable media, which when executed by an electronic device, will cause the computing device to perform the operations. The order in which some or all of the operations are described should not be construed as to imply that these operations are necessarily order dependent. Those skilled in the art will appreciate alternative orderings having the benefit of this description. Moreover, it should be understood that not all operations are necessarily present in each embodiment provided herein.
Also, as used herein, the word "preferred" is intended to serve as an example, instance, or illustration. Any aspect or design described herein as "preferred" is not necessarily to be construed as advantageous over other aspects or designs. Rather, use of the word "preferred" is intended to present concepts in a concrete fashion. The term "or" as used in this application is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless specified otherwise or clear from context, "X employs A or B" is intended to include either of the permutations as a matter of course. That is, if X employs A; b is used as X; or X employs both A and B, then "X employs A or B" is satisfied in any of the foregoing examples.
Also, although the disclosure has been shown and described with respect to one or more implementations, equivalent alterations and modifications will occur to others skilled in the art based upon a reading and understanding of this specification and the annexed drawings. The present disclosure includes all such modifications and alterations, and is limited only by the scope of the appended claims. In particular regard to the various functions performed by the above described components (e.g., elements, resources, etc.), the terms used to describe such components are intended to correspond, unless otherwise indicated, to any component which performs the specified function of the described component (e.g., that is functionally equivalent), even though not structurally equivalent to the disclosed structure which performs the function in the herein illustrated exemplary implementations of the disclosure. In addition, while a particular feature of the disclosure may have been disclosed with respect to only one of several implementations, such feature may be combined with one or more other features of the other implementations as may be desired and advantageous for a given or particular application. Furthermore, to the extent that the terms "includes," has, "" contains, "or variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term" comprising.
Each functional unit in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium. The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Each apparatus or system described above may perform the method in the corresponding method embodiment.
In summary, although the present invention has been described with reference to the preferred embodiments, the above-described preferred embodiments are not intended to limit the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, therefore, the scope of the present invention shall be determined by the appended claims.

Claims (7)

1. A method of data repair, comprising:
acquiring a data repair instruction, and creating a local new node for repairing a lost data segment in a local rack according to the data repair instruction;
according to the data repair instruction, collecting local repair data in the coded data segments of other nodes of the local rack;
determining a data relay node in a relay rack according to the data repair instruction, and collecting relay repair data in the coding data segment of the node of the corresponding relay rack through the data relay node; and
according to the local repair data and the relay repair data, repairing the lost data segment at the local new node;
the method comprises the steps of encoding and saving original data into n encoded data segments with the size of M/k, wherein M is the size of the original data;
setting n of said encoded data segments onto n nodes; wherein n of the nodes are disposed on average on r racks; wherein n is an integer multiple of r;
the data volume of the relay repair data collected by each relay rack is the same, and the data volume of the relay repair data collected by each relay rack is as follows:
Figure FDA0002631952670000011
wherein β is the data volume of the relay repair data collected by each relay rack, so that the node data after the repair operation minimizes the cross-rack transmission data on the basis of keeping the characteristic of the maximum distance divisible code, and the cross-rack repair read bandwidth for repairing one node is (r-1) β;
n/r is less than or equal to k, so that the fault of any node cannot be repaired locally in the rack; and n/r is less than or equal to n-k, so that when any rack fails, nodes on other racks can be repaired;
establishing an information flow G (n, k, r, beta) by taking the data volume of the relay repair data collected by each relay rack as a parameter;
where G has three types of nodes: virtual data source S, data collection source T and input-output node
Figure FDA0002631952670000021
Function G has six types of directed edges:
from a virtual data source S to each input node
Figure FDA0002631952670000022
An edge with infinite capacity; from the input node
Figure FDA0002631952670000023
To the output node
Figure FDA0002631952670000024
An edge with M/k capacity; from the input node
Figure FDA0002631952670000025
To the output node
Figure FDA0002631952670000026
An edge with M/k capacity; slave output node
Figure FDA0002631952670000027
To the input node
Figure FDA0002631952670000028
An edge with M/k capacity; slave output node
Figure FDA0002631952670000029
To the input node
Figure FDA00026319526700000210
From the k selected output nodes
Figure FDA00026319526700000211
An edge with unlimited capacity to the data collection source T for data transmission.
2. The data repair method of claim 1, further comprising:
and generating the data repair instruction based on the node data reading fault.
3. The data recovery method of claim 1 wherein the node data after the recovery operation has the characteristic of being very long-range divisible.
4. A data recovery apparatus, comprising:
the local new node creating module is used for acquiring a data repairing instruction and creating a local new node for repairing a lost data segment in a local rack according to the data repairing instruction;
a local repair data collection module, configured to collect local repair data in other nodes of the local rack according to the data repair instruction;
the relay repair data collection module is used for determining a data relay node in a relay rack according to the data repair instruction and collecting relay repair data in the corresponding node of the relay rack through the data relay node;
a repair module, configured to perform a repair operation on the lost data segment at the local new node according to the local repair data and the relay repair data;
the coded data segment generating module is used for coding and storing original data into n coded data segments with the size of M/k, wherein M is the size of the original data; and
the node setting module is used for setting the n coded data segments to n nodes; wherein n of the nodes are disposed on average on r racks; wherein n is an integer multiple of r;
the data volume of the relay repair data collected by each relay rack is the same, and the data volume of the relay repair data collected by each relay rack is as follows:
Figure FDA0002631952670000031
wherein β is the data volume of the relay repair data collected by each relay rack, so that the node data after the repair operation minimizes the cross-rack transmission data on the basis of keeping the characteristic of the maximum distance divisible code, and the cross-rack repair read bandwidth for repairing one node is (r-1) β;
n/r is less than or equal to k, so that the fault of any node cannot be repaired locally in the rack; and n/r is less than or equal to n-k, so that when any rack fails, nodes on other racks can be repaired;
establishing an information flow G (n, k, r, beta) by taking the data volume of the relay repair data collected by each relay rack as a parameter;
where G has three types of nodes: virtual data source S, data collection source T and input-output node
Figure FDA0002631952670000032
Function G has six types of directed edges:
from a virtual data source S to each input node
Figure FDA0002631952670000033
An edge with infinite capacity; from the input node
Figure FDA0002631952670000034
To the output node
Figure FDA0002631952670000035
An edge with M/k capacity; from the input node
Figure FDA0002631952670000036
To the output node
Figure FDA0002631952670000037
An edge with M/k capacity; slave output node
Figure FDA0002631952670000038
To the input node
Figure FDA0002631952670000039
An edge with M/k capacity; slave output node
Figure FDA00026319526700000310
To the input node
Figure FDA00026319526700000311
From the k selected output nodes
Figure FDA00026319526700000312
An edge with unlimited capacity to the data collection source T for data transmission.
5. The data recovery device of claim 4, further comprising:
and the instruction generating module is used for generating the data repairing instruction based on the node data reading fault.
6. The data recovery apparatus according to claim 4, wherein the node data after the recovery operation has a characteristic of a maximum distance divisible code.
7. A storage medium having stored therein processor-executable instructions, the instructions being loaded by one or more processors to perform a data repair method according to any one of claims 1 to 3.
CN201610399396.0A 2016-06-06 2016-06-06 Data restoration method and data restoration device Active CN107463462B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610399396.0A CN107463462B (en) 2016-06-06 2016-06-06 Data restoration method and data restoration device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610399396.0A CN107463462B (en) 2016-06-06 2016-06-06 Data restoration method and data restoration device

Publications (2)

Publication Number Publication Date
CN107463462A CN107463462A (en) 2017-12-12
CN107463462B true CN107463462B (en) 2020-10-13

Family

ID=60545746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610399396.0A Active CN107463462B (en) 2016-06-06 2016-06-06 Data restoration method and data restoration device

Country Status (1)

Country Link
CN (1) CN107463462B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111385200B (en) * 2020-03-04 2022-03-04 中国人民解放军国防科技大学 Control method and device for data block repair
CN112118604B (en) * 2020-07-27 2023-06-20 哈尔滨工业大学(深圳) Mobile storage system-oriented relay cooperation data restoration method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270161B (en) * 2011-06-09 2013-03-20 华中科技大学 Methods for storing, reading and recovering erasure code-based multistage fault-tolerant data
CN102624866B (en) * 2012-01-13 2014-08-20 北京大学深圳研究生院 Data storage method, data storage device and distributed network storage system
US9647698B2 (en) * 2013-02-26 2017-05-09 Peking University Shenzhen Graduate School Method for encoding MSR (minimum-storage regenerating) codes and repairing storage nodes

Also Published As

Publication number Publication date
CN107463462A (en) 2017-12-12

Similar Documents

Publication Publication Date Title
US11531593B2 (en) Data encoding, decoding and recovering method for a distributed storage system
US9141679B2 (en) Cloud data storage using redundant encoding
Rashmi et al. Having your cake and eating it too: Jointly optimal erasure codes for {I/O}, storage, and network-bandwidth
EP3504623B1 (en) Multiple node repair using high rate minimum storage regeneration erasure code
US20170286222A1 (en) Allocating data for storage by utilizing a location-based hierarchy in a dispersed storage network
CN111682874B (en) Data recovery method, system, equipment and readable storage medium
US20170212925A1 (en) Utilizing a hierarchical index in a dispersed storage network
US10152376B2 (en) Data object recovery for storage systems
US10644726B2 (en) Method and apparatus for reconstructing a data block
WO2023082629A1 (en) Data storage method and apparatus, electronic device, and storage medium
WO2016130091A1 (en) Methods of encoding and storing multiple versions of data, method of decoding encoded multiple versions of data and distributed storage system
CN111858142A (en) Data processing method and device, electronic equipment and storage medium
US20160285476A1 (en) Method for encoding and decoding of data based on binary reed-solomon codes
CN113687975A (en) Data processing method, device, equipment and storage medium
CN102843212B (en) Coding and decoding processing method and device
CN107463462B (en) Data restoration method and data restoration device
CN115113819A (en) Data storage method, single-node server and equipment
CN116248129A (en) Fault-tolerant data segment compression method, recovery method, device and system
US10528282B2 (en) Modifying and utilizing a file structure in a dispersed storage network
CN115098295A (en) Data local recovery method, equipment and storage medium
CN104102558A (en) Erasure code based file appending method
US20180075047A1 (en) Accessing serially stored data in a dispersed storage network
CN115454343A (en) Data processing method, device and medium based on RAID chip
CN106302573B (en) Method, system and device for processing data by adopting erasure code
CN116662063B (en) Error correction configuration method, error correction method, system, equipment and medium for flash memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant