CN111625394B - Data recovery method, device and equipment based on erasure codes and storage medium - Google Patents

Data recovery method, device and equipment based on erasure codes and storage medium Download PDF

Info

Publication number
CN111625394B
CN111625394B CN202010458910.XA CN202010458910A CN111625394B CN 111625394 B CN111625394 B CN 111625394B CN 202010458910 A CN202010458910 A CN 202010458910A CN 111625394 B CN111625394 B CN 111625394B
Authority
CN
China
Prior art keywords
matrix
coding
check
auxiliary
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010458910.XA
Other languages
Chinese (zh)
Other versions
CN111625394A (en
Inventor
唐聃
何瑞
高燕
曾琼
耿微
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daoji Intelligence (Anhui) Information Technology Co.,Ltd.
Original Assignee
Chengdu University of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu University of Information Technology filed Critical Chengdu University of Information Technology
Priority to CN202010458910.XA priority Critical patent/CN111625394B/en
Publication of CN111625394A publication Critical patent/CN111625394A/en
Application granted granted Critical
Publication of CN111625394B publication Critical patent/CN111625394B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Error Detection And Correction (AREA)

Abstract

The invention provides a data recovery method, a device, equipment and a storage medium based on erasure codes, firstly, encoding original data to generate redundant bits, and storing the original data and the generated redundant data; and recovering the lost data according to the original data and the generated redundant data to finish the recovery of the data. The invention improves the time spent on the matrix operation in the data recovery process by optimizing the check matrix, further improves the data recovery efficiency, and lightens the steps and the complexity of the matrix operation in the data recovery, thereby improving the data recovery efficiency without reducing the recovery effect.

Description

Data recovery method, device and equipment based on erasure codes and storage medium
Technical Field
The invention belongs to the technical field of erasure codes, and particularly relates to an erasure code-based data recovery method, device, equipment and storage medium.
Background
With the rapid development of the internet, the application is more and more abundant, the number of users is more and more, and the data is also increased in a geometric level, the storage of mass data brings huge pressure to local storage, and a storage system is overwhelmed and is located at the edge of collapse, so that the overhead and pressure on the storage are reduced by adopting distributed storage.
In the early distributed storage system, three copy technologies were used to store and recover data, that is, when storing data, three copies of the same original data are copied to other storage nodes, and when data is lost, the copied data is used to recover the data directly. But the traditional three-copy strategy has obvious defects: (1) The fault tolerance rate of the hard disk fault is too low, and once the hard disk fault occurs, data recovery can be immediately caused to cause the data recovery of the cluster layer; (2) The hard disk utilization rate is low, the scheme adopting three copies is equivalent to that the hard disk utilization rate can only reach 33% at most, and in addition to other factors, the overall hard disk utilization rate of the cluster is probably less than 30%, so that the cost of the stored hard disk is increased undoubtedly; (3) the writing performance of the three copy techniques is very low; (4) When a single disk fault occurs and the single disk is damaged for a period of time, the data recovery of the cluster layer needs to be continued immediately, and the time cannot be controlled manually.
The erasure code technology related by the method well solves the problems. The erasure code is firstly applied in the communication field, mainly used for solving the problem of loss of some data in the transmission process, and is formed by segmenting transmitted signal data, then coding to generate a check bit, transmitting the check bit and original data together, and decoding and recovering the data without loss of the check bit in the data recovery process. With the development of the present, erasure coding technology is also applied to memory systems. The distributed storage system based on erasure codes has the core principle that original data is divided into a plurality of data blocks, then redundant blocks are obtained according to different erasure code algorithms, and then the redundant blocks are respectively stored in different nodes. And when the node fails, recovering the data according to the residual data blocks and the redundant blocks to obtain the lost data. By this method, the reliability and security of data are guaranteed. And it is readily seen that there are significant advantages in distributed storage: (1) The data with the same size is stored, the storage space occupied by the erasure code is very small, and approximately half of the storage space is saved compared with the traditional three-copy technology. (2) The space utilization rate is high and is more than twice of that of the traditional three-copy technology. (3) The conventional three-copy technique can only allow two nodes to fail, while the erasure code technique can allow a plurality of nodes to fail simultaneously. And (4) the cluster construction cost is low.
With the development of the present, erasure coding techniques are also based on storage systems. At present, erasure code technologies applied to distributed storage systems mainly include RS, X, event, and other codes. For a common RS code, which is based on a prior coding algorithm, n original data blocks are given in the coding stage, then m check data blocks are generated from the original data blocks, and finally, the n check data blocks are stored together. In the decoding stage, the original data can be recovered by arbitrarily taking out n data blocks from the n + m data blocks, that is, the lost data block is less than or equal to m. When the RS code is used for encoding and decoding, the inversion operation of the matrix is often involved, and meanwhile, multiplication in a finite field is involved, so that the realization is complex. The amount of processed data is limited, the time occupied by the operation of the matrix is long when the data is recovered, the coding and decoding efficiency is greatly limited, and the obtained redundant data is increased along with the increase of the data, so that the coding and decoding throughput rate is reduced.
Disclosure of Invention
In view of the above disadvantages in the prior art, the erasure code-based data recovery method, apparatus, device and storage medium provided by the present invention further improve the efficiency of storage coding.
In order to achieve the above purpose, the invention adopts the technical scheme that:
the scheme provides a data recovery method based on erasure codes, which comprises the following steps:
s1, encoding original data based on erasure codes by using a generating matrix to generate redundant bits, and storing the original data and the generated redundant data;
and S2, recovering the lost data by using the check matrix according to the original data and the generated redundant data, and completing the data recovery method.
The invention has the beneficial effects that: the invention reduces the time spent on data recovery, particularly the time spent on matrix operation during data recovery, and reduces the steps and complexity of matrix operation by optimizing the check matrix, thereby improving the efficiency of data recovery.
Further, the encoding process in step S1 includes the following steps:
a1, representing an encoding matrix A to be optimized based on erasure codes by using a vector A, A = [ a = [ a ] 1 ,a 2 ,a 3 ,…,a N ] T ∈R M ×N Wherein N is less than or equal to M, a n Represents a certain row of the coding matrix a, and N =1,2 M×N Representing an M × N matrix of positive real numbers;
a2, recordingEncoding N-1 behavior A in matrix A n ,A n =[a 1 ,a 2 ,a 3 ,…,a n-1 ,a n+1 ,…,a N ] T ∈R M×(N-1) Wherein a is N Is represented by A n Column N of (5), R M×(N-1) A matrix of positive real numbers representing M (N-1);
a3, introducing an encoding auxiliary permutation matrix T n,N Carrying out auxiliary multiplication on the coding matrix A to obtain a check bit, and storing the check bit;
a4, according to the coding auxiliary permutation matrix T n,N Exchanging the nth row and the N row in the coding matrix A with the coding matrix A;
a5, introducing a new auxiliary check matrix X, and calculating to obtain an augmentation matrix of the new auxiliary check matrix X according to the new auxiliary coding matrix X and the coding matrix A
Figure GDA0004065931240000031
Wherein R is M×M Representing an M × M matrix of positive real numbers;
a6, according to the new auxiliary check matrix X augmentation matrix
Figure GDA0004065931240000041
Calculating to obtain a full rank matrix
Figure GDA0004065931240000042
Wherein the content of the first and second substances,
Figure GDA0004065931240000043
denotes a n The conjugate transpose of (1);
a7, according to the full rank matrix
Figure GDA0004065931240000044
Calculating by utilizing linear algebra to obtain an operation equation of a coding matrix A and a permutation matrix of the coding matrix A;
a8, obtaining a new augmentation matrix of the auxiliary check matrix X according to the operation equation
Figure GDA0004065931240000045
Augmentation matrix for coding matrix A
Figure GDA0004065931240000046
An orthogonal complement projection matrix over the line space expansion;
a9, setting a row N = M, and decomposing a reversible encoding matrix A to be optimized;
a10, introducing a new auxiliary check row vector x according to the decomposed reversible encoding matrix A to be optimized n Wherein the auxiliary check row vector x n Satisfy the requirement of
Figure GDA0004065931240000047
R (M-1)×1 Represents a positive real matrix of (M-1) xN;
a11, checking the row vector x according to the auxiliary n And calculating by utilizing an orthogonal complementary projection matrix to obtain an auxiliary check line, wherein aiming at | | x n When | l =1,
Figure GDA0004065931240000048
and A12, calculating to obtain the determinant values of the coding matrix A and the corresponding transpose matrix according to the auxiliary check row
Figure GDA0004065931240000049
Wherein A is T A transposed matrix representing the coding matrix a;
and A13, calculating according to the determinant values of the coding matrix A and the corresponding transpose matrix to obtain a corresponding log value, and optimizing the row or column corresponding to the check matrix according to the log value, thereby completing the coding processing of the original data based on the erasure codes.
The beneficial effects of the further scheme are as follows: the invention improves the operations of matrix inversion, multiplication and the like in the process of storing and coding by optimizing the matrix, thereby reducing the time spent on the matrix operation and improving the data recovery efficiency.
Still further, the new augmented moment of the auxiliary check matrix X in step A5Matrix of
Figure GDA00040659312400000410
The expression of (a) is as follows:
Figure GDA00040659312400000411
wherein, I represents a unit array,
Figure GDA0004065931240000051
is an augmented matrix of the coding matrix a,
Figure GDA0004065931240000052
to represent
Figure GDA0004065931240000053
The conjugate transpose of (c).
The beneficial effects of the above further scheme are as follows: by augmenting the matrix
Figure GDA0004065931240000054
And the calculation of the check matrix is carried out to prepare for the value of the following specific data loss block.
Still further, the expression of the operation equation in step A7 is as follows:
Figure GDA0004065931240000055
wherein det [ AxA [ ] T ]Representing an operational equation, A represents an encoding matrix, A T Representing the conjugate transpose of the coding matrix A, T n,N Representing the encoding auxiliary permutation matrix, P representing the specific value of the primitive variable obtained in the optimization process,
Figure GDA0004065931240000056
represents T n,N Conjugate transpose of (a) n Represents a certain row of the coding matrix a, and N =1, 2., N represents the total number of rows of the coding matrix a,
Figure GDA0004065931240000057
denotes a n The conjugate transpose of (a) is performed,
Figure GDA0004065931240000058
an augmentation matrix representing the coding matrix a,
Figure GDA0004065931240000059
to represent
Figure GDA00040659312400000510
The conjugate transpose of (c).
The beneficial effect of the above further scheme is that: by calculating the value of the determinant of the encoding matrix a and the corresponding transpose matrix, the value of the square of the determinant is further obtained.
Still further, the expression of the determinant values of the coding matrix a and its corresponding transpose matrix in step a12 is as follows:
Figure GDA00040659312400000511
wherein the content of the first and second substances,
Figure GDA00040659312400000512
a value representing a determinant of the encoding matrix a and its corresponding transpose,
Figure GDA00040659312400000513
an augmentation matrix representing the coding matrix a,
Figure GDA00040659312400000514
to represent
Figure GDA00040659312400000515
The conjugate transpose of (a) is performed,
Figure GDA00040659312400000516
representing a certain row a of the coding matrix A n Transpose of (x) n Representing a secondary check row vector.
The beneficial effects of the above further scheme are as follows: the value of the determinant of the matrix a and the corresponding transpose matrix is encoded, and the log value of the determinant is further obtained.
Still further, the log value expression of the determinant of the encoding matrix a and the corresponding transpose matrix in step a13 is as follows:
Figure GDA0004065931240000061
wherein the content of the first and second substances,
Figure GDA0004065931240000062
a log value representing the determinant of the coding matrix a and its corresponding transpose,
Figure GDA0004065931240000063
representing a certain row a of the coding matrix A n Conjugate transpose of (1), x n Representing the secondary check row vector and delta the operator.
The beneficial effects of the above further scheme are: the condition of the scanning cycle at the time of data recovery is set according to the log value of the corresponding determinant.
Still further, the S2 includes the steps of:
b1, setting Fy according to the check bit n ]Is the source entropy of the n-th to-be-optimized check matrix based on the erasure code, and Fy n ]=C 1 Wherein, C 1 Represents a constant;
b2 according to said Fy n ]Using primitive variables P 1 Carrying out scaling processing on the check matrix based on the erasure codes;
b3, using the source entropy of the check matrix, taking the log value of the determinant of the coding matrix A and the corresponding transpose matrix as a regularization item, and calculating to obtain a specific block number Fs corresponding to the lost data according to the generated redundant data;
b4, introducing a diagonal matrix D of a full rank to perform auxiliary operation on the scaling of the check matrix based on the erasure codes to obtain a coding auxiliary matrix x;
b5, calculating to obtain an auxiliary check matrix X according to the identity matrix I and the coding auxiliary matrix X, wherein I = X X;
b6, constant C 2 Is the log value of determinant of the coding matrix A;
b7 according to constant C 1 And constant C 2 Calculating to obtain primitive number P 2
The primitive number P 2 The expression of (a) is as follows:
Figure GDA0004065931240000071
wherein, fy n ]Representing the source entropy of the n-th check matrix to be optimized based on erasure codes, C 1 And C 2 Each represents a constant, M represents the total number of columns of the encoding matrix a, M represents the number of columns in the encoding matrix a, and M =1, 2.. Multidot.m;
b8, according to the primitive number P 2 Scaling the check matrix, and numbering Fs corresponding to the lost data]And the lost data is recovered.
The beneficial effects of the further scheme are as follows: the invention simplifies the multiplication operation of the matrix by optimizing the row or the column corresponding to the check matrix, thereby improving the data recovery efficiency.
The invention also discloses a data recovery device based on the erasure code, which comprises:
the generating matrix module is used for coding original data based on erasure codes, generating redundant bits and storing the original data and the generated redundant data;
and the check matrix module is used for recovering the lost data according to the original data and the generated redundant data.
The invention also discloses a data recovery device based on the erasure code, which comprises:
one or more processors; and
storage means for storing at least one program;
the at least one program is executed by the one or more processors to implement the data recovery method.
The invention also discloses a computer readable storage medium, wherein at least one computer execution instruction or at least one program is stored in the computer readable storage medium, and the at least one computer execution instruction or the at least one program is executed by one or more processors to realize the data recovery method.
The beneficial effects of the invention are: the invention reduces the time spent on data recovery, particularly the time spent on matrix operation during data recovery, and reduces the steps and complexity of matrix operation by optimizing the check matrix, thereby improving the efficiency of data recovery.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
Example 1
As shown in fig. 1, the present invention provides an erasure code-based data recovery method, which is implemented as follows:
s1, encoding original data based on erasure codes by using a generating matrix to generate redundant bits, and storing the original data and the generated redundant data, wherein the encoding in the step S1 comprises the following steps:
a1, representing an encoding matrix A to be optimized based on erasure codes by using a vector A, A = [ a = 1 ,a 2 ,a 3 ,…,a N ] T ∈R M ×N Wherein N is less than or equal to M, a n Represents a certain row of the coding matrix a, and N =1,2The total number of rows, M, R, represents the total number of columns of the coding matrix A M×N Representing an M × N matrix of positive real numbers;
a2, recording N-1 behavior A in the coding matrix A n ,A n =[a 1 ,a 2 ,a 3 ,…,a n-1 ,a n+1 ,…,a N ] T ∈R M×(N-1) Wherein a is N Is shown as A n Column N of (5), R M×(N-1) A matrix of positive real numbers representing M (N-1);
a3, introducing an encoding auxiliary permutation matrix T n,N Carrying out auxiliary multiplication on the coding matrix A to obtain a check bit, and storing the check bit;
a4, according to the coding auxiliary permutation matrix T n,N Exchanging the nth row and the N row in the coding matrix A with the coding matrix A, wherein N represents the nth row in the coding matrix A;
a5, introducing a new auxiliary check matrix X, and calculating to obtain an augmentation matrix of the new auxiliary check matrix X according to the new auxiliary coding matrix X and the coding matrix A
Figure GDA0004065931240000091
Wherein R is M×M Representing an M × M matrix of positive real numbers;
novel augmentation matrix of auxiliary check matrix X
Figure GDA0004065931240000092
The expression of (a) is as follows:
Figure GDA0004065931240000093
wherein, I represents a unit array,
Figure GDA0004065931240000094
is an augmented matrix of the coding matrix a,
Figure GDA0004065931240000095
to represent
Figure GDA0004065931240000096
The conjugate transpose of (1);
a6, according to the new auxiliary check matrix X augmentation matrix
Figure GDA0004065931240000097
Calculating to obtain a full rank matrix
Figure GDA0004065931240000098
Wherein the content of the first and second substances,
Figure GDA0004065931240000099
denotes a n The conjugate transpose of (1);
a7, according to the full rank matrix
Figure GDA00040659312400000910
And calculating by using linear algebra to obtain an operation equation of the coding matrix A and a permutation matrix of the coding matrix A, wherein the operation equation expression is as follows:
Figure GDA00040659312400000911
wherein det [ AxA [ ] T ]Representing the operational equation, A represents the coding matrix, A T Representing the conjugate transpose of the coding matrix A, T n,N Representing the encoded auxiliary permutation matrix, P representing the specific value of the primitive variable obtained in the optimization process,
Figure GDA00040659312400000912
represents T n,N Conjugate transpose of (a) n Represents a certain row of the coding matrix a, and N =1, 2., N represents the total number of rows of the coding matrix a,
Figure GDA00040659312400000913
denotes a n The conjugate transpose of (a) is performed,
Figure GDA00040659312400000914
an augmentation matrix representing the coding matrix a,
Figure GDA00040659312400000915
to represent
Figure GDA00040659312400000916
The conjugate transpose of (1);
a8, obtaining a new augmentation matrix of the auxiliary check matrix X according to the operation equation
Figure GDA00040659312400000917
Augmentation matrix for coding matrix A
Figure GDA00040659312400000918
An orthogonal complement projection matrix over the line space expansion;
a9, setting a row N = M, and decomposing a reversible encoding matrix A to be optimized;
a10, introducing a new auxiliary check row vector x according to the decomposed reversible encoding matrix A to be optimized n Wherein the row vector x is checked auxiliarily n Satisfy the requirement of
Figure GDA0004065931240000101
R (M-1)×1 Represents a positive real matrix of (M-1) xN;
a11, checking the row vector x according to the auxiliary n And calculating by utilizing an orthogonal complementary projection matrix to obtain an auxiliary check line, wherein aiming at | | x n When | l =1,
Figure GDA0004065931240000102
and A12, calculating to obtain the determinant values of the coding matrix A and the corresponding transpose matrix according to the auxiliary check row:
Figure GDA0004065931240000103
wherein the content of the first and second substances,
Figure GDA0004065931240000104
representing an encoding matrix A and its corresponding transpose matrixThe value of the determinant of (a) is,
Figure GDA0004065931240000105
an augmented matrix representing the encoding matrix a,
Figure GDA0004065931240000106
to represent
Figure GDA0004065931240000107
The conjugate transpose of (a) is performed,
Figure GDA0004065931240000108
representing a certain row a of the coding matrix A n Transpose of (x) n Representing a secondary check row vector;
a13, calculating according to the determinant values of the coding matrix A and the corresponding transpose matrix to obtain a corresponding log value, and optimizing the row or column corresponding to the check matrix according to the log value, thereby completing the coding processing of the original data based on the erasure codes;
the log-valued expression of the determinant of the coding matrix a and its corresponding transpose is as follows:
Figure GDA0004065931240000109
wherein the content of the first and second substances,
Figure GDA00040659312400001010
a log value representing the determinant of the coding matrix a and its corresponding transpose,
Figure GDA00040659312400001011
representing a certain row a of the coding matrix A n Conjugate transpose of (c), x n Representing a secondary check row vector;
s2, recovering the lost data by using the check matrix according to the original data and the generated redundant data to finish the data recovery method, wherein the realization method comprises the following steps:
b1, setting Fy according to the check bit n ]Is a baseSource entropy of the n-th to-be-optimized check matrix of erasure codes, and Fy n ]=C 1 Wherein, C 1 Represents a constant;
b2, according to said Fy n ]Using primitive variables P 1 Carrying out scaling processing on the check matrix based on the erasure codes;
b3, using the source entropy of the check matrix, taking the log value of the determinant of the coding matrix A and the corresponding transpose matrix as a regularization item, and calculating according to the generated redundant data to obtain a specific block number Fs corresponding to the lost data;
b4, introducing a diagonal matrix D of a full rank to perform auxiliary operation on the scaling of the check matrix based on the erasure codes to obtain a coding auxiliary matrix x;
b5, calculating to obtain an auxiliary check matrix X according to the identity matrix I and the coding auxiliary matrix X, wherein I = X X;
b6, constant C 2 Is the log value of determinant of the coding matrix A;
b7 according to constant C 1 And constant C 2 Calculating to obtain primitive number P 2
The primitive number P 2 The expression of (a) is as follows:
Figure GDA0004065931240000111
wherein, fy n ]Representing the source entropy of the n-th parity check matrix to be optimized based on erasure codes, C 1 And C 2 Each represents a constant, M represents the total number of columns of the encoding matrix a, M represents the number of columns in the encoding matrix a, and M =1, 2.. Multidot.m;
b8, according to the primitive number P 2 Scaling the check matrix, and numbering Fs according to the specific block number corresponding to the lost data]And the lost data is recovered.
In this embodiment, the check matrix source to be optimized is continuously scaled according to steps B1 to B8 to simplify the operation of the matrix in encoding and decoding, thereby improving the efficiency of the matrix algorithm.
In this embodiment, the data recovery method is applicable to RS codes, X codes, event codes, and the like, and simultaneously, a certain row of a matrix is processed independently in the encoding and decoding optimization process, so that some complex operations in the encoding and decoding matrix optimization process are avoided, and the method can also be applied to wider non-orthogonal matrix classes.
Example 2
The invention also provides a data recovery device based on erasure codes, which comprises:
the generating matrix module is used for coding original data based on erasure codes, generating redundant bits and storing the original data and the generated redundant data;
and the check matrix module is used for recovering the lost data according to the original data and the generated redundant data.
In the embodiment, the original data based on the erasure codes are encoded by using the generating matrix to generate redundant bits, and the original data and the generated redundant data are stored; and recovering the lost data by using the check matrix according to the original data and the generated redundant data to finish the data recovery method. The invention reduces the time spent on data recovery, particularly the time spent on matrix operation during data recovery, and reduces the steps and complexity of matrix operation by optimizing the check matrix, thereby improving the efficiency of data recovery.
Example 3
The invention also provides a data recovery device based on erasure codes, which comprises:
one or more processors; and
storage means for storing at least one program;
the at least one program is executed by the one or more processors to implement the data recovery method of embodiment 1.
In this embodiment, the one or more processors may be a Central Processing Unit (CPU), or may be other general-purpose processors, digital Signal Processors (DSP), application Specific Integrated Circuits (ASIC), field-Programmable Gate arrays (FPGA) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In this embodiment, the memory is configured to store at least one program, and the processor executes or executes the program stored in the memory to implement the data recovery method described in embodiment 1. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Example 4
The present invention also provides a computer-readable storage medium, in which at least one computer-executable instruction or at least one program is stored, and the at least one computer-executable instruction or the at least one program is executed by one or more processors to implement the data recovery method described in embodiment 1.
In this embodiment, the computer-readable storage medium includes, but is not limited to, various media that can store program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Through the design, the time spent on data recovery is reduced, particularly the time spent on matrix operation during data recovery is reduced, and the steps and the complexity of the matrix operation are reduced by optimizing the check matrix, so that the efficiency of data recovery is improved.

Claims (9)

1. A data recovery method based on erasure codes is characterized by comprising the following steps:
s1, encoding original data based on erasure codes to generate redundant bits, and storing the original data and the generated redundant data;
s2, recovering the lost data according to the original data and the generated redundant data to finish the recovery of the data;
the encoding process in step S1 includes the steps of:
a1, representing an encoding matrix A to be optimized based on erasure codes by using a vector A, A = [ a = [ a ] 1 ,a 2 ,a 3 ,…,a N ] T ∈R M×N Wherein N is less than or equal to M, a N Represents a certain row of the encoding matrix a, and N =1, 2.., N represents the total number of rows of the encoding matrix a, M represents the total number of columns of the encoding matrix a, R M×N Representing an M × N matrix of positive real numbers;
a2, recording N-1 behavior A in the coding matrix A n ,A n =[a 1 ,a 2 ,a 3 ,…,a n-1 ,a n+1 ,…,a N ] T ∈R M×(N-1) Wherein a is N Is represented by A n Column N of (1), R M×(N-1) Represents a matrix of positive real numbers of M x (N-1);
a3, introducing an encoding auxiliary permutation matrix T n,N Performing auxiliary multiplication calculation on the coding matrix A to obtain a check bit, and storing the check bit;
a4, according to the coding auxiliary permutation matrix T n,N Exchanging the nth row and the N row in the coding matrix A with the coding matrix A;
a5, introducing a new auxiliary check matrix X, and calculating to obtain an augmented matrix of the new auxiliary check matrix X according to the new auxiliary coding matrix X and the coding matrix A
Figure FDA0004048229710000011
Figure FDA0004048229710000012
Wherein R is M×M Representing an M × M matrix of positive real numbers;
a6, according to the new auxiliary check matrix X augmentation matrix
Figure FDA0004048229710000013
Calculating to obtain a full rank matrix
Figure FDA0004048229710000014
Wherein the content of the first and second substances,
Figure FDA0004048229710000015
denotes a n The conjugate transpose of (1);
a7, according to the full rank matrix
Figure FDA0004048229710000016
Calculating by utilizing linear algebra to obtain an operation equation of a coding matrix A and a permutation matrix of the coding matrix A;
a8, obtaining a new augmentation matrix of the auxiliary check matrix X according to the operation equation
Figure FDA0004048229710000021
Augmentation matrix for coding matrix A
Figure FDA0004048229710000022
An orthogonal complement projection matrix over the line space expansion;
a9, setting a row N = M, and decomposing a reversible encoding matrix A to be optimized;
a10, introducing a new auxiliary check row vector x according to the decomposed reversible encoding matrix A to be optimized n Wherein the auxiliary check row vector x n Satisfy the requirement of
Figure FDA0004048229710000023
R (M-1)×1 Represents a positive real matrix of (M-1) xN;
a11, checking the row vector x according to the auxiliary n And calculating by utilizing an orthogonal complementary projection matrix to obtain an auxiliary check line, wherein aiming at | | x n When | l =1,
Figure FDA0004048229710000024
a12, according to the auxiliary check row, calculating to obtain the determinant value of the coding matrix A and the corresponding transpose matrix
Figure FDA0004048229710000025
Wherein A is T A transposed matrix representing the coding matrix a;
and A13, calculating according to the determinant values of the coding matrix A and the corresponding transpose matrix to obtain a corresponding log value, and optimizing the row or column corresponding to the check matrix according to the log value, thereby completing the coding processing of the original data based on the erasure codes.
2. The erasure code-based data recovery method of claim 1, wherein the new augmentation matrix of the auxiliary check matrix X in step A5
Figure FDA0004048229710000026
The expression of (a) is as follows:
Figure FDA0004048229710000027
wherein, I represents a unit matrix,
Figure FDA0004048229710000028
is an augmented matrix of the coding matrix a,
Figure FDA0004048229710000029
represent
Figure FDA00040482297100000210
The conjugate transpose of (c).
3. The erasure code-based data recovery method of claim 2, wherein the equation expression in step A7 is as follows:
Figure FDA0004048229710000031
wherein det [ AxA [ ] T ]Representing the operational equation, A represents the coding matrix, A T Representing the conjugate transpose of the coding matrix A, T n,N Representing the encoding auxiliary permutation matrix, P representing the specific value of the primitive variable obtained in the optimization process,
Figure FDA0004048229710000032
represents T n,N Conjugate transpose of (a) n Represents a certain row of the coding matrix a, and N =1, 2., N represents the total number of rows of the coding matrix a,
Figure FDA0004048229710000033
denotes a n The conjugate transpose of (a) is performed,
Figure FDA0004048229710000034
an augmentation matrix representing the coding matrix a,
Figure FDA0004048229710000035
represent
Figure FDA0004048229710000036
The conjugate transpose of (c).
4. The erasure code-based data recovery method of claim 1, wherein the expression of the determinant values of the coding matrix a and its corresponding transpose matrix in step a12 is as follows:
Figure FDA0004048229710000037
wherein the content of the first and second substances,
Figure FDA0004048229710000038
representation codingThe values of the determinant of matrix a and its corresponding transpose,
Figure FDA0004048229710000039
an augmentation matrix representing the coding matrix a,
Figure FDA00040482297100000310
to represent
Figure FDA00040482297100000311
The conjugate transpose of (a) is performed,
Figure FDA00040482297100000312
representing a certain row a of the coding matrix A n Transpose of (x) n Representing a secondary check row vector.
5. The erasure code-based data recovery method of claim 1, wherein the log-valued expression of the determinant of the encoding matrix a and its corresponding transpose matrix in step a13 is as follows:
Figure FDA00040482297100000313
wherein the content of the first and second substances,
Figure FDA00040482297100000314
a log value representing the determinant of the coding matrix a and its corresponding transpose,
Figure FDA00040482297100000315
representing a certain row a of the coding matrix A n Conjugate transpose of (c), x n Representing the secondary check row vector and delta the operator.
6. The erasure code-based data recovery method of claim 1, wherein the step S2 includes the steps of:
b1, setting Fy according to the check bit n ]Is the source entropy of the n-th to-be-optimized check matrix based on the erasure code, and Fy n ]=C 1 Wherein, C 1 Represents a constant;
b2, according to said Fy n ]Using primitive variables P 1 Carrying out scaling processing on the check matrix based on the erasure codes;
b3, using the source entropy of the check matrix, taking the log value of the determinant of the coding matrix A and the corresponding transpose matrix as a regularization item, and calculating to obtain a specific block number Fs corresponding to the lost data according to the generated redundant data;
b4, introducing a diagonal matrix D of a full rank to perform auxiliary operation on the scaling of the check matrix based on the erasure codes to obtain a coding auxiliary matrix x;
b5, calculating to obtain an auxiliary check matrix X according to the identity matrix I and the coding auxiliary matrix X, wherein I = X X;
b6, constant C 2 Is the log value of determinant of the coding matrix A;
b7 according to constant C 1 And constant C 2 Calculating to obtain primitive number P 2
The primitive number P 2 The expression of (a) is as follows:
Figure FDA0004048229710000041
wherein, fy n ]Representing the source entropy of the n-th check matrix to be optimized based on erasure codes, C 1 And C 2 Each represents a constant, M represents the total number of columns of the encoding matrix a, M represents the number of columns in the encoding matrix a, and M =1, 2., M;
b8, according to the primitive number P 2 Scaling the check matrix, and numbering Fs corresponding to the lost data]The lost data is recovered.
7. An erasure code-based data recovery apparatus, comprising:
the generating matrix module is used for coding original data based on erasure codes, generating redundant bits and storing the original data and the generated redundant data;
the encoding process includes the steps of:
a1, representing an encoding matrix A to be optimized based on erasure codes by using a vector A, A = [ a = [ a ] 1 ,a 2 ,a 3 ,…,a N ] T ∈R M×N Wherein N is less than or equal to M, a N Represents a certain row of the coding matrix a, and N =1,2 M×N Representing an M × N matrix of positive real numbers;
a2, recording N-1 action A in the coding matrix A n ,A n =[a 1 ,a 2 ,a 3 ,…,a n-1 ,a n+1 ,…,a N ] T ∈R M×(N-1) Wherein a is N Is represented by A n Column N of (5), R M×(N-1) A matrix of positive real numbers representing M (N-1);
a3, introducing an encoding auxiliary permutation matrix T n,N Carrying out auxiliary multiplication on the coding matrix A to obtain a check bit, and storing the check bit;
a4, according to the coding auxiliary permutation matrix T n,N Exchanging the nth row and the N row in the coding matrix A with the coding matrix A;
a5, introducing a new auxiliary check matrix X, and calculating to obtain an augmentation matrix of the new auxiliary check matrix X according to the new auxiliary coding matrix X and the coding matrix A
Figure FDA0004048229710000051
Figure FDA0004048229710000052
Wherein R is M×M Representing an M × M matrix of positive real numbers;
a6, according to the new auxiliary check matrix X augmentation matrix
Figure FDA0004048229710000053
Calculating to obtain a full rank matrix
Figure FDA0004048229710000054
Wherein the content of the first and second substances,
Figure FDA0004048229710000055
denotes a n The conjugate transpose of (1);
a7, according to the full rank matrix
Figure FDA0004048229710000056
Calculating by utilizing linear algebra to obtain an operation equation of a coding matrix A and a permutation matrix of the coding matrix A;
a8, obtaining a new augmentation matrix of the auxiliary check matrix X according to the operation equation
Figure FDA0004048229710000057
Augmentation matrix for coding matrix A
Figure FDA0004048229710000058
An orthogonal complement projection matrix over the line space expansion;
a9, setting a row N = M, and decomposing a reversible encoding matrix A to be optimized;
a10, introducing a new auxiliary check row vector x according to the decomposed reversible coding matrix A to be optimized n Wherein the auxiliary check row vector x n Satisfy the requirement of
Figure FDA0004048229710000061
R (M-1)×1 Represents a positive real matrix of (M-1) xN;
a11, checking the row vector x according to the auxiliary n And calculating by utilizing an orthogonal complementary projection matrix to obtain an auxiliary check line, wherein aiming at | | x n When | =1,
Figure FDA0004048229710000062
a12, calculating to obtain a code matrix A and according to the auxiliary check rowThe value of the determinant of its corresponding transpose matrix
Figure FDA0004048229710000063
Wherein A is T A transposed matrix representing the coding matrix a;
a13, calculating according to the determinant values of the coding matrix A and the corresponding transpose matrix to obtain a corresponding log value, and optimizing the row or column corresponding to the check matrix according to the log value, thereby completing the coding processing of the original data based on the erasure codes;
and the check matrix module is used for recovering the lost data according to the original data and the generated redundant data.
8. An erasure code-based data recovery apparatus, comprising:
one or more processors; and
storage means for storing at least one program;
the at least one program is executable by the one or more processors to implement the data recovery method of any one of claims 1-6.
9. A computer-readable storage medium having stored therein at least one computer-executable instruction or at least one program, the at least one computer-executable instruction or at least one program being executable by one or more processors to implement the data recovery method of any one of claims 1-6.
CN202010458910.XA 2020-05-27 2020-05-27 Data recovery method, device and equipment based on erasure codes and storage medium Active CN111625394B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010458910.XA CN111625394B (en) 2020-05-27 2020-05-27 Data recovery method, device and equipment based on erasure codes and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010458910.XA CN111625394B (en) 2020-05-27 2020-05-27 Data recovery method, device and equipment based on erasure codes and storage medium

Publications (2)

Publication Number Publication Date
CN111625394A CN111625394A (en) 2020-09-04
CN111625394B true CN111625394B (en) 2023-03-21

Family

ID=72272648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010458910.XA Active CN111625394B (en) 2020-05-27 2020-05-27 Data recovery method, device and equipment based on erasure codes and storage medium

Country Status (1)

Country Link
CN (1) CN111625394B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112799875B (en) * 2020-12-18 2023-01-06 苏州浪潮智能科技有限公司 Method, system, device and medium for verification recovery based on Gaussian elimination
CN113805815B (en) * 2021-09-18 2024-03-01 中国科学院微电子研究所 Data recovery method, device and system for flash memory
CN113568786B (en) * 2021-09-23 2021-12-31 四川大学 Data recovery method, device, equipment and storage medium
CN113901069B (en) * 2021-12-08 2022-03-15 威讯柏睿数据科技(北京)有限公司 Data storage method and device of distributed database
CN115454711A (en) * 2022-11-11 2022-12-09 苏州浪潮智能科技有限公司 Method, device and medium for recovering erasure correction data in distributed storage system
CN115993941B (en) * 2023-03-23 2023-06-02 陕西中安数联信息技术有限公司 Distributed data storage error correction method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103746774A (en) * 2014-01-03 2014-04-23 中国人民解放军国防科学技术大学 Error resilient coding method for high-efficiency data reading

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107544862B (en) * 2016-06-29 2022-03-25 中兴通讯股份有限公司 Stored data reconstruction method and device based on erasure codes and storage node
CN111045853A (en) * 2019-10-29 2020-04-21 烽火通信科技股份有限公司 Method and device for improving erasure code recovery speed and background server
CN110837436B (en) * 2019-11-05 2023-10-13 成都信息工程大学 Method for automatically decoding erasure codes in lightweight manner on finite field and intelligent terminal module
CN110895497B (en) * 2019-12-09 2022-06-07 成都信息工程大学 Method and device for reducing erasure code repair in distributed storage

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103746774A (en) * 2014-01-03 2014-04-23 中国人民解放军国防科学技术大学 Error resilient coding method for high-efficiency data reading

Also Published As

Publication number Publication date
CN111625394A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
CN111625394B (en) Data recovery method, device and equipment based on erasure codes and storage medium
US8775860B2 (en) System and method for exact regeneration of a failed node in a distributed storage system
CN100479333C (en) Device and method for decoding of chain reaction codes through inactivation of recovered symbols
US9304859B2 (en) Polar codes for efficient encoding and decoding in redundant disk arrays
US10656996B2 (en) Integrated security and data redundancy
CN112000512B (en) Data restoration method and related device
CN111090540B (en) Data processing method and device based on erasure codes
CN111782152A (en) Data storage method, data recovery device, server and storage medium
Ardakani et al. On allocation of systematic blocks in coded distributed computing
CN112181707B (en) Distributed storage data recovery scheduling method, system, equipment and storage medium
CN112887509A (en) Fusion encryption method based on multiple chaotic systems
US10784896B2 (en) High performance data redundancy and fault tolerance
CN112000509B (en) Erasure code encoding method, system and device based on vector instruction
CN112015325B (en) Method for generating decoding matrix, decoding method and corresponding device
CN115113816A (en) Erasure code data processing system, method, computer device and medium
Canteaut et al. Improvements of the attacks on cryptosystems based on error-correcting codes
US10171109B2 (en) Fast encoding method and device for Reed-Solomon codes with a small number of redundancies
CN105871508B (en) Network coding and decoding method and system
CN111541512A (en) Data processing method, terminal device and readable storage medium
CN112906844B (en) Two-dimensional code secret sharing and restoring method and device based on (3,1) Hamming code
US10560122B2 (en) Memory system and method of controlling nonvolatile memory
Müelich et al. Constructing an LDPC Code Containing a Given Vector
Hamidi A New Method for Transformation Techniques in Secure Information Systems
Ivantsiv et al. Methods of information security based cryptographic transformations matrix Noiseimmunity
CN116302665A (en) Hardware implementation method, terminal module, control system and computer processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Tang Dan

Inventor after: He Rui

Inventor after: Gao Yan

Inventor after: Zeng Qiong

Inventor after: Geng Wei

Inventor before: Cai Hongliang

Inventor before: He Rui

Inventor before: Gao Yan

Inventor before: Zeng Qiong

Inventor before: Geng Wei

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231122

Address after: 230000 Room 203, building 2, phase I, e-commerce Park, Jinggang Road, Shushan Economic Development Zone, Hefei City, Anhui Province

Patentee after: Hefei Jiuzhou Longteng scientific and technological achievement transformation Co.,Ltd.

Address before: 230000 floor 1, building 2, phase I, e-commerce Park, Jinggang Road, Shushan Economic Development Zone, Hefei City, Anhui Province

Patentee before: Dragon totem Technology (Hefei) Co.,Ltd.

Effective date of registration: 20231122

Address after: 230000 floor 1, building 2, phase I, e-commerce Park, Jinggang Road, Shushan Economic Development Zone, Hefei City, Anhui Province

Patentee after: Dragon totem Technology (Hefei) Co.,Ltd.

Address before: 610225 24 section 1 Xuefu Road, Southwest Airport Economic Development Zone, Chengdu, Sichuan

Patentee before: CHENGDU University OF INFORMATION TECHNOLOGY

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231211

Address after: 233600 Star Garden Incubator in the Optoelectronics Standardization Plant, Economic Development Zone, Guoyang County, Bozhou City, Anhui Province

Patentee after: Daoji Intelligence (Anhui) Information Technology Co.,Ltd.

Address before: 230000 Room 203, building 2, phase I, e-commerce Park, Jinggang Road, Shushan Economic Development Zone, Hefei City, Anhui Province

Patentee before: Hefei Jiuzhou Longteng scientific and technological achievement transformation Co.,Ltd.