WO2024098647A1 - 一种校验码恢复方法、系统、电子设备及存储介质 - Google Patents

一种校验码恢复方法、系统、电子设备及存储介质 Download PDF

Info

Publication number
WO2024098647A1
WO2024098647A1 PCT/CN2023/085989 CN2023085989W WO2024098647A1 WO 2024098647 A1 WO2024098647 A1 WO 2024098647A1 CN 2023085989 W CN2023085989 W CN 2023085989W WO 2024098647 A1 WO2024098647 A1 WO 2024098647A1
Authority
WO
WIPO (PCT)
Prior art keywords
check code
global
local
transformation relationship
transformation
Prior art date
Application number
PCT/CN2023/085989
Other languages
English (en)
French (fr)
Inventor
吴睿振
王小伟
王凛
陈静静
张永兴
张旭
Original Assignee
苏州元脑智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州元脑智能科技有限公司 filed Critical 苏州元脑智能科技有限公司
Publication of WO2024098647A1 publication Critical patent/WO2024098647A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1048Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature

Definitions

  • the present application relates to the field of data storage, and in particular to a verification code recovery method, system, electronic device and storage medium.
  • the purpose of the present application is to provide a check code recovery method, system, electronic device and storage medium, which can reduce the amount of calculation for recovering the check code and improve the efficiency of check code recovery.
  • a verification code recovery method which includes:
  • generating a global checksum of a data block includes:
  • the global checksum corresponding to all data blocks is generated by the RS algorithm.
  • determining a first transformation relationship between a data block and a global check code includes:
  • the first transformation relationship is: p i represents the i-th global check code, a i1 , a i2 , ..., a ik represent k Reed-Solomon code RS algorithm parameters used to calculate p i , x j represents the j-th data block, 1 ⁇ j ⁇ k, and k represents the total number of data blocks.
  • generating a local check code of a data block includes:
  • All data blocks are divided into m data block groups, and the local check code corresponding to each data block group is generated by the RS algorithm.
  • determining a second transformation relationship between the data block and the local check code includes:
  • the second transformation relationship is: lp n represents the nth local check code, 1 ⁇ n ⁇ m, l n1 , l n2 , ..., l nt represent t RS algorithm parameters used to calculate lp n , and x s1 , x s2 , ..., x st represent s1 to st data blocks.
  • combining the first transformation relationship and the second transformation relationship to generate a check code transformation relationship of a global check code and a local check code includes:
  • the expression corresponding to the check code transformation relationship is: lp n represents the nth local check code, 1 ⁇ n ⁇ m, m represents the total number of local check codes, pi represents the ith global check code, 1 ⁇ i ⁇ r, r represents the total number of global check codes, and pp i represents the transformation parameter for transforming between the global check code and the local check code.
  • using the transformation parameters to restore the global check code or the local check code of the error report includes:
  • target local check code If the target local check code reports an error, determine whether the number of target local check codes is greater than 1;
  • the target local check code is restored using the transformation parameters, all global check codes and all local check codes except the target local check code.
  • the method further includes:
  • the data block group corresponding to the target local check code is set as the target data block group
  • the target local check code is restored using all the data blocks in the target data block group.
  • using the transformation parameters to restore the global check code or the local check code of the error report includes:
  • target global check code If the target global check code reports an error, determine whether the number of target global check codes is greater than 1;
  • the target global check code is restored using the transformation parameters, all local check codes and all global check codes except the target global check code.
  • determining whether the number of target global check codes is greater than 1 further includes:
  • the target global check code is restored using the data block.
  • it also includes:
  • the global check code or the local check code is used to recover the erroneous data block.
  • the method before generating the global check code and the local check code of the data block, the method further includes:
  • the present application also provides a verification code recovery system, the system comprising:
  • a check code generation module used for generating a global check code and a local check code of a data block
  • a transformation relationship determination module used to determine a first transformation relationship between a data block and a global check code, and a second transformation relationship between a data block and a local check code; and also used to generate a check code transformation relationship between a global check code and a local check code by combining the first transformation relationship and the second transformation relationship;
  • a parameter determination module used to determine the transformation parameters for transforming between the global check code and the local check code according to the check code transformation relationship
  • the check code recovery module is used to recover the global check code or local check code of the error report by using the transformation parameters.
  • the present application also provides a non-volatile computer-readable storage medium, on which a computer program is stored, and when the computer program is executed, the steps of the above-mentioned verification code recovery method are implemented.
  • the present application also provides an electronic device, including a memory and a processor, wherein a computer program is stored in the memory, and when the processor calls the computer program in the memory, the steps of executing the above-mentioned verification code recovery method are implemented.
  • the present application provides a check code recovery method, including: generating a global check code and a local check code of a data block; determining a first transformation relationship between the data block and the global check code, and determining a second transformation relationship between the data block and the local check code; generating a check code transformation relationship between the global check code and the local check code in combination with the first transformation relationship and the second transformation relationship; determining a transformation parameter for transforming between the global check code and the local check code according to the check code transformation relationship; and using the transformation parameter to recover an erroneous global check code or a local check code.
  • the present application determines the check code transformation relationship according to the transformation relationship between the data block and the global check code and the local check code, and the mutual conversion between the global check code and the local check code can be realized based on the check code transformation relationship.
  • the transformation parameters for transforming between the global check code and the local check code can be determined.
  • the above transformation parameters can be used to restore the erroneous global check code or local check code.
  • the above process of recovering the check code does not need to be re-encoded and calculated according to the data code, which can reduce the amount of calculation for recovering the check code and improve the efficiency of check code recovery.
  • the present application also provides a check code recovery system, a non-volatile computer-readable storage medium and an electronic device, which have the above beneficial effects and are not repeated here.
  • FIG1 is a flow chart of a verification code recovery method provided in some embodiments of the present application.
  • FIG2 is a structural example diagram of an LRC provided in some embodiments of the present application.
  • FIG3 is a schematic diagram of the relationship between a global check code and a local check code provided in some embodiments of the present application;
  • FIG4 is a schematic diagram of the structure of a verification code recovery system provided in some embodiments of the present application.
  • FIG5 is a schematic diagram of the structure of a non-volatile computer-readable storage medium provided in some embodiments of the present application.
  • FIG6 is a schematic diagram of the structure of an electronic device provided in some embodiments of the present application.
  • FIG. 1 is a flowchart of a verification code recovery method provided in some embodiments of the present application, the method may include the following steps:
  • S101 Generate a global check code and a local check code of a data block
  • the original data stored by the user can be received, the original data can be split into k data blocks, and then r global check codes and m local check codes of the k data blocks can be generated according to the erasure code algorithm.
  • Erasure codes are a type of forward error correction technology in coding theory. They were first used in the communications field to solve problems such as loss and loss in data transmission. Erasure codes have been introduced into the storage field because they have achieved good results in preventing data loss. Erasure codes can effectively reduce storage overhead while ensuring the same reliability, so they are widely used in major storage systems and data centers.
  • RS codes used in distributed environments. RS codes are related to two parameters k and r. Given two positive integers k and r, the RS code encodes k data blocks into r additional check codes (i.e., the data in the check blocks).
  • the encoding of r check codes based on the Vandermonde matrix or the Cauchy matrix is called RS erasure code encoded using the Vandermonde matrix or the Cauchy matrix.
  • all data blocks can be divided into m data block groups, and a local check code corresponding to each data block group is generated by the RS algorithm, and each data block group corresponds to a local check code. After obtaining the global check code and the local check code, it can also be determined whether the data block has an error; if so, the global check code or the local check code is used to recover the erroneous data block.
  • S102 Determine a first transformation relationship between a data block and a global check code, and determine a second transformation relationship between a data block and a local check code;
  • the first transformation relationship between the data block and the global check code can be determined according to the generation method of the global check code, and the process is as follows: the global check code corresponding to all data blocks is generated by the RS algorithm, and the first transformation relationship between the data block and the global check code is determined according to the RS algorithm; wherein the first transformation relationship is: pi represents the ith global check code, ai1 , ai2 , ..., aik represent k RS algorithm parameters used to calculate the ith global check code pi , xj represents the jth data block, 1 ⁇ j ⁇ k, and k represents the total number of data blocks.
  • the second transformation relationship between the data block and the local check code may also be determined according to the generation method of the local check code, and the process is as follows: all data blocks are divided into m data block groups, a local check code corresponding to each data block group is generated by the RS algorithm, and the second transformation relationship between the data block and the local check code is determined according to the RS algorithm; wherein the second transformation relationship is: lp n represents the nth local check code, 1 ⁇ n ⁇ m, l n1 , l n2 , ..., l nt represent t RS algorithm parameters used to calculate lp n , and x s1 , x s2 , ..., x st represent s1 to st data blocks.
  • S103 generating a check code conversion relationship of a global check code and a local check code by combining the first conversion relationship and the second conversion relationship;
  • the first transformation relationship describes the relationship between the data block and the global check code
  • the second transformation relationship describes the relationship between the data block and the local check code.
  • the check code transformation relationship between the global check code and the local check code can be obtained by combining the formulas of the first transformation relationship and the second transformation relationship.
  • S104 Determine a transformation parameter for transforming between a global check code and a local check code according to a check code transformation relationship
  • the transformation parameters for transforming between the global check code and the local check code can be determined.
  • the local check code can be obtained based on the global check code and the transformation parameters, and the global check code can also be obtained based on the local check code and the transformation parameters.
  • S105 Restoring the reported global check code or local check code using the transformed parameters.
  • the global check code before executing S105, there may be an operation to detect whether the global check code or the local check code reports an error; if the global check code reports an error, the global check code that reports an error can be restored using the local check code, transformation parameters and part of the global check code; if the local check code reports an error, the global check code, transformation parameters and part of the local check code can be restored using the global check code, transformation parameters and part of the local check code.
  • redundant code blocks are not added, and the LRC improved algorithm improves the recovery speed when the check code block is wrong, so that the formed erasure disk group can have the advantage of fast recovery in any single error scenario.
  • the check code transformation relationship is determined according to the transformation relationship between the data block and the global check code and the local check code, and the mutual conversion between the global check code and the local check code can be realized based on the check code transformation relationship.
  • the above-mentioned transformation parameters can be used to restore the erroneous global check code or the local check code. The above-mentioned process of restoring the check code does not need to be re-encoded and calculated according to the data code, which can reduce the calculation amount of restoring the check code and improve the efficiency of restoring the check code.
  • the check code transformation relationship of the global check code and the local check code can be generated by combining the first transformation relationship and the second transformation relationship in the following manner, including: constructing a group of equations based on the first transformation relationship corresponding to all global check codes and the second transformation relationship corresponding to all local check codes; solving the group of equations to obtain the check code transformation relationship.
  • the above constructed equation group can be:
  • lp n (i.e., lp 1 , lp 2 , ... , lp m ) represents the nth local check code, 1 ⁇ n ⁇ m, m represents the total number of local check codes, pi (i.e., p 1 , p 2 , ... , p r ) represents the ith global check code, 1 ⁇ i ⁇ r, r represents the total number of global check codes, and pp i (i.e., pp 1 , pp 2 , ... , pp r ) represents the transformation parameter for transforming between the global check code and the local check code.
  • a set of transformation parameters pp 1 , pp 2 , ..., pp r that meet the above expression can be calculated based on the expression corresponding to the above check code transformation relationship, and the above transformation parameters can be stored.
  • the verification code can be recovered by:
  • Step 1 Generate a global checksum and a local checksum of a data block
  • Step 2 Determine a first transformation relationship between the data block and the global check code, and determine a second transformation relationship between the data block and the local check code;
  • Step 3 Combining the first transformation relationship and the second transformation relationship to generate a check code transformation relationship of a global check code and a local check code;
  • Step 4 Determine the transformation parameters for transforming between the global check code and the local check code according to the check code transformation relationship
  • Step 5 If the target local check code reports an error, determine whether the number of target local check codes is greater than 1; if the number of target local check codes is equal to 1, use the transformation parameters, all global check codes and all local check codes except the target local check code to restore the target local check code; if the number of target local check codes is greater than 1, restore the target local check code
  • the corresponding data block group is set as the target data block group; and the target local check code is restored using all the data blocks in the target data block group.
  • Step 6 If the target global check code reports an error, determine whether the number of target global check codes is greater than 1; if the number of target global check codes is equal to 1, use the transformation parameters, all local check codes and all global check codes except the target global check code to restore the target global check code; if the number of target global check codes is greater than 1, use the data block to restore the target global check code.
  • the LRC improved algorithm that does not add redundant code blocks and improves the recovery speed when the check code blocks are wrong enables the erasure disk group formed to have the advantage of fast recovery in any single error scenario.
  • the RS erasure code based on the Vandermonde matrix is as follows:
  • the RS erasure code based on the Cauchy matrix is as follows:
  • the k*k matrix in the above content corresponds to the k original data blocks, and the r*k matrix in the lower part corresponds to the encoding matrix.
  • the newly added P 1 to P r are the data of the r check blocks obtained by encoding (i.e., check codes).
  • the inverse matrix of the matrix corresponding to the remaining data is multiplied with the data to obtain the original data blocks D 1 to D k .
  • erasure codes use the Cauchy matrix or Vandermonde matrix introduced above. The advantage of this is that the resulting matrix is definitely reversible, and any of its submatrices are also reversible, and the size of the matrix can be easily expanded.
  • the common calculation of the inverse matrix of RS erasure code uses Gaussian elimination method. This general solution is applicable to the inversion of any reversible matrix, but it is not optimized for the characteristics of matrix coding. Therefore, although the calculation is regularized, it will introduce a large number of redundant operations.
  • Gaussian elimination requires (k+r) 3 operations to obtain the required inverse matrix and then restore the corresponding data block.
  • LRC is an improved algorithm for RS and can be applied to Windows Azure storage.
  • the simple structure of LRC is shown in Figure 2, where x1-x5 represents data blocks stored by 5 users, p1 and p2 represent global check codes generated by RS, and lp1 and lp2 represent local check codes generated by RS.
  • a and b represent different parameters.
  • the data values of the parameters are determined by RS formula relationships such as Vandermonde or Cauchy, and the operation is implemented through multiplication and addition iterations of the Galois Field.
  • LRC proposed the concept of local check code, such as lp1 and lp2 in Figure 2.
  • the generation method is to first group the data blocks x1-x5 based on the relationship and properties of the data blocks. For example, in the example in Figure 2, x1-x3 are grouped together, and x4 and x5 are grouped together. Then:
  • LRC is only aimed at improving any error.
  • the global checksum p1 and p2 data blocks are required. Therefore, from the perspective of data security, the security of the global checksum is equally important.
  • the local checksum of LRC cannot be used for recovery. Even if re-encoding is required, the data blocks that need to be read are all data blocks, which is less efficient.
  • improvements are made on the basis of the LRC generation method so that it can achieve fast recovery of a single error (with a 99.75% probability of occurrence) of the check code block without increasing redundancy.
  • Figure 3 is a schematic diagram of the relationship between a global check code and a local check code provided in some embodiments of the present application.
  • x1, x2, x3, x4 and x5 represent data of the data block
  • p1 and p2 represent global check codes
  • lp1 and lp2 represent local check codes
  • l1, l2 and l3 represent parameters for calculating lp1, l4 and l5 represent parameters for calculating lp2
  • pp1 and pp2 represent transformation parameters for transforming between global check codes and local check codes.
  • the transformation parameters pp1 and pp2 are used as the third group of local codes, and in order to save redundant data blocks, the operation result is equal to the exclusive OR operation based on lp1 and lp2, which is expressed as:
  • li and ppi are the proposed operation parameters, and their parameter values depend on the grouping relationship, that is, the number of data blocks, the grouping method, and the number of check codes.
  • grouping relationship that is, the number of data blocks, the grouping method, and the number of check codes.
  • lp1 and lp2 and pp1 and pp2 form a local erasure group. Because the relationship based on the setting is:
  • the LRC improved algorithm that does not add redundant code blocks and improves the recovery speed when the check code blocks are wrong allows the composed erasure disk group to have the advantage of fast recovery in any single error scenario.
  • an improved scheme based on LRC is proposed.
  • the check code block can obtain the same effect as the data code block recovery under the protection of the local check code block, reducing the amount of data reading required for recovery and improving the recovery speed.
  • the check code transformation relationship is determined according to the transformation relationship between the data block and the global check code and the local check code, and the mutual conversion between the global check code and the local check code can be realized based on the check code transformation relationship.
  • the global check code and the local check code can be determined according to the check code transformation relationship.
  • the above-mentioned transformation parameters can be used to restore the erroneous global check code or the local check code.
  • the above-mentioned process of restoring the check code does not need to be re-encoded and calculated according to the data code, which can reduce the calculation amount of restoring the check code and improve the efficiency of restoring the check code.
  • FIG. 4 it is a schematic diagram of the structure of a verification code recovery system provided in some embodiments of the present application; the system may include:
  • a check code generation module 401 is used to generate a global check code and a local check code of a data block
  • the transformation relationship determination module 402 is used to determine a first transformation relationship between a data block and a global check code, and to determine a second transformation relationship between a data block and a local check code; and is also used to generate a check code transformation relationship between a global check code and a local check code by combining the first transformation relationship and the second transformation relationship;
  • the check code recovery module 404 is used to recover the global check code or the local check code that has been reported as an error by using the transformation parameters.
  • the original data stored by the user can be received, the original data can be split into k data blocks, and then r global check codes and m local check codes of the k data blocks can be generated according to the erasure code algorithm.
  • the first transformation relationship between the data block and the global check code can be determined according to the generation method of the global check code, and the process is as follows: the global check code corresponding to all data blocks is generated by the RS algorithm, and the first transformation relationship between the data block and the global check code is determined according to the RS algorithm; wherein the first transformation relationship is: pi represents the ith global check code, ai1 , ai2 , ..., aik represent k RS algorithm parameters used to calculate the ith global check code pi , xj represents the jth data block, 1 ⁇ j ⁇ k, and k represents the total number of data blocks.
  • the second transformation relationship between the data block and the local check code may also be determined according to the generation method of the local check code, and the process is as follows: all data blocks are divided into m data block groups, a local check code corresponding to each data block group is generated by the RS algorithm, and the second transformation relationship between the data block and the local check code is determined according to the RS algorithm; wherein the second transformation relationship is: lp n represents the nth local check code, 1 ⁇ n ⁇ m, l n1 , l n2 , ..., l nt represent t RS algorithm parameters used to calculate lp n , and x s1 , x s2 , ..., x st represent s1 to st data blocks.
  • the first transformation relationship describes the relationship between the data block and the global check code
  • the second transformation relationship describes the relationship between the data block and the local check code.
  • the check code transformation relationship between the global check code and the local check code can be obtained by combining the formulas of the first transformation relationship and the second transformation relationship.
  • the transformation parameters for transforming between the global check code and the local check code can be determined.
  • the local check code can be obtained according to the global check code and the transformation parameters, and the global check code can also be obtained according to the local check code and the transformation parameters.
  • this step there may be an operation of detecting whether the global check code or the local check code is an error; if the global check code is an error, the global check code that is an error can be restored by using the local check code, transformation parameters and part of the global check code; if the local check code is an error, the global check code, transformation parameters and part of the local check code can be restored.
  • the check code transformation relationship is determined according to the transformation relationship between the data block and the global check code and the local check code, respectively, and the mutual conversion between the global check code and the local check code can be realized based on the check code transformation relationship.
  • the transformation parameters for transforming between the global check code and the local check code can be determined.
  • the above transformation parameters can be used to restore the erroneous global check code or local check code. The above process of restoring the check code does not need to be re-encoded and calculated according to the data code, which can reduce the amount of calculation for restoring the check code and improve the efficiency of restoring the check code.
  • the check code generation module 401 generates a global check code for a data block, including: generating a global check code corresponding to all data blocks by using a RS algorithm.
  • the process of determining the first transformation relationship between the data block and the global check code by the transformation relationship determination module 402 includes: determining the first transformation relationship between the data block and the global check code according to the RS algorithm;
  • the first transformation relationship is: p i represents the i-th global check code, a i1 , a i2 , ..., a ik represent k RS algorithm parameters used to calculate p i , x j represents the j-th data block, 1 ⁇ j ⁇ k, and k represents the total number of data blocks.
  • the process of generating a local check code of a data block by the check code generation module 401 includes: dividing all data blocks into m data block groups, and generating a local check code corresponding to each data block group by using the RS algorithm.
  • the process of determining the second transformation relationship between the data block and the local check code by the transformation relationship determination module 402 includes: determining the second transformation relationship between the data block and the local check code according to the RS algorithm;
  • the second transformation relationship is: lp n represents the nth local check code, 1 ⁇ n ⁇ m, l n1 , l n2 , ..., l nt represent t RS algorithm parameters used to calculate lp n , and x s1 , x s2 , ..., x st represent s1 to st data blocks.
  • the process of the transformation relationship determination module 402 generating the check code transformation relationship of the global check code and the local check code by combining the first transformation relationship and the second transformation relationship includes: constructing an equation group according to the first transformation relationship corresponding to all global check codes and the second transformation relationship corresponding to all local check codes; solving the equation group to obtain the check code transformation relationship; wherein the expression corresponding to the check code transformation relationship is: lp n represents the nth local check code, 1 ⁇ n ⁇ m, m represents the total number of local check codes, pi represents the ith global check code, 1 ⁇ i ⁇ r, r represents the total number of global check codes, and pp i represents the transformation parameter for transforming between the global check code and the local check code.
  • the process of the check code recovery module 404 using the transformation parameters to recover the global check code or local check code that has reported an error includes: if the target local check code reports an error, determining whether the number of target local check codes is greater than 1; if the number of target local check codes is equal to 1, using the transformation parameters, all global check codes and all local check codes except the target local check code to restore the target local check code.
  • the check code recovery module 404 is also used to, after determining whether the number of target local check codes is greater than 1, set the data block group corresponding to the target local check code as the target data block group if the number of target local check codes is greater than 1; and use all data blocks in the target data block group to recover the target local check code.
  • the process of the check code recovery module 404 using the transformation parameters to recover the global check code or local check code that has reported an error includes: if the target global check code reports an error, determining whether the number of target global check codes is greater than 1; if the number of target global check codes is equal to 1, using the transformation parameters, all local check codes and all global check codes except the target global check code to restore the target global check code.
  • the check code recovery module 404 is further configured to determine whether the number of target global check codes is greater than 1, and if the number of target global check codes is greater than 1, recover the target global check code using the data block.
  • it also includes:
  • the data block recovery module is used to determine whether there is an error in the data block; if so, the erroneous data block is recovered using a global check code or a local check code.
  • it also includes:
  • the data splitting module is used to receive the original data stored by the user and split the original data into k data blocks before generating the global check code and the local check code of the data block.
  • the embodiments of the system part correspond to the embodiments of the method part, the embodiments of the system part can refer to the description of the embodiments of the method part, which will not be repeated here.
  • a non-volatile computer-readable storage medium 50 provided in some embodiments of the present application is provided, on which a computer program 510 is stored.
  • the non-volatile computer-readable storage medium 50 may include: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other media that can store program codes.
  • the steps implemented when the computer program 510 is executed include: generating a global check code and a local check code of a data block; Determine a first transformation relationship between a data block and a global check code, and determine a second transformation relationship between a data block and a local check code; generate a check code transformation relationship between a global check code and a local check code by combining the first transformation relationship and the second transformation relationship; determine a transformation parameter for transforming between the global check code and the local check code according to the check code transformation relationship; and use the transformation parameter to restore an erroneous global check code or a local check code.
  • an electronic device may include a memory 61 and a processor 62.
  • the memory 61 stores a computer program.
  • the processor 62 calls the computer program in the memory 61, the steps provided in the above embodiments may be implemented.
  • the electronic device may also include various network interfaces, power supplies and other components.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Error Detection And Correction (AREA)
  • Detection And Correction Of Errors (AREA)

Abstract

本申请公开了一种校验码恢复方法、系统、电子设备及存储介质,所属的技术领域为数据存储领域。校验码恢复方法包括:生成数据块的全局校验码和局部校验码;确定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;利用变换参数恢复报错的全局校验码或局部校验码。上述恢复校验码的过程无需根据数据码重新编码计算,能够降低恢复校验码的计算量,提高校验码恢复效率。

Description

一种校验码恢复方法、系统、电子设备及存储介质
相关申请的交叉引用
本申请要求于2022年11月11日提交中国专利局,申请号为202211409901.7,申请名称为“一种校验码恢复方法、系统、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据存储领域,特别涉及一种校验码恢复方法、系统、电子设备及存储介质。
背景技术
面对海量数据的存储要求,分布式存储以其成本低廉,可扩展性好等优势逐渐取代了统一存储的主导地位,在理论研究和实际应用方面得到了越来越多的关注。分布式存储系统多以廉价的磁盘作为存储节点,每个存储节点的可靠性往往不会很高,另一方面,一个分布式存储系统通常包含很多的节点,由于软硬件故障,人为失误等原因,系统常常发生节点失效的情况。为了提高分布式存储系统的数据可靠性,保证数据收集节点能以很高的概率实现原始文件的重构,需要在存储原始数据的基础上,额外存储一定数量的冗余,使得在出现部分节点失效的情况下,系统仍然可以正常运行,数据收集节点仍然可以对原始文件实现解码恢复。同时,为了维持系统的可靠性,需要对失效的节点及时进行修复,设计良好的节点修复机制十分重要。
分布式存储中多为了保护用户的数据,保证在掉电等场景发生时,用户的数据还可以被恢复,通常会使用如上的纠删等算法,组建纠删盘组,对其进行保护。而使用RS(Reed-Solomon,里德-所罗门码)的纠删进行数据保护时,任意错误都需要取出剩余所有数据进行运算。LRC(Locally Repairable Codes,局部校验码)的提出,通过增加局部冗余码的方式,提升了最大几率发生的单一错误场景下的纠错速度。但是所有纠删盘组保护的前提都是校验码块的完整,而LRC虽然对数据盘错误恢复时的速度进行了改进,但校验码块的错误依然需要取出所有的剩余数据块。
因此,如何降低恢复校验码的计算量,提高校验码恢复效率是本领域技术人员目前需要解决的技术问题。
发明内容
本申请的目的是提供一种校验码恢复方法、系统、电子设备及存储介质,能够降低恢复校验码的计算量,提高校验码恢复效率。
为解决上述技术问题,本申请提供一种校验码恢复方法,该校验码恢复方法包括:
生成数据块的全局校验码和局部校验码;
确定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;
结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;
根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;
利用变换参数恢复报错的全局校验码或局部校验码。
在一些实施例中,生成数据块的全局校验码,包括:
通过RS算法生成所有数据块对应的全局校验码。
在一些实施例中,确定数据块与全局校验码的第一变换关系,包括:
根据RS算法确定数据块与全局校验码的第一变换关系;
其中,第一变换关系为:pi表示第i个全局校验码,ai1、ai2、…、aik表示计算pi所使用的k个里德-所罗门码RS算法参数,xj表示第j个数据块,1≤j≤k,k表示数据块的总数量。
在一些实施例中,生成数据块的局部校验码,包括:
将所有数据块划分为m个数据块组,通过RS算法生成每一数据块组对应的局部校验码。
在一些实施例中,确定数据块与局部校验码的第二变换关系包括:
根据RS算法确定数据块与局部校验码的第二变换关系;
其中,第二变换关系为:lpn表示第n个局部校验码,1≤n≤m,ln1、ln2、…、lnt表示计算lpn所使用的t个RS算法参数,xs1、xs2、…、xst表示第s1至第st个数据块。
在一些实施例中,结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系,包括:
根据所有全局校验码对应的第一变换关系、所有局部校验码对应的第二变换关系构建方程组;
求解方程组得到校验码变换关系;其中,校验码变换关系对应的表达式为: lpn表示第n个局部校验码,1≤n≤m,m表示局部校验码的总数量,pi表示第i个全局校验码,1≤i≤r,r表示全局校验码的总数量,ppi表示全局校验码和局部校验码之间进行变换的变换参数。
在一些实施例中,利用变换参数恢复报错的全局校验码或局部校验码,包括:
若目标局部校验码报错,则判断目标局部校验码的数量是否大于1;
若目标局部校验码的数量等于1,则利用变换参数、所有全局校验码和除了目标局部校验码之外的所有局部校验码恢复目标局部校验码。
在一些实施例中,在判断目标局部校验码的数量是否大于1之后,还包括:
若目标局部校验码的数量大于1,则将目标局部校验码对应的数据块组设置为目标数据块组;
利用目标数据块组中的所有数据块恢复目标局部校验码。
在一些实施例中,利用变换参数恢复报错的全局校验码或局部校验码,包括:
若目标全局校验码报错,则判断目标全局校验码的数量是否大于1;
若目标全局校验码的数量等于1,则利用变换参数、所有局部校验码和除了目标全局校验码之外的所有全局校验码恢复目标全局校验码。
在一些实施例中,判断目标全局校验码的数量是否大于1,还包括:
若目标全局校验码的数量大于1,则利用数据块恢复目标全局校验码。
在一些实施例中,还包括:
判断数据块是否存在错误;
若是,则利用全局校验码或局部校验码恢复报错的数据块。
在一些实施例中,在生成数据块的全局校验码和局部校验码之前,还包括:
接收用户存储的原始数据,拆分原始数据得到k个数据块。
本申请还提供了一种校验码恢复系统,该系统包括:
校验码生成模块,用于生成数据块的全局校验码和局部校验码;
变换关系确定模块,用于确定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;还用于结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;
参数确定模块,用于根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;
校验码恢复模块,用于利用变换参数恢复报错的全局校验码或局部校验码。
本申请还提供了一种非易失性计算机可读存储介质,其上存储有计算机程序,计算机程序执行时实现上述校验码恢复方法执行的步骤。
本申请还提供了一种电子设备,包括存储器和处理器,存储器中存储有计算机程序,处理器调用存储器中的计算机程序时实现上述校验码恢复方法执行的步骤。
本申请提供了一种校验码恢复方法,包括:生成数据块的全局校验码和局部校验码;确定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;利用变换参数恢复报错的全局校验码或局部校验码。
本申请在得到数据块的全局校验码和局部校验码之后,根据数据块分别与全局校验码和局部校验码的变换关系确定校验码变换关系,基于校验码变换关系可以实现全局校验码和局部校验码之间的相互转换。根据校验码变换关系可以确定全局校验码和局部校验码之间进行变换的变换参数,当全局校验码或局部校验码报错时,可以利用上述变换参数恢复错误的全局校验码或局部校验码。上述恢复校验码的过程无需根据数据码重新编码计算,能够降低恢复校验码的计算量,提高校验码恢复效率。本申请同时还提供了一种校验码恢复系统、一种非易失性计算机可读存储介质和一种电子设备,具有上述有益效果,在此不再赘述。
附图说明
为了更清楚地说明本申请在一些实施例中的技术方案,下面将对实施例中所需要使用的附图做简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请在一些实施例中所提供的一种校验码恢复方法的流程图;
图2为本申请在一些实施例中所提供的一种LRC的结构示例图;
图3为本申请在一些实施例中所提供的一种全局校验码和局部校验码的关系示意图;
图4为本申请在一些实施例中所提供的一种校验码恢复系统的结构示意图;
图5为本申请在一些实施例中所提供的一种非易失性计算机可读存储介质的结构示意图;
图6为本申请在一些实施例中所提供的一种电子设备的结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
参见图1,为本申请在一些实施例中所提供的一种校验码恢复方法的流程图,可以包括如下步骤:
S101:生成数据块的全局校验码和局部校验码;
其中,在一些实施例中,可以应用于分布式存储系统,在一些实施例中可以接收用户存储的原始数据,拆分原始数据得到k个数据块,进而根据纠删码算法生成k个数据块的r个全局校验码和m个局部校验码。
纠删码(Erasure Code)属于编码理论中的一种前向纠错技术,最早应用于通信领域以解决数据传输中的丢失与损耗这类问题。由于纠删码技术在防止数据丢失取得了较好的效果,因此被引入存储领域。纠删码可以在保证相同可靠性的前提下有效地降低存储开销,因此纠删码技术被广泛地应用于各大存储系统以及数据中心。
纠删码的种类众多,在实际存储系统中较常见的有应用在分布式环境下的RS码。RS码与两个参数k和r相关。给定两个正整数k和r,RS码将k个数据块编码为r个额外的校验码(即校验块中的数据)。而r个校验码,基于范德蒙矩阵或柯西矩阵进行编码的方式就称为利用范德蒙矩阵或柯西矩阵编码的RS纠删码。
在一些实施例中,还可以将所有数据块划分为m个数据块组,通过RS算法生成每一数据块组对应的局部校验码,每一数据块组对应一个局部校验码。在得到全局校验码和局部校验码之后,还可以判断数据块是否存在错误;若是,则利用全局校验码或局部校验码恢复报错的数据块。
S102:确定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;
在一些实施例中,可以根据全局校验码的生成方式确定数据块与全局校验码的第一变换关系,其过程如下:通过RS算法生成所有数据块对应的全局校验码,根据RS算法确定数据块与全局校验码的第一变换关系;其中,第一变换关系为: pi表示第i个全局校验码,ai1、ai2、…、aik表示计算第i个全局校验码pi所使用的k个RS算法参数,xj表示第j个数据块,1≤j≤k,k表示数据块的总数量。
在一些实施例中,还可以根据局部校验码的生成方式确定数据块与局部校验码的第二变换关系,其过程如下:将所有数据块划分为m个数据块组,通过RS算法生成每一数据块组对应的局部校验码,根据RS算法确定数据块与局部校验码的第二变换关系;其中,第二变换关系为:lpn表示第n个局部校验码,1≤n≤m,ln1、ln2、…、lnt表示计算lpn所使用的t个RS算法参数,xs1、xs2、…、xst表示第s1至第st个数据块。
S103:结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;
第一变换关系描述数据块与全局校验码的关系,第二变换关系描述数据块与局部校验码的关系,通过联立第一变换关系和第二变换关系的公式可以得到全局校验码和局部校验码的校验码变换关系。
S104:根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;
在得到校验码变换关系之后,可以确定全局校验码和局部校验码之间进行变换的变换参数,可以根据全局校验码和变换参数得到局部校验码,也可以根据局部校验码和变换参数得到全局校验码。
S105:利用变换参数恢复报错的全局校验码或局部校验码。
在一些实施例中,在执行S105之前,可以存在检测全局校验码或局部校验码是否报错的操作;若全局校验码报错,则可以利用局部校验码、变换参数和部分的全局校验码恢复报错的全局校验码;若局部校验码报错,则可以利用全局校验码、变换参数和部分的局部校验码恢复报错的局部校验码。
在上述方案中,任意一个校验码错误,只需要读取剩余其他校验码块,而无需读取全部数据块。从而达到在不增加冗余存储块,且局部校验码效果不受影响的前提下,改善校验码块的恢复速度。
在一些实施例中,不增加冗余码块,提升校验码块错误时恢复速度的LRC改进算法,使得所组成的纠删盘组在任何单一错误的场景下都可具有快速恢复的优势。
在一些实施例中,在得到数据块的全局校验码和局部校验码之后,根据数据块分别与全局校验码和局部校验码的变换关系确定校验码变换关系,基于校验码变换关系可以实现全局校验码和局部校验码之间的相互转换。根据校验码变换关系可以确定全局校验码和局部校验 码之间进行变换的变换参数,当全局校验码或局部校验码报错时,可以利用上述变换参数恢复错误的全局校验码或局部校验码。上述恢复校验码的过程无需根据数据码重新编码计算,能够降低恢复校验码的计算量,提高校验码恢复效率。
在一些实施例中,可以通过以下方式结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系,包括:根据所有全局校验码对应的第一变换关系、所有局部校验码对应的第二变换关系构建方程组;求解方程组得到校验码变换关系。
其中,上述构建的方程组可以为:

上述校验码变换关系对应的表达式为:
以上表达式中,lpn(即lp1、lp2、…、lpm)表示第n个局部校验码,1≤n≤m,m表示局部校验码的总数量,pi(即p1、p2、…、pr)表示第i个全局校验码,1≤i≤r,r表示全局校验码的总数量,ppi(即pp1、pp2、…、ppr)表示全局校验码和局部校验码之间进行变换的变换参数。
在得到所有的全局校验码和局部校验码之后,可以基于上述校验码变换关系对应的表达式,计算得到符合上述表达式的一组变换参数pp1、pp2、…、ppr,并对上述变换参数进行存储。
在一些实施例中,可以通过以下方式恢复校验码:
步骤1:生成数据块的全局校验码和局部校验码;
步骤2:确定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;
步骤3:结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;
步骤4:根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;
步骤5:若目标局部校验码报错,则判断目标局部校验码的数量是否大于1;若目标局部校验码的数量等于1,则利用变换参数、所有全局校验码和除了目标局部校验码之外的所有局部校验码恢复目标局部校验码;若目标局部校验码的数量大于1,则将目标局部校验码 对应的数据块组设置为目标数据块组;利用目标数据块组中的所有数据块恢复目标局部校验码。
步骤6:若目标全局校验码报错,则判断目标全局校验码的数量是否大于1;若目标全局校验码的数量等于1,则利用变换参数、所有局部校验码和除了目标全局校验码之外的所有全局校验码恢复目标全局校验码;若目标全局校验码的数量大于1,则利用数据块恢复目标全局校验码。
在上述方案中,任意一个校验码错误,只需要读取剩余其他校验码块,而无需读取全部数据块。从而达到在不增加冗余存储块,且局部校验码效果不受影响的前提下,改善校验码块的恢复速度。在一些实施例中,不增加冗余码块,提升校验码块错误时恢复速度的LRC改进算法,使得所组成的纠删盘组在任何单一错误的场景下都可具有快速恢复的优势。
下面,继续详细地说明上述的一些实施例描述的流程。
基于范德蒙矩阵的RS纠删码如下:
基于柯西矩阵的RS纠删码如下:
以上内容中的k*k矩阵对应的就是k个原始数据块,下部分的r*k矩阵对应的就是编码矩阵,通过与原始数据D1到Dk相乘,得到新添加的P1到Pr就是编码所得到的r个校验块的数据(即校验码)。当其中任意做多r个数据在传输中出错或丢失,需要纠错时,即用剩余数据对应矩阵的逆矩阵与数据相乘,即会得到原始数据块D1到Dk
以D1到Dk数据丢失,进行解码为例,过程如下所示:
可知纠删码的核心概念是构建一个可逆的编码矩阵用以产生校验数据,其逆矩阵可经过计算恢复原始数据。常见的RS纠删码使用的是上面介绍的柯西矩阵或范德蒙矩阵,这样的优势是所得到的矩阵肯定可逆,其任意子矩阵也都可逆,并且矩阵的大小扩充简单。
常见的RS纠删码逆矩阵的计算采用的是高斯消元法,这种通用解法适用于任何可逆矩阵的求逆,但是没有针对矩阵编码的特性进行优化,因此虽然计算规律化,却会引入大量冗余运算。当存储k个数据块,添加r个校验数据块的情况下,需要恢复的单个数据块错误的几率占了99.75%(2007存储技术年会统计数据),而使用高斯消元需要(k+r)3次运算,才能得到所需要的逆矩阵,然后恢复相应的数据块。
LRC是针对RS的一种改进算法,可以应用于Windows azure storage存储中。LRC的简单结构如图2所示;其中,x1-x5表示5个用户存储的数据块,p1和p2表示基于通过RS生成的全局校验码,lp1和lp2表示通过RS生成的局部校验码。
通过RS算法计算得到全局校验码p1和p2的过程如下:
上式中a和b代表不同的参数,参数的数据值由范德蒙或柯西等RS公式关系决定,运算通过伽罗华域的乘法和加法迭代实现。
如上可知,仅有p1和p2时,任意错误发生都需要取出所有的数据块进行运算,即是对于上述图2中的例子,至少每次需要读取五个数据块的数据,以完成运算。
LRC提出了局部校验码的概念,如图2中的lp1和lp2,其生成方式为,首先将数据块x1-x5基于数据块的关系和性质进行分组,如图2中的例子,x1-x3分为一组,x4和x5分为一组。则有:
如上可知,99.75%的错误发生情况都是单一错误的场景,则针对以上例子中的任一用户数据(x1-x5)错误,则都可适用lp1和lp2进行快速恢复。
也就是说,基于公式(2)的关系,当错误发生在x1-x3中任意一个时,只需要读取剩余两块数据即可利用lp1进行异或恢复。而当错误发生在x4和x5中任意一个时,只需要读取lp2和剩余一个数据块即可恢复;相比利用p1和p2进行恢复时需要至少读取五块数据得到了较大的改进。
而LRC所针对的只是任一错误的改进,对于发生一个以上错误时,就需要用到全局校验p1和p2的数据块,因此从数据安全性的保证需求上,全局校验码的安全性一样重要,但是使用传统LRC,当全局校验码发生错误时,无法利用LRC的局部校验码进行恢复,即使需要重新进行编码,所需读取的数据块为所有数据块,其效率较差。
在一些实施例中,针对上述问题,在LRC的生成方法基础上进行了改进,以使其在不增加冗余的前提下,可以实现校验码块的单一错误(99.75%概率发生)快速恢复。
参见图3,为本申请在一些实施例中所提供的一种全局校验码和局部校验码的关系示意图,图3中x1、x2、x3、x4和x5表示数据块的数据,p1和p2表示全局校验码,lp1和lp2表示局部校验码,l1、l2和l3表示计算lp1的参数,l4和l5表示计算lp2的参数,pp1和pp2表示全局校验码和局部校验码之间进行变换的变换参数。
在一些实施例中,为了改进p1和p2的单一错误时解码速度,使变换参数pp1和pp2作为第三组局部码,而为了节省冗余数据块,使其运算结果等于基于lp1和lp2的异或运算,则表示为:
为了使公式(3)能够成立,还需要对lp1和lp2的生成关系进行改进,表示为:
以上式(3)和式(4)中的li和ppi都是所提出的运算参数,其参数值取决于分组关系,亦即是数据块的个数、分组的方式,以及校验码的个数。如图3中的情况,对于任意RS算法,可以基于公式(1)结合p1p2的关系,可以将上式组合表示为:
因为如上,li和ppi都是所提出的运算参数,为了使得运算公式成立,这里的运算参数在赋值时需要满足一定的条件,而该条件通过公式(5)可以做推导计算,对公式(5)进行分析变换,可以得到:
公式(5)如果成立,则需满足公式(6)的关系。
公式(6)中ai,bi已知,当给定pp1和pp2时,可以求解l1,…l5。
证毕,即可基于公式(6)得到图3中所有li和ppi的值。
举例说明上式关系,假设此时RS算法使用的是范德蒙关系,即:
则基于上式可简单求得一个有效解:pp1=pp2=1,l1=2,l2=3,l3=4,l4=5,l5=6;验证关系有:
上述关系成立,则公式(3)成立。
因此使用一些实施例来实现上述方法时,lp1和lp2和pp1、pp2组成了一个局部纠删组。因为基于设置的关系为:
因此无需额外增加任意冗余存储。对于校验码lp1,lp2,p1,p2任意一个发生错误时可得到其纠错关系为:
相比较于上述LRC的介绍可知,其任意一个校验码错误,都只需要读取剩余其他校验码块,而无需读取全部数据块。从而达到在不增加冗余存储块,且局部校验码效果不受影响的前提下,改善校验码块的恢复速度。在一些实施例中,不增加冗余码块,提升校验码块错误时恢复速度的LRC改进算法,使得所组成的纠删盘组在任何单一错误的场景下都可具有快速恢复的优势。
在一些实施例中,提出了一种基于LRC的改进方案,通过改进编解码的方式,在不改变局部校验码块的修复效果,在不增加任何冗余存储的前提下,使得校验码块可以得到相同于局部校验码块保护下数据码块恢复的效果,减少恢复所需的数据读取量,提高恢复速度。在一些实施例中,在得到数据块的全局校验码和局部校验码之后,根据数据块分别与全局校验码和局部校验码的变换关系确定校验码变换关系,基于校验码变换关系可以实现全局校验码和局部校验码之间的相互转换。根据校验码变换关系可以确定全局校验码和局部校验码之 间进行变换的变换参数,当全局校验码或局部校验码报错时,可以利用上述变换参数恢复错误的全局校验码或局部校验码。上述恢复校验码的过程无需根据数据码重新编码计算,能够降低恢复校验码的计算量,提高校验码恢复效率。
参见图4,为本申请在一些实施例中所提供的一种校验码恢复系统的结构示意图;该系统可以包括:
校验码生成模块401,用于生成数据块的全局校验码和局部校验码;
变换关系确定模块402,用于确定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;还用于结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;
参数确定模块403,用于根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;
校验码恢复模块404,用于利用变换参数恢复报错的全局校验码或局部校验码。
其中,在一些实施例中,可以应用于分布式存储系统,在一些实施例中可以接收用户存储的原始数据,拆分原始数据得到k个数据块,进而根据纠删码算法生成k个数据块的r个全局校验码和m个局部校验码。
在一些实施例中,可以根据全局校验码的生成方式确定数据块与全局校验码的第一变换关系,其过程如下:通过RS算法生成所有数据块对应的全局校验码,根据RS算法确定数据块与全局校验码的第一变换关系;其中,第一变换关系为:pi表示第i个全局校验码,ai1、ai2、…、aik表示计算第i个全局校验码pi所使用的k个RS算法参数,xj表示第j个数据块,1≤j≤k,k表示数据块的总数量。
在一些实施例中,还可以根据局部校验码的生成方式确定数据块与局部校验码的第二变换关系,其过程如下:将所有数据块划分为m个数据块组,通过RS算法生成每一数据块组对应的局部校验码,根据RS算法确定数据块与局部校验码的第二变换关系;其中,第二变换关系为:lpn表示第n个局部校验码,1≤n≤m,ln1、ln2、…、lnt表示计算lpn所使用的t个RS算法参数,xs1、xs2、…、xst表示第s1至第st个数据块。
第一变换关系描述数据块与全局校验码的关系,第二变换关系描述数据块与局部校验码的关系,通过联立第一变换关系和第二变换关系的公式可以得到全局校验码和局部校验码的校验码变换关系。
在得到校验码变换关系之后,可以确定全局校验码和局部校验码之间进行变换的变换参 数,可以根据全局校验码和变换参数得到局部校验码,也可以根据局部校验码和变换参数得到全局校验码。
其中,在本步骤之前可以存在检测全局校验码或局部校验码是否报错的操作;若全局校验码报错,则可以利用局部校验码、变换参数和部分的全局校验码恢复报错的全局校验码;若局部校验码报错,则可以利用全局校验码、变换参数和部分的局部校验码恢复报错的局部校验码。
在一些实施例中,在得到数据块的全局校验码和局部校验码之后,根据数据块分别与全局校验码和局部校验码的变换关系确定校验码变换关系,基于校验码变换关系可以实现全局校验码和局部校验码之间的相互转换。根据校验码变换关系可以确定全局校验码和局部校验码之间进行变换的变换参数,当全局校验码或局部校验码报错时,可以利用上述变换参数恢复错误的全局校验码或局部校验码。上述恢复校验码的过程无需根据数据码重新编码计算,能够降低恢复校验码的计算量,提高校验码恢复效率。
在一些实施例中,校验码生成模块401生成数据块的全局校验码包括:通过RS算法生成所有数据块对应的全局校验码。
在一些实施例中,变换关系确定模块402确定数据块与全局校验码的第一变换关系的过程包括:根据RS算法确定数据块与全局校验码的第一变换关系;
其中,第一变换关系为:pi表示第i个全局校验码,ai1、ai2、…、aik表示计算pi所使用的k个RS算法参数,xj表示第j个数据块,1≤j≤k,k表示数据块的总数量。
在一些实施例中,校验码生成模块401生成数据块的局部校验码的过程包括:将所有数据块划分为m个数据块组,通过RS算法生成每一数据块组对应的局部校验码。
在一些实施例中,变换关系确定模块402确定数据块与局部校验码的第二变换关系的过程包括:根据RS算法确定数据块与局部校验码的第二变换关系;
其中,第二变换关系为:lpn表示第n个局部校验码,1≤n≤m,ln1、ln2、…、lnt表示计算lpn所使用的t个RS算法参数,xs1、xs2、…、xst表示第s1至第st个数据块。
在一些实施例中,变换关系确定模块402结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系的过程包括:根据所有全局校验码对应的第一变换关系、所有局部校验码对应的第二变换关系构建方程组;求解方程组得到校验码变换关系;其中,校验码变换关系对应的表达式为:lpn 表示第n个局部校验码,1≤n≤m,m表示局部校验码的总数量,pi表示第i个全局校验码,1≤i≤r,r表示全局校验码的总数量,ppi表示全局校验码和局部校验码之间进行变换的变换参数。
在一些实施例中,校验码恢复模块404利用变换参数恢复报错的全局校验码或局部校验码的过程包括:若目标局部校验码报错,则判断目标局部校验码的数量是否大于1;若目标局部校验码的数量等于1,则利用变换参数、所有全局校验码和除了目标局部校验码之外的所有局部校验码恢复目标局部校验码。
在一些实施例中,校验码恢复模块404还用于在判断目标局部校验码的数量是否大于1之后,若目标局部校验码的数量大于1,则将目标局部校验码对应的数据块组设置为目标数据块组;利用目标数据块组中的所有数据块恢复目标局部校验码。
在一些实施例中,校验码恢复模块404利用变换参数恢复报错的全局校验码或局部校验码的过程包括:若目标全局校验码报错,则判断目标全局校验码的数量是否大于1;若目标全局校验码的数量等于1,则利用变换参数、所有局部校验码和除了目标全局校验码之外的所有全局校验码恢复目标全局校验码。
在一些实施例中,校验码恢复模块404还用于判断目标全局校验码的数量是否大于1之后,若目标全局校验码的数量大于1,则利用数据块恢复目标全局校验码。
在一些实施例中,还包括:
数据块恢复模块,用于判断数据块是否存在错误;若是,则利用全局校验码或局部校验码恢复报错的数据块。
在一些实施例中,还包括:
数据拆分模块,用于在生成数据块的全局校验码和局部校验码之前,接收用户存储的原始数据,拆分原始数据得到k个数据块。
由于系统部分的实施例与方法部分的实施例相互对应,因此系统部分的实施例可以参见方法部分的实施例的描述,这里暂不赘述。
参见图5,为本申请在一些实施例中所提供的一种非易失性计算机可读存储介质50,其上存有计算机程序510,该计算机程序510被执行时可以实现上述实施例所提供的步骤。该非易失性计算机可读存储介质50可以包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
计算机程序510被执行时实现的步骤包括:生成数据块的全局校验码和局部校验码;确 定数据块与全局校验码的第一变换关系,确定数据块与局部校验码的第二变换关系;结合第一变换关系和第二变换关系生成全局校验码和局部校验码的校验码变换关系;根据校验码变换关系确定全局校验码和局部校验码之间进行变换的变换参数;利用变换参数恢复报错的全局校验码或局部校验码。
参见图5,为本申请在一些实施例中所提供的一种电子设备,可以包括存储器61和处理器62,存储器61中存有计算机程序,处理器62调用存储器61中的计算机程序时,可以实现上述实施例所提供的步骤。当然电子设备还可以包括各种网络接口,电源等组件。
说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的系统而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的状况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。

Claims (20)

  1. 一种校验码恢复方法,其特征在于,包括:
    生成数据块的全局校验码和局部校验码;
    确定所述数据块与所述全局校验码的第一变换关系,确定所述数据块与所述局部校验码的第二变换关系;
    结合所述第一变换关系和所述第二变换关系生成所述全局校验码和所述局部校验码的校验码变换关系;
    根据所述校验码变换关系确定所述全局校验码和所述局部校验码之间进行变换的变换参数;
    利用所述变换参数恢复报错的所述全局校验码或所述局部校验码。
  2. 根据权利要求1所述校验码恢复方法,其特征在于,所述生成数据块的全局校验码,包括:
    通过里德-所罗门码RS算法生成所有所述数据块对应的所述全局校验码。
  3. 根据权利要求2所述校验码恢复方法,其特征在于,
    每一数据块组对应一个局部校验码。
  4. 根据权利要求2所述校验码恢复方法,其特征在于,确定所述数据块与所述全局校验码的第一变换关系,包括:
    根据所述RS算法确定所述数据块与所述全局校验码的所述第一变换关系;
    其中,所述第一变换关系为:pi表示第i个所述全局校验码,ai1、ai2、…、aik表示计算pi所使用的k个RS算法参数,xj表示第j个所述数据块,1≤j≤k,k表示所述数据块的总数量。
  5. 根据权利要求1所述校验码恢复方法,其特征在于,所述生成数据块的局部校验码,包括:
    将所有所述数据块划分为m个数据块组,通过RS算法生成每一所述数据块组对应的所述局部校验码。
  6. 根据权利要求5所述校验码恢复方法,其特征在于,确定所述数据块与所述局部校验码的第二变换关系,包括:
    根据所述RS算法确定所述数据块与所述局部校验码的所述第二变换关系;
    其中,所述第二变换关系为:lpn表示第n个所述局部校验码,1≤n≤m,ln1、ln2、…、lnt表示计算lpn所使用的t个RS算法参数,xs1、xs2、…、xst表示第s1至第st个所述数据块。
  7. 根据权利要求1所述校验码恢复方法,其特征在于,所述结合所述第一变换关系和所述第二变换关系生成所述全局校验码和所述局部校验码的校验码变换关系,包括:
    联立所述第一变换关系和所述第二变换关系的公式,得到所述全局校验码和所述局部校验码的校验码变换关系。
  8. 根据权利要求7所述校验码恢复方法,其特征在于,所述联立所述第一变换关系和所述第二变换关系的公式,得到所述全局校验码和所述局部校验码的校验码变换关系,包括:
    根据所有所述全局校验码对应的所述第一变换关系、所有所述局部校验码对应的所述第二变换关系构建方程组;
    求解所述方程组得到所述校验码变换关系;其中,所述校验码变换关系对应的表达式为:lpn表示第n个所述局部校验码,1≤n≤m,m表示所述局部校验码的总数量,pi表示第i个所述全局校验码,1≤i≤r,r表示所述全局校验码的总数量,ppi表示所述全局校验码和所述局部校验码之间进行变换的所述变换参数。
  9. 根据权利要求1所述校验码恢复方法,其特征在于,利用所述变换参数恢复报错的所述全局校验码或所述局部校验码,包括:
    若目标局部校验码报错,则判断所述目标局部校验码的数量是否大于1;
    若所述目标局部校验码的数量等于1,则利用所述变换参数、所有所述全局校验码和除了所述目标局部校验码之外的所有所述局部校验码恢复所述目标局部校验码。
  10. 根据权利要求9所述校验码恢复方法,其特征在于,在判断所述目标局部校验码的数量是否大于1之后,还包括:
    若所述目标局部校验码的数量大于1,则将所述目标局部校验码对应的数据块组设置为目标数据块组;
    利用所述目标数据块组中的所有数据块恢复所述目标局部校验码。
  11. 根据权利要求1所述校验码恢复方法,其特征在于,利用所述变换参数恢复报错的全局校验码或局部校验码,包括:
    若目标全局校验码报错,则判断所述目标全局校验码的数量是否大于1;
    若所述目标全局校验码的数量等于1,则利用所述变换参数、所有所述局部校验码和除了所述目标全局校验码之外的所有所述全局校验码恢复所述目标全局校验码。
  12. 根据权利要求11所述校验码恢复方法,其特征在于,在判断所述目标全局校验码的数量是否大于1之后,还包括:
    若所述目标全局校验码的数量大于1,则利用所述数据块恢复所述目标全局校验码。
  13. 根据权利要求1所述校验码恢复方法,其特征在于,还包括:
    判断所述数据块是否存在错误;
    若是,则利用所述全局校验码或所述局部校验码恢复报错的所述数据块。
  14. 根据权利要求1所述校验码恢复方法,其特征在于,在生成数据块的全局校验码和局部校验码之前,还包括:
    接收用户存储的原始数据,拆分所述原始数据得到k个所述数据块。
  15. 根据权利要求1所述校验码恢复方法,其特征在于,
    所述第一变换关系用于描述数据块与全局校验码的关系,所述第二变换关系用于描述数据块与局部校验码的关系。
  16. 根据权利要求1所述校验码恢复方法,其特征在于,在利用所述变换参数恢复报错的所述全局校验码或所述局部校验码之前,还包括:
    检测全局校验码或局部校验码是否报错的操作。
  17. 根据权利要求15所述校验码恢复方法,其特征在于,所述利用所述变换参数恢复报错的所述全局校验码或所述局部校验码,包括:
    若所述全局校验码报错,则利用所述局部校验码、所述变换参数和部分的全局校验码恢复报错的全局校验码;
    若所述局部校验码报错,则利用所述全局校验码、所述变换参数和部分的局部校验码恢复报错的局部校验码。
  18. 一种校验码恢复系统,其特征在于,包括:
    校验码生成模块,用于生成数据块的全局校验码和局部校验码;
    变换关系确定模块,用于确定所述数据块与所述全局校验码的第一变换关系,确定所述数据块与所述局部校验码的第二变换关系;还用于结合所述第一变换关系和所述第二变换关系生成所述全局校验码和所述局部校验码的校验码变换关系;
    参数确定模块,用于根据所述校验码变换关系确定所述全局校验码和所述局部校验码之间进行变换的变换参数;
    校验码恢复模块,用于利用所述变换参数恢复报错的所述全局校验码或所述局部校验码。
  19. 一种电子设备,其特征在于,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器调用所述存储器中的计算机程序时实现如权利要求1至17任一项所 述校验码恢复方法。
  20. 一种非易失性计算机可读存储介质,其特征在于,所述非易失性计算机可读存储介质中存储有计算机可执行指令,所述计算机可执行指令被处理器加载并执行时,实现如权利要求1至17任一项所述校验码恢复方法。
PCT/CN2023/085989 2022-11-11 2023-04-03 一种校验码恢复方法、系统、电子设备及存储介质 WO2024098647A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211409901.7A CN115454712B (zh) 2022-11-11 2022-11-11 一种校验码恢复方法、系统、电子设备及存储介质
CN202211409901.7 2022-11-11

Publications (1)

Publication Number Publication Date
WO2024098647A1 true WO2024098647A1 (zh) 2024-05-16

Family

ID=84295542

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/085989 WO2024098647A1 (zh) 2022-11-11 2023-04-03 一种校验码恢复方法、系统、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN115454712B (zh)
WO (1) WO2024098647A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115454712B (zh) * 2022-11-11 2023-02-28 苏州浪潮智能科技有限公司 一种校验码恢复方法、系统、电子设备及存储介质
CN115793984B (zh) * 2023-01-03 2023-04-28 苏州浪潮智能科技有限公司 一种数据存储方法、装置、计算机设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107656832A (zh) * 2017-09-18 2018-02-02 华中科技大学 一种低数据重建开销的纠删码方法
CN111240597A (zh) * 2020-01-15 2020-06-05 书生星际(北京)科技有限公司 存储数据的方法、装置、设备和计算机可读存储介质
US20210091789A1 (en) * 2017-07-28 2021-03-25 Industry-University Cooperation Foundation Hanyang University Method and apparatus for encoding erasure code for storing data
US20210271557A1 (en) * 2018-09-03 2021-09-02 Here Data Technology Data encoding, decoding and recovering method for a distributed storage system
CN114048061A (zh) * 2021-10-09 2022-02-15 阿里云计算有限公司 校验块的生成方法及装置
CN115098295A (zh) * 2022-06-29 2022-09-23 阿里巴巴(中国)有限公司 数据局部恢复方法、设备及存储介质
CN115454712A (zh) * 2022-11-11 2022-12-09 苏州浪潮智能科技有限公司 一种校验码恢复方法、系统、电子设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106527993B (zh) * 2016-11-09 2019-08-30 北京搜狐新媒体信息技术有限公司 一种分布式系统中的海量文件储存方法及装置
CN111858169B (zh) * 2020-07-10 2023-07-25 山东云海国创云计算装备产业创新中心有限公司 一种数据恢复方法、系统及相关组件
CN115269258A (zh) * 2022-07-27 2022-11-01 山东云海国创云计算装备产业创新中心有限公司 一种数据恢复的方法和系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210091789A1 (en) * 2017-07-28 2021-03-25 Industry-University Cooperation Foundation Hanyang University Method and apparatus for encoding erasure code for storing data
CN107656832A (zh) * 2017-09-18 2018-02-02 华中科技大学 一种低数据重建开销的纠删码方法
US20210271557A1 (en) * 2018-09-03 2021-09-02 Here Data Technology Data encoding, decoding and recovering method for a distributed storage system
CN111240597A (zh) * 2020-01-15 2020-06-05 书生星际(北京)科技有限公司 存储数据的方法、装置、设备和计算机可读存储介质
CN114048061A (zh) * 2021-10-09 2022-02-15 阿里云计算有限公司 校验块的生成方法及装置
CN115098295A (zh) * 2022-06-29 2022-09-23 阿里巴巴(中国)有限公司 数据局部恢复方法、设备及存储介质
CN115454712A (zh) * 2022-11-11 2022-12-09 苏州浪潮智能科技有限公司 一种校验码恢复方法、系统、电子设备及存储介质

Also Published As

Publication number Publication date
CN115454712A (zh) 2022-12-09
CN115454712B (zh) 2023-02-28

Similar Documents

Publication Publication Date Title
WO2024098647A1 (zh) 一种校验码恢复方法、系统、电子设备及存储介质
US11531593B2 (en) Data encoding, decoding and recovering method for a distributed storage system
US9141679B2 (en) Cloud data storage using redundant encoding
US8522122B2 (en) Correcting memory device and memory channel failures in the presence of known memory device failures
WO2018166078A1 (zh) 修复多节点失效的mds阵列码编码以及解码方法
CN112000512B (zh) 一种数据修复方法及相关装置
WO2018171111A1 (zh) 多容错性的mds阵列码编码以及修复方法
Gad et al. Repair-optimal MDS array codes over GF (2)
WO2022127289A1 (zh) 基于高斯消元进行校验恢复的方法、系统、设备及介质
US20100138717A1 (en) Fork codes for erasure coding of data blocks
US20210218419A1 (en) Method, device and apparatus for storing data, computer readable storage medium
Venkatesan et al. Effect of codeword placement on the reliability of erasure coded data storage systems
CN111682874A (zh) 一种数据恢复的方法、系统、设备及可读存储介质
CN114816837A (zh) 一种纠删码融合方法、系统、电子设备及存储介质
WO2017185681A1 (zh) 一种gel码字结构编码和译码的方法、装置及相关设备
CN112000278B (zh) 一种热数据存储的自适应局部重构码设计方法及云存储系统
WO2017041232A1 (zh) 一种二进制循环码的编解码框架
Song et al. A Low complexity design of reed solomon code algorithm for advanced RAID system
Chen et al. A new Zigzag MDS code with optimal encoding and efficient decoding
CN115113816A (zh) 一种纠删码数据处理系统、方法、计算机设备及介质
CN115269258A (zh) 一种数据恢复的方法和系统
Lan et al. Efficient Repair Algorithm for Information Column of EVENODD (p, 4) Codes
Galinanes et al. Ensuring data durability with increasingly interdependent content
Kadhe et al. Codes with locality in the rank and subspace metrics
CN114879904B (zh) 一种数据存储纠删方法、装置、设备及可读存储介质