WO2018000788A1 - 一种数据存储方法和装置、一种数据恢复方法和装置 - Google Patents

一种数据存储方法和装置、一种数据恢复方法和装置 Download PDF

Info

Publication number
WO2018000788A1
WO2018000788A1 PCT/CN2016/113523 CN2016113523W WO2018000788A1 WO 2018000788 A1 WO2018000788 A1 WO 2018000788A1 CN 2016113523 W CN2016113523 W CN 2016113523W WO 2018000788 A1 WO2018000788 A1 WO 2018000788A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
local
partial
stored
recovered
Prior art date
Application number
PCT/CN2016/113523
Other languages
English (en)
French (fr)
Inventor
张帅
陈晓辉
尹殷
李凯
Original Assignee
北京三快在线科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京三快在线科技有限公司 filed Critical 北京三快在线科技有限公司
Priority to US16/314,281 priority Critical patent/US10754727B2/en
Priority to CA3053855A priority patent/CA3053855C/en
Priority to EP16907175.0A priority patent/EP3480697A4/en
Publication of WO2018000788A1 publication Critical patent/WO2018000788A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1068Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices in sector programmable memories, e.g. flash disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • H03M13/15Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
    • H03M13/151Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
    • H03M13/1515Reed-Solomon codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • H03M13/373Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35 with erasure correction and erasure determination, e.g. for packet loss recovery or setting of erasures for the decoding of Reed-Solomon codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/61Aspects and characteristics of methods and arrangements for error correction or error detection, not provided for otherwise
    • H03M13/615Use of computational or mathematical techniques
    • H03M13/616Matrix operations, especially for generator matrices or check matrices, e.g. column or row permutations

Definitions

  • the present application relates to the field of computer information technology, and in particular, to a data storage method and apparatus, and a data recovery method and apparatus.
  • files can be divided into multiple data blocks for storage; in order to ensure system robustness and disaster recovery capability, data blocks generally have multiple copies and are stored in different physical locations. .
  • the multi-copy disaster recovery method needs to be configured with more storage devices, thereby increasing the cost of the storage device. Taking three copies as an example, the multi-copy disaster recovery method will increase storage redundancy by 200% and increase 200. % storage cost.
  • the RS (Reed-Solomon) method can generate a corresponding check block according to the specified data block, and when the data block fails, it can be based on the un-failed data block and the check block. Recovering failed blocks of data allows for higher data reliability with less data redundancy. For example, when the size of the specified data block and its corresponding parity block are 100M and 30M, respectively, the above RS method can achieve storage reliability of three copies using 30% redundancy.
  • the above RS method usually needs to read all the data blocks and check blocks that are not invalid, that is, the above RS method cannot effectively utilize IO in the data recovery process (input and output, Input Output), usually 30% redundant RS method needs to read 100M data during data recovery, which will cause 10 times IO consumption.
  • the technical problem to be solved by the embodiments of the present application is to provide a data storage method and a data recovery method, which can reduce the amount of data read during the data recovery process, thereby greatly reducing the IO consumption in the data recovery process.
  • the embodiment of the present application further provides a data storage device and a data recovery device to ensure implementation and application of the foregoing method.
  • a data storage method including:
  • the local generation matrix encoding the overall data of the data to be stored to obtain corresponding overall verification data, wherein the overall verification data includes local verification data related to the local data;
  • the partial verification data and its corresponding local data are stored.
  • the local data of the data to be stored is obtained by the following steps:
  • the divided data blocks are grouped, and corresponding local data is obtained according to the group.
  • the local data of the data to be stored includes: local data according to the corresponding group of the data to be stored, where the local check data includes:
  • the second partial parity data corresponding to the local data of the packet combination.
  • the method further includes:
  • a mapping relationship between the storage group combination information and the storage address of the second partial parity data is stored.
  • the present application discloses a data recovery method, including:
  • the corresponding partial check data and the local data are read from the pre-stored data; wherein the partial check data is obtained by encoding the whole data of the to-be-stored data according to the local generation matrix.
  • the local generation matrix is obtained by splitting the rows of the overall generation matrix, and the split rows of the local generation matrix include zero elements;
  • the data to be recovered is restored according to the read partial verification data and the local data.
  • the step of recovering the data to be restored according to the read local check data and the local data includes:
  • the local decoding matrix comprising: a row of an identity matrix and a row of a local generation matrix, The row of the unit matrix does not include the corresponding row of the data to be restored, and the local decoding matrix is a square matrix;
  • the step of reading the corresponding partial verification data from the pre-stored data for the data to be restored includes:
  • the first partial-check corresponding to the single-target packet is read from the pre-stored data.
  • the data and the local data are used as the local check data and the local data corresponding to the first to-be-recovered data; wherein the first partial check data is obtained according to the partial data of the single packet.
  • the step of reading the first partial parity data corresponding to the single target packet from the pre-stored data includes:
  • Corresponding first partial verification data is read from pre-stored data according to the first target storage address.
  • the step of reading the corresponding partial verification data from the pre-stored data for the data to be restored includes:
  • the target grouping combination includes: a single target grouping
  • the second partial-verification corresponding to the target packet combination is read from the pre-stored data.
  • the data and the local data are used as the local check data and the local data corresponding to the second to-be-recovered data; wherein the second partial check data is obtained according to the partial data of the packet combination.
  • the step of reading the second partial parity data corresponding to the target packet combination from the pre-stored data includes:
  • the present application discloses a data storage device comprising a processor that reads the machine readable instructions corresponding to the data storage control logic stored on the storage medium and executes the instructions:
  • the local generation matrix encoding the overall data of the data to be stored to obtain corresponding overall verification data, wherein the overall verification data includes local verification data related to the local data;
  • the partial verification data and its corresponding local data are stored.
  • the machine readable instructions cause the processor to: acquire local data of the data to be stored, including:
  • the divided data blocks are grouped, and corresponding local data is obtained according to the group.
  • the present application discloses a data recovery apparatus including a processor that reads machine readable instructions corresponding to data recovery control logic stored on a storage medium and executes the instructions:
  • the corresponding partial check data and the local data are read from the pre-stored data; wherein the partial check data is obtained by encoding the whole data of the to-be-stored data according to the local generation matrix.
  • the local generation matrix is obtained by splitting the rows of the overall generation matrix, and the split rows of the local generation matrix include zero elements;
  • the data to be recovered is restored according to the read partial verification data and the local data.
  • machine readable instructions cause the processor to: when recovering the data to be recovered according to the read partial parity data and local data:
  • the local decoding matrix includes: a row of an identity matrix and a row of a local generation matrix, the row of the identity matrix does not include a row corresponding to the data to be restored, and the local decoding matrix is a square matrix; as well as
  • the machine readable instructions when the corresponding partial parity data is read from the pre-stored data for the data to be recovered, the machine readable instructions cause the processor to:
  • the first partial-check corresponding to the single-target packet is read from the pre-stored data.
  • the data and the local data are used as the local check data and the local data corresponding to the first to-be-recovered data; wherein the first partial check data is obtained according to the partial data of the single packet.
  • the machine readable instructions cause the processor to:
  • Corresponding first partial verification data is read from pre-stored data according to the first target storage address.
  • the machine readable instructions when the corresponding partial parity data is read from the pre-stored data for the data to be recovered, the machine readable instructions cause the processor to:
  • the target grouping combination includes: a single target grouping
  • the second partial-verification corresponding to the target packet combination is read from the pre-stored data.
  • the data and the local data are used as the local check data and the local data corresponding to the second to-be-recovered data; wherein the second partial check data is obtained according to the partial data of the packet combination.
  • the machine readable instructions cause the processor to:
  • the embodiments of the present application include the following advantages:
  • the local check data of the embodiment of the present application is obtained by encoding the local generation matrix corresponding to the local data of the data to be stored. Since the split line of the local generation matrix may include zero elements, the local check number may be ensured. According to the local data, the data to be stored may be unrelated to other data except the local data. Therefore, the local data may be restored according to the local verification data without relying on other data. That is, in the data recovery process, the embodiment of the present application can only read the local check data and the local data corresponding to the data to be restored. Therefore, in the recovery process of the failed data block, the entire program is generally read. The means for the data block and the check block that are not invalid, since the embodiment of the present application reduces the amount of data read during the data recovery process, the IO consumption in the data recovery process can be greatly reduced.
  • FIG. 1 is a flow chart showing the steps of an embodiment of a data storage method of the present application.
  • FIG. 2 is a flow chart showing the steps of an embodiment of a data recovery method of the present application.
  • FIG. 3 is a flow chart showing the steps of an embodiment of a data storage and recovery method of the present application.
  • FIG. 4 is a schematic structural diagram of hardware of a data storage device according to an embodiment of the present application.
  • FIG. 5 is a functional block diagram of a data storage control logic according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of hardware of a data recovery apparatus according to an embodiment of the present application.
  • FIG. 7 is a functional block diagram of a data recovery control logic according to an embodiment of the present application.
  • FIG. 1 a flow chart of steps of an embodiment of a data storage method of the present application is shown, which may specifically include the following steps 101-103.
  • step 101 the rows of the overall generation matrix are split according to the local data of the data to be stored, to obtain a corresponding local generation matrix, wherein the split rows of the local generation matrix may include zero elements;
  • step 102 the whole data of the data to be stored is encoded by using the local generation matrix to obtain corresponding overall verification data, where the overall verification data may specifically include the local data.
  • Partial verification data Partial verification data
  • the partial verification data and its corresponding local data are stored.
  • the embodiments of the present application can be applied to data storage in any field such as the multimedia field, the e-commerce field, and the search field, that is, the data to be stored may be data of any field.
  • the data to be stored may be data of any field.
  • a person skilled in the art can determine the data length of the data to be stored according to actual application requirements.
  • the embodiment of the present application does not limit the specific data to be stored and the data length thereof.
  • the partial data can be used for a part of the overall data of the data to be stored.
  • the local check data of the embodiment of the present application is obtained by encoding the local generation matrix corresponding to the local data of the data to be stored. Since the split line of the local generation matrix may include zero elements, the local calibration may be ensured.
  • the test data is related to the above partial data, and may be irrelevant to other data in the data to be stored other than the above partial data. Therefore, the local data can be restored according to the partial verification data without relying on other data, that is, the embodiment of the present application can read only the partial verification data corresponding to the data to be restored in the data recovery process.
  • the local data of the data to be stored may be obtained by performing data block partitioning on the to-be-stored data, grouping the divided data blocks, and according to the Grouping gets the corresponding local data. That is, the local data of the embodiment of the present application may include: at least one data block corresponding to the packet. Further, the local check data of the embodiment of the present application may further include: at least one check block corresponding to the packet. For example, when the data length of the data block to be stored is 100 M, the embodiment of the present application may group the 100 M data blocks.
  • the embodiment of the present application may generate a corresponding first check block for the first group, and generate a corresponding second for the second group. Check block.
  • the embodiment of the present application can read the first group of un-failed data blocks and the first check block for corresponding recovery, that is, the embodiment of the present application can read 50M. Data to achieve data recovery, therefore, reading 100M of data compared to the existing solution can reduce IO consumption by 50%.
  • the embodiment of the present application can read the second group of un-failed data blocks and the second check block for corresponding recovery, so that 50% of IO consumption can be reduced.
  • local data of the data to be stored may be encoded to obtain a corresponding local check vector, where the local check vector corresponding data may be local check data corresponding to the local data.
  • the stored data may be first split in units of bytes to obtain a plurality of data blocks and data vectors composed of the plurality of data blocks. Then, the above data vector can be split into several sub-children The data vector, the above sub data vector can be used as local data of the data to be stored. Finally, the above sub-data vector can be encoded to obtain a corresponding partial check vector.
  • the data length of the data block may be 1, 2, 4, 8 or the like. The specific data length of the data block is not limited in the embodiment of the present application.
  • the whole data (that is, the data vector) of the data to be stored includes k data blocks, and the local data (that is, the sub data vector) includes k 1 data blocks, and then k 1 data blocks may be used. Encoding is performed to generate m 1 check blocks.
  • n is the total number of data blocks and check blocks to be encoded
  • k 1 data blocks are respectively represented as :D 0 , D 1 ,..., D k1-1
  • the size of each data block is M/k 1
  • the product of the overall generator matrix and k 1 data blocks can be calculated to obtain m 1 check blocks.
  • the size of each check block is also M/k 1 .
  • the above-mentioned overall generation matrix may be a matrix based on the m 1 row k 1 column of the Galois field, and the matrix may be a transformed Vandermonde matrix or a Cauchy matrix.
  • the above (n, k 1 ) RS method when a data block or a check block fails, it is necessary to recover the data block or the check block to ensure reliability. Specifically, if the check block is invalid, the invalid check block can be obtained by re-encoding with the k 1 block; if the data block is invalid, the remaining n-1 block and the check block are used. Any k block can recover the data block.
  • the above-mentioned Galois field may be an expansion of a polynomial domain based on a (0,1) ring on x ⁇ 8+x ⁇ 4+x ⁇ 3+x ⁇ 2+1, and the Galois field includes 0-255. A total of 256 elements, which can correspond to all values of 1 byte.
  • the above steps 101 and 102 can calculate the product of the overall generation matrix and k data blocks, and obtain m check blocks: C 0 , C 1 , ..., C M-1 , that is, the above m check blocks may correspond to elements in the overall check vector.
  • the above-mentioned overall check vector may include: an element related to the local data, the element may be related to the local data, and may be unrelated to other data except the local data in the data to be stored; Therefore, the recovery of the corresponding local data can be implemented according to the above-mentioned local check vector without depending on other data.
  • step 101 may split the first row of the overall generation matrix P into R rows to obtain a local generation matrix Q 1 , where Q 1 may be a matrix of m+R-1 rows and k columns.
  • the split line of the local generation matrix Q1 includes zero elements in both the first row and the second row.
  • step 102 may multiply the local generation matrix Q 1 and the data vector composed of k data blocks to obtain an overall check vector Cr1:
  • the check block C 0 in the overall check vector Cr1 is related to the local data D 0 and D 1
  • the check block C 1 in the overall check vector Cr1 is related to the local data D 2 and D 3
  • Data recovery is achieved based on local data and some elements in the overall check vector.
  • the third party can be recovered according to both the local data D 0 , D 1 and the check block C 0 .
  • the local data D 0 fails, the local data D 1 and the check block C 0 can be restored.
  • the local data D 0 is obtained .
  • the first row of the overall generation matrix P may be split into more than two rows.
  • the first row of the overall generation matrix P may be split into three rows to obtain local generation.
  • Matrix Q 2
  • C 1 1 check block local data D associated with the overall calibration vector C 1 Cr2 check block in the local data D 2 is related to D 3 , so data recovery can be achieved based on partial data and some elements in the overall check vector.
  • the recovery may be directly based on a partial parity block data C 0 D 0, a direct restore partial data D 1 according to a check block C 1, based on a direct check block C 2, both of the local data D 2 and D 3 Restore the third party.
  • other rows of the overall generation matrix P may also be split.
  • an example is provided in which the first row and the second row of the overall generation matrix P are simultaneously split into two rows to obtain a local generation matrix Q 3 :
  • the check block C 0 in the overall check vector Cr3 is related to the local data D 0 and D 1
  • the check block C 2 in the overall check vector Cr3 is related to the local data D 0 and D 1
  • the whole The check block C 1 in the check vector Cr3 is related to the local data D 2 and D 3
  • the check block C 3 in the above-mentioned overall check vector Cr3 is related to the partial data D 2 and D 3 , so it can be based on local data and overall school
  • the remaining two can be recovered according to any of the local data D 0 and D 1 and the check blocks C 0 and C 2 , for example, when the local data D 0 and D 1 fail, the check block C can be used. 0 and C 2 recover to obtain local data D 0 and D 1 .
  • the process of splitting the rows of the overall generation matrix is only an optional embodiment.
  • those skilled in the art can flexibly split the rows of the overall generation matrix according to actual application requirements.
  • four data blocks such as local data D 0 , D 1 , D 2 , and D 3
  • the first group includes D 0 , D 1
  • the second group includes local data D 2 and D 3
  • the column elements corresponding to the first group can be set to be non-zero
  • the second group such as the third column in the overall generation matrix
  • the column element corresponding to the element of the fourth column is set to zero
  • the column elements corresponding to the first group (such as the elements of the first column and the second column in the overall generation matrix) are set to zero
  • the second group is simultaneously
  • the column elements corresponding to the elements of the third column and the fourth column in the overall generation matrix are set to be
  • the generation matrix Q 3 implements partial recovery of the first packet and the second packet, respectively. Similarly, when the number of packets G of the data to be stored is greater than 2, the correlation of the check blocks between the packets may be mitigated according to the splitting principle of the two packets. It can be understood that the specific process of splitting the rows of the overall generation matrix is not limited in the embodiment of the present application.
  • the overall check vector corresponding to the step 102 includes: an element related to the local data, the element may be related to the local data, and may be other data than the local data in the data to be stored. It is irrelevant; therefore, the recovery of the corresponding local data can be implemented according to the above-mentioned local check vector without depending on other data.
  • the local data of the data to be stored may include: according to the local data of the corresponding data to be stored, the partial verification data may specifically include:
  • the second partial parity data corresponding to the local data of the packet combination.
  • the corresponding encoding process may specifically include:
  • Step A1 For local data of a single packet, encoding corresponding first partial parity data; and/or
  • Step A2 For the partial data of the packet combination, the corresponding second partial parity data is encoded.
  • the data to be stored may be grouped according to actual application requirements.
  • the number of packets may be G
  • step A1 may encode local data of a single packet in G packets to obtain a corresponding first partial.
  • the verification data that is, the first partial verification data can correspond to a single packet.
  • Step A2 may encode the local data of the group combination in the G packets to obtain corresponding second partial parity data, that is, the second partial parity data may correspond to the packet combination, wherein the number of packets covered by the packet combination may be Less than G.
  • step 103 may store the partial verification data and its corresponding local data.
  • the data may be dispersed and stored in (k 1 +m). 1 )
  • a different storage node is a logical abstraction of the storage device, which can be either a disk or a storage server. That is, the embodiment of the present application may perform the distributed storage of the local check data and the corresponding local data in units of data blocks or check blocks, so as to spread the risk of data loss.
  • the specific storage of data and its corresponding local data is not limited.
  • the method in the embodiment of the present application may further include:
  • a mapping relationship between the storage group combination information and the storage address of the second partial parity data is stored.
  • the single packet information may be used to identify a single packet, which may include: information such as an ID (identity), a name, and the like of a single packet.
  • the above packet combination information can be used to represent a plurality of combined packets.
  • the storage address may include: a storage path corresponding to the storage node, and the corresponding local verification data may be directly accessed (including read) through the storage path.
  • the method may further include: generating corresponding overall verification data for the overall data of the data to be stored; and storing the overall verification data.
  • This alternative embodiment can store local verification data On the basis of the above, the storage of the overall verification data is performed, so that the local data and the local verification data are insufficient to recover the data to be restored, and the recovery of the data to be restored is implemented according to the overall verification data.
  • the local check data of the embodiment of the present application is obtained by encoding the local generation matrix corresponding to the local data of the data to be stored. Since the split line of the local generation matrix may include zero elements, the local part may be ensured.
  • the verification data is related to the local data, and may be uncorrelated with other data except the local data in the data to be stored. Therefore, the local data may be implemented according to the local verification data without relying on other data.
  • Recovery that is, the embodiment of the present application can only read the local check data and the local data corresponding to the data to be restored in the data recovery process, thereby reducing the amount of data read during the data recovery process, thereby greatly reducing data recovery.
  • FIG. 2 a flow chart of steps of an embodiment of a data recovery method of the present application is shown, which may specifically include the following steps 201 and 202.
  • step 201 corresponding partial check data and local data are read from the pre-stored data for the data to be restored; wherein the partial check data is performed on the whole data of the data to be stored according to the local generation matrix. Coding, the local generation matrix is obtained by splitting the rows of the overall generation matrix, and the split rows of the local generation matrix may include zero elements;
  • the data to be recovered is restored according to the read partial parity data and local data.
  • the data to be recovered may be used to indicate that there is data for recovery demand, which usually corresponds to a failed data block or a check block.
  • the data to be restored may be represented by an identifier of a data block or a check block.
  • the data to be restored may include: a data block D x to be restored with the number X, and/or a number Y to be restored. Restore the check block C Y .
  • the number of the data block to be restored or the number of the check block to be restored may be greater than or equal to 1. It can be understood that the embodiment of the present application does not limit the specific data to be restored.
  • the local data may also correspond to the data block grouping of the whole data, that is, the local data of the embodiment of the present application may include: at least one data block corresponding to the group, further, the present The local check data of the application embodiment may also include: at least one check block corresponding to the packet.
  • the embodiment of the present application may provide the following reading scheme for reading corresponding partial verification data and local data from pre-stored data for data to be recovered:
  • the step 201 of reading the corresponding partial verification data from the pre-stored data for the data to be restored may specifically include the step C1 and the step C2.
  • step C1 the first to-be-recovered data belonging to the single-target packet is obtained from the to-be-recovered data
  • step C2 when the data length of the first data to be restored does not exceed the data length of the first partial check data corresponding to the single target packet, the first corresponding to the single target packet is read from the pre-stored data.
  • a partial check data and local data are used as local check data and local data corresponding to the first data to be restored; wherein the first partial check data may be obtained according to local data of a single group.
  • the maximum data length of the first to-be-recovered data is allowed to be the data length of the first partial parity data.
  • the local data of a single packet includes k 1 data blocks
  • the first partial parity data of a single packet specifically includes m 1 parity blocks
  • the single packet The corresponding (k 1 + m 1 ) data blocks and check blocks allow for the failure and recovery of m 1 blocks (including data blocks and check blocks) at most.
  • the mapping relationship between the data block information or the check block information and the single packet information may be pre-stored, and step C1 may be based on the information to be restored or the information to be restored in the data to be restored (such as the identifier). And searching for the mapping relationship to obtain a single target packet to which the to-be-recovered data block or the to-be-recovered parity block belongs, and extracting the first to-be-recovered data belonging to the single-target packet from the data to be restored. It can be understood that the specific process of obtaining the first data to be restored belonging to the single target packet from the data to be recovered in step C1 is not limited in the embodiment of the present application.
  • the data length of the first data to be restored may be compared with the data length of the first partial parity data corresponding to the single target packet, when the obtained comparison result is less than or equal to
  • the local data of the single target packet and the first partial parity data may be considered to be sufficient for the recovery of the first data to be restored, so that the first partial parity data corresponding to the single target packet may be read from the pre-stored data. Local data.
  • the step of reading the first partial verification data corresponding to the single target packet from the pre-stored data may specifically include step C21 and step C22.
  • step C21 the first target storage address corresponding to the first partial parity data of the single target packet is obtained according to the mapping relationship between the pre-stored single packet information and the storage address of the first partial parity data;
  • step C22 corresponding first partial verification data is read from the pre-stored data according to the first target storage address.
  • mapping between the single-packet information and the storage address of the local data may be pre-stored, and the local data corresponding to the single-target packet may be read according to the mapping relationship. It can be understood that the embodiment of the present application corresponds to a single-target packet.
  • the specific reading process of the local data and the first partial verification data is not limited.
  • the step 201 of reading the corresponding partial verification data from the pre-stored data for the data to be recovered may specifically include steps D1, D2 and D3.
  • step D1 the first to-be-recovered data belonging to the single-target packet is obtained from the to-be-recovered data
  • step D2 when the data length of the first to-be-recovered data exceeds the data length of the first-part packet corresponding to the first partial parity data, the second to-be-recovered belonging to the target packet combination is obtained from the to-be-recovered data.
  • Data; the target packet combination may specifically include: a single target packet;
  • step D3 when the data length of the second to-be-recovered data does not exceed the data length of the second partial parity data corresponding to the target packet combination, the corresponding number of the target packet combination is read from the pre-stored data.
  • the two partial check data and the local data are used as the local check data and the local data corresponding to the second to-be-recovered data; wherein the second partial check data may be obtained according to the partial data of the group combination.
  • the first data to be restored when the data length of the first data to be restored exceeds the data length of the first partial parity data corresponding to the single target packet, for example, the first data to be restored includes three data blocks, and the first When a partial check data includes two check blocks, it may be considered that the local data of the single target packet and the local check data are insufficient to implement recovery of the first data to be restored. In this case, the target packet combination may be used to correspond to the second. The local check data and the local data are used to recover the second data to be restored.
  • the second to-be-recovered data may be corresponding to the target packet combination, and the second to-be-recovered data may include: the first to-be-recovered data, and the target packet combination may specifically include:
  • the target grouping that is, the single target grouping is a subset of the target grouping combination.
  • the G single packets are represented as G 0 , G 1 , . . . , G G-1 , respectively, and the packet combination may specifically include at least two of the G single packets.
  • the packet may comprise a combination of target acquisition G 0, such as ⁇ G 0, G 1 ⁇ , ⁇ G 0 , G 2 ⁇ , ⁇ G 0 , G 1 , G 2 ⁇ , etc.
  • target acquisition G 0 such as ⁇ G 0, G 1 ⁇ , ⁇ G 0 , G 2 ⁇ , ⁇ G 0 , G 1 , G 2 ⁇ , etc.
  • the second to-be-recovered data may be more than the first to-be-recovered data, and may also be consistent with the first to-be-recovered data.
  • the local data of the target packet combination includes (k 1 + k 2 ) data blocks
  • the local check data of the target packet combination specifically includes (m 1 + m 2 ) check blocks, and then the data block and the check block
  • the (k 1 +k 2 +m 1 +m 2 ) data blocks and the check blocks corresponding to the target group combination allow up to (m 1 +m 2 ) blocks (including data blocks and calibration). Failure and recovery of the block.
  • the mapping relationship between the data block information or the check block information and the packet combination information may be pre-stored.
  • Step D2 may perform searching according to the information of the first data to be restored (such as an identifier) in the foregoing mapping relationship, to obtain a target packet combination to which the first data to be restored belongs, and extract the target packet combination from the data to be restored.
  • the second to be recovered data It can be understood that, in the embodiment of the present application, the specific process of acquiring the second to-be-recovered data belonging to the target packet combination from the to-be-recovered data is not limited in step D2, for example, the single-group information may also be used according to the information of the single-target grouping.
  • the mapping relationship with the packet combination information is intermediately searched to obtain a target packet combination corresponding to the single target packet.
  • the data length of the second data to be restored may be compared with the data length of the target packet combination corresponding to the second partial parity data, when the obtained comparison result is less than or equal to
  • the local data of the target packet combination and the second partial parity data may be considered to be sufficient for the recovery of the second data to be restored, so that the second partial parity data corresponding to the target packet combination may be read from the pre-stored data. Local data.
  • the step of reading the second partial verification data corresponding to the target packet combination from the pre-stored data may specifically include steps D31 and D32.
  • step D31 the second target storage address corresponding to the second partial parity data of the target packet combination is obtained according to the mapping relationship between the pre-stored packet combination information and the storage address of the second partial parity data;
  • step D32 corresponding second partial verification data is read from the pre-stored data according to the second target storage address.
  • mapping between the target packet combination information and the storage address of the local data may be performed in advance, and the local data corresponding to the target packet combination may be read according to the mapping relationship. It can be understood that the embodiment of the present application corresponds to the target packet combination.
  • the specific reading process of the local data and the second partial verification data is not limited.
  • the process of reading the corresponding partial check data and the local data from the pre-stored data for the data to be recovered is described in detail above by reading the scheme 1 and the reading scheme 2. It can be understood that those skilled in the art can Depending on the actual application requirements, any one or combination of the foregoing reading scheme 1 and the reading scheme 2 may be used, or other reading schemes may be used.
  • the embodiment of the present application reads from the pre-stored data for the data to be recovered.
  • the specific process of taking the corresponding partial check data and local data is not limited.
  • the to-be-recovered data may include: a to-be-recovered data block and/or a to-be-restored parity block, where the local data may be re-recovered when the to-be-recovered data includes only the parity block to be restored. Encoding is performed to obtain a corresponding check block to be restored.
  • the embodiment of the present application mainly describes the recovery process of the restored data block.
  • the corresponding recovery process may specifically include: constructing an overall decoding matrix; and the overall decoding matrix.
  • the method may include: generating a row of the matrix and a row of the identity matrix, wherein the row of the identity matrix does not include the data to be restored Corresponding rows, the row of the overall generation matrix is consistent with the number of rows corresponding to the data to be restored, the overall decoding matrix is a square matrix; and the partial parity data and the local data are decoded according to the overall decoding matrix to obtain a The original data corresponding to the restored data is described.
  • the corresponding unit matrix I 1 may be a square matrix of k 1 ⁇ k 1 , assuming one data block fails, the failure can matrix row corresponding to the data block 1 is removed from the unit I, wherein the removal of the line may correspond to a data block fails, the failure if the data block is located in the i-th original matrix of (1 ⁇ i ⁇ k 1 ) rows, the ith row can also be removed from the identity matrix. It should be noted that the ith row removed by the row compensation in the overall generation matrix may be adopted, so that the above overall decoding matrix is a square matrix.
  • the overall generation matrix is the matrix P in the formula (1)
  • the failed data block is the first data block D 0 , which can be from the 4 ⁇ 4 unit.
  • the first row is removed from the matrix, and the row corresponding to the non-failed parity block in the overall generation matrix is added. If the row corresponding to the non-failed parity block includes more than one, then any one (for example, the first row) can be selected from This provides an overall decoding matrix corresponding to the above example:
  • the invalid data block D 0 can be utilized by the overall decoding matrix S, the local data blocks D 1 , D 2 and D 3 , and the local parity block C 0 by the formula (9).
  • the two sides of the formula (9) can be multiplied by the inverse matrix S' -1 of the matrix S to obtain the restored original data:
  • the local check data and the local data are read according to the read Step 202 of restoring data for recovery may specifically include steps E1 and E2.
  • a local decoding matrix is constructed; the local decoding matrix may specifically include: a row of the unit matrix Generating a row of the matrix, the row of the unit matrix does not include the corresponding row of the data to be restored, and the local decoding matrix may be a square matrix;
  • step E2 the local check data and the local data are decoded according to the local decoding matrix to obtain original data corresponding to the data to be restored.
  • D 0 , D 1 , D 2 , and D 3 are divided into two groups, wherein the first group includes D 0 , D 1 , and the second group includes D 2 and D 3 , and using the local generation matrix Q 3 of the formula (6) to encode the whole data D 0 , D 1 , D 2 and D 3 , then when the D 0 fails, the un-failed data block can be utilized.
  • D 1 and check block C 0 perform D 0 recovery, and the corresponding local decoding matrix T can be expressed as:
  • the failed data block D 0 can be decoded by the formula (11) using the local decoding matrix T, the local data block D 1 , and the local parity block C 0 :
  • the recovery of D 0 and D 1 can be performed by the non-failed check blocks C 0 and C 2 , and the corresponding local decoding is performed.
  • the matrix T can be expressed as:
  • step E1 and the step E2 can use the step E1 and the step E2 to perform the recovery of the data to be restored according to the actual application requirements.
  • the embodiment of the present application does not limit the specific recovery process.
  • the method may further include: when the data length of the second to-be-recovered data exceeds the data length of the second group-verified data corresponding to the target packet combination, The corresponding data of the check and the overall data are read in the stored data; and the data to be recovered is restored according to the read overall check data and the overall data.
  • the optional data can be restored according to the overall verification data, so that the reliability of the data storage can be ensured.
  • FIG. 3 a data of the present application is shown.
  • the flow chart of the steps of the storage and recovery method embodiment may specifically include the following steps 301-308.
  • step 301 the rows of the overall generation matrix are split according to the local data of the data to be stored, to obtain a corresponding local generation matrix; the split rows of the local generation matrix may include zero elements;
  • the overall data of the data to be stored is encoded by the local generation matrix to obtain corresponding overall verification data.
  • the overall verification data may specifically include: correlating with the local data. Partial verification data;
  • step 303 the element related to the local data and its corresponding local data are stored
  • the corresponding partial check data and the local data are read from the pre-stored data
  • step 305 it is determined whether the data length of the data to be restored exceeds the data length of the corresponding partial check data, if not, step 306 is performed, and if so, step 308 is performed;
  • a local decoding matrix is configured.
  • the local decoding matrix may include: at least one of an identity matrix and a local generation matrix that do not include a row corresponding to the data to be restored, where the local decoding matrix is a square matrix;
  • step 307 the local check data and the local data are decoded according to the local decoding matrix to obtain original data corresponding to the data to be restored;
  • step 308 the corresponding overall verification data and the overall data are read from the pre-stored data; and the to-be-recovered data is restored according to the read overall verification data and the overall data.
  • step 304 the partial verification data and the local data may be read by using any one or combination of the foregoing reading scheme 1 and the reading scheme 2, and step 305 may implement data recovery by using step E1 - step E2.
  • the recovery process of step 308 may refer to the recovery process of the local check data obtained by independently encoding the local data according to the above-mentioned generation scheme 1, and details are not described herein.
  • the storage device can include a processor 41 and a machine readable storage medium 42, wherein the processor 41 and the machine readable storage medium 42 are typically interconnected by an internal bus 43.
  • the data storage device may also include an external interface 44 to enable communication with other devices or components.
  • the machine readable storage medium 42 can be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (eg, hard drive), solid state A hard disk, any type of storage disk (such as a compact disc, dvd, etc.), or a similar storage medium, or a combination thereof.
  • RAM Random Access Memory
  • volatile memory non-volatile memory
  • flash memory storage drive (eg, hard drive), solid state A hard disk, any type of storage disk (such as a compact disc, dvd, etc.), or a similar storage medium, or a combination thereof.
  • storage drive eg, hard drive
  • solid state A hard disk any type of storage disk (such as a compact disc, dvd, etc.), or a similar storage medium, or a combination thereof.
  • machine readable instructions corresponding to control logic 50 of data storage executed by processor 41 may be stored on machine readable storage medium 42.
  • the processor 41 when the processor 41 reads and executes the machine readable instructions stored on the machine readable storage medium 42, the processor 41 can perform the data storage method as described above.
  • the data storage control logic 50 may include a split module 501, an encoding module 502, and a local storage module 503.
  • the splitting module 501 is configured to split the row of the overall generation matrix according to the local data of the data to be stored, to obtain a corresponding local generation matrix; the split row of the local generation matrix may include a zero element;
  • the encoding module 502 is configured to: use the local generation matrix to encode the overall data of the data to be stored to obtain corresponding overall verification data; and the overall verification data may specifically include: correlating with the local data Local check data;
  • the local storage module 503 can be configured to store the local check data and its corresponding local data.
  • the data storage control logic 50 may further include: an obtaining module, configured to acquire local data of the data to be stored.
  • the obtaining module may include:
  • a grouping sub-module configured to group the divided data blocks, and obtain corresponding local data according to the grouping.
  • the local data of the data to be stored may include: the local data according to the corresponding data to be stored, the local verification data may specifically include:
  • the second partial parity data corresponding to the local data of the packet combination.
  • the data storage control logic 50 may further include:
  • a first relationship storage module configured to store a mapping between the single packet information and a storage address of the first partial parity data Department
  • the second relationship storage module is configured to store a mapping relationship between the packet combination information and the storage address of the second partial parity data.
  • the data storage control logic 50 may further include:
  • the overall verification generation module is configured to generate corresponding overall verification data for the overall data of the data to be stored
  • An overall storage module is configured to store the overall verification data.
  • control logic of the present application can be understood as machine readable instructions stored in machine readable storage medium 42.
  • processor 41 on the data storage device of the present application executes the control logic, the processor 41 performs the following operations by invoking machine readable instructions stored on the machine readable storage medium 42:
  • the split rows of the local generation matrix include zero elements
  • the overall verification data includes: local verification data related to the local data
  • the partial verification data and its corresponding local data are stored.
  • the machine readable instructions stored on the machine readable storage medium 42 may cause the processor 41 to:
  • the divided data blocks are grouped, and corresponding local data is obtained according to the group.
  • the local data of the data to be stored may include local data according to the corresponding group of the data to be stored.
  • the partial verification data may include:
  • the second partial parity data corresponding to the local data of the packet combination.
  • machine readable instructions stored on machine readable storage medium 42 may also cause processor 41 to:
  • a mapping relationship between the storage group combination information and the storage address of the second partial parity data is stored.
  • the description is relatively simple, and the relevant parts can be referred to the description of the storage method embodiment.
  • the data recovery apparatus may include a processor 61 and a machine readable storage medium 62, wherein the processor 61 and the machine are readable
  • the storage media 62 are typically interconnected by an internal bus 63.
  • the data recovery device may also include an external interface 64 to enable communication with other devices or components.
  • the machine readable storage medium 62 can be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (eg, hard drive), solid state A hard disk, any type of storage disk (such as a compact disc, dvd, etc.), or a similar storage medium, or a combination thereof.
  • RAM Random Access Memory
  • volatile memory non-volatile memory
  • flash memory storage drive (eg, hard drive), solid state A hard disk, any type of storage disk (such as a compact disc, dvd, etc.), or a similar storage medium, or a combination thereof.
  • storage drive eg, hard drive
  • solid state A hard disk any type of storage disk (such as a compact disc, dvd, etc.), or a similar storage medium, or a combination thereof.
  • machine readable instructions corresponding to control logic 70 of data storage executed by processor 61 may be stored on machine readable storage medium 62.
  • the processor 61 when the processor 61 reads and executes the machine readable instructions stored on the machine readable storage medium 62, the processor 61 can perform the data recovery method as described above.
  • the data storage control logic 70 can include a read module 701 and a recovery module 702.
  • the reading module 701 is configured to: read, for the data to be recovered, the corresponding partial check data and the local data from the pre-stored data; wherein the partial check data is the data to be stored according to the local generation matrix
  • the overall data is encoded, and the local generation matrix is obtained by splitting the rows of the overall generation matrix, and the split rows of the local generation matrix may include zero elements.
  • the recovery module 702 is configured to recover the to-be-recovered data according to the read partial verification data and the local data.
  • the recovery module 702 may specifically include:
  • the sub-module may be configured to construct a local decoding matrix.
  • the local decoding matrix may specifically include: a row of an identity matrix and a row of a local generation matrix, where the row of the identity matrix does not include the corresponding row of the data to be restored, the local
  • the decoding matrix is a square matrix;
  • the decoding submodule is configured to decode the local check data and the local data according to the local decoding matrix to obtain original data corresponding to the to-be-recovered data.
  • the reading module 701 may specifically include:
  • a first acquiring sub-module configured to obtain, from the data to be recovered, first to-be-recovered data belonging to a single-target packet; as well as
  • a first reading submodule configured to: when the data length of the first data to be restored does not exceed the data length of the first partial parity data corresponding to the single target packet, read the single data from the pre-stored data
  • the first partial check data and the local data corresponding to the target group are used as the local check data and the local data corresponding to the first to-be-recovered data; wherein the first partial check data may be local data according to a single group get.
  • the first reading submodule may specifically include:
  • the first address obtaining unit is configured to obtain, according to a mapping relationship between the pre-stored single group information and the storage address of the first partial check data, the first target storage address of the single target packet corresponding to the first partial check data. ;
  • the first reading unit is configured to read the corresponding first partial verification data from the pre-stored data according to the first target storage address.
  • the reading module 701 may specifically include:
  • a first acquiring sub-module configured to obtain, from the data to be recovered, first to-be-recovered data belonging to a single-target packet
  • a second obtaining submodule configured to: when the data length of the first to-be-recovered data exceeds a data length of the first partial parity data corresponding to the single target packet, obtain the target packet combination from the to-be-recovered data Second to-be-recovered data; the target packet combination includes: a single-target packet;
  • a second reading submodule configured to: when the data length of the second to-be-recovered data does not exceed the data length of the second partial verification data corresponding to the target packet combination, read the target from the pre-stored data
  • the second partial check data and the local data corresponding to the group combination are used as the local check data and the local data corresponding to the second to-be-recovered data; wherein the second partial check data is obtained according to the partial data of the group combination .
  • the second reading submodule may specifically include:
  • the second address obtaining unit may be configured to acquire, according to a mapping relationship between the pre-stored packet combination information and the storage address of the second partial parity data, the second target storage address corresponding to the second partial parity data of the target packet combination ;
  • the second reading unit is configured to read the corresponding second partial verification data from the pre-stored data according to the second target storage address.
  • the data recovery control logic 70 may further include:
  • the overall reading module is configured to: when the data length of the second data to be restored exceeds the data length of the second partial verification data corresponding to the target packet combination, read corresponding overall verification data from the pre-stored data And overall data;
  • the overall recovery module is configured to recover the data to be recovered according to the read overall verification data and the overall data.
  • control logic 70 The following is a software implementation as an example to further describe how the data recovery device executes the control logic 70.
  • the control logic of the present application can be understood as machine readable instructions stored in machine readable storage medium 62.
  • the processor 61 on the data recovery device of the present application executes the control logic, the processor 61 performs the following operations by invoking machine readable instructions stored on the machine readable storage medium 62:
  • the corresponding partial check data and the local data are read from the pre-stored data; wherein the partial check data is obtained by encoding the whole data of the to-be-stored data according to the local generation matrix.
  • the local generation matrix is obtained by splitting the rows of the overall generation matrix, and the split rows of the local generation matrix include zero elements;
  • the data to be recovered is restored according to the read partial verification data and the local data.
  • the machine readable instruction when the recovering the data to be restored according to the read partial verification data and the local data, causes the processor to:
  • the local decoding matrix includes: a row of an identity matrix and a row of a local generation matrix, the row of the identity matrix does not include a row corresponding to the data to be restored, and the local decoding matrix is a square matrix;
  • the machine readable instructions cause the processor to:
  • the first partial-check corresponding to the single-target packet is read from the pre-stored data.
  • the data and the local data are used as the local check data and the local data corresponding to the first to-be-recovered data; wherein the first partial check data is obtained according to the partial data of the single packet.
  • the machine readable instructions when the first partial parity data corresponding to the single target packet is read from pre-stored data, the machine readable instructions cause the processor to:
  • Corresponding first partial verification data is read from pre-stored data according to the first target storage address.
  • the machine readable instructions when the corresponding partial parity data is read from the pre-stored data for the data to be recovered, the machine readable instructions cause the processor to:
  • the target grouping combination includes: a single target grouping
  • the second partial-verification corresponding to the target packet combination is read from the pre-stored data.
  • the data and the local data are used as the local check data and the local data corresponding to the second to-be-recovered data; wherein the second partial check data is obtained according to the partial data of the packet combination.
  • the machine readable instructions when the second partial parity data corresponding to the target packet combination is read from pre-stored data, the machine readable instructions cause the processor to:
  • the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • embodiments of the present application can be provided as a method, apparatus, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, read-only optical ROM (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassette, magnetic tape storage or other magnetic storage device or any other non-transportable medium that can be used to store information that can be accessed by a computing device.
  • computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
  • Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device
  • Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computing Systems (AREA)
  • Detection And Correction Of Errors (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)
  • Debugging And Monitoring (AREA)
  • Communication Control (AREA)

Abstract

本申请提供了一种数据存储方法和装置、一种数据恢复方法和装置,其中,数据恢复方法可包括:针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;其中,所述局部校验数据为依据局部生成矩阵对所述待存储数据的整体数据进行编码得到,所述局部生成矩阵为对整体生成矩阵的行进行拆分得到,所述局部生成矩阵的拆分行中包括零元素(201);依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复(202)。能够减少数据恢复过程中读取的数据量,进而能够大大降低数据恢复过程中的IO消耗。

Description

一种数据存储方法和装置、一种数据恢复方法和装置
相关申请的交叉引用
本专利申请要求于2016年06月29日提交的、申请号为201610500318.5、发明名称为“一种数据存储方法和装置、一种数据恢复方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本文中。
技术领域
本申请涉及计算机信息技术领域,特别是涉及一种数据存储方法和装置、以及一种数据恢复方法和装置。
背景技术
目前,随着信息技术产业的迅猛发展,出于成本、可靠性等多方面的考虑,越来越多的厂商选择在产品中部署分布式系统,使得分布式系统也因此得到了快速的发展。
在现有的分布式系统架构中,文件可以被分割成多个数据块进行存储;为了保证系统的健壮性及灾难恢复能力,数据块一般会有多个副本,并分别存放于不同的物理位置。然而,上述多副本的容灾方法需要配置更多的存储设备,从而增加了存储设备的成本,以三副本为例,上述多副本的容灾方法将增加200%的存储冗余,以及增加200%的存储成本。
相对于多副本的容灾方法,RS(里所,Reed-Solomon)方法可以依据指定的数据块生成对应的校验块,并在数据块失效时,可以依据未失效的数据块和校验块恢复出失效的数据块,因此能够以更小的数据冗余度获得更高数据可靠性。例如,在指定的数据块及其对应校验块的大小分别为100M和30M时,上述RS方法可以使用30%的冗余达到三副本的存储可靠性。
然而,在失效的数据块的恢复过程中,上述RS方法通常需要读取全部未失效的数据块和校验块,也即,上述RS方法在数据恢复过程中无法有效利用IO(输入输出,Input Output),通常30%冗余的RS方法在数据恢复过程中需要读取100M的数据,这样将造成10倍的IO消耗。
发明内容
本申请实施例所要解决的技术问题是提供一种数据存储方法和一种数据恢复方法,能够减少数据恢复过程中读取的数据量,进而能够大大降低数据恢复过程中的IO消耗。
相应的,本申请实施例还提供了一种数据存储装置和一种数据恢复装置,用以保证上述方法的实现及应用。
为了解决上述问题,本申请公开了一种数据存储方法,包括:
依据待存储数据的局部数据,对整体生成矩阵的行进行拆分,以得到对应的局部生成矩阵,其中,所述局部生成矩阵的拆分行中包括零元素;
利用所述局部生成矩阵,对所述待存储数据的整体数据进行编码,以得到对应的整体校验数据,其中,所述整体校验数据包括与所述局部数据相关的局部校验数据;
存储所述局部校验数据及其对应的局部数据。
可选地,通过如下步骤获取所述待存储数据的局部数据:
对所述待存储数据进行数据块划分;
对划分得到的数据块进行分组,并依据所述分组得到对应的局部数据。
可选地,所述待存储数据的局部数据包括:依据所述待存储数据对应分组的局部数据,所述局部校验数据包括:
单分组的局部数据对应的第一局部校验数据;和/或
分组组合的局部数据对应的第二局部校验数据。
可选地,所述方法还包括:
存储单分组信息与第一局部校验数据的存储地址之间的映射关系;和/或
存储分组组合信息与第二局部校验数据的存储地址之间的映射关系。
另一方面,本申请公开了一种数据恢复方法,包括:
针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;其中,所述局部校验数据为依据局部生成矩阵对所述待存储数据的整体数据进行编码得到,所述局部生成矩阵为对整体生成矩阵的行进行拆分得到,所述局部生成矩阵的拆分行中包括零元素;
依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复。
可选地,所述依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复的步骤,包括:
构造局部解码矩阵;所述局部解码矩阵包括:单位矩阵的行和局部生成矩阵的行,所述 单位矩阵的行不包括所述待恢复数据对应行,所述局部解码矩阵为方阵;
依据所述局部解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据。
可选地,所述针对待恢复数据,从预先存储的数据中读取对应的局部校验数据的步骤,包括:
从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
在所述第一待恢复数据的数据长度不超过所述单目标分组对应第一局部校验数据的数据长度时,从预先存储的数据中读取所述单目标分组对应的第一局部校验数据和局部数据,作为所述第一待恢复数据对应的局部校验数据和局部数据;其中,所述第一局部校验数据为依据单分组的局部数据得到。
可选地,所述从预先存储的数据中读取所述单目标分组对应的第一局部校验数据的步骤,包括:
依据预先存储的单分组信息与第一局部校验数据的存储地址之间的映射关系,获取所述单目标分组对应第一局部校验数据的第一目标存储地址;
依据所述第一目标存储地址,从预先存储的数据中读取对应的第一局部校验数据。
可选地,所述针对待恢复数据,从预先存储的数据中读取对应的局部校验数据的步骤,包括:
从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
在所述第一待恢复数据的数据长度超过所述单目标分组对应第一局部校验数据的数据长度时,从所述待恢复数据中获取属于目标分组组合的第二待恢复数据;所述目标分组组合包括:单目标分组;
在所述第二待恢复数据的数据长度不超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据和局部数据,作为所述第二待恢复数据对应的局部校验数据和局部数据;其中,所述第二局部校验数据为依据分组组合的局部数据得到。
可选地,所述从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据的步骤,包括:
依据预先存储的分组组合信息与第二局部校验数据的存储地址之间的映射关系,获取所 述目标分组组合对应第二局部校验数据的第二目标存储地址;
依据所述第二目标存储地址,从预先存储的数据中读取对应的第二局部校验数据。
再一方面,本申请公开了一种数据存储装置,包括处理器,所述处理器通过读取存储介质上所存储的与数据存储控制逻辑对应的机器可读指令并执行所述指令来:
依据待存储数据的局部数据,对整体生成矩阵的行进行拆分,以得到对应的局部生成矩阵,其中,所述局部生成矩阵的拆分行中包括零元素;
利用所述局部生成矩阵,对所述待存储数据的整体数据进行编码,以得到对应的整体校验数据,其中,所述整体校验数据包括与所述局部数据相关的局部校验数据;以及
存储所述局部校验数据及其对应的局部数据。
可选地,所述机器可读指令促使所述处理器:获取所述待存储数据的局部数据,包括:
对所述待存储数据进行数据块划分;及
对划分得到的数据块进行分组,并依据所述分组得到对应的局部数据。
又一方面,本申请公开了一种数据恢复装置,包括处理器,所述处理器通过读取存储介质上所存储的与数据恢复控制逻辑对应的机器可读指令并执行所述指令来:
针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;其中,所述局部校验数据为依据局部生成矩阵对所述待存储数据的整体数据进行编码得到,所述局部生成矩阵为对整体生成矩阵的行进行拆分得到,所述局部生成矩阵的拆分行中包括零元素;以及
依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复。
可选地,当依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复时,所述机器可读指令促使所述处理器:
构造局部解码矩阵,其中,所述局部解码矩阵包括:单位矩阵的行和局部生成矩阵的行,所述单位矩阵的行不包括所述待恢复数据对应行,所述局部解码矩阵为方阵;以及
依据所述局部解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据。
可选地,当针对待恢复数据,从预先存储的数据中读取对应的局部校验数据时,所述机器可读指令促使所述处理器:
从所述待恢复数据中获取属于单目标分组的第一待恢复数据;以及
在所述第一待恢复数据的数据长度不超过所述单目标分组对应第一局部校验数据的数据长度时,从预先存储的数据中读取所述单目标分组对应的第一局部校验数据和局部数据,作为所述第一待恢复数据对应的局部校验数据和局部数据;其中,所述第一局部校验数据为依据单分组的局部数据得到。
可选地,当从预先存储的数据中读取所述单目标分组对应的第一局部校验数据时,所述机器可读指令促使所述处理器:
依据预先存储的单分组信息与第一局部校验数据的存储地址之间的映射关系,获取所述单目标分组对应第一局部校验数据的第一目标存储地址;
依据所述第一目标存储地址,从预先存储的数据中读取对应的第一局部校验数据。
可选地,当针对待恢复数据,从预先存储的数据中读取对应的局部校验数据时,所述机器可读指令促使所述处理器:
从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
在所述第一待恢复数据的数据长度超过所述单目标分组对应第一局部校验数据的数据长度时,从所述待恢复数据中获取属于目标分组组合的第二待恢复数据;所述目标分组组合包括:单目标分组;
在所述第二待恢复数据的数据长度不超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据和局部数据,作为所述第二待恢复数据对应的局部校验数据和局部数据;其中,所述第二局部校验数据为依据分组组合的局部数据得到。
可选地,当从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据时,所述机器可读指令促使所述处理器:
依据预先存储的分组组合信息与第二局部校验数据的存储地址之间的映射关系,获取所述目标分组组合对应第二局部校验数据的第二目标存储地址;以及
依据所述第二目标存储地址,从预先存储的数据中读取对应的第二局部校验数据。
与现有技术相比,本申请实施例包括以下优点:
本申请实施例的局部校验数据是依据待存储数据的局部数据所对应的局部生成矩阵编码得到的,由于上述局部生成矩阵的拆分行中可以包括零元素,因此可以保证上述局部校验数 据与上述局部数据相关,而可以与待存储数据中除上述局部数据外的其他数据不相关,因此,可以在不依赖其他数据的情况下,依据上述局部校验数据实现相应局部数据的恢复,也即,本申请实施例在数据恢复过程中可以仅仅读取待恢复数据对应的局部校验数据和局部数据,因此,相对于现有方案在失效的数据块的恢复过程中、通常读取全部未失效的数据块和校验块的手段,由于本申请实施例减少数据恢复过程中读取的数据量,故能够大大降低数据恢复过程中的IO消耗。
附图说明
图1是本申请的一种数据存储方法实施例的步骤流程图;
图2是本申请的一种数据恢复方法实施例的步骤流程图;
图3是本申请的一种数据存储和恢复方法实施例的步骤流程图;
图4是本申请实施例的一种数据存储装置的硬件结构示意图;
图5是本申请实施例的一种数据存储控制逻辑的功能模块图;
图6是本申请实施例的一种数据恢复装置的硬件结构示意图;
图7是本申请实施例的一种数据恢复控制逻辑的功能模块图。
具体实施方式
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。
存储方法实施例
参照图1,示出了本申请的一种数据存储方法实施例的步骤流程图,具体可以包括如下步骤101-103。
在步骤101,依据待存储数据的局部数据,对整体生成矩阵的行进行拆分,以得到对应的局部生成矩阵,其中,所述局部生成矩阵的拆分行中可以包括零元素;
在步骤102,利用所述局部生成矩阵,对所述待存储数据的整体数据进行编码,以得到对应的整体校验数据,其中,所述整体校验数据具体可以包括与所述局部数据相关的局部校验数据;
在步骤103,存储所述局部校验数据及其对应的局部数据。
本申请实施例可以应用于多媒体领域、电子商务领域、搜索领域等任意领域的数据的存储,也即,上述待存储数据可以为任意领域的数据。另外,本领域技术人员可以根据实际应用需求,确定上述待存储数据的数据长度,本申请实施例对于具体的待存储数据及其数据长度不加以限制。
本申请实施例中,局部数据可用于待存储数据的整体数据中的一部分。且由于本申请实施例的局部校验数据是依据待存储数据的局部数据所对应的局部生成矩阵编码得到的,由于上述局部生成矩阵的拆分行中可以包括零元素,因此可以保证上述局部校验数据与上述局部数据相关,而可以与待存储数据中除上述局部数据外的其他数据不相关。因此,可以在不依赖其他数据的情况下,依据上述局部校验数据实现相应局部数据的恢复,也即,本申请实施例在数据恢复过程中可以仅仅读取待恢复数据对应的局部校验数据和局部数据,因此,相对于在失效的数据块的恢复过程中、通常读取全部未失效的数据块和校验块的手段,由于本申请实施例减少数据恢复过程中读取的数据量,故能够大大降低数据恢复过程中的IO消耗。
在本申请的一种可选实施例中,可以通过如下步骤获取所述待存储数据的局部数据:对所述待存储数据进行数据块划分;对划分得到的数据块进行分组,并依据所述分组得到对应的局部数据。也即,本申请实施例的局部数据可以包括:分组对应的至少一个数据块,进一步,本申请实施例的局部校验数据也可以包括:分组对应的至少一个校验块。例如,在待存储的数据块的数据长度为100M时,本申请实施例可以对上述100M的数据块进行分组。假设分组得到的第一组和第二组的数据长度均为50M,则本申请实施例可以针对上述第一组生成对应的第一校验块,以及,针对上述第二组生成对应的第二校验块。这样,在第一组的数据块失效时,本申请实施例可以读取第一组未失效的数据块和第一校验块进行相应的恢复,也即,本申请实施例可以读取50M的数据以实现数据恢复,因此,相对于现有方案读取100M的数据,可以降低50%的IO消耗。同理,在第二组的数据块失效时,本申请实施例可以读取第二组未失效的数据块和第二校验块进行相应的恢复,故可以降低50%的IO消耗。
在本申请的一种可选实施例中,可以对待存储数据的局部数据进行编码,以得到对应的局部校验向量,该局部校验向量对应数据可以为局部数据对应的局部校验数据。由此能够保证得到的局部校验向量与上述局部数据相关,而可以与待存储数据中除上述局部数据外的其他数据不相关。因此,可以在不依赖其他数据的情况下,依据上述局部校验向量实现相应局部数据的恢复。
在本申请的一种应用示例中,可以首先以字节为单位对待存储数据进行拆分,以得到若干个数据块及该若干个数据块组成的数据向量。然后,可以对上述数据向量拆分为若干个子 数据向量,上述子数据向量可以作为待存储数据的局部数据。最后,可以将上述子数据向量进行编码,以得到对应的局部校验向量。其中,上述数据块的数据长度可以是1,2,4,8等字节,本申请实施例对于上述数据块的具体数据长度不加以限制。
在实际应用中,假设上述待存储数据的整体数据(也即数据向量)包括k个数据块,上述局部数据(也即子数据向量)包括k1个数据块,则可以对k1个数据块进行编码生成m1个校验块。
在此提供通过(n,k)RS方法对k1个数据块进行编码的示例。n是待编码的数据块和校验块的总个数,k=k1是待编码的数据块个数,m1=n-k1是校验块个数,假设k1个数据块分别表示为:D0、D1、...、Dk1-1,每个数据块的大小为M/k1,则可以计算整体生成矩阵和k1个数据块的乘积,得到m1个校验块:C0、C1、…、Cm-1,每个校验块的大小也是M/k1。其中,上述整体生成矩阵可以是一个基于迦罗瓦域的m1行k1列的矩阵,该矩阵可以是变换后的范德蒙矩阵(Vandermonde matrix),也可以是柯西矩阵(Cauchy matrix)。应用上述(n,k1)RS方法,当一个数据块或校验块失效时,需要恢复数据块或校验块以保证可靠性。具体地,如果失效的是校验块,可以利用k1块数据块重新编码得到该失效的校验块;如果失效的是1个数据块,利用剩余n-1块数据块和校验块中的任意k块可恢复该数据块。其中,上述迦罗瓦域可以为基于(0,1)环的多项式域在x^8+x^4+x^3+x^2+1上的扩张,迦罗瓦域中包含0~255共256个元素,可对应1个字节的所有取值。
可以理解,上述对待存储数据的局部数据进行编码的过程只是作为示例,实际上,本领域技术人员可以根据实际应用需求,采用所需的编码过程,本申请实施例对于对待存储数据的局部数据进行编码的具体过程不加以限制。
同样以上面描述的(n,k)RS方法为例,上述步骤101和步骤102可以计算整体生成矩阵和k个数据块的乘积,得到m个校验块:C0、C1、…、Cm-1,也即,上述m个校验块可以与整体校验向量中的元素相应。这样,在上述整体校验向量可包括:与所述局部数据相关的元素的情况下,上述元素可以与上述局部数据相关,而可以与待存储数据中除上述局部数据外的其他数据不相关;因此,可以在不依赖其他数据的情况下,依据上述局部校验向量实现相应局部数据的恢复。
在本申请一种应用示例中,假设通过迦罗瓦域得到的整体生成矩阵为P,P为m行k列的矩阵:
Figure PCTCN2016113523-appb-000001
则步骤101可以将所述整体生成矩阵P的第1行拆分成R行,以得到局部生成矩阵Q1,其中,Q1可以为m+R-1行k列的矩阵。
假设m=5,k=4,R=2,则上述局部生成矩阵Q1可以表示为:
Figure PCTCN2016113523-appb-000002
可以看出,局部生成矩阵Q1的拆分行第一行和第二行中均包括零元素。
在本申请的一种可选实施例中,步骤102可以将上述局部生成矩阵Q1与k个数据块组成的数据向量相乘,可以得到整体校验向量Cr1:
Figure PCTCN2016113523-appb-000003
可以看出,上述整体校验向量Cr1中校验块C0与局部数据D0和D1相关,上述整体校验向量Cr1中校验块C1与局部数据D2和D3相关,故可以依据局部数据和整体校验向量中部分元素实现数据恢复。例如,可以依据局部数据D0、D1和校验块C0中的二者恢复出第三者,例如,在局部数据D0失效时,可以依据局部数据D1和校验块C0恢复得到局部数据D0
在本申请的其他实施例中,还可以将整体生成矩阵P的第一行拆分为大于2的行,例如,可以将整体生成矩阵P的第一行拆分为3行,以得到局部生成矩阵Q2
Figure PCTCN2016113523-appb-000004
则将上述局部生成矩阵Q2与k个数据块组成的数据向量相乘,可以得到整体校验向量Cr2:
Figure PCTCN2016113523-appb-000005
可以看出,上述整体校验向量Cr2中校验块C0与局部数据D0相关,校验块C1与局部数据D1相关,上述整体校验向量Cr2中校验块C1与局部数据D2和D3相关,故可以依据局部数据和整体校验向量中部分元素实现数据恢复。例如,可以直接依据校验块C0恢复出局部数据D0,直接依据校验块C1恢复出局部数据D1,直接依据校验块C2、局部数据D2和D3中的二者恢复出第三者。
在本申请的其他实施例中,还可以对整体生成矩阵P的其他行进行拆分。在此提供同时将所述整体生成矩阵P的第1行和第2行拆分成2行,以得到局部生成矩阵Q3的示例:
Figure PCTCN2016113523-appb-000006
则将上述局部生成矩阵Q3与k个数据块组成的数据向量相乘,可以得到整体校验向量Cr3:
Figure PCTCN2016113523-appb-000007
可以看出,上述整体校验向量Cr3中校验块C0与局部数据D0和D1相关,上述整体校验向量Cr3中校验块C2与局部数据D0和D1相关,上述整体校验向量Cr3中校验块C1与局部数据D2和D3相关,上述整体校验向量Cr3中校验块C3与局部数据D2和D3相关,故可以依据局部数据和整体校验向量中部分元素实现数据恢复。例如,可以依据局部数据D0和D1、校验块C0和C2中的任意二者恢复出其余二者,例如,在局部数据D0和D1失效时,可以依据校验块C0和C2恢复得到局部数据D0和D1
可以理解,上述对整体生成矩阵的行进行拆分的过程只是作为可选实施例,实际上,本领域技术人员可以根据实际应用需求,灵活地对整体生成矩阵的行进行拆分。例如,可以对局部数据D0、D1、D2和D3等4个数据块划分为两个分组,其中,第一分组包括D0、D1,第二分组包括局部数据D2和D3,则通过上述步骤可以将第一分组对应的列元素(如整体生成矩阵中第一列和第二列的元素)设置为非零、同时将第二分组(如整体生成矩阵中第三列和第四列的元素)对应的列元素设置为零,以及,将第一分组对应的列元素(如整体生成矩阵中第一列和第二列的元素)设置为零、同时将第二分组(如整体生成矩阵中第三列和第四列的元素)对应的列元素设置为非零,由此可以减轻第一分组和第二分组之间校验块的相关性,进而可以利用上述局部生成矩阵Q3分别实现第一分组和第二分组的局部恢复。同理,在上述待存储数据的分组数目G大于2时,也可以按照2个分组的拆分原理,减轻各分组之间校验块的相关性。可以理解,本申请实施例对于对整体生成矩阵的行进行拆分的具体过程不加以限制。
需要说明的是,上述公式(1)-公式(7)中m=5,k=4只是作为本申请的应用示例,实际上,本领域技术人员可以根据实际应用需求,采用其他m值和k值,例如,k=10,m=4等,可以理解,本申请实施例对于数据块和校验块的具体数量不加以限制。
综上,在上述步骤102对应整体校验向量包括:与所述局部数据相关的元素的情况下,上述元素可以与上述局部数据相关,而可以与待存储数据中除上述局部数据外的其他数据不相关;因此,可以在不依赖其他数据的情况下,依据上述局部校验向量实现相应局部数据的恢复。
在本申请的再一种可选实施例中,所述待存储数据的局部数据具体可以包括:依据所述待存储数据对应分组的局部数据,则上述局部校验数据具体可以包括:
单分组的局部数据对应的第一局部校验数据;和/或
分组组合的局部数据对应的第二局部校验数据。
相应的编码过程具体可以包括:
步骤A1、针对单分组的局部数据,编码得到对应的第一局部校验数据;和/或
步骤A2、针对分组组合的局部数据,编码得到对应的第二局部校验数据。
在实际应用中,可以根据实际应用需求,对上述待存储数据进行分组,例如,分组的数量可以为G,则步骤A1可以对G个分组中单分组的局部数据,编码得到对应的第一局部校验数据,也即,第一局部校验数据可以与单个的分组相应。步骤A2可以对G个分组中分组组合的局部数据,编码得到对应的第二局部校验数据,也即,第二局部校验数据可以与分组组合相应,其中,分组组合所覆盖的分组数量可以小于G。
在步骤102编码得到局部校验数据后,步骤103可以存储所述局部校验数据及其对应的局部数据。在本申请的一种可选实施例中,假设上述局部数据包括k1个数据块,上述局部校验数据具体包括m1个校验块,则可以将这些数据分散存储在(k1+m1)个不同的存储节点;这里的存储节点是存储设备的逻辑抽象,既可以是一个磁盘也可以是一个存储服务器。也即,本申请实施例可以数据块或者校验块为单位进行所述局部校验数据及其对应的局部数据的分散存储,以分散数据丢失的风险,当然,本申请实施例对于局部校验数据及其对应的局部数据的具体存储方式不加以限制。
在本申请的一种可选实施例中,为了方便不同局部校验数据的寻址,本申请实施例的方法还可以包括:
存储单分组信息与第一局部校验数据的存储地址之间的映射关系;和/或
存储分组组合信息与第二局部校验数据的存储地址之间的映射关系。
其中,上述单分组信息可用于标识单个分组,其具体可以包括:单个分组的ID(标识,Identity)、名称等信息。同理上述分组组合信息可用于表示组合的多个分组。在实际应用中,上述存储地址具体可以包括:存储节点对应的存储路径,通过该存储路径可以直接访问(包括读取)对应的局部校验数据。
在本申请的一种可选实施例中,所述方法还可以包括:针对待存储数据的整体数据,生成对应的整体校验数据;存储所述整体校验数据。本可选实施例可以在存储局部校验数据的 基础上,进行整体校验数据的存储,以在局部数据和局部校验数据不足以恢复待恢复数据的情况下,依据整体校验数据实现待恢复数据的恢复。
综上,本申请实施例的局部校验数据是依据待存储数据的局部数据所对应的局部生成矩阵编码得到的,由于上述局部生成矩阵的拆分行中可以包括零元素,因此可以保证上述局部校验数据与上述局部数据相关,而可以与待存储数据中除上述局部数据外的其他数据不相关,因此,可以在不依赖其他数据的情况下,依据上述局部校验数据实现相应局部数据的恢复,也即,本申请实施例在数据恢复过程中可以仅仅读取待恢复数据对应的局部校验数据和局部数据,因此能够减少数据恢复过程中读取的数据量,进而能够大大降低数据恢复过程中的IO消耗。
恢复方法实施例
参照图2,示出了本申请的一种数据恢复方法实施例的步骤流程图,具体可以包括如下步骤201和202。
在步骤201,针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;其中,所述局部校验数据为依据局部生成矩阵对所述待存储数据的整体数据进行编码得到,所述局部生成矩阵为对整体生成矩阵的行进行拆分得到,所述局部生成矩阵的拆分行中可以包括零元素;
在步骤202,依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复。
本申请实施例中,待恢复数据可用于表示存在恢复需求的数据,其通常与失效的数据块或者校验块相应。在实际应用中,可以通过数据块或校验块的标识来表示上述待恢复数据,例如,待恢复数据可以包括:编号为X的待恢复数据块Dx,和/或,编号为Y的待恢复校验块CY。其中,上述待恢复数据块或者待恢复校验块的数目可以大于等于1,可以理解,本申请实施例对于具体的待恢复数据不加以限制。
在本申请的一种可选实施例中,上述局部数据也可以与整体数据的数据块分组相应,也即,本申请实施例的局部数据可以包括:分组对应的至少一个数据块,进一步,本申请实施例的局部校验数据也可以包括:分组对应的至少一个校验块。本申请实施例可以提供针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据的如下读取方案:
读取方案1
读取方案1中,针对待恢复数据,从预先存储的数据中读取对应的局部校验数据的步骤201,具体可以包括步骤C1和步骤C2。
在步骤C1,从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
在步骤C2,在所述第一待恢复数据的数据长度不超过所述单目标分组对应第一局部校验数据的数据长度时,从预先存储的数据中读取所述单目标分组对应的第一局部校验数据和局部数据,作为所述第一待恢复数据对应的局部校验数据和局部数据;其中,所述第一局部校验数据可以为依据单分组的局部数据得到。
读取方案1中,对于单目标分组的局部数据及其对应第一局部校验数据而言,其允许所述第一待恢复数据的最大数据长度为第一局部校验数据的数据长度。例如,单分组的局部数据包括k1个数据块,单分组的第一局部校验数据具体包括m1个校验块,则在数据块与校验块的数据长度一致的情况下,单分组对应的(k1+m1)个数据块和校验块最多允许m1个块(包括数据块和校验块)的失效和恢复。
在实际应用中,可以预先存储数据块信息或校验块信息与单分组信息之间的映射关系,则步骤C1可以依据待恢复数据中待恢复数据块或者待恢复校验块的信息(如标识),在上述映射关系进行查找,以得到待恢复数据块或者待恢复校验块所属的单目标分组,并可以从待恢复数据中提取出属于单目标分组的第一待恢复数据。可以理解,本申请实施例对于步骤C1从所述待恢复数据中获取属于单目标分组的第一待恢复数据的具体过程不加以限制。
在本申请的一种可选实施例中,可以对第一待恢复数据的数据长度与所述单目标分组对应第一局部校验数据的数据长度进行比较,在得到的比较结果为小于等于时,可以认为单目标分组的局部数据和第一局部校验数据足以实现第一待恢复数据的恢复,因此可以从预先存储的数据中读取所述单目标分组对应的第一局部校验数据和局部数据。
在本申请的另一种可选实施例中,所述从预先存储的数据中读取所述单目标分组对应的第一局部校验数据的步骤,具体可以包括步骤C21和步骤C22。
在步骤C21,依据预先存储的单分组信息与第一局部校验数据的存储地址之间的映射关系,获取所述单目标分组对应第一局部校验数据的第一目标存储地址;
在步骤C22,依据所述第一目标存储地址,从预先存储的数据中读取对应的第一局部校验数据。
同理,也可以预先存储单分组信息与局部数据的存储地址之间的映射关系,并依据该映射关系读取上述单目标分组对应的局部数据,可以理解,本申请实施例对于单目标分组对应的局部数据和第一局部校验数据的具体读取过程不加以限制。
读取方案2
读取方案2中,所述针对待恢复数据,从预先存储的数据中读取对应的局部校验数据的步骤201,具体可以包括步骤D1、D2和D3。
在步骤D1,从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
在步骤D2,在所述第一待恢复数据的数据长度超过所述单目标分组对应第一局部校验数据的数据长度时,从所述待恢复数据中获取属于目标分组组合的第二待恢复数据;所述目标分组组合具体可以包括:单目标分组;
在步骤D3,在所述第二待恢复数据的数据长度不超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据和局部数据,作为所述第二待恢复数据对应的局部校验数据和局部数据;其中,所述第二局部校验数据可以为依据分组组合的局部数据得到。
读取方案2中,在所述第一待恢复数据的数据长度超过所述单目标分组对应第一局部校验数据的数据长度时,例如,第一待恢复数据包括3个数据块,而第一局部校验数据包括2个校验块时,可以认为单目标分组的局部数据和局部校验数据不足以实现第一待恢复数据的恢复,此种情况下,可以利用目标分组组合对应第二局部校验数据和局部数据进行第二待恢复数据的恢复。
相对于第一待恢复数据与单目标分组相应,第二待恢复数据可以目标分组组合相应,具体地,第二待恢复数据具体可以包括:第一待恢复数据,目标分组组合具体可以包括:单目标分组,也即,单目标分组为目标分组组合的子集。在本申请的一种应用示例中,G个单分组分别表示为G0、G1、…、GG-1,则分组组合具体可以包括上述G个单分组中的至少2个。假设第一待恢复数据包括:G0中的3个数据块,而G0的校验块的数目为2,则可以获取包括G0的目标分组组合,如{G0,G1}、{G0,G2}、{G0,G1、G2}等,需要说明的是,这些目标分组组合中除G0之外的单分组中可能存在失效的块,也可能不存在失效的块,故上述第二待恢复数据可能多于第一待恢复数据,也可能与第一待恢复数据一致。
对于目标分组组合的局部数据及其对应第二局部校验数据而言,其允许所述第二待恢复数据的最大数据长度为第二局部校验数据的数据长度。例如,目标分组组合的局部数据包括(k1+k2)个数据块,目标分组组合的局部校验数据具体包括(m1+m2)个校验块,则在数据块与校验块的数据长度一致的情况下,目标分组组合对应的(k1+k2+m1+m2)个数据块和校验块最多允许(m1+m2)个块(包括数据块和校验块)的失效和恢复。
在实际应用中,可以预先存储数据块信息或校验块信息与分组组合信息之间的映射关系, 则步骤D2可以依据第一待恢复数据的信息(如标识),在上述映射关系进行查找,以得到第一待恢复数据所属的目标分组组合,并可以从待恢复数据中提取出属于目标分组组合的第二待恢复数据。可以理解,本申请实施例对于步骤D2从所述待恢复数据中获取属于目标分组组合的第二待恢复数据的具体过程不加以限制,例如,还可以依据单目标分组的信息,在单分组信息与分组组合信息之间的映射关系中间查找,以得到单目标分组对应的目标分组组合。
在本申请的一种可选实施例中,可以对第二待恢复数据的数据长度与所述目标分组组合对应第二局部校验数据的数据长度进行比较,在得到的比较结果为小于等于时,可以认为目标分组组合的局部数据和第二局部校验数据足以实现第二待恢复数据的恢复,因此可以从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据和局部数据。
在本申请的另一种可选实施例中,所述从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据的步骤,具体可以包括步骤D31和D32。
在步骤D31,依据预先存储的分组组合信息与第二局部校验数据的存储地址之间的映射关系,获取所述目标分组组合对应第二局部校验数据的第二目标存储地址;
在步骤D32,依据所述第二目标存储地址,从预先存储的数据中读取对应的第二局部校验数据。
同理,也可以预先目标分组组合信息与局部数据的存储地址之间的映射关系,并依据该映射关系读取上述目标分组组合对应的局部数据,可以理解,本申请实施例对于目标分组组合对应的局部数据和第二局部校验数据的具体读取过程不加以限制。
以上通过读取方案1和读取方案2对于针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据的过程进行了详细的介绍,可以理解,本领域技术人员可以根据实际应用需求,采用上述读取方案1和读取方案2中的任一或者组合,或者,还可以采用其他读取方案,本申请实施例对于针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据的具体过程不加以限制。
在实际应用中,所述待恢复数据具体可以包括:待恢复数据块和/或待恢复校验块,其中,在所述待恢复数据中仅仅包括待恢复校验块时,可以重新对局部数据进行编码,以得到对应的待恢复校验块。本申请实施例主要对待恢复数据块的恢复过程进行说明。
在本申请的一种可选实施例中,假设所读取的局部校验数据是对局部数据进行独立编码得到的,则对应的恢复过程具体可以包括:构造整体解码矩阵;所述整体解码矩阵具体可以包括:整体生成矩阵的行和单位矩阵的行,其中,该单位矩阵的行不包括所述待恢复数据 对应行,整体生成矩阵的行与待恢复数据对应行的数量一致,所述整体解码矩阵为方阵;并依据所述整体解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据。
假设局部数据的原始数据包括k1个数据块,局部校验数据的原始数据包括m1个校验块,则对应的单位矩阵I1可以为k1×k1的方阵,假设有1个数据块失效,则可以从单位矩阵I1中去除失效的数据块对应的行,其中,去除的行可以与失效的数据块相应,如果失效的数据块原本位于单位矩阵的第i(1≤i≤k1)行,也可以从单位矩阵中去除第i行。需要说明的是,可以采用整体生成矩阵中的行补偿去除的第i行,以使上述整体解码矩阵为方阵。
在本申请的一种应用示例中,假设k1=4,整体生成矩阵为公式(1)中的矩阵P,失效的数据块为第1个数据块D0,则可以从4×4的单位矩阵中去除第1行,并补上整体生成矩阵中未失效校验块对应的行,假设未失效校验块对应的行包括多个,则可以从中选取任一(例如第1行),在此提供一种上述示例对应的整体解码矩阵:
Figure PCTCN2016113523-appb-000008
在本申请的一种可选实施例中,可以通过公式(9)利用整体解码矩阵S、局部数据块D1、D2和D3、以及局部校验块C0对失效的数据块D0进行解码:
Figure PCTCN2016113523-appb-000009
由于公式(9)中矩阵S为可逆矩阵,故可以将公式(9)的两边乘以矩阵S的逆矩阵S’-1,以得到恢复后的原始数据:
Figure PCTCN2016113523-appb-000010
在本申请的另一种可选实施例中,假设所读取的局部校验数据是对整体数据进行编码得到的,则所述依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复的步骤202,具体可以包括步骤E1和E2。
在步骤E1,构造局部解码矩阵;所述局部解码矩阵具体可以包括:单位矩阵的行和 局部生成矩阵的行,所述单位矩阵的行不包括所述待恢复数据对应行,所述局部解码矩阵可以为方阵;
在步骤E2,依据所述局部解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据。
在本申请的一种应用示例中,假设对D0、D1、D2和D3等4个数据块划分为两个分组,其中,第一分组包括D0、D1,第二分组包括D2和D3,且利用公式(6)的局部生成矩阵Q3对整体数据D0、D1、D2和D3进行了编码,则在D0失效时,可以利用未失效的数据块D1和校验块C0进行D0的恢复,相应的局部解码矩阵T可以表示为:
Figure PCTCN2016113523-appb-000011
在本申请的一种可选实施例中,可以通过公式(11)利用局部解码矩阵T、局部数据块D1、以及局部校验块C0对失效的数据块D0进行解码:
Figure PCTCN2016113523-appb-000012
由于公式(12)中矩阵T为可逆矩阵,故可以将公式(12)的两边乘以矩阵T的逆矩阵T’-1,以得到恢复后的原始数据:
Figure PCTCN2016113523-appb-000013
在本申请的另一种应用示例中,在局部数据块D0和D1同时失效时,可以利用未失效的校验块C0和C2进行D0和D1的恢复,相应的局部解码矩阵T可以表示为:
Figure PCTCN2016113523-appb-000014
可以理解,本领域技术人员可以根据实际应用需求,灵活利用步骤E1和步骤E2进行待恢复数据的恢复,本申请实施例对于具体的恢复过程不加以限制。
在本申请的一种可选实施例中,所述方法还可以包括:在所述第二待恢复数据的数据长度超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取对应的整体校验数据和整体数据;依据所读取的整体校验数据和整体数据,对所述待恢复数据进行恢复。本可选实施例在局部数据和第二局部校验数据不足以恢复待恢复数据的情况下,可以依据整体校验数据实现待恢复数据的恢复,因此可以保证数据存储的可靠性。
为使本领域技术人员更好地理解本申请实施例,参照图3,示出了本申请的一种数据 存储和恢复方法实施例的步骤流程图,具体可以包括如下步骤301-308。
在步骤301,依据待存储数据的局部数据,对整体生成矩阵的行进行拆分,以得到对应的局部生成矩阵;所述局部生成矩阵的拆分行中可以包括零元素;
在步骤302,利用所述局部生成矩阵,对所述待存储数据的整体数据进行编码,以得到对应的整体校验数据;其中,所述整体校验数据具体可以包括:与所述局部数据相关的局部校验数据;
在步骤303,存储所述与所述局部数据相关的元素及其对应的局部数据;
在步骤304,针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;
在步骤305,判断所述待恢复数据的数据长度是否超过对应局部校验数据的数据长度,若否,则执行步骤306,若是,则执行步骤308;
在步骤306,构造局部解码矩阵;所述局部解码矩阵具体可以包括:不包括所述待恢复数据对应行的单位矩阵和局部生成矩阵中的至少一种,所述局部解码矩阵为方阵;
在步骤307,依据所述局部解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据;
在步骤308,从预先存储的数据中读取对应的整体校验数据和整体数据;并依据所读取的整体校验数据和整体数据,对所述待恢复数据进行恢复。
需要说明的是,步骤304可以采用上述读取方案1和读取方案2中的任一或者组合实现局部校验数据和局部数据的读取,步骤305可以采用步骤E1-步骤E2实现数据恢复,步骤308的恢复过程可以参照按照上述生成方案1对局部数据进行独立编码得到的局部校验数据的恢复过程,在此不作赘述。
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请实施例所必须的。
存储装置实施例
参照图4,示出了本申请一实施例的一种数据存储装置的硬件结构示意图,所述数据 存储装置可包括处理器41以及机器可读存储介质42,其中,处理器41和机器可读存储介质42通常借由内部总线43相互连接。在其他可能的实现方式中,所述数据存储装置还可能包括外部接口44,以能够与其他设备或者部件进行通信。
在不同的例子中,所述机器可读存储介质42可以是:RAM(Radom Access Memory,随机存取存储器)、易失存储器、非易失性存储器、闪存、存储驱动器(如硬盘驱动器)、固态硬盘、任何类型的存储盘(如光盘、dvd等),或者类似的存储介质,或者它们的组合。
进一步地,机器可读存储介质42上可存储由处理器41执行的数据存储的控制逻辑50对应的机器可读指令。这样,在处理器41读取并执行机器可读存储介质42上所存储的机器可读指令时,所述处理器41可执行如上所述的数据存储方法。从功能上划分,如图5所示,所述数据存储控制逻辑50可以包括拆分模块501、编码模块502和局部存储模块503。
拆分模块501,可用于依据待存储数据的局部数据,对整体生成矩阵的行进行拆分,以得到对应的局部生成矩阵;所述局部生成矩阵的拆分行中可以包括零元素;
编码模块502,可用于利用所述局部生成矩阵,对所述待存储数据的整体数据进行编码,以得到对应的整体校验数据;所述整体校验数据具体可以包括:与所述局部数据相关的局部校验数据;以及
局部存储模块503,可用于存储所述局部校验数据及其对应的局部数据。
在本申请的一种可选实施例中,所述数据存储控制逻辑50还可以包括:获取模块,用于获取所述待存储数据的局部数据。
所述获取模块,可以包括:
划分子模块,用于对所述待存储数据进行数据块划分;及
分组子模块,用于对划分得到的数据块进行分组,并依据所述分组得到对应的局部数据。
在本申请的另一种可选实施例中,所述待存储数据的局部数据具体可以包括:依据所述待存储数据对应分组的局部数据,所述局部校验数据具体可以包括:
单分组的局部数据对应的第一局部校验数据;和/或
分组组合的局部数据对应的第二局部校验数据。
在本申请的再一种可选实施例中,所述数据存储控制逻辑50还可以包括:
第一关系存储模块,用于存储单分组信息与第一局部校验数据的存储地址之间的映射关 系;和/或
第二关系存储模块,用于存储分组组合信息与第二局部校验数据的存储地址之间的映射关系。
在本申请的再一种可选实施例中,所述数据存储控制逻辑50还可以包括:
整体校验生成模块,用于针对待存储数据的整体数据,生成对应的整体校验数据;
整体存储模块,用于存储所述整体校验数据。
下面以软件实现为例,进一步描述数据存储装置如何执行该控制逻辑50。在该例子中,本申请控制逻辑可理解为存储在机器可读存储介质42中的机器可读指令。当本申请的数据存储装置上的处理器41执行该控制逻辑时,该处理器41通过调用机器可读存储介质42上保存的机器可读指令来执行如下操作:
依据待存储数据的局部数据,对整体生成矩阵的行进行拆分,以得到对应的局部生成矩阵;所述局部生成矩阵的拆分行中包括零元素;
利用所述局部生成矩阵,对所述待存储数据的整体数据进行编码,以得到对应的整体校验数据;所述整体校验数据包括:与所述局部数据相关的局部校验数据;
存储所述局部校验数据及其对应的局部数据。
本实施例中,当获取所述待存储数据的局部数据时,机器可读存储介质42上保存的机器可读指令可促使处理器41:
对所述待存储数据进行数据块划分;
对划分得到的数据块进行分组,并依据所述分组得到对应的局部数据。
本实施例中,所述待存储数据的局部数据可包括依据所述待存储数据对应分组的局部数据。
在本实施例中,所述局部校验数据可包括:
单分组的局部数据对应的第一局部校验数据;和/或
分组组合的局部数据对应的第二局部校验数据。
本实施例中,机器可读存储介质42上保存的机器可读指令还可促使处理器41:
存储单分组信息与第一局部校验数据的存储地址之间的映射关系;和/或
存储分组组合信息与第二局部校验数据的存储地址之间的映射关系。
对于数据存储装置实施例而言,由于其与存储方法实施例基本相似,所以描述的比较简单,相关之处参见存储方法实施例的部分说明即可。
恢复装置实施例
参照图6,示出了本申请一实施例的一种数据恢复装置的硬件结构示意图,所述数据恢复装置可包括处理器61以及机器可读存储介质62,其中,处理器61和机器可读存储介质62通常借由内部总线63相互连接。在其他可能的实现方式中,所述数据恢复装置还可能包括外部接口64,以能够与其他设备或者部件进行通信。
在不同的例子中,所述机器可读存储介质62可以是:RAM(Radom Access Memory,随机存取存储器)、易失存储器、非易失性存储器、闪存、存储驱动器(如硬盘驱动器)、固态硬盘、任何类型的存储盘(如光盘、dvd等),或者类似的存储介质,或者它们的组合。
进一步地,机器可读存储介质62上可存储由处理器61执行的数据存储的控制逻辑70对应的机器可读指令。这样,在处理器61读取并执行机器可读存储介质62上所存储的机器可读指令时,所述处理器61可执行如上所述的数据恢复方法。从功能上划分,如图7所示,所述数据存储控制逻辑70可以包括读取模块701和恢复模块702。
读取模块701,可用于针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;其中,所述局部校验数据为依据局部生成矩阵对所述待存储数据的整体数据进行编码得到,所述局部生成矩阵为对整体生成矩阵的行进行拆分得到,所述局部生成矩阵的拆分行中可以包括零元素。
恢复模块702,可用于依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复。
在本申请的一种可选实施例汇总,所述恢复模块702,具体可以包括:
构造子模块,可用于构造局部解码矩阵;所述局部解码矩阵具体可以包括:单位矩阵的行和局部生成矩阵的行,所述单位矩阵的行不包括所述待恢复数据对应行,所述局部解码矩阵为方阵;以及
解码子模块,可用于依据所述局部解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据。
在本申请的一种可选实施例中,所述读取模块701,具体可以包括:
第一获取子模块,可用于从所述待恢复数据中获取属于单目标分组的第一待恢复数据; 以及
第一读取子模块,可用于在所述第一待恢复数据的数据长度不超过所述单目标分组对应第一局部校验数据的数据长度时,从预先存储的数据中读取所述单目标分组对应的第一局部校验数据和局部数据,作为所述第一待恢复数据对应的局部校验数据和局部数据;其中,所述第一局部校验数据可以为依据单分组的局部数据得到。
在本申请的一种可选实施例中,所述第一读取子模块,具体可以包括:
第一地址获取单元,可用于依据预先存储的单分组信息与第一局部校验数据的存储地址之间的映射关系,获取所述单目标分组对应第一局部校验数据的第一目标存储地址;
第一读取单元,可用于依据所述第一目标存储地址,从预先存储的数据中读取对应的第一局部校验数据。
在本申请的一种可选实施例中,所述读取模块701,具体可以包括:
第一获取子模块,可用于从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
第二获取子模块,可用于在所述第一待恢复数据的数据长度超过所述单目标分组对应第一局部校验数据的数据长度时,从所述待恢复数据中获取属于目标分组组合的第二待恢复数据;所述目标分组组合包括:单目标分组;
第二读取子模块,可用于在所述第二待恢复数据的数据长度不超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据和局部数据,作为所述第二待恢复数据对应的局部校验数据和局部数据;其中,所述第二局部校验数据为依据分组组合的局部数据得到。
在本申请的一种可选实施例中,所述第二读取子模块,具体可以包括:
第二地址获取单元,可用于依据预先存储的分组组合信息与第二局部校验数据的存储地址之间的映射关系,获取所述目标分组组合对应第二局部校验数据的第二目标存储地址;
第二读取单元,可用于依据所述第二目标存储地址,从预先存储的数据中读取对应的第二局部校验数据。
在本申请的一种可选实施例中,所述数据恢复控制逻辑70还可以包括:
整体读取模块,可用于在所述第二待恢复数据的数据长度超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取对应的整体校验数据和整体数据;
整体恢复模块,可用于依据所读取的整体校验数据和整体数据,对所述待恢复数据进行恢复。
下面以软件实现为例,进一步描述数据恢复装置如何执行该控制逻辑70。在该例子中,本申请控制逻辑可理解为存储在机器可读存储介质62中的机器可读指令。当本申请的数据恢复装置上的处理器61执行该控制逻辑时,该处理器61通过调用机器可读存储介质62上保存的机器可读指令来执行如下操作:
针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;其中,所述局部校验数据为依据局部生成矩阵对所述待存储数据的整体数据进行编码得到,所述局部生成矩阵为对整体生成矩阵的行进行拆分得到,所述局部生成矩阵的拆分行中包括零元素;
依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复。
本实施例中,当所述依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复时,所述机器可读指令促使所述处理器:
构造局部解码矩阵;所述局部解码矩阵包括:单位矩阵的行和局部生成矩阵的行,所述单位矩阵的行不包括所述待恢复数据对应行,所述局部解码矩阵为方阵;
依据所述局部解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据。
在本实施例中,所述针对待恢复数据,当从预先存储的数据中读取对应的局部校验数据时,所述机器可读指令促使所述处理器:
从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
在所述第一待恢复数据的数据长度不超过所述单目标分组对应第一局部校验数据的数据长度时,从预先存储的数据中读取所述单目标分组对应的第一局部校验数据和局部数据,作为所述第一待恢复数据对应的局部校验数据和局部数据;其中,所述第一局部校验数据为依据单分组的局部数据得到。
在本实施例中,当从预先存储的数据中读取所述单目标分组对应的第一局部校验数据时,所述机器可读指令促使所述处理器:
依据预先存储的单分组信息与第一局部校验数据的存储地址之间的映射关系,获取所述单目标分组对应第一局部校验数据的第一目标存储地址;
依据所述第一目标存储地址,从预先存储的数据中读取对应的第一局部校验数据。
在本实施例中,当针对待恢复数据,从预先存储的数据中读取对应的局部校验数据时,所述机器可读指令促使所述处理器:
从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
在所述第一待恢复数据的数据长度超过所述单目标分组对应第一局部校验数据的数据长度时,从所述待恢复数据中获取属于目标分组组合的第二待恢复数据;所述目标分组组合包括:单目标分组;
在所述第二待恢复数据的数据长度不超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据和局部数据,作为所述第二待恢复数据对应的局部校验数据和局部数据;其中,所述第二局部校验数据为依据分组组合的局部数据得到。
在本实施例中,当从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据时,所述机器可读指令促使所述处理器:
依据预先存储的分组组合信息与第二局部校验数据的存储地址之间的映射关系,获取所述目标分组组合对应第二局部校验数据的第二目标存储地址;
依据所述第二目标存储地址,从预先存储的数据中读取对应的第二局部校验数据。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
本领域内的技术人员应明白,本申请的实施例可提供的为方法、装置、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
在一个典型的配置中,所述计算机设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器 (CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非持续性的电脑可读媒体(transitory media),如调制的数据信号和载波。
本申请实施例是参照根据本申请实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。
以上对本申请所提供的一种数据存储方法和装置、以及一种数据恢复方法和装置,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (14)

  1. 一种数据存储方法,其中,包括:
    依据待存储数据的局部数据,对整体生成矩阵的行进行拆分,以得到对应的局部生成矩阵,其中,所述局部生成矩阵的拆分行中包括零元素;
    利用所述局部生成矩阵,对所述待存储数据的整体数据进行编码,以得到对应的整体校验数据,其中,所述整体校验数据包括与所述局部数据相关的局部校验数据;
    存储所述局部校验数据及其对应的局部数据。
  2. 根据权利要求1所述的方法,其中,获取所述待存储数据的局部数据包括:
    对所述待存储数据进行数据块划分;
    对划分得到的数据块进行分组,并依据所述分组得到对应的局部数据。
  3. 根据权利要求1或2所述的方法,其中,所述待存储数据的局部数据包括:
    依据所述待存储数据对应分组的局部数据;
    所述局部校验数据包括:
    单分组的局部数据对应的第一局部校验数据;和/或
    分组组合的局部数据对应的第二局部校验数据。
  4. 根据权利要求3所述的方法,其中,所述方法还包括:
    存储单分组信息与第一局部校验数据的存储地址之间的映射关系;和/或
    存储分组组合信息与第二局部校验数据的存储地址之间的映射关系。
  5. 一种数据恢复方法,其中,包括:
    针对待恢复数据,从预先存储的数据中读取对应的局部校验数据和局部数据;其中,所述局部校验数据为依据局部生成矩阵对所述待存储数据的整体数据进行编码得到,所述局部生成矩阵为对整体生成矩阵的行进行拆分得到,所述局部生成矩阵的拆分行中包括零元素;
    依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复。
  6. 根据权利要求5所述的方法,其中,所述依据所读取的局部校验数据和局部数据,对所述待恢复数据进行恢复的步骤,包括:
    构造局部解码矩阵,其中,所述局部解码矩阵包括:
    单位矩阵的行和局部生成矩阵的行,所述单位矩阵的行不包括所述待恢复数据对应行,所述局部解码矩阵为方阵;
    依据所述局部解码矩阵对所述局部校验数据和局部数据进行解码,以得到所述待恢复数据对应的原始数据。
  7. 根据权利要求5或6所述的方法,其中,所述针对待恢复数据,从预先存储的数据中读取对应的局部校验数据的步骤,包括:
    从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
    在所述第一待恢复数据的数据长度不超过所述单目标分组对应第一局部校验数据的数据长度时,从预先存储的数据中读取所述单目标分组对应的第一局部校验数据和局部数据,作为所述第一待恢复数据对应的局部校验数据和局部数据;其中,所述第一局部校验数据为依据单分组的局部数据得到。
  8. 根据权利要求7所述的方法,其中,所述从预先存储的数据中读取所述单目标分组对应的第一局部校验数据的步骤,包括:
    依据预先存储的单分组信息与第一局部校验数据的存储地址之间的映射关系,获取所述单目标分组对应第一局部校验数据的第一目标存储地址;
    依据所述第一目标存储地址,从预先存储的数据中读取对应的第一局部校验数据。
  9. 根据权利要求5或6所述的方法,其中,所述针对待恢复数据,从预先存储的数据中读取对应的局部校验数据的步骤,包括:
    从所述待恢复数据中获取属于单目标分组的第一待恢复数据;
    在所述第一待恢复数据的数据长度超过所述单目标分组对应第一局部校验数据的数据长度时,从所述待恢复数据中获取属于目标分组组合的第二待恢复数据;所述目标分组组合包括:单目标分组;
    在所述第二待恢复数据的数据长度不超过所述目标分组组合对应第二局部校验数据的数据长度时,从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据和局部数据,作为所述第二待恢复数据对应的局部校验数据和局部数据;其中,所述第二局部校验数据为依据分组组合的局部数据得到。
  10. 根据权利要求9所述的方法,其中,所述从预先存储的数据中读取所述目标分组组合对应的第二局部校验数据的步骤,包括:
    依据预先存储的分组组合信息与第二局部校验数据的存储地址之间的映射关系,获取所述目标分组组合对应第二局部校验数据的第二目标存储地址;
    依据所述第二目标存储地址,从预先存储的数据中读取对应的第二局部校验数据。
  11. 一种数据存储装置,包括处理器,所述处理器通过读取存储介质上所存储的与数据存储控制逻辑对应的机器可读指令并执行如权利要求1-4所述的数据存储方法。
  12. 一种数据恢复装置,包括处理器,所述处理器通过读取存储介质上所存储的与数据 恢复控制逻辑对应的机器可读指令并执行如权利要求5-10所述的数据恢复方法。
  13. 一种机器可读存储介质,存储由一个或多个处理器执行的机器可读指令,所述机器可读指令促使所述处理器执行如权利要求1-4所述的数据存储方法。
  14. 一种机器可读存储介质,存储由一个或多个处理器执行的机器可读指令,所述机器可读指令促使所述处理器执行如权利要求5-10所述的数据恢复方法。
PCT/CN2016/113523 2016-06-29 2016-12-30 一种数据存储方法和装置、一种数据恢复方法和装置 WO2018000788A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US16/314,281 US10754727B2 (en) 2016-06-29 2016-12-30 Method and apparatus for storing data and method and apparatus for recovering data
CA3053855A CA3053855C (en) 2016-06-29 2016-12-30 Data-storage method and apparatus, and data-recovery method and apparatus
EP16907175.0A EP3480697A4 (en) 2016-06-29 2016-12-30 METHOD AND APPARATUS FOR STORING DATA AND METHOD AND APPARATUS FOR DATA RE-ESTABLISHMENT

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610500318.5 2016-06-29
CN201610500318.5A CN106201764B (zh) 2016-06-29 2016-06-29 一种数据存储方法和装置、一种数据恢复方法和装置

Publications (1)

Publication Number Publication Date
WO2018000788A1 true WO2018000788A1 (zh) 2018-01-04

Family

ID=57463474

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/113523 WO2018000788A1 (zh) 2016-06-29 2016-12-30 一种数据存储方法和装置、一种数据恢复方法和装置

Country Status (5)

Country Link
US (1) US10754727B2 (zh)
EP (1) EP3480697A4 (zh)
CN (1) CN106201764B (zh)
CA (2) CA3177662C (zh)
WO (1) WO2018000788A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106201764B (zh) * 2016-06-29 2019-03-12 北京三快在线科技有限公司 一种数据存储方法和装置、一种数据恢复方法和装置
CN108880620B (zh) * 2018-08-20 2021-06-11 广东石油化工学院 电力线通信信号重构方法
CN113051428B (zh) * 2019-12-28 2024-04-05 浙江宇视科技有限公司 一种摄像机前端存储备份的方法及装置
CN112860475B (zh) * 2021-02-04 2023-02-28 山东云海国创云计算装备产业创新中心有限公司 基于rs纠删码的校验块恢复方法、装置、系统及介质
CN112948335A (zh) * 2021-02-23 2021-06-11 北京星震同源数字系统股份有限公司 一种数据处理方法和系统
CN114153393A (zh) * 2021-11-29 2022-03-08 山东云海国创云计算装备产业创新中心有限公司 一种数据编码方法、系统、设备以及介质
CN117636998A (zh) * 2022-08-09 2024-03-01 长鑫存储技术有限公司 数据处理方式、数据处理结构及存储器

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833040A (zh) * 2012-08-03 2012-12-19 中兴通讯股份有限公司 解码处理方法、装置及编解码系统
CN104461781A (zh) * 2014-12-01 2015-03-25 华中科技大学 一种基于纠删码的数据块重建方法
WO2015195104A1 (en) * 2014-06-17 2015-12-23 Hewlett-Packard Development Company, L.P. Distributed storage data recovery
CN105335252A (zh) * 2015-10-22 2016-02-17 浪潮电子信息产业股份有限公司 一种数据保护方法、装置以及系统
CN105335150A (zh) * 2014-08-13 2016-02-17 苏宁云商集团股份有限公司 纠删码数据的快速编解码方法和系统
CN105610879A (zh) * 2014-10-31 2016-05-25 深圳市华为技术软件有限公司 数据处理方法和装置
CN106201764A (zh) * 2016-06-29 2016-12-07 北京三快在线科技有限公司 一种数据存储方法和装置、一种数据恢复方法和装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102254102B1 (ko) * 2015-01-23 2021-05-20 삼성전자주식회사 메모리 시스템 및 메모리 시스템의 동작 방법

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833040A (zh) * 2012-08-03 2012-12-19 中兴通讯股份有限公司 解码处理方法、装置及编解码系统
WO2015195104A1 (en) * 2014-06-17 2015-12-23 Hewlett-Packard Development Company, L.P. Distributed storage data recovery
CN105335150A (zh) * 2014-08-13 2016-02-17 苏宁云商集团股份有限公司 纠删码数据的快速编解码方法和系统
CN105610879A (zh) * 2014-10-31 2016-05-25 深圳市华为技术软件有限公司 数据处理方法和装置
CN104461781A (zh) * 2014-12-01 2015-03-25 华中科技大学 一种基于纠删码的数据块重建方法
CN105335252A (zh) * 2015-10-22 2016-02-17 浪潮电子信息产业股份有限公司 一种数据保护方法、装置以及系统
CN106201764A (zh) * 2016-06-29 2016-12-07 北京三快在线科技有限公司 一种数据存储方法和装置、一种数据恢复方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3480697A4 *

Also Published As

Publication number Publication date
CN106201764A (zh) 2016-12-07
CA3177662A1 (en) 2018-01-04
CN106201764B (zh) 2019-03-12
CA3053855C (en) 2023-01-03
US20190205212A1 (en) 2019-07-04
EP3480697A1 (en) 2019-05-08
US10754727B2 (en) 2020-08-25
EP3480697A4 (en) 2019-06-12
CA3177662C (en) 2023-06-27
CA3053855A1 (en) 2018-01-04

Similar Documents

Publication Publication Date Title
WO2018000788A1 (zh) 一种数据存储方法和装置、一种数据恢复方法和装置
US9600365B2 (en) Local erasure codes for data storage
US8856619B1 (en) Storing data across groups of storage nodes
EP2920695B1 (en) Method and system for multi-dimensional raid reconstruction and defect avoidance
JP4192154B2 (ja) エラー訂正のためのデータの分割
US9141679B2 (en) Cloud data storage using redundant encoding
US9195551B2 (en) Enhanced storage of metadata utilizing improved error detection and correction in computer memory
JP5805727B2 (ja) 縮退故障を有するメモリセルに対応するためのデータ符号化及び復号化
US10558524B2 (en) Computing system with data recovery mechanism and method of operation thereof
KR20100111680A (ko) 메모리 어레이의 에러 수정 방법
TW202011189A (zh) 分散式存儲系統、方法和裝置
US20120198195A1 (en) Data storage system and method
US9189327B2 (en) Error-correcting code distribution for memory systems
US20200336157A1 (en) Systematic and xor-based coding technique for distributed storage systems
US20220091936A1 (en) Systems and methods for encoding metadata
Chen et al. A new Zigzag MDS code with optimal encoding and efficient decoding
US9489252B1 (en) File recovery using diverse erasure encoded fragments
WO2020029423A1 (zh) 一种修复二进制阵列码校验矩阵的构造方法及修复方法
CN108352845B (zh) 用于对存储数据进行编码的方法以及装置
US9450617B2 (en) Distribution and replication of erasure codes
CN110431531A (zh) 存储控制器、数据处理芯片及数据处理方法
CN107615248B (zh) 分布式数据存储方法、控制设备和系统
JP5278115B2 (ja) 冗長符号生成方法及び装置、データ復元方法及び装置、並びにraid記憶装置
US20180365107A1 (en) Data Recovery and Regeneration Using Parity Code
Wang EVENODD Code Implementation and Generalizations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16907175

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2016907175

Country of ref document: EP

Effective date: 20190129

ENP Entry into the national phase

Ref document number: 3053855

Country of ref document: CA