WO2022105442A1 - 一种基于纠删码的数据重构方法、装置、设备及存储介质 - Google Patents

一种基于纠删码的数据重构方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022105442A1
WO2022105442A1 PCT/CN2021/121225 CN2021121225W WO2022105442A1 WO 2022105442 A1 WO2022105442 A1 WO 2022105442A1 CN 2021121225 W CN2021121225 W CN 2021121225W WO 2022105442 A1 WO2022105442 A1 WO 2022105442A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
erasure
offset
incremental
osd
Prior art date
Application number
PCT/CN2021/121225
Other languages
English (en)
French (fr)
Inventor
王庆海
孟祥瑞
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Priority to US18/029,700 priority Critical patent/US20240045763A1/en
Publication of WO2022105442A1 publication Critical patent/WO2022105442A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1088Reconstruction on already foreseen single or plurality of spare disks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • H03M13/373Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35 with erasure correction and erasure determination, e.g. for packet loss recovery or setting of erasures for the decoding of Reed-Solomon codes

Definitions

  • the present application relates to the field of distributed storage, and in particular, to a data reconstruction method, apparatus, device and storage medium based on erasure codes.
  • Object-based Storage is a new network storage architecture.
  • the device based on object storage technology is the Object-based Storage Device (OSD) for short.
  • OSD Object-based Storage Device
  • the main function of OSD is to store data, replicate data, balance data, restore data, etc.
  • a disk corresponds to an OSD, and the OSD manages the disk storage.
  • the OSD manages the disk storage.
  • the data of the OSD needs to be recovered, and the data in the failed OSD is recovered in other OSDs. The process is called data reconstruction.
  • Erasure coding is a data protection method that divides data into pieces, expands, encodes, and stores redundant data blocks in different locations, such as disks, storage nodes, or other geographic locations .
  • the data protection provided by erasure codes can be expressed by the formula K+M, where K is the number of data disks, M is the number of check disks, and at most M disks are allowed to fail.
  • the data storage mode of erasure coding is applied to the distributed file system of data object storage, since the data in each OSD member that saves the data object is unique, the current time when restoring the data of the data object is Refactor the entire data object. Therefore, for the mode of data storage based on erasure coding in the distributed file system, the amount of data read and write is large during the process of data reconstruction, and it is difficult to ensure the overall efficiency of data reconstruction.
  • the purpose of this application is to provide a data reconstruction method, apparatus, device and storage medium based on erasure correction codes, so as to relatively ensure the overall efficiency of data reconstruction.
  • the present application provides a data reconstruction method based on erasure codes, including:
  • Corresponding data fragments are obtained from multiple source OSDs according to the data offset information;
  • the source OSDs are the target OSDs that store incremental data in each OSD that stores data objects based on erasure codes, and the number of source OSDs is related to the correction The number of data disks corresponding to deletion codes is the same;
  • Each data segment is integrated into an erasure incremental segment, and the erasure incremental segment is written to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • the data offset information includes an offset start address and an offset data amount
  • corresponding data segments are obtained from multiple source OSDs, including:
  • the offset start address is divided equally based on the number of data disks to obtain the erasure offset start address, and the offset data amount is divided equally based on the number of data disks to obtain the erasure offset data amount;
  • Corresponding data segments are acquired from multiple source OSDs according to the erasure offset start address and the erasure offset data amount.
  • the method further includes:
  • the amount of offset data is an integer multiple of the amount of erasure stripe data corresponding to the erasure code, perform the step of dividing the amount of offset data equally based on the number of data disks to obtain the amount of erasure offset data;
  • the offset data volume is not an integer multiple of the erasure stripe data volume corresponding to the erasure code, increase the offset data volume to an integer multiple of the erasure stripe data volume;
  • the step of dividing the offset data amount into equal amounts based on the number of data disks to obtain the erasure erasure offset data amount is performed.
  • the method further includes:
  • write erasure incremental segments to the OSD to be reconstructed in which no incremental data is stored in the OSD including:
  • the erasure incremental segment obtained by reading the data segment based on the location information is written to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • writing erasure incremental segments to the OSD to be reconstructed in which incremental data is not stored in the OSD including:
  • acquiring the data offset information of the incremental data in the data object includes:
  • the present application also provides a data reconstruction device based on erasure correction codes, comprising:
  • the offset information acquisition module is used to acquire the data offset information of the incremental data in the data object
  • the data segment acquisition module is used to acquire corresponding data segments from multiple source OSDs according to the data offset information; wherein, the source OSDs are the target OSDs that store incremental data among the OSDs that store data objects based on erasure codes, And the number of source OSDs is the same as the number of data disks corresponding to erasure codes;
  • the incremental segment storage module is used to integrate each data segment into an erasure incremental segment, and write the erasure incremental segment to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • the data offset information includes an offset start address and an offset data amount
  • the data fragment acquisition module includes:
  • the address range acquisition module is used to divide the offset start address by equal amount based on the number of data disks to obtain the erasure offset start address, and divide the offset data amount based on the number of data disks to obtain the erasure offset the amount of data moved;
  • the interval data acquisition module is used to acquire corresponding data segments from multiple source OSDs according to the erasure offset start address and the erasure offset data amount.
  • the present application also provides a data reconstruction device based on erasure codes, including:
  • the processor is configured to implement the steps of the above-mentioned data reconstruction method based on erasure code when executing the computer program.
  • the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the foregoing erasure code-based data reconstruction method are implemented.
  • the data reconstruction method based on erasure code provided by the present application first obtains the data offset information of incremental data in the data object, and then obtains corresponding data segments from multiple source OSDs according to the data offset information.
  • the source OSDs are Among the OSDs that store data objects based on erasure codes, target OSDs with incremental data are stored, and the number of source OSDs is the same as the number of data disks corresponding to erasure codes. After the data fragments are obtained, the data fragments are further integrated into Erasure the incremental fragments, and write the erasure incremental fragments to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • the method realizes the incremental data of the source OSD that stores the incremental data of the data objects to the OSD to be reconstructed that does not store the incremental data.
  • Data reconstruction of partially erased incremental fragments Compared with the data reconstruction method for the entire data object, the method reduces the data amount of the data reconstruction, and further ensures the overall efficiency of the data reconstruction.
  • the present application also provides a data reconstruction apparatus, device and storage medium based on erasure correction codes, the beneficial effects are the same as those described above.
  • FIG. 1 is a flowchart of a method for data reconstruction based on erasure codes disclosed in an embodiment of the present application
  • FIG. 2 is a schematic diagram of incremental data generation of a data object disclosed in an embodiment of the present application.
  • Figure 3.a is a schematic diagram of a data segment disclosed by an embodiment of a scenario of the application.
  • Fig. 3.b is a schematic diagram of a data segment disclosed by a scenario embodiment of the application.
  • FIG. 4 is a schematic structural diagram of an apparatus for data reconstruction based on erasure correction codes disclosed in an embodiment of the present application.
  • the data storage mode of erasure coding is applied to the distributed file system of data object storage, since the data in each OSD member that saves the data object is unique, the current time when restoring the data of the data object is Refactor the entire data object. Therefore, for the mode of data storage based on erasure coding in the distributed file system, the amount of data read and write is large during the process of data reconstruction, and it is difficult to ensure the overall efficiency of data reconstruction.
  • the core of the present application is to provide a data reconstruction method based on erasure codes, so as to relatively ensure the overall efficiency of data reconstruction.
  • an embodiment of the present application discloses a data reconstruction method based on erasure codes, including:
  • Step S10 Acquire data offset information of incremental data in the data object.
  • the data object in this step is stored by multiple OSDs (Object-based Storage Device) based on the erasure code storage mechanism.
  • OSDs Object-based Storage Device
  • the data object is a whole data.
  • Incremental data refers to the data that has changed in the data object, the data increment is part of the data in the data object, and the data offset information represents the offset position of the data increment in the data object.
  • Step S11 Acquire corresponding data segments from multiple source OSDs according to the data offset information.
  • the source OSD is a target OSD storing incremental data among OSDs storing data objects based on erasure codes, and the number of source OSDs is the same as the number of data disks corresponding to erasure codes.
  • this step further acquires corresponding data segments from multiple source OSDs according to the data offset information, wherein the source OSDs are the respective data objects that jointly store the data object based on erasure codes.
  • the target OSD in which incremental data is stored in the OSD is the target OSD in which incremental data is stored in the OSD.
  • data objects are stored and maintained by multiple OSDs that meet the required number of disks for erasure code. That is to say, the data in multiple OSDs can be integrated into a complete data object. , so when some data in the data object changes to generate data increments, the data increments are still stored by a corresponding number of multiple OSDs.
  • the data protection provided by erasure codes can be expressed by the formula K+M, where K is the number of data disks, M is the number of check disks, and at most M disks are allowed to fail, so
  • K is the number of data disks
  • M is the number of check disks, and at most M disks are allowed to fail
  • the OSDs to be reconstructed can be reconstructed in the OSDs to be reconstructed through the data segments related to the data increments in the K source OSDs that store the data increments. Missing data increments.
  • Step S12 Integrate each data segment into an erasure incremental segment, and write the erasure incremental segment to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • this step After acquiring corresponding data segments from multiple source OSDs according to the data offset information, this step further integrates each data segment into erasure incremental segments, and then writes the erasure incremental segments to the OSD to be reconstructed.
  • the data reconstruction method based on erasure code provided by the present application first obtains the data offset information of incremental data in the data object, and then obtains corresponding data segments from multiple source OSDs according to the data offset information.
  • the source OSDs are Among the OSDs that store data objects based on erasure codes, target OSDs with incremental data are stored, and the number of source OSDs is the same as the number of data disks corresponding to erasure codes. After the data fragments are obtained, the data fragments are further integrated into Erasure the incremental fragments, and write the erasure incremental fragments to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • the method realizes the incremental data of the source OSD that stores the incremental data of the data objects to the OSD to be reconstructed that does not store the incremental data.
  • Data reconstruction of partially erased incremental fragments Compared with the data reconstruction method for the entire data object, the method reduces the data amount of the data reconstruction, and further ensures the overall efficiency of the data reconstruction.
  • the data offset information includes an offset start address and an offset data amount
  • corresponding data segments are obtained from multiple source OSDs, including:
  • the offset start address is divided equally based on the number of data disks to obtain the erasure offset start address, and the offset data amount is divided equally based on the number of data disks to obtain the erasure offset data amount;
  • Corresponding data segments are acquired from multiple source OSDs according to the erasure offset start address and the erasure offset data amount.
  • the data offset information includes an offset start address and an offset data amount, wherein the offset start address refers to the data address corresponding to the starting position of the incremental data, and the offset data amount refers to the data address corresponding to the initial position of the incremental data. is the address length occupied by the overall data of the incremental data. Considering that the erasure code stores data objects in K data disks on average, incremental data is also stored in each data disk on average.
  • the offset start address is divided into equal amounts based on the number of data disks to obtain the erasure offset start address, and the offset data amount is divided equally based on the number of data disks to obtain the erasure offset data amount,
  • corresponding data segments are obtained from multiple source OSDs according to the erasure offset start address and the erasure offset data amount. This embodiment further ensures the accuracy of the process of acquiring corresponding data segments in multiple source OSDs.
  • the method further includes:
  • the amount of offset data is an integer multiple of the amount of erasure stripe data corresponding to the erasure code, perform the step of dividing the amount of offset data equally based on the number of data disks to obtain the amount of erasure offset data;
  • the offset data volume is not an integer multiple of the erasure stripe data volume corresponding to the erasure code, increase the offset data volume to an integer multiple of the erasure stripe data volume;
  • the step of dividing the offset data amount into equal amounts based on the number of data disks to obtain the erasure erasure offset data amount is performed.
  • the data of the incremental data The size should be an integer multiple of the erasure stripe size. Then, before dividing the offset data amount based on the number of data disks to obtain the erasure offset data amount, determine whether the offset data amount is the correction corresponding to the erasure code. An integer multiple of the data volume of the erasure stripe. If the offset data volume is an integer multiple of the erasure stripe data volume corresponding to the erasure code, the offset data volume is further divided into equal amounts based on the number of data disks to obtain erasure erasure.
  • the offset data amount is further increased to an integer multiple of the amount of erasure stripe data, and based on The modified offset data amount is subjected to the step of dividing the offset data amount into equal amounts based on the number of data disks to obtain the erasure erasure offset data amount.
  • This embodiment further ensures that the offset data amount is divided into equal amounts based on the number of data disks to obtain the accuracy of the erasure correction offset data amount.
  • the method further includes:
  • write erasure incremental segments to the OSD to be reconstructed in which no incremental data is stored in the OSD including:
  • the erasure incremental segment obtained by reading the data segment based on the location information is written to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • Method Before writing the erasure incremental segment to the OSD to be reconstructed that does not store incremental data in the OSD, first store the erasure incremental segment corresponding to each data offset information as a continuous data segment, and record each data segment.
  • the location information of the erasure incremental segment in the data segment, where the continuous data segment means that the data of the data segment occupies consecutive data addresses, and the location information records the data corresponding to each erasure incremental segment in the data segment.
  • This embodiment further ensures the accuracy of the process of writing erasure-erasing incremental segments to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • writing erasure incremental segments to the OSD to be reconstructed in which no incremental data is stored in the OSD including:
  • the specific method is to write the erasure incremental segment to the OSD to be reconstructed and the offset start address and the address range corresponding to the offset data amount, thereby further ensuring the accuracy of the process of writing the erasure incremental segment to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • acquiring the data offset information of the incremental data in the data object includes:
  • the data offset information of the incremental data is stored in the write operation log of the placement group where the data object is located, and then the data of the incremental data in the data object is obtained.
  • the data offset information of the incremental data in the data object is obtained based on the write operation log of the placement group where the data object is located.
  • the present application further provides a scenario embodiment under a specific application scenario for further description.
  • OSD.2 acts as the main OSD, sends read requests to OSD.2, OSD.13, OSD.25 and OSD.46, reads 1M data respectively, and then uses the obtained 4M data to decode OSD.39 and OSD.
  • the data required by OSD.61 (each 1M), and then send the decoded data to OSD.39 and OSD.61 respectively.
  • FIG. 2 a schematic diagram of incremental data generation of the data object shown in FIG. 2 is used to illustrate the method for implementing incremental data reconstruction based on erasure codes in a fault scenario in this case.
  • the version of the data object object is version1, and the OSDs [2, 13, 25, 39, 46, 61] that carry the data object are all normal. Write operations to the PG are logged, called pg_log. object is stored on pg 1.2d, then OSD[2,13,25,39,46,61] is the 6 members of pg 1.2d. Every modification to the object will be recorded in pg_log, that is, in the pg_log of each OSD [2, 13, 25, 39, 46, 61].
  • OSD.39 fails at this time, and then modify the 32K data from the data object offset position of 512K, at this time the data object version becomes version2, at this time OSD[2,13,25,NONE,46, 61] pg_log records this modification ⁇ object, version2, [512K, 32K] ⁇ . Then modify the 64K data at the position where the offset position of the data object is 1M, and the version of the data object becomes version3. At this time, the pg_log of OSD[2,13,25,NONE,46,61] records this modification ⁇ object ,version3,[1M,64K] ⁇ . Then OSD.61 fails.
  • OSD.39 is missing three versions of the data object object, version2, version3, and version4, and OSD.61 is missing the version4 version of the data object object. Restoring these versions requires reading data from the 4 unfailed OSDs to decode the version data required by OSD.39 and OSD.61.
  • the data to be recovered for recording OSD.39 and OSD.61 is:
  • the missing data segment of OSD.39 is: ⁇ object, [512K, 32K], [1M, 64K], [3M, 40K] ⁇
  • the missing data segment of OSD.61 is: ⁇ object,[3M,40K] ⁇ ,
  • the offset and length of the read and write data must be an integer multiple of the erasure stripe. Taking the above example as an example of erasure stripe of 32K, and then aligning the erasure stripes, the obtained union is:
  • the data objects and fragments of the data objects that need to be read by each normal OSD constructed in (2) of step 1 are sent to each OSD.
  • OSD.2 To send a message to OSD.2, OSD.13, OSD.25, OSD.46, you need to read 3 segments of data [128K, 8K], [256K, 16K], [768K, 16K] on the data object object.
  • K normal OSDs read data from the local hard disk, fill it in the message and send it to the main OSD
  • the main OSD After receiving the K pieces of data, the main OSD decodes the data to be restored according to the K pieces of data, and encapsulates the data into a message and sends it to the OSD that lacks the data.
  • the main OSD receives 1M data sent by each of the 4 OSDs, and decodes the 4 pieces of data to obtain the 1M data required by OSD.39 and OSD.61. .
  • the data sent to OSD.39 and OSD.61 is only one segment, and the length is 1M.
  • the data sent by this solution to the OSD with missing data may be multiple segments, which need to be processed. The process is as follows:
  • OSD.61 only needs to decode the data of the [768K, 16K] part of the data object object.
  • OSD.39 needs to decode the three segments of data [128K, 8K], [256K, 16K], [768K, 16K] of the data object object.
  • the restored data sent to the OSD to be restored is stored in a bufferlist, and the data is stored continuously, it is necessary to combine the multiple pieces of data obtained in step (1), and insert the information of the data segment into the information table.
  • the merged data is denoted as data
  • the information table of the data segment is denoted as data_included.
  • the schematic diagram of the data segment corresponding to OSD.39 is shown in Figure 3.a
  • the schematic diagram of the data segment corresponding to OSD.61 is shown in Figure 3.b.
  • the data sent to OSD.39 is the combined continuous 40K data, and the information table of the data segment ⁇ object111, [128K, 8K], [256K, 16K], [768K, 16K] ⁇ Send to OSD.39, which is needed in step 5.
  • the data sent to OSD.61 is 16K, and the data segment information table is ⁇ object111, [768K, 16K] ⁇ .
  • the OSD missing the data will write the data to the local hard disk after receiving the data, and then send a response to the main OSD.
  • the existing solution only writes 1M data to the local hard disk in this step, and the location of the object to be written is [0, 1M], that is, 1M data is written from the offset of 0 in the object .
  • each data segment needs to be extracted from the data (data) according to the data segment information table (data_included), and then written to the local hard disk.
  • data_included The specific process is as follows:
  • the offset is 0, and the data_len is 8K, that is, the data from 0 to 8K is intercepted from the data, which is recorded as write_data.
  • offset+ data_len. offset needs to be added with data_len.
  • an embodiment of the present application provides a data reconstruction device based on erasure correction codes, including:
  • the offset information acquisition module 10 is used to acquire the data offset information of the incremental data in the data object
  • the data segment acquisition module 11 is configured to acquire corresponding data segments from multiple source OSDs according to the data offset information; wherein, the source OSDs are the target OSDs that store incremental data among the OSDs that store data objects based on erasure codes , and the number of source OSDs is the same as the number of data disks corresponding to erasure codes;
  • the incremental segment storage module 12 is configured to integrate each data segment into an erasure incremental segment, and write the erasure incremental segment to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • the data offset information includes an offset start address and an offset data amount
  • the data fragment acquisition module includes:
  • the address range acquisition module is used to divide the offset start address by equal amount based on the number of data disks to obtain the erasure offset start address, and divide the offset data amount based on the number of data disks to obtain the erasure offset the amount of data moved;
  • the interval data acquisition module is used to acquire corresponding data segments from multiple source OSDs according to the erasure offset start address and the erasure offset data amount.
  • the data reconstruction device based on erasure code first obtains data offset information of incremental data in a data object, and then obtains corresponding data segments from multiple source OSDs according to the data offset information, where the source OSDs are Among the OSDs that store data objects based on erasure codes, target OSDs with incremental data are stored, and the number of source OSDs is the same as the number of data disks corresponding to erasure codes. After the data fragments are obtained, the data fragments are further integrated into Erasure the incremental fragments, and write the erasure incremental fragments to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • the device realizes the incremental data of the OSD to be reconstructed without storing the incremental data based on the source OSD that stores the incremental data of the data objects.
  • Data reconstruction of partially erased incremental fragments Compared with the method of performing data reconstruction on the entire data object, the device reduces the amount of data for data reconstruction, and further ensures the overall efficiency of data reconstruction.
  • the present application also provides a data reconstruction device based on erasure codes, including:
  • the processor is configured to implement the steps of the above-mentioned data reconstruction method based on erasure code when executing the computer program.
  • the data reconstruction device based on erasure code first obtains data offset information of incremental data in a data object, and then obtains corresponding data segments from multiple source OSDs according to the data offset information.
  • the source OSDs are Among the OSDs that store data objects based on erasure codes, target OSDs with incremental data are stored, and the number of source OSDs is the same as the number of data disks corresponding to erasure codes. After the data fragments are obtained, the data fragments are further integrated into Erasure the incremental fragments, and write the erasure incremental fragments to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • this device realizes the incremental data for the OSD to be reconstructed that does not store incremental data based on the source OSD that stores incremental data of data objects. Data reconstruction of partially erased incremental fragments. Compared with the method of data reconstruction for the entire data object, the device reduces the amount of data for data reconstruction, and further ensures the overall efficiency of data reconstruction.
  • the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the foregoing erasure code-based data reconstruction method are implemented.
  • the computer-readable storage medium provided by the present application first obtains data offset information of incremental data in a data object, and then obtains corresponding data fragments from multiple source OSDs according to the data offset information, and the source OSDs are based on erasure codes.
  • the target OSDs for incremental data are stored, and the number of source OSDs is the same as the number of data disks corresponding to erasure codes.
  • each data fragment is further integrated into erasure increments. fragments, and write erasure incremental fragments to the OSD to be reconstructed in which no incremental data is stored in the OSD.
  • the computer-readable storage medium implements a distributed file system scenario in which objects are stored based on erasure coding mode
  • the OSD to be reconstructed based on the source OSD that stores the incremental data of the data object can be reconstructed for the OSD that does not store the incremental data.
  • the computer-readable storage medium reduces the data amount of the data reconstruction, and further ensures the overall efficiency of the data reconstruction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

本申请公开了一种基于纠删码的数据重构方法、装置、设备及存储介质。该方法的步骤包括:获取数据对象中增量数据的数据偏移信息;根据数据偏移信息在多个源OSD中获取相应的数据片段;其中,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,且源OSD的数量与纠删码对应的数据盘数量相同;将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。本方法减少了数据重构的数据量,进一步确保了数据重构的整体效率。此外,本申请还提供一种基于纠删码的数据重构装置、设备及存储介质,有益效果同上所述。

Description

一种基于纠删码的数据重构方法、装置、设备及存储介质
本申请要求在2020年11月19日提交中国专利局、申请号为202011305572.2、发明名称为“一种基于纠删码的数据重构方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及分布式存储领域,特别是涉及一种基于纠删码的数据重构方法、装置、设备及存储介质。
背景技术
对象存储(Object-based Storage)是一种新的网络存储架构,基于对象存储技术的设备就是对象存储设备(Object-based Storage Device)简称OSD,在分布式文件系统中,OSD的主要功能是存储数据、复制数据、平衡数据、恢复数据等。一般情况下,一块磁盘对应一个OSD,由OSD来对磁盘存储进行管理,当OSD挂载的磁盘永久故障时,需要对该OSD的数据进行恢复,故障的OSD中的数据在其他OSD中恢复出来的过程被称为数据重构。
纠删码(erasure coding,EC)是一种数据保护方法,它将数据分割成片段,把冗余数据块扩展、编码,并将其存储在不同的位置,比如磁盘、存储节点或者其它地理位置。纠删码提供的数据保护可以用公式K+M表示,其中,K即为数据盘的个数,M即为校验盘的个数,最多允许故障M个磁盘。
在将纠删码的数据存储模式应用于数据对象存储的分布式文件系统的场景中,由于每个保存数据对象的OSD成员中的数据均是独一无二的,因此当前在恢复数据对象的数据时是对整个数据对象进行重构。所以对于分布式文件系统中基于纠删码进行数据存储的模式而言,在进行数据重构的过程中数据读写量较大,难以确保数据重构的整体效率。
由此可见,提供一种基于纠删码的数据重构方法,以相对确保数据重构的整体效率,是本领域技术人员需要解决的问题。
发明内容
本申请的目的是提供一种基于纠删码的数据重构方法、装置、设备及存储介质,以相对确保数据重构的整体效率。
为解决上述技术问题,本申请提供一种基于纠删码的数据重构方法,包括:
获取数据对象中增量数据的数据偏移信息;
根据数据偏移信息在多个源OSD中获取相应的数据片段;其中,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,且源OSD的数量与纠删码对应的数据盘数量相同;
将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。
优选地,数据偏移信息包括偏移起始地址以及偏移数据量;
相应的,根据数据偏移信息在多个源OSD中获取相应的数据片段,包括:
对偏移起始地址进行基于数据盘数量的等量划分得到纠删偏移起始地址,并对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量;
根据纠删偏移起始地址以及纠删偏移数据量在多个源OSD中获取相应的数据片段。
优选地,在对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量之前,方法还包括:
判断偏移数据量是否为纠删码对应的纠删条带数据量的整数倍;
若偏移数据量为纠删码对应的纠删条带数据量的整数倍,则执行对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量的步骤;
若偏移数据量不为纠删码对应的纠删条带数据量的整数倍,将偏移数据量增加至纠删条带数据量的整数倍;
基于修改后的偏移数据量执行对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量的步骤。
优选地,当数据偏移信息的数量大于1时,在将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD之前,方法还包括:
将各数据偏移信息对应的纠删增量片段存储为连续的数据段,并记录各纠删增量片段在数据段中的位置信息;
相应的,将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD,包括:
将基于位置信息在数据段读取得到的纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。
优选地,将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD,包括:
将纠删增量片段写入至待重构OSD中,与偏移起始地址以及偏移数据量对应的地址区间。
优选地,获取数据对象中增量数据的数据偏移信息,包括:
基于数据对象所处放置组的写操作日志获取数据对象中增量数据的数据偏移信息。
此外,本申请还提供一种基于纠删码的数据重构装置,包括:
偏移信息获取模块,用于获取数据对象中增量数据的数据偏移信息;
数据片段获取模块,用于根据数据偏移信息在多个源OSD中获取相应的数据片段;其中,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,且源OSD的数量与纠删码对应的数据盘数量相同;
增量片段存储模块,用于将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。
优选地,数据偏移信息包括偏移起始地址以及偏移数据量;
相应的,数据片段获取模块,包括:
地址区间获取模块,用于对偏移起始地址进行基于数据盘数量的等量划分得到纠删偏移起始地址,并对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量;
区间数据获取模块,用于根据纠删偏移起始地址以及纠删偏移数据量在多个源OSD中获取相应的数据片段。
此外,本申请还提供一种基于纠删码的数据重构设备,包括:
存储器,用于存储计算机程序;
处理器,用于执行计算机程序时实现如上述的基于纠删码的数据重构方法的步骤。
此外,本申请还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现如上述的基于纠删码的数据重构方法的步骤。
本申请所提供的基于纠删码的数据重构方法,首先获取数据对象中增量数据的数据偏移信息,进而根据数据偏移信息在多个源OSD中获取相应的数据片段,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,并且源OSD的数量与纠删码对应的数据盘数量相同,在获取数据片段后,进一步将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。由于本方法在基于纠删码模式进行对象存储的分布式文件系统场景中,实现了基于存储有数据对象的增量数据的源OSD对未存储有增量数据的待重构OSD进行增量数据部分的纠删增量片段的数据重构。相较于对整个数据对象进行数据重构的方式,本方法减少了数据重构的数据量,进一步确保了数据重构的整体效率。此外,本申请还提供一种基于纠删码的数据重构装置、设备及存储介质,有益效果同上所述。
附图说明
为了更清楚地说明本申请实施例,下面将对实施例中所需要使用的附图做简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例公开的一种基于纠删码的数据重构方法的流程图;
图2为本申请实施例公开的一种数据对象的增量数据生成示意图;
图3.a为本申请场景实施例公开的一种数据段的示意图;
图3.b为本申请场景实施例公开的一种数据段的示意图;
图4为本申请实施例公开的一种基于纠删码的数据重构装置的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下,所获得的所有其他实施例,都属于本申请保护范围。
在将纠删码的数据存储模式应用于数据对象存储的分布式文件系统的场景中,由于每个保存数据对象的OSD成员中的数据均是独一无二的,因此当前在恢复数据对象的数据时是对整个数据对象进行重构。所以对于分布式文件系统中基于纠删码进行数据存储的模式而言,在进行数据重构的过程中数据读写量较大,难以确保数据重构的整体效率。
为此,本申请的核心是提供一种基于纠删码的数据重构方法,以相对确保数据重构的整体效率。
为了使本技术领域的人员更好地理解本申请方案,下面结合附图和具体实施方式对本申请作进一步的详细说明。
请参见图1所示,本申请实施例公开了一种基于纠删码的数据重构方法,包括:
步骤S10:获取数据对象中增量数据的数据偏移信息。
需要说明的是,本步骤中的数据对象基于纠删码存储机制由多个OSD(对象存储设备,Object-based Storage Device)共同存储,在逻辑上数据对象是一个数据整体,在此基础上,增量数据指的是在数据对象中发生变化的数据,数据增量是数据对象中的部分数据,而数据偏移信息表征的是数据增量在数据对象中所在的偏移位置。
步骤S11:根据数据偏移信息在多个源OSD中获取相应的数据片段。
其中,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,且源OSD的数量与纠删码对应的数据盘数量相同。
在获取数据对象中增量数据的数据偏移信息之后,本步骤进一步根据数据偏移信息在多个源OSD中获取相应的数据片段,其中,源OSD为基于纠删码共同存储数据对象的各OSD中存储有增量数据的目标OSD。
由于基于纠删码进行分布式对象存储时,数据对象由符合纠删码所要求磁盘数量的多个OSD共同存储并维护,也就是说,多个OSD中的数据能够 共同整合为完整的数据对象,因此当数据对象中的部分数据发生变化而产生数据增量时,数据增量仍然由相应数量的多个OSD存储。在此基础上,由于纠删码提供的数据保护可以用公式K+M表示,其中,K即为数据盘的个数,M即为校验盘的个数,最多允许故障M个磁盘,因此当OSD中存在未正常存储数据增量的待重构OSD时,能够通过存储有数据增量的K个源OSD中关于数据增量的数据片段,在待重构OSD中重构待重构OSD缺失的数据增量。
步骤S12:将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。
在根据数据偏移信息在多个源OSD中获取相应的数据片段之后,本步骤进一步将各数据片段整合为纠删增量片段,进而将纠删增量片段写入至待重构OSD。
本申请所提供的基于纠删码的数据重构方法,首先获取数据对象中增量数据的数据偏移信息,进而根据数据偏移信息在多个源OSD中获取相应的数据片段,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,并且源OSD的数量与纠删码对应的数据盘数量相同,在获取数据片段后,进一步将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。由于本方法在基于纠删码模式进行对象存储的分布式文件系统场景中,实现了基于存储有数据对象的增量数据的源OSD对未存储有增量数据的待重构OSD进行增量数据部分的纠删增量片段的数据重构。相较于对整个数据对象进行数据重构的方式,本方法减少了数据重构的数据量,进一步确保了数据重构的整体效率。
在上述实施例的基础上,作为一种优选的实施方式,数据偏移信息包括偏移起始地址以及偏移数据量;
相应的,根据数据偏移信息在多个源OSD中获取相应的数据片段,包括:
对偏移起始地址进行基于数据盘数量的等量划分得到纠删偏移起始地址,并对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量;
根据纠删偏移起始地址以及纠删偏移数据量在多个源OSD中获取相应的数据片段。
在本实施方式中,数据偏移信息包括偏移起始地址以及偏移数据量,其中,偏移起始地址指的是增量数据的起始位置对应的数据地址,偏移数据量指的是增量数据的整体数据占用的地址长度。由于考虑到纠删码将数据对象平均存储在K个数据盘中,因此增量数据也平均存储于各数据盘中,因此在根据数据偏移信息在多个源OSD中获取相应的数据片段时,具体是对偏移起始地址进行基于数据盘数量的等量划分得到纠删偏移起始地址,并对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量,以次进一步根据纠删偏移起始地址以及纠删偏移数据量在多个源OSD中获取相应的数据片段。本实施方式进一步确保了在多个源OSD中获取相应的数据片段过程的准确性。
在上述实施方式的基础上,作为一种优选的实施方式,在对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量之前,方法还包括:
判断偏移数据量是否为纠删码对应的纠删条带数据量的整数倍;
若偏移数据量为纠删码对应的纠删条带数据量的整数倍,则执行对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量的步骤;
若偏移数据量不为纠删码对应的纠删条带数据量的整数倍,将偏移数据量增加至纠删条带数据量的整数倍;
基于修改后的偏移数据量执行对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量的步骤。
需要说明的是,在本实施方式中,考虑到纠删码往往是以纠删条带为单位对数据对象进行存储,因此当数据对象中发生数据变化产生增量数据时,增量数据的数据大小应为纠删条带大小的整数倍,进而在对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量之前,判断偏移数据量是否为纠删码对应的纠删条带数据量的整数倍,若偏移数据量为纠删码对应的纠删条带数据量的整数倍,则进一步执行对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量的步骤,若偏移数据量不为纠删码对应的纠删条带数据量的整数倍,则进一步将偏移数据量增加至纠删条带数据量的整数倍,并基于修改后的偏移数据量执行对偏移数据量进行基于数据盘数量的等 量划分得到纠删偏移数据量的步骤。本实施方式进一步确保了对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量的准确性。
另外,作为一种优选的实施方式,当数据偏移信息的数量大于1时,在将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD之前,方法还包括:
将各数据偏移信息对应的纠删增量片段存储为连续的数据段,并记录各纠删增量片段在数据段中的位置信息;
相应的,将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD,包括:
将基于位置信息在数据段读取得到的纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。
需要说明的是,在当数据偏移信息的数量大于1时,则说明数据对象中存在多个增量数据,因此对于多个增量数据将分别生成相应的纠删增量片段,进而本实施方式在将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD之前,首先将各数据偏移信息对应的纠删增量片段存储为连续的数据段,并记录各纠删增量片段在数据段中的位置信息,其中,连续的数据段指的是数据段的数据占用连续的数据地址,而位置信息记录的是数据段中各纠删增量片段对应的数据地址区间,进而将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD时,具体是将基于位置信息在数据段读取得到的纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。本实施方式进一步确保了将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD过程的准确性。
另外,作为一种优选的实施方式,将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD,包括:
将纠删增量片段写入至待重构OSD中,与偏移起始地址以及偏移数据量对应的地址区间。
本实施方式在将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD时,具体是将纠删增量片段写入至待重构OSD中与偏移起始地址以及偏 移数据量对应的地址区间内,以此进一步确保了将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD过程的准确性。
此外,在上述一系列实施例的基础上,作为一种优选地实施方式,获取数据对象中增量数据的数据偏移信息,包括:
基于数据对象所处放置组的写操作日志获取数据对象中增量数据的数据偏移信息。
本实施方式在数据对象由写操作导致数据内容发生变化后,将增量数据的数据偏移信息存储至数据对象所处放置组的写操作日志中,进而在获取数据对象中增量数据的数据偏移信息时,具体基于数据对象所处放置组的写操作日志获取数据对象中增量数据的数据偏移信息。本实施方式进一步确保了获取数据偏移信息的准确性。
为了加深对于上述实施例的理解,本申请还提供一种具体应用场景下的场景实施例做进一步说明。
以4M数据大小的数据对象,4+2纠删码,纠删条带32K为例。比如承载对象object的6个OSD为[2,13,25,39,46,61],其中OSD.39和OSD.61的节点被重启过,OSD.39和OSD.61数据版本落后需要进行恢复。按原有方案的话,OSD.2作为主OSD,发送读请求给OSD.2、OSD.13,OSD.25以及OSD.46,分别读取1M数据,然后利用得到4M数据解码出OSD.39和OSD.61需要的数据(各1M),然后将解码出的数据分别发给OSD.39和OSD.61。
下面,以图2所示的数据对象的增量数据生成示意图,举例说明本案在故障场景下基于纠删码实现数据增量重构的方法。
数据对象object版本为version1,此时承载该数据对象的OSD[2,13,25,39,46,61]均正常。对PG的写操作都会记录日志,被称为pg_log。object保存在pg 1.2d上,则OSD[2,13,25,39,46,61]为pg 1.2d的6个成员。对object的每一次修改都会记录到pg_log中,即记录到OSD[2,13,25,39,46,61]各个OSD的pg_log中。假设此时OSD.39发生故障,然后从该数据对象偏移位置为512K的位置,修改32K的数据,此时数据对象版本变为version2,此时OSD[2,13,25,NONE,46,61]的pg_log中记录了本次的修改{object,version2,[512K,32K]}。接 着在数据对象偏移位置为1M的位置修改64K的数据,数据对象版本变为version3,此时OSD[2,13,25,NONE,46,61]的pg_log中记录了本次的修改{object,version3,[1M,64K]}。然后OSD.61发生故障。再对数据对象偏移位置为3M的位置修改40K的数据,数据对象版本变为version4,OSD[2,13,25,NONE,46,NONE]的pg_log中记录了本次的修改{object,version4,[3M,40K]}。OSD.39和OSD.61重新启动后。OSD.2通过对比各个OSD上的pg_log可知OSD.39缺失数据对象object的version2,version3,version4三个版本,OSD.61缺失数据对象object的version4这一个版本。恢复这些版本需要从未故障的4个OSD上读取数据才能解码出OSD.39和OSD.61需要的版本数据。
1、构造读请求的过程如下:
(1)根据pg_log将需要进行数据恢复的OSD以及缺失的数据记录下来(以map的形式)。
记录OSD.39和OSD.61需要恢复的数据为:
<
<OSD.39,{object,[512K,32K],[1M,64K],[3M,40K]}>,
<OSD.61,{object,[3M,40K]}>
>
(2)将多个需要恢复数据的OSD缺失的数据段求并集
比如OSD.39缺失的数据段为:{object,[512K,32K],[1M,64K],[3M,40K]}
OSD.61缺失的数据段为:{object,[3M,40K]},
求并集后为:{object,[512K,32K],[1M,64K],[3M,40K]}
由于纠删码组织数据的最小单位是纠删条带,所以读写数据的偏移量以及长度都得是纠删条带的整数倍。以上文举例纠删条带为32K为例,按纠删条带对其后,求得的并集为:
{object,[512K,32K],[1M,64K],[3M,64K]}.
由上文纠删单元的概念可知,每个OSD成员保存的数据仅为完整数据对象的1/k,所以给各个非故障OSD发送的读请求读取的数据对象范围需要再按照纠删单元修正一下,修正后如下:
<
<OSD.2,{object,[128K,8K],[256K,16K],[768K,16K]}>,
<OSD.13,{object,[128K,8K],[256K,16K],[768K,16K]}>,
<OSD.25,{object,[128K,8K],[256K,16K],[768K,16K]}>,
<OSD.46,{object,[128K,8K],[256K,16K],[768K,16K]}>
>
(3)将(1)中OSD.39和OSD.61需要恢复的数据也按照纠删条带和纠删单元修正一下。结果如下:
<
<OSD.39,{object,[128K,8K],[256K,16K],[768K,16K]}>,
<OSD.61,{object,[768K,16K]}>
>
2、将读请求发给K个正常的OSD
将步骤1的(2)中构造的各个正常OSD需要读取的数据对象以及数据对象的片段发给各个OSD。
给OSD.2,OSD.13,OSD.25,OSD.46发消息,需要读取数据对象object上的3段数据[128K,8K],[256K,16K],[768K,16K]。
3、K个正常的OSD从本地硬盘中读出数据,填充到消息里发给主OSD
现有方案在该步骤时只读取一段数据(在举得例子中4M数据对象每个正常的OSD需要读取1M的数据),在本方案中需要读取多段数据,{object,[128K,8K],[256K,16K],[768K,16K]},OSD.2,OSD.13,OSD.25,OSD.46各读取8K+16K+16K=40K数据。
4、主OSD收到K份数据后根据这K份数据解码出需要恢复的数据,将该数据封装到消息里发给缺失该数据的OSD。
现有方案在举的例子的这种情况下,4M数据对象在恢复时主OSD收到了4个OSD各自发送的1M数据,将这4份数据解码出OSD.39和OSD.61需要的1M数据。给OSD.39和OSD.61发送的数据只有一段,长度为1M。而本方案给缺失数据的OSD发送的数据可能是多段,需要处理一下,流程如下:
(1)将正常OSD发送的多段数据逐一进行解码,得到待恢复的OSD需要的数据。
根据步骤1的(3)可知OSD.61只需要解码数据对象object的[768K,16K]部分的数据即可。
OSD.39需要解码数据对象object的[128K,8K],[256K,16K],[768K,16K]这3段数据。
(2)将解码后的数据段进行合并,并将数据段的信息插入表(map)中。
由于给待恢复的OSD发送的恢复数据是保存在一个bufferlist中,数据是连续保存的,所以需要对步骤(1)得到的多段数据进行合并,并将数据段的信息插入信息表里。合并后的数据记为data,数据段的信息表记为data_included。OSD.39对应的数据段的示意图如图3.a所示,OSD.61对应的数据段的示意图如图3.b所示。
(3)将合并后的数据以及数据段的信息表发给各个待恢复的OSD。
按照(2)合并的数据,给OSD.39发送的数据是合并后的连续40K数据,同时将数据段的信息表{object111,[128K,8K],[256K,16K],[768K,16K]}发给OSD.39,在第5步需要用。
给OSD.61发送的数据是16K,数据段信息表为{object111,[768K,16K]}。
5、缺失该份数据的OSD收到数据后将数据写入本地硬盘,然后给主OSD发给应答。
按照举例的这种情况现有方案在该步骤处理时只向本地硬盘写入了1M数据,写入对象的位置即为[0,1M],即从对象内偏移为0处写入1M数据。
本方案由于需要向本地硬盘写入多段数据,需要按照数据段信息表(data_included)从数据(data)中取出各个数据段,然后写入本地硬盘。具体流程如下:
(1)初始化offset=0;
(2)从data_included中取出一段数据的位置信息。
比如OSD.39处理时,先从data_included{[128K,8K],[256K,16K],[768K,16K]}取出[128K,8K]。其中128K记为write_start,8K记为data_len。
(3)从data中截取一段数据,截取的数据跟(2)中数据的位置信息一致。截取data时截取的开始位置为offset,截取长度为(2)中位置信息的数据长度(data_len)。
此时offset为0,data_len为8K,即从data中截取0到8K的数据,记为write_data。
(4)将(3)中截取的数据片段(write_data)写入硬盘中对象的具体位置。即从对象偏移位置为write_start处写入长度为data_len。
(5)offset+=data_len。offset需要加上data_len。
(6)如果data_included还有未处理的数据片段信息则转到(2)进行处理,data_included全部处理完时则结束。
6、主OSD收到应答后数据恢复结束。
请参见图4所示,本申请实施例提供了一种基于纠删码的数据重构装置,包括:
偏移信息获取模块10,用于获取数据对象中增量数据的数据偏移信息;
数据片段获取模块11,用于根据数据偏移信息在多个源OSD中获取相应的数据片段;其中,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,且源OSD的数量与纠删码对应的数据盘数量相同;
增量片段存储模块12,用于将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。
此外,作为一种优选的实施方式,数据偏移信息包括偏移起始地址以及偏移数据量;
相应的,数据片段获取模块,包括:
地址区间获取模块,用于对偏移起始地址进行基于数据盘数量的等量划分得到纠删偏移起始地址,并对偏移数据量进行基于数据盘数量的等量划分得到纠删偏移数据量;
区间数据获取模块,用于根据纠删偏移起始地址以及纠删偏移数据量在多个源OSD中获取相应的数据片段。
本申请所提供的基于纠删码的数据重构装置,首先获取数据对象中增量数据的数据偏移信息,进而根据数据偏移信息在多个源OSD中获取相应的数据片段,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,并且源OSD的数量与纠删码对应的数据盘数量相同,在获取数据片段后,进一步将各数据片段整合为纠删增量片段,并将纠删增量片段写 入至OSD中未存储有增量数据的待重构OSD。由于本装置在基于纠删码模式进行对象存储的分布式文件系统场景中,实现了基于存储有数据对象的增量数据的源OSD对未存储有增量数据的待重构OSD进行增量数据部分的纠删增量片段的数据重构。相较于对整个数据对象进行数据重构的方式,本装置减少了数据重构的数据量,进一步确保了数据重构的整体效率。
此外,本申请还提供一种基于纠删码的数据重构设备,包括:
存储器,用于存储计算机程序;
处理器,用于执行计算机程序时实现如上述的基于纠删码的数据重构方法的步骤。
本申请所提供的基于纠删码的数据重构设备,首先获取数据对象中增量数据的数据偏移信息,进而根据数据偏移信息在多个源OSD中获取相应的数据片段,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,并且源OSD的数量与纠删码对应的数据盘数量相同,在获取数据片段后,进一步将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。由于本设备在基于纠删码模式进行对象存储的分布式文件系统场景中,实现了基于存储有数据对象的增量数据的源OSD对未存储有增量数据的待重构OSD进行增量数据部分的纠删增量片段的数据重构。相较于对整个数据对象进行数据重构的方式,本设备减少了数据重构的数据量,进一步确保了数据重构的整体效率。
此外,本申请还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现如上述的基于纠删码的数据重构方法的步骤。
本申请所提供的计算机可读存储介质,首先获取数据对象中增量数据的数据偏移信息,进而根据数据偏移信息在多个源OSD中获取相应的数据片段,源OSD为基于纠删码存储数据对象的各OSD中,存储有增量数据的目标OSD,并且源OSD的数量与纠删码对应的数据盘数量相同,在获取数据片段后,进一步将各数据片段整合为纠删增量片段,并将纠删增量片段写入至OSD中未存储有增量数据的待重构OSD。由于本计算机可读存储介质在基于纠删 码模式进行对象存储的分布式文件系统场景中,实现了基于存储有数据对象的增量数据的源OSD对未存储有增量数据的待重构OSD进行增量数据部分的纠删增量片段的数据重构。相较于对整个数据对象进行数据重构的方式,本计算机可读存储介质减少了数据重构的数据量,进一步确保了数据重构的整体效率。
以上对本申请所提供的一种基于纠删码的数据重构方法、装置、设备及存储介质进行了详细介绍。说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。

Claims (10)

  1. 一种基于纠删码的数据重构方法,其特征在于,包括:
    获取数据对象中增量数据的数据偏移信息;
    根据所述数据偏移信息在多个源OSD中获取相应的数据片段;其中,所述源OSD为基于纠删码存储所述数据对象的各OSD中,存储有所述增量数据的目标OSD,且所述源OSD的数量与所述纠删码对应的数据盘数量相同;
    将各所述数据片段整合为纠删增量片段,并将所述纠删增量片段写入至所述OSD中未存储有所述增量数据的待重构OSD。
  2. 根据权利要求1所述的基于纠删码的数据重构方法,其特征在于,所述数据偏移信息包括偏移起始地址以及偏移数据量;
    所述根据所述数据偏移信息在多个源OSD中获取相应的数据片段,包括:
    对所述偏移起始地址进行基于所述数据盘数量的等量划分得到纠删偏移起始地址,并对所述偏移数据量进行基于所述数据盘数量的等量划分得到纠删偏移数据量;
    根据所述纠删偏移起始地址以及所述纠删偏移数据量在多个所述源OSD中获取相应的所述数据片段。
  3. 根据权利要求2所述的基于纠删码的数据重构方法,其特征在于,在所述对所述偏移数据量进行基于所述数据盘数量的等量划分得到纠删偏移数据量之前,所述方法还包括:
    判断所述偏移数据量是否为所述纠删码对应的纠删条带数据量的整数倍;
    若所述偏移数据量为所述纠删码对应的纠删条带数据量的整数倍,则执行所述对所述偏移数据量进行基于所述数据盘数量的等量划分得到纠删偏移数据量的步骤;
    若所述偏移数据量不为所述纠删码对应的纠删条带数据量的整数倍,将所述偏移数据量增加至所述纠删条带数据量的整数倍;
    基于修改后的所述偏移数据量执行所述对所述偏移数据量进行基于所述数据盘数量的等量划分得到纠删偏移数据量的步骤。
  4. 根据权利要求2所述的基于纠删码的数据重构方法,其特征在于,当所述数据偏移信息的数量大于1时,在所述将所述纠删增量片段写入至所述OSD中未存储有所述增量数据的待重构OSD之前,所述方法还包括:
    将各所述数据偏移信息对应的所述纠删增量片段存储为连续的数据段,并记录各所述纠删增量片段在所述数据段中的位置信息;
    所述将所述纠删增量片段写入至所述OSD中未存储有所述增量数据的待重构OSD,包括:
    将基于所述位置信息在所述数据段读取得到的所述纠删增量片段写入至所述OSD中未存储有所述增量数据的所述待重构OSD。
  5. 根据权利要求2所述的基于纠删码的数据重构方法,其特征在于,所述将所述纠删增量片段写入至所述OSD中未存储有所述增量数据的待重构OSD,包括:
    将所述纠删增量片段写入至所述待重构OSD中,与所述偏移起始地址以及所述偏移数据量对应的地址区间。
  6. 根据权利要求1至5任意一项所述的基于纠删码的数据重构方法,其特征在于,所述获取数据对象中增量数据的数据偏移信息,包括:
    基于所述数据对象所处放置组的写操作日志获取所述数据对象中所述增量数据的数据偏移信息。
  7. 一种基于纠删码的数据重构装置,其特征在于,包括:
    偏移信息获取模块,用于获取数据对象中增量数据的数据偏移信息;
    数据片段获取模块,用于根据所述数据偏移信息在多个源OSD中获取相应的数据片段;其中,所述源OSD为基于纠删码存储所述数据对象的各OSD中,存储有所述增量数据的目标OSD,且所述源OSD的数量与所述纠删码对应的数据盘数量相同;
    增量片段存储模块,用于将各所述数据片段整合为纠删增量片段,并将所述纠删增量片段写入至所述OSD中未存储有所述增量数据的待重构OSD。
  8. 根据权利要求7所述的基于纠删码的数据重构装置,其特征在于,所述数据偏移信息包括偏移起始地址以及偏移数据量;
    所述数据片段获取模块,包括:
    地址区间获取模块,用于对所述偏移起始地址进行基于所述数据盘数量的等量划分得到纠删偏移起始地址,并对所述偏移数据量进行基于所述数据盘数量的等量划分得到纠删偏移数据量;
    区间数据获取模块,用于根据所述纠删偏移起始地址以及所述纠删偏移数据量在多个所述源OSD中获取相应的所述数据片段。
  9. 一种基于纠删码的数据重构设备,其特征在于,包括:
    存储器,用于存储计算机程序;
    处理器,用于执行所述计算机程序时实现如权利要求1至6任一项所述的基于纠删码的数据重构方法的步骤。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述的基于纠删码的数据重构方法的步骤。
PCT/CN2021/121225 2020-11-19 2021-09-28 一种基于纠删码的数据重构方法、装置、设备及存储介质 WO2022105442A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/029,700 US20240045763A1 (en) 2020-11-19 2021-09-28 A data reconstruction method based on erasure coding, an apparatus, a device and a storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011305572.2A CN112463434B (zh) 2020-11-19 2020-11-19 一种基于纠删码的数据重构方法、装置、设备及存储介质
CN202011305572.2 2020-11-19

Publications (1)

Publication Number Publication Date
WO2022105442A1 true WO2022105442A1 (zh) 2022-05-27

Family

ID=74836871

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/121225 WO2022105442A1 (zh) 2020-11-19 2021-09-28 一种基于纠删码的数据重构方法、装置、设备及存储介质

Country Status (3)

Country Link
US (1) US20240045763A1 (zh)
CN (1) CN112463434B (zh)
WO (1) WO2022105442A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463434B (zh) * 2020-11-19 2022-08-02 苏州浪潮智能科技有限公司 一种基于纠删码的数据重构方法、装置、设备及存储介质
CN113986944B (zh) * 2021-12-29 2022-03-25 天地伟业技术有限公司 分片数据的写入方法、系统及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103645861A (zh) * 2013-12-03 2014-03-19 华中科技大学 一种纠删码集群中失效节点的重构方法
CN103955343A (zh) * 2014-04-16 2014-07-30 华中科技大学 一种基于i/o流水线的失效节点数据重构优化方法
US20180157671A1 (en) * 2016-12-02 2018-06-07 International Business Machines Corporation Accessing objects in an erasure code supported object storage environment
CN110019408A (zh) * 2017-12-29 2019-07-16 北京奇虎科技有限公司 一种用于追溯数据状态的方法、装置及计算机设备
CN112463434A (zh) * 2020-11-19 2021-03-09 苏州浪潮智能科技有限公司 一种基于纠删码的数据重构方法、装置、设备及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930103B (zh) * 2016-05-10 2019-04-16 南京大学 一种分布式存储ceph的纠删码覆盖写方法
CN108664351A (zh) * 2017-03-31 2018-10-16 杭州海康威视数字技术股份有限公司 一种数据存储、重构、清理方法、装置及数据处理系统
CN109213637B (zh) * 2018-11-09 2022-03-04 浪潮电子信息产业股份有限公司 分布式文件系统集群节点的数据恢复方法、装置及介质
CN110262922B (zh) * 2019-05-15 2021-02-09 中国科学院计算技术研究所 基于副本数据日志的纠删码更新方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103645861A (zh) * 2013-12-03 2014-03-19 华中科技大学 一种纠删码集群中失效节点的重构方法
CN103955343A (zh) * 2014-04-16 2014-07-30 华中科技大学 一种基于i/o流水线的失效节点数据重构优化方法
US20180157671A1 (en) * 2016-12-02 2018-06-07 International Business Machines Corporation Accessing objects in an erasure code supported object storage environment
CN110019408A (zh) * 2017-12-29 2019-07-16 北京奇虎科技有限公司 一种用于追溯数据状态的方法、装置及计算机设备
CN112463434A (zh) * 2020-11-19 2021-03-09 苏州浪潮智能科技有限公司 一种基于纠删码的数据重构方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN112463434B (zh) 2022-08-02
CN112463434A (zh) 2021-03-09
US20240045763A1 (en) 2024-02-08

Similar Documents

Publication Publication Date Title
US7853750B2 (en) Method and an apparatus to store data patterns
US6665815B1 (en) Physical incremental backup using snapshots
US7010645B2 (en) System and method for sequentially staging received data to a write cache in advance of storing the received data
US7778958B2 (en) Recovery of data on a primary data volume
US10481988B2 (en) System and method for consistency verification of replicated data in a recovery system
WO2022105442A1 (zh) 一种基于纠删码的数据重构方法、装置、设备及存储介质
US7631158B2 (en) Disk snapshot method using a copy-on-write table in a user space
US20120005163A1 (en) Block-based incremental backup
US20100161565A1 (en) Cluster data management system and method for data restoration using shared redo log in cluster data management system
US9256498B1 (en) System and method for generating backups of a protected system from a recovery system
CN106951375B (zh) 在存储系统中删除快照卷的方法及装置
US7487385B2 (en) Apparatus and method for recovering destroyed data volumes
JPH05502747A (ja) 大容量データベースシステムにおける回復
US7020805B2 (en) Efficient mechanisms for detecting phantom write errors
US20140181396A1 (en) Virtual tape using a logical data container
US20210349793A1 (en) System and methods of efficiently resyncing failed components without bitmap in an erasure-coded distributed object with log-structured disk layout
US11556423B2 (en) Using erasure coding in a single region to reduce the likelihood of losing objects maintained in cloud object storage
CN110555055A (zh) 针对Oracle数据库重做日志文件的数据挖掘方法
CN111367926A (zh) 分布式系统的数据处理方法和装置
US20070106925A1 (en) Method and system using checksums to repair data
US7716519B2 (en) Method and system for repairing partially damaged blocks
US7930495B2 (en) Method and system for dirty time log directed resilvering
CN114840364A (zh) 对内存中的存储数据进行备份的方法、装置及电子设备
CN113259410A (zh) 一种基于分布式存储的数据传输校验方法及系统
CN111897676A (zh) 一种基于数据库索引的文件备份方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893593

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21893593

Country of ref document: EP

Kind code of ref document: A1