WO2023151290A1 - 一种数据编码方法、装置、设备及介质 - Google Patents

一种数据编码方法、装置、设备及介质 Download PDF

Info

Publication number
WO2023151290A1
WO2023151290A1 PCT/CN2022/123401 CN2022123401W WO2023151290A1 WO 2023151290 A1 WO2023151290 A1 WO 2023151290A1 CN 2022123401 W CN2022123401 W CN 2022123401W WO 2023151290 A1 WO2023151290 A1 WO 2023151290A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
group
updated
stripes
disk
Prior art date
Application number
PCT/CN2022/123401
Other languages
English (en)
French (fr)
Inventor
吴睿振
陈静静
张永兴
张旭
王凛
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2023151290A1 publication Critical patent/WO2023151290A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error

Definitions

  • the present application relates to the technical field of data storage, and in particular to a data encoding method, device, equipment and medium.
  • large-stripe erasure correction is a relatively clear application requirement.
  • the large-stripe erasure in large-stripe erasure refers to the relatively large number of stripes for data and verification under deletion correction.
  • the security of the data can be greatly improved, reducing the probability of the need for hard disk inspection.
  • the amount of data that needs to be taken out is too large, because the current limitation of the storage work speed is mainly the IOPS (Input/Output Operations Per Second, the number of read and write operations per second), so when the amount of data is large, the data reading speed will slow down, which will further slow down the data recovery speed.
  • IOPS Input/Output Operations Per Second, the number of read and write operations per second
  • the purpose of this application is to provide a data encoding method, device, device, and medium, which can reduce the amount of data that needs to be read during data recovery and improve the speed of data recovery in a large-stripe erasure correction scenario.
  • the specific plan is as follows:
  • the present application discloses a data encoding method, including:
  • the parity block to be updated is updated according to the different stripe groups and the different data disk groups according to a preset encoding rule, so as to complete data encoding.
  • the data encoding method also includes:
  • grouping the second preset number of stripes in the storage erasure correction structure based on the first division rule to obtain different stripe groups includes:
  • the first division rule is determined according to the second preset number of stripes.
  • the grouping the second preset number of stripes in the storage erasure correction structure based on the first division rule to obtain different stripe groups includes:
  • the grouping the second preset number of stripes in the storage erasure correction structure based on the first division rule to obtain different stripe groups includes:
  • the grouping the second preset number of stripes in the storage erasure correction structure based on the first division rule to obtain different stripe groups further includes:
  • encoding the slice group including the one slice using the original encoding method includes:
  • the slice group of one condition does not participate in re-encoding, and encodes the slice group of one condition according to the original encoding method.
  • the grouping of data disks corresponding to different stripes in each group based on the second division rule to obtain different data disk groups includes:
  • the data encoding method further includes: determining a verification disk from all the verification disks based on a preset operation principle, and using the original The encoding method performs encoding, and then determines the check blocks in the remaining check discs of all the check discs as the check blocks to be updated.
  • the preset operation principle is the simplest operation principle.
  • updating the check block to be updated according to the stored erasure correction structure after the grouping and according to a preset coding rule includes:
  • each stripe group after determining the sequence number of the parity block to be updated, use the parity block to be updated in the parity disk corresponding to the even.
  • the parity block to be updated in the parity disk corresponding to the even-numbered stripe in the group is updated.
  • updating the check block to be updated according to the stored erasure correction structure after the grouping and according to a preset coding rule includes:
  • each stripe group after determining the sequence number of the check block to be updated, use the data disk corresponding to the odd numbered stripe in the group that has the same sequence number as the sequence number of the check block to be updated.
  • the data blocks in the data disk group update the parity blocks to be updated in the parity disks corresponding to the even-numbered stripes in the group.
  • updating the check block to be updated according to the stored erasure correction structure after the grouping and according to a preset coding rule includes:
  • each stripe group after determining the sequence number of the parity block to be updated, use the parity block to be updated in the parity disk corresponding to the odd The parity block to be updated in the parity disk corresponding to the odd stripe in the group is updated.
  • updating the check block to be updated according to the stored erasure correction structure after the grouping and according to a preset coding rule includes:
  • each stripe group after determining the sequence number of the check block to be updated, use the data disk corresponding to the odd numbered stripe in the group that has the same sequence number as the sequence number of the check block to be updated.
  • the data blocks in the data disk group update the parity blocks to be updated in the parity disks corresponding to the even-numbered stripes in the group.
  • each stripe has a corresponding original data block.
  • the present application discloses a data encoding device, including:
  • An erasure correction structure acquisition module configured to acquire a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes, and the hard disk Including data disk and verification disk;
  • a grouping module configured to group the second preset number of stripes in the storage erasure correction structure based on a first division rule to obtain different stripe groups, and group different stripes in each group based on a second division rule. Group the data disks corresponding to the above stripes to obtain different data disk groups;
  • An updating module configured to update the parity block to be updated according to the different stripe groups and the different data disk groups and according to a preset encoding rule, so as to complete data encoding.
  • an electronic device comprising:
  • a processor configured to execute the computer program, so as to implement the data encoding method disclosed above.
  • the present application discloses a non-volatile readable storage medium for storing a computer program; wherein, when the computer program is executed by a processor, the aforementioned disclosed data encoding method is implemented.
  • the present application discloses a computing processing device, including:
  • One or more processors when the computer readable code is executed by the one or more processors, the computing processing device executes the steps of the data encoding method disclosed above.
  • the present application discloses a computer program product, including computer readable codes, which, when the computer readable codes are run on a computing processing device, cause the computing processing device to execute the steps of the data encoding method disclosed above .
  • the present application proposes a data encoding method, including: obtaining a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of entries
  • the hard disk includes a data disk and a parity disk; group the second preset number of stripes in the storage erasure correction structure based on the first division rule to obtain different stripe groups, and based on the second
  • the division rules group the data disks corresponding to the different stripes in each group to obtain different data disk groups; according to the different stripe groups and the different data disk groups and according to the preset encoding rules, update the check block Update to complete data encoding.
  • this application improves the original encoding method so that when decoding based on the improved encoding method, the amount of data to be read for decoding is reduced, and the decoding speed is further greatly improved. .
  • Fig. 1 is a flow chart of a data encoding method disclosed in the present application
  • FIG. 2 is a flow chart of a specific data encoding method disclosed in the present application.
  • FIG. 3 is a flow chart of a specific data encoding method disclosed in the present application.
  • Fig. 4 discloses a schematic diagram of an erasure code encoding structure based on an original encoding method
  • FIG. 7 is a schematic structural diagram of an encoding hardware disclosed in the present application.
  • FIG. 8 is a schematic structural diagram of a data encoding device disclosed in the present application.
  • FIG. 9 is a structural diagram of an electronic device disclosed in the present application.
  • Figure 10 schematically shows a block diagram of a computing processing device for performing a method according to the present application.
  • Fig. 11 schematically shows a storage unit for holding or carrying program codes for realizing the method according to the present application.
  • the embodiment of the present application proposes a data encoding scheme, which can reduce the amount of data to be read during data recovery and improve the speed of data recovery in the scenario of large-stripe erasure correction.
  • the embodiment of the present application discloses a data encoding method, as shown in Figure 1, the method includes:
  • Step S11 Obtain the storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes, and the hard disks include data disks and calibration disks. Check the market.
  • the hard disk includes a data disk and a parity disk
  • the data disk is used to store data blocks
  • the parity disk is used to store parity blocks.
  • Step S12 Group the second preset number of stripes in the storage erasure correction structure based on the first division rule to obtain different stripe groups, and group the different stripes in each group based on the second division rule Group with corresponding data disks to get different data disk groups.
  • the storage capacity of the hard disk is divided based on stripes. Specifically, each hard disk is divided by using the second preset number of stripes, and then the first division rule in the storage erasure correction structure is divided. Two preset numbers of stripes are grouped to obtain different stripe groups, and data disks corresponding to different stripes in each group are grouped based on a second division rule to obtain different data disk groups.
  • Step S13 Update the parity block to be updated according to the different stripe groups and the different data disk groups and according to a preset encoding rule, so as to complete data encoding.
  • the check block to be updated is updated based on the different stripe groups and different data disk groups according to a preset encoding rule, so as to complete data encoding.
  • the specific process of determining the check block to be updated is: based on the preset operation principle, a check disk is determined from all check disks, and the check block in a check disk is Use the original encoding method for encoding, and then determine the check blocks in the remaining check discs in all check discs as the check blocks to be updated.
  • the preset operation principle refers to the simplest operation principle, that is, , the check block to be updated is determined based on the principle that the entire operation process can be simplified.
  • the present application proposes a data encoding method, including: obtaining a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes,
  • the hard disk includes a data disk and a parity disk; based on the first division rule, the second preset number of stripes in the storage erasure correction structure are grouped to obtain different stripe groups, and different stripe groups in each group are divided based on the second division rule. Group the corresponding data disks to obtain different data disk groups; update the check blocks to be updated according to the preset coding rules according to different stripe groups and different data disk groups, so as to complete the data coding.
  • this application By improving the original encoding method, when decoding based on the improved encoding method, the amount of data to be read for decoding is reduced, and the decoding speed is further greatly improved.
  • the embodiment of the present application discloses a specific data encoding method, as shown in Figure 2, the method includes:
  • Step S21 Obtain a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes, and the hard disks include data disks and calibration disks. Check the market.
  • Step S22 Divide every two stripes in the storage erasure correction structure into a group to obtain different stripe groups.
  • the grouping rules need to be defined according to the number of stripes.
  • the number of stripes is the second preset number.
  • the second preset number is an even number
  • the data in the erasure correction structure Each two stripes of each group are grouped to obtain different stripe groups; in addition, when the second preset number is an odd number, each two stripes in the storage erasure correction structure are grouped, and then the storage
  • the remaining slice in the erasure correction structure is grouped to obtain different slice groups, and the slice group including one slice is encoded using the original encoding method.
  • encoding the slice group including one slice using the original encoding method means that the slice group including one slice does not participate in re-encoding, and encodes according to the original encoding method.
  • Step S23 Determine the number of data blocks corresponding to different stripes in each group and the number of check blocks to be updated; calculate the ratio of the number of data blocks to the number of check blocks to be updated, and when the When the ratio is not an integer, round up the ratio; use the ratio as the division length, group the data disks corresponding to the different stripes in each group, and when the different stripes in each group correspond to When the number of undivided data disks is less than the division length, divide the undivided data disks into one group to obtain different groups of data disks.
  • the data disks corresponding to different stripes in each group should be grouped based on the second division rule to obtain different data disk groups. Specifically, it is determined that different stripes in each group correspond to The number of data blocks and the number of check blocks to be updated; calculate the ratio of the number of data blocks to the number of check blocks to be updated, and when the ratio is not an integer, the comparison value is rounded up; the ratio is divided into lengths, for each group Group the data disks corresponding to different stripes in each group, and when the number of undivided data disks corresponding to different stripes in each group is less than the division length, divide the undivided data disks into one group to obtain different data disk groups.
  • Step S24 Sorting each of the data disk groups and the checksums to be updated respectively; in each stripe group, after determining the sequence numbers of the checksums to be updated, use the The parity block to be updated in the parity disk corresponding to the even-numbered stripe, and the data disk having the same serial number as the serial number of the parity block to be updated among the data disks corresponding to the odd-numbered stripe in the group
  • the data blocks in the group update the check blocks to be updated in the check disks corresponding to the even-numbered stripes in the group.
  • the check block to be updated is updated according to the different stripe groups and different data disk groups according to the preset coding rules to complete the data coding. Specifically, Sort each data disk group and the verification block to be updated separately; in each strip group, after determining the sequence number of the verification block to be updated, use the parity block in the verification disk corresponding to the even-numbered stripes in this group. Update the check block, and the data blocks in the data disk group corresponding to the odd-numbered stripes in this group have the same serial number as the number of the check block to be updated, and the data blocks in the check disk corresponding to the even-numbered stripes in this group The check block to be updated is updated to complete data encoding.
  • the present application proposes a data encoding method, including: obtaining a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes,
  • the hard disk includes a data disk and a check disk; divide each two stripes in the storage erasure correction structure into a group to obtain different stripe groups; determine the number of data blocks corresponding to different stripes in each group The number of verification blocks; calculate the ratio of the number of data blocks to the number of verification blocks to be updated, and when the ratio is not an integer, the comparison value is rounded up; the ratio is used to divide the length, and the data disks corresponding to different stripes in each group Carry out grouping, and when the number of undivided data disks corresponding to different stripes in each group is less than the division length, divide the undivided data disks into one group to obtain different data disk groups; In each strip group, after determining the serial number of the check block
  • the embodiment of the present application discloses a specific data encoding method, as shown in Figure 3, the method includes:
  • Step S31 Obtain a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes, and the hard disks include data disks and calibration disks. Check the market.
  • Step S32 Divide every two stripes in the storage erasure correction structure into a group to obtain different stripe groups.
  • the grouping rules need to be defined according to the number of stripes.
  • the number of stripes is the second preset number.
  • the second preset number is an even number
  • the data in the erasure correction structure Each two stripes of each group are grouped to obtain different stripe groups; in addition, when the second preset number is an odd number, each two stripes in the storage erasure correction structure are grouped, and then the storage
  • the remaining slice in the erasure correction structure is grouped to obtain different slice groups, and the slice group including one slice is encoded using the original coding method.
  • encoding the slice group including one slice using the original encoding method means that the slice group including one slice does not participate in re-encoding, and encodes according to the original encoding method.
  • Step S33 Determine the number of data blocks corresponding to different stripes in each group and the number of check blocks to be updated; calculate the ratio of the number of data blocks to the number of check blocks to be updated, and when the When the ratio is not an integer, round up the ratio; use the ratio as the division length, group the data disks corresponding to the different stripes in each group, and when the different stripes in each group correspond to When the number of undivided data disks is less than the division length, divide the undivided data disks into one group to obtain different groups of data disks.
  • the data disks corresponding to different stripes in each group should be grouped based on the second division rule to obtain different data disk groups. Specifically, it is determined that different stripes in each group correspond to The number of data blocks and the number of check blocks to be updated; calculate the ratio of the number of data blocks to the number of check blocks to be updated, and when the ratio is not an integer, the comparison value is rounded up; the ratio is divided into lengths, for each group Group the data disks corresponding to different stripes in each group, and when the number of undivided data disks corresponding to different stripes in each group is less than the division length, divide the undivided data disks into one group to obtain different data disk groups.
  • Step S34 Sorting each of the data disk groups and the checksums to be updated respectively; in each stripe group, after determining the sequence numbers of the checksums to be updated, use the The parity block to be updated in the parity disk corresponding to an odd stripe, and the data disk having the same serial number as the parity block to be updated among the data disks corresponding to an even stripe in the group
  • the data blocks in the group update the parity blocks to be updated in the parity disks corresponding to the odd stripes in the group.
  • the check block to be updated is updated according to the different stripe groups and different data disk groups according to the preset coding rules to complete the data coding. Specifically, Sort each data disk group and the verification block to be updated separately; in each strip group, after determining the serial number of the verification block to be updated, use the parity block corresponding to the odd number of stripes in the group to be updated. Update the check block, and the data blocks in the data disk group corresponding to the even-numbered stripes in this group have the same serial number as the number of the check block to be updated, and the data blocks in the check disk corresponding to the odd-numbered stripes in this group The check block to be updated is updated.
  • Data block update the parity block to be updated in the parity disk corresponding to the even-numbered stripe in this group, or use the parity block to be updated in the parity disk corresponding to the odd-numbered stripe in this group, and the even-numbered
  • the data blocks in the data disk group with the same sequence number as the check block to be updated in the data disk corresponding to the stripe update the check block to be updated in the check disk corresponding to the odd stripe in this group. Both conditions cannot exist at the same time.
  • the present application proposes a data encoding method, including: obtaining a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes,
  • the hard disk includes a data disk and a check disk; divide each two stripes in the storage erasure correction structure into a group to obtain different stripe groups; determine the number of data blocks corresponding to different stripes in each group The number of verification blocks; calculate the ratio of the number of data blocks to the number of verification blocks to be updated, and when the ratio is not an integer, the comparison value is rounded up; the ratio is used to divide the length, and the data disks corresponding to different stripes in each group Carry out grouping, and when the number of undivided data disks corresponding to different stripes in each group is less than the division length, divide the undivided data disks into one group to obtain different data disk groups; In each strip group, after determining the serial number of the check block
  • Fig. 4 discloses a schematic diagram of an erasure code encoding structure based on an original encoding method.
  • k data block. k represents the number of blocks to divide the original data and the minimum number of blocks to restore the original data. The smaller the value of k, the greater the cost of data reconstruction when a failure occurs; the larger the value of k, the need for multiple data copies, increasing the load on the network and IO.
  • m coding block. m affects the reliability of data preservation and storage costs. The larger the value, the greater the tolerance to faults, the greater the redundancy of data, and the higher the storage cost.
  • Such an erasure correction system can encode K Ds to obtain m Cs.
  • the erasure correction system can decode and recover any m errors in the system after m codes are realized.
  • RS code (Reed-Solomon Code) applied in distributed environment is more common in actual storage system.
  • the RS code is associated with two parameters k and r. Given two positive integers k and r, the RS code encodes k data blocks into r additional parity blocks.
  • the way r check blocks are encoded based on Vandermonde matrix or Cauchy matrix is called RS erasure code encoded by Vandermonde matrix or Cauchy matrix, RS erasure code based on Vandermonde matrix and RS erasure code based on Cauchy matrix
  • the specific encoding process of erasure is as follows:
  • the k*k matrix in the above formula corresponds to k original data blocks, and the r*k matrix corresponds to the encoding matrix.
  • the newly added P 1 to P r is the encoding
  • the obtained r verification data When arbitrarily many r data are mistaken or lost during transmission and need to be corrected, the inverse matrix of the matrix corresponding to the remaining data is multiplied by the data to obtain the original data blocks D 1 to D k .
  • erasure codes use the Cauchy matrix or Vandermonde matrix introduced above. The advantage of this is that the obtained matrix is invertible, and any sub-matrix is also invertible, and the size of the matrix is easy to expand.
  • the RS algorithm has the advantages of simple calculation and flexible expansion, so it has a wide range of applications in the industry.
  • the RS algorithm generally adopts the Vandermonde or Cauchy algorithm as described above. No matter what algorithm is used here, this application sets the relationship of encoding and decoding as follows:
  • p1 is used as an example in the formula of the above-mentioned encoding and decoding relationship proposed by this application:
  • This application proposes an algorithm that reduces the amount of data read for decoding and recovery at the cost of encoding complexity.
  • the data read during encoding can be applied in parallel to different check blocks, Therefore, the actual encoding speed is not affected, and the decoding speed will be greatly improved due to the reduction of read data.
  • the above example is arbitrarily divided into: a group of disk 1 and disk 2, a group of disk 3 and disk 4, and a group of disk 5.
  • the even-numbered strips of the corresponding group 2 are also updated using the data of the odd-numbered strips, and all the encoding is completed above. It should be pointed out that the encoding operation here does not change the original RS encoding generation method. For the additional XOR data added to the even (or odd) check code, it only needs to be involved in the encoding itself. It can be sent to the updated check code block at the same time for operation.
  • FIG. 7 is a schematic diagram of a coding hardware structure disclosed in the present application.
  • the decoding part when an error occurs on a disk, an error occurs on disk 5 as an example. At this time, the errors are d15, d25, d35, and d45.
  • the embodiment of the present application also discloses a data encoding device, as shown in Fig. 8, the device includes:
  • the erasure correction structure acquisition module 11 is configured to acquire the storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of stripes, the The hard disk includes a data disk and a check disk;
  • the grouping module 12 is configured to group the second preset number of stripes in the storage erasure correction structure based on a first division rule to obtain different stripe groups, and to group different stripes in each group based on a second division rule.
  • the data disks corresponding to the stripes are grouped to obtain different data disk groups;
  • the update module 13 is configured to update the check block to be updated according to the different stripe groups and the different data disk groups and according to the preset encoding rules, so as to complete the data encoding.
  • the present application proposes a data encoding method, including: obtaining a storage erasure correction structure determined based on the original encoding method, wherein the storage erasure correction structure corresponds to a first preset number of hard disks and a second preset number of entries
  • the hard disk includes a data disk and a parity disk; group the second preset number of stripes in the storage erasure correction structure based on the first division rule to obtain different stripe groups, and based on the second
  • the division rules group the data disks corresponding to the different stripes in each group to obtain different data disk groups; according to the different stripe groups and the different data disk groups and according to the preset encoding rules, update the check block Update to complete data encoding.
  • this application improves the original encoding method so that when decoding based on the improved encoding method, the amount of data to be read for decoding is reduced, and the decoding speed is further greatly improved. .
  • FIG. 9 is a structural diagram of an electronic device 20 according to an exemplary embodiment, and the content in the diagram should not be regarded as any limitation on the application scope of this application.
  • FIG. 9 is a schematic structural diagram of an electronic device 20 provided by an embodiment of the present application.
  • the electronic device 20 may specifically include: at least one processor 21 , at least one memory 22 , a display screen 23 , an input/output interface 24 , a communication interface 25 , a power supply 26 , and a communication bus 27 .
  • the memory 22 is used to store a computer program, and the computer program is loaded and executed by the processor 21 to implement relevant steps in the data encoding method disclosed in any of the foregoing embodiments.
  • the electronic device 20 in this embodiment may specifically be an electronic computer.
  • the power supply 26 is used to provide working voltage for each hardware device on the electronic device 20; the communication interface 25 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows is applicable Any communication protocol in the technical solution of the present application is not specifically limited here; the input and output interface 24 is used to obtain external input data or output data to the external world, and its specific interface type can be selected according to specific application needs, here Not specifically limited.
  • the memory 22, as a resource storage carrier can be a read-only memory, random access memory, magnetic disk or optical disk, etc., and the resources stored thereon can include the computer program 221, and the storage method can be temporary storage or permanent storage.
  • the computer program 221 may further include a computer program capable of completing other specific tasks in addition to the computer program capable of completing the data encoding method performed by the electronic device 20 disclosed in any of the foregoing embodiments.
  • the embodiment of the present application also discloses a non-volatile readable storage medium for storing a computer program; wherein, when the computer program is executed by a processor, the aforementioned disclosed data encoding method is implemented.
  • the various component embodiments of the present application may be realized in hardware, or in software modules running on one or more processors, or in a combination thereof.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components in the data encoding device according to the embodiments of the present application.
  • DSP digital signal processor
  • the present application can also be implemented as an apparatus or apparatus program (eg, computer program and computer program product) for performing a part or all of the methods described herein.
  • Such a program implementing the present application may be stored on a computer-readable medium, or may be in the form of one or more signals.
  • Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.
  • Figure 10 illustrates a computing processing device that may implement methods according to the present application.
  • the computing processing device includes thereon a processor 1010 and a computer program product in the form of a memory 1020 or a non-volatile readable storage medium.
  • Memory 1020 may be electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the memory 1020 has a storage space 1030 for program code 1031 for performing any method steps in the methods described above.
  • the storage space 1030 for program codes may include respective program codes 1031 for respectively implementing various steps in the above methods. These program codes can be read from or written into one or more computer program products.
  • These computer program products comprise program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is typically a portable or fixed storage unit as with reference to FIG. 11 .
  • the storage unit may have storage segments, storage spaces, etc. arranged similarly to the memory 1020 in the computing processing device of FIG. 10 .
  • the program code can eg be compressed in a suitable form.
  • the storage unit includes computer readable code 1031', i.e. code readable by, for example, a processor such as 1010, which code, when executed by a computing processing device, causes the computing processing device to perform the above-described methods. each step.
  • each embodiment in this application is described in a progressive manner, each embodiment focuses on the differences from other embodiments, and the same or similar parts of each embodiment can be referred to each other.
  • the description is relatively simple, and for relevant details, please refer to the description of the method part.
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • registers hard disk, removable disk, CD-ROM, or any other Any other known storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Error Detection And Correction (AREA)
  • Detection And Correction Of Errors (AREA)

Abstract

一种数据编码方法、装置、设备及介质,包括:获取基于原始编码方法确定的存储纠删结构,其中,存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,硬盘包括数据盘以及校验盘;基于第一划分规则对存储纠删结构中第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;根据不同条带小组以及不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码,通过改进原始编码方法使得解码时所需要读取的数据量减少,进一步使得解码速度得到较大的提高。

Description

一种数据编码方法、装置、设备及介质
本申请要求于2022年02月09日提交中国专利局,申请号为202210119841.9,申请名称为“一种数据编码方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据存储技术领域,特别涉及一种数据编码方法、装置、设备及介质。
背景技术
伴随着通讯技术和网络科技的迅速发展,数字化信息呈指数爆炸式增长,数据存储技术也因此迎来了巨大的挑战。存储系统中数据的可靠性问题以及存储系统的能耗问题越来越被人们所关注,现如今面对如此庞大的数据规模,存储系统中数据的可靠性和存储系统中包含的组件数量成反比关系,即存储系统组件数越多,存储系统中数据的可靠性就越低。根据相关调查显示,在一个由600个磁盘构成的互联网数据中心中,每月大约会有30个磁盘出现损坏的情况,在大规模存储系统中,磁盘故障造成的数据可靠性下降是相当严重的问题,对此人们展开了相关容错技术的研究。纠删码(Erasure Coding,EC)是一种数据保护方法,它将数据分割成片段,把冗余数据扩展、编码,并将其存储在不同的位置,比如磁盘、存储节点或者其它地理位置。将原始数据分割成k个数据块,并根据编码矩阵生成m编码块,将n(n=k+m)块分布到不同的服务器上,当不大于m块数据出现错误时,只需要k块就可以恢复原来的数据。
现如今环境下,大条带纠删是一个比较明确的应用需求,大条带纠删中的大条带指的是所组成纠删下数据和校验的条带数都比较大,这种情况下,数据的安全性能够得到很大的提高,减少硬盘检查的需求几率。但是在大条带纠删 的情况下,在对数据进行恢复时,利用现有的纠删算法,需要取出的数据量太大,由于目前限制存储工作速度的主要是硬盘的IOPS(Input/Output Operations Per Second,每秒进行读写操作的次数),因此当数据量很大时,数据读取速度就会变慢,进一步导致数据恢复速度变慢。
发明内容
有鉴于此,本申请的目的在于提供一种数据编码方法、装置、设备及介质,能够在大条带纠删场景下,降低数据恢复时所需要读取的数据量,提高数据恢复速度,其具体方案如下:
第一方面,本申请公开了一种数据编码方法,包括:
获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘;
基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;
根据所述不同条带小组以及所述不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码。
可选的,所述数据编码方法还包括:
基于所述存储存储纠删结构,确定所述条带与所述数据盘以及所述叫研判的对应关系。
可选的,在所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,包括:
根据所述条带的第二预设数量,确定所述第一划分规则。
可选的,所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,包括:
基于所述第二预设数量的条带对每个硬盘进行划分,并基于所述第一划分规则对所述第二预设数量的条带进行分组,以得到不同条带小组。
可选的,所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,包括:
将所述存储纠删结构中的每两个所述条带分为一组,以得到不同条带小组。
可选的,所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,还包括:
将所述存储纠删结构中的每两个所述条带分为一组,然后将所述存储纠删结构中剩余的一个所述条带分为一组,以得到不同条带小组,并对包括所述一个所述条带的所述条带小组使用所述原始编码方法进行编码。
可选的,所述对包括所述一个所述条带的所述条带小组使用所述原始编码方法进行编码,包括:
一个所述条件的所述条带小组不参与重新编码,并按照所述原始编码方法对一个所述条件的所述条带小组进行编码。
可选的,所述基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组,包括:
确定出每组内不同所述条带对应的数据块数量与所述待更新校验块数量;
计算所述数据块数量与所述待更新校验块数量的比值,并当所述比值不为整数时,对所述比值进行向上取整;
以所述比值为划分长度,对每组内不同所述条带对应的数据盘进行分组,并当每组内不同所述条带对应的未划分数据盘数量小于所述划分长度时,将所述未划分数据盘分为一组,以得到不同数据盘小组。
可选的,所述数据编码方法,还包括:基于预设运算原则从所有所述校验盘中确定出一个校验盘,并对所述一个校验盘中的所述校验块使用原始编码方法进行编码,然后将所述所有校验盘中剩余的所述校验盘中的所述校验块确定 为所述待更新校验块。
可选的,所述预设运算原则为最简运算原则。
可选的,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
对每个所述数据盘小组以及所述待更新校验分别进行排序;
在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内偶数条带对应的校验盘中的所述待更新校验块,对所述本组内偶数条带对应的校验盘中的所述待更新校验块进行更新。
可选的,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
对每个所述数据盘小组以及所述待更新校验分别进行排序;
在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内奇数条带对应的数据盘中与所述待更新校验块的序号具有相同序号的所述数据盘小组中的数据块,对所述本组内偶数条带对应的校验盘中的所述待更新校验块进行更新。
可选的,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
对每个所述数据盘小组以及所述待更新校验分别进行排序;
在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内奇数条带对应的校验盘中的所述待更新校验块,对所述本组内奇数条带对应的校验盘中的所述待更新校验块进行更新。
可选的,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
对每个所述数据盘小组以及所述待更新校验分别进行排序;
在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内奇数条带对应的数据盘中与所述待更新校验块的序号具有相同序号的所述数 据盘小组中的数据块,对所述本组内偶数条带对应的校验盘中的所述待更新校验块进行更新。
可选的,在按照预设编码规则对所述待更新校验块进行更新时,每个条带均有对应的原始数据块。
第二方面,本申请公开了一种数据编码装置,包括:
纠删结构获取模块,用于获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘;
分组模块,用于基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;
更新模块,用于根据所述不同条带小组以及所述不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码。
第三方面,本申请公开了一种电子设备,包括:
存储器,用于保存计算机程序;
处理器,用于执行所述计算机程序,以实现前述公开的数据编码方法。
第四方面,本申请公开了一种非易失性可读存储介质,用于保存计算机程序;其中,所述计算机程序被处理器执行时实现前述公开的数据编码方法。
第五方面,本申请公开了一种计算处理设备,包括:
存储器,其中存储有计算机可读代码;
一个或多个处理器,当所述计算机可读代码被所述一个或多个处理器执行时,所述计算处理设备执行前述公开的数据编码方法的步骤。
第六方面,本申请公开了一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行前述公开的数据编码方法的步骤。
可见,本申请提出一种数据编码方法,包括:获取基于原始编码方法确定 的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘;基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;根据所述不同条带小组以及所述不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码,如此一来,本申请通过改进原始编码方法,使得在基于改进后编码方法进行解码时,解码所需要读取的数据量减少,进一步使得解码速度得到较大的提高。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本申请公开的一种数据编码方法流程图;
图2为本申请公开的一种具体的数据编码方法流程图;
图3为本申请公开的一种具体的数据编码方法流程图;
图4公开了一种基于原始编码方法的纠删码编码结构示意图;
图5公开了一种K=5,R=4情况下每盘4条带的原始存储纠删结构;
图6为本申请公开的一种改进后的K=5,R=4情况下每盘4条带的存储纠删结构;
图7为本申请公开的一种编码硬件结构示意图;
图8为本申请公开的一种数据编码装置结构示意图;
图9为本申请公开的一种电子设备结构图;
图10示意性地示出了用于执行根据本申请的方法的计算处理设备的框图;以及
图11示意性地示出了用于保持或者携带实现根据本申请的方法的程序代码的存储单元。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在大条带纠删的情况下,在对数据进行恢复时,利用现有的纠删算法,需要取出的数据量太大,由于目前限制存储工作速度的主要是硬盘的IOPS,因此当数据量很大时,数据读取速度就会变慢,进一步导致数据恢复速度变慢。
为此,本申请实施例提出一种数据编码方案,能够在大条带纠删场景下,降低数据恢复时所需要读取的数据量,提高数据恢复速度。
本申请实施例公开了一种数据编码方法,参见图1所示,该方法包括:
步骤S11:获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘。
本实施例中,首先获取基于原始编码方法确定的存储纠删结构,通过存储纠删结构可以直观的看出条带与数据盘以及校验盘的对应关系,具体的,该存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,硬盘包括数据盘以及校验盘,数据盘用来存来数据块,校验盘用来存储校验块。
步骤S12:基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组。
本实施例中,首先对基于条带对硬盘进行存储容量的划分,具体的,利用第二预设数量的条带对每个硬盘进行划分,然后基于第一划分规则对存储纠删结构中第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同条带对应的数据盘进行分组,以得到不同数据盘小组。
步骤S13:根据所述不同条带小组以及所述不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码。
本实施例中,在得到不同条带小组以及不同数据盘小组后,基于不同条带小组以及不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码。
需要指出的是,本实施例中,确定待更新校验块的具体过程是:基于预设运算原则从所有校验盘中确定出一个校验盘,并对一个校验盘中的校验块使用原始编码方法进行编码,然后将所有校验盘中剩余的校验盘中的校验块确定为待更新校验块,本实施例中,预设运算原则是指最简运算原则,也即,待更新校验块是基于能够使整个运算过程最简化的原则确定的。
可见,本申请提出一种数据编码方法,包括:获取基于原始编码方法确定的存储纠删结构,其中,存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,硬盘包括数据盘以及校验盘;基于第一划分规则对存储纠删结构中第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同条带对应的数据盘进行分组,以得到不同数据盘小组;根据不同条带小组以及不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码,如此一来,本申请通过改进原始编码方法,使得在基于改进后编码方法进行解码时,解码所需要读取的数据量减少,进一步使得解码速度得到较大的提高。
本申请实施例公开了一种具体的数据编码方法,参见图2所示,该方法包括:
步骤S21:获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘。
关于上述步骤更加具体的工作过程参见前述公开的实施例,在此不做赘述。
步骤S22:将所述存储纠删结构中的每两个所述条带分为一组,以得到不同条带小组。
本实施例中,在进行条带分组时,需要根据条带数量定义分组规则,具体的,条带数量为第二预设数量,当第二预设数量为偶数时,将存储纠删结构中的每两个条带分为一组,以得到不同条带小组;此外,当第二预设数量为奇数时,将存储纠删结构中的每两个条带分为一组,然后将存储纠删结构中剩余的一个条带分为一组,以得到不同条带小组,并对包括一个条带的条带小组使用原始编码方法进行编码。需要指出的是,对包括一个条带的条带小组使用原始编码方法进行编码是指,包括一个条带的条带小组不参与重新编码,按照原始编码方法进行编码。
步骤S23:确定出每组内不同所述条带对应的数据块数量与所述待更新校验块数量;计算所述数据块数量与所述待更新校验块数量的比值,并当所述比值不为整数时,对所述比值进行向上取整;以所述比值为划分长度,对每组内不同所述条带对应的数据盘进行分组,并当每组内不同所述条带对应的未划分数据盘数量小于所述划分长度时,将所述未划分数据盘分为一组,以得到不同数据盘小组。
本实施例中,在得到不同小组之后,要基于第二划分规则对每组内不同条带对应的数据盘进行分组,以得到不同数据盘小组,具体的,确定出每组内不同条带对应的数据块数量与待更新校验块数量;计算数据块数量与待更新校验块数量的比值,并当比值不为整数时,对比值进行向上取整;以比值为划分长度,对每组内不同条带对应的数据盘进行分组,并当每组内不同条带对应的未划分数据盘数量小于划分长度时,将未划分数据盘分为一组,以得到不同数据盘小组。
步骤S24:对每个所述数据盘小组以及所述待更新校验分别进行排序;在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内偶数条带对应的校验盘中的所述待更新校验块,以及所述本组内奇数条带对应的数 据盘中与所述待更新校验块的序号具有相同序号的所述数据盘小组中的所述数据块,对所述本组内偶数条带对应的校验盘中的所述待更新校验块进行更新。
本实施例中,在得到不同条带小组与不同数据盘小组后,根据不同条带小组以及不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码,具体的,对每个数据盘小组以及待更新校验分别进行排序;在每个条带小组中,在确定出待更新校验块的序号后,利用本组内偶数条带对应的校验盘中的待更新校验块,以及本组内奇数条带对应的数据盘中与待更新校验块的序号具有相同序号的数据盘小组中的数据块,对本组内偶数条带对应的校验盘中的待更新校验块进行更新,以完成数据编码。
可见,本申请提出一种数据编码方法,包括:获取基于原始编码方法确定的存储纠删结构,其中,存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,硬盘包括数据盘以及校验盘;将存储纠删结构中的每两个条带分为一组,以得到不同条带小组;确定出每组内不同条带对应的数据块数量与待更新校验块数量;计算数据块数量与待更新校验块数量的比值,并当比值不为整数时,对比值进行向上取整;以比值为划分长度,对每组内不同条带对应的数据盘进行分组,并当每组内不同条带对应的未划分数据盘数量小于划分长度时,将未划分数据盘分为一组,以得到不同数据盘小组;对每个数据盘小组以及待更新校验分别进行排序;在每个条带小组中,在确定出待更新校验块的序号后,利用本组内偶数条带对应的校验盘中的待更新校验块,以及本组内奇数条带对应的数据盘中与待更新校验块的序号具有相同序号的数据盘小组中的数据块,对本组内偶数条带对应的校验盘中的待更新校验块进行更新,如此一来,本申请通过改进原始编码方法,使得在基于改进后编码方法进行解码时,解码所需要读取的数据量减少,进一步使得解码速度得到较大的提高。
本申请实施例公开了一种具体的数据编码方法,参见图3所示,该方法包括:
步骤S31:获取基于原始编码方法确定的存储纠删结构,其中,所述存储 纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘。
关于上述步骤更加具体的工作过程参见前述公开的实施例,在此不做赘述。
步骤S32:将所述存储纠删结构中的每两个所述条带分为一组,以得到不同条带小组。
本实施例中,在进行条带分组时,需要根据条带数量定义分组规则,具体的,条带数量为第二预设数量,当第二预设数量为偶数时,将存储纠删结构中的每两个条带分为一组,以得到不同条带小组;此外,当第二预设数量为奇数时,将存储纠删结构中的每两个条带分为一组,然后将存储纠删结构中剩余的一个条带分为一组,以得到不同条带小组,并对包括一个条带的条带小组使用原始编码方法进行编码。需要指出的是,对包括一个条带的条带小组使用原始编码方法进行编码是指,包括一个条带的条带小组不参与重新编码,按照原始编码方法进行编码。
步骤S33:确定出每组内不同所述条带对应的数据块数量与所述待更新校验块数量;计算所述数据块数量与所述待更新校验块数量的比值,并当所述比值不为整数时,对所述比值进行向上取整;以所述比值为划分长度,对每组内不同所述条带对应的数据盘进行分组,并当每组内不同所述条带对应的未划分数据盘数量小于所述划分长度时,将所述未划分数据盘分为一组,以得到不同数据盘小组。
本实施例中,在得到不同小组之后,要基于第二划分规则对每组内不同条带对应的数据盘进行分组,以得到不同数据盘小组,具体的,确定出每组内不同条带对应的数据块数量与待更新校验块数量;计算数据块数量与待更新校验块数量的比值,并当比值不为整数时,对比值进行向上取整;以比值为划分长度,对每组内不同条带对应的数据盘进行分组,并当每组内不同条带对应的未划分数据盘数量小于划分长度时,将未划分数据盘分为一组,以得到不同数据 盘小组。
步骤S34:对每个所述数据盘小组以及所述待更新校验分别进行排序;在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内奇数条带对应的校验盘中的所述待更新校验块,以及所述本组内偶数条带对应的数据盘中与所述待更新校验块的序号具有相同序号的所述数据盘小组中的所述数据块,对所述本组内奇数条带对应的校验盘中的所述待更新校验块进行更新。
本实施例中,在得到不同条带小组与不同数据盘小组后,根据不同条带小组以及不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码,具体的,对每个数据盘小组以及待更新校验分别进行排序;在每个条带小组中,在确定出待更新校验块的序号后,利用本组内奇数条带对应的校验盘中的待更新校验块,以及本组内偶数条带对应的数据盘中与待更新校验块的序号具有相同序号的数据盘小组中的数据块,对本组内奇数条带对应的校验盘中的待更新校验块进行更新。
需要指出的是,本实施例中,为了获得最大的出错数量,在按照预设编码规则对待更新校验块进行更新时,必须保证一个条带存有原始数据块,因此上述编码规则中只能是利用本组内偶数条带对应的校验盘中的待更新校验块,以及本组内奇数条带对应的数据盘中与待更新校验块的序号具有相同序号的数据盘小组中的数据块,对本组内偶数条带对应的校验盘中的待更新校验块进行更新,或利用本组内奇数条带对应的校验盘中的待更新校验块,以及本组内偶数条带对应的数据盘中与待更新校验块的序号具有相同序号的数据盘小组中的数据块,对本组内奇数条带对应的校验盘中的待更新校验块进行更新。两种情况不能同时存在。
可见,本申请提出一种数据编码方法,包括:获取基于原始编码方法确定的存储纠删结构,其中,存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,硬盘包括数据盘以及校验盘;将存储纠删结构中的每两个条带分为一组,以得到不同条带小组;确定出每组内不同条带对应的数据块数量与 待更新校验块数量;计算数据块数量与待更新校验块数量的比值,并当比值不为整数时,对比值进行向上取整;以比值为划分长度,对每组内不同条带对应的数据盘进行分组,并当每组内不同条带对应的未划分数据盘数量小于划分长度时,将未划分数据盘分为一组,以得到不同数据盘小组;对每个数据盘小组以及待更新校验分别进行排序;在每个条带小组中,在确定出待更新校验块的序号后,利用本组内奇数条带对应的校验盘中的待更新校验块,以及本组内偶数条带对应的数据盘中与待更新校验块的序号具有相同序号的数据盘小组中的数据块,对本组内奇数条带对应的校验盘中的待更新校验块进行更新,如此一来,本申请通过改进原始编码方法,使得在基于改进后编码方法进行解码时,解码所需要读取的数据量减少,进一步使得解码速度得到较大的提高。
图4公开了一种基于原始编码方法的纠删码编码结构示意图。
1、纠删码(Erasure Code)属于编码理论中的一种前向纠错技术,最早应用于通信领域以解决数据传输中的丢失与损耗的问题。由于纠删码技术在防止数据丢失上取得了较好的效果,因此被引入存储领域。纠删码可以在保证相同可靠性的前提下有效地降低存储开销,因此纠删码技术被广泛地应用于各大存储系统以及数据中心例如微软的Azure、Facebook的F4等。纠删码是指将原始数据分割成k个数据块,并根据编码矩阵生成m编码块,将n(n=k+m)块分布到不同的服务器上。当不大于m块数据出现错误时,只需要k块就可以恢复原来的数据,其参数配置如下所示:
(1)k:数据块。k表示将原始数据划分的块数和恢复原始数据的最小块数。k值越小,发生故障时,数据重构的代价越大;k值越大,需要多路数据拷贝,增加网络和IO的负载。
(2)m:编码块。m影响数据保存的可靠性和存储成本。取值越大,对故障的容忍度大,数据的冗余度也会增加,存储成本也会提高。
(3)n:生成块数(n=k+m)。
(4)有效存储比:k/n。
原始纠删码编码一般利用范德蒙或柯西矩阵,参见图4所示,图中待编码的数据块为k=5个,编码需求为m=3,B11,B12等部分可以是范德蒙矩阵或柯西矩阵,最终的生成码块为D+C部分,总量为k+m=8个,有效存储比为:k/n=5/8。这样的纠删系统,可以对K个D进行编码,得到m个C。纠删系统可在m个编码实现后很对系统中任意m个错误进行解码恢复。
2、在实际存储系统中较常见的有应用在分布式环境下的RS码(Reed-Solomon Code)。RS码与两个参数k和r相关。给定两个正整数k和r,RS码将k个数据块编码为r个额外的校验块。而r个校验块基于范德蒙矩阵或柯西矩阵进行编码的方式就称为利用范德蒙矩阵或柯西矩阵编码的RS纠删码,基于范德蒙矩阵的RS纠删码以及基于柯西矩阵的RS纠删码的具体编码过程分别如下所示:
Figure PCTCN2022123401-appb-000001
Figure PCTCN2022123401-appb-000002
上述公式中的k*k矩阵对应的就是k个原始数据块,r*k矩阵对应的就是编码矩阵,通过与原始数据D 1到D k相乘,得到新添加的P 1到P r就是编码所得到的r个 校验数据。当其中任意做多r个数据在传输中出错或丢失,需要纠错时,即用剩余数据对应矩阵的逆矩阵与数据相乘,即会得到原始数据块D 1到D k。以D 1到D r数据丢失,进行解码为例,过程如下所示:
Figure PCTCN2022123401-appb-000003
由此可知,纠删码的核心概念是构建一个可逆的编码矩阵用以产生校验数据,其逆矩阵可经过计算恢复原始数据。常见的RS纠删码使用的是上面介绍的柯西矩阵或范德蒙矩阵,这样的优势是所得到的矩阵可逆,其任意子矩阵也都可逆,并且矩阵的大小扩充简单。
现有纠删的算法使用的大部分都是RS算法,RS算法具有计算简单、扩展灵活等优势,因此在工业界具有广泛的应用。RS算法一般采用的是如上面所介绍的范德蒙或柯西的算法。这里无论采用何种算法,本申请将其编码的关系和解码的关系分别设置为:
encoding:p i=fe i(d i);
fecoding:d i=fde i(d i);
利用任意一个大条带纠删k=5,r=4的情况下使用标准范德蒙的RS算法进行编解码构建的纠删系统进行举例。则此时的编码关系如下所示:
Figure PCTCN2022123401-appb-000004
在上述的编码关系中,对本申请提出的上述编码和解码的关系的公式中以p1进行举例:
Figure PCTCN2022123401-appb-000005
同理可得到解码时对应的fde的关系,其中,
Figure PCTCN2022123401-appb-000006
为异或符号。
图5公开了一种K=5,R=4情况下每盘4条带的原始存储纠删结构。
假设将每个硬盘分为四个条带,不考虑负载均衡,只考虑数据和校验的关系,则存储纠删结构的关系参见图5所示。图5中的p11,p12,p13,p14是利用条带1通过编码关系的公式生成的校验数据,相应的,其他条带编码关系相同。则在原始的编码情况下,上述编码可以恢复1-4个任意盘的错误。当发生一个错误,且错误为盘1的情况下,原始的RS编码需要求出盘2-5的数据以及盘6-9的任意一个校验,来完成解码运算,此时所需取出的数据块数为20。
本申请提出了一种以编码复杂度为代价,减少解码恢复所需要读取的数据量的算法,在硬件实现中,因为编码时所读取的数据可以并行应用于不同的校验块产生,因此实际的编码速度并没有受到影响,而解码的速度会因为读取数据的减少,得到较大的提升。
图6为本申请公开的一种改进后的K=5,R=4情况下每盘4条带的存储纠删 结构。
具体实施过程参见图6所示:
(1)对条带基于偶数分组:具体的,将每两个条带组分为一组。
(2)基于校验盘数量对数据盘进行分组:分组方式如下所示:
Figure PCTCN2022123401-appb-000007
当除不尽时,则向上取整,每组对应的数据盘个数基于n划分为整数。以上述情况举例,k=5,r=4,则:
Figure PCTCN2022123401-appb-000008
则每组基于n=2进行整数的划分,划分为分别:2,2,1个元素。
上述例子任意划分为:盘1和盘2一组,盘3和盘4一组,盘5一组。
(3)将每组内奇数(或偶数)的校验生成增加组内偶数(或奇数)的步骤(2)中的数据盘,以图6进行举例说明,考虑将奇数的数据盘增加给偶数的校验盘,以组1举例,则分组增加为:
Figure PCTCN2022123401-appb-000009
Figure PCTCN2022123401-appb-000010
Figure PCTCN2022123401-appb-000011
同理,相应的组2的偶数条带也利用奇数条带的数据做同样更新,以上,完成所有的编码。需要指出的是,这里的编码运算没有改变原本的RS编码的生成方式,对于额外添加给偶数(或奇数)的校验码的增加的异或数据,只需要在进行其本身涉及的编码时,将其同时送给更新的校验码块进行运算即可。
图7为本申请公开的一种编码硬件结构示意图。
以p24’的生成举例,其硬件结构参见图7,由此可知,原本的编码顺序和方式无需改变,对于新增的对于校验块的新增部分,通过将其他编码涉及的数据块直接传输新增即可,这里的操作并行进行,无需增加新的数据读取和搬移,因此对速度和面积没有影响。
在解码部分:当一个盘发生错误时,以盘5发生错误为例进行说明,此时发生错误为d15,d25,d35,d45,首先读取d21-d24,d41-d44全部8个数据块,利用p21和p41分别进行恢复,得到d25,d45两个数据块。然后再取出p24’和p44’,基于上述分组增加的公式可知,此时已经取得d21-d25和d41-d45,则通过上述分组增加的公式可直接求得d15和d35。也就是说,完成一个盘错误时的恢复,仅需从硬盘中取出d21-d24,d41-d44,p21,p41,p24’,p44’总共12个数据块。相比原始方法需要取出20个数据块相比,减少了一部分数据读取需求,一定程度上提高读取速度。同理在发生一个以上错误的时候,也有不同的速度提升,这里不再举例说明。如此一来,本深情提出了一种针对大条带纠删下纠错恢复速度改进的纠删硬件加速器方案,针对用户当今实际的需求下,发生错误的恢复速度要求高的情况,针对限制存储纠删结构速度的主要原因是数据搬运的IOPS限制的特性,在原有RS纠删方法的前提下,改进编码方案,使得解码需求发生时,可以减少数据的搬运量以提高解码速度。
相应的,本申请实施例还公开了一种数据编码装置,参见图8所示,该装置包括:
纠删结构获取模块11,用于获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘;
分组模块12,用于基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;
更新模块13,用于根据所述不同条带小组以及所述不同数据盘小组并按照 预设编码规则对待更新校验块进行更新,以完成数据编码。
其中,关于上述各个模块更加具体的工作过程可以参考前述实施例中公开的相应内容,在此不再进行赘述。
可见,本申请提出一种数据编码方法,包括:获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘;基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;根据所述不同条带小组以及所述不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码,如此一来,本申请通过改进原始编码方法,使得在基于改进后编码方法进行解码时,解码所需要读取的数据量减少,进一步使得解码速度得到较大的提高。
进一步的,本申请实施例还提供了一种电子设备。图9是根据一示例性实施例示出的电子设备20结构图,图中的内容不能认为是对本申请的使用范围的任何限制。
图9为本申请实施例提供的一种电子设备20的结构示意图。该电子设备20,具体可以包括:至少一个处理器21、至少一个存储器22、显示屏23、输入输出接口24、通信接口25、电源26、和通信总线27。其中,所述存储器22用于存储计算机程序,所述计算机程序由所述处理器21加载并执行,以实现前述任一实施例公开的数据编码方法中的相关步骤。另外,本实施例中的电子设备20具体可以为电子计算机。
本实施例中,电源26用于为电子设备20上的各硬件设备提供工作电压;通信接口25能够为电子设备20创建与外界设备之间的数据传输通道,其所遵循的通信协议是能够适用于本申请技术方案的任意通信协议,在此不对其进行具体限定;输入输出接口24,用于获取外界输入数据或向外界输出数据,其具体的接口类型可以根据具体应用需要进行选取,在此不进行具体限定。
另外,存储器22作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储的资源可以包括计算机程序221,存储方式可以是短暂存储或者永久存储。其中,计算机程序221除了包括能够用于完成前述任一实施例公开的由电子设备20执行的数据编码方法的计算机程序之外,还可以进一步包括能够用于完成其他特定工作的计算机程序。
进一步的,本申请实施例还公开了一种非易失性可读存储介质,用于存储计算机程序;其中,计算机程序被处理器执行时实现前述公开的数据编码方法。
关于该方法的具体步骤可以参考前述实施例中公开的相应内容,在此不再进行赘述。
本申请的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本申请实施例的数据编码装置中的一些或者全部部件的一些或者全部功能。本申请还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本申请的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
例如,图10示出了可以实现根据本申请的方法的计算处理设备。该计算处理设备上包括处理器1010和以存储器1020形式的计算机程序产品或者非易失性可读存储介质。存储器1020可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器1020具有用于执行上述方法中的任何方法步骤的程序代码1031的存储空间1030。例如,用于程序代码的存储空间1030可以包括分别用于实现上面的方法中的各种步骤的各个程序代码1031。这些程序代码可以从一个或者多个计算机程序产品中读出或者 写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为如参考图11的便携式或者固定存储单元。该存储单元可以具有与图10的计算处理设备中的存储器1020类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码1031’,即可以由例如诸如1010之类的处理器读取的代码,这些代码当由计算处理设备运行时,导致该计算处理设备执行上面所描述的方法中的各个步骤。
本申请书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包 括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上对本申请所提供的一种数据编码方法、装置、设备、存储介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。

Claims (20)

  1. 一种数据编码方法,其特征在于,包括:
    获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘;
    基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;
    根据所述不同条带小组以及所述不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码。
  2. 根据权利要求1所述的数据编码方法,其特征在于,还包括:
    基于所述存储存储纠删结构,确定所述条带与所述数据盘以及所述叫研判的对应关系。
  3. 根据权利要求1所述的数据编码方法,其特征在于,在所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,包括:
    根据所述条带的第二预设数量,确定所述第一划分规则。
  4. 根据权利要求3所述的数据编码方法,其特征在于,所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,包括:
    基于所述第二预设数量的条带对每个硬盘进行划分,并基于所述第一划分规则对所述第二预设数量的条带进行分组,以得到不同条带小组。
  5. 根据权利要求3所述的数据编码方法,其特征在于,当所述第二预设数量为偶数时,所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,包括:
    将所述存储纠删结构中的每两个所述条带分为一组,以得到不同条带小 组。
  6. 根据权利要求1所述的数据编码方法,其特征在于,当所述第二预设数量为奇数时,所述基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,还包括:
    将所述存储纠删结构中的每两个所述条带分为一组,然后将所述存储纠删结构中剩余的一个所述条带分为一组,以得到不同条带小组,并对包括所述一个所述条带的所述条带小组使用所述原始编码方法进行编码。
  7. 根据权利要求6所述的数据编码方法,其特征在于,所述对包括所述一个所述条带的所述条带小组使用所述原始编码方法进行编码,包括:
    一个所述条件的所述条带小组不参与重新编码,并按照所述原始编码方法对一个所述条件的所述条带小组进行编码。
  8. 根据权利要求1所述的数据编码方法,其特征在于,所述基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组,包括:
    确定出每组内不同所述条带对应的数据块数量与所述待更新校验块数量;
    计算所述数据块数量与所述待更新校验块数量的比值,并当所述比值不为整数时,对所述比值进行向上取整;
    以所述比值为划分长度,对每组内不同所述条带对应的数据盘进行分组,并当每组内不同所述条带对应的未划分数据盘数量小于所述划分长度时,将所述未划分数据盘分为一组,以得到不同数据盘小组。
  9. 根据权利要求1所述的数据编码方法,其特征在于,还包括:
    基于预设运算原则从所有所述校验盘中确定出一个校验盘,并对所述一个校验盘中的所述校验块使用原始编码方法进行编码,然后将所述所有校验盘中剩余的所述校验盘中的所述校验块确定为所述待更新校验块。
  10. 根据权利要求9所述的数据编码方法,其特征在于,所述预设运算原则为最简运算原则。
  11. 根据权利要求1至10任一项所述的数据编码方法,其特征在于,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
    对每个所述数据盘小组以及所述待更新校验分别进行排序;
    在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内偶数条带对应的校验盘中的所述待更新校验块,对所述本组内偶数条带对应的校验盘中的所述待更新校验块进行更新。
  12. 根据权利要求1至10任一项所述的数据编码方法,其特征在于,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
    对每个所述数据盘小组以及所述待更新校验分别进行排序;
    在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内奇数条带对应的数据盘中与所述待更新校验块的序号具有相同序号的所述数据盘小组中的数据块,对所述本组内偶数条带对应的校验盘中的所述待更新校验块进行更新。
  13. 根据权利要求1至10任一项所述的数据编码方法,其特征在于,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
    对每个所述数据盘小组以及所述待更新校验分别进行排序;
    在每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组内奇数条带对应的校验盘中的所述待更新校验块,对所述本组内奇数条带对应的校验盘中的所述待更新校验块进行更新。
  14. 根据权利要求1至10任一项所述的数据编码方法,其特征在于,所述根据所述分组后的所述存储纠删结构并按照预设编码规则对待更新校验块进行更新,包括:
    每个所述数据盘小组以及所述待更新校验分别进行排序;
    每个所述条带小组中,在确定出所述待更新校验块的序号后,利用本组 内偶数条带对应的数据盘中与所述待更新校验块的序号具有相同序号的所述数据盘小组中的数据块,对所述本组内奇数条带对应的校验盘中的所述待更新校验块进行更新。
  15. 根据权利要求1所述的数据编码方法,其特征在于,在按照预设编码规则对所述待更新校验块进行更新时,每个条带均有对应的原始数据块。
  16. 一种数据编码装置,其特征在于,包括:
    纠删结构获取模块,用于获取基于原始编码方法确定的存储纠删结构,其中,所述存储纠删结构中对应第一预设数量的硬盘以及第二预设数量的条带,所述硬盘包括数据盘以及校验盘;
    分组模块,用于基于第一划分规则对所述存储纠删结构中所述第二预设数量的条带进行分组,以得到不同条带小组,并基于第二划分规则对每组内不同所述条带对应的数据盘进行分组,以得到不同数据盘小组;
    更新模块,用于根据所述不同条带小组以及所述不同数据盘小组并按照预设编码规则对待更新校验块进行更新,以完成数据编码。
  17. 一种电子设备,其特征在于,包括:
    存储器,用于保存计算机程序;
    处理器,用于执行所述计算机程序,以实现如权利要求1至15任一项所述的数据编码方法。
  18. 一种非易失性可读存储介质,其特征在于,用于保存计算机程序;其中,所述计算机程序被处理器执行时实现如权利要求1至15任一项所述的数据编码方法。
  19. 一种计算处理设备,其特征在于,包括:
    存储器,其中存储有计算机可读代码;
    一个或多个处理器,当所述计算机可读代码被所述一个或多个处理器执行时,所述计算处理设备执行权利要求1-15任意一项所述的数据编码方法的步骤。
  20. 一种计算机程序产品,其特征在于,包括计算机可读代码,当所述 计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行根据权利要求1-15任意一项所述的数据编码方法的步骤。
PCT/CN2022/123401 2022-02-09 2022-09-30 一种数据编码方法、装置、设备及介质 WO2023151290A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210119841.9A CN114153651B (zh) 2022-02-09 2022-02-09 一种数据编码方法、装置、设备及介质
CN202210119841.9 2022-02-09

Publications (1)

Publication Number Publication Date
WO2023151290A1 true WO2023151290A1 (zh) 2023-08-17

Family

ID=80450020

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/123401 WO2023151290A1 (zh) 2022-02-09 2022-09-30 一种数据编码方法、装置、设备及介质

Country Status (2)

Country Link
CN (1) CN114153651B (zh)
WO (1) WO2023151290A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114153651B (zh) * 2022-02-09 2022-04-29 苏州浪潮智能科技有限公司 一种数据编码方法、装置、设备及介质
CN114816837B (zh) * 2022-06-28 2022-12-02 苏州浪潮智能科技有限公司 一种纠删码融合方法、系统、电子设备及存储介质
CN115080303B (zh) * 2022-07-26 2023-01-06 苏州浪潮智能科技有限公司 Raid6磁盘阵列的编码方法、解码方法、装置及介质
CN116501553B (zh) * 2023-06-25 2023-09-19 苏州浪潮智能科技有限公司 数据恢复方法、装置、系统、电子设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090055682A1 (en) * 2007-07-18 2009-02-26 Panasas Inc. Data storage systems and methods having block group error correction for repairing unrecoverable read errors
CN106844098A (zh) * 2016-12-29 2017-06-13 中国科学院计算技术研究所 一种基于十字交叉纠删编码的快速数据恢复方法及系统
US20200285551A1 (en) * 2019-03-04 2020-09-10 Hitachi, Ltd. Storage system, data management method, and data management program
CN112860475A (zh) * 2021-02-04 2021-05-28 山东云海国创云计算装备产业创新中心有限公司 基于rs纠删码的校验块恢复方法、装置、系统及介质
CN113258938A (zh) * 2021-06-03 2021-08-13 成都信息工程大学 一种单节点故障快速修复纠删码的构造方法
CN113590042A (zh) * 2021-07-29 2021-11-02 杭州宏杉科技股份有限公司 一种数据保护存储方法、装置及设备
CN114153651A (zh) * 2022-02-09 2022-03-08 苏州浪潮智能科技有限公司 一种数据编码方法、装置、设备及介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009050761A1 (ja) * 2007-10-15 2009-04-23 Fujitsu Limited ストレージシステム、ストレージ制御装置、ストレージシステムの制御方法及びそのプログラム
US8914706B2 (en) * 2011-12-30 2014-12-16 Streamscale, Inc. Using parity data for concurrent data authentication, correction, compression, and encryption
US9201800B2 (en) * 2013-07-08 2015-12-01 Dell Products L.P. Restoring temporal locality in global and local deduplication storage systems

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090055682A1 (en) * 2007-07-18 2009-02-26 Panasas Inc. Data storage systems and methods having block group error correction for repairing unrecoverable read errors
CN106844098A (zh) * 2016-12-29 2017-06-13 中国科学院计算技术研究所 一种基于十字交叉纠删编码的快速数据恢复方法及系统
US20200285551A1 (en) * 2019-03-04 2020-09-10 Hitachi, Ltd. Storage system, data management method, and data management program
CN112860475A (zh) * 2021-02-04 2021-05-28 山东云海国创云计算装备产业创新中心有限公司 基于rs纠删码的校验块恢复方法、装置、系统及介质
CN113258938A (zh) * 2021-06-03 2021-08-13 成都信息工程大学 一种单节点故障快速修复纠删码的构造方法
CN113590042A (zh) * 2021-07-29 2021-11-02 杭州宏杉科技股份有限公司 一种数据保护存储方法、装置及设备
CN114153651A (zh) * 2022-02-09 2022-03-08 苏州浪潮智能科技有限公司 一种数据编码方法、装置、设备及介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG HANG, LIU SHANZHENG; TANG DAN; CAI HONGLIANG: "Erasure Code with Low Recovery-overhead in Distributed Storage Systems", JOURNAL OF COMPUTER APPLICATIONS, JISUANJI YINGYONG, CN, vol. 40, no. 10, 31 October 2020 (2020-10-31), CN , pages 2942 - 2950, XP055960281, ISSN: 1001-9081, DOI: 10.11772/j.issn.1001-9081.2020010127 *

Also Published As

Publication number Publication date
CN114153651A (zh) 2022-03-08
CN114153651B (zh) 2022-04-29

Similar Documents

Publication Publication Date Title
WO2023151290A1 (zh) 一种数据编码方法、装置、设备及介质
US10146618B2 (en) Distributed data storage with reduced storage overhead using reduced-dependency erasure codes
US10481978B2 (en) Optimal slice encoding strategies within a dispersed storage unit
US10089176B1 (en) Incremental updates of grid encoded data storage systems
US10162704B1 (en) Grid encoded data storage systems for efficient data repair
US10108819B1 (en) Cross-datacenter extension of grid encoded data storage systems
US9998539B1 (en) Non-parity in grid encoded data storage systems
US9959167B1 (en) Rebundling grid encoded data storage systems
US9904589B1 (en) Incremental media size extension for grid encoded data storage systems
CN112860475B (zh) 基于rs纠删码的校验块恢复方法、装置、系统及介质
CN111831223B (zh) 提高数据去重系统可扩展性的容错编码方法、装置及系统
WO2023138289A1 (zh) 一种数据存储方法、装置、设备及计算机可读存储介质
CN113687975B (zh) 数据处理方法、装置、设备及存储介质
CN102843212B (zh) 编解码处理方法及装置
CN114116297B (zh) 一种数据编码方法、装置、设备及介质
CN116501553B (zh) 数据恢复方法、装置、系统、电子设备及存储介质
CN107153661A (zh) 一种基于hdfs系统的数据的存储、读取方法及其装置
CN113258936B (zh) 一种基于循环移位的双重编码的构造方法
Ivanichkina et al. Mathematical methods and models of improving data storage reliability including those based on finite field theory
US10235402B1 (en) Techniques for combining grid-encoded data storage systems
US10324790B1 (en) Flexible data storage device mapping for data storage systems
US10198311B1 (en) Cross-datacenter validation of grid encoded data storage systems
CN105007286A (zh) 解码方法和装置及云存储方法和系统
CN115061640B (zh) 一种容错分布存储系统、方法、电子设备及介质
CN111224747A (zh) 可降低修复带宽和磁盘读取开销的编码方法及其修复方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22925648

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18696394

Country of ref document: US