WO2021012164A1 - 数据重构的方法、装置、计算机设备、存储介质及系统 - Google Patents

数据重构的方法、装置、计算机设备、存储介质及系统 Download PDF

Info

Publication number
WO2021012164A1
WO2021012164A1 PCT/CN2019/097155 CN2019097155W WO2021012164A1 WO 2021012164 A1 WO2021012164 A1 WO 2021012164A1 CN 2019097155 W CN2019097155 W CN 2019097155W WO 2021012164 A1 WO2021012164 A1 WO 2021012164A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
storage device
matrix
sub
storage
Prior art date
Application number
PCT/CN2019/097155
Other languages
English (en)
French (fr)
Inventor
张进毅
董如良
陈亮
薛强
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2019/097155 priority Critical patent/WO2021012164A1/zh
Priority to EP19938765.5A priority patent/EP3989069B1/en
Priority to CN201980008279.9A priority patent/CN112543920B/zh
Publication of WO2021012164A1 publication Critical patent/WO2021012164A1/zh
Priority to US17/574,069 priority patent/US20220138046A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1088Reconstruction on already foreseen single or plurality of spare disks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1092Rebuilding, e.g. when physically replacing a failing disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1435Saving, restoring, recovering or retrying at system level using file system or storage system metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Definitions

  • This application relates to the field of data storage technology, and in particular to a method, device, computer equipment, storage medium, and storage system for data reconstruction.
  • erasure coding (EC) and striping technologies are generally used to store the data to be stored.
  • the data to be stored can be divided into multiple stripes.
  • the redundancy mode is N data blocks + M check blocks, then each stripe includes N data blocks + M check blocks, where N and M are integers greater than 0, 1 data block and A parity block can be regarded as a block.
  • the N+M blocks included in each stripe can be stored on multiple storage devices. When any storage device loses the stored blocks, it is lost. In order to ensure that the lost blocks do not affect the normal business of the storage system, the storage The system needs to program the missing blocks.
  • the block reconstruction can be the following process: For a stripe that has lost blocks, the control device in the storage system determines the missing block in the stripe, and sends N read requests to the storage device. Each read request is used To instruct to read a non-lost block, after receiving each request, the storage device sends the non-lost block indicated by each read request to the control device. When the control device obtains N blocks, The check matrix and the obtained N blocks, the missing blocks are obtained, and the obtained missing blocks are sent to the storage device that lost the blocks through a write request. After the storage device receives the write request, it will control the device Obtain the lost blocks and write them into the storage node, so that block reconstruction can be realized.
  • N read requests, N blocks, and reconstructed blocks need to be transmitted between the control device and the storage device of the storage strip.
  • a large amount of network bandwidth is occupied, thus Reduced the performance of reconstruction blocks.
  • the embodiments of the present invention provide a data reconstruction method, device, computer equipment, storage medium and system, which can improve the performance of block reconstruction.
  • the technical scheme is as follows:
  • a method for data reconstruction includes: determining the first block in the lost blocks in the first strip, and the blocks in the first strip are divided by the target number Storage device storage;
  • the first result of the obtained target number of storage devices is directly reconstructed to the first segment that is lost in the first stripe, so that there is no need to read the unmissed segments on the first stripe.
  • the first block can be reconstructed. Since the first result has less data than the un-lost block on the first strip, during data transmission , The occupied network bandwidth is also relatively small, thereby improving the performance of reconstruction block.
  • the method before the obtaining the first result of the target number of storage devices, the method further includes:
  • the first sub-matrix includes a column corresponding to the first block and a second sub-matrix.
  • the reconstructing the first block according to the first result of the target number of storage devices includes:
  • the reconstructing the first block according to the first result of the target number of storage devices and the first sub-matrix includes:
  • the first target row of the target block matrix is determined as the reconstructed first block.
  • the method before the first result of the target number of storage devices and the first sub-matrix is reconstructed, the method further includes:
  • the storage device that obtains the target number is based on the first result returned by the first obtaining request.
  • the first result sent by each storage device can be obtained, without sending a large number of read requests to each storage device, which not only reduces the control device
  • the overhead of the CPU also further reduces the occupancy of network bandwidth, which in turn can further improve the performance of reconstruction blocks.
  • the method further includes:
  • a write request is sent to a target storage device, the write request carries the reconstructed first block and the block information of the reconstructed first block, and the target storage device is based on the reconstructed first block To store the reconstructed first block information.
  • the method before the sending the write request to the target storage device, the method further includes:
  • the reconstructed second partition is the same as the acquired second partition, execute the step of sending a write request to the target storage device, otherwise, do not execute the step of sending a write request to the target storage device, and The use of the first strip is prohibited.
  • the method before the first result of the target number of storage devices and the first sub-matrix is reconstructed, the method further includes:
  • Target reconstruction request Send a target reconstruction request to the target storage device, where the target reconstruction request carries a first sub-matrix, and the target reconstruction request is used to instruct to reconstruct the first block according to the first sub-matrix.
  • the first result sent by each storage device can be obtained without sending a large number of read requests to each storage device, which not only reduces the control equipment
  • the overhead of the CPU also further reduces the occupancy of network bandwidth, which can further improve the performance of the reconstruction block.
  • the method before the determining the first block of the lost blocks in the first stripe, the method further includes:
  • the determining the first block in the lost blocks in the first strip includes:
  • any one of the lost blocks in the first strip is determined as the first block.
  • the method before the determining the first block of the lost blocks in the first stripe, the method further includes:
  • the determining the first block in the lost blocks in the first strip includes:
  • any one of the lost segments in the first stripe is determined as the first segment.
  • the method further includes:
  • the method before the determining the first block of the lost blocks in the first stripe, the method further includes:
  • the determining the first block in the lost blocks in the first strip includes:
  • the method before the determining the first block of the lost blocks in the first stripe, the method further includes:
  • the first segment of the lost segment in the first stripe is determined from the at least one segment.
  • the method before the first result of the target number of storage devices and the first sub-matrix is reconstructed, the method further includes:
  • the storage unit of the storage device that obtains the target number is based on the first result returned by the third obtaining request.
  • the method before the first result of the target number of storage devices and the first sub-matrix is reconstructed, the method further includes:
  • the at least one main storage device is used to manage the storage devices of the target number, and each target summation matrix is at least one storage device based on the fourth acquisition
  • the sum of the first results returned by the request, the at least one storage device is a device managed by a primary storage device;
  • the reconstructing the first block according to the first result of the target number of storage devices and the first sub-matrix includes:
  • the first block is reconstructed.
  • the reconstructing the first block according to the at least one target sum matrix and the first sub-matrix includes:
  • the first target row of the target block matrix is determined as the reconstructed first block.
  • a method for data reconstruction which includes:
  • the first result is obtained
  • the first result is sent to the target device, and the target device performs the first result of the lost blocks in the first stripe according to the first result of the target number of storage devices Refactoring.
  • the method before the reading the effective partition of the first strip stored in the first storage device, the method further includes:
  • An acquisition request is received, the acquisition request carrying a second sub-matrix corresponding to a storage device, block information of a third block corresponding to each column in the second sub-matrix, and an identifier of the target device, the second sub-matrix Including a column corresponding to at least one third block stored on a corresponding storage device in the first stripe, the second block being any one of the valid blocks of the first stripe,
  • the third block is any block in the effective block except the second block;
  • the calculating according to the at least one third block and the second sub-matrix to obtain the first result includes:
  • the second sub-matrix is multiplied by the block matrix to obtain the first result.
  • the target device includes a control device, a target storage device, a first storage device, and at least one primary storage device.
  • the first storage device is any one of the target storage devices.
  • the at least one primary storage device is used to manage the target number of storage devices;
  • the acquisition request is a first acquisition request for instructing to send the first result to the control device
  • the acquisition request is a second acquisition request for instructing to send the first result to the target storage device
  • the acquisition request is a third acquisition request for instructing to send the first result to the first storage device
  • the acquisition request is a fourth acquisition request used to instruct to send the first result to the primary storage device.
  • the method before the receiving the acquisition request, the method further includes:
  • a first reconstruction request is sent to the control device, where the first reconstruction request carries the storage medium identifier of the failed storage medium in the storage device.
  • the method before the receiving the acquisition request, the method further includes:
  • a second reconstruction request is sent to the control device, where the second reconstruction request carries block information of the at least one segment.
  • a method for data reconstruction which includes:
  • a target reconstruction request is received, where the target reconstruction request carries a first sub-matrix, and the target reconstruction request is used to indicate that according to the first sub-matrix, the first segment in the lost segment in the first strip is Block is reconstructed, the first sub-matrix includes the column corresponding to the first block in the lost block in the first stripe and the column corresponding to the second block, and the second block is Any one of the effective blocks of the first strip;
  • the storage device that acquires the target number is based on the first result returned by the second acquisition request, and the second acquisition request carries the second sub-matrix corresponding to the storage device, and the second sub-matrix corresponding to each column in the second sub-matrix.
  • the block information of the three blocks and the identification of the target storage device, the blocks in the first stripe are stored by the target number of storage devices, and each first result is determined by the target number of storage devices
  • a storage device reads the stored effective partition of the first stripe and calculates it according to the read effective partition.
  • the third partition is the invalid partition except the second partition. Any block
  • the reconstructing the first block according to the first result of the target number of storage devices and the first sub-matrix includes:
  • the first target row of the target block is used as the reconstructed first block.
  • the method further includes:
  • a reconstruction completion response is sent to the control device, where the reconstruction completion response is used to indicate that the reconstruction of the first block is completed.
  • the method before the storing the reconstructed first partition, the method further includes:
  • the step of storing the reconstructed first partition is performed; otherwise, the reconstruction of the first partition is not performed.
  • the storage step is performed in blocks, and the use of the first strip is prohibited.
  • a block reconstruction device for performing the above-mentioned data reconstruction method.
  • the block reconstruction device includes a functional module for executing the data reconstruction method provided in the foregoing first aspect or any optional manner of the foregoing first aspect.
  • a block reconstruction device for performing the above-mentioned data reconstruction method.
  • the block reconstruction device includes a functional module for executing the data reconstruction method provided in the foregoing second aspect or any optional manner of the foregoing second aspect.
  • a block reconstruction device which is used to perform the above data reconstruction method.
  • the block reconstruction device includes a functional module for executing the data reconstruction method provided by the foregoing third aspect or any optional manner of the foregoing third aspect.
  • a computer device in a seventh aspect, includes a processor and a memory, and at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to implement the operations performed by the above-mentioned data reconstruction method .
  • a storage medium stores at least one instruction, and the instruction is loaded and executed by a processor to implement operations performed by the above-mentioned data reconstruction method.
  • a method for data reconstruction in a storage system includes a control device and one or more storage devices, each of the one or more storage devices includes a hard disk; One or more storage devices are used to store the blocks of a strip generated by the control device; the method includes:
  • the first storage device in the one or more storage devices that stores the effective partition of the stripe reads the stored effective partition and calculates the first result according to the read effective partition;
  • the control device receives the first result and restores damaged blocks in the strip according to the first result.
  • the stripe is generated according to an erasure coding algorithm.
  • the one or more storage devices are hard disk enclosures.
  • a storage system in a tenth aspect, includes a control device and one or more storage devices, each of the one or more storage devices includes a hard disk; the one or more storage devices For storing a stripe generated by the control device;
  • the first storage device storing the effective partitions of the strip in the one or more storage devices is used to read the stored effective partitions and calculate the first result according to the read effective partitions, and the Sending the first result to the control device;
  • the control device is configured to receive the first result and restore the damaged blocks in the strip according to the first result.
  • the one or more storage devices are hard disk enclosures.
  • the stripe is generated according to an erasure coding algorithm.
  • a method for data reconstruction in a storage system includes a plurality of storage devices.
  • Each storage device of the plurality of storage devices includes one or more hard disks.
  • a storage device is used to store the blocks of a strip; the method includes:
  • the first storage device storing the effective partitions of the stripe among the plurality of storage devices reads the stored effective partitions and calculates the first result according to the read effective partitions;
  • the second storage device of the plurality of storage devices restores the damaged blocks in the stripe according to the first result.
  • the second storage device is a primary storage device of the multiple storage devices or a storage device that stores the damaged block of the strip in the multiple storage devices.
  • the method also includes:
  • the second storage device receives the first result.
  • the method further includes:
  • the first storage device includes the second storage device
  • the second storage device receives the first result sent by the other storage device.
  • a storage system in a twelfth aspect, includes a plurality of storage devices, each of the plurality of storage devices includes one or more hard disks, and the plurality of storage devices are used for storing A strip of blocks;
  • the first storage device in the plurality of storage devices that stores the effective partition of the strip is used to read the stored effective partition and calculate the first result according to the read effective partition;
  • the second storage device of the plurality of storage devices is used to restore the damaged blocks in the stripe according to the first result.
  • the second storage device is a primary storage device of the multiple storage devices or a storage device that stores the damaged block of the strip in the multiple storage devices.
  • the first storage device is configured to send the first result to the second storage device; wherein, the first storage device does not include the second storage device;
  • the second storage device is used to receive the first result.
  • Figure 1 is a schematic diagram of an implementation environment provided by an embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of a computer device provided by an embodiment of the present invention.
  • FIG. 3 is a flowchart of a method for data reconstruction provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a block reconstruction process provided by an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a block reconstruction process provided by an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a block reconstruction process provided by an embodiment of the present invention.
  • FIG. 7 is a flowchart of a method for data reconstruction provided by an embodiment of the present invention.
  • FIG. 8 is a flowchart of a method for data reconstruction provided by an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of a block reconstruction process provided by an embodiment of the present invention.
  • FIG. 10 is a flowchart of a method for data reconstruction provided by an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a block reconstruction process provided by an embodiment of the present invention.
  • FIG. 12 is a flowchart of a method for data reconstruction provided by an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of a block reconstruction process provided by an embodiment of the present invention.
  • FIG. 14 is a schematic structural diagram of a data reconstruction device provided by an embodiment of the present invention.
  • FIG. 15 is a schematic structural diagram of a data reconstruction device provided by an embodiment of the present invention.
  • FIG. 16 is a schematic structural diagram of a data reconstruction device provided by an embodiment of the present invention.
  • Figure 1 is a schematic diagram of an implementation environment provided by an embodiment of the present invention.
  • the implementation environment includes a control device 101, at least one storage device 102, and a target storage device 103, a control device 101, a storage device 102, and a target storage device 103 can be connected by optical fiber or cable.
  • the control device 101 is used to write data to the storage device 102 and the target storage device 103.
  • the control device 101 can also be used to read data from the storage device 102 or the target storage device 103.
  • the control device 101 can also be used to The missing blocks in the storage device 102 are reconstructed.
  • the storage device 102 is used to store data and output the stored data to the control device.
  • the storage device 102 is also used to send intermediate result data in the block reconstruction process to the control device 101 or the target storage device 103, so as to control the device 101 or the target
  • the storage device 103 may directly reconstruct the lost block according to the intermediate result data sent by the at least one storage device 102.
  • the target storage device 103 is configured to store the reconstructed blocks, and the target storage device 102 is further configured to reconstruct the lost blocks according to the intermediate result data sent by at least one storage device 102. It should be noted that the target storage device 103 may be any storage device in the at least one storage device 102, and may also be another storage device other than the at least one storage device 102.
  • Both the storage device 102 and the target storage device 103 may also include at least one storage medium, and each storage medium is used to store data written by the control device.
  • the control device 101, the storage device 102, and the target storage device 103 may also include an input/output (OI) unit, and the OI unit is used to send or receive messages.
  • OI input/output
  • control device 101, the storage device 102, and the target storage device 103 may all include a storage unit and a control unit.
  • the control unit may have the functions of the control device 101 described above, and the storage unit may have the functions described above.
  • the functions of the storage device 102 or the target storage device 103, that is, the control device 101, the storage device 102, and the target storage device 103 are all devices that include a storage unit and a control unit, and the device has a control device, a storage device, and a target storage. The function of the device.
  • At least one storage device including a storage unit and a control unit may be divided into multiple groups, and each group may have a main storage device.
  • the main storage device in any group is used to manage the Storage device, each storage device can send the intermediate results in the block reconstruction process to the main storage device, and the main storage device can sum the received intermediate results to obtain the sum value, and send the sum value to the reconstruction sub
  • the block reconstruction device restores the reconstructed block according to the received sum value.
  • the primary storage device may be any device in any grouped storage device, or any device other than any grouped storage device.
  • a disk enclosure is considered as a group, and a disk enclosure includes at least one storage device with a storage unit and a control unit and a primary storage device, or an availability zone (AZ) is considered as a group, and an AZ includes at least one A storage device with a storage unit and a control unit and a main storage device.
  • FIG. 2 is a schematic structural diagram of a computer device provided by an embodiment of the present invention.
  • the computer device 200 includes relatively large differences due to different configurations or performances. , May include one or more processors (central processing units, CPU) 201 and one or more memories 202, where at least one instruction is stored in the memory 202, and the at least one instruction is loaded and executed by the processor 201 To implement the methods provided in the following method embodiments.
  • the computer device 200 may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface for input and output.
  • the computer device 200 may also include other components for implementing device functions, which will not be repeated here.
  • a computer-readable storage medium such as a memory including instructions, which may be executed by a processor in a computer device to complete the data reconstruction method in the following embodiments.
  • the computer-readable storage medium may be read-only memory (ROM), random access memory (RAM), compact disc read-only memory (CD-ROM), Tapes, floppy disks and optical data storage devices, etc.
  • the control device can assign a strip ID to the strip, To distinguish different strips, the strip identifier can be the number of the strip.
  • the control device can also allocate block information, a block number, and a node identifier for each block on the strip.
  • the node identifier is used to indicate the storage device storing the block.
  • the block information of any block can be any block.
  • the storage address of any block may include the internet protocol address (IP) of the storage device storing any block and the offset address of any block in the storage medium of the storage device
  • the storage address can also be a global name (WWN).
  • the block information of any one of the blocks may also include the medium identification of the storage medium storing the any one of the blocks, and the medium identification may be the storage medium's Numbering.
  • the embodiment of the present invention does not specifically limit the strip identifier, block information, the node identifier, and the medium identifier.
  • the control device can store the strip and the allocated information of each block of the strip associatively, so that the subsequent control device can determine which block on which strip is lost according to the stored information, And the lost blocks are stored in which storage medium of which storage device.
  • the control device may store the information allocated for the strip and each block of the strip in the allocation table, so as to implement associative storage.
  • the control device can store the segment identifier of each segment in the association table.
  • the segment identifier includes a data identifier or a check identifier, where the data identifier is used to indicate that the segment is a data block, and the verification identifier is used to indicate The block is a parity block.
  • the control device can store each block according to the information corresponding to each block in the allocation table.
  • stripe 1 includes partition 1-p, where p is an integer greater than 1, and partition 1 is a data block stored on storage device 1.
  • the control device may store the block 1 in the storage device 1, and the block p is a check block stored on the storage device 2, and the control device may store the block P in the storage device 2.
  • the storage device performs EC calculation or redundant arrays of independent drives (RAID) calculation, according to the data blocks on the stripe and verification Matrix, generate at least one check block, where the check matrix H is a matrix H of 2*(N+M), that is, yes.
  • the strip includes N data blocks and M check blocks.
  • the jth column in the check matrix H corresponds to the jth block of the strip, that is, each column in the check matrix corresponds to the first A block on the strip, where N, M, and j are all positive integers greater than 0.
  • the control device saves the computer program on the local storage medium and executes the computer program on the processor of the control device.
  • the control device divides the original data into 23 blocks, that is, 23 data blocks, and then arranges the 23 data blocks in a certain order; calculates the 23 data blocks through EC/RAID to generate parity data (2 Check blocks).
  • control device can use the check matrix equation (1) for calculation.
  • the check matrix H is stored in a computer program, and x T is the transposed matrix of the stripe matrix x.
  • the check matrix H can be represented by formula (2)
  • the strip matrix x can be represented by formula 3.
  • a 1,j represent the elements of the first row and the jth column in the check matrix H
  • a 2,j represent the elements of the first row and the jth column in the check matrix H
  • x j is used to represent the For the j-th block in the band
  • x 1 to x 23 are data blocks
  • x 24 and x 25 are check blocks.
  • each column in the check matrix H corresponds to each column in the stripe matrix x, that is, x j corresponds to H j , where H j is the jth column of the check matrix H. Since the data blocks from x 1 to x 23 are original data, that is, a known quantity, and the check matrix H is also a known quantity, the control device can solve the unknown check block according to formulas (1)-(3) x 24 to x 25 , the specific solution process is as follows.
  • the control device splits the check matrix H into two sub-matrices H N and H M.
  • the control equipment can obtain formula (4) according to formula (1):
  • the control device can obtain the check blocks x 24 and x 25 according to formula (4):
  • the control device can send the data blocks x 1 to x 23 and check blocks x 24 and x 25 on the strip to at least one storage device through the IO unit, and the at least One storage device can use multiple storage media (disks, solid state drives, etc.) through the IO unit; then, at least one storage device writes data blocks or check blocks into the storage media.
  • the foregoing process of generating strips is also a process of generating strips according to an erasure code algorithm.
  • the storage medium in the storage device may lose its stored blocks.
  • the control device needs to reconstruct the lost blocks.
  • the above process of solving the check block when any one of the blocks in the strip is lost, the missing block can also be solved according to other blocks on the strip.
  • you can The above solution process is decomposed into multiple sub-processes, and the storage device completes the calculation of the sub-processes. Then, the storage device sends the calculation results of the sub-processes to the control device, and the final control device according to the received calculation results.
  • the lost block is also called a damaged block, that is, a block that cannot be read. Solving the lost block is to restore the damaged block.
  • the storage device can calculate the intermediate result according to the effective blocks in the strips stored by the storage device, and then the storage device sends the intermediate result to the control device, and the control device recovers the damage according to the intermediate result. Of blocks.
  • the storage device calculates an intermediate result according to the effective blocks in the strips stored by the storage device, and the storage device storing the damaged block restores the damaged block according to the intermediate result.
  • the storage device storing the damaged block when the storage device storing the damaged block also stores the valid block, other storage devices storing the valid block send the calculated intermediate result to the storage device storing the damaged block, and the damaged block is stored Based on these intermediate results, the storage device recovers the damaged block.
  • the storage device storing the damaged block does not store the valid block
  • the storage device storing the valid block sends the intermediate result to the storage device storing the damaged block, and the storage device storing the damaged block according to the intermediate result Recover damaged blocks.
  • the storage device calculates the intermediate result according to the effective blocks in the strips stored by the storage device, and the main storage device in the storage device restores the damaged blocks according to the intermediate result.
  • an effective block refers to a block that can be read or an undamaged block in a stripe.
  • an embodiment of the present invention provides a method for data reconstruction in a storage system
  • the storage system includes a control device and one or more storage devices, each of the one or more storage devices Each includes a hard disk; the one or more storage devices are used to store the blocks of a strip generated by the control device; the method includes:
  • the first storage device in the one or more storage devices that stores the effective partition of the stripe reads the stored effective partition and calculates the first result according to the read effective partition;
  • the control device receives the first result and restores damaged blocks in the strip according to the first result.
  • the stripe is generated according to an erasure coding algorithm.
  • the one or more storage devices are hard disk enclosures.
  • Another embodiment of the present invention provides a method for data reconstruction in a storage system, the storage system includes a plurality of storage devices, and each storage device of the plurality of storage devices includes one or more hard disks, The multiple storage devices are used to store blocks of a strip; the method includes:
  • a first storage device in the plurality of storage devices that stores the effective partition of the stripe reads the stored effective partition and calculates the first result according to the read effective partition
  • the second storage device of the plurality of storage devices restores the damaged blocks in the stripe according to the first result.
  • the second storage device is a primary storage device among the multiple storage devices or a storage device that stores the damaged partition of the strip in the multiple storage devices.
  • the method further includes:
  • the second storage device receives the first result.
  • storage devices other than the second storage device in the first storage device send the first result to the second storage device; wherein, the first storage device includes the second storage device equipment;
  • the second storage device receives the first result sent by the other storage device.
  • the multiple storage devices are hard disk enclosures.
  • FIG. 3 To reconstruct the lost blocks, the flow chart of a data reconstruction method provided in the embodiment of the present invention shown in FIG. 3 can be used to illustrate this process.
  • the process of the method specifically includes:
  • the storage device that lost the block sends a reconstruction request to the control device.
  • the reconstruction request is used to indicate the reconstruction of the lost block, and the reconstruction request may include a first reconstruction request or a second reconstruction request, where the first reconstruction request storage device carries the storage medium of the failed storage medium When any storage medium in the storage device fails, the storage device can send the first reconstruction request carrying the medium identification of any storage medium to the control device. Therefore, before this step 301, the storage device You can query whether the storage medium in the storage device fails; when the storage medium in the storage device fails, the storage device sends the first reconstruction request to the control device
  • the second reconstruction request carries the block information of at least one block lost by the storage medium in the storage device.
  • the storage device can use the second The reconstruction request is sent to the control device. Therefore, before this step 301, the storage device can also query whether the storage medium in the storage device has lost blocks; when any storage medium in the storage device loses at least one block , The storage device sends a second reconstruction request to the control device.
  • the control device determines, according to the reconstruction request, the first block in the lost blocks in the first strip, and the blocks in the first strip are stored by the target number of storage devices.
  • the target number is the number of storage devices used to store the blocks of the first strip, and the target number may be one or more.
  • the embodiment of the present invention does not specifically limit the target number.
  • the first strip is any strip with missing blocks.
  • control device can implement this step 302 through the process shown in Manner 1-2.
  • Step 21 The control device determines at least one second stripe according to the storage medium identifier carried in the first reconstruction request.
  • the at least one second stripe is a stripe that has lost blocks, and the storage medium identifies the stored block as a block on the at least one second stripe. Therefore, the control device The identification of the storage medium carried in the configuration request determines at least one second band. In a possible implementation manner, the control device may determine at least one second strip corresponding to the storage medium identifier from the association table.
  • Step 22 The control device determines any one of the at least one second strip to determine the first strip.
  • the control device may randomly select a second strip from the at least one second strip as the first strip. If the strip is identified as a number, the control device can also select a second strip with the largest or smallest number as the first strip.
  • the embodiment of the present invention does not specifically limit the manner of selecting the first strip from the at least one second strip.
  • Step 23 The control device determines any one of the missing blocks in the first strip as the first block according to the storage medium identifier.
  • the control device may first select the at least one fourth segment corresponding to the medium identifier of the storage medium and the strip identifier of the first strip in the association table, and the at least one fourth segment is also The partition of the first strip that the storage medium is missing. Then, the control device selects any one block from the at least one fourth block as the first block. It should be noted that the manner in which the control device selects the first block from the at least one first quadrant is the same as the manner in which the first strip is selected from the at least one second strip in step 22. Here, In the embodiment of the present invention, the manner of selecting the first partition from the at least one partition is not repeated.
  • Step 2A The control device determines at least one second stripe according to the block information of the at least one block carried in the second reconstruction request.
  • the control device may determine at least one second strip corresponding to the block information of the at least one segment from the association table.
  • Step 2B The control device determines any one of the at least one second strip as the first strip.
  • step 2B The implementation process of this step 2B is the same as that of step 22.
  • the embodiment of the present invention will not repeat this step 2B.
  • Step 2C The control device determines any one of the lost blocks in the first strip as the first block according to the block information of the at least one missing block.
  • the control device can first select the at least one fourth segment corresponding to the block information of the at least one segment and the strip identifier of the first strip in the association table, and the control device can select from the at least one first strip.
  • Four blocks select one block as the first block.
  • step 23 there is a description of determining the first partition from the at least one fourth partition, and the description of determining the first partition from the at least one fourth partition is not repeated here.
  • the control device splits the check matrix of the first strip into a first sub-matrix and a target number of second sub-matrices, where the first sub-matrix includes a column corresponding to the first block and a second sub-matrix.
  • the first sub-matrix includes a column corresponding to the first block and a second sub-matrix.
  • Column corresponding to a block each second sub-matrix includes a column corresponding to at least one third block stored on a storage device in the first strip, and the second block is in the first strip Any one of the effective blocks of, the third block is any block of the effective blocks except the second block.
  • the control device may decompose the check matrix of the first strip through step 303.
  • the control device can split the columns corresponding to the first block and the second block in the check matrix to form the first sub-matrix, which is the matrix to be solved, and the control device can split the third block in the check matrix
  • the corresponding columns form the parameter matrix
  • the control device splits the parameter matrix into the target number of second sub-matrices according to the third block stored by the target number of storage devices, where one second sub-matrix corresponds to one Storage device, each column in a second sub-matrix corresponds to a third block stored in the corresponding storage device.
  • the first stripe includes blocks 1-25, where blocks 1-23 are data blocks, blocks 24-25 are check blocks, and the first block is block 1, and
  • the two sub-blocks are sub-block 25 and the sub-blocks 2-24 are the third sub-block as examples for description.
  • each hard disk enclosure is equivalent to a storage device, that is, the target number is 2.
  • the control device and hard disk enclosure include CPU, memory, and IO units, and the hard disk enclosure also includes storage media for storing data
  • the IO unit in the hard disk enclosure can also be replaced with a network interface unit (the network interface unit is referred to as network in Figure 4), where the network interface unit has the same function as the IO unit.
  • the control device groups the blocks of the first strip into blocks.
  • the control device can use block 1 (the first block) and block 25 (the second block) as the blocks to be repaired, that is, the blocks to be solved, Therefore, the block 2-12 stored in the storage device 1 is the third block of the first strip stored in the storage device 1, and the block 13-24 stored in the storage device 2 is the first strip stored in the storage device 2.
  • the control device can split the second to 12th columns corresponding to the blocks 2-12 in the check matrix to form the second sub-matrix H 1 corresponding to the storage device 1 , and the check matrix Columns 13-24 corresponding to the middle block 13-24 are split to form the second sub-matrix H 2 corresponding to the storage device 2.
  • the first and 25th blocks in the check matrix H corresponding to blocks 1 and 25 The columns are split to form the first sub-matrix H 3 , where H 1 -H 3 can be expressed as:
  • the target number is equal to 1
  • the control device and hard disk enclosure include CPU, memory, and IO units.
  • the hard disk enclosure also includes storage media for storing data.
  • the IO unit in the hard disk enclosure can also be replaced. It is a network interface unit (the network interface unit is referred to as network in Figure 4), where the network interface unit has the same function as the IO unit.
  • the storage device (hard disk enclosure) 3 stores the blocks 1-25 on the first strip.
  • the control device can use block 1 (the first block) and block 25 (the second block) as the blocks to be repaired.
  • the block is the block to be solved. Therefore, the block 2-24 stored in the storage device 3 is the third block of the first strip stored in the storage device 3, and the control device can divide the check matrix into Columns 2-24 corresponding to blocks 2-24 are split to form the second sub-matrix H 4 corresponding to storage device 3, and columns 1 and 25 corresponding to blocks 1 and 25 in the check matrix H are split. It is separated into the first sub-matrix H 3 , where H 4 can be expressed as:
  • the control device sends a first acquisition request to a storage device corresponding to each second sub-matrix, where the first acquisition request carries a second sub-matrix corresponding to the storage device, and each column in the second sub-matrix corresponds to The block information of the third block and the identification of the control device.
  • Each first acquisition request is used to instruct to send a first result to the control device
  • the first acquisition request is a type of acquisition request
  • the acquisition request carries a second sub-matrix corresponding to the storage device.
  • the block information of the third block corresponding to each column and the identification of the control device.
  • the acquisition request is the fourth acquisition request.
  • the target device is also the control device .
  • control device can send each second sub-matrix to its corresponding storage device, and the storage device completes the sub-processes (steps 306-307 The process of solving the first result shown).
  • the control device can send the block information of the third block corresponding to the second sub-matrix to the storage device, so that the storage device can be based on the block of the third block Information, the third block corresponding to the received second sub-matrix is obtained. Therefore, the control device may compose a second sub-matrix and block information of the third block corresponding to the one sub-matrix into a first obtaining request , And send it to the corresponding storage device.
  • the control device completes the reconstruction of the first block
  • the target storage device completes the reconstruction of the first block, which device is in When the reconstruction of the first partition is completed, the calculation results of the target number of storage devices need to be obtained. Therefore, when the identification of the target device carried in the acquisition request is the identification of the control device, the acquisition request is also the first The acquisition request is used to instruct to send the first result to the control device, where the first result is obtained by the storage device reading the stored effective partition of the first stripe and calculating it according to the read effective partition, that is, Storage device calculation results).
  • the control device in FIG. 4 sends the first acquisition request 1 to the storage device 1 through the IO unit.
  • the first acquisition request 1 carries the second sub-matrix H1 and the block information of the blocks 2-12, and the control device passes the IO unit ,
  • the first acquisition request 2 is sent to the storage device 2, and the first acquisition request 2 carries the second sub-matrix H2 and block information of the blocks 13-24.
  • the control device in FIG. 5 sends the first acquisition request 3 to the storage device 3 through the IO unit, and the first acquisition request 3 carries the second sub-matrix H1 and block information of the blocks 2-24.
  • the storage device receives the first acquisition request.
  • the storage device is any storage device among the target number of storage devices.
  • the target number of storage devices will all receive a first acquisition request.
  • storage devices 1 and 2 in FIG. 4 receive first acquisition requests 1 and 2 through IO units, respectively.
  • the storage device 3 in FIG. 5 receives the first acquisition request 3 through the IO unit.
  • This step 305 is also a step in which the storage device receives the acquisition request.
  • the storage device reads the at least one third block according to the block information of the third block corresponding to each column in the second sub-matrix.
  • the storage device can determine the storage location of the third sub-block corresponding to each column in the second sub-matrix according to the block information of the third sub-block corresponding to each column in the second sub-matrix. At least one third block is read from the storage location of the third block corresponding to a column.
  • the storage device 1 in FIG. 4 can read the blocks 2-12 through the IO unit according to the block information of the blocks 2-12 in the first acquisition request 1, and the storage device 2 according to the first acquisition request 2
  • the block information of the block 13-24 can be read to the block 13-24 through the IO unit.
  • the storage device 3 in FIG. 4 can read the blocks 2-24 through the IO unit according to the block information of the blocks 2-24 in the first acquisition request 3.
  • the third block is also a valid block for reading. It should be noted that the process shown in step 306 is also based on the block information of the third block corresponding to each column in the second sub-matrix. The process of reading the at least one third block
  • the storage device calculates according to the at least one third block and the second sub-matrix to obtain the first result
  • the first result can be used to represent the characteristics of the third block corresponding to each column in the second sub-matrix.
  • the storage device may form the at least one third block into a block matrix, each row of the block matrix is a third block; the second sub-matrix is multiplied by the block matrix to obtain the first result .
  • the process shown in this step 307 is also a process of obtaining the first result based on the read effective block calculation.
  • the CPU in the storage device 1 in FIG. 4 divides blocks 2-12 into a block matrix P 1 , multiplies H 1 and P 1 to obtain the first result Q 1
  • the CPU in the storage device 2 divides blocks 13 -14 forms a block matrix P 2 , multiply H 2 and P 2 to obtain the first result Q 2
  • P 1 (x 1 ,x 2 ,...,x 12 ) T
  • Q 1 H 1 *P 1
  • P 2 (x 13 , x 14 ,..., x 23 ) T
  • Q 2 H 2 *P 2 .
  • the storage device sends the first result to the control device.
  • the storage device Since the first request carries the identifier of the control device, the storage device sends the first result to the control device. It should be noted that, since the storage device is any storage device among the target number of storage devices, all the target number of storage devices need to perform the process shown in steps 305-308.
  • the storage device for which the control device acquires the target number is based on the first result returned by the first acquisition request.
  • the control device can receive the first result of the target number. Therefore, the control device can obtain the first result of the target number, and each first result is determined by the target number.
  • One of the storage devices of the number of storage devices reads the stored effective partition of the first strip and calculates it according to the read effective partition.
  • control device in FIG. 4 may obtain the first result Q 1 sent by the storage device 1 and the first result Q 2 sent by the storage device 2 .
  • control device in FIG. 5 may obtain the first result Q 3 sent by the storage device 3 .
  • the control device reconstructs the first block according to the first result of the target number of storage devices and the first sub-matrix.
  • the control device in addition to the first result of the target number of storage devices and the first sub-matrix, the first sub-matrix In addition to the reconstruction of the block, the second partition may be reconstructed according to the first result of the target number of storage devices and the first sub-matrix.
  • the control device may sum the first results of the target number of storage devices to obtain a summation matrix; based on the inverse matrix of the first sub-matrix and the summation matrix, obtain Target block matrix; the control device determines the first target row of the target block matrix as the reconstructed first block; determines the second target row of the target block matrix as the reconstructed second block, wherein, the first target column of the transposed matrix of the target block matrix is the first target row of the target block matrix, the second target column is the second target row of the target block matrix, and the first target column corresponds to the first block , The second target column corresponds to the second block.
  • obtaining the target block matrix may be to multiply the inverse matrix of the first sub-matrix by the summation matrix and take the negative to obtain the target block matrix. It should be noted that the process shown in this step 310 is also a process of reconstructing the first block according to the first result of the target number of storage devices.
  • x 1 calculated by the control device is the reconstructed first block
  • x 25 is the target score
  • control device does not need to calculate the summation matrix, but will directly obtain the target block matrix based on the inverse matrix of the first sub-matrix and the first result.
  • the control device sends a write request to the target storage device, where the write request carries the reconstructed first block and block information of the reconstructed first block.
  • some strips will have a silent error in which the calculation process is correct, but the final calculation result is incorrect. If a silent error occurs in the first strip, the first segment will be reconstructed. The content of the first segment may be inconsistent with the content of the first segment. Since the first segment has been lost, the control device cannot determine whether the content of the reconstructed first segment is consistent with the original content of the first segment. Whether the content of the dichotomy block is consistent with the content of the reconstructed second partition is used to determine whether a silent error occurs in the first band.
  • control device obtains the second partition from the storage device storing the second partition; if the reconstructed second partition is the same as the acquired second partition, then Perform the step of sending a write request to the target storage device; otherwise, the step of sending a write request to the target storage device is not performed, and the first stripe is prohibited from being used.
  • the process for the control device to obtain the second segment from the storage device storing the second segment may be: the control device sends a read request to the storage device storing the second segment, and the read request carries the block of the second segment. Information; after receiving the read request, the storage device storing the second block sends the second block to the control device according to the block information of the second block, so that the control device can receive the second block.
  • FIG. 6 shows a schematic diagram of the block reconstruction process provided by the embodiment of the present invention.
  • the block reconstruction process shown in FIG. 6 may include the following steps 61-66.
  • Step 61 The control device initiates reconstruction and scans the faulty block (block 1).
  • step 61 The process shown in this step 61 is also the process shown in step 302.
  • Step 62 The control device groups the blocks of the strip where block 1 is located by frames. Assuming that blocks 1-12 are on storage device 1, blocks 13-25 are on storage device 2, it is necessary to repair block 1 and block. Block 25, then, the control device splits the check matrix H into three sub-matrices (H 1 , H 2 , H 3 ), and sends the first acquisition request to the storage device through the IO unit.
  • the matrices H1 and H2 are the second sub-matrices, and the sub-matrix H3 is the first sub-matrix.
  • the control device divides the blocks 2-24 into 2 groups: blocks 2-12 are one group, and the storage device 1 pair sub-matrix H1 Perform calculation; block 13-24 is a group, and storage device 2 calculates sub-matrix H2.
  • This step 62 can be executed by the IO unit of the control device.
  • Step 63 The control device reads the block 25 separately.
  • control device reads the partition 25 from the storage device 2 storing the partition 25.
  • This step 63 can be executed by the IO unit of the control device.
  • Step 64 After receiving the request, the storage device reads the blocks on the storage medium through the IO unit.
  • This step 64 can be performed by the IO unit of the storage device.
  • Step 65 After the storage device reads the block, the intermediate result Q (first result) is obtained through calculation.
  • the sub-matrix H 1 and its corresponding third block are calculated to obtain the intermediate result Q 1 ; on the storage device 2, the sub-matrix H 2 and its corresponding third block are calculated to obtain the intermediate Result Q 2 , and finally storage devices 1 and 2 return intermediate results (Q 1 , Q 2 ) to the control device, where Q includes intermediate results Q 1 and Q 2 .
  • This step 65 can be executed by the CPU of the storage device.
  • Step 66 The control device uses the intermediate result Q and the sub-matrix H 3 to calculate the final result, and restores block 1 and block 25. And compare the restored block 25 with the block 25 read in step 63. If the data is consistent, write block 1 into the storage medium of the target storage device; if they are inconsistent, it means that the striping has encountered a silent error , Isolate and protect the strips.
  • This step 65 can be executed by the CPU of the control device. It should be noted that the structure of the storage device and the control device in FIG. 6 is similar to the structure of the storage device and the control device in FIG. 4.
  • the target storage device stores the reconstructed first block according to the block information of the reconstructed first block.
  • the target storage device can determine the storage location of the reconstructed first block in the target storage device according to the block information of the reconstructed first block, so that the target device can store the received reconstructed first block Store in a certain storage location.
  • the target storage device After the target storage device stores the reconstructed first block, the target storage device sends a storage success response to the control device, and the storage success response is used to indicate that the reconstructed first block has been stored.
  • the control device queries whether all the lost blocks of the first strip have been completely reconstructed, and if all the blocks are not reconstructed, it will continue to execute the above for any block that has not been reconstructed. Block reconstruction process, otherwise the first strip reconstruction is complete.
  • control device After the control device completes the reconstruction of the first segment, the control device can also query whether there are other segments missing on the first strip, and if so, the control device can query any of the lost segments
  • the process shown in steps 302-312 is performed in blocks, and if other blocks on the first strip are not lost, it can be considered that the reconstruction of the first strip is completed.
  • control device can also query whether there are other strips that have lost blocks. If there are other strips that have lost blocks, then The control device reconstructs the lost blocks on other strips, that is, executes the process shown in steps 302-312.
  • step 301-312 refer to the flowchart of a method for data reconstruction provided by the embodiment of the present invention shown in FIG. 7.
  • the controller when the controller (control device) initiates the reconstruction, the controller scans the strips to confirm whether there are faulty blocks. If there are no faulty blocks, the reconstruction is complete. If there is a faulty block, the controller obtains the faulty block (the missing block), and groups the blocks that need to be read into blocks, and splits the obtained check matrix according to the grouping. Send the split second sub-matrix to the corresponding disk enclosure, that is, send each second sub-matrix to each disk enclosure according to the group. Each disk enclosure reads the blocks in the corresponding group.
  • the controller receives the intermediate results returned by each hard disk enclosure, calculates the recovered faulty blocks, and writes the recovered faulty blocks to a certain disk of a certain hard disk enclosure. When the writing is completed, the controller then scans the stripes to determine Whether there are still faulty blocks on this strip.
  • the control device updates the information of the first partition in the association table, specifically, updates the block information of the first partition to the reconstructed first partition.
  • the node identifier of the first segment is updated to the node identifier of the target storage device, etc., so that the subsequent control device can read the reconstructed first segment from the target storage device.
  • the control device directly reconstructs the lost first block in the first strip according to the first result of the obtained target number of storage devices, so that the control device does not need to read from the storage device.
  • the first block can be reconstructed, because the first result is not lost on the first strip.
  • the amount of data transmitted between the control device and the storage device is relatively small, and the network bandwidth occupied during the data transmission is relatively small. Thereby improving the performance of reconstructing blocks.
  • the first result sent by each storage device can be obtained without sending a large number of read requests to each storage device, which not only reduces the overhead of the CPU of the control device, but also further Reduce the occupancy of network bandwidth, which can further improve the performance of reconstruction block.
  • the failure of data silence can be avoided, and the reliability of the system is improved.
  • control device may reconstruct the first block according to the first result and the first sub-matrix.
  • target storage device may also reconstruct the first block according to the first result and the first sub-matrix.
  • a sub-matrix reconstruct the first block.
  • the storage device that lost the block sends a reconstruction request to the control device.
  • This step 801 is the same as the process shown in step 301.
  • the embodiment of the present invention does not repeat this step 801.
  • the control device determines, according to the reconstruction request, the first segment that is missing in the first strip, and the segment in the first strip is stored by the target number of storage devices.
  • This step 802 is similar to the process shown in step 302. Here, this step 802 is not described in detail in the embodiment of the present invention.
  • the control device splits the check matrix of the first strip into a first sub-matrix and a target number of second sub-matrices, where the first sub-matrix includes the check matrix corresponding to the first block in the check matrix.
  • the first sub-matrix includes the check matrix corresponding to the first block in the check matrix.
  • Column and the column corresponding to the second partition each second sub-matrix includes a column corresponding to at least one third partition stored on a storage device in the first stripe, and the second partition is the first
  • the third block is any one of the effective blocks except the second block.
  • This step 803 is the same as the process shown in step 303.
  • the embodiment of the present invention does not repeat this step 803.
  • the control device sends a second acquisition request to the storage device corresponding to each second sub-matrix, and each second acquisition request carries a second sub-matrix corresponding to the storage device, and each column in the second sub-matrix corresponds to The block information of the third sub-block and the identifier of the target storage device, each second acquisition request is used to instruct to send the first result to the target storage device.
  • the second acquisition request is a type of acquisition request, and the acquisition request carries the second sub-matrix corresponding to the storage device, the block information of the third block corresponding to each column in the second sub-matrix, and the identification of the target device,
  • the acquisition request is the second acquisition request.
  • the target device is also the target storage device.
  • the acquisition request is also the second acquisition request
  • the second acquisition request may also carry the IP address of the target storage device so that each The storage device may send the first result to the target storage device according to the IP address of the target storage device.
  • the control device sends a target reconstruction request to the target storage device, where the target reconstruction request carries a first sub-matrix, and the target reconstruction request is used to instruct to reconstruct the first block according to the first sub-matrix.
  • the target reconstruction request may also carry the block information of the reconstructed first block, so that the target storage device can store the reconstructed first block according to the block information of the reconstructed first block. It should be noted that this step 805 can be executed at any timing before step 811.
  • the storage device receives the second acquisition request.
  • step 806 or step 305 is a process in which the storage device receives an acquisition request.
  • the acquisition request is the first acquisition request, it is used to instruct to send the first result to the control device.
  • the acquisition request is the second acquisition request, it is used to instruct to send the first result to the target storage device.
  • the storage device reads the at least one third block according to the block information of the third block corresponding to each column in the second sub-matrix.
  • This step 807 is the same as the process shown in step 306. Here, this step 807 is not described in detail in the embodiment of the present invention.
  • the storage device calculates according to the at least one third block and the second sub-matrix to obtain the first result.
  • This step 808 is the same as the process shown in step 307. Here, this step 808 is not described in detail in the embodiment of the present invention.
  • the storage device sends the first result to the target storage device.
  • the storage device may send the first result to the target storage device according to the IP address of the target storage device carried in the second request. All storage devices of the target number need to execute the process shown in steps 806-809 above.
  • the target storage device obtains the first result returned by the target number of storage devices based on the second obtaining request.
  • This step 810 is the same as the process in step 309 that the control device obtains the first result returned by the storage device of the target number based on the first obtaining request.
  • the embodiment of the present invention does not repeat this step 810.
  • the target storage device reconstructs the first block according to the first result of the target number of storage devices and the first sub-matrix.
  • the process of reconstructing the first block by the target storage device is the same as the process of reconstructing the first block by the control device in step 310.
  • the embodiment of the present invention does not repeat step 811.
  • the target storage device stores the reconstructed first block.
  • the target storage device may store the reconstructed first block according to the block information of the reconstructed first block.
  • the target storage device may store the reconstructed first sub-block according to a preset storage rule.
  • the embodiment of the present invention stores the preset The rules do not make specific restrictions.
  • the target storage device obtains the second partition from the storage device storing the second partition; if the reconstructed partition and the acquired second partition are If the blocks are the same, the step of storing the reconstructed first block is executed, otherwise, the step of storing the reconstructed first block is not executed, and the use of the first strip is prohibited.
  • the process of acquiring the second partition by the target storage device is the same as the process of acquiring the second partition by the control device. In the embodiment of the present invention, the process of acquiring the second partition by the target storage device is not described in detail.
  • the target storage device When storing the reconstructed first block is completed, the target storage device sends a reconstruction completion response to the control device, where the reconstruction completion response is used to indicate that the reconstruction of the first block is completed.
  • the reconstruction completion response may also carry the block information of the reconstructed first block, so that the control device can receive the reconstructed block.
  • the block information of the first block in the association table is updated.
  • control device After receiving the reconstruction completion response, the control device queries whether all the lost blocks of the first strip are completely reconstructed, and if not all reconstructions are complete, it will continue to execute any block that has not been reconstructed. The block reconstruction process described above, otherwise the first stripe reconstruction is completed.
  • the control device updates the information of the first partition in the association table, specifically, updates the block information of the first partition to the reconstructed first partition.
  • the node identifier of the first partition is updated to the node identifier of the target storage device, etc., so that the subsequent control device can reconstruct the first partition from the target storage device.
  • steps 801-813 refer to the schematic diagram of a block reconstruction process provided by an embodiment of the present invention shown in FIG. 9.
  • the block reconstruction process shown in FIG. 9 may include the following steps 1-7.
  • Step 1 The control device initiates reconstruction and scans the faulty block (block 1).
  • step 1 is also the process shown in step 802. This step 1 can be executed by the CPU of the control device.
  • Step 2 The control device groups the blocks of the strip where block 1 is located by frame. Assume that blocks 1-12 are on storage device 1, and blocks 13-25 are on storage device 2. It is necessary to repair block 1 and block. Block 25, then, the control device splits the check matrix H into three sub-matrices (H 1 , H 2 , H 3 ), and sends a second acquisition request to the storage device through the IO unit.
  • the matrices H1 and H2 are the second sub-matrices, and the sub-matrix H3 is the first sub-matrix.
  • the control device divides the blocks 2-24 into 2 groups: blocks 2-12 are one group, and the storage device 1 pair sub-matrix H1 Perform calculation; block 13-24 is a group, and storage device 2 calculates sub-matrix H2.
  • This step 62 can be executed by the IO unit of the control device.
  • Step 3 After the storage device receives the request, it reads the blocks on the storage medium through the IO unit.
  • Step 4 After the storage device reads the partition through the IO unit, the intermediate result Q (first result) is calculated by the CPU.
  • the sub-matrix H 1 and its corresponding third block are calculated to obtain the intermediate result Q 1 ; on the storage device 2, the sub-matrix H 2 and its corresponding third block are calculated to obtain the intermediate Result Q 2.
  • Step 5 The storage device 1 sends the intermediate result Q 1 to the storage device 2 through the IO unit.
  • Step 6 The storage device 2 uses the intermediate results Q 1 and Q 2 and the sub-matrix H 3 through the CPU to calculate the final result and restore the block 1 and the block 25. And compare the restored block 25 with the block 25 read in step 63, and if the data is consistent, the block 1 is written into the storage medium.
  • Step 7 The storage device 2 notifies the control device that the block has been reconstructed, and the control device continues to execute step 1 through the CPU or the reconstruction is complete.
  • the control device directly reconstructs the lost first block in the first strip according to the first result of the obtained target number of storage devices, so that the control device does not need to read from the storage device.
  • the first block can be reconstructed, because the first result is not lost on the first strip.
  • the amount of data is less than that of the block, so in the process of reconstructing the first block, the amount of data transmitted between the control device and the storage device is relatively small, and the network bandwidth occupied during the transmission is relatively small, so Improved the performance of reconstruction blocks.
  • the first result sent by each storage device can be obtained, without sending a large number of read requests to each storage device, which not only reduces the overhead of the CPU of the control device, but also further Reduce the occupancy of network bandwidth, which can further improve the performance of reconstruction block.
  • the failure of data silence can be avoided, and the reliability of the system is improved.
  • devices including a storage unit and a control unit have the functions of a control device, a storage device, and a target storage device.
  • the lost block can be reconstructed according to the flowchart of a data reconstruction method provided by the embodiment of the present invention shown in FIG. 10. The flow of the method is as follows.
  • the first storage device determines the first block in the lost blocks in the first stripe, and the blocks in the first stripe are stored by a target number of devices, where the first storage device is the target number Any of the storage devices.
  • Each storage device includes a control unit and a storage unit, and the first storage device can execute this step 1001 through the control unit of the first storage device.
  • the first storage device may first query whether the storage medium in the storage unit of the first storage device has lost blocks. If the storage medium has lost blocks, it can first determine which stripe has lost blocks. Then determine which blocks are missing on which stripe.
  • the first storage device queries whether the storage medium in the first storage device of the target number of storage devices has failed; if the storage in the first storage device When the medium fails, the first storage device determines the first block in the lost block in the first strip from the blocks stored in the failed storage medium.
  • the first storage device determines from the association table at least one second stripe corresponding to the identifier of the any storage medium, and then Any one of the at least one second strip is used as the first strip, and then at least one fourth segment corresponding to the identification of the any storage medium and the strip identification of the first strip is determined, and the first The storage device determines any one of the at least one fourth sub-block as the first sub-block.
  • the first storage device queries whether the storage medium in the first storage device of the target number of storage devices has a block loss; when the first storage device When any storage medium in the storage device loses at least one segment, the first segment in the lost segment in the first strip is determined from the at least one segment. Specifically, the first storage device determines at least one second stripe corresponding to at least one segment from the association table, and then uses any one of the at least one second stripe as the first stripe. A storage device then determines any one of the at least one fourth segment corresponding to the at least one segment in the first strip as the first segment.
  • the first storage device splits the check matrix of the first strip into a first sub-matrix and a target number of second sub-matrices.
  • This step 1002 may be executed by the control unit of the first storage device.
  • This step 1002 is the same as the process in which the control device splits the check matrix of the first strip into the first sub-matrix and the target number of second sub-matrices in step 303.
  • the embodiment of the present invention does not compare this step 1002. Do repeat.
  • the first storage device respectively sends a third acquisition request to the storage unit of the storage device corresponding to each second sub-matrix, where the third acquisition request carries the second sub-matrix and the second sub-matrix corresponding to the storage device.
  • each storage device has a storage unit, and the storage unit in each storage device stores partial blocks of the first stripe, the storage unit in each storage device has a certain computing capacity, therefore, the first The storage device may send a third acquisition request to the storage unit of each storage device, so that the storage unit of each storage device can acquire the first result based on the content of the third acquisition request.
  • the third acquisition request is used to instruct to send the first result to the first storage device
  • the third acquisition request is a type of acquisition request
  • the acquisition request carries a second sub-matrix corresponding to the storage device, and the second sub-matrix The block information of the third segment corresponding to each column and the identification of the first storage device.
  • the identification of the target device is the identification of the first storage device
  • the acquisition request is the third acquisition request.
  • the target device That is, the first storage device.
  • the storage unit of the storage device receives the third acquisition request.
  • the storage unit of the storage device reads the at least one third block according to the block information of the third block corresponding to each column in the second sub-matrix.
  • This step 1005 is the same as the process of storing the block information of the third block corresponding to each column in the second sub-matrix of the storage device in step 306, and obtaining the at least one third block.
  • the embodiment of the present invention compares this step 1004 will not be repeated.
  • the storage unit of the storage device calculates according to the at least one third block and the second sub-matrix to obtain a first result.
  • This step 1006 is the same as the process of obtaining the first result by the storage device in step 307.
  • the embodiment of the present invention does not repeat this step 1006.
  • the storage unit of the storage device sends the first result to the first storage device.
  • the storage device in 1004-1007 includes the first storage device
  • the storage unit of the first storage device acquires the target number of storage devices based on the first result returned by the third acquisition request.
  • the first storage device Since steps 1004-1007 are to be executed in the storage units of the target number of storage devices, the first storage device obtains the storage units of the target number of storage devices based on the first result returned by the third obtaining request.
  • the first storage device reconstructs the first block according to the first result of the target number of storage devices and the first sub-matrix.
  • This step 1009 is similar to the process of reconstructing the first block by the control device in step 310, and this step 1009 is not described in detail in this embodiment of the present invention.
  • the first storage device stores the reconstructed first block in the storage medium of the first storage device.
  • the storage medium of the first storage device may be any storage medium in the storage unit of the first storage device. After the first storage device stores the reconstructed first partition, the association table with the first partition may be updated. Corresponding information so that all subsequent storage devices can read the reconstructed first block from the first storage device.
  • the first storage device obtains the second segment from the storage device storing the second segment; if the reconstructed second segment is the same as the acquired second segment , The step of storing the reconstructed first block in the storage medium of the first storage device is executed, otherwise, the step of storing the reconstructed first block in the storage medium of the first storage device is not executed , And prohibit the use of the first strip. It should be noted that the process of reconstructing the second block by the first storage device is the same as the process of reconstructing the second block by the control device.
  • the first storage device queries whether all the lost blocks of the first strip are completely reconstructed, and if all the blocks are not reconstructed, the block reconstruction process is continued for any block that has not been reconstructed. , Otherwise the reconstruction of the first strip is complete.
  • the lost blocks on the first stripe queried by the first storage device may be the blocks lost by the first storage device, or the blocks lost by other storage devices.
  • blocks 1-1011 To further illustrate the process shown in steps 1001-1011, refer to the schematic diagram of a block reconstruction process provided by an embodiment of the present invention shown in FIG. 11.
  • block 1 fails (the failure can be a block loss)
  • blocks 1-8 are on node 1
  • blocks 9-16 are on node 2
  • block 17- 25 is on node 3.
  • Each node is a storage device running a control unit and a storage unit (that is, a device that integrates control and storage).
  • the control unit of node 1 initiates reconstruction and scans the faulty block (block 1).
  • the control unit of node 1 divides the strips of block 1 into block groups, and needs to repair block 1 and block 25.
  • the control unit of node 1 splits the check matrix H into 4 sub-matrices (H x1 , H x2 , H x3 , H x4 ), the node 1 sends a request (third acquisition request) to the storage unit of each node through the IO unit.
  • the sub-matrices H x1 , H x2 and H x3 are the second sub-matrices
  • the sub-matrix H x4 is the first sub-matrix.
  • the control unit of node 1 divides the blocks 2-24 into 3 groups: block 2-8 is 1. Group, the storage unit of node 1 calculates the sub-matrix H x1 ; block 9-16 is a group, the storage unit of node 2 calculates the sub-matrix H x2 ; block 17-24 is a group, and the storage of node 3 The unit calculates the sub-matrix H x3 .
  • the storage unit of each node After receiving the request, the storage unit of each node reads the block on the storage medium through the IO unit.
  • the storage unit of node 1 reads block 2-8, the storage unit of node 2 reads block 9-16, and the storage unit of node 3 reads block 17-25.
  • the storage unit of node 1 uses the sub-matrix H x1 to calculate, and the intermediate result Q x1 is obtained ;
  • the storage unit of node 2 uses the sub-matrix H x2 to calculate to obtain Q x2 ;
  • the storage unit of node 3 uses the sub-matrix H x3 to calculate, obtains Q x3 , where Q x1 , Q x2 and Q x3 are the first results.
  • the storage unit of each node sends the intermediate result to node 1.
  • Node 1 uses the intermediate results (Q x1 , Q x2 and Q x3 ) and the sub-matrix H x4 to calculate the final result, and restores block 1 and block 25. And write the restored block 1 into the storage medium.
  • the control unit of node 1 continues to reconstruct other faulty blocks on the first band, or the reconstruction ends.
  • the first storage device directly reconstructs the first segment that is lost in the first stripe according to the first result of the obtained target number of storage devices, so that the first storage device does not need to download from To read the unlost blocks on the first strip on other storage devices, only need to obtain the first result of the target number from the storage unit of the storage device, then the first block can be reconstructed. Because of the first result Compared with the unlost blocks on the first strip, the amount of data is smaller. Therefore, during the reconstruction of the first block, the amount of data transferred between the first storage device and other storage devices is relatively small. In the process, the occupied network bandwidth is also relatively small, thereby improving the performance of reconstruction block.
  • the first result sent by the storage unit of each storage device can be obtained, without sending a large number of read requests to the storage unit of each storage device, which not only reduces
  • the overhead of the CPU of the control device also further reduces the occupancy of the network bandwidth, thereby further improving the performance of the reconstruction block.
  • the control module when the control module is connected to only one storage module, the control can completely reduce the reconstruction bandwidth between the control module and the storage module.
  • the target number of storage devices can be managed by at least one primary storage device, and the target number of storage devices can store the first result generated by its storage unit Send to the corresponding main storage device, and then the main storage device sums the obtained first results, and the device that reconstructs the first block can obtain the summed matrix from the main storage device and according to the summed matrix,
  • the process of the method includes the following steps.
  • the first storage device determines a first block in the lost blocks in the first stripe, and the blocks in the first stripe are stored by the target number of devices.
  • This step 1201 is similar to the process shown in step 1001, and this step 1201 is not described in detail in the embodiment of the present invention.
  • the first storage device splits the check matrix of the first strip into a first sub-matrix and a target number of second sub-matrices.
  • This step 1202 is similar to the process shown in step 1002, and this step 1202 is not described in detail in the embodiment of the present invention.
  • the first storage device respectively sends a fourth acquisition request to the storage unit of the storage device corresponding to each second sub-matrix, where the fourth acquisition request carries the second sub-matrix corresponding to the storage device and the second sub-matrix
  • the block information of the third block corresponding to each column and the identification of the main storage device.
  • the fourth acquisition request is used to instruct to send the first result to the main storage device.
  • the fourth acquisition request is a type of acquisition request.
  • the acquisition request carries a second sub-matrix corresponding to the storage device.
  • the acquisition request is the fourth acquisition request.
  • the target device is also the primary storage device. Storage device.
  • the storage unit of the storage device receives the fourth acquisition request.
  • the storage unit of the storage device reads the at least one third block according to the block information of the third block corresponding to each column in the second sub-matrix.
  • This step 1205 is similar to the process shown in step 1005, and this step 1205 is not described in detail in the embodiment of the present invention.
  • the storage unit of the storage device calculates according to the at least one third sub-block and the second sub-matrix to obtain a first result.
  • This step 1206 is the same as the process shown in step 1006, and this step 1206 is not repeated in this embodiment of the present invention.
  • the storage device sends the first result to the main storage device that manages the storage device.
  • the any one of the primary storage devices obtains the first result returned by the at least one storage device managed by the any one of the primary storage devices based on the fourth obtaining request.
  • the any primary storage device sums the acquired first results returned by at least one storage device to obtain a target summation matrix.
  • the target summation matrix is the sum of the first results returned by at least one storage device based on the fourth acquisition request.
  • the any primary storage device sends the target summation matrix to the first storage device.
  • each main storage device in the at least one main storage device performs the process shown in steps 1208-1210.
  • the first storage device obtains at least one target summation matrix from at least one main storage device.
  • the first storage device will obtain the target summation matrix of the at least one main storage device, that is, obtain at least A target sum matrix.
  • the first storage device reconstructs the first block according to the at least one target sum matrix and the first sub-matrix.
  • the control unit of the first storage device may sum the at least one target sum matrix to obtain the sum matrix; obtain the target block matrix based on the inverse matrix of the first sub-matrix and the sum matrix; The first target row of the block matrix is used as the reconstructed first block, and the second target row of the target block matrix is used as the reconstructed second block.
  • step 1212 is also a process of reconstructing the first block according to the first result of the target number of storage devices
  • the first storage device stores the reconstructed first block in the storage medium of the first storage device.
  • This step 1213 is similar to the process shown in step 1011, and this step 1213 is not repeated in this embodiment of the present invention. It should be noted that the lost blocks on the first stripe queried by the first storage device may be the blocks lost by the first storage device, or the blocks lost by other storage devices.
  • the first storage device obtains the second partition from the storage device storing the second partition; if the reconstructed second partition is the same as the acquired second partition , The step of storing the reconstructed first block in the storage medium of the first storage device is executed, otherwise, the step of storing the reconstructed first block in the storage medium of the first storage device is not executed , And prohibit the use of the first strip. It should be noted that the process of reconstructing the second block by the first storage device is the same as the process of reconstructing the second block by the control device.
  • the first storage device queries whether all the lost blocks of the first strip are completely reconstructed, and if all the blocks are not reconstructed, the block reconstruction process is continued for any block that has not been reconstructed. , Otherwise the reconstruction of the first strip is complete.
  • This step 1214 is the same as the process shown in step 1012, and this step 1214 is not repeated in this embodiment of the present invention.
  • FIG. 13 for a schematic diagram of a block reconstruction process provided by an embodiment of the present invention.
  • block 1 is the faulty block, and block 1 and block 25 need to be repaired.
  • Blocks 1-6 are on node 1
  • blocks 7-12 are on node 2.
  • Blocks 13-18 are on node 3, and blocks 19-25 are on node 4.
  • node 1 and node 2 are in box 1 or AZ1
  • node 2 and node 3 are in box 2 or AZ2.
  • Each node is a storage device running a control unit and a storage unit.
  • the control unit of node 1 initiates reconstruction, and scans the faulty block (block 1).
  • the control unit of node 1 groups the partitions of the strip where partition 1 is located in frames.
  • the control unit of node 1 splits the check matrix H into 5 sub-matrices (H y1 , Hy2 , Hy3 , Hy4 , Hy5 ), and node 1 sends a request to the storage unit of each node through the IO unit (the fourth acquisition request)
  • the sub-matrices Hy1 , Hy2 , Hy3 , and Hy4 are the second sub-matrices
  • the sub-matrix Hy5 is the first sub-matrix.
  • the control unit of node 1 divides blocks 2-24 into 4 groups: block 2- 6 is a group, the nodes of the memory cell 1 H y1 is calculated; as a group block 7-12, the node storage unit 2 calculates the H y2; 1 group of block 13-18, the memory cell node 3 Calculate Hy3 , block 19-23 is a group, and the storage unit of node 4 calculates Hy4 .
  • the storage unit of each node After receiving the request, the storage unit of each node reads the block on the storage medium through the IO unit.
  • the storage unit of node 1 is read to block 2-6, the storage unit of node 2 is read to block 7-12, the storage unit of node 3 is read to block 13-18, and the storage unit of node 2 is read Block 7-12, the storage unit of node 7 reads block 13-24.
  • the storage unit of node 1 calculates the sub-matrix H y1 and its corresponding third block to obtain the intermediate result Q y1 ;
  • the storage unit of node 2 calculates the sub-matrix H y2 and its corresponding third block to obtain Q y2 ;
  • the storage unit of node 3 calculates the sub-matrix Hy3 and its corresponding third block to obtain Q y3 ;
  • the storage unit of node 4 calculates the sub-matrix H y4 and its corresponding third block to obtain Q y4 ; where Q y1 , Q y2 , Q y3 and Q y4 are the first results.
  • each node sends the intermediate result to the master node (main storage device) in the frame or AZ, and each master node calculates the target sum matrix.
  • Node 2 sends Q y2 to node 1 (the main storage device) through the internal switch, and node 1 adds the intermediate results (Q y2 , Q y1 ) to obtain the target summation matrix T 1 ;
  • node 4 sends Q y4 through the internal switch Send to node 3 (the main storage device), and node 3 adds the intermediate results (Q y3 , Q y4 ) to obtain the target summation matrix T 2 .
  • the master node 3 sends the calculated target sum matrix to the node 1 through the switch between the chassis or the AZ.
  • Node 1 uses the target sum matrix and sub-matrix Hy5 to calculate the final result, and restores block 1 and block 25. And write the restored block 1 into the storage medium.
  • the control unit of node 1 continues to reconstruct other faulty blocks, or the reconstruction ends.
  • the check matrix of the first stripe is split into the first sub-matrix and the second sub-matrix of the target number through the first storage device, and the split second sub-matrix is The matrix is sent to the storage unit of the corresponding storage device.
  • the storage unit of each storage device generates the first result.
  • Each storage device sends the generated first result to the main storage device.
  • the main storage device performs the first result on the received first result. Sum, obtain the target sum matrix, and then send the target sum matrix to the first storage device, and finally the first storage device reconstructs the first block according to the at least one target sum matrix and the first sub-matrix.
  • the first storage device does not need to read the unlost blocks on the first strip from other storage devices. It only needs to obtain the target first result from the main storage device, and then the first block can be reconstructed. Since the first result or the target first result has a smaller amount of data than the unlost blocks on the first stripe, during the reconstruction of the first block, the first storage device and other storage devices The amount of transmitted data is relatively small, and during the data transmission process, the occupied network bandwidth is relatively small, thereby improving the performance of reconstruction and partitioning. If the at least one main storage device is in different hard disk enclosures, the data exchanged between each hard disk enclosure is only the second sub-matrix, therefore, the amount of data exchange between each hard disk enclosure can be reduced.
  • each primary storage device can obtain the first result sent by the storage unit of each storage device without sending a large number of read requests to the storage unit of each storage device. , Not only reduces the overhead of the CPU of the control device, but also further reduces the network bandwidth occupancy, thereby further improving the performance of the reconstruction block. And by comparing whether the content of the reconstructed second block is consistent with the content of the obtained block, the failure of data silence can be avoided, and the reliability of the system is improved.
  • FIG. 14 is a schematic structural diagram of a data reconstruction device provided by an embodiment of the present invention, and the device includes:
  • the determining module 1401 is configured to execute the above step 302;
  • the first obtaining module 1402 is configured to obtain the first results of the target number of storage devices, and each first result is read by one of the target number of storage devices in the stored first item
  • the effective partition of the band is calculated according to the effective partition read;
  • the reconstruction module 1403 is configured to reconstruct the first block according to the first result of the target number of storage devices.
  • the device further includes a splitting module for performing step 303 above;
  • the reconstruction module 1403 is configured to execute the above step 3010.
  • the reconstruction module 1403 is used to:
  • the first target row of the target block matrix is determined as the reconstructed first block.
  • the device further includes:
  • the first sending module is configured to perform the above step 304;
  • the first obtaining module 1402 is configured to execute the above step 309.
  • the first sending module is configured to perform step 311 above.
  • the device further includes an execution module
  • the first obtaining module 1402 is further configured to obtain the second block from a storage device storing the second block;
  • the reconstruction module 1403 is further configured to reconstruct the second block according to the first result of the target number of storage devices and the first sub-matrix;
  • the execution module is configured to execute the step of sending a write request to the target storage device if the reconstructed second segment is the same as the acquired second segment; otherwise, the step of sending the write request to the target storage device is not executed.
  • the device further includes:
  • the second sending module is configured to perform the above step 804;
  • the second sending module is also used to execute the above step 805;
  • the first receiving module is configured to perform step 813 above.
  • the device further includes:
  • the second receiving module is configured to receive a first reconstruction request, where the first reconstruction request carries the storage medium identifier of the failed storage medium in the storage device;
  • the determining module 1401 is also used to execute the above steps 21-23.
  • the device further includes:
  • a third receiving module configured to receive a second reconstruction request, where the second reconstruction request carries block information of at least one block missing from the storage medium in the storage device;
  • the determining module 1401 is also used to execute the above steps 2A-2C.
  • the device further includes:
  • the first query module is used to query and execute the above step 1012.
  • the device further includes:
  • the second query module is configured to query whether the storage medium in the first storage device of the target number of storage devices is invalid
  • the determining module 1401 is further configured to determine the first block in the lost block in the first strip from the blocks stored in the failed storage medium when the storage medium in the first storage device fails.
  • the device further includes:
  • the third query module is used to query whether the storage medium in the first storage device of the target number of storage devices has lost blocks
  • the determining module 1401 is further configured to: when any storage medium in the first storage device loses at least one segment, from the at least one segment, determine the first segment in the lost segment in the first stripe. One piece.
  • the device further includes:
  • the third sending module is configured to perform the above step 1003;
  • the second acquiring module is used to acquire and execute the above step 1008.
  • the device further includes:
  • the third acquisition module is configured to perform the above step 1204;
  • the third acquisition module is also used to execute the above step 1211;
  • the reconstruction module 1403 is also used to execute the above step 1012.
  • the reconstruction module 1403 is also used to:
  • the first target row of the target block matrix is determined as the reconstructed first block.
  • FIG. 15 is a schematic structural diagram of a data reconstruction device provided by an embodiment of the present invention, and the device includes:
  • the reading module 1501 is configured to read the effective partition of the first stripe stored by the storage device, and the first stripe is stored by the target number of storage devices;
  • the calculation module 1502 is configured to calculate according to the read effective block to obtain the first result
  • the sending module 1503 is configured to perform step 308 above.
  • the device further includes:
  • the receiving module is configured to receive an acquisition request, the acquisition request carrying the second sub-matrix corresponding to the storage device, the block information of the third block corresponding to each column in the second sub-matrix, and the identification of the target device, so
  • the second sub-matrix includes a column corresponding to at least one third partition stored on a corresponding storage device in the first stripe, and the second partition is any column that is not lost in the first stripe.
  • Block, the third block is any block except the second block among the non-lost blocks;
  • the reading module 1501 is configured to perform the above step 1205;
  • the calculation module 1502 is configured to execute the aforementioned step 1206.
  • calculation module 1502 is also used to:
  • the second sub-matrix is multiplied by the block matrix to obtain the first result.
  • the target device includes a control device, a target storage device, a first storage device, and at least one main storage device.
  • the first storage device is any one of the target storage devices, and the at least one main storage device is The storage device is used to manage the target number of storage devices;
  • the acquisition request is a first acquisition request for instructing to send the first result to the control device
  • the acquisition request is a second acquisition request for instructing to send the first result to the target storage device
  • the acquisition request is a third acquisition request for instructing to send the first result to the first storage device
  • the acquisition request is a fourth acquisition request used to instruct to send the first result to the primary storage device.
  • the device further includes:
  • the first query module is used to query whether the storage medium in the storage device is invalid
  • the sending module 1503 is further configured to send a first reconstruction request to the control device when the storage medium in the storage device fails, and the first reconstruction request carries the storage of the failed storage medium in the storage device.
  • Media identification is further configured to send a first reconstruction request to the control device when the storage medium in the storage device fails, and the first reconstruction request carries the storage of the failed storage medium in the storage device.
  • the device further includes:
  • the second query module is used to query whether the storage medium in the storage device is lost in blocks
  • the sending module 1503 is further configured to send a second reconstruction request to the control device when any storage medium in the storage device loses at least one segment, and the second reconstruction request carries the at least one segment.
  • the block information of the block is further configured to send a second reconstruction request to the control device when any storage medium in the storage device loses at least one segment, and the second reconstruction request carries the at least one segment.
  • FIG. 16 is a schematic structural diagram of a data reconstruction device provided by an embodiment of the present invention, and the device includes:
  • the receiving module 1601 is configured to receive a target reconstruction request, where the target reconstruction request carries a first sub-matrix, and the target reconstruction request is used to indicate that according to the first sub-matrix, the lost scores in the first strip are The first sub-block in the block is reconstructed, and the first sub-matrix includes the column corresponding to the first sub-block in the lost sub-block in the first strip and the column corresponding to the second sub-block, so The second block is any block that is not lost in the first strip;
  • the obtaining module 1602 is configured to perform the above step 810;
  • the reconstruction module 1603 is used to execute the above step 811.
  • the reconstruction module 1603 is further configured to:
  • the first target row of the target block is used as the reconstructed first block.
  • the device further includes:
  • the storage module is used to execute the above step 812;
  • the sending module is used to perform step 813 above.
  • the device further includes an execution module
  • the acquiring module 1602 is further configured to acquire the second partition from a storage device storing the second partition;
  • the reconstruction module 1603 is further configured to reconstruct the second block according to the first result of the target number of storage devices and the first sub-matrix;
  • the execution module is configured to perform the step of storing the reconstructed first sub-block if the reconstructed second sub-block is the same as the acquired second sub-block, otherwise, not perform the The step of storing the reconstructed first block, and prohibiting the use of the first strip.

Abstract

本发明公开了一种数据重构的方法、装置、计算机设备、存储介质及系统,属于数据存储技术领域。本方法直接根据获取的目标个数的存储设备的第一结果对第一条带中丢失的第一分块进行重构,从而无需从读取第一条带上未丢的分块,仅需要获取目标个数的第一结果,就可以对第一分块进行重构,由于第一结果与第一条带上未丢的分块相比数据量较少,在数据传输过程中,占用的网络带宽也比较少,从而提高了重构分块的性能。

Description

数据重构的方法、装置、计算机设备、存储介质及系统 技术领域
本申请涉及数据存储技术领域,特别涉及一种数据重构的方法、装置、计算机设备、存储介质及存储系统。
背景技术
为了提高存储系统的容错能力,一般采用纠删码(erasure coding,EC)以及分条技术,对待存储的数据进行数据存储,具体地,可以将待存储的数据划分为多个条带,若EC的冗余模式为N个数据块+M个校验块,则每个条带上包括N个数据块+M个校验块,其中,N和M为大于0的整数,1个数据块和一个校验块可以视为一个分块。每个条带所包括的N+M个分块可以存储在多个存储设备上,当任一个存储设备丢失存储的分块时丢失,为了保证丢失的分块不影响存储系统的正常业务,存储系统需要对丢失的分块进行程序。
分块重构可以是以下过程:对于丢失分块的条带,存储系统中的控制设备确定该条带中丢失的1个分块,并向存储设备发送N个读请求,每个读请求用于指示读取一个未丢失的分块,存储设备接收到每个请求后,向控制设备发送每个读请求指示的未丢失的分块,当控制设备获取到N个分块后,根据条带的校验矩阵以及获取的N个分块,获取丢失的分块,通过写请求将获取的丢失的分块发送给丢失分块的存储设备,该存储设备接收到该写请求后,将控制设备获取丢失的分块写入该存储节点内中,从而可以实现分块重构。
在上述分块重构过程中,控制设备和存储条带的存储设备之间需要传输N个读请求、N分块以及重构的分块,在传输过程中,会占用大量的网络带宽,从而降低了重构分块的性能。
发明内容
本发明实施例提供了一种数据重构的方法、装置、计算机设备、存储介质及系统,能够提高分块重构的性能。该技术方案如下:
第一方面,提供了一种数据重构的方法,该方法包括:确定第一条带中已丢失分块中的第一分块,所述第一条带中的分块由目标个数的存储设备存储;
获取所述目标个数的存储设备的第一结果,每个第一结果由所述目标个数的存储设备中的一个存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到;
根据所述目标个数的存储设备的第一结果,对所述第一分块进行重构。
基于上述实现方式,直接根据获取的目标个数的存储设备的第一结果对第一条带中丢失的第一分块进行重构,从而无需从读取第一条带上未丢的分块,仅需要获取目标个数的第一结果,就可以对第一分块进行重构,由于第一结果与第一条带上未丢的分块相比数据量较少,在数据传输过程中,占用的网络带宽也比较少,从而提高了重构分块的性能。
在一种可能实现方式中,所述获取所述目标个数的存储设备的第一结果之前,所述方法还包括:
将所述第一条带的校验矩阵拆分成第一子矩阵和所述目标个数的第二子矩阵,所述第一 子矩阵包括所述第一分块所对应的列以及第二分块所对应的列,每个第二子矩阵包括所述第一条带中在一个存储设备上所存储的至少一个第三分块所对应的列,所述第二分块为所述第一条带的有效分块中的任一分块,所述第三分块为所述有效分块中除所述第二分块以外的任一分块;
所述根据所述目标个数的存储设备的第一结果,对所述第一分块进行重构包括:
根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构包括:
对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
基于所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
将所述目标分块矩阵的第一目标行确定为重构的第一分块。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
向所述每个第二子矩阵所对应的存储设备分别发送第一获取请求,所述第一获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及控制设备的标识;
获取所述目标个数的存储设备基于所述第一获取请求返回的第一结果。
基于上述可能的实现方式,并且由于仅向各个存储设备发送一个第一获取请求就可以获取到各个存储设备发送的第一结果,而无需向各个存储设备发送大量的读请求,不仅降低了控制设备的CPU的开销,也进一步降低网络带宽的占用情况,进而可以进一步提高重构分块的性能。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之后,所述方法还包括:
向目标存储设备发送写请求,所述写请求携带重构的第一分块以及所述重构的第一分块的块信息,由所述目标存储设备根据所述重构的第一分块的块信息,对所述重构的第一分块进行存储。
在一种可能实现方式中,所述向目标存储设备发送写请求之前,所述方法还包括:
从存储所述第二分块的存储设备获取所述第二分块;
根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
若重构的第二分块与获取的所述第二分块相同,则执行所述向目标存储设备发送写请求的步骤,否则,不执行所述向目标存储设备发送写请求的步骤,并且禁止使用所述第一条带。
基于上述可能的实现方式,并且通过比较第二分块与重构的第二分块的内容是否一致,可以避免出现数据静默的故障,提高了系统的可靠性。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
向所述每个第二子矩阵所对应的存储设备分别发送第二获取请求,所述第二获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及 目标存储设备的标识;
向所述目标存储设备发送目标重构请求,所述目标重构请求携带第一子矩阵,所述目标重构请求用于指示根据所述第一子矩阵,对所述第一分块进行重构;
接收所述目标存储设备发送的重构完成响应,所述重构完成响应用于指示所述第一分块重构完成。
基于上述可能的实现方式,由于仅向各个存储设备发送一个第二获取请求就可以获取到各个存储设备发送的第一结果,而无需向各个存储设备发送大量的读请求,不仅降低了控制设备的CPU的开销,也进一步降低网络带宽的占用情况,进而可以进一步提高重构分块的性能。
在一种可能实现方式中,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
接收第一重构请求,所述第一重构请求携带存储设备中失效的存储介质的存储介质标识;
根据所述第一重构请求携带的存储介质标识,确定至少一个第二条带;
将所述至少一个第二条带中任一个条带确定为所述第一条带;
所述确定第一条带中已丢失分块中的第一分块包括:
根据所述存储介质标识,将所述第一条带中已丢失分块中的任一个分块确定为所述第一分块。
在一种可能实现方式中,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
接收第二重构请求,所述第二重构请求携带存储设备中存储介质丢失的至少一个分块的块信息;
根据所述第二重构请求携带的至少一个分块的块信息,确定至少一个第二条带;
将所述至少一个第二条带中任一个条带确定为所述第一条带;
所述确定第一条带中已丢失分块中的第一分块包括:
根据所述至少一个分块的块信息,将所述第一条带中已丢失分块中的任一个分块确定为所述第一分块。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果,对所述第一分块进行重构之后,所述方法还包括:
查询所述第一条带的已丢失的所有分块是否全部重构完成,若未全部重构完成,则对未重构的任一个分块继续执行上述分块重构过程,否则所述第一条带重构完成。
在一种可能实现方式中,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
查询所述目标个数的存储设备中的第一存储设备内的存储介质是否失效;
所述确定第一条带中已丢失分块中的第一分块包括:
若第一存储设备内的存储介质失效时,从失效的存储介质所存储的分块中,确定第一条带中已丢失分块中的第一分块。
在一种可能实现方式中,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
查询所述目标个数的存储设备中的第一存储设备中的存储介质是否有分块丢失;
当所述第一存储设备内的任一存储介质丢失至少一个分块时,从所述至少一个分块中确定第一条带中已丢失分块中的第一分块。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
向所述每个第二子矩阵所对应的存储设备的存储单元分别发送第三获取请求,所述第三获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及所述第一存储设备的标识;
获取所述目标个数的存储设备的存储单元基于所述第三获取请求返回的第一结果。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
向所述每个第二子矩阵所对应的存储设备的存储单元分别发送第四获取请求,所述第四获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及主存储设备的标识;
从至少一个主存储设备获取至少一个目标求和矩阵,所述至少一个主存储设备用于管理所述目标个数的存储设备,每个目标求和矩阵为至少一个存储设备基于所述第四获取请求返回的第一结果的和,所述至少一个存储设备为一个主存储设备所管理的设备;
所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构包括:
根据所述至少一个目标求和矩阵以及所述第一子矩阵,对所述第一分块进行重构。
在一种可能实现方式中,所述根据所述至少一个目标求和矩阵以及所述第一子矩阵,对所述第一分块进行重构包括:
对所述至少一个目标求和矩阵进行求和,得到求和矩阵;
基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
将所述目标分块矩阵的第一目标行确定为重构的第一分块。
第二方面,提供了一种数据重构的方法,该方法包括:
读取存储设备存储的第一条带的有效分块,所述第一条带由目标个数的存储设备存储;
根据读取的有效分块计算,得到第一结果;
向所述目标设备发送所述第一结果,由所述目标设备根据所述目标个数的存储设备的第一结果,对所述第一条带中已丢失分块中的第一分块进行重构。
在一种可能实现方式中,所述读取第一存储设备存储的第一条带的有效分块之前,所述方法还包括:
接收获取请求,所述获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标设备的标识,所述第二子矩阵包括第一条带中在对应的存储设备上所存储的至少一个第三分块所对应的列,所述第二分块为所述第一条带的有效分块中的任一分块,所述第三分块为所述有效分块中除所述第二分块以外的任一分块;
所述读取第一存储设备存储的第一条带的有效分块包括:
根据所述第二子矩阵中每一列所对应的第三分块的块信息,读取所述至少一个第三分块。
所述根据读取的有效分块计算,得到第一结果包括:
根据所述至少一个第三分块以及所述第二子矩阵计算,得到所述第一结果。
在一种可能实现方式中,所述根据所述至少一个第三分块以及所述第二子矩阵计算,得到所述第一结果包括:
将所述至少一个第三分块组成分块矩阵,所述分块矩阵的每一行为一个第三分块;
将所述第二子矩阵与所述分块矩阵相乘,得到所述第一结果。
在一种可能实现方式中,所述目标设备包括控制设备、目标存储设备、第一存储设备以及至少一个主存储设备,所述第一存储设备为所述目标存储设备中的任一设备,所述至少一个主存储设备用于管理所述目标个数的存储设备;
当所述目标设备的标识为控制设备的标识时,所述获取请求为第一获取请求,用于指示向所述控制设备发送所述第一结果;
当所述目标设备的标识为目标存储设备的标识时,所述获取请求为第二获取请求,用于指示向所述目标存储设备发送所述第一结果;
当所述目标设备的标识为第一存储设备的标识时,所述获取请求为第三获取请求,用于指示向所述第一存储设备发送所述第一结果;
当所述目标设备的标识为主存储设备的标识时,所述获取请求为第四获取请求,用于指示向所述主存储设备发送所述第一结果。
在一种可能实现方式中,所述接收获取请求之前,所述方法还包括:
查询存储设备内的存储介质是否失效;
当所述存储设备中的存储介质失效时,向控制设备发送第一重构请求,所述第一重构请求携带所述存储设备中失效的存储介质的存储介质标识。
在一种可能实现方式中,所述接收获取请求之前,所述方法还包括:
查询存储设备中的存储介质是否有分块丢失;
当所述存储设备内的任一存储介质丢失至少一个分块时,向控制设备发送第二重构请求,所述第二重构请求携带所述至少一个分块的块信息。
第三方面,提供了一种数据重构的方法,该方法包括:
接收目标重构请求,所述目标重构请求携带第一子矩阵,所述目标重构请求用于指示根据所述第一子矩阵,对第一条带中已丢失分块中的第一分块进行重构,所述第一子矩阵包括所述第一条带中已丢失分块中的第一分块所对应的列以及第二分块所对应的列,所述第二分块为所述第一条带的有效分块中任一分块;
获取目标个数的存储设备基于所述第二获取请求返回的第一结果,所述第二获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标存储设备的标识,所述第一条带中的分块由所述目标个数的存储设备存储,每个第一结果由所述目标个数的存储设备中的一个存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到,所述第三分块为所述失效分块中除所述第二分块以外的任一分块;
根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构包括:
对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
将所述目标分块的第一目标行作为重构的第一分块。
在一种可能实现方式中,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之后,所述方法还包括:
对重构后的第一分块进行存储;
当对所述重构后的第一分块存储完成时,向控制设备发送重构完成响应,所述重构完成响应用于指示所述第一分块重构完成。
在一种可能实现方式中,所述对重构后的第一分块进行存储之前,所述方法还包括:
从存储所述第二分块的存储设备获取所述第二分块;
根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
若重构的第二分块与获取的所述第二分块相同,则执行所述对重构后的第一分块进行存储的步骤,否则,不执行所述对重构后的第一分块进行存储的步骤,并且禁止使用所述第一条带。
第四方面,提供了一种分块重构的装置,用于执行上述数据重构的方法。具体地,该分块重构的装置包括用于执行上述第一方面或上述第一方面的任一种可选方式提供的数据重构的方法的功能模块。
第五方面,提供了一种分块重构的装置,用于执行上述数据重构的方法。具体地,该分块重构的装置包括用于执行上述第二方面或上述第二方面的任一种可选方式提供的数据重构的方法的功能模块。
第六方面,提供了一种分块重构的装置,用于执行上述数据重构的方法。具体地,该分块重构的装置包括用于执行上述第三方面或上述第三方面的任一种可选方式提供的数据重构的方法的功能模块。
第七方面,提供一种计算机设备,该计算机设备包括处理器和存储器,该存储器中存储有至少一条指令,该指令由该处理器加载并执行以实现如上述数据重构的方法所执行的操作。
第八方面,提供一种存储介质,该存储介质中存储有至少一条指令,该指令由处理器加载并执行以实现如上述数据重构的方法所执行的操作。
第九方面,提供一种存储系统中数据重构的方法,所述存储系统包括控制设备和一个或多个存储设备,所述一个或多个存储设备中的每一个均包括一个硬盘;所述一个或多个存储设备用于存储所述控制设备生成的一个条带的分块;所述方法包括:
所述一个或多个存储设备中存储有所述条带的有效分块的第一存储设备读取存储的有效分块并且根据读取的有效分块计算得到第一结果;
所述第一存储设备将所述第一结果发送给所述控制设备;
所述控制设备接收所述第一结果并根据所述第一结果恢复所述条带中损坏的分块。
在一种可能实现方式中,所述条带是根据纠删码算法生成的。
在一种可能实现方式中,,所述一个或多个存储设备为硬盘框。
第十方面,提供一种存储系统,所述存储系统包括控制设备和一个或多个存储设备,所述一个或多个存储设备中的每一个均包括一个硬盘;所述一个或多个存储设备用于存储所述控制设备生成的一个条带的分块;
所述一个或多个存储设备中存储有所述条带的有效分块的第一存储设备用于读取存储的有效分块并且根据读取的有效分块计算得到第一结果,将所述第一结果发送给所述控制设备;
所述控制设备用于接收所述第一结果并根据所述第一结果恢复所述条带中损坏的分块。
在一种可能实现方式中,所述一个或多个存储设备为硬盘框。
在一种可能实现方式中,所述条带是根据纠删码算法生成的。
第十一方面,提供一种存储系统中数据重构的方法,所述存储系统包括多个存储设备,所述多个存储设备中的每一个存储设备均包括一个或多个硬盘,所述多个存储设备用于存储一个条带的分块;所述方法包括:
所述多个存储设备中存储有所述条带的有效分块的第一存储设备读取存储的有效分块并且根据读取的有效分块计算得到第一结果;
所述多个存储设备中的第二存储设备根据所述第一结果恢复所述条带中损坏的分块。
在一种可能实现方式中,所述第二存储设备为所述多个存储设备中的主存储设备或者所述多个存储设备中存储所述条带的损坏的分块的存储设备。
所述方法还包括:
所述第一存储设备向所述第二存储设备发送所述第一结果;其中,所述第一存储设备不包括所述第二存储设备;
所述第二存储设备接收所述第一结果。
在一种可能实现方式中,所述方法还包括:
所述第一存储设备中除所述第二存储设备外的其他存储设备向所述第二存储设备发送所述第一结果;其中,所述第一存储设备包含所述第二存储设备;
所述第二存储设备接收所述其他存储设备发送的第一结果。
第十二方面,提供一种存储系统,所述存储系统包括多个存储设备,所述多个存储设备中的每一个存储设备均包括一个或多个硬盘,所述多个存储设备用于存储一个条带的分块;
所述多个存储设备中存储有所述条带的有效分块的第一存储设备用于读取存储的有效分块并且根据读取的有效分块计算得到第一结果;
所述多个存储设备中的第二存储设备用于根据所述第一结果恢复所述条带中损坏的分块。
在一种可能实现方式中,所述第二存储设备为所述多个存储设备中的主存储设备或者所述多个存储设备中存储所述条带的损坏的分块的存储设备。
在一种可能实现方式中,所述第一存储设备用于向所述第二存储设备发送所述第一结果;其中,所述第一存储设备不包括所述第二存储设备;
所述第二存储设备用于接收所述第一结果。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种实施环境的示意图;
图2是本发明实施例提供的一种计算机设备的结构示意图;
图3是本发明实施例提供的一种数据重构的方法的流程图;
图4是本发明实施例提供的一种分块重构过程的示意图;
图5是本发明实施例提供的一种分块重构过程的示意图;
图6是本发明实施例提供的一种分块重构过程的示意图;
图7是本发明实施例提供的一种数据重构的方法的流程图;
图8是本发明实施例提供的一种数据重构的方法的流程图;
图9是本发明实施例提供的一种分块重构过程的示意图;
图10是本发明实施例提供的一种数据重构的方法的流程图;
图11是本发明实施例提供的一种分块重构过程的示意图;
图12是本发明实施例提供的一种数据重构的方法的流程图;
图13是本发明实施例提供的一种分块重构过程的示意图;
图14是本发明实施例提供的一种数据重构的装置的结构示意图;
图15是本发明实施例提供的一种数据重构的装置的结构示意图;
图16是本发明实施例提供的一种数据重构的装置的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。
图1是本发明实施例提供的一种实施环境的示意图,参见图1,该实施环境包括控制设备101、至少一个存储设备102以及目标存储设备103,控制设备101、存储设备102以及目标存储设备103之间可以通过光纤或者电缆进行连接。
控制设备101,用于将向存储设备102和目标存储设备103写入数据,该控制设备101还可以用于从存储设备102或目标存储设备103处读取数据,该控制设备101还可以用于重构存储设备102内丢失的分块。
存储设备102,用于存储数据以及向控制设备输出存储的数据,存储设备102还用于向控制设备101或目标存储设备103发送分块重构过程中的中间结果数据,以便控制设备101或目标存储设备103可以直接根据至少一个存储设备102所发送的中间结果数据,对丢失的分块进行重构。
目标存储设备103,用于存储重构的分块,目标存储设备102还用于根据至少一个存储设备102所发送的中间结果数据,对丢失的分块进行重构。需要说明的是,该目标存储设备103可以是该至少一个存储设备102中的任一存储设备,还可以是该至少一个存储设备102以外的其他存储设备。
存储设备102以及目标存储设备103均还可以包括至少一个存储介质,每个存储介质用于存储控制设备写入的数据。
控制设备101、存储设备102以及目标存储设备103均还可以包括输入/输出(outin,OI)单元,OI单元用于发送或者接收消息。
需要说明的是,在一些实施例中,控制设备101、存储设备102以及目标存储设备103均可以包括存储单元和控制单元,其中,控制单元可以具有上述控制设备101的功能,存储单元可以具有上述存储设备102或目标存储设备103的功能,也即是,控制设备101、存储设备102以及目标存储设备103均为包括存储单元和控制单元的设备,该设备兼具控制设备、 存储设备以及目标存储设备的功能。
在一些实施例中,包括存储单元和控制单元的至少一个存储设备可以分为多组,每个组内可以有一个主存储设备,对于任一分组的主存储设备用于管理任一分组内的存储设备,各个存储设备可以将分块重构过程中的中间结果发送给主存储设备,主存储设备可以对接收到的中间结果进行求和,得到和值,并将和值发送给重构分块的设备,由重构分块的设备根据接收的和值,恢复重构分块。主存储设备可以是任一分组的存储设备中的任一设备,也可以是任一分组的存储设备以外的任一设备。例如一个硬盘框视为一个分组,一个硬盘框包括至少一个具有存储单元和控制单元的存储设备和一个主存储设备,或者一个可用区(availability zones,AZ)视为一个分组,一个AZ包括至少一个具有存储单元和控制单元的存储设备和一个主存储设备。
上述的控制设备、存储设备以及目标存储设备均可以是计算机设备,图2是本发明实施例提供的一种计算机设备的结构示意图,计算机设备200包括可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)201和一个或一个以上的存储器202,其中,该存储器202中存储有至少一条指令,该至少一条指令由该处理器201加载并执行以实现下述各个方法实施例提供的方法。当然,该计算机设备200还可以具有有线或无线网络接口、键盘以及输入输出接口等部件,以便进行输入输出,该计算机设备200还可以包括其他用于实现设备功能的部件,在此不做赘述。
在示例性实施例中,还提供了一种计算机可读存储介质,例如包括指令的存储器,上述指令可由计算机设备中的处理器执行以完成下述实施例中的数据重构的方法。例如,该计算机可读存储介质可以是只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、只读光盘(compact disc read-only memory,CD-ROM)、磁带、软盘和光数据存储设备等。
以上是实施环境对以及计算机设备的描述,另外,当基于EC以及分条技术,将一个条带中的分块存储在至少一个存储设备之前,控制设备可以为该条带分配一个条带标识,以区分不同的条带,该条带标识可以是条带的编号。控制设备还可以为该条带上的各个分块分配块信息、分块编号以及节点标识,该节点标识用于指示存储分块的存储设备,该任一分块的块信息可以是任一分块的存储地址,任一分块的存储地址可以包括存储任一分块的存储设备的互联网协议地址(internet protocol address,IP)以及任一分块在该存储设备的存储介质中的偏移地址,该存储地址还可以是全局通用名称(worldwide name,WWN)。当存储该任一分块的存储设备中包括多个存储介质时,该任一分块的块信息还可以包括存储该任一分块的存储介质的介质标识,该介质标识可以是存储介质的编号。本发明实施例对该条带标识、块信息、该节点标识以及介质标识不做具体限定。
当分配完成后,该控制设备可以将为该条带以及该条带的各个分块分配的信息进行关联存储,以便后续控制设备可以根据存储的信息,确定哪个条带上的哪个分块丢失,且丢失的分块存储在哪个存储设备的哪个存储介质中。控制设备可以将为该条带以及该条带的各个分块分配的信息存储在分配表中,以实现关联存储。当然,该控制设备可以在该关联表中存储每个分块的分块标识,分块标识包括数据标识或校验标识,其中,数据标识用于指示分块为数据块,验证标识用于指示分块为校验块。当该控制设备将为该条带以及该条带的各个分块分配的信息存储在分配表后,该控制设备可以根据分配表中各个分块所对应的信息,对各个 分块进行存储。
例如,表1所示的关联表,从表1中可知,条带1包括分块1-p,其中,p为大于1的整数,分块1是被存储在存储设备1上的一个数据块,控制设备可以将分块1存储在存储设备1中,而分块p是被存储在存储设备2上的校验块,控制设备可以将分块P存储在存储设备2中。
表1
Figure PCTCN2019097155-appb-000001
并且,控制设备在将该第一条带上的各个分块进行存储之前,该存储设备通过EC计算或磁盘阵列(redundant arrays of independent drives,RAID)计算,根据条带上的数据块以及校验矩阵,生成至少一个校验块,其中,校验矩阵H为2*(N+M)的矩阵H,也即是。该条带包括N个数据块和M个校验块,校验矩阵H中的第j列对应该条带的第j个分块,也即是,校验矩阵中的每一列对应该第一条带上的一个分块,其中N、M和j均为大于0的正整数。
以EC 23+2的冗余模式对原始数据进行计算为例,来说明生成校验块的过程,其中23为数据块的数目,2为校验块的数目。
控制设备将计算机程序保存在本地存储介质上,并在控制设备的处理器上执行该计算机程序。控制设备将原始数据划分成23个分块,也即是23个数据块,然后按照一定的次序对23个数据块进行排列;将23个数据块通过EC/RAID计算,生成校验数据(2个检验块)。
具体地,该控制设备可以使用校验矩阵方程(1)进行计算。
H*x T=0        (1)
其中,校验矩阵H存放在计算机程序中,x T为条带矩阵x的转置矩阵。在23+2配比下,校验矩阵H可以用公式(2)表示,条带矩阵x可以用公式3表示。
Figure PCTCN2019097155-appb-000002
x=(x 1,x 2,…,x j,…,x 24,x 25)      (3)
其中,a 1,j表示校验矩阵H中的第1行的第j列的元素,a 2,j表示校验矩阵H中的第1行的第j列的元素,x j用于表示条带中第j个分块,x 1至x 23为数据块,x 24和x 25为检验块。
需要说明的是,校验矩阵H中的每一列与条带矩阵x中的每一列对应,也即是,x j对应H j,其中,H j为校验矩阵H的第j列。由于x 1至x 23的数据块为原始数据,也即是已知量,且校验矩阵H也是已知量,因此,控制设备根据公式(1)-(3)可以求解未知的校验块x 24至x 25,具体求解过程如下。
控制设备将校验矩阵H拆分成两个子矩阵H N和H M
Figure PCTCN2019097155-appb-000003
Figure PCTCN2019097155-appb-000004
控制设备根据公式(1)可以得到公式(4):
H N*(x 1,x 2,…,x 23) T+H M*(x 24,x 25) T=0    (4)
控制设备根据公式(4)可以得到校验块x 24和x 25
Figure PCTCN2019097155-appb-000005
在获取到校验块x 24和x 25后,控制设备可以通过IO单元把该条带上数据块x 1至x 23以及校验块x 24和x 25发送给至少一个存储设备,由该至少一个存储设备通过IO单元,可以多个存储介质(磁盘,固态硬盘等);然后,至少一个存储设备把数据块或者校验块,写入存储介质中。需要说明的是,上述生成条带的过程也即是根据纠删码算法生成条带的过程。
由于在一些特定的环境下,存储设备中存储介质可以能会丢失其存储的分块,为了保证丢失的分块不影响正常业务,控制设备需要对丢失的分块进行重构。根据上述求解校验块的过程,当条带中的任一个分块丢失时,也可以根据条带上的其他分块,求解该丢失的分块,为了简化在控制设备中的计算过程,可以将上述求解过程分解成多个子过程,由存储设备完成子过程的计算,然后,存储设备将子过程的计算结果发送给控制设备,最终控制设备根据接收到的计算结果。
本发明实施例中,丢失的分块也称为损坏的分块,即无法读取的分块。求解该丢失的分块即恢复该损坏的分块。本发明实施例中,可以由存储设备根据该存储设备所存储的条带中的有效分块计算得到中间结果,然后存储设备再将中间结果发送给控制设备,由控制设备根据中间结果恢复出损坏的分块。另外一种实现,存储设备根据该存储设备所存储的条带中的有效分块计算得到中间结果,由存储有损坏分块的存储设备根据中间结果恢复出损坏的分块。具体的,当存储有损坏分块的存储设备也存储有效分块时,其他存储有有效分块的存储设备将计算得到的中间结果发送给存储有损坏分块的存储设备,存储有损坏分块的存储设备根据这些中间结果,恢复出损坏的分块。当存储有损坏的分块的存储设备没有存储有效分块时,存储有有效分块的存储设备将中间结果发送给存储有损坏分块的存储设备,存储有损坏分块的存储设备根据中间结果恢复损坏的分块。这样,可以直接在存储有损坏分块的存储设备恢复损坏的分块,减少了数据交互。另外一种实现,存储设备根据该存储设备所存储的条带中的有效分块计算得到中间结果,由存储设备中的主存储设备根据中间结果恢复损坏的分块。具体的,当主存储设备也存储有效分块时,其他存储有有效分块的存储设备将计算得到的中间结果发送给主存储设备,主存储设备根据这些中间结果,恢复出损坏的分块。当主存储设备没有存储有效分块时,存储有有效分块的存储设备将中间结果发送给主存储设备,主存储设备根据中间结果恢复损坏的分块。本发明实施例中,有效的分块是指条带中可以读取的分块或者未损坏的分块。
结合上述描述,本发明的一个实施例提供了一种存储系统中数据重构的方法,所述存储系统包括控制设备和一个或多个存储设备,所述一个或多个存储设备中的每一个均包括一个硬盘;所述一个或多个存储设备用于存储所述控制设备生成的一个条带的分块;所述方法包括:
所述一个或多个存储设备中存储有所述条带的有效分块的第一存储设备读取存储的有效的分块并且根据读取的有效分块计算得到第一结果;
所述第一存储设备将所述第一结果发送给所述控制设备;
所述控制设备接收所述第一结果并根据所述第一结果恢复所述条带中损坏的分块。
一种可能的实现方式,所述条带是根据纠删码算法生成的。
一种可能的实现方式,所述一个或多个存储设备为硬盘框。
本发明另外一种实施例提供了一种存储系统中数据重构的方法,所述存储系统包括多个存储设备,所述多个存储设备中的每一个存储设备均包括一个或多个硬盘,所述多个存储设备用于存储一个条带的分块;所述方法包括:
所述多个存储设备中存储有所述条带的有效分块的第一存储设备读取存储的有效的分块并且根据读取的有效分块计算得到第一结果;
所述多个存储设备中的第二存储设备根据所述第一结果恢复所述条带中损坏的分块。
一种可能的实现方式,所述第二存储设备为所述多个存储设备中的主存储设备或者所述多个存储设备中存储所述条带的损坏的分块的存储设备。
进一步的,所述方法还包括:
所述第一存储设备向所述第二存储设备发送所述第一结果;其中,所述第一存储设备不包括所述第二存储设备;
所述第二存储设备接收所述第一结果。
进一步的,所述第一存储设备中除所述第二存储设备外的其他存储设备向所述第二存储设备发送所述第一结果;其中,所述第一存储设备包含所述第二存储设备;
所述第二存储设备接收所述其他存储设备发送的第一结果。
一种可能的实现方式,所述多个存储设备为硬盘框。
本发明上述实施例可以参考下面实施例中的描述,在此不再赘述。
对丢失的分块进行重构,可以使用图3所示的本发明实施例提供的一种数据重构的方法的流程图,来表明此过程,该方法的流程具体包括:
301、丢失分块的存储设备向控制设备发送重构请求。
该重构请求用于指示重构丢失的分块,该重构请求可以包括第一重构请求或第二重构请求,其中,第一重构请求存储设备中携带失效的存储介质的存储介质标识,当该存储设备内的任一存储介质失效时,该存储设备可以将携带任一存储介质的介质标识的第一重构请求发送至控制设备,因此,在本步骤301之前,该存储设备可以查询存储设备内的存储介质是否失效;当该存储设备中的存储介质失效时,该存储设备向控制设备发送该第一重构请求
第二重构请求携带存储设备中存储介质丢失的至少一个分块的块信息,当该存储设备内的任一存储介质部分失效,导致丢失至少一个分块时,该存储设备可以将该第二重构请求发送至控制设备,因此,在本步骤301之前,该存储设备还可以查询存储设备中的存储介质是否有分块丢失;当该存储设备内的任一存储介质丢失至少一个分块时,该存储设备向控制设备发送第二重构请求。
302、控制设备根据该重构请求,确定第一条带中已丢失分块中的第一分块,该第一条带中的分块由目标个数的存储设备存储。
该目标个数为用于存储第一条带的分块的存储设备的数目,该目标个数可以是1个也可以是多个,本发明实施例对该目标个数不做具体限定。该第一条带为丢失分块的任一条带。
对于不同的重构请求,控制设备可以通过方式1-2所示的过程,来实现本步骤302。
方式1、当该重构请求为第一重构请求时,本步骤302可以通过下述步骤21-23所示的过程来实现。
步骤21、控制设备根据该第一重构请求携带的存储介质标识,确定至少一个第二条带。
该至少一个第二条带也即是丢失了分块的条带,该存储介质标识所存储的分块为该至少一个第二条带上的分块,因此,该控制设备根据该第一重构请求携带的存储介质标识,确定至少一个第二条带。在一种可能的实现方式中,该控制设备可以从关联表中确定与该存储介质标识对应的至少一个第二条带。
步骤22、控制设备将该至少一个第二条带中任一个条带确定该第一条带。
该控制设备可以从该至少一个第二条带中随机选择一个第二条带作为该第一条带。若条带标识为编号,该控制设备还可以选择一个编号最大的或最小的第二条带作为该第一条带。本发明实施例对从该至少一个第二条带中选择该第一条带的方式不做具体限定。
步骤23、控制设备根据该存储介质标识,将该第一条带中已丢失分块中的任一个分块确定为该第一分块。
该控制设备可以先在关联表中,选择出与该存储介质的介质标识以及该第一条带的条带标识所对应的至少一个第四分块,则该至少一个第四分块也即是该存储介质丢失的该第一条带的分块。然后,该控制设备再从该至少一个第四分块中选择任意一个分块作为该第一分块。需要说明的是,控制设备从该至少一第个四分块中选择第一分块的方式和步骤22中从该至少一个第二条带中选择该第一条带的方式同理,在此本发明实施例对从该至少一个分块中选择第一分块的方式不做赘述。
方式2、当该重构请求为第二重构请求时,本步骤302可以通过下述步骤2A-2C所示的过程来实现。
步骤2A、该控制设备根据该第二重构请求携带的至少一个分块的块信息,确定至少一个第二条带。
该控制设备可以从关联表中确定与该至少一个分块的块信息对应的至少一个第二条带。
步骤2B、控制设备将该至少一个第二条带中任一个条带确定为该第一条带。
本步骤2B与步骤22的实现过程同理,在此,本发明实施例对本步骤2B不做赘述。
步骤2C、控制设备根据该丢失的至少一个分块的块信息,将该第一条带中已丢失分块中的任一个分块确定为该第一分块。
该控制设备可以先在关联表中,选择出与该至少一个分块的块信息以及该第一条带的条带标识所对应的至少一个第四分块,该控制设备可以从该至少一个第四分块选择一个分块作为该第一分块。在步骤23中有对从该至少一个第四分块中确定第一分块的描述,在此对从该至少一个第四分块中确定第一分块不做赘述。
303、控制设备将该第一条带的校验矩阵拆分成第一子矩阵和目标个数的第二子矩阵,该第一子矩阵包括该第一分块所对应的列以及第二分块所对应的列,每个第二子矩阵包括第一条带中在一个存储设备上所存储的至少一个第三分块所对应的列,该第二分块为所述第一条带中的有效分块中的任一分块,该第三分块为有效分块中除该第二分块以外的任一分块。
为了将求解第一分块的过程进行分解,该控制设备可以通过本步骤303将第一条带的校验矩阵进行分解。控制设备可以将校验矩阵中对应第一分块与第二分块的列拆分出来,组成第一子矩阵,也即是待求解的矩阵,控制设备可以将校验矩阵中第三分块所对应的列组成参量矩阵,控制设备再根据目标个数的存储设备所存储的第三分块,将参量矩阵拆分成目标个数的第二子矩阵,其中,一个第二子矩阵对应一个存储设备,一个第二子矩阵中的每一列与 对应的存储设备存储的一个第三分块对应。
需要说明的是,本发明实施例以第一条带包括分块1-25其中分块1-23为数据块,分块24-25为校验块,第一分块为分块1,第二分块为分块25以及分块2-24为第三分块为例进行说明。
当目标个数大于1时,为了进一步说明本步骤303所述的过程,参见图4所示的本发明实施例提供的一种分块重构过程的示意图,在图4中有1个控制设备和2个硬盘框,每个硬盘框相当于一个存储设备,也即是目标个数为2,控制设备和硬盘框均包括CPU、内存以及IO单元,硬盘框还包括用于存储数据的存储介质,硬盘框中的IO单元还可以替换为网络接口单元(网络接口单元在图4中简称网络),其中,网络接口单元的作用和IO单元相同。控制设备把第一条带的分块按框分组,具体地,假设存储设备(硬盘框)1存储有第一条带上的分块1-12,存储设备(硬盘框)2存储有第一条带上的分块13-25,该控制设备可以将分块1(第一分块)和分块25(第二分块)作为待修复的分块,也即是待求解的分块,因此,存储设备1中存储的分块2-12为存储设备1存储的第一条带的第三分块,存储设备2中存储的分块13-24为存储设备2存储的第一条带的第三分块,进而控制设备可以将校验矩阵中分块2-12所对应的第2-12列拆分出来,组成存储设备1所对应的第二子矩阵H 1,将校验矩阵中分块13-24所对应的第13-24列拆分出来,组成存储设备2所对应的第二子矩阵H 2,将校验矩阵H中与分块1和25对应的第1和25列拆分出来,组成第一子矩阵H 3,其中H 1-H 3可以表示为:
Figure PCTCN2019097155-appb-000006
Figure PCTCN2019097155-appb-000007
Figure PCTCN2019097155-appb-000008
当目标个数等于1时,为了进一步说明本步骤303所述的过程,参见图5所示的本发明实施例提供的一种分块重构过程的示意图,在图5中有1个控制设备和1个硬盘框,也即是目标个数为1,控制设备和硬盘框均包括CPU、内存以及IO单元,硬盘框还包括用于存储数据的存储介质,硬盘框中的IO单元还可以替换为网络接口单元(网络接口单元在图4中简称网络),其中,网络接口单元的作用和IO单元相同。存储设备(硬盘框)3存储有第一条带上的分块1-25,该控制设备可以将分块1(第一分块)和分块25(第二分块)作为待修复的分块,也即是待求解的分块,因此,存储设备3中存储的分块2-24为存储设备3存储的第一条带的第三分块,进而控制设备可以将校验矩阵中分块2-24所对应的第2-24列拆分出来,组成存储设备3所对应的第二子矩阵H 4,将校验矩阵H中与分块1和25对应的第1和25列拆分出来,组成第一子矩阵H 3,其中H 4可以表示为:
Figure PCTCN2019097155-appb-000009
304、控制设备向每个第二子矩阵所对应的存储设备分别发送第一获取请求,该第一获取请求携带与存储设备对应的第二子矩阵、该一个第二子矩阵中每一列所对应的第三分块的块信息以及控制设备的标识。
每个第一获取请求用于指示向该控制设备发送第一结果,该第一获取请求为获取请求的一种,该获取请求携带与存储设备对应的第二子矩阵、该第二子矩阵中每一列所对应的第三分块的块信息以及控制设备的标识,当该目标设备的标识为控制的标识时,该获取请求为第四获取请求,此时,该目标设备也即是控制设备。
为了让各个存储设备可以计算求解第一子矩阵过程中的各个子过程,该控制设备可以将每个第二子矩阵发送至其对应的存储设备,由存储设备来完成子过程(步骤306-307所示的求解第一结果的过程)。
由于存储设备可能存储有多个条带的多个分块,该控制设备可以将第二子矩阵对应的第三分块的块信息发送给存储设备,以便存储设备可以根据第三分块的块信息,获取到与接收的第二子矩阵对应的第三分块,因此,控制设备可以将一个第二子矩阵以及与该一个子矩阵对应的第三分块的块信息,组成第一获取请求,并发送给对应的存储设备。
需要说明的是,在有的实施例中,是由控制设备完成对第一分块的重构,而有的实施例中,由目标存储设备完成对第一分块的重构,哪个设备在完成对第一分块的重构时,均需要获取目标个数的存储设备的计算结果,因此,当获取请求中携带目标设备的标识为控制设备的标识时,该获取请求也即是第一获取请求,用于指示向控制设备发送第一结果,其中,第一结果由存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到,也即是存储设备的计算结果)。
例如,图4中的控制设备通过IO单元,将第一获取请求1发送至存储设备1,第一获取请求1携带第二子矩阵H1以及分块2-12的块信息,控制设备通过IO单元,将第一获取请求2发送至存储设备2,第一获取请求2携带第二子矩阵H2以及分块13-24的块信息。再例如,图5中的控制设备通过IO单元,将第一获取请求3发送至存储设备3,第一获取请求3携带第二子矩阵H1以及分块2-24的块信息。
305、存储设备接收第一获取请求。
该存储设备为目标个数的存储设备中的任一个存储设备。通过本步骤305目标个数的存储设备均会接收到一个第一获取请求。例如,图4中的存储设备1和2通过IO单元,分别接收到第一获取请求1和2。再例如,图5中的存储设备3通过IO单元,接收到第一获取请求3。本步骤305也即是存储设备接收获取请求的步骤。
306、存储设备根据该第二子矩阵中每一列所对应的第三分块的块信息,读取该至少一个第三分块。
该存储设备可以根据第二子矩阵中每一列所对应的第三分块的块信息,确定第二子矩阵中每一列所对应的第三分块的存储位置,再从第二子矩阵中每一列所对应的第三分块的存储位置处读取至少一个第三分块。例如,图4中的存储设备1根据第一获取请求1中的分块2-12的块信息,通过IO单元,可以读取到分块2-12,存储设备2根据第一获取请求2中的分块13-24的块信息,通过IO单元,可以读取到分块13-24。再例如,图4中的存储设备3根据第一获取请求3中的分块2-24的块信息,通过IO单元,可以读取到分块2-24。
第三分块也即是是读取的有效分块,需要说明的是,本步骤306所示的过程也即是根据该第二子矩阵中每一列所对应的第三分块的块信息,读取该至少一个第三分块的过程
307、存储设备根据该至少一个第三分块以及该第二子矩阵计算,得到该第一结果
其中,该第一结果可以用于表示该第二子矩阵中每一列所对应的第三分块的特征。该存 储设备可以将该至少一个第三分块组成分块矩阵,该分块矩阵的每一行为一个第三分块;将该第二子矩阵与该分块矩阵相乘,得到该第一结果。本步骤307所示的过程也即是根据读取的有效分块计算,得到第一结果的过程。
例如,图4中的存储设备1中的CPU将分块2-12组成分块矩阵P 1,将H 1与P 1相乘得到第一结果Q 1,存储设备2中的CPU将分块13-14组成分块矩阵P 2,将H 2与P 2相乘得到第一结果Q 2,其中,P 1=(x 1,x 2,…,x 12) T,Q 1=H 1*P 1,P 2=(x 13,x 14,…,x 23) T,Q 2=H 2*P 2
再例如,图5中的存储设备3中的CPU将分块2-24组成分块矩阵P 3,将H 4与P 3相乘得到第一结果Q 3,其中,P 3=(x 1,x 2,…,x 24) T,Q 3=H 4*P 3
308、该存储设备向控制设备发送该第一结果。
由于第一请求中携带的是控制设备的标识,因此,该存储设备向控制设备发送第一结果。需要说明的是,由于该存储设备为该目标个数的存储设备中的任一个存储设备,因此,该目标个数的存储设备均要执行上述步骤305-308所示的过程。
309、控制设备获取目标个数的存储设备基于该第一获取请求返回的第一结果。
由于目标个数的存储设备均会执行步骤308,控制设备就可以接收到目标个数的第一结果,因此,控制设备可以获取到目标个数的第一结果,每个第一结果由该目标个数的存储设备中的一个存储设备读取存储的该第一条带的有效分块并且根据读取的有效分块计算得到。
例如,图4中的控制设备可以获取到存储设备1发送的第一结果Q 1以及存储设备2发送的第一结果Q 2。再例如,图5中的控制设备可以获取到存储设备3发送的第一结果Q 3
310、控制设备根据该目标个数的存储设备的第一结果以及该第一子矩阵,对该第一分块进行重构。
由于第一子矩阵中一列对应第一分块,另一列对应第二分块,因此,控制设备除了根据该目标个数的存储设备的第一结果以及该第一子矩阵,对该第一分块进行重构以外,还可以根据该目标个数的存储设备第一结果以及该第一子矩阵,对第二分块进行重构。
在一种可能的实现方式中,该控制设备可以对该目标个数的存储设备的第一结果进行求和,得到求和矩阵;基于该第一子矩阵的逆矩阵以及该求和矩阵,获取目标分块矩阵;该控制设备将该目标分块矩阵的第一目标行确定为重构的第一分块;将该目标分块矩阵的第二目标行确定为重构的第二分块,其中,目标分块矩阵的转置矩阵的第一目标列为目标分块矩阵的第一目标行,第二目标列为目标分块矩阵的第二目标行,第一目标列对应第一分块,第二目标列对应第二分块。其中,基于该第一子矩阵的逆矩阵以及该求和矩阵,获取目标分块矩阵可以是将该第一子矩阵的逆矩阵乘以该求和矩阵并取负,得到目标分块矩阵。需要说明的是,本步骤310所示的过程也即是根据目标个数的存储设备的第一结果,对第一分块进行重构的过程。
例如,H*x T=H 1*(x 2,x 3,…,x 12) T+H 2*(x 13,x 14,…,x 23) T+H 3*(x 1,x 25) T=0,也即是,H 1*P 1+H 2*P 2+H 3*(x 1,x 25) T=0,也即是,Q 1+Q 2+H 3*(x 1,x 25) T=0,也即是(x 1,x 25) T=-H 3 -1*(Q 1+Q 2),从而图4中的控制设备可以执行(x 1,x 25) T=-H 3 -1*(Q 1+Q 2)所示的计算过程,得到目标分块矩阵(x 1,x 25) T,其中,(x 1,x 25)也即是目标分块矩阵的转置矩阵,控制设备计算得到的x 1也即是重构的第一分块,x 25也即是重构的第二分块。
需要说明的是,当目标个数为1时,该控制设备无需计算求和矩阵,而是将直接基于该第一子矩阵的逆矩阵以及该第一结果,获取目标分块矩阵。
例如,H*x T=H 4*(x 2,x 3,…,x 24) T+H 3*(x 1,x 25) T=0,也即是,Q 3+H 3*(x 1,x 25) T=0,从而图5中的控制设备可以执行(x 1,x 25) T=-H 3 -1*Q 3的计算过程,以获取x 1和x 25
311、控制设备向目标存储设备发送写请求,该写请求携带重构的第一分块以及该重构的第一分块的块信息。
需要说明的是,在一些实施例中,有的条带会出现计算过程正确,但是最终出现的计算结果不正确的静默错误,若第一条带出现静默错误,则重构的第一分块的内容可能与第一分块的内容不一致,由于第一分块已丢失,控制设备无法确定重构的第一分块的内容是否与第一分块原来的内容一致,则控制设备可以通过第二分块的内容与重构的第二分块的内容是否一致,来确定第一条带是否出现静默错误。
在一些实施例中,在本步骤311之前,控制设备从存储该第二分块的存储设备获取该第二分块;若该重构的第二分块与获取的第二分块相同,则执行该向目标存储设备发送写请求的步骤,否则,不执行向目标存储设备发送写请求的步骤,并且禁止使用该第一条带。
控制设备从存储该第二分块的存储设备获取该第二分块的过程可以是:控制设备向存储该第二分块的存储设备发送读请求,该读请求携带该第二分块的块信息;存储该第二分块的存储设备接收待该读请求后,根据该第二分块的块信息向控制设备发送第二分块,从而控制设备可以接收该第二分块。
例如,图6所示的本发明实施例提供的分块重构过程的示意图,图6所示的分块重构过程可以包括下述下步骤61-66。
步骤61、控制设备发起重构,扫描到故障的分块(分块1)。
本步骤61所示的过程也即是步骤302所示的过程。
步骤62、控制设备把分块1所在的条带的分块按框分组,假设分块1-12在存储设备1上,分块13-25在存储设备2上,需要修复分块1和分块25,那么,控制设备把校验矩阵H拆分成3个子矩阵(H 1,H 2,H 3),通过IO单元向存储设备发送第一获取请求。
其中,矩阵H1和H2为第二子矩阵,子矩阵H3为第一子矩阵,控制设备把分块2-24分成2组:分块2-12为1组,存储设备1对子矩阵H1在进行计算;分块13-24为1组,存储设备2对子矩阵H2进行计算。本步骤62可以由控制设备的IO单元来执行。
步骤63:控制设备单独读取分块25。
也即是,控制设备从存储分块25的存储设备2处读取分块25。本步骤63可以由控制设备的IO单元来执行。
步骤64:存储设备收到请求后,通过IO单元读取存储介质上的分块。
本步骤64可以由存储设备的IO单元来执行。
步骤65:存储设备读取到分块后,通过计算得到中间结果Q(第一结果)。
存储设备1上,对子矩阵H 1和其对应的第三分块进行计算,得到中间结果Q 1;存储设备2上,对子矩阵H 2和其对应的第三分块进行计算,得到中间结果Q 2,最后存储设备1和2把中间结果(Q 1,Q 2)返回控制设备,其中,Q包括中间结果Q 1和Q 2。本步骤65可以由存储设备的CPU来执行。
步骤66、控制设备使用中间结果Q和子矩阵H 3,计算最终结果,恢复分块1和分块25。并把恢复的分块25和步骤63读取到的分块25进行比较,如果数据一致,则把分块1写入目标存储设备的存储介质中;如果不一致,则表示分条遭遇了静默错误,对分条进行隔离保护。
本步骤65可以由控制设备的CPU来执行。需要说明的是,图6中的存储设备和控制设备的结构与图4中的存储设备和控制设备的结构类似。
312、目标存储设备根据该重构的第一分块的块信息,对该重构的第一分块进行存储。
该目标存储设备根据重构的第一分块的块信息,可以确定重构的第一分块在该目标存储设备的存储位置,从而该目标设备可以将接收到的重构的第一分块存储在确定的存储位置处。
需要说明的是,当目标存储设备存储完重构的第一分块后,目标存储设备向控制设备发送存储成功响应,该存储成功响应用于指示已经存储重构的第一分块。当接收到该存储成功响应后,控制设备查询该第一条带的已丢失的所有分块是否全部重构完成,若未全部重构完成,则对未重构的任一个分块继续执行上述分块重构过程,否则所述第一条带重构完成。
当控制设备完成第一分块的重构后,该控制设备还可以查询该第一条带上是否还有其他分块丢失,若有,则控制设备查询到的丢失的分块中的任一个分块执行步骤302-312所示的过程,该第一条带上的其他分块没有丢失,则可以视为该第一条带重构完成。
由于可能存在多个条带均丢失分块的情况,当重构完第一条带后,该控制设备还可以查询是否还有其他条带丢失分块,若有其他条带丢失分块,则控制设备重构其他条带上丢失的各个分块,也即是执行步骤302-312所示的过程。
为了进一步说明步骤301-312所示的过程,参见图7所示的本发明实施例提供的有一种数据重构的方法的流程图。在图7中,当控制器(控制设备)发起重构后,控制器扫描分条,确认是否存在故障的分块,若不存在故障的分块,则重构完成。若存在故障的分块,控制器获取故障的分块(所丢失的分块),并把需要读取的分块按框分组,并根据分组将获取的分条的校验矩阵进行拆分,将拆分出的第二子矩阵发送到对应的硬盘框,也即是按照组把各个第二子矩阵发送到各个硬盘框上。各个硬盘框读取对应组内的分块,读取完成后,计算EC/RAID的中间结果,并把中间结果返回控制器。控制器接收到各个硬盘框返回的中间结果,计算恢复的故障块,把恢复的故障块写入某个硬盘框的某个磁盘中,当写入完后,控制器接着扫描分条,以确定该分条上是否还存在故障块。
需要说明的是,当控制设备接收到该存储成功的响应后,该控制设备更新关联表中第一分块的信息,具体地,将第一分块的块信息更新成重构的第一分块的块信息,将第一分块的节点标识更新为目标存储设备的节点标识等,以便后续控制设备可以从目标存储设备处读取重构的第一分块。
本发明实施例提供的方法,通过控制设备直接根据获取的目标个数的存储设备的第一结果对第一条带中丢失的第一分块进行重构,从而控制设备无需从存储设备上读取第一条带上未丢的分块,仅需要从存储设备处获取目标个数的第一结果,就可以对第一分块进行重构,由于第一结果与第一条带上未丢的分块相比数据量较少,因此在重构第一分块的过程中,控制设备和存储设备之间传输的数据量比较少,在数据传输过程中,占用的网络带宽也比较少,从而提高了重构分块的性能。并且由于仅向各个存储设备发送一个第一获取请求就可以获取到各个存储设备发送的第一结果,而无需向各个存储设备发送大量的读请求,不仅降低了控制设备的CPU的开销,也进一步降低网络带宽的占用情况,进而可以进一步提高重构分块的性能。并且通过比较第二分块与重构的第二分块的内容是否一致,可以避免出现数据静默的故障,提高了系统的可靠性。由于传输过程中数据量较少时,因此数据传输的速度块,用传输时间短,由于降低了CPU的开销,在计算过程中可以使用更多的CPU进行计算,从而降 低计算的时间,并且在各个第一结果分别在不同的设备上完成,更进一步降低计算时间,因此,重构分块的用时较少,提高了系统的可靠性。
在图3所示的实施例中,控制设备可以根据第一结果以及第一子矩阵,对第一分块进行重构,而在一些实施例中,目标存储设备也可以根据第一结果以及第一子矩阵,对第一分块进行重构。
为了进一步体现目标存储设备根据第一结果以及第一子矩阵,对第一分块进行重构过程,参见图8所示的本发明实施例提供的一种数据重构的方法的流程图,该方法的具体包括:
801、丢失分块的存储设备向控制设备发送重构请求。
本步骤801与步骤301所示的过程同理,在此,本发明实施例对本步骤801不做赘述。
802、控制设备根据该重构请求,确定第一条带中丢失的第一分块,该第一条带中的分块由目标个数的存储设备存储。
本步骤802与步骤302所示的过程同理,在此,本发明实施例对本步骤802不做赘述。
803、控制设备将该第一条带的校验矩阵拆分成第一子矩阵和目标个数的第二子矩阵,该第一子矩阵包括该校验矩阵中该第一分块所对应的列以及第二分块所对应的列,每个第二子矩阵包括第一条带中在一个存储设备上所存储的至少一个第三分块所对应的列,该第二分块为该第一条带的有效分块中的任一分块,该第三分块为有效分块中除该第二分块以外的任一分块。
本步骤803与步骤303所示的过程同理,在此,本发明实施例对本步骤803不做赘述。
804、控制设备向每个第二子矩阵所对应的存储设备分别发送第二获取请求,每个第二获取请求携带与存储设备对应的第二子矩阵、该第二子矩阵中每一列所对应的第三分块的块信息以及目标存储设备的标识,每个第二获取请求用于指示向该目标存储设备发送第一结果。
该第二获取请求为获取请求的一种,该获取请求携带与存储设备对应的第二子矩阵、该第二子矩阵中每一列所对应的第三分块的块信息以及目标设备的标识,当该目标设备的标识为目标存储设备的标识时,该获取请求为第二获取请求,此时,该目标设备也即是目标存储设备。
需要说明的是,当获取请求中携带目标设备的标识为控制设备的标识时,该获取请求也即是第二获取请求,该第二获取请求中还可以携带目标存储设备的IP地址,以便各个存储设备可以根据目标存储设备的IP地址,向目标存储设备发送第一结果。
805、控制设备向该目标存储设备发送目标重构请求,该目标重构请求携带第一子矩阵,目标重构请求用于指示根据第一子矩阵,对该第一分块进行重构。
该目标重构请求还可以携带该重构的第一分块的块信息,以便目标存储设备可以根据重构的第一分块的块信息,对重构的第一分块进行存储。需要说明的是,本步骤805可以在步骤811之前的任一时机执行。
806、存储设备接收第二获取请求。
需要说明的是,本步骤806或步骤305所示的过程,也即是存储设备接收获取请求的过程,当该获取请求为所述第一获取请求时,用于指示向控制设备发送第一结果,当该获取请求为第二获取请求时,用于指示向目标存储设备发送第一结果。
807、存储设备根据该第二子矩阵中每一列所对应的第三分块的块信息,读取该至少一个第三分块。
本步骤807与步骤306所示的过程同理,在此,本发明实施例对本步骤807不做赘述。
808、存储设备根据该至少一个第三分块以及该第二子矩阵计算,得到该第一结果。
本步骤808与步骤307所示的过程同理,在此,本发明实施例对本步骤808不做赘述。
809、存储设备向目标存储设备发送该第一结果。
该存储设备可以根据第二请求中携带的目标存储设备的IP地址,向该目标存储设备发送第一结果。该目标个数的存储设备均要执行上述步骤806-809所示的过程。
810、目标存储设备获取目标个数的存储设备基于该第二获取请求返回的第一结果。
本步骤810与步骤309中控制设备获取目标个数的存储设备基于该第一获取请求返回的第一结果的过程同理,在此,本发明实施例对本步骤810不做赘述。
811、目标存储设备根据该目标个数的存储设备的第一结果以及该第一子矩阵,重构该第一分块。
目标存储设备重构第一分块的过程与步骤310中控制设备重构第一分块的过程同理,在此,本发明实施例对本步骤811不做赘述。
812、目标存储设备对该重构的第一分块进行存储。
当第二获取请求中携带重构的第一分块的块信息时,该目标存储设备可以根据重构的第一分块的块信息,对重构的第一分块进行存储,若第二获取请求中未携带重构的第一分块的块信息时,该目标存储设备可以根据预设的存储规则,对重构的第一分块进行存储,本发明实施例对该预设的存储规则不做具体限定。
需要说明的是,在一些实施例中,在本步骤812之前,目标存储设备从存储该第二分块的存储设备获取该第二分块;若该重构的分块与获取的第二分块相同,则执行对重构后的第一分块进行存储的步骤,否则,不执行对重构后的第一分块进行存储的步骤,并且禁止使用所述第一条带需要说明的是,目标存储设备获取第二分块的过程与控制设备获取第二分块的过程同理,本发明实施例对目标存储设备获取第二分块的过程不做赘述。
813、当将重构的第一分块存储完成时,目标存储设备向控制设备发送重构完成响应,该重构完成的响应用于指示第一分块重构完成。
当目标存储设备根据预设的存储规则,对重构的第一分块进行存储时,该重构完成响应还可以携带重构的第一分块的块信息,以便控制设备接收到该重构的第一分块的块信息后,对关联表中第一分块的块信息进行更新。
当接收到该重构完成响应后,控制设备查询该第一条带的已丢失的所有分块是否全部重构完成,若未全部重构完成,则对未重构的任一个分块继续执行上述分块重构过程,否则所述第一条带重构完成。
需要说明的是,当控制设备接收到该重构完成响应后,该控制设备更新关联表中第一分块的信息,具体地,将第一分块的块信息更新成重构的第一分块的块信息,将第一分块的节点标识更新为目标存储设备的节点标识的等,以便后续控制设备可以从目标存储设备处重构的第一分块。
为了进一步说明,步骤801-813所示的过程,参见图9所示的本发明实施例提供的一种分块重构过程的示意图。图9所示的分块重构过程可以包括下述下步骤1-7。
步骤1、控制设备发起重构,扫描到故障的分块(分块1)。
本步骤1所示的过程也即是步骤802所示的过程。本步骤1可以通过控制设备的CPU来 执行。
步骤2、控制设备把分块1所在的条带的分块按框分组,假设分块1-12在存储设备1上,分块13-25在存储设备2上,需要修复分块1和分块25,那么,控制设备把校验矩阵H拆分成3个子矩阵(H 1,H 2,H 3),通过IO单元向存储设备发送第二获取请求。
其中,矩阵H1和H2为第二子矩阵,子矩阵H3为第一子矩阵,控制设备把分块2-24分成2组:分块2-12为1组,存储设备1对子矩阵H1在进行计算;分块13-24为1组,存储设备2对子矩阵H2进行计算。本步骤62可以由控制设备的IO单元来执行。
步骤3:存储设备收到请求后,通过IO单元读取存储介质上的分块。
步骤4:存储设备通过IO单元读取到分块后,通过CPU计算得到中间结果Q(第一结果)。
存储设备1上,对子矩阵H 1和其对应的第三分块进行计算,得到中间结果Q 1;存储设备2上,对子矩阵H 2和其对应的第三分块进行计算,得到中间结果Q 2。
步骤5、存储设备1通过IO单元,把中间结果Q 1发送存储设备2。
步骤6、存储设备2通过CPU使用中间结果Q 1和Q 2和子矩阵H 3,计算最终结果,恢复分块1和分块25。并把恢复的分块25和步骤63读取到的分块25进行比较,如果数据一致,则把分块1写入存储介质中。
步骤7、存储设备2通知控制设备该分块已完成重构,控制设备通过CPU继续执行步骤1或者重构完成。
本发明实施例提供的方法,通过控制设备直接根据获取的目标个数的存储设备的第一结果对第一条带中丢失的第一分块进行重构,从而控制设备无需从存储设备上读取第一条带上未丢的分块,仅需要从存储设备处获取目标个数的第一结果,就可以对第一分块进行重构,由于第一结果与第一条带上未丢的分块相比数据量较少,因此在重构第一分块的过程中,控制设备和存储设备之间传输的数据量比较少,在传输过程中,占用的网络带宽也比较少,从而提高了重构分块的性能。并且由于仅向各个存储设备发送一个第二获取请求就可以获取到各个存储设备发送的第一结果,而无需向各个存储设备发送大量的读请求,不仅降低了控制设备的CPU的开销,也进一步降低网络带宽的占用情况,进而可以进一步提高重构分块的性能。并且通过比较第二分块与重构的第二分块的内容是否一致,可以避免出现数据静默的故障,提高了系统的可靠性。
对于一些特殊的应用场景(例如存储和控制一化的云存储场景),包括存储单元和控制单元的设备兼具控制设备、存储设备以及目标存储设备的功能。对于这样的设备可以根据图10所示的本发明实施例提供的一种数据重构的方法的流程图,对丢失的分块进行重构。该方法的流程下述步骤。
1001、第一存储设备确定第一条带中已丢失分块中的第一分块,该第一条带中的分块由目标个数的设备存储,其中,第一存储设备为目标个数的存储设备中的任一设备。
每个存储设备包括控制单元和存储单元,第一存储设备可以通过第一存储设备的控制单元执行本步骤1001。在本步骤1001之前,第一存储设备可以先查询第一存储设备的存储单元内的存储介质是否丢失分块,若存储介质有丢失的分块,则可以先确定哪个条带丢失了分块,再确定哪个条带上丢失了哪些分块。
对于整个存储介质失效的情况下,在一种可能的实现方式中,第一存储设备查询目标个 数的存储设备中的第一存储设备内的存储介质是否失效;若第一存储设备内的存储介质失效时,该第一存储设备从失效的存储介质所存储的分块中,确定第一条带中已丢失分块中的第一分块。具体地,当第一存储设备内的存储介质中的任一介质失效时,第一存储设备从关联表中,确定该任一存储介质的标识对应的至少一个第二条带,然后再将该至少一个第二条带中的任一条带作为该第一条带,再确定与该任一存储介质的标识以及第一条带的条带标识均对应的至少一个第四分块,该第一存储设备将该至少一个第四分块中的任一分块确定为该第一分块。
对于存储介质部分失效的情况下,在一种可能的实现方式中,第一存储设备查询该目标个数的存储设备中的第一存储设备中的存储介质是否有分块丢失;当该第一存储设备内的任一存储介质丢失至少一个分块时,从至少一个分块中确定第一条带中已丢失分块中的第一分块。具体地,第一存储设备从关联表中,确定至少一个分块对应的至少一个第二条带,然后再将该至少一个第二条带中的任一条带作为该第一条带,该第一存储设备再将该第一条带内与该至少一个分块对应的至少一个第四分块中的任一分块确定为该第一分块。
1002、第一存储设备将该第一条带的校验矩阵拆分成第一子矩阵和目标个数的第二子矩阵。
本步骤1002可以由第一存储设备的控制单元来执行。本步骤1002与步骤303中控制设备将该第一条带的校验矩阵拆分成第一子矩阵和目标个数的第二子矩阵的过程同理,在此本发明实施例对本步骤1002不做赘述。
1003、第一存储设备向该每个第二子矩阵所对应的存储设备的存储单元分别发送第三获取请求,该第三获取请求携带与存储设备对应的第二子矩阵、该第二子矩阵中每一列所对应的第三分块的块信息以及该第一存储设备的标识,其中,该第二存储设备为该目标个数的存储设备中除该第一存储设备以外的任一设备。
由于每个存储设备都有存储单元,且每个存储设备内的存储单元均存储有第一条带的部分分块,每个存储设备内的存储单元具有一定的计算能力,因此,该第一存储设备可以向每个存储设备的存储单元发送第三获取请求,以便每个存储设备的存储单元可以基于第三获取请求的内容,获取第一结果。
该第三获取请求用于指示向第一存储设备发送第一结果,该第三获取请求为获取请求的一种,该获取请求携带与存储设备对应的第二子矩阵、该第二子矩阵中每一列所对应的第三分块的块信息以及第一存储设备的标识,当该目标设备的标识为第一存储设备的标识时,该获取请求为第三获取请求,此时,该目标设备也即是第一存储设备。
1004、存储设备的存储单元接收该第三获取请求。
1005、存储设备的存储单元根据该第二子矩阵中每一列所对应的第三分块的块信息,读取该至少一个第三分块。
本步骤1005与步骤306中存储设备该第二子矩阵中每一列所对应的第三分块的块信息,获取该至少一个第三分块的过程同理,在此,本发明实施例对本步骤1004不做赘述。
1006、存储设备的存储单元根据该至少一个第三分块以及该第二子矩阵计算,得到第一结果。
本步骤1006与步骤307中存储设备获取该第一结果的过程同理,在此,本发明实施例对本步骤1006不做赘述。
1007、存储设备的存储单元向第一存储设备发送该第一结果。
需要说明的是,该目标个数的存储设备的存储单元均要执行步骤1004-1007所示的过程。需要说明的是,1004-1007中的存储设备包括第一存储设备
1008、第一存储设备获取目标个数的存储设备的存储单元基于第三获取请求返回的第一结果。
由于该目标个数的存储设备的存储单元中均要执行步骤1004-1007,因此,第一存储设备获取目标个数的存储设备的存储单元基于第三获取请求返回的第一结果。
1009、第一存储设备根据目标个数的存储设备的第一结果以及第一子矩阵,对第一分块进行重构。
本步骤1009与步骤310中控制设备对该第一分块进行重构的过程同理,在此本发明实施例对本步骤1009不做赘述。
1010、第一存储设备将重构的第一分块存储在该第一存储设备的存储介质内。
第一存储设备的存储介质可以是第一存储设备的存储单元内的任一存储介质,当该第一存储设备存储完重构的第一分块后,可以更新关联表中与第一分块对应的信息,以便后续所有的存储设备均可以从第一存储设备处读取重构的第一分块。
为了避免出现静默错误,在本步骤1010之前,第一存储设备从存储该第二分块的存储设备获取该第二分块;若该重构的第二分块与获取的第二分块相同,则执行将重构的第一分块存储在该第一存储设备的存储介质内的步骤,否则,不执行将重构的第一分块存储在该第一存储设备的存储介质内的步骤,并且禁止使用该第一条带。需要说明的是,第一存储设备重构第二分块的过程与控制设备重构第二分块的过程同理。
1011、第一存储设备查询该第一条带的已丢失的所有分块是否全部重构完成,若未全部重构完成,则对未重构的任一个分块继续执行上述分块重构过程,否则该第一条带重构完成。
需要说明的是,该第一存储设备查询的第一条带上丢失的分块可以是第一存储设备丢失的分块,还可以其他存储设备丢失的分块。
为了进一步说明步骤1001-1011所示的过程,参见图11所示本发明实施例提供的一种分块重构过程的示意图。在图11中,以23+2为例,分块1故障(该故障可以是分块丢失),分块1-8在节点1上,分块9-16在节点2上,分块17-25在节点3上。每个节点都为运行了控制单元和存储单元的存储设备(也即是控制与存储一体的设备)。
1101、节点1的控制单元发起重构,扫描到故障的分块(分块1)。
1102、节点1的控制单元把分块1所在的条带的分块按框分组,需要修复分块1和分块25,节点1的控制单元把校验矩阵H拆分成4个子矩阵(H x1,H x2,H x3,H x4),节点1通过IO单元向各个节点的存储单元发送请求(第三获取请求)。
其中,子矩阵H x1,H x2以及H x3为第二子矩阵,子矩阵H x4为第一子矩阵,节点1的控制单元把分块2-24分成3组:分块2-8为1组,节点1的存储单元对子矩阵H x1进行计算;分块9-16为1组,节点2的存储单元对子矩阵H x2进行计算;分块17-24为1组,节点3的存储单元对子矩阵H x3进行计算。
1103、接收到请求后,各个节点的存储单元通过IO单元读取存储介质上的分块。
节点1的存储单元读取到分块2-8,节点2的存储单元读取到分块9-16,节点3的存储单元读取到分块17-25。
1104、各个节点的存储单元读取到分块后,通过计算得到中间结果(第一结果)。
节点1的存储单元使用子矩阵H x1进行计算,得到中间结果Q x1;节点2的存储单元使用子矩阵H x2进行计算,得到Q x2;节点3的存储单元使用子矩阵H x3进行计算,得到Q x3,其中,Q x1、Q x2和Q x3为第一结果。
1105、各个节点的存储单元把中间结果发送给节点1。节点1使用中间结果(Q x1、Q x2和Q x3)和子矩阵H x4,计算最终结果,恢复分块1和分块25。并把恢复的分块1写入存储介质中。节点1的控制单元继续重构第一条带上故障的其他分块,或者重构完成结束。
本发明实施例提供的方法,通过第一存储设备直接根据获取的目标个数的存储设备的第一结果对第一条带中丢失的第一分块进行重构,从而第一存储设备无需从其他存储设备上读取第一条带上未丢的分块,仅需要从存储设备的存储单元处获取目标个数的第一结果,就可以对第一分块进行重构,由于第一结果与第一条带上未丢的分块相比数据量较少,因此在重构第一分块的过程中,第一存储设备和其他存储设备之间传输的数据量比较少,在数据传输过程中,占用的网络带宽也比较少,从而提高了重构分块的性能。并且由于仅向各个存储设备的存储单元发送一个第三获取请求就可以获取到各个存储设备的存储单元发送的第一结果,而无需向各个存储设备的存储单元发送大量的读请求,不仅降低了控制设备的CPU的开销,也进一步降低网络带宽的占用情况,进而可以进一步提高重构分块的性能。并且通过比较重构的第二分块与获取的分块的内容是否一致,可以避免出现数据静默的故障,提高了系统的可靠性。并且,当控制模块只接入一个存储模块的场景,控可以完全减少控制模块和存储模块间的重构带宽。
当第一条带被存储多个硬盘框或者多个AZ中时,目标个数的存储设备可以由至少一个主存储设备来管理,目标个数的存储设备可以将其存储单元生成的第一结果发送至对应的主存储设备,然后主存储设备对获取的第一结果进行求和,重构第一分块的设备可以在主存储设备处获取求和后的矩阵并根据求和后的矩阵,重构第一分块,为了说明该过程,参见图12所示的本发明实施例提供的一种数据重构的方法的流程图。该方法的流程包括以下步骤。
1201、第一存储设备确定第一条带中已丢失分块中的第一分块,该第一条带中的分块由目标个数的设备存储。
本步骤1201与步骤1001所示的过程同理,本发明实施例对本步骤1201不做赘述。
1202、第一存储设备将该第一条带的校验矩阵拆分成第一子矩阵和目标个数的第二子矩阵。
本步骤1202与步骤1002所示的过程同理,本发明实施例对本步骤1202不做赘述。
1203、第一存储设备向每个第二子矩阵所对应的存储设备的存储单元分别发送第四获取请求,该第四获取请求携带与存储设备对应的第二子矩阵、该第二子矩阵中每一列所对应的第三分块的块信息以及主存储设备的标识。
该第四获取请求用于指示向主存储设备发送第一结果,该第四获取请求为获取请求的一种,该获取请求携带与存储设备对应的第二子矩阵、该第二子矩阵中每一列所对应的第三分块的块信息以及目标设备的标识,当该目标设备的标识为主存储设备的标识时,该获取请求为第四获取请求,此时,该目标设备也即是主存储设备。
1204、存储设备的存储单元接收该第四获取请求。
1205、存储设备的存储单元根据该第二子矩阵中每一列所对应的第三分块的块信息,读 取该至少一个第三分块。
本步骤1205与步骤1005所示的过程同理,本发明实施例对本步骤1205不做赘述。
1206、存储设备的存储单元根据该至少一个第三分块以及该第二子矩阵计算,得到第一结果。
本步骤1206与步骤1006所示的过程同理,本发明实施例对本步骤1206不做赘述。
1207、存储设备向管理该存储设备的主存储设备发送该第一结果。
需要说明的是,目标个数的存储设备均只执行步骤1204-1207的过程。
1208、对于至少一个主存储设备中的任一主存储设备,该任一主存储设备获取该任一主存储设备管理的至少一个存储设备基于该第四获取请求返回的第一结果。
1209、该任一主存储设备对获取的至少一个存储设备返回的第一结果进行求和,得到目标求和矩阵。
该目标求和矩阵,也即是至少一个存储设备基于该第四获取请求返回的第一结果的和。
1210、该任一主存储设备向第一存储设备发送目标求和矩阵。
由于每个存储设备均会向对应的主存储设备发送第一结果,因此,该至少一个主存储设备中的每个主存储设备均执行步骤1208-1210所示的过程。
1211、第一存储设备从至少一个主存储设备获取至少一个目标求和矩阵。
由于该至少一个主存储设备中的每个主存储设备均执行步骤1208-1210所示的过程,该第一存储设备会获取到至少一个主存储设备的目标求和矩阵,也即是获取到至少一个目标求和矩阵。
1212、第一存储设备根据至少一个目标求和矩阵以及该第一子矩阵,对该第一分块进行重构。
该第一存储设备的控制单元可以对该至少一个目标求和矩阵进行求和,得到求和矩阵;基于该第一子矩阵的逆矩阵以及该求和矩阵,获取目标分块矩阵;将该目标分块矩阵的第一目标行作为重构的第一分块,将该目标分块矩阵的第二目标行作为重构的第二分块。
需要说明的是,本步骤1212所示的过程也即是根据目标个数的存储设备的第一结果,对第一分块进行重构的过程
1213、第一存储设备将重构的第一分块存储在该第一存储设备的存储介质内。
本步骤1213与步骤1011所示的过程同理,本发明实施例对本步骤1213不做赘述。需要说明的是,该第一存储设备查询的第一条带上丢失的分块可以是第一存储设备丢失的分块,还可以其他存储设备丢失的分块。
为了避免出现静默错误,在本步骤1213之前,第一存储设备从存储该第二分块的存储设备获取该第二分块;若该重构的第二分块与获取的第二分块相同,则执行将重构的第一分块存储在该第一存储设备的存储介质内的步骤,否则,不执行将重构的第一分块存储在该第一存储设备的存储介质内的步骤,并且禁止使用该第一条带。需要说明的是,第一存储设备重构第二分块的过程与控制设备重构第二分块的过程同理。
1214、第一存储设备查询该第一条带的已丢失的所有分块是否全部重构完成,若未全部重构完成,则对未重构的任一个分块继续执行上述分块重构过程,否则该第一条带重构完成。
本步骤1214与步骤1012所示的过程同理,本发明实施例对本步骤1214不做赘述。
为了进一步说明步骤1201-1214所示的过程,参见图13所示本发明实施例提供的一种分 块重构过程的示意图。在图13中,以23+2为例,分块1为故障分块,需要修复分块1和分块25,分块1-6在节点1上,分块7-12在节点2上,分块13-18在节点3上,分块19-25在节点4上。其中,节点1和节点2在框1或者AZ1中,节点2和节点3在框2或者AZ2中。每个节点都为运行了控制单元和存储单元的存储设备。
1301、节点1的控制单元发起重构,扫描到故障的分块(分块1)。
1302、节点1的控制单元把分块1所在的条带的分块按框分组。节点1的控制单元把校验矩阵H拆分成5个子矩阵(H y1,H y2,H y3,H y4,H y5),节点1通过IO单元向各个节点的存储单元发送请求(第四获取请求)
其中,子矩阵H y1,H y2,H y3,H y4为第二子矩阵,子矩阵H y5为第一子矩阵,节点1的控制单元把分块2-24分成4组:分块2-6为1组,节点1的存储单元对H y1进行计算;分块7-12为1组,节点2的存储单元对H y2进行计算;分块13-18为1组,节点3的存储单元对H y3进行计算,分块19-23为1组,节点4的存储单元对H y4进行计算。
1303、接收到请求后,各个节点的存储单元通过IO单元读取存储介质上的分块。
节点1的存储单元读取到分块2-6,节点2的存储单元读取到分块7-12,节点3的存储单元读取到分块13-18,节点2的存储单元读取到分块7-12,节点7的存储单元读取到分块13-24。
1304、各个节点存储单元读取到分块后,通过计算得到中间结果(第一结果)。
节点1的存储单元对子矩阵H y1和其对应的第三分块进行计算,得到中间结果Q y1;节点2的存储单元对子矩阵H y2和其对应的第三分块进行计算,得到Q y2;节点3的存储单元对子矩阵H y3和其对应的第三分块进行计算,得到Q y3;节点4的存储单元对子矩阵H y4和其对应的第三分块进行计算,得到Q y4;其中,Q y1、Q y2、Q y3和Q y4为第一结果。
1305、各个节点的存储单元把中间结果发送给框内或者AZ内的主节点(主存储设备),各个主节点计算目标求和矩阵。
节点2通过内部交换机把Q y2发送给节点1(主存储设备),节点1对中间结果(Q y2,Q y1)进行加法运算,得到目标求和矩阵T 1;节点4通过内部交换机把Q y4发送给节点3(主存储设备),节点3对中间结果(Q y3,Q y4)进行加法运算,得到目标求和矩阵T 2
1306、主节点3通过框间或AZ间的交换机,将计算的目标求和矩阵给节点1。
1307、节点1使用目标求和矩阵和子矩阵H y5,计算最终结果,恢复分块1和分块25。并把恢复的分块1写入存储介质中。节点1的控制单元继续重构故障其他分块,或者重构完成结束。
本发明实施例提供的方法,通过第一存储设备将第一条带的校验矩阵拆分成第一子矩阵和所述目标个数的第二子矩阵,并将拆分出的第二子矩阵发送给对应的存储设备的存储单元,各个存储设备的存储单元生成第一结果,各个存储设备将生成的第一结果发送至主存储设备,由主存储设备先对接收到的第一结果进行求和,得到目标求和矩阵,再将目标求和矩阵发送给第一存储设备,最后第一存储设备根据至少一个目标求和矩阵以及第一子矩阵,重构第一分块。在该过程中第一存储设备无需从其他存储设备上读取第一条带上未丢的分块,仅需要从主存储设备获取目标第一结果,就可以对第一分块进行重构,由于第一结果或者是目标第一结果与第一条带上未丢的分块相比数据量较少,因此在重构第一分块的过程中,第一存储设备和其他存储设备之间传输的数据量比较少,在数据传输过程中,占用的网络带宽也比较 少,从而提高了重构分块的性能。若该至少一个主存储设备在不同的硬盘框内,各个硬盘框之间交互的数据仅有第二子矩阵,因此,可以降低各个硬盘框之间的数据交互量。并且由于仅向各个存储设备的存储单元发送一个第四获取请求就可以各个主存储设备获取到各个存储设备的存储单元发送的第一结果,而无需向各个存储设备的存储单元发送大量的读请求,不仅降低了控制设备的CPU的开销,也进一步降低网络带宽的占用情况,进而可以进一步提高重构分块的性能。并且通过比较重构的第二分块与获取的分块的内容是否一致,可以避免出现数据静默的故障,提高了系统的可靠性。
图14是本发明实施例提供的一种数据重构的装置的结构示意图,该装置包括:
确定模块1401,用于执行上述步骤302;
第一获取模块1402,用于获取所述目标个数的存储设备的第一结果,每个第一结果由所述目标个数的存储设备中的一个存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到;
重构模块1403,用于根据所述目标个数的存储设备的第一结果,对所述第一分块进行重构。
可选地,该装置还包括拆分模块,用于执行上述步骤303;
所述重构模块1403,用于执行上述步骤3010。
可选地,该重构模块1403,用于:
对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
基于所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
将所述目标分块矩阵的第一目标行确定为重构的第一分块。
可选地,该装置还包括:
第一发送模块,用于执行上述步骤304;
所述第一获取模块1402,用于执行上述步骤309。
可选地,该第一发送模块,用于执行上述步骤311。
可选地,所述装置还包括执行模块;
所述第一获取模块1402,还用于从存储所述第二分块的存储设备获取所述第二分块;
所述重构模块1403,还用于根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
所述执行模块,用于若重构的第二分块与获取的所述第二分块相同,则执行所述向目标存储设备发送写请求的步骤,否则,不执行所述向目标存储设备发送写请求的步骤,并且禁止使用所述第一条带。
可选地,所述装置还包括:
第二发送模块,用于执行上述步骤804;
所述第二发送模块,还用于执行上述步骤805;
第一接收模块,用于执行上述步骤813。
可选地,所述装置还包括:
第二接收模块,用于接收第一重构请求,所述第一重构请求携带存储设备中失效的存储介质的存储介质标识;
所述确定模块1401,还用于执行上述步骤21-23。
可选地,所述装置还包括:
第三接收模块,用于接收第二重构请求,所述第二重构请求携带存储设备中存储介质丢失的至少一个分块的块信息;
所述确定模块1401,还用于执行上述步骤2A-2C。
可选地,所述装置还包括:
第一查询模块,用于查执行上述步骤1012。
可选地,所述装置还包括:
第二查询模块,用于查询所述目标个数的存储设备中的第一存储设备内的存储介质是否失效;
所述确定模块1401,还用于若第一存储设备内的存储介质失效时,从失效的存储介质所存储的分块中,确定第一条带中已丢失分块中的第一分块。
可选地,所述装置还包括:
第三查询模块,用于查询所述目标个数的存储设备中的第一存储设备中的存储介质是否有分块丢失;
所述确定模块1401,还用于当所述第一存储设备内的任一存储介质丢失至少一个分块时,从所述至少一个分块中确定第一条带中已丢失分块中的第一分块。
可选地,所述装置还包括:
第三发送模块,用于执行上述步骤1003;
第二获取模块,用于获取执行上述步骤1008。
可选地,所述装置还包括:
第三获取模块,用于执行上述步骤1204;
所述第三获取模块,还用于执行上述步骤1211;
所述重构模块1403,还用于执行上述步骤1012。
可选地,所述重构模块1403还用于:
对所述至少一个目标求和矩阵进行求和,得到求和矩阵;
基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
将所述目标分块矩阵的第一目标行确定为重构的第一分块。
图15是本发明实施例提供的一种数据重构的装置的结构示意图,该装置包括:
读取模块1501,用于读取存储设备存储的第一条带的有效分块,所述第一条带由目标个数的存储设备存储;
计算模块1502,用于根据读取的有效分块计算,得到第一结果;
发送模块1503,用于执行上述步骤308。
可选地,该装置还包括:
接收模块,用于接收获取请求,所述获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标设备的标识,所述第二子矩阵包括第一条带中在对应的存储设备上所存储的至少一个第三分块所对应的列,所述第二分块为所述第一条带中未丢失的任一分块,所述第三分块为未丢失的分块中除所述第二分块以外的任一分块;
所述读取模块1501,用于执行上述步骤1205;
所述计算模块1502,用于执行上述步骤1206。
可选地,该计算模块1502还用于:
将所述至少一个第三分块组成分块矩阵,所述分块矩阵的每一行为一个第三分块;
将所述第二子矩阵与所述分块矩阵相乘,得到所述第一结果。
可选地,所述目标设备包括控制设备、目标存储设备、第一存储设备以及至少一个主存储设备,所述第一存储设备为所述目标存储设备中的任一设备,所述至少一个主存储设备用于管理所述目标个数的存储设备;
当所述目标设备的标识为控制设备的标识时,所述获取请求为第一获取请求,用于指示向所述控制设备发送所述第一结果;
当所述目标设备的标识为目标存储设备的标识时,所述获取请求为第二获取请求,用于指示向所述目标存储设备发送所述第一结果;
当所述目标设备的标识为第一存储设备的标识时,所述获取请求为第三获取请求,用于指示向所述第一存储设备发送所述第一结果;
当所述目标设备的标识为主存储设备的标识时,所述获取请求为第四获取请求,用于指示向所述主存储设备发送所述第一结果。
可选地,所述装置还包括:
第一查询模块,用于查询存储设备内的存储介质是否失效;
所述发送模块1503,还用于当所述存储设备中的存储介质失效时,向控制设备发送第一重构请求,所述第一重构请求携带所述存储设备中失效的存储介质的存储介质标识。
可选地,所述装置还包括:
第二查询模块,用于查询存储设备中的存储介质是否有分块丢失;
所述发送模块1503,还用于当所述存储设备内的任一存储介质丢失至少一个分块时,向控制设备发送第二重构请求,所述第二重构请求携带所述至少一个分块的块信息。
图16是本发明实施例提供的一种数据重构的装置的结构示意图,该装置包括:
接收模块1601,用于接收目标重构请求,所述目标重构请求携带第一子矩阵,所述目标重构请求用于指示根据所述第一子矩阵,对第一条带中已丢失分块中的第一分块进行重构,所述第一子矩阵包括所述第一条带中已丢失分块中的第一分块所对应的列以及第二分块所对应的列,所述第二分块为所述第一条带中未丢失的任一分块;
获取模块1602,用于执行上述步骤810;
重构模块1603,用于执行上述步骤811。
可选地,所述重构模块1603还用于:
对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
将所述目标分块的第一目标行作为重构的第一分块。
可选地,所述装置还包括:
存储模块,用于执行上述步骤812;
发送模块,用于执行上述步骤813。
可选地,所述装置还包括执行模块;
所述获取模块1602,还用于从存储所述第二分块的存储设备获取所述第二分块;
所述重构模块1603,还用于根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
所述执行模块,用于若重构的第二分块与获取的所述第二分块相同,则执行所述对重构后的第一分块进行存储的步骤,否则,不执行所述对重构后的第一分块进行存储的步骤,并且禁止使用所述第一条带。
上述所有可选技术方案,可以采用任意结合形成本公开的可选实施例,在此不再一一赘述。
需要说明的是:上述实施例提供的数据重构的装置在对分块进行重构时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的数据重构的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,存储介质或光盘等。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (66)

  1. 一种数据重构的方法,其特征在于,所述方法包括:
    确定第一条带中已丢失分块中的第一分块,所述第一条带中的分块由目标个数的存储设备存储;
    获取所述目标个数的存储设备的第一结果,每个第一结果由所述目标个数的存储设备中的一个存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到;
    根据所述目标个数的存储设备的第一结果,对所述第一分块进行重构。
  2. 根据权利要求1所述的方法,其特征在于,所述获取所述目标个数的存储设备的第一结果之前,所述方法还包括:
    将所述第一条带的校验矩阵拆分成第一子矩阵和所述目标个数的第二子矩阵,所述第一子矩阵包括所述第一分块所对应的列以及第二分块所对应的列,每个第二子矩阵包括所述第一条带中在一个存储设备上所存储的至少一个第三分块所对应的列,所述第二分块为所述第一条带的有效分块中的任一分块,所述第三分块为所述有效分块中除所述第二分块以外的任一分块;
    所述根据所述目标个数的存储设备的第一结果,对所述第一分块进行重构包括:
    根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构。
  3. 根据权利要求2所述的任一方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构包括:
    对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
    基于所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
    将所述目标分块矩阵的第一目标行确定为重构的第一分块。
  4. 根据权利要求2-3所述的任一方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
    向所述每个第二子矩阵所对应的存储设备分别发送第一获取请求,所述第一获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及控制设备的标识;
    获取所述目标个数的存储设备基于所述第一获取请求返回的第一结果。
  5. 根据权利要求2-4所述的任一方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之后,所述方法还包括:
    向目标存储设备发送写请求,所述写请求携带重构的第一分块以及所述重构的第一分块的块信息,由所述目标存储设备根据所述重构的第一分块的块信息,对所述重构的第一分块进行存储。
  6. 根据权利要求2-5所述的任一方法,其特征在于,所述向目标存储设备发送写请求之前,所述方法还包括:
    从存储所述第二分块的存储设备获取所述第二分块;
    根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
    若重构的第二分块与获取的所述第二分块相同,则执行所述向目标存储设备发送写请求的步骤,否则,不执行所述向目标存储设备发送写请求的步骤,并且禁止使用所述第一条带。
  7. 根据权利要求2-6所述的任一方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
    向所述每个第二子矩阵所对应的存储设备分别发送第二获取请求,所述第二获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标存储设备的标识;
    向所述目标存储设备发送目标重构请求,所述目标重构请求携带所述第一子矩阵,所述目标重构请求用于指示根据所述第一子矩阵,对所述第一分块进行重构;
    接收所述目标存储设备发送的重构完成响应,所述重构完成响应用于指示所述第一分块重构完成。
  8. 根据权利要求1-7所述的任一方法,其特征在于,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
    接收第一重构请求,所述第一重构请求携带存储设备中失效的存储介质的存储介质标识;
    根据所述第一重构请求携带的存储介质标识,确定至少一个第二条带;
    将所述至少一个第二条带中任一个条带确定为所述第一条带;
    所述确定第一条带中已丢失分块中的第一分块包括:
    根据所述存储介质标识,将所述第一条带中已丢失分块中的任一个分块确定为所述第一分块。
  9. 根据权利要求1-7所述的任一方法,其特征在于,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
    接收第二重构请求,所述第二重构请求携带存储设备中存储介质丢失的至少一个分块的块信息;
    根据所述第二重构请求携带的至少一个分块的块信息,确定至少一个第二条带;
    将所述至少一个第二条带中任一个条带确定为所述第一条带;
    所述确定第一条带中已丢失分块中的第一分块包括:
    根据所述至少一个分块的块信息,将所述第一条带中已丢失分块中的任一个分块确定为所述第一分块。
  10. 根据权利要求1-9所述的任一方法,其特征在于,所述根据所述目标个数的存储设 备的第一结果,对所述第一分块进行重构之后,所述方法还包括:
    查询所述第一条带的已丢失的所有分块是否全部重构完成,若未全部重构完成,则对未重构的任一个分块继续执行上述分块重构过程,否则所述第一条带重构完成。
  11. 根据权利要求1-10所述的任一方法,其特征在于,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
    查询所述目标个数的存储设备中的第一存储设备内的存储介质是否失效;
    所述确定第一条带中已丢失分块中的第一分块包括:
    若第一存储设备内的存储介质失效时,从失效的存储介质所存储的分块中,确定第一条带中已丢失分块中的第一分块。
  12. 根据权利要求1-10所述的任一方法,其特征在于,所述确定第一条带中已丢失分块中的第一分块之前,所述方法还包括:
    查询所述目标个数的存储设备中的第一存储设备中的存储介质是否有分块丢失;
    当所述第一存储设备内的任一存储介质丢失至少一个分块时,从所述至少一个分块中确定第一条带中已丢失分块中的第一分块。
  13. 根据权利要求2-12所述的任一方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
    向所述每个第二子矩阵所对应的存储设备的存储单元分别发送第三获取请求,所述第三获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及所述第一存储设备的标识;
    获取所述目标个数的存储设备的存储单元基于所述第三获取请求返回的第一结果。
  14. 根据权利要求2-12所述的任一方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之前,所述方法还包括:
    向所述每个第二子矩阵所对应的存储设备的存储单元分别发送第四获取请求,所述第四获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及主存储设备的标识;
    从至少一个主存储设备获取至少一个目标求和矩阵,所述至少一个主存储设备用于管理所述目标个数的存储设备,每个目标求和矩阵为至少一个存储设备基于所述第四获取请求返回的第一结果的和,所述至少一个存储设备为一个主存储设备所管理的设备;
    所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构包括:
    根据所述至少一个目标求和矩阵以及所述第一子矩阵,对所述第一分块进行重构。
  15. 根据权利要求14所述的方法,其特征在于,所述根据所述至少一个目标求和矩阵以及所述第一子矩阵,对所述第一分块进行重构包括:
    对所述至少一个目标求和矩阵进行求和,得到求和矩阵;
    基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
    将所述目标分块矩阵的第一目标行确定为重构的第一分块。
  16. 一种数据重构的方法,其特征在于,所述方法包括:
    读取存储设备存储的第一条带的有效分块,所述第一条带由目标个数的存储设备存储;
    根据读取的有效分块计算,得到第一结果;
    向所述目标设备发送所述第一结果,由所述目标设备根据所述目标个数的存储设备的第一结果,对所述第一条带中已丢失分块中的第一分块进行重构。
  17. 根据权利要求16所述的方法,其特征在于,所述读取第一存储设备存储的第一条带的有效分块之前,所述方法还包括:
    接收获取请求,所述获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标设备的标识,所述第二子矩阵包括第一条带中在对应的存储设备上所存储的至少一个第三分块所对应的列,所述第二分块为所述第一条带的有效分块中的任一分块,所述第三分块为所述有效分块中除所述第二分块以外的任一分块;
    所述读取第一存储设备存储的第一条带的有效分块包括:
    根据所述第二子矩阵中每一列所对应的第三分块的块信息,读取所述至少一个第三分块。
    所述根据读取的有效分块计算,得到第一结果包括:
    根据所述至少一个第三分块以及所述第二子矩阵计算,得到所述第一结果。
  18. 根据权利要求17所述的方法,其特征在于,所述根据所述至少一个第三分块以及所述第二子矩阵计算,得到所述第一结果包括:
    将所述至少一个第三分块组成分块矩阵,所述分块矩阵的每一行为一个第三分块;
    将所述第二子矩阵与所述分块矩阵相乘,得到所述第一结果。
  19. 根据权利要求17-18所述的任一方法,其特征在于,所述目标设备包括控制设备、目标存储设备、第一存储设备以及至少一个主存储设备,所述第一存储设备为所述目标存储设备中的任一设备,所述至少一个主存储设备用于管理所述目标个数的存储设备;
    当所述目标设备的标识为控制设备的标识时,所述获取请求为第一获取请求,用于指示向所述控制设备发送所述第一结果;
    当所述目标设备的标识为目标存储设备的标识时,所述获取请求为第二获取请求,用于指示向所述目标存储设备发送所述第一结果;
    当所述目标设备的标识为第一存储设备的标识时,所述获取请求为第三获取请求,用于指示向所述第一存储设备发送所述第一结果;
    当所述目标设备的标识为主存储设备的标识时,所述获取请求为第四获取请求,用于指示向所述主存储设备发送所述第一结果。
  20. 根据权利要求17-19所述的任一方法,其特征在于,所述接收获取请求之前,所述方法还包括:
    查询存储设备内的存储介质是否失效;
    当所述存储设备中的存储介质失效时,向控制设备发送第一重构请求,所述第一重构请求携带所述存储设备中失效的存储介质的存储介质标识。
  21. 根据权利要求17-19所述的任一方法,其特征在于,所述接收获取请求之前,所述方法还包括:
    查询存储设备中的存储介质是否有分块丢失;
    当所述存储设备内的任一存储介质丢失至少一个分块时,向控制设备发送第二重构请求,所述第二重构请求携带所述至少一个分块的块信息。
  22. 一种数据重构的方法,其特征在于,所述方法包括:
    接收目标重构请求,所述目标重构请求携带第一子矩阵,所述目标重构请求用于指示根据所述第一子矩阵,对第一条带中已丢失分块中的第一分块进行重构,所述第一子矩阵包括所述第一条带中已丢失分块中的第一分块所对应的列以及第二分块所对应的列,所述第二分块为所述第一条带的有效分块中的任一分块;
    获取目标个数的存储设备基于所述第二获取请求返回的第一结果,所述第二获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标存储设备的标识,所述第一条带中的分块由所述目标个数的存储设备存储,每个第一结果由所述目标个数的存储设备中的一个存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到,所述第三分块为所述失效分块中除所述第二分块以外的任一分块;
    根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构。
  23. 根据权利要求22所述的方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构包括:
    对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
    基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
    将所述目标分块的第一目标行作为重构的第一分块。
  24. 根据权利要求22-23所述的任一方法,其特征在于,所述根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构之后,所述方法还包括:
    对重构后的第一分块进行存储;
    当对所述重构后的第一分块存储完成时,向控制设备发送重构完成响应,所述重构完成响应用于指示所述第一分块重构完成。
  25. 根据权利要求24所述的方法,其特征在于,所述对重构后的第一分块进行存储之前,所述方法还包括:
    从存储所述第二分块的存储设备获取所述第二分块;
    根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
    若重构的第二分块与获取的所述第二分块相同,则执行所述对重构后的第一分块进行存储的步骤,否则,不执行所述对重构后的第一分块进行存储的步骤,并且禁止使用所述第一条带。
  26. 一种数据重构的装置,其特征在于,所述装置包括:
    确定模块,用于确定第一条带中已丢失分块中的第一分块,所述第一条带中的分块由目标个数的存储设备存储;
    第一获取模块,用于获取所述目标个数的存储设备的第一结果,每个第一结果由所述目标个数的存储设备中的一个存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到;
    重构模块,用于根据所述目标个数的存储设备的第一结果,对所述第一分块进行重构。
  27. 根据权利要求26所述的装置,其特征在于,所述装置还包括拆分模块;
    所述拆分模块,用于将所述第一条带的校验矩阵拆分成第一子矩阵和所述目标个数的第二子矩阵,所述第一子矩阵包括所述第一分块所对应的列以及第二分块所对应的列,每个第二子矩阵包括所述第一条带中在一个存储设备上所存储的至少一个第三分块所对应的列,所述第二分块为所述第一条带的有效分块中的任一分块,所述第三分块为所述有效分块中除所述第二分块以外的任一分块;
    所述重构模块,用于根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构。
  28. 根据权利要求27所述的装置,其特征在于,所述重构模块用于:
    对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
    基于所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
    将所述目标分块矩阵的第一目标行确定为重构的第一分块。
  29. 根据权利要求27-28所述的任一装置,其特征在于,所述装置还包括:
    第一发送模块,用于向所述每个第二子矩阵所对应的存储设备分别发送第一获取请求,所述第一获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及控制设备的标识;
    所述第一获取模块,还用于获取所述目标个数的存储设备基于所述第一获取请求返回的第一结果。
  30. 根据权利要求29所述的装置,其特征在于,所述第一发送模块还用于:
    向目标存储设备发送写请求,所述写请求携带重构的第一分块以及所述重构的第一分块的块信息,由所述目标存储设备根据所述重构的第一分块的块信息,对所述重构的第一分块进行存储。
  31. 根据权利要求27-30所述的任一装置,其特征在于,所述装置还包括执行模块;
    所述第一获取模块,还用于从存储所述第二分块的存储设备获取所述第二分块;
    所述重构模块,还用于根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
    所述执行模块,用于若重构的第二分块与获取的所述第二分块相同,则执行所述向目标存储设备发送写请求的步骤,否则,不执行所述向目标存储设备发送写请求的步骤,并且禁止使用所述第一条带。
  32. 根据权利要求27-31所述的任一装置,其特征在于,所述装置还包括:
    第二发送模块,用于向所述每个第二子矩阵所对应的存储设备分别发送第二获取请求,所述第二获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标存储设备的标识;
    所述第二发送模块,还用于向所述目标存储设备发送目标重构请求,所述目标重构请求携带所述第一子矩阵,所述目标重构请求用于指示根据所述第一子矩阵,对所述第一分块进行重构;
    第一接收模块,用于接收所述目标存储设备发送的重构完成响应,所述重构完成响应用于指示所述第一分块重构完成。
  33. 根据权利要求26-32所述的任一装置,其特征在于,所述装置还包括:
    第二接收模块,用于接收第一重构请求,所述第一重构请求携带存储设备中失效的存储介质的存储介质标识;
    所述确定模块还用于:
    根据所述第一重构请求携带的存储介质标识,确定至少一个第二条带;
    将所述至少一个第二条带中任一个条带确定为所述第一条带;
    根据所述存储介质标识,将所述第一条带中已丢失分块中的任一个分块确定为所述第一分块。
  34. 根据权利要求26-32所述的任一装置,其特征在于,所述装置还包括:
    第三接收模块,用于接收第二重构请求,所述第二重构请求携带存储设备中存储介质丢失的至少一个分块的块信息;
    所述确定模块还用于:
    根据所述第二重构请求携带的至少一个分块的块信息,确定至少一个第二条带;
    将所述至少一个第二条带中任一个条带确定为所述第一条带;
    根据所述至少一个分块的块信息,将所述第一条带中已丢失分块中的任一个分块确定为所述第一分块。
  35. 根据权利要求26-34所述的任一装置,其特征在于,所述装置还包括:
    第一查询模块,用于查询所述第一条带的已丢失的所有分块是否全部重构完成,若未全 部重构完成,则对未重构的任一个分块继续执行上述分块重构过程,否则所述第一条带重构完成。
  36. 根据权利要求26-35所述的任一装置,其特征在于,所述装置还包括:
    第二查询模块,用于查询所述目标个数的存储设备中的第一存储设备内的存储介质是否失效;
    所述确定模块,还用于若第一存储设备内的存储介质失效时,从失效的存储介质所存储的分块中,确定第一条带中已丢失分块中的第一分块。
  37. 根据权利要求26-35所述的任一装置,其特征在于,所述装置还包括:
    第三查询模块,用于查询所述目标个数的存储设备中的第一存储设备中的存储介质是否有分块丢失;
    所述确定模块,还用于当所述第一存储设备内的任一存储介质丢失至少一个分块时,从所述至少一个分块中确定第一条带中已丢失分块中的第一分块。
  38. 根据权利要求27-37所述的任一装置,其特征在于,所述装置还包括:
    第三发送模块,用于向所述每个第二子矩阵所对应的存储设备的存储单元分别发送第三获取请求,所述第三获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及所述第一存储设备的标识;
    第二获取模块,用于获取所述目标个数的存储设备的存储单元基于所述第三获取请求返回的第一结果。
  39. 根据权利要求27-37所述的任一装置,其特征在于,所述装置还包括:
    第三获取模块,用于向所述每个第二子矩阵所对应的存储设备的存储单元分别发送第四获取请求,所述第四获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及主存储设备的标识;
    所述第三获取模块,还用于从至少一个主存储设备获取至少一个目标求和矩阵,所述至少一个主存储设备用于管理所述目标个数的存储设备,每个目标求和矩阵为至少一个存储设备基于所述第四获取请求返回的第一结果的和,所述至少一个存储设备为一个主存储设备所管理的设备;
    所述重构模块,还用于根据所述至少一个目标求和矩阵以及所述第一子矩阵,对所述第一分块进行重构。
  40. 根据权利要求39所述的装置,其特征在于,所述重构模块还用于:
    对所述至少一个目标求和矩阵进行求和,得到求和矩阵;
    基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
    将所述目标分块矩阵的第一目标行确定为重构的第一分块。
  41. 一种数据重构的装置,其特征在于,所述装置包括:
    读取模块,用于读取存储设备存储的第一条带的有效分块,所述第一条带由目标个数的存储设备存储;
    计算模块,用于根据读取的有效分块计算,得到第一结果;
    发送模块,用于向所述目标设备发送所述第一结果,由所述目标设备根据所述目标个数的存储设备的第一结果,对第一条带中已丢失分块中的第一分块进行重构。
  42. 根据权利要求41所述的装置,其特征在于,所述装置还包括:
    接收模块,用于接收获取请求,所述获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标设备的标识,所述第二子矩阵包括第一条带中在对应的存储设备上所存储的至少一个第三分块所对应的列,所述第二分块为所述第一条带的有效分块中的任一分块,所述第三分块为所述有效分块中除所述第二分块以外的任一分块;
    所述读取模块,用于根据所述第二子矩阵中每一列所对应的第三分块的块信息,读取所述至少一个第三分块;
    所述计算模块,用于根据所述至少一个第三分块以及所述第二子矩阵计算,得到所述第一结果。
  43. 根据权利要求42所述的装置,其特征在于,所述计算模块用于:
    将所述至少一个第三分块组成分块矩阵,所述分块矩阵的每一行为一个第三分块;
    将所述第二子矩阵与所述分块矩阵相乘,得到所述第一结果。
  44. 根据权利要求41-42所述的任一装置,其特征在于,所述目标设备包括控制设备、目标存储设备、第一存储设备以及至少一个主存储设备,所述第一存储设备为所述目标存储设备中的任一设备,所述至少一个主存储设备用于管理所述目标个数的存储设备;
    当所述目标设备的标识为控制设备的标识时,所述获取请求为第一获取请求,用于指示向所述控制设备发送所述第一结果;
    当所述目标设备的标识为目标存储设备的标识时,所述获取请求为第二获取请求,用于指示向所述目标存储设备发送所述第一结果;
    当所述目标设备的标识为第一存储设备的标识时,所述获取请求为第三获取请求,用于指示向所述第一存储设备发送所述第一结果;
    当所述目标设备的标识为主存储设备的标识时,所述获取请求为第四获取请求,用于指示向所述主存储设备发送所述第一结果。
  45. 根据权利要求42-44所述的任一装置,其特征在于,所述装置还包括:
    第一查询模块,用于查询存储设备内的存储介质是否失效;
    所述发送模块,还用于当所述存储设备中的存储介质失效时,向控制设备发送第一重构请求,所述第一重构请求携带所述存储设备中失效的存储介质的存储介质标识。
  46. 根据权利要求42-44所述的任一装置,其特征在于,所述装置还包括:
    第二查询模块,用于查询存储设备中的存储介质是否有分块丢失;
    所述发送模块,还用于当所述存储设备内的任一存储介质丢失至少一个分块时,向控制设备发送第二重构请求,所述第二重构请求携带所述至少一个分块的块信息。
  47. 一种数据重构的装置,其特征在于,所述装置包括:
    接收模块,用于接收目标重构请求,所述目标重构请求携带第一子矩阵,所述目标重构请求用于指示根据所述第一子矩阵,对第一条带中已丢失分块中的第一分块进行重构,所述第一子矩阵包括所述第一条带中已丢失分块中的第一分块所对应的列以及第二分块所对应的列,所述第二分块为所述第一条带的有效分块的任一分块;
    获取模块,用于获取目标个数的存储设备基于所述第二获取请求返回的第一结果,所述第二获取请求携带与存储设备对应的第二子矩阵、所述第二子矩阵中每一列所对应的第三分块的块信息以及目标存储设备的标识,所述第一条带中的分块由所述目标个数的存储设备存储,每个第一结果由所述目标个数的存储设备中的一个存储设备读取存储的所述第一条带的有效分块并且根据读取的有效分块计算得到,所述第三分块为所述失效分块中除所述第二分块以外的任一分块;
    重构模块,用于根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第一分块进行重构。
  48. 根据权利要求47所述的装置,其特征在于,所述重构模块还用于:
    对所述目标个数的存储设备的第一结果进行求和,得到求和矩阵;
    基于将所述第一子矩阵的逆矩阵以及所述求和矩阵,获取目标分块矩阵;
    将所述目标分块的第一目标行作为重构的第一分块。
  49. 根据权利要求47-48所述的装置,其特征在于,所述装置还包括:
    存储模块,用于对重构后的第一分块进行存储;
    发送模块,用于当对所述重构后的第一分块存储完成时,向控制设备发送重构完成响应,所述重构完成响应用于指示所述第一分块重构完成。
  50. 根据权利要求49所述的装置,其特征在于,所述装置还包括执行模块;
    所述获取模块,还用于从存储所述第二分块的存储设备获取所述第二分块;
    所述重构模块,还用于根据所述目标个数的存储设备的第一结果以及所述第一子矩阵,对所述第二分块进行重构;
    所述执行模块,用于若重构的第二分块与获取的所述第二分块相同,则执行所述对重构后的第一分块进行存储的步骤,否则,不执行所述对重构后的第一分块进行存储的步骤,并且禁止使用所述第一条带。
  51. 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如权利要求1至权利要求25任一项所述的数据重构的方法所执行的操作。
  52. 一种存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至权利要求25任一项所述的数据重构的方法所执行的操作。
  53. 一种存储系统中数据重构的方法,其特征在于,所述存储系统包括控制设备和一个或多个存储设备,所述一个或多个存储设备中的每一个均包括一个硬盘;所述一个或多个存储设备用于存储所述控制设备生成的一个条带的分块;所述方法包括:
    所述一个或多个存储设备中存储有所述条带的有效分块的第一存储设备读取存储的有效分块并且根据读取的有效分块计算得到第一结果;
    所述第一存储设备将所述第一结果发送给所述控制设备;
    所述控制设备接收所述第一结果并根据所述第一结果恢复所述条带中损坏的分块。
  54. 根据权利要求53所述的方法,其特征在于,所述条带是根据纠删码算法生成的。
  55. 根据权利要求53或54所述的方法,其特征在于,所述一个或多个存储设备为硬盘框。
  56. 一种存储系统,其特征在于,所述存储系统包括控制设备和一个或多个存储设备,所述一个或多个存储设备中的每一个均包括一个硬盘;所述一个或多个存储设备用于存储所述控制设备生成的一个条带的分块;
    所述一个或多个存储设备中存储有所述条带的有效分块的第一存储设备用于读取存储的有效分块并且根据读取的有效分块计算得到第一结果,将所述第一结果发送给所述控制设备;
    所述控制设备用于接收所述第一结果并根据所述第一结果恢复所述条带中损坏的分块。
  57. 根据权利要求56所述的存储系统,其特征在于,所述一个或多个存储设备为硬盘框。
  58. 根据权利要求56或57所述的存储系统,其特征在于,所述条带是根据纠删码算法生成的。
  59. 一种存储系统中数据重构的方法,其特征在于,所述存储系统包括多个存储设备,所述多个存储设备中的每一个存储设备均包括一个或多个硬盘,所述多个存储设备用于存储一个条带的分块;所述方法包括:
    所述多个存储设备中存储有所述条带的有效分块的第一存储设备读取存储的有效分块并且根据读取的有效分块计算得到第一结果;
    所述多个存储设备中的第二存储设备根据所述第一结果恢复所述条带中损坏的分块。
  60. 根据权利要求59所述的方法,其特征在于,所述第二存储设备为所述多个存储设备中的主存储设备或者所述多个存储设备中存储所述条带的损坏的分块的存储设备。
  61. 根据权利要求60所述的方法,其特征在于,所述方法还包括:
    所述第一存储设备向所述第二存储设备发送所述第一结果;其中,所述第一存储设备不包括所述第二存储设备;
    所述第二存储设备接收所述第一结果。
  62. 根据权利要求60所述的方法,其特征在于,所述方法还包括:
    所述第一存储设备中除所述第二存储设备外的其他存储设备向所述第二存储设备发送所述第一结果;其中,所述第一存储设备包含所述第二存储设备;
    所述第二存储设备接收所述其他存储设备发送的第一结果。
  63. 一种存储系统,其特征在于,所述存储系统包括多个存储设备,所述多个存储设备中的每一个存储设备均包括一个或多个硬盘,所述多个存储设备用于存储一个条带的分块;
    所述多个存储设备中存储有所述条带的有效分块的第一存储设备用于读取存储的有效分块并且根据读取的有效分块计算得到第一结果;
    所述多个存储设备中的第二存储设备用于根据所述第一结果恢复所述条带中损坏的分块。
  64. 根据权利要求63所述的存储系统,其特征在于,所述第二存储设备为所述多个存储设备中的主存储设备或者所述多个存储设备中存储所述条带的损坏的分块的存储设备。
  65. 根据权利要求64所述的存储系统,其特征在于,所述第一存储设备用于向所述第二存储设备发送所述第一结果;其中,所述第一存储设备不包括所述第二存储设备;
    所述第二存储设备用于接收所述第一结果。
  66. 根据权利要求64所述的存储系统,其特征在于,所述第一存储设备中除所述第二存储设备外的其他存储设备用于向所述第二存储设备发送所述第一结果;其中,所述第一存储设备包含所述第二存储设备;
    所述第二存储设备用于接收所述其他存储设备发送的第一结果。
PCT/CN2019/097155 2019-07-22 2019-07-22 数据重构的方法、装置、计算机设备、存储介质及系统 WO2021012164A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/CN2019/097155 WO2021012164A1 (zh) 2019-07-22 2019-07-22 数据重构的方法、装置、计算机设备、存储介质及系统
EP19938765.5A EP3989069B1 (en) 2019-07-22 2019-07-22 Data reconstruction method and apparatus, computer device, storage medium, and system
CN201980008279.9A CN112543920B (zh) 2019-07-22 2019-07-22 数据重构的方法、装置、计算机设备、存储介质及系统
US17/574,069 US20220138046A1 (en) 2019-07-22 2022-01-12 Data reconstruction method and apparatus, computer device, and storage medium and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/097155 WO2021012164A1 (zh) 2019-07-22 2019-07-22 数据重构的方法、装置、计算机设备、存储介质及系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/574,069 Continuation US20220138046A1 (en) 2019-07-22 2022-01-12 Data reconstruction method and apparatus, computer device, and storage medium and system

Publications (1)

Publication Number Publication Date
WO2021012164A1 true WO2021012164A1 (zh) 2021-01-28

Family

ID=74192748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/097155 WO2021012164A1 (zh) 2019-07-22 2019-07-22 数据重构的方法、装置、计算机设备、存储介质及系统

Country Status (4)

Country Link
US (1) US20220138046A1 (zh)
EP (1) EP3989069B1 (zh)
CN (1) CN112543920B (zh)
WO (1) WO2021012164A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114168064A (zh) * 2020-09-10 2022-03-11 伊姆西Ip控股有限责任公司 用于重建存储系统的方法、设备和计算机程序产品
US11934280B2 (en) 2021-11-16 2024-03-19 Netapp, Inc. Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors
CN116846546B (zh) * 2023-04-24 2024-03-22 广州智臣信息科技有限公司 信息防丢失防重复的跨网络数据交换系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991740A (zh) * 2015-06-24 2015-10-21 华中科技大学 一种加速纠删码编解码过程的通用矩阵优化方法
CN107656832A (zh) * 2017-09-18 2018-02-02 华中科技大学 一种低数据重建开销的纠删码方法
US20180095676A1 (en) * 2016-06-30 2018-04-05 Western Digital Technologies, Inc. Declustered array of storage devices with chunk groups and support for multiple erasure schemes
WO2018112980A1 (zh) * 2016-12-24 2018-06-28 华为技术有限公司 存储控制器、数据处理芯片及数据处理方法
CN108334280A (zh) * 2017-12-28 2018-07-27 创新科存储技术(深圳)有限公司 一种raid5磁盘组快速重建方法和装置

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5623595A (en) * 1994-09-26 1997-04-22 Oracle Corporation Method and apparatus for transparent, real time reconstruction of corrupted data in a redundant array data storage system
US6993701B2 (en) * 2001-12-28 2006-01-31 Network Appliance, Inc. Row-diagonal parity technique for enabling efficient recovery from double failures in a storage array
US7073115B2 (en) * 2001-12-28 2006-07-04 Network Appliance, Inc. Correcting multiple block data loss in a storage array using a combination of a single diagonal parity group and multiple row parity groups
US20050091452A1 (en) * 2003-10-28 2005-04-28 Ying Chen System and method for reducing data loss in disk arrays by establishing data redundancy on demand
US7263629B2 (en) * 2003-11-24 2007-08-28 Network Appliance, Inc. Uniform and symmetric double failure correcting technique for protecting against two disk failures in a disk array
US7206899B2 (en) * 2003-12-29 2007-04-17 Intel Corporation Method, system, and program for managing data transfer and construction
US7512862B1 (en) * 2005-04-13 2009-03-31 Network Appliance, Inc. Compression of data for protection
US20080104445A1 (en) * 2006-10-31 2008-05-01 Hewlett-Packard Development Company, L.P. Raid array
US8156405B1 (en) * 2007-12-07 2012-04-10 Emc Corporation Efficient redundant memory unit array
US8386834B1 (en) * 2010-04-30 2013-02-26 Network Appliance, Inc. Raid storage configuration for cached data storage
US8683296B2 (en) * 2011-12-30 2014-03-25 Streamscale, Inc. Accelerated erasure coding system and method
US8914706B2 (en) * 2011-12-30 2014-12-16 Streamscale, Inc. Using parity data for concurrent data authentication, correction, compression, and encryption
CN102981778B (zh) * 2012-11-15 2016-11-16 浙江宇视科技有限公司 一种raid阵列重建方法及装置
CN103975309B (zh) * 2012-11-28 2017-08-25 华为技术有限公司 数据恢复方法、数据恢复装置、存储器及存储系统
CN104461394B (zh) * 2014-12-09 2018-11-13 华为技术有限公司 一种raid及从其读取数据的方法
CN104536698A (zh) * 2014-12-10 2015-04-22 华为技术有限公司 一种基于raid的磁盘重构方法及相关设备
EP3208714B1 (en) * 2015-12-31 2019-08-21 Huawei Technologies Co., Ltd. Data reconstruction method, apparatus and system in distributed storage system
KR102580123B1 (ko) * 2016-05-03 2023-09-20 삼성전자주식회사 Raid 스토리지 장치 및 그것의 관리 방법
US10585749B2 (en) * 2017-08-10 2020-03-10 Samsung Electronics Co., Ltd. System and method for distributed erasure coding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991740A (zh) * 2015-06-24 2015-10-21 华中科技大学 一种加速纠删码编解码过程的通用矩阵优化方法
US20180095676A1 (en) * 2016-06-30 2018-04-05 Western Digital Technologies, Inc. Declustered array of storage devices with chunk groups and support for multiple erasure schemes
WO2018112980A1 (zh) * 2016-12-24 2018-06-28 华为技术有限公司 存储控制器、数据处理芯片及数据处理方法
CN107656832A (zh) * 2017-09-18 2018-02-02 华中科技大学 一种低数据重建开销的纠删码方法
CN108334280A (zh) * 2017-12-28 2018-07-27 创新科存储技术(深圳)有限公司 一种raid5磁盘组快速重建方法和装置

Also Published As

Publication number Publication date
US20220138046A1 (en) 2022-05-05
EP3989069A1 (en) 2022-04-27
CN112543920B (zh) 2023-02-10
CN112543920A (zh) 2021-03-23
EP3989069A4 (en) 2022-07-27
EP3989069B1 (en) 2023-10-25

Similar Documents

Publication Publication Date Title
US7681104B1 (en) Method for erasure coding data across a plurality of data stores in a network
US10346250B2 (en) Configuring storage resources of a dispersed storage network
US10691366B2 (en) Policy-based hierarchical data protection in distributed storage
US7681105B1 (en) Method for lock-free clustered erasure coding and recovery of data across a plurality of data stores in a network
WO2021012164A1 (zh) 数据重构的方法、装置、计算机设备、存储介质及系统
US9195392B2 (en) Distributed storage method, apparatus, and system
JP5167243B2 (ja) 拡張性及び耐障害性を有する記憶システムのための記憶領域割当て及び消去符号化技法
CN110737541B (zh) 分布式存储系统中分发数据的方法和系统
US8677063B2 (en) Parity declustered storage device array with partition groups
US7231493B2 (en) System and method for updating firmware of a storage drive in a storage network
US10353787B2 (en) Data stripping, allocation and reconstruction
US20160147620A1 (en) Fault tolerance for persistent main memory
CN111095217B (zh) 资源全局共享的基于raid机制的数据存储系统
US11442827B2 (en) Policy-based hierarchical data protection in distributed storage
EP2921961A2 (en) Method of, and apparatus for, improved data recovery in a storage system
JP2018156656A (ja) 仮想装置階層を利用した複数のメモリ装置を含む仮想装置に対する客体の格納及び読み取り方法とこれを用いたストレージ装置
JP2021002350A (ja) 消去コードベースのデータ処理方法および装置
US8639968B2 (en) Computing system reliability
Thomasian et al. Hierarchical RAID: Design, performance, reliability, and recovery
WO2020238736A1 (zh) 一种生成解码矩阵的方法、解码方法和对应装置
WO2018235132A1 (en) DISTRIBUTED STORAGE SYSTEM
Yuan et al. HS6: an efficient H-code RAID-6 scaling by optimizing data migrating and parity updating
US11593207B2 (en) Storage system having RAID stripe metadata
CN113742132B (zh) 一种存储系统故障修复方法及装置
US20220237093A1 (en) Storage array disk recovery

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19938765

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019938765

Country of ref document: EP

Effective date: 20211230