CN113377569A - Method, apparatus and computer program product for recovering data - Google Patents

Method, apparatus and computer program product for recovering data Download PDF

Info

Publication number
CN113377569A
CN113377569A CN202010158555.4A CN202010158555A CN113377569A CN 113377569 A CN113377569 A CN 113377569A CN 202010158555 A CN202010158555 A CN 202010158555A CN 113377569 A CN113377569 A CN 113377569A
Authority
CN
China
Prior art keywords
data
corrupted
disk
recovery
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010158555.4A
Other languages
Chinese (zh)
Inventor
汤海鹰
吴志龙
康剑斌
商蓉蓉
高健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
EMC IP Holding Co LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by EMC IP Holding Co LLC filed Critical EMC IP Holding Co LLC
Priority to CN202010158555.4A priority Critical patent/CN113377569A/en
Priority to US17/023,815 priority patent/US11314594B2/en
Publication of CN113377569A publication Critical patent/CN113377569A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1084Degraded mode, e.g. caused by single or multiple storage removals or disk failures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1088Reconstruction on already foreseen single or plurality of spare disks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1092Rebuilding, e.g. when physically replacing a failing disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2211/00Indexing scheme relating to details of data-processing equipment not covered by groups G06F3/00 - G06F13/00
    • G06F2211/10Indexing scheme relating to G06F11/10
    • G06F2211/1002Indexing scheme relating to G06F11/1076
    • G06F2211/1026Different size groups, i.e. non uniform size of groups in RAID systems with parity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2211/00Indexing scheme relating to details of data-processing equipment not covered by groups G06F3/00 - G06F13/00
    • G06F2211/10Indexing scheme relating to G06F11/10
    • G06F2211/1002Indexing scheme relating to G06F11/1076
    • G06F2211/1057Parity-multiple bits-RAID6, i.e. RAID 6 implementations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2211/00Indexing scheme relating to details of data-processing equipment not covered by groups G06F3/00 - G06F13/00
    • G06F2211/10Indexing scheme relating to G06F11/10
    • G06F2211/1002Indexing scheme relating to G06F11/1076
    • G06F2211/1059Parity-single bit-RAID5, i.e. RAID 5 implementations

Abstract

Embodiments of the present disclosure relate to a method, apparatus and computer program product for recovering data. The method includes determining whether data read from a Redundant Array of Independent Disks (RAID) is corrupted, wherein the RAID includes two check disks. The method further includes determining whether a single disc data recovery can recover the damaged data in case the read data is damaged. The method further includes recovering the corrupted data using dual disk data recovery in the event that single disk data recovery fails to recover the corrupted data. The embodiment of the disclosure provides a recovery scheme for RAID with two check disks without recorded data damage, so that the damaged data can be recovered under the condition of single disk failure or double disk failure, thereby improving the performance of the storage system.

Description

Method, apparatus and computer program product for recovering data
Technical Field
Embodiments of the present disclosure relate generally to the field of data storage technology, and more particularly, relate to a method, apparatus, and computer program product for recovering data.
Background
Redundant Array of Independent Disks (RAID) is a data backup technique that is capable of combining multiple independent physical disks in different ways to form an array of disks (i.e., logical disks), thereby providing higher storage performance and higher reliability performance than a single disk. In order to recover data when a certain disk in RAID fails, one parity information block (for example, RAID 5) or a plurality of parity information blocks (for example, RAID 6) is generally provided in RAID. Taking RAID6 as an example, if data in one or two disks in RAID6 fails, RAID6 can calculate the data in the failed disk through the check information.
Generally, in a RAID, there may be a plurality of disks equal to or greater than the RAID width, where each disk is divided into a plurality of slices, and each slice may have a fixed size (e.g., 4GB, etc.). RAID typically stores data by striping, for example, in RAID6, 6 slices across 6 disks may be combined to form a RAID stripe set, also referred to as "Uber," which includes multiple stripes. That is, 4 data blocks and 2 parity blocks (i.e., "4D + P + Q") may form a stripe, and when a certain disk in the RAID fails, reconstruction can be performed through parity information, so that data can be recovered and not lost.
Disclosure of Invention
Embodiments of the present disclosure provide a method, apparatus, and computer program product for recovering data.
In one aspect of the disclosure, a method for recovering data is provided. The method comprises the following steps: determining whether data read from a Redundant Array of Independent Disks (RAID) is corrupted, wherein the RAID comprises two check disks; determining whether single-disc data recovery is capable of recovering the damaged data, according to the determination that the read data is damaged; and in accordance with a determination that the single-disk data recovery fails to recover the corrupted data, recovering the corrupted data using dual-disk data recovery.
In another aspect of the present disclosure, an electronic device is provided. The apparatus includes a processing unit and a memory coupled to the processing unit and storing instructions. The instructions, when executed by a processing unit, perform the following acts: determining whether data read from a Redundant Array of Independent Disks (RAID) is corrupted, wherein the RAID comprises two check disks; determining whether single-disc data recovery is capable of recovering the damaged data, according to the determination that the read data is damaged; and in accordance with a determination that the single-disk data recovery fails to recover the corrupted data, recovering the corrupted data using dual-disk data recovery.
In yet another aspect of the disclosure, a computer program product is provided. The computer program product is tangibly stored on a non-transitory computer-readable medium and includes computer-executable instructions that, when executed, cause a computer to perform a method or process in accordance with embodiments of the present disclosure.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the disclosure, nor is it intended to limit the scope of various embodiments of the disclosure.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following more particular descriptions of exemplary embodiments of the disclosure as illustrated in the accompanying drawings wherein like reference numbers generally represent like elements throughout the exemplary embodiments of the disclosure.
FIG. 1 illustrates a schematic diagram of an example environment of a storage system, according to an embodiment of the present disclosure;
FIG. 2 shows a schematic diagram of stripes in RAID 6;
FIG. 3 shows a flow diagram of a method for data corruption recovery in accordance with an embodiment of the present disclosure;
FIG. 4 shows a schematic diagram for single disk data recovery in RAID6, in accordance with an embodiment of the present disclosure;
FIG. 5 illustrates another schematic diagram for single disk data recovery in RAID6 according to an embodiment of the present disclosure;
FIG. 6 shows a schematic diagram for dual disk data recovery in RAID6, in accordance with an embodiment of the present disclosure;
FIG. 7 illustrates another schematic diagram for dual disk data recovery in RAID6 according to an embodiment of the present disclosure; and
FIG. 8 shows a schematic block diagram of a device that may be used to implement embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain specific embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The term "include" and variations thereof as used herein is meant to be inclusive in an open-ended manner, i.e., "including but not limited to". Unless specifically stated otherwise, the term "or" means "and/or". The term "based on" means "based at least in part on". The terms "one example embodiment" and "one embodiment" mean "at least one example embodiment". The term "another embodiment" means "at least one additional embodiment". The terms "first," "second," and the like may refer to different or the same items unless explicitly indicated to be different.
Silent data corruption (silent data corruption), also known as static data corruption, refers to a data failure that is not detected by the disk firmware or the host operating system, and is deemed to exist when a user issues a read command to the hard disk drive and the data returned by the hard disk drive is different from the originally written data. While the disk hardware or software is unaware of such data corruption before or during reading of the data. Such corruption events may be transient or may also be permanent data corruption. However, conventional storage systems do not have a recovery scheme for silent data corruption for RAIDs with two parity disks.
Therefore, the embodiment of the disclosure provides a recovery scheme for the RAID with two check disks without data corruption, so that the corrupted data can be recovered under the condition of single disk failure or double disk failure, and the performance of the storage system is improved. According to the embodiment of the present disclosure, dual disk recovery for silent data corruption can be supported.
It should be appreciated that although RAID6 is used as an example of a RAID comprising two parity disks in some embodiments of the present disclosure, any other RAID comprising two parity disks, known or developed in the future, may be used in conjunction with embodiments of the present disclosure.
The basic principles and several example implementations of the present disclosure are explained below with reference to fig. 1-8. It should be understood that these exemplary embodiments are given solely for the purpose of enabling those skilled in the art to better understand and thereby implement the embodiments of the present disclosure, and are not intended to limit the scope of the present disclosure in any way.
FIG. 1 illustrates a schematic diagram of an example environment 100 of a storage system, according to an embodiment of the disclosure. As shown in the example environment 100 of FIG. 1, the storage pool 110 includes multiple RSSs 111, 112, 113, and 114, each of which constitutes a failure domain, meaning that if one disk drive in a certain RSS fails, it does not affect the reliability of the other RSSs. The storage pool 110 manages all disk drives in the storage system. In embodiments of the present disclosure, each RSS may include a plurality of disk drives, for example, between 5 and 25.
Each disc may be divided into fixed-size disc slices, for example, may be divided into 4 GB-sized slices. Multiple slices on different disks may make up one slice set (Uber), and multiple slice sets may make up one mapper layer. For example, a stripe set may be allocated from the storage pool 110, and in the case of a RAID 5 type RAID, creating a stripe set requires allocating 5 free slices from 5 disks to make up a RAID 5 stripe set. In the case of a RAID6 type RAID, creating a stripe set requires allocating 6 free slices from 6 disks to make up a RAID6 stripe set. Furthermore, it needs to be guaranteed that all slices comprised by one slice set need to be from the same RRS. Each stripe set includes a plurality of RAID stripes. In some embodiments, each slice in a slice set may be 2MB in size, also referred to as a physical chunk (PLB).
The storage pool 110 will expose some tiers (e.g., user data tier 130, metadata tier 140, etc.) to the outside for use by other components, each tier may include multiple stripe sets. Each tier has a respective RAID policy applied based on its data type, and all stripe sets in a tier have the same RAID policy applied, e.g., the same RAID width and RAID type. Layers can be expanded as needed, and thus new sets of stripes can be dynamically assigned and assigned to the respective layers.
As illustrated by example environment 100, a RAID Data Bank (DB) tier 120, a user data tier 130, and a metadata tier 140, etc., may be constructed, which are each mapped by a mapper 150 into a namespace 160 for use by an external host. Wherein storage pools 110, RAID database tier 120, user data tier 130, metadata tier 140, mapper 150, etc. may comprise the entire RAID system. RAID DB layer 120 includes only a single stripe set and is not exposed, which is consumed only by RAID content. User data layer 130 employs RAID 5 and/or RAID6, the type and width of the RAID being dependent on the type and number of disks in the system. For example, RAID 5 may typically support 4+1, 8+1, or 16+1, and RAID6 may typically support 4+2, 8+2, or 16+ 2. In general, 2 or 3 mirrors may be provided for each layer, depending on the level of protection for the particular data.
Mapper 150 is a core component in RAID that treats each layer as a planar linear physical address space and, in addition, exposes a single planar linear logical address space to namespace 160. For example, the logical address space may be large. In some embodiments, the mapper 150 uses a B + tree to maintain mappings between logical addresses and physical addresses in a 4K page granularity. Namespace 160 consumes and manages the linear logical space exposed by mapper 150, and namespace 160 will create and expose volumes to external hosts. The mapper 150 consumes the startup layer (not shown), the user data layer 130, and the metadata layer 140. Where the boot layer employs 3 mirrors, the mapper 150 stores some important configurations at the boot layer to be loaded on the boot path. The metadata layer 140 may employ 2 mirrors where the mapper 150 stores metadata, such as B + tree nodes, in the metadata layer 140. With user data layer 130 employing RAID 5 and/or RAID6, all host user data will be stored at user data layer 130.
The mapper 150, when processing the IO, will generate read IO and write IO for these layers. The mapper 150 operates in a log-based mode, meaning that when the mapper 150 writes any host data to the user data layer 130, it first aggregates enough pages, then packs them into 2 MB-sized PLBs, and writes the PLBs to RAID. This type of mapper involves the ability to significantly simplify the write IO path. At the user data layer 130, the mapper 150 will always perform a write IO of size 2MB, and the write of 2MB will always be a full stripe write to RAID. For read IOs on user data layer 130, the IOs may be any size within 2MB, but are typically 4K page aligned.
Further, although not shown, modules and components of a cache, a logger, a log data layer, a log metadata layer, etc. may also be included in the storage system, wherein the cache provides an in-memory caching function that has 2 instances in the system, one instance for user data and another instance for metadata that provides a transactional operation function for the mapper 150 to speed up data access. When a transaction is committed, if the transaction modifies some pages to prevent data loss, it will retain all modifications to some special layers exposed by the RAID through the logging component. The journal user data layer and journal metadata layer are created on some special drives that perform almost as well as DRAM, better than SSD. The logging component consumes and manages space in the log user data layer and the log metadata layer, and the cache will load and reserve dirty pages using the APIs exposed by the logging component.
FIG. 2 shows a schematic diagram of stripes in RAID 6. As shown in FIG. 2, RAID6 involves disks 210, 220, 230, 240, 250, 260. However, RAID6 may involve more disks. RAID6 uses two parity disks P and Q, where P refers to the normal XOR parity information and Q refers to the Reed-Solomon code. RAID6 allows data failure of up to two disks in RAID without causing any data loss. In general, RAID6 may be configured as 4 data disks +2 parity disks, or 8 data disks +2 parity disks, or 16 data disks +2 parity disks.
Fig. 2 shows a RAID6 configuration of 4 data disks +2 parity disks, for example, data blocks a1, B1, C1, D1 and parity blocks P1, Q1 constitute a RAID6 stripe. In a RAID6 system, data can be recovered using parity blocks. For example, if data block A1 in the stripe is corrupted, the data of data block A1 may be recovered using parity block P1 and the other data blocks B1, C1, D1, just as in RAID 5 recovery. If the check block P1 or Q1 in the stripe is damaged, P1 or Q1 can be recalculated. If both data block A1 and check block Q1 in the stripe are corrupted, the data of data block A1 may be recovered using check block P1 and other data blocks B1, C1, D1, and the data of data block Q1 may be recalculated. If the data block A1 and the parity block P1 in the stripe are corrupted at the same time, the data of the data block A1 can be recovered by using the parity block Q1 and other data blocks B1, C1 and D1, and then the data of the data block P1 can be recalculated. If two data blocks A1 and B1 in a stripe are corrupted at the same time, the data of data blocks A1 and B1 may be recovered using parity blocks P1, Q1 and data blocks C1, D1.
In some embodiments of the present disclosure, a method for recovering data is provided that includes determining whether data read from a RAID is corrupted, wherein the RAID includes two check disks. The method further includes determining whether single-disk data recovery is capable of recovering the corrupted data, in accordance with the determination that the read data is corrupted. The method also includes recovering the corrupted data using dual disk data recovery in accordance with a determination that the single disk data recovery fails to recover the corrupted data. In this way, the embodiments of the present disclosure propose a recovery scheme for RAID with two parity disks without data corruption, so that the corrupted data can be recovered in the case of a single disk failure or a double disk failure, thereby improving the performance of the storage system.
FIG. 3 illustrates a flow diagram of a method 300 for data corruption recovery, which is a two-step disk data corruption recovery method that combines single disk data recovery and dual disk data recovery, in accordance with the present disclosure. In RAID6, neither the mapper nor the RAID is aware that data on a total of several disks is corrupted when the mapper 150 finds data corruption. Thus, the method 300 of the present disclosure first assumes that data on only one disk is corrupted and then attempts a single disk data recovery mechanism, as with RAID 5 recovery. If the data is successfully recovered, the whole recovery process is successful; otherwise, the dual disk data recovery mechanism continues to be attempted. If the data is successfully recovered, the whole recovery process is also successful; otherwise, the entire recovery process fails because RAID6 cannot recover from data failures on more than 2 disks.
At 302, a determination is made as to whether data corruption is found, e.g., a predetermined checksum of data written from a data block (e.g., a 4K page) by mapper 150 as described with reference to FIG. 1, to verify whether corruption occurred in the read data. The checksum of each 4K data may be calculated and buffered or saved when writing the data. Then, when reading data, the read data is compared with the cached checksum to judge whether the read data block is damaged, and the damage is called non-recorded data damage. In a RAID6 system, if one or two blocks in one stripe are damaged, data of the damaged one or two blocks can be recovered by other blocks.
If no data corruption is found at 302, the mapper 150 may forward the data to the host as normal. If data corruption is found at 302, then a single disk data recovery process is performed at 304, i.e., assuming that one data block in a RAID6 stripe is corrupted. Fig. 4-5 illustrate two examples of single disk data recovery for RAID6 in accordance with embodiments of the present disclosure.
Referring to FIG. 4, an example of single disk data recovery with non-page crossing of corrupted data is shown. The checksum of the data read by mapper 150 from data chunk D1 does not match checksum 420 for data chunk D1, indicating a data corruption of data chunk D1 with a pattern present in failed data chunk D1, as shown in fig. 4. The mapper 150 may send a request to the RAID for data recovery. Thus, assuming that only data block D1 in RAID6 stripe 410 is corrupted, the data of data block D1 may be recovered by the other data blocks D2, D3, D4 and parity block P. The recovered data is then compared to a checksum 420 to determine whether the data recovery was successful. If there is a match with the checksum 420, then the data block D1 location will be filled with the new data being restored and the restore operation will succeed.
In some embodiments, the 4K data read may span two pages, thus requiring two blocks of data to be checked. Referring to FIG. 5, an example of single disk data recovery of a corrupted data span is shown. As shown in FIG. 5, the mapper 150 reads 4K data from the data blocks D12 and D13, and then determines from the checksum 420 that data corruption occurred, where D11 and D12 are on the same 4K page, and D13 and D14 are on the same 4K page. In this case, two data recoveries need to be performed, recovering the data of data blocks D11 and D12 through data blocks D21, D31, D41 and check block P1 for RAID6 stripe 510; for RAID6 stripe 520, the data of data blocks D13 and D14 are recovered through data blocks D22, D32, D42, and parity block P2. Next, based on the recovered data chunks D12 and D13, the checksum 420 is verified whether the recovery was successful. Therefore, when the damaged data crosses pages, the single disk recovery process in RAID6 needs to be performed twice. In some embodiments, if the accuracy of the data has been verified by the checksum 420 after the recovery of data block D12, then data block D13 must be the correct data and thus data recovery need not be performed for data block D13.
Referring back to FIG. 3, at 306, a determination is made as to whether single-disk data recovery was successful. If so, an indication of successful recovery is returned at 312 and the corrupted data is replaced with the recovered data in the corresponding disk location of the RAID. If the single-disk data recovery is not successful, indicating that there are two or more disks with data corruption, at 308, a dual-disk data recovery is performed, i.e., assuming that two data blocks, or one data block and one parity block, in a RAID6 stripe are corrupted. Fig. 6-7 illustrate two examples for dual disk data recovery in RAID6, according to embodiments of the present disclosure.
FIG. 6 illustrates an example of dual disk data recovery with the corrupted data not spread across pages. As shown in fig. 6, all combinations of all potentially corrupted dual disks will be traversed to determine whether the data can be properly recovered. For example, for the example in FIG. 6, since data chunk D1 has been determined to be corrupted by checksum 420, then multiple candidate combinations of dual disks that may be corrupted include 610, 620, 630, 640. In combination 610, assuming corruption of data chunks D1 and D2, the data of data chunks D1 and D2 are recovered through data chunks D3, D4 and parity chunks P and Q. In combination 620, assuming corruption of data chunks D1 and D3, the data of data chunks D1 and D3 are recovered through data chunks D2, D4 and parity chunks P and Q. In combination 630, assuming corruption of data chunks D1 and D4, the data of data chunks D1 and D4 are recovered through data chunks D2, D3 and parity chunks P and Q. In combination 640, assuming data chunk D1 and parity chunk P are corrupted, the data of data chunk D1 and parity chunk P are recovered via data chunks D2, D3, D4 and parity chunk Q. If the data of the data block D1 recovered by a certain combination can match the checksum 420, the recovery is successful. And if the data of all combined recovered data block D1 does not match the checksum 420, a recovery failure is indicated.
None of the candidate combinations includes the check disk Q mainly for the following two reasons. First, for the combination of the data chunk D1 and the parity chunk Q, the data of the data chunk D1 can be directly restored by the single-disk data restoration method. Second, if both parity blocks P and Q are corrupted, there is data corruption on at least three disks, and RAID6 does not support such a recovery capability.
Therefore, for RAID6 of "4 + 2", assuming that there is a double disk failure and the damaged data is in a single page, it is necessary to perform a maximum of 4 data recovery operations. For RAID6 of "8 + 2", a maximum of 8 data recovery operations need to be performed. For RAID6 of "16 + 2", a maximum of 16 data recovery operations need to be performed.
In some embodiments, the 4K data read may span two pages, thus requiring two data blocks to be checked, with more checks. Referring to FIG. 7, an example of dual disk data recovery of a corrupted data page span is shown. In the case of finding data corruption of data chunks D12 and D13, if only data chunk D12 is corrupted, only data chunk D11+ D12 needs to be recovered, which performs a process similar to that of fig. 6, requiring only 4 data recovery operations at most. Likewise, if only data block D13 is corrupted, only data block D13+ D14 need be recovered, which is performed similarly to FIG. 6.
If both data blocks D12 and D13 are corrupted, then not only data block D11+ D12, but also data block D13+ D14, there are 4 × 4 combinations. Referring to fig. 7, it is known that both D11+ D12 and D13+ D14 are damaged, assuming that D22 is damaged, then possible combinations include 710, 720, 730, 740. In addition, assuming that D32, D42, and P2 are damaged, there are four combinations, respectively. Therefore, for RAID6 with RAID width of R, at most R (R-2) combinations are needed. For example, there are 288 combinations for RAID6 of "16 + 2".
Referring back to FIG. 3, at 310, it is determined whether dual disk data recovery was successful. If so, an indication of successful recovery is returned at 312 and the corrupted data is replaced with the recovered data in the corresponding disk location of the RAID. If the dual disk data recovery is also unsuccessful, indicating that more than two disks have data corrupted, the mechanisms of RAID6 are unable to recover the data, and thus an indication of a recovery failure is returned at 314, such as an uncorrectable error message. In this way, double-disk failure recovery in RAID6 without documented data corruption is achieved. Accordingly, some embodiments of the present disclosure propose a recovery scheme for RAID6 without documenting data corruption, so that corrupted data can be recovered in case of a single disk failure or a double disk failure, improving the performance of the storage system.
Fig. 8 shows a schematic block diagram of a device 800 that may be used to implement embodiments of the present disclosure, the device 800 may be a device or apparatus as described by embodiments of the present disclosure. As shown in fig. 8, device 800 includes a Central Processing Unit (CPU)801 that may perform various appropriate actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM)802 or loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Various methods or processes described above may be performed by the processing unit 801. For example, in some embodiments, the methods may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by CPU 801, a computer program may perform one or more steps or actions of the methods or processes described above.
In some embodiments, the methods and processes described above may be implemented as a computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for carrying out various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language, as well as conventional procedural programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
These computer-readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement various aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (17)

1. A method for recovering data, comprising:
determining whether data read from a Redundant Array of Independent Disks (RAID) is corrupted, the RAID comprising two check disks;
determining whether single-disc data recovery is capable of recovering the damaged data, according to the determination that the read data is damaged; and
in accordance with a determination that the single-disk data recovery fails to recover the corrupted data, recovering the corrupted data using dual-disk data recovery.
2. The method of claim 1, wherein determining whether the read data is corrupted comprises:
verifying whether the data is corrupted based on a predetermined checksum.
3. The method of claim 2, wherein recovering the corrupted data using dual disk data recovery comprises:
determining a first data disk in which the corrupted data exists based on the checksum; and
determining a plurality of candidate combinations of dual disk failures in the RAID, each of the plurality of candidate combinations including the first data disk.
4. The method of claim 3, wherein recovering the corrupted data using dual disk data recovery further comprises:
restoring corresponding data in two candidate failed disks using a first candidate combination of the plurality of candidate combinations;
determining whether the data recovered by the first candidate combination is correct through the checksum; and
in accordance with a determination that the data recovered by the first candidate combination is incorrect, using a second candidate combination of the plurality of candidate combinations to recover corresponding data in two candidate failed disks.
5. The method of claim 1, wherein recovering the corrupted data using dual disk data recovery comprises:
determining whether the corrupted data relates to both a first page and a second page in a first data disk; and
in accordance with a determination that the corrupted data relates to both the first page and the second page:
for a first stripe involving the first page, determining a first set of candidate combinations of dual disk failures for the first stripe;
for a second stripe involving the second page, determining a second set of candidate combinations of dual disk failures for the second stripe; and
a third set of candidate combinations is obtained based on the first set of candidate combinations and the second set of candidate combinations.
6. The method of claim 5, wherein recovering the corrupted data using dual disk data recovery further comprises:
restoring corresponding data in the first stripe and corresponding data in the second stripe, respectively, using each candidate combination in the third set of candidate combinations until the corrupted data is correctly restored.
7. The method of claim 1, wherein recovering the corrupted data using dual disk data recovery comprises:
in accordance with a determination that the dual disk data recovery is capable of recovering the corrupted data, replacing the corrupted data with the recovered data in a corresponding disk location of the RAID; and
generating an indication of a recovery failure in accordance with a determination that the dual disk data recovery fails to recover the corrupted data.
8. The method of claim 1, the RAID being RAID6 comprising at least four data disks and the two check disks.
9. An electronic device, comprising:
a processing unit; and
a memory coupled to the processing unit and storing instructions that, when executed by the processing unit, perform the following:
determining whether data read from a Redundant Array of Independent Disks (RAID) is corrupted, the RAID comprising two check disks;
determining whether single-disc data recovery is capable of recovering the damaged data, according to the determination that the read data is damaged; and
in accordance with a determination that the single-disk data recovery fails to recover the corrupted data, recovering the corrupted data using dual-disk data recovery.
10. The apparatus of claim 9, wherein determining whether the read data is corrupted comprises:
verifying whether the data is corrupted based on a predetermined checksum.
11. The apparatus of claim 10, wherein recovering the corrupted data using dual disk data recovery comprises:
determining a first data disk in which the corrupted data exists based on the checksum; and
determining a plurality of candidate combinations of dual disk failures in the RAID, each of the plurality of candidate combinations including the first data disk.
12. The apparatus of claim 11, wherein recovering the corrupted data using dual disk data recovery further comprises:
restoring corresponding data in two candidate failed disks using a first candidate combination of the plurality of candidate combinations;
determining whether the data recovered by the first candidate combination is correct through the checksum; and
in accordance with a determination that the data recovered by the first candidate combination is incorrect, using a second candidate combination of the plurality of candidate combinations to recover corresponding data in two candidate failed disks.
13. The apparatus of claim 9, wherein recovering the corrupted data using dual disk data recovery comprises:
determining whether the corrupted data relates to both a first page and a second page in a first data disk; and
in accordance with a determination that the corrupted data relates to both the first page and the second page:
for a first stripe involving the first page, determining a first set of candidate combinations of dual disk failures for the first stripe;
for a second stripe involving the second page, determining a second set of candidate combinations of dual disk failures for the second stripe; and
a third set of candidate combinations is obtained based on the first set of candidate combinations and the second set of candidate combinations.
14. The apparatus of claim 13, wherein recovering the corrupted data using dual disk data recovery further comprises:
restoring corresponding data in the first stripe and corresponding data in the second stripe, respectively, using each candidate combination in the third set of candidate combinations until the corrupted data is correctly restored.
15. The apparatus of claim 9, wherein recovering the corrupted data using dual disk data recovery comprises:
in accordance with a determination that the dual disk data recovery is capable of recovering the corrupted data, replacing the corrupted data with the recovered data in a corresponding disk location of the RAID; and
generating an indication of a recovery failure in accordance with a determination that the dual disk data recovery fails to recover the corrupted data.
16. The apparatus of claim 9, the RAID being RAID6 comprising at least four data disks and the two check disks.
17. A computer program product tangibly stored on a non-transitory computer-readable medium and comprising computer-executable instructions that, when executed, cause a computer to perform the method of any of claims 1 to 8.
CN202010158555.4A 2020-03-09 2020-03-09 Method, apparatus and computer program product for recovering data Pending CN113377569A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010158555.4A CN113377569A (en) 2020-03-09 2020-03-09 Method, apparatus and computer program product for recovering data
US17/023,815 US11314594B2 (en) 2020-03-09 2020-09-17 Method, device and computer program product for recovering data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010158555.4A CN113377569A (en) 2020-03-09 2020-03-09 Method, apparatus and computer program product for recovering data

Publications (1)

Publication Number Publication Date
CN113377569A true CN113377569A (en) 2021-09-10

Family

ID=77554844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010158555.4A Pending CN113377569A (en) 2020-03-09 2020-03-09 Method, apparatus and computer program product for recovering data

Country Status (2)

Country Link
US (1) US11314594B2 (en)
CN (1) CN113377569A (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112749039A (en) * 2019-10-31 2021-05-04 伊姆西Ip控股有限责任公司 Method, apparatus and program product for data writing and data recovery
CN114221975B (en) * 2021-11-30 2024-01-30 浙江大华技术股份有限公司 Cloud storage data recovery method and device based on SMR disk and electronic equipment
CN116501553B (en) * 2023-06-25 2023-09-19 苏州浪潮智能科技有限公司 Data recovery method, device, system, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090259882A1 (en) * 2008-04-15 2009-10-15 Dot Hill Systems Corporation Apparatus and method for identifying disk drives with unreported data corruption
CN110413205A (en) * 2018-04-28 2019-11-05 伊姆西Ip控股有限责任公司 Method, equipment and computer readable storage medium for being written to disk array

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040123032A1 (en) * 2002-12-24 2004-06-24 Talagala Nisha D. Method for storing integrity metadata in redundant data layouts
US7552357B2 (en) * 2005-04-29 2009-06-23 Network Appliance, Inc. Lost writes detection in a redundancy group based on RAID with multiple parity
US7689877B2 (en) * 2005-11-04 2010-03-30 Sun Microsystems, Inc. Method and system using checksums to repair data
US7752489B2 (en) * 2007-05-10 2010-07-06 International Business Machines Corporation Data integrity validation in storage systems
US8706701B1 (en) 2010-11-18 2014-04-22 Emc Corporation Scalable cloud file system with efficient integrity checks
US20130198585A1 (en) * 2012-02-01 2013-08-01 Xyratex Technology Limited Method of, and apparatus for, improved data integrity
US8327185B1 (en) 2012-03-23 2012-12-04 DSSD, Inc. Method and system for multi-dimensional raid
US9354975B2 (en) 2013-03-15 2016-05-31 Emc Corporation Load balancing on disks in raid based on linear block codes
RU2013128346A (en) 2013-06-20 2014-12-27 ИЭмСи КОРПОРЕЙШН DATA CODING FOR A DATA STORAGE SYSTEM BASED ON GENERALIZED CASCADE CODES
US9641615B1 (en) 2014-03-31 2017-05-02 EMC IP Holding Company LLC Allocating RAID storage volumes across a distributed network of storage elements
US10466913B2 (en) 2015-04-29 2019-11-05 EMC IP Holding Company LLC Method and system for replicating and using grid level metadata in a storage system
US9905289B1 (en) 2017-04-28 2018-02-27 EMC IP Holding Company LLC Method and system for systematic read retry flow in solid state memory
US11151056B2 (en) 2019-04-25 2021-10-19 EMC IP Holding Company LLC Efficient virtualization layer structure for a data storage system
US11119803B2 (en) 2019-05-01 2021-09-14 EMC IP Holding Company LLC Method and system for offloading parity processing
US10990474B1 (en) * 2020-03-06 2021-04-27 Seagate Technology Llc Cost-benefit aware read-amplification in RAID scrubbing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090259882A1 (en) * 2008-04-15 2009-10-15 Dot Hill Systems Corporation Apparatus and method for identifying disk drives with unreported data corruption
CN110413205A (en) * 2018-04-28 2019-11-05 伊姆西Ip控股有限责任公司 Method, equipment and computer readable storage medium for being written to disk array

Also Published As

Publication number Publication date
US11314594B2 (en) 2022-04-26
US20210279135A1 (en) 2021-09-09

Similar Documents

Publication Publication Date Title
CN109725822B (en) Method, apparatus and computer program product for managing a storage system
CN108733314B (en) Method, apparatus, and computer-readable storage medium for Redundant Array of Independent (RAID) reconstruction
CN109213618B (en) Method, apparatus and computer program product for managing a storage system
US20180088857A1 (en) Method and system for managing storage system
US8943357B2 (en) System and methods for RAID writing and asynchronous parity computation
US10664367B2 (en) Shared storage parity on RAID
US9690651B2 (en) Controlling a redundant array of independent disks (RAID) that includes a read only flash data storage device
US11314594B2 (en) Method, device and computer program product for recovering data
US10503620B1 (en) Parity log with delta bitmap
US11449400B2 (en) Method, device and program product for managing data of storage device
CN110413208B (en) Method, apparatus and computer program product for managing a storage system
CN110058787B (en) Method, apparatus and computer program product for writing data
US11003554B2 (en) RAID schema for providing metadata protection in a data storage system
US20200174689A1 (en) Update of raid array parity
US7577804B2 (en) Detecting data integrity
CN113552998B (en) Method, apparatus and program product for managing stripes in a storage system
US20200348858A1 (en) Method, device and computer program product
US10664346B2 (en) Parity log with by-pass
US11620080B2 (en) Data storage method, device and computer program product
US11481275B2 (en) Managing reconstruction of a malfunctioning disk slice
US11561859B2 (en) Method, device and computer program product for managing data
US20210208969A1 (en) Dropped write error detection
US11249667B2 (en) Storage performance enhancement
CN112328182A (en) RAID data management method, device and computer readable storage medium
US10133630B2 (en) Disposable subset parities for use in a distributed RAID

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination