US11301326B2 - Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks - Google Patents

Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks Download PDF

Info

Publication number
US11301326B2
US11301326B2 US17/084,650 US202017084650A US11301326B2 US 11301326 B2 US11301326 B2 US 11301326B2 US 202017084650 A US202017084650 A US 202017084650A US 11301326 B2 US11301326 B2 US 11301326B2
Authority
US
United States
Prior art keywords
protected data
data
protected
storage devices
raid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US17/084,650
Other versions
US20210073074A1 (en
Inventor
An-nan Chang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Silicon Motion Inc
Original Assignee
Silicon Motion Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Silicon Motion Inc filed Critical Silicon Motion Inc
Priority to US17/084,650 priority Critical patent/US11301326B2/en
Publication of US20210073074A1 publication Critical patent/US20210073074A1/en
Application granted granted Critical
Publication of US11301326B2 publication Critical patent/US11301326B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1092Rebuilding, e.g. when physically replacing a failing disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems

Definitions

  • the present invention is related to storage systems, and more particularly, to a method and apparatus for performing dynamic recovery management regarding a redundant array of independent disks (RAID).
  • RAID redundant array of independent disks
  • a redundant array of independent disks may be implemented in a server.
  • RAID may be implemented in a server.
  • the server may be designed to be equipped with a copy-on-write (COW) architecture. Due to features of the COW architecture, performance of the server may degrade as time goes by. To prevent this, the server may be designed to be equipped with a redirect-on-write (ROW) architecture, but this may result in other problems.
  • COW copy-on-write
  • ROW redirect-on-write
  • An objective of the present invention is to provide a method and apparatus for performing dynamic recovery management regarding a redundant array of independent disks (RAID), to solve the related art problems.
  • RAID redundant array of independent disks
  • Another objective of the present invention is to provide a method and apparatus for performing dynamic recovery management regarding a RAID that can guarantee the storage system can properly operate under various situations.
  • Another objective of the present invention is to provide a method and apparatus for performing dynamic recovery management regarding a RAID that can solve the related art problems without introducing any side effect or in a way that is less likely to introduce side effects.
  • At least one embodiment of the present invention provides a method for performing dynamic recovery management regarding a RAID.
  • the method may comprise: writing a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data, wherein the RAID comprises multiple storage devices, the first set of protected data comprises data and multiple parity-check codes, RAID information within the first set of management information indicates the first set of protected data is stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data; and in response to any storage device of the multiple storage devices malfunctioning, writing a second set of protected data into a second protected access unit of the multiple protected access units, and recording a second set of management information corresponding to the second set of protected data, for data recovery of the second set of protected data, wherein the second set of protected data comprises data and multiple parity-check codes, RAID information
  • the present invention further provides a storage system operating according to the aforementioned method, wherein the storage system comprises the RAID.
  • At least one embodiment of the present invention provides an apparatus for performing dynamic recovery management regarding a RAID.
  • the apparatus may comprise a processing circuit, wherein the processing circuit is positioned in a storage system, and is configured to control operations of the storage system.
  • the operations of the storage system may comprise: writing a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data, wherein the RAID comprises multiple storage devices, the first set of protected data comprises data and multiple parity-check codes, RAID information within the first set of management information indicates the first set of protected data is stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data; and in response to any storage device of the multiple storage devices malfunctioning, writing a second set of protected data into a second protected access unit of the multiple protected access units, and recording a second set of management information corresponding to
  • the method and apparatus of the present invention can guarantee the storage system will properly operate under various situations. For example, when any disk within a RAID malfunctions, the system manager does not need to be concerned that the probability of the data of the server being unrecoverable will greatly increase due to a second disk malfunctioning.
  • the method and apparatus of the present invention provide a powerful dynamic recovery management mechanism. Thus, the objectives of optimal performance, high security, budget control, etc. can be achieved. Additionally, the method and apparatus of the present invention can solve the problems in the related art without introducing any side effect or in a way that is less likely to introduce side effects.
  • FIG. 1 is a diagram illustrating a storage system and a user device according to an embodiment of the present invention.
  • FIG. 2 is a working flow of a method for performing dynamic recovery management regarding a RAID (such as that shown in FIG. 1 ) according to an embodiment of the present invention.
  • FIG. 3 illustrates a plurality of protected access units according to an embodiment of the present invention, where examples of the plurality of protected access units may include protected blocks.
  • FIG. 4 illustrates a redirect-on-write (ROW) scheme of the method according to an embodiment of the present invention.
  • FIG. 5 illustrates a control scheme of the method according to an embodiment of the present invention.
  • FIG. 6 illustrates a control scheme of the method according to another embodiment of the present invention.
  • FIG. 7 illustrates a control scheme of the method according to another embodiment of the present invention.
  • FIG. 8 illustrates a control scheme of the method according to another embodiment of the present invention.
  • FIG. 9 illustrates a control scheme of the method according to another embodiment of the present invention.
  • FIG. 10 illustrates a control scheme of the method according to another embodiment of the present invention.
  • FIG. 1 is a diagram illustrating a storage system 100 and a user device 10 according to an embodiment of the present invention.
  • the user device 10 may comprise a processing circuit 11 (e.g. at least one processor and associated circuits), and may further comprise an interface circuit 12 coupled to the processing circuit 11 , and a storage device.
  • the storage system 100 may comprise a processing circuit 111 (e.g. at least one processor and associated circuits), and may further comprise interface circuits 112 and 122 and random access memory (RAM) 121 that are coupled to the processing circuit 111 through a bus 110 .
  • Storage devices ⁇ 130 , 131 , 132 , . . .
  • the storage devices ⁇ 131 , 132 , . . . , 146 ⁇ may forma RAID, where a program code 111 P executed on the processing circuit 111 may be read from the storage device 130 (e.g. a system disk), and may maintain (e.g. establish, store and/or update) a management table 121 T within the RAM 121 in order to perform related operations to manage a data region DR.
  • the management table 121 T may comprise multiple sets of management information for dynamic recovery management, and each set of management information within the multiple sets of management information (e.g.
  • a row of information within the management table 121 T may comprise RAID information such as RAID bitmap information, and may comprise validity information such as validity bitmap information.
  • the management table 121 T may be backed up in a table region TR, but the present invention is not limited thereto.
  • the interface circuits 12 and 112 may be implemented as a wired network interface and/or wireless network interface, to allow the storage system 100 and the user device 10 to exchange information with each other.
  • a user may access (read or write) user data in the storage system 100 through the user device 10 .
  • Examples of the user device 10 may include, but are not limited to: a multifunctional mobile phone, a tablet, a wearable device and a personal computer (such as a desktop computer and a laptop computer).
  • Examples of the storage system 100 may include, but are not limited to: a server such as a storage server.
  • the architecture of the storage system 100 may vary.
  • the program code 111 P may be implemented by a dedicated hardware configured in the interface circuit 122 , to perform related operations of the present invention method.
  • the number of storage devices ⁇ 131 , 132 , . . . , 146 ⁇ within the RAID may vary, e.g. may be increased or reduced.
  • FIG. 2 is a working flow 200 of a method for performing dynamic recovery management regarding a RAID (such as the RAID shown in FIG. 1 ) according to an embodiment of the present invention, where the RAID may comprise multiple storage devices such as the storage devices ⁇ 131 , 132 , . . . , 146 ⁇ .
  • the method may be applied to the storage system 100 , the processing circuit 111 executing the program code 111 P, and associated components shown in FIG. 1 .
  • the storage system 100 e.g. the processing circuit 111
  • may maintain e.g.
  • the management table 121 T establish, store and/or update) respective validity information of the multiple sets of management information within the management table 121 T according to at least one health state of the RAID such as one or more health states thereof, in order to generate the latest version of the multiple sets of management information.
  • the one or more health states of the RAID may include, but are not limited to: a normal state, a malfunction state and a recovery state of one or more storage devices within the RAID.
  • the storage system 100 (e.g. the processing circuit 111 ) writes a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and records a first set of management information corresponding to the first set of protected data, such as a certain row of information of the management table 121 T, for data recovery of the first set of protected data, where the first set of protected data comprises data and multiple parity-check codes, RAID information within the first set of management information indicates the first set of protected data being stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data.
  • the RAID information within the first set of management information may comprise first RAID bitmap information
  • the first RAID bitmap information may comprise a first set of first bits, where the first set of first bits indicates the first set of protected data is respectively stored in the first set of storage devices.
  • the multiple storage devices comprise all of the storage devices ⁇ 131 , 132 , . . . , 146 ⁇ , and all these storage devices are currently operating normally.
  • the first set of storage devices may comprise all of the multiple storage devices, but the present invention is not limited thereto.
  • the first set of first bits may be 1111111111111111 (which may be recorded as 0xFFFF) to indicate the first set of protected data (such as the aforementioned data and multiple parity-check codes therein) is respectively stored in the storage devices ⁇ 131 , 132 , . . . , 146 ⁇ .
  • the validity information within the first set of management information may comprise first validity bitmap information
  • the first validity bitmap information may comprise a first set of second bits, where the first set of second bits indicates respective validities of the first set of protected data, respectively.
  • the first set of second bits may be 1111111111111111 (which may be recorded as 0xFFFF) to indicate all the first set of protected data is valid.
  • Step 220 when any storage device of the multiple storage devices malfunctions, the storage system 100 (e.g. the processing circuit 111 ) writes a second set of protected data into a second protected access unit of the multiple protected access units, and records a second set of management information corresponding to the second set of protected data, such as another row information within the management table 121 T, for data recovery of the second set of protected data, where the second set of protected data comprises data and multiple parity-check codes, RAID information within the second set of management information indicates the second set of protected data is stored in a second set of storage devices of the multiple storage devices, and validity information within the second set of management information indicates respective validities of the second set of protected data. More particularly, the second set of storage devices is different from the first set of storage devices. For example, the second set of storage devices does not comprise the aforementioned any storage device of the multiple storage devices.
  • the RAID information within the second set of management information may comprise second RAID bitmap information
  • the second RAID bitmap information may comprise a second set of first bits, where the second set of first bits indicates the second set of protected data is respectively stored in the second set of storage devices.
  • the multiple storage devices comprise all of the storage devices ⁇ 131 , 132 , . . . , 146 ⁇ , and most of these storage devices are currently operating normally, where the storage device 131 malfunctions.
  • the second set of storage devices may comprise the storage devices ⁇ 132 , . . . , 146 ⁇ , but the present invention is not limited thereto.
  • the second set of first bits may be 0111111111111111 (which may be recorded as 0x8FFF) to indicate the second set of protected data (such as the aforementioned data and multiple parity-check codes therein) is respectively stored in the storage devices ⁇ 132 , . . . , 146 ⁇ .
  • the validity information within the second set of management information may comprise second validity bitmap information
  • the second validity bitmap information may comprise a second set of second bits, where the second set of second bits indicates respective validities of the second set of protected data, respectively. Under a situation where all the storage devices ⁇ 132 , . . .
  • the second set of second bits may be 0111111111111111 (which may be recorded as 0x8FFF) to indicate all the second set of protected data is valid.
  • the second set of first bits 0111111111111111 indicates the second set of protected data is respectively stored in the storage devices ⁇ 132 , . . . , 146 ⁇
  • only the last 15 bits 111111111111 are meaningful in the second set of second bits 0111111111111 while the first bit 0 may be regarded as “Don't care” according to some viewpoints, but the present invention is not limited thereto.
  • the health state of the RAID e.g.
  • the storage system 100 may update respective validity information of the multiple sets of management information, such as multiple sets of second bits, to generate latest versions of the multiple sets of management information, where each set of second bits within the multiple sets of second bits indicates respective validity of a corresponding set of protected data, respectively.
  • the storage system 100 performs data recovery of at least one set of protected data, where the aforementioned at least one set of management information corresponds to the aforementioned at least one set of protected data.
  • the aforementioned at least one set of management information may comprise the first set of management information, and the aforementioned at least one set of protected data may comprise the first set of protected data.
  • the aforementioned at least one set of management information may comprise the second set of management information, and the aforementioned at least one set of protected data may comprise the second set of protected data.
  • the aforementioned at least one set of management information may comprise the first set of management information and the second set of management information, and the aforementioned at least one set of protected data may comprise the first set of protected data and the second set of protected data.
  • the method may be illustrated by the working flow 200 shown in FIG. 2 , but the present invention is not limited thereto. According to some embodiments, one or more steps may be added, removed or modified in the working flow 200 .
  • the storage system 100 may update the validity information within the first set of management information, to indicate that protected data within the first set of protected data stored in this storage device is invalid, for data recovery of the first set of protected data.
  • the storage system 100 e.g. the processing circuit 111 .
  • the processing circuit 111 may read valid protected data within the first set of protected data, to perform data recovery of the first set of protected data according to the valid protected data, where the valid protected data comprises at least one portion (such as one portion or all) of the data within the first set of protected data, and comprises at least one parity-check code of the multiple parity-check codes (such as one or more of these parity-check codes) within the first set of protected data.
  • the storage system 100 may update the validity information within the first set of management information to indicate that protected data within the first set of protected data stored in the second storage device is invalid, for data recovery of the first set of protected data.
  • the storage system 100 e.g. the processing circuit 111
  • the valid protected data may comprise at least one parity-check code of the multiple parity-check codes (such as one or more of these parity-check codes) within the first set of protected data.
  • the storage system 100 may update the validity information within the second set of management information to indicate that protected data within the second set of protected data stored in the second storage device is invalid, for data recovery of the second set of protected data.
  • the storage system 100 e.g. the processing circuit 111 .
  • the processing circuit 111 may read valid protected data within the second set of protected data to perform data recovery of the second set of protected data according to the valid protected data, where the valid protected data comprises at least one portion (such as one portion or all) of the data within the second set of protected data, and comprises at least one parity-check code of the multiple parity-check codes (such as one or more of these parity-check codes) within the second set of protected data.
  • FIG. 3 illustrates a plurality of protected access units according to an embodiment of the present invention, where examples of the plurality of protected access units may include protected blocks 310 and 320 , but the present invention is not limited thereto.
  • a symbol “D” may represent data within the protected block such as user data respectively stored in some storage devices, and symbols “P” and “Q” may respectively represent parity-check codes within the protected block.
  • the parity-check codes P and Q may be the same or different from each other, and more particularly, under a situation where they are different from each other, the storage system 100 (e.g.
  • the processing circuit 111 may respectively adopt different encoding rules to perform error correction code (ECC) encoding on the data D in order to generate corresponding parity-check codes P and Q.
  • ECC error correction code
  • the multiple storage devices of the RAID may comprise the storage devices ⁇ 131 , 132 , . . . , 144 , 145 , 146 ⁇ , but the present invention is not limited thereto.
  • the storage devices ⁇ 131 , 132 , . . . , 144 , 145 , 146 ⁇ may store a set of protected data (e.g. the first set of protected data), and any of the storage devices ⁇ 131 , 132 , .
  • a type and/or a protection degree of the RAID may vary, where the user data may obtain protection of a corresponding type and/or degree.
  • the arrangement of the data D, the parity-check code P and/or the parity-check code Q may vary.
  • a number of storage devices configured to store the data D and/or a number of storage devices configured to store the parity-check codes may vary.
  • a total number of storage devices configured to store the data D and the parity-check codes P and Q may vary.
  • FIG. 4 illustrates a redirect-on-write (ROW) scheme of the method according to an embodiment of the present invention.
  • the storage system 100 e.g. the processing circuit 111
  • any protected access unit e.g. a certain protected block within the data region DR
  • the storage system 100 e.g.
  • L2p table 410 may comprise multiple L2p sub-tables, where a first row of L2p sub-tables may respectively map pages 0-511 (more particularly, logical addresses 0-511) to respective storage locations thereof (e.g. some protected access units such as protected blocks); a second row of L2p sub-tables may respectively map pages 512-1023 (more particularly, logical addresses 512-1023) to respective storage locations thereof (e.g. some protected access units such as protected blocks); and the rest may be induced by analogy, but the present invention is not limited thereto. According to some embodiments, these storage locations may be regarded as ROW locations.
  • a total number of storage devices within the RAID may vary, and the total number of storage device configured to store the data D and the parity-check codes P and Q may accordingly vary.
  • the RAID may comprise ten storage devices, such as the first ten storage devices ⁇ 131 , 132 , . . . ⁇ within the storage devices ⁇ 131 , 132 , . . . , 146 ⁇ shown in FIG. 1 .
  • the ten storage devices ⁇ 131 , 132 , . . . ⁇ may be respectively represented by ⁇ SD 0 , SD 1 , . . . , SD 9 ⁇ .
  • FIG. 5 illustrates a control scheme of the method according to an embodiment of the present invention, where the plurality of protected access units may comprise multiple groups of protected access units, such as a group 510 that is firstly written and a group 520 that is subsequently written, but the present invention is not limited thereto.
  • a row of small frames may represent a protected access unit, and ten small frames (from left to right) within the row of small frames may respectively correspond to the ten storage devices ⁇ 131 , 132 , . . . ⁇ such as the storage devices ⁇ SD 0 , SD 1 , . . .
  • SD 9 ⁇ may represent subsets of this protected access unit which are respectively located at the storage devices ⁇ SD 0 , SD 1 , . . . , SD 9 ⁇ .
  • Any row of small frames labeled with symbols “D”, “P” and “Q” may represent a protected access unit in which the data D and the parity-check codes P and Q were written before.
  • the data D and the parity-check codes P and Q may be respectively stored in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 9 ⁇ .
  • the storage system 100 e.g. the processing circuit 111
  • the storage system 100 may respectively record corresponding RAID bitmap information and validity bitmap information as a set of first bits 1111111111000000 and a set of second bits 1111111111000000, meaning the protected data is stored in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 9 ⁇ , respectively, and is all valid.
  • the storage system 100 may update corresponding validity bitmap information as a set of second bits 1111111011000000, meaning the majority of the protected data is valid, but the protected data within the storage device SD 7 may be regarded as invalid.
  • the storage system 100 e.g. the processing circuit 111
  • the processing circuit 111 may continue writing, and more particularly, write the user data into protected access units within the group 520 .
  • the storage system 100 e.g. the processing circuit 111
  • the storage system 100 may respectively record corresponding RAID bitmap information and validity bitmap information as a set of first bits 1111111011000000 and a set of second bits 1111111011000000, meaning the protected data is stored in nine normal storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 , SD 8 , SD 9 ⁇ within the storage devices ⁇ SD 0 , SD 1 , . . . , SD 9 ⁇ , respectively, and is all valid (in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 , SD 8 , SD 9 ⁇ ).
  • the protected data in each of the protected access units within the group 510 may be regarded as (8+2) protected data, where 8 means the data D is distributed in eight storage devices ⁇ SD 0 , SD 1 , . . . , SD 7 ⁇ (the storage device SD 7 malfunctions), and 2 means the parity-check codes P and Q are distributed in two storage devices ⁇ SD 8 , SD 9 ⁇ .
  • the protected data in each of the protected access units within the group 520 may be regarded as (7+2) protected data, where 7 means the data D is distributed in seven storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 ⁇ , and 2 means the parity-check codes P and Q are distributed in two storage devices ⁇ SD 8 , SD 9 ⁇
  • FIG. 6 illustrates a control scheme of the method according to another embodiment of the present invention, where the multiple groups of protected access units may comprise the two groups 510 and 520 which are written before, a group 530 which is subsequently written and a group 540 which is not written yet, but the present invention is not limited thereto.
  • the leftmost portion of FIG. 6 is similar to the rightmost portion of FIG. 5 .
  • another storage device such as the storage device SD 9 malfunctions (labeled “Disk fail” for better comprehension)
  • protected data within the storage device SD 9 becomes unobtainable (labeled “F” for better comprehension), and therefore may be regarded as invalid.
  • the storage system 100 e.g.
  • the processing circuit 111 may update corresponding validity bitmap information as a set of second bits 1111111010000000, meaning the majority of the protected data is valid, but the protected data stored by the storage devices SD 7 and SD 9 may be regarded as invalid.
  • the storage system 100 e.g. the processing circuit 111
  • the storage system 100 may update corresponding validity bitmap information as a set of second bits 1111111010000000, meaning the majority of the protected data is valid, but the protected data stored by the storage device SD 9 may be regarded as invalid.
  • the storage system 100 e.g. the processing circuit 111
  • the storage system 100 may respectively record corresponding RAID bitmap information and validity bitmap information as a set of first bits 1111111010000000 and a set of second bits 1111111010000000, meaning the protected data is respectively stored in eight normal storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 , SD 8 ⁇ within the storage devices ⁇ SD 0 , SD 1 , . . . , SD 9 ⁇ and is all valid (in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 , SD 8 ⁇ ).
  • the protected data in each of the protected access units within the group 530 may be regarded as (6+2) protected data, where 6 means the data D is distributed in six storage device ⁇ SD 0 , SD 1 , . . . , SD 5 ⁇ , and 2 means the parity-check codes P and Q are distributed in two storage devices ⁇ SD 6 , SD 8 ⁇ .
  • a number RAID_DISK( 510 ) of RAID disks ⁇ SD 0 , SD 1 , . . . , SD 9 ⁇ adopted by the group 510 is equal to 10, where a number FAIL_DISK( 510 ) of malfunctioning disks ⁇ SD 7 , SD 9 ⁇ is equal to 2.
  • a number RAID_DISK( 520 ) of RAID disks ⁇ SD 0 , SD 1 , . . . SD 6 , SD 8 , SD 9 ⁇ adopted by the group 520 is equal to 9, where a number FAIL_DISK( 520 ) of malfunctioning disks ⁇ SD 9 ⁇ is equal to 1.
  • a number RAID_DISK( 530 ) of RAID disks ⁇ SD 0 , SD 1 , . . . , SD 6 , SD 8 ⁇ adopted by the group 530 is equal to 8, where a number FAIL_DISK( 530 ) of malfunctioning disks is equal to 0.
  • FIG. 7 illustrates a control scheme of the method according to another embodiment of the present invention.
  • the leftmost portion of FIG. 7 is equivalent to the rightmost portion of FIG. 6 .
  • a new storage device is coupled to the interface circuit 122 to replace a certain malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD 9 (this is labeled “New disk inserted” for better comprehension).
  • Protected access units within the storage system 100 that need to be recovered (or restored) at this moment may comprise respective protected access units of the groups 510 and 520 .
  • the storage system 100 e.g.
  • the processing circuit 111 may recover the parity-check code Q according to the data D respectively stored in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 ⁇ and the parity-check code P stored in the storage device SD 8 ; more particularly, the storage system 100 (e.g. the processing circuit 111 ) may perform ECC decoding according to the data D respectively corresponding to the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 ⁇ and the parity-check code P corresponding to the storage device SD 8 in order to generate the data D corresponding to the storage device SD 7 , and perform ECC encoding according to the data respectively corresponding to the storage devices ⁇ SD 0 , SD 1 , .
  • the storage system 100 may recover the parity-check code Q according to the data D respectively stored in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 ⁇ ; more particularly, the storage system 100 (e.g.
  • the processing circuit 111 may perform ECC encoding according to the data D respectively corresponding to the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 ⁇ in order to generate the parity-check code Q corresponding to the storage device SD 9 ; and may update the corresponding validity bitmap information to be a set of second bits 1111111011000000, meaning the protected data is all valid. As a result, the protected data within the group 520 is completely recovered.
  • FIG. 8 illustrates a control scheme of the method according to another embodiment of the present invention.
  • Anew storage device is coupled to the interface circuit 122 to replace another malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD 7 (this is labeled “New disk inserted” for better comprehension).
  • Protected access units within the storage system. 100 that need to be recovered at this moment may comprise the protected access units within the groups 510 .
  • the storage system 100 e.g. the processing circuit 111
  • the storage system 100 may perform ECC decoding according to the data D respectively corresponding to the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 ⁇ and the parity-check code P corresponding to the storage device SD 8 in order to generate the data D corresponding to the storage device SD 7 ; and may update the corresponding validity bitmap information to be a set of second bits 1111111111000000, meaning the protected data is all valid. As a result, the protected data within the group 510 is completely recovered.
  • FIG. 9 illustrates a control scheme of the method according to another embodiment of the present invention.
  • the leftmost portion of FIG. 9 is equivalent to the rightmost portion of FIG. 6 .
  • a new storage device is coupled to the interface circuit 122 to replace a certain malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD 7 (this is labeled “New disk inserted” for better comprehension).
  • Protected access units within the storage system. 100 that need to be recovered at this moment may comprise the protected access units within the group 510 .
  • the storage system 100 e.g.
  • the processing circuit 111 may recover the data D corresponding to the storage device SD 7 according to the data D respectively stored in the storage devices ⁇ SD 0 , SD 1 , SD 6 ⁇ and the parity-check code P stored in the storage device SD 8 ; more particularly, the storage system 100 (e.g.
  • the processing circuit 111 may perform ECC decoding according to the data D respectively corresponding to storage devices ⁇ SD 0 , SD 1 , SD 6 ⁇ and the parity-check code P corresponding to the storage device SD 8 in order to generate the data D corresponding to the storage device SD 7 ; and may update the corresponding validity bitmap information to be a set of second bits 1111111110000000, meaning the majority of the protected data is valid, but the protected data stored by the storage device SD 9 is regarded as invalid.
  • FIG. 10 illustrates a control scheme of the method according to another embodiment of the present invention.
  • the leftmost portion of FIG. 10 is equivalent to the rightmost portion of FIG. 9 .
  • a new storage device is coupled to the interface circuit 122 to replace another malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD 9 (this will be labeled “New disk inserted”).
  • the protected access units within the storage system 100 that need to be recovered at this moment may comprise respective protected access units of the group 510 and 520 .
  • the storage system 100 e.g.
  • the processing circuit 111 may recover the parity-check code Q according to the data D respectively stored in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 7 ⁇ ; more particularly, the storage system 100 (e.g. the processing circuit 111 ) may perform ECC encoding according to the data D respectively corresponding to the storage devices ⁇ SD 0 , SD 1 , . . . , SD 7 ⁇ in order to generate the parity-check code Q corresponding to the storage device SD 9 ; and may update the corresponding validity bitmap information to be a set of second bits 1111111111000000, meaning the protected data is all valid. As a result, the protected data within the group 510 is completely recovered.
  • the storage system 100 may recover the parity-check code Q according to the data D respectively stored in the storage devices ⁇ SD 0 , SD 1 , . . . , SD 6 ⁇ ; more particularly, the storage system 100 (e.g. the processing circuit 111 ) may perform ECC encoding according to the data D respectively corresponding to the storage devices ⁇ SD 0 , SD 1 , . . .
  • SD 6 ⁇ in order to generate the parity-check code Q corresponding to the storage device SD 9 ; and may update the corresponding validity bitmap information to be a set of second bits 1111111011000000, meaning the protected data is all valid. As a result, the protected data within the group 520 is completely recovered.
  • the multiple sets of management information may vary. For example, regarding any set (more particularly, each set) of the multiple set of management information, a bit count of first bits within the RAID bitmap information (such as the first RAID bitmap information, the second RAID bitmap information, etc.) and/or a bit count of second bits within the validity bitmap information (such as the first validity bitmap information, the second validity bitmap information, etc.) may vary (e.g. increase or decrease). For brevity, similar descriptions for these embodiments are not repeated in detail here.

Abstract

A method and apparatus for performing dynamic recovery management regarding a RAID are provided. The method includes: writing a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data; and when any storage device of multiple storage devices of the RAID malfunctions, writing a second set of protected data into a second protected access unit of the protected access units, and recording a second set of management information corresponding to the second set of protected data, for data recovery of the second set of protected data. Any set of the first set of protected data and the second set of protected data includes data and multiple parity-check codes.

Description

CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation application and claims the benefit of U.S. Non-provisional application Ser. No. 16/513,675, which was filed on Jul. 16, 2019, and is included herein by reference.
BACKGROUND OF THE INVENTION 1. Field of the Invention
The present invention is related to storage systems, and more particularly, to a method and apparatus for performing dynamic recovery management regarding a redundant array of independent disks (RAID).
2. Description of the Prior Art
A redundant array of independent disks (RAID) may be implemented in a server. Through various types of RAID schemes, data can obtain protection at a corresponding level. For the purpose of data backup, the server may be designed to be equipped with a copy-on-write (COW) architecture. Due to features of the COW architecture, performance of the server may degrade as time goes by. To prevent this, the server may be designed to be equipped with a redirect-on-write (ROW) architecture, but this may result in other problems. When any disk within the RAID malfunctions, if a second disk malfunctions, the probability of data of the server being unrecoverable will greatly increase. Thus, there is a need for a novel method and associated architecture, to guarantee a storage system can properly operate under various situations.
SUMMARY OF THE INVENTION
An objective of the present invention is to provide a method and apparatus for performing dynamic recovery management regarding a redundant array of independent disks (RAID), to solve the related art problems.
Another objective of the present invention is to provide a method and apparatus for performing dynamic recovery management regarding a RAID that can guarantee the storage system can properly operate under various situations.
Another objective of the present invention is to provide a method and apparatus for performing dynamic recovery management regarding a RAID that can solve the related art problems without introducing any side effect or in a way that is less likely to introduce side effects.
At least one embodiment of the present invention provides a method for performing dynamic recovery management regarding a RAID. The method may comprise: writing a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data, wherein the RAID comprises multiple storage devices, the first set of protected data comprises data and multiple parity-check codes, RAID information within the first set of management information indicates the first set of protected data is stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data; and in response to any storage device of the multiple storage devices malfunctioning, writing a second set of protected data into a second protected access unit of the multiple protected access units, and recording a second set of management information corresponding to the second set of protected data, for data recovery of the second set of protected data, wherein the second set of protected data comprises data and multiple parity-check codes, RAID information within the second set of management information indicates the second set of protected data is stored in a second set of storage devices of the multiple storage devices, and validity information within the second set of management information indicates respective validities of the second set of protected data. The second set of storage devices is different from the first set of storage devices.
The present invention further provides a storage system operating according to the aforementioned method, wherein the storage system comprises the RAID.
At least one embodiment of the present invention provides an apparatus for performing dynamic recovery management regarding a RAID. The apparatus may comprise a processing circuit, wherein the processing circuit is positioned in a storage system, and is configured to control operations of the storage system. The operations of the storage system may comprise: writing a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data, wherein the RAID comprises multiple storage devices, the first set of protected data comprises data and multiple parity-check codes, RAID information within the first set of management information indicates the first set of protected data is stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data; and in response to any storage device of the multiple storage devices malfunctioning, writing a second set of protected data into a second protected access unit of the multiple protected access units, and recording a second set of management information corresponding to the second set of protected data, for data recovery of the second set of protected data, wherein the second set of protected data comprises data and multiple parity-check codes, RAID information within the second set of management information indicates the second set of protected data is stored in a second set of storage devices of the multiple storage devices, and validity information within the second set of management information indicates respective validities of the second set of protected data. The second set of storage devices is different from the first set of storage devices.
The method and apparatus of the present invention can guarantee the storage system will properly operate under various situations. For example, when any disk within a RAID malfunctions, the system manager does not need to be concerned that the probability of the data of the server being unrecoverable will greatly increase due to a second disk malfunctioning. In addition, the method and apparatus of the present invention provide a powerful dynamic recovery management mechanism. Thus, the objectives of optimal performance, high security, budget control, etc. can be achieved. Additionally, the method and apparatus of the present invention can solve the problems in the related art without introducing any side effect or in a way that is less likely to introduce side effects.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a diagram illustrating a storage system and a user device according to an embodiment of the present invention.
FIG. 2 is a working flow of a method for performing dynamic recovery management regarding a RAID (such as that shown in FIG. 1) according to an embodiment of the present invention.
FIG. 3 illustrates a plurality of protected access units according to an embodiment of the present invention, where examples of the plurality of protected access units may include protected blocks.
FIG. 4 illustrates a redirect-on-write (ROW) scheme of the method according to an embodiment of the present invention.
FIG. 5 illustrates a control scheme of the method according to an embodiment of the present invention.
FIG. 6 illustrates a control scheme of the method according to another embodiment of the present invention.
FIG. 7 illustrates a control scheme of the method according to another embodiment of the present invention.
FIG. 8 illustrates a control scheme of the method according to another embodiment of the present invention.
FIG. 9 illustrates a control scheme of the method according to another embodiment of the present invention.
FIG. 10 illustrates a control scheme of the method according to another embodiment of the present invention.
DETAILED DESCRIPTION
FIG. 1 is a diagram illustrating a storage system 100 and a user device 10 according to an embodiment of the present invention. The user device 10 may comprise a processing circuit 11 (e.g. at least one processor and associated circuits), and may further comprise an interface circuit 12 coupled to the processing circuit 11, and a storage device. The storage system 100 may comprise a processing circuit 111 (e.g. at least one processor and associated circuits), and may further comprise interface circuits 112 and 122 and random access memory (RAM) 121 that are coupled to the processing circuit 111 through a bus 110. Storage devices {130, 131, 132, . . . , 146} (such as hard disks and/or solid state drives) may be installed in the storage system 100 through the interface circuit 122, and more particularly, the storage devices {131, 132, . . . , 146} may forma RAID, where a program code 111P executed on the processing circuit 111 may be read from the storage device 130 (e.g. a system disk), and may maintain (e.g. establish, store and/or update) a management table 121T within the RAM 121 in order to perform related operations to manage a data region DR. In addition, the management table 121T may comprise multiple sets of management information for dynamic recovery management, and each set of management information within the multiple sets of management information (e.g. a row of information within the management table 121T) may comprise RAID information such as RAID bitmap information, and may comprise validity information such as validity bitmap information. When needed, the management table 121T may be backed up in a table region TR, but the present invention is not limited thereto. Additionally, the interface circuits 12 and 112 may be implemented as a wired network interface and/or wireless network interface, to allow the storage system 100 and the user device 10 to exchange information with each other. A user may access (read or write) user data in the storage system 100 through the user device 10. Examples of the user device 10 may include, but are not limited to: a multifunctional mobile phone, a tablet, a wearable device and a personal computer (such as a desktop computer and a laptop computer). Examples of the storage system 100 may include, but are not limited to: a server such as a storage server. According to some embodiments, the architecture of the storage system 100 may vary. For example, the program code 111P may be implemented by a dedicated hardware configured in the interface circuit 122, to perform related operations of the present invention method. According to some embodiments, the number of storage devices {131, 132, . . . , 146} within the RAID may vary, e.g. may be increased or reduced.
FIG. 2 is a working flow 200 of a method for performing dynamic recovery management regarding a RAID (such as the RAID shown in FIG. 1) according to an embodiment of the present invention, where the RAID may comprise multiple storage devices such as the storage devices {131, 132, . . . , 146}. The method may be applied to the storage system 100, the processing circuit 111 executing the program code 111P, and associated components shown in FIG. 1. For example, the storage system 100 (e.g. the processing circuit 111) may maintain (e.g. establish, store and/or update) respective validity information of the multiple sets of management information within the management table 121T according to at least one health state of the RAID such as one or more health states thereof, in order to generate the latest version of the multiple sets of management information. Examples of the one or more health states of the RAID may include, but are not limited to: a normal state, a malfunction state and a recovery state of one or more storage devices within the RAID.
In Step 210, the storage system 100 (e.g. the processing circuit 111) writes a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and records a first set of management information corresponding to the first set of protected data, such as a certain row of information of the management table 121T, for data recovery of the first set of protected data, where the first set of protected data comprises data and multiple parity-check codes, RAID information within the first set of management information indicates the first set of protected data being stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data.
According to this embodiment, the RAID information within the first set of management information may comprise first RAID bitmap information, and the first RAID bitmap information may comprise a first set of first bits, where the first set of first bits indicates the first set of protected data is respectively stored in the first set of storage devices. For better comprehension, assume that the multiple storage devices comprise all of the storage devices {131, 132, . . . , 146}, and all these storage devices are currently operating normally. Under this situation, the first set of storage devices may comprise all of the multiple storage devices, but the present invention is not limited thereto. The first set of first bits may be 1111111111111111 (which may be recorded as 0xFFFF) to indicate the first set of protected data (such as the aforementioned data and multiple parity-check codes therein) is respectively stored in the storage devices {131, 132, . . . , 146}. In addition, the validity information within the first set of management information may comprise first validity bitmap information, and the first validity bitmap information may comprise a first set of second bits, where the first set of second bits indicates respective validities of the first set of protected data, respectively. Under a situation where all the storage devices {131, 132, . . . , 146} are currently operating normally, the first set of second bits may be 1111111111111111 (which may be recorded as 0xFFFF) to indicate all the first set of protected data is valid.
In Step 220, when any storage device of the multiple storage devices malfunctions, the storage system 100 (e.g. the processing circuit 111) writes a second set of protected data into a second protected access unit of the multiple protected access units, and records a second set of management information corresponding to the second set of protected data, such as another row information within the management table 121T, for data recovery of the second set of protected data, where the second set of protected data comprises data and multiple parity-check codes, RAID information within the second set of management information indicates the second set of protected data is stored in a second set of storage devices of the multiple storage devices, and validity information within the second set of management information indicates respective validities of the second set of protected data. More particularly, the second set of storage devices is different from the first set of storage devices. For example, the second set of storage devices does not comprise the aforementioned any storage device of the multiple storage devices.
According to this embodiment, the RAID information within the second set of management information may comprise second RAID bitmap information, and the second RAID bitmap information may comprise a second set of first bits, where the second set of first bits indicates the second set of protected data is respectively stored in the second set of storage devices. For better comprehension, assume that the multiple storage devices comprise all of the storage devices {131, 132, . . . , 146}, and most of these storage devices are currently operating normally, where the storage device 131 malfunctions. Under this situation, the second set of storage devices may comprise the storage devices {132, . . . , 146}, but the present invention is not limited thereto. The second set of first bits may be 0111111111111111 (which may be recorded as 0x8FFF) to indicate the second set of protected data (such as the aforementioned data and multiple parity-check codes therein) is respectively stored in the storage devices {132, . . . , 146}. In addition, the validity information within the second set of management information may comprise second validity bitmap information, and the second validity bitmap information may comprise a second set of second bits, where the second set of second bits indicates respective validities of the second set of protected data, respectively. Under a situation where all the storage devices {132, . . . , 146} are currently operating normally, the second set of second bits may be 0111111111111111 (which may be recorded as 0x8FFF) to indicate all the second set of protected data is valid. Please note that, since the second set of first bits 0111111111111111 indicates the second set of protected data is respectively stored in the storage devices {132, . . . , 146}, only the last 15 bits 111111111111111 are meaningful in the second set of second bits 0111111111111111 while the first bit 0 may be regarded as “Don't care” according to some viewpoints, but the present invention is not limited thereto. When needed, and more particularly, when the health state of the RAID (e.g. one or more storage devices therein) changes, the storage system 100 (e.g. the processing circuit 111) may update respective validity information of the multiple sets of management information, such as multiple sets of second bits, to generate latest versions of the multiple sets of management information, where each set of second bits within the multiple sets of second bits indicates respective validity of a corresponding set of protected data, respectively.
In Step 230, according to a latest version of at least one set of management information, the storage system 100 (e.g. the processing circuit 111) performs data recovery of at least one set of protected data, where the aforementioned at least one set of management information corresponds to the aforementioned at least one set of protected data. For example, the aforementioned at least one set of management information may comprise the first set of management information, and the aforementioned at least one set of protected data may comprise the first set of protected data. In another example, the aforementioned at least one set of management information may comprise the second set of management information, and the aforementioned at least one set of protected data may comprise the second set of protected data. In yet another example, the aforementioned at least one set of management information may comprise the first set of management information and the second set of management information, and the aforementioned at least one set of protected data may comprise the first set of protected data and the second set of protected data.
For better comprehension, the method may be illustrated by the working flow 200 shown in FIG. 2, but the present invention is not limited thereto. According to some embodiments, one or more steps may be added, removed or modified in the working flow 200.
When the storage device mentioned in Step 220 (i.e. the aforementioned any storage device of the multiple storage devices) malfunctions, the storage system 100 (e.g. the processing circuit 111) may update the validity information within the first set of management information, to indicate that protected data within the first set of protected data stored in this storage device is invalid, for data recovery of the first set of protected data. In Step 230, according to latest validity information within the first set of management information, the storage system 100 (e.g. the processing circuit 111) may read valid protected data within the first set of protected data, to perform data recovery of the first set of protected data according to the valid protected data, where the valid protected data comprises at least one portion (such as one portion or all) of the data within the first set of protected data, and comprises at least one parity-check code of the multiple parity-check codes (such as one or more of these parity-check codes) within the first set of protected data.
In another example, when a second storage device of the multiple storage devices malfunctions, the storage system 100 (e.g. the processing circuit 111) may update the validity information within the first set of management information to indicate that protected data within the first set of protected data stored in the second storage device is invalid, for data recovery of the first set of protected data. In Step 230, according to latest validity information within the first set of management information, the storage system 100 (e.g. the processing circuit 111) may read valid protected data of the first set of protected data, to perform data recovery of the first set of protected data according to the valid protected data, where the valid protected data comprises at least one portion (such as one portion or all) of the data within the first set of protected data, but the present invention is not limited thereto. Under some situations (e.g. the valid protected data comprises a portion of the data), the valid protected data may comprise at least one parity-check code of the multiple parity-check codes (such as one or more of these parity-check codes) within the first set of protected data.
In yet another example, when the second storage device malfunctions, the storage system 100 (e.g. the processing circuit 111) may update the validity information within the second set of management information to indicate that protected data within the second set of protected data stored in the second storage device is invalid, for data recovery of the second set of protected data. In Step 230, according to latest validity information within the second set of management information, the storage system 100 (e.g. the processing circuit 111) may read valid protected data within the second set of protected data to perform data recovery of the second set of protected data according to the valid protected data, where the valid protected data comprises at least one portion (such as one portion or all) of the data within the second set of protected data, and comprises at least one parity-check code of the multiple parity-check codes (such as one or more of these parity-check codes) within the second set of protected data.
FIG. 3 illustrates a plurality of protected access units according to an embodiment of the present invention, where examples of the plurality of protected access units may include protected blocks 310 and 320, but the present invention is not limited thereto. Regarding any protected block within the protected blocks 310 and 320, a symbol “D” may represent data within the protected block such as user data respectively stored in some storage devices, and symbols “P” and “Q” may respectively represent parity-check codes within the protected block. Through the parity check codes P and Q, the data D can be protected. The parity-check codes P and Q may be the same or different from each other, and more particularly, under a situation where they are different from each other, the storage system 100 (e.g. the processing circuit 111) may respectively adopt different encoding rules to perform error correction code (ECC) encoding on the data D in order to generate corresponding parity-check codes P and Q. For better comprehension, the multiple storage devices of the RAID may comprise the storage devices {131, 132, . . . , 144, 145, 146}, but the present invention is not limited thereto. Regarding any protected block within the protected blocks 310 and 320, the storage devices {131, 132, . . . , 144, 145, 146} may store a set of protected data (e.g. the first set of protected data), and any of the storage devices {131, 132, . . . , 144, 145, 146} may store corresponding protected data within this set of protected data, such as the data D, the parity-check code P or the parity-check code Q. According to some embodiments, a type and/or a protection degree of the RAID may vary, where the user data may obtain protection of a corresponding type and/or degree. The arrangement of the data D, the parity-check code P and/or the parity-check code Q may vary. In another example, a number of storage devices configured to store the data D and/or a number of storage devices configured to store the parity-check codes (such as the parity-check codes P and Q) may vary. In yet another example, regarding any protected block within the protected blocks 310 and 320, a total number of storage devices configured to store the data D and the parity-check codes P and Q may vary.
FIG. 4 illustrates a redirect-on-write (ROW) scheme of the method according to an embodiment of the present invention. The storage system 100 (e.g. the processing circuit 111) can write multiple sets of protected data into multiple protected blocks of the RAID in a ROW manner, and respectively record the multiple sets of management information corresponding to the multiple sets of protected data, for data recovery of the multiple sets of protected data, where any set of protected data within the multiple sets of protected data may comprise data (such as the data D) and multiple parity-check codes (such as the parity-check codes P and Q). Regarding any protected access unit (e.g. a certain protected block within the data region DR) within the aforementioned multiple protected access units in Step 210, the storage system 100 (e.g. the processing circuit 111) may record or update mapping information between a logical address of the data D and a protected-access-unit address (p-address) of this protected access unit into a logical-address-to-p-address (L2p) table 410 within the table region TR. L2p table 410 may comprise multiple L2p sub-tables, where a first row of L2p sub-tables may respectively map pages 0-511 (more particularly, logical addresses 0-511) to respective storage locations thereof (e.g. some protected access units such as protected blocks); a second row of L2p sub-tables may respectively map pages 512-1023 (more particularly, logical addresses 512-1023) to respective storage locations thereof (e.g. some protected access units such as protected blocks); and the rest may be induced by analogy, but the present invention is not limited thereto. According to some embodiments, these storage locations may be regarded as ROW locations.
According to some embodiments, a total number of storage devices within the RAID may vary, and the total number of storage device configured to store the data D and the parity-check codes P and Q may accordingly vary. For example, the RAID may comprise ten storage devices, such as the first ten storage devices {131, 132, . . . } within the storage devices {131, 132, . . . , 146} shown in FIG. 1. For better comprehension, in the embodiments shown in FIG. 5 to FIG. 10, assume the ten storage devices {131, 132, . . . } may be respectively represented by {SD0, SD1, . . . , SD9}.
FIG. 5 illustrates a control scheme of the method according to an embodiment of the present invention, where the plurality of protected access units may comprise multiple groups of protected access units, such as a group 510 that is firstly written and a group 520 that is subsequently written, but the present invention is not limited thereto. For brevity, a row of small frames may represent a protected access unit, and ten small frames (from left to right) within the row of small frames may respectively correspond to the ten storage devices {131, 132, . . . } such as the storage devices {SD0, SD1, . . . , SD9}, and more particularly, may represent subsets of this protected access unit which are respectively located at the storage devices {SD0, SD1, . . . , SD9}. Any row of small frames labeled with symbols “D”, “P” and “Q” may represent a protected access unit in which the data D and the parity-check codes P and Q were written before.
As shown in the upper left corner of FIG. 5, for protected data in any protected access unit within the group 510, the data D and the parity-check codes P and Q may be respectively stored in the storage devices {SD0, SD1, . . . , SD9}. Regarding protected data in each of the protected access units within the group 510, the storage system 100 (e.g. the processing circuit 111) may respectively record corresponding RAID bitmap information and validity bitmap information as a set of first bits 1111111111000000 and a set of second bits 1111111111000000, meaning the protected data is stored in the storage devices {SD0, SD1, . . . , SD9}, respectively, and is all valid. Afterwards, when a certain storage device such as the storage device SD7 malfunctions (this is labeled “Disk fail” for better comprehension), protected data within the storage device SD7 becomes unobtainable (this is labeled “F” for better comprehension), and therefore may be regarded as invalid. Regarding the protected data in each of the protected access units within the group 510, the storage system 100 (e.g. the processing circuit 111) may update corresponding validity bitmap information as a set of second bits 1111111011000000, meaning the majority of the protected data is valid, but the protected data within the storage device SD7 may be regarded as invalid. Afterwards, the storage system 100 (e.g. the processing circuit 111) may continue writing, and more particularly, write the user data into protected access units within the group 520. Regarding protected data in each of the protected access units within the group 520, the storage system 100 (e.g. the processing circuit 111) may respectively record corresponding RAID bitmap information and validity bitmap information as a set of first bits 1111111011000000 and a set of second bits 1111111011000000, meaning the protected data is stored in nine normal storage devices {SD0, SD1, . . . , SD6, SD8, SD9} within the storage devices {SD0, SD1, . . . , SD9}, respectively, and is all valid (in the storage devices {SD0, SD1, . . . , SD6, SD8, SD9}).
Please note that the protected data in each of the protected access units within the group 510 may be regarded as (8+2) protected data, where 8 means the data D is distributed in eight storage devices {SD0, SD1, . . . , SD7} (the storage device SD7 malfunctions), and 2 means the parity-check codes P and Q are distributed in two storage devices {SD8, SD9}. In addition, the protected data in each of the protected access units within the group 520 may be regarded as (7+2) protected data, where 7 means the data D is distributed in seven storage devices {SD0, SD1, . . . , SD6}, and 2 means the parity-check codes P and Q are distributed in two storage devices {SD8, SD9}
FIG. 6 illustrates a control scheme of the method according to another embodiment of the present invention, where the multiple groups of protected access units may comprise the two groups 510 and 520 which are written before, a group 530 which is subsequently written and a group 540 which is not written yet, but the present invention is not limited thereto. The leftmost portion of FIG. 6 is similar to the rightmost portion of FIG. 5. When another storage device such as the storage device SD9 malfunctions (labeled “Disk fail” for better comprehension), protected data within the storage device SD9 becomes unobtainable (labeled “F” for better comprehension), and therefore may be regarded as invalid. Regarding the protected data in each of the protected access units within the group 510, the storage system 100 (e.g. the processing circuit 111) may update corresponding validity bitmap information as a set of second bits 1111111010000000, meaning the majority of the protected data is valid, but the protected data stored by the storage devices SD7 and SD9 may be regarded as invalid. Regarding the protected data in each of the protected access units within the group 520, the storage system 100 (e.g. the processing circuit 111) may update corresponding validity bitmap information as a set of second bits 1111111010000000, meaning the majority of the protected data is valid, but the protected data stored by the storage device SD9 may be regarded as invalid. Afterwards, the storage system 100 (e.g. the processing circuit 111) may continue writing, and more particularly, write the user data into protected access units within the group 530. Regarding protected data in each of the protected access units within the group 530, the storage system 100 (e.g. the processing circuit 111) may respectively record corresponding RAID bitmap information and validity bitmap information as a set of first bits 1111111010000000 and a set of second bits 1111111010000000, meaning the protected data is respectively stored in eight normal storage devices {SD0, SD1, . . . , SD6, SD8} within the storage devices {SD0, SD1, . . . , SD9} and is all valid (in the storage devices {SD0, SD1, . . . , SD6, SD8}).
Please note that the protected data in each of the protected access units within the group 530 may be regarded as (6+2) protected data, where 6 means the data D is distributed in six storage device {SD0, SD1, . . . , SD5}, and 2 means the parity-check codes P and Q are distributed in two storage devices {SD6, SD8}. As shown in the rightmost portion of FIG. 6, a number RAID_DISK(510) of RAID disks {SD0, SD1, . . . , SD9} adopted by the group 510 is equal to 10, where a number FAIL_DISK(510) of malfunctioning disks {SD7, SD9} is equal to 2. In addition, a number RAID_DISK(520) of RAID disks {SD0, SD1, . . . SD6, SD8, SD9} adopted by the group 520 is equal to 9, where a number FAIL_DISK(520) of malfunctioning disks {SD9} is equal to 1. Additionally, a number RAID_DISK(530) of RAID disks {SD0, SD1, . . . , SD6, SD8} adopted by the group 530 is equal to 8, where a number FAIL_DISK(530) of malfunctioning disks is equal to 0.
FIG. 7 illustrates a control scheme of the method according to another embodiment of the present invention. The leftmost portion of FIG. 7 is equivalent to the rightmost portion of FIG. 6. A new storage device is coupled to the interface circuit 122 to replace a certain malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD9 (this is labeled “New disk inserted” for better comprehension). Protected access units within the storage system 100 that need to be recovered (or restored) at this moment may comprise respective protected access units of the groups 510 and 520. Regarding the protected data in each of the protected access units within the group 510, the storage system 100 (e.g. the processing circuit 111) may recover the parity-check code Q according to the data D respectively stored in the storage devices {SD0, SD1, . . . , SD6} and the parity-check code P stored in the storage device SD8; more particularly, the storage system 100 (e.g. the processing circuit 111) may perform ECC decoding according to the data D respectively corresponding to the storage devices {SD0, SD1, . . . , SD6} and the parity-check code P corresponding to the storage device SD8 in order to generate the data D corresponding to the storage device SD7, and perform ECC encoding according to the data respectively corresponding to the storage devices {SD0, SD1, . . . , SD7} in order to generate the parity-check code Q corresponding to the storage device SD9; and may update the corresponding validity bitmap information to be a set of second bits 1111111011000000, meaning the majority of the protected data is valid, but the protected data stored by the storage device SD7 may be regarded as invalid. In addition, regarding the protected data in each of the protected access units within the group 520, the storage system 100 (e.g. the processing circuit 111) may recover the parity-check code Q according to the data D respectively stored in the storage devices {SD0, SD1, . . . , SD6}; more particularly, the storage system 100 (e.g. the processing circuit 111) may perform ECC encoding according to the data D respectively corresponding to the storage devices {SD0, SD1, . . . , SD6} in order to generate the parity-check code Q corresponding to the storage device SD9; and may update the corresponding validity bitmap information to be a set of second bits 1111111011000000, meaning the protected data is all valid. As a result, the protected data within the group 520 is completely recovered.
FIG. 8 illustrates a control scheme of the method according to another embodiment of the present invention. Anew storage device is coupled to the interface circuit 122 to replace another malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD7 (this is labeled “New disk inserted” for better comprehension). Protected access units within the storage system. 100 that need to be recovered at this moment may comprise the protected access units within the groups 510. Regarding the protected data in each of the protected access units within the group 510, the storage system 100 (e.g. the processing circuit 111) may recover the data D corresponding to the storage device SD7 according to the data D respectively stored in the storage devices {SD0, SD1, . . . , SD6} and the parity-check code P stored in the storage device SD8; more particularly, the storage system 100 (e.g. the processing circuit 111) may perform ECC decoding according to the data D respectively corresponding to the storage devices {SD0, SD1, . . . , SD6} and the parity-check code P corresponding to the storage device SD8 in order to generate the data D corresponding to the storage device SD7; and may update the corresponding validity bitmap information to be a set of second bits 1111111111000000, meaning the protected data is all valid. As a result, the protected data within the group 510 is completely recovered.
FIG. 9 illustrates a control scheme of the method according to another embodiment of the present invention. The leftmost portion of FIG. 9 is equivalent to the rightmost portion of FIG. 6. A new storage device is coupled to the interface circuit 122 to replace a certain malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD7 (this is labeled “New disk inserted” for better comprehension). Protected access units within the storage system. 100 that need to be recovered at this moment may comprise the protected access units within the group 510. Regarding the protected data in each of the protected access units within the group 510, the storage system 100 (e.g. the processing circuit 111) may recover the data D corresponding to the storage device SD7 according to the data D respectively stored in the storage devices {SD0, SD1, SD6} and the parity-check code P stored in the storage device SD8; more particularly, the storage system 100 (e.g. the processing circuit 111) may perform ECC decoding according to the data D respectively corresponding to storage devices {SD0, SD1, SD6} and the parity-check code P corresponding to the storage device SD8 in order to generate the data D corresponding to the storage device SD7; and may update the corresponding validity bitmap information to be a set of second bits 1111111110000000, meaning the majority of the protected data is valid, but the protected data stored by the storage device SD9 is regarded as invalid.
FIG. 10 illustrates a control scheme of the method according to another embodiment of the present invention. The leftmost portion of FIG. 10 is equivalent to the rightmost portion of FIG. 9. A new storage device is coupled to the interface circuit 122 to replace another malfunctioning storage device; for example, this new storage device is installed in the storage system 100 to serve as the latest storage device SD9 (this will be labeled “New disk inserted”). The protected access units within the storage system 100 that need to be recovered at this moment may comprise respective protected access units of the group 510 and 520. Regarding the protected data in each of the protected access units within the group 510, the storage system 100 (e.g. the processing circuit 111) may recover the parity-check code Q according to the data D respectively stored in the storage devices {SD0, SD1, . . . , SD7}; more particularly, the storage system 100 (e.g. the processing circuit 111) may perform ECC encoding according to the data D respectively corresponding to the storage devices {SD0, SD1, . . . , SD7} in order to generate the parity-check code Q corresponding to the storage device SD9; and may update the corresponding validity bitmap information to be a set of second bits 1111111111000000, meaning the protected data is all valid. As a result, the protected data within the group 510 is completely recovered. In addition, regarding the protected data in each of the protected access units within the group 520, the storage system 100 (e.g. the processing circuit 111) may recover the parity-check code Q according to the data D respectively stored in the storage devices {SD0, SD1, . . . , SD6}; more particularly, the storage system 100 (e.g. the processing circuit 111) may perform ECC encoding according to the data D respectively corresponding to the storage devices {SD0, SD1, . . . , SD6} in order to generate the parity-check code Q corresponding to the storage device SD9; and may update the corresponding validity bitmap information to be a set of second bits 1111111011000000, meaning the protected data is all valid. As a result, the protected data within the group 520 is completely recovered.
According to some embodiments, the multiple sets of management information may vary. For example, regarding any set (more particularly, each set) of the multiple set of management information, a bit count of first bits within the RAID bitmap information (such as the first RAID bitmap information, the second RAID bitmap information, etc.) and/or a bit count of second bits within the validity bitmap information (such as the first validity bitmap information, the second validity bitmap information, etc.) may vary (e.g. increase or decrease). For brevity, similar descriptions for these embodiments are not repeated in detail here.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.

Claims (14)

What is claimed is:
1. A method for performing dynamic recovery management regarding a redundant array of independent disks (RAID), the method comprising:
writing a first set of protected data into a first access unit of multiple access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data, wherein the RAID comprises multiple storage devices, any access unit of the multiple access units is a logical access unit of the RAID regarding accessing the RAID, and comprises respective partial storage regions of the multiple storage devices, the first set of protected data comprises data and multiple parity-check codes configured to protect said data of the first set of protected data, RAID information within the first set of management information indicates the first set of protected data is stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data; and
in response to any storage device of the multiple storage devices malfunctioning, writing a second set of protected data into a second access unit of the multiple access units, and recording a second set of management information corresponding to the second set of protected data, for data recovery of the second set of protected data, wherein the second set of protected data comprises data and multiple parity-check codes configured to protect said data of the second set of protected data, RAID information within the second set of management information indicates the second set of protected data is stored in a second set of storage devices of the multiple storage devices, and validity information within the second set of management information indicates respective validities of the second set of protected data;
wherein the second set of storage devices is different from the first set of storage devices.
2. The method of claim 1, wherein the RAID information within the first set of management information comprises first RAID bitmap information, the first RAID bitmap information comprises a first set of first bits, and the first set of first bits indicates the first set of protected data is respectively stored in the first set of storage devices.
3. The method of claim 2, wherein the RAID information within the second set of management information comprises second RAID bitmap information, the second RAID bitmap information comprises a second set of first bits, and the second set of first bits indicates the second set of protected data is respectively stored in the second set of storage devices.
4. The method of claim 1, wherein the validity information within the first set of management information comprises first validity bitmap information, the first validity bitmap information comprises a first set of second bits, and the first set of second bits indicates respective validities of the first set of protected data, respectively.
5. The method of claim 4, wherein the validity information within the second set of management information comprises second validity bitmap information, the second validity bitmap information comprises a second set of second bits, and the second set of second bits indicates respective validities of the second set of protected data, respectively.
6. The method of claim 1, wherein the second set of storage devices does not comprise said any storage device.
7. The method of claim 1, further comprising:
in response to said any storage device malfunctioning, updating the validity information within the first set of management information to indicate protected data within the first set of protected data stored in said any storage device is invalid, for data recovery of the first set of protected data.
8. The method of claim 7, further comprising:
according to latest validity information within the first set of management information, reading valid protected data of the first set of protected data to perform data recovery of the first set of protected data according to the valid protected data, wherein the valid protected data comprises at least one portion of the data within the first set of protected data, and comprises at least one parity-check code of the multiple parity-check codes within the first set of protected data.
9. The method of claim 7, further comprising:
in response to a second storage device of the multiple storage devices malfunctioning, updating the validity information within the first set of management information to indicate protected data within the first set of protected data stored in the second storage device is invalid, for data recovery of the first set of protected data.
10. The method of claim 9, further comprising:
according to latest validity information within the first set of management information, reading valid protected data of the first set of protected data to perform data recovery of the first set of protected data according to the valid protected data, wherein the valid protected data comprises at least one portion of the data within the first set of protected data.
11. The method of claim 7, further comprising:
in response to a second storage device of the multiple storage devices malfunctioning, updating the validity information within the second set of management information to indicate protected data within the second set of protected data stored in the second storage device is invalid, for data recovery of the second set of protected data.
12. The method of claim 11, further comprising:
according to latest validity information within the second set of management information, reading valid protected data of the second set of protected data to perform data recovery of the second set of protected data according to the valid protected data, wherein the valid protected data comprises at least one portion of the data within the second set of protected data, and comprises at least one parity-check code of the multiple parity-check codes within the second set of protected data.
13. A storage system operating according to the method of claim 1, wherein the storage system comprises the RAID.
14. An apparatus for performing dynamic recovery management regarding a redundant array of independent disks (RAID), the apparatus comprising:
a processing circuit, positioned in a storage system, configured to control operations of the storage system, wherein the operations of the storage system comprise:
writing a first set of protected data into a first access unit of multiple access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data, wherein the RAID comprises multiple storage devices, any access unit of the multiple access units is a logical access unit of the RAID regarding accessing the RAID, and comprises respective partial storage regions of the multiple storage devices, the first set of protected data comprises data and multiple parity-check codes configured to protect said data of the first set of protected data, RAID information within the first set of management information indicates the first set of protected data is stored in a first set of storage devices of the multiple storage devices, and validity information within the first set of management information indicates respective validities of the first set of protected data; and
in response to any storage device of the multiple storage devices malfunctioning, writing a second set of protected data into a second access unit of the multiple access units, and recording a second set of management information corresponding to the second set of protected data, for data recovery of the second set of protected data, wherein the second set of protected data comprises data and multiple parity-check codes configured to protect said data of the second set of protected data, RAID information within the second set of management information indicates the second set of protected data is stored in a second set of storage devices of the multiple storage devices, and validity information within the second set of management information indicates respective validities of the second set of protected data;
wherein the second set of storage devices is different from the first set of storage devices.
US17/084,650 2019-01-02 2020-10-30 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks Active US11301326B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/084,650 US11301326B2 (en) 2019-01-02 2020-10-30 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
TW108100004 2019-01-02
TW108100004A TWI704451B (en) 2019-01-02 2019-01-02 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks and storage system operating according to the method
US16/513,675 US10860423B2 (en) 2019-01-02 2019-07-16 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks
US17/084,650 US11301326B2 (en) 2019-01-02 2020-10-30 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/513,675 Continuation US10860423B2 (en) 2019-01-02 2019-07-16 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks

Publications (2)

Publication Number Publication Date
US20210073074A1 US20210073074A1 (en) 2021-03-11
US11301326B2 true US11301326B2 (en) 2022-04-12

Family

ID=71124273

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/513,675 Active US10860423B2 (en) 2019-01-02 2019-07-16 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks
US17/084,650 Active US11301326B2 (en) 2019-01-02 2020-10-30 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US16/513,675 Active US10860423B2 (en) 2019-01-02 2019-07-16 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks

Country Status (3)

Country Link
US (2) US10860423B2 (en)
CN (1) CN111400084B (en)
TW (1) TWI704451B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI704451B (en) * 2019-01-02 2020-09-11 慧榮科技股份有限公司 Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks and storage system operating according to the method
CN111949434B (en) * 2019-05-17 2022-06-14 华为技术有限公司 RAID management method, RAID controller and system
US11507319B2 (en) * 2021-02-04 2022-11-22 Silicon Motion, Inc. Memory controller having a plurality of control modules and associated server
CN112905387B (en) * 2021-03-04 2022-05-24 河北工业大学 RAID6 encoding and data recovery method based on same
CN114080596A (en) * 2021-09-29 2022-02-22 长江存储科技有限责任公司 Data protection method for memory and memory device thereof

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5826001A (en) 1995-10-13 1998-10-20 Digital Equipment Corporation Reconstructing data blocks in a raid array data storage system having storage device metadata and raid set metadata
US20050055603A1 (en) 2003-08-14 2005-03-10 Soran Philip E. Virtual disk drive system and method
US20050108594A1 (en) 2003-11-18 2005-05-19 International Business Machines Corporation Method to protect data on a disk drive from uncorrectable media errors
US6950901B2 (en) 2001-01-05 2005-09-27 International Business Machines Corporation Method and apparatus for supporting parity protection in a RAID clustered environment
US7386758B2 (en) * 2005-01-13 2008-06-10 Hitachi, Ltd. Method and apparatus for reconstructing data in object-based storage arrays
US7603529B1 (en) * 2006-03-22 2009-10-13 Emc Corporation Methods, systems, and computer program products for mapped logical unit (MLU) replications, storage, and retrieval in a redundant array of inexpensive disks (RAID) environment
US20160259554A1 (en) * 2012-12-13 2016-09-08 Hitachi, Ltd. Computer realizing high-speed access and data protection of storage device, computer system, and i/o request processing method
US9535563B2 (en) 1999-02-01 2017-01-03 Blanding Hovenweep, Llc Internet appliance system and method
US20190129815A1 (en) * 2017-10-31 2019-05-02 EMC IP Holding Company LLC Drive extent based end of life detection and proactive copying in a mapped raid (redundant array of independent disks) data storage system
US20200150887A1 (en) * 2018-11-08 2020-05-14 Silicon Motion Inc. Method and apparatus for performing mapping information management regarding redundant array of independent disks
US20200210290A1 (en) * 2019-01-02 2020-07-02 Silicon Motion Inc. Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001075741A (en) * 1999-09-02 2001-03-23 Toshiba Corp Disk control system and data maintenance method
US7434097B2 (en) * 2003-06-05 2008-10-07 Copan System, Inc. Method and apparatus for efficient fault-tolerant disk drive replacement in raid storage systems
JP4862847B2 (en) * 2008-03-07 2012-01-25 日本電気株式会社 Disk array data recovery method, disk array system, and control program
CN105068896B (en) * 2015-09-25 2019-03-12 浙江宇视科技有限公司 Data processing method and device based on RAID backup
KR102573301B1 (en) * 2016-07-15 2023-08-31 삼성전자 주식회사 Memory System performing RAID recovery and Operating Method thereof

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5826001A (en) 1995-10-13 1998-10-20 Digital Equipment Corporation Reconstructing data blocks in a raid array data storage system having storage device metadata and raid set metadata
US9535563B2 (en) 1999-02-01 2017-01-03 Blanding Hovenweep, Llc Internet appliance system and method
US6950901B2 (en) 2001-01-05 2005-09-27 International Business Machines Corporation Method and apparatus for supporting parity protection in a RAID clustered environment
US20050055603A1 (en) 2003-08-14 2005-03-10 Soran Philip E. Virtual disk drive system and method
US20050108594A1 (en) 2003-11-18 2005-05-19 International Business Machines Corporation Method to protect data on a disk drive from uncorrectable media errors
US7386758B2 (en) * 2005-01-13 2008-06-10 Hitachi, Ltd. Method and apparatus for reconstructing data in object-based storage arrays
US7603529B1 (en) * 2006-03-22 2009-10-13 Emc Corporation Methods, systems, and computer program products for mapped logical unit (MLU) replications, storage, and retrieval in a redundant array of inexpensive disks (RAID) environment
US20160259554A1 (en) * 2012-12-13 2016-09-08 Hitachi, Ltd. Computer realizing high-speed access and data protection of storage device, computer system, and i/o request processing method
US20190129815A1 (en) * 2017-10-31 2019-05-02 EMC IP Holding Company LLC Drive extent based end of life detection and proactive copying in a mapped raid (redundant array of independent disks) data storage system
US20200150887A1 (en) * 2018-11-08 2020-05-14 Silicon Motion Inc. Method and apparatus for performing mapping information management regarding redundant array of independent disks
US20200210290A1 (en) * 2019-01-02 2020-07-02 Silicon Motion Inc. Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks

Also Published As

Publication number Publication date
TWI704451B (en) 2020-09-11
TW202026874A (en) 2020-07-16
US10860423B2 (en) 2020-12-08
CN111400084A (en) 2020-07-10
US20200210290A1 (en) 2020-07-02
CN111400084B (en) 2023-06-20
US20210073074A1 (en) 2021-03-11

Similar Documents

Publication Publication Date Title
US11301326B2 (en) Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks
US11941257B2 (en) Method and apparatus for flexible RAID in SSD
JP5341896B2 (en) Self-healing nonvolatile memory
US7254754B2 (en) Raid 3+3
US9158675B2 (en) Architecture for storage of data on NAND flash memory
US11874741B2 (en) Data recovery method in storage medium, data recovery system, and related device
CN112068778B (en) Method and apparatus for maintaining integrity of data read from a storage array
CN107885620B (en) Method and system for improving performance and reliability of solid-state disk array
CN111124266A (en) Data management method, device and computer program product
US11221773B2 (en) Method and apparatus for performing mapping information management regarding redundant array of independent disks
CN111506259B (en) Data storage method, data reading method, data storage device, data reading apparatus, data storage device, and readable storage medium
TWI768476B (en) Method and apparatus for performing mapping information management regarding redundant array of independent disks, and associated storage system
CN113703684A (en) Data storage method, data reading method and storage system based on RAID
WO2013023564A1 (en) Method and apparatus for flexible raid in ssd
US20100085658A1 (en) Storage device controlling device, storage device, storage device controlling program, and storage device controlling method
JP2006268286A (en) Disk array device

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE