WO2014101412A1 - 一种磁盘重构方法及其装置 - Google Patents

一种磁盘重构方法及其装置 Download PDF

Info

Publication number
WO2014101412A1
WO2014101412A1 PCT/CN2013/080582 CN2013080582W WO2014101412A1 WO 2014101412 A1 WO2014101412 A1 WO 2014101412A1 CN 2013080582 W CN2013080582 W CN 2013080582W WO 2014101412 A1 WO2014101412 A1 WO 2014101412A1
Authority
WO
WIPO (PCT)
Prior art keywords
disk
data
member disk
reconstruction
restored
Prior art date
Application number
PCT/CN2013/080582
Other languages
English (en)
French (fr)
Inventor
何孝金
覃中
熊伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2014101412A1 publication Critical patent/WO2014101412A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1092Rebuilding, e.g. when physically replacing a failing disk

Definitions

  • the present invention relates to the field of storage, and in particular, to a disk reconstruction method and apparatus therefor.
  • Redundant Array of Independent Disks bad 'J Redundant Array of Independent Disks , RAID
  • Redundant Array of Independent Disks bad 1 J Redundant Array of Inexpensive Disks , RAID
  • a plurality of the separate disk or a hard disk A disk group or disk group formed by combination can also be called a logical hard disk. Multiple disks in a disk group are member disks of each other.
  • RAID technology is one of the most commonly used technologies in the storage field. It virtualizes multiple disks or hard disks into a large-capacity disk or hard disk. It can speed up the overall storage speed by parallel reading and writing, and can achieve certain use of redundant error correction technology. Fault tolerance, which provides higher storage performance and data backup capabilities than a single disk or hard disk of the same capacity.
  • Embodiments of the present invention provide a disk reconstruction method and apparatus, which can reduce data loss after a disk is reconstructed.
  • a disk reconstruction method including:
  • the corresponding reconfiguration process is performed according to the restored first member disk.
  • the target disk containing the recovered data is replaced with the restored first member disk as a member disk of the RAID group.
  • the method further includes: recovering, according to data of a second member disk readable area other than the first member disk in the RAID group, the first area of the first member disk Data, and the restored data is stored to the target disk, wherein the area corresponding to the second area of the first member disk on the target disk does not write data, and the area of the first area corresponding to the second member disk is readable, second The area corresponds to the unreadable area of the second member disk. Further, after the failure of the first disk member is restored, the data of the second region of the restored first member disk may be stored to the target disk.
  • an apparatus for implementing disk reconstruction including:
  • the data acquisition unit is configured to recover data of the first member disk according to the data of the second member disk other than the first member disk in the RAID group, where the first member disk is a member disk that fails in the RAID group;
  • a write processing unit is configured to write data recovered by the data acquisition unit to the target disk.
  • the reconfiguration control unit is configured to switch the member disks of the RAID group from the first member disk to the target disk containing the restored data before the first member disk fails, and after the first member disk fails, according to the restored The first member disk completes the refactoring process.
  • the reconfiguration control unit switches the member disks of the RAID group from the target disk containing the restored data to the restored first member disk.
  • the reconfiguration control unit selects, after the first member disk failure recovery, recovers the member disks of the RAID group from the inclusion, after determining that the failure of the first member disk is recoverable.
  • the data of the target disk is switched to the recovery of the first member disk.
  • a storage device comprising: a third aspect and various means for implementing disk reconstruction, and coupling to a device for implementing disk reconstruction One or more RAID groups and/or target disks.
  • a fourth aspect provides a disk reconstruction apparatus, including:
  • the disk adapter is used as an interface between the redundant array of the independent disk RAID group and the target disk; the storage controller is configured to determine whether the failure of the first member disk is recoverable, and if the failure of the first member disk is recoverable, the first reconstruction is performed. Mode processing, if the failure of the first member disk is unrecoverable, the second reconstruction mode is processed;
  • the first member mode recovers the data of the first member disk according to the data of the second member disk other than the first member disk in the RAID group, and stores the restored data to the target disk; Before the fault is recovered, the first member disk is replaced with the target disk containing the recovered data as the member disk of the RAID group; after the first member disk is recovered, the reconstruction process is completed according to the restored first member disk;
  • the data of the first member disk is restored according to the data of the second member disk, and the restored data is stored to the target disk; after the first member disk to the target disk is reconstructed, the The target disk of the recovered data replaces the first member disk as a member disk of the RAID group, and removes the first member disk from the RAID group to complete the disk reconfiguration.
  • the performing the reconfiguration processing according to the restored first member disk comprises: replacing the target disk containing the restored data with the restored first member disk as a member of the RAID group Disk to complete the refactoring process.
  • the performing the re-processing process according to the restored first member disk comprises: restoring, on the restored first member disk, an area corresponding to the unreadable area of the second member disk The data is stored to the target disk to complete the refactoring process.
  • a storage device including:
  • FIG. 1 is a schematic flowchart of a disk reconstruction method according to an embodiment of the present invention.
  • FIG. 2A is a schematic flowchart of a disk reconstruction method according to another embodiment of the present invention
  • FIG. 2B is a schematic flowchart of a disk reconstruction method according to another embodiment of the present invention
  • FIG. 3 is a schematic flowchart of a disk reconstruction method according to another embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a disk reconstruction method according to another embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a disk reconstruction method according to another embodiment of the present invention.
  • 6A is a schematic diagram of data storage of a RAID group according to an embodiment of the present invention.
  • FIG. 6B is a schematic diagram of disk reconstruction according to another embodiment of the present invention.
  • 6C is a schematic diagram of disk reconstruction according to another embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a device according to another embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a device according to another embodiment of the present invention.
  • FIG. 9A is a block diagram of an application system according to an embodiment of the present invention.
  • FIG. 9B is a block diagram of an application system according to an embodiment of the present invention.
  • the technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. example. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
  • disk and "hard disk” according to embodiments of the present invention have substantially the same meaning.
  • a disk is a device disk that is magnetically read and written. It can be a non-volatile storage medium. Files saved after power-off are not lost. The hard disk is better protected by loading the disk's storage sheet into a hard metal case.
  • the disk reconstruction involved in the embodiment of the present invention is to reconstruct or restore data on the disk.
  • the recovered data can be written to the target disk.
  • the target disk can be the specified backup disk or any Free disk available.
  • the various disk reconstruction methods and apparatuses provided by the embodiments of the present invention are applicable to a disk group including a plurality of member disks, such as a RAID group.
  • the disk group is used to distribute and store an integer number of data blocks and an integer number of parity data formed by the data blocks. In this disk group, if you need to reconfigure a disk, you can use the data of the remaining disks in the disk group to recover the data of the disk, thus realizing disk reconfiguration.
  • the "disk group" provided by the embodiment of the present invention may be a software-based array or a hardware-based array.
  • the soft array is implemented by a software program and by the computer's central processing unit (CPU).
  • CPU central processing unit
  • a software-based array configures multiple hard disks on a connected Small Computer System Interface (SCSI) card into logical disks through an array management function provided by the network operating system itself.
  • Software-based arrays provide data redundancy.
  • Hardware-based arrays are implemented using specialized disk array cards.
  • Hardware-based arrays provide online capacity expansion, dynamic modification of array levels, automatic data recovery, drive roaming, and caching. It provides solutions for performance, data protection, reliability, availability and manageability.
  • the processing unit dedicated to the array card operates, and its performance is much higher than that of a conventional non-array hard disk, and it is safer and more stable.
  • the disk group or the disk array provided by the embodiments of the present invention may adopt RAID technology, and the RAID may be software-based or hardware-based.
  • the embodiments of the present invention can be applied to various RAID combinations, and are identified by RAID levels, such as RAID-0, RAID-1, RAID-IE, RAID-5, RAID-6, RAID-7, RAID-10, RAID-50. .
  • RAID levels can meet the diverse needs of performance and security.
  • the number and storage methods of the disks required for various RAID levels are known to the public and will not be described.
  • each member disk contains an equal number of blocks, and the aligned blocks across all member disks in the RAID group are called strips.
  • Figure 4 shows a RAID group data storage format.
  • the RAID group is divided into N stripes, each strip corresponds to 4 blocks, 3 blocks store data of 3 data blocks, and 1 block stores check data of 3 blocks of the strip. .
  • the size of each data block carrying data bits or bytes can be set according to the storage device or system, and can be set through a local or remote control interface. As shown in FIG.
  • a set of data blocks D1, D2, D3N+3 and the school insurance data PI, P2, and PN formed by these data blocks are distributed to the plurality of member disks 601-604 of the RAID group 600. It is worth noting that the number of member disks included in a RAID group is not limited to four. The number of member disks can be determined according to the basic needs of the RAID level and customer needs. Data storage in the embodiment of the present invention The storage mode is not limited to that shown in FIG. 6A, and may include existing storage modes of various RAID levels.
  • step S101 when the first member disk in the RAID group fails, such as the member disk 604 fails, in step S101, it is determined that the second member disk other than the first member disk in the RAID group has an unreadable area.
  • step S103 the data of the first member disk may be restored according to the data of the second member disks 601-603 other than the member disk 604 in the RAID group, and the restored data is stored to the target disk 605.
  • step S105 before the failure of the first member disk is restored, the first member disk is replaced with the target disk containing the restored data as a member disk of the RAID group.
  • step S107 after the first member disk fails to recover, the corresponding first member disk is used to complete the corresponding reconstruction process.
  • the second member disk is found to have an unreadable area, and the first area of the first member disk is reconfigured, the second area of the first member disk may not be reconstructed.
  • the first area corresponds to the area of the second member disk
  • the second area corresponds to the unreadable area of the second member disk.
  • the reconstruction operation here includes data recovery processing.
  • On the target disk you can reserve the disk space for the second area. Ensure that the block correspondence between the target disk and the second member disk is consistent with the block relationship between the first member disk and the second member disk. Change the block correspondence to reduce the storage and data processing complexity during the use of the target disk as a member disk. Bad block marking may not be performed on the disk space reserved for the second area.
  • the first member disk in the case where the second member disk has an unreadable area, the first member disk may fail to be replaced with the restored first member disk after the first member disk fails.
  • the data of the second area corresponding to the unreadable area of the second member disk on the restored first member disk may be stored to the target disk. In this way, the integrity of the reconstructed disk data can be achieved, and the time delay caused by disk switching can be further reduced.
  • Figure 2A shows an implementation of the need to switch back to the original disk after a failure recovery.
  • the disk reconstruction method shown in FIG. 2A is applied to a storage device or a storage system of a RAID group including a plurality of member disks.
  • a processing package for reconfiguring a failed first member disk in a RAID group Includes:
  • the second member disk may be determined to have an unreadable area according to the state detection result of the second member disk, and the status detection result is used to indicate the readability detection result of the strip and/or the block.
  • the first region may be determined according to the state detection result.
  • the status detection result can include a record of the readable area and/or the unreadable area, and the readable area can be determined to determine the unreadable area.
  • the status detection result can be identified by a stripe identifier and/or a block identifier.
  • the stripe identifier and/or the block identifier may also be represented by a storage address, for example, the stripe identifier may be a stripe number or a stripe address (a first address and/or a tail address of the stripe), the area
  • the block identifier can be a block number or a block address (the first address and/or the last address of the block).
  • the status detection result may be historical data stored in the memory, such as a log containing the unreadable area record; or may be obtained by performing status detection on the second member disk.
  • the status detection result may be historical data that has been obtained before the fault occurs, or may be data obtained by detecting the disk state of the second member disk after the fault occurs. It can be scanned by region. If the second member disk is readable, the area of the first member disk corresponding to it can be read. Alternatively, it is determined that an unreadable area exists on the second member disk, and the second area of the first member disk is determined according to the unreadable area of the second member disk, and the remaining area of the first member disk is the first area. In an embodiment of the present invention, the determining operation of whether the failure of the first member disk is recoverable may be included, and the operation may be implemented by detecting a cause of the disk failure.
  • Disk failure causes one or more reasons, such as disk offline, disk physical media failure, and so on.
  • the status of the slot in the slot of the first member disk in the RAID group can be detected, and the slot state detection is performed on the first member disk before the step S201a. If the first member disk is not in place, it is determined that the first member disk is recoverable.
  • the determining operation of whether the failure of the first member disk has been recovered may also be included, and the operation may be performed by detecting one or more of a disk in-position state, a disk identity information, a disk physical medium integrity, and the like.
  • Content implementation the operation of determining whether the first member disk is recoverable may be determined by detecting the slot status of the first member disk, the identity of the first member disk, and the like, and may also be combined with the physical medium of the first member disk. Integrity is judged. This can effectively solve the faulty disk misoperation or poor contact, resulting in the disk can not be detected.
  • the failed disk is determined to be unrecoverable.
  • the disk can be determined to be recoverable.
  • Figure 2B shows an implementation process that does not require switching back to the original disk after failure recovery.
  • the disk reconstruction method shown in Fig. 2B is applied to a storage device or a storage system of a R A I D group containing a plurality of member disks.
  • the method includes: S201b: determining that the second member disk other than the first member disk in the RAID group has an unreadable area.
  • the S203b recovers the data of the first area of the first member disk by using the data of the second member disk, and saves the restored data to the target disk, where the area of the second member disk corresponding to the first area of the first member disk is read.
  • step S201b The operation of determining whether the second member disk has an unreadable area in step S201b may be the same as or similar to the operation in step S201a.
  • the operation of determining the first area in step S203b may be the same as or similar to step S203a, and details are not described herein.
  • Another embodiment of the present invention provides a disk reconstruction method applied to a storage device or a storage system of a RAID group including a plurality of member disks. As shown in Figure 3, when the first member disk in the RAID group is faulty, the reconstruction processing method includes:
  • S301 Determine whether the first member disk is recoverable. If the first member disk is recoverable, perform S303. If the first member disk is unrecoverable, perform S306.
  • the operation of determining whether the first member disk is recoverable can be implemented by detecting a disk failure cause, and the operation can be implemented by detecting a disk failure cause. Disk failure causes one or more reasons, such as disk offline, disk physical media failure, and so on. For example, the slot state detection is performed on the first member disk. If the slot state detection result indicates that the first member disk is not in the bit, it is determined that the first member disk is recoverable.
  • the first member disk is more likely to be inserted back, and the first member can be determined.
  • the disk is recoverable; or the first member disk is unplugged, and the first member disk is unrecoverable; or the first member disk is faulty due to physical media. If it is caused, it is determined that the first member disk is unrecoverable.
  • the target disk can be any available free disk.
  • the disk space of the target disk can be divided into multiple areas, and the divided areas are kept corresponding to multiple areas of the first member disk, so as to ensure that the data of each area recovered can be stored to the target disk.
  • Area Of course, dividing the area is not necessary, for example, it can be stored in the order of the data block distribution rules.
  • a set of consecutive data blocks and a check block formed by the data blocks are distributed in a plurality of member disks of the RAID group. These disk areas of the storage data block and the check block are referred to as blocks, and at least one set of blocks spanning a plurality of member disks can form a stripe.
  • the data of the first member disk is recovered by using the distribution relationship between the data block and the check block on the second member disk.
  • the data of the area corresponding to the unreadable area on the first member disk is unrecoverable, and the corresponding area is not reconstructed, for example, the target disk is not bad.
  • Block tag As shown in FIG. 6B, the first member disk (disk 4) fails.
  • the detected second member disk (disk 1 - 3) is unreadable, the area j of the disk 1 corresponds to the area m of the first member disk.
  • Region m is not reconstructed, that is, the data recovery calculation process is not performed; the corresponding region n on the target disk is not subjected to the bad block flag.
  • Region j, region m, and region n can be identified by a block identifier and/or a stripe identifier.
  • the target disk is a temporary member disk of the RAID group.
  • RAID can be applied in a network environment where high speed servers and high speed storage devices are interconnected at high speed through a storage area network (SAN).
  • the high-speed storage device can be a RAID-based storage device or system, which makes physical long-distance storage easy and convenient, and improves data reliability and security.
  • it can be applied to enterprises that require high data security and storage performance.
  • the target disk can provide service access services for other devices, thereby realizing rapid backup of data and quickly recovering user services.
  • the target disk can provide service access services for other devices, thereby realizing rapid backup of data and quickly recovering user services.
  • Step S305 After the first member disk is recovered, the disk reconstruction process is completed by using the restored first member disk. Step S305 can be implemented in multiple manners. For example, after the first member disk fails, the target disk is replaced with the restored first member disk as a member disk of the RAID group to complete the disk reconstruction process.
  • step S304 After the disk reconfiguration process is completed, delete or deactivate the target disk as a record of the member disks of the RIAD group. On the other hand, restoring the first member disk as a record of a member of the RAID group, that is, re-activating the first member disk. If a new record is added to the RAID group member disk table and the RAID group area mapping table in step S304, the newly added record is deleted in step S305, and the record of the original first member disk is reactivated. If the temporary RAID group member disk table and the temporary RAID group area mapping table are created in step S304, the temporary RAID group member disk table and the temporary RAID group area mapping table are deleted in step S305, and the original RAID group is reused. The member disk table and the original RAID group area mapping relationship table.
  • step S303 if the second member disk has an unreadable area, the recoverable data includes data of the first area of the first member disk, and the area of the first area corresponding to the second member disk is readable.
  • the first member disk fails, the data of the second area of the restored first member disk is stored to the target disk, and the second area corresponds to the unreadable area of the second member disk.
  • the data of the area m of the restored first member disk is stored to the destination. The area i or n of the disk.
  • first member disk failure is restored can be referred to above.
  • the target disk is replaced with the restored first member disk. This prevents data loss of the first member disk due to the presence of the unreadable area of the second member disk data during the reconstruction, and ensures the integrity of the data of the first member disk after the first member disk is recovered, thereby ensuring the RAID group. Data integrity and security.
  • replacing the target disk with the restored first member disk can maintain the original data processing mode of the RAID group and restore the data storage state before the failure of the RAID group.
  • the second member disk may have an unreadable area, and the area corresponding to the unreadable area on the target disk may be marked as a bad block, may remain idle, do not write any data, or may Fill with a fixed value for system identification.
  • the first member disk fails.
  • the area j of the disk 1 corresponds to the area m of the first member disk;
  • Bad block marking is performed on the corresponding area n on the target disk.
  • Region j, region m, and region n can be identified by a block identifier and/or a stripe identifier.
  • the area n can also be padded with a fixed value or left idle.
  • step S306 may include the following sub-steps:
  • step S306a Determine whether the second member disk has an unreadable area. If the second member disk has an unreadable area, go to step S306b. If the second member disk does not have an unreadable area, go to step S306c.
  • S306b recover data of the first area of the first member disk according to data of the second member disk, and store the restored data to the target disk, where the area label corresponding to the second area of the first member disk on the target disk The bad block does not write data, the area of the second member corresponding to the first area is readable, and the unreadable area of the second member disk corresponding to the second area.
  • S306c Restore data of the first member disk according to data of the second member disk, and store the restored data to the target disk.
  • the first member disk is removed from the RAID group. If the first member disk is deleted as the member disk of the RAID group, the RAID group area mapping table is refreshed with the information of the target disk. This prevents subsequent member disks from being deleted. Failure, failure due to inaccurate member information.
  • the RAID group when the failure of the first member disk is recoverable, the RAID group can maintain its original data storage state, and the RAID group data is not lost; when the failure of the first member disk is unrecoverable, according to the second member disk readable area
  • the data of the first area of the first member disk is recovered, and the data of the second area of the first member disk is lost at most, wherein the area of the second member disk corresponding to the first area is possible, and the second area is corresponding to the second member. Reading area.
  • the stateful intelligent reconfiguration of the member disks in the RAID group can balance the data integrity and unreadable area of the RAID group, resulting in wasted time. It can reduce the loss of data after the RAID group is reconstructed, even without loss. Ability to quickly restore a user's business.
  • FIG. 4 is a flow chart of a method according to another embodiment of the present invention.
  • the method shown in FIG. 4 is similar to that of FIG. 3, and the main difference is that different reconstruction methods are selected according to whether the failure of the first member disk is recoverable and whether the second member disk has an unreadable region:
  • the first reconfiguration method is applied to the case where the failure of the first member disk is recoverable and the second member disk has an unreadable area.
  • the processing of the first reconstruction mode includes: recovering the data of the first member disk according to the data of the second member disk, and storing the restored data to the target disk; in step S404, before the failure of the first member disk is restored, Replace the first member disk with the target disk that contains the recovered data as the member disk of the RAID group.
  • step S405 after the first member disk fails, the disk reconstruction process is completed with the restored first member disk.
  • step S405 after the first member disk fails, the target disk can be replaced with the restored member disk as the member disk of the RAID group to complete the disk reconstruction process.
  • the data recovered in step S403 includes data of the first area of the first member disk, and the area of the second member disk corresponding to the first area is readable.
  • the data of the second area of the first member disk may be stored to the target disk to complete the disk reconstruction process, and the second area corresponds to the unreadable area of the second member disk.
  • the second reconstruction mode is applied to the case where the failure of the first member disk is unrecoverable and the second member disk has an unreadable area.
  • the processing of the second reconstruction manner includes: in step S406b, Restoring the data of the first area of the first member disk according to the data of the second member disk, and storing the restored data to the target disk, where the area corresponding to the second area of the first member disk on the target disk marks a bad block or The data of the second member disk corresponding to the first area is readable, and the second area corresponds to the non-interest area of the second member disk.
  • step S407 the first member disk is replaced with a target disk containing the restored data as a member disk of the RAID group.
  • the third reconstruction method is applied to the case where the second member disk does not have an unreadable area.
  • the processing of the third reconstruction method includes: Step S406c: recovering data of the first member disk according to data of the second member disk, and storing the restored data to the target disk.
  • Step S407 the first member disk is replaced with the target disk containing the recovered data as a member disk of the RAID group.
  • FIG. 5 is a flowchart of a method according to another embodiment of the present invention, and the method shown in FIG. 5 is similar to FIG. 4.
  • the specific processing includes:
  • step S502 If there is an unreadable area, go to step S502. If there is no unreadable area, go to step S506c.
  • the second member disk is a member disk other than the first member disk that failed in the RAID group.
  • step S502 Determine whether the fault of the first member disk is recoverable. If the fault is recoverable, go to step S503. If the fault is not recoverable, go to step S506b.
  • the disk reconstruction process is completed by using the restored first member disk.
  • the target disk can be replaced with the restored first member disk as a member disk of the RAID group to complete the disk reconstruction process.
  • the data of the second area of the restored first member disk can be stored to the corresponding area of the target disk to complete the disk reconstruction process.
  • the second area corresponds to an unreadable area of the second member disk.
  • S506b recover data of the first area of the first member disk according to data of the second member disk, and store the restored data to the target disk, where the first member is magnetic on the target disk
  • the area corresponding to the second area of the disc marks bad blocks or does not write data, and the area of the second member disk corresponding to the first area is readable.
  • S506c recover data of the first member disk according to data of the second member disk, and store the restored data to the target disk. After the operation of S506c is completed, step S507 is performed.
  • FIG. 7 shows an apparatus 700 for implementing disk reconstruction according to an embodiment of the present invention.
  • the device 700 is coupled to include one or more RAID groups, each of which contains a plurality of member disks.
  • Apparatus 700 includes:
  • the data obtaining unit 703 is configured to recover data of the first member disk according to data of the second member disk other than the first member disk in the RAID group.
  • the write processing unit 704 is configured to write the data recovered by the data acquisition unit 703 to the target disk.
  • the reconfiguration control unit 702 is configured to: switch the member disks of the RAID group from the first member disk to the target disk that includes the restored data, and restore the first member disk after the first member disk fails.
  • the first member disk completes the refactoring process.
  • the reconfiguration control unit 702 can switch the member disks of the RAID group from the target disk containing the restored data to the restored first member disk after the first member disk fails.
  • the reconstruction control unit 702 may also instruct the data acquisition unit 703 to acquire data of the second region of the first member disk after the first member disk failure recovery, and instruct the write processing unit 704 to store the data of the second region to the target disk. region.
  • the second area corresponds to an unreadable area of the second member disk.
  • the reconfiguration control unit 702 selects, after the first member disk failure recovery, switches the member disks of the RAID group from the target disk containing the restored data to the restored first member disk after determining that the failure of the first member disk is recoverable.
  • the first way to refactor is the first way to refactor.
  • the reconfiguration control unit 702 selects whether to delete the first member disk from the RAID group after completing the reconstruction of the first member disk to the target disk in the case that the failure of the first member disk is unrecoverable. Second reconstruction method.
  • the reconstruction control unit 702 can also perform reconfiguration selection in combination with the failure recoverability detection result of the first member disk and the detection result of whether the second member disk has an unreadable area.
  • the reconstruction control unit 702 can include a fault state management unit 7022, which is used by To implement fault state management of the first member disk.
  • the fault state management unit 7022 can obtain information such as the cause of the failure of the first member disk, whether the fault has been recovered, and the like, and the fault cause may include one or more of the absence of the bit, the failure of the disk medium, and the like.
  • the reconfiguration control unit 702 may further include a disk readability detecting unit 7021, configured to obtain a readability detection result of the RAID group member disk, including obtaining a readability detection result of the second member disk.
  • the reconfiguration control unit 702 can include a disk management unit 7023 for managing RAID group member disks, including maintaining member disk information, responsible for member disk change or switch management, disk readable area and/or unreadable area management, and disk reconstruction mode. Choose one or more aspects.
  • the reconstruction control unit 702 may further include a processing unit 7024 that is responsible for controlling the data acquisition unit 703 and the write control according to information or instructions of one or more units such as the failure management unit 7022, the disk readability detecting unit 7021, the disk management unit 7023, and the like.
  • the processing unit 7024 controls the data recovery process of the data acquiring unit 703 according to the readability detection result of the second member disk, so that the data acquiring unit 703 restores the first member disk first region according to the data of the second member disk readable area.
  • the data of the first area corresponding to the area of the second member disk is readable, and the data obtaining unit 703 is configured to reconstruct the second area of the first member disk, where the second area corresponds to the unreadable area of the second member disk.
  • the processing unit 7024 may also instruct the write processing unit 704 not to write data or mark bad blocks in an area corresponding to the second area on the target disk according to the disk readability detection result. Specifically, the processing unit 7024 can perform one or more operations of the foregoing method flow as needed.
  • an apparatus 800 is provided according to an embodiment of the present invention, including:
  • the reconfiguration mode selection unit 801 is configured to determine whether the first member disk in the RAID group is recoverable, and if the failure of the first member disk is recoverable, the soft reconstruction mode is selected, and if the failure of the first member disk is unrecoverable, the hard disk is selected. Refactoring method.
  • a soft reconstruction unit 802 configured to recover data of the first member disk according to data of the second member disk other than the first member disk in the RAID group in the soft reconstruction mode, and store the restored data to the target disk, where Before the first member disk fails, replace the first member disk with the target disk containing the recovered data as the member disk of the RAID group. After the first member disk fails, replace the data with the restored first member disk.
  • the target disk is a member disk of the RAID group. If the second member disk has an unreadable area, only the data of the first area of the first member disk may be restored, and the second part of the first member disk is restored. The area is not reconstructed or data restored. The area of the second member disk corresponding to the first area is readable, and the second area corresponds to the unreadable area of the second member disk.
  • the hard reconfiguration unit 805 is configured to: in the hard reconfiguration mode, recover data of the first member disk according to data of the second member disk other than the first member disk in the RAID group, and store the restored data to the target disk, where After the recovered data storage is completed, the first member disk is replaced with the target disk containing the recovered data as a member disk of the RAID group, and the first member disk is removed from the RAID group to complete the disk reconstruction. After the first member disk fails, the target disk is maintained as a member disk of the RAID group. If the second member disk has an unreadable area, only the data of the first area of the first member disk may be restored, and the second area of the first member disk is not reconstructed or data restored. The area of the second member disk corresponding to the first area is readable, and the second area corresponds to the unreadable area of the second member disk. .
  • the soft reconstruction unit 802 can include:
  • the data obtaining unit 8021 is configured to recover data of the first member disk according to data of the second member disk.
  • the data acquiring unit 8021 can obtain data of the first area of the first member disk according to the data of the second member disk readable area, and the area of the second member disk corresponding to the first area is Readable.
  • the data obtained by the data acquisition unit 8021 can be buffered into a buffer coupled to the device 800, which can be externally connected to the device 800 or integrated into the device 800.
  • the write processing unit 8023 is configured to control the storage of the restored data to the target disk corresponding area reconstruction control unit 8024, for completing the storage of all recoverable data of the first member disk in the target disk, and using the data including the recovery
  • the target disk replaces the first member disk as a member disk of the RAID group; and after the first member disk fails, replaces the target disk containing the recovered data with the restored first member disk as a member disk of the RAID group.
  • the hard reconstruction unit 805 can include:
  • the data obtaining unit 8051 is configured to recover data of the first member disk according to the data of the second member disk.
  • the way to recover data can be obtained by redundant calculation using the distribution of data blocks and check blocks in the second member disk. If the second member disk has an unreadable area, the data obtaining unit 8051 may obtain the data of the first area of the first member disk according to the data of the second member disk readable area, and the area corresponding to the second member disk of the first area is Readable, and the second area of the first member disk is not subjected to data recovery processing, and the second area corresponds to the second member disk Unreadable area.
  • the write processing unit 8053 is configured to store the data recovered by the data obtaining unit 8051 to the corresponding area of the target disk.
  • the hard reconstruction unit 805 may include a bad block marking unit 8052 for marking an area corresponding to the unreadable area of the second member disk as a bad block in the target disk when the second member disk has an unreadable area, the target disk No data is written at the bad block in the mark.
  • a bad block marking unit 8052 for marking an area corresponding to the unreadable area of the second member disk as a bad block in the target disk when the second member disk has an unreadable area, the target disk No data is written at the bad block in the mark.
  • the hard reconstruction unit 805 may also not include the bad block marking unit 8052, but instead uses a fixed value as the data of the second area of the first member disk to write to the area corresponding to the second area in the target disk. Of course, the area corresponding to the second area may not write any data.
  • the reconfiguration control unit 8054 can also complete the storage of all recoverable data of the first member disk in the target disk, add the target disk as a member disk to the RAID group, and remove the first member disk from the RAID group.
  • the reconfiguration control unit 8054 can update the RAID group disk member information.
  • 9A and 9B are schematic diagrams showing the architectures of application systems 910a and 910b according to an embodiment of the present invention.
  • system 900a includes one or more storage devices or storage systems, such as device 910a, connected to a Fibre Channel (FC) network or an Internet Protocol (IP) network or IP network 905.
  • Network devices such as hosts 901a, 901b, database 902.
  • Fig. 9A schematically shows two hosts 901a and 901b.
  • Actual systems may include more similar hosts, which may be servers that perform various functions, such as a web server, a file server, a service server, and the like.
  • the database 902 provides a content index, access address information, user information, and the like of the stored file.
  • the device 910a can implement the disk reconstruction function, and the components for providing the disk reconstruction function mainly include:
  • the storage controller 912a is connected to a plurality of disks by one or more disk adapters 913a. At least some of the disks can form RAID groups, such as RAID groups 915a and 915b. At least some of the disks are used as free disks, such as disks 916a and 916b, which can be used as the target disk for disk reconfiguration.
  • RAID group can be a software-based RAID group or a hardware-based RAID group.
  • the disk adapter 913a is an interface between the redundant array of independent disks and the target disk. Input and output adaptation functions are provided that can act as an intermediary for RAID groups and other components such as storage controller 912, buffers.
  • the storage controller 912a is coupled to the RAID group and is used to complete the control of the RAID group.
  • the control operations that can be completed include the disk reconfiguration control operation.
  • the storage controller 912a may perform the following operations: when the first member disk in the RAID group fails, recover the data of the first member disk according to the second member disk other than the first member disk in the RAID group, and store the restored data to the target disk. Before the failure of the first member disk, replace the first member disk with the target disk containing the recovered data as the member disk of the RAID group; after the first member disk is recovered, according to the restored first member disk Processing.
  • the embodiment of the present invention provides at least two reconstruction modes. After the first member disk fails, the reconfiguration method of switching from the target disk back to the restored first member disk is called the first reconfiguration mode. After the first member disk fails, the reconstruction mode of the first member disk that does not need to be switched from the target disk back to the recovery is called the second reconstruction mode. Of course, in the embodiment, other reconstruction methods may also be included.
  • the storage controller 912a replaces the target disk containing the recovered data with the restored first member disk as a member disk of the RAID group after the first member disk fails.
  • the storage controller 912a may store the data of the second area of the restored first member disk to the corresponding area of the target disk after the first member disk fails, and the second area corresponds to the unreadable area of the second member disk. This operation can be performed if the second member disk has an unreadable area.
  • the first reconstruction mode and the second reconstruction mode may be selected by the memory controller 912a.
  • the storage controller 912a may select based on the judgment result of whether the failure of the first member disk is recoverable. If the first reconstruction mode can be resumed, the first member disk needs to be switched back to the restored first member disk. The way, that is, there is no need to switch back to the restored first member disk.
  • the storage controller 912a may obtain at least part of information such as disk member information, disk area information, information indicating whether the disk state is recoverable, information indicating a disk readability detection result, information indicating a data block and a check block distribution, or rules. Used as a disk reconfiguration control process.
  • the storage controller 912a may obtain the above information from a memory included in itself, or may obtain the above information from a memory connected thereto.
  • the storage controller 912a may determine whether the failure of the failed first member disk is recoverable based on information indicating whether the disk state is recoverable.
  • the storage controller 912a may determine whether the second member disk has an unreadable area based on the information indicating the result of the disk readability detection.
  • the storage controller 912a may restore the data of the first member disk based on the data of the second member disk in the RAID group, and store the restored data to the target disk.
  • Disk area information can include disk stripe or disk block information.
  • Data recovery processing can be based on information or rules indicating the distribution of data blocks and check blocks.
  • the memory controller 912a is typically a RAID controller that performs the various disk reconstruction methods described above in the present invention by executing a program stored in a computer readable storage medium.
  • the device 910a may further include the following components:
  • One or more communication adapters such as communication adapters 911a and 911b, which act as network adapters for FC networks or IP networks, communicate with network devices, such as host 901 and database 902, via FC networks or IP networks.
  • One or more buffers may be used as data between the cache RAID group (RAID group 915a or 915b) and the free disk (disk 916a or 916b), or may cache data between the RAID group and other network devices. .
  • the management controller 917 can manage the device 910a through the user's management interface.
  • Memory 918 can be responsible for the storage of system parameters of device 910b.
  • Bus bridge 919a can provide a series of data bus, control bus, respectively, to achieve data and control command interaction between components.
  • the bus bridge 919a may also include a power bus that powers the components through the power bus.
  • the storage controller 912 can implement control over part and all of the processing steps of the methods illustrated in FIGS. 1 through 3.
  • 9B is similar to system 900b and 900a, and includes device 910b, which is a storage device or storage system similar to device 910a, which is coupled to network devices such as hosts 901a, 901b, database 902, via FC network 905a and/or IP network 905b.
  • device 910b which is a storage device or storage system similar to device 910a, which is coupled to network devices such as hosts 901a, 901b, database 902, via FC network 905a and/or IP network 905b.
  • Device 910b includes the following components:
  • the storage controller 912b is connected to a plurality of disks by one or more disk adapters 913b. At least some of the disks may form a RAID group, such as RAID groups 915a and 915b. At least some of the disks are used as free disks, such as disks 916a and 916b, which can be used as target disks for disk reconfiguration.
  • the RAID group can be a software-based RAID group or a hardware-based RAID group.
  • Disk Adapter 913b is an interface between a redundant array of independent disks and a target disk. It provides input and output adaptation, which can be used as a middleware for RAID group and storage controller 912b.
  • the basic functions of the storage controller 912a are the same as those of the storage controller 912a of Fig. 9A, and will not be described again.
  • At least one of the following components may also be included:
  • One or more buffers may be used as data between the cache RAID group (RAID group 915a or 915b) and the free disk (disk 916a or 916b), or between the RAID group and other network devices. data.
  • the management controller 917 can manage the device 910a through the user's management interface.
  • One or more memories such as memories 918a and 918b, are coupled to memory controller 912b.
  • the memory 918a is primarily responsible for the storage of system parameters of the device 910b, and the memory 918b can provide the information required for RAID group control, as described in detail with respect to the control of the memory controller 912a.
  • the bus bridge is 91%, which can provide a series of data bus and control bus to realize the interaction of data and control commands between components.
  • the bus bridge 919b can also include a power bus that powers the components through the power bus.
  • the memory controller 912a and the disk adapter 913a may be integrated to form a disk reconstruction device 920.
  • Disk reconstruction device 920 can also integrate memory 918b.
  • the memory 918b can be used to store at least part of information such as disk member information, disk area information, information indicating whether the disk status is recoverable, information indicating the disk readability detection result, and the like.
  • the disk member information is stored in the form of a RAID group member disk table
  • the disk area information is stored in a RAID group area mapping table.
  • the memory 918b can store a computer readable program that performs at least one of the methods of the present invention, such that one or more processors (not shown) in the device 910b are capable of executing the computer readable program to perform disk reconstruction, the processors It may be integrated inside the controller 920 or may be connected to the controller 920 through an interface.
  • One or more communication adapters such as communication adapters 911a and 911b.
  • the communication adapters 911a and 911b are an FC communication adapter and an IP communication adapter, respectively.
  • the disclosed method and apparatus may be implemented in other manners.
  • the device embodiments described above are merely illustrative sexuality, for example, the division of units, only for one logical function division, the actual implementation may have another division, for example, multiple units or components may be combined or integrated into another system, or some features may be ignored, or Not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • each functional unit may be integrated into one processing unit, or each unit may be physically included separately, or two or more units may be integrated into one unit.
  • the above units may be implemented in the form of hardware or in the form of hardware plus software functional units.
  • All or part of the steps of implementing the above method embodiments may be performed by hardware related to the program instructions.
  • the foregoing program may be stored in a computer readable storage medium, and when executed, the program includes the steps of the foregoing method embodiments;
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read only memory (ROM), a random access memory (RAM), a disk or an optical disk, and the like. The medium of the code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供了一种磁盘重构方法。该方法包括:当独立磁盘冗余阵列RAID组中第一成员磁盘存在故障时,根据RAID组中第一成员磁盘之外的第二成员磁盘的数据恢复第一成员磁盘的数据,并将恢复的数据存储到目标磁盘;在第一成员磁盘的故障恢复前,用包含恢复的数据的目标磁盘替换第一成员磁盘作为RAID组的成员磁盘(S105);在第一成员磁盘故障恢复后,根据恢复的第一成员磁盘进行相应的重构处理(S107)。还提供了一种磁盘重构装置。所述方法及装置能够减少数据的丢失并快速恢复用户的业务。

Description

一种磁盘重构方法及其装置 技术领域 本发明涉及存储领域, 尤其涉及一种磁盘重构方法及其装置。 背景技术 独立磁盘冗余阵歹 'J ( Redundant Array of Independent Disks , RAID ) , 旧称廉价磁盘冗余阵歹1 J ( Redundant Array of Inexpensive Disks , RAID ) , 是一种把多块独立的磁盘或硬盘组合起来形成的一个磁盘组或硬盘组, 也可以称为逻辑硬盘。 一个磁盘组中的多个磁盘互为成员磁盘。
RAID技术为存储领域最常用的技术之一,它将多块磁盘或硬盘虚拟 成一个大容量的磁盘或硬盘, 可以通过并行读写来加快整体存储速度, 并可以利用冗余纠错技术实现一定的容错能力, 从而提供比同等容量的 单个磁盘或硬盘更高的存储性能与数据备份能力。
在现有技术当中, 在某一磁盘出现故障之后, 可以利用磁盘组中剩 余磁盘上的内容恢复故障磁盘的内容, 并将恢复的内容写到一个空闲盘 中, 这个过程称作磁盘重构。
磁盘重构过程中, 如果遇到剩余磁盘上有不可读的区域, 则在该空闲 磁盘上与不可读的区域对应的空闲区域标记为坏块, 继续利用剩余磁盘的 其它区域进行重构, 重构完成后将故障磁盘替换为空闲磁盘投入使用。 然 而, 现有的这种处理方式, 故障磁盘上部分数据会丟失, 例如故障磁盘上 与不可读的区域对应的区域的数据会丟失,导致重构后的磁盘数据不完整。 发明内容 本发明的实施例提供一种磁盘重构方法及装置, 能够减少磁盘在重 构之后数据丟失。
第一方面, 提供一种磁盘重构方法, 包括:
当 RAID组中第一成员磁盘存在故障时, 所述方法:
根据 RAID组中第一成员磁盘以外的第二成员磁盘的数据恢复第一 成员磁盘的数据, 并将恢复的数据存储到目标磁盘;
在第一成员磁盘的故障恢复前, 用包含恢复的数据的目标磁盘替换 第一成员磁盘作为 RAID组的成员磁盘;
在第一成员磁盘故障恢复后, 根据恢复的第一成员磁盘进行相应的 重构处理。
在第一种可能实现的方式中, 结合第一方面, 用恢复的第一成员磁 盘替换包含恢复的数据的目标磁盘作为 RAID组的成员磁盘。
在第二种可能实现的方式中, 结合第一方面, 所述方法还包括: 根 据 RAID 组中第一成员磁盘以外的第二成员磁盘可读区域的数据恢复第 一成员磁盘的第一区域的数据, 并将恢复的数据存储到目标磁盘, 其中, 在目标磁盘上与第一成员磁盘的第二区域对应的区域不写入数据, 第一 区域对应第二成员磁盘的区域可读, 第二区域对应第二成员磁盘的不可 读区域。 进一步的, 在第一磁盘成员的故障恢复后, 可以将恢复的第一 成员磁盘的第二区域的数据存储到目标磁盘。
第二方面, 提供一种用于实现磁盘重构的装置, 包括:
数据获取单元, 用于根据 RAID组中第一成员磁盘以外的第二成员 磁盘的数据恢复第一成员磁盘的数据, 第一成员磁盘为 RAID组中发生 故障的成员磁盘;
写处理单元, 用于将数据获取单元恢复的数据写入目标磁盘。
重构控制单元, 用于在第一成员磁盘故障恢复前, 将 RAID组的成 员磁盘从第一成员磁盘切换到包含恢复的数据的目标磁盘, 并在第一成 员磁盘故障恢复后, 根据恢复的第一成员磁盘完成重构处理。
在第一种可能实现的方式中, 结合第二方面, 在第一成员磁盘故障 恢复后, 重构控制单元将 RAID 组的成员磁盘从包含恢复的数据的目标 磁盘切换到恢复的第一成员磁盘。
在第二种可能实现的方式中, 结合第二方面, 重构控制单元在判断 第一成员磁盘的故障可恢复的情况选择在第一成员磁盘故障恢复后, 将 RAID 组的成员磁盘从包含恢复的数据的目标磁盘切换到恢复的第一成 员磁盘的重构方式。
第三方面, 提供一种存储装置, 包括: 第三方面以及各种可能实现 涉及的用于实现磁盘重构的装置, 以及耦合到用于实现磁盘重构的装置 的一个或多个 RAID组和 /或目标磁盘。
第四方面, 提供一种磁盘重构装置, 包括:
磁盘适配器, 用作独立磁盘冗余阵列 RAID组和目标磁盘的接口; 存储控制器, 用于判断第一成员磁盘的故障是否可恢复, 如果第一 成员磁盘的故障可恢复, 按第一重构方式处理, 如果第一成员磁盘的故 障不可恢复, 按第二重构方式处理;
其中, 在第一重构方式下, 根据 RAID组中第一成员磁盘以外的第 二成员磁盘的数据恢复第一成员磁盘的数据, 并将恢复的数据存储到目 标磁盘; 在第一成员磁盘的故障恢复前, 用包含恢复的数据的目标磁盘 替换第一成员磁盘作为 RAID 组的成员磁盘; 在第一成员磁盘故障恢复 后, 根据恢复的第一成员磁盘完成重构处理;
其中, 在第二重构方式下, 根据第二成员磁盘的数据恢复第一成员 磁盘的数据, 并将恢复的数据存储到目标磁盘; 在完成第一成员磁盘到 目标磁盘重构后, 用包含恢复的数据的目标磁盘替换第一成员磁盘作为 RAID组的成员磁盘, 将第一成员磁盘从 RAID组移除以完成磁盘重构。
在第二种可能实现的方式中, 结合第四方面, 根据恢复的第一成员 磁盘完成重构处理的操作包括: 用恢复的第一成员磁盘替换包含恢复的 数据的目标磁盘作为 RAID组的成员磁盘以完成重构处理。
在第三种可能实现的方式中, 结合第四方面, 根据恢复的第一成员 磁盘完成重构处理的操作包括: 将恢复的第一成员磁盘上与第二成员磁 盘的不可读区域对应的区域的数据存储到目标磁盘以完成重构处理。
第五方面, 提供一种存储装置, 包括:
第四方面以及各种可能实现涉及的磁盘重构装置, 以及耦合到磁盘 重构装置的一个或多个 RAID组和 /或目标磁盘。
本发明实施例提供的各种方法和装置, 能够减少磁盘重构之后数据的 丟失, 甚至实现不丟失, 并且能够快速恢复用户的业务。 附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对 实施例或现有技术描述中所需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明的一些实施例, 对于本领域普通技术人员 来讲, 在不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附 图。
图 1为本发明实施例提供的磁盘重构方法流程示意图;
图 2A为本发明另一实施例提供的磁盘重构方法流程示意图; 图 2B为本发明另一实施例提供的磁盘重构方法流程示意图;
图 3为本发明另一实施例提供的磁盘重构方法流程示意图;
图 4为本发明另一实施例提供的磁盘重构方法流程示意图;
图 5为本发明另一实施例提供的磁盘重构方法流程示意图;
图 6A为本发明实施例提供的 RAID组数据存储示意图;
图 6B为本发明另一实施例提供的磁盘重构示意图;
图 6C为本发明另一实施例提供的磁盘重构示意图;
图 7为本发明又一实施例提供的装置示意图;
图 8为本发明又一实施例提供的装置示意图;
图 9A为本发明实施例提供的应用系统框图;
图 9B为本发明实施例提供的应用系统框图。 具体实施方式 下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进 行清楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没 有做出创造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的 范围。
本发明实施例涉及的术语 "磁盘" 和 "硬盘" 具有基本相同含义。 磁盘是通过磁性进行读写功能的设备磁盘, 其可为非易失性存储介质, 断电后保存的文件不会丟失。 硬盘是将磁盘的储存片装到硬质金属盒子 里可以得到更好的保护。
本发明实施例涉及的磁盘重构, 是对磁盘上数据的重建或恢复。 恢 复的数据可以被写入目标磁盘。 目标磁盘可以是指定的备份磁盘或任何 可用的空闲磁盘。 本发明实施例提供的各种磁盘重构方法和装置, 可应 用于包含多个成员磁盘的磁盘组, 例如 RAID 组。 该磁盘组用于分布存 储整数个数据块以及这些数据块形成的整数个校验数据。 在该磁盘组中, 如果需要对某磁盘进行重构, 可以利用磁盘组中剩余磁盘的数据恢复该 磁盘的数据, 从而实现磁盘重构。
本发明实施例提供的 "磁盘组" , 又称作磁盘阵列, 可以是基于软 件的阵列也可以是基于硬件的阵列。 特别的, 软阵列通过软件程序并由 计算机的中央处理单元 ( Center Processing Unit , CPU ) 提供运行能力所 成。 例如, 基于软件的阵列通过网络操作系统自身提供的磁盘管理功能 将连接的普通小型计算机系统接口 (Small Computer System Interface , SCSI ) 卡上的多块硬盘配置成逻辑盘, 组成阵列。 基于软件的阵列可以 提供数据冗余功能。基于硬件的阵列是使用专门的磁盘阵列卡来实现的。 基于硬件的阵列能够提供在线扩容、 动态修改阵列级别、 自动数据恢复、 驱动器漫游、 超高速緩冲等功能。 它能提供性能、 数据保护、 可靠性、 可用性和可管理性的解决方案。 阵列卡专用的处理单元来进行操作, 它 的性能远远高于常规非阵列硬盘, 并且更安全更稳定。
本发明各实施例提供的磁盘组或磁盘阵列可以采用 RAID 技术, RAID可以是基于软件的也可以是基于硬件的。本发明实施例可以应用于 各种 RAID 组合方式, 用 RAID 级别标识, 例如 RAID-0 , RAID- 1 , RAID- IE, RAID-5 , RAID-6 , RAID-7 , RAID- 10 , RAID-50。 不同的 RAID 级别可以满足性能和安全的多种需要。 各种 RAID 级别所需磁盘的数目 和存储方式为公众所知, 不再赘述。
RAID组中, 每一个成员磁盘包含数量相等的区块, 跨越 RAID组中 所有成员磁盘的对齐区块称为条带。 如图 4所示为一种 RAID组数据存 储形式。 图 6A中 RAID组划分成 N个条带, 每一个条带对应 4个区块, 3个区块存储 3个数据块的数据, 1个区块存储该条带 3个区块的校验数 据。 每个数据块承载数据比特或字节的大小可以根据存储装置或系统设 置, 可以通过本地或远程控制接口进行设置。 如图 6A所示, 一组数据块 Dl , D2 , , D3N+3以及这些数据块形成的校险数据 PI , P2 , , PN分布存储到 RAID组 600的多个成员磁盘 601 - 604中。 值得注意的 是, RAID组包含的成员磁盘数不仅限于图示的 4个, 成员磁盘的数量可 以根据 RAID 级别的基本需要以及客户需要确定。 本发明实施例数据存 储方式不仅限于图 6A所示, 可以包括现有各种 RAID级别的存储方式。 在本发明的一个实施例中, 当 RAID 组中第一成员磁盘故障, 如成 员磁盘 604发生故障, 步骤 S101中, 确定 RAID组中第一成员磁盘之外 的第二成员磁盘存在不可读区域。 步骤 S103中, 可以根据 RAID组中该 成员磁盘 604以外的第二成员磁盘 601-603 的数据恢复第一成员磁盘的 数据, 并将恢复的数据存储到目标磁盘 605。 步骤 S 105中, 在第一成员 磁盘的故障恢复前, 用包含恢复的数据的目标磁盘替换第一成员磁盘作 为 RAID组的成员磁盘。 步骤 S107中, 在第一成员磁盘故障恢复后, 用 恢复的第一成员磁盘完成相应的重构处理。
如果发现第二成员磁盘存在不可读区域, 对第一成员磁盘的第一区 域进行重构操作, 可以不对第一成员磁盘的第二区域进行重构操作。 第 一区域对应第二成员磁盘的区域可读, 第二区域对应第二成员磁盘的不 可读区域。 这里的重构操作包括数据恢复处理。 在目标磁盘上可以为第 二区域预留磁盘空间, 保证目标磁盘和第二成员磁盘之间的区块对应关 系和第一成员磁盘与第二成员磁盘之间区块对应关系保持一致, 即不改 变区块对应关系, 以降低目标磁盘作为成员磁盘使用过程中存储和数据 处理复杂性。 在为第二区域预留的磁盘空间上可以不进行坏块标记。
在第一成员磁盘故障恢复后, 可以用恢复的第一成员磁盘替换目标 磁盘作为 RAID组的成员磁盘。 采用这种方式, 可以不考虑第二成员磁 盘是否存在不可读区域, 即不管第二成员磁盘是否存在不可读区域, 在 故障恢复后都能保证系统数据的完整性和安全性。 当然这种方式在第二 成员磁盘存在不可读区域的情况, 可以有效解决目标磁盘上存在部分数 据丟失的问题。
在本发明一个实施例中, 在第二成员磁盘存在不可读区域的情况, 第一成员磁盘故障恢复后可以不需要将目标磁盘替换回恢复的第一成员 磁盘。 相应的, 可以将恢复的第一成员磁盘上与第二成员磁盘不可读区 域对应的第二区域的数据存储到目标磁盘。 这样, 可以实现重构后磁盘 数据的完整性, 而且可以进一步减少磁盘切换带来的时间延迟。
图 2A 所示为故障恢复后需要切换回原来磁盘的一个实现过程。 图 2A所示的磁盘重构方法应用于包含多个成员磁盘的 RAID组的存储装置 或存储系统。 对 RAID 组中存在故障的第一成员磁盘进行重构的处理包 括:
S201a、 确定 RAID组中第一成员磁盘之外的第二成员磁盘存在不可 读区域。
S203a、 根据第二成员磁盘的数据恢复第一磁盘的第一区域的数据, 并将恢复的数据存储到目标磁盘, 其中, 所述第一区域对应的第二成员 磁盘的区 i或可读。
S205a、 在第一成员磁盘故障恢复前, 用包含恢复的数据的目标磁盘 替换第一成员磁盘作为 RAID组的成员磁盘。
S207a、 在第一成员磁盘故障恢复后, 用恢复后的第一成员磁盘替换 目标磁盘作为 RAID组的成员磁盘。
步骤 S201a 中, 可以根据第二成员磁盘的状态检测结果确定第二成 员磁盘是否存在不可读区域, 状态检测结果用于指示条带和 /或区块的可 读性检测结果。 步骤 S203a 中, 可以根据状态检测结果确定第一区域。 状态检测结果可以包含可读区域和 /或不可读区域的记录, 知道可读区域 即可确定不可读区域。
状态检测结果可以用条带标识符和 /或区块标识符来识别。 具体的, 条带标识符和 /或区块标识符也可以用存储地址表示, 例如, 条带标识符 可以是条带编号或条带地址 (条带的首地址和 /或尾地址) , 区块标识符 可以是区块编号或区块地址 (区块的首地址和 /或尾地址) 。 状态检测结 果可以是存储在存储器中的历史数据, 例如包含该不可读区域记录的曰 志; 也可以是通过对第二成员磁盘执行状态检测获得。
状态检测结果可以是故障发生前已经获得的历史数据, 也可以是故 障发生后对第二成员磁盘启动磁盘状态检测获得的数据。 可以按区域扫 描, 如果第二成员磁盘均可读, 其所对应的第一成员磁盘的区域可读。 或者是, 先确定了第二成员磁盘上存在不可读区域, 根据第二成员磁盘 的不可读区域确定第一成员磁盘的第二区域, 第一成员磁盘剩下的区域 即为第一区域。 在本发明的实施例中, 可以包括对第一成员磁盘的故障 是否可恢复的判断操作, 该操作可以通过检测磁盘故障原因实现。 磁盘 故障原因包括磁盘离线、 磁盘物理介质故障等一个或多个原因。 例如, 可以检测 RAID 组中第一成员磁盘所在槽位的槽位状态, 在步骤 S201a 之前对第一成员磁盘执行槽位状态检测, 如果槽位状态检测结果指示该 第一成员磁盘不在位, 则判断第一成员磁盘可恢复。
在本发明的实施例中, 还可以包括对第一成员磁盘的故障是否已恢 复的判断操作, 该操作可以通过检测磁盘在位状态、 磁盘的身份信息、 磁盘物理介质完整性等一个或多个内容实现。 在重构的过程中, 第一成 员磁盘是否可恢复的判断操作可以通过检测第一成员磁盘的槽位状态、 第一成员磁盘的身份等确定, 另外, 还可以结合第一成员磁盘的物理介 质完整性进行判断。 这样可以有效解决故障磁盘误操作拔出或接触不良 导致磁盘无法被检测。 例如, 通过磁盘诊断确定某磁盘的故障是由于物 理介质故障导致的, 则判定该故障磁盘不可恢复。 通常, 在某个磁盘所 在槽位没有接入新盘, 该磁盘被插回的可能性较大, 则可以判定该磁盘 可恢复。
图 2B所示为故障恢复后不需要切换回原来磁盘的一个实现过程。图 2 B所示的磁盘重构方法应用于包含多个成员磁盘的 R A I D组的存储装置 或存储系统。 当 RAID组中第一磁盘存在故障时, 重构处理方法包括: S201b、确定 RAID组中第一成员磁盘之外的第二成员磁盘存在不可 读区域。
S203b、利用第二成员磁盘的数据恢复第一成员磁盘的第一区域的数 据, 并将恢复的数据保存到目标磁盘, 其中, 第一成员磁盘的第一区域 对应的第二成员磁盘的区域可读。
S205b、 在第一成员磁盘故障恢复前, 用包含恢复的数据的目标磁盘 替换第一成员磁盘作为 RAID组的成员磁盘。
S207b、 在第一成员磁盘故障恢复后, 将恢复的第一成员磁盘的第二 区域的数据存储到目标磁盘, 第二区域对应第二成员磁盘的不可读区域。
步骤 S201b 中确定第二成员磁盘是否存在不可读区域的操作可以采 用和步骤 S201a相同或类似操作, 步骤 S203b中确定第一区域的操作可 以采用和步骤 S203a相同或类似操作, 不再赘述。
本发明另一实施例提供一种磁盘重构方法, 应用于包含多个成员磁 盘的 RAID组的存储装置或存储系统。 如图 3所示, 当 RAID组中第一 成员磁盘存在故障时, 重构处理方法包括:
S301、 判断第一成员磁盘是否可恢复, 若第一成员磁盘可恢复, 执 行 S303 ; 若第一成员磁盘不可恢复, 执行 S306。 其中, 第一成员磁盘是否可恢复的判断操作可以通过检测磁盘故障 原因实现, 该操作可以通过检测磁盘故障原因实现。 磁盘故障原因包括 磁盘离线、 磁盘物理介质故障等一个或多个原因。 例如, 对第一成员磁 盘执行槽位状态检测, 如果槽位状态检测结果指示该第一成员磁盘不在 位, 则判断第一成员磁盘可恢复。 若第一成员磁盘是由于误操作导致盘 被拔出, 且该第一成员磁盘所在槽位没有接入新盘, 该第一成员磁盘被 插回的可能性较大, 则可以判定第一成员磁盘可恢复; 或第一成员磁盘 是被拔出, 且该第一成员磁盘所在槽位接入了新盘, 则判定该第一成员 磁盘不可恢复; 或该第一成员磁盘是由于物理介质故障导致的, 则判定 该第一成员磁盘不可恢复。
S303、 根据 RAID组中第一成员磁盘以外的第二成员磁盘的数据恢 复第一成员磁盘的数据, 将恢复的数据存储到目标磁盘。
该目标磁盘可以是任意可用的空闲磁盘。 在存储数据前, 可以对目 标磁盘的磁盘空间划分成多个区域, 划分的多个区域保持和第一成员磁 盘的多个区域对应, 以保证恢复的每一个区域的数据能够存储到目标磁 盘相应的区域。 当然, 划分区域不是必须的, 例如, 可以按照数据块分 布规则顺序存储。
具体的, 如图 6A所示,一组连续的数据块和这些数据块形成的校验 块分布存储于 RAID组的多个成员磁盘。 这些存储数据块和校验块的磁 盘区域被称为区块, 至少一组跨越多个成员磁盘的区块可以形成条带。
示例性的, 利用第二成员磁盘上数据块和校验块的分布关系恢复出 第一成员磁盘的数据。 当第二成员磁盘上存在不可读区域, 如存在坏道, 第一成员磁盘上与不可读区域对应的区域的数据不可恢复, 不对该对应 的区域进行重构处理, 例如在目标磁盘不进行坏块标记。 如图 6B所示, 第一成员磁盘 (磁盘 4 )发生故障, 当检测到的第二成员磁盘 (磁盘 1 - 3 ) 中磁盘 1的区域 j不可读, 其对应第一成员磁盘的区域 m, 不对区域 m 进行重构处理, 即不进行数据恢复计算处理; 在目标磁盘上对应的区 域 n也不进行坏块标记。 区域 j、 区域 m和区域 n可用区块标识符和 /或 条带标识符识别。
S304、 在第一成员磁盘故障恢复前, 用包含恢复的数据的目标磁盘 替换第一成员磁盘作为 RAID组的成员磁盘。 在本实施例中,目标磁盘作为 RAID组临时的成员磁盘,可以在 RAID 组成员磁盘表和 RAID 组区域映射关系表中新增一条记录, 暂时不删除 第一成员磁盘的记录, 可以用解激活状态表示。 或者创建临时的 RAID 组成员磁盘表和临时的 RAID组区域映射表。 把原来的 RAID组成员磁 盘表和原来的 RAID组区域映射关系表解激活, 待故障恢复后重新激活。
示例性的, RAID 可以应用于通过存储区域网络 ( Storage Area Network , SAN )将高速服务器与高速存储设备的高速互联的网络环境中。 其中, 高速存储设备可以为基于 RAID 的存储装置或系统, 这就使得物 理上的远距离存储变得容易便捷, 提高了数据的可靠性和安全性。 例如, 可以应用在对数据安全性和存储性能要求很高的企业当中。
譬如在企业商务数据或运营商数据的存储和备份管理的网络环境 中, 目标磁盘接替故障磁盘后, 目标磁盘可以为其他设备提供业务访问 服务, 从而实现数据的快速备份, 快速地恢复用户的业务, 以保证企业 商务数据远程传输和远程存储的安全性和稳定性。
S305、 在第一成员磁盘故障恢复后, 用恢复的第一成员磁盘完成磁 盘重构处理。 步骤 S305可以有多种实现方式, 例如, 在第一成员磁盘故 障恢复后, 用恢复的第一成员磁盘替换目标磁盘作为 RAID 组的成员磁 盘以完成磁盘重构处理。
完成磁盘重构处理后, 删除或解激活目标磁盘作为 RIAD组的成员 磁盘的记录。 另一方面, 恢复第一成员磁盘作为 RAID 组成员的记录, 即重新激活该第一成员磁盘。 如果步骤 S304中是在 RAID组成员磁盘表 和 RAID组区域映射关系表中新增一条记录, 在步骤 S305中将新增的记 录删除, 重新激活原来地第一成员磁盘的记录。 如果步骤 S304中是创建 临时的 RAID组成员磁盘表和临时的 RAID组区域映射表, 在步骤 S305 中将临时的 RAID组成员磁盘表和临时的 RAID组区域映射表删除, 重 新使用原来的 RAID组成员磁盘表和原来的 RAID组区域映射关系表。
在步骤 S303中, 如果第二成员磁盘存在不可读区域, 能够恢复的数 据包含第一成员磁盘的第一区域的数据, 第一区域对应第二成员磁盘的 区域均可读。 在第一成员磁盘故障恢复后, 将恢复的第一成员磁盘的第 二区域的数据存储到目标磁盘, 第二区域对应第二成员磁盘的不可读区 域。 参考图 6B所示, 将恢复的第一成员磁盘的区域 m的数据存储到目 标磁盘的区 i或 n。
第一成员磁盘故障是否恢复可参照上文所述。 当该第一成员磁盘的 故障恢复后, 如第一成员磁盘插回其所在 RAID 组的槽位, 将目标磁盘 替换为恢复后的第一成员磁盘。 这样防止了在重构时由于第二成员盘数 据存在不可读区域而引起第一成员磁盘的数据丟失, 在第一成员磁盘故 障恢复后可以保证第一成员磁盘数据的完整性, 进而保证 RAID 组的数 据完整性和安全性。 另一方面, 用恢复后的第一成员磁盘替换目标磁盘, 可以保持 RAID组原有的数据处理模式, 恢复该 RAID组出现故障之前 的数据存储状态。
S306、 根据 RAID组中第一成员磁盘之外的第二成员磁盘的数据恢 复第一成员磁盘的数据, 并将恢复的数据存储到目标磁盘。
在步骤 S306执行的过程中, 第二成员磁盘可能存在不可读区域, 在 目标磁盘上与不可读区域对应的区域可以被标记为坏块, 也可以保持空 闲, 不写入任何数据, 或者是可以用固定的值进行填充以便于系统识别。
如图 6C所示, 第一成员磁盘 (磁盘 4 )发生故障, 当检测到的第二 成员磁盘 (磁盘 1 - 3 ) 中磁盘 1的区域 j不可读, 其对应第一成员磁盘 的区域 m; 在目标磁盘上对应的区域 n进行坏块标记。 区域 j、 区域 m 和区域 n可用区块标识符和 /或条带标识符识别。 当然, 区域 n上也可以 采用固定值填充或保持空闲。
具体的, 步骤 S306可以包括如下几个子步骤:
S306a、 判断第二成员磁盘是否存在不可读区域; 如果第二成员磁盘 存在不可读区域, 执行步骤 S306b , 如果第二成员磁盘不存在不可读区 域, 执行步骤 S306c。
S306b、根据第二成员磁盘的数据恢复第一成员磁盘的第一区域的数 据, 并将恢复的数据存储到目标磁盘, 其中, 在目标磁盘上与第一成员 磁盘的第二区域对应的区域标记坏块或不写入数据, 第一区域对应的第 二成员的区域可读, 第二区域对应的第二成员磁盘的不可读区域。
S306c、 根据第二成员磁盘的数据恢复第一成员磁盘的数据, 并将恢 复的数据存储到目标磁盘。
S307、 用包含恢复的数据的目标磁盘替换第一成员磁盘作为 RAID 组的成员磁盘。 可以将第一成员磁盘从 RAID 组移除, 不考虑第一成员 磁盘的故障是否恢复。
将该第一成员磁盘从该 RAID 组中移除, 如删除第一成员磁盘作为 RAID组的成员磁盘的记录,用目标磁盘的信息刷新 RAID组区域映射关 系表, 这样, 可以防止后续其他成员盘出现故障, 成员信息不准确带来 的失效。
这样, 当第一成员磁盘的故障可恢复时, RAID组可以保持其原有的 数据存储状态, RAID组数据不丟失; 当第一成员磁盘的故障不可恢复, 根据第二成员磁盘可读区域的数据恢复第一成员磁盘的第一区域的数 据, 至多丟失第一成员磁盘第二区域的数据, 其中, 第一区域对应的第 二成员磁盘的区域均可能, 第二区域对应第二成员的不可读区域。 通过 RAID组中成员磁盘的状态智能选择重构方式,可以平衡 RAID组数据完 整性和不可读区域导致重构时间浪费的问题, 能够减少 RAID组重构后 数据的丟失, 甚至实现不丟失, 并且能够快速恢复用户的业务。
图 4为本发明另一实施例的方法流程图。 图 4所示方法与图 3类似, 主要区别在于, 根据第一成员磁盘的故障是否可恢复以及第二成员磁盘 是否存在不可读区域的判断结果选择不同的重构方式:
第一重构方式, 应用于在第一成员磁盘的故障可恢复以及第二成员 磁盘存在不可读区域的情况。 第一重构方式的处理包括: 步骤 S403中, 根据第二成员磁盘的数据恢复第一成员磁盘的数据, 将恢复的数据存储 到目标磁盘; 步骤 S404中, 在第一成员磁盘故障恢复前, 用包含恢复的 数据的目标磁盘替换第一成员磁盘作为 RAID组的成员磁盘; 步骤 S405 中, 在第一成员磁盘故障恢复后, 用恢复的第一成员磁盘完成磁盘重构 处理。
步骤 S405中, 在第一成员磁盘故障恢复后, 可以用恢复后的第一成 员磁盘替换目标磁盘作为 RAID组的成员磁盘以完成磁盘重构处理。
步骤 S403中恢复的数据包括第一成员磁盘第一区域的数据,第一区 域对应的第二成员磁盘的区域均可读。 步骤 S405中, 在第一成员磁盘故 障恢复后, 可以将第一成员磁盘第二区域的数据存储到目标磁盘以完成 磁盘重构处理, 第二区域对应第二成员磁盘的不可读区域。
第二重构方式, 应用于第一成员磁盘的故障不可恢复以及第二成员 磁盘存在不可读区域的情况。 第二重构方式的处理包括: 步骤 S406b中, 根据第二成员磁盘的数据恢复第一成员磁盘的第一区域的数据, 将恢复 的数据存储到目标磁盘, 其中, 在目标磁盘上与第一成员磁盘的第二区 域对应的区域标记坏块或不写入数据, 第一区域对应的第二成员磁盘的 区域可读, 第二区域对应第二成员磁盘的不可区域。 步骤 S407中, 用包 含恢复的数据的目标磁盘替换第一成员磁盘作为 RAID组的成员磁盘。
第三重构方式, 应用于第二成员磁盘不存在不可读区域的情况。 第 三重构方式的处理包括: 步骤 S406c 中, 根据第二成员磁盘的数据恢复 第一成员磁盘的数据, 并将恢复的数据存储到目标磁盘。 步骤 S407中, 用包含恢复的数据的目标磁盘替换第一成员磁盘作为 RAID 组的成员磁 盘。
图 5为本发明另一实施例的方法流程图, 图 5所示的方法和图 4类 似。 当 RAID组中第一成员磁盘故障时, 具体处理过程包括:
5501、 判断 RAID组中第二成员磁盘是否存在不可读区域, 如果存 在不可读区域,执行步骤 S502 ,如果不存在不可读区域,执行步骤 S506c。 第二成员磁盘为 RAID 组中发生故障的第一成员磁盘以外的其他成员磁 盘。
5502、 判断第一成员磁盘的故障是否可恢复, 如果可恢复, 执行步 骤 S503 , 如果不可恢复, 执行步骤 S506b。
5503、 根据第二成员磁盘的数据恢复第一成员磁盘的数据, 将恢复 的数据存储到目标磁盘。
5504、 在第一成员磁盘故障恢复前, 用包含恢复的数据的目标磁盘 替换第一成员磁盘作为 RAID组的成员磁盘。
5505、 在第一成员磁盘故障恢复后, 用恢复的第一成员磁盘完成磁 盘重构处理。
一方面, 在第一成员磁盘故障恢复后, 可以用恢复的第一成员磁盘 替换目标磁盘作为 RAID组的成员磁盘以完成磁盘重构处理。
另一方面, 在第一成员磁盘故障恢复后, 可以将用恢复的第一成员 磁盘的第二区域的数据存储到目标磁盘相应区域以完成磁盘重构处理。 其中, 第二区域对应第二成员磁盘的不可读区域。
S506b、根据第二成员磁盘的数据恢复第一成员磁盘的第一区域的数 据, 将恢复的数据存储到目标磁盘, 其中, 在目标磁盘上与第一成员磁 盘的第二区域对应的区域标记坏块或不写入数据, 第一区域对应的第二 成员磁盘的区域可读。 完成 S506b的操作后, 执行步骤 S507。
S506c、 根据第二成员磁盘的数据恢复第一成员磁盘的数据, 并将恢 复的数据存储到目标磁盘。 完成 S506c的操作后, 执行步骤 S507。
S507、 用包含恢复的数据的目标磁盘替换第一成员磁盘作为 RAID 组的成员磁盘。
如图 7所示为本发明实施例提供一种装置 700 , 用于实现磁盘重构。 装置 700耦合到包含一个或多个 RAID组, 每一个 RAID组包含多个成 员磁盘。 装置 700包括:
数据获取单元 703 ,用于根据 RAID组中第一成员磁盘以外的第二成 员磁盘的数据恢复第一成员磁盘的数据。
写处理单元 704 , 用于将数据获取单元 703 恢复的数据写入目标磁 盘。
重构控制单元 702 , 用于在第一成员磁盘故障恢复前, 将 RAID组的 成员磁盘从第一成员磁盘切换到包含恢复的数据的目标磁盘, 并在第一 成员磁盘故障恢复后, 根据恢复的第一成员磁盘完成重构处理。
重构控制单元 702可以在第一成员磁盘故障恢复后, 将 RAID组的 成员磁盘从包含恢复的数据的目标磁盘切换到恢复的第一成员磁盘。 重 构控制单元 702也可以指示数据获取单元 703在第一成员磁盘故障恢复 后, 获取第一成员磁盘的第二区域的数据, 以及指示写处理单元 704将 第二区域的数据存储到目标磁盘对应区域。 第二区域对应第二成员磁盘 的不可读区域。
重构控制单元 702在判断第一成员磁盘的故障可恢复的情况选择在 第一成员磁盘故障恢复后, 将 RAID 组的成员磁盘从包含恢复的数据的 目标磁盘切换到恢复的第一成员磁盘的第一重构方式。
重构控制单元 702在判断第一成员磁盘的故障不可恢复的情况选择 不考虑故障是否恢复的情况, 在完成第一成员磁盘到目标磁盘的重构后 将第一成员磁盘从 RAID组删除的第二重构方式。
重构控制单元 702还可以结合第一成员磁盘的故障可恢复性检测结 果和第二成员磁盘是否存在不可读区域的检测结果进行重构方式选择。
具体的, 重构控制单元 702可以包含故障状态管理单元 7022 , 其用 于实现第一成员磁盘的故障状态管理。 故障状态管理单元 7022可以获得 第一成员磁盘的故障原因, 故障是否已恢复等信息, 故障原因可以包含 不在位、 磁盘介质故障等一个或多个。
重构控制单元 702还可以包含磁盘可读性检测单元 7021 , 用于获得 RAID组成员磁盘的可读性检测结果,包括获得第二成员磁盘的可读性检 测结果。
重构控制单元 702可以包含磁盘管理单元 7023 , 用于管理 RAID组 成员磁盘, 包括维护成员磁盘信息、 负责成员磁盘变更或切换管理、 磁 盘可读区域和 /或不可读区域管理、磁盘重构方式选择等一个或多个方面。
重构控制单元 702还可以包含处理单元 7024 , 负责根据故障管理单 元 7022、 磁盘可读性检测单元 7021、 磁盘管理单元 7023等一个或多个 单元的信息或指令控制数据获取单元 703和写入控制单元 704的操作。 例如, 处理单元 7024根据第二成员磁盘的可读性检测结果控制数据获取 单元 703的数据恢复过程, 以便于数据获取单元 703根据第二成员磁盘 可读区域的数据恢复第一成员磁盘第一区域的数据, 第一区域对应第二 成员磁盘的区域可读, 阻止数据获取单元 703对第一成员磁盘的第二区 域进行重构, 第二区域对应第二成员磁盘的不可读区域。 处理单元 7024 还可以根据磁盘可读性检测结果指示写处理单元 704 不在目标磁盘上与 第二区域对应的区域写入数据或标记坏块。 具体的, 处理单元 7024可以 根据需要执行上述方法流程的一个或多个操作。
如图 8所示为本发明实施例提供一种装置 800 , 包括:
重构方式选择单元 801 ,用于判断 RAID组中发生故障的第一成员磁 盘是否可恢复, 如果第一成员磁盘的故障可恢复选择软重构方式, 如果 第一成员磁盘的故障不可恢复选择硬重构方式。
软重构单元 802 , 用于在软重构方式下, 根据 RAID组中第一成员磁 盘以外的第二成员磁盘的数据恢复第一成员磁盘的数据, 并将恢复的数 据存储到目标磁盘, 在第一成员磁盘故障恢复前, 用包含恢复的数据的 目标磁盘替换第一成员磁盘作为 RAID 组的成员磁盘, 在第一成员磁盘 故障恢复后, 用恢复后的第一成员磁盘替换包含恢复的数据的目标磁盘 作为 RAID组的成员磁盘。 其中, 如果第二成员磁盘存在不可读区域, 可以仅恢复第一成员磁盘的第一区域的数据, 而对第一成员磁盘的第二 区域不进行重构或数据恢复处理。 其中, 第一区域对应的第二成员磁盘 的区域均可读, 第二区域对应第二成员磁盘的不可读区域。
硬重构单元 805 , 用于在硬重构方式下, 根据 RAID组中第一成员磁 盘以外的第二成员磁盘的数据恢复第一成员磁盘的数据, 并将恢复的数 据存储到目标磁盘, 在完成恢复的数据存储后, 用包含恢复的数据的目 标磁盘替换第一成员磁盘作为 RAID 组的成员磁盘, 将第一成员磁盘从 RAID组移除以完成磁盘重构。第一成员磁盘故障恢复后维持目标磁盘作 为 RAID 组的成员磁盘。 其中, 如果第二成员磁盘存在不可读区域, 可 以仅恢复第一成员磁盘的第一区域的数据, 而对第一成员磁盘的第二区 域不进行重构或数据恢复处理。 其中, 第一区域对应的第二成员磁盘的 区域均可读, 第二区域对应第二成员磁盘的不可读区域。 。
软重构单元 802可以包括:
数据获取单元 8021 , 用于根据第二成员磁盘的数据恢复第一成员磁 盘的数据。 在第二成员磁盘存在不可读区域时, 数据获取单元 8021可以 根据第二成员磁盘可读区域的数据获取第一成员磁盘的第一区域的数 据, 第一区域对应的第二成员磁盘的区域均可读。 数据获取单元 8021获 取的数据可以緩存到与装置 800耦合的緩存器中, 该緩存器可以是外部 连接到装置 800 , 也可以是集成到装置 800中。
写处理单元 8023 , 用于控制将恢复的数据存储到目标磁盘相应区域 重构控制单元 8024 , 用于在目标磁盘中完成对第一成员磁盘的所有 可恢复数据的存储后, 用包含恢复的数据的目标磁盘替换第一成员磁盘 作为 RAID组的成员磁盘; 以及在第一成员磁盘故障恢复后, 用恢复后 的第一成员磁盘替换包含恢复的数据的目标磁盘作为 RAID 组的成员磁 盘。
硬重构单元 805可以包括:
数据获取单元 8051 , 用于根据第二成员盘的数据恢复第一成员磁盘 的数据。 恢复数据的方式可以采用第二成员磁盘中数据块和校验块的分 布进行冗余计算获得。 如果第第二成员磁盘存在不可读区域, 数据获取 单元 8051可以仅根据第二成员磁盘可读区域的数据获取第一成员磁盘的 第一区域的数据, 第一区域对应第二成员磁盘的区域均可读, 而对第一 成员磁盘的第二区域不进行数据恢复处理, 第二区域对应第二成员磁盘 的不可读区域。
写处理单元 8053 ,用于将数据获取单元 8051恢复的数据存储到目标 磁盘相应区域。
硬重构单元 805可以包括坏块标记单元 8052 , 用于在第二成员磁盘 存在不可读区域时, 在目标磁盘中将与第二成员盘的不可读区域对应的 区域标记为坏块, 目标磁盘中标记坏块处不写入数据。
硬重构单元 805也可以不包含坏块标记单元 8052 , 而是采用一个固 定值作为第一成员磁盘的第二区域的数据写入目标磁盘中与第二区域对 应的区域。 当然, 与第二区域对应的区域也可以不写入任何数据。
重构控制单元 8054还可以在目标磁盘中完成对第一成员磁盘所有可 恢复数据的存储后, 将目标盘作为成员磁盘加入 RAID 组中, 并将第一 成员磁盘从 RAID组中移除。 重构控制单元 8054可以更新 RAID组磁盘 成员信息。
图 9A和 9B为本发明实施例提供的应用系统 910a和 910b的架构示 意图。
图 9A中, 系统 900a包括一个或多个存储装置或存储系统, 如图装 置 910a , 其通过光纤通道 ( Fiber Channel , FC ) 网络或网际互联协议 ( Internet Protocol , IP ) 网络或 IP网络 905连接到主机 901a, 901b , 数 据库 902等网络设备。 图 9A示意性显示了 2个主机 901a和 901b , 实际 系统中可以包括更多类似主机, 这些主机可以是完成各种功能的服务器, 如网络服务器 (web server ) 、 文件服务器、 业务服务器等。 数据库 902 提供存储文件的内容索引、 访问地址信息、 用户信息等。
装置 910a能够实现磁盘重构功能, 提供磁盘重构功能的部件主要包 括:
存储控制器 912a, 通过一个或多个磁盘适配器 913a 连接到多个磁 盘。 其中, 至少部分磁盘可以形成 RAID组, 如 RAID组 915a和 915b。 至少部分磁盘用作空闲磁盘, 如磁盘 916a和 916b , 其可以用作磁盘重构 的目标磁盘。 RAID组可以是基于软件的 RAID组, 也可以是基于硬件的 RAID组。
磁盘适配器 913a是独立磁盘冗余阵列 RAID组和目标磁盘的接口, 提供输入和输出适配功能, 其可以作为 RAID 组和其它组件 (如存储控 制器 912、 緩存器 ) 的中介。
存储控制器 912a, 耦合到 RAID组, 用于完成 RAID组的控制, 能 够完成的控制操作包括磁盘重构控制操作。
存储控制器 912a可执行如下操作:当 RAID组中第一成员磁盘故障, 根据 RAID组中第一成员磁盘以外的第二成员磁盘恢复第一成员磁盘的 数据, 并将恢复的数据存储到目标磁盘; 在第一成员磁盘的故障恢复前, 用包含恢复的数据的目标磁盘替换第一成员磁盘作为 RAID 组的成员磁 盘; 在第一成员磁盘故障恢复后, 根据恢复后的第一成员磁盘进行相应 的处理。
本发明实施例至少提供两种重构方式。 第一成员磁盘故障恢复后, 需要从目标磁盘切换回恢复的第一成员磁盘的重构方式被称作第一重构 方式。 第一成员磁盘故障恢复后, 不需要从目标磁盘切换回恢复的第一 成员磁盘的重构方式被称作第二重构方式。 当然, 在实施例中, 也可以 包含其他重构方式。
存储控制器 912a可在第一成员磁盘故障恢复后, 用恢复的第一成员 磁盘替换包含恢复的数据的目标磁盘作为 RAID组的成员磁盘。
存储控制器 912a可在第一成员磁盘故障恢复后, 将恢复的第一成员 磁盘的第二区域的数据存储到目标磁盘相应区域, 第二区域对应第二成 员磁盘的不可读区域。 该操作可以在第二成员磁盘存在不可读区域情况 下执行。
第一重构方式和第二重构方式可以由存储控制器 912a选择。 存储控 制器 912a 可以基于第一成员磁盘的故障是否可恢复的判断结果进行选 择, 如果可以恢复选择第一重构方式, 即需要切换回恢复的第一成员磁 盘, 如果不可以恢复选择第二重构方式, 即不需要切换回恢复的第一成 员磁盘。
存储控制器 912a可以获得磁盘成员信息、 磁盘区域信息、 指示磁盘 状态是否可恢复的信息、 指示磁盘可读性检测结果的信息、 指示数据块 和校验块分布的信息或规则等至少部分信息, 用作磁盘重构控制处理。 存储控制器 912a可以从自身包含的存储器获得上述信息, 也可以从连接 其上的存储器获得上述信息。 存储控制器 912a可基于指示磁盘状态是否可恢复的信息判断发生故 障的第一成员磁盘的故障是否可恢复。 存储控制器 912a可基于指示磁盘 可读性检测结果的信息判断第二成员磁盘是否存在不可读区域。 存储控 制器 912a可以基于 RAID组中第二成员磁盘的数据恢复第一成员磁盘的 数据, 并将恢复的数据存储到目标磁盘的操作。 磁盘区域信息可以包括 磁盘条带或磁盘区块信息。 数据恢复处理可基于指示数据块和校验块分 布的信息或规则进行。
存储控制器 912a典型的是一个 RAID控制器, 通过执行存储于一计 算机可读取存储介质中的程序, 执行本发明上述各种磁盘重构的方法。
如图 9A示例, 装置 910a还可以包括如下部件:
一个或多个通信适配器, 如通信适配器 911a和 911b , 这些通信适配 器作为 FC网络或 IP网络的网络适配器, 通过 FC网络或 IP网络与网络 设备, 如主机 901和数据库 902等通信。
一个或多个緩存器, 如緩存器 914 , 可以作为緩存 RAID组 (RAID 组 915a或 915b ) 与空闲磁盘 (磁盘 916a或 916b ) 中间的数据, 也可以 緩存 RAID组和其他网络设备之间的数据。
管理控制器 917 , 可以通过用户的管理接口对装置 910a实现管理。 存储器 918 , 可以负责装置 910b的系统参数的存储。
总线桥接器 919a, 可以提供一系列数据总线、 控制总线, 分别实现 部件之间的数据和控制指令交互。总线桥接器 919a也可以包含功率总线, 通过功率总线为各部件供电。
具体的, 存储控制器 912可以对图 1到图 3所示方法部分和全部处 理步骤实现控制。
图 9B提供的系统 900b和 900a类似,包括装置 910b ,其与装置 910a 类似的存储装置或存储系统, 其通过 FC网络 905a和 /或 IP网络 905b连 接到主机 901a、 901b , 数据库 902等网络设备。
装置 910b包括如下部件:
存储控制器 912b , 通过一个或多个磁盘适配器 913b 连接到多个磁 盘。 其中, 至少部分磁盘可以形成 RAID组, 如 RAID组 915a和 915b。 至少部分磁盘用作空闲磁盘, 如磁盘 916a和 916b , 其可以用作磁盘重构 的目标磁盘。 RAID组可以是基于软件的 RAID组, 也可以是基于硬件的 RAID组。
磁盘适配器 913b是独立磁盘冗余阵列 RAID组和目标磁盘的接口, 提供输入和输出适配功能, 其可以作为 RAID组和存储控制器 912b的中 介。
存储控制器 912a的基本功能和图 9A的存储控制器 912a完成的基本 功能相同, 不再赘述。
如图 9B示例, 还可以包括如下至少一个部件:
一个或多个緩存器, 如緩存器 914 , 可以作为緩存 RAID组 (RAID 组 915a或 915b ) 与空闲磁盘 (磁盘 916a或 916b ) 中间的数据, 也可以 緩存 RAID组和其他网络设备的之间的数据。
管理控制器 917 , 可以通过用户的管理接口对装置 910a实现管理。 耦合到存储控制器 912b 的一个或多个存储器, 如存储器 918a 和 918b。 存储器 918a主要负责装置 910b的系统参数的存储, 存储器 918b 可以提供 RAID组控制所需的信息, 具体参见存储控制器 912a控制涉及 的信息。
总线桥接器 91% , 可以提供一系列数据总线、 控制总线, 分别实现 部件之间的数据和控制指令交互。总线桥接器 919b也可以包含功率总线, 通过功率总线为各部件供电。
在图 9B的图示中, 存储控制器 912a、 磁盘适配器 913a可以集成到 一起,形成磁盘重构装置 920。磁盘重构装置 920还可以集成存储器 918b。 存储器 918b可以用于存储磁盘成员信息、 磁盘区域信息、 指示磁盘状态 是否可恢复的信息、 指示磁盘可读性检测结果的信息等至少部分信息。 存储器 918b中以 RAID组成员磁盘表形式存储磁盘成员信息, 以 RAID 组区域映射表形式存储磁盘区域信息。存储器 918b可以存储执行完成本 发明至少一种方法的计算机可读程序, 以便于装置 910b中的一个或多个 处理器 (未示出) 能够执行该计算机可读程序完成磁盘重构, 这些处理 器可以集成在控制器 920内部, 也可以通过接口连接到控制器 920。
一个或多个通信适配器, 如通信适配器 911a和 911b。 在图 9B中, 通信适配器 911a和 911b分别是 FC通信适配器和 IP通信适配器。
在本申请所提供的几个实施例中, 应该理解到, 所揭露方法和设备, 可以通过其它的方式实现。 例如, 以上所描述的设备实施例仅仅是示意 性的, 例如, 单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可以 有另外的划分方式, 例如多个单元或组件可以结合或者可以集成到另一 个系统, 或一些特征可以忽略, 或不执行。 另一点, 所显示或讨论的相 互之间的耦合或直接耦合或通信连接可以是通过一些接口, 装置或单元 的间接耦合或通信连接, 可以是电性, 机械或其它的形式。
另外, 在本发明各个实施例中的中, 各功能单元可以集成在一个处 理单元中, 也可以是各个单元单独物理包括, 也可以两个或两个以上单 元集成在一个单元中。 且上述的各单元既可以采用硬件的形式实现, 也 可以采用硬件加软件功能单元的形式实现。
实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬 件来完成, 前述的程序可以存储于一计算机可读取存储介质中, 该程序 在执行时, 执行包括上述方法实施例的步骤; 而前述的存储介质包括: U 盘、 移动硬盘、 只读存储器 ( Read Only Memory, 筒称 ROM ) 、 随机存 取存储器( Random Access Memory , 筒称 RAM ) 、 磁碟或者光盘等各种 可以存储程序代码的介质。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局 限于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可 轻易想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明 的保护范围应以所述权利要求的保护范围为准。

Claims

权 利 要 求
1、 一种磁盘重构方法, 其特征在于, 当独立磁盘冗余阵列 RAID组 中第一成员磁盘存在故障时, 所述方法包括:
根据 RAID组中第一成员磁盘以外的第二成员磁盘的数据恢复第一 成员磁盘的数据, 并将恢复的数据存储到目标磁盘;
在第一成员磁盘的故障恢复前, 用包含恢复的数据的目标磁盘替换 第一成员磁盘作为 RAID组的成员磁盘;
在第一成员磁盘故障恢复后, 根据恢复的第一成员磁盘进行相应的 重构处理。
2、 根据权利要求 1 所述的磁盘重构方法, 其特征在于, 所述根据 恢复的第一成员磁盘进行相应的重构处理包括:
用恢复的第一成员磁盘替换包含恢复的数据的目标磁盘作为 RAID 组的成员磁盘。
3、 根据权利要求 1所述的磁盘重构方法, 其特征在于, 根据 RAID 组中第一成员磁盘以外的第二成员磁盘可读区域的数据恢复第一成员 磁盘的第一区域的数据, 并将恢复的数据存储到目标磁盘, 其中, 在目 标磁盘上与第一成员磁盘的第二区域对应的区域不写入数据, 第一区域 对应第二成员磁盘的区域可读, 第二区域对应第二成员磁盘的不可读区 域。
4、 根据权利要求 3 所述的磁盘重构方法, 其特征在于, 所述根据 恢复后的第一成员磁盘进行相应的重构处理包括: 将恢复的第一成员磁 盘的第二区域的数据存储到目标磁盘。
5、 一种用于实现磁盘重构的装置, 其特征在于, 包括:
数据获取单元, 用于根据独立磁盘冗余阵列 RAID组中第一成员磁 盘以外的第二成员磁盘的数据恢复第一成员磁盘的数据, 第一成员磁盘 为 RAID组中发生故障的成员磁盘;
写处理单元, 用于将数据获取单元恢复的数据写入目标磁盘; 重构控制单元, 用于在第一成员磁盘故障恢复前, 将 RAID组的成 员磁盘从第一成员磁盘切换到包含恢复的数据的目标磁盘, 并在第一成 员磁盘故障恢复后, 根据恢复的第一成员磁盘完成重构处理。
6、 根据权利要求 5 所述的装置, 其特征在于, 在第一成员磁盘故 障恢复后, 重构控制单元将 RAID组的成员磁盘从包含恢复的数据的目 标磁盘切换到恢复的第一成员磁盘。
7、 根据权利要求 6 所述的装置, 其特征在于, 重构控制单元在判 断第一成员磁盘的故障可恢复的情况选择在第一成员磁盘故障恢复后, 将 RAID组的成员磁盘从包含恢复的数据的目标磁盘切换到恢复的第一 成员磁盘的重构方式。
8、 根据权利要求 5 所述的装置, 其特征在于, 在第一成员磁盘故 障恢复后, 重构控制单元将 RAID组的成员磁盘从包含恢复的数据的目 标磁盘切换到恢复的第一成员磁盘。
9、 一种存储装置, 其特征在于, 包括:
如权利要求 5 - 8任一项所述的用于实现磁盘重构的装置;
耦合到所述用于实现磁盘重构的装置的一个或多个独立磁盘冗余 阵列 RAID组和 /或目标磁盘。
10、 一种磁盘重构装置, 其特征在于, 包括:
磁盘适配器, 用作独立磁盘冗余阵列 RAID组和目标磁盘的接口; 存储控制器, 用于判断第一成员磁盘的故障是否可恢复, 如果第一 成员磁盘的故障可恢复, 按第一重构方式处理, 如果第一成员磁盘的故 障不可恢复, 按第二重构方式处理;
其中, 在第一重构方式下, 根据 RAID组中第一成员磁盘以外的第 二成员磁盘的数据恢复第一成员磁盘的数据, 并将恢复的数据存储到目 标磁盘; 在第一成员磁盘的故障恢复前, 用包含恢复的数据的目标磁盘 替换第一成员磁盘作为 RAID组的成员磁盘; 在第一成员磁盘故障恢复 后, 根据恢复的第一成员磁盘完成重构处理;
其中, 在第二重构方式下, 根据第二成员磁盘的数据恢复第一成员 磁盘的数据, 并将恢复的数据存储到目标磁盘; 在完成第一成员磁盘到 目标磁盘重构后, 用包含恢复的数据的目标磁盘替换第一成员磁盘作为 RAID组的成员磁盘, 将第一成员磁盘从 RAID组移除以完成磁盘重构。
11、 根据权利要求 10 所述的磁盘重构装置, 其特征在于, 根据恢 复的第一成员磁盘完成重构处理的操作包括: 用恢复的第一成员磁盘替 换包含恢复的数据的目标磁盘作为 RAID 组的成员磁盘以完成重构处 理。
12、 根据权利要求 10 所述的磁盘重构装置, 其特征在于, 根据恢 复的第一成员磁盘完成重构处理的操作包括: 将恢复的第一成员磁盘上 与第二成员磁盘的不可读区域对应的区域的数据存储到目标磁盘以完 成重构处理。
13、 一种存储装置, 其特征在于, 包括:
如权利要求 10 - 12任一项所述的磁盘重构装置;
耦合到所述磁盘重构装置的一个或多个独立磁盘冗余阵列 RAID组和 / 或目标磁盘。
PCT/CN2013/080582 2012-12-27 2013-08-01 一种磁盘重构方法及其装置 WO2014101412A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210580816.7 2012-12-27
CN201210580816.7A CN103049400B (zh) 2012-12-27 2012-12-27 一种磁盘重构方法及其装置

Publications (1)

Publication Number Publication Date
WO2014101412A1 true WO2014101412A1 (zh) 2014-07-03

Family

ID=48062047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/080582 WO2014101412A1 (zh) 2012-12-27 2013-08-01 一种磁盘重构方法及其装置

Country Status (2)

Country Link
CN (1) CN103049400B (zh)
WO (1) WO2014101412A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114063929A (zh) * 2021-11-25 2022-02-18 北京计算机技术及应用研究所 基于双控制器硬盘阵列的局部raid重构系统及方法

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049400B (zh) * 2012-12-27 2015-12-23 华为技术有限公司 一种磁盘重构方法及其装置
CN104679623A (zh) * 2013-11-29 2015-06-03 中国移动通信集团公司 一种服务器硬盘的维护方法、系统及服务器监控设备
CN105094684B (zh) * 2014-04-24 2018-03-09 国际商业机器公司 磁盘阵列系统中问题磁盘的重用方法和系统
CN104461791B (zh) * 2014-11-28 2017-02-01 华为技术有限公司 一种信息处理方法及处理装置
CN106126378A (zh) 2016-06-29 2016-11-16 华为技术有限公司 一种触发磁盘阵列进行重构的方法及装置
CN106371947B (zh) * 2016-09-14 2019-07-26 郑州云海信息技术有限公司 一种用于raid的多故障盘数据恢复方法及其系统
CN107315662A (zh) * 2017-07-05 2017-11-03 郑州云海信息技术有限公司 一种防止硬盘数据丢失的方法及系统
CN108874312B (zh) * 2018-05-30 2021-09-17 郑州云海信息技术有限公司 数据存储方法以及存储设备
CN109496292B (zh) * 2018-10-16 2022-02-22 深圳市锐明技术股份有限公司 一种磁盘管理方法、磁盘管理装置及电子设备
CN111124263B (zh) * 2018-10-31 2023-10-27 伊姆西Ip控股有限责任公司 用于管理多个盘的方法、电子设备以及计算机程序产品
CN109871186B (zh) * 2019-03-12 2021-12-07 北京计算机技术及应用研究所 面向可重组raid的多目标快速重构系统
CN112800493A (zh) * 2021-02-07 2021-05-14 联想(北京)有限公司 一种信息处理方法及设备
CN113391941B (zh) * 2021-06-18 2022-07-22 苏州浪潮智能科技有限公司 一种raid的读写超时处理方法、装置、设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276302A (zh) * 2007-03-29 2008-10-01 中国科学院计算技术研究所 一种磁盘阵列系统中磁盘故障处理和数据重构方法
CN102081559A (zh) * 2011-01-11 2011-06-01 成都市华为赛门铁克科技有限公司 一种独立磁盘冗余阵列的数据恢复方法和装置
CN102207895A (zh) * 2011-05-27 2011-10-05 杭州华三通信技术有限公司 一种独立磁盘冗余阵列数据重建方法和装置
CN103049400A (zh) * 2012-12-27 2013-04-17 华为技术有限公司 一种磁盘重构方法及其装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276302A (zh) * 2007-03-29 2008-10-01 中国科学院计算技术研究所 一种磁盘阵列系统中磁盘故障处理和数据重构方法
CN102081559A (zh) * 2011-01-11 2011-06-01 成都市华为赛门铁克科技有限公司 一种独立磁盘冗余阵列的数据恢复方法和装置
CN102207895A (zh) * 2011-05-27 2011-10-05 杭州华三通信技术有限公司 一种独立磁盘冗余阵列数据重建方法和装置
CN103049400A (zh) * 2012-12-27 2013-04-17 华为技术有限公司 一种磁盘重构方法及其装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114063929A (zh) * 2021-11-25 2022-02-18 北京计算机技术及应用研究所 基于双控制器硬盘阵列的局部raid重构系统及方法
CN114063929B (zh) * 2021-11-25 2023-10-20 北京计算机技术及应用研究所 基于双控制器硬盘阵列的局部raid重构系统及方法

Also Published As

Publication number Publication date
CN103049400B (zh) 2015-12-23
CN103049400A (zh) 2013-04-17

Similar Documents

Publication Publication Date Title
WO2014101412A1 (zh) 一种磁盘重构方法及其装置
CN100407121C (zh) 信息处理系统和一次存储装置
US9053075B2 (en) Storage control device and method for controlling storages
US7587631B2 (en) RAID controller, RAID system and control method for RAID controller
CN102024044B (zh) 分布式文件系统
US8464094B2 (en) Disk array system and control method thereof
US7783922B2 (en) Storage controller, and storage device failure detection method
JP3618529B2 (ja) ディスクアレイ装置
US8554734B1 (en) Continuous data protection journaling in data storage systems
US7895162B2 (en) Remote copy system, remote environment setting method, and data restore method
CN109582443A (zh) 基于分布式存储技术的虚拟机备份系统
KR100711165B1 (ko) 기억 제어 장치, 제어 방법 및 기록 매체
JP2008046986A (ja) ストレージシステム
US20070101188A1 (en) Method for establishing stable storage mechanism
JP3573032B2 (ja) ディスクアレイ装置
JP2001337792A (ja) ディスクアレイ装置
CN111984365B (zh) 一种虚拟机虚拟磁盘双活实现方法及系统
US7653831B2 (en) Storage system and data guarantee method
US20210303178A1 (en) Distributed storage system and storage control method
WO2017097233A1 (zh) 一种数据存储负载的容错方法及iptv系统
US8433949B2 (en) Disk array apparatus and physical disk restoration method
US20090177916A1 (en) Storage system, controller of storage system, control method of storage system
US7529776B2 (en) Multiple copy track stage recovery in a data storage system
CN116204137B (zh) 基于dpu的分布式存储系统、控制方法、装置及设备
US7529966B2 (en) Storage system with journaling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13867242

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13867242

Country of ref document: EP

Kind code of ref document: A1