CN112084060A

CN112084060A - Reducing data loss events in RAID arrays of different RAID levels

Info

Publication number: CN112084060A
Application number: CN202010511980.7A
Authority: CN
Inventors: L·M·古普塔; M·G·博里克; K·A·尼尔森; C·A·哈迪; B·A·里纳尔迪
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2019-06-15
Filing date: 2020-06-08
Publication date: 2020-12-15

Abstract

Embodiments of the present disclosure relate to reducing data loss events in RAID arrays of different RAID levels. A method for reducing data loss events in RAIDs of different RAID levels is disclosed. Such a method identifies a first set of RAIDs and a second set of RAIDs in a data storage environment. The first group contains RAIDs (e.g., RAID-6 arrays) that provide more robust data protection and the second group contains RAIDs (e.g., RAID-5 arrays) that provide less robust data protection. The method identifies, in the data storage environment, higher risk storage drives having a risk of failure above a threshold and lower risk storage drives having a risk of failure below a threshold. The method swaps higher risk storage drives in the second set of RAIDs with lower risk storage drives in the first set of RAIDs. In some embodiments, the swap may be performed such that more than a selected number of higher risk storage drives are not included in the RAID of the second group. Corresponding systems and computer program products are also disclosed.

Description

Reducing data loss events in RAID arrays of different RAID levels

Technical Field

The present invention relates to a system and method for reducing data loss events in a redundant array of independent disks.

Background

RAID (i.e., redundant array of independent disks) is a storage technology that provides enhanced storage functionality and reliability through redundancy. RAID is created by combining multiple storage drive components (e.g., disk drives and/or solid state drives) into one logical unit. The data is then distributed across the drives using various techniques called "RAID levels". The standard RAID levels, including RAID levels 1 through 6, are currently a basic set of RAID configurations that use striping, mirroring and/or parity to provide data redundancy. Each configuration provides a balance between two key goals: (1) improved data reliability and (2) improved I/O performance.

Currently, the most common RAID levels are RAID-5 and RAID-6, both of which utilize block-level striping with distributed parity values. A RAID-5 array is configured to recover from a single drive failure, while a RAID-6 array may recover from two simultaneous drive failures. Thus, a RAID-6 array provides greater protection of redundant data than a RAID-5 array.

In the field, it has been observed that drive failures in combination with media errors result in most data loss events. For example, a drive failure in a RAID-5 array combined with a media error on another storage drive in the array can result in a data loss. Although a RAID-5 array will lose data when two storage drives fail simultaneously, the most common is data loss due to a single drive failure and media error. In contrast, data loss would be prevented in both cases due to the extra parity values utilized by the RAID-6 array.

In view of the foregoing, there is a need for systems and methods for reducing data loss events in redundant arrays of independent disks. There is also a need for systems and methods for providing better reporting and statistics regarding data loss caused or prevented by a particular RAID level (e.g., RAID-5, RAID-6, etc.). In some cases, such systems and methods may be used to encourage users to transition to a more robust RAID level (e.g., RAID-6), or to provide evidence that a prior transition to a more robust RAID level has prevented data loss.

Disclosure of Invention

The present invention has been developed in response to the present state of the art, and in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available systems and methods. Accordingly, embodiments of the present invention have been developed to reduce data loss events in a Redundant Array of Independent Disks (RAID). The features and advantages of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter.

Consistent with the foregoing, a method for reducing data loss events in a Redundant Array of Independent Disks (RAID) having different RAID levels is disclosed. In one embodiment, the method identifies a first and second set of RAIDs in a data storage environment. The first group contains RAIDs (e.g., RAID-6 arrays) that provide more robust data protection and the second group contains RAIDs (e.g., RAID-5 arrays) that provide less robust data protection. The method identifies, in the data storage environment, higher risk storage drives having a risk of failure above a threshold and lower risk storage drives having a risk of failure below a threshold. The method swaps higher risk storage drives in the second set of RAIDs with lower risk storage drives in the first set of RAIDs. In some embodiments, the swap may be performed such that more than a selected number of higher risk storage drives are not included in the second set of RAIDs. Consistent with the foregoing, a method for converting a Redundant Array of Independent Disks (RAID) to a more robust RAID level is disclosed. This method identifies higher risk storage drives in the data storage environment having a risk of failure above a first threshold. The method determines a number of higher risk storage drives included in a RAID array of the data storage environment. The method determines whether the number exceeds a second threshold. The method also determines whether a destage rate associated with the RAID array is below a third threshold. If the number exceeds a second threshold and the degradation rate is below a third threshold, the method converts the RAID array to a more robust RAID level.

Corresponding system and computer program products are also disclosed and claimed herein.

Drawings

In order that the advantages of the invention will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments that are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings, in which:

FIG. 1 is a high-level block diagram illustrating one example of a network environment in which systems and methods in accordance with the present invention may be implemented;

FIG. 2 is a high-level block diagram illustrating one embodiment of a storage system in which one or more RAIDs may be implemented;

FIG. 3 illustrates a reporting module configured to report information related to storage drive failures in a storage environment, and optionally whether certain data loss can be prevented by using a RAID that provides more robust data protection, or whether data loss is prevented by using a RAID that provides more robust data protection;

FIG. 4 is a high-level block diagram showing a reporting module and various related sub-modules;

FIG. 5 illustrates an action module that swaps storage drives between a RAID array that provides less robust data protection and a RAID array that provides more robust data protection to reduce the risk of data loss;

FIG. 6 illustrates more evenly distributing higher risk storage drives across a RAID at a RAID level to reduce the risk of data loss;

FIG. 7 illustrates an action module removing a higher risk storage drive from a RAID array to reduce the risk of data loss;

FIG. 8 is a high-level block diagram showing an action module and various related sub-modules;

FIG. 9 is a flow diagram illustrating one embodiment of a method for reducing the risk of data loss in a storage environment comprising RAID arrays of different RAID levels; and

FIG. 10 is a flow diagram illustrating one embodiment of a method for determining whether a RAID array may be converted to a more robust RAID level.

Detailed Description

It will be readily understood that the components of the present invention, as generally described and illustrated in the figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the present invention, as represented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of certain examples of presently contemplated embodiments in accordance with the invention. The presently described embodiments will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions thereon for causing a processor to perform aspects of the invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages.

The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

Referring to fig. 1, an example of a network environment 100 is shown. Network environment 100 is presented to illustrate one example of an environment in which systems and methods consistent with the invention may be implemented. Network environment 100 is presented by way of example and not limitation. Indeed, the systems and methods disclosed herein may be applicable to a variety of different network environments in addition to the illustrated network environment 100.

As shown, the network environment 100 includes one or

more computers

102, 106 interconnected by a network 104. The network 104 may include, for example, a Local Area Network (LAN)104, a Wide Area Network (WAN)104, the Internet 104, an intranet 104, and so forth. In certain embodiments, the

computers

102, 106 may include a client computer 102 and a server computer 106 (also referred to herein as a "host" 106 or a "host system" 106). Typically, the client computer 102 initiates a communication session, while the server computer 106 waits for and responds to requests from the client computer 102. In certain embodiments, the computer 102 and/or server 106 may be connected to one or more internal or external directly attached storage systems 112 (e.g., an array of hard disk drives, solid state drives, tape drives, etc.). These

computers

102, 106 and the directly attached storage system 112 may communicate using protocols such as ATA, SATA, SCSI, SAS, fibre channel, and the like.

In certain embodiments, the network environment 100 may include a storage network 108, such as a Storage Area Network (SAN)108 or a LAN 108 (e.g., when network-attached storage is used), located behind the servers 106. The network 108 may connect the servers 106 to one or more storage systems 110, such as an array of hard disk drives or solid state drives 110a, a tape library 110b, a single hard disk drive 110c or solid state drive 110c, a tape drive 110d, a CD-ROM library, or the like. To access the storage system 110, the host system 106 may communicate through a physical connection from one or more ports on the host 106 to one or more ports on the storage system 110. The connection may be through a switch, fiber, or similar direct connection. In some embodiments, the servers 106 and the storage system 110 may communicate using networking standards such as Fibre Channel (FC) or iSCSI.

Referring to FIG. 2, one example of a storage system 110a containing an array of hard disk drives 204 and/or solid state drives 204 is shown. Internal components of the storage system 110a are shown, as in some embodiments, a RAID array may be implemented in whole or in part within such a storage system 110 a. As shown, the storage system 110a includes a storage controller 200, one or more switches 202, and one or more storage drives 204, such as a hard disk drive 204 and/or a solid state drive 204 (e.g., a flash-based storage drive 204). The storage controller 200 may enable one or more hosts 106 (e.g., open systems and/or mainframe servers 106 running operating systems such as z/OS, zVM, etc.) to access data in one or more storage drives 204.

In selected embodiments, the storage controller 200 includes one or more servers 206. The storage controller 200 may also include a host adapter 208 and a device adapter 210 to connect the storage controller 200 to the host device 106 and the storage drive 204, respectively. Multiple servers 206a, 206b may provide redundancy to ensure that data is always available to a connected host 106. Thus, when one server 206a fails, the other server 206b can assume the I/O load of the failed server 206a to ensure that I/O can continue between the host 106 and the storage drive 204. This process may be referred to as "failover".

In selected embodiments, each server 206 may include one or more processors 212 and memory 214. The memory 214 may include volatile memory (e.g., RAM) as well as non-volatile memory (e.g., ROM, EPROM, EEPROM, hard disk, flash memory, etc.). In some embodiments, the volatile and non-volatile memories may store software modules that run on the processor 212 and are used to access data in the storage drive 204. The server 206 may host at least one instance of these software modules. These software modules may manage all read and write requests to logical volumes in the storage drives 204.

One example of a storage system 110a having an architecture similar to that shown in FIG. 2 is the IBM DS8000TM Enterprise storage system. DS8000TM is a high performance, large capacity storage controller that provides disk and solid state storage, intended to support continuous operation. However, the techniques disclosed herein are not limited to the IBM DS8000TM enterprise storage system 110a, but may be implemented in any comparable or similar storage system 110, regardless of the manufacturer, product name, or component name associated with the system 110. Any storage system that may benefit from one or more embodiments of the present invention is considered to be within the scope of the present invention. Thus, the IBM DS8000TM is presented by way of example only and not limitation.

In certain embodiments, the storage drives 204 of the storage system 110a may be configured in one or more RAID arrays to provide a desired level of reliability and/or I/O performance. As previously mentioned, the RAID levels most commonly used today are RAID-5 and RAID-6. These RAID levels utilize block-level striping with distributed parity values. A RAID-5 array is configured to recover from a single drive failure, while a RAID-6 array recovers from two simultaneous drive failures. Thus, a RAID-6 array provides more robust protection of redundant data than a RAID-5 array.

In the field, it has been observed that drive failures in combination with media errors result in most data loss events. For example, a drive failure in a RAID-5 array combined with a media error on another storage drive 204 in the same array will result in a data loss. Although a RAID-5 array may also lose data when two storage drives 204 fail simultaneously, the most common is data loss due to single drive failure and media errors. In contrast, the RAID-6 array may prevent data loss in both cases because of the extra parity values used in the RAID-6 array. In view of the foregoing, there is a need for systems and methods to reduce data loss events in redundant arrays of independent disks. There is also a need for systems and methods to provide better reporting and statistics regarding data loss caused by a particular RAID level. In some cases, such systems and methods may be used to encourage users to transition to a more robust RAID level (e.g., RAID-6), or to provide evidence that a prior transition to a more robust RAID level has generated revenue in terms of protecting the data.

Referring to FIG. 3, in some embodiments according to the invention, a reporting module 300 may be provided in the host system 102 or other system to reduce data loss events in a RAID array. The reporting module 300 may be configured to provide better reporting and statistics regarding data loss caused by a particular RAID level (e.g., a RAID-5 or RAID-6 array). In certain embodiments, the reports and statistics provided by the reporting module 300 may be used to encourage users to transition to a more robust RAID level (e.g., RAID-6), or to provide evidence that a prior transition to a more robust RAID level has prevented data loss.

For example, as shown in FIG. 3, when a failure 306a occurs in a RAID-5 array 304a (i.e., a storage drive failure 306a), the reporting module 300 may determine whether a data loss occurs due to the failure. For example, if a failed storage drive 204 is accompanied by a media error on another storage drive 204 in the RAID-5 array 304a, a data loss may result. In this case, the reporting module 300 may log the failure 306a and the resulting loss of data in the RAID-5 array 304 a. The reporting module 300 may report the event to a user. In certain embodiments, the reporting module 300 may indicate whether data loss may be prevented if the RAID-5 array 304a is converted to a RAID-6 array 304 b.

Similarly, in certain embodiments, when a failure 306b occurs in the RAID-6 array 304b (i.e., a storage drive failure 306b), the reporting module 300 may determine whether the data loss building is guarded against due to the RAID-6 architecture. For example, if the failed storage drive 204 is accompanied by a media error or another failed storage drive 204, but the RAID-6 array 304b is still able to recover, reconstruct, and prevent data loss, the reporting module 300 may record this information. The reporting module 300 may report the event 306b to a user and indicate that data loss was prevented because the RAID array is a RAID-6 array 304b and may indicate that data loss occurred if the RAID array is a RAID-5 array 304 a.

Rather, action module 302 may take various actions to mitigate the risk of data loss in a storage environment that includes multiple RAID arrays 304. Examples of such actions will be discussed in connection with fig. 5-7. FIG. 8 is a high-level block diagram showing an action module 302 and various related sub-modules.

Referring to fig. 4, a high-level block diagram of the reporting module 300 and related sub-modules is shown. The reporting module 300 and related sub-modules may be implemented in hardware, software, firmware, or a combination thereof. The reporting module 300 and related sub-modules are presented by way of example and not limitation. In different embodiments, more or fewer sub-modules may be provided. For example, the functionality of some sub-modules may be combined into a single or fewer number of sub-modules, or the functionality of a single sub-module may be distributed across multiple sub-modules.

As shown, the reporting module 300 may include one or more of a failure detection module 402, a data collection module 404, a data loss determination module 406, a prevention determination module 408, an aggregation module 410, and a communication module 412.

The failure detection module 402 may be configured to detect a failure 306 in the storage system 110a, such as a failure 306 of one of the plurality of storage drives 204 participating in the RAID array 404. When such a fault 306 occurs, the data collection module 404 may collect data about the fault 306. For example, the data collection module 404 may determine 420 the number of failed storage drives 204, the type of failed storage drives 204 (e.g., make, model, storage capacity, performance characteristics, manufacturer specifications, etc.), the age of the failed storage drives 204, and the RAID type of the failed 306 (e.g., whether the RAID is a RAID-5 or RAID-6 array).

The data loss determination module 406 may be configured to determine whether the fault 306 results in data loss. For example, if a storage drive failure 306 is accompanied by a media error on another storage drive 204 in the RAID-5 array 304a or a failure 306 of another storage drive 204, data may be lost. The data loss determination module 406 may determine whether such data loss has occurred.

Conversely, the prevention determination module 408 may determine whether the data loss detected by the data loss determination module 406 has been prevented. For example, if RAID-5 array 304a is converted to RAID-6 array 304b, no data loss occurs in RAID-5 array 304a, which may be detected by the prevention determination module 408. Alternatively or additionally, the prevention determination module 408 may determine whether the configuration of the RAID array 404 actually prevents data loss. For example, if the RAID-6 array 304b experiences a failure 306, the failure 306 does not result in data loss, but would result in data loss if the RAID array is a RAID-5 array 304a, which the prevention determination module 408 may detect.

The aggregation module 410 may aggregate the statistics across the storage environment and the RAID array 404. For example, for each storage drive failure 306 occurring in the storage environment, the aggregation module 410 may aggregate information such as the RAID involved (e.g., RAID-5, RAID-6, etc.), the number of failed storage drives 204, whether data loss was prevented, the type of failed storage drive, the age of the failed storage drive 204, and so forth. In certain embodiments, the aggregation module 310 may aggregate information such as whether a data loss occurring in the RAID-5 array 304a may be prevented if the RAID-5 array 304a is converted to the RAID-6 array 304 b. Similarly, the aggregation module 410 may aggregate information such as whether a storage drive failure 306 that occurred in the RAID-6 array 304b and did not result in a data loss would cause a data loss in the event of a failure in the RAID-5 array 304 a.

The communication module 412 may communicate information generated and collected by the other sub-modules 402, 404, 406, 408, 410 to the user. This may help a user determine how to configure the storage environment, and in particular how to configure the RAID array 404 in the storage environment. For example, a user may decide to convert various RAID-5 arrays 304a to RAID-6 arrays 304b when it is observed that various data loss events may be prevented using RAID-6 arrays 304 b. Similarly, upon observing a data loss event that the RAID-6 array 304b prevents, the information provided by the communication module 412 may validate the user's previous decision to convert the RAID-5 array 304a to the RAID-6 array 304 b. In some embodiments, the provider of the storage service/hardware may use this information to convert or utilize the RAID-6 array 304b by displaying an example of the customer's real world utilization of the RAID-6 array 304b to avoid or to avoid data loss.

Referring to FIG. 5, as previously described, the action module 302 may take various actions to mitigate the risk of data loss in a storage environment. In certain embodiments, the action module 302 may maintain statistics about the storage drives 204 in the storage environment (e.g., the storage system 110a) in order to determine which storage drives 204 have the greatest risk of failure. For example, the action module 302 may determine how likely a storage drive 204 is to fail within a given period of time (e.g., one month). In some embodiments, the likelihood is expressed as a percentage of the chance that the storage drive 204 will fail within a given period of time. The action module 302 may then determine which storage drives 204 have a risk of failure exceeding a selected threshold (e.g., twenty-five percent) during the time period. As shown in FIG. 5, these storage drives 204 may be designated as higher risk storage drives 204, and as shown in FIG. 5, storage drives 204 below a threshold may be designated as lower risk storage drives 204.

Action module 302 may then take action to mitigate the risk of data loss in the storage environment, particularly in a particular RAID array 304 of the storage environment. In doing so, the action module 302 may consider the RAID level of the RAID array 304. For example, RAID-5 array 304a is less robust than RAID-6 array 304b in protecting data. Thus, the higher risk storage drives 204 in the RAID-5 array 304a are more likely to cause data loss than the higher risk storage drives 204 in the RAID-6 array 304 b. Accordingly, the action module 302 may take action to reduce or balance the risk across the RAID array 304 in order to minimize the chance of data loss.

For example, as shown in FIG. 5, in certain embodiments, the action module 302 may analyze the RAID arrays 304 in the storage environment to determine which RAID arrays 304 contain higher risk storage drives 204 and the number of higher risk storage drives 204 contained. Using this information, the action module 302 may swap storage drives 204 between the RAID-5 array 304a and the RAID-6 array 304b in a manner that reduces the risk of data loss and/or more evenly distributes the risk of data loss among the RAID arrays 304.

For example, FIG. 5 shows a RAID-5 array 304a containing higher risk storage drives 204a therein. Because the RAID-5 array 304a provides less protection than the RAID-6 array 304b and can only withstand a single storage drive failure without losing data, the action module 302 may swap the higher risk storage drives 204a in the RAID-5 array 304a and the lower risk storage drives 204b in the RAID-6 array 304 b. This would result in a single higher risk storage drive 204a in the RAID-6 array 304 b. As previously described, the RAID-6 array 304b may recover from two concurrent drive failures without causing data loss, and thus may be better able to handle failures of the higher risk storage drives 204 a.

The action module 302 may proceed in three steps using the spare storage drive 204c when swapping storage drives 204 between RAID arrays 304. For example, in the example of FIG. 5, data in the higher-risk storage drive 204a may be copied to a spare storage drive 204c, and then the spare storage drive 204c may be incorporated into the RAID-5 array 304a to replace the higher-risk storage drive 204 a. The data in the lower risk storage drives 204b may then be copied to the higher risk storage drives 204a (now spare), and the higher risk storage drives 204a may then be consolidated into the RAID-6 array 304 b. The data in the storage drive 204c (now part of the RAID-5 array 304a) may then be copied to the lower risk storage drive 204b (now spare), and then the lower risk storage drive 204b is merged into the RAID-5 array 304 a. This completes the swap of the higher risk storage drive 204a with the lower risk storage drive 204 b.

In some embodiments, the intelligent reconstruction process may be used to copy data from one storage drive 204 to another. This intelligent reconstruction process may reduce the risk of data loss by preserving the ability of the storage drive 204 to function as a spare drive even though data is being copied to the drive. In certain embodiments, the intelligent rebuild process may create a bitmap for the first storage drive 204 when data is copied from the first storage drive 204 to the second storage drive 204 (e.g., the spare storage drive 204). Each bit may represent a portion of storage space (e.g., a one megabyte region) on the first storage drive 204. The smart rebuild process may then begin copying data from the first storage drive 204 to the second storage drive 204. As each portion is copied, its associated bits may be recorded in a bitmap.

If a write to a portion of the first storage drive 204 is received while the data copy process is in progress, the smart rebuild process may check the bitmap to determine if the data in the associated portion has already been copied to the second storage drive 204. If not, the smart rebuild process may simply write the data to the corresponding portion of the first storage drive 204. Otherwise, after writing the data to the first storage drive 204, the data may also be copied to the second storage drive 204. Once all portions have been copied from the first storage drive 204 to the second storage drive 204, the RAID array 300 may begin using the second storage drive 204 in place of the first storage drive 204. This frees the first storage drive 204 from the RAID array 300.

Alternatively, the intelligent reconstruction process may utilize a watermark rather than a bitmap to track which data has been copied from the first storage drive 204 to the second storage drive 204. In such an embodiment, the portions may be copied from the first storage drive 204 to the second storage drive 204 in a specified order. The watermark may track how far the copying process has progressed in the parts. If a write to a portion of the first storage drive 204 is received during the copy process, the smart rebuild process may check the watermark to determine if the data in the portion has already been copied to the second storage drive 204. If not already copied, the smart rebuild process may write the data to the first storage drive 204. Otherwise, the smart rebuild process may also copy the data to the second storage drive 204 after writing the data to the first storage drive 204. Once all portions have been copied from the first storage drive 204 to the second storage drive 204, the RAID array 300 may begin using the second storage drive 204 in place of the first storage drive 204. This frees the first storage drive 204 from the RAID array 300.

Referring to FIG. 6, in certain embodiments, the action module 302 may allocate higher risk storage drives 204 on the RAID array 304 in a manner that reduces the risk of data loss in the storage environment. For example, FIG. 6 shows two RAID-5 arrays 304a, each RAID-5 array including a plurality of higher risk storage drives 204. In this example, RAID-5 array 304a1 includes a single higher risk storage drive 204, while RAID-5 array 304a2 includes three higher risk storage drives 204. To reduce the risk of data loss in the RAID-5 arrays 304a1, 304a2, the action module 302 may more evenly distribute the higher risk storage drives 204 across the RAID-5 array 304a by swapping the higher risk storage drives 204e from the RAID-5 array 304a2 with the lower risk storage drives 204d from the RAID-5 array 304a 1. After the swap, each RAID-5 array 304a will contain two higher risk storage drives 204. The swapping may occur on a RAID array 304 of the same RAID level and/or a different RAID level as shown in this example.

Referring to FIG. 7, in some embodiments, if the risk of failure of a storage drive 204 exceeds a specified threshold (e.g., fifty percent), the action module 302 may simply delete a storage drive 204 from the RAID array 304 without placing it in another storage drive 204. To this end, the action module 302 may swap the higher-risk storage drive 204 with the spare storage drive 204. For example, as shown in FIG. 7, if the higher-risk storage drive 204f is at a risk of failure greater than fifty percent, the action module 302 may copy data from the higher-risk storage drive 204f to the spare storage drive 204g and merge the spare storage drive 204g into the RAID array 304 b. The higher risk storage drive 204f may then be flagged to be replaced with a new spare storage drive 204.

Referring to FIG. 8, a high-level block diagram of the action module 302 and related sub-modules is shown. The action module 302 and related sub-modules may be implemented in hardware, software, firmware, or a combination thereof. The action module 302 and related sub-modules are presented by way of example and not limitation. In different embodiments, more or fewer sub-modules may be provided. For example, the functionality of some sub-modules may be combined into a single or fewer number of sub-modules, or the functionality of a single sub-module may be distributed across multiple sub-modules.

As shown, action module 302 includes one or more statistics collection module 800, failure prediction module 802, threshold module 804, parameter module 806, exchange module 808, assignment module 810, conversion module 812, and deletion module 814.

The statistics collection module 800 may be configured to collect statistics of the storage drives 204 in the storage environment. For example, the statistics collection module 800 may be configured to collect information such as the age of the storage drives 204 in the storage environment, the type of storage drives 204 in the storage environment (e.g., make, model, storage capacity, performance characteristics, etc.), the workload of the storage drives 204, and so forth. Using these statistics, the failure prediction module 802 may predict when a storage drive 204 in the storage environment will fail. In some embodiments, this is expressed as a percentage of the chance that the storage drive 204 will fail within a specified period of time (e.g., one month). For example, the action module 302 may use the statistical information to determine that the storage drive 204 has twenty-five percent of the chance of failing within a month.

In contrast, the threshold module 804 may specify a threshold that treats the storage drive 204 as a higher risk storage drive 204. For example, any storage drive 204 with a risk of failure of more than twenty-five percent in the next month may be considered a higher risk storage drive 204. Rather, the parameter module 806 may establish various parameters associated with reducing the risk of data loss in the storage environment. For example, the parameter may indicate that a RAID array 304 of a certain RAID level does not contain more than a certain number of higher risk storage drives 204. For example, the parameter module 806 may indicate that the RAID-5 array 304a should contain zero higher risk storage drives 204, and the RAID-6 array 304b may contain up to two higher risk storage drives 204 due to more robust data protection.

The action module 302 may then attempt to enforce the parameters. For example, the swap module 808 may attempt to swap storage drives 204 between the RAID-5 array 304a and the RAID-6 array 304b to reduce the risk of data loss in the storage environment. In some cases, this may involve moving higher risk storage drives 204 from RAID-5 array 304a to RAID-6 array 304b, and moving lower risk storage drives 204 from RAID-6 array 304b to RAID-5 array 304 a. In some embodiments, the switching module 808 may attempt to move all higher risk storage drives 204 from RAID-5 array 304a to RAID-6 array 304 b. Conversely, the allocation module 810 may attempt to more evenly allocate higher risk storage drives 204 among the RAID arrays 304 of a particular RAID level. For example, assuming that the higher risk storage drives 204 cannot be moved to the RAID-6 array 304b, the allocation module 810 may attempt to more evenly distribute the higher risk storage drives 204 across the RAID-5 array 304 a.

To further reduce the risk of data loss in the storage environment, a translation module 812 may be used to translate the RAID array 304 from one RAID level to another RAID level. For example, if the risk of data loss is too high for the RAID-5 array 304a and cannot otherwise be reduced, the translation module 812 may translate the RAID-5 array 304a to the RAID-6 array 304 b. Finally, if the risk of the higher-risk storage drive 204 failing is too high (e.g., above fifty percent), the delete module 814 may replace the higher-risk storage drive 204 with the spare storage drive 204 and mark the higher-risk storage drive 204 for deletion from the storage environment.

FIG. 9 illustrates one embodiment of a method 900 for reducing the risk of data loss in a storage environment comprised of RAID arrays of different RAID levels. In certain embodiments, such a method 900 may be performed by the action module 302 previously described. In this example, the storage environment includes a set of RAID-5 arrays 304a and RAID-6 arrays 304b, although the method 900 may also be used with RAID arrays 304 of other RAID levels. Method 900 is merely one example of a method that may be performed by action module 302 and is not intended to be limiting.

Once the storage drives 204 are categorized as either higher risk storage drives 204 or lower risk storage drives 204, the method 900 may attempt to move the storage drives 204 between RAID arrays 304 or perform other actions to reduce the risk of data loss in the storage environment. As shown, the method 900 initially determines 902 whether any RAID-5 arrays 304a in the storage environment contain higher risk storage drives 204 (e.g., storage drives 204 having a risk of failure above a certain percentage). If so, the method 900 determines 904 whether any RAID-6 array 304b in the storage environment contains lower risk storage drives 204. If so, the method 900 attempts to swap storage drives 204 between RAID arrays 304.

More specifically, the method 900 finds 906 the RAID-6 array 304b in a storage environment having a minimum number of higher risk storage drives 204. The method 900 also finds 908 the RAID-5 array 304a with the largest number of higher risk storage drives 204. The method 900 then swaps 910 the higher risk storage drives 204 in the RAID-5 array 304a with the lower risk storage drives 204 in the RAID-6 array 304 b. The method 900 then repeats steps 902, 904, 906, 908, 910 until no RAID-5 array 304a in the storage environment contains any higher risk storage drives 204 or until no RAID-6 array 304b in the storage environment contains lower risk storage drives 204. If no more RAID-5 arrays 304a in the storage environment contain higher risk storage drives 204 in step 902, the method 900 ends.

On the other hand, if the method 900 determines at step 904 that no more RAID-6 arrays 304b in the storage environment contain lower risk storage drives 204 (which contain only higher risk storage drives 204), then the method 900 attempts to allocate 912 higher risk storage drives 204 among the RAID-5 arrays 304a in the storage environment. That is, the method 900 attempts to more evenly distribute 912 the higher risk storage drives 204 among the RAID-5 arrays 304a in the storage environment by swapping the storage drives 204 among the RAID-5 arrays 304 a. This operation may be performed to further reduce the risk of data loss in the storage environment.

After more evenly distributing 912 the higher risk storage drives 204 among the RAID-5 arrays 304a, the method 900 may determine 913 whether there are still some RAID-5 arrays 304a containing too many higher risk storage drives 204 (e.g., more than one). In such a case, the method 900 may convert 914 the RAID-5 array 304a to a RAID-6 array 304b to reduce the risk of data loss. One embodiment of a method 1000 for determining whether a RAID array may be converted to a more robust RAID level is shown in FIG. 10.

Similarly, the method 900 may also determine 916 whether any RAID-5 arrays 304a in the storage environment contain very high risk storage drives 204 (e.g., storage drives 204 with a failure risk above a high threshold). If so, the method 900 may swap 918 these very high risk storage drives 204 with spare storage drives 204. After performing these actions, the method 900 ends. Method 900 may be repeated periodically or in response to certain conditions to reduce/balance the risk of data loss in the storage environment.

FIG. 10 is a flow diagram illustrating one embodiment of a method 1000 for determining whether a RAID array 304 may be converted to a more robust RAID level (e.g., whether a RAID-5 array 304a may be converted to a RAID-6 array 304 b). As shown, the method 1000 initially determines 1002 whether the number of higher risk storage drives 204 in a RAID array (e.g., RAID-5 array 304a) is above a threshold (e.g., one). If not, method 1000 prevents 1012 from converting RAID array 304 to a more robust RAID level.

If the number of higher risk storage drives 204 in the RAID array 304 is greater than the threshold in step 1002, the method 1000 determines 1004 whether the storage environment contains a sufficient number of spare storage drives 204 of the type in the RAID system 304. If not, the method 1000 may not be able to convert the RAID array 304 to a more robust RAID level, thereby limiting 1012 execution.

If the storage environment contains a sufficient number of spare storage drives 204 to convert the RAID array 304 to a more robust RAID level in step 1004, the method 1000 may check other criteria. For example, assuming that RAID array 304 is RAID-5 array 304a and the more robust RAID level is RAID-6, method 1000 may determine 1006 whether the rate of degradation to RAID-5 array 304a is below a threshold (e.g., 500K I/O operations per second). In certain embodiments, the destage rate may refer to the rate at which data is destaged from the cache (in storage 214) to the RAID array 304. Generally, destaging to RAID-5 array 304a is more efficient than destaging to RAID-6 array 304 b. Because four operations (1, staging data, 2, staging parity, 3, destaging data, and 4, destaging parity) are required to destage to RAID-5 array 304a, while six operations (1, staging data, 2, staging first parity, 3, staging second parity, 4, destaging data, 5, destaging first parity, 6, destaging second parity) are required to destage to RAID-6 array 304 b. Thus, if the destage rate associated with the RAID-5 array 304a is high, converting the RAID-5 array 304a to the RAID-6 array 304b may negatively impact the performance of the RAID array. Thus, in certain embodiments, if the destage rate to the RAID-5 array 304a is above a selected threshold, the method 1000 may avoid 1012 converting the RAID-5 array 304a to the RAID-6 array 304 b.

If the destage rate of the RAID array 304 is below a threshold in step 1006, the method 1000 determines 1008 whether the RAID array 304 is associated with a high performance class. Such a high performance category may be associated with high performance data. As described above, destaging to RAID-6 array 304b may be less efficient than destaging to RAID-5 array 304 a. Thus, converting RAID-5 array 304a to RAID-6 array 304b may compromise the I/O performance of the data on RAID-5 array 304a, especially if the data is high performance data. Thus, in certain embodiments, if the RAID array 304 is associated with a high performance category, the method 1000 may avoid 1012 converting the RAID array 304 to a more robust RAID level.

In the illustrated embodiment, if each of the

criteria

1002, 1004, 1006, 1008 is met, the method 1000 converts 1010 the RAID array 304 to a more robust RAID level, such as converting RAID-5 array 304a to RAID-6 array 304 b. The criteria shown are given by way of example only and not by way of limitation. In other embodiments, method 1000 may include fewer, additional, or different criteria to determine whether and when to convert a RAID array of a certain RAID level to a more robust RAID level.

The systems and methods disclosed herein have been primarily discussed in connection with reducing the risk of data loss in a storage environment comprised of RAID-5 and RAID-6 arrays 304. However, the systems and methods disclosed herein are not limited to this RAID-5 and RAID-6 array 304, and may also be used with RAID arrays 304 having other RAID levels (e.g., RAID-10 arrays). Thus, the systems and methods disclosed herein are not limited to RAID-5 and RAID-6 arrays 304.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims

1. A method for reducing data loss events in Redundant Array of Independent Disks (RAID) having RAID levels, the method comprising:

identifying, in a data storage environment, a first set of RAIDs and a second set of RAIDs, the first set including RAIDs that provide more robust data protection and the second set including RAIDs that provide less robust data protection;

in a data storage environment, identifying higher risk storage drives having a risk of failure above a threshold and lower risk storage drives having a risk of failure below a threshold; and

the higher risk storage drives in the second set of RAIDs are swapped with the lower risk storage drives in the first set of RAIDs to reduce the risk of data loss in the data storage environment.

2. The method of claim 1, wherein the swapping comprises swapping using a smart rebuild process that enables a storage drive to be used as a spare drive even though data is being copied to the storage drive.

3. The method of claim 1, wherein the swapping comprises swapping such that more than a selected number of higher risk storage drives are not included in the second set of RAIDs.

4. The method of claim 1, further comprising: more evenly distributing higher risk storage drives among the second set of RAIDs.

5. The method of claim 1, further comprising: more evenly distributing higher risk storage drives among the first set of RAIDs.

6. The method of claim 1, wherein identifying higher risk storage drives comprises identifying storage drives with higher failure risk using statistical information.

7. The method of claim 6, wherein the statistical information comprises at least one of storage drive age and storage drive type.

8. A computer program product for reducing data loss events in Redundant Array of Independent Disks (RAID) with RAID levels, comprising a computer readable medium having computer usable program code embodied therein, the computer usable program code configured to, when executed by at least one processor, perform operations according to the method of any one of claims 1 to 7.

9. A system for reducing data loss events in Redundant Array of Independent Disks (RAID) with RAID levels, the system comprising:

at least one processor;

at least one storage device coupled to the at least one processor and storing code for execution on the at least one processor to cause the at least one processor to perform operations according to the method of any of claims 1 to 7.

10. An apparatus for reducing data loss events in a Redundant Array of Independent Disks (RAID) having different RAID levels, comprising means to perform operations in accordance with any one of the methods of claims 1 to 7.

11. A method for converting a Redundant Array of Independent Disks (RAID) to a more robust RAID level, comprising:

in a data storage environment, identifying higher risk storage drives having a risk of failure above a first threshold;

determining a number of higher risk storage drives included in a RAID array of a data storage environment;

determining whether the number exceeds a second threshold;

determining whether a degradation rate associated with the RAID array is below a third threshold; and

if the number exceeds the second threshold and the degradation rate is below a third threshold, the RAID array is converted to a more robust RAID level.

12. The method of claim 11, wherein the RAID array is a RAID-5 array.

13. The method of claim 12, wherein converting the RAID array to a more robust RAID level comprises converting a RAID-5 array to a RAID-6 array.

14. The method of claim 11, further comprising: it is determined whether the RAID array is associated with a high performance class.

15. The method of claim 14, further comprising: if the RAID array is not associated with a high performance class, the RAID array is merely converted to a more robust RAID level.

16. The method of claim 11, further comprising: a determination is made as to whether the data storage environment contains a sufficient number of spare storage drives to convert the RAID array to a more robust RAID level.

17. The method of claim 16, further comprising: the RAID array is converted to a more robust RAID level only if the data storage environment contains a sufficient number of spare storage drives.

18. A computer program product for converting a Redundant Array of Independent Disks (RAID) to a more robust RAID level, comprising a computer readable medium having computer usable program code embodied therein, the computer usable program code configured to, when executed by at least one processor, perform operations according to any one of the methods of claims 11 to 17.

19. A system for converting a Redundant Array of Independent Disks (RAID) to a more robust RAID level, the system comprising:

at least one processor;

at least one storage device coupled to the at least one processor and storing code for execution on the at least one processor to cause the at least one processor to perform operations according to the method of any of claims 11 to 17.

20. An apparatus for converting a Redundant Array of Independent Disks (RAID) to a more robust RAID level comprising means for performing operations in accordance with the method of any one of claims 11 to 17.