CN102184129B - Fault tolerance method and device for disk arrays - Google Patents

Fault tolerance method and device for disk arrays Download PDF

Info

Publication number
CN102184129B
CN102184129B CN201110106601.7A CN201110106601A CN102184129B CN 102184129 B CN102184129 B CN 102184129B CN 201110106601 A CN201110106601 A CN 201110106601A CN 102184129 B CN102184129 B CN 102184129B
Authority
CN
China
Prior art keywords
band
disk
tape identification
data
disk array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110106601.7A
Other languages
Chinese (zh)
Other versions
CN102184129A (en
Inventor
郑辉
曹庭华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd
Original Assignee
Hangzhou H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou H3C Technologies Co Ltd filed Critical Hangzhou H3C Technologies Co Ltd
Priority to CN201110106601.7A priority Critical patent/CN102184129B/en
Publication of CN102184129A publication Critical patent/CN102184129A/en
Application granted granted Critical
Publication of CN102184129B publication Critical patent/CN102184129B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a fault tolerance method and device for disk arrays, wherein the method comprises the following steps: when a disk in a disk array goes wrong, adding a hot spare into the disk array so as to replace the disk which goes wrong, and carrying out reconstruction on the disk array into which the hot spare is added in strips; when a reconstructed current strip has a reconstruction read error, recording the identifier of the current strip into a non-volatile random access memory (NVRAM), and skipping over the current strip, and continuing to carry out reconstruction from the next strip until the reconstruction on the disk array is completed; and aiming at the identifier of each strip recorded in the NVRAM, repairing the reconstruction read error of the strip corresponding to the identifier of the strip by writing, and deleting the identifier of the strip from the NVRAM after the repairing is completed. By using the method and device provided by the invention, the occurrence of problems caused by the reconstruction read errors or service read errors of the disk array in a degraded state can be avoided.

Description

The fault-tolerance approach of disk array and device
Technical field
The present invention relates to field of storage, particularly the fault-tolerance approach of disk array and device.
Background technology
Redundant Array of Independent Disks (RAID) (RAID:Redundant Array of Independent Disks), is called for short disk array, and it is combined into an array by a plurality of independently disks, and good redundancy and the memory property higher than single disk are provided.In field of storage, the redundancy by disk array self is directly or indirectly stored in data on a plurality of independent disks, to reach the object that data are not lost when one or more disk failure, has realized data fault-tolerant.
Wherein, when because some reason is while causing disk array to lose redundancy such as the disk failure in disk array etc., this disk array can be in degrading state.It is example in degrading state that the disk failure of take in disk array causes disk array to lose redundancy and make this disk array, in the prior art, for recovering this because of the redundancy of the disk array of disk failure in degrading state, the mode that conventional mode is rebuild for increasing HotSpare disk, is specially: the disk of replacing fault with HotSpare disk.But, in this process of reconstruction, if there is again disk, rebuild read error, wherein, rebuilding read error is in process of reconstruction, rebuilds the read error that I/O causes disk to occur,, stop rebuilding, now this disk array can only rest on degrading state, cannot get back to redundant state.Other disks in this disk array break down again, and whole disk array will be failed, close I/O passage, and this not only causes this disk array to stop providing business, also can cause the loss of data of storing before this disk array.
In addition, when the disk array in degrading state carries out business while reading, if the business of generation read error, wherein, business misreads and is mistaken for: in business read-write process, the read error that business I/O causes disk to occur,, now this disk array failure, closes I/O passage, this causes this disk array to stop providing business, and the loss of data of storage before causing.
Summary of the invention
The invention provides the fault-tolerance approach and the device that relate to disk array, avoid disk array in degrading state due to the problem that read error or business read error occur to rebuild causes.
Technical scheme provided by the invention comprises:
A fault-tolerance approach for disk array in the method, when the disk in disk array breaks down, increases HotSpare disk in described disk array, the disk breaking down to replace this, and take band and to having increased the disk array of HotSpare disk, rebuild as unit; Its key is, the method comprises:
When rebuilt current band occurs to rebuild read error, the identification record of band before deserving, in Nonvolatile memory, and is skipped to current band, from next band, continue to rebuild, until complete the reconstruction of disk array;
For each tape identification recording in described Nonvolatile memory, by the reconstruction read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
A fault-tolerance approach for disk array, the method comprises:
Band in the disk array in degrading state carries out in process that business reads, when read current band generation business read error time, the identification record of band before deserving, in Nonvolatile memory, is controlled to this disk array and kept reduction state, and continue to provide business;
For each tape identification recording in described Nonvolatile memory, by the business read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
A fault-tolerant device for disk array, this device comprises: replacement unit and reconstruction unit; Described replacement unit, for when the disk of disk array breaks down, increases HotSpare disk, the disk breaking down to replace this in described disk array; Described reconstruction unit is rebuild having increased the disk array of HotSpare disk as unit for take band; Its key is, described device also comprises:
Record cell, while occurring to rebuild read error for the current band being rebuild by described reconstruction unit, the identification record of band before deserving, in Nonvolatile memory, and is triggered to described reconstruction unit and skips current band, from next band, continue to rebuild, until complete the reconstruction of disk array;
Repair unit, for each tape identification recording for described Nonvolatile memory, by the reconstruction read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
A fault-tolerant device for disk array, comprising: business is read processing unit, record cell, control module and reparation unit, wherein,
Business is read processing unit, carries out business read for the band of the disk array in degrading state;
Record cell, for when the current band generation business read error of being read, in Nonvolatile memory, and triggers the identification record of band before deserving described control module and controls disk array and continue to keep reduction state, and continue to provide business;
Repair unit, for each tape identification recording for described Nonvolatile memory, by the business read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
As can be seen from the above technical solutions, in the present invention, when current band occurs to rebuild read error, not stop rebuilding, but the band that reconstruction read error occurs is recorded in Nonvolatile memory, skip and deserve front band, from next band, start to continue to rebuild, and there is to rebuild the band of read error for each, by WriteMode, repair the reconstruction read error of this band, recover as early as possible the redundancy of disk array, like this, even if there are a plurality of disks to break down in process of reconstruction, can not cause the whole disk array will be failed yet;
Also have, in the present invention, when current band generation business read error, also not cause whole disk array failure, but by the identification record of band before deserving in Nonvolatile memory, return to bad command, and control this disk array and continue to provide that business is read, business is write; Guarantee like this, on the one hand business continuance, the risk of avoiding on the other hand data to be lost.
Accompanying drawing explanation
The disk array schematic diagram that Fig. 1 provides for the embodiment of the present invention;
The schematic diagram of disk array during disk failure that Fig. 2 provides for the embodiment of the present invention
Fig. 3 is the schematic diagram of realizing of the embodiment of the present invention 1;
Fig. 4 is the schematic diagram of realizing of the embodiment of the present invention 2;
The structure drawing of device that Fig. 5 provides for the embodiment of the present invention;
Another structure drawing of device that Fig. 6 provides for the embodiment of the present invention.
Embodiment
In order to make the object, technical solutions and advantages of the present invention clearer, below in conjunction with the drawings and specific embodiments, describe the present invention.
In actual applications, disk array is all by dividing concurrent reading and writing that the mode of band realizes polylith disk.Fig. 1 is the schematic diagram of a plurality of bands of being divided of disk array provided by the invention.In Fig. 1, a cylinder is designated as a disk, it is connected in parallel and forms disk array, and Fig. 1 be take disk array, and to be divided into 9 identical bands of size be example, can find out that being divided each band obtaining has all taken the storage space of each disk in disk array.
Based on describing above, below by two embodiment, the fault-tolerance approach of disk array provided by the invention is described:
Embodiment 1:
In the present embodiment 1, when the disk in the disk array shown in Fig. 1 breaks down such as disk 3, in this disk array, increase HotSpare disk, to replace this disk breaking down 3, specifically as shown in Figure 2.
Afterwards, take band as unit to having increased the disk array of HotSpare disk, be that the disk array shown in Fig. 2 is rebuild.
In the process of reconstruction of the disk array shown in Fig. 2, if there is to rebuild read error in current rebuilt band, by the identification record of band before deserving in Nonvolatile memory, and skip current band, from next band, continue to rebuild, until complete the reconstruction of disk array, specifically can be referring to Fig. 3.In Fig. 3, the band of oblique line sign is that sequence number is 1,3,5,6 band generation reconstruction read error, and not rebuilt success, is recorded in Nonvolatile memory.Can find out, the present invention is than prior art, the reconstruction that affects disk array integral body not due to occurring to rebuild the band of read error, but continue to continue to rebuild from next band, until complete the reconstruction of disk array, this to disk failure few and this fault to the little application of service impact such as monitoring storage etc., can reduce the risk that disk failure brings whole system as far as possible.
It should be noted that, the present invention in Nonvolatile memory, mainly contains following two objects by the identification record of the band of generation reconstruction read error:
First, due in process of reconstruction, skip the band that read error has occurred to rebuild, therefore, data on this band being skipped are nonredundant, when follow-up when the shared disk of this band is issued to read command, if what read is disk 1 and/or the disk 2 in Fig. 2, can normally issue read command, with according to this read command reading out data; If what read is the HotSpare disk in Fig. 2, does not issue read command, but directly utilize shared other disks except HotSpare disk of this band such as the disk 1 in Fig. 2 and the data in disk 2 calculate the data that need to read from HotSpare disk.Also the data on shared other disks except HotSpare disk of the band that is skipped can read, and data on shared HotSpare disk can not read, it need to carry out corresponding calculating such as XOR calculating etc. obtains according to the data on other shared disks of this band.This is convenient to accurately judge the property of can read of disk, and, the present invention by the identification record of band that occur to rebuild read error in Nonvolatile memory, so that as the least unit of failed disk, having reduced to the full extent because read error occurs to rebuild some band on disk, band causes loss of data and the unsettled risk of disk array on whole disk array.
The second, be convenient to the band that read error occurs to rebuild in identification, to guarantee to repair as early as possible the reconstruction read error of this band, recover as early as possible the data redundancy of this band.
Wherein, the present embodiment 1 can be by the reconstruction read error of the WriteMode reparation band corresponding with Nonvolatile memory discal patch tape identification, and this is repaired operation and can in process of reconstruction or after rebuilding, carry out, and specifically can determine according to practical business situation.Below by following two kinds of modes, this reparation operation is described:
Mode 1:
The manner 1 is that the mode of writing by whole piece band realizes.Be specially: when to band corresponding to Nonvolatile memory discal patch tape identification, (band 1 shown in Fig. 3 of take is example, other bands are such as the principle of the band 3,5,6 in Fig. 3 is similar) while writing data, if these data are just in time write each disk (comprising HotSpare disk) that full band 1 takies, the corresponding data writing of each disk taking to band 1 respectively, such as the disk 1 in Fig. 3, disk 2 data writing corresponding to HotSpare disk.So far, complete the repair of band 1, band 1 has recovered data redundancy again.
It should be noted that, in the data that write to band 1, can not write the space in each disk that full band 1 takies completely, such as, band 1 takies respectively disk 1, on disk 2 and HotSpare disk, size is the space of 16k, and now the size of these data is only 4k, can only write to the front 4k space of disk 1, the present invention utilizes the go forward data in 4k space of this disk 1 and disk 2 to carry out corresponding calculating, calculate the data that the Zhong Qian4k space, HotSpare disk space for taking to band 1 writes, and the front 4k in the storage space of the HotSpare disk that takies of data to this band 1 that writes this calculating.In this case, the present invention does not think that band 1 completes repair, still thinks that this band 1 does not also recover data redundancy.That is to say, the present invention is only write when full completely at band 1, has just thought the repair of band 1, and band 1 recovers data redundancy.
Mode 2:
The manner 2 is by writing setting data to band corresponding to Nonvolatile memory discal patch tape identification (band 1 shown in Fig. 3 of take is example, and other bands are such as the principle of the band 3,5,6 in Fig. 3 is similar) such as 0 mode realizes.Which 2 is the reconstruction read errors of repairing by force this band 1, to recover the data redundancy of band 1.Be specially: the address of determining band 1 according to the sign of the band 1 of Nonvolatile memory record, according to the significance level of these band 1 data of storing of this definite adress analysis, if determine the significance level of these data, be less than setting threshold, by the reconstruction read error of following operation reparation and band 1: to writing full setting data with shared other disks except HotSpare disk of band 1, and to writing with the shared HotSpare disk of this band 1 data that calculate according to the setting data of other shared disks of this band.So far, complete the repair of band 1, band 1 has recovered data redundancy again.Certainly, if determine the significance level of these data, be more than or equal to setting threshold, can 1 carry out in the manner described above.
So far, by having realized with upper type 1 and mode 2, repair the operation that band is rebuild read error.When band is done after reparation, from described Nonvolatile memory, delete this tape identification.
Above embodiment 1 is described, below embodiment 2 is described.
Embodiment 2:
The present embodiment 2 is different from embodiment 1, and embodiment 1 mainly carries out for rebuilding read error, and the present embodiment 2 is mainly for the disk array at degrading state, to carry out the process that business reads to be described.
Disk array in degrading state in the present embodiment 2 can be disk array and loses the disk array after redundancy, be specially rebuilt disk array before or in process of reconstruction, or for causing stopping the disk array rebuild etc. because rebuild read error, the embodiment of the present invention does not limit.Band in disk array in degrading state to this carries out in process that business reads, read current band generation business read error time, by the identification record of band before deserving in Nonvolatile memory, return to bad command, and control this disk array and continue to provide that business is read, business is write, and control disk array and still keep degrading state, specifically can be referring to Fig. 4.In Fig. 4, having there is business read error in the band 1 of oblique line sign, is recorded in Nonvolatile memory.In existing mode, at disk array, after degrading state, if there is again business read error, this disk array failure, cannot provide business, and the data of storage also exist the risk being lost before.And in the present invention, when the disk array generation business read error in degrading state, although can not reading out data, but control this disk array and continue externally to provide business, still make the reading and writing passage of this disk array in open mode, and, the state of this disk array remains unchanged, this has guaranteed the continuity of follow-up business, also the risk that is lost of stored data before not existing, at disk failure, few and this fault reads the less application of impact such as the advantage of monitoring application is very obvious to partial data for this.
In the present embodiment 2, the band of the business of generation read error is recorded to Nonvolatile memory, its object is mainly the band of being convenient to identify generation business read error, to guarantee to repair as early as possible the business read error of this band.
Wherein, in the present embodiment 2, can repair by WriteMode the business read error of band.Mode 1 and the mode 2 of this reparation operation specifically and in embodiment 1 is similar, is specially:
First method:
First method is that the mode of writing by whole piece band realizes.Be specially: when writing data to band corresponding to Nonvolatile memory discal patch tape identification (band 1 shown in Fig. 4 of take is example), when these data are just in time write the space of the disk that completely this band 1 takies, determined the repair of band 1, band 1 has recovered data redundancy again.
Second method:
Second method is by writing setting data to band corresponding to Nonvolatile memory discal patch tape identification (band 1 shown in Fig. 4 of take is example) such as 0 mode realizes.This second method is the business read error of repairing by force this band 1, to recover the data redundancy of band 1.Be specially: the address of determining band 1 according to the sign of the band 1 of Nonvolatile memory record, according to the significance level of these band 1 data of storing of this definite adress analysis, if determine the significance level of these data, be less than setting threshold, by following operation, repair the business read error of band 1: to each shared disk of band 1 (being disk 1 and disk 2), write full setting data in Fig. 4.So far, complete the repair of band 1, band 1 has recovered data redundancy again.Certainly, if determine the significance level of these data, be more than or equal to setting threshold, adopt first method.
So far, by above first method and second method, realized the operation of repairing band business read error.When band is done after reparation, from described Nonvolatile memory, delete this tape identification.
So far, complete the description of embodiment 2.
The method above embodiment of the present invention being provided is described, and the device below embodiment of the present invention being provided is described.
Referring to Fig. 5, the structure drawing of device that Fig. 5 provides for the embodiment of the present invention.This installs corresponding embodiment 1, comprising: replacement unit and reconstruction unit;
Wherein, described replacement unit, for when the disk of disk array breaks down, increases HotSpare disk, the disk breaking down to replace this in described disk array;
Described reconstruction unit is rebuild having increased the disk array of HotSpare disk as unit for take band;
Crucially, as shown in Figure 5, described device also comprises:
Record cell, while occurring to rebuild read error for the current band being rebuild by described reconstruction unit, the identification record of band before deserving, in Nonvolatile memory, and is triggered to described reconstruction unit and skips current band, from next band, continue to rebuild, until complete the reconstruction of disk array;
Repair unit, for each tape identification recording for described Nonvolatile memory, by the reconstruction read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
In the present invention, described reparation unit is by writing to the whole band corresponding with this tape identification the reconstruction read error that data are repaired the band corresponding with this tape identification; Or, determine the significance level of band the store data corresponding with this tape identification, if determine the significance level of these data, be less than setting threshold, the reconstruction read error of repairing this band by following operation: write full setting data to other disks except HotSpare disk that this band is shared, and write to the shared HotSpare disk of this band the data that calculate according to the setting data in described other disks.
Preferably, as shown in Figure 5, described device also comprises:
Processing unit, for when need to be shared to the band corresponding with described Nonvolatile memory discal patch tape identification HotSpare disk reading out data time, do not issue read command, utilize the data in shared other disks except HotSpare disk of this band to calculate the data that need to read from HotSpare disk; When need to be to the band corresponding with described Nonvolatile memory discal patch tape identification shared other disk reading out datas except HotSpare disk, to these other disks, issue read command, with according to this read command reading out data.
So far, complete the structure drawing of device shown in Fig. 5.
Referring to Fig. 6, another installation drawing that Fig. 6 provides for the embodiment of the present invention.This installs corresponding embodiment 2, and as shown in Figure 6, this device comprises: business is read processing unit, record cell, control module and reparation unit, wherein,
Business is read processing unit, carries out business read for the band of the disk array in degrading state;
Record cell, for when the current band generation business read error of being read, in Nonvolatile memory, and triggers the identification record of band before deserving described control module and controls disk array and continue to keep reduction state, and continue to provide business;
Repair unit, for each tape identification recording for described Nonvolatile memory, by the business read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
Wherein, described reparation unit is by writing to the whole band corresponding with this tape identification the business read error that data are repaired the band corresponding with this tape identification; Or,
Determine the significance level of band the store data corresponding with this tape identification, if determine the significance level of these data, be less than setting threshold, the business read error of repairing the band corresponding with this tape identification by following operation: to writing setting data with the shared disk of band corresponding to this tape identification.
So far, complete the unit describe shown in Fig. 6.
As can be seen from the above technical solutions, in the present invention, when current band occurs to rebuild read error, not stop rebuilding, but the band that reconstruction read error occurs is recorded in Nonvolatile memory, from next band, start to continue to rebuild, and there is to rebuild the band of read error for each, by WriteMode, repair the reconstruction read error of this band, recover as early as possible the redundancy of disk array, like this, even if there are a plurality of disks to break down in process of reconstruction, can not cause the whole disk array will be failed yet;
Also have, in the present invention, when current band generation business read error, also not cause whole disk array failure, but by the identification record of band before deserving in Nonvolatile memory, return to bad command, and control this disk array and continue to provide that business is read, business is write; Guarantee like this, on the one hand business continuance, the risk of avoiding on the other hand data to be lost.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims (10)

1. the fault-tolerance approach of a disk array, in the method, when the disk in disk array breaks down, in described disk array, increase HotSpare disk, the disk breaking down to replace this, and take band and to having increased the disk array of HotSpare disk, rebuild as unit; It is characterized in that, the method comprises:
When rebuilt current band occurs to rebuild read error, the identification record of band before deserving, in Nonvolatile memory, and is skipped to current band, from next band, continue to rebuild, until complete the reconstruction of disk array;
For each tape identification recording in described Nonvolatile memory, by the reconstruction read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
2. method according to claim 1, is characterized in that, the described reconstruction read error by the WriteMode reparation band corresponding with this tape identification comprises:
By writing to the whole band corresponding with this tape identification the reconstruction read error that data are repaired the band corresponding with this tape identification; Or,
Determine the significance level of band the store data corresponding with this tape identification, if determine the significance level of these data, be less than setting threshold, the reconstruction read error of repairing this band by following operation: write full setting data to other disks except HotSpare disk that this band is shared, and write to the shared HotSpare disk of this band the data that calculate according to the setting data in described other disks.
3. method according to claim 1, is characterized in that, the method further comprises:
When need to be shared to the band corresponding with described Nonvolatile memory discal patch tape identification HotSpare disk reading out data time, do not issue read command, utilize the data in shared other disks except HotSpare disk of this band to calculate the data that need to read from HotSpare disk;
When need to be to the band corresponding with described Nonvolatile memory discal patch tape identification shared other disk reading out datas except HotSpare disk, to these other disks, issue read command, with according to this read command reading out data.
4. a fault-tolerance approach for disk array, is characterized in that, the method comprises:
Band in the disk array in degrading state carries out in process that business reads, when read current band generation business read error time, the identification record of band before deserving, in Nonvolatile memory, is controlled to this disk array and kept reduction state, and continue to provide business;
For each tape identification recording in described Nonvolatile memory, by the business read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
5. method according to claim 4, is characterized in that, the described business read error by the WriteMode reparation band corresponding with this tape identification comprises:
By writing to the whole band corresponding with this tape identification the business read error that data are repaired the band corresponding with this tape identification; Or,
Determine the significance level of band the store data corresponding with this tape identification, if determine the significance level of these data, be less than setting threshold, the business read error of repairing the band corresponding with this tape identification by following operation: to writing setting data with the shared disk of band corresponding to this tape identification.
6. a fault-tolerant device for disk array, this device comprises: replacement unit and reconstruction unit; Described replacement unit, for when the disk of disk array breaks down, increases HotSpare disk, the disk breaking down to replace this in described disk array; Described reconstruction unit is rebuild having increased the disk array of HotSpare disk as unit for take band; It is characterized in that, described device also comprises:
Record cell, while occurring to rebuild read error for the current band being rebuild by described reconstruction unit, the identification record of band before deserving, in Nonvolatile memory, and is triggered to described reconstruction unit and skips current band, from next band, continue to rebuild, until complete the reconstruction of disk array;
Repair unit, for each tape identification recording for described Nonvolatile memory, by the reconstruction read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
7. device according to claim 6, is characterized in that, described reparation unit is by writing to the whole band corresponding with this tape identification the reconstruction read error that data are repaired the band corresponding with this tape identification; Or, determine the significance level of band the store data corresponding with this tape identification, if determine the significance level of these data, be less than setting threshold, the reconstruction read error of repairing this band by following operation: write full setting data to other disks except HotSpare disk that this band is shared, and write to the shared HotSpare disk of this band the data that calculate according to the setting data in described other disks.
8. device according to claim 6, is characterized in that, described device also comprises:
Processing unit, for when need to be shared to the band corresponding with described Nonvolatile memory discal patch tape identification HotSpare disk reading out data time, do not issue read command, utilize the data in shared other disks except HotSpare disk of this band to calculate the data that need to read from HotSpare disk; When need to be to the band corresponding with described Nonvolatile memory discal patch tape identification shared other disk reading out datas except HotSpare disk, to these other disks, issue read command, with according to this read command reading out data.
9. a fault-tolerant device for disk array, is characterized in that, this device comprises: business is read processing unit, record cell, control module and reparation unit, wherein,
Business is read processing unit, carries out business read for the band of the disk array in degrading state;
Record cell, for when the current band generation business read error of being read, in Nonvolatile memory, and triggers the identification record of band before deserving described control module and controls disk array and continue to keep reduction state, and continue to provide business;
Repair unit, for each tape identification recording for described Nonvolatile memory, by the business read error of the WriteMode reparation band corresponding with this tape identification, and after completing reparation, from described Nonvolatile memory, delete this tape identification.
10. device according to claim 9, is characterized in that, described reparation unit is by writing to the whole band corresponding with this tape identification the business read error that data are repaired the band corresponding with this tape identification; Or,
Determine the significance level of band the store data corresponding with this tape identification, if determine the significance level of these data, be less than setting threshold, the business read error of repairing the band corresponding with this tape identification by following operation: to writing setting data with the shared disk of band corresponding to this tape identification.
CN201110106601.7A 2011-04-27 2011-04-27 Fault tolerance method and device for disk arrays Active CN102184129B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110106601.7A CN102184129B (en) 2011-04-27 2011-04-27 Fault tolerance method and device for disk arrays

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110106601.7A CN102184129B (en) 2011-04-27 2011-04-27 Fault tolerance method and device for disk arrays

Publications (2)

Publication Number Publication Date
CN102184129A CN102184129A (en) 2011-09-14
CN102184129B true CN102184129B (en) 2014-03-12

Family

ID=44570309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110106601.7A Active CN102184129B (en) 2011-04-27 2011-04-27 Fault tolerance method and device for disk arrays

Country Status (1)

Country Link
CN (1) CN102184129B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981921A (en) * 2012-12-17 2013-03-20 浙江宇视科技有限公司 Restoring method and device for failure reading of IO (image orthicon) by Raid5 array
CN103186437A (en) * 2013-04-02 2013-07-03 浪潮电子信息产业股份有限公司 Method for upgrading hybrid hard disk array system
CN103823728B (en) * 2014-03-13 2015-11-18 深圳市迪菲特科技股份有限公司 A kind of method of raid-array Intelligent Reconstruction
CN104035840B (en) * 2014-06-12 2017-10-31 浙江宇视科技有限公司 A kind of method and apparatus for recovering band read error
CN104268028B (en) * 2014-09-22 2017-12-15 浙江宇视科技有限公司 A kind of method and apparatus based on disk read error classification reading disk data
CN104850359B (en) * 2015-05-29 2019-01-15 浙江宇视科技有限公司 A kind of RAID array method for reconstructing and device
CN105183589A (en) * 2015-08-31 2015-12-23 安徽欧迈特数字技术有限责任公司 Disk array fault tolerance apparatus
CN105183590A (en) * 2015-08-31 2015-12-23 安徽欧迈特数字技术有限责任公司 Disk array fault tolerance processing method
CN105094712B (en) * 2015-09-30 2019-01-11 浙江宇视科技有限公司 A kind of data processing method and device
CN106933708B (en) * 2015-12-29 2020-03-20 伊姆西Ip控股有限责任公司 Method and device for facilitating storage system recovery and storage system
CN106528342A (en) * 2016-11-11 2017-03-22 安徽维德工业自动化有限公司 Disk array fault tolerance apparatus with cloud server backup function
CN107391042A (en) * 2017-07-28 2017-11-24 郑州云海信息技术有限公司 The design method and system of a kind of disk array
CN108279993B (en) * 2018-01-03 2021-08-24 创新先进技术有限公司 Method and device for realizing service degradation and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276302A (en) * 2007-03-29 2008-10-01 中国科学院计算技术研究所 Magnetic disc fault processing and data restructuring method in magnetic disc array system
CN101436149A (en) * 2008-12-19 2009-05-20 华中科技大学 Method for rebuilding data of magnetic disk array
CN102023902A (en) * 2010-12-28 2011-04-20 创新科存储技术有限公司 Disc array reconstruction method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276302A (en) * 2007-03-29 2008-10-01 中国科学院计算技术研究所 Magnetic disc fault processing and data restructuring method in magnetic disc array system
CN101436149A (en) * 2008-12-19 2009-05-20 华中科技大学 Method for rebuilding data of magnetic disk array
CN102023902A (en) * 2010-12-28 2011-04-20 创新科存储技术有限公司 Disc array reconstruction method

Also Published As

Publication number Publication date
CN102184129A (en) 2011-09-14

Similar Documents

Publication Publication Date Title
CN102184129B (en) Fault tolerance method and device for disk arrays
US9189311B2 (en) Rebuilding a storage array
US8239714B2 (en) Apparatus, system, and method for bad block remapping
US9009526B2 (en) Rebuilding drive data
US7911840B2 (en) Logged-based flash memory system and logged-based method for recovering a flash memory system
CN102023815B (en) RAID is realized in solid-state memory
KR100701563B1 (en) Storage control apparatus and method
US8356292B2 (en) Method for updating control program of physical storage device in storage virtualization system and storage virtualization controller and system thereof
CN104035830A (en) Method and device for recovering data
CN103718162A (en) Method and apparatus for flexible raid in ssd
US20100070796A1 (en) Storage utilization to improve reliability using impending failure triggers
US20080126840A1 (en) Method for reconstructing data in case of two disk drives of raid failure and system therefor
CN102508620B (en) Method for processing RAID5 (Redundant Array of Independent Disks) bad sector
CN104050056A (en) File system backup of multi-storage-medium device
CN104484251A (en) Method and device for processing faults of hard disk
CN101840360A (en) Rapid reconstruction method and device of RAID (Redundant Array of Independent Disk) system
CN103019894B (en) Reconstruction method for redundant array of independent disks
CN102968361A (en) RAID (Redundant Array of Independent Disk) data self-repairing method
CN107885620B (en) Method and system for improving performance and reliability of solid-state disk array
US10860446B2 (en) Failed storage device rebuild using dynamically selected locations in overprovisioned space
CN105183590A (en) Disk array fault tolerance processing method
JP4951493B2 (en) Disk array device
US20130179726A1 (en) Automatic remapping in redundant array of independent disks and related raid
CN106528342A (en) Disk array fault tolerance apparatus with cloud server backup function
CN102945191A (en) RAID5 (redundant array of independent disk 5) data transfer method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 310052 Binjiang District Changhe Road, Zhejiang, China, No. 466, No.

Patentee after: Xinhua three Technology Co., Ltd.

Address before: 310053 Hangzhou hi tech Industrial Development Zone, Zhejiang province science and Technology Industrial Park, No. 310 and No. six road, HUAWEI, Hangzhou production base

Patentee before: Huasan Communication Technology Co., Ltd.

CP03 Change of name, title or address