CN107844381A - The fault handling method and device of storage system - Google Patents

The fault handling method and device of storage system Download PDF

Info

Publication number
CN107844381A
CN107844381A CN201610837841.7A CN201610837841A CN107844381A CN 107844381 A CN107844381 A CN 107844381A CN 201610837841 A CN201610837841 A CN 201610837841A CN 107844381 A CN107844381 A CN 107844381A
Authority
CN
China
Prior art keywords
disk
error rate
threshold value
reading
rate threshold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610837841.7A
Other languages
Chinese (zh)
Inventor
郑文武
李先绪
黄植勤
吴家隐
邱红飞
陈泳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201610837841.7A priority Critical patent/CN107844381A/en
Publication of CN107844381A publication Critical patent/CN107844381A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3034Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of fault handling method of storage system and device, it is related to field of computer technology.The present invention is judged the health status of disk before disk failure, after finding dangerous disk, enables HotSpare disk immediately, but HotSpare disk is not added in disk array immediately.Now dangerous disk can also normal work, can within a short period of time online by dangerous disk data duplication into HotSpare disk, and in write operation afterwards, keep HotSpare disk it is consistent with the data of dangerous disk.Once dangerous disk turns into faulty disk, then HotSpare disk is added in disk array immediately.Because the data of HotSpare disk and dangerous disk are completely the same; so HotSpare disk can substitute faulty disk normal work immediately; the very long restructuring procedure that data recovery is carried out using verification data is avoided, so as to further reduce the time that data are in unprotect state, improves the security of data.

Description

The fault handling method and device of storage system
Technical field
The present invention relates to field of computer technology, the fault handling method and device of more particularly to a kind of storage system.
Background technology
Disk array (Redundant Arrays of Independent Disks, RAID) is in current storage system Conventional technology, for ensureing the safe and reliable of data.One RAID group is made up of the disk of 2 pieces or more than 2 pieces.Removed in disk Outside business datum, also comprising verification data.After 1 piece in RAID groups or 2 pieces of hard disks break down, can manually or from In the dynamic RAID groups for adding new building, verification data of the system in normal disk is by the data recovery of loss to new building In.Typically now by the way of new building is automatically added to, this new building is referred to as HotSpare disk.HotSpare disk does not work usually, when After there are disk failures in RAID groups, then it is added in RAID groups.HotSpare disk is added to after RAID rents, system according to verification data, Restore data in HotSpare disk, rebuild RAID groups, referred to as reconstruct.
In RAID groups, disk is divided into multiple data blocks according to band, and reconstruct also recovers data according to band block-by-block.System System will verification deblocking read, computing complete generation recover after data after, then by this part recovery data be written to it is hot standby Disk.After the recovery for completing a block number evidence, continue to recover the data of other remaining blocks.This process is general veryer long, actual raw Produce in environment, reconstitution time is short then 2 hours, if data volume is larger, more than 10 hours are even up to a couple of days.Even for Newest RAID2.0 technologies, reconstitution time are also usually up to a few hours.
Because current disk array reconfiguration technique is time-consuming longer, and when reconstructing, it is impossible to disk failures thing occurs again Part, otherwise data recovery procedure can be interrupted, cause loss of data.Therefore in restructuring procedure, data are in unprotect for a long time State, data safety is by serious threat.
The content of the invention
A technical problem to be solved by this invention is:How in disk failures ensure data do not lose, subtract simultaneously Data are in the time of unprotect state in few restructuring procedure, improve Information Security.
According to an aspect of the present invention, there is provided a kind of storage system fault handling method, including:Obtain disk Operational factor;Judge whether disk is in the hole according to the operational factor of disk;By the number of disk in the hole HotSpare disk is synchronized to when factually;When disk failures in the hole, the disk for replacing damage using HotSpare disk works.
In one embodiment, the operational factor that disk is obtained from monitoring analysis report technical data of disk is passed through.
In one embodiment, the operational factor of disk includes the reading error rate of disk and/or writes out error rate;According to disk Operational factor judge disk it is whether in the hole including:The reading error rate of disk is carried out pair with reading error rate threshold value Than, and/or error rate will be write out and contrasted with writing out error rate threshold value;Exceed if the read out error rate and read error rate threshold value and/or write Error rate, which exceedes, writes out error rate threshold value, it is determined that disk is in the hole.
In one embodiment, the scope for reading error rate threshold value is 15%~25%, and the scope for writing out error rate threshold value is 30%~50%.
In one embodiment, the operational factor of disk includes reading rate and/or writing rate;According to the operational factor of disk Judge disk it is whether in the hole including:The reading rate of disk and reading rate threshold value are contrasted, and/or by writing rate Contrasted with writing rate threshold value;If reading rate is less than writing rate threshold value less than reading rate threshold value and/or writing rate, it is determined that Disk is in the hole.
According to another aspect of the present invention, there is provided a kind of storage system fault treating apparatus, including:Disk parameter Acquiring unit, for obtaining the operational factor of disk;Disk State judging unit, judge magnetic for the operational factor according to disk Whether disk is in the hole;Data in magnetic disk synchronization unit, for the real time data synchronization of disk in the hole to be arrived HotSpare disk;Disk replacement unit, for when disk failures in the hole, the disk work of damage to be replaced using HotSpare disk Make.
In one embodiment, disk parameter acquiring unit, for the monitoring analysis report technical data certainly by disk Obtain the operational factor of disk.
In one embodiment, the operational factor of disk includes the reading error rate of disk and/or writes out error rate;Disk State Judging unit, for the reading error rate of disk to be contrasted with reading error rate threshold value, and/or error rate will be write out and write out error rate Threshold value is contrasted, if the read out error rate exceed read error rate threshold value and/or write out error rate exceed write out error rate threshold value, it is determined that Disk is in the hole.
In one embodiment, the scope for reading error rate threshold value is 15%~25%, and the scope for writing out error rate threshold value is 30%~50%.
In one embodiment, the operational factor of disk includes reading rate and/or writing rate;Disk State judging unit, For the reading rate of disk and reading rate threshold value to be contrasted, and/or writing rate and writing rate threshold value are contrasted;If Reading rate is less than writing rate threshold value less than reading rate threshold value and/or writing rate, it is determined that disk is in the hole.
The present invention is judged the health status of disk before disk failure, after finding dangerous disk, is enabled immediately hot standby Disk, but HotSpare disk is not added in disk array immediately.Now dangerous disk can also normal work, can be online within a short period of time By dangerous disk data duplication into HotSpare disk, and in write operation afterwards, keep HotSpare disk consistent with the data of dangerous disk.One Denier danger disk turns into faulty disk, then HotSpare disk is added in disk array immediately.Because the data of HotSpare disk and dangerous disk are complete It is complete consistent, so HotSpare disk can substitute faulty disk normal work immediately, avoid and carry out data recovery using verification data Very long restructuring procedure, so as to further reduce the time that data are in unprotect state, improve the security of data.
By referring to the drawings to the present invention exemplary embodiment detailed description, further feature of the invention and its Advantage will be made apparent from.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 shows the schematic flow sheet of the fault handling method of the storage system of one embodiment of the present of invention.
Fig. 2 shows the schematic flow sheet of the fault handling method of the storage system of the application examples of the present invention.
Fig. 3 shows the structural representation of the fault treating apparatus of the storage system of one embodiment of the present of invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Below Description only actually at least one exemplary embodiment is illustrative, is never used as to the present invention and its application or makes Any restrictions.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, belongs to the scope of protection of the invention.
For in the prior art, disk array reconfiguration technique is time-consuming longer, and in restructuring procedure, data are in for a long time Unprotect state, the problem of data safety is by serious threat, propose this programme.
Below with reference to the fault handling method of the storage system of Fig. 1 description present invention.
Fig. 1 is the flow chart of fault handling method one embodiment of the storage system of the present invention.As shown in figure 1, the reality Applying the method for example includes:
Step S102, obtain the operational factor of disk.
Wherein, the operational factor of disk for example passes through SMART (Self-Monitoring Analysis and Reporting Technology, from monitoring analysis and reporting techniques) data obtain, have recorded including disk in SMART data Error rate, the information such as read or write speed.
Step S104, judge whether disk is in the hole according to the operational factor of disk.
Wherein, the present invention provides two kinds of reference schemes for how to judge whether disk is in the hole:First, according to magnetic The reading error rate of disk and/or write out error rate and judge whether disk is in the hole, specifically, by the reading error rate of disk with reading Error rate threshold is contrasted, and/or will be write out error rate and be contrasted with writing out error rate threshold value;Exceed if the read out error rate and read Error rate threshold value and/or write out error rate exceed write out error rate threshold value, it is determined that disk is in the hole.Can be according to reading error rate Or write out one in error rate to judge whether disk in the hole, can also two combinations judge whether disk is in Precarious position, relative to more accurate as criterion using one of which.Inventor, which has found that if criterion is excessively tight, (to be allowed Reading error rate, to write out error rate excessive), then may cause to fail to judge so that some disks that will be broken down are not judged For dangerous disk;And standard is excessively loose (the reading error rate of permission, write out error rate too small), then may cause one side system dangerous disk mistake It is more, on the other hand dangerous disk still can normal operation for a long time, premature loss HotSpare disk.It is arranged to when reading error rate threshold value During numerical value between 15%~25%, such as 20%, when writing out the numerical value that error rate threshold value is arranged between 30%~50%, such as 40%, can more accurately predict disk will break down.2nd, magnetic is judged according to the reading rate of disk and/or writing rate Whether disk in the hole, specifically, the reading rate of disk and reading rate threshold value are contrasted, and/or by writing rate with Writing rate threshold value is contrasted;If reading rate is less than writing rate threshold value less than reading rate threshold value and/or writing rate, it is determined that magnetic Disk is in the hole.It can judge whether disk is in the hole, also may be used according to one in reading rate or writing rate Judge whether disk is in the hole with two combinations, relative to more accurate as criterion using one of which. , can also according to demand or actual observation experience chooses other operational factors to judge magnetic in addition to above two determination methods Whether disk is in the hole.
Inventor is had found when the threshold value that operational factor defines, although disk still can correctly be read and write, can be recognized It is in the hole for disk, most probably within a short period of time, turn into the faulty disk that can not normally read and write.Enter in new building stable Working condition, after seldom there is chance failure, disk failure prediction is carried out using SMART data, there is very high accuracy rate, The quality of disk is better, operation is more stable, then the accuracy rate of failure predication is higher.Among practice, accuracy rate is more than 90%.
Step S106, by the real time data synchronization of disk in the hole to HotSpare disk.
Wherein, real-time synchronization includes existing data in disk are fully synchronized into HotSpare disk, meanwhile, if disk is held Write operation data to be written real-time synchronization gone into HotSpare disk.
Step S108, when disk failures in the hole, the disk for replacing damage using HotSpare disk works.
Because the data in disk have just been carried out synchronization by HotSpare disk before disk failure, therefore work as disk once Break down damage when, HotSpare disk can in time be used for substitute disk work, it is not necessary to carry out the process of data reconstruction again.
If not predicting disk accurately according to the service data of disk will break down, adopted after disk failures Data convert is carried out with original reconfiguration technique.Further, since the service data of disk there may be fluctuation, can be periodic Service data is monitored, if disk is confirmed as dangerous disk but after preset time, service data is recovered normally to be not belonging to endanger Dangerous disk, then HotSpare disk synchrodata need not be reused, HotSpare disk is removed, avoid the resource of excessive consumption HotSpare disk.
The method of above-described embodiment, the health status of disk is judged before disk failure, after finding dangerous disk, stood HotSpare disk is enabled, but HotSpare disk is not added in disk array immediately.Now dangerous disk can also normal work, can be shorter Online by dangerous disk data duplication into HotSpare disk in time, and in write operation afterwards, HotSpare disk and dangerous disk are kept Data are consistent.Once dangerous disk turns into faulty disk, then HotSpare disk is added in disk array immediately.Due to HotSpare disk and danger The data of disk are completely the same, so HotSpare disk can substitute faulty disk normal work immediately, avoid and carried out using verification data The very long restructuring procedure of data recovery, so as to further reduce the time that data are in unprotect state, improve data Security.
One application examples of the fault handling method of storage system of the present invention is described below with reference to Fig. 2.
Fig. 2 is the flow chart of one application examples of fault handling method of the storage system of the present invention.As shown in Fig. 2 the reality Applying the method for example includes:
Step S202, the SMART parameter of one piece of disk in reading disk array.
Specifically, can by write DiskState (Re, We) functions read the read error rate in SMART parameter and Write error rate, and set and read error rate threshold value and write out error rate threshold value.
Step S204, judges the dangerous disk that whether belongs to of disk, i.e., whether read error rate, which is more than, reads error rate threshold value 20%, Whether write error rate, which is more than, is write out error rate threshold value 40%, if read error rate is more than 20% and write error rate is more than 40%, Step S206 is performed, otherwise determines that disk is not belonging to dangerous disk, next piece of disk in return to step S202 reading disk arrays SMART parameter.
Step S206, by the data duplication in dangerous disk to HotSpare disk.
Step S208, keep the synchronization of data in HotSpare disk and dangerous disk.
Step S210, judges whether dangerous disk breaks down, if dangerous disk failure, performs step S212, otherwise, after It is continuous to perform step S208.
Step S212, HotSpare disk is added in disk array and replaces failed disk.
The present invention also provides a kind of fault treating apparatus of storage system, is described with reference to Fig. 3.
Fig. 3 is the structure chart of fault handling method one embodiment of the storage system of the present invention.As shown in figure 3, the dress Putting 30 includes:
Disk parameter acquiring unit 302, for obtaining the operational factor of disk.
Specifically, disk parameter acquiring unit 302, for being counted by disk from monitoring analysis report technology (SMART) According to the operational factor for obtaining disk.
Disk State judging unit 304, for judging whether disk is in the hole according to the operational factor of disk.
In the case where the operational factor of disk includes the reading error rate of disk and/or writes out error rate, Disk State judges Unit 304, for the reading error rate of disk to be contrasted with reading error rate threshold value, and/or error rate will be write out and write out error rate Threshold value is contrasted, if the read out error rate exceed read error rate threshold value and/or write out error rate exceed write out error rate threshold value, it is determined that Disk is in the hole.Wherein, the scope for reading error rate threshold value is 15%~25%, such as 20%;Write out error rate threshold value Scope is 30%~50%, such as 40%.
In the case where the operational factor of disk includes reading rate and/or writing rate, Disk State judging unit 304, use Contrasted in by the reading rate of disk and reading rate threshold value, and/or writing rate and writing rate threshold value are contrasted;If read Speed is less than writing rate threshold value less than reading rate threshold value and/or writing rate, it is determined that disk is in the hole.
Data in magnetic disk synchronization unit 306, for by the real time data synchronization of disk in the hole to HotSpare disk.
Disk replacement unit 308, for when disk failures in the hole, the magnetic of damage to be replaced using HotSpare disk Disk works.
One of ordinary skill in the art will appreciate that hardware can be passed through by realizing all or part of step of above-described embodiment To complete, by program the hardware of correlation can also be instructed to complete, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.

Claims (10)

  1. A kind of 1. fault handling method of storage system, it is characterised in that including:
    Obtain the operational factor of disk;
    Judge whether disk is in the hole according to the operational factor of the disk;
    By the real time data synchronization of disk in the hole to HotSpare disk;
    When the disk failures in the hole, the HotSpare disk is utilized to replace the disk work of damage.
  2. 2. according to the method for claim 1, it is characterised in that
    Pass through the operational factor that the disk is obtained from monitoring analysis report technical data of disk.
  3. 3. according to the method for claim 1, it is characterised in that
    The operational factor of the disk includes the reading error rate of disk and/or writes out error rate;
    The operational factor according to the disk judge disk it is whether in the hole including:
    The reading error rate of the disk is contrasted with reading error rate threshold value, and/or error rate will be write out and write out error rate threshold value Contrasted;
    If the read out error rate exceed read error rate threshold value and/or write out error rate exceed write out error rate threshold value, it is determined that disk is in Precarious position.
  4. 4. according to the method for claim 3, it is characterised in that
    The scope for reading error rate threshold value is 15%~25%, and the scope for writing out error rate threshold value is 30%~50%.
  5. 5. according to the method for claim 1, it is characterised in that
    The operational factor of the disk includes reading rate and/or writing rate;
    The operational factor according to the disk judge disk it is whether in the hole including:
    The reading rate of the disk and reading rate threshold value are contrasted, and/or writing rate and writing rate threshold value are contrasted;
    If reading rate is less than writing rate threshold value less than reading rate threshold value and/or writing rate, it is determined that disk is in the hole.
  6. A kind of 6. fault treating apparatus of storage system, it is characterised in that including:
    Disk parameter acquiring unit, for obtaining the operational factor of disk;
    Disk State judging unit, for judging whether disk is in the hole according to the operational factor of the disk;
    Data in magnetic disk synchronization unit, for by the real time data synchronization of disk in the hole to HotSpare disk;
    Disk replacement unit, for when the disk failures in the hole, utilizing the HotSpare disk to replace damage Disk works.
  7. 7. device according to claim 6, it is characterised in that
    The disk parameter acquiring unit, for the fortune that the disk is obtained from monitoring analysis report technical data by disk Row parameter.
  8. 8. device according to claim 6, it is characterised in that
    The operational factor of the disk includes the reading error rate of disk and/or writes out error rate;
    The Disk State judging unit, for the reading error rate of the disk to be contrasted with reading error rate threshold value, and/or Error rate will be write out and contrasted with writing out error rate threshold value, exceeded if the read out error rate and read error rate threshold value and/or write out error rate and surpass Cross and write out error rate threshold value, it is determined that disk is in the hole.
  9. 9. device according to claim 8, it is characterised in that
    The scope for reading error rate threshold value is 15%~25%, and the scope for writing out error rate threshold value is 30%~50%.
  10. 10. device according to claim 6, it is characterised in that
    The operational factor of the disk includes reading rate and/or writing rate;
    The Disk State judging unit, for the reading rate of the disk and reading rate threshold value to be contrasted, and/or it will write Speed is contrasted with writing rate threshold value;If reading rate is less than writing rate threshold value less than reading rate threshold value and/or writing rate, Determine that disk is in the hole.
CN201610837841.7A 2016-09-21 2016-09-21 The fault handling method and device of storage system Pending CN107844381A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610837841.7A CN107844381A (en) 2016-09-21 2016-09-21 The fault handling method and device of storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610837841.7A CN107844381A (en) 2016-09-21 2016-09-21 The fault handling method and device of storage system

Publications (1)

Publication Number Publication Date
CN107844381A true CN107844381A (en) 2018-03-27

Family

ID=61657642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610837841.7A Pending CN107844381A (en) 2016-09-21 2016-09-21 The fault handling method and device of storage system

Country Status (1)

Country Link
CN (1) CN107844381A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109445708A (en) * 2018-11-02 2019-03-08 南方电网调峰调频发电有限公司 A kind of transparent fault transfer method based on the privately owned cloud platform of database
CN110515756A (en) * 2019-07-26 2019-11-29 济南浪潮数据技术有限公司 A kind of trouble-saving method, apparatus, equipment and the storage medium of storage system
CN110795276A (en) * 2018-08-01 2020-02-14 阿里巴巴集团控股有限公司 Storage medium repairing method, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1959647A (en) * 2005-11-04 2007-05-09 英业达股份有限公司 Method for establishing stable memory mechanism
US20080178038A1 (en) * 2004-05-06 2008-07-24 International Business Machines Corporation Low cost raid with seamless disk failure recovery
CN102129397A (en) * 2010-12-29 2011-07-20 深圳市永达电子股份有限公司 Method and system for predicating self-adaptive disk array failure
US8473779B2 (en) * 2008-02-29 2013-06-25 Assurance Software And Hardware Solutions, Llc Systems and methods for error correction and detection, isolation, and recovery of faults in a fail-in-place storage array
CN105468484A (en) * 2014-09-30 2016-04-06 伊姆西公司 Method and apparatus for determining fault location in storage system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080178038A1 (en) * 2004-05-06 2008-07-24 International Business Machines Corporation Low cost raid with seamless disk failure recovery
CN1959647A (en) * 2005-11-04 2007-05-09 英业达股份有限公司 Method for establishing stable memory mechanism
US8473779B2 (en) * 2008-02-29 2013-06-25 Assurance Software And Hardware Solutions, Llc Systems and methods for error correction and detection, isolation, and recovery of faults in a fail-in-place storage array
CN102129397A (en) * 2010-12-29 2011-07-20 深圳市永达电子股份有限公司 Method and system for predicating self-adaptive disk array failure
CN105468484A (en) * 2014-09-30 2016-04-06 伊姆西公司 Method and apparatus for determining fault location in storage system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795276A (en) * 2018-08-01 2020-02-14 阿里巴巴集团控股有限公司 Storage medium repairing method, computer equipment and storage medium
CN109445708A (en) * 2018-11-02 2019-03-08 南方电网调峰调频发电有限公司 A kind of transparent fault transfer method based on the privately owned cloud platform of database
CN110515756A (en) * 2019-07-26 2019-11-29 济南浪潮数据技术有限公司 A kind of trouble-saving method, apparatus, equipment and the storage medium of storage system

Similar Documents

Publication Publication Date Title
CN105468479B (en) A kind of disk array RAID bad block processing methods and device
CN100530125C (en) Safety storage method for data
CN102521058A (en) Disk data pre-migration method of RAID (Redundant Array of Independent Disks) group
US20040103246A1 (en) Increased data availability with SMART drives
CN102508733B (en) A kind of data processing method based on disk array and disk array manager
KR100711165B1 (en) Apparatus, method and recording medium for the control of storage
CN107844381A (en) The fault handling method and device of storage system
CN104461791B (en) Information processing method and device
CN102170460A (en) Cluster storage system and data storage method thereof
CN109726036B (en) Data reconstruction method and device in storage system
CN104407821B (en) A kind of method and device for realizing RAID reconstruction
CN114265728A (en) Storage system fault recovery method and device, computer equipment and medium
CN106708646A (en) Hard disk abnormal condition automatic resetting method and device thereof
US10606490B2 (en) Storage control device and storage control method for detecting storage device in potential fault state
CN108874312B (en) Data storage method and storage device
CN110597655A (en) Fast predictive restoration method for coupling migration and erasure code-based reconstruction and implementation
CN105183583A (en) Method for data reconstruction of disk array, and disk array system
CN110795273B (en) RAID write hole protection method, system and storage medium
CN105138280B (en) Method for writing data, apparatus and system
CN106528349B (en) A kind of date storage method and device
CN104636082B (en) The control method and device of disk array RAID
CN105117172B (en) A kind of disk array history falls the store method of disk record
CN105630417A (en) RAID5 (Redundant Array Of Independent Disks) system and method for continuously writing data after failure of RAID5 system
CN105183590A (en) Disk array fault tolerance processing method
CN117437967A (en) Stripe detection method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180327