CN106528342A - Disk array fault tolerance apparatus with cloud server backup function - Google Patents

Disk array fault tolerance apparatus with cloud server backup function Download PDF

Info

Publication number
CN106528342A
CN106528342A CN201610992459.3A CN201610992459A CN106528342A CN 106528342 A CN106528342 A CN 106528342A CN 201610992459 A CN201610992459 A CN 201610992459A CN 106528342 A CN106528342 A CN 106528342A
Authority
CN
China
Prior art keywords
unit
band
disk
data
disk array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610992459.3A
Other languages
Chinese (zh)
Inventor
陈家满
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Weide Industrial Automation Co Ltd
Original Assignee
Anhui Weide Industrial Automation Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Weide Industrial Automation Co Ltd filed Critical Anhui Weide Industrial Automation Co Ltd
Priority to CN201610992459.3A priority Critical patent/CN106528342A/en
Publication of CN106528342A publication Critical patent/CN106528342A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1456Hardware arrangements for backup
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a disk array fault tolerance apparatus with a cloud server backup function. The apparatus comprises a replacement unit, a rebuilding unit, a recording unit, a processing unit, a repairing unit, and a recovering unit. The recording unit is connected with the rebuilding unit, the processing unit, the repairing unit, and the recovering unit. The recording unit is connected with a cloud server backup storage by using a connection device. The rebuilding unit is connected with the replacement unit. A rebuilding read error of a strip is repaired by using a write method, and the redundancy of the disk array is recovered as soon as possible, so that failure of the whole disk array due to faults of multiple disks in rebuilding is prevented. The apparatus has the cloud server backup function. When a local disk has an irreparable fault, backup data can be recovered from the storage of the cloud server.

Description

A kind of disk array fault tolerance facility with Cloud Server backup
Technical field:
The present invention relates to it is related to disk storage system field, specifically a kind of fault-tolerant dress of disk array with Cloud Server backup Put.
Background technology:
RAID, abbreviation disk array, multiple independent disks are combined into an array by which, there is provided well Redundancy and the storage performance higher than single disk.In field of storage, by the redundancy of disk array itself by data Directly or indirectly it is stored on multiple single disks, to reach the mesh that the data when one or more disk failures are not lost , that is, realize data fault-tolerant.
Wherein, when because disk failure in some reasons such as disk array etc. causes disk array to lose redundancy When, the disk array can be in degrading state.Make so that the disk failure in disk array causes disk array to lose redundancy The disk array as a example by degrading state, then in the prior art, is to recover this because disk failure is in degrading state The redundancy of disk array, conventional mode are to increase the mode that HotSpare disk is rebuild, specially:The magnetic of failure is replaced with HotSpare disk Disk.But, in the process of reconstruction, and if there occurs that disk rebuilds read error, wherein, reconstruction read error is process of reconstruction In, rebuild I/O cause disk occur read error, then, and stop rebuild, now the disk array can only rest on degrading state without Method returns to redundant state.When other disks in the disk array are broken down again, whole disk array will be unsuccessfully I/O passages are closed, this does not only result in the disk array and stops offer business, also results in the data stored before the disk array Lose.
In addition, when business reading is carried out to the disk array in degrading state, in the event of business read error, wherein industry Business misreads and is mistaken for:During business read-write, business I/O causes the read error that disk occurs, then now the disk array fails, i.e., I/O passages are closed, this causes the disk array to stop offer business, and causes the loss of data for storing before.
And the popularization of country's big city network and high speed fibre causes to back up using Cloud Server at set intervals at present Data are possibly realized, for the very high enterprise of security request data can preserve needs by way of Cloud Server is backed up Data.
The content of the invention:
It is an object of the invention to provide a kind of disk array fault tolerance facility with Cloud Server backup, it is to avoid in degrading state Disk array due to the problem for occurring to rebuild read error or business read error and cause;And energy is backed up by Cloud Server Enough restoring datas.
The technical solution used in the present invention is:
A kind of disk array fault tolerance facility with Cloud Server backup:Including replacement unit, reconstruction unit, recording unit, place Reason unit, reparation unit and recovery unit;Recording unit is connected respectively to reconstruction unit, processing unit, repairs unit and recovery Unit, recording unit are connected to Cloud Server backup of memory by connection equipment, and reconstruction unit is connected to replacement unit;Institute Replacement unit is stated, when the disk failures in disk array, increases HotSpare disk in the disk array, to replace this The disk of raw failure;The reconstruction unit, the disk array to increased HotSpare disk in units of band are rebuild;The note Record unit, when the current band rebuild by the reconstruction unit occurs to rebuild read error, will deserve the identification record of preceding article band To in Nonvolatile memory, and trigger the reconstruction unit and skip current band, continue to rebuild from next band, until completing The reconstruction of disk array;The processing unit, when needs are to band institute corresponding with stripe markings in the Nonvolatile memory When the HotSpare disk of occupancy reads data, read command is not issued, using other disks in addition to HotSpare disk shared by the band In data calculate need from HotSpare disk read data;When needing to corresponding with stripe markings in the Nonvolatile memory Band shared by other disks in addition to HotSpare disk when reading data, to this, other disks issue read command, with basis Data are read in the read command;The reparation unit, for each stripe markings recorded in the Nonvolatile memory, by writing Mode repairs the reconstruction read error of band corresponding with the stripe markings, and after completing to repair from the Nonvolatile memory Delete the stripe markings;The recovery unit, when mistake occurs in disk repair data or deletes mark mistake, by the data Return to the data before repairing.
The reparation unit is repaired and the stripe markings by writing data to whole band corresponding with the stripe markings The business read error of corresponding band;Or, it is determined that the significance level of band institute corresponding with stripe markings data storage, such as Fruit determines that the significance level of the data less than given threshold, then repairs band corresponding with the stripe markings by following operation Business read error:To the disk write setting data shared by band corresponding with the stripe markings.
Compared with prior art, the beneficial effects of the present invention is:
The present invention repairs the reconstruction read error of the band by WriteMode, recovers the redundancy of disk array as early as possible, it is to avoid rebuild During multiple disks break down and cause the situation of whole disk array failure to occur.There is business in current band to misread Mistake, the identification record of preceding article band can be deserved in Nonvolatile memory, and control the disk array and the reading of offer business is provided Write, not only ensure that business continuance but also avoid the risk that data are lost.Cloud Server backup functionality is provided simultaneously, The data of backup when the catastrophe failure that cannot be repaired occurs in local disk, can be recovered from the memory of Cloud Server.
Description of the drawings:
Fig. 1 is schematic structural view of the invention.
Specific embodiment:
Below in conjunction with the accompanying drawings, the present invention is described in further detail by embodiment:
A kind of disk array fault tolerance facility with Cloud Server backup:Including replacement unit, reconstruction unit, recording unit, place Reason unit, reparation unit and recovery unit;Recording unit is connected respectively to reconstruction unit, processing unit, repairs unit and recovery Unit, recording unit are connected to Cloud Server backup of memory by connection equipment, and reconstruction unit is connected to replacement unit;Institute Replacement unit is stated, when the disk failures in disk array, increases HotSpare disk in the disk array, to replace this The disk of raw failure;The reconstruction unit, the disk array to increased HotSpare disk in units of band are rebuild;The note Record unit, when the current band rebuild by the reconstruction unit occurs to rebuild read error, will deserve the identification record of preceding article band To in Nonvolatile memory, and trigger the reconstruction unit and skip current band, continue to rebuild from next band, until completing The reconstruction of disk array;The processing unit, when needs are to band institute corresponding with stripe markings in the Nonvolatile memory When the HotSpare disk of occupancy reads data, read command is not issued, using other disks in addition to HotSpare disk shared by the band In data calculate need from HotSpare disk read data;When needing to corresponding with stripe markings in the Nonvolatile memory Band shared by other disks in addition to HotSpare disk when reading data, to this, other disks issue read command, with basis Data are read in the read command;The reparation unit, for each stripe markings recorded in the Nonvolatile memory, by writing Mode repairs the reconstruction read error of band corresponding with the stripe markings, and after completing to repair from the Nonvolatile memory Delete the stripe markings;The recovery unit, when mistake occurs in disk repair data or deletes mark mistake, by the data Return to the data before repairing.
The reparation unit is repaired and the stripe markings by writing data to whole band corresponding with the stripe markings The business read error of corresponding band;Or, it is determined that the significance level of band institute corresponding with stripe markings data storage, such as Fruit determines that the significance level of the data less than given threshold, then repairs band corresponding with the stripe markings by following operation Business read error:To the disk write setting data shared by band corresponding with the stripe markings.
Embodiment of the present invention is only the description carried out to the preferred embodiment of the present invention, not to the present invention Spirit and scope are defined, and on the premise of without departing from design philosophy of the present invention, in this area, engineers and technicians are to this Various modifications and improvement that bright technical scheme is made, all should fall into protection scope of the present invention, the skill that the present invention is claimed Art content, has all recorded in detail in the claims.

Claims (2)

1. it is a kind of with Cloud Server backup disk array fault tolerance facility, it is characterised in that:Including replacement unit, rebuild single Unit, recording unit, processing unit, reparation unit and recovery unit;Recording unit be connected respectively to reconstruction unit, processing unit, Unit and recovery unit are repaired, recording unit is connected to Cloud Server backup of memory by connection equipment, and reconstruction unit connects again It is connected to replacement unit;The replacement unit, when the disk failures in disk array, increases heat in the disk array Standby disk, to replace the disk for breaking down;The reconstruction unit, to increased the disk array of HotSpare disk in units of band Rebuild;The recording unit, it is when the current band rebuild by the reconstruction unit occurs to rebuild read error, this is current The identification record of band is in Nonvolatile memory, and triggers the reconstruction unit and skip current band, from next band after It is continuous to rebuild, until completing the reconstruction of disk array;The processing unit, when need to band mark in the Nonvolatile memory When knowing HotSpare disk shared by corresponding band and reading data, do not issue read command, using shared by the band except HotSpare disk Outside other disks in data calculate need from HotSpare disk read data;When need to the Nonvolatile memory When other disks in addition to HotSpare disk shared by the corresponding band of middle stripe markings read data, to this, other disks are issued Read command, to read data according to the read command;The reparation unit, for each recorded in the Nonvolatile memory Tape identification, by the reconstruction read error of WriteMode reparation band corresponding with the stripe markings, and from described after completing to repair The stripe markings are deleted in Nonvolatile memory;The recovery unit, when mistake occurs in disk repair data or deletes mark During mistake, by data of the data recovery to before repairing.
2. it is according to claim 1 with Cloud Server backup disk array fault tolerance facility, it is characterised in that:It is described to repair Multiple unit repairs the industry of band corresponding with the stripe markings by writing data to whole band corresponding with the stripe markings Business read error;Or, it is determined that the significance level of band institute corresponding with stripe markings data storage, if it is determined that go out the data Significance level be less than given threshold, then the business read error of corresponding with stripe markings band is repaired by following operation: To the disk write setting data shared by band corresponding with the stripe markings.
CN201610992459.3A 2016-11-11 2016-11-11 Disk array fault tolerance apparatus with cloud server backup function Pending CN106528342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610992459.3A CN106528342A (en) 2016-11-11 2016-11-11 Disk array fault tolerance apparatus with cloud server backup function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610992459.3A CN106528342A (en) 2016-11-11 2016-11-11 Disk array fault tolerance apparatus with cloud server backup function

Publications (1)

Publication Number Publication Date
CN106528342A true CN106528342A (en) 2017-03-22

Family

ID=58351067

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610992459.3A Pending CN106528342A (en) 2016-11-11 2016-11-11 Disk array fault tolerance apparatus with cloud server backup function

Country Status (1)

Country Link
CN (1) CN106528342A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334280A (en) * 2017-12-28 2018-07-27 创新科存储技术(深圳)有限公司 A kind of RAID5 disks group fast reconstructing method and device
CN109445708A (en) * 2018-11-02 2019-03-08 南方电网调峰调频发电有限公司 A kind of transparent fault transfer method based on the privately owned cloud platform of database
CN116149576A (en) * 2023-04-20 2023-05-23 北京大学 Method and system for reconstructing disk redundant array oriented to server non-perception calculation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184129A (en) * 2011-04-27 2011-09-14 杭州华三通信技术有限公司 Fault tolerance method and device for disk arrays
CN105183589A (en) * 2015-08-31 2015-12-23 安徽欧迈特数字技术有限责任公司 Disk array fault tolerance apparatus
CN105959356A (en) * 2016-04-26 2016-09-21 华中科技大学 Method of realizing multi-cloud storage fault-tolerance conversion mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184129A (en) * 2011-04-27 2011-09-14 杭州华三通信技术有限公司 Fault tolerance method and device for disk arrays
CN105183589A (en) * 2015-08-31 2015-12-23 安徽欧迈特数字技术有限责任公司 Disk array fault tolerance apparatus
CN105959356A (en) * 2016-04-26 2016-09-21 华中科技大学 Method of realizing multi-cloud storage fault-tolerance conversion mechanism

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334280A (en) * 2017-12-28 2018-07-27 创新科存储技术(深圳)有限公司 A kind of RAID5 disks group fast reconstructing method and device
CN108334280B (en) * 2017-12-28 2021-01-08 深圳创新科技术有限公司 RAID5 disk group fast reconstruction method and device
CN109445708A (en) * 2018-11-02 2019-03-08 南方电网调峰调频发电有限公司 A kind of transparent fault transfer method based on the privately owned cloud platform of database
CN116149576A (en) * 2023-04-20 2023-05-23 北京大学 Method and system for reconstructing disk redundant array oriented to server non-perception calculation
CN116149576B (en) * 2023-04-20 2023-07-25 北京大学 Method and system for reconstructing disk redundant array oriented to server non-perception calculation

Similar Documents

Publication Publication Date Title
CN102012847B (en) Improved disk array reconstruction method
CN102184129B (en) Fault tolerance method and device for disk arrays
CN103970481B (en) The method and apparatus rebuilding memory array
US9009526B2 (en) Rebuilding drive data
CN102023815B (en) RAID is realized in solid-state memory
CN104484251B (en) A kind of processing method and processing device of hard disk failure
CN100504795C (en) Computer RAID array early-warning system and method
CN101887351B (en) Fault-tolerance method and system for redundant array of independent disk
CN103309775A (en) Fault-tolerance method for high-reliability disk array
CN102508620B (en) Method for processing RAID5 (Redundant Array of Independent Disks) bad sector
CN102508733B (en) A kind of data processing method based on disk array and disk array manager
CN104035830A (en) Method and device for recovering data
CN105302667A (en) Cluster architecture based high-reliability data backup and recovery method
CN103699457A (en) Method and device for restoring disk arrays based on stripping
CN104407821B (en) A kind of method and device for realizing RAID reconstruction
US20120084610A1 (en) Method and device for reading and writing a memory card
CN103019894B (en) Reconstruction method for redundant array of independent disks
CN110399247A (en) A kind of data reconstruction method, device, equipment and computer readable storage medium
CN106528342A (en) Disk array fault tolerance apparatus with cloud server backup function
US8886993B2 (en) Storage device replacement method, and storage sub-system adopting storage device replacement method
CN100462931C (en) Method, system and computer program product for recovery of formatting in repair of bad sectors in disk drives
CN108874312B (en) Data storage method and storage device
US8423776B2 (en) Storage systems and data storage method
CN102999399A (en) Method and device of automatically restoring storage of JBOD (just bundle of disks) array
CN105183590A (en) Disk array fault tolerance processing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170322

RJ01 Rejection of invention patent application after publication