CN102368222A - Online repairing method of multiple-copy storage system - Google Patents

Online repairing method of multiple-copy storage system Download PDF

Info

Publication number
CN102368222A
CN102368222A CN2011103283174A CN201110328317A CN102368222A CN 102368222 A CN102368222 A CN 102368222A CN 2011103283174 A CN2011103283174 A CN 2011103283174A CN 201110328317 A CN201110328317 A CN 201110328317A CN 102368222 A CN102368222 A CN 102368222A
Authority
CN
China
Prior art keywords
copy
mds
primary
primary copy
record
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011103283174A
Other languages
Chinese (zh)
Inventor
付根希
姜国梁
彭成
杨浩
王勇
苗艳超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN2011103283174A priority Critical patent/CN102368222A/en
Publication of CN102368222A publication Critical patent/CN102368222A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an online repairing method of a multiple-copy storage system, comprising the following steps of: storing an object by adopting a multiple-copy mode; improving the reliability of the system; storing different copies of the same object on different OSDs (Optical Scanning Devices), wherein one of the copies of the same object is a master copy; modifying operation and sending the operation to the copy; sending the operation to a slave copy by the master copy; when the master copy has a malfunction, selecting a new master copy by an MDS (Malfunction Detection System) through master changing operation and recording malfunction information by the MDS; when the slave copy has a malfunction, informing the master copy to the MDS and recording the information of the object having the malfunction by the MDS; after the malfunction is removed, triggering data repairing and finishing data repairing under the master control of the MDS; when a node of the OSD goes down, applying for changing a master to the MDS by a client side; and after the master changing operation is finished, continuing operating the object. In the invention, the consistency of the copies is repaired in an online way, and the reliability and the usability of the system are improved.

Description

The method of the online reparation of a kind of many copies storage system
Technical field
The present invention relates to the Computer Storage field, specifically, related to a kind of online restorative procedure based on object storage system.
Background technology
In the object storage system, adopt the mode of many copies can improve the reliability of system.In the distributed memory system through the generic storage device build, disk failure, network failure and the node machine of delaying are recurrent, so system need provide the ability of online handling failure, make system that reliable and stable service can be provided.
Current, fault is repaired through the mode of off-line by most system, so just greatly reduces the availability of system.
Be accompanied by that system scale constantly enlarges, the complicacy of network increases greatly, make network failure handle and face great challenge, the use of simultaneously a large amount of inexpensive disk equipment and low-cost server makes the delay probability of machine of disk failure and the node of system increase greatly.
Summary of the invention
The object of the present invention is to provide a kind of high reliability, the online restorative procedure of the object based on object storage of high availability.
The method of the online reparation of a kind of many copies storage system,
Adopt many copies mode to preserve object, the different copies of same target are stored on the different OSD;
Selected primary copy in the copy of same target, each retouching operation carries out on primary copy, and primary copy after revising and accomplishing is synchronized to retouching operation from copy;
When primary copy breaks down, initiate to change the primary copy request to MDS, MDS select one from copy as new primary copy, and record trouble information log;
When copy broke down, primary copy was informed MDS with failure message, the information of this fault object of MDS record;
After the trouble shooting, trigger the On-line Fault reparation, under MDS control; By the leading object of repairing of primary copy, be separate between the object, object of every reparation; The up-to-date information of this object is revised on MDS; If because of new failure stopping, when restarting to repair, no longer repeat to repair the object of having repaired in the repair process;
When the OSD host node was delayed machine, client was changed primary copy to the MDS application, after replacing is accomplished, continued operand; When OSD delays machine from node, the daily record of host node record consistance, and report MDS copy state.
Preferably, the user carries out reading and writing data through said client and system, and said client provides the universal document system interface, and said client is obtained the canned data and the copy information of object to MDS; The data of writing are issued primary copy, and primary copy carries out internal memory operation, and the daily record of record internal memory, and primary copy is transmitted to write operation from copy, also the daily record numbering is brought from copy simultaneously; After accomplishing internal memory operation, primary copy acknowledged client end; Primary copy carries out principal and subordinate's disk operating, if mistake then writes down the consistance daily record, removes the daily record of internal memory.
Preferably, the retouching operation of said copy when data forwarding to acknowledged client end just behind the internal memory of copy, have only when all copy simultaneous faultss, just can not guarantee the consistance of copy.
Preferably; Said OSD has write down inconsistent log information between the copy, when the object reparation is accomplished, appends the object daily record before this of a sign journal entries record and is employed; After definite all daily records of using are invalid, can delete the recovery disk space to daily record.
Preferably, said copy reparation starts through triggering, and comprises manual triggers and triggers automatically, triggers automatically and must set trigger condition.
Preferably, said trigger condition comprises disk failure, and network reconnects with the consistance daily record excessive.
The present invention repairs the consistance of copy through online mode, has improved the reliabilty and availability of system.
Description of drawings
Fig. 1 is the interaction models process chart of system.
Fig. 2 handles figure from the copy network failure
Fig. 3 primary copy fault handling figure
Fig. 4 master OSD fault handling figure
Fig. 5 data repair control chart
Embodiment
1 adopts the mode conservation object of many copies, improves the reliability of system, and the different copies of same target are stored on the different OSD;
Having one in the copy of 2 same targets is primary copy, and retouching operation is issued this copy, and primary copy issues operation from copy; During retouching operation; When data forwarding to acknowledged client end just behind the internal memory of copy, as long as therefore have available copy, just can guarantee the consistance of each copy; When all copy simultaneous faultss, can't guarantee the consistance of copy;
When 3 primary copies break down,, select new primary copy, MDS record trouble information by MDS through the change owner operation;
4 when copy breaks down, and primary copy is informed MDS, and MDS writes down this fault object information;
After 5 trouble shootings, the trigger data reparation is under the master control of MDS; By the leading object of repairing of primary copy, be separate between the object, object of every reparation; The up-to-date information of this object is revised on MDS, in the repair process because of new failure stopping, when restarting to repair; No longer repeat to repair the object of having repaired, the process of data repair starts by triggering, and supports by manual triggers and the automatic dual mode that triggers; According to the actual requirements, set the condition that triggers automatically, as disk failure, network reconnect and the uniformity daily record excessive etc.;
When the 6OSD node was delayed machine, client was applied for change owner to MDS, after change owner is accomplished; Continue this object of operation, OSD is last to have write down inconsistent log information between the copy, when the object reparation is accomplished; Through appending particular log clauses and subclauses, expression object daily record before this was employed.Be in due course, whole daily records of using are deleted, reclaim the disk space that invalid daily record takies.
External data reciprocal process: the user carries out reading and writing data through client and system, and client provides the universal document system interface, when the user uses and local file system as broad as long.
Internal data reciprocal process: client is obtained the canned data and the copy information of object to MDS; The data of writing are issued primary copy, and primary copy carries out internal memory operation, and the daily record of record internal memory, and primary copy is transmitted to write operation from copy, also the daily record numbering is brought from copy simultaneously; After accomplishing internal memory operation, primary copy acknowledged client end; Primary copy carries out principal and subordinate's disk operating, if mistake then writes down the consistance daily record, removes the daily record of internal memory.
For example the present invention is done more carefully below in conjunction with accompanying drawing and to describe:
Fig. 1 is the interaction models process chart of system.
Client is initiated write operation from the stored position information that MDS obtains object to primary copy; After primary copy is accepted data, to transmitting from copy; After accepting data from copy, reply primary copy; After primary copy obtains all replying from copy, reply to client and to write completion.
Fig. 2 is that the network failure of system is handled figure.
1 accepts the page from client, obtains through network layer
2 carry out local internal memory operation
3 are transmitted to the page from copy
4 from copy answer failed or overtime
5 primary copies are preserved from the copy state, to MDS report copy state
6 follow-up operations are no longer sent data manipulation to the copy of failure, are the primary copy fault handling of system up to repairing completion Fig. 3.
Break down in the 1 primary copy processing procedure
2 primary copies are initiated the change owner request to MDS
3MDS changes away primary copy
4 yuan of primary copies by main transformer become from
5 clients are sent write operation to new primary copy, and before not repairing, new master no longer sends write operation (former master) to the copy of failure
Fig. 4 is the delay processing procedure of machine of host node.
Find the host node machine of delaying in the 1 client operation process
2 clients are initiated change owner to MDS
3MDS handles change owner
4 clients are operated to new primary copy
When 5 former main OSD are restarted the adding system, need with main transformer become from, handle by the adding flow process again of OSD.
Fig. 5 is the overall procedure of online reparation
Under the overall control of MDS, by the single object of repair person's thread responsible for rehabilitation of OSD.

Claims (6)

1. the method for the online reparation of copy storage system more than a kind is characterized in that:
Adopt many copies mode to preserve object, the different copies of same target are stored on the different OSD;
Selected primary copy in the copy of same target, each retouching operation carries out on primary copy, and primary copy after revising and accomplishing is synchronized to retouching operation from copy;
When primary copy breaks down, initiate to change the primary copy request to MDS, MDS select one from copy as new primary copy, and record trouble information log;
When copy broke down, primary copy was informed MDS with failure message, the information of this fault object of MDS record;
After the trouble shooting, trigger the On-line Fault reparation, under MDS control; By the leading object of repairing of primary copy, be separate between the object, object of every reparation; The up-to-date information of this object is revised on MDS; If because of new failure stopping, when restarting to repair, no longer repeat to repair the object of having repaired in the repair process;
When the OSD host node was delayed machine, client was changed primary copy to the MDS application, after replacing is accomplished, continued operand; When OSD delays machine from node, the daily record of host node record consistance, and report MDS copy state.
2. the method for claim 1, it is characterized in that: the user carries out reading and writing data through said client and system, and said client provides the universal document system interface, and said client is obtained the canned data and the copy information of object to MDS; The data of writing are issued primary copy, and primary copy carries out internal memory operation, and the daily record of record internal memory, and primary copy is transmitted to write operation from copy, also the daily record numbering is brought from copy simultaneously; After accomplishing internal memory operation, primary copy acknowledged client end; Primary copy carries out principal and subordinate's disk operating, if mistake then writes down the consistance daily record, removes the daily record of internal memory.
3. the method for claim 1 is characterized in that: the retouching operation of said copy when data forwarding to acknowledged client end just behind the internal memory of copy, have only when all copy simultaneous faultss, just can not guarantee the consistance of copy.
4. the method for claim 1; It is characterized in that: said OSD has write down inconsistent log information between the copy; When the object reparation is accomplished; Append the object daily record before this of a sign journal entries record and be employed, after definite all daily records of using are invalid, can delete the recovery disk space daily record.
5. the method for claim 1 is characterized in that: said copy reparation starts through triggering, and comprises manual triggers and triggers automatically, triggers automatically and must set trigger condition.
6. method as claimed in claim 5 is characterized in that: said trigger condition comprises disk failure, and network reconnects with the consistance daily record excessive.
CN2011103283174A 2011-10-25 2011-10-25 Online repairing method of multiple-copy storage system Pending CN102368222A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011103283174A CN102368222A (en) 2011-10-25 2011-10-25 Online repairing method of multiple-copy storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011103283174A CN102368222A (en) 2011-10-25 2011-10-25 Online repairing method of multiple-copy storage system

Publications (1)

Publication Number Publication Date
CN102368222A true CN102368222A (en) 2012-03-07

Family

ID=45760787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011103283174A Pending CN102368222A (en) 2011-10-25 2011-10-25 Online repairing method of multiple-copy storage system

Country Status (1)

Country Link
CN (1) CN102368222A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880522A (en) * 2012-09-21 2013-01-16 中国人民解放军国防科学技术大学 Hardware fault-oriented method and device for correcting faults in key files of system
CN102981934A (en) * 2012-12-21 2013-03-20 曙光信息产业(北京)有限公司 Log transition method and log transition device
CN103019886A (en) * 2012-12-11 2013-04-03 曙光信息产业(北京)有限公司 Method and device for restoring log system in multivariate data server
CN103370692A (en) * 2012-11-21 2013-10-23 华为技术有限公司 Method and apparatus for restoring data
CN103530205A (en) * 2013-10-23 2014-01-22 曙光信息产业(北京)有限公司 Method and device for processing fault duplicate in multiple duplicates
CN103544081A (en) * 2013-10-23 2014-01-29 曙光信息产业(北京)有限公司 Management method and device for double metadata servers
CN103607448A (en) * 2013-11-18 2014-02-26 四川川大智胜软件股份有限公司 Method for storage of ATC system dynamic data
CN104239182A (en) * 2014-09-03 2014-12-24 北京鲸鲨软件科技有限公司 Cluster file system split-brain processing method and device
CN104281631A (en) * 2013-07-12 2015-01-14 中兴通讯股份有限公司 Distributed database system and data synchronization method and nodes thereof
CN106201788A (en) * 2016-07-26 2016-12-07 乐视控股(北京)有限公司 Copy restorative procedure and system for distributed storage cluster
CN107153671A (en) * 2016-03-02 2017-09-12 阿里巴巴集团控股有限公司 A kind of method and apparatus for realizing the read-write of multifile copy in a distributed system
CN107291591A (en) * 2017-06-14 2017-10-24 郑州云海信息技术有限公司 One kind storage fault repairing method and device
CN107864209A (en) * 2017-11-17 2018-03-30 北京联想超融合科技有限公司 The method, apparatus and server of data write-in
CN108235751A (en) * 2017-12-18 2018-06-29 华为技术有限公司 Identify the method, apparatus and data-storage system of object storage device inferior health
CN109189738A (en) * 2018-09-18 2019-01-11 郑州云海信息技术有限公司 Choosing method, the apparatus and system of main OSD in a kind of distributed file system
CN109992452A (en) * 2019-03-29 2019-07-09 新华三技术有限公司 A kind of fault handling method and device
CN111125024A (en) * 2019-11-29 2020-05-08 浪潮电子信息产业股份有限公司 Method, device, equipment and storage medium for deleting distributed system files
CN112506710A (en) * 2020-12-16 2021-03-16 深信服科技股份有限公司 Distributed file system data repair method, device, equipment and storage medium
CN112711376A (en) * 2019-10-25 2021-04-27 北京金山云网络技术有限公司 Method and device for determining object master copy file in object storage system
CN117093406A (en) * 2023-10-18 2023-11-21 浙江印象软件有限公司 Log center maintenance method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050283655A1 (en) * 2004-06-21 2005-12-22 Dot Hill Systems Corporation Apparatus and method for performing a preemptive reconstruct of a fault-tolerand raid array
CN102023816A (en) * 2010-11-04 2011-04-20 天津曙光计算机产业有限公司 Object storage policy and access method of object storage system
CN102033786A (en) * 2010-11-04 2011-04-27 天津曙光计算机产业有限公司 Method for repairing consistency of copies in object storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050283655A1 (en) * 2004-06-21 2005-12-22 Dot Hill Systems Corporation Apparatus and method for performing a preemptive reconstruct of a fault-tolerand raid array
CN102023816A (en) * 2010-11-04 2011-04-20 天津曙光计算机产业有限公司 Object storage policy and access method of object storage system
CN102033786A (en) * 2010-11-04 2011-04-27 天津曙光计算机产业有限公司 Method for repairing consistency of copies in object storage system

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880522A (en) * 2012-09-21 2013-01-16 中国人民解放军国防科学技术大学 Hardware fault-oriented method and device for correcting faults in key files of system
CN102880522B (en) * 2012-09-21 2014-12-31 中国人民解放军国防科学技术大学 Hardware fault-oriented method and device for correcting faults in key files of system
CN103370692A (en) * 2012-11-21 2013-10-23 华为技术有限公司 Method and apparatus for restoring data
US9983941B2 (en) 2012-11-21 2018-05-29 Huawei Technologies Co., Ltd. Method and apparatus for recovering data
CN103370692B (en) * 2012-11-21 2016-06-29 华为技术有限公司 A kind of method of repair data and device
WO2014078997A1 (en) * 2012-11-21 2014-05-30 华为技术有限公司 Method and device for repairing data
CN103019886A (en) * 2012-12-11 2013-04-03 曙光信息产业(北京)有限公司 Method and device for restoring log system in multivariate data server
CN103019886B (en) * 2012-12-11 2016-03-30 曙光信息产业(北京)有限公司 The restoration methods of log system in multivariate data server and device
CN102981934A (en) * 2012-12-21 2013-03-20 曙光信息产业(北京)有限公司 Log transition method and log transition device
CN104281631A (en) * 2013-07-12 2015-01-14 中兴通讯股份有限公司 Distributed database system and data synchronization method and nodes thereof
CN103544081A (en) * 2013-10-23 2014-01-29 曙光信息产业(北京)有限公司 Management method and device for double metadata servers
CN103544081B (en) * 2013-10-23 2015-08-12 曙光信息产业(北京)有限公司 The management method of double base data server and device
CN103530205A (en) * 2013-10-23 2014-01-22 曙光信息产业(北京)有限公司 Method and device for processing fault duplicate in multiple duplicates
CN103607448A (en) * 2013-11-18 2014-02-26 四川川大智胜软件股份有限公司 Method for storage of ATC system dynamic data
CN103607448B (en) * 2013-11-18 2016-08-24 四川川大智胜软件股份有限公司 A kind of method of ATC system dynamic data storage
CN104239182A (en) * 2014-09-03 2014-12-24 北京鲸鲨软件科技有限公司 Cluster file system split-brain processing method and device
CN104239182B (en) * 2014-09-03 2017-05-03 北京鲸鲨软件科技有限公司 Cluster file system split-brain processing method and device
CN107153671A (en) * 2016-03-02 2017-09-12 阿里巴巴集团控股有限公司 A kind of method and apparatus for realizing the read-write of multifile copy in a distributed system
CN107153671B (en) * 2016-03-02 2020-11-24 阿里巴巴集团控股有限公司 Method and equipment for realizing multi-file copy reading and writing in distributed system
CN106201788A (en) * 2016-07-26 2016-12-07 乐视控股(北京)有限公司 Copy restorative procedure and system for distributed storage cluster
CN107291591A (en) * 2017-06-14 2017-10-24 郑州云海信息技术有限公司 One kind storage fault repairing method and device
CN107864209A (en) * 2017-11-17 2018-03-30 北京联想超融合科技有限公司 The method, apparatus and server of data write-in
CN107864209B (en) * 2017-11-17 2021-05-18 北京联想超融合科技有限公司 Data writing method and device and server
CN108235751B (en) * 2017-12-18 2020-04-14 华为技术有限公司 Method and device for identifying sub-health of object storage equipment and data storage system
CN108235751A (en) * 2017-12-18 2018-06-29 华为技术有限公司 Identify the method, apparatus and data-storage system of object storage device inferior health
US11320991B2 (en) 2017-12-18 2022-05-03 Huawei Technologies Co., Ltd. Identifying sub-health object storage devices in a data storage system
CN109189738A (en) * 2018-09-18 2019-01-11 郑州云海信息技术有限公司 Choosing method, the apparatus and system of main OSD in a kind of distributed file system
CN109992452B (en) * 2019-03-29 2021-06-18 新华三技术有限公司 Fault processing method and device
CN109992452A (en) * 2019-03-29 2019-07-09 新华三技术有限公司 A kind of fault handling method and device
CN112711376A (en) * 2019-10-25 2021-04-27 北京金山云网络技术有限公司 Method and device for determining object master copy file in object storage system
CN111125024A (en) * 2019-11-29 2020-05-08 浪潮电子信息产业股份有限公司 Method, device, equipment and storage medium for deleting distributed system files
CN111125024B (en) * 2019-11-29 2022-05-24 浪潮电子信息产业股份有限公司 Method, device, equipment and storage medium for deleting distributed system files
US12001397B2 (en) 2019-11-29 2024-06-04 Inspur Electronic Information Industry Co., Ltd. Method, apparatus and device for deleting distributed system file, and storage medium
CN112506710A (en) * 2020-12-16 2021-03-16 深信服科技股份有限公司 Distributed file system data repair method, device, equipment and storage medium
CN112506710B (en) * 2020-12-16 2024-02-23 深信服科技股份有限公司 Distributed file system data restoration method, device, equipment and storage medium
CN117093406A (en) * 2023-10-18 2023-11-21 浙江印象软件有限公司 Log center maintenance method and system
CN117093406B (en) * 2023-10-18 2024-02-09 浙江印象软件有限公司 Log center maintenance method and system

Similar Documents

Publication Publication Date Title
CN102368222A (en) Online repairing method of multiple-copy storage system
US8521691B1 (en) Seamless migration between replication technologies
US8706700B1 (en) Creating consistent snapshots across several storage arrays or file systems
US9740572B1 (en) Replication of xcopy command
US7587627B2 (en) System and method for disaster recovery of data
US8352785B1 (en) Methods for generating a unified virtual snapshot and systems thereof
EP2429134A1 (en) Method and apparatus for checking and synchronizing data block in distributed file system
CN108897641B (en) Log analysis service real-time synchronization system under database master-slave environment
CN105528368A (en) A database migration method and device
CN104219085A (en) Proxy server and data processing method and system of database
US10049024B2 (en) Data processing method, device, and system for storage unit
EP2879040A1 (en) Data storage method, data storage apparatus, and storage device
EP3862883A1 (en) Data backup method and apparatus, and system
CN102710763B (en) The method and system of a kind of distributed caching pond, burst and Failure Transfer
CN107315659B (en) Metadata redundancy backup method and device
JP2010033398A (en) Acting-proxy system including acting system for processing transaction and proxy system being backup system for the acting system
CN102023816A (en) Object storage policy and access method of object storage system
CN108255576A (en) Live migration of virtual machine abnormality eliminating method, device and storage medium
KR101424568B1 (en) Client and database server for resumable transaction and method thereof
US9146921B1 (en) Accessing a file system during a file system check
CN103544081B (en) The management method of double base data server and device
JP2006277208A (en) Backup system, program and backup method
TW201308095A (en) Data synchronization method
CN107153671B (en) Method and equipment for realizing multi-file copy reading and writing in distributed system
US8229995B2 (en) Data transfer processing apparatus, data transfer processing method, and computer product

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120307