CN102368222A - Online repairing method of multiple-copy storage system - Google Patents
Online repairing method of multiple-copy storage system Download PDFInfo
- Publication number
- CN102368222A CN102368222A CN2011103283174A CN201110328317A CN102368222A CN 102368222 A CN102368222 A CN 102368222A CN 2011103283174 A CN2011103283174 A CN 2011103283174A CN 201110328317 A CN201110328317 A CN 201110328317A CN 102368222 A CN102368222 A CN 102368222A
- Authority
- CN
- China
- Prior art keywords
- copy
- mds
- primary
- primary copy
- record
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an online repairing method of a multiple-copy storage system, comprising the following steps of: storing an object by adopting a multiple-copy mode; improving the reliability of the system; storing different copies of the same object on different OSDs (Optical Scanning Devices), wherein one of the copies of the same object is a master copy; modifying operation and sending the operation to the copy; sending the operation to a slave copy by the master copy; when the master copy has a malfunction, selecting a new master copy by an MDS (Malfunction Detection System) through master changing operation and recording malfunction information by the MDS; when the slave copy has a malfunction, informing the master copy to the MDS and recording the information of the object having the malfunction by the MDS; after the malfunction is removed, triggering data repairing and finishing data repairing under the master control of the MDS; when a node of the OSD goes down, applying for changing a master to the MDS by a client side; and after the master changing operation is finished, continuing operating the object. In the invention, the consistency of the copies is repaired in an online way, and the reliability and the usability of the system are improved.
Description
Technical field
The present invention relates to the Computer Storage field, specifically, related to a kind of online restorative procedure based on object storage system.
Background technology
In the object storage system, adopt the mode of many copies can improve the reliability of system.In the distributed memory system through the generic storage device build, disk failure, network failure and the node machine of delaying are recurrent, so system need provide the ability of online handling failure, make system that reliable and stable service can be provided.
Current, fault is repaired through the mode of off-line by most system, so just greatly reduces the availability of system.
Be accompanied by that system scale constantly enlarges, the complicacy of network increases greatly, make network failure handle and face great challenge, the use of simultaneously a large amount of inexpensive disk equipment and low-cost server makes the delay probability of machine of disk failure and the node of system increase greatly.
Summary of the invention
The object of the present invention is to provide a kind of high reliability, the online restorative procedure of the object based on object storage of high availability.
The method of the online reparation of a kind of many copies storage system,
Adopt many copies mode to preserve object, the different copies of same target are stored on the different OSD;
Selected primary copy in the copy of same target, each retouching operation carries out on primary copy, and primary copy after revising and accomplishing is synchronized to retouching operation from copy;
When primary copy breaks down, initiate to change the primary copy request to MDS, MDS select one from copy as new primary copy, and record trouble information log;
When copy broke down, primary copy was informed MDS with failure message, the information of this fault object of MDS record;
After the trouble shooting, trigger the On-line Fault reparation, under MDS control; By the leading object of repairing of primary copy, be separate between the object, object of every reparation; The up-to-date information of this object is revised on MDS; If because of new failure stopping, when restarting to repair, no longer repeat to repair the object of having repaired in the repair process;
When the OSD host node was delayed machine, client was changed primary copy to the MDS application, after replacing is accomplished, continued operand; When OSD delays machine from node, the daily record of host node record consistance, and report MDS copy state.
Preferably, the user carries out reading and writing data through said client and system, and said client provides the universal document system interface, and said client is obtained the canned data and the copy information of object to MDS; The data of writing are issued primary copy, and primary copy carries out internal memory operation, and the daily record of record internal memory, and primary copy is transmitted to write operation from copy, also the daily record numbering is brought from copy simultaneously; After accomplishing internal memory operation, primary copy acknowledged client end; Primary copy carries out principal and subordinate's disk operating, if mistake then writes down the consistance daily record, removes the daily record of internal memory.
Preferably, the retouching operation of said copy when data forwarding to acknowledged client end just behind the internal memory of copy, have only when all copy simultaneous faultss, just can not guarantee the consistance of copy.
Preferably; Said OSD has write down inconsistent log information between the copy, when the object reparation is accomplished, appends the object daily record before this of a sign journal entries record and is employed; After definite all daily records of using are invalid, can delete the recovery disk space to daily record.
Preferably, said copy reparation starts through triggering, and comprises manual triggers and triggers automatically, triggers automatically and must set trigger condition.
Preferably, said trigger condition comprises disk failure, and network reconnects with the consistance daily record excessive.
The present invention repairs the consistance of copy through online mode, has improved the reliabilty and availability of system.
Description of drawings
Fig. 1 is the interaction models process chart of system.
Fig. 2 handles figure from the copy network failure
Fig. 3 primary copy fault handling figure
Fig. 4 master OSD fault handling figure
Fig. 5 data repair control chart
Embodiment
1 adopts the mode conservation object of many copies, improves the reliability of system, and the different copies of same target are stored on the different OSD;
Having one in the copy of 2 same targets is primary copy, and retouching operation is issued this copy, and primary copy issues operation from copy; During retouching operation; When data forwarding to acknowledged client end just behind the internal memory of copy, as long as therefore have available copy, just can guarantee the consistance of each copy; When all copy simultaneous faultss, can't guarantee the consistance of copy;
When 3 primary copies break down,, select new primary copy, MDS record trouble information by MDS through the change owner operation;
4 when copy breaks down, and primary copy is informed MDS, and MDS writes down this fault object information;
After 5 trouble shootings, the trigger data reparation is under the master control of MDS; By the leading object of repairing of primary copy, be separate between the object, object of every reparation; The up-to-date information of this object is revised on MDS, in the repair process because of new failure stopping, when restarting to repair; No longer repeat to repair the object of having repaired, the process of data repair starts by triggering, and supports by manual triggers and the automatic dual mode that triggers; According to the actual requirements, set the condition that triggers automatically, as disk failure, network reconnect and the uniformity daily record excessive etc.;
When the 6OSD node was delayed machine, client was applied for change owner to MDS, after change owner is accomplished; Continue this object of operation, OSD is last to have write down inconsistent log information between the copy, when the object reparation is accomplished; Through appending particular log clauses and subclauses, expression object daily record before this was employed.Be in due course, whole daily records of using are deleted, reclaim the disk space that invalid daily record takies.
External data reciprocal process: the user carries out reading and writing data through client and system, and client provides the universal document system interface, when the user uses and local file system as broad as long.
Internal data reciprocal process: client is obtained the canned data and the copy information of object to MDS; The data of writing are issued primary copy, and primary copy carries out internal memory operation, and the daily record of record internal memory, and primary copy is transmitted to write operation from copy, also the daily record numbering is brought from copy simultaneously; After accomplishing internal memory operation, primary copy acknowledged client end; Primary copy carries out principal and subordinate's disk operating, if mistake then writes down the consistance daily record, removes the daily record of internal memory.
For example the present invention is done more carefully below in conjunction with accompanying drawing and to describe:
Fig. 1 is the interaction models process chart of system.
Client is initiated write operation from the stored position information that MDS obtains object to primary copy; After primary copy is accepted data, to transmitting from copy; After accepting data from copy, reply primary copy; After primary copy obtains all replying from copy, reply to client and to write completion.
Fig. 2 is that the network failure of system is handled figure.
1 accepts the page from client, obtains through network layer
2 carry out local internal memory operation
3 are transmitted to the page from copy
4 from copy answer failed or overtime
5 primary copies are preserved from the copy state, to MDS report copy state
6 follow-up operations are no longer sent data manipulation to the copy of failure, are the primary copy fault handling of system up to repairing completion Fig. 3.
Break down in the 1 primary copy processing procedure
2 primary copies are initiated the change owner request to MDS
3MDS changes away primary copy
4 yuan of primary copies by main transformer become from
5 clients are sent write operation to new primary copy, and before not repairing, new master no longer sends write operation (former master) to the copy of failure
Fig. 4 is the delay processing procedure of machine of host node.
Find the host node machine of delaying in the 1 client operation process
2 clients are initiated change owner to MDS
3MDS handles change owner
4 clients are operated to new primary copy
When 5 former main OSD are restarted the adding system, need with main transformer become from, handle by the adding flow process again of OSD.
Fig. 5 is the overall procedure of online reparation
Under the overall control of MDS, by the single object of repair person's thread responsible for rehabilitation of OSD.
Claims (6)
1. the method for the online reparation of copy storage system more than a kind is characterized in that:
Adopt many copies mode to preserve object, the different copies of same target are stored on the different OSD;
Selected primary copy in the copy of same target, each retouching operation carries out on primary copy, and primary copy after revising and accomplishing is synchronized to retouching operation from copy;
When primary copy breaks down, initiate to change the primary copy request to MDS, MDS select one from copy as new primary copy, and record trouble information log;
When copy broke down, primary copy was informed MDS with failure message, the information of this fault object of MDS record;
After the trouble shooting, trigger the On-line Fault reparation, under MDS control; By the leading object of repairing of primary copy, be separate between the object, object of every reparation; The up-to-date information of this object is revised on MDS; If because of new failure stopping, when restarting to repair, no longer repeat to repair the object of having repaired in the repair process;
When the OSD host node was delayed machine, client was changed primary copy to the MDS application, after replacing is accomplished, continued operand; When OSD delays machine from node, the daily record of host node record consistance, and report MDS copy state.
2. the method for claim 1, it is characterized in that: the user carries out reading and writing data through said client and system, and said client provides the universal document system interface, and said client is obtained the canned data and the copy information of object to MDS; The data of writing are issued primary copy, and primary copy carries out internal memory operation, and the daily record of record internal memory, and primary copy is transmitted to write operation from copy, also the daily record numbering is brought from copy simultaneously; After accomplishing internal memory operation, primary copy acknowledged client end; Primary copy carries out principal and subordinate's disk operating, if mistake then writes down the consistance daily record, removes the daily record of internal memory.
3. the method for claim 1 is characterized in that: the retouching operation of said copy when data forwarding to acknowledged client end just behind the internal memory of copy, have only when all copy simultaneous faultss, just can not guarantee the consistance of copy.
4. the method for claim 1; It is characterized in that: said OSD has write down inconsistent log information between the copy; When the object reparation is accomplished; Append the object daily record before this of a sign journal entries record and be employed, after definite all daily records of using are invalid, can delete the recovery disk space daily record.
5. the method for claim 1 is characterized in that: said copy reparation starts through triggering, and comprises manual triggers and triggers automatically, triggers automatically and must set trigger condition.
6. method as claimed in claim 5 is characterized in that: said trigger condition comprises disk failure, and network reconnects with the consistance daily record excessive.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011103283174A CN102368222A (en) | 2011-10-25 | 2011-10-25 | Online repairing method of multiple-copy storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011103283174A CN102368222A (en) | 2011-10-25 | 2011-10-25 | Online repairing method of multiple-copy storage system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102368222A true CN102368222A (en) | 2012-03-07 |
Family
ID=45760787
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2011103283174A Pending CN102368222A (en) | 2011-10-25 | 2011-10-25 | Online repairing method of multiple-copy storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102368222A (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102880522A (en) * | 2012-09-21 | 2013-01-16 | 中国人民解放军国防科学技术大学 | Hardware fault-oriented method and device for correcting faults in key files of system |
CN102981934A (en) * | 2012-12-21 | 2013-03-20 | 曙光信息产业(北京)有限公司 | Log transition method and log transition device |
CN103019886A (en) * | 2012-12-11 | 2013-04-03 | 曙光信息产业(北京)有限公司 | Method and device for restoring log system in multivariate data server |
CN103370692A (en) * | 2012-11-21 | 2013-10-23 | 华为技术有限公司 | Method and apparatus for restoring data |
CN103530205A (en) * | 2013-10-23 | 2014-01-22 | 曙光信息产业(北京)有限公司 | Method and device for processing fault duplicate in multiple duplicates |
CN103544081A (en) * | 2013-10-23 | 2014-01-29 | 曙光信息产业(北京)有限公司 | Management method and device for double metadata servers |
CN103607448A (en) * | 2013-11-18 | 2014-02-26 | 四川川大智胜软件股份有限公司 | Method for storage of ATC system dynamic data |
CN104239182A (en) * | 2014-09-03 | 2014-12-24 | 北京鲸鲨软件科技有限公司 | Cluster file system split-brain processing method and device |
CN104281631A (en) * | 2013-07-12 | 2015-01-14 | 中兴通讯股份有限公司 | Distributed database system and data synchronization method and nodes thereof |
CN106201788A (en) * | 2016-07-26 | 2016-12-07 | 乐视控股(北京)有限公司 | Copy restorative procedure and system for distributed storage cluster |
CN107153671A (en) * | 2016-03-02 | 2017-09-12 | 阿里巴巴集团控股有限公司 | A kind of method and apparatus for realizing the read-write of multifile copy in a distributed system |
CN107291591A (en) * | 2017-06-14 | 2017-10-24 | 郑州云海信息技术有限公司 | One kind storage fault repairing method and device |
CN107864209A (en) * | 2017-11-17 | 2018-03-30 | 北京联想超融合科技有限公司 | The method, apparatus and server of data write-in |
CN108235751A (en) * | 2017-12-18 | 2018-06-29 | 华为技术有限公司 | Identify the method, apparatus and data-storage system of object storage device inferior health |
CN109189738A (en) * | 2018-09-18 | 2019-01-11 | 郑州云海信息技术有限公司 | Choosing method, the apparatus and system of main OSD in a kind of distributed file system |
CN109992452A (en) * | 2019-03-29 | 2019-07-09 | 新华三技术有限公司 | A kind of fault handling method and device |
CN111125024A (en) * | 2019-11-29 | 2020-05-08 | 浪潮电子信息产业股份有限公司 | Method, device, equipment and storage medium for deleting distributed system files |
CN112506710A (en) * | 2020-12-16 | 2021-03-16 | 深信服科技股份有限公司 | Distributed file system data repair method, device, equipment and storage medium |
CN112711376A (en) * | 2019-10-25 | 2021-04-27 | 北京金山云网络技术有限公司 | Method and device for determining object master copy file in object storage system |
CN117093406A (en) * | 2023-10-18 | 2023-11-21 | 浙江印象软件有限公司 | Log center maintenance method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050283655A1 (en) * | 2004-06-21 | 2005-12-22 | Dot Hill Systems Corporation | Apparatus and method for performing a preemptive reconstruct of a fault-tolerand raid array |
CN102023816A (en) * | 2010-11-04 | 2011-04-20 | 天津曙光计算机产业有限公司 | Object storage policy and access method of object storage system |
CN102033786A (en) * | 2010-11-04 | 2011-04-27 | 天津曙光计算机产业有限公司 | Method for repairing consistency of copies in object storage system |
-
2011
- 2011-10-25 CN CN2011103283174A patent/CN102368222A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050283655A1 (en) * | 2004-06-21 | 2005-12-22 | Dot Hill Systems Corporation | Apparatus and method for performing a preemptive reconstruct of a fault-tolerand raid array |
CN102023816A (en) * | 2010-11-04 | 2011-04-20 | 天津曙光计算机产业有限公司 | Object storage policy and access method of object storage system |
CN102033786A (en) * | 2010-11-04 | 2011-04-27 | 天津曙光计算机产业有限公司 | Method for repairing consistency of copies in object storage system |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102880522A (en) * | 2012-09-21 | 2013-01-16 | 中国人民解放军国防科学技术大学 | Hardware fault-oriented method and device for correcting faults in key files of system |
CN102880522B (en) * | 2012-09-21 | 2014-12-31 | 中国人民解放军国防科学技术大学 | Hardware fault-oriented method and device for correcting faults in key files of system |
CN103370692A (en) * | 2012-11-21 | 2013-10-23 | 华为技术有限公司 | Method and apparatus for restoring data |
US9983941B2 (en) | 2012-11-21 | 2018-05-29 | Huawei Technologies Co., Ltd. | Method and apparatus for recovering data |
CN103370692B (en) * | 2012-11-21 | 2016-06-29 | 华为技术有限公司 | A kind of method of repair data and device |
WO2014078997A1 (en) * | 2012-11-21 | 2014-05-30 | 华为技术有限公司 | Method and device for repairing data |
CN103019886A (en) * | 2012-12-11 | 2013-04-03 | 曙光信息产业(北京)有限公司 | Method and device for restoring log system in multivariate data server |
CN103019886B (en) * | 2012-12-11 | 2016-03-30 | 曙光信息产业(北京)有限公司 | The restoration methods of log system in multivariate data server and device |
CN102981934A (en) * | 2012-12-21 | 2013-03-20 | 曙光信息产业(北京)有限公司 | Log transition method and log transition device |
CN104281631A (en) * | 2013-07-12 | 2015-01-14 | 中兴通讯股份有限公司 | Distributed database system and data synchronization method and nodes thereof |
CN103544081A (en) * | 2013-10-23 | 2014-01-29 | 曙光信息产业(北京)有限公司 | Management method and device for double metadata servers |
CN103544081B (en) * | 2013-10-23 | 2015-08-12 | 曙光信息产业(北京)有限公司 | The management method of double base data server and device |
CN103530205A (en) * | 2013-10-23 | 2014-01-22 | 曙光信息产业(北京)有限公司 | Method and device for processing fault duplicate in multiple duplicates |
CN103607448A (en) * | 2013-11-18 | 2014-02-26 | 四川川大智胜软件股份有限公司 | Method for storage of ATC system dynamic data |
CN103607448B (en) * | 2013-11-18 | 2016-08-24 | 四川川大智胜软件股份有限公司 | A kind of method of ATC system dynamic data storage |
CN104239182A (en) * | 2014-09-03 | 2014-12-24 | 北京鲸鲨软件科技有限公司 | Cluster file system split-brain processing method and device |
CN104239182B (en) * | 2014-09-03 | 2017-05-03 | 北京鲸鲨软件科技有限公司 | Cluster file system split-brain processing method and device |
CN107153671A (en) * | 2016-03-02 | 2017-09-12 | 阿里巴巴集团控股有限公司 | A kind of method and apparatus for realizing the read-write of multifile copy in a distributed system |
CN107153671B (en) * | 2016-03-02 | 2020-11-24 | 阿里巴巴集团控股有限公司 | Method and equipment for realizing multi-file copy reading and writing in distributed system |
CN106201788A (en) * | 2016-07-26 | 2016-12-07 | 乐视控股(北京)有限公司 | Copy restorative procedure and system for distributed storage cluster |
CN107291591A (en) * | 2017-06-14 | 2017-10-24 | 郑州云海信息技术有限公司 | One kind storage fault repairing method and device |
CN107864209A (en) * | 2017-11-17 | 2018-03-30 | 北京联想超融合科技有限公司 | The method, apparatus and server of data write-in |
CN107864209B (en) * | 2017-11-17 | 2021-05-18 | 北京联想超融合科技有限公司 | Data writing method and device and server |
CN108235751B (en) * | 2017-12-18 | 2020-04-14 | 华为技术有限公司 | Method and device for identifying sub-health of object storage equipment and data storage system |
CN108235751A (en) * | 2017-12-18 | 2018-06-29 | 华为技术有限公司 | Identify the method, apparatus and data-storage system of object storage device inferior health |
US11320991B2 (en) | 2017-12-18 | 2022-05-03 | Huawei Technologies Co., Ltd. | Identifying sub-health object storage devices in a data storage system |
CN109189738A (en) * | 2018-09-18 | 2019-01-11 | 郑州云海信息技术有限公司 | Choosing method, the apparatus and system of main OSD in a kind of distributed file system |
CN109992452B (en) * | 2019-03-29 | 2021-06-18 | 新华三技术有限公司 | Fault processing method and device |
CN109992452A (en) * | 2019-03-29 | 2019-07-09 | 新华三技术有限公司 | A kind of fault handling method and device |
CN112711376A (en) * | 2019-10-25 | 2021-04-27 | 北京金山云网络技术有限公司 | Method and device for determining object master copy file in object storage system |
CN111125024A (en) * | 2019-11-29 | 2020-05-08 | 浪潮电子信息产业股份有限公司 | Method, device, equipment and storage medium for deleting distributed system files |
CN111125024B (en) * | 2019-11-29 | 2022-05-24 | 浪潮电子信息产业股份有限公司 | Method, device, equipment and storage medium for deleting distributed system files |
US12001397B2 (en) | 2019-11-29 | 2024-06-04 | Inspur Electronic Information Industry Co., Ltd. | Method, apparatus and device for deleting distributed system file, and storage medium |
CN112506710A (en) * | 2020-12-16 | 2021-03-16 | 深信服科技股份有限公司 | Distributed file system data repair method, device, equipment and storage medium |
CN112506710B (en) * | 2020-12-16 | 2024-02-23 | 深信服科技股份有限公司 | Distributed file system data restoration method, device, equipment and storage medium |
CN117093406A (en) * | 2023-10-18 | 2023-11-21 | 浙江印象软件有限公司 | Log center maintenance method and system |
CN117093406B (en) * | 2023-10-18 | 2024-02-09 | 浙江印象软件有限公司 | Log center maintenance method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102368222A (en) | Online repairing method of multiple-copy storage system | |
US8521691B1 (en) | Seamless migration between replication technologies | |
US8706700B1 (en) | Creating consistent snapshots across several storage arrays or file systems | |
US9740572B1 (en) | Replication of xcopy command | |
US7587627B2 (en) | System and method for disaster recovery of data | |
US8352785B1 (en) | Methods for generating a unified virtual snapshot and systems thereof | |
EP2429134A1 (en) | Method and apparatus for checking and synchronizing data block in distributed file system | |
CN108897641B (en) | Log analysis service real-time synchronization system under database master-slave environment | |
CN105528368A (en) | A database migration method and device | |
CN104219085A (en) | Proxy server and data processing method and system of database | |
US10049024B2 (en) | Data processing method, device, and system for storage unit | |
EP2879040A1 (en) | Data storage method, data storage apparatus, and storage device | |
EP3862883A1 (en) | Data backup method and apparatus, and system | |
CN102710763B (en) | The method and system of a kind of distributed caching pond, burst and Failure Transfer | |
CN107315659B (en) | Metadata redundancy backup method and device | |
JP2010033398A (en) | Acting-proxy system including acting system for processing transaction and proxy system being backup system for the acting system | |
CN102023816A (en) | Object storage policy and access method of object storage system | |
CN108255576A (en) | Live migration of virtual machine abnormality eliminating method, device and storage medium | |
KR101424568B1 (en) | Client and database server for resumable transaction and method thereof | |
US9146921B1 (en) | Accessing a file system during a file system check | |
CN103544081B (en) | The management method of double base data server and device | |
JP2006277208A (en) | Backup system, program and backup method | |
TW201308095A (en) | Data synchronization method | |
CN107153671B (en) | Method and equipment for realizing multi-file copy reading and writing in distributed system | |
US8229995B2 (en) | Data transfer processing apparatus, data transfer processing method, and computer product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20120307 |