CN109981741A - A kind of maintaining method of distributed memory system - Google Patents

A kind of maintaining method of distributed memory system Download PDF

Info

Publication number
CN109981741A
CN109981741A CN201910140854.2A CN201910140854A CN109981741A CN 109981741 A CN109981741 A CN 109981741A CN 201910140854 A CN201910140854 A CN 201910140854A CN 109981741 A CN109981741 A CN 109981741A
Authority
CN
China
Prior art keywords
node
service mode
offline
write
memory system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910140854.2A
Other languages
Chinese (zh)
Inventor
金辉
严刚
侯玉斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Enlightenment Cloud Computing Co Ltd
Original Assignee
Enlightenment Cloud Computing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enlightenment Cloud Computing Co Ltd filed Critical Enlightenment Cloud Computing Co Ltd
Priority to CN201910140854.2A priority Critical patent/CN109981741A/en
Publication of CN109981741A publication Critical patent/CN109981741A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/59Providing operational support to end devices by off-loading in the network or by emulation, e.g. when they are unavailable

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of maintaining methods of distributed memory system, comprising the following steps: under step 1, service mode, node off-line will not trigger recovery process;When step 2, the data object for causing copy to lose for offline node carry out write operation, handled by the way of effective copy+marking, i.e., degradation WriteMode is handled;When step 3, offline node are again online, only this online node can trigger recovery process again, and only synchronize its it is offline during be labeled with the data object of write-in label.The present invention promotes the maintainability of storage cluster, realizes the operations such as disk replacement and the program upgrading under presence, guaranty period read-write operation can be normal by introducing service mode.

Description

A kind of maintaining method of distributed memory system
Technical field
The present invention relates to field of computer technology more particularly to a kind of maintaining methods of distributed memory system.
Background technique
Distributed memory system will start recovery process in topology change, carry out automaticdata rebalancing.Usually In the case of, recovery process be fully automated and it is not interruptable.
During Recovery, in order to guarantee the consistency of data, part operation can not be executed, and influence normally making for user With.When power system capacity is gradually increased, it is meant that need the data volume of recovery to increase, the time of cost is consequently increased, and is The availability of system reduces.Best engineering practice shows how to reduce the frequency of recovery generation to the greatest extent, is distributed storage system The important content of system O&M.
However, during practical O&M, if to carry out disk replacement or edition upgrading, corresponding number can only be first allowed It is offline according to node, an online new node again is replaced after disk or upgrade procedure, this, which means that, must carry out 2 times Recovery, it is time-consuming and laborious.
Summary of the invention
In view of the above drawbacks of the prior art, technical problem to be solved by the invention is to provide a kind of distributed storages The maintaining method of system, so as to solve the deficiencies in the prior art.
To achieve the above object, the present invention provides a kind of maintaining methods of distributed memory system, comprising the following steps:
Under step 1, service mode, node off-line will not trigger recovery process;
When step 2, the data object for causing copy to lose for offline node carry out write operation, using effective copy+beat The mode of label is handled, i.e., degradation WriteMode is handled;
When step 3, offline node are again online, only this online node can trigger recovery process again, and And the data object of write-in label is labeled with during only synchronizing it offline.
The degradation WriteMode of a kind of maintaining method of above-mentioned distributed memory system, the step 2 handles specific steps Are as follows:
1, gateway receives client's write request;
2, Gateway determines the back end for needing to forward according to present topology;
3, back end receives the write request of forwarding;
4, judge whether that in service mode, be writes process into degrading, and obtains service mode topology, otherwise normally writes Process;
5, pass through service mode topology equivalence locations of copies;
6, compare present topology and service mode topology;
7, judge whether there is node off-line under present topology, marked if it is write-in, otherwise write copy;Sentence after write-in label Whether disconnected present node is write copy, is otherwise terminated under service mode topology if being.
A kind of maintaining method of above-mentioned distributed memory system, under service mode, can only offline unlatching service mode when Already existing node in cluster.
A kind of maintaining method of above-mentioned distributed memory system, under service mode, can only by offline node again on Line, cannot online new node.
A kind of maintaining method of above-mentioned distributed memory system under service mode, to offline node, cannot be removed Or change its working directory.
A kind of maintaining method of above-mentioned distributed memory system, under service mode, when node is again online, start-up parameter It must be offline completely the same before with its.
The beneficial effects of the present invention are:
The present invention promotes the maintainability of storage cluster, realizes that the disk under presence replaces by introducing service mode Change with program upgrading etc. operation, guaranty period read-write operation can be normal.
It is described further below with reference to technical effect of the attached drawing to design of the invention, specific structure and generation, with It is fully understood from the purpose of the present invention, feature and effect.
Detailed description of the invention
Fig. 1 is the degradation write-in flow chart under service mode of the invention.
Specific embodiment
A kind of maintaining method of distributed memory system, comprising the following steps:
Under step 1, service mode, node off-line will not trigger recovery process;
When step 2, the data object for causing copy to lose for offline node carry out write operation, using effective copy+beat The mode of label is handled, i.e., degradation WriteMode is handled;
When step 3, offline node are again online, only this online node can trigger recovery process again, and And the data object of write-in label is labeled with during only synchronizing it offline.
Firstly, it is necessary to service mode topology and present topology be recorded, for calculating locations of copies and the write-in of data object Mark position.The premise write that degrades is that recovery does not occur, else if the data object is carrying out recovery, writes behaviour Make to enter and wait.Degradation write-in process under service mode is as shown in Figure 1, degradation WriteMode handles specific steps are as follows:
1, gateway receives client's write request;
2, Gateway determines the back end for needing to forward according to present topology;
3, back end receives the write request of forwarding;
4, judge whether that in service mode, be writes process into degrading, and obtains service mode topology, otherwise normally writes Process;
5, pass through service mode topology equivalence locations of copies;
6, compare present topology and service mode topology;
7, judge whether there is node off-line under present topology, marked if it is write-in, otherwise write copy;Sentence after write-in label Whether disconnected present node is write copy, is otherwise terminated under service mode topology if being.
In addition, under service mode, can only offline unlatching service mode when cluster in already existing node.Service mode Under, it can only be again online by offline node, it cannot online new node.Under service mode, to offline node, it cannot remove Or change its working directory.Under service mode, when node is again online, start-up parameter must with its it is offline before complete one It causes.
One, inventive principle:
Distributed memory system uses strong consistency strategy to the write-in of copy, thus in copy missing and not yet When recovery is finished, write operation can not be executed.But if without recovery during node off-line, it is secondary to missing When this object carries out write operation, and it is not written into the copy (also not writing certainly, because of node off-line) of missing, but is directed to Missing copy stamps write-in label, guarantees that effective copy+write-in label sum reaches maximum number of copies, then the strongly consistent of write operation Property is not destroyed.It when offline node is again online, then will be distributed in the data copy on offline node, stamp write-in The part of label synchronizes, that is, can guarantee the strong consistency of data.
Specific embodiment given below illustrates operation of the present invention Method And Principle:
Using service mode, the functions such as online disk replacement and the upgrading of online gray scale, implementation Datong District may be implemented Small difference illustrates the application method of service mode here by taking the replacement of online disk as an example:
Firstly, making in the normal situation of cluster state (without offline, the recovery not completed on node) Starting service mode is ordered with " dog cluster mmode on ";
The corresponding data object service processes of disk to be replaced are killed, and waits topology to update and completes (that is, passing through " dog cluster info " order, it can be seen that topology caused by the node is offline updates);
The whole disk of disk to be replaced is copied into new disk;
Back end corresponding to original disk (it is required that start-up parameter with before consistent) is restarted, is waited Recovery (synchronizing the data object of write-in label) is completed;
Close service mode.
The method for realizing gray scale upgrading is similar with the above process, the step of Replace Disk and Press Anykey To Reboot is changed into upgrade installation package, just It is the process for upgrading a node;Then, according to the repetition of this process one node, one node, until all nodes all rise Grade at latest edition program.
The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical solution, all should be within the scope of protection determined by the claims.

Claims (6)

1. a kind of maintaining method of distributed memory system, which comprises the following steps:
Under step 1, service mode, node off-line will not trigger recovery process;
When step 2, the data object for causing copy to lose for offline node carry out write operation, using effective copy+marking Mode handle, i.e., degradation WriteMode handle;
When step 3, offline node are again online, only this online node can trigger recovery process again, and only The data object of write-in label is labeled with during only synchronizing it offline.
2. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: the drop of the step 2 Grade WriteMode handles specific steps are as follows:
1, gateway receives client's write request;
2, Gateway determines the back end for needing to forward according to present topology;
3, back end receives the write request of forwarding;
4, judge whether that in service mode, be writes process into degrading, and obtains service mode topology, otherwise normally writes stream Journey;
5, pass through service mode topology equivalence locations of copies;
6, compare present topology and service mode topology;
7, judge whether there is node off-line under present topology, marked if it is write-in, otherwise write copy;Judge to work as after write-in label Whether front nodal point is write copy, is otherwise terminated under service mode topology if being.
3. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: under service mode, only Already existing node in cluster when energy offline unlatching service mode.
4. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: under service mode, only Can be again online by offline node, it cannot online new node.
5. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: right under service mode Its working directory cannot be removed or be changed to offline node.
6. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: under service mode, section When point is again online, start-up parameter must be offline completely the same before with its.
CN201910140854.2A 2019-02-26 2019-02-26 A kind of maintaining method of distributed memory system Pending CN109981741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910140854.2A CN109981741A (en) 2019-02-26 2019-02-26 A kind of maintaining method of distributed memory system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910140854.2A CN109981741A (en) 2019-02-26 2019-02-26 A kind of maintaining method of distributed memory system

Publications (1)

Publication Number Publication Date
CN109981741A true CN109981741A (en) 2019-07-05

Family

ID=67077337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910140854.2A Pending CN109981741A (en) 2019-02-26 2019-02-26 A kind of maintaining method of distributed memory system

Country Status (1)

Country Link
CN (1) CN109981741A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990339A (en) * 2019-10-15 2020-04-10 平安科技(深圳)有限公司 Distributed storage file reading and writing method, device and platform and readable storage medium

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599032A (en) * 2009-05-31 2009-12-09 成都市华为赛门铁克科技有限公司 Storage node management method, control subsystem and storage system
WO2011023134A1 (en) * 2009-08-28 2011-03-03 Beijing Innovation Works Technology Company Limited Method and system for managing distributed storage system through virtual file system
CN102202044A (en) * 2011-02-25 2011-09-28 北京兴宇中科科技开发股份有限公司 Portable cloud storage method and device
US20120166390A1 (en) * 2010-12-23 2012-06-28 Dwight Merriman Method and apparatus for maintaining replica sets
CN103546579A (en) * 2013-11-07 2014-01-29 陈靓 Method for improving availability of distributed storage system through data logs
US20140089912A1 (en) * 2012-09-21 2014-03-27 Silver Spring Networks, Inc. System and method for efficiently updating firmware for nodes in a mesh network
EP2755161A1 (en) * 2013-01-14 2014-07-16 Accenture Global Services Limited Secure online distributed data storage services
CN104216719A (en) * 2013-05-30 2014-12-17 深圳创维无线技术有限公司 Method and device for updating android system
CN104615606A (en) * 2013-11-05 2015-05-13 阿里巴巴集团控股有限公司 Hadoop distributed file system and management method thereof
CN104618487A (en) * 2015-02-06 2015-05-13 杭州华三通信技术有限公司 Internet protocol storage on-line upgrading method and device
CN105094913A (en) * 2015-07-31 2015-11-25 广东欧珀移动通信有限公司 System, base band fastener and system application upgrading method and device
CN105659213A (en) * 2013-10-18 2016-06-08 日立数据系统工程英国有限公司 Target-driven independent data integrity and redundancy recovery in a shared-nothing distributed storage system
CN106406758A (en) * 2016-09-05 2017-02-15 华为技术有限公司 Data processing method based on distributed storage system, and storage equipment
US20170116302A1 (en) * 2015-10-22 2017-04-27 Maxta, Inc. Replica Checkpointing Without Quiescing
CN106776142A (en) * 2016-12-23 2017-05-31 深圳市深信服电子科技有限公司 A kind of date storage method and data storage device
CN107526536A (en) * 2016-06-22 2017-12-29 伊姆西公司 For managing the method and system of storage system
CN107943510A (en) * 2017-11-23 2018-04-20 郑州云海信息技术有限公司 Distributed memory system upgrade method, system, device and readable storage medium storing program for executing
US9983823B1 (en) * 2016-12-09 2018-05-29 Amazon Technologies, Inc. Pre-forking replicas for efficient scaling of a distribued data storage system
CN108319618A (en) * 2017-01-17 2018-07-24 阿里巴巴集团控股有限公司 A kind of data distribution control method, system and the device of distributed memory system
CN108427537A (en) * 2018-01-12 2018-08-21 上海凯翔信息科技有限公司 Distributed memory system and its file write-in optimization method, client process method
US10069914B1 (en) * 2014-04-21 2018-09-04 David Lane Smith Distributed storage system for long term data storage
CN108780460A (en) * 2016-03-25 2018-11-09 英特尔公司 For the distribution index in distributed memory system and the method and apparatus of repositioning object segmentation
US20180349071A1 (en) * 2017-05-30 2018-12-06 Kyocera Document Solutions Inc. Image forming apparatus management system including plural image forming apparatuses and management server for remotely managing plural image forming apparatuses via network, and image forming apparatus management method

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599032A (en) * 2009-05-31 2009-12-09 成都市华为赛门铁克科技有限公司 Storage node management method, control subsystem and storage system
WO2011023134A1 (en) * 2009-08-28 2011-03-03 Beijing Innovation Works Technology Company Limited Method and system for managing distributed storage system through virtual file system
US20120166390A1 (en) * 2010-12-23 2012-06-28 Dwight Merriman Method and apparatus for maintaining replica sets
CN102202044A (en) * 2011-02-25 2011-09-28 北京兴宇中科科技开发股份有限公司 Portable cloud storage method and device
US20140089912A1 (en) * 2012-09-21 2014-03-27 Silver Spring Networks, Inc. System and method for efficiently updating firmware for nodes in a mesh network
EP2755161A1 (en) * 2013-01-14 2014-07-16 Accenture Global Services Limited Secure online distributed data storage services
CN104216719A (en) * 2013-05-30 2014-12-17 深圳创维无线技术有限公司 Method and device for updating android system
CN105659213A (en) * 2013-10-18 2016-06-08 日立数据系统工程英国有限公司 Target-driven independent data integrity and redundancy recovery in a shared-nothing distributed storage system
CN104615606A (en) * 2013-11-05 2015-05-13 阿里巴巴集团控股有限公司 Hadoop distributed file system and management method thereof
CN103546579A (en) * 2013-11-07 2014-01-29 陈靓 Method for improving availability of distributed storage system through data logs
US10069914B1 (en) * 2014-04-21 2018-09-04 David Lane Smith Distributed storage system for long term data storage
CN104618487A (en) * 2015-02-06 2015-05-13 杭州华三通信技术有限公司 Internet protocol storage on-line upgrading method and device
CN105094913A (en) * 2015-07-31 2015-11-25 广东欧珀移动通信有限公司 System, base band fastener and system application upgrading method and device
US20170116302A1 (en) * 2015-10-22 2017-04-27 Maxta, Inc. Replica Checkpointing Without Quiescing
CN108780460A (en) * 2016-03-25 2018-11-09 英特尔公司 For the distribution index in distributed memory system and the method and apparatus of repositioning object segmentation
CN107526536A (en) * 2016-06-22 2017-12-29 伊姆西公司 For managing the method and system of storage system
CN106406758A (en) * 2016-09-05 2017-02-15 华为技术有限公司 Data processing method based on distributed storage system, and storage equipment
US9983823B1 (en) * 2016-12-09 2018-05-29 Amazon Technologies, Inc. Pre-forking replicas for efficient scaling of a distribued data storage system
CN106776142A (en) * 2016-12-23 2017-05-31 深圳市深信服电子科技有限公司 A kind of date storage method and data storage device
CN108319618A (en) * 2017-01-17 2018-07-24 阿里巴巴集团控股有限公司 A kind of data distribution control method, system and the device of distributed memory system
US20180349071A1 (en) * 2017-05-30 2018-12-06 Kyocera Document Solutions Inc. Image forming apparatus management system including plural image forming apparatuses and management server for remotely managing plural image forming apparatuses via network, and image forming apparatus management method
CN107943510A (en) * 2017-11-23 2018-04-20 郑州云海信息技术有限公司 Distributed memory system upgrade method, system, device and readable storage medium storing program for executing
CN108427537A (en) * 2018-01-12 2018-08-21 上海凯翔信息科技有限公司 Distributed memory system and its file write-in optimization method, client process method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
施超: "基于Android平台OTA增量升级系统研究与设计", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990339A (en) * 2019-10-15 2020-04-10 平安科技(深圳)有限公司 Distributed storage file reading and writing method, device and platform and readable storage medium
CN110990339B (en) * 2019-10-15 2023-09-19 平安科技(深圳)有限公司 File read-write method, device and platform for distributed storage and readable storage medium

Similar Documents

Publication Publication Date Title
CN102402596B (en) A kind of reading/writing method of master slave separation database and system
CN101964820B (en) Method and system for keeping data consistency
CN109241185A (en) A kind of method and data synchronization unit that data are synchronous
CN106021016A (en) Virtual point in time access between snapshots
CN102981931A (en) Backup method and device for virtual machine
CN103294675A (en) Method and device for updating data in distributed storage system
CN109918229B (en) Database cluster copy construction method and device in non-log mode
US10747776B2 (en) Replication control using eventually consistent meta-data
CN102937909B (en) A kind of method of disposing and upgrading linux system
CN102750322B (en) Method and system for guaranteeing distributed metadata consistency for cluster file system
CN103473287A (en) Method and system for automatically distributing, running and updating executable programs
CN102279857B (en) Method and system for realizing data reproduction
CN105630571A (en) Virtual machine creating method and device
JP5521595B2 (en) Storage system and storage control method
CN114942965B (en) Method and system for accelerating synchronous operation of main database and standby database
CN102495739A (en) Data compatible method and system as well as inter-plate message method and system
CN110007941A (en) A kind of the MCU firmware and upgrade method of Intelligent refuse classification recovery system
CN102193841A (en) Backup method and device of Subversion configuration database
CN102708166B (en) Data replication method, data recovery method and data recovery device
CN102833273A (en) Data restoring method when meeting temporary fault and distributed caching system
CN105205178A (en) Multi-process access memory database system
CN109981741A (en) A kind of maintaining method of distributed memory system
CN107861838A (en) Method and device of the automated back-up MySQL database from storehouse
CN113296804B (en) Method and device for upgrading database
CN104407932B (en) A kind of data back up method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20220830

AD01 Patent right deemed abandoned