CN109981741A

CN109981741A - A kind of maintaining method of distributed memory system

Info

Publication number: CN109981741A
Application number: CN201910140854.2A
Authority: CN
Inventors: 金辉; 严刚; 侯玉斌
Original assignee: Enlightenment Cloud Computing Co Ltd
Current assignee: Enlightenment Cloud Computing Co Ltd
Priority date: 2019-02-26
Filing date: 2019-02-26
Publication date: 2019-07-05

Abstract

The invention discloses a kind of maintaining methods of distributed memory system, comprising the following steps: under step 1, service mode, node off-line will not trigger recovery process；When step 2, the data object for causing copy to lose for offline node carry out write operation, handled by the way of effective copy+marking, i.e., degradation WriteMode is handled；When step 3, offline node are again online, only this online node can trigger recovery process again, and only synchronize its it is offline during be labeled with the data object of write-in label.The present invention promotes the maintainability of storage cluster, realizes the operations such as disk replacement and the program upgrading under presence, guaranty period read-write operation can be normal by introducing service mode.

Description

A kind of maintaining method of distributed memory system

Technical field

The present invention relates to field of computer technology more particularly to a kind of maintaining methods of distributed memory system.

Background technique

Distributed memory system will start recovery process in topology change, carry out automaticdata rebalancing.Usually In the case of, recovery process be fully automated and it is not interruptable.

During Recovery, in order to guarantee the consistency of data, part operation can not be executed, and influence normally making for user With.When power system capacity is gradually increased, it is meant that need the data volume of recovery to increase, the time of cost is consequently increased, and is The availability of system reduces.Best engineering practice shows how to reduce the frequency of recovery generation to the greatest extent, is distributed storage system The important content of system O&M.

However, during practical O&M, if to carry out disk replacement or edition upgrading, corresponding number can only be first allowed It is offline according to node, an online new node again is replaced after disk or upgrade procedure, this, which means that, must carry out 2 times Recovery, it is time-consuming and laborious.

Summary of the invention

In view of the above drawbacks of the prior art, technical problem to be solved by the invention is to provide a kind of distributed storages The maintaining method of system, so as to solve the deficiencies in the prior art.

To achieve the above object, the present invention provides a kind of maintaining methods of distributed memory system, comprising the following steps:

Under step 1, service mode, node off-line will not trigger recovery process；

When step 2, the data object for causing copy to lose for offline node carry out write operation, using effective copy+beat The mode of label is handled, i.e., degradation WriteMode is handled；

When step 3, offline node are again online, only this online node can trigger recovery process again, and And the data object of write-in label is labeled with during only synchronizing it offline.

The degradation WriteMode of a kind of maintaining method of above-mentioned distributed memory system, the step 2 handles specific steps Are as follows:

1, gateway receives client's write request；

2, Gateway determines the back end for needing to forward according to present topology；

3, back end receives the write request of forwarding；

4, judge whether that in service mode, be writes process into degrading, and obtains service mode topology, otherwise normally writes Process；

5, pass through service mode topology equivalence locations of copies；

6, compare present topology and service mode topology；

7, judge whether there is node off-line under present topology, marked if it is write-in, otherwise write copy；Sentence after write-in label Whether disconnected present node is write copy, is otherwise terminated under service mode topology if being.

A kind of maintaining method of above-mentioned distributed memory system, under service mode, can only offline unlatching service mode when Already existing node in cluster.

A kind of maintaining method of above-mentioned distributed memory system, under service mode, can only by offline node again on Line, cannot online new node.

A kind of maintaining method of above-mentioned distributed memory system under service mode, to offline node, cannot be removed Or change its working directory.

A kind of maintaining method of above-mentioned distributed memory system, under service mode, when node is again online, start-up parameter It must be offline completely the same before with its.

The beneficial effects of the present invention are:

The present invention promotes the maintainability of storage cluster, realizes that the disk under presence replaces by introducing service mode Change with program upgrading etc. operation, guaranty period read-write operation can be normal.

It is described further below with reference to technical effect of the attached drawing to design of the invention, specific structure and generation, with It is fully understood from the purpose of the present invention, feature and effect.

Detailed description of the invention

Fig. 1 is the degradation write-in flow chart under service mode of the invention.

Specific embodiment

A kind of maintaining method of distributed memory system, comprising the following steps:

Under step 1, service mode, node off-line will not trigger recovery process；

Firstly, it is necessary to service mode topology and present topology be recorded, for calculating locations of copies and the write-in of data object Mark position.The premise write that degrades is that recovery does not occur, else if the data object is carrying out recovery, writes behaviour Make to enter and wait.Degradation write-in process under service mode is as shown in Figure 1, degradation WriteMode handles specific steps are as follows:

1, gateway receives client's write request；

3, back end receives the write request of forwarding；

5, pass through service mode topology equivalence locations of copies；

6, compare present topology and service mode topology；

In addition, under service mode, can only offline unlatching service mode when cluster in already existing node.Service mode Under, it can only be again online by offline node, it cannot online new node.Under service mode, to offline node, it cannot remove Or change its working directory.Under service mode, when node is again online, start-up parameter must with its it is offline before complete one It causes.

One, inventive principle:

Distributed memory system uses strong consistency strategy to the write-in of copy, thus in copy missing and not yet When recovery is finished, write operation can not be executed.But if without recovery during node off-line, it is secondary to missing When this object carries out write operation, and it is not written into the copy (also not writing certainly, because of node off-line) of missing, but is directed to Missing copy stamps write-in label, guarantees that effective copy+write-in label sum reaches maximum number of copies, then the strongly consistent of write operation Property is not destroyed.It when offline node is again online, then will be distributed in the data copy on offline node, stamp write-in The part of label synchronizes, that is, can guarantee the strong consistency of data.

Specific embodiment given below illustrates operation of the present invention Method And Principle:

Using service mode, the functions such as online disk replacement and the upgrading of online gray scale, implementation Datong District may be implemented Small difference illustrates the application method of service mode here by taking the replacement of online disk as an example:

Firstly, making in the normal situation of cluster state (without offline, the recovery not completed on node) Starting service mode is ordered with " dog cluster mmode on "；

The corresponding data object service processes of disk to be replaced are killed, and waits topology to update and completes (that is, passing through " dog cluster info " order, it can be seen that topology caused by the node is offline updates)；

The whole disk of disk to be replaced is copied into new disk；

Back end corresponding to original disk (it is required that start-up parameter with before consistent) is restarted, is waited Recovery (synchronizing the data object of write-in label) is completed；

Close service mode.

The method for realizing gray scale upgrading is similar with the above process, the step of Replace Disk and Press Anykey To Reboot is changed into upgrade installation package, just It is the process for upgrading a node；Then, according to the repetition of this process one node, one node, until all nodes all rise Grade at latest edition program.

The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical solution, all should be within the scope of protection determined by the claims.

Claims

1. a kind of maintaining method of distributed memory system, which comprises the following steps:

Under step 1, service mode, node off-line will not trigger recovery process；

When step 2, the data object for causing copy to lose for offline node carry out write operation, using effective copy+marking Mode handle, i.e., degradation WriteMode handle；

When step 3, offline node are again online, only this online node can trigger recovery process again, and only The data object of write-in label is labeled with during only synchronizing it offline.

2. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: the drop of the step 2 Grade WriteMode handles specific steps are as follows:

1, gateway receives client's write request；

3, back end receives the write request of forwarding；

4, judge whether that in service mode, be writes process into degrading, and obtains service mode topology, otherwise normally writes stream Journey；

5, pass through service mode topology equivalence locations of copies；

6, compare present topology and service mode topology；

7, judge whether there is node off-line under present topology, marked if it is write-in, otherwise write copy；Judge to work as after write-in label Whether front nodal point is write copy, is otherwise terminated under service mode topology if being.

3. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: under service mode, only Already existing node in cluster when energy offline unlatching service mode.

4. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: under service mode, only Can be again online by offline node, it cannot online new node.

5. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: right under service mode Its working directory cannot be removed or be changed to offline node.

6. a kind of maintaining method of distributed memory system as described in claim 1, it is characterised in that: under service mode, section When point is again online, start-up parameter must be offline completely the same before with its.