CN109639794B

CN109639794B - State cluster recovery method, device, equipment and readable storage medium

Info

Publication number: CN109639794B
Application number: CN201811507350.1A
Authority: CN
Inventors: 杜鹏飞
Original assignee: Hangzhou Dt Dream Technology Co Ltd
Current assignee: Hangzhou Dt Dream Technology Co Ltd
Priority date: 2018-12-10
Filing date: 2018-12-10
Publication date: 2021-07-13
Anticipated expiration: 2038-12-10
Also published as: CN109639794A

Abstract

The invention discloses a recovery method of a stateful cluster, which comprises the following steps: after the target node is restarted, acquiring an identity identification file of the distributed coordination service record; determining the identity of the main node by using the identity file, and judging whether the identity of the main node is the same as the local identity; if yes, acquiring a distributed lock of the distributed coordination service, and setting a VIP (virtual IP interface) for providing access service to the outside by the state cluster in a local network card; if not, after the master node acquires the distributed lock, the slave node is added into the stateful cluster and added into the master identity application queue. The integrity of the stateful cluster data can be guaranteed when the stateful cluster runs, when the cluster restarts or when a single node restarts. The invention also discloses a recovery device, equipment and a readable storage medium for the stateful cluster, and the recovery device and the equipment have corresponding technical effects.

Description

State cluster recovery method, device, equipment and readable storage medium

Technical Field

The present invention relates to the field of computer application technologies, and in particular, to a method, an apparatus, a device, and a readable storage medium for recovering a stateful cluster.

Background

In IT systems such as cloud computing, big data, artificial intelligence and the like, a plurality of key services store core data of services, the normal operation of the services is the premise of stable operation of the system, and in order to solve the problems of single-point failure and data loss, a cluster is generally formed by using a method of redundant backup of a plurality of nodes, and the services are uniformly provided to the outside. These services with variable data are called stateful services. Such as mariaddb cluster of Galera technology as database service, active-standby cluster of ovn-db, and active-standby cluster of mongo, such as rabbitmq-server as message forwarding service. When the node providing the service is abnormal (such as power failure and network abnormality), the service of other nodes can continue to work.

A plurality of nodes form a state cluster, each node stores respective data, and the data consistency of each node is ensured through cluster heartbeat and synchronization. Some clusters provide read-write capability for multiple nodes simultaneously, such as Galera-mariaddb, rabbitmq-server; some clusters are divided into a master + slave role, only a master node provides read-write capability, and slave only provides read capability. In the aspect of cluster recovery, each cluster can easily solve the problems of single-point failure and rejoining. However, it is difficult to restore the cluster to normal if a plurality of nodes of the cluster are abnormal (such as power failure and network oscillation), or even all of the nodes are abnormal, or the cluster is shut down in plan (such as shutdown of the cluster for maintenance). The problem is more prominent especially in the scene requiring automatic recovery from full power-on. The concrete points are as follows: when the cluster is restarted, the last hung node should be restarted for the first time to ensure that the data is the most complete and complete. That is, when the whole cluster restarts, an arbitration module is needed to decide which node starts first, which is often determined by the order of cluster shutdown, and the arbitration module decides which node is last shutdown by probing and then lets the node start first to ensure data integrity (since the last shutdown node has the most complete data, the data of the node that is shutdown in advance may not be complete). As shown in fig. 1 (the start-up sequence is the same as the sequence of the open arrows shown, as opposed to the close time sequence), it is common to start mariaddb clusters by an agent of mariaddb, such as a placemaker. However, the following disadvantages exist in managing the start-up and operation of the cluster by the additional Pacemaker module:

the first disadvantage is that: the pacemaker itself relies on corosyn, which has poor stability when the network oscillates, and the complexity of the system is increased.

The second disadvantage is that: when the placemaker manages each business module, the agent needs to be configured, the implementation of each agent is different, and when the version of the business module is upgraded, the agents are possibly incompatible.

The third disadvantage is that: the pacemaker is suitable for the native starting mode of the business module, but the pacemaker cannot do the best for the containerized business module.

The defect four is as follows: the placemaker state machine is complex, and the agent is high in coupling with the service module, which causes difficulty in maintenance.

In summary, how to effectively solve the problems of guaranteeing data integrity and the like when a cluster is restarted is a technical problem that needs to be solved urgently by those skilled in the art at present.

Disclosure of Invention

The invention aims to provide a method, a device and equipment for recovering a stateful cluster and a readable storage medium, so as to guarantee the integrity of data when the cluster is restarted.

In order to solve the technical problems, the invention provides the following technical scheme:

a stateful cluster recovery method, comprising:

after the target node is restarted, acquiring an identity identification file of the distributed coordination service record;

determining a host node identity by using the identity file, and judging whether the host node identity is the same as a local identity or not;

if yes, acquiring a distributed lock of the distributed coordination service, and setting a VIP (virtual IP interface) for providing access service to the outside by the state cluster in a local network card;

and if not, after the master node acquires the distributed lock, adding the slave node identity into the stateful cluster and adding the slave node identity into the master identity applying queue.

Preferably, the setting of the VIP for providing the access service to the outside by the state cluster in the local network card includes:

and setting the master-slave service state of the local machine as a master state, and adding the VIP in the local network card.

Preferably, after the adding the slave node identity to the stateful cluster and adding the application master identity queue, the method further includes:

and circularly monitoring the distributed lock of the distributed coordination service and the state change message of the master service and the slave service.

Preferably, after the circularly monitoring the distributed lock of the distributed coordination service and the status change message of the master-slave service, the method further includes:

acquiring the distributed lock, executing the distributed lock for acquiring the distributed coordination service, and setting a VIP (very important person) for providing access service to the outside by a state cluster in a local network card;

and writing the local identifier into the identity identifier file as the identity identifier of the main node.

Preferably, the acquiring the distributed lock includes:

in a contention manner, the distributed lock is acquired.

Preferably, after the target node is restarted, acquiring an identity file of the distributed coordination service record, including:

after the target node is restarted, judging whether the VIP exists in the local network card or not;

if the VIP exists, deleting the VIP from the local network card, initializing the local state to an initialization state, and setting the state into the identity file;

and acquiring the identity identification file through an interface of the distributed coordination service.

Preferably, the distributed coordination service continuously maintains the identity file, and updates the identity file when a master identity changes; the current state of the target node is saved in the identification file, and the current state is any one of an initialization state, a master preparation state, a master waiting state, a slave preparation state and a slave state.

A stateful cluster restoration apparatus, comprising:

the identity identification file acquisition module is used for acquiring the identity identification file of the distributed coordination service record after the target node is restarted;

the identification judgment module is used for determining the identity of the main node by using the identity file and judging whether the identity of the main node is the same as the local identification;

the master identity determining module is used for acquiring a distributed lock of the distributed coordination service if the master identity determining module is used, and a VIP (very important person) for providing access service to the outside by a state cluster is arranged in a local network card;

and the slave identity determining module is used for adding the slave node identity into the stateful cluster and adding the slave node identity into the application master identity queue after the master node acquires the distributed lock if the slave node identity is not the stateful cluster.

A stateful cluster recovery device, comprising:

a memory for storing a computer program;

and the processor is used for realizing the steps of the state cluster recovery method when the computer program is executed.

A readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the above-mentioned stateful cluster restoration method.

By applying the method provided by the embodiment of the invention, after the target node is restarted, the identity identification file of the distributed coordination service record is obtained; determining the identity of the main node by using the identity file, and judging whether the identity of the main node is the same as the local identity; if yes, acquiring a distributed lock of the distributed coordination service, and setting a VIP (virtual IP interface) for providing access service to the outside by the state cluster in a local network card; if not, after the master node acquires the distributed lock, the slave node is added into the stateful cluster and added into the master identity application queue.

Under the condition that only the target node is restarted or the whole cluster is restarted, the target node firstly obtains the identity identification file after being restarted, and then the identity of the master node is determined by utilizing the identity identification file. And judging whether the identity identification of the host node is the same as the local identification, if so, indicating that the target node is the host node before restarting, acquiring the distributed lock, and setting a VIP (very important person) for providing access to the outside by the state cluster in a local network card. The master node is locked in a distributed lock mode, and when the master node fails, other nodes compete for the master identity in a distributed lock competition mode; if the distributed locks are different, the target node is a slave node before restarting, and the slave node can be added into the stateful cluster and the queue applying the master identity after the master node acquires the distributed locks. Because the VIP providing the service to the outside by the stateful cluster is still arranged on the main node after the restart, the restart sequence is not required even when the whole cluster is restarted, and the VIP can be directly rejoined into the cluster according to the master-slave identity of each node in the stateful service before the restart. In addition, if the master node is abnormal in the service operation process, the identity of the master node can be obtained by other nodes in the master identity application queue in a distributed lock competition mode. Therefore, the integrity of the data of the stateful cluster can be guaranteed when the stateful cluster runs, the cluster is restarted or the single node is restarted. Compared with the method for managing the starting and the running of the cluster through an additional Pacemaker module, the method provided by the embodiment of the invention has the advantages of simple structure and easiness in implementation.

In addition, in the embodiment of the invention, the VIP which provides the service to the outside by the state cluster is arranged on the main node, the TCP connection can be directly established between the service and the client, the SLB can be bypassed, and better maintainability and high-efficiency communication can be realized.

Accordingly, embodiments of the present invention further provide a stateful cluster restoring apparatus, a device, and a readable storage medium corresponding to the stateful cluster restoring method, which have the above technical effects and are not described herein again.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a restart diagram of a recovery method of a stateful cluster according to the prior art;

FIG. 2 is a flowchart illustrating an implementation of a stateful cluster recovery method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating operation of a stateful cluster before restarting in an embodiment of the present invention;

FIG. 4 is a restart diagram illustrating restarting a stateful cluster in an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating the activation of a master module according to an embodiment of the present invention;

FIG. 6 is a schematic diagram illustrating operations of a master module on a slave node when a master node fails according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a master node reboot according to an embodiment of the present invention;

FIG. 8 is a diagram illustrating a slave node rebooting in an embodiment of the present invention;

fig. 9 is a schematic diagram of node IP planning in an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a stateful cluster recovery apparatus in an embodiment of the present invention;

fig. 11 is a schematic structural diagram of a stateful cluster recovery device in the embodiment of the present invention;

fig. 12 is a schematic structural diagram of a stateful cluster recovery device in an embodiment of the present invention.

Detailed Description

The core of the invention is to provide a stateful cluster recovery method,

in order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The first embodiment is as follows:

referring to fig. 2, fig. 2 is a flowchart of a stateful cluster restoration method in an embodiment of the present invention, where the method is applicable to each node in a stateful cluster, and the method includes the following steps:

s101, after the target node is restarted, acquiring an identity identification file of the distributed coordination service record.

It should be noted that the target node in the embodiment of the present invention may be any one target node in the stateful cluster. The target node restart may be caused by a power-down restart, a fault restart, or other planned restarts, and the target node restart may be a single-point restart or may be each node when the whole stateful cluster is restarted as a whole. That is, when the stateful cluster is restarted, each node may be regarded as a target node, and the recovery method provided by the embodiment of the present invention is executed. Wherein, the stateful cluster can be maridb + galera, rabbitmq-server, ovn-db, mongodb. In addition, the recovery method for the stateful cluster provided in the embodiment of the present invention is also applicable to restart recovery of stateful cluster software, and specific recovery methods for the stateful cluster software refer to the recovery method for the stateful cluster described herein.

And after the target node is restarted, firstly, acquiring the identity identification file of the distributed coordination service record. Specifically, the distributed coordination service may be a common distributed coordination service such as an ETCD, a consul, a zookeeper, and the like, which is not described herein in detail.

The distributed coordination service continuously maintains the identity identification file, and updates the identity identification file when the main identity changes; the current state of the target node is stored in the identity file, and is any one of an initialization state, a master preparation state, a master waiting state, a slave preparation state and a slave state. The init may represent an initialization state, the master may represent a master state, the to _ master may represent a master ready state, the wait _ master may represent a master wait state, the to _ slave may represent a slave ready state (or called a standby ready state), and the slave may represent a slave state (or called a standby state). When the state of the target node is changed, the current state of the target node can be written into the identity file. Specifically, the id file may be a/var/run/leader file.

S102, determining the identity of the main node by using the identity file, and judging whether the identity of the main node is the same as the identity of the local computer.

The node ID corresponding to the main state can be read from the identification file, and the node ID is judged to be the same as the local ID. If the identity is the same, the identity of the main node is the same as the identity of the main node. Of course, if the target node is restarted together with the whole cluster, the current state of the local identifier may also be read from the identifier file (where the current state refers to the current state of the target node last saved by the distributed coordination service before restarting), and if the current state is the main state, it may also be considered whether the identifier of the main node is the same as the local identifier, and if the current state is not the main state, it is considered that the identifier of the main node is different from the local identifier. Therefore, whether the target node is a master node before restarting the stateful cluster or whether a new master node exists after the target node is disconnected can be determined according to the judgment result.

If the judgment result is yes, the target node is still the main node; if the judgment result is negative, the target node is the slave node. Adding the state clusters with different identities according to different judgment results, namely executing the operation of the step S103 if the state clusters are added with different identities; if not, the operation of step S104 is performed.

S103, acquiring a distributed lock of the distributed coordination service, and setting a VIP (virtual IP interface) for providing access service to the outside by the state cluster in the local network card.

In the embodiment of the invention, the identity of the master node is locked by adopting a distributed lock mode, namely when the target node is determined to be the master node, the distributed lock of the distributed coordination service can be acquired. Then, a Virtual IP Address, which is a VIP (Virtual IP Address) that the state cluster provides access to the outside, may be set in the local network card. The VIP is different from the real IP address of the proxy server, and the proxy server assigns a range of virtual IP addresses to each client according to the number of clients in the Internet, and assigns a virtual IP address to each client according to a certain rule, so that the client can be indirectly connected to the Internet. The VIP is mainly used for switching between different hosts and is mainly used for master-slave switching of the server. Specifically, the specific implementation process of accessing by using the VIP may refer to the VIP in the currently common SLB service, that is, when the client accesses the access service provided by the stateful cluster, the client accesses through the VIP. And (4) deploying the VIP on the main node, namely, completely dropping the service or business processing on the main node, and canceling the SLB. It should be noted that, in the embodiment of the present invention, the service or service deployed in the stateful cluster refers to a service or service that can be automatically started after an abnormal exit is realized in a system d or docker manner. The distributed lock is locked with the main node, and when the main node fails, other nodes acquire the identity of the main node in a lock competition mode.

Specifically, when adding the VIP, the master-slave service state of the host may also be set as the master state. That is, adding VIP to the local network card can set the master-slave service state in the local node to the master state, so that the VIP can access the service to the outside and manage the service on other slave nodes. The master-slave service state is a service state having two selectable states, namely a master state and a slave state, such as ovn-db service having the master state and the slave state, and a service state on a certain node (the master state is when deployed on the master node, and the slave state is when deployed on the slave node).

And S104, after the master node acquires the distributed lock, adding the slave node into the stateful cluster and adding the slave node into the master identity application queue.

After the master node acquires the distributed lock, the target node can determine whether the slave node is added to the stateful cluster, and meanwhile, the target node can also be added to the master identity applying queue. After the main node fails, the identity of the main node is obtained, and the service stability of the state cluster is guaranteed.

It should be noted that, based on the above embodiments, the embodiments of the present invention also provide corresponding improvements. In the preferred/improved embodiment, the same steps as those in the above-mentioned embodiment or corresponding steps can be referred to each other, and the corresponding beneficial effects can be referred to each other, and are not shown in a list in the preferred/improved embodiment.

Preferably, in view of the problem that a master node may fail during the operation of the cluster, in order to avoid the master node being missing due to the cluster, the embodiment of the present invention further provides the following solutions:

after step S104 is executed, that is, after the slave node is added to the stateful cluster and added to the application master identity queue, the following steps may also be executed:

the method comprises the steps that firstly, distributed locks of distributed coordination services and state change messages of master services and slave services are monitored circularly;

step two, acquiring a distributed lock, executing the step of acquiring the distributed lock of the distributed coordination service, and setting a VIP (very important person) for providing access service to the outside by a state cluster in a local network card;

and step three, writing the local identification as the identity identification of the main node into an identity identification file.

For convenience of description, the above three steps will be described in detail in combination.

And circularly monitoring the distributed lock of the distributed coordination service, so that the running state in the cluster can be known in real time, such as whether the master node is normal or not. The master-slave service state change information is obtained, and the state of the master-slave service of the machine can be adjusted in time so as to provide access service to the outside better. When the master node fails, the distributed lock can be acquired in a competitive manner. The method specifically comprises the following steps:

the first method is as follows: zookeeper implementing distributed lock

Implementations include implementing a shared lock with uniqueness of node names and implementing a shared lock with temporary sequential nodes. The algorithm idea of sharing the lock is realized by utilizing the uniqueness of the node name: by utilizing the uniqueness of the name, when the locking operation is carried out, only all the nodes are required to establish/test/Lock nodes together, only one node is successfully established, and the successful node obtains the Lock. And during unlocking, only deleting/test/Lock nodes, and enabling the other nodes to enter the competition creation nodes again until all the nodes acquire the locks. The algorithm idea of sharing the lock is realized by using the temporary sequence nodes: for the locking operation, all nodes can go to/lock directory to create temporary sequential nodes, and if the created nodes find the minimum node with self created node sequence number under/lock/directory, the lock is obtained. Otherwise, a node having a smaller sequence number than the self-created node (the largest node smaller than the self-created node) is monitored, and waiting is entered.

The second method comprises the following steps: redis implementation of distributed locks

redis implements distributed locks mainly by four commands: setnx (set ifnot exists maintains an optimistic lock): when no key exists, the value is set for the key. setnx differs from set: set is present key, then de-overwrite value; if setnx is that no key exists, assigning values to the key and the value again; getset: obtaining an old value according to the key and setting a new value; expire: setting an expiration time; del: and (5) deleting. The specific implementation mode is that when the lock is acquired, setnx locking is used, an timeout time is added to the lock by using an exception command, the lock is automatically released when the timeout time is exceeded, the value of the lock is a randomly generated UUID, and judgment is carried out when the lock is released. In addition, an acquisition timeout time is set when the lock is acquired, and if the acquisition timeout time is exceeded, the lock acquisition is abandoned. When releasing the lock, judging whether the lock is the lock or not through the UUID, and if the lock is the lock, executing delete to release the lock.

The third method comprises the following steps: database implementation distributed lock

The implementation mode is as follows: utilized are optimistic locks and pessimistic locks, wherein the optimistic lock: and adding a field of a version number into the table, inquiring data with the version number before updating each time, and then carrying a version number condition after a where condition statement when updating, wherein the successful updating indicates that the lock is occupied, and the unsuccessful updating indicates that the lock is not occupied. Pessimistic locks: with a select for update (X-lock)/select in share mode (S-lock), generally more X-locks are used because more write functions are subsequently implemented.

When the distributed lock is acquired, the step of acquiring the distributed lock of the distributed coordination service may be performed, and the local network card is provided with a step of setting a VIP for providing an access service to the outside by the state cluster, that is, the operation of step S103 in the foregoing. Then, the local identification is written into the identification file as the identity identification of the main node, so as to ensure that the latest state information is always recorded in the identification file.

Preferably, considering that the target node may be a master node before the restart, but due to a reason such as an excessively long failure time, when the target node is restarted, a new master node is already in normal operation, and therefore, when the target node is restarted, acquiring the identity file of the distributed coordination service record specifically includes:

step one, after a target node is restarted, judging whether a local network card has VIP or not;

step two, if the VIP exists, deleting the VIP from the local network card, initializing the local state to an initialization state, and setting the initialization state into an identity file;

and step three, acquiring the identity identification file through an interface of the distributed coordination service.

For convenience of description, the above three steps will be described in combination.

In order to avoid two VIPs in the same stateful cluster and destroy the data consistency of the cluster, after the target node is restarted, whether the VIP exists in the local network card or not can be judged, if yes, the VIP is deleted from the local network card, the local state is initialized to the init state, and the state is set in the identity file. And then, acquiring the identity identification file through an interface of the distributed coordination service.

Example two:

in order to facilitate better understanding of the technical solutions provided by the embodiments of the present invention for those skilled in the art, the following takes the example that the distributed coordination service is specifically the etc, and the technical solutions provided by the embodiments of the present invention are described in detail.

It should be noted that the premise of this embodiment is to cancel the load balancing function of the SLB and drop all the service processing to the master node, which has certain requirements on the overall scale of the system and the service pressure, and is only suitable for medium and small-scale service architectures. All services can be automatically started after abnormal exit is realized in a systemd or docker mode. When the cluster is restarted, it is necessary to require that the previous master node can be started normally, for example, the cluster contains nodes 1, 2 and 3, where 1 is the master node, and after the cluster is restarted as a whole, the node 1 must be in place, and the other nodes 2 and 3 may not be in place.

Based on the recovery method for the stateful cluster provided in the first embodiment, a master selection module is designed, that is, the master selection module can implement the recovery method for the stateful cluster provided in the first embodiment.

Referring to fig. 3, fig. 4, fig. 5 and fig. 6, fig. 3 is a schematic operation diagram before restarting a stateful cluster in an embodiment of the present invention, fig. 4 is a schematic restart diagram when restarting the stateful cluster in the embodiment of the present invention, and fig. 5 is a schematic start diagram of a master module in the embodiment of the present invention; fig. 6 is a schematic diagram illustrating operations of a master module on a slave node when a master node fails according to an embodiment of the present invention.

First, a election module is deployed on each node within a stateful cluster. And initializing the master selecting module, for example, any one node in the ETCD can be selected as a master node, that is, a certain node in the cluster is used as an initial master node. After the master module on each node is started, the following steps can be executed:

if the local network card has VIP, deleting the VIP; setting the state of a local machine to be an initialization state, and setting the initialization state to var/run/leader; checking the ID of the host node through the ETCD interface, and judging whether the ID of the host node is the same as the ID of the host node; specifically, if the two nodes are the same, the node is set as a master node in the initialization setting; if not, it means that the present node is set as a slave node.

When the node is set as a main node, a leader lock is acquired through an ETCD interface, the state of the node is modified into a main preparation state, and state information is set into a var/run/leader; then waiting for 5 seconds, informing the node that the standby service is changed into the main state, such as ovn-db, and for the service of multiple masters, the service does not need to be changed, such as galera + mariardb; writing the ID of the local node into the identity of a main node of the ETCD, modifying the state of the local node into a main state, and setting main state information into var/run/leader; and adding the VIP to the network card, and circularly monitoring a leader lock of the ETCD and processing identity change, such as changing the master identity into the slave identity.

When the node is set as a main node, the state of the node is modified into a main waiting state, and the main waiting state is set into a var/run/leader; and circularly inquiring whether the leader lock is acquired or not through the ETCD, if not, indicating that the main node is not started, and after waiting for 1 second, continuing to inquire until the main node is started. When the master node is monitored to be started, the master service of the node is informed to be modified to the standby state, and the modification is invalid for the services of multiple masters and the standby service which is already in the standby state. Then, the native state is modified into a slave state, and the slave state is set into var/run/leader. Accordingly, like the master node, the slave node may also cyclically listen to the leader lock of the ETCD and process identity changes, such as changing the master identity to the slave identity.

The slave node monitors the leader lock of the ETCD circularly, when the fault of the master node occurs, the slave node can compete for the leader lock in the ETCD in a competitive mode, and the slave node can wait for 5 seconds after obtaining the leader lock. Then, the current state is modified to the master ready state and set to var/run/leader. Then, the node is informed that the standby service is changed into the main state, and the service of multiple main nodes does not need to be changed; writing the ID of the local node into the identity of a main node of the ETCD, modifying the state of the local node into a main state, and setting main state information into var/run/leader; and adding the VIP to the network card, and circularly monitoring a leader lock of the ETCD and processing identity change, such as changing the master identity into the slave identity.

In fig. 5 and fig. 6, the quiet time of 5 seconds and the waiting time of 1 second in the circular query can be replaced by other time lengths, but the quiet time is twice the query waiting time. All nodes in the cluster run a 'master-selecting module' (i.e. the master-selecting service shown in fig. 3), which competes for the master identity through a distributed lock of the etc d, sets identity information into a/var/run/leader file (status information includes master, to _ slave, and to _ master), continuously maintains the file, and updates the value of the file when the master identity changes, i.e. changes the status.

The master module can ensure that the identity of the master module before exiting is consistent with the identity after restarting, namely the master module of the master node can still obtain the master identity after exiting and restarting, and the master module of the non-master node can only obtain the non-master identity after exiting and restarting.

And the master selecting module simultaneously realizes the switching of the VIP when changing the master identity and keeps the binding relationship between the node where the VIP is located and the master node.

When the node is restarted, because the/var/run/leader is a memory file, the file does not exist, and each cluster service (such as maridb, rabbitmq-server) needs to wait for the file to initialize and operate again. When the master module is started, the value of/var/run/leader is set according to the steps. When the value is one of slave or master, each cluster service selects different starting modes according to the value.

Referring to fig. 7, fig. 7 is a schematic diagram of a master node restart in an embodiment of the present invention, where an example of the master node starting mariaddb and rabtimq-server is shown, and fig. 8 is a schematic diagram of a slave node restart in an embodiment of the present invention.

For convenience of explanation, the following two scenarios of initialization, main failure, and cluster power-off restart are used to illustrate the steps of implementing maridb and rabbitmq-server cluster restart:

fig. 9 shows node IP planning, and fig. 9 is a schematic diagram of node IP planning in the embodiment of the present invention. It should be noted that the IP in the planning diagram shown in fig. 9 may also be another specific IP address, and is not limited to fig. 9. The start script is as follows:

the starting scripts of the main selecting module, mariardb and rabbitmq-server are as follows: the system comprises a leader _ start.sh, a maridb _ start.sh and a rabbitstart.sh, wherein all three services are managed by a system md and are automatically pulled up again after exception, the maridb of the three nodes form a galera + maridb cluster, and the rabbitmq-server of the three nodes form a rabbitmq cluster.

Firstly, initialization during deployment:

1) setting the KV value of the ETCD during deployment: "master-ID": "hostl"

2) Three nodes start up simultaneously and three services per node start up simultaneously.

3) The/var/run/leader of node 1, 2, 3 does not exist.

4) maridb _ start.sh and rabbit _ start.sh keep the state of waiting for this file:

5) and when the reader.sh of the node 1 inquires that the value of the master-ID of the etcd is the same as that of the local computer, a lock of the reader is obtained, the lock is called as a master node, the master is written into/var/run/reader, and 10.0.0.254 is added to the network card of 10.0.0.11.

6) After the maridb.sh and the rabbitsh of the node 1 detect that/var/run/leader is master, starting respective service processes according to the main identity respectively as follows:

mariadb.sh：

sed-ie′/^safe_to_bootstrap/s/0/1/′/var/lib/mysql/grastate.dat

mysqld_safe--wsrep-new-cluster

rabbit.sh：

rabbitmqctl force_boot

/usr/sbin/rabbitmq-server

7) sh write slave to/var/run/leader for node 2, 3

8) After detecting that/var/run/leader is slave, the maridb.sh and the rabbitsh of the nodes 2 and 3 start respective service processes according to the backup identities, which are respectively:

mariadb.sh：

sed-ie′/^safe_to_bootstrap/s/1/0/′/var/lib/mysql/grastate.dat

mysqld_safe

rabbit.sh：

/usr/sbin/rabbitmq-server

9) at the moment, the maridb cluster and the rabbitmq cluster are already established, and the client can access the maridb service and the rabbitmq-server service through 10.0.0.254

II, abnormal power failure of the node 1:

1) node 2 competes for the leader lock resource to become the master identity and node 3 maintains the slave (or alternate) identity.

2) Sh change/var/run/leader value to "master" for leader of node 2.

3) Sh changes the KV value of the ETCD of the node 2: "master-ID": "host 2"

4) Sh of node 2 adds 10.0.0.254 to the 10.0.0.12 network card.

5) maridb and rabbitmq clusters are well established and support multi-master reads and writes without changing anything. Clients can access maridb service and rabbitmq-server service through 10.0.0.254.

Thirdly, restarting the whole power failure of the three nodes:

1) the KV value of the ETCD is as follows: "master-ID": "host 2"

The latter operation is the same as "initialization at deployment", since hsot2 is the master node before restart, host2 obtains the master identity at restart, and the other nodes obtain the backup identities.

In this way, traffic on a single node in the cluster can be accessed directly through the VIP. After the nodes in the cluster compete for the primary identity through the distributed coordination component, the node information is written into the shared storage, and when the subsequent cluster is restarted, the primary identity is still obtained by the primary node of the last time. In the service operation process, if the master node is abnormal, other nodes can acquire the identity of the master node and update the new node ID into the shared storage. When the whole cluster service is recovered, the cluster state before shutdown is not tried to be recovered, but the master node is used for booting the cluster forcibly, and the slave node is added into the cluster in the member identity, so that the requirement on the starting sequence during restarting is avoided.

Example three:

corresponding to the above method embodiment, the embodiment of the present invention further provides a stateful cluster restoring apparatus, and the stateful cluster restoring apparatus described below and the stateful cluster restoring method described above may be referred to in correspondence.

Referring to fig. 10, the apparatus includes the following modules:

the identity file acquisition module 101 is used for acquiring the identity file of the distributed coordination service record after the target node is restarted;

the identification judgment module 102 is configured to determine a host node identification by using the identification file, and judge whether the host node identification is the same as the local identification;

the master identity determining module 103 is configured to, if yes, obtain a distributed lock of the distributed coordination service, and set a VIP, in which a state cluster provides an access service to the outside, in the local network card;

and the slave identity determining module 104 is configured to, if the master node does not acquire the distributed lock, join the stateful cluster with the slave node identity, and join the application master identity queue.

By applying the device provided by the embodiment of the invention, after the target node is restarted, the identity identification file of the distributed coordination service record is obtained; determining the identity of the main node by using the identity file, and judging whether the identity of the main node is the same as the local identity; if yes, acquiring a distributed lock of the distributed coordination service, and setting a VIP (virtual IP interface) for providing access service to the outside by the state cluster in a local network card; if not, after the master node acquires the distributed lock, the slave node is added into the stateful cluster and added into the master identity application queue.

Under the condition that only the target node is restarted or the whole cluster is restarted, the target node firstly obtains the identity identification file after being restarted, and then the identity of the master node is determined by utilizing the identity identification file. And judging whether the identity identification of the host node is the same as the local identification, if so, indicating that the target node is the host node before restarting, acquiring the distributed lock, and setting a VIP (very important person) for providing access to the outside by the state cluster in a local network card. The master node is locked in a distributed lock mode, and when the master node fails, other nodes compete for the master identity in a distributed lock competition mode; if the distributed locks are different, the target node is a slave node before restarting, and the slave node can be added into the stateful cluster and the queue applying the master identity after the master node acquires the distributed locks. Because the VIP providing the service to the outside by the stateful cluster is still arranged on the main node after the restart, the restart sequence is not required even when the whole cluster is restarted, and the VIP can be directly rejoined into the cluster according to the master-slave identity of each node in the stateful service before the restart. In addition, if the master node is abnormal in the service operation process, the identity of the master node can be obtained by other nodes in the master identity application queue in a distributed lock competition mode. Therefore, the integrity of the data of the stateful cluster can be guaranteed when the stateful cluster runs, the cluster is restarted or the single node is restarted. Compared with the method for managing the starting and the running of the cluster through an additional Pacemaker module, the device provided by the embodiment of the invention is simple in structure and easy to realize.

In an embodiment of the present invention, the primary identity determining module 103 is specifically configured to set a primary service state and a secondary service state of the local computer to a primary state, and add a VIP to a network card of the local computer.

In one embodiment of the present invention, the method further comprises:

and the cyclic monitoring module is used for cyclically monitoring the distributed lock of the distributed coordination service and the state change message of the master service and the slave service after the slave node is added into the state cluster and the master identity application queue.

In one embodiment of the present invention, the method further comprises:

the master identity competition module is used for acquiring the distributed locks after circularly monitoring the distributed locks of the distributed coordination services and the state change messages of the master services and the slave services, executing the acquisition of the distributed locks of the distributed coordination services, and setting a VIP step of providing access services to the outside by the state cluster in a local network card; and writing the local identification as the identity of the main node into an identity file.

In an embodiment of the present invention, the master identity competition module is specifically configured to acquire the distributed lock in a competition manner.

In a specific embodiment of the present invention, the identification file obtaining module 101 is specifically configured to determine whether a VIP exists in a local network card after a target node is restarted; if yes, deleting the VIP from the local network card, initializing the local state to an initialization state, and setting the state to an identity file; and acquiring the identity identification file through an interface of the distributed coordination service.

In one embodiment of the present invention, the distributed coordination service continuously maintains the identity file, and updates the identity file when the master identity changes; the current state of the target node is stored in the identity file, and is any one of an initialization state, a master preparation state, a master waiting state, a slave preparation state and a slave state.

Example four:

corresponding to the above method embodiment, an embodiment of the present invention further provides a stateful cluster restoring apparatus, and a stateful cluster restoring apparatus described below and a stateful cluster restoring method described above may be referred to in correspondence.

Referring to fig. 11, the stateful cluster restoring apparatus includes:

a memory D1 for storing computer programs;

processor D2, configured to, when executing the computer program, implement the steps of the stateful cluster restoring method of the above-described method embodiment.

Specifically, referring to fig. 12, a schematic structural diagram of a stateful cluster restoring apparatus provided in this embodiment is provided, where the stateful cluster restoring apparatus may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing an application 342 or data 344. Memory 332 and storage media 330 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instructions operating on a data processing device. Still further, central processor 322 may be configured to communicate with storage medium 330 to execute a series of instruction operations in storage medium 330 on stateful cluster restoration device 301.

Stateful cluster restoration device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341. Such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

The steps in the stateful cluster restoring method described above may be implemented by the structure of a stateful cluster restoring apparatus.

Example five:

corresponding to the above method embodiment, an embodiment of the present invention further provides a readable storage medium, and a readable storage medium described below and a stateful cluster recovery method described above may be referred to in correspondence.

A readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the stateful cluster restoration method of the above-mentioned method embodiment.

The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various other readable storage media capable of storing program codes.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

Claims

1. A method for stateful cluster recovery, comprising:

if yes, acquiring a distributed lock of the distributed coordination service, and setting a VIP (virtual IP interface) for providing access service to the outside by the state cluster in a local network card; wherein the VIP is a virtual IP address;

2. The stateful cluster restoration method according to claim 1, wherein the providing of a VIP for providing access service to outside by the stateful cluster in the local network card comprises:

and setting the state of the master-slave service of the local machine as a master state, and adding the VIP in the local network card.

3. The stateful cluster restoring method according to claim 2, wherein after the joining the stateful cluster with the slave node identity and the joining the application master identity queue, further comprising:

4. The stateful cluster recovery method of claim 3, wherein after the cyclically listening for the distributed lock of the distributed coordination service and the state change message of the master-slave service, further comprising:

5. The stateful cluster recovery method of claim 4, wherein the acquiring the distributed lock comprises:

in a contention manner, the distributed lock is acquired.

6. The stateful cluster recovery method of claim 1, wherein obtaining the identity file of the distributed coordination service record after the target node is restarted comprises:

7. The stateful cluster restoration method according to any one of claims 1 to 6, wherein the distributed coordination service continuously maintains the identity file, and updates the identity file when a change occurs in a master identity; the current state of the target node is saved in the identification file, and the current state is any one of an initialization state, a master preparation state, a master waiting state, a slave preparation state and a slave state.

8. A stateful cluster restoration apparatus, comprising:

the master identity determining module is used for acquiring a distributed lock of the distributed coordination service if the master identity determining module is used, and a VIP (very important person) for providing access service to the outside by a state cluster is arranged in a local network card; wherein the VIP is a virtual IP address;

9. A stateful cluster restoration device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the stateful cluster restoration method of any one of claims 1 to 7 when executing the computer program.

10. A readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the stateful cluster restoration method according to any one of the claims 1 to 7.