CN113377702B - Method and device for starting two-node cluster, electronic equipment and storage medium - Google Patents

Method and device for starting two-node cluster, electronic equipment and storage medium Download PDF

Info

Publication number
CN113377702B
CN113377702B CN202110764775.6A CN202110764775A CN113377702B CN 113377702 B CN113377702 B CN 113377702B CN 202110764775 A CN202110764775 A CN 202110764775A CN 113377702 B CN113377702 B CN 113377702B
Authority
CN
China
Prior art keywords
node
cluster
state
master
access state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110764775.6A
Other languages
Chinese (zh)
Other versions
CN113377702A (en
Inventor
吴业亮
朱正东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anchao Cloud Software Co Ltd
Original Assignee
Anchao Cloud Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anchao Cloud Software Co Ltd filed Critical Anchao Cloud Software Co Ltd
Priority to CN202110764775.6A priority Critical patent/CN113377702B/en
Publication of CN113377702A publication Critical patent/CN113377702A/en
Application granted granted Critical
Publication of CN113377702B publication Critical patent/CN113377702B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/161Computing infrastructure, e.g. computer clusters, blade chassis or hardware partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention discloses a method and a device for starting a two-node cluster, electronic equipment and a storage medium, wherein the method for starting the two-node cluster comprises the following steps: the first node acquires a first access state and a second access state of the second node, wherein the first access state is a connection state of the first node and the second node through a cardiac jumper, and the second access state is a connection state of the first node and the second node through a management network; the first node obtains a third channel state of a third party IP address; and the first node and the second node switch the master node according to the first path state, the second path state and the third path state to start the cluster. According to the method for starting the two-node cluster, whether the cluster is normal or not can be judged by setting the management network and the heartbeat line in the cluster, so that the situation that the cluster is cracked due to misjudgment in the process of starting the cluster is avoided, and the error detection is effectively avoided.

Description

Method and device for starting two-node cluster, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computers, and in particular, to a method and apparatus for starting a two-node cluster, an electronic device, and a storage medium.
Background
A cluster of two nodes is made up of two computers, each of which is referred to as a node in the cluster. Due to network failures, a cluster may split into two groups, a phenomenon called brain split. When the cluster is split, each node group of the split two node groups cannot detect existence of the other node groups through heartbeat information or lease information, nodes of other node groups are considered to be faulty, and thus, in the same time period, the nodes in the node groups may initiate access to a certain shared storage resource, and thus, data access errors are caused. At this time, the cluster management software generally adopts a certain algorithm to arbitrate which node group wins and continues the work of the original cluster, and the nodes in the failed node group need to be restarted and execute the operation of rejoining the cluster. In the cluster starting process, because a plurality of states are unstable, the arbitration program is easy to misjudge, and the cluster is cracked.
The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person of ordinary skill in the art.
Disclosure of Invention
The invention aims to provide a method and a device for starting a two-node cluster, electronic equipment and a storage medium, which can judge whether the cluster is normal or not through setting a management network and a heartbeat line in the cluster, avoid misjudgment in the process of starting the cluster, cause cluster brain crack and effectively avoid error detection.
In order to achieve the above objective, an embodiment of the present invention provides a method for starting a two-node cluster.
In one or more embodiments of the invention, the cluster includes a first node and a second node, the method comprising: the first node acquires a first access state and a second access state of the second node, wherein the first access state is a connection state of the first node and the second node through a jumper wire, and the second access state is a connection state of the first node and the second node through a management network; the first node obtains a third channel state of a third party IP address; and the first node and the second node switch the master node according to the first access state, the second access state and the third access state to start the cluster.
In one or more embodiments of the present invention, the first node and the second node perform a master node handover according to the first path state, the second path state, and the third path state, including: when the first path state is off, the second path state is off, and the third path state is on, the first node switches itself to a master node; and/or when the first path state is disconnected and the second path state is connected, the first node and the second node perform master node switching according to a preset node role.
In one or more embodiments of the invention, the method further comprises: after the cluster is started, the first node and the second node respectively create a mark file.
In one or more embodiments of the invention, the method further comprises: when the first access state is off, the second access state is off, and the third access state is on, the first node queries itself for the presence of the markup file; if yes, the first node does not execute the switching of the master node; if not, the first node switches itself to the master node.
In one or more embodiments of the invention, the method further comprises: when the first access state is disconnected and the second access state is connected, the first node and the second node respectively inquire whether the mark file exists or not; if yes, the first node and the second node do not execute the switching of the master node; if not, the first node and the second node switch the master node according to the role of the preset node.
In one or more embodiments of the invention, the method further comprises: when the first node and the second node form a cluster, the first node and the second node delete the mark files of the first node and the second node respectively.
In one or more embodiments of the present invention, the determination of the switching of the first node and the second node master node and the determination of the deletion of the marker file are performed according to a set period.
In one or more embodiments of the present invention, before the first node and the second node perform the determination of the master node handover, the method further includes: acquiring the starting time and the current time of a cluster; judging whether the time of starting the cluster and the difference value of the current time are smaller than a set duration or not; if yes, switching the master node after waiting for the difference time; if not, the master node is directly switched.
In another aspect of the present invention, an apparatus for two-node cluster startup is provided, which includes an acquisition module and a switching module.
The acquisition module is used for the first node to acquire a first access state and a second access state of the second node, wherein the first access state is a connection state of the first node and the second node through a cardiac jumper, and the second access state is a connection state of the first node and the second node through a management network; and a third path state for the first node to acquire a third party IP address.
The switching module is used for switching the master node by the first node and the second node according to the first access state, the second access state and the third access state so as to start the cluster.
In one or more embodiments of the present invention, the first node and the second node perform a master node handover according to the first path state, the second path state, and the third path state, including: when the first path state is off, the second path state is off, and the third path state is on, the first node switches itself to a master node; and/or when the first path state is disconnected and the second path state is connected, the first node and the second node perform master node switching according to a preset node role.
In one or more embodiments of the present invention, after the cluster startup is completed, the first node and the second node respectively create a markup file.
In one or more embodiments of the present invention, the switching module is further configured to: when the first access state is off, the second access state is off, and the third access state is on, the first node queries itself for the presence of the markup file; if yes, the first node does not execute the switching of the master node; if not, the first node switches itself to the master node.
In one or more embodiments of the present invention, the switching module is further configured to: when the first access state is disconnected and the second access state is connected, the first node and the second node respectively inquire whether the mark file exists or not; if yes, the first node and the second node do not execute the switching of the master node; if not, the first node and the second node switch the master node according to the role of the preset node.
In one or more embodiments of the present invention, when the first node and the second node form a cluster, the first node and the second node delete the marker file of themselves, respectively.
In one or more embodiments of the present invention, the determination of the switching of the first node and the second node master node and the determination of the deletion of the marker file are performed according to a set period.
In one or more embodiments of the present invention, before the first node and the second node perform the determination of the master node handover, the method further includes: acquiring the starting time and the current time of a cluster; judging whether the starting time of the system cluster and the difference value of the current time are smaller than a set duration or not; if yes, switching the master node after waiting for the difference time; if not, the master node is directly switched.
In another aspect of the present invention, there is provided an electronic device including: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the two-node cluster-initiated method as described above.
In another aspect of the invention, a computer readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, implements the steps of the method for two-node cluster startup as described.
Compared with the prior art, the method and the device for starting the two-node cluster, the electronic equipment and the storage medium can judge whether the cluster is normal or not through setting the management network and the heartbeat line in the cluster, judge whether the cluster is started or not through creating the mark file in the cluster starting process, avoid misjudgment in the starting process, cause cluster brain cracking, effectively avoid error detection, and quicken the waiting time of a judging program by changing the waiting time to the starting time of the cluster, improve the detection efficiency and quickly and effectively recover faults in the starting process.
Drawings
FIG. 1 is a partial schematic diagram of a two-node cluster start-up method according to one embodiment of the invention;
FIG. 2 is an overall architecture diagram of a two-node cluster start-up method according to one embodiment of the invention;
FIG. 3 is a schematic diagram of a two-node cluster start-up method according to an embodiment of the invention;
FIG. 4 is a flow chart of a two node cluster start-up method according to one embodiment of the invention;
FIG. 5 is a block diagram of a two-node cluster starting device according to one embodiment of the invention;
FIG. 6 is a hardware block diagram of a two-node cluster-initiated computing device in accordance with one embodiment of the invention.
Detailed Description
The following detailed description of embodiments of the invention is, therefore, to be taken in conjunction with the accompanying drawings, and it is to be understood that the scope of the invention is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the term "comprise" or variations thereof such as "comprises" or "comprising", etc. will be understood to include the stated element or component without excluding other elements or components.
Some of the concepts involved in the embodiments of the present invention are described below.
A high availability cluster is a cluster in which when one node or server fails, another node can automatically and immediately provide service to the outside, i.e., the resource on the failed node is transferred to another node, so that the other node has the resource to provide service to the outside. The high-availability cluster is used for automatically switching resources and services when a single node fails, so that the services can be ensured to be always on.
Brain split refers to a high availability cluster which is an integral and coordinated action and split into two independent nodes when a heartbeat line connecting the two nodes is disconnected in the high availability cluster. Because of the mutual loss of contact, the two nodes are all in fault, and the high available services on the two nodes contend for shared resources and application services locally because of brain cracks, which leads to the following serious consequences: shared resources are divided by melons, neither node is served, or both nodes provide service, but read and write shared storage at the same time, resulting in data corruption.
The heart jumper is mainly used for connecting the network cable of the working machine and the backup machine between the master server and the slave server, monitors the working machine in a software mode, and immediately puts the backup machine into use once the working machine is found to stop service due to a certain reason so as to ensure the smoothness of the network and the normal operation of the service.
Example 1
As shown in fig. 1, the method for starting up a two-node cluster in one embodiment of the present invention includes the following steps.
In step S101, the first node acquires a first path state and a second path state of the second node, and a third path state to a third party IP address.
Two physical connecting lines are arranged between the first node and the second node, one of the two physical connecting lines is used for directly connecting the first node and the second node through a heartbeat line, and the other physical connecting line is used for connecting the first node and the second node through a third party IP address. The first node and the second node are connected to the third party IP address through a management network, and the third party IP address in this embodiment is a gateway IP address configured on the switch.
In this embodiment, the first path state is defined as a connection state of the first node and the second node through the jumper, the second path state is defined as a connection state of the first node and the second node through the management network, and the third path state is defined as a connection state of the first node to the third party IP address.
In step S102, the first node and the second node perform a master node switching according to the first path state, the second path state, and the third path state.
According to the difference of the first path state, the second path state and the third path state, the states of each node, the third party IP address or the jumper wire can be judged, so that the corresponding master node switching strategy is determined. Specifically:
(1) when the first path state is off, the second path state is off, and the third path state is on, the first node switches itself to the master node.
At this time, the first node cannot communicate with the second node through the cardiac jumper and the management network, but the first node can be normally connected to the third party IP address, which indicates that the first node is normal at this time, but the second node fails and the cardiac jumper is broken. In this case, only the first node in the two-node cluster can work normally, so the first node will switch itself to the master node to enter the stand-alone mode, and the first node provides service.
(2) And when the first path state is disconnected and the second path state is connected, the first node and the second node switch the master node according to the role of the preset node.
At this time, the first node cannot communicate with the second node through the cardiac jumper, but the first node can communicate with the second node through the management network; thus, whether the first node can be normally connected to the third party IP address or not can be determined, and the first node and the second node are normal at the moment, but the heartbeat line is broken. In this case, the first node and the second node in the two-node cluster can work normally, so that the switching of the master node needs to be determined according to the roles of the preset nodes of the first node and the second node.
In this embodiment, one node is configured in advance in the first node and the second node of the two-node cluster as a default master node, so that the default master node can switch itself to the master node at this time and provide services, and the other node can join the cluster with the identity of the slave node.
(3) When the first path state is on and the second path state is off, the master node is not switched.
At this time, the first node may communicate with the second node through the jumper wire, and the first node may not communicate with the second node through the management network; thus, whether the first node can be normally connected to the third party IP address or not can be determined, the first node and the second node are normal at the moment, but the management network is disconnected. In this case, the first node and the second node in the two-node cluster can work normally and the cluster is normal, so that the switching of the master node is not needed, and the state is checked only on the second node.
(4) When the first path state is connected and the second path state is connected, the master node is not switched.
At this time, the first node may be connected to the second node through the jumper, and the first node may also be connected to the second node through the management network; thus, whether the first node can be normally connected to the third party IP address or not can be determined, the first node and the second node are normal at the moment, and the cardiac jumper and the management network are also normal. In this case, the first node and the second node in the two-node cluster can work normally and the cluster is normal, so that the switching of the master node is not needed, and the state is checked only on the second node.
(5) The first path state, the second path state, and the third path state are all open.
At this time, the first node cannot communicate with the second node through a physical path, the first node fails and cannot determine the state of the second node, and the cluster network is completely disconnected. In this case, each node in the cluster does not perform any operation.
Example 2
As shown in fig. 2 to 4, a method for starting up a two-node cluster according to another embodiment of the present invention is described, and includes the following steps.
In step S201, after the cluster startup is completed, the first node 101 and the second node 102 in the cluster create a markup file, respectively.
The marking files are created in the nodes, and whether the cluster really fails in the starting process can be judged by judging whether the marking files exist in the nodes after the cluster is started.
In step S202, a time of cluster startup and a current time are obtained; judging whether the difference value of the starting time and the current time of the cluster is smaller than a set duration or not; if the difference value between the starting time and the current time of the cluster is smaller than the set time length, switching the master node after waiting for the difference time length; if the difference value between the starting time and the current time of the cluster is larger than the set duration, the master node is directly switched.
In this embodiment, the set duration is preferably 180 seconds, and waiting for 180 seconds can ensure that the cluster has enough time to complete the startup process under the condition that the cluster can be started normally. Meanwhile, in order to avoid overlong switching time caused by waiting for a set time length when the master node switching is executed each time, the waiting time when the master node switching is ready to be executed is further optimized.
Specifically, taking the difference between the time of starting the cluster and the current time as a reference factor, if the difference is smaller than the set time length, indicating that the set time length does not elapse after the cluster is started, and further, giving a certain time to the cluster to verify whether the cluster can be started normally. In this embodiment, the difference time is waited for, and if the cluster is still unable to be started normally after the difference time, the master node is immediately and quickly switched.
Of course, in some alternative embodiments, the above-mentioned determination is not performed every time the master node is switched, and the master node is directly executed after waiting for a set period of time.
In step S203, the first node 101 acquires a first path state 103 and a second path state 104 of the second node 102, where the first path state 103 is a connection state of the first node 101 to the second node 102 through a jumper wire, and the second path state 104 is a connection state of the first node 101 to the second node 102 through a management network.
In this embodiment, the first node 101 and the second node 102 have two ports respectively, one port is connected to the switch, one port is connected to the other node, the network lines where the first node 101 and the second node 102 are interconnected are called as heart jumpers, and the network from the first node 101 and the second node 102 to the switch is called as a management network.
The jumper is a network cable for connecting the first node 101 and the second node 102, in the first node 101 and the second node 102, the first node 101 is preconfigured as a main node, and the second node 102 is a standby node, and the two nodes are connected through a heartbeat line. The running state of the other party is monitored in real time by the software installed on the nodes through the heartbeat wire, once the first node 101 of the host computer which is working fails, the heartbeat wire is reflected to the second node 102 of the other host computer which is mutually backed up, and the second node 102 of the host computer can immediately put into working, so that the normal running of the network can be ensured to the greatest extent.
The first node 101 obtains a third path state 105 of the third party IP address, and provides a third party IP address on the switch, which is network reachable to the first node 101 and the second node 102.
In step S204, the first node 101 and the second node 102 perform a master node switching according to the first path state 103, the second path state 104, and the third path state 105, including the following states:
in this embodiment, in the process of starting the cluster, misjudgment occurs in the cluster, when the second path is first turned on and then turned on, the second node 102 cannot detect the first node 101, considers that the first node 101 fails, switches itself to the master node, and when the starting of the cluster is completed, the first node 101 does not fail, at this time, two master nodes exist in the cluster, and can both provide services to cause inconsistent data and brain fracture occurs. Judging whether the cluster really fails in the starting process by judging whether the mark file exists in each node.
When the first path state 103 is off and the second path state 104 is on, the first node 101 is normal and the second node 102 is normal, but the heartbeat line 103 is faulty, and the first node 101 and the second node 102 respectively query whether a markup file exists or not; if the first node 101 and the second node 102 have the mark files, the first node 101 and the second node 102 do not execute the master node switching; if the first node 101 and the second node 102 do not have the mark file, the first node 101 and the second node 102 perform the master node switching according to the role of the preset node.
In this embodiment, when the first node 101 and the second node 102 are configured in advance, the first node 101 is configured as a master node, and when the cluster cannot select the master node, the first node 101 switches to the master node role to take over the master node function.
When the first path state 103 is off, the second path state 104 is off, and the third path state 105 is on, at this time, the first node 101 is normal, the second node 102 is failed, and the heartbeat line 103 is failed, when one node is failed, the normal node is switched to a stand-alone mode, the normal node provides a service, and the first node 101 queries whether a markup file exists or not; if the first node 101 has a mark file, the first node 101 does not execute the master node switching; if the first node 101 does not have the markup file, the first node 101 switches itself to the master node.
When the first path state 103 is connected, the first node 101 is normal, the second node 102 is also normal, and the cluster is fault-free, so that the first node 101 and the second node 102 can both provide services, and the cluster can normally operate.
When the first path state 103 is off, the second path state 104 is off, and the third path state 105 is off, the first node 101 and the second node 102 are normal at this time, and all the networks fail, and the first node 101 and the second node 102 cannot provide service, and cannot perform a failure recovery operation.
In step S205, when the first node 101 and the second node 102 form a cluster, the first node 101 and the second node 102 delete their own marker files, respectively, and exit the startup procedure.
The judgment of the primary node switching of the first node 101 and the second node 102, and the judgment of the deletion of the marker file are periodically performed, and in this embodiment, it is preferable to periodically perform once every 3 seconds.
According to the method for starting the two-node cluster, provided by the embodiment of the invention, whether the cluster is normal or not can be judged by setting the management network and the heartbeat line in the cluster, and whether the cluster is started or not is judged by creating the mark file in the cluster starting process, so that false judgment in the starting process is avoided, the cluster brain crack is avoided, false detection is effectively avoided, the waiting time is changed to be the starting time of the cluster, the waiting time required by a judging program is shortened, the detection efficiency is improved, and the fault recovery in the starting process is rapidly and effectively carried out.
As shown in fig. 5, a specific embodiment of the two-node cluster starting device of the present invention is described.
In the embodiment of the invention, the device for starting the two-node cluster comprises an acquisition module 501 and a switching module 502.
The obtaining module 501 is configured to enable a first node to obtain a first path state and a second path state of a second node, where the first path state is a connection state of the first node and the second node through a jumper, and the second path state is a connection state of the first node and the second node through a management network; and a third path state for the first node to acquire a third party IP address.
The switching module 502 is configured to enable the first node and the second node to switch the master node according to the first path state, the second path state, and the third path state to start the cluster.
The switching module 502 is specifically configured to: after the cluster is started, the first node and the second node respectively create the mark files.
The switching module 502 is further configured to: when the first access state is disconnected, the second access state is disconnected, and the third access state is connected, the first node inquires whether a mark file exists or not; if the first node has a mark file, the first node does not execute the switching of the master node; if the first node does not have the mark file, the first node switches itself to the master node.
The switching module 502 is further configured to: when the first access state is disconnected and the second access state is connected, the first node and the second node respectively inquire whether a mark file exists or not; if the first node and the second node are respectively provided with the mark files, the first node and the second node do not execute the switching of the main node; and if the first node and the second node do not have the mark file, the first node and the second node switch the master node according to the role of the preset node.
When the first node and the second node form a cluster, the first node and the second node delete the marking files of the first node and the second node respectively. The judgment of the switching of the first node and the second node master node, and the judgment of the deletion of the marker file are periodically performed every 3 seconds.
The device for starting the two-node cluster provided by the embodiment of the invention further comprises a waiting module 503, configured to: before judging the switching of the master node, the first node and the second node acquire the starting time and the current time of the cluster; judging whether the difference value between the time of cluster starting and the current time is smaller than a set duration, and setting the duration to 180 seconds in the embodiment; if the difference value between the starting time and the current time of the cluster is less than 180 seconds, switching the master node after waiting for the difference value duration; if the difference between the system starting time and the current time is greater than 180 seconds, the master node is directly switched.
Fig. 6 shows a hardware block diagram of a computing device 60 for two-node cluster failure recovery, according to an embodiment of the present description. As shown in fig. 6, computing device 60 may include at least one processor 601, memory 602 (e.g., non-volatile memory), memory 603, and communication interface 606, and at least one processor 601, memory 602, memory 603, and communication interface 606 are connected together via bus 606. The at least one processor 601 executes at least one computer readable instruction stored or encoded in the memory 602.
It should be appreciated that the computer-executable instructions stored in the memory 602, when executed, cause the at least one processor 601 to perform the various operations and functions described above in connection with fig. 1-6 in various embodiments of the present specification.
In embodiments of the present description, computing device 60 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, personal Digital Assistants (PDAs), handsets, messaging devices, wearable computing devices, consumer electronic devices, and the like.
According to one embodiment, a program product, such as a machine-readable medium, is provided. The machine-readable medium may have instructions (i.e., elements described above implemented in software) that, when executed by a machine, cause the machine to perform the various operations and functions described above in connection with fig. 1-6 in various embodiments of the specification. In particular, a system or apparatus provided with a readable storage medium having stored thereon software program code implementing the functions of any of the above embodiments may be provided, and a computer or processor of the system or apparatus may be caused to read out and execute instructions stored in the readable storage medium.
According to the method and device for starting the two-node cluster, the electronic equipment and the storage medium, whether the cluster is normal or not can be judged by setting the management network and the heartbeat line in the cluster, and whether the cluster is started or not is judged by creating the mark file in the cluster starting process, so that the situation that the cluster is cracked due to misjudgment in the starting process is avoided, the error detection is effectively avoided, the waiting time is changed to be the starting time of the cluster, the waiting time of a judging program is shortened, the detection efficiency is improved, and the fault recovery in the starting process is rapidly and effectively carried out.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing descriptions of specific exemplary embodiments of the present invention are presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain the specific principles of the invention and its practical application to thereby enable one skilled in the art to make and utilize the invention in various exemplary embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (10)

1. A method for two-node cluster initiation, the cluster comprising a first node and a second node, the method comprising:
the first node acquires a first access state and a second access state of the second node, wherein the first access state is a connection state of the first node and the second node through a jumper wire, and the second access state is a connection state of the first node and the second node through a management network;
the first node obtains a third channel state of a third party IP address;
the first node and the second node switch the master node according to the first access state, the second access state and the third access state to start the cluster;
the performing the master node handover to start the cluster includes:
when the first path state is off, the second path state is off, and the third path state is on, the first node switches itself to a master node;
when the first path state is disconnected and the second path state is connected, the first node and the second node perform master node switching according to a role of a preset node;
the first node and the second node are directly connected through the heartbeat line, the first node and the second node are respectively connected to the third party IP address through the management network, and the third party IP address is a gateway IP address configured on the switch.
2. The two-node cluster starting method of claim 1, further comprising:
after the cluster is started, the first node and the second node respectively create a mark file.
3. The two-node cluster starting method of claim 2, further comprising:
when the first access state is off, the second access state is off, and the third access state is on, the first node queries itself for the presence of the markup file; if so, the first and second data are not identical,
the first node does not perform a master node handover; if not, the method comprises the steps of,
the first node switches itself to the master node.
4. The two-node cluster starting method of claim 2, further comprising:
when the first access state is disconnected and the second access state is connected, the first node and the second node respectively inquire whether the mark file exists or not; if so, the first and second data are not identical,
the first node and the second node do not execute the master node switching; if not, the method comprises the steps of,
and the first node and the second node perform master node switching according to the role of the preset node.
5. The two-node cluster starting method of claim 2, further comprising:
when the first node and the second node form a cluster, the first node and the second node delete the mark files of the first node and the second node respectively.
6. The method of claim 5, wherein the determining of the switching of the first node and the second node master node and the determining of the deletion of the marker file are performed according to a set period.
7. The method for starting up a two-node cluster according to any one of claims 2 to 6, wherein the first node and the second node further comprise, before performing the master node handover:
acquiring the starting time and the current time of a cluster;
judging whether the time of starting the cluster and the difference value of the current time are smaller than a set duration or not;
if yes, the master node is switched after waiting for the difference time.
8. An apparatus for two-node cluster initiation, the cluster comprising a first node and a second node, the apparatus comprising:
the acquisition module is used for the first node to acquire a first access state and a second access state of the second node, wherein the first access state is a connection state of the first node and the second node through a jumper wire, and the second access state is a connection state of the first node and the second node through a management network; and a third path state for the first node to acquire a third party IP address;
the switching module is used for enabling the first node and the second node to switch the master node according to the first access state, the second access state and the third access state so as to start the cluster;
the performing the master node handover to start the cluster includes:
when the first path state is off, the second path state is off, and the third path state is on, the first node switches itself to a master node;
when the first path state is disconnected and the second path state is connected, the first node and the second node perform master node switching according to a role of a preset node;
the first node and the second node are directly connected through the heartbeat line, the first node and the second node are respectively connected to the third party IP address through the management network, and the third party IP address is a gateway IP address configured on the switch.
9. An electronic device, comprising:
at least one processor; and
a memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the two-node cluster start-up method of any of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the two-node cluster start-up method according to any of claims 1 to 7.
CN202110764775.6A 2021-07-06 2021-07-06 Method and device for starting two-node cluster, electronic equipment and storage medium Active CN113377702B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110764775.6A CN113377702B (en) 2021-07-06 2021-07-06 Method and device for starting two-node cluster, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110764775.6A CN113377702B (en) 2021-07-06 2021-07-06 Method and device for starting two-node cluster, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113377702A CN113377702A (en) 2021-09-10
CN113377702B true CN113377702B (en) 2024-03-22

Family

ID=77581144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110764775.6A Active CN113377702B (en) 2021-07-06 2021-07-06 Method and device for starting two-node cluster, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113377702B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115269248B (en) * 2022-07-28 2023-08-08 安超云软件有限公司 Method and device for preventing brain fracture under double-node cluster, electronic equipment and storage medium
CN116248484B (en) * 2023-03-09 2024-03-22 安超云软件有限公司 Management method and device of cloud primary integrated machine, electronic equipment and storage medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101291243A (en) * 2007-04-16 2008-10-22 广东省新支点技术服务有限公司 Split brain preventing method for highly available cluster system
CN105681074A (en) * 2015-12-29 2016-06-15 北京同有飞骥科技股份有限公司 Method and device for enhancing reliability and availability of dual-computer clusters
CN105934929A (en) * 2014-12-31 2016-09-07 华为技术有限公司 Post-cluster brain split quorum processing method and quorum storage device and system
CN107147528A (en) * 2017-05-23 2017-09-08 郑州云海信息技术有限公司 One kind stores gateway intelligently anti-fissure system and method
CN107948249A (en) * 2017-11-02 2018-04-20 华南理工大学 Big data plateau elastic telescopic method based on service discovery and container technique
CN108984320A (en) * 2018-06-27 2018-12-11 郑州云海信息技术有限公司 A kind of anti-fissure method and device of message queue cluster
CN109495312A (en) * 2018-12-05 2019-03-19 广州鼎甲计算机科技有限公司 The method and system of high-availability cluster based on arbitration disk and dual link
CN110620684A (en) * 2019-08-31 2019-12-27 苏州浪潮智能科技有限公司 Storage double-control split-brain-preventing method, system, terminal and storage medium
CN110750393A (en) * 2019-09-03 2020-02-04 北京字节跳动网络技术有限公司 Method, device, medium and equipment for avoiding network service dual-computer hot standby split brain
CN110784350A (en) * 2019-10-25 2020-02-11 北京计算机技术及应用研究所 Design method of real-time available cluster management system
CN112084072A (en) * 2020-09-11 2020-12-15 重庆紫光华山智安科技有限公司 Method, system, medium and terminal for improving disaster tolerance capability of PostgreSQL cluster
CN112104727A (en) * 2020-09-10 2020-12-18 华云数据控股集团有限公司 Method and system for deploying simplified high-availability Zookeeper cluster
CN112367198A (en) * 2020-10-30 2021-02-12 新华三大数据技术有限公司 Main/standby node switching method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8671218B2 (en) * 2009-06-16 2014-03-11 Oracle America, Inc. Method and system for a weak membership tie-break
US9930140B2 (en) * 2015-09-15 2018-03-27 International Business Machines Corporation Tie-breaking for high availability clusters
US10547499B2 (en) * 2017-09-04 2020-01-28 International Business Machines Corporation Software defined failure detection of many nodes

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101291243A (en) * 2007-04-16 2008-10-22 广东省新支点技术服务有限公司 Split brain preventing method for highly available cluster system
CN105934929A (en) * 2014-12-31 2016-09-07 华为技术有限公司 Post-cluster brain split quorum processing method and quorum storage device and system
CN105681074A (en) * 2015-12-29 2016-06-15 北京同有飞骥科技股份有限公司 Method and device for enhancing reliability and availability of dual-computer clusters
CN107147528A (en) * 2017-05-23 2017-09-08 郑州云海信息技术有限公司 One kind stores gateway intelligently anti-fissure system and method
CN107948249A (en) * 2017-11-02 2018-04-20 华南理工大学 Big data plateau elastic telescopic method based on service discovery and container technique
CN108984320A (en) * 2018-06-27 2018-12-11 郑州云海信息技术有限公司 A kind of anti-fissure method and device of message queue cluster
CN109495312A (en) * 2018-12-05 2019-03-19 广州鼎甲计算机科技有限公司 The method and system of high-availability cluster based on arbitration disk and dual link
CN110620684A (en) * 2019-08-31 2019-12-27 苏州浪潮智能科技有限公司 Storage double-control split-brain-preventing method, system, terminal and storage medium
CN110750393A (en) * 2019-09-03 2020-02-04 北京字节跳动网络技术有限公司 Method, device, medium and equipment for avoiding network service dual-computer hot standby split brain
CN110784350A (en) * 2019-10-25 2020-02-11 北京计算机技术及应用研究所 Design method of real-time available cluster management system
CN112104727A (en) * 2020-09-10 2020-12-18 华云数据控股集团有限公司 Method and system for deploying simplified high-availability Zookeeper cluster
CN112084072A (en) * 2020-09-11 2020-12-15 重庆紫光华山智安科技有限公司 Method, system, medium and terminal for improving disaster tolerance capability of PostgreSQL cluster
CN112367198A (en) * 2020-10-30 2021-02-12 新华三大数据技术有限公司 Main/standby node switching method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Design and implementation of heartbeat in multi-machine environment";Zonghao Hou等;《17th International Conference on Advanced Information Networking and Applications》;全文 *
Oracle群集件Clusterware的机制研究与性能分析;许庆炜;;湖北第二师范学院学报(第02期);全文 *
电力系统管理网关的高可用性方案;黄波;颜慧;王建新;;电子技术与软件工程(第20期);全文 *

Also Published As

Publication number Publication date
CN113377702A (en) 2021-09-10

Similar Documents

Publication Publication Date Title
US10983880B2 (en) Role designation in a high availability node
CN106330475B (en) Method and device for managing main and standby nodes in communication system and high-availability cluster
CN113377702B (en) Method and device for starting two-node cluster, electronic equipment and storage medium
CN102355369B (en) Virtual clustered system as well as processing method and processing device thereof
CN102394914A (en) Cluster brain-split processing method and device
WO2015169199A1 (en) Anomaly recovery method for virtual machine in distributed environment
GB2484086A (en) Reliability and performance modes in a distributed storage system
CN108984349B (en) Method and device for electing master node, medium and computing equipment
CN109245926B (en) Intelligent network card, intelligent network card system and control method
CN107508694B (en) Node management method and node equipment in cluster
CN111176888B (en) Disaster recovery method, device and system for cloud storage
CN105554074A (en) NAS resource monitoring system and monitoring method based on RPC communication
CN105554130A (en) Distributed storage system-based NameNode switching method and switching device
WO2016180005A1 (en) Method for processing virtual machine cluster and computer system
CN115269248B (en) Method and device for preventing brain fracture under double-node cluster, electronic equipment and storage medium
CN104484243A (en) High-reliability system device and method combining virtual machine fault-tolerant technique and high-availability cluster technique
WO2017071384A1 (en) Message processing method and apparatus
US9600487B1 (en) Self healing and restartable multi-steam data backup
CN101262479A (en) A network file share method, server and network file share system
CN113438111A (en) Method for restoring RabbitMQ network partition based on Raft distribution and application
CN117370316A (en) High availability management method and device for database, electronic equipment and storage medium
CN110661599B (en) HA implementation method, device and storage medium between main node and standby node
CN109117317A (en) A kind of clustering fault restoration methods and relevant apparatus
CN111338848B (en) Failure application copy processing method and device, computer equipment and storage medium
CN112612652A (en) Distributed storage system abnormal node restarting method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant