CN109617761B - Method and device for switching main server and standby server - Google Patents

Method and device for switching main server and standby server Download PDF

Info

Publication number
CN109617761B
CN109617761B CN201811506780.1A CN201811506780A CN109617761B CN 109617761 B CN109617761 B CN 109617761B CN 201811506780 A CN201811506780 A CN 201811506780A CN 109617761 B CN109617761 B CN 109617761B
Authority
CN
China
Prior art keywords
bridge
standby
server
state
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811506780.1A
Other languages
Chinese (zh)
Other versions
CN109617761A (en
Inventor
崔义芳
喻波
王志海
韩振国
安鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wondersoft Technology Co Ltd
Original Assignee
Beijing Wondersoft Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wondersoft Technology Co Ltd filed Critical Beijing Wondersoft Technology Co Ltd
Priority to CN201811506780.1A priority Critical patent/CN109617761B/en
Publication of CN109617761A publication Critical patent/CN109617761A/en
Application granted granted Critical
Publication of CN109617761B publication Critical patent/CN109617761B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/2028Failover techniques eliminating a faulty processor or activating a spare
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Environmental & Geological Engineering (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention provides a method and a device for switching a main server and a standby server. The method comprises the following steps: starting a first Keepalived service; starting a first bridge service to determine a first bridge state of a native bridge; starting a first bridge check thread under the condition that the first bridge state is in a working state; checking the second bridge state of the native bridge by using the first bridge checking thread to generate a first bridge state file; checking the first bridge state file with the first Keepalived service; and under the condition that the native network bridge is determined to be in the non-working state according to the checking result, stopping the first Keepalived service and executing the main/standby switching process. The invention can solve the problem of loop caused by network bridge parallel, and provides a mode of switching the main and standby state mechanism for the service without port for keepalive hosting, the realization mode is simpler, and the aim of multiplexing can be achieved.

Description

Method and device for switching main server and standby server
Technical Field
The present invention relates to the field of computer communication technologies, and in particular, to a method and an apparatus for switching between a master server and a slave server.
Background
The Keepalived is used for detecting the state of the server, if one web server is down or works in a fault, the Keepalived detects the state, the faulty server is removed from the system, meanwhile, other servers are used for replacing the work of the server, when the server works normally, the Keepalived automatically adds the server into a server group, all the work is automatically completed, manual intervention is not needed, and only the faulty server is repaired manually.
To improve reliability, two or more bridges are provided in parallel between LANs (Local Area networks), but this configuration causes additional problems because loops are created in the topology, which may cause infinite loops. The spanning tree (spanningtree) algorithm is commonly used at present. One approach to solving the infinite loop problem is to let bridges communicate with each other and overlay the actual topology with a spanning tree to each LAN. Using spanning trees, it is possible to ensure that there is only one path between any two LANs. Once the spanning tree is defined by the bridge, all transfers between LANs follow the spanning tree. Since there is only a unique path from each source to each destination, no more cycles are possible.
To build a spanning tree, one bridge must first be chosen as the root of the spanning tree. The method is implemented by broadcasting each bridge its serial number (which is set by the manufacturer and guaranteed to be globally unique), and selecting the bridge with the smallest serial number as the root. The spanning tree is then constructed with the shortest path from the root to each bridge. If a bridge or LAN fails, it is recalculated. The Keepalived function is that one main server and a plurality of backup servers are provided, the same service configuration is deployed on the main server and the backup servers, a virtual IP address is used for providing services to the outside, and when the main server fails, the virtual IP address can automatically drift to the backup servers. The keepalived hosts services with ports, so that the ports can be configured into a keepalived configuration file, and the keepalived can judge whether the services are normal through the existence of the ports, so that the main server and the standby server are switched. However, the bridge service does not have such a port, and it is necessary to know other flags of the bridge service, such as the state information of the bridge. Spanning tree (spanningtree) algorithm is too complex to implement.
Disclosure of Invention
The invention provides a method and a device for switching a main server and a standby server, which are used for solving the problems that the main server and the standby server can not be switched according to the state of a network bridge and the network bridge service of equipment which is not used any more can not be stopped in the prior art.
In order to solve the above problems, the present invention is realized by:
in a first aspect, an embodiment of the present invention provides a method for switching between a master server and a standby server, where the method includes: starting a first Keepalived service; starting a first bridge service to determine a first bridge state of a native bridge; starting a first bridge check thread under the condition that the first bridge state is in a working state; checking the second bridge state of the native bridge by using the first bridge checking thread to generate a first bridge state file; checking the first bridge state file with the first Keepalived service; and under the condition that the native network bridge is determined to be in the non-working state according to the checking result, stopping the first Keepalived service and executing the main/standby switching process.
Preferably, after the step of starting the first Keepalived service, the method further includes: starting a first main/standby communication thread; sending a host startup state message to a standby server through the first main/standby communication thread; receiving standby state information returned by the standby server; under the condition that the standby server is determined to be in the working state according to the standby machine state information, sending a bridge service stopping message to the standby server so that the standby server stops the bridge service according to the bridge service stopping message; and executing the step of starting the first bridge service to judge the first bridge state of the local bridge under the condition that the standby server is determined to be in the non-working state according to the standby machine state information.
Preferably, the step of executing the active/standby switching process includes: acquiring a bridge state value corresponding to a local bridge by using the first main/standby communication thread; and sending the bridge state value to the standby server, and executing the process of switching from the main server to the standby server.
Preferably, after the step of stopping the first Keepalived service and executing the active/standby switching process, the method further includes: and after the state of the local network bridge is recovered to be normal, sending server normal operation state information to the standby server through the first main/standby communication thread, and executing a process of switching from the standby server to the main server.
In a second aspect, an embodiment of the present invention provides a method for switching between a master server and a standby server, where the method includes: starting a second Keepalived service; starting a second main/standby communication thread; receiving first host state information sent by a main server by utilizing the second main/standby communication thread; when the main server is determined to be abnormal according to the first host state information, sending a bridge service stop instruction to the main server; starting a second bridge service to determine a second bridge state of the native bridge; starting a second bridge check thread under the condition that the second bridge state is in a working state; receiving second host state information sent by a main server by utilizing the second main/standby communication thread; and when the main server is determined to be recovered to the normal state according to the second host state information, stopping the second bridge service and the second keepalive service, and sending service stop state information to the main server.
Preferably, after the step of initiating the second bridge service, the method further comprises: judging whether the native bridge is in a working state; under the condition that the native bridge is in a non-working state, the second bridge service is started again; and repeatedly executing the steps of judging whether the local network bridge is in the working state or not and restarting the second network bridge service under the condition that the local network bridge is not in the working state.
In a third aspect, an embodiment of the present invention provides a device for switching between a master server and a standby server, including: the first Keepalived starting module is used for starting the first Keepalived service; the first bridge state judging module is used for starting a first bridge service to judge the first bridge state of the local bridge; a first bridge thread starting module, configured to start a first bridge check thread when the first bridge state is in a working state; the first bridge file generation module is used for checking the second bridge state of the local bridge by using the first bridge check thread to generate a first bridge state file; a first bridge file checking module, configured to check the first bridge state file using the first Keepalived service; and the master-slave switching execution module is used for stopping the first Keepalived service and executing the master-slave switching process under the condition that the local network bridge is determined to be in the non-working state according to the check result.
Preferably, the method further comprises the following steps: a first active/standby thread starting module, configured to start a first active/standby communication thread; the host state message sending module is used for sending a host startup state message to the standby server through the first main/standby communication thread; the standby machine state information receiving module is used for receiving standby machine state information returned by the standby server; and the bridge judgment module execution module is used for executing the first bridge state judgment module under the condition that the standby server is determined to be in the non-working state according to the standby machine state information.
Preferably, the active/standby switching execution module includes: the bridge state value acquisition submodule is used for acquiring a bridge state value corresponding to the local bridge by using the first main/standby communication thread; and the main/standby switching execution submodule is used for sending the bridge state value to the standby server and executing the process of switching from the main server to the standby server.
Preferably, the method further comprises the following steps: and the standby main server switching module is used for sending bridge normal state information to the standby server through the first standby communication thread after the state of the local bridge is recovered to be normal, and executing a process of switching from the standby server to the main server.
In a fourth aspect, an embodiment of the present invention provides a device for switching between a master server and a standby server, including: the second Keepalived starting module is used for starting a second Keepalived service; a second active-standby thread starting module, configured to start a second active-standby communication thread; a first host state receiving module, configured to receive, by using the second active/standby communication thread, first host state information sent by a master server; the network bridge stopping instruction sending module is used for sending a network bridge service stopping instruction to the main server when the main server is determined to be abnormal according to the first host state information; the second bridge state judging module is used for starting a second bridge service to judge the second bridge state of the local bridge; the second bridge thread starting module is used for starting a second bridge check thread under the condition that the second bridge state is in a working state; a second host state receiving module, configured to receive, by using the second active/standby communication thread, second host state information sent by a master server; and the second bridge service and keepalive stopping module is used for stopping the second bridge service and the second keepalive service when the main server is determined to be recovered to the normal state according to the second host state information, and sending service stopping state information to the main server.
Preferably, the method further comprises the following steps: the bridge state judging module is used for judging whether the local bridge is in a working state; the second bridge service restarting module is used for restarting the second bridge service under the condition that the local bridge is in a non-working state; and the repeated execution module is used for repeatedly executing the bridge state judgment module and the second bridge service restart module.
In a fifth aspect, an embodiment of the present invention provides a terminal, including: the switching method comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the switching method realizes the steps of any one of the active/standby server switching methods.
In a sixth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the steps in the active/standby server switching method described in any one of the foregoing descriptions.
Compared with the prior art, the invention has the following advantages:
in the embodiment of the invention, the first bridge service is started by starting the first keepalive service to judge the first bridge state of the local bridge, the first bridge inspection thread is started under the condition that the first bridge state is in a working state, the second bridge state of the local bridge is inspected by using the first bridge inspection thread to generate a first bridge state file, the first bridge state file is inspected by using the first keepalive service, and the first keepalive service is stopped and the main-standby switching process is executed under the condition that the local bridge is determined to be in a non-working state according to the inspection result. The embodiment of the invention can solve the problem of loops caused by the parallel of the network bridge, provides a mode of switching the main and standby state mechanisms for the service without a port for keepalive hosting, has a simpler implementation mode and can achieve the aim of multiplexing.
Drawings
Fig. 1 is a flowchart illustrating steps of a method for switching between a master server and a standby server according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating steps of a method for switching between a master server and a standby server according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram illustrating a main/standby server switching device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a main/standby server switching device according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The technical terms used in the examples of the present invention are explained as follows:
keepalived: the Keepalived is used for detecting the state of the server, if one web server is down or works in a fault, the Keepalived detects the state, the faulty server is removed from the system, meanwhile, other servers are used for replacing the work of the server, when the server works normally, the Keepalived automatically adds the server into a server group, all the work is automatically completed, manual intervention is not needed, and only the faulty server is repaired manually.
Bridge: a bridge connects two networks and manages the flow of network data. It operates at the data link layer and forwards data according to MAC addresses. In the network interconnection, the network interconnection device plays roles of data receiving, address filtering and data forwarding and is used for realizing data exchange among a plurality of network systems.
Example one
Referring to fig. 1, a flowchart illustrating steps of a method for switching between a master server and a slave server according to an embodiment of the present invention is shown, where the method for switching between the master server and the slave server may be applied to a master server, and specifically may include the following steps:
step 101: the first Keepalived service is initiated.
In this embodiment of the present invention, the first Keepalived service refers to a Keepalived service on the host server side.
A self-checking function can be added in advance in a Keepalived service configuration item of the main server, and the self-checking function can be used for checking a local bridge state file subsequently, so that the main-standby switching is convenient to perform when the bridge state is abnormal.
After the first keepalive service on the side of the main server is started, the main server can be hosted by using the first keepalive service to monitor the state of the main server in real time, such as whether the main server has a fault or not.
After the first Keepalived service is initiated, step 102 is performed.
Step 102: a first bridge service is initiated to determine a first bridge state of the native bridge.
The first bridge service refers to a bridge service on the host server side. The first bridge state refers to the state of the bridge on the host server side, and may include two states, an active state and an inactive state.
The first bridge service may determine the state of the native bridge, i.e., determine whether the native bridge is operational using the first bridge service.
In this embodiment of the present invention, after the first Keepalived service is started, the active/standby communication thread may be further started to implement communication between the active server and the standby server through the active/standby communication thread, and specifically, the following detailed description is provided in the following preferred embodiments.
In a preferred embodiment of the present invention, after the step 101, the method may further include:
step A1: and starting a first main/standby communication thread.
In this embodiment of the present invention, the first active/standby communication thread refers to a thread in which the active server communicates with the standby server.
The main-standby communication thread refers to a thread which mutually reports the bridge state between a host (namely a main server) and a standby (namely a standby server), and the start and stop of the local bridge are determined according to the bridge state of an opposite terminal.
After the first Keepalived service is started, a first active/standby communication thread may be started, so as to implement communication between the primary server and the standby server, and implement mutual reporting of a bridge state between the primary server and the standby server.
After the first active-standby communication thread is initiated, step A2 is performed.
Step A2: and sending a host startup state message to a standby server through the first main/standby communication thread.
The host startup state message refers to a message that the host is restarted after startup or fault recovery.
After the first main/standby communication thread is started, the main server can send a host startup state message to the standby server through the first main/standby communication thread, and if the current standby server is executing service, the standby server can stop the bridge and keepalive service at the standby server side through the host startup state message, so that the problem of a loop caused by parallel operation of the bridge is avoided.
Step a3 is performed after sending a host initiated status message to the standby server via the first standby communication thread.
Step A3: and receiving standby state information returned by the standby server.
After sending the host startup status message to the standby server through the first active-standby communication thread, the standby state information returned by the standby server may be received, that is, the standby state information returned by the standby server is received through the first active-standby communication thread, and step a4 and step a5 are performed.
Step A4: and under the condition that the standby server is determined to be in the working state according to the standby machine state information, sending a bridge service stop message to the standby server, so that the standby server stops the bridge service according to the bridge service stop message.
The standby state information received by the main server aims at describing whether the standby server stops the bridge service and the keepalive service at the standby server side, when the standby server is determined to be in the working state through the standby state information, the main server sends a bridge service stop message to the standby server through a first main/standby communication thread, after the standby server receives the bridge service stop message sent by the main server, the standby server needs to stop the bridge service at the standby server side according to the bridge service stop message, and after the bridge service is stopped, the standby state information is continuously sent to the main server to prompt the main server to stop the bridge service at the standby server side.
Step A5: and executing the step of starting the first bridge service to judge the first bridge state of the local bridge under the condition that the standby server is determined to be in the non-working state according to the standby machine state information.
The state information of the standby machine received by the main server aims to describe the state information of whether the standby server stops the bridge service and the keepalive service on the side of the standby server. If the standby server is determined to be in the non-working state according to the standby state information returned by the standby server, the step 102 is executed.
After determining the first bridge state of the native bridge, step 103 is performed.
Step 103: and starting a first bridge check thread under the condition that the first bridge state is in a working state.
The network bridge checking thread is a loop obtaining network bridge state thread, and writes the network bridge state into the state file, so that the keepalive can check the state value of the state file conveniently.
The first bridge check thread refers to a bridge check thread on the host server side.
If the first bridge service determines that the native bridge is in the working state, the first bridge check thread is started and step 104 is executed.
Step 104: and checking the second bridge state of the native bridge by using the first bridge check thread to generate a first bridge state file.
The second bridge state refers to the bridge state on the host server side detected by the first bridge check thread.
After the first bridge check thread is started, the second bridge state of the native bridge may be checked by the first bridge check thread, and after the bridge state of the native bridge is checked, the checked state may be recorded in a bridge state file, thereby generating a first bridge state file.
After the first bridge state file is generated, step 105 is performed.
Step 105: checking the first bridge state file with the first Keepalived service.
After the first bridge state file is generated, the first bridge state file may be checked using a first keepalive service to determine a state of the native bridge based on values in the first bridge state file.
After checking the first bridge state file with the first Keepalived service, step 106 is performed.
Step 106: and under the condition that the native network bridge is determined to be in the non-working state according to the checking result, stopping the first Keepalived service and executing the main/standby switching process.
And under the condition that the local network bridge is determined to be in the non-working state according to the checking result, stopping the first Keepallved service, and executing a main/standby switching process, namely switching from the main server to the standby server and providing service by the standby server.
The main/standby switching process is described in detail in the following preferred embodiments.
In a preferred embodiment of the present invention, the step 106 may include:
substep B1: acquiring a bridge state value corresponding to a local bridge by using the first main/standby communication thread;
substep B2: and sending the bridge state value to the standby server, and executing the process of switching from the main server to the standby server.
In the embodiment of the invention, the main server can obtain the bridge state value corresponding to the local bridge by using the first main and standby communication thread, and send the bridge state value to the standby server, when the sending of the bridge state value fails, the working intention of the equipment is recorded, and the first keepalive service sends an alarm mail to a mailbox configured in a configuration file of the keepalive service according to the state value, so that a worker intervenes the standby server, namely the bridge and the keepalive service on the side of the standby server are stopped.
And when the transmission is successful, the standby server stops the bridge and Keepalived service of the standby server side according to the bridge state value of the host side.
After the bridge state on the primary server side is restored to normal, information that the native bridge is restored to normal may also be sent to the standby server, and in particular, the following detailed description of the preferred embodiments is provided.
In a preferred embodiment of the present invention, after the step 106, the method may further include:
substep C1: and after the state of the local network bridge is recovered to be normal, sending server normal operation state information to the standby server through the first main/standby communication thread, and executing a process of switching from the standby server to the main server.
In the embodiment of the present invention, the server normal operation state information refers to information that the main server is in a normal operation state.
After the state of the local bridge is recovered to be normal, the main server can send normal operation state information of the server to the standby server through the first main and standby communication threads, the normal operation state information of the server can be used for describing that the main server is recovered to be in a working state, and then the standby server stops bridge service and keepalive service at the side of the standby server according to the normal operation state information of the server, so that the problem that a loop is caused by parallel bridges is avoided, and then the process of switching from the standby server to the main server is executed, namely the main server continues to provide the service.
The method for switching the main server and the standby server provided by the embodiment of the invention starts the first bridge service by starting the first keepalive service to judge the first bridge state of the local bridge, starts the first bridge inspection thread under the condition that the first bridge state is in a working state, inspects the second bridge state of the local bridge by using the first bridge inspection thread to generate a first bridge state file, inspects the first bridge state file by using the first keepalive service, stops the first keepalive service under the condition that the local bridge is determined to be in a non-working state according to the inspection result, and executes the main/standby switching process. The embodiment of the invention can solve the problem of loops caused by the parallel of the network bridge, provides a mode of switching the main and standby state mechanisms for the service without a port for keepalive hosting, has a simpler implementation mode and can achieve the aim of multiplexing.
Example two
Referring to fig. 2, a flowchart illustrating steps of a method for switching between a master server and a standby server according to an embodiment of the present invention is shown, where the method for switching between the master server and the standby server may be applied to the standby server, and specifically may include the following steps:
step 201: a second Keepalived service is initiated.
The embodiment of the invention can be applied to the standby server.
The second Keepalived service refers to a Keepalived service of the standby service side.
After the second Keepalived service is initiated, hosting the standby server, step 202 is performed.
Step 202: and starting a second main/standby communication thread.
The second main/standby communication thread is a thread for communication between the standby server and the main server, and through the second main/standby communication thread, the standby server and the main server can report the state of the bridge, and the start and stop of the local bridge are determined according to the state of the opposite terminal.
After the second Keepalived service is initiated, a second active-standby communication thread may be initiated and step 203 may be performed.
Step 203: and receiving the first host state information sent by the main server by utilizing the second main/standby communication thread.
The first host status information refers to information indicating a current status of the primary server, and may include status information such as an abnormality or a failure of the primary server.
After the second active-standby communication thread is started, the standby server may receive the first host state information sent by the main server through the second active-standby communication thread, and perform step 204.
Step 204: and when the main server is determined to be abnormal according to the first host state information, sending a bridge service stop instruction to the main server.
After the standby server receives the first host state information sent by the main server, the state of the main server, namely whether the state is abnormal or normal or not, can be judged through the first host state information.
And when the standby server determines that the main server is abnormal according to the state information of the first host, sending a bridge service stopping instruction to the main server, wherein the bridge service stopping instruction is used for indicating the main server to stop the bridge service.
Step 205 is performed when the bridge service stop instruction is sent to the host server.
Step 205: a second bridge service is initiated to determine a second bridge state of the native bridge.
After sending the bridge service stop instruction to the main server, the main server may stop the bridge service on the main server side through the bridge service stop instruction, and the main server sends a message that the bridge service has been stopped to the standby server through the second main/standby communication thread.
In a preferred embodiment of the present invention, after the step 205, the method may further include:
substep D1: judging whether the native bridge is in a working state;
substep D2: under the condition that the native bridge is in a non-working state, the second bridge service is started again;
substep D3: and repeatedly executing the steps of judging whether the local network bridge is in the working state or not and restarting the second network bridge service under the condition that the local network bridge is not in the working state.
In a preferred embodiment of the present invention, after the second bridge service is started, the second bridge service may be further used to determine the state of the native bridge, that is, determine whether the native bridge is in the operating state, and if it is determined that the native bridge is in the non-operating state, the second bridge service is started again to determine the state of the native bridge again, and the step D1 and the step D2 are repeatedly executed to set the number of times of repeated execution, so that the bridge state check thread may be prevented from being erroneously determined due to too slow start of the bridge service.
When the standby server receives the message that the bridge service of the main server side has been stopped, the standby server starts the second bridge service, that is, the local bridge service of the standby server, and determines the second bridge state of the local bridge by using the second bridge service, thereby executing step 206.
Step 206: and starting a second bridge check thread under the condition that the second bridge state is in the working state.
And under the condition that the standby server judges that the local network bridge is in the working state through the second network bridge service, starting a second network bridge check thread to check the state of the standby network bridge in real time, and recording a check result in a state file.
Step 207: and receiving second host state information sent by a main server by utilizing the second main/standby communication thread.
During the process of checking the state of the native bridge in real time by using the second bridge service, the second host state information sent by the host server may also be received through the second host/standby communication thread, and step 208 is executed.
Step 208: and when the main server is determined to be recovered to the normal state according to the second host state information, stopping the second bridge service and the second keepalive service, and sending service stop state information to the main server.
And under the condition that the standby server determines that the main server recovers to the normal state according to the second host state information sent by the main server, stopping the second bridge service and the second keepalive service, and sending service stop state information to the main server, wherein after receiving the service stop state information, the main server can perform a switching process from the standby server to the main server, namely the main server provides service after switching.
The method for switching the main and standby servers, provided by the embodiment of the invention, comprises the steps of starting a second main and standby communication thread by starting a second keepalive service, receiving first host state information sent by a main server by using the second main and standby communication thread, sending a bridge service stop instruction to the main server when the main server is determined to be abnormal according to the first host state information, starting a second bridge service to judge the second bridge state of a local bridge, starting a second bridge inspection thread under the condition that the second bridge state is in a working state, receiving second host state information sent by the main server by using the second main and standby communication thread, stopping the second bridge service and the second keepalive service when the main server is determined to be recovered to be in a normal state according to the second host state information, and sending service stop state information to the main server. The embodiment of the invention can solve the problem of loops caused by the parallel of the network bridge, provides a mode of switching the main and standby state mechanisms for the service without a port for keepalive hosting, has a simpler implementation mode and can achieve the aim of multiplexing.
EXAMPLE III
Referring to fig. 3, a schematic structural diagram of a main/standby server switching device according to an embodiment of the present invention is shown, where the main/standby server switching device may be applied to a main server, and specifically may include:
a first Keepalived starting module 310, configured to start a first Keepalived service; a first bridge state determining module 320, configured to start a first bridge service to determine a first bridge state of the native bridge; a first bridge thread starting module 330, configured to start a first bridge check thread when the first bridge state is in a working state; a first bridge file generating module 340, configured to check a second bridge state of the native bridge by using the first bridge check thread, and generate a first bridge state file; a first bridge file checking module 350, configured to check the first bridge state file using the first Keepalived service; and the active/standby switching execution module 360 is configured to stop the first Keepalived service and execute the active/standby switching process when it is determined that the local network bridge is in the non-working state according to the check result.
Preferably, the method further comprises the following steps: a first active/standby thread starting module, configured to start a first active/standby communication thread; the host state message sending module is used for sending a host startup state message to the standby server through the first main/standby communication thread; the standby machine state information receiving module is used for receiving standby machine state information returned by the standby server; and the bridge judgment module execution module is used for executing the first bridge state judgment module under the condition that the standby server is determined to be in the non-working state according to the standby machine state information.
Preferably, the active/standby switching execution module 360 includes: the bridge state value acquisition submodule is used for acquiring a bridge state value corresponding to the local bridge by using the first main/standby communication thread; and the main/standby switching execution submodule is used for sending the bridge state value to the standby server and executing the process of switching from the main server to the standby server.
Preferably, the method further comprises the following steps: and the standby main server switching module is used for sending bridge normal state information to the standby server through the first standby communication thread after the state of the local bridge is recovered to be normal, and executing a process of switching from the standby server to the main server.
The master-slave server switching device provided by the embodiment of the invention starts the first bridge service by starting the first keepalive service to judge the first bridge state of the local bridge, starts the first bridge inspection thread under the condition that the first bridge state is in the working state, inspects the second bridge state of the local bridge by using the first bridge inspection thread to generate the first bridge state file, inspects the first bridge state file by using the first keepalive service, stops the first keepalive service under the condition that the local bridge is determined to be in the non-working state according to the inspection result, and executes the master-slave switching process. The embodiment of the invention can solve the problem of loops caused by the parallel of the network bridge, provides a mode of switching the main and standby state mechanisms for the service without a port for keepalive hosting, has a simpler implementation mode and can achieve the aim of multiplexing.
Example four
Referring to fig. 4, a schematic structural diagram of a main/standby server switching device according to an embodiment of the present invention is shown, where the main/standby server switching device may be applied to a standby server, and specifically may include:
a second Keepalived starting module 410, configured to start a second Keepalived service; a second active/standby thread starting module 420, configured to start a second active/standby communication thread; a first host state receiving module 430, configured to receive, by using the second active/standby communication thread, first host state information sent by a host server; a bridge stop instruction sending module 440, configured to send a bridge service stop instruction to the host server when it is determined that the host server is abnormal according to the first host state information; a second bridge state determining module 450, configured to start a second bridge service to determine a second bridge state of the local bridge; a second bridge thread starting module 460, configured to start a second bridge check thread when the second bridge state is in the working state; a second host state receiving module 470, configured to receive, by using the second active/standby communication thread, second host state information sent by a host server; a second bridge service and keepalive stopping module 480, configured to stop the second bridge service and the second keepalive service when it is determined that the host server recovers to a normal state according to the second host state information, and send service stopping state information to the host server.
Preferably, the method further comprises the following steps: the bridge state judging module is used for judging whether the local bridge is in a working state; the second bridge service restarting module is used for restarting the second bridge service under the condition that the local bridge is in a non-working state; and the repeated execution module is used for repeatedly executing the bridge state judgment module and the second bridge service restart module.
The master-slave server switching device provided in the embodiment of the present invention starts a second master-slave communication thread by starting a second keepalive service, receives first host state information sent by a master server by using the second master-slave communication thread, sends a bridge service stop instruction to the master server when determining that the master server is abnormal according to the first host state information, starts a second bridge service to determine a second bridge state of a local bridge, starts a second bridge check thread when the second bridge state is in a working state, receives second host state information sent by the master server by using the second master-slave communication thread, stops the second bridge service and the second keepalive service when determining that the master server is restored to a normal state according to the second host state information, and sends service stop state information to the master server. The embodiment of the invention can solve the problem of loops caused by the parallel of the network bridge, provides a mode of switching the main and standby state mechanisms for the service without a port for keepalive hosting, has a simpler implementation mode and can achieve the aim of multiplexing.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Preferably, an embodiment of the present invention further provides a terminal, including a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the foregoing active/standby server switching method embodiment, and can achieve the same technical effect, and details are not repeated here to avoid repetition.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the foregoing active/standby server switching method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminals (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including an instruction system which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method for switching between the main server and the standby server and the device for switching between the main server and the standby server provided by the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (14)

1. A method for switching between a main server and a standby server is applied to the main server, and is characterized by comprising the following steps:
starting a first Keepalived service;
under the condition that the standby server is in a non-working state, starting a first bridge service to judge the first bridge state of the local bridge;
starting a first bridge check thread under the condition that the first bridge state is a working state;
checking the second bridge state of the native bridge by using the first bridge checking thread to generate a first bridge state file;
checking the first bridge state file with the first Keepalived service;
stopping the first keepalive service and executing a main-standby switching process under the condition that the native network bridge is determined to be in a non-working state according to the checking result;
when the main server is abnormal, first host abnormal state information is sent to the standby server through a second main/standby communication thread, so that the standby server starts a bridge on the side of the standby server according to the first host abnormal state information.
2. The method of claim 1, further comprising, after the step of initiating the first Keepalived service:
starting a first main/standby communication thread;
sending a host startup state message to a standby server through the first main/standby communication thread;
receiving standby state information returned by the standby server;
under the condition that the standby server is determined to be in the working state according to the standby machine state information, sending a bridge service stop message to the standby server so that the standby server stops the bridge service of the standby server side according to the bridge service stop message;
and executing the step of starting the first bridge service to judge the first bridge state of the local bridge under the condition that the standby server is determined to be in the non-working state according to the standby machine state information.
3. The method according to claim 2, wherein the step of performing the active/standby switching process comprises:
acquiring a bridge state value corresponding to a local bridge by using the first main/standby communication thread;
and sending the bridge state value to the standby server, and executing the process of switching from the main server to the standby server.
4. The method according to claim 2, wherein after the step of stopping the first Keepalived service and performing the active-standby switching procedure, further comprising:
and after the state of the local network bridge is recovered to be normal, sending server normal operation state information to the standby server through the first main/standby communication thread, and executing a process of switching from the standby server to the main server.
5. A method for switching between a main server and a standby server is applied to the standby server and is characterized by comprising the following steps:
starting a second Keepalived service;
starting a second main/standby communication thread;
receiving first host state information sent by a main server by utilizing the second main/standby communication thread;
when the main server is determined to be abnormal according to the first host state information, sending a bridge service stop instruction to the main server;
starting a network bridge on a standby server side;
starting a second bridge service to determine a second bridge state of the native bridge;
starting a second bridge check thread under the condition that the second bridge state is a working state;
receiving second host state information sent by a main server by utilizing the second main/standby communication thread;
and when the main server is determined to be recovered to the normal state according to the second host state information, stopping the second bridge service and the second keepalive service, and sending service stop state information to the main server.
6. The method of claim 5, further comprising, after the step of initiating the second bridge service:
judging whether the native bridge is in a working state;
under the condition that the native bridge is in a non-working state, the second bridge service is started again;
and repeatedly executing the steps of judging whether the local network bridge is in a working state or not and restarting the service of the second network bridge under the condition that the local network bridge is in a non-working state.
7. A master-slave server switching device is applied to a master server and is characterized by comprising:
the first Keepalived starting module is used for starting the first Keepalived service;
the first bridge state judging module is used for starting a first bridge service under the condition that the standby server is in a non-working state so as to judge the first bridge state of the local bridge;
a first bridge thread starting module, configured to start a first bridge check thread when the first bridge state is a working state;
the first bridge file generation module is used for checking the second bridge state of the local bridge by using the first bridge check thread to generate a first bridge state file;
a first bridge file checking module, configured to check the first bridge state file using the first Keepalived service;
the main/standby switching execution module is used for stopping the first Keepalived service and executing a main/standby switching process under the condition that the local network bridge is determined to be in a non-working state according to the check result;
and the information sending module is used for sending the abnormal state information of the first host to the standby server through a second main/standby communication thread when the main server is abnormal, so that the standby server starts a bridge at the side of the standby server according to the abnormal state information of the first host.
8. The apparatus of claim 7, further comprising:
a first active/standby thread starting module, configured to start a first active/standby communication thread;
the host state message sending module is used for sending a host startup state message to the standby server through the first main/standby communication thread;
the standby machine state information receiving module is used for receiving standby machine state information returned by the standby server;
and the bridge judgment module execution module is used for executing the first bridge state judgment module under the condition that the standby server is determined to be in the non-working state according to the standby machine state information.
9. The apparatus of claim 8, wherein the active-standby switching execution module comprises:
the bridge state value acquisition submodule is used for acquiring a bridge state value corresponding to the local bridge by using the first main/standby communication thread;
and the main/standby switching execution submodule is used for sending the bridge state value to the standby server and executing the process of switching from the main server to the standby server.
10. The apparatus of claim 8, further comprising:
and the standby main server switching module is used for sending bridge normal state information to the standby server through the first standby communication thread after the state of the local bridge is recovered to be normal, and executing a process of switching from the standby server to the main server.
11. A master-backup server switching device is applied to a backup server and is characterized by comprising:
the second Keepalived starting module is used for starting a second Keepalived service;
a second active-standby thread starting module, configured to start a second active-standby communication thread;
a first host state receiving module, configured to receive, by using the second active/standby communication thread, first host state information sent by a master server;
the network bridge stopping instruction sending module is used for sending a network bridge service stopping instruction to the main server when the main server is determined to be abnormal according to the first host state information;
the network bridge starting module is used for starting the network bridge at the side of the standby server;
the second bridge state judging module is used for starting a second bridge service to judge the second bridge state of the local bridge;
the second bridge thread starting module is used for starting a second bridge check thread under the condition that the second bridge state is a working state;
a second host state receiving module, configured to receive, by using the second active/standby communication thread, second host state information sent by a master server;
and the second bridge service and keepalive stopping module is used for stopping the second bridge service and the second keepalive service when the main server is determined to be recovered to the normal state according to the second host state information, and sending service stopping state information to the main server.
12. The apparatus of claim 11, further comprising:
the bridge state judging module is used for judging whether the local bridge is in a working state;
the second bridge service restarting module is used for restarting the second bridge service under the condition that the local bridge is in a non-working state;
and the repeated execution module is used for repeatedly executing and judging whether the local network bridge is in a working state or not, and restarting the second network bridge service under the condition that the local network bridge is in a non-working state.
13. A terminal, comprising: memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the active/standby server switching method according to any one of claims 1 to 6.
14. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps in the active-standby server switching method according to any one of claims 1 to 6.
CN201811506780.1A 2018-12-10 2018-12-10 Method and device for switching main server and standby server Active CN109617761B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811506780.1A CN109617761B (en) 2018-12-10 2018-12-10 Method and device for switching main server and standby server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811506780.1A CN109617761B (en) 2018-12-10 2018-12-10 Method and device for switching main server and standby server

Publications (2)

Publication Number Publication Date
CN109617761A CN109617761A (en) 2019-04-12
CN109617761B true CN109617761B (en) 2020-02-21

Family

ID=66008844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811506780.1A Active CN109617761B (en) 2018-12-10 2018-12-10 Method and device for switching main server and standby server

Country Status (1)

Country Link
CN (1) CN109617761B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101060391A (en) * 2007-05-16 2007-10-24 华为技术有限公司 Master and spare server switching method and system and master server and spare server
CN102006189A (en) * 2010-11-25 2011-04-06 中兴通讯股份有限公司 Primary access server determination method and device for dual-machine redundancy backup
CN102546135A (en) * 2010-12-31 2012-07-04 富泰华工业(深圳)有限公司 System and method for switching between active and standby servers
CN102638389A (en) * 2011-02-15 2012-08-15 中兴通讯股份有限公司 Redundancy backup method and system of TRILL (Transparent Interconnection over Lots of Links) network
CN102647288A (en) * 2011-02-16 2012-08-22 中兴通讯股份有限公司 VM (Virtual Machine) data access protection method and system
CN108200124A (en) * 2017-12-12 2018-06-22 武汉烽火众智数字技术有限责任公司 A kind of High Availabitity application architecture and construction method
EP3352433A1 (en) * 2016-11-28 2018-07-25 Wangsu Science & Technology Co., Ltd. Node connection method and distributed computing system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7251745B2 (en) * 2003-06-11 2007-07-31 Availigent, Inc. Transparent TCP connection failover
CN108768883B (en) * 2018-05-18 2022-04-22 新华三信息安全技术有限公司 Network traffic identification method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101060391A (en) * 2007-05-16 2007-10-24 华为技术有限公司 Master and spare server switching method and system and master server and spare server
CN102006189A (en) * 2010-11-25 2011-04-06 中兴通讯股份有限公司 Primary access server determination method and device for dual-machine redundancy backup
CN102546135A (en) * 2010-12-31 2012-07-04 富泰华工业(深圳)有限公司 System and method for switching between active and standby servers
CN102638389A (en) * 2011-02-15 2012-08-15 中兴通讯股份有限公司 Redundancy backup method and system of TRILL (Transparent Interconnection over Lots of Links) network
CN102647288A (en) * 2011-02-16 2012-08-22 中兴通讯股份有限公司 VM (Virtual Machine) data access protection method and system
EP3352433A1 (en) * 2016-11-28 2018-07-25 Wangsu Science & Technology Co., Ltd. Node connection method and distributed computing system
CN108200124A (en) * 2017-12-12 2018-06-22 武汉烽火众智数字技术有限责任公司 A kind of High Availabitity application architecture and construction method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于keepalived的高可用性应用研究;汪海洋等;《电子技术》;20140725(第7期);正文第21-24页 *

Also Published As

Publication number Publication date
CN109617761A (en) 2019-04-12

Similar Documents

Publication Publication Date Title
US10491671B2 (en) Method and apparatus for switching between servers in server cluster
CN109286529B (en) Method and system for recovering RabbitMQ network partition
CN105933407B (en) method and system for realizing high availability of Redis cluster
US20090257431A1 (en) Global broadcast communication system
WO2015169199A1 (en) Anomaly recovery method for virtual machine in distributed environment
CN102882704B (en) Link protection method in the soft reboot escalation process of a kind of ISSU and equipment
CN111385107B (en) Main/standby switching processing method and device for server
CN102355368A (en) Fault processing method of network equipment and system
CN104036043A (en) High availability method of MYSQL and managing node
WO2022088861A1 (en) Database fault handling method and apparatus
CN111338858B (en) Disaster recovery method and device for double machine rooms
CN107566036B (en) Automatically detecting an error in a communication and automatically determining a source of the error
CN113377702B (en) Method and device for starting two-node cluster, electronic equipment and storage medium
JP6421516B2 (en) Server device, redundant server system, information takeover program, and information takeover method
CN109617761B (en) Method and device for switching main server and standby server
CN115842860B (en) Monitoring method, device and system for data link
CN111078454A (en) Cloud platform configuration recovery method and device
CN112491633B (en) Fault recovery method, system and related components of multi-node cluster
CN114840495A (en) Database cluster split-brain prevention method, storage medium and device
CN113590434A (en) Cluster alarm method, system, device and medium
CN112468330A (en) Method, system, equipment and medium for setting fault node
CN107783855B (en) Fault self-healing control device and method for virtual network element
CN115190040B (en) High-availability realization method and device for virtual machine
CN116506327B (en) Physical node monitoring method, device, computer equipment and storage medium
CN115499296B (en) Cloud desktop hot standby management method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant