CN110752955A - Seat invariant fault migration system and method - Google Patents

Seat invariant fault migration system and method Download PDF

Info

Publication number
CN110752955A
CN110752955A CN201911041633.6A CN201911041633A CN110752955A CN 110752955 A CN110752955 A CN 110752955A CN 201911041633 A CN201911041633 A CN 201911041633A CN 110752955 A CN110752955 A CN 110752955A
Authority
CN
China
Prior art keywords
server unit
unit
seat
network
standby
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911041633.6A
Other languages
Chinese (zh)
Inventor
韩琼
尚晓东
濮约刚
张明庆
孙大东
陈通
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Computer Technology and Applications
Original Assignee
Beijing Institute of Computer Technology and Applications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Computer Technology and Applications filed Critical Beijing Institute of Computer Technology and Applications
Priority to CN201911041633.6A priority Critical patent/CN110752955A/en
Publication of CN110752955A publication Critical patent/CN110752955A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1048Departure or maintenance mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1053Group management mechanisms  with pre-configuration of logical or physical connections with a determined number of other peers
    • H04L67/1057Group management mechanisms  with pre-configuration of logical or physical connections with a determined number of other peers involving pre-assessment of levels of reputation of peers

Abstract

The invention relates to a fault migration system and method with unchangeable seats, wherein the system comprises: the active server unit regularly saves the memory snapshots of the virtual machines in the background, the standby server unit monitors the active server unit by continuously sending heartbeat packets, when the disconnection of the network is detected, another virtual machine is immediately started to restore the virtual machine snapshots which are successfully saved recently, and the display, the USB and the interface of the standby server unit are transferred to the seat of the fault server unit, so that the seat is unchanged and the state is restored. The invention can realize that when one server unit fails, the external devices such as the system, the task and the display, the keyboard and the mouse, the serial port and the like which are connected with the server unit are automatically migrated to another preset server unit, and the server unit continues to operate according to the original state. The requirements of operators on unchangeable seats and unchangeable connection relation of external equipment are met.

Description

Seat invariant fault migration system and method
Technical Field
The invention relates to a multi-unit server fault maintenance technology of a domestic processor, in particular to a seat invariant fault migration system and a seat invariant fault migration method.
Background
With the gradual advance and vigorous support of the autonomous controllable industry by the country, the multi-unit server based on the domestic processor is beginning to be popularized and applied in various fields. However, the problems of low hardware stability, frequent failure and the like generally exist in the domestic multi-unit server at present, and the application of the autonomous controllable server in a high-reliability requirement scene is seriously influenced.
At present, the fault migration of a domestic multi-unit server is mainly to migrate an application service, and the main method is to build a service cluster and run high-availability cluster software on the cluster. When some unit hardware or application program of the server fails, other server units automatically restart the application program, so that the purpose of failure migration is achieved.
The fault migration in a high-availability cluster mode is only suitable for the fault migration of the application service, the application service needs to be executed again during the migration, and the continuity of the running state cannot be guaranteed; and the connection relation of the external equipment cannot be kept unchanged after the migration.
Disclosure of Invention
It is an object of the present invention to provide a seat-invariant fault migration system that solves the above-mentioned problems of the prior art.
The invention relates to a fault migration system with unchangeable seats, which comprises: a plurality of server units, a shared storage unit, a network switching unit and a KVM switching unit; the server unit connects the display output interface and the usb interface to the KVM switching unit, and the switching unit provides the display output interface, the usb interface and the serial port to the outside; each server unit is interconnected with the KVM switching unit through the switch unit; the network exchange unit can be connected with the network exchange unit of another server through an external network interface of the whole machine; the active server unit regularly saves the memory snapshots of the virtual machines in the background, the standby server unit monitors the active server unit by continuously sending heartbeat packets, when the disconnection of the network is detected, another virtual machine is immediately started to restore the virtual machine snapshots which are successfully saved recently, and the display, the USB and the interface of the standby server unit are transferred to the seat of the fault server unit, so that the seat is unchanged and the state is restored.
In an embodiment of the seat-invariant failover system according to the present invention, the KVM switch provides network command operations for interface switching and network-to-serial switching.
In an embodiment of the seat-invariant failover system according to the invention, each server unit and the shared storage unit install a domestic operating system and a virtual machine system.
In an embodiment of the seat-invariant failover system according to the invention, all virtual machine images are stored in a shared storage unit, and the server units are able to access these images through network sharing.
According to an embodiment of the seat-invariant fault migration system of the present invention, a virtual network to serial port software is installed in the virtual machine system.
In an embodiment of the seat-invariant failover system according to the present invention, the virtual machine system boots up automatically with the physical machine.
In an embodiment of the seat-invariant failover system according to the present invention, remote desktop management software is installed in the virtual machine system on the shared storage unit.
The invention relates to a fault migration method with unchanged seats, which comprises the following steps: step 1, starting a server; step 2: setting an active server unit and a standby server unit and corresponding priorities; and step 3: each server unit detects the role and priority of the server unit, and enters a standby state if the role of the server unit is not set; if the server unit is set as an active server unit, entering step 4; if the server unit is set as a standby server unit, entering step 5; and 4, step 4: executing the operation of saving the memory snapshot of the virtual machine at regular time; and 5: sending a network heartbeat packet to the active server unit, monitoring the state of the active server unit, and entering step 6 when the standby server monitors network interruption; step 6: the standby server unit judges the priority set by the standby server unit, and if the standby server unit is the first priority, the step 8 is directly carried out; if the standby server unit is not the first priority, performing step 7; and 7: the heartbeat packet is sent to the first priority standby server unit. If the first priority standby server unit network is abnormal, entering step 8; if the state is normal, entering a standby state; and 8: the standby server unit starts the virtual machine and recovers the virtual machine snapshot newly generated by the fault server with the highest priority; and step 9: the standby server unit migrates the standby server unit's display and the USB interface to the failed server unit's seat via a network to KVM switch command.
The invention relates to a fault migration system and a fault migration method based on a domestic three-unit server, which can realize that when one server unit breaks down, external devices such as a system, a task and a display, a keyboard, a mouse, a serial port and the like connected with the server unit are automatically migrated to another preset server unit and continuously run according to the original state. The requirements of operators on unchangeable seats and unchangeable connection relation of external equipment are met.
Drawings
FIG. 1 is a schematic diagram of a seat invariant fault migration system of the present invention;
fig. 2 is a flow chart of the seat invariant fault migration method of the present invention.
Detailed Description
In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
Fig. 1 is a schematic diagram of a seat-invariant fault migration system according to the present invention, and as shown in fig. 1, the seat-invariant fault migration system of a domestic three-unit server according to the present invention includes: three server units, a shared storage unit, a network switch unit and a KVM switch unit. The server unit connects the display output interface and the usb interface to the KVM switching unit, and the switching unit provides the display output interface, the usb interface, the serial port and the like to the outside; each server unit is interconnected with the KVM switch unit via a switch unit, see fig. 1.
As shown in fig. 1, the KVM switch provides network command operation for interface switching and network to serial function; each server unit is provided with a domestic operating system and a virtual machine system; all virtual machine images are stored on a shared storage unit, and the server units can access the images through network sharing. And virtual network to serial port software is installed in the virtual machine system. The virtual machine system is started up automatically along with the physical machine.
As shown in fig. 1, in operation, the active server unit regularly saves the memory snapshots of the virtual machines in the background, the standby server unit monitors the active server unit by continuously sending heartbeat packets, when it is detected that the network is disconnected, another virtual machine is immediately started to restore the virtual machine snapshots that have been successfully saved recently, and interfaces such as the display interface and the USB interface of the standby server unit are migrated to the agent of the failed server unit, thereby realizing agent invariance and state restoration. In the process, the network automatically restores connection, so that the serial port also automatically restores connection.
As shown in FIG. 1, the system may implement one or two standby server units to monitor two or one active server unit and perform failover recovery according to a predetermined priority.
Fig. 2 is a flowchart of a seat invariant fault migration method of the present invention, and as shown in fig. 2, a processing flow of the fault migration method of the present invention includes the following steps:
step 1, powering up a server and normally starting each functional unit;
step 2: setting an active server unit and a standby server unit and corresponding priorities;
and step 3: each server unit detects the role and priority of the server unit. If the role of the server unit is not set, entering a standby state; if the server unit is set as an active server unit, entering the step 4; if the server unit is set as a standby server unit, entering the step 5;
and 4, step 4: immediately executing the operation of saving the memory snapshot of the virtual machine at regular time;
and 5: and sending a network heartbeat packet to the active server unit to monitor the state of the active server unit. When the standby server monitors the network interruption, entering the next step;
step 6: the standby server unit determines the priority to be set by itself. If the standby server unit has the first priority, directly entering the 8 th step; if the standby server unit is not the first priority, the next step is carried out;
and 7: the heartbeat packet is sent to the first priority standby server unit. If the first priority standby server unit network is abnormal, entering step 8; if the state is normal, entering a standby state;
and 8: the standby server unit starts the virtual machine and recovers the virtual machine snapshot newly generated by the fault server with the highest priority;
and step 9: the standby server unit switches the display, USB, etc. interface of the standby server unit to the seat of the failed server unit through a network to KVM switch command.
When the domestic three-unit server fault migration is implemented, the key point of the method is that the virtual machine system can perform online memory snapshot in near real time and immediately recover the latest snapshot when needed; meanwhile, the interface switching is carried out by combining the remote control KVM switching unit, so that the fault recovery is realized and the seats are kept unchanged.
Compared with the prior art, the technical method provided by the invention improves the operation reliability of the domestic server. When a fault occurs, the system automatically restores all the running states before the fault, and the connection relation of the peripheral equipment is kept unchanged, thereby not influencing the work of a user.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (8)

1. A seat-invariant failover system, comprising: a plurality of server units, a shared storage unit, a network switching unit and a KVM switching unit; the server unit connects the display output interface and the usb interface to the KVM switching unit, and the switching unit provides the display output interface, the usb interface and the serial port to the outside; each server unit is interconnected with the KVM switching unit through the switch unit; the network exchange unit can be connected with the network exchange unit of another server through an external network interface of the whole machine;
the active server unit regularly saves the memory snapshots of the virtual machines in the background, the standby server unit monitors the active server unit by continuously sending heartbeat packets, when the disconnection of the network is detected, another virtual machine is immediately started to restore the virtual machine snapshots which are successfully saved recently, and the display, the USB and the interface of the standby server unit are transferred to the seat of the fault server unit, so that the seat is unchanged and the state is restored.
2. The seat-invariant failover system of claim 1, wherein the KVM switch provides network command operations for interface switching and network to serial ports.
3. The seat-invariant failover method of claim 1, wherein each server unit and the shared storage unit has a home operating system installed and a virtual machine system installed.
4. The seat-invariant failover system of claim 1, wherein all virtual machine images are stored in their entirety on a shared storage unit, and wherein server units have access to the images via network sharing.
5. The seat-invariant fault migration system of claim 1, wherein virtual network to serial port software is installed in the virtual machine system.
6. The seat-invariant failover system of claim 1, wherein the virtual machine system boots up automatically with a physical machine.
7. The seat-invariant failover system of claim 1, wherein remote desktop management software is installed in the virtual machine system on the shared storage unit.
8. A method for seat invariant fault migration via the system of any of claims 1-7, comprising:
step 1, starting a server;
step 2: setting an active server unit and a standby server unit and corresponding priorities;
and step 3: each server unit detects the role and priority of the server unit, and enters a standby state if the role of the server unit is not set; if the server unit is set as an active server unit, entering step 4; if the server unit is set as a standby server unit, entering step 5;
and 4, step 4: executing the operation of saving the memory snapshot of the virtual machine at regular time;
and 5: sending a network heartbeat packet to the active server unit, monitoring the state of the active server unit, and entering step 6 when the standby server monitors network interruption;
step 6: the standby server unit judges the priority set by the standby server unit, and if the standby server unit is the first priority, the step 8 is directly carried out; if the standby server unit is not the first priority, performing step 7;
and 7: the heartbeat packet is sent to the first priority standby server unit. If the first priority standby server unit network is abnormal, entering step 8; if the state is normal, entering a standby state;
and 8: the standby server unit starts the virtual machine and recovers the virtual machine snapshot newly generated by the fault server with the highest priority;
and step 9: the standby server unit migrates the standby server unit's display and the USB interface to the failed server unit's seat via a network to KVM switch command.
CN201911041633.6A 2019-10-30 2019-10-30 Seat invariant fault migration system and method Pending CN110752955A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911041633.6A CN110752955A (en) 2019-10-30 2019-10-30 Seat invariant fault migration system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911041633.6A CN110752955A (en) 2019-10-30 2019-10-30 Seat invariant fault migration system and method

Publications (1)

Publication Number Publication Date
CN110752955A true CN110752955A (en) 2020-02-04

Family

ID=69281065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911041633.6A Pending CN110752955A (en) 2019-10-30 2019-10-30 Seat invariant fault migration system and method

Country Status (1)

Country Link
CN (1) CN110752955A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112462955A (en) * 2021-01-25 2021-03-09 北京小鸟科技股份有限公司 Multi-output node control method, system and equipment of distributed KVM (keyboard video mouse) seat

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136838A (en) * 2006-08-29 2008-03-05 华为技术有限公司 Bridge mode elastic grouping ring transannular bridge equipment redundancy protecting method
US20080189700A1 (en) * 2007-02-02 2008-08-07 Vmware, Inc. Admission Control for Virtual Machine Cluster
CN101729426A (en) * 2009-12-29 2010-06-09 中兴通讯股份有限公司 Method and system for quickly switching between master device and standby device of virtual router redundancy protocol (VRRP)
CN103067176A (en) * 2013-01-11 2013-04-24 浪潮集团有限公司 Safety authentication method applied to multi-unit server management
CN108255639A (en) * 2017-12-12 2018-07-06 深圳市科思科技股份有限公司 A kind of server system
CN108469996A (en) * 2018-03-13 2018-08-31 山东超越数控电子股份有限公司 A kind of system high availability method based on auto snapshot

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136838A (en) * 2006-08-29 2008-03-05 华为技术有限公司 Bridge mode elastic grouping ring transannular bridge equipment redundancy protecting method
US20080189700A1 (en) * 2007-02-02 2008-08-07 Vmware, Inc. Admission Control for Virtual Machine Cluster
CN101729426A (en) * 2009-12-29 2010-06-09 中兴通讯股份有限公司 Method and system for quickly switching between master device and standby device of virtual router redundancy protocol (VRRP)
CN103067176A (en) * 2013-01-11 2013-04-24 浪潮集团有限公司 Safety authentication method applied to multi-unit server management
CN108255639A (en) * 2017-12-12 2018-07-06 深圳市科思科技股份有限公司 A kind of server system
CN108469996A (en) * 2018-03-13 2018-08-31 山东超越数控电子股份有限公司 A kind of system high availability method based on auto snapshot

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112462955A (en) * 2021-01-25 2021-03-09 北京小鸟科技股份有限公司 Multi-output node control method, system and equipment of distributed KVM (keyboard video mouse) seat

Similar Documents

Publication Publication Date Title
US6622261B1 (en) Process pair protection for complex applications
CN103346903A (en) Dual-machine backup method and device
US20090070761A1 (en) System and method for data communication with data link backup
US8032786B2 (en) Information-processing equipment and system therefor with switching control for switchover operation
CN106533736B (en) Network equipment restarting method and device
CN109471759B (en) A kind of database failure switching method and equipment based on SAS dual control equipment
KR20010062749A (en) Remote power management system of information processing apparatus or the like
CN105302661A (en) System and method for implementing virtualization management platform high availability
CN100492305C (en) Fast restoration method of computer system and apparatus
CN104579791A (en) Method for achieving automatic K-DB main and standby disaster recovery cluster switching
CN103532753A (en) Double-computer hot standby method based on memory page replacement synchronization
CN110740066B (en) Seat-invariant cross-machine fault migration method and system
CN104484243A (en) High-reliability system device and method combining virtual machine fault-tolerant technique and high-availability cluster technique
CN102708027A (en) Method and system for avoiding outage of communication device
CN105119754A (en) System and method for performing virtual master-to-slave shift to keep TCP connection
CN101442437B (en) Method, system and equipment for implementing high availability
JP5285045B2 (en) Failure recovery method, server and program in virtual environment
CN110752955A (en) Seat invariant fault migration system and method
CN101557307B (en) Dispatch automation system application state management method
JP5285044B2 (en) Cluster system recovery method, server, and program
CN111930573A (en) Task-level dual-computer hot standby system and method based on management platform
JP2004355446A (en) Cluster system and its control method
CN107122228A (en) The dispositions method and device of the management platform of super emerging system
KR20140140719A (en) Apparatus and system for synchronizing virtual machine and method for handling fault using the same
US11954509B2 (en) Service continuation system and service continuation method between active and standby virtual servers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200204