CN102043688B - Hot standby method and device used for blade server - Google Patents

Hot standby method and device used for blade server Download PDF

Info

Publication number
CN102043688B
CN102043688B CN201010596201.4A CN201010596201A CN102043688B CN 102043688 B CN102043688 B CN 102043688B CN 201010596201 A CN201010596201 A CN 201010596201A CN 102043688 B CN102043688 B CN 102043688B
Authority
CN
China
Prior art keywords
management module
operational management
standby
data transmission
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010596201.4A
Other languages
Chinese (zh)
Other versions
CN102043688A (en
Inventor
王峰
郑谦
张考华
李华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Dawning Information Industry Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN201010596201.4A priority Critical patent/CN102043688B/en
Publication of CN102043688A publication Critical patent/CN102043688A/en
Application granted granted Critical
Publication of CN102043688B publication Critical patent/CN102043688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Hardware Redundancy (AREA)
  • Small-Scale Networks (AREA)

Abstract

The invention relates to a hot standby method and device used for a blade server. The hot standby method disclosed by the invention comprises the following steps: synchronizing synchronous data of an operating management module and a standby management module by virtue of an Ethernet interface; monitoring faults of the operating management module; and substituting the standby management module for the operating management module to operate according to the synchronous data when the operating management module has faults, wherein the monitoring step comprises the following substeps: firstly, checking whether data are transmitted or not at the Ethernet interface between the operating management module and the standby management module; secondly, checking whether data are transmitted or not at a serial port between the operating management module and the standby management module when no data are transmitted at the Ethernet interface in first predetermined time; and thirdly, determining that the operating management module has faults when no data are transmitted at the serial port in second predetermined time. Besides, the invention also provides a hot standby device.

Description

The method and apparatus that is used for the two-node cluster hot backup of blade server
Technical field
The present invention relates in general to network field, more specifically, relates to the method and apparatus for the two-node cluster hot backup of blade server.
Background technology
In current blade server application, the effect of administration module is mathematical.Yet due to the stability of system self and such as reasons such as software and hardwares, there is the possibility breaking down in administration module.Once fault occurs, and will cause blade server not run well.And repair system needs spended time, for some important ingresses for service or access point (enterprise, bank etc.), when administration module breaks down, current system exists very large risk.Therefore, on market, being badly in need of a kind of blade server with back mechanism avoids this situation to occur.
And for the back mechanism of other field, if directly apply to the present invention, can obtain having the two-shipper mechanism of main and standby relation, that is, and two-node cluster hot backup mechanism.Yet still there is very large defect in such scheme.For example: fault that cannot automatic monitoring operational management module; Cannot automatically start standby management module; Data resource that receives before cannot retaining when operational management module breaks down and opens standby management module, processes etc.And these not enough meetings cause great waste aspect time and data resource, the operation of blade server produces significant adverse effect.
Summary of the invention
Consider the problems referred to above and make the present invention.
The invention provides a kind of double hot standby method, comprise the following steps: by Ethernet interface, operational management module is synchronizeed with the synchrodata of standby management module; The fault of monitoring operational management module; When breaking down, standby management module substitutes the work of operational management module according to synchrodata, and wherein, monitoring comprises: step 1, checks whether the Ethernet interface between operational management module and standby management module has data transmission; Step 2, when Ethernet interface does not have data transmission within first schedule time, checks whether the serial ports between operational management module and standby management module has data transmission; Step 3, when serial ports does not have data transmission within second schedule time, determines that operational management module breaks down.
Wherein, step 2 further comprises: when Ethernet interface has data transmission within first schedule time, return to step 1.
Wherein, step 3 further comprises: when serial ports has data transmission within second schedule time, return to step 1.
Wherein, synchrodata comprises IP, time.
Wherein, monitoring further comprises: whether the processor that checks operational management module works within the 3rd schedule time, when processor is not worked within the 3rd schedule time, determines and breaks down.
In addition, the present invention also provides a kind of two-node cluster hot backup device, comprising: synchronization module, for operational management module being synchronizeed with the synchrodata of standby management module by Ethernet interface; Monitoring module, for monitoring the fault of operational management module; And alternative module, for making standby management module substitute the work of operational management module according to synchrodata when breaking down, wherein, monitoring module comprises: first checks submodule, for checking whether the Ethernet interface between operational management module and standby management module has data transmission within first schedule time; Second checks submodule, for checking when there is no data transmission whether the serial ports between operational management module and standby management module has data transmission within second schedule time; And fault determines submodule, for determining that when there is no data transmission operational management module breaks down.
Wherein, synchrodata comprises IP, time.
Wherein, monitoring module further comprises: the 3rd checks submodule, for checking whether the processor of operational management module works within the 3rd schedule time, when processor is not worked within the 3rd schedule time, determines and breaks down.
Other features and advantages of the present invention will be set forth in the following description, and, partly from instructions, become apparent, or understand by implementing the present invention.Object of the present invention and other advantages can be realized and be obtained by specifically noted structure in the instructions write, claims and accompanying drawing.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part, and schematic description and description of the present invention is used for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 shows according to the process flow diagram of the double hot standby method of exemplary embodiment of the present invention;
Fig. 2 shows according to the process flow diagram of monitoring step in the double hot standby method of exemplary embodiment of the present invention;
Fig. 3 shows according to the block diagram of the two-node cluster hot backup device of exemplary embodiment of the present invention; And
Fig. 4 shows according to the process flow diagram of monitoring module in the two-node cluster hot backup device of exemplary embodiment of the present invention.
Embodiment
Below in conjunction with accompanying drawing, describe embodiments of the invention in detail.
Fig. 1 shows according to the process flow diagram of the double hot standby method of exemplary embodiment of the present invention.As shown in Figure 1, double hot standby method can comprise: S101, by Ethernet interface, operational management module is synchronizeed with the synchrodata of standby management module; S103, the fault of monitoring operational management module; S105, when breaking down, standby management module substitutes the work of operational management module according to synchrodata.Wherein, step S103 can have a plurality of sub-steps.
Fig. 2 shows according to the process flow diagram of monitoring step in the double hot standby method of exemplary embodiment of the present invention.As shown in Figure 2, monitoring step S103 comprises: S1031, checks whether the Ethernet interface between operational management module and standby management module has data transmission; S1033, when Ethernet interface does not have data transmission within first schedule time, checks whether the serial ports between operational management module and standby management module has data transmission; S1035, when serial ports does not have data transmission within second schedule time, determines that operational management module breaks down.
Wherein, step S1033 further comprises: when Ethernet interface has data transmission within first schedule time, return to step S1031.Step S1035 further comprises: when serial ports has data transmission within second schedule time, return to step S1031.
And monitoring step S103 further comprises: whether the processor that checks operational management module works within the 3rd schedule time, when processor is not worked within the 3rd schedule time, determine and break down.
Said method has following function:
1), by Ethernet interface, realize the synchronous of the data such as IP, time between two administration modules.
2) two administration modules are carried out to mutual double-pipe type module status through Ethernet and serial ports and monitor, while being out of order, correctly process.
3), when operational management module is out of order, standby management module can be promoted as operational management module smoothly according to the information of preserving, and guarantees system stable operation
The situation that hardware activates for two administration modules simultaneously also has mechanism and controls.That is,, when there is the situation that two administration modules all activate, can take at least one in following mechanism to control:
1) stop the work of all administration modules, and restart;
2) the active and standby priority that basis sets in advance keeps main management module wherein, and stops the work of another administration module;
3) performance when the first two administration module is marked, and be operational management module according to an administration module of predetermined policy selection, and the work that stops another administration module (for example, the performance of the first administration module is higher than the performance of the second administration module, and the first administration module is operational management module).
In addition, Fig. 3 shows according to the block diagram of the two-node cluster hot backup device of exemplary embodiment of the present invention.As shown in Figure 3, two-node cluster hot backup device can comprise: synchronization module 301, for operational management module being synchronizeed with the synchrodata of standby management module by Ethernet interface; Monitoring module 303, for monitoring the fault of operational management module; And alternative module 305, for making standby management module substitute the work of operational management module according to synchrodata when breaking down.
And Fig. 4 shows according to the process flow diagram of monitoring module in the two-node cluster hot backup device of exemplary embodiment of the present invention.As shown in Figure 4, monitoring module 303 comprises: first checks submodule 3031, for checking whether the Ethernet interface between operational management module and standby management module has data transmission within first schedule time; Second checks submodule 3033, for checking when there is no data transmission whether the serial ports between operational management module and standby management module has data transmission within second schedule time; And fault determines submodule 3035, for determining that when there is no data transmission operational management module breaks down.
In addition, monitoring module 303 can also comprise: the 3rd checks submodule (not shown), for checking whether the processor of operational management module works within the 3rd schedule time, when processor is not worked within the 3rd schedule time, determines and breaks down.
In the present invention, first schedule time can be 60 seconds, and second schedule time can be 5 seconds, and the 3rd schedule time can be to be less than 1 second.Certainly, according to design requirement, this three schedule times can be not equate or not etc.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (8)

1. a double hot standby method, comprising: described operational management module is synchronizeed with the synchrodata of described standby management module; The fault of monitoring operational management module; When breaking down, described standby management module substitutes described operational management module work according to described synchrodata;
It is characterized in that:
The step that described operational management module is synchronizeed with the synchrodata of described standby management module is specially by the Ethernet interface between operational management module and standby management module carries out synchronously the synchrodata of described operational management module and described standby management module;
Wherein, described monitoring comprises:
Step 1, checks whether the described Ethernet interface between described operational management module and described standby management module has data transmission;
Step 2, when described Ethernet interface does not have data transmission within first schedule time, checks whether the serial ports between described operational management module and described standby management module has data transmission;
Step 3, when described serial ports does not have data transmission within second schedule time, determines that described operational management module breaks down.
2. method according to claim 1, is characterized in that, described step 2 further comprises:
When described Ethernet interface has data transmission within described first schedule time, return to described step 1.
3. method according to claim 1, is characterized in that, described step 3 further comprises:
When described serial ports has data transmission within described second schedule time, return to described step 1.
4. according to the method in any one of claims 1 to 3, it is characterized in that, described synchrodata comprises IP, time.
5. according to the method in any one of claims 1 to 3, it is characterized in that, described monitoring further comprises:
Whether the processor that checks described operational management module works within the 3rd schedule time, when described processor is not worked within described the 3rd schedule time, determines and breaks down.
6. a two-node cluster hot backup device, comprising: synchronization module, for described operational management module is synchronizeed with the synchrodata of described standby management module; Monitoring module, for monitoring the fault of described operational management module; And alternative module, for making described standby management module substitute the work of described operational management module according to described synchrodata when breaking down;
It is characterized in that:
Described synchronization module is synchronizeed with the synchrodata of described standby management module to described operational management module to be specially by the Ethernet interface between operational management module and standby management module the synchrodata of described operational management module and described standby management module is carried out synchronously;
Wherein, described monitoring module comprises:
First checks submodule, for checking whether the described Ethernet interface between described operational management module and described standby management module has data transmission within first schedule time;
Second checks submodule, for checking when there is no data transmission whether the serial ports between described operational management module and described standby management module has data transmission within second schedule time; And
Fault is determined submodule, for determining that when there is no data transmission described operational management module breaks down.
7. device according to claim 6, is characterized in that, described synchrodata comprises IP, time.
8. device according to claim 6, is characterized in that, described monitoring module further comprises:
The 3rd checks submodule, for checking whether the processor of described operational management module works within the 3rd schedule time, when described processor is not worked within described the 3rd schedule time, determines and breaks down.
CN201010596201.4A 2010-12-10 2010-12-10 Hot standby method and device used for blade server Active CN102043688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010596201.4A CN102043688B (en) 2010-12-10 2010-12-10 Hot standby method and device used for blade server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010596201.4A CN102043688B (en) 2010-12-10 2010-12-10 Hot standby method and device used for blade server

Publications (2)

Publication Number Publication Date
CN102043688A CN102043688A (en) 2011-05-04
CN102043688B true CN102043688B (en) 2014-04-30

Family

ID=43909841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010596201.4A Active CN102043688B (en) 2010-12-10 2010-12-10 Hot standby method and device used for blade server

Country Status (1)

Country Link
CN (1) CN102043688B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9229825B2 (en) 2013-06-28 2016-01-05 International Business Machines Corporation Quick failover of blade server

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1308278A (en) * 2001-02-15 2001-08-15 华中科技大学 IP fault-tolerant method for colony server
CN2726011Y (en) * 2004-09-14 2005-09-14 中国人民解放军上海警备区司令部指挥自动化工作站 Running control system for data fault tolerance back-up

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1308278A (en) * 2001-02-15 2001-08-15 华中科技大学 IP fault-tolerant method for colony server
CN2726011Y (en) * 2004-09-14 2005-09-14 中国人民解放军上海警备区司令部指挥自动化工作站 Running control system for data fault tolerance back-up

Also Published As

Publication number Publication date
CN102043688A (en) 2011-05-04

Similar Documents

Publication Publication Date Title
CN106331098B (en) Server cluster system
CN103199972B (en) The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system
US9141491B2 (en) Highly available server system based on cloud computing
CN102546135B (en) Active/standby server switched system and method
CN100426751C (en) Method for ensuring accordant configuration information in cluster system
CN109286529B (en) Method and system for recovering RabbitMQ network partition
CN105471622A (en) High-availability method and system for main/standby control node switching based on Galera
CN103546914A (en) HSS (home subscriber server) master-slave management method and HSS master-slave management device
CN103905247B (en) Two-unit standby method and system based on multi-client judgment
CN102394914A (en) Cluster brain-split processing method and device
US7925761B2 (en) System and method for implementing a dead man dependency technique for cluster resources
CN1322422C (en) Automatic startup of cluster system after occurrence of recoverable error
CN112527567A (en) System disaster tolerance method, device, equipment and storage medium
CN107729213B (en) Background task monitoring method and device
CN103414739B (en) Use Cloud Server automatic monitored control system and the method for automatic drift
CN111726388A (en) Cross-cluster high-availability implementation method, device, system and equipment
CN106294795A (en) A kind of data base's changing method and system
CN105812161A (en) Controller fault backup method and system
CN102043688B (en) Hot standby method and device used for blade server
CN117076196A (en) Database disaster recovery management and control method and device
CN103001802A (en) Method and system for automatically correcting faults of Ethernet ports
CN104346233A (en) Fault recovery method and device for computer system
CN107423167A (en) A kind of ISCSI target redundancy control methods and system based on dual control storage
CN105007293A (en) Double master control network system and double writing method for service request therein
CN101453354A (en) High availability system based on ATCA architecture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB02 Change of applicant information

Address after: 100193 Beijing, Haidian District, northeast Wang West Road, building 8, No. 36

Applicant after: Dawning Information Industry (Beijing) Co.,Ltd.

Address before: 100084 Beijing Haidian District City Mill Street No. 64

Applicant before: Dawning Information Industry (Beijing) Co.,Ltd.

C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220725

Address after: 100089 building 36, courtyard 8, Dongbeiwang West Road, Haidian District, Beijing

Patentee after: Dawning Information Industry (Beijing) Co.,Ltd.

Patentee after: DAWNING INFORMATION INDUSTRY Co.,Ltd.

Address before: 100193 No. 36 Building, No. 8 Hospital, Wangxi Road, Haidian District, Beijing

Patentee before: Dawning Information Industry (Beijing) Co.,Ltd.