CN106293874A

CN106293874A - A kind of method and device that high-availability cluster is monitored

Info

Publication number: CN106293874A
Application number: CN201610619309.8A
Authority: CN
Inventors: 余乐宽
Original assignee: Inspur Beijing Electronic Information Industry Co Ltd
Current assignee: Inspur Beijing Electronic Information Industry Co Ltd
Priority date: 2016-07-29
Filing date: 2016-07-29
Publication date: 2017-01-04

Abstract

The invention discloses a kind of method and device that high-availability cluster is monitored, by monitoring each server in cluster and the running status of virtual machine run on the server；If listening to server or time virtual machine abnormal operational conditions occurs, determine that server or virtual machine are failed server or fault virtual machine；Fault virtual machine is carried out fast quick-recovery or will run on the virtual machine (vm) migration run in failed server to other servers of cluster.The application by server, the monitoring of virtual machine, feeding back to virtual machine High Availabitity audiomonitor by the running status of server and virtual machine so that it is server or virtual machine to the machine that breaks down or delay carry out fast quick-recovery quickly and accurately.The application framework is simple, economical and practical efficiently；Possess good autgmentability, when cluster scale increases, functional requirement and performance requirement can be met；Fast detecting failure node and be rapidly completed switching；Improve O&M efficiency, reduce maintenance cost.

Description

A kind of method and device that high-availability cluster is monitored

Technical field

The present invention relates to technical field of virtualization, particularly relate to a kind of method that high-availability cluster is monitored and dress Put.

Background technology

Along with the high speed development of Intel Virtualization Technology, the application of virtual machine is more and more universal.Along with client to virtual machine can Requirement by property and sustainable operation improves constantly, it is ensured that it is applied continual operation to become those skilled in the art and urgently solves Technical problem certainly.For these problems, virtualization product provides kinds of schemes to the high availability of virtual machine, but most of Scheme still can not meet customer need, it is impossible to guarantee that it runs without interruption.

Summary of the invention

It is an object of the invention to provide a kind of method and device that high-availability cluster is monitored, it is therefore intended that by right Server, the monitoring of virtual machine, feed back to virtual machine High Availabitity audiomonitor by the running status of server and virtual machine so that it is Server or virtual machine to the machine that breaks down or delay carry out fast quick-recovery quickly and accurately.

For solving above-mentioned technical problem, the present invention provides a kind of method being monitored high-availability cluster, including:

Monitor each server in cluster and the running status of virtual machine run on described server；

If listening to described server or time described virtual machine abnormal operational conditions occurs, determine described server or institute Stating virtual machine is failed server or fault virtual machine；

Described fault virtual machine is carried out fast quick-recovery or by the virtual machine (vm) migration run in described failed server to collecting Run on other servers of group.

Alternatively, each server and the running status of virtual machine run on described server in described monitoring cluster Including:

By heartbeat mechanism, monitor the heartbeat message of described server, and receive in Preset Time at described server The heartbeat message of the virtual machine of upper operation.

Alternatively, listen to described server or time described virtual machine abnormal operational conditions occurs if described, determine institute State server or described virtual machine be failed server or fault virtual machine includes:

When not receiving the heartbeat message that described virtual machine sends in described Preset Time, it is determined that described virtual machine is Fault virtual machine.

If listening to described virtual machine time abnormal operational conditions occurs, then send detection confirmation to described virtual machine and disappear Breath；

If to described detection, described virtual machine confirms that message is responded, then confirm that described fault machine is normal, do not appoint Where is managed；If described virtual machine does not responds described detection confirms message, then confirm that described virtual machine is fault virtual machine.

If listening to described server or time described virtual machine abnormal operational conditions occurs, directly empty by bottom-up layer Planization management platform sends fault message.

Present invention also offers a kind of device that high-availability cluster is monitored, including:

Monitor module, each server and the operation shape of virtual machine that runs on described server in monitor cluster State；

Determine module, for when listening to described server or described virtual machine abnormal operational conditions occurs, determine Described server or described virtual machine are failed server or fault virtual machine；

Processing module, for carrying out fast quick-recovery or the void will run in described failed server to described fault virtual machine Plan machine moves to run on other servers of cluster.

Alternatively, described monitoring module specifically for:

Alternatively, described determine module specifically for:

If listening to described virtual machine time abnormal operational conditions occurs, then send detection confirmation to described virtual machine and disappear Breath；If to described detection, described virtual machine confirms that message is responded, then confirm that described fault machine is normal, do not do any place Reason；If described virtual machine does not responds described detection confirms message, then confirm that described virtual machine is fault virtual machine.

Alternatively, described determine that module is additionally operable to:

The method and device being monitored high-availability cluster provided by the present invention, by each server in monitoring cluster And the running status of the virtual machine run on the server；If listening to server or abnormal operational conditions occurring in virtual machine Time, determine that server or virtual machine are failed server or fault virtual machine；Fault virtual machine is carried out fast quick-recovery or by event The virtual machine (vm) migration run on barrier server runs on other servers of cluster.Provided herein enters high-availability cluster The method and device of row monitoring, by server, the monitoring of virtual machine, feeding back the running status of server and virtual machine To virtual machine High Availabitity audiomonitor so that it is server or virtual machine to the machine that breaks down or delay are carried out quickly quickly and accurately Recover.The application is applicable to the environment having High Availabitity demand to system service；Framework is simple, economical and practical efficiently；Possess good Autgmentability, when cluster scale increases, functional requirement and performance requirement can be met；Fast detecting failure node and being rapidly completed Switching；Improve O&M efficiency, reduce maintenance cost.

Accompanying drawing explanation

For the clearer explanation embodiment of the present invention or the technical scheme of prior art, below will be to embodiment or existing In technology description, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to root Other accompanying drawing is obtained according to these accompanying drawings.

Fig. 1 is the flow process of a kind of detailed description of the invention of the method being monitored high-availability cluster provided by the present invention Figure；

Fig. 2 is the stream of the another kind of detailed description of the invention of the method being monitored high-availability cluster provided by the present invention Cheng Tu；

The structured flowchart of the device that high-availability cluster is monitored that Fig. 3 provides for the embodiment of the present invention.

Detailed description of the invention

In order to make those skilled in the art be more fully understood that the present invention program, below in conjunction with the accompanying drawings and detailed description of the invention The present invention is described in further detail.Obviously, described embodiment be only a part of embodiment of the present invention rather than Whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, broadly falls into the scope of protection of the invention.

The flow chart of a kind of detailed description of the invention of the method being monitored high-availability cluster provided by the present invention is such as Shown in Fig. 1, the method includes:

Step S101: monitor each server in cluster and the running status of virtual machine run on described server；

Step S102: if listening to described server or time described virtual machine abnormal operational conditions occurs, determine described Server or described virtual machine are failed server or fault virtual machine；

Server or virtual machine occur that abnormal operational conditions refers to the machine of delaying or malfunction, refer to virtual machine or server Operation can cause the state that client's related application can not normally use.

Step S103: described fault virtual machine is carried out fast quick-recovery or the virtual machine that will run in described failed server Move to run on other servers of cluster.

The method that high-availability cluster is monitored provided by the present invention, by monitor in cluster each server and The running status of the virtual machine run on server；If listening to server or time virtual machine abnormal operational conditions occurs, really Determine server or virtual machine is failed server or fault virtual machine；Fault virtual machine is carried out fast quick-recovery or by failed services The virtual machine (vm) migration run on device runs on other servers of cluster.Provided herein is monitored high-availability cluster Method, by server, the monitoring of virtual machine, the running status of server and virtual machine is fed back to virtual machine height can With audiomonitor so that it is server or virtual machine to the machine that breaks down or delay carry out fast quick-recovery quickly and accurately.The application fits For system service there being the environment of High Availabitity demand；Framework is simple, economical and practical efficiently；Possesses good autgmentability, at collection When group's scale increases, functional requirement and performance requirement can be met；Fast detecting failure node and be rapidly completed switching；Improve fortune Dimension efficiency, reduces maintenance cost.

On the basis of above-described embodiment, in the method that high availability cluster is monitored provided by the present invention, prison In listening cluster, each server and the running status of virtual machine run on described server are specifically as follows:

It is pointed out that heartbeat mechanism is: each server is supervisor and monitored person simultaneously, monitoring again it While the virtual machine (VM) of upper operation, need to send heartbeat message to platform main monitoring service；As supervisor, when specifying When not receiving the heartbeat message of the VM run on it in the time, then push fault by express passway to platform main monitoring service and believe Breath；As monitored person, server wants timing to send heartbeat message to platform main monitoring service, shows oneself to survive；Heart beating is Cluster server keeps the basis of High Availabitity.

In the present embodiment, if listening to described server or time described virtual machine abnormal operational conditions occurs, determine institute State server or described virtual machine is failed server or fault virtual machine specifically includes: when not receiving in described Preset Time During the heartbeat message that described virtual machine sends, it is determined that described virtual machine is fault virtual machine.

Further, if listening to described server or time described virtual machine abnormal operational conditions occurs, determine described Server or the process that described virtual machine is failed server or fault virtual machine may include that

On the basis of above-described embodiment, in the method that high-availability cluster is monitored provided by the present invention, if prison Hear when abnormal operational conditions occur in described server or described virtual machine, determine that described server or described virtual machine are for event Barrier server or fault virtual machine can also include:

The present embodiment is when server or virtual machine break down or delay abnormal running statuses such as machine, directly by bottom To the mechanism of upper strata virtual management platform PUSH message, it is not by any process, and the information that obtains in time is fed back in time, will not The situation of message delay feedback occurs.In virtual cluster technology, by server, the monitoring of virtual machine, by quickly Feedback channel, by the running status rapid feedback of server and virtual machine to virtual machine High Availabitity audiomonitor so that it is quick, smart The accurate server to the machine that breaks down or delay or virtual machine carry out fast quick-recovery.

The flow chart of the another kind of detailed description of the invention of the method that high-availability cluster is monitored provided by the present invention As in figure 2 it is shown, the method includes；

Step S201: server initiation, each physical node installs monitoring system；

Step S202: every physical machine timing sends heartbeat message to platform main monitoring service, and within a specified time connects Receive the heartbeat message that the VM run on it sends；

Step S203: when physical machine does not receive the heartbeat message that virtual machine sends within the time specified, then can be to Main monitoring service sends Trouble Report；

Step S204: after main monitoring service receives Trouble Report, can send detection to fault virtual machine immediately and confirm message；

Step S205: if fault VM responds the detection message of main monitoring, then show that this VM is normal, and main monitoring service will not Do any process；If fault VM does not responds the detection message of main monitoring, then confirm that this VM breaks down；

Step S206: main monitoring service, for the VM broken down, carries out fast quick-recovery or migration.

The embodiment of the present application passes through VM High Availabitity service listener, monitor have virtual machine that fast feedback channel feeds back and The running status of server, the machine if server or virtual machine break down or delay, the High Availabitity service of VM is fast and accurate to delaying The virtual machine of machine or fault carries out recovering or move to run on other server of cluster, it is achieved the High Availabitity of cluster.

In the application, VM High Availabitity service listener is the system service of virtual management platform, and this audiomonitor is uninterrupted Run, be used for monitoring the running status of virtual machine or server, and virtual machine is carried out fast and accurate protection.

The device being monitored high-availability cluster provided the embodiment of the present invention below is introduced, described below The device being monitored high-availability cluster can the most corresponding ginseng with the above-described method being monitored high-availability cluster According to.

The structured flowchart of the device being monitored high-availability cluster that Fig. 3 provides for the embodiment of the present invention, with reference to Fig. 3 pair The device that high-availability cluster is monitored may include that

Monitor module 100, each server and the fortune of virtual machine run on described server in monitor cluster Row state；

Determine module 200, be used for when listening to described server or described virtual machine abnormal operational conditions occurs, really Fixed described server or described virtual machine are failed server or fault virtual machine；

Processing module 300, for carrying out fast quick-recovery or will run in described failed server to described fault virtual machine Virtual machine (vm) migration run on other servers of cluster.

As a kind of detailed description of the invention, in the device that high-availability cluster is monitored provided by the present invention, above-mentioned Monitoring module 100 can be specifically for:

In the present embodiment, above-mentioned determine that module 200 can be specifically for:

On the basis of above-described embodiment, in the provided herein device that high-availability cluster is monitored, above-mentioned Determine module 200 specifically for:

On the basis of any of the above-described embodiment, in the device that high-availability cluster is monitored provided by the present invention, Determine that module can also be specifically for: if listening to described server or time described virtual machine abnormal operational conditions occurs, directly Connect and sent fault message by bottom-up layer virtual management platform.

The provided herein device being monitored high-availability cluster, by server, the monitoring of virtual machine, incites somebody to action The running status of server and virtual machine feeds back to virtual machine High Availabitity audiomonitor so that it is quickly and accurately to breaking down Or the server of machine of delaying or virtual machine carry out fast quick-recovery.The application is applicable to the environment having High Availabitity demand to system service； Framework is simple, economical and practical efficiently；Possess good autgmentability, when cluster scale increases, functional requirement and performance can be met Demand；Fast detecting failure node and be rapidly completed switching；Improve O&M efficiency, reduce maintenance cost.

In this specification, each embodiment uses the mode gone forward one by one to describe, and what each embodiment stressed is and other The difference of embodiment, between each embodiment, same or similar part sees mutually.For filling disclosed in embodiment For putting, owing to it corresponds to the method disclosed in Example, so describe is fairly simple, relevant part sees method part Illustrate.

Professional further appreciates that, in conjunction with the unit of each example that the embodiments described herein describes And algorithm steps, it is possible to electronic hardware, computer software or the two be implemented in combination in, in order to clearly demonstrate hardware and The interchangeability of software, the most generally describes composition and the step of each example according to function.These Function performs with hardware or software mode actually, depends on application-specific and the design constraint of technical scheme.Specialty Technical staff specifically should can be used for using different methods to realize described function to each, but this realization should not Think beyond the scope of this invention.

The method described in conjunction with the embodiments described herein or the step of algorithm can direct hardware, processor be held The software module of row, or the combination of the two implements.Software module can be placed in random access memory (RAM), internal memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, depositor, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.

Above method and the device being monitored high-availability cluster provided by the present invention is described in detail. Principle and the embodiment of the present invention are set forth by specific case used herein, and the explanation of above example is simply used In helping to understand method and the core concept thereof of the present invention.It should be pointed out that, for those skilled in the art, Under the premise without departing from the principles of the invention, it is also possible to the present invention is carried out some improvement and modification, these improve and modify also Fall in the protection domain of the claims in the present invention.

Claims

1. the method that high-availability cluster is monitored, it is characterised in that including:

If listening to described server or time described virtual machine abnormal operational conditions occurs, determine described server or described void Plan machine is failed server or fault virtual machine；

Described fault virtual machine is carried out fast quick-recovery or by described failed server run virtual machine (vm) migration to cluster its Run on his server.

2. the method as claimed in claim 1 high-availability cluster being monitored, it is characterised in that each in described monitoring cluster Server and the running status of virtual machine run on described server include:

By heartbeat mechanism, monitor the heartbeat message of described server, and in Preset Time, receive fortune on described server The heartbeat message of the virtual machine of row.

3. the method as claimed in claim 2 high-availability cluster being monitored, it is characterised in that if described listen to described When abnormal operational conditions occur in server or described virtual machine, determine that described server or described virtual machine are failed server Or fault virtual machine includes:

4. the method as claimed in claim 3 high-availability cluster being monitored, it is characterised in that if described listen to described When abnormal operational conditions occur in server or described virtual machine, determine that described server or described virtual machine are failed server Or fault virtual machine includes:

If listening to described virtual machine time abnormal operational conditions occurs, then send detection to described virtual machine and confirm message；

If to described detection, described virtual machine confirms that message is responded, then confirm that described fault machine is normal, do not do any place Reason；If described virtual machine does not responds described detection confirms message, then confirm that described virtual machine is fault virtual machine.

5. the method that high-availability cluster is monitored as described in any one of Claims 1-4, it is characterised in that if described Listen to described server or time described virtual machine abnormal operational conditions occurs, determine that described server or described virtual machine are Failed server or fault virtual machine include:

If listening to described server or time described virtual machine abnormal operational conditions occurs, directly virtualized by bottom-up layer Management platform sends fault message.

6. the device that high-availability cluster is monitored, it is characterised in that including:

Monitor module, each server and the running status of virtual machine run on described server in monitor cluster；

Processing module, for carrying out fast quick-recovery or the virtual machine that will run in described failed server to described fault virtual machine Move to run on other servers of cluster.

7. the device as claimed in claim 6 high-availability cluster being monitored, it is characterised in that described monitoring module is concrete For:

8. the device as claimed in claim 7 high-availability cluster being monitored, it is characterised in that described determine that module is concrete For:

9. the device as claimed in claim 8 high-availability cluster being monitored, it is characterised in that described determine that module is concrete For:

If listening to described virtual machine time abnormal operational conditions occurs, then send detection to described virtual machine and confirm message；As To described detection, the most described virtual machine confirms that message is responded, then confirm that described fault machine is normal, be left intact；If Described virtual machine is not responded described detection and is confirmed message, then confirm that described virtual machine is fault virtual machine.

10. the device that high-availability cluster is monitored as described in any one of claim 6 to 9, it is characterised in that described really Cover half block is additionally operable to: