CN103595771A

CN103595771A - Method for controlling and managing parallel service groups in cluster

Info

Publication number: CN103595771A
Application number: CN201310530438.6A
Authority: CN
Inventors: 王婷
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2013-11-01
Filing date: 2013-11-01
Publication date: 2014-02-19

Abstract

Disclosed is a method for controlling and managing parallel service groups in a cluster. The method aims to correctly control and manage the parallel service groups existing in the cluster and comprises the steps that firstly, simultaneous starting and stopping of the parallel groups can be controlled; secondly, the states of the service groups on each node are correctly recorded; thirdly, abnormal operation can be carried out when the service groups are abnormal on a certain node, and the node in the cluster serves as an ordinary service node as well as a management node; when the parallel service groups are started or stopped, a command of starting or stopping the groups is transmitted to an appointed server by the management node of the cluster according to user configuration information to inform a server which needs to start or stop the service groups to start or stop the service groups; the management node of the cluster will collect the states of the service groups on each server, and the states will be recorded according to the zone bits of the states of the groups respectively; when the parallel service groups are abnormal on a certain server, other service groups, dependent on the parallel groups, on the server will be triggered by the management node for switching to guarantee normal services provided by the corresponding depending service groups; simultaneous starting and stopping of the parallel groups are controlled.

Description

A kind of method of concurrent service group control and management of cluster

Technical field

The present invention relates to Computer Applied Technology field, specifically a kind of method of concurrent service group control and management of cluster.

Background technology

Highly available cluster system refers to and can guarantee that business do not interrupt for 24 hours, and when system generation problem or application software generation problem, quick-recovery business, is controlled at downtime other system of minute level as far as possible soon.Highly available cluster system mainly comprises following module from the bottom to top: heartbeat module, monitoring resource module, resource management module, distributed control desk module and web services module.

Wherein, resource management module is responsible for the operational administrative of service groups and group resource, for operation and monitoring business group resource, guarantees its availability and reliability.At present in highly available cluster system, on service groups station server in cluster, move, so the unique state of a cluster management module record traffic group in corresponding server.But in some special demands, a service groups may need to be enabled on the server some appointment or whole in cluster, need to record the state of this service groups on Servers-all simultaneously, and manage simultaneously and control this service groups on these servers, original management and control mode can not meet the demands.Therefore, the concept of introducing concurrent service group necessitates, and a kind of method of inventing management and control concurrent service group also becomes naturally.Concurrent service group refers to the group that certain service groups is moved on need to a plurality of or whole server in cluster, the management of concurrent service group and control need to spread all on all Servers-alls that move this service groups, guarantee to monitor in time the service groups on each server, and carry out in time various operations.

Summary of the invention

A kind of method that the object of this invention is to provide concurrent service group control and management of cluster.

The object of the invention is to realize in the following manner, comprise and solving while there is concurrent service group in cluster, service groups is carried out to correct control and management, first, startup and stopping when controlling parallel group; Secondly, the state of correct record traffic group on each node; Again, when service groups is abnormal on certain node, can carry out abnormal operation, certain node in cluster also exists as management node as common service node time, when concurrent service group starts or stops, the management node of cluster is according to user configuration information, the order that starts or stop group being sent on the server of appointment, and notice need to start the startup of server of service groups or stop this service groups; The management node of cluster can be collected the state of service groups on each server, and the flag bit of state by group records respectively each state; When concurrent service group is abnormal on certain server, other service groups that management node can trigger the upper dependence of this service concurrent service group are switched, and what guarantee corresponding dependence service groups normally provides service; When controlling parallel group, start and stop, wherein:

Concurrent service group starts flow process and comprises following a few step:

Step 1: user creates cluster and creates a concurrent service group according to business demand, after completing, upload configuration file, to each node, starts cluster service.Cluster has an optimum node decision-making and becomes management node in start-up course;

Step 2: management node has received the order that starts parallel group, and the configuration file generating according to user is found the server that needs to start parallel group;

Step 3: management node combined message #dest=all#rd=..., wherein rd represents to start the combination of the parallel server of organizing, and then message is sent to all nodes, and notice destination server starts parallel group;

Step 4: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node starts parallel group at once, and the result that parallel group is started returns to management node;

Step 5: management node receives the startup result return value of each destination server, according to success or unsuccessfully arrange server group mode bit on value;

Step 6: management node is informed the result situation of user's startup group;

concurrent service group stops flow process, comprises following a few step:

Step 1: management node has received the order that stops parallel group, the configuration file generating according to user is found the server that need to stop parallel group;

Step 2: management node combined message #dest=all#rd=..., notice destination server stops parallel group;

Step 3: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node stops parallel group at once, and the result that parallel group is started returns to management node;

Step 4: what management node received each destination server stops result return value, according to success or the value of server on the mode bit of group is unsuccessfully set;

Step 5: management node informs that user stops the result situation of group;

it is as follows that management node arranges the process step of parallel group state:

Step 1: when management node receives the message of parallel group state variation, start startup group state flow process is set;

Step 2: obtain the title organized in message and the group state on which server and change;

Step 3: obtain group state corresponding on this server whether consistent with the state that needs to upgrade; Consistent words directly exit, and when inconsistent, mode bit are carried out to XOR calculating, and new state is set;

Step 4: complete the last state that preservation group is set;

when parallel group is abnormal, to process abnormal process step as follows for management node:

Step 1: the parallel group of discovering server of having moved parallel group has occurred extremely, forms message extremely sending to management node;

Step 2: management node receives exception reporting, arranges the state of group by example 3;

Step 3: whether management node is found has the service groups that depends on parallel group to operate on abnormal server;

Step 4: were it not for and finish dealing with, not so traversal is found out all service groups that meet step 3 condition;

Step 5: management node sends message informing abnormal nodes and stops relying on and the parallel service groups of organizing;

Step 6: abnormal nodes stops the service groups in step 5, then returns results to management node;

Step 7: management node receives returning results of step 6, finds a new normal group finding in the startup of server step 4 of parallel group of moving;

Step 8: normal node starts service groups, and returns results;

Step 9: management node arranges the state of service groups, and tenth skill.

The present invention has not only expanded the applied environment of high availability cluster, can improve the abnormality processing speed of high availability cluster service simultaneously, group for the support upper-layer service of some bottoms, if can not clash, can be used as parallel group and existing, when when abnormal, only need to upper-layer service be processed appears in upper-layer service, shortened the speed of abnormal switching, the continuity of higher assurance service.

The invention has the beneficial effects as follows: the present invention is complete has realized the control and management of concurrent service group in cluster, this method is compared with traditional cluster group management method, the scope of application of the high available service of cluster be can expand, and reliability and the continuity of business improved.Improved the value of software.

Accompanying drawing explanation

Fig. 1 is that parallel group of highly available cluster system in embodiment 1 starts schematic flow sheet;

Fig. 2 is that parallel group of highly available cluster system in embodiment 2 stops schematic flow sheet;

Fig. 3 is parallel group state setting procedure schematic diagram in embodiment 3;

Fig. 4 is parallel group abnormality processing schematic flow sheet in embodiment 4.

Embodiment

With reference to Figure of description, method of the present invention is described in detail below.

Technical problem to be solved by this invention is, a kind of control and management method of concurrent service group is provided, and can solve while there is concurrent service group in cluster, and service groups is carried out to correct control and management.When first, can control parallel group, start and stop; Secondly, the state of correct record traffic group on each node; Again, when service groups is abnormal on certain node, can carry out abnormal operation.

A kind of control and management method of concurrent service group: certain node in cluster also exists as management node as common service node time, when concurrent service group starts or stops, the management node of cluster is according to user configuration information, the order that starts or stop group being sent on the server of appointment, and notice need to start the startup of server of service groups or stop this service groups; The management node of cluster can be collected the state of service groups on each server, and the flag bit of state by group records respectively each state; When concurrent service group is abnormal on certain server, other service groups that management node can trigger the upper dependence of this service concurrent service group are switched, and what guarantee corresponding dependence service groups normally provides service.

Below in conjunction with drawings and Examples, the present invention is further elaborated.

Embodiment 1 as shown in Figure 1

Embodiment 1 is in cluster, and concurrent service group starts flow process, and this flow process mainly comprises following a few step:

Step 6: management node is informed the result situation of user's startup group.

Embodiment 2 as shown in Figure 2

Embodiment 2 is in cluster, and concurrent service group stops flow process, and this flow process mainly comprises following a few step:

Step 5: management node informs that user stops the result situation of group.

Embodiment 3 as shown in Figure 3

Embodiment 3 is flow processs that management node arranges parallel group state, below each step of this flow process is described in detail.

Step 4: complete the last state that preservation group is set.

Embodiment 4 as shown in Figure 4

Embodiment 4 be parallel group on certain node when abnormal management node how to process abnormal flow process, this flow process mainly comprises the following steps.

Step 8: normal node starts service groups, and returns results;

Step 9: management node arranges the state of service groups, and tenth skill.

So far, completely realized the control and management of concurrent service group in cluster, this method is compared with traditional cluster group management method, can expand the scope of application of the high available service of cluster, and improves reliability and the continuity of business.Improved the value of software.

Except the technical characterictic described in specification, be the known technology of those skilled in the art.

Claims

1. a method for the concurrent service group control and management of cluster, is characterized in that comprising while there is concurrent service group in cluster service groups is carried out to correct control and management, first, starts and stop when controlling parallel group; Secondly, the state of correct record traffic group on each node; Again, when service groups is abnormal on certain node, can carry out abnormal operation, certain node in cluster also exists as management node as common service node time, when concurrent service group starts or stops, the management node of cluster is according to user configuration information, the order that starts or stop group being sent on the server of appointment, and notice need to start the startup of server of service groups or stop this service groups; The management node of cluster can be collected the state of service groups on each server, and the flag bit of state by group records respectively each state; When concurrent service group is abnormal on certain server, other service groups that management node can trigger the upper dependence of this service concurrent service group are switched, and what guarantee corresponding dependence service groups normally provides service; When controlling parallel group, start and stop, wherein:

Step 1: user creates cluster and creates a concurrent service group according to business demand, after completing, upload configuration file, to each node, starts cluster service, and cluster has an optimum node decision-making and becomes management node in start-up course;

Step 5: management node informs that user stops the result situation of group;

Step 4: complete the last state that preservation group is set;

Step 8: normal node starts service groups, and returns results;

Step 9: management node arranges the state of service groups, and tenth skill.