CN103595771A - Method for controlling and managing parallel service groups in cluster - Google Patents

Method for controlling and managing parallel service groups in cluster Download PDF

Info

Publication number
CN103595771A
CN103595771A CN201310530438.6A CN201310530438A CN103595771A CN 103595771 A CN103595771 A CN 103595771A CN 201310530438 A CN201310530438 A CN 201310530438A CN 103595771 A CN103595771 A CN 103595771A
Authority
CN
China
Prior art keywords
group
node
server
management node
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310530438.6A
Other languages
Chinese (zh)
Inventor
王婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201310530438.6A priority Critical patent/CN103595771A/en
Publication of CN103595771A publication Critical patent/CN103595771A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Hardware Redundancy (AREA)

Abstract

Disclosed is a method for controlling and managing parallel service groups in a cluster. The method aims to correctly control and manage the parallel service groups existing in the cluster and comprises the steps that firstly, simultaneous starting and stopping of the parallel groups can be controlled; secondly, the states of the service groups on each node are correctly recorded; thirdly, abnormal operation can be carried out when the service groups are abnormal on a certain node, and the node in the cluster serves as an ordinary service node as well as a management node; when the parallel service groups are started or stopped, a command of starting or stopping the groups is transmitted to an appointed server by the management node of the cluster according to user configuration information to inform a server which needs to start or stop the service groups to start or stop the service groups; the management node of the cluster will collect the states of the service groups on each server, and the states will be recorded according to the zone bits of the states of the groups respectively; when the parallel service groups are abnormal on a certain server, other service groups, dependent on the parallel groups, on the server will be triggered by the management node for switching to guarantee normal services provided by the corresponding depending service groups; simultaneous starting and stopping of the parallel groups are controlled.

Description

A kind of method of concurrent service group control and management of cluster
Technical field
The present invention relates to Computer Applied Technology field, specifically a kind of method of concurrent service group control and management of cluster.
Background technology
Highly available cluster system refers to and can guarantee that business do not interrupt for 24 hours, and when system generation problem or application software generation problem, quick-recovery business, is controlled at downtime other system of minute level as far as possible soon.Highly available cluster system mainly comprises following module from the bottom to top: heartbeat module, monitoring resource module, resource management module, distributed control desk module and web services module.
Wherein, resource management module is responsible for the operational administrative of service groups and group resource, for operation and monitoring business group resource, guarantees its availability and reliability.At present in highly available cluster system, on service groups station server in cluster, move, so the unique state of a cluster management module record traffic group in corresponding server.But in some special demands, a service groups may need to be enabled on the server some appointment or whole in cluster, need to record the state of this service groups on Servers-all simultaneously, and manage simultaneously and control this service groups on these servers, original management and control mode can not meet the demands.Therefore, the concept of introducing concurrent service group necessitates, and a kind of method of inventing management and control concurrent service group also becomes naturally.Concurrent service group refers to the group that certain service groups is moved on need to a plurality of or whole server in cluster, the management of concurrent service group and control need to spread all on all Servers-alls that move this service groups, guarantee to monitor in time the service groups on each server, and carry out in time various operations.
Summary of the invention
A kind of method that the object of this invention is to provide concurrent service group control and management of cluster.
The object of the invention is to realize in the following manner, comprise and solving while there is concurrent service group in cluster, service groups is carried out to correct control and management, first, startup and stopping when controlling parallel group; Secondly, the state of correct record traffic group on each node; Again, when service groups is abnormal on certain node, can carry out abnormal operation, certain node in cluster also exists as management node as common service node time, when concurrent service group starts or stops, the management node of cluster is according to user configuration information, the order that starts or stop group being sent on the server of appointment, and notice need to start the startup of server of service groups or stop this service groups; The management node of cluster can be collected the state of service groups on each server, and the flag bit of state by group records respectively each state; When concurrent service group is abnormal on certain server, other service groups that management node can trigger the upper dependence of this service concurrent service group are switched, and what guarantee corresponding dependence service groups normally provides service; When controlling parallel group, start and stop, wherein:
Concurrent service group starts flow process and comprises following a few step:
Step 1: user creates cluster and creates a concurrent service group according to business demand, after completing, upload configuration file, to each node, starts cluster service.Cluster has an optimum node decision-making and becomes management node in start-up course;
Step 2: management node has received the order that starts parallel group, and the configuration file generating according to user is found the server that needs to start parallel group;
Step 3: management node combined message #dest=all#rd=..., wherein rd represents to start the combination of the parallel server of organizing, and then message is sent to all nodes, and notice destination server starts parallel group;
Step 4: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node starts parallel group at once, and the result that parallel group is started returns to management node;
Step 5: management node receives the startup result return value of each destination server, according to success or unsuccessfully arrange server group mode bit on value;
Step 6: management node is informed the result situation of user's startup group;
concurrent service group stops flow process, comprises following a few step:
Step 1: management node has received the order that stops parallel group, the configuration file generating according to user is found the server that need to stop parallel group;
Step 2: management node combined message #dest=all#rd=..., notice destination server stops parallel group;
Step 3: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node stops parallel group at once, and the result that parallel group is started returns to management node;
Step 4: what management node received each destination server stops result return value, according to success or the value of server on the mode bit of group is unsuccessfully set;
Step 5: management node informs that user stops the result situation of group;
it is as follows that management node arranges the process step of parallel group state:
Step 1: when management node receives the message of parallel group state variation, start startup group state flow process is set;
Step 2: obtain the title organized in message and the group state on which server and change;
Step 3: obtain group state corresponding on this server whether consistent with the state that needs to upgrade; Consistent words directly exit, and when inconsistent, mode bit are carried out to XOR calculating, and new state is set;
Step 4: complete the last state that preservation group is set;
when parallel group is abnormal, to process abnormal process step as follows for management node:
Step 1: the parallel group of discovering server of having moved parallel group has occurred extremely, forms message extremely sending to management node;
Step 2: management node receives exception reporting, arranges the state of group by example 3;
Step 3: whether management node is found has the service groups that depends on parallel group to operate on abnormal server;
Step 4: were it not for and finish dealing with, not so traversal is found out all service groups that meet step 3 condition;
Step 5: management node sends message informing abnormal nodes and stops relying on and the parallel service groups of organizing;
Step 6: abnormal nodes stops the service groups in step 5, then returns results to management node;
Step 7: management node receives returning results of step 6, finds a new normal group finding in the startup of server step 4 of parallel group of moving;
Step 8: normal node starts service groups, and returns results;
Step 9: management node arranges the state of service groups, and tenth skill.
The present invention has not only expanded the applied environment of high availability cluster, can improve the abnormality processing speed of high availability cluster service simultaneously, group for the support upper-layer service of some bottoms, if can not clash, can be used as parallel group and existing, when when abnormal, only need to upper-layer service be processed appears in upper-layer service, shortened the speed of abnormal switching, the continuity of higher assurance service.
The invention has the beneficial effects as follows: the present invention is complete has realized the control and management of concurrent service group in cluster, this method is compared with traditional cluster group management method, the scope of application of the high available service of cluster be can expand, and reliability and the continuity of business improved.Improved the value of software.
Accompanying drawing explanation
Fig. 1 is that parallel group of highly available cluster system in embodiment 1 starts schematic flow sheet;
Fig. 2 is that parallel group of highly available cluster system in embodiment 2 stops schematic flow sheet;
Fig. 3 is parallel group state setting procedure schematic diagram in embodiment 3;
Fig. 4 is parallel group abnormality processing schematic flow sheet in embodiment 4.
Embodiment
With reference to Figure of description, method of the present invention is described in detail below.
Technical problem to be solved by this invention is, a kind of control and management method of concurrent service group is provided, and can solve while there is concurrent service group in cluster, and service groups is carried out to correct control and management.When first, can control parallel group, start and stop; Secondly, the state of correct record traffic group on each node; Again, when service groups is abnormal on certain node, can carry out abnormal operation.
A kind of control and management method of concurrent service group: certain node in cluster also exists as management node as common service node time, when concurrent service group starts or stops, the management node of cluster is according to user configuration information, the order that starts or stop group being sent on the server of appointment, and notice need to start the startup of server of service groups or stop this service groups; The management node of cluster can be collected the state of service groups on each server, and the flag bit of state by group records respectively each state; When concurrent service group is abnormal on certain server, other service groups that management node can trigger the upper dependence of this service concurrent service group are switched, and what guarantee corresponding dependence service groups normally provides service.
Below in conjunction with drawings and Examples, the present invention is further elaborated.
Embodiment 1 as shown in Figure 1
Embodiment 1 is in cluster, and concurrent service group starts flow process, and this flow process mainly comprises following a few step:
Step 1: user creates cluster and creates a concurrent service group according to business demand, after completing, upload configuration file, to each node, starts cluster service.Cluster has an optimum node decision-making and becomes management node in start-up course;
Step 2: management node has received the order that starts parallel group, and the configuration file generating according to user is found the server that needs to start parallel group;
Step 3: management node combined message #dest=all#rd=..., wherein rd represents to start the combination of the parallel server of organizing, and then message is sent to all nodes, and notice destination server starts parallel group;
Step 4: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node starts parallel group at once, and the result that parallel group is started returns to management node;
Step 5: management node receives the startup result return value of each destination server, according to success or unsuccessfully arrange server group mode bit on value;
Step 6: management node is informed the result situation of user's startup group.
Embodiment 2 as shown in Figure 2
Embodiment 2 is in cluster, and concurrent service group stops flow process, and this flow process mainly comprises following a few step:
Step 1: management node has received the order that stops parallel group, the configuration file generating according to user is found the server that need to stop parallel group;
Step 2: management node combined message #dest=all#rd=..., notice destination server stops parallel group;
Step 3: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node stops parallel group at once, and the result that parallel group is started returns to management node;
Step 4: what management node received each destination server stops result return value, according to success or the value of server on the mode bit of group is unsuccessfully set;
Step 5: management node informs that user stops the result situation of group.
Embodiment 3 as shown in Figure 3
Embodiment 3 is flow processs that management node arranges parallel group state, below each step of this flow process is described in detail.
Step 1: when management node receives the message of parallel group state variation, start startup group state flow process is set;
Step 2: obtain the title organized in message and the group state on which server and change;
Step 3: obtain group state corresponding on this server whether consistent with the state that needs to upgrade; Consistent words directly exit, and when inconsistent, mode bit are carried out to XOR calculating, and new state is set;
Step 4: complete the last state that preservation group is set.
Embodiment 4 as shown in Figure 4
Embodiment 4 be parallel group on certain node when abnormal management node how to process abnormal flow process, this flow process mainly comprises the following steps.
Step 1: the parallel group of discovering server of having moved parallel group has occurred extremely, forms message extremely sending to management node;
Step 2: management node receives exception reporting, arranges the state of group by example 3;
Step 3: whether management node is found has the service groups that depends on parallel group to operate on abnormal server;
Step 4: were it not for and finish dealing with, not so traversal is found out all service groups that meet step 3 condition;
Step 5: management node sends message informing abnormal nodes and stops relying on and the parallel service groups of organizing;
Step 6: abnormal nodes stops the service groups in step 5, then returns results to management node;
Step 7: management node receives returning results of step 6, finds a new normal group finding in the startup of server step 4 of parallel group of moving;
Step 8: normal node starts service groups, and returns results;
Step 9: management node arranges the state of service groups, and tenth skill.
So far, completely realized the control and management of concurrent service group in cluster, this method is compared with traditional cluster group management method, can expand the scope of application of the high available service of cluster, and improves reliability and the continuity of business.Improved the value of software.
Except the technical characterictic described in specification, be the known technology of those skilled in the art.

Claims (1)

1. a method for the concurrent service group control and management of cluster, is characterized in that comprising while there is concurrent service group in cluster service groups is carried out to correct control and management, first, starts and stop when controlling parallel group; Secondly, the state of correct record traffic group on each node; Again, when service groups is abnormal on certain node, can carry out abnormal operation, certain node in cluster also exists as management node as common service node time, when concurrent service group starts or stops, the management node of cluster is according to user configuration information, the order that starts or stop group being sent on the server of appointment, and notice need to start the startup of server of service groups or stop this service groups; The management node of cluster can be collected the state of service groups on each server, and the flag bit of state by group records respectively each state; When concurrent service group is abnormal on certain server, other service groups that management node can trigger the upper dependence of this service concurrent service group are switched, and what guarantee corresponding dependence service groups normally provides service; When controlling parallel group, start and stop, wherein:
Concurrent service group starts flow process and comprises following a few step:
Step 1: user creates cluster and creates a concurrent service group according to business demand, after completing, upload configuration file, to each node, starts cluster service, and cluster has an optimum node decision-making and becomes management node in start-up course;
Step 2: management node has received the order that starts parallel group, and the configuration file generating according to user is found the server that needs to start parallel group;
Step 3: management node combined message #dest=all#rd=..., wherein rd represents to start the combination of the parallel server of organizing, and then message is sent to all nodes, and notice destination server starts parallel group;
Step 4: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node starts parallel group at once, and the result that parallel group is started returns to management node;
Step 5: management node receives the startup result return value of each destination server, according to success or unsuccessfully arrange server group mode bit on value;
Step 6: management node is informed the result situation of user's startup group;
concurrent service group stops flow process, comprises following a few step:
Step 1: management node has received the order that stops parallel group, the configuration file generating according to user is found the server that need to stop parallel group;
Step 2: management node combined message #dest=all#rd=..., notice destination server stops parallel group;
Step 3: each node receives after message whether have this node in the server of parsing rd, if directly do not moved; If this node is in destination server time, node stops parallel group at once, and the result that parallel group is started returns to management node;
Step 4: what management node received each destination server stops result return value, according to success or the value of server on the mode bit of group is unsuccessfully set;
Step 5: management node informs that user stops the result situation of group;
it is as follows that management node arranges the process step of parallel group state:
Step 1: when management node receives the message of parallel group state variation, start startup group state flow process is set;
Step 2: obtain the title organized in message and the group state on which server and change;
Step 3: obtain group state corresponding on this server whether consistent with the state that needs to upgrade; Consistent words directly exit, and when inconsistent, mode bit are carried out to XOR calculating, and new state is set;
Step 4: complete the last state that preservation group is set;
when parallel group is abnormal, to process abnormal process step as follows for management node:
Step 1: the parallel group of discovering server of having moved parallel group has occurred extremely, forms message extremely sending to management node;
Step 2: management node receives exception reporting, arranges the state of group by example 3;
Step 3: whether management node is found has the service groups that depends on parallel group to operate on abnormal server;
Step 4: were it not for and finish dealing with, not so traversal is found out all service groups that meet step 3 condition;
Step 5: management node sends message informing abnormal nodes and stops relying on and the parallel service groups of organizing;
Step 6: abnormal nodes stops the service groups in step 5, then returns results to management node;
Step 7: management node receives returning results of step 6, finds a new normal group finding in the startup of server step 4 of parallel group of moving;
Step 8: normal node starts service groups, and returns results;
Step 9: management node arranges the state of service groups, and tenth skill.
CN201310530438.6A 2013-11-01 2013-11-01 Method for controlling and managing parallel service groups in cluster Pending CN103595771A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310530438.6A CN103595771A (en) 2013-11-01 2013-11-01 Method for controlling and managing parallel service groups in cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310530438.6A CN103595771A (en) 2013-11-01 2013-11-01 Method for controlling and managing parallel service groups in cluster

Publications (1)

Publication Number Publication Date
CN103595771A true CN103595771A (en) 2014-02-19

Family

ID=50085751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310530438.6A Pending CN103595771A (en) 2013-11-01 2013-11-01 Method for controlling and managing parallel service groups in cluster

Country Status (1)

Country Link
CN (1) CN103595771A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016033806A1 (en) * 2014-09-05 2016-03-10 华为技术有限公司 Service flow control method and apparatus
CN106874138A (en) * 2015-12-11 2017-06-20 成都华为技术有限公司 The distribution method and device of a kind of service node
CN109587223A (en) * 2018-11-20 2019-04-05 北京奇艺世纪科技有限公司 Data aggregation method, device and system
CN111782231A (en) * 2020-07-14 2020-10-16 厦门市美亚柏科信息股份有限公司 Service deployment method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090158083A1 (en) * 2007-12-17 2009-06-18 Electronics And Telecommunications Research Institute Cluster system and method for operating the same
CN102571960A (en) * 2012-01-12 2012-07-11 浪潮(北京)电子信息产业有限公司 Method and device for monitoring high-availability cluster state
CN102983996A (en) * 2012-11-21 2013-03-20 浪潮电子信息产业股份有限公司 Dynamic allocation method and system for high-availability cluster resource management
CN103118121A (en) * 2013-02-19 2013-05-22 浪潮电子信息产业股份有限公司 Application method of high availability cluster in virtualization technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090158083A1 (en) * 2007-12-17 2009-06-18 Electronics And Telecommunications Research Institute Cluster system and method for operating the same
CN102571960A (en) * 2012-01-12 2012-07-11 浪潮(北京)电子信息产业有限公司 Method and device for monitoring high-availability cluster state
CN102983996A (en) * 2012-11-21 2013-03-20 浪潮电子信息产业股份有限公司 Dynamic allocation method and system for high-availability cluster resource management
CN103118121A (en) * 2013-02-19 2013-05-22 浪潮电子信息产业股份有限公司 Application method of high availability cluster in virtualization technology

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016033806A1 (en) * 2014-09-05 2016-03-10 华为技术有限公司 Service flow control method and apparatus
CN106874138A (en) * 2015-12-11 2017-06-20 成都华为技术有限公司 The distribution method and device of a kind of service node
CN109587223A (en) * 2018-11-20 2019-04-05 北京奇艺世纪科技有限公司 Data aggregation method, device and system
CN111782231A (en) * 2020-07-14 2020-10-16 厦门市美亚柏科信息股份有限公司 Service deployment method and device
CN111782231B (en) * 2020-07-14 2022-10-11 厦门市美亚柏科信息股份有限公司 Service deployment method and device

Similar Documents

Publication Publication Date Title
CN106331098B (en) Server cluster system
US20190363934A1 (en) Network operation support system and network device management method
US10509680B2 (en) Methods, systems and apparatus to perform a workflow in a software defined data center
US8862928B2 (en) Techniques for achieving high availability with multi-tenant storage when a partial fault occurs or when more than two complete faults occur
CN108632067B (en) Disaster recovery deployment method, device and system
EP3288269B1 (en) Method and system for cloud storage of video, and method and system for previewing cloud-stored video
US8032786B2 (en) Information-processing equipment and system therefor with switching control for switchover operation
WO2021136422A1 (en) State management method, master and backup application server switching method, and electronic device
CN104320401A (en) Big data storage and access system and method based on distributed file system
CN104092718A (en) Distributed system and configuration information updating method in distributed system
CN104778102A (en) Master-slave switching method and system
CN104158693A (en) A method and a system for disaster recovery backup of data service
CN102394914A (en) Cluster brain-split processing method and device
JP2013161251A (en) Computer failure monitoring program, method, and device
CN106657167B (en) Management server, server cluster, and management method
CN106330523A (en) Cluster server disaster recovery system and method, and server node
CN103886091B (en) A kind of database synchronization method based on recording mark
CN105335256B (en) Switch the methods, devices and systems of backup disk in whole machine cabinet server
TWI677797B (en) Management method, system and equipment of master and backup database
CN105141400A (en) High-availability cluster management method and related equipment
CN103595771A (en) Method for controlling and managing parallel service groups in cluster
CN107040576A (en) Information-pushing method and device, communication system
CN103152416A (en) Dynamic management method for improving high availability of online clustering group
CN106855869B (en) Method, device and system for realizing high availability of database
CN107181608A (en) A kind of method and operation management system for recovering service and performance boost

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140219

WD01 Invention patent application deemed withdrawn after publication