CN110784350A

CN110784350A - Design method of real-time available cluster management system

Info

Publication number: CN110784350A
Application number: CN201911022253.8A
Authority: CN
Inventors: 詹少博
Original assignee: Beijing Institute of Computer Technology and Applications
Current assignee: Beijing Institute of Computer Technology and Applications
Priority date: 2019-10-25
Filing date: 2019-10-25
Publication date: 2020-02-11
Anticipated expiration: 2039-10-25
Also published as: CN110784350B

Abstract

The invention relates to a design method of a real-time high-availability cluster management system, and relates to the technical field of high-availability cluster management. The real-time high-availability cluster management system designed by the invention runs in a real-time operating system, supports visual configuration, and realizes resource isolation, dynamic reconfiguration and application migration; providing high-availability guarantee support and realizing high availability of the application of the computing node; the distributed memory data management is integrated inside, and the synchronization of key data is realized through a multi-copy redundancy mechanism. The system realizes the unbinding of software and hardware, improves the utilization rate of hardware resources, automatically migrates service application to available equipment when the software and hardware faults occur, realizes fault self-shielding and ensures uninterrupted service.

Description

Design method of real-time available cluster management system

Technical Field

The invention relates to the technical field of high-availability cluster management, in particular to a design method of a real-time high-availability cluster management system.

Background

With the development of technology, the traditional management mode of checking physical devices and business applications one by one through a manual mode is no longer applicable, and the main defects of the traditional management mode include the following points:

more and more business applications and physical devices are provided, and the combination modes of the business applications and the physical devices are various, so that the mode of manually recording the deployment condition of the business applications and logging in the physical devices one by one to manage the start and stop of a specific business system is an inefficient service management mode, and a large amount of time and energy consumption is caused.

With the increase of business applications and physical devices, the frequency of software and hardware failures increases linearly. Particularly, when software and hardware faults occur in a large system consisting of a plurality of business applications, the problem of troubleshooting and solving can be a long-period work, and the increasingly urgent requirements of scientific research tasks can not be met.

Meanwhile, the failure of software and hardware can not only cause the failure of normal operation of service application, but also cause the permanent loss of data, which can not lead to the situation of complete recovery, and can not meet the real service requirement.

The high-availability cluster management system can solve the problems and has the following characteristics:

1) supporting application and hardware decoupling

By means of non-invasion, the application and the hardware are decoupled, and the business application can be migrated on a plurality of physical devices under the condition that the business process is not influenced.

2) Supporting uninterrupted services

High availability guarantee is provided for service application, the service system is protected from software and hardware faults, and the faults are self-shielded

3) Supporting service monitoring management

Application service start-stop management, network management, scheduling management and resource monitoring supporting all-around visualization

4) Supporting high availability of data

The data redundancy backup is supported, the data redundancy backup is automatically survived when a fault occurs, and the data redundancy backup can be self-healed after the fault is recovered.

At present, deployment environments of high-availability cluster management systems are non-real-time systems, and high availability of files is realized through an internal integrated distributed file system and a multi-copy redundancy mechanism; and the database is provided for real-time synchronization, and the damage resistance and disaster tolerance of key data are realized. Since access speed and access mode are limited, access based on the file system cannot meet the requirements of the real-time system, and the database depends on the file system, high-availability cluster management on a non-real-time system cannot be used for the real-time system.

Disclosure of Invention

Technical problem to be solved

The technical problem to be solved by the invention is as follows: how to design a real-time high-availability cluster management system.

(II) technical scheme

In order to solve the technical problem, the invention provides a design method of a real-time high-availability cluster management system, which runs on a computing node and a management node and is designed for high-availability management of user applications running on the computing node.

Preferably, the real-time high-availability cluster management system is designed to include a data communication module and an application management module, the data communication module is used for providing FC communication and gigabit ethernet communication data support for the compute nodes and the management nodes, and the application management module is used for performing data distribution management on the compute nodes and the management nodes and also for managing interaction control between the compute nodes and the management nodes.

Preferably, the data communication module is designed to be composed of a driving module, the driving module provides an FC driver, a network card driver and a communication protocol, and the two communication data are stored in a fusion manner by creating a memory data queue, so that data is transmitted and received in a unified virtual communication device manner.

Preferably, the application management module is designed to comprise a data synchronization module, a monitoring module, a loading module, a management module and a human-computer interaction module;

the data synchronization module is designed to provide a real-time data synchronization mechanism, task data generated by a task system is stored in a local database and is uploaded to a management node through a network, the management node is distributed to a computing node to realize real-time data backup, and when a fault occurs, an integrated computing combination switches a database instance and a service route of a task access database to a backup node in real time; after the fault is removed, automatically adding a fault recovery node into an available sequence, and simultaneously backing up data to the fault recovery node in real time, so that the effect of normal uninterrupted synchronization of the data is finally achieved; meanwhile, both the master application and the standby application can receive external data, and the external data is simultaneously sent to the master application and the standby application; the main application can back up the key data to all the computing nodes for synchronously controlling the flow and the data;

the monitoring module is designed to provide a state monitoring function for the outside, runs on each computing node, communicates with the management node, and is used for periodically acquiring the hardware resource state, the application working state and the module self-checking information of each computing node in the system, and forming the monitoring information into heartbeat messages to be periodically sent to the management node;

the loading module runs on each computing node and is specifically realized by adopting the following design:

a) reading script configuration file information during starting, and loading out application;

b) receiving out application transmitted by the management node, loading and running the application in a process mode, running a task according to a CPU core assigned by the management node, storing the application in the electronic disk, and updating the loaded application information in a configuration file;

c) after the operation is finished, sending loading completion information to the management node;

d) receiving a vxworks mapping file transmitted by a management node, and storing the vxworks mapping file in a boot partition of the electronic disk;

the management module runs on the management node, manages the main application and the standby application through mutual communication with each computing node, and responds to the man-machine interaction information;

the man-machine interaction module is designed to provide a computing node management information display function for a user.

Preferably, the monitoring module is specifically realized by adopting the following design:

a) periodically monitoring the state of each application running on the computing node, forming a heartbeat message and sending the heartbeat message to the management node;

b) periodically monitoring the in-place state of the hardware environment resource operated by the computing node and the FC and Ethernet communication states, forming a heartbeat message and sending the heartbeat message to the management node, wherein the monitoring period can be set by taking 5 milliseconds as a unit;

c) receiving resource monitoring query sent by a management node, CPU utilization rate, CPU temperature, memory capacity, electronic disk capacity, running state of each application and resource occupation situation of the application, and forming a message to be fed back to the management node;

d) receiving a self-test result query sent by a management node, and sending a power-on self-test result of the computing node equipment to the management node;

e) acquiring switching information sent by a management node in real time, switching the standby application into a main application, carrying out external communication, deleting the main application, then re-creating and starting the main application, and starting the standby application;

f) providing an API interface to acquire the working state of the currently running application: is a main application or a standby application.

Preferably, the management module is specifically implemented by the following design:

a) module initialization: sending power-on self-check monitoring information to the computing nodes, receiving the self-check information, acquiring the equipment state of each computing node, alarming the computing nodes in the fault state, carrying out corresponding processing, and sending the equipment state information to an information recording task for recording;

b) and a human-computer interaction module interaction task: receiving human-computer interaction information, including submitting application information, updating mapping information, monitoring information and the like, and sending the information to an information processing task for processing;

c) and (3) information processing tasks: processing the submitted application information, assigning the computing nodes where the main application and the standby application are located in the application information, deploying according to the application information, if the computing nodes are not assigned, sending resource monitoring query information such as a CPU (central processing unit), a memory, an electronic disk and the like to the computing nodes, selecting the computing nodes with the least resource occupation as the main application running nodes and the standby application running nodes after the information is obtained, sending the configuration information to the corresponding computing nodes, and sending the computing node resource occupation information and the newly allocated main application running information and the newly allocated standby application running information to an information recording task; processing the updated mapping information, and sending a mapping file to the computing node to be updated;

d) switching processing tasks: the heartbeat information of the computing nodes, the main applications and the standby applications is periodically acquired, when the heartbeat is not received in more than 2 periods or the hardware state of the computing nodes in the heartbeat message is a fault, the computing nodes are judged to have the fault, the fault computing nodes are alarmed, and the applications running on the fault nodes are migrated to the computing nodes with sufficient resources according to the current resource occupation condition of the rest computing nodes; when the state of the main application in the heartbeat message is a fault or is suspended, judging that the main application has a fault, sending a switching instruction to the standby application to switch the standby application to the main application, sending the switching instruction to a node where the main application is located to enable the main application to be the standby application after being deleted and restarted, and sending the switched computing node information, the main application information and the standby application information to an information recording task, wherein the switching time of the main application and the standby application is the time of one heartbeat period;

e) and (3) information recording task: receiving state information and resource information of the computing nodes and main application information and standby application information which run on the computing nodes, and recording the information on an electronic disk to form a log;

preferably, the human-computer interaction module is specifically configured to provide, through the graphic data, a CPU usage, a memory usage, network traffic information, and a disk usage of each node at each time.

Preferably, when the monitoring of the state of each application running on the computing node is performed, the monitoring period is set in units of 5 milliseconds.

Preferably, the running hardware environment resources include an ethernet card, an electronic disk, and an FC.

Preferably, the running states of the applications include normal, failure and suspension.

(III) advantageous effects

The real-time high-availability cluster management system designed by the invention runs in a real-time operating system, supports visual configuration, and realizes resource isolation, dynamic reconfiguration and application migration; providing high-availability guarantee support and realizing high availability of the application of the computing node; the distributed memory data management is integrated inside, and the synchronization of key data is realized through a multi-copy redundancy mechanism. The system realizes the unbinding of software and hardware, improves the utilization rate of hardware resources, automatically migrates service application to available equipment when the software and hardware faults occur, realizes fault self-shielding and ensures uninterrupted service.

Drawings

FIG. 1 is a diagram of a real-time high availability cluster management system architecture designed by the present invention;

FIG. 2 is a diagram of an operation scenario of a real-time high availability cluster management system designed by the present invention;

FIG. 3 is a data flow diagram of a real-time high availability cluster management system designed by the present invention;

fig. 4 is a structural diagram of a real-time high-availability cluster management system designed by the present invention.

Detailed Description

In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.

A plurality of application software runs in a process mode, one process realizes one application, at most 4 applications are supported on one computing node, each application is distributed with one CPU core to run, and the applications realize physical isolation on the CPU core and the memory.

The management node is independent of the computing node, runs the management software independently, and performs information interaction with software resident in the computing node, so that the state monitoring and hot switching can be performed on the computing node and various applications running on the computing node automatically, the scheduling and switching can be realized statically according to a configuration file, meanwhile, the man-machine interaction is supported, the flexibility is high, a user can start and stop the applications in the running process to realize dynamic migration, and the current states of each node and each application can be monitored visually; software resident in the computing node only realizes state acquisition and management command execution, occupies less resources, is relatively simple in design, and can be used for executing various applications by using main resources.

The real-time high-availability cluster management system adopts a distributed memory data management mechanism to realize data sharing, storage and backup, data are subjected to multi-copy redundant storage on different computing nodes through a network, the user memory space of each computing node is allocated with the space with the same number as the computing node application number, the number and the capacity of the space are configurable, the data applied by each computing node can be synchronously backed up to the data storage space corresponding to the application on all the computing nodes when being updated, and in order to ensure the data consistency of the computing nodes, the data backup sequence is to firstly complete the data backup of other computing nodes and then carry out the data backup work on the computing node.

Monitoring the application on the computing node, switching when a fault occurs, and reading the latest data of the application from the corresponding address of the user space of the computing node by the taken-over application software to realize application synchronization; when the application performs dynamic migration, the latest data is read from the corresponding address to realize synchronous migration.

When a plurality of applications on a computing node need to access communication equipment simultaneously, a virtual communication equipment needs to be allocated to each application, the access objects of the applications are virtual communication equipment, the virtual communication equipment stores communication data into a memory data queue, a cluster management system takes out the data from the queue and sends the data out through a physical communication equipment, the physical communication equipment receives the data and stores the data into the queue, the cluster management system takes out the data and sends the data into the corresponding virtual communication equipment, and the virtual communication equipment receives the data.

As shown in fig. 1, a real-time high-availability cluster management system architecture diagram is provided, the real-time high-availability cluster management system runs on a computing node and a management node and is used for performing high-availability management on user applications running on the computing node, and the real-time high-availability cluster management system comprises a data communication module and an application management module. The data communication module is used for providing FC communication and gigabit Ethernet communication data support for the computing nodes and the management nodes, the application management module is used for carrying out data distribution management on the computing nodes and the management nodes and also used for managing interaction control between the computing nodes and the management nodes, and the computing nodes and the management nodes jointly form an operating hardware platform of the real-time high-availability cluster management system.

Fig. 2 shows an operation scenario diagram of a real-time high-availability cluster management system, an application management module interacts with a user application through a visual interface and an API interface, and a data communication module includes two communication modes, namely, an FC network and a gigabit ethernet network. The data distribution management of the application management module to the computing nodes is to manage the computing nodes by controlling the data distribution of the data communication module.

Fig. 3 provides a data flow diagram of a real-time high-availability cluster management system, in which a user submits an application, automatically allocates a master application and a backup application according to a CPU load condition after configuring a resource attribute, and performs data synchronization, so that the master application can perform external communication, the backup application only implements passive reception, the master application and the backup application both send application heartbeat messages to a management node, the computing node sends a hardware resource heartbeat message to the management node, and the management node monitors the application and the computing node; when the main application is monitored to be abnormal, the main application can be switched to the standby application, and continuous external communication is carried out after data is synchronized, so that high availability of the application is realized; when the computing node is monitored to be abnormal, the management node carries out alarm prompt, and the application running on the failed computing node is migrated to other normal computing nodes through human operation; static application deployment can be realized through a human-computer interaction interface, and high availability is realized by allocating main and standby applications to designated computing nodes through configuration;

the API calling interface comprises a virtual communication equipment interface, a data synchronization interface and an application state monitoring interface; the virtual communication equipment interface supports network communication of a plurality of applications on one computing node, the data synchronization interface supports synchronization among the applications across the computing nodes, and the application state monitoring interface can acquire state information of main hardware resources of any computing node and application running state information.

The data communication module realizes that a plurality of virtual communication devices correspond to one physical communication device through communication device virtualization, so that a plurality of applications can access one network device at the same time; by identifying the master application and the slave application, the master application can send and receive data, and the slave application can only passively receive data.

FIG. 4 is a block diagram of a real-time high availability cluster management system configuration showing system module configuration, wherein the data communication module is composed of a driver module; the application management module comprises a data synchronization module, a monitoring module, a loading module, a management module and a man-machine interaction module.

The driving module provides FC driving, network card driving and a communication protocol, and the two communication data are fused and stored by creating a memory data queue, so that the data are transmitted and received in a unified virtual communication device manner

The data synchronization module provides a real-time data synchronization mechanism, task data generated by a task system is stored in a local database and is uploaded to a management node through a network, the management node is distributed to a computing node to realize real-time data backup, and when a fault occurs, an integrated computing combination switches a database instance and a service route of a task access database to a backup node in real time; after the fault is removed, automatically adding a fault recovery node into an available sequence, and simultaneously backing up data to the fault recovery node in real time, so that the effect of normal uninterrupted synchronization of the data is finally achieved; meanwhile, both the master application and the standby application can receive external data, and the external data is simultaneously sent to the master application and the standby application; the primary application can back up critical data to all compute nodes for synchronizing control flows and data.

The monitoring module is used for providing an all-dimensional state monitoring function for the outside, runs between each computing node and the management node for communication, is used for periodically acquiring the hardware resource state, the application working state and the module self-checking information of each computing node in the system, and forms the monitoring information into a heartbeat message period to be sent to the management node, and is specifically realized by adopting the following design:

g) the state of each application running on the computing node is periodically monitored, a heartbeat message is formed and sent to the management node, the state information is provided by the application and the system, and the monitoring period can be set by taking 5 milliseconds as a unit;

h) periodically monitoring the in-place state of the hardware environment resources (Ethernet card, electronic disk, FC, etc.) operated by the computing node and the communication state of FC and Ethernet, forming a heartbeat message to be sent to the management node, wherein the monitoring period can be set in a unit of 5 milliseconds;

i) receiving resource monitoring query sent by a management node, namely CPU utilization rate, CPU temperature, memory capacity, electronic disk capacity, running states (normal, fault and suspended) of each application and resource occupation conditions of the applications, and forming a message to be fed back to the management node;

j) receiving a self-test result query sent by a management node, and sending a power-on self-test result of the computing node equipment to the management node;

k) acquiring switching information sent by a management node in real time, switching the standby application into a main application, carrying out external communication, deleting the main application, then re-creating and starting the main application, and starting the standby application;

l) providing an API interface to acquire the working state of the currently running application: is a main application or a standby application.

e) reading script configuration file information during starting, and loading out application;

f) receiving out application transmitted by the management node, loading and running the application in a process mode, running a task according to a CPU core assigned by the management node, storing the application in the electronic disk, and updating the loaded application information in a configuration file;

g) after the operation is finished, sending loading completion information to the management node;

h) and receiving the vxworks mapping file transmitted by the management node, and storing the vxworks mapping file in the electronic disk boot partition.

The management module runs on the management node, manages the main application and the standby application through mutual communication with each computing node, responds to human-computer interaction information, and is specifically realized by adopting the following design:

f) module initialization: sending power-on self-check monitoring information to the computing nodes, receiving the self-check information, acquiring the equipment state of each computing node, alarming the computing nodes in the fault state, carrying out corresponding processing, and sending the equipment state information to an information recording task for recording;

g) and a human-computer interaction module interaction task: receiving human-computer interaction information, including submitting application information, updating mapping information, monitoring information and the like, and sending the information to an information processing task for processing;

h) and (3) information processing tasks: processing the submitted application information, assigning the computing nodes where the main application and the standby application are located in the application information, deploying according to the application information, if the computing nodes are not assigned, sending resource monitoring query information such as a CPU (central processing unit), a memory, an electronic disk and the like to the computing nodes, selecting the computing nodes with the least resource occupation as the main application running nodes and the standby application running nodes after the information is obtained, sending the configuration information to the corresponding computing nodes, and sending the computing node resource occupation information and the newly allocated main application running information and the newly allocated standby application running information to an information recording task; processing the updated mapping information, and sending a mapping file to the computing node to be updated;

i) switching processing tasks: the heartbeat information of the computing nodes, the main applications and the standby applications is periodically acquired, when the heartbeat is not received in more than 2 periods or the hardware state of the computing nodes in the heartbeat message is a fault, the computing nodes are judged to have the fault, the fault computing nodes are alarmed, and the applications running on the fault nodes are migrated to the computing nodes with sufficient resources according to the current resource occupation condition of the rest computing nodes; when the state of the main application in the heartbeat message is a fault or is suspended, judging that the main application has a fault, sending a switching instruction to the standby application to switch the standby application to the main application, sending the switching instruction to a node where the main application is located to enable the main application to be the standby application after being deleted and restarted, and sending the switched computing node information, the main application information and the standby application information to an information recording task, wherein the switching time of the main application and the standby application is the time of one heartbeat period;

j) and (3) information recording task: receiving state information and resource information of the computing nodes and main application information and standby application information which run on the computing nodes, and recording the information on an electronic disk to form a log;

the man-machine interaction module is used for providing a computing node management information display function for a user. The user can intuitively know the running state of each node in the whole system according to the visual interface of the human-computer interaction module, and simultaneously provides detailed information such as the CPU use condition, the memory use condition, the network flow information, the disk use condition and the like of each node at each moment through the graphic data, so that the user can conveniently master the whole state of the system.

In summary, the design implementation of the high-availability cluster management based on the real-time system provided by the invention realizes the high-availability cluster management on the real-time operating system through the device virtualization technology and the data synchronization technology, and performance indexes such as application switching, data migration, fault perception and the like all meet the requirements of the real-time system.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. A method for designing a real-time high-availability cluster management system, which runs on a computing node and a management node and is designed for high-availability management of user applications running on the computing node.

2. The method of claim 1, wherein the real-time high availability cluster management system is designed to include a data communication module and an application management module, the data communication module is used for providing FC communication and gigabit ethernet communication data support for the computing nodes and the management nodes, and the application management module is used for performing data distribution management on the computing nodes and the management nodes and performing management on interaction control between the computing nodes and the management nodes.

3. The method according to claim 2, wherein the data communication module is designed to be composed of a driver module, the driver module provides an FC driver, a network card driver and a communication protocol, and the two communication data are fused and stored by creating a memory data queue, so that the data are transmitted and received in a unified virtual communication device manner.

4. The method of claim 3, wherein the application management module is designed to include a data synchronization module, a monitoring module, a loading module, a management module, and a human-machine interaction module;

5. The method of claim 4, wherein the monitoring module is implemented by specifically adopting the following design:

6. The method of claim 5, wherein the management module is implemented by specifically adopting the following design:

e) and (3) information recording task: and receiving the state information and the resource information of the computing nodes and the main application information and the standby application information which run on the computing nodes, and recording the information on the electronic disk to form a log.

7. The method of claim 6, wherein the human-computer interaction module is specifically configured to provide, through the graph data, a CPU usage, a memory usage, network traffic information, and a disk usage of each node at each time.

8. The method of claim 5, wherein the monitoring period is settable in units of 5 milliseconds while monitoring the status of each application running on the compute node.

9. The method of claim 5, wherein the runtime hardware environment resources comprise an Ethernet card, an electronic disk, FC.

10. The method of claim 5, wherein the application running states include normal, failed, suspended.