CN111865714A

CN111865714A - Cluster management method based on multi-cloud environment

Info

Publication number: CN111865714A
Application number: CN202010585865.4A
Authority: CN
Inventors: 伏伟任; 蒋秋明
Original assignee: Shanghai Shangshi Longchuang Intelligent Technology Co Ltd
Current assignee: Shanghai Shangshi Longchuang Intelligent Technology Co Ltd
Priority date: 2020-06-24
Filing date: 2020-06-24
Publication date: 2020-10-30
Anticipated expiration: 2040-06-24
Also published as: CN111865714B

Abstract

The invention relates to a cluster management method based on a multi-cloud environment, which comprises the following steps: the development framework design is the core of a multi-cloud framework and is also the part with the highest abstraction degree, the operation and the main components of the micro-service operation framework are known, for most middle-platform systems, the dependence on the framework operation is an RPC framework, the service management capability based on the RPC framework comprises mechanisms of service registration discovery, fusing fault tolerance, flow control and the like, and the service logic core code is decoupled from the micro-service framework capability. The invention establishes a detailed monitoring to the cluster node information based on the multi-cloud environment, can specify the nodes, and can compare the single data of each node in a graph mode so as to process specific faults.

Description

Cluster management method based on multi-cloud environment

Technical Field

The invention relates to cloud architecture management, in particular to a cluster management method based on a multi-cloud environment.

Background

The multi-cloud environment is a cloud architecture, is formed by combining a plurality of cloud services provided by a plurality of cloud providers, and can be a public cloud or a private cloud, wherein the multi-cloud is that the same type of cloud scheme is deployed on the plurality of providers, the mixed cloud is that a plurality of cloud deployment types are combined through integration or orchestration, and the multi-cloud scheme may relate to 2 public cloud environments or 2 private cloud environments.

A hybrid cloud scenario may involve 1 public cloud environment and 1 private cloud environment, and an infrastructure (implemented by application programming interfaces, middleware, or containers) that facilitates workload portability, with more and more enterprises selecting a multi-cloud deployment (including public and private clouds) with the desire to improve security and performance by expanding more environments.

The existing management modes of the cloud environment are various, and systematic clustering management is inconvenient to carry out.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a cluster management method based on a multi-cloud environment.

The purpose of the invention can be realized by the following technical scheme:

a cluster management method based on a multi-cloud environment comprises the following steps:

Step 1: setting a development framework based on the operation requirements and main components of a micro-service operation framework and decoupling corresponding core codes from the micro-service operation framework capacity;

step 2: designing a framework within the development framework;

and step 3: defining a micro service interface in the framework to complete the whole deployment;

and 4, step 4: and performing different cluster management operations based on the deployed structure.

Further, the development framework in the step 1 is an RPC framework and service management capability based on the RPC framework.

Further, the service governance capability based on the RPC framework comprises service registration discovery, fusing fault tolerance and flow control.

Further, the technical base of the framework in the step 2 adopts the technical bases of Spring, Spring Boot, ServiceComb, HSF and Spring Cloud micro-service frameworks.

Further, the different cluster management operations in step 4 include node joining, node leaving, normal operation of the node, node configuration, and thread synchronization.

Further, the process of node joining specifically includes: each node reads the configuration file of the node when starting, and sends the joining request message according to the period until receiving the joining confirmation messages of all other nodes.

Further, the leaving process of the node specifically includes: monitoring the states of all nodes, monitoring through heartbeat messages sent by the opposite side, if the heartbeat messages of a certain node are not received in a set period, the node is considered to leave, and when the node is the leaving or failure of a backup node, the node is directly deleted from a node list, and when the node is the leaving or failure of a main node, a new main node is selected from the rest nodes again.

Further, the normal operation process of the node specifically includes: a certain node sends heartbeat messages periodically to identify the existence of the node, and other nodes receive the heartbeat messages of the node periodically to jointly maintain a cluster node list.

Further, the process of node configuration specifically includes: each node starts and reads the initialized self node and the message to be sent in the configuration file, and then adds the self after configuration into the cluster node list.

Further, the threads in the thread synchronization implementation include a gm _ listener thread, a heartbeat thread, an add _ flag thread, and a test thread, where:

the gm _ listener thread is used for monitoring the received multicast messages and performing corresponding processing;

The heartbeat thread is used for sending a join request message or a heartbeat message every other heartbeat cycle by inquiring the state;

the add _ flag thread is used for carrying out periodic subtraction operation on a flag variable flag for marking the state of each node;

and the test thread is used for periodically detecting whether the flag variable is less than 0 or not for the nodes in each list.

Compared with the prior art, the invention has the following advantages:

(1) the method comprises the following steps: step 1: setting a development framework based on the operation requirements and main components of a micro-service operation framework and decoupling corresponding core codes from the micro-service operation framework capacity; step 2: designing a framework within the development framework; and step 3: defining a micro service interface in the framework to complete the whole deployment; and 4, step 4: different cluster management operations are carried out based on the deployed structure, a detailed monitoring for cluster node information is established based on a multi-cloud environment, nodes can be specified, and single data of each node can be compared in a graph mode so as to process specific faults;

(2) by adopting the steps of the method, a system administrator can set the automatic response of the system to the event through the event service, realize task distribution, load balance and high availability, develop a friendly management interface and improve the safety and convenience of management.

Drawings

FIG. 1 is a flow chart of the method steps of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.

DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION

Fig. 1 shows a cluster management method based on a multi-cloud environment, which includes the following steps:

s1, selecting a development framework: the development framework design is the core of a multi-cloud framework and is also the part with the highest abstraction degree, the operation and the main components of a micro-service operation framework are known, for most middle-platform systems, the dependence on the framework operation is an RPC framework, the service management capability based on the RPC framework comprises mechanisms of service registration discovery, fusing fault tolerance, flow control and the like, and the service logic core code is decoupled from the micro-service framework capability;

s2, architecture design: the technical base uses the technical base of Spring, Spring Boot, ServiceComb, HSF and Spring cloud micro-service framework, and is constructed based on the Spring and Spring Boot technical stack;

S3, defining a micro-service interface: the method comprises the following steps that a microservice-endpoint-ServiceComb is issued as a ServiceComb micro-service project, a microservice-endpoint-HSF is issued as an HSF micro-service project, and a microservice-endpoint-Spring is issued as a Spring Cloud micro-service project, wherein Integer, String and Boolean definition parameters and return values are used for defining parameters and return values by using a POJO Bean which conforms to Java Bean specifications, i nterface and abstrate class are not used, a plurality of realized base classes exist, template classes are used as parameters and return values, and objects which are strongly related to the operating environment are not used as interface parameters and return values;

and S4, managing the architecture of the nodes.

A cluster management method based on a multi-cloud environment, wherein step S4 includes:

and (3) adding the nodes: each node reads its own configuration file when starting, the configuration file comprises a node ID number, its own IP address, a multicast IP address and a port number, an initialization message and its own node, and then periodically sends a join request message until receiving a join confirmation message of other nodes;

departure of a node: firstly, monitoring the state of a node through a heartbeat message sent by the opposite side, namely, if the heartbeat message of a certain node is not received in three periods, the node is considered to leave, and under two basic conditions, the leaving/failure of a backup node is realized, the node is directly deleted from a node list, and if the main node leaves/fails, a new main node is selected from the rest nodes again, namely, the main node with the smallest ID number is selected from the rest nodes as the new main node, and the main node leaving or having the failure is deleted;

And (4) normal operation: because the node normally operates, the node periodically sends heartbeat messages to identify the existence of the node, and other nodes periodically receive the heartbeat messages of the node, thereby maintaining a cluster node list;

and (3) node configuration: each node is configured with a configuration file, the configuration file is stored in a node type in a configuration directory, the node starts to read the configuration file firstly, initializes the node and a message to be sent by a node ID number, a self IP address, a multicast IP address, a port number and the like, and adds the node into a node list firstly;

thread synchronization is realized: all threads in a process share the same global memory so that the threads share information, all threads in a process not only share global variables, but also share process instructions, most data, open files (e.g., descriptors), signal handlers and signal settings, current working directory, user ID and group lD, which relate to multiple threads running simultaneously, such as a gm _ listener thread, which is responsible for monitoring received multicast messages and performing corresponding processing, such as receiving a join message to determine if it is in the node list, joining it if it is not, and sending a join acknowledgement message, receiving a join acknowledgement message, determining if it is in the node list, joining it if it is not, receiving heartbeat information, adding a corresponding node flag variable, the heartbeat thread, sending a join request message or heartbeat message every other heartbeat cycle by querying for status, the add _ flag thread periodically decrements a flag variable flag that identifies the status of each node, while the test thread periodically detects for each node in the list whether the flag variable is less than 0, i.e., whether the node in the list is dead or away.

The method establishes a detailed monitoring for the cluster node information based on the multi-cloud environment, can specify the nodes, and can compare the single data of each node in a graph mode so as to process specific faults, and a system administrator can set the automatic response of the system to the event through the event service, thereby realizing task distribution, load balancing, high availability, developing a friendly management interface and improving the safety and convenience of management.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A cluster management method based on a multi-cloud environment is characterized by comprising the following steps:

Step 2: designing a framework within the development framework;

2. The method of claim 1, wherein the development framework in step 1 is an RPC framework and service governance capacity based on the RPC framework.

3. The method of claim 2, wherein the service administration capabilities based on RPC framework include service registration discovery, fuse fault tolerance and flow control.

4. The method according to claim 1, wherein the technical base of the framework in step 2 is Spring, Spring Boot, ServiceComb, HSF, and Spring Cloud micro-service framework.

5. The method according to claim 1, wherein the different cluster management operations in step 4 include node joining, node leaving, node normal operation, node configuration, and thread synchronization.

6. The method according to claim 5, wherein the process of joining nodes specifically comprises: each node reads the configuration file of the node when starting, and sends the joining request message according to the period until receiving the joining confirmation messages of all other nodes.

7. The method according to claim 5, wherein the leaving of the node specifically comprises: monitoring the states of all nodes, monitoring through heartbeat messages sent by the opposite side, if the heartbeat messages of a certain node are not received in a set period, the node is considered to leave, and when the node is the leaving or failure of a backup node, the node is directly deleted from a node list, and when the node is the leaving or failure of a main node, a new main node is selected from the rest nodes again.

8. The method according to claim 5, wherein the normal operation process of the nodes specifically includes: a certain node sends heartbeat messages periodically to identify the existence of the node, and other nodes receive the heartbeat messages of the node periodically to jointly maintain a cluster node list.

9. The method according to claim 5, wherein the node configuration process specifically comprises: each node starts and reads the initialized self node and the message to be sent in the configuration file, and then adds the self after configuration into the cluster node list.

10. The method of claim 5, wherein the threads in the thread synchronization implementation comprise a gm _ listener thread, a heartbeat thread, an add _ flag thread, and a test thread, wherein: