CN113067850B

CN113067850B - Cluster arrangement system under multi-cloud scene

Info

Publication number: CN113067850B
Application number: CN202110192828.1A
Authority: CN
Inventors: 吕冬兵; 李英俊; 李志伟; 杜晋秀; 张业达; 符敦威; 张兴峻; 阿尔曼; 阳万里; 肖志华
Original assignee: Kirin Software Co Ltd
Current assignee: Kirin Software Co Ltd
Priority date: 2021-02-20
Filing date: 2021-02-20
Publication date: 2023-04-07
Anticipated expiration: 2041-02-20
Also published as: CN113067850A

Abstract

The invention relates to a cluster arrangement system under a multi-cloud scene, which comprises an arrangement inlet, an arrangement engine module, an alarm module and a cloud agent module, wherein the arrangement inlet is used for receiving a user request and sending the user request to the arrangement engine module; the alarm module receives the user request on the message queue, manages the cluster alarm, regularly acquires the monitoring information of all cluster nodes on each cloud according to the alarm strategy and informs the arrangement engine module to execute corresponding operation; the arrangement engine module receives the requests of the arrangement entrance and the alarm module and executes specific cluster operation; and the cloud agent module receives the request of the alarm module or the arrangement engine module, calls interfaces of all clouds and acquires the monitoring data and the health state of all cluster nodes. The invention realizes the cross-cloud management of the application cluster, such as the functions of arranging, managing life cycle, high availability, automatic expansion, balancing load and the like of the cluster.

Description

Cluster arrangement system under multi-cloud scene

Technical Field

The patent application belongs to the technical field of multi-cloud cluster arrangement, and particularly relates to a cluster arrangement system under a multi-cloud scene.

Background

In recent years, with the continuous development of cloud computing technology, cloud computing service providers rise at home and abroad, and a cloudy pattern is gradually formed.

With the development of cloud computing markets, many enterprises begin to select business clouds, and the enterprises do not adopt only one cloud but also a manner of combining multiple clouds, such as public clouds, private clouds, hybrid clouds, and the like. Enterprises have developed into a mainstream trend in a cloudy manner.

Due to various considerations (cost, compliance, avoidance of vendor lock, etc.), today there are less and less scenarios where enterprises use public/private clouds alone, and most enterprises prefer to use multiple clouds. According to the cloud computing survey report of Flexera company 2020, the vast majority of enterprises (93%) use multiple clouds, which is 9% more than that of 2019, and the enterprises use more than 4 different clouds for the production environment on average, 2.2 public clouds and 2.2 private clouds, while tentatively using the other 4 clouds.

In a using mode of multiple clouds (IAAS scenes), how an application cluster can be managed conveniently like in a traditional mode, and the cluster arrangement of a cross-cloud platform can be realized by utilizing the characteristics of the multiple clouds, including load balancing, high availability, automatic expansion and contraction and the like of the cluster, which is a problem concerned by users at the present stage and a difficult problem in a multiple cloud environment.

In a production environment, applications often exist in the form of clusters, and the clusters are few nodes and many nodes are thousands of nodes. Most of the existing cluster arrangement systems/technologies are directed at container cloud scenes (the invention is mainly directed at IAAS scenes, and the two scenes are different), and the rest often only consider a single data center or a cloud platform, but most of enterprises face a cloudy scene nowadays, and the cluster arrangement and management across cloud platforms cannot be realized.

The invention discloses a cluster management method and a cluster management system (application number CN 201911243130.7). The invention provides a cluster management method and a cluster management system, which realize the access to a plurality of clusters by adopting a unique entrance address through a container gateway, so that the related information of the plurality of clusters does not need to be maintained. However, the invention patent mainly aims at a K8S (K8S is short for kubernets, which is an open-source distributed system platform used for managing containerization on multiple hosts in a platform, and can realize cluster management) container cloud platform, and does not support multiple clouds.

Disclosure of Invention

The invention provides a cluster arrangement system in a multi-cloud scene to realize cross-cloud management of application clusters.

In order to solve the problems, the technical scheme adopted by the invention is as follows:

a cluster arranging system under a multi-cloud scene is characterized in that a plurality of modules for cluster arranging are made through a modularization technology according to the requirements of cluster arranging, and the functions of cluster arranging, life cycle management, high availability, automatic expansion and contraction and load balancing are realized by utilizing the mutual correlation among the modules.

The technical scheme of the invention is further improved in that the cluster arrangement system comprises an arrangement entrance, an alarm module, an arrangement engine module and a cloud agent module, wherein:

the editing inlet is used for receiving a user request and sending the user request to the editing engine module through a message queue;

the alarm module is responsible for setting an alarm strategy of the cluster, and comprises alarm conditions and cluster operations to be executed after the alarm, the alarm module monitors the message queue, receives user requests on the message queue, manages the cluster alarm, periodically acquires monitoring information of all cluster nodes on each cloud, including recent monitoring data or health status, through the cloud agent module according to the alarm strategy, and evaluates whether the alarm or the notification of the alarm is carried out by the arranging engine module in combination with the alarm strategy of the cluster;

the arranging engine module is used for monitoring the message queue, receiving the request of the arranging entrance and the alarm module and executing specific cluster operation, wherein the cluster operation comprises the establishment, the capacity expansion, the capacity reduction, the recovery and the reconstruction of a cluster;

the cloud agent module is used for receiving a request of the alarm module or the orchestration engine module, calling interfaces of the clouds, operating cluster nodes (such as virtual machines or bare metals) on the clouds, and acquiring monitoring data and health states of the cluster nodes.

The technical scheme of the invention is further improved in that the alarm strategy in the alarm module is to inform the arrangement engine module to execute capacity expansion/capacity reduction operation when the comprehensive load of the cluster is higher or lower than a set threshold value and lasts for a certain time; and when a node in the cluster is down and lasts for a certain time, informing the arrangement engine module to execute corresponding recovery/reconstruction operations, wherein the recovery/reconstruction belong to the category of a recovery strategy.

The technical scheme of the invention is further improved in that the cloud agent module and each cloud are managed in a plug-in mode, namely the cloud back end is managed in a plug-in mode, so that the cloud agent module can be conveniently expanded.

The technical scheme of the invention is further improved in that the system further comprises a timer module which is used for managing the timing tasks in the system, when a user creates the timing tasks through the arranging entrance, the timer module maintains a task queue and sends the expired tasks to the arranging engine module through the message queue to execute specific cluster operation.

The technical proposal of the invention is further improved in that the cluster arranging process of the cluster arranging system comprises the following steps:

s1, supporting various cloud back ends in a plug-in mode, and acquiring detailed information of various clouds by using a cloud agent module;

acquiring cloud information, registering connection and authentication information of each cloud, starting corresponding cloud proxy service, and further acquiring detailed information of each cloud through a cloud proxy module, wherein the detailed information comprises the capacities of a current CPU (central processing unit), a memory and a disk;

s2, registering mirror images, networks and mapping relations of the mirror images and the networks in various clouds in a cluster arrangement system;

registering mirror images, namely making mirror images required by the cluster, converting the mirror images into formats required by each cloud, distributing the formats required by each cloud to each cloud (if existing mirror images exist, the mirror images can be directly used), registering the mirror images on a cluster arrangement system, and setting unique identifiers of the mirror images on each cloud when registering the mirror images;

registering a network, namely creating a network to be used on each cloud platform, registering the network on a cluster arrangement system, and setting a unique identifier of the network on each cloud as a mirror image;

s3, creating a cluster definition template, and defining a cluster template in a yaml format through a specific syntax to ensure that nodes in a cluster have the same configuration;

creating a cluster definition template, wherein a template file of the cluster definition template adopts a yaml format, and the cluster definition template specifies cluster node types and configuration information to ensure that the nodes in the cluster are configured consistently; (the type of the cluster node and the configuration of mirror image, CPU, memory, disk, network, etc. are specified in the template);

and S4, creating a cluster through the cluster template, and setting the capacity of the cluster and a multi-cloud scheduling strategy.

Starting the cluster through the created cluster definition template, designating the minimum and maximum node numbers of the cluster, and simultaneously setting a multi-cloud scheduling strategy for selecting cloud creation nodes and nodes added/deleted at the later stage to complete the creation of the cluster; therefore, after the cluster is successfully created, the life cycle of the cluster can be managed through the arrangement entry, or the nodes in the cluster can be added/deleted, and the power supply and the life cycle of the nodes in the cluster can be managed.

The cloudy scheduling policy in step S4 needs to set the weight W and the capacity N of each cloud, and the cloudy scheduling policy may include a greedy policy or an opportunity policy, where:

a greedy strategy selects a cloud with a high weight W each time a node is increased until the capacity N of the cloud is reached, and selects a cloud with a low weight W when the node is deleted;

the opportunistic strategy is scheduled based on the probability F calculated after the weighted summation of the weights W of all clouds, and the scheduling formula is as follows:

and S5, setting an alarm strategy for the cluster according to needs, and binding specific execution strategies required by the alarm strategy, wherein the specific execution strategies comprise a health strategy, a load balancing strategy and a stretching strategy and are used for respectively realizing HA (home agent), load balancing and automatic stretching of the cluster.

The health strategy means that the alarm module periodically acquires the health state of the cluster nodes in each cloud through the cloud agent module, and once the node downtime is detected and lasts for a period of time, the cluster engine is informed to execute a corresponding recovery strategy;

correspondingly, the recovery strategy is that the nodes are recovered by using a specified task flow, the specified task flow comprises restarting, migrating, rebuilding and switching the cloud platform (fig. 2), namely, when the nodes are recovered, the nodes are firstly tried to restart, if the restarting cannot be recovered, the nodes are migrated to other physical nodes, if the migration cannot be recovered, the nodes are deleted and rebuilt, if the nodes cannot be recovered all the time on one cloud, the cloud is marked as unavailable, and the cloud is scheduled to other cloud rebuilding nodes.

The scaling strategy means that the alarm module periodically obtains recent monitoring data of nodes in the cluster in each cloud through the cloud agent module, wherein the recent monitoring data comprises an average CPU utilization rate C, an average memory utilization rate M, an average disk IO load D and an average network load N, and then calculates a comprehensive load Z of the cluster by combining weights W set for each monitoring item in the multi-cloud scheduling strategy, and the calculation formula is as follows:

and if the comprehensive load Z is higher than a preset threshold value and lasts for a certain time, executing a corresponding cluster expansion strategy, and otherwise, executing a corresponding cluster contraction strategy.

The load balancing strategy refers to that a required binding configuration is specified by combining a special load balancing component, after the binding is completed, the load balancing component automatically creates a corresponding load balancer, and meanwhile, nodes in a cluster are also added into the load balancer, and the load balancer can be correspondingly updated during automatic expansion and contraction.

Further comprising step S6:

and S6, binding timing tasks for the cluster as required, executing the timing operation of the cluster in a plan, setting a timing strategy, creating one or more timing tasks to be bound to the cluster by the timing strategy, and executing the operation of the cluster at regular time.

Due to the adoption of the technical scheme, the invention has the beneficial effects that:

(1) Clustering orchestration in a multi-cloud environment is supported.

(2) Template type arrangement, flexible definition and expansion.

(3) And the flexible strategy mechanism is used for binding corresponding strategies to the cluster according to needs.

(4) And meanwhile, the automatic expansion, load balance and HA of the cluster are supported.

(5) In a multi-cloud architecture, an unsupported cloud platform can be extended in a plug-in form.

Drawings

FIG. 1 is a schematic diagram of a system architecture.

Fig. 2 is a node recovery flow of the health policy.

Detailed Description

The present invention will be described in further detail with reference to examples.

The invention discloses a cluster arrangement system in a multi-cloud scene, which aims at the problems and realizes cross-cloud management of application clusters, and comprises the functions of cluster arrangement, life cycle management, high availability, automatic expansion, load balancing and the like.

Before describing the present invention, certain abbreviations and key terms are defined.

IAAS: the Infrastructure as a Service means one of three Service modes of cloud computing (the other two are PaaS and SaaS), mainly realizes computing virtualization, storage virtualization and network virtualization, and is provided to users in the form of cloud hosts (virtual machines and bare metals).

Other modes of cloud computing are also described below.

PaaS: platform-as-a-Service.

SaaS: software-as-a-Service.

Infrastructure (such as virtual machines, servers, storage space, network bandwidth, security, etc.) is at the very bottom, platforms (such as databases, development tools, web servers, software runtime environments, etc.) are in the middle, and software (such as CRM, email, virtual desktops, unified communications, online gaming, etc.) is at the top.

IaaS: infrastructure-as-a-Service is the first layer.

PaaS: platform-as-a-Service the second layer is the so-called PaaS, sometimes called middleware.

SaaS: software-as-a-Service is the third layer.

The yaml format: YAML is the English meaning of "another markup language". The computer-readable data serialization format is an intuitive data serialization format which can be recognized by a computer, is high in readability, is easy to read by human beings, is easy to interact with a script language and is used for expressing data sequences. It is a data description language similar to the subset XML of the standard generalized markup language, with much simpler syntax than XML. Because of the simple implementation and low parsing cost, YAML is particularly suitable for use in scripting languages. Currently YAML can be parsed by the following programming languages: ruby, java, perl, python, PHP, OCaml, javaScript.

HA: high Availability (High Availability) refers to a server clustering technique aimed at reducing service interruption time. The method and the system provide services to the outside continuously by protecting business programs of users, and reduce the influence of faults caused by software, hardware and human on the business to the minimum degree.

The system architecture of the invention is shown in fig. 1, and comprises an arrangement entrance, an arrangement engine module, an alarm module and a cloud agent module.

The arranging entry is used for receiving a user request, wherein the request mainly relates to management of a cluster template, life cycle management of a cluster and alarm management of the cluster, and the request is sent to the arranging engine module through a message queue.

And the alarm module is responsible for setting an alarm strategy of the cluster, including alarm conditions and cluster operations to be executed after alarm. The alarm module monitors the message queue, receives a request of a user on the message queue, manages cluster alarms, periodically acquires monitoring information of all nodes of each cluster on the cloud through the cloud agent module according to an alarm strategy, and evaluates whether to alarm or inform the arrangement engine module to execute corresponding operation by combining the alarm strategy of the cluster. When the comprehensive load of the cluster is higher or lower than a set threshold value and lasts for a certain time, informing the management engine to execute expansion/contraction operation; and when a node in the cluster is down and lasts for a certain time, informing the management engine to execute recovery/reconstruction operation.

The arranging engine module also monitors the message queue, receives the request of the arranging entry/alarm module, and executes specific cluster operations, such as the creation, the capacity expansion and reduction, the recovery and the reconstruction of the cluster.

The cloud agent receives a request of the alarm module or the orchestration engine, calls interfaces of the clouds, operates cluster nodes (such as virtual machines or bare metals) on the clouds, and simultaneously acquires monitoring data and health states of the cluster nodes. During specific execution, the cloud agent module and each cloud are managed in a plug-in mode, and the cloud back end is managed in the plug-in mode, so that the cloud agent module can be conveniently expanded.

The scheduling system is also provided with a timer module, the timer module manages timed tasks in the system, when a user creates the timed tasks through the scheduling entrance, the timer module maintains a task queue and sends expired tasks to the scheduling engine through the message queue to execute specific cluster operations.

The implementation of a specific embodiment of the present invention is as follows.

(1) The method comprises the steps that connection and authentication information of each cloud are registered on a cluster management system, corresponding cloud agent service is started, a cloud agent module and each cloud are managed in a plug-in mode, and detailed information of each cloud, including the current capacities of a CPU, a memory and a disk, is further obtained through the cloud agent module;

(2) Registering mirror images, namely making mirror images required by the clusters, converting the mirror images into formats required by all the clouds, distributing the formats required by all the clouds to all the clouds (if existing mirror images exist, the existing mirror images can be directly used), registering the mirror images on a cluster arrangement system, and setting unique identifications of the mirror images on all the clouds when registering the mirror images;

(3) The method comprises the steps of registering a network, namely creating a network to be used on each cloud platform (if an existing network exists, the network can be directly used), making a connection between nodes among a plurality of clouds by using a public network, a private line or a large second layer, and registering the network on a cluster arrangement system, wherein the network is required to be provided with unique identification marks on each cloud as a mirror image;

(4) And (3) creating a cluster definition template, wherein a template file adopts a yaml format, the type of a cluster node and the configurations of mirror images, a CPU (central processing unit), a memory, a disk, a network and the like are specified in the template, and the template can ensure that the configuration of the nodes in the cluster is consistent. An example of a cluster definition template is as follows:

(5) And (4) starting the cluster through the cluster definition template established in the step (4), appointing the minimum and maximum node number of the cluster, and setting a multi-cloud scheduling strategy for selecting the cloud establishment node and adding/deleting the node at the later stage. The scheduling policy needs to set the weight W and the capacity N of each cloud. The scheduling strategy supports a greedy strategy and an opportunity strategy, the greedy strategy selects a cloud with high weight when a node is increased each time until the capacity N of the cloud is reached, and selects a cloud with low weight when the node is deleted; the opportunity strategy is scheduled based on the probability F calculated by each cloud weight, and the formula is as follows:

after the cluster is successfully created, the life cycle of the cluster can be managed through a system entrance, or nodes in the cluster can be added/deleted, and the power supply and the life cycle of the nodes in the cluster can be managed.

(6) The alarm module is responsible for setting alarm strategies of the cluster, including alarm conditions and cluster operations to be executed after alarm. The alarm module is not responsible for monitoring, but periodically acquires recent monitoring data or health status of the nodes in the cluster in each cloud through the cloud agent module, and evaluates whether to alarm or not through the monitoring data or the health status and an alarm strategy. The user can bind a plurality of alarm strategies for the cluster, the alarm strategies support health strategies and telescopic strategies, the health strategies are used for realizing HA (high availability) of the cluster, and the telescopic strategies are used for realizing automatic telescopic of the cluster.

(7) Aiming at the health strategy, the alarm module periodically acquires the health state of the nodes in the cluster in each cloud through the cloud agent module, and once the nodes are detected to be down and continue for a period of time, the cluster engine is informed to execute a corresponding recovery strategy. The recovery strategy comprises a designated task flow for recovering the nodes, the default task flow comprises restarting, migrating, rebuilding and switching the cloud platform (figure 2), namely, when the nodes are recovered, the nodes are tried to be restarted first, if the restarting fails to recover, the nodes are migrated to other physical nodes, if the migration fails to recover, the rebuilt nodes are deleted, if the cloud fails to recover all the time, the cloud is marked as unavailable, and the rebuilt nodes are scheduled to other cloud rebuilt nodes. And (3) the user can also define the recovered task flow or define that the condition of the downtime of the node is not processed temporarily, and when the number of the nodes living in the cluster is lower than a certain threshold value, the operation of recovering or adding the nodes is executed again, and when the nodes are added, the scheduling strategy specified in the step (5) is followed.

(8) Aiming at the expansion strategy, the alarm module periodically acquires recent monitoring data of nodes in the cluster in each cloud through the cloud agent module, wherein the recent monitoring data comprises an average CPU utilization rate C, an average memory utilization rate M, an average disk IO load D and an average network load N, and then calculates a comprehensive load Z of the cluster by combining weights W set for each monitoring item in the strategy, wherein the calculation formula is as follows:

and if the comprehensive load is higher than a preset threshold value and lasts for a certain time, executing a corresponding cluster expansion strategy, otherwise executing a corresponding cluster contraction strategy, wherein the expansion strategy comprises the number of nodes expanded or contracted each time and the cooling time of the operation besides the specified threshold value and the corresponding operation. It should be noted that, during expansion or contraction, the number of nodes in the cluster is not lower than the minimum value or higher than the maximum value defined by the cluster, and the scheduling policy in (5) is followed when the cloud is scheduled.

(9) In addition, a load balancing strategy can be bound for the cluster, and the system does not specifically realize the load balancing function, but combines a special load balancing component (such as Octavia in OpenStack). The load balancing strategy specifies configurations such as virtual machine IP, connection limitation, applied protocols, ports, load balancing algorithms, health check and the like, after strategy binding is completed, a corresponding load balancer is automatically created, nodes in the cluster are added into the load balancer, and the load balancer is correspondingly updated during automatic expansion and contraction. In addition, for the health policy in (7), the health check can directly use the health check mode of the load balancer itself.

(10) One or more timed tasks may be created to bind to the cluster and perform some operations of the cluster (e.g., capacity expansion/contraction) periodically. The timing task is a cron-like task and can be configured as flexibly as a cron. The timing task is generally used to perform cluster expansion or contraction (or power on/off of nodes) in a plan to deal with a known cluster load rule in advance, for example, a large amount of access in the daytime requires more nodes, while a small amount of access at night requires less number of nodes, and can even be accurate to each time slot, so as to save cost.

The invention has the advantages of modular cluster configuration, cluster strategy mechanism, health strategy execution process, system design framework and multi-cloud scheduling strategy, perfectly realizes cluster arrangement in multi-cloud environment, template arrangement, flexible definition and expansion, is a flexible strategy mechanism, binds corresponding strategies to the clusters according to requirements, and has extremely high application value.

Claims

1. A cluster arranging system under a multi-cloud scene is characterized in that: according to the requirements of cluster arrangement, a plurality of modules for cluster arrangement are made through a modularization technology, and the functions of cluster arrangement, life cycle management, high availability, automatic expansion and load balancing are realized by utilizing the mutual correlation among the modules;

the system comprises an arrangement entrance, an alarm module, an arrangement engine module and a cloud agent module, wherein:

the alarm module is in charge of setting alarm strategies of the cluster, including alarm conditions and cluster operations to be executed after alarm, monitors the message queue, receives user requests on the message queue, manages the cluster alarm, periodically acquires monitoring information of all cluster nodes on each cloud through the cloud agent module, and evaluates whether to alarm or inform the arrangement engine module to execute corresponding operations by combining the alarm strategies of the cluster;

the arrangement engine module is used for monitoring the message queue, receiving the requests of the arrangement entrance and the alarm module and executing specific cluster operation, wherein the cluster operation comprises the creation, the expansion, the contraction, the recovery and the reconstruction of a cluster;

the cloud agent module is used for receiving a request of the alarm module or the arrangement engine module, calling interfaces of all clouds, operating cluster nodes on all clouds and simultaneously acquiring monitoring data and health states of all cluster nodes;

the cluster arranging process of the cluster arranging system comprises the following steps:

s1, managing the cloud agent module and each cloud in a plug-in mode, and acquiring detailed information of each cloud by using the cloud agent module;

s2, registering mirror images, networks and mapping relations of the mirror images and the networks in various clouds;

s3, defining a cluster template in a yaml format through syntax, and ensuring that nodes in a cluster have the same configuration;

s4, creating a cluster through a cluster template, setting the capacity of the cluster, and setting a multi-cloud scheduling strategy;

2. The system of claim 1, wherein the cluster orchestration system under a multi-cloud scenario is: the alarm strategy in the alarm module is that when the comprehensive load of the cluster is higher or lower than a set threshold value and lasts for a certain time, the scheduling engine module is informed to execute capacity expansion/capacity reduction operation; and when the nodes in the cluster are down and continue for a certain time, informing the arrangement engine module to execute corresponding recovery/reconstruction operation.

3. The cluster orchestration system according to claim 1, wherein the cluster orchestration system is characterized in that: the system also comprises a timer module which is used for managing the timing tasks in the system, when a user creates the timing tasks through the arranging entrance, the timer module maintains a task queue and sends the expired tasks to the arranging engine module through the message queue to execute the specific cluster operation.

4. The system of claim 1, wherein the cluster orchestration system under a multi-cloud scenario is:

in step S4, the multi-cloud scheduling policy needs to set the weight W and the capacity N of each cloud, and the multi-cloud scheduling policy includes a greedy policy or an opportunity policy, where:

the opportunity strategy is scheduled based on the probability F calculated after the weighted summation of the weights W of the clouds, and the scheduling formula is as follows:

5. the system of claim 1, wherein the cluster orchestration system under a multi-cloud scenario is: in step S5, the health policy means that the alarm module periodically obtains the health status of the cluster node in each cloud through the cloud agent module, and once the node downtime is detected and lasts for a period of time, the cluster engine is notified to execute a corresponding recovery policy;

correspondingly, the recovery strategy is that the nodes are recovered by using the specified task flow, the specified task flow comprises restarting, transferring, rebuilding and switching the cloud platform, namely, when the nodes are recovered, the nodes are tried to be restarted first, if the restarting cannot be recovered, the nodes are transferred to other physical nodes, if the transferring cannot be recovered, the nodes are deleted and rebuilt, if the nodes cannot be recovered all the time on one cloud, the cloud is marked as unavailable, and the cloud is scheduled to other cloud rebuilding nodes.

6. The system of claim 1, wherein the cluster orchestration system under a multi-cloud scenario is: in step S5, the scaling strategy means that the alarm module periodically obtains recent monitoring data of nodes in the cluster in each cloud through the cloud agent module, where the recent monitoring data includes an average CPU utilization C, an average memory utilization M, an average disk IO load D, and an average network load N, and then calculates a comprehensive load Z of the cluster by combining weights W set for each monitoring item in the multi-cloud scheduling strategy, where the calculation formula is as follows:

7. The system of claim 1, wherein the cluster orchestration system under a multi-cloud scenario is: in step S5, the load balancing policy refers to that a required binding configuration is specified by combining a special load balancing component, after the binding is completed, the load balancing component automatically creates a corresponding load balancer, and simultaneously, nodes in the cluster are also added to the load balancer, and the load balancer is correspondingly updated during automatic scaling.

8. The system of claim 1, wherein the cluster orchestration system under a multi-cloud scenario is: further comprising a step S6 of carrying out,