CN112328359A

CN112328359A - Scheduling method for avoiding container cluster starting congestion and container cluster management platform

Info

Publication number: CN112328359A
Application number: CN202011188530.5A
Authority: CN
Inventors: 陈凯; 王成龙; 郭子敏
Original assignee: Fiberhome Telecommunication Technologies Co Ltd
Current assignee: Fiberhome Telecommunication Technologies Co Ltd
Priority date: 2020-10-30
Filing date: 2020-10-30
Publication date: 2021-02-05
Anticipated expiration: 2040-10-30
Also published as: CN112328359B

Abstract

The invention discloses a scheduling method for avoiding container cluster starting congestion and a container cluster management platform, wherein the scheduling method comprises the following steps: in the process that the computing node starts the Pod, if a preset inhibition condition is met, selecting a part of pods from a starting Pod pool according to the starting sequence of the pods, and adding the selected pods into an inhibition Pod pool to inhibit the selected pods; and if the preset recovery conditions are met, determining the Pod recovered by the plan according to the starting sequence of the Pods, the resource occupation amount of the Pods and the resource capacity which can be provided by the computing node, and adding the Pod recovered by the plan into a starting Pod pool to restart the Pod recovered by the plan until the starting of all the Pods is completed. In the invention, the start of the application container is orderly scheduled, the container start efficiency is improved, the start time is saved, and the start deadlock is prevented.

Description

Scheduling method for avoiding container cluster starting congestion and container cluster management platform

Technical Field

The invention belongs to the field of container scheduling, and particularly relates to a scheduling method for avoiding starting congestion of a container cluster and a container cluster management platform.

Background

With the continuous development of virtualization technologies, more and more enterprises deploy applications onto cloud platforms. The cloud platform mainly provides two ways of deploying the application, namely a virtual machine and a container, wherein the container is developed particularly rapidly, the container is lighter in weight, smaller in occupied resource, quicker to start and easier to deploy compared with the virtual machine due to the fact that no client operating system exists in the container, and the difference of each operating system is shielded for the application.

The advantages of the container attract a large number of applications to be deployed in a container mode, and the requirement of a container cluster unified management platform is stimulated. At present, a kubernets container cluster management platform of an open source project of google is generally adopted in the industry, and an own container cluster management platform is developed on the basis of the project. The kubernets platform realizes large-scale unified deployment of containers, resource and state monitoring, cross-node flexible scheduling, smooth upgrading of application containers, capacity expansion and contraction, load balancing, fault detection, self-repair and the like, developers are more concentrated in services, and related work of platform deployment is given to the container management platform.

The kubernets platform adopts a distributed architecture based on a container technology, and comprises a control node (master node) and a computing node (work node). Each node may be a physical server or a virtual machine, and the container group scheduling module of the control node schedules the target container to be created and operated by an appropriate computing node according to the resource residual capacity of each computing node and the requirement of the container on the resource (including CPU resource and memory resource). The computing node is an operation carrier of the container and is responsible for life cycle management, state monitoring and the like of the container. In order to facilitate container management, the kubernets on the computing nodes add the concept of Pod, wherein Pod is the minimum unit managed by kubernets, all container sets of an application are composed of a plurality of containers, and a kubernets platform carries out deployment scheduling in Pod units. The invention also adopts Pod scheduling with Pod granularity.

At present, more and more cloud platforms are deployed in an integrated cabinet mode, and the abnormal power failure condition exists, or the integrated cabinet is used for carrying out the migration of geographic positions, and the all-in-one machine is pre-deployed and applied before leaving a factory. Because the container management platform lacks Pod starting priority management and starting sequence management, after each node is powered on, deployed application pods can be started at the same time out of order. And resources of the computing nodes are preempted simultaneously in the Pod starting process, so that the utilization rate of the resources of the computing nodes is increased, the available resources of each Pod are reduced, and the starting time is prolonged. And judging whether the Pod is started overtime by a Pod health check module of the container cluster management platform, and restarting a container under the Pod according to a strategy.

In addition, if there is deployment dependency between application pods, it may also cause dependency deadlock, and the service cannot be started, for example, the service of Pod a depends on Pod B, the service of Pod B depends on Pod C, and Pod a, Pod B, and Pod C may start to seize the node resource together at the same time. If Pod level start is completed first by Pod A, it is necessary to wait for Pod B to complete. When the Pod A waits for overtime, the Pod A is restarted, so that the Pod A can occupy the resources of the computing node, the Pod B and the Pod C have fewer resources, and the whole starting time of the application is prolonged. Referring to fig. 1, a variation diagram of the number of running pods in a compute node and the resource occupation of the compute node is shown, and when the number of running pods is large, the resources of the compute node are preempted between them.

Disclosure of Invention

Aiming at the defects or the improvement requirements of the prior art, the invention provides a scheduling method for avoiding container cluster starting congestion and a container cluster management platform, and aims to increase the deployment sequence and priority management of application containers, schedule the starting of the application containers in order, improve the starting efficiency of the containers, save the starting time and prevent starting deadlock, thereby solving the technical problems that the utilization rate of computing node resources is high, the available resources of each Pod are reduced and the starting time is prolonged due to the preemption of the computing node resources when a large number of containers are started simultaneously by the container cluster management platform.

In order to achieve the above object, according to an aspect of the present invention, there is provided a scheduling method for avoiding container cluster startup congestion, where the scheduling method is applied to a container cluster management platform, the container cluster management platform includes a control node and at least one computing node, and each computing node is provided with a startup Pod pool and a suppression Pod pool, and the scheduling method includes:

in the process of starting the Pod, if a preset inhibition condition is met, the computing node selects a part of pods from the starting Pod pool according to the starting sequence of the pods, and adds the selected pods to an inhibition Pod pool to inhibit the selected pods;

and if the preset recovery conditions are met, determining the Pod recovered by the plan according to the starting sequence of the Pods, the resource occupation amount of the Pods and the resource capacity which can be provided by the computing node, and adding the Pod recovered by the plan into a starting Pod pool to restart the Pod recovered by the plan until the starting of all the Pods is completed.

Preferably, if the preset recovery condition is met, determining the Pod recovered by the current plan according to the starting sequence of the pods, the resource occupation amount of the pods, and the resource capacity provided by the computing node, and adding the Pod recovered by the current plan to the starting Pod pool to restart the Pod recovered by the current plan until the completion of the starting of all the pods includes:

if the preset recovery conditions are met, selecting the Pod for plan recovery according to the starting sequence of the pods, wherein the sum of the resource demand of the Pod which is started, the resource limitation of the platform on the Pod in starting and the resource limitation of the platform on the Pod for plan recovery is required to be less than the resource capacity which can be provided by the computing node;

and removing the currently started Pod from the starting Pod pool, and adding the Pod recovered by the plan into the starting Pod pool to restart the Pod recovered by the plan until the start of all the pods is completed.

Preferably, the resource demand of the Pod that has completed booting, the resource limit of the platform on the Pod in booting, and the resource limit of the platform on the Pod planned to be recovered are obtained as follows:

acquiring the CPU resource demand of the Pod which has finished starting according to the following formula;

wherein n is the number of containers contained in Pod,

is the deadline for the container to use the CPU time slice,

is the starting Time of the CPU Time slice used by the container, Time^endIs the cut-off Time of the CPU used by the container, Time^beginIs the starting time of the container using the CPU; capacity_cpuIs the total amount of CPU resources provided by the compute node;

acquiring the memory resource demand of the Pod which is started through a platform interface;

acquiring the resource demand of the Pod which has finished starting according to the following formula;

wherein A is more than or equal to 0 and less than or equal to 1, B is more than or equal to 0 and less than or equal to 1, A + B is 1, and m is the historical deployment times of the Pod;

and selectively adopting a mode of reading the configuration file or a mode of historically deploying data to obtain the resource limitation quantity of the platform on the Pod in starting and the resource limitation quantity of the platform on the Pod planned to be recovered.

Preferably, the preset recovery condition includes: the resource utilization rate of the computing node is not more than a set utilization rate threshold, and in the starting Pod pool, the Ratio of the Pod which has finished starting to the total number of the pods is more than a set proportion threshold Ratio_started。

Preferably, the preset inhibition condition includes: the resource utilization rate of the computing node is greater than a set utilization rate threshold, and the sum of the resource demand of the Pod which has finished starting and the resource limitation amount of the platform to the Pod which is starting is greater than the resource capacity which can be provided by the computing node.

Preferably, in the process of starting Pod, if a preset suppression condition is satisfied, the computing node selects a part of pods from the Pod starting pool according to a Pod starting sequence, and adds the selected pods to a Pod suppression pool, so as to suppress the selected pods, including:

in the process that the computing node starts the Pod, if the resource utilization rate of the computing node is greater than a set utilization rate threshold value, and the sum of the resource demand of the Pod which finishes starting and the resource limitation amount of the platform to the Pod in starting is greater than the resource capacity which can be provided by the computing node, judging whether the Pod in starting exists in the starting Pod pool;

and if so, selecting a part of the Pod from the starting Pod pool according to the starting sequence of the pods, and adding the selected Pod into a suppressing Pod pool to suppress the selected Pod until the resource utilization rate of the computing node is not greater than a set utilization rate threshold.

Preferably, the starting sequence of the Pod is determined by the deployment priority and/or the deployment sequence of the Pod, wherein the Pod with the higher deployment priority is started preferentially; for the Pod with the same deployment priority or the Pod without the priority, the Pod deployed in advance is started preferentially.

According to another aspect of the present invention, a container cluster management platform is provided, where the container cluster management platform includes a control node and at least one computing node, each computing node is provided with a startup Pod pool and a suppression Pod pool, and the control node and the computing node cooperate with each other to implement the scheduling method of the present invention.

Preferably, a recording module and an analysis module are arranged on the control node;

the recording module is used for recording deployment information of the Pod in all the computing nodes, wherein the deployment information comprises deployment priority of the Pod, deployment sequence of the Pod and resource use condition of the Pod;

the analysis module is used for interacting deployment information with the recording module, calculating the resource occupation condition of the Pod on each computing node, and storing the resource occupation condition of the Pod on each computing node in the recording module.

Preferably, a scheduling module is arranged on the computing node;

the scheduling module is used for determining the Pod needing to be suppressed or the Pod needing to be recovered according to the resource utilization rate of the computing node and the resource occupation condition of the Pod so as to selectively start the corresponding Pod;

the starting Pod pool is used for recording Pod in starting;

the suppression Pod pool is used for recording the suppressed Pod.

Generally, compared with the prior art, the technical scheme of the invention has the following beneficial effects: the invention provides a scheduling method for avoiding container cluster starting congestion and a container cluster management platform, wherein the scheduling method is applied to the container cluster management platform, the container cluster management platform comprises a control node and at least one computing node, each computing node is provided with a starting Pod pool and a suppressing Pod pool, and the scheduling method comprises the following steps: in the process that the computing node starts the Pod, if a preset inhibition condition is met, selecting a part of pods from a starting Pod pool according to the starting sequence of the pods, and adding the selected pods into an inhibition Pod pool to inhibit the selected pods; and if the preset recovery conditions are met, determining the Pod recovered by the plan according to the starting sequence of the Pods, the resource occupation amount of the Pods and the resource capacity which can be provided by the computing node, and adding the Pod recovered by the plan into a starting Pod pool to restart the Pod recovered by the plan until the starting of all the Pods is completed.

In the invention, the deployment sequence and priority management of the application containers are increased, on the premise of fully using the utilization rate of the computing node resources, the starting containers with low priority are restrained, partial node resources are released, the starting containers with high priority can be ensured to distribute more computing resources, the starting of the application containers is dispatched in order, the container starting efficiency is improved, the starting time is saved, and the starting deadlock is prevented. Under the abnormal power failure recovery scene of the container cluster management platform, the reliability guarantee of cluster recovery is provided. Meanwhile, support is provided for pre-deployment of application of the cloud platform all-in-one machine, container application can be preset before delivery, direct electrification is carried out during field engineering implementation, the application containers are started in sequence, manual intervention is reduced, service opening time and complexity are reduced, and service opening efficiency is improved.

Drawings

Fig. 1 is a schematic diagram of a change curve of node resource utilization according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a container cluster management platform according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of another container cluster management platform according to an embodiment of the present invention;

fig. 4 is a data flow diagram of Pod scheduling provided in an embodiment of the present invention;

fig. 5 is a flowchart illustrating a scheduling method for avoiding container cluster startup congestion according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

Before introducing this solution, the following concepts are explained first:

in the management of the kubernets platform, Pod is a set of application containers, the Pod is composed of a plurality of containers and provides a certain function together, and the kubernets platform performs deployment and scheduling by taking Pod as a unit.

The Pod state object comprises five stages of running, pending, failed, reserved and unwown, wherein running indicates that the Pod is bound to a certain computing node, all containers in the Pod are created, and the containers are in a running, starting or restarting state; pending indicates that a container in a Pod is not completely created, and may be in a process of scheduling to a computing node, and cannot provide a service to the outside; failed indicates that all containers in the Pod are terminated and one container is terminated because of failure, and the platform determines whether to restart the Pod according to a restart policy configured by the Pod; suceded indicates that Pod has successfully started and exited after normal operation, and all containers in Pod have terminated normally. This state corresponds to an application Pod that needs to be run only once, for example, a Pod that implements a configuration action; unknown represents that the Pod state is unknown, and when a control node and a computing node network have problems, the control node cannot acquire the Pod state.

The phases of Pod in startup described below and corresponding to running, pending and failed; pod states that have completed boot-up include suceded and unbown. And the Pod in the running stage judges whether the startup is finished according to the managed container state and the service state. The container is ready to start and the service provided is ready indicating that the start is complete.

Example 1:

referring to fig. 2 to 4, the present embodiment provides a container cluster management platform, where the container cluster management platform includes a control node and at least one computing node, where the node may be a physical server or a virtual machine.

The control node is used for scheduling a target container to a proper computing node to create and operate according to the resource residual capacity of each computing node and the requirement of the container on resources, wherein the resources comprise CPU resources and memory resources.

The computing node is an operation carrier of the container and is responsible for life cycle management, state monitoring and the like of the container, and the minimum unit of kubernets management on the computing node is Pod.

The control node is provided with a recording module and an analysis module, and the computing node is provided with a scheduling module, a starting Pod pool and a suppressing Pod pool; the recording module is used for recording deployment information of the Pod in all the computing nodes, wherein the deployment information comprises Pod starting sequence and Pod resource use condition, and the Pod starting sequence is the deployment priority of the Pod and/or the deployment sequence of the Pod.

The deployment priority is configured by a user according to the practical application situation, the dependent Pod deployment priority is high, and the priority required to depend on other pods is relatively low. In actual use, the Pod with the priority is started first, and the Pod without the priority is started according to the deployment sequence.

The deployment sequence is numbered from small to large according to an integer, if a certain Pod is deleted, the deployment information of the Pod is correspondingly deleted, and the sequence numbers of all the later deployed pods are reduced by one and moved forward.

In an actual application scenario, the record information further includes Pod instance ID, Pod name, resource requirement and limitation of Pod in Pod, deployment date, actual CPU resource usage during startup, memory occupation, computing node identifier, computing node CPU frequency and CPU core number, computing node resource capacity, computing node operating container number, and the like.

The recording module may record multiple times of starting deployment history data, the starting action includes restarting a Pod instance or migrating a Pod instance, and when the user deletes an application Pod, the corresponding deployment information record is deleted.

The scheduling module is used for determining the Pod needing to be suppressed or the Pod needing to be recovered according to the resource utilization rate of the computing node and the resource occupation condition of the Pod, so as to selectively start the corresponding Pod, dynamically control the starting and stopping of the container, and ensure the ordered and efficient starting of the application container in the node.

In this embodiment, a start Pod pool and a suppression Pod pool are added to a computer node, where the start Pod pool is used to record a Pod in start; and the suppression Pod pool is used for recording the Pod with the stored container state.

The method comprises the steps that Pods which are started in computing nodes are placed in a starting Pod pool; and placing the suppressed Pod in a suppression Pod pool. The suppressed Pod is not detected by the health monitoring function of the container cluster management platform, and meanwhile, the restart rule of the application container is configured to be a new rule, so that the application container cannot be automatically started when the computing node is powered on and started.

In this embodiment, the scheduling module periodically monitors the resource usage by taking the computing node as a unit, and schedules the start of the Pod according to the resource usage to selectively start the Pod, where the monitoring period may be configured and adjusted.

Specifically, before scheduling the Pod, the scheduling module queries and acquires the historical resource occupancy amount of the Pod under the computing node from the deployment information according to the computing node identifier, and specifically, the historical resource occupancy amount of the Pod under the computing node can be acquired simply by taking an average value, so that the Pod is scheduled according to the historical resource occupancy amount of the Pod under the computing node.

In practical use, the scheduling module is configured to transfer the Pod that needs to be suppressed in the start Pod pool to the suppression Pod pool according to the suppression condition, and set the container in the transferred Pod to a stopped state (stopped) through the kubelet component of the kubernets platform. The suppressed Pod is not detected by the health monitoring function of the container cluster management platform, and the node resource shortage is not cleared. If a Pod is scheduled from the inhibit Pod pool to the start Pod pool, the container in that Pod is restarted while the health probe for the Pod is restored. FIG. 4 depicts Pod in Pool_starting(Start Pod pool) and pool_stop(suppression Pod pool).

By making the node resource utilization rate greater than Threshold_maxJudging whether the starting of Pod needs to be inhibited or not; when the number of the activated Pods in the activated Pod pool exceeds the set Ratio threshold Ratio_startedAnd then, analyzing the resource occupation amount of the Pod which is being started in the starting Pod pool, the occupation amount of the Pod resources which are already started and the occupation condition of the Pod resources which are planned to be recovered, and starting the Pod which is planned to be recovered when the resource occupation amount is smaller than the resource capacity of the computing node.

Adding the Pod with the computing node in the starting state to the starting Pod pool management, and removing the Pod with the state in the starting Pod pool changed into succed or running and the service ready. A Pod in partial startup is selected from the startup Pod pool according to the following condition 1, and the transition is made to the suppression Pod pool. A partial Pod is selected from the inhibited Pod pool according to the following condition 2, and the Pod is transferred to the activated Pod pool for Pod restoration.

Condition 1: resource utilization Ratio of the current computing node_nodeGreater than the utilization Threshold_maxAnd if the resource quota of all the running containers exceeds the resource capacity of the computing nodes, inhibiting the starting Pod in the starting Pod pool, removing the inhibited Pod from the starting Pod pool, and adding the inhibited Pod into the inhibiting Pod pool. The selection sequence of the suppressed Pod preferentially suppresses the Pod which is configured without priority and is deployed behind according to the Pod deployment sequence and the priority in the deployment information record table, and then suppresses the Pod with low priority until the resource demand of all the pods which have finished starting in the computing node and the resource restriction of the pods (running but serving as ready, pending and failed) in the starting are less than or equal to the resource capacity of the computing node, and the resource utilization rate of the computing node is reduced.

Condition 2: when the Pod in the Pod starting pool is started, the set Pod starting proportion Ratio is reached_startedAnd querying the container deployment information module, selecting the Pod with high priority from the suppression Pod pool for recovery, and if the Pod has no configured priority, recovering the Pod which is deployed first according to the deployment sequence number.

If the sum of the resource demand of the container of the Pod which has finished starting, the Pod resource limitation amount resource quota of the Pod which is starting the platform and the resource limitation amount of the platform plan recovery Pod is less than the resource capacity of the computing node; or all the Pods in the Pod starting pool are started, and the sum of the resource demand of the container of the Pod which is started already and the resource limit of the platform planned recovery Pod is larger than the resource capacity of the computing node, removing the Pod which is started already in the Pod starting pool, and selecting the Pod planned to be recovered from the Pod inhibiting pool for recovery. The containers in the suppressed Pod will restart and the Pod will accept the health monitoring management of the container cluster management platform. In order to ensure the consistency of the container cluster management platform to the outside, the state of the Pod under the Pod is managed by starting the Pod pool and inhibiting the Pod pool without increasing the Pod state to the outside, the state of the container under the Pod in the starting Pod pool is in running or stopped (non-0, abnormal exit) state, and the state of the container under the Pod in the inhibiting Pod pool is in stopped (0, normal exit) state.

For the specific scheduling process of the container cluster management platform, see the scheduling method of embodiment 2 below.

Example 2:

based on the container cluster management platform in the foregoing embodiment, referring to fig. 5, this embodiment provides a scheduling method for avoiding container cluster startup congestion, where the scheduling method is applied to a container cluster management platform, the container cluster management platform includes a control node and at least one computing node, each computing node is provided with a startup Pod pool and a suppression Pod pool, and the scheduling method includes the following steps:

step 101: and the control node schedules the Pod to the corresponding computing node according to the resource condition of each computing node and the demand condition of the Pod on the resource.

In this embodiment, the control node schedules the Pod to the corresponding computing node according to the resource condition of each computing node and the demand condition of the Pod for the resource. And the computing node inquires the under-started Pod, adds the under-started Pod to a starting Pod pool, and acquires the starting sequence of the under-started Pod from the control node. Specifically, the scheduling module of the computing node acquires the deployment priority and the deployment order of the Pod being started from the deployment information according to the computing node identifier, so as to schedule the corresponding Pod according to the deployment priority and the deployment order. If the Pod that is being started is recorded in the suppression Pod pool, the corresponding Pod does not need to be added in the start Pod pool.

Step 102: in the process of starting the Pod, if a preset inhibition condition is met, the computing node selects a part of pods from the starting Pod pool according to the starting sequence of the pods, and adds the selected pods to the inhibition Pod pool to inhibit the selected pods.

The starting sequence of the Pod is determined by the deployment priority and/or the deployment sequence of the Pod, wherein the Pod with high deployment priority is started preferentially; for the Pod with the same deployment priority or the Pod without the priority, the Pod deployed in advance is started preferentially.

Wherein the preset inhibition condition comprises: the resource utilization rate of the computing node is greater than a set utilization rate threshold, and the sum of the resource demand of the Pod which has finished starting and the resource limitation amount of the platform to the Pod which is starting is greater than the resource capacity which can be provided by the computing node.

In this embodiment, the CPU utilization and the memory utilization of the compute node are monitored; and calculating according to the weight factors corresponding to the CPU utilization rate and the memory utilization rate to obtain the resource utilization rate of the computing node.

Periodically monitoring resource utilization rate of the computing nodes by taking the computing nodes as units, and specifically querying the computing nodes (nodes) through a resource query interface of a container cluster management platform_i) CPU utilization (Ratio) of_cpu) And memory utilization (Ratio)_mem) According to CPU utilization rate (Ratio)_cpu) And memory utilization (Ratio)_mem) Calculating the resource usage Ratio of the node_node。

The specific calculation method is as follows: calculating a resource utilization (Ratio) of a node_node) The setting can be made according to the following formula: ratio (R)_node＝A×Ratio_cpu+B×Ratio_memWherein A is more than or equal to 0 and less than or equal to 1, B is more than or equal to 0 and less than or equal to 1, and A + B is equal to 1. The specific values of the weight factors a and B can be adjusted by a user according to application characteristics deployed by the container cluster platform, for example, if the computing application focuses more on the CPU resource, the value of the weight factor a is greater than the value of the weight factor B, and if the memory consuming application focuses more on the memory resource, the value of the weight factor B is greater than the value of the weight factor a. In an extreme case, a may be configured as 1, B as 0 or a as 0, B as 1; under the condition that the CPU or the memory use is over-limit, the application Pod can be restrained, and the node computing resources are released. Here, when evaluating the resource usage amount of the application Pod, the same weight value is used.

In this embodiment, it is necessary to preset a utilization Threshold of a computing node_maxWhen the computing node is just started, if the resource utilization rate of the node is greater than the set Threshold_maxThe resource utilization of the node is considered to be too high.

When the resource utilization rate of a computing node is greater than a set utilization rate threshold, judging whether the sum of the resource demand of the Pod which is started and the resource limit of the platform to the Pod which is starting is greater than the resource capacity which can be provided by the computing node, and if so, judging whether the Pod which is starting exists in the starting Pod pool; and if so, selecting a part of the Pod from the starting Pod pool according to the starting sequence of the pods, and adding the selected Pod into a suppressing Pod pool to suppress the selected Pod until the resource utilization rate of the computing node is not greater than a set utilization rate threshold.

Step 103: and if the preset recovery conditions are met, determining the Pod recovered by the plan according to the starting sequence of the Pods, the resource occupation amount of the Pods and the resource capacity which can be provided by the computing node, and adding the Pod recovered by the plan into a starting Pod pool to restart the Pod recovered by the plan until the starting of all the Pods is completed.

Wherein the preset recovery condition comprises: the resource utilization rate of the computing node is not more than a set utilization rate threshold, and in the starting Pod pool, the Ratio of the Pod which has finished starting to the total number of the pods is more than a set proportion threshold Ratio_started。

The resource occupation amount of the Pod comprises the resource demand amount of the Pod which has completed the startup, the resource limit amount of the platform on the Pod in the startup and the resource limit amount of the platform on the Pod planned to be recovered. The resource demand of the Pod which has finished being started is the resource amount actually occupied in the Pod operation process; the resource limit amount of the Pod in startup by the platform is the maximum resource amount that the platform can provide for the Pod, and for the same Pod, the resource limit amount is greater than the resource demand amount, so when the state of the Pod is changed from "in startup" to "startup completed", the computing node will release capacity for other pods to start.

For example, the resource capacity of the computing node is 1000, which means that the maximum amount of resources available for the application Pod is 1000. Assuming that the minimum resource requirement of an application Pod is 100 and the maximum resource limit is 200, 10 such pods may be deployed, each of which has a maximum available resource limit of 200, and may operate normally in a wrong peak. For example, 6 applications are idle, the resources are only 200 in total, and another 4 Pod with high application load can use the resource limit of 200, and the total used CPU resources do not exceed the CPU resource capacity of the node by 1000.

In actual use, the resource capacity of the computing node is configured by a container cluster platform administrator according to actual hardware performance, and the minimum resource demand and the maximum resource limit of the application Pod are configured by a user according to the characteristics of the application. Wherein the minimum resource requirement of the Pod is a fully guaranteed resource that is desired to be allocated to; the maximum resource limit is the upper limit of the resources that can be used at most. If the user does not configure the minimum resource requirement and the maximum resource limit, the default configuration amount of the system will be used, and the minimum resource requirement is equal to the maximum resource limit.

In this embodiment, if a preset recovery condition is met, selecting a Pod planned to be recovered according to a starting sequence of the pods, where a sum of a resource demand of the Pod which has completed starting, a resource restriction amount of the Pod which is being started by the platform, and a resource restriction amount of the Pod planned to be recovered by the platform needs to be less than a resource capacity which can be provided by the computing node; and removing the currently started Pod from the starting Pod pool, and adding the Pod recovered by the plan into the starting Pod pool to restart the Pod recovered by the plan until the start of all the pods is completed. As the Pod that has completed booting increases, if the Pod that has completed booting is not removed from the boot Pod pool, the Ratio threshold Ratio_startedThe Pod recovery process cannot be executed, and therefore, after determining the Pod to be recovered by the plan, the Pod which has completed the startup at present needs to be removed from the startup Pod pool.

In another embodiment, when all the Pod in the Pod starting pool are converted into the "already started" state, and the resource demand amount of the Pod already started and the resource limitation amount of the platform on the Pod recovered by the current plan are greater than the resource capacity which can be provided by the computing node, the Pod recovered by the current plan is still recovered, and the Pod recovered by the current plan completes starting by preempting the resources.

In an actual application scenario, the resource demand of the Pod which has completed starting, the resource limit of the Pod which is being started by the platform, and the resource limit of the Pod which is planned to be recovered by the platform are acquired as follows: acquiring the CPU resource demand of the Pod which has finished starting according to the following formula;

wherein n is the number of containers contained in Pod,

is the deadline for the container to use the CPU time slice,

is the starting Time of the CPU Time slice used by the container, Time^endIs the cut-off Time of the CPU used by the container, Time^beginIs the starting time of the container using the CPU; capacity_cpuIs the total amount of CPU resources provided by the compute node; and acquiring the memory resource demand of the Pod which is started through the platform interface. Acquiring the resource demand of the Pod which has finished starting according to the following formula;

wherein A is more than or equal to 0 and less than or equal to 1, B is more than or equal to 0 and less than or equal to 1, A + B is 1, and m is the historical deployment times of the Pod. The specific values of the weight factors a and B may be adjusted by a user according to application characteristics deployed by the container cluster platform, for example, if the computing application focuses more on the CPU resource, the value of the weight factor a is greater than the value of the weight factor B, and if the memory-consuming application focuses more on the memory resource, the value of the weight factor B is greater than the value of the weight factor a. In an extreme case, a may be configured to be 1, B may be 0, or a may be 0 and B may be 1.

And selectively adopting a mode of reading the configuration file or a mode of historically deploying data to obtain the resource limitation quantity of the platform on the Pod in starting and the resource limitation quantity of the platform on the Pod planned to be recovered. Specifically, the resource use limit of all containers in the Pod to the computing node is read in a manner of reading a deployment configuration file, and if no resource requirement configuration data exists, historical deployment data of the Pod in the deployment information recording module is read to obtain the resource use limit of the computing node; if no resource requirement configuration data exists and no historical deployment data exists, the deployment priority is considered to be low, and the lowest value of the resource limit stored in the resource configuration Pod is taken as the resource limit and requirement of the Pod.

The following describes, in combination with the foregoing description, an implementation process of the scheduling method of this embodiment specifically: starting the Pool_startinThere are several states of Pod in g: (1) pod (Pod) being started_starting) Specifically, the method includes Pod and service non-ready Pod in running state, Pod in pending state, and Pod in failed state; (2) transition from being booted to a boot-completed Pod (Pod)_started) Specifically, Pod service ready Pod and suceded Pod in running state are included. In actual use, the Pod with the running state changed from being started to the completion of the startup is removed from the startup Pod pool.

The resource occupation of the Pod includes the resource demand of the Pod that has completed booting and the resource limit of the platform to the Pod in the booting.

In this embodiment, the resource utilization of the computing node is queried and analyzed, and if the set utilization Threshold is reached_maxCalculating the resource demand of the Pod which has completed the startup

Total number and resource limit amount of Pod in startup

The total number of, among others,

indicating the resource demand of the Pod which has finished starting on the node; wherein the content of the first and second substances,

indicating the amount of resource restriction that the platform occupies resources for a certain Pod.

After the resource demand and the resource limit are obtained, whether the resource demand and the resource limit are larger than the resource capacity of the computing node is judged, if so, whether the number of the Pod which is started in the computing node is larger than 1 is judged, and if so, part of the pods are restrained. That is, it is queried whether there is a Pod in the pool of initiating pods that is initiating_startingIf not, not processing; if so, selecting part of the Pod from low to high according to the starting sequence, transferring the Pod from the starting Pod pool to the inhibiting Pod pool, and stopping containers under the pods.

In conjunction with the foregoing description, the triggering conditions for the inhibition condition are:

first, Ratio_node＞Threshold_maxI.e. the resource utilization is greater than a set resource utilization threshold. Secondly, if

And StartingNum_totalAnd > 1, the partial Pod is suppressed based on the deployment priority and the deployment order. Wherein, StartingNum_totalIs the number of Pod being launched in the compute node,

is the resource demand of a node by a Pod that has completed startup,

the amount of resource restriction that represents the Pod is on startup is the maximum amount of resources that the Pod can use.

Selecting n Pod to suppress according to the starting priority of Pod from low to high, wherein the n suppressed Pod needs to satisfy the following conditions:

and StartingNum_total＞n；

Suppressing Pod up to Ratio_node≤Threshold_max。

In an actual application scenario, according to a deployment sequence and a deployment priority of Pod, performing starting priority labeling on Pod being started, for example, "$ priority $ sequnce id", where $ priority is a priority, the priority may be set to 8, the range is 0-7, the larger the value is, the higher the priority is, and if Pod is not configured with a deployment priority, the default priority is 0; and $ sequnce id is the deployment order. In actual scheduling, the Pod is started according to the priority, and the higher the priority is, the more the deployment sequence is; and for the Pod without the deployment priority, starting the Pod according to the deployment sequence.

In this embodiment, the scheduling method further includes: if the resource utilization rate of the computing node is not greater than a set utilization rate threshold, the computing node judges whether the occupation ratio of the started Pod is greater than a set proportion threshold or not in the starting Pod pool, and if the occupation ratio of the started Pod is greater than the set proportion threshold, the Pod recovered by the plan at this time is determined according to the starting sequence of the Pod, the resource demand of the started Pod, the resource limitation of the platform on the Pod in starting and the resource limitation of the platform on the Pod recovered by the plan, wherein in the recovery process, the sum of the resource demand of the started Pod, the resource limitation of the platform on the Pod in starting and the resource limitation of the platform on the Pod recovered by the plan is required to be ensured to be less than the resource capacity which can be provided by the computing node. And then, removing the Pod which has finished the starting from the starting Pod pool, and adding the Pod recovered by the plan into the starting Pod pool to restart the Pod recovered by the plan until the starting of all the pods is finished.

Specifically, the Ratio of the Pod which completes startup in the startup Pod pool to the total number of pods in the startup Pod pool is set_startedWhen the number of the Pod in the starting Pod pool exceeds the set Ratio_startedAnd then, analyzing the resource occupation amount of the Pod which is being started in the starting Pod pool, and recovering Pod starting when the resource occupation amount is smaller than the resource capacity of the computing node.

Selecting the Pod with high starting priority for gradual recovery, planning to recover the resource quota of the Pod not greater than the resource residual capacity of the node, transferring the selected Pod from the suppressing Pod Pool to the starting Pod Pool, recovering the health monitoring of the Pod after starting, and recovering the health monitoring of the Pod from the Pool_startingDelete the Pod records that have been successfully started. And then the next recovery is carried out.

In conjunction with the foregoing description, the triggering conditions that satisfy the recovery condition are:

and Pool_startingThe Ratio of the Pod of the middle started to the total number of pods therein reaches the set value Ratio_startedThen, if

The resource utilization tends to be smooth after the partial Pod is started, and the partial Pod can be recovered. From pool_stopAnd selecting m Pod to recover according to the starting priority from high to low. The following conditions are satisfied:

wherein, the delta Capacity represents the amount of the resource left by the computing node,

indicating the resource maximum usage limit for the planned recovery Pod. In this embodiment, the deployment order and priority management of the application containers are increased, and on the premise that the utilization rate of the computing node resources is fully used, the start containers with low priority are suppressed, part of the node resources are released, and the node resources are protectedThe starting container with high priority can distribute more computing resources, the starting of the application containers is dispatched in order, the container starting efficiency is improved, the starting time is saved, and the starting deadlock is prevented. Under the abnormal power failure recovery scene of the container cluster management platform, the reliability guarantee of cluster recovery is provided. Meanwhile, support is provided for pre-deployment of application of the cloud platform all-in-one machine, container application can be preset before delivery, direct electrification is carried out during field engineering implementation, the application containers are started in sequence, manual intervention is reduced, service opening time and complexity are reduced, and service opening efficiency is improved.

Example 3:

the Pod scheduling process is exemplified below. Assume that the CPU resource capacity of a compute node is 1000 this capacity represents the amount of resources that can be made available to an application container. The CPU resource requirement of each application container Pod is 100 and the resource usage limit is 200. In the initial starting stage, if resource preemption occurs, the CPU utilization rate of the computing node is increased, and the CPU resource allocated to 10 Pod is only 100 or less and can be also dropped by the container management platform kill. After the throttling, the starting of 5 Pod is throttled, releasing the CPU resource 500.

The following table shows the recovery process.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. The scheduling method for avoiding the starting congestion of the container cluster is applied to a container cluster management platform, the container cluster management platform comprises a control node and at least one computing node, each computing node is provided with a starting Pod pool and a suppressing Pod pool, and the scheduling method comprises the following steps:

2. The scheduling method according to claim 1, wherein if the preset recovery condition is met, determining the Pod recovered by the current plan according to a starting sequence of the pods, a resource occupation amount of the pods, and a resource capacity that can be provided by the computing node, and adding the Pod recovered by the current plan to a starting Pod pool to restart the Pod recovered by the current plan until the completion of the starting of all the pods includes:

3. The scheduling method of claim 1, wherein the resource demand of the Pod that has completed booting, the resource limit amount of the Pod that the platform is currently booting, and the resource limit amount of the Pod that the platform is scheduled to resume are obtained as follows:

wherein n is the number of containers contained in Pod,

is the deadline for the container to use the CPU time slice,

4. The scheduling method of claim 1, wherein the preset recovery condition comprises: the resource utilization rate of the computing node is not more than a set utilization rate threshold, and in the starting Pod pool, the Ratio of the Pod which has finished starting to the total number of the pods is more than a set proportion threshold Ratio_started。

5. The scheduling method according to claim 1, wherein the preset suppression condition comprises: the resource utilization rate of the computing node is greater than a set utilization rate threshold, and the sum of the resource demand of the Pod which has finished starting and the resource limitation amount of the platform to the Pod which is starting is greater than the resource capacity which can be provided by the computing node.

6. The scheduling method of claim 5, wherein in the process of starting the Pod, if a preset suppression condition is met, the selecting, by the computing node, a part of the pods from the starting Pod pool according to the starting order of the pods, and adding the selected pods to a suppression Pod pool to suppress the selected pods comprises:

7. The scheduling method according to claim 1, wherein the Pod start order is determined by a Pod deployment priority and/or a deployment order, wherein a Pod with a high deployment priority is started preferentially; for the Pod with the same deployment priority or the Pod without the priority, the Pod deployed in advance is started preferentially.

8. A container cluster management platform is characterized by comprising a control node and at least one computing node, wherein each computing node is provided with a starting Pod pool and a suppressing Pod pool, and the control node and the computing nodes are matched with each other to realize the scheduling method according to any one of claims 1 to 7.

9. The container cluster management platform according to claim 8, wherein a recording module and an analyzing module are disposed on the control node;

10. The container cluster management platform according to claim 8, wherein a scheduling module is disposed on the compute node;

the starting Pod pool is used for recording Pod in starting;

the suppression Pod pool is used for recording the suppressed Pod.