CN112214303A

CN112214303A - Kubernetes cluster automatic scaling system

Info

Publication number: CN112214303A
Application number: CN201910612910.8A
Authority: CN
Inventors: 伍强; 俞嘉地; 薛广涛
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2019-07-09
Filing date: 2019-07-09
Publication date: 2021-01-12

Abstract

A kubernets cluster auto-scaling system comprising: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value. The invention can reduce the deployment and operation cost of Web service and application programs, reduce the waste of resources, dynamically adjust the size of the Kubernetes cluster and improve the utilization rate of the cluster resources.

Description

Kubernetes cluster automatic scaling system

Technical Field

The invention relates to a technology in the field of internet information processing, in particular to a Kubernetes (k 8s for short) cluster automatic zooming system.

Background

Kubernetes is a container management system for Google open source, and facilitates the deployment of services on containers by users. More and more enterprises and developers migrate their web applications to kubernets clusters. However, the workload of the web application varies greatly, which results in a large change in the demand of the application for server cluster resources. For such a situation, the current general processing method is to design the kubernets cluster to be capable of processing the work load peak period, which results in that most resources of the cluster are in an idle state under a general condition, the resource utilization rate is low, and great waste is caused.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides the Kubernetes cluster automatic scaling system, which can reduce the deployment and operation cost of web application and reduce the waste of resources. The method can automatically adjust the size of the Kubernets cluster according to the change of the work load of the Kubernets cluster, improve the utilization rate of cluster resources, and simultaneously can ensure the service quality of web application.

The invention is realized by the following technical scheme:

the invention comprises the following steps: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value.

Technical effects

Compared with the prior art, the invention can improve the resource utilization rate of the Kubernetes cluster on the premise of ensuring the service quality of application.

Drawings

FIG. 1 is a system architecture diagram;

FIG. 2 is a monitoring module architecture diagram;

FIG. 3 is a QoS module architecture diagram;

FIG. 4 is a graph of CPU utilization versus service response time;

FIG. 5 is a flow chart of cluster scaling;

FIG. 6 is a workload variation diagram;

FIG. 7 is a graph of system accuracy variation;

FIG. 8 is a graph of service response time variation for a k8s cluster;

FIG. 9 is a cumulative probability distribution plot of response time of the system;

FIG. 10 is a bar graph of average CPU utilization for a native Kubernets cluster and a kubernets cluster using the present invention.

Detailed Description

As shown in fig. 1, the present embodiment relates to a kubernets cluster automatic scaling system based on MAPE (monitoring-analysis-planning-execution) architecture, which includes: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value.

As shown in fig. 2, the monitoring module includes: a control center unit (Monitor center), a time sequence database unit (InfluxDB) and a data monitoring unit (Heapster), wherein: the time sequence database unit and the data monitoring unit are respectively deployed in different pod assemblies, the control center unit is used for controlling the normal operation of the flow database unit and the performance analysis unit, and the control center unit sends a list query request of all nodes in the Kubernetes cluster to the data monitoring unit. For each node, the data monitoring unit requests the cAdvisor in the Kubelet to acquire the CPU utilization rate information of the current node, and outputs the acquired monitoring data to the time sequence database unit for storage.

As shown in fig. 3, the QoS module includes: control center unit, pressure tool unit and application program unit, wherein: the pressure tool unit and the application program unit are deployed in different Pod assemblies of the same node, the pressure tool unit changes the CPU utilization rate of the server according to the operation parameters (CPUs), the control center unit sends a request to the application program and calculates Response time (Response time) according to the received Response, the CPU parameters are continuously changed through the pressure tool unit, different CPU utilization rates are adjusted, corresponding Response time is obtained, and a relation graph of the Response time and the CPU utilization rate is obtained, as shown in fig. 4.

The upper limit T of the response time_limit＝α×T_normalWherein: α is a coefficient greater than 1, T_normalResponse time is obtained in an environment where kubernets cluster resources are abundant. Then passes T on the relation graph of response time and CPU utilization rate_limitObtain the corresponding CPU utilization rate U_limitUpper limit of

The cluster scaling algorithm is characterized in that threshold judgment is carried out on the upper limit of the CPU utilization rate provided by the QoS module and the current CPU utilization rate of each node on the cluster provided by the monitoring module, cluster reduction is carried out on the nodes exceeding the threshold, and if not, cluster amplification is carried out, and the method specifically comprises the following steps:

1) and acquiring the current CPU utilization rate of each node on the cluster through the monitoring module.

2) Comparing CPU utilization of each node with U_upper0.4, when in the interval [0.4, U_upper]If so, the scaling size is 0 (i.e., no cluster scaling is performed); when greater than U_upperIf yes, the cluster is enlarged; and when the number is less than 0.4, the cluster is reduced.

3) When the current execution and the last execution are both cluster enlargement or cluster reduction, the enlargement or reduction size of the current cluster is 2 times of the enlargement or reduction size of the last cluster, otherwise, the scaling size is 1.

As shown in fig. 5, the execution module controls the kubernets cluster to perform corresponding cluster scaling through the Kubectl command line interface.

In this embodiment, 5 servers with four cores are used, and the interval [0, 50000 ] is generated by normal distribution function]To simulate a real workload, as shown in fig. 6, the abscissa represents the time after processing

T denotes the current time, T_durRepresenting the duration of each load and the ordinate represents the request rate.

FIG. 7 shows the accuracy and duration T of the present invention_durA variation diagram of (2). Therefore get T_durAn accuracy of 0.96 can be obtained for 30.

As shown in fig. 8, it is a relationship diagram between the service quality on the kubernets cluster and the coefficient α in the QoS module, so that when the coefficient α is 2, a service response time close to that of the native kubernets cluster can be obtained, and the service quality is ensured.

Fig. 9 is a graph showing the cumulative probability distribution of the response time of the present invention. The average response time from which the invention can be derived is about 15s, because of the time required for Pod start-up, downloading of images, etc. on the kubernets cluster.

As shown in fig. 10, there is a bar graph of the average CPU utilization at different loads for the native kubernets cluster and the kubernets cluster using the present invention. It can be seen that under the present invention, the CPU utilization of the k8s cluster is improved by about 30%.

The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims

1. A Kubernetes cluster automatic scaling system, comprising: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value.

2. The kubernets cluster automatic scaling system of claim 1, wherein the monitoring module includes: control center unit, time sequence database unit and data monitoring unit, wherein: the time sequence database unit and the data monitoring unit are respectively deployed in different pod assemblies, the control center unit is used for controlling the normal operation of the flow database unit and the performance analysis unit, the control center unit sends a list query request of all nodes in the Kubernetes cluster to the data monitoring unit, for each node, the data monitoring unit requests the cAdvison in the Kubelet to acquire the CPU utilization rate information of the current node, and the acquired monitoring data are output to the time sequence database unit for storage.

3. The kubernets cluster auto-scaling system of claim 1, wherein the QoS module includes: control center unit, pressure tool unit and application program unit, wherein: the pressure tool unit and the application program unit are deployed in different Pod assemblies of the same node, the pressure tool unit changes the CPU utilization rate of the server according to the operation parameters of the pressure tool unit, and the control center unit sends a request to the application program and calculates response time according to the received response.

4. The kubernets cluster automatic scaling system of claim 3, wherein the relationship graph of response time and CPU utilization is obtained by adjusting the CPU utilization through the pressure tool unit to obtain the corresponding response time.

5. A Kubernets cluster auto-scaling system according to claim 3 or 4, characterized in that the upper limit T of the response time_limit＝α×T_normalWherein: α is a coefficient greater than 1, T_normalTo obtain response time under the environment of abundant Kubernetes cluster resources, and then to pass through the relation graph of response time and CPU utilization rateT_limitObtain the corresponding CPU utilization rate U_limitUpper limit of

6. The kubernets cluster automatic scaling system of claim 1, wherein the cluster scaling algorithm performs threshold judgment on the upper limit of the CPU utilization rate provided by the QoS module and the current CPU utilization rate of each node on the cluster provided by the monitoring module, and performs cluster reduction on the node exceeding the threshold, otherwise performs cluster amplification.

7. The Kubernetes cluster automatic scaling system according to claim 1 or 6, wherein the cluster scaling algorithm comprises the specific steps of:

1) acquiring the current CPU utilization rate of each node on the cluster through a monitoring module;

2) comparing CPU utilization of each node with U_upper0.4, when in the interval [0.4, U_upper]If not, not executing cluster scaling; when greater than U_upperIf yes, the cluster is enlarged; when the number is less than 0.4, the cluster is reduced;

8. The kubernets cluster automatic scaling system of claim 1, wherein the execution module controls the kubernets cluster to execute corresponding cluster scaling through a Kubectl command line interface.