CN112214303A - Kubernetes cluster automatic scaling system - Google Patents

Kubernetes cluster automatic scaling system Download PDF

Info

Publication number
CN112214303A
CN112214303A CN201910612910.8A CN201910612910A CN112214303A CN 112214303 A CN112214303 A CN 112214303A CN 201910612910 A CN201910612910 A CN 201910612910A CN 112214303 A CN112214303 A CN 112214303A
Authority
CN
China
Prior art keywords
cluster
module
unit
cpu utilization
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910612910.8A
Other languages
Chinese (zh)
Inventor
伍强
俞嘉地
薛广涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201910612910.8A priority Critical patent/CN112214303A/en
Publication of CN112214303A publication Critical patent/CN112214303A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Abstract

A kubernets cluster auto-scaling system comprising: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value. The invention can reduce the deployment and operation cost of Web service and application programs, reduce the waste of resources, dynamically adjust the size of the Kubernetes cluster and improve the utilization rate of the cluster resources.

Description

Kubernetes cluster automatic scaling system
Technical Field
The invention relates to a technology in the field of internet information processing, in particular to a Kubernetes (k 8s for short) cluster automatic zooming system.
Background
Kubernetes is a container management system for Google open source, and facilitates the deployment of services on containers by users. More and more enterprises and developers migrate their web applications to kubernets clusters. However, the workload of the web application varies greatly, which results in a large change in the demand of the application for server cluster resources. For such a situation, the current general processing method is to design the kubernets cluster to be capable of processing the work load peak period, which results in that most resources of the cluster are in an idle state under a general condition, the resource utilization rate is low, and great waste is caused.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides the Kubernetes cluster automatic scaling system, which can reduce the deployment and operation cost of web application and reduce the waste of resources. The method can automatically adjust the size of the Kubernets cluster according to the change of the work load of the Kubernets cluster, improve the utilization rate of cluster resources, and simultaneously can ensure the service quality of web application.
The invention is realized by the following technical scheme:
the invention comprises the following steps: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value.
Technical effects
Compared with the prior art, the invention can improve the resource utilization rate of the Kubernetes cluster on the premise of ensuring the service quality of application.
Drawings
FIG. 1 is a system architecture diagram;
FIG. 2 is a monitoring module architecture diagram;
FIG. 3 is a QoS module architecture diagram;
FIG. 4 is a graph of CPU utilization versus service response time;
FIG. 5 is a flow chart of cluster scaling;
FIG. 6 is a workload variation diagram;
FIG. 7 is a graph of system accuracy variation;
FIG. 8 is a graph of service response time variation for a k8s cluster;
FIG. 9 is a cumulative probability distribution plot of response time of the system;
FIG. 10 is a bar graph of average CPU utilization for a native Kubernets cluster and a kubernets cluster using the present invention.
Detailed Description
As shown in fig. 1, the present embodiment relates to a kubernets cluster automatic scaling system based on MAPE (monitoring-analysis-planning-execution) architecture, which includes: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value.
As shown in fig. 2, the monitoring module includes: a control center unit (Monitor center), a time sequence database unit (InfluxDB) and a data monitoring unit (Heapster), wherein: the time sequence database unit and the data monitoring unit are respectively deployed in different pod assemblies, the control center unit is used for controlling the normal operation of the flow database unit and the performance analysis unit, and the control center unit sends a list query request of all nodes in the Kubernetes cluster to the data monitoring unit. For each node, the data monitoring unit requests the cAdvisor in the Kubelet to acquire the CPU utilization rate information of the current node, and outputs the acquired monitoring data to the time sequence database unit for storage.
As shown in fig. 3, the QoS module includes: control center unit, pressure tool unit and application program unit, wherein: the pressure tool unit and the application program unit are deployed in different Pod assemblies of the same node, the pressure tool unit changes the CPU utilization rate of the server according to the operation parameters (CPUs), the control center unit sends a request to the application program and calculates Response time (Response time) according to the received Response, the CPU parameters are continuously changed through the pressure tool unit, different CPU utilization rates are adjusted, corresponding Response time is obtained, and a relation graph of the Response time and the CPU utilization rate is obtained, as shown in fig. 4.
The upper limit T of the response timelimit=α×TnormalWherein: α is a coefficient greater than 1, TnormalResponse time is obtained in an environment where kubernets cluster resources are abundant. Then passes T on the relation graph of response time and CPU utilization ratelimitObtain the corresponding CPU utilization rate UlimitUpper limit of
Figure BDA0002122942410000021
The cluster scaling algorithm is characterized in that threshold judgment is carried out on the upper limit of the CPU utilization rate provided by the QoS module and the current CPU utilization rate of each node on the cluster provided by the monitoring module, cluster reduction is carried out on the nodes exceeding the threshold, and if not, cluster amplification is carried out, and the method specifically comprises the following steps:
1) and acquiring the current CPU utilization rate of each node on the cluster through the monitoring module.
2) Comparing CPU utilization of each node with Uupper0.4, when in the interval [0.4, Uupper]If so, the scaling size is 0 (i.e., no cluster scaling is performed); when greater than UupperIf yes, the cluster is enlarged; and when the number is less than 0.4, the cluster is reduced.
3) When the current execution and the last execution are both cluster enlargement or cluster reduction, the enlargement or reduction size of the current cluster is 2 times of the enlargement or reduction size of the last cluster, otherwise, the scaling size is 1.
As shown in fig. 5, the execution module controls the kubernets cluster to perform corresponding cluster scaling through the Kubectl command line interface.
In this embodiment, 5 servers with four cores are used, and the interval [0, 50000 ] is generated by normal distribution function]To simulate a real workload, as shown in fig. 6, the abscissa represents the time after processing
Figure BDA0002122942410000031
T denotes the current time, TdurRepresenting the duration of each load and the ordinate represents the request rate.
FIG. 7 shows the accuracy and duration T of the present inventiondurA variation diagram of (2). Therefore get TdurAn accuracy of 0.96 can be obtained for 30.
As shown in fig. 8, it is a relationship diagram between the service quality on the kubernets cluster and the coefficient α in the QoS module, so that when the coefficient α is 2, a service response time close to that of the native kubernets cluster can be obtained, and the service quality is ensured.
Fig. 9 is a graph showing the cumulative probability distribution of the response time of the present invention. The average response time from which the invention can be derived is about 15s, because of the time required for Pod start-up, downloading of images, etc. on the kubernets cluster.
As shown in fig. 10, there is a bar graph of the average CPU utilization at different loads for the native kubernets cluster and the kubernets cluster using the present invention. It can be seen that under the present invention, the CPU utilization of the k8s cluster is improved by about 30%.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (8)

1. A Kubernetes cluster automatic scaling system, comprising: a monitoring module for monitoring the status of the entire kubernets cluster, a QoS (quality of service) module, a scaling module and an execution module, wherein: the monitoring module respectively outputs monitoring data to the Qos module and the zooming module, the QoS module calculates the upper limit of the CPU utilization rate for guaranteeing the service quality and outputs the upper limit to the zooming module, the zooming module with the built-in cluster zooming algorithm obtains a Kubernets cluster ideal value according to the monitoring data and the CPU utilization rate, outputs the Kubernets cluster ideal value to the execution module, and the execution module performs zooming processing on the cluster according to the Kubernets cluster ideal value.
2. The kubernets cluster automatic scaling system of claim 1, wherein the monitoring module includes: control center unit, time sequence database unit and data monitoring unit, wherein: the time sequence database unit and the data monitoring unit are respectively deployed in different pod assemblies, the control center unit is used for controlling the normal operation of the flow database unit and the performance analysis unit, the control center unit sends a list query request of all nodes in the Kubernetes cluster to the data monitoring unit, for each node, the data monitoring unit requests the cAdvison in the Kubelet to acquire the CPU utilization rate information of the current node, and the acquired monitoring data are output to the time sequence database unit for storage.
3. The kubernets cluster auto-scaling system of claim 1, wherein the QoS module includes: control center unit, pressure tool unit and application program unit, wherein: the pressure tool unit and the application program unit are deployed in different Pod assemblies of the same node, the pressure tool unit changes the CPU utilization rate of the server according to the operation parameters of the pressure tool unit, and the control center unit sends a request to the application program and calculates response time according to the received response.
4. The kubernets cluster automatic scaling system of claim 3, wherein the relationship graph of response time and CPU utilization is obtained by adjusting the CPU utilization through the pressure tool unit to obtain the corresponding response time.
5. A Kubernets cluster auto-scaling system according to claim 3 or 4, characterized in that the upper limit T of the response timelimit=α×TnormalWherein: α is a coefficient greater than 1, TnormalTo obtain response time under the environment of abundant Kubernetes cluster resources, and then to pass through the relation graph of response time and CPU utilization rateTlimitObtain the corresponding CPU utilization rate UlimitUpper limit of
Figure FDA0002122942400000011
6. The kubernets cluster automatic scaling system of claim 1, wherein the cluster scaling algorithm performs threshold judgment on the upper limit of the CPU utilization rate provided by the QoS module and the current CPU utilization rate of each node on the cluster provided by the monitoring module, and performs cluster reduction on the node exceeding the threshold, otherwise performs cluster amplification.
7. The Kubernetes cluster automatic scaling system according to claim 1 or 6, wherein the cluster scaling algorithm comprises the specific steps of:
1) acquiring the current CPU utilization rate of each node on the cluster through a monitoring module;
2) comparing CPU utilization of each node with Uupper0.4, when in the interval [0.4, Uupper]If not, not executing cluster scaling; when greater than UupperIf yes, the cluster is enlarged; when the number is less than 0.4, the cluster is reduced;
3) when the current execution and the last execution are both cluster enlargement or cluster reduction, the enlargement or reduction size of the current cluster is 2 times of the enlargement or reduction size of the last cluster, otherwise, the scaling size is 1.
8. The kubernets cluster automatic scaling system of claim 1, wherein the execution module controls the kubernets cluster to execute corresponding cluster scaling through a Kubectl command line interface.
CN201910612910.8A 2019-07-09 2019-07-09 Kubernetes cluster automatic scaling system Pending CN112214303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910612910.8A CN112214303A (en) 2019-07-09 2019-07-09 Kubernetes cluster automatic scaling system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910612910.8A CN112214303A (en) 2019-07-09 2019-07-09 Kubernetes cluster automatic scaling system

Publications (1)

Publication Number Publication Date
CN112214303A true CN112214303A (en) 2021-01-12

Family

ID=74048329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910612910.8A Pending CN112214303A (en) 2019-07-09 2019-07-09 Kubernetes cluster automatic scaling system

Country Status (1)

Country Link
CN (1) CN112214303A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114816750A (en) * 2022-04-24 2022-07-29 江苏鼎集智能科技股份有限公司 Big data management task operation method
US11868802B2 (en) 2021-07-09 2024-01-09 Red Hat, Inc. Application lifecycle management based on real-time resource usage

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107734035A (en) * 2017-10-17 2018-02-23 华南理工大学 A kind of Virtual Cluster automatic telescopic method under cloud computing environment
US10044640B1 (en) * 2016-04-26 2018-08-07 EMC IP Holding Company LLC Distributed resource scheduling layer utilizable with resource abstraction frameworks
CN108469989A (en) * 2018-03-13 2018-08-31 广州西麦科技股份有限公司 A kind of reaction type based on clustering performance scalable appearance method and system automatically
CN108848157A (en) * 2018-06-12 2018-11-20 郑州云海信息技术有限公司 A kind of method and apparatus of Kubernetes cluster container monitors
CN109960585A (en) * 2019-02-02 2019-07-02 浙江工业大学 A kind of resource regulating method based on kubernetes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10044640B1 (en) * 2016-04-26 2018-08-07 EMC IP Holding Company LLC Distributed resource scheduling layer utilizable with resource abstraction frameworks
CN107734035A (en) * 2017-10-17 2018-02-23 华南理工大学 A kind of Virtual Cluster automatic telescopic method under cloud computing environment
CN108469989A (en) * 2018-03-13 2018-08-31 广州西麦科技股份有限公司 A kind of reaction type based on clustering performance scalable appearance method and system automatically
CN108848157A (en) * 2018-06-12 2018-11-20 郑州云海信息技术有限公司 A kind of method and apparatus of Kubernetes cluster container monitors
CN109960585A (en) * 2019-02-02 2019-07-02 浙江工业大学 A kind of resource regulating method based on kubernetes

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
F. AL-HAIDARI: "Impact of CPU Utilization Thresholds and Scaling Size on Autoscaling Cloud Resources", 《2013 IEEE 5TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE》 *
TIAN YE: "An Auto-Scaling Framework for Containerized Elastic Applications", 《2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11868802B2 (en) 2021-07-09 2024-01-09 Red Hat, Inc. Application lifecycle management based on real-time resource usage
CN114816750A (en) * 2022-04-24 2022-07-29 江苏鼎集智能科技股份有限公司 Big data management task operation method
CN114816750B (en) * 2022-04-24 2022-12-23 江苏鼎集智能科技股份有限公司 Big data management task operation method

Similar Documents

Publication Publication Date Title
JP6457447B2 (en) Data center network traffic scheduling method and apparatus
US9191463B2 (en) Stream processing using a client-server architecture
US8341439B2 (en) Power management apparatus and method thereof and power control system
CN106557369A (en) A kind of management method and system of multithreading
US20090327459A1 (en) On-Demand Capacity Management
CN110677274A (en) Event-based cloud network service scheduling method and device
US9851988B1 (en) Recommending computer sizes for automatically scalable computer groups
US20160080267A1 (en) Monitoring device, server, monitoring system, monitoring method and program recording medium
CN105808341A (en) Method, apparatus and system for scheduling resources
CN109117244B (en) Method for implementing virtual machine resource application queuing mechanism
CN112214303A (en) Kubernetes cluster automatic scaling system
JP2007257163A (en) Operation quality management method in distributed program execution environment
CN103442087B (en) A kind of Web service system visit capacity based on response time trend analysis controls apparatus and method
CN116643844B (en) Intelligent management system and method for automatic expansion of power super-computing cloud resources
CN114490091B (en) Method and device for monitoring rule engine performance in industrial data acquisition management system
Zhou et al. AHPA: adaptive horizontal pod autoscaling systems on alibaba cloud container service for kubernetes
CN114443262A (en) Computing resource management method, device, equipment and system
Huaijun et al. Research and implementation of mobile cloud computing offloading system based on Docker container
CN111026553B (en) Resource scheduling method and server system for offline mixed part operation
Li et al. CODEC: Cost-Effective Duration Prediction System for Deadline Scheduling in the Cloud
CN116526678B (en) Intelligent computing center power supply elastic scheduling system and control method thereof
CN112182363B (en) Intelligent auditing method, device, equipment and storage medium based on micro-service framework
CN111258710B (en) System maintenance method and device
Cheng et al. Design and Implement for Reducing the Temporary High Load of Device in Industrial Networks
CN117201339A (en) Data acquisition method, system, equipment and storage medium based on ai decision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210112

RJ01 Rejection of invention patent application after publication