CN110825520A

CN110825520A - Cluster top-speed elastic expansion method for realizing efficient resource utilization

Info

Publication number: CN110825520A
Application number: CN201910994328.2A
Authority: CN
Inventors: 单朋荣; 杨美红; 赵志刚; 陈静; 厉承轩; 刘凯; 于焕焕
Original assignee: Shandong Computer Science Center National Super Computing Center in Jinan
Current assignee: Shandong Computer Science Center National Super Computing Center in Jinan
Priority date: 2019-10-18
Filing date: 2019-10-18
Publication date: 2020-02-21
Anticipated expiration: 2039-10-18
Also published as: CN110825520B

Abstract

The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, which comprises the following steps of: step 1: creating an intelligent elastic expansion compensation module IACM and a compensation queue; step 2: judging whether the service cluster needs to be added with nodes in a future period of time; and step 3: adding Pod resources into a compensation queue; and 4, step 4: setting the priority of a compensation queue; and 5: performing recovery operation on Pod resources; step 6: maintaining Pod resources; and 7: and (5) judging periodically. According to the method for the extremely-fast elastic expansion and contraction of the cluster, when the resource demand of the service cluster is increased, the new copy already exists in the 'compensation queue', and the service cluster can be added into the cluster by directly pulling up the new copy, so that the time for building the new copy is saved, and the purpose of extremely and rapidly expanding the cluster node resources is realized. When the resource pool is deficient, the resource is adjusted, the resource occupation ratio is large and cannot be adjusted, and the task of expelling or killing can be performed so as to realize resource scheduling and resource expansion.

Description

Cluster top-speed elastic expansion method for realizing efficient resource utilization

Technical Field

The invention relates to a cluster top-speed elastic expansion method, in particular to a cluster top-speed elastic expansion method for realizing efficient resource utilization, and belongs to the technical field of big data and cloud computing platforms.

Background

At present, the method for automatically telescoping the Kubernetes container cluster based on a resource threshold or a user-defined index needs to solve the contradiction between the node quick ejection and cluster oscillation avoidance. Because the newly-built nodes of the cluster consume a long time, the real requirements cannot be well met on the efficiency of timely responding to expansion and contraction. Therefore, the intelligent telescopic compensation technology based on the intelligent telescopic compensation process is provided, the time for adding a newly-built node into a cluster to provide service can be effectively shortened, the telescopic response speed is accelerated, and the quality of the service is ensured.

In addition, in the aspect of scheduling and placing of the automatic telescopic pop-up node, because the existing scheduling strategy only supports the characteristic dimension of resource requirements, the support is insufficient in the behavior characteristic dimension of macro application. The method comprehensively considers two aspects, provides a 'hybrid scheduling soft segmentation' process, can realize the full utilization of resources, and further realizes the minimization of cost.

Disclosure of Invention

In order to overcome the defects of the technical problems, the invention provides a cluster top-speed elastic expansion method for realizing efficient resource utilization.

The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, which comprises an intelligent expansion compensation process and a hybrid scheduling soft segmentation process; the intelligent telescopic compensation process is characterized by being realized through the following steps:

step 1: the service cluster creates an intelligent elastic expansion compensation module IACM of a user-defined resource in Kubernets, and defines a compensation queue in the intelligent elastic expansion compensation module IACM, wherein the compensation queue is maintained by a thread DaemonSet which is deployed on each node by the IACM;

step 2: judging whether the service cluster needs to add nodes in a future period of time, if so, creating Pod and executing the step 3); if the node does not need to be added, entering the next prediction period; pod is an aggregation of containers sharing a name space and a network space, and is a basic unit of cluster scheduling;

and step 3: adding the Pod resources predicted and generated by the prediction algorithm in the step 2) into a compensation queue, and executing the step 4);

and 4, step 4: setting the priority of the compensation queue, and executing the step 5;

and 5: when the length of the compensation queue exceeds a set threshold value, cleaning and recycling operation is performed on Pod resources in the compensation queue periodically, and step 6 is performed;

step 6: maintaining the operations of entering and exiting the compensation queue of the Pod resources and cleaning the Pod resources from the compensation queue, and executing the step 7;

and 7: the judgment process of step 2 is performed periodically.

The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, wherein the prediction algorithm logic in the step 2 is realized through the following steps:

step 2-1: fitting the real-time monitoring data and the historical load data, judging the expansion requirement, newly building a Pod resource and adding the Pod resource into a compensation queue when nodes need to be added in the future, and not executing the operation of newly building the Pod resource when the nodes do not need to be added in the future;

step 2-2: judging whether Pod resources needing to be recycled exist or not, then executing a prediction algorithm, if the algorithm predicts that the Pod resources are needed in a future period of time, adding the Pod resources needing to be recycled into a compensation queue, and if the Pod resources are needed in the future period of time, discarding the Pod resources needing to be recycled;

step 2-3: and analyzing the Pod resource requirements submitted by the user in the future period of time, and establishing a new Pod resource according to the requirements submitted by the user and adding the new Pod resource to the compensation queue.

The invention discloses a cluster top-speed elastic expansion method for realizing efficient resource utilization, wherein the priority of a compensation queue in step 4 is obtained through the following steps:

step 4-1: and (3) solving the priority of the Pod resources in the compensation queue according to the formula (1):

P_n＝W₀*F_n+W₁*P₀+W₂*C_n(1)

wherein, P_nIndicating the priority, W, of the newly enqueued Pod resource₀、W₁、W₂Denotes a weight ratio, F_nThe fitting factor is a value representing the fitting degree and representing the possibility of being scheduled in the future; p₀Representing service preset priority, C_nRepresenting a credit factor; w is more than 0₀，W₁，W₂＜1，W₀+W₁+W₂＝1；

Step 4-2: calculating the fitting factor F in the formula (1) by using the formula (2)_n：

Wherein S represents a service cluster, i represents the total i newly added Pod events of the cluster in history, i' represents the total pop resource out-queue number in the service cluster, and OE_mPod resource, IE, representing mth out-of-compensation queue_mPod representing the m-th enqueue; f_i+1The fitting factor representing the newly added Pod resource is judged according to the ratio of the Pod resource which belongs to the same service cluster to the compensation queue in history;

representing the sum of Pod resources historically enqueued and dequeued by a certain service cluster,representing the sum of Pod resources historically added to a compensation queue by a certain service cluster; { IE, OE }. epsilon.S denotes a queue Pod resource IE in the set_mAnd dequeue Pod resource OE_mBelong to a service cluster;

step 4-3: signalling factor C in equation (1)_nThe method is realized by the following steps:

step 4-3-1: obtaining historical resource quota utilization rate, setting that the service cluster has i times of newly-added Pod resources in history, and has n types of resources participating in calculation in Pod, wherein the resource quota utilization rates in the i times of newly-added Pod resources in the service cluster in history are respectively expressed as U₁(t₁,d₁)，...，U_m(t_m,d_m)，...，U_i(t_i,d_i) Wherein:

U_m(t_m,d_m)＝(u_m,1(t_m,d_m),...,u_m,k(t_m,d_m),...,u_m,n(t_m,d_m))

U_m(t_m,d_m) Indicates the resource utilization rate of the mth newly added Pod, u_m,k(t_m,d_m) Representing the resource quota utilization rate of the kth resource in the mth newly-added Pod, wherein k is 1, 2. t is t_mIndicating the time at which the Pod starts providing service, d_mIndicating the service duration provided by the Pod;

step 4-3-2: calculating the weight; firstly, calculating the utilization rate u of each resource quota in the newly added Pod by a formula (3)_m,k(t_m,d_m) Occupied weight w_m,k：

w_m,kFor each resource quota utilization rate u in newly-increased Pod_m,k(t_m,d_m) The weight occupied, α, is for d_m0 < α < 1, and the size of D is equal to the period time T of the periodic dynamic adjustment resource quota, m is 1, 2.. times.i;

the weight of the usage rate of each resource quota in the mth newly added Pod is represented as:

W_m＝(w_m,1,...,w_m,k,...,w_m,m)

step 4-3-3: calculating the credit C of the cluster where the Pod is located; for various resources in the Pod, after the Pod is newly added for i times in the history of the same service cluster, calculating a credit factor of each newly added Pod resource through a formula (4);

for all resource types, after the resource quota is applied for i times, when a new resource quota is allocated to the user for the (i + 1) th time, the credit factor C of the service cluster is as follows:

so the credit factor c of the i +1 th newly added resource Pod_n＝c_i+1＝c。

The cluster top-speed elastic expansion method for realizing the efficient utilization of resources is characterized in that the hybrid scheduling soft segmentation process is realized by the following steps:

a) for Pod resources established in the intelligent telescopic compensation process, dividing the Pod resources into online task resources and offline task resources, and solving an online task resource ratio Rx and an offline task resource ratio Ry through a formula (6):

where x, y are intermediate variables, which are solved by equation (7):

α 1, α 2.. α n represents the weight of online tasks, α 1 ', α 2 '. α n ' represents the weight of offline tasks, the types of resource types share n classes, k1, k2,. and kn represent n classes of resource types, the weight of α i + α i ' is 1, the weight of α i and α i ' is determined by the ratio of the respective summations of the various types of resources of the online and offline tasks, i is 1,2,. n;

b) when the online task resource use residual rate is less than or equal to 10% or the task dequeuing rate is slowly lower than a set threshold value, the resource shortage is judged, and the resource needs to be adjusted; if the resource needs to be adjusted, executing the step c), and if the resource does not need to be adjusted, executing the step f);

c) judging the offline task service priority Py > P₀If yes, indicating that the offline task cannot be evicted, and executing step d); if not, indicating that the offline task can be evicted to release the resource, and executing the step i); p₀An eviction threshold for an offline task;

setting new resource proportion variables of online tasks and offline tasks to be R 'respectively'_x、R′_y，R′_x+R′_y1 is ═ 1; then:

R′_x＝λ+R_x(8)

R_x＝R′_x,R_y＝R′_y(9)

wherein, λ is a resource adjustment factor;

wherein k1 ', k2 ', kn ' are the resources newly released by the offline task;

d) judging whether the utilization rate of the online task allocation resources is greater than or equal to 100%, if so, executing the step e); if not, executing step f);

e) judging whether the online task can be only scheduled to the current node, if so, executing the step i); if not, executing step h);

f) judging whether the residual rate of the off-line task resources is lower than 10%, if so, encroaching on the on-line task resources but not exceeding 50% of the total resources at most, and executing the step j); if not, executing step k);

g) encroaching on the online task resources, but not exceeding 50% of the total resources of the online task at most, judging whether the encroachment on the online task is successful, if so, executing the step j); if not, executing step k);

h) dispatching to other nodes or newly added nodes, and executing the step a); if the scheduling fails, executing the step i);

i) forcibly expelling the tasks with low priority in the off-line tasks according to the priority of the off-line tasks, and executing the step j);

j) proportioning R 'according to resources'_x、R′_yDynamically adjusting system resources, and executing the step k);

k) the resource meets the requirement, the circulation is finished, otherwise, the steps b) to j) are executed circularly according to the period T).

In the step c), the online task priority Px and the offline task priority Py are solved through a formula (11):

wherein, ω is₀、ω₁Represents the weight value, ω₀+ω₁+ξ＝1，ω₀+ω₁+ξ′＝1；P_sFor task system priority, R_tIndicating the running time of the task, E_tIndicating the task deadline, ξ, ξ' are period impact factors.

The cluster top-speed elastic expansion method for realizing the efficient utilization of resources is characterized in that the task dequeuing speed in the step b) is slowly lower than a set threshold value by using a task waiting period, and when the task waiting period exceeds the threshold value T₀And then, judging that the resources are booked and needing to be adjusted.

When the expansion of dynamic resources is triggered, a new Pod resource starting mode is not directly adopted, but a compensation queue is traversed first, whether a prepared Pod resource exists or not is checked, if yes, the Pod resource is directly pulled up, and if not, an intelligent expansion compensation process is executed; when triggering dynamic capacity reduction, the reduced Pod is not directly killed after a cooling period, but is put into a compensation queue for waiting, and then is destroyed in an intelligent compensation process; it can be seen that due to the addition of the intelligent telescopic compensation process, the original cooling period can be properly shortened, and therefore a faster and fine-grained dynamic telescopic scheme is realized.

The invention has the beneficial effects that: according to the cluster rapid elastic expansion method, in the intelligent expansion compensation process, the resource requirement of a service in the future period is predicted according to historical load and real-time load information, Pod resources are pre-created and added into a compensation queue, and as the service cluster increases the resource requirement, a required new copy already exists in the compensation queue, the service cluster can be added into the compensation queue by directly pulling up the new copy, so that the preparation time consumed by creating the new copy from zero is saved, and the goal of more extremely and rapidly expanding cluster node resources is realized. In the process of hybrid scheduling soft segmentation, firstly, services are divided into online and offline task types, priorities of offline and online tasks are calculated, when resources cannot meet requirements, the size of a resource pool can be adjusted firstly, when resources in the resource pool are insufficient, resource adjustment is carried out, the resource occupation is almost not adjusted, the priority is low, and tasks can be expelled or killed so as to achieve resource scheduling and resource expansion.

Drawings

FIG. 1 is a schematic diagram of an architecture of a conventional cluster scaling method;

FIG. 2 is a schematic diagram of an architecture of the cluster top-speed elastic scaling method of the present invention;

fig. 3 is a flow chart of the hybrid scheduling soft segmentation process in the present invention.

Detailed Description

The invention is further described with reference to the following figures and examples.

If the automatic scaling architecture system is roughly divided into three parts, namely index acquisition, scaling component and resource scheduling, as shown in fig. 1, a schematic diagram of the existing cluster scaling method is given. Adding the processes of 'intelligent telescopic compensation' and 'mixed scheduling soft segmentation' forms a new scheduling architecture system, as shown in fig. 2, a schematic diagram of the architecture system of the cluster top-speed elastic telescopic method is provided, the system forms a more perfect automatic telescopic flow, and the efficiency of telescopic nodes and the quality of service are improved; and the cluster resources are fully utilized, and the cost is saved.

The intelligent telescopic compensation process mainly comprises two parts of a compensation queue and a control logic. The control logic is added with a prediction algorithm analysis logic, the algorithm logic predicts the resource demand of the service for a period of time in the future according to historical load and real-time load information, and resources are pre-created and added into a compensation queue. Because the intelligent scaling process has independent control logic and life cycle, the intelligent scaling process is designed into a Controller mode and is managed and maintained by a control Manager (Controller Manager) of Kubernets.

The prediction algorithm model of the 'control logic' part can be established according to the characteristics of data required by prediction, and the data characteristics mainly comprise: time and service correlation, volatility, burstiness (discrete points). Establishing a prediction algorithm, a self-establishment algorithm or a combined algorithm according to the data characteristics and by combining the characteristics of the cluster and the service target of the application; and then, carrying out experiments on the algorithm to finally determine the optimal algorithm. The algorithms with better effect at present comprise gray prediction, exponential smoothing, BP neural network and regression, autoregressive algorithm.

The service cluster and main logic oriented to the intelligent scaling compensation process are as follows: the service oriented to the algorithm is a type of stateless application cluster created and maintained by a copy control policy, and the design of the main logic of the algorithm is based on the process of the change of the life cycle of a queue, and the process mainly comprises the actions of enqueuing, dequeuing and clearing the queue. Wherein, it is Pod (Pod can be regarded as an aggregation of containers sharing name space, network space and the like, and is a basic unit of cluster scheduling) resource that is predicted by a prediction algorithm and generated by a copy control policy to perform enqueuing by an enqueue action. The forecasting algorithm forecasts the change situation of the resource demand of the service cluster in a future period of time, and if the cluster needs to increase the Pod, new Pod resources are built and the resource enqueuing action is executed. And (3) dequeuing: when the cluster demand is in the effective prediction time period, namely the Pod of the cluster generated by prediction is not cleared in the queue, and the cluster size needs to be increased; and traversing the queue resources and executing the dequeuing action. Cleaning and aligning: when the memory resource is in shortage and the resource needs to be recycled or after one period is finished, the queue is cleared according to the priority sequence.

The constraints of the modulo algorithm are: the service cluster oriented to the system is a stateless application cluster generated by the control of a copy controller, and the Pod in the cluster is from the same deployment template and provides the same service type to the outside; in addition, the new replica constraints generated for each prediction oriented stateless application service cluster are: only one can be created and added to the queue at a time, and only after the copy is scheduled or cleaned, the creation and enqueuing of new resources can be performed according to the prediction algorithm.

The algorithm has the advantages that: when the resource requirement of the service cluster is increased, the needed new copies already exist in the compensation queue, and the service cluster can be added into the cluster by directly pulling up the new copies, so that the preparation time consumed by creating the new copies from zero is saved, and the goal of more extremely and rapidly expanding cluster node resources is realized. The disadvantages are that: pod resources in the queue consume memory resources. The time saved by the dominant part exists in the life cycle of the Pod, if the cluster resources are sufficient and the running state is good, most of the time consumed by newly creating the copy exists in the suspended state, activities such as downloading the mirror image of the state and preparing the base environment of the Pod take longer, and the state accounts for a great proportion in the life cycle of the Pod. This approach is designed to trade off the gain effect in time at the expense of space. Additionally, the Pod status requirements for joining the queue are: the base context is ready and in an unbound node scheduled state, wherein the base context preparation comprises: basic isolation environment, file system, network, storage, container mirror download. Since the replica controller needs to maintain and control the number of cluster nodes, and the Pod added to the "compensation queue" is generated by the replica controller under the same service cluster, and the resource is ready but not put into use, the Pod resource in this state can set a special flag for the Pod of the compensation queue in the field of the resource list to distinguish from the existing state, so as to facilitate the uniform management of kubernets. In summary, when the cluster size needs to be expanded, if the Pod in the queue can be pulled up directly for operation, a lot of time is saved.

The intelligent expansion compensation process is realized by the following steps:

and 7: the judgment process of step 2 is performed periodically.

The prediction algorithm logic described in step 2 is implemented by:

The priority of the compensation queue in step 4 is found by the following steps:

P_n＝W₀*F_n+W₁*P₀+W₂*C_n(1)

representing the sum of Pod resources historically enqueued and dequeued by a certain service cluster,

representing the sum of Pod resources historically added to a compensation queue by a certain service cluster; { IE, OE }. epsilon.S denotes a queue Pod resource IE in the set_mAnd dequeue Pod resource OE_mBelong to a service cluster;

U_m(t_m,d_m)＝(u_m,1(t_m,d_m),...,u_m,k(t_m,d_m),...,u_m,n(t_m,d_m))

w_m,kFor each resource quota utilization rate u in newly-increased Pod_m,k(t_m,d_m) The weight occupied, α, is for d_m0 < α < 1, D being equal to the periodic dynamic adjustmentA cycle time T, m of the resource quota is 1, 2.

W_m＝(w_m,1,...,w_m,k,...,w_m,m)

so the credit factor c of the i +1 th newly added resource Pod_n＝c_i+1＝c。

In the hybrid scheduling soft segmentation process, the background is as follows: the basic scheduling unit on Kubernetes is Pod, resources required by a parameter service cluster and a training cluster in a model training scene need to be allocated once, if only partial resources are allocated or are successfully scheduled and occupied by other clusters, the cluster is started incompletely or incompletely. Many open source projects propose a group scheduling scheme (e.g., volcano), but the problem of large block resource allocation difficulty is difficult to solve. The 'hybrid scheduling soft segmentation' process provided by the patent can solve the problem in a targeted manner, and the basic idea is to divide a resource pool, separately place online services with Pod as scheduling granularity and batch job offline tasks, and reduce the problem that the scheduling cannot be performed all the time because a small job occupies resources when the batch job has an independent resource space. When the batch processing operation releases the resources, the resources are released in a whole block, and the resources are easily allocated when the operation is reused.

In addition, the design of the module can also dynamically allocate the resource pool, namely, the resource division water level line is adjusted when the resource demand changes. This is also the embodiment of the idea of "soft slicing". Defining: wherein, the characteristic behavior dimension of the application macro mainly comprises: difference, timeliness and basic planning. The difference is mainly reflected in that the difference is obvious in the requirements of different application types on vacant resources of containers such as Pod and the like, the time sensitivity, the cost planning and the like; the time period is mainly characterized in that the application has different performance behaviors in different time periods; the basic programmable body is the part that is agreed in advance by a human, beyond the part that the unpredictable application load acutely exceeds the capacity plan.

As shown in fig. 3, a flow chart of a hybrid scheduling soft-segmentation process is given, and the hybrid scheduling soft-segmentation process is specifically implemented by the following steps:

where x, y are intermediate variables, which are solved by equation (7):

R_x′＝λ+R_x(8)

R_x＝R′_x,R_y＝R′_y(9)

wherein, λ is a resource adjustment factor;

wherein k1 ', k2 ', kn ' are the resources newly released by the offline task;

In step c), the online task priority Px and the offline task priority Py are obtained through a formula (11):

The task dequeuing speed in the step b) is slowly lower than a set threshold value and is characterized by a task waiting period, and when the task waiting period exceeds the threshold value T0, the resource accounting is judged and the resource is required to be adjusted.

When dynamic expansion is triggered, a new Pod starting mode is not directly adopted, but a compensation queue of an intelligent telescopic compensation process is traversed first, and whether a prepared Pod resource exists is checked. If yes, directly pulling up; if not, executing the original scaling strategy. In the triggering dynamic capacity reduction, the reduced Pod is not directly killed after the cooling period, but is put into a compensation queue for a certain period, and then the intelligent compensation module calls a corresponding assembly to destroy the Pod. It can be seen that due to the addition of the intelligent telescopic compensation module, the original cooling period can be properly shortened, and therefore a faster and fine-grained dynamic telescopic scheme is realized.

Claims

1. A cluster top-speed elastic expansion method for realizing efficient resource utilization comprises an intelligent expansion compensation process and a hybrid scheduling soft segmentation process; the intelligent telescopic compensation process is characterized by being realized through the following steps:

and 7: the judgment process of step 2 is performed periodically.

2. The method for realizing resource-efficient elastic cluster scaling according to claim 1, wherein the prediction algorithm logic in step 2 is implemented by the following steps:

3. The method for cluster top-speed elastic scaling for resource efficient utilization according to claim 1 or 2, wherein the priority of the compensation queue in step 4 is obtained by the following steps:

P_n＝W₀*F_n+W₁*P₀+W₂*C_n(1)

U_m(t_m,d_m)＝(u_m,1(t_m,d_m),...,u_m,k(t_m,d_m),...,u_m,n(t_m,d_m))

W_m＝(w_m,1,...,w_m,k,...,w_m,m)

so the credit factor c of the i +1 th newly added resource Pod_n＝c_i+1＝c。

4. The method for cluster top-speed elastic scaling to achieve efficient resource utilization according to claim 1, wherein the hybrid scheduling soft-segmentation process is specifically achieved by the following steps:

where x, y are intermediate variables, which are solved by equation (7):

R′_x＝λ+R_x(8)

R_x＝R′_x,R_y＝R′_y(9)

wherein, λ is a resource adjustment factor;

wherein k1 ', k2 ', kn ' are the resources newly released by the offline task;

5. The method according to claim 4, wherein in step c), the online task priority Px and the offline task priority Py are obtained according to formula (11):

6. The method as claimed in claim 4, wherein the slow task dequeuing rate in step b) is lower than a predetermined threshold and is characterized by a task waiting period, and when the task waiting period exceeds a threshold T₀And then, judging that the resources are booked and needing to be adjusted.

7. The method according to claim 4, wherein when dynamic resource expansion is triggered, a new Pod resource starting mode is not directly adopted, but a compensation queue is traversed first to check whether a prepared Pod resource exists, if yes, the Pod resource is directly pulled up, and if not, an intelligent expansion compensation process is executed; when triggering dynamic capacity reduction, the reduced Pod is not directly killed after a cooling period, but is put into a compensation queue for waiting, and then is destroyed in an intelligent compensation process; it can be seen that due to the addition of the intelligent telescopic compensation process, the original cooling period can be properly shortened, and therefore a faster and fine-grained dynamic telescopic scheme is realized.