CN115328647A

CN115328647A - Compute intensive service elastic expansion method based on task queue

Info

Publication number: CN115328647A
Application number: CN202210871366.0A
Authority: CN
Inventors: 于劲松; 梁思远; 周金浛; 唐荻音; 周倜; 苗毅; 陶来发; 刘浩
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2022-07-22
Filing date: 2022-07-22
Publication date: 2022-11-11

Abstract

The invention relates to a flexible scaling method for computationally intensive services based on task queues, for flexible scaling of computationally intensive but delay-tolerant services. The method comprises (S101) storing independently executable computing tasks in a task queue; (S102) forming a task time consumption estimation mapping; (S103) estimating the time consumption of the task; (S104) increasing and decreasing instances and predicting the on-time completion condition of the task to obtain the minimum number of the instances; (S105) increasing or decreasing the instances according to the minimum number of instances. The compute-intensive service elastic expansion method based on the task queue provided by the invention arranges the tasks in a queue form, provides the minimum number of instances within the range of meeting the specified delay requirement by combining prediction and adjustment, saves computing resources to the greatest extent while meeting the business requirement, and meets the compute-intensive service elastic expansion requirement.

Description

Compute intensive service elastic expansion method based on task queue

Technical Field

The invention relates to a computer resource elastic expansion method, in particular to a calculation intensive service elastic expansion method based on a task queue.

Background

In order to satisfy business requirements under the condition of saving computer resources as much as possible, a computer system or a computer cluster needs to automatically adjust the number of service instances used for business according to a certain policy, which is called elastic expansion. When the business requirement is increased, the elastic expansion automatically increases the service instance, and more computing resources are distributed to meet the requirement of processing the business requirement; when the business requirement is reduced, the elastic expansion automatically reduces the service instances, and releases the occupied computing resources to other applications, so as to realize the efficient utilization of the resources. The existing elastic expansion method comprises the following steps: 1) The method based on the alarm is that the occupancy rates of resources such as a CPU of the service instance are continuously monitored, if the occupancy rates are continuously too high, the service instance is increased, and if the occupancy rates are continuously too low, the service instance is decreased; 2) A planning-based approach analyzes and predicts peak periods of traffic, increases instances of service during peak periods, and decreases instances of service at other times.

IO intensive services can use the two resilient scaling methods described above, such as Web services. In this type of service, waiting for reads and writes accounts for most of the traffic processing, and the CPU duty of a single thread is not high. Therefore, they often employ multithreading to increase CPU utilization. When the service processing load is increased, the CPU occupancy rate is increased, so that the alarm-based method can realize timely and effective elastic expansion and contraction by continuously monitoring the occupancy rate of CPU resources. On the other hand, the Web service logic is simple, the single service is processed quickly, the load is mainly related to the number of services such as user access amount, and the time and the size of the peak value of the user access amount are analyzed and estimated, and the elastic expansion and contraction can be realized timely and effectively by applying a plan-based method.

However, computationally intensive services cannot use both of the elastic scaling methods described above. In the computation-intensive services such as regular log analysis, data mining, machine learning model training and the like, the operation occupies most of the time of business processing, and the CPU continuously runs in a high proportion and is irrelevant to the business load. Thus, the alarm-based approach fails to increase or decrease service instances by continuously monitoring the occupancy of CPU resources. Also, the individual traffic handling time of a computationally intensive service is difficult to estimate, and thus a plan-based approach is also difficult to allocate service instances by analyzing and predicting traffic peaks. Therefore, how to reasonably perform elastic scaling is the key to the computationally intensive service to reasonably allocate computer resources.

Disclosure of Invention

The invention provides a task queue-based elastic expansion method for compute-intensive services, which adopts a prediction method to provide the minimum number of instances meeting the specified business processing requirements and can be suitable for the elastic expansion of the compute-intensive services.

In order to achieve the purpose, the invention proposes the following technical scheme:

the compute-intensive service elastic scaling method based on the task queue comprises the following steps:

(S101) storing independently executable computing tasks in a task queue, wherein each task is given a latest allowed completion time, and the tasks are sorted according to the latest allowed completion time in the queue;

(S102) recording the historical task time consumption, the task type and the quantized values of the task time consumption influence factors to form a task time consumption estimation mapping;

(S103) estimating task time consumptions of all tasks being executed and in the queue by using the task time consumption estimation map;

(S104) predicting the predicted completion time of all tasks according to the last task completion time and all task consumed time of all current examples;

(S105) increasing and decreasing instances and predicting the task on-time completion condition until the minimum number of instances for enabling the tasks in all the queues to be completed on time is obtained;

(S106) increasing or decreasing the instances according to the minimum number of instances.

Specifically, the task time consumption estimation mapping in the step (S102) is formed by performing multiple linear regression on different types of calculation tasks.

Further, the step (S105) includes: if a task with the predicted completion time later than the latest allowed completion time exists in the prediction, circularly executing and adding one instance for predicting again until the predicted completion time of all the tasks in the task queue is earlier than the latest allowed completion time, and taking the minimum number of instances as the number of instances when the circulation exits; otherwise, the loop execution reduces one instance to predict again until there is a task whose predicted completion time is later than the latest allowed completion time, and the minimum number of instances is taken as the number of instances at the loop exit plus one.

The beneficial effect of this application is:

the compute-intensive service elastic expansion method based on the task queue provided by the invention arranges the tasks in a queue form, provides the minimum number of instances within the range of meeting the specified delay requirement by combining prediction and adjustment, saves computing resources to the greatest extent while meeting the business requirement, and meets the compute-intensive service elastic expansion requirement.

Drawings

FIG. 1 is a flowchart of a compute intensive service elastic scaling method based on task queues according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for forming a task time consumption estimation map according to an embodiment of the present invention;

fig. 3 is a flowchart of a method for predicting predicted completion time of all tasks according to an embodiment of the present invention.

Detailed Description

The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and not by way of limitation with respect to the scope of the invention.

In order that the above objects, features and advantages of the present application can be more clearly understood, the present invention will be further described in detail with reference to the accompanying drawings and examples. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. The specific embodiments described herein are offered by way of illustration and disclosure only and are not intended to limit the present application. All other embodiments that can be derived by one of ordinary skill in the art from the description of the embodiments are intended to be within the scope of the present disclosure.

One embodiment of the invention is applied to data mining services, and the service types comprise threshold mining, cluster analysis and prediction analysis. The time consumption of the threshold value mining task is only related to the sample size, the time consumption of the clustering analysis task is related to the sample size and the sample dimension, and the time consumption of the prediction analysis task is related to the prediction step number.

As shown in fig. 1, the compute intensive service elastic scaling method based on task queue includes the following steps:

Specifically, in step (S101), the storage information of the computing task in the queue includes: 1. task type A, threshold mining G ₀ Cluster analysis G ₁ Or predictive analysis G ₂ (ii) a 2. The quantized values of the time-consuming influence factors of the tasks are mined by the threshold value as the sample size n ₀ Cluster analysis as a sample size n ₁ And sample dimension N ₁ The prediction analysis is the number of predicted steps s ₂ (ii) a 3. The latest allowable completion time t of the task _p . When a computing task is inserted into the queue, according to t _p Performing sequencing insertion to make t in the task queue _p The smallest task is at the head of the queue, t _p The largest task is at the end of the queue.

Concretely, the steps(S102), using the time T of the ith historical task ⁱ Task type A ⁱ Factors influencing task time consumption

The flow of the mapping of the quantized values to the task time consumption estimation is shown in fig. 2, and comprises the following steps:

(S201) according to the task type A ⁱ Classifying the historical task information to form a historical task information set:

(S202) separately applying the least squares method to S ₀ ,S ₁ ,S ₂ Multiple linear model of (T = a) ₀ n ₀ +b ₀ ,

T＝a ₁ b ₁ +b ₁ N ₁ +c ₁ ,T＝a ₂ s ₂ +b ₂ Performing regression to obtain a ₀ ,b ₀ ,a ₁ ,b ₁ ,c ₁ ,a ₂ ,b ₂ In the formula, T is

(S203) establishing a task time consumption estimation mapping

Wherein

Estimation of task time consumption. The specific mapping mode is as follows:

specifically, in step (S104), the values of the influencing factors of the currently executed and the tasks in the task queue are substituted

In the method, the estimated task time consumption of the current execution task is obtained

And estimated task time in task queue

Specifically, in step (S104), the task completion time set at the jth instance in the current M service instances is

The current time is t _now Predicting the predicted completion time of all tasks, wherein the prediction process is shown in figure 3, and the specific steps comprise:

(S301) calculating the predicted completion time of the current task for all instances

(S302) setting the current task as the head task of the task queue, and recording the estimated time consumption of the task as

(S303) finding the example with the minimum predicted completion time of the current task, and recording the predicted completion time of the current task of the example as

(S304) updating the predicted completion time

Noting the updated time as the predicted completion time of the task being traversed

(S305) if the task queue is traversed completely, ending the circulation, otherwise, setting the current taskThe task is the next task in the queue, and the estimated time consumption of the task is recorded as

Step (S303) is performed.

Specifically, in step (S105), if there is a task whose predicted completion time is later than the latest permitted completion time in the prediction, the loop execution adds an instance to predict again until the predicted completion times of all tasks in the task queue are earlier than the latest permitted completion time, and the minimum number of instances is taken as the number of instances at the time of loop exit; otherwise, the loop execution reduces one instance to predict again until there is a task whose predicted completion time is later than the latest allowed completion time, and the minimum number of instances is taken as the number of instances at the loop exit plus one.

In the method for computing intensive service elastic stretching based on the task queue described in this embodiment, the tasks are arranged in a queue form, the minimum number of instances within the range of meeting the specified delay requirement is given by combining prediction and adjustment, the computing resource is saved to the greatest extent while the business requirement is met, and the computing intensive service elastic stretching requirement is met.

Although the embodiments of the present application have been described in conjunction with the accompanying drawings, those skilled in the art will be able to make various modifications and variations without departing from the spirit and scope of the application, and such modifications and variations are included in the specific embodiments of the present invention as defined in the appended claims, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of various equivalent modifications and substitutions within the technical scope of the present disclosure, and these modifications and substitutions are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A compute-intensive service elastic scaling method based on a task queue is characterized by comprising the following steps of:

(S104) predicting the predicted completion time of all tasks according to the completion time of the last task and the consumed time of all tasks of all current examples;

2. The method as claimed in claim 1, wherein the task-queue-based elastic scaling method for compute-intensive services is characterized in that the task-consumption estimation mapping in step (S102) is formed by respective multiple linear regression for different types of computation tasks.

3. The method of claim 1, wherein in step (S105), if there is a task whose predicted completion time is later than the latest permitted completion time in the prediction, the loop execution adds an instance to predict again until the predicted completion time of all tasks in the task queue is earlier than the latest permitted completion time, and the minimum number of instances is determined as the number of instances at the time of loop exit; otherwise, the loop execution reduces one instance to predict again until there is a task whose predicted completion time is later than the latest allowed completion time, and the minimum number of instances is taken as the number of instances at the loop exit reduced by one.