CN113867972B

CN113867972B - Container memory load prediction method based on combination of memory resources and service performance

Info

Publication number: CN113867972B
Application number: CN202111471717.0A
Authority: CN
Inventors: 刘东海; 徐育毅; 庞辉富
Original assignee: Hangzhou Youyun Software Co ltd; Beijing Guangtong Youyun Technology Co ltd
Current assignee: Hangzhou Youyun Software Co ltd; Beijing Guangtong Youyun Technology Co ltd
Priority date: 2021-12-06
Filing date: 2021-12-06
Publication date: 2022-03-15
Anticipated expiration: 2041-12-06
Also published as: CN113867972A

Abstract

The invention provides a container memory load prediction method based on the combination of memory resources and service performance, which comprises the following steps: (1) selecting a performance index; (2) selecting the size of a performance window: determining the size of the current sliding window according to the data variance of the last sliding window and the current fixed sliding window; (3) and a load prediction algorithm: detecting the performance of the container service at the current stage each time, comparing the performance with the expected performance of the container service, and obtaining a performance difference value after comparison; and calculating the memory mapping value of the next stage and outputting the final prediction adjustment value. The invention has the advantages that: through a load prediction algorithm, an elastic expansion mechanism has foresight; the garbage recycling duration is used as a measurement index of service performance, the problem of service quality detection of large-scale operation in the expansion and contraction process is solved, the memory usage amount of the next stage of the container is predicted according to the monitoring data of the historical time sequence, and memory resources are saved while the service quality is guaranteed.

Description

Container memory load prediction method based on combination of memory resources and service performance

Technical Field

The invention relates to the technical field of containers, in particular to a container memory load prediction method based on the combination of memory resources and service performance.

Background

With the rise of artificial intelligence and big data services, the bottom layer supporting function of the cloud computing platform becomes more and more important. In a cloud platform, a container virtualization technology represented by Docker is rapidly developed, and by means of the characteristics of light weight, convenience and the like, the container virtualization technology gradually replaces the traditional virtual machine technology in some fields.

The container technology can isolate and pack the program running environment, facilitates the whole process of development, online, test and maintenance of the program, and can share the operating system of the host among the containers, so that the occupied resources are less compared with the virtual machine. The container on-demand service provides proper computing resources for the application program according to the workload of the application program, and controls the resource cost while ensuring good performance service. However, the workload of the cloud computing application varies with time, and the static resource allocation that meets the peak demand causes a serious waste of resources, whereas the maintenance of the average computing resources causes a reduction in service performance and level. At present, many big data analysis systems utilize memory resources to the maximum extent, but the memory is still a relatively expensive resource, so that the waste of the memory resources is reduced while the service performance is maintained, the memory resources required by the container are reasonably predicted, and the organic allocation of the container resources according to the prediction becomes a problem to be solved urgently.

For solving the problem, a few solutions exist at present, for example, CN109271232A of the present invention provides a cluster resource allocation method based on a cloud computing platform, which first collects load data, calculates load variation and load duration in a future time period, and determines whether the load variation and the load duration exceed threshold values to make decisions, including copy destination decisions and migration destination decisions; the invention plans the mapping between the virtual machine and the physical machine through the allocation of the virtualized cluster resources, and adjusts according to the system running condition so as to reasonably allocate the virtualized resources to the service nodes in the cluster and optimize the system performance and energy consumption. However, the description of the collected load data is fuzzy, the subjective awareness of the threshold setting of the comparison is too strong, how to calculate the part of original load is not clarified, and meanwhile, the cluster resource allocation mode of the virtual machine has certain migration difference when facing container resource allocation.

In order to solve the problems, the invention CN110231976A discloses a method and a system for deploying edge computing platform containers based on load prediction, the invention is characterized in that an original load monitoring system is carried on a computing node, the state of various systems is better monitored, the system is uploaded to a central server through the node load prediction system, the central node is carried with a node load prediction system and a computing task management system, the node load prediction system is provided with an LSTM model corresponding to each computing node, and the node load prediction system receives the original load information of the node and sends the prediction result to the computing task management system; the calculation task management system is responsible for deploying the containers, feeds back the node numbers and the task time to the node load prediction system according to the received information, and issues the containers to the available calculation nodes. However, the LSTM model in the invention is relatively original, the evaluation index of the model is not clearly defined, and how to judge the penalty value of the system load prediction is not described in detail.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, and provides a container memory load prediction method based on the combination of memory resources and service performance. The service quality detection problem of large-scale operation in the expansion process is solved by using the garbage recycling duration as a measurement index of service performance.

The object of the present invention is achieved by the following technical means. A container memory load prediction method based on memory resource and service performance combination comprises the following steps:

(1) selecting the performance index: selecting garbage recycling, namely selecting GC duration as a performance evaluation index of a subsequent algorithm;

(2) selecting the size of a performance window: determining the size of the current sliding window according to the data variance of the last sliding window and the current fixed sliding window;

(3) and a load prediction algorithm: detecting the performance of the container service at the current stage each time, comparing the performance with the expected performance of the container service, and obtaining a performance difference value after comparison; then, calculating the memory mapping value of the next stage according to the change condition of the performance and the current memory resource use condition; and outputting the final prediction adjustment value according to the actual conditions of the container and the host by the memory mapping value.

Further, the size of the performance window is selected, and the specific steps are as follows:

(2.1) receiving the average of the last sliding window and the current fixed sliding window

The average value in the last sliding window is represented by oldAvg, the average value in the current fixed sliding window is represented by curAvg, the average change in the positive directions of the current fixed sliding window and the last sliding window is represented by direct, and the formula is as follows:

formula (3-7)

Inverse is used to represent the reverse average change of the current fixed sliding window and the last sliding window, and the formula is:

formula (3-8)

Variance of the two is expressed by using variance, and the formula is expressed as:

formula (3-9)

(2.2) comparing the variance with a confidence interval, and if the variance is smaller than a threshold of the confidence interval, not changing the size of the sliding window; if the variance is larger than the threshold of the confidence interval, stretching the sliding window according to the average values of the current fixed sliding window and the last sliding window, and if the average value of the current fixed sliding window is larger than the average value of the last sliding window, adding one to the size of the sliding window, otherwise, subtracting one; the final sliding window size is expressed in wSize, and is formulated as:

formula (3-10)

And (2.3) applying the final size of the sliding window to the sliding window for calculating the expected performance, and dynamically adjusting the size of the sliding window according to historical data during each load prediction.

Furthermore, the load prediction algorithm specifically comprises the following steps:

(3.1) use

Representing a container service performance index, and using GC duration in the operation process of the container service as a detection index; by monitoring the average length of time during which garbage collection operations have been performed within the interval of the last certain period of time

To express the service performance index of the current elastic expansion and contraction, use

Represents; the expression is as follows:

formula (3-1)

(3.2) use

Indicating expected performance during operation of the container service in the past

The sliding window in time is a time sequence, weighted average operation is carried out on actual performance indexes of the container service detected in the window, the weight is sequentially decreased along with the time distance of the predicted point, and the expression is as follows:

formula (3-2)

(3.3) use of

Indicating the difference between the observed performance and the expected performance, and using the performance difference to reflect whether the container service is facing insufficient memory or over memory in the immediately past periodAnd if so, further determining whether the container memory is subjected to amplification or contraction behavior, wherein the expression is as follows:

formula (3-3)

(3.4) predicting the model according to the performance difference

And current memory usage

Two influence factors calculate the mapping index of the memory allocation size of the next stage

Using the mapping index of the previous stage, and combining the performance of the service just past in the previous stage, i.e. the performance difference

Determining whether the memory mapping index is adjusted upwards or downwards; when the garbage recycling duration of the service is increased, then

If the memory space needs to be expanded for the container, the memory mapping index should be adjusted up, and the expression is:

formula (3-4)

Wherein,

the memory mapping value indicating that the container in the next stage should be adjusted is obtained by the mapping value in the previous stage

In accordance with performance difference changes and memory resourcesThe use case, the result of the calculation,

(ii) a Parameter(s)

Is the coefficient of elasticity, parameter

Representing a first order model between garbage collection time and memory mapping values; parameter(s)

According to the memory usage rate of the container

And calculating to obtain the expression:

formula (3-5)

(3.5) determining the size of the memory provided by the next stage according to the memory mapping index output by the prediction model and the real memory allowed to be allocated to the program in the host

：

Formula (3-6)

Wherein

And

indicating the maximum and minimum memory that can be allocated for a program in the host setting.

The invention has the beneficial effects that: aiming at a memory-related load prediction algorithm in a container cloud environment, the invention provides a load prediction algorithm based on the combination of the memory resource use condition and the garbage recovery service performance, so that an elastic expansion mechanism has foresight; the garbage recycling duration is used as a measurement index of service performance, the problem of service quality detection of large-scale operation in the expansion and contraction process is solved, the memory usage amount of the next stage of the container is predicted according to the monitoring data of the historical time sequence, and memory resources are saved while the service quality is guaranteed.

Drawings

FIG. 1 is a schematic flow chart of the present invention.

Fig. 2 is a flow chart illustrating the selection of the size of the performance window.

FIG. 3 is a flowchart of the memory elastic stretch prediction algorithm.

Detailed Description

The invention will be described in detail below with reference to the following drawings:

the invention designs and realizes a container memory load prediction method based on the combination of memory resources and service performance.

1. Performance index selection

The use condition of the memory resource is obtained more intuitively, but the performance index of the container service is more abstract. Now, qos (quality of service) is commonly used to indicate the quality of service, and the higher this value is, the higher the quality of service provided is, the better the service performance is. The performance measurement indexes are various and are usually response time, throughput, request quantity, failure time, accuracy and the like. For application services with short single processing time and frequent interactivity, such as WEB-type application services, the service performance can be effectively measured by using the response time, and the resource allocation condition can be dynamically adjusted according to the change of the user request quantity, so that the service quality is ensured.

However, for a large data analysis program, the data processing amount is large, or the arithmetic logic is complex, so that the single running time is long and can reach the minute level or even the hour level, different data analysis jobs often load different data sets and arithmetic logic, and the response time or throughput cannot be used as the performance measurement index. The interaction frequency of the big data analysis program is low, the operation is usually submitted to an analysis system to operate, then the result is waited to be returned, and the elastic expansion effect can be realized only by detecting the change of the service quality in the operation process of the big data analysis program and adjusting the change according to the change. And the influence of the GC on the program performance is consistent, and when the GC operation is executed, the program operation must be suspended so as to ensure the integrity of the detection object tree.

According to the characteristics of a large-scale computing environment in an actual cloud environment, the GC duration is selected as a performance evaluation index of a subsequent algorithm mainly aiming at a large data analysis system (response speed requirement of minute and hour levels) needing long-term interaction.

2. Performance window size selection method

In the invention, the expected performance of the container service is obtained by analyzing historical data through a sliding window. Due to the dynamic changes in workload, heterogeneous infrastructure, and the impact of multi-tenant shared systems, it is important to select an appropriate window size. Previous traffic prediction tools typically consider a large amount of historical data and are therefore classified as "remote dependent (LRD)" methods, however LRD based techniques are not best suited for online traffic prediction in cloud computing systems because they do not have the periodic behavior of traditional networks. In past approaches, fixed-size observation windows were mostly used, but in a cloud environment that is prone to change, the coupling between historical data may change over time, and existing fixed-size windows cannot adequately limit the data range to capture local trends in the data.

The method is designed to adaptively select the size of the sliding window of the optimal observation value, capture the latest trend of expected performance and improve the accuracy of the performance. If the sliding window is large, performance abnormal data can be eliminated better, and the method is suitable for the condition that the time sequence fluctuates severely; if the window is small, the performance change is more sensitive, but the workload of the prediction algorithm is reduced, so that the method is suitable for the condition that the fluctuation is stable. The size of the current sliding window is thus determined based on the data variance of the last window and the current fixed sliding window. If the variance is small, the expected value is close to the historical mean, and if the variance is large, the expected value is extended according to the historical data.

The algorithm flow is to receive the average value of the last window and the current fixed sliding window and compare the average value, in order to avoid unnecessary overhead of the algorithm, 5% is selected as the boundary of the confidence interval, which means that if the difference between the two is within 5%, the sliding window size does not need to be changed, otherwise, the window size is changed according to fluctuation.

The average value in the last sliding window is represented by oldAvg, the average value in the current fixed-size sliding window is represented by curAvg, and in order to eliminate the difference of different data scenes, the forward average change of the current fixed sliding window and the last sliding window is represented by direct, and the formula is as follows:

formula (3-7)

formula (3-8)

formula (3-9)

Next, comparing the variance with a confidence interval, and if the variance is smaller than a threshold of the confidence interval, not changing the size of the sliding window; if the variance is larger than the threshold of the confidence interval, stretching the sliding window according to the average values of the current fixed sliding window and the last sliding window, and if the average value of the current fixed sliding window is larger than the average value of the last sliding window, adding one to the size of the sliding window, otherwise, subtracting one; the final sliding window size is expressed in wSize, and is formulated as:

formula (3-10)

Finally, the final size of the sliding window is acted in the sliding window for calculating the expected performance, and the size of the sliding window is dynamically adjusted according to historical data during each load prediction; the short-range dependence effect on historical data is increased, and the method is suitable for a cloud computing environment.

3. Load prediction algorithm

In this section, the prediction algorithm is mainly applied to the container management system and analyzed in combination with the container cloud environment. The flow of the prediction algorithm is roughly that the performance of the container service at the current stage is detected each time, and then the performance is compared with the expected performance of the container service, and a performance difference value is obtained after the comparison. This difference reflects whether the container service status at the present stage is good or bad, and also reflects whether the load degree of the service is increased or decreased. And then calculating the memory mapping value of the next stage according to the change condition of the performance and the current memory resource use condition. The mapping value outputs the final predicted adjustment value according to the actual conditions of the container and the host.

By using

And representing the container service performance index, selecting according to the described performance index, and using the garbage recovery (GC) time length in the container service operation process as a detection index. When new computing requirements come and there is insufficient available memory space, garbage collection operations are performed, causing a certain time delay. If the available memory space of the container is small, the program frequently executes garbage collection operation during the execution process, which results in the reduction of service performance. The system monitors the average time length (in) for executing the garbage collection operation in the latest certain time period (interval)

) To express the service performance index of the current elastic expansion and contraction, use

And (4) showing. The expression is as follows:

formula (3-1)

By using

The expected performance in the service operation process is shown, and the data analysis program has stronger stage in the operation process, and the data effect is more obvious when the data analysis program is closer to the prediction time point. Aiming at the trend in the program running process, the method is used in the past

And the sliding window in time is a time sequence, the weighted average operation is carried out on the actual performance index of the container service detected in the window, and the weight is sequentially decreased along with the time distance of the predicted point. The expression is as follows:

formula (3-2)

By using

Representing the difference between observed performance and expected performance, the duration of garbage collection for a service may fluctuate after processing an unknown workload. Therefore, the performance difference can well reflect whether the container is in shortage or excess of memory in the past period of time, and further determine whether the container memory is amplified or shrunk. The expression is as follows:

formula (3-3)

The prediction model will then be based on the performance difference

And current memory usage

And determining whether the memory mapping index is adjusted up or decreased. When the garbage recycling duration of the service is increased, then

Meaning that memory space should need to be expanded for the container to help it reach the desired state, at which time the memory map index should be adjusted up. The expression is as follows:

formula (3-4)

Wherein,

And calculating the obtained result according to the performance difference value change and the use condition of the memory resource.

Because ofIn the face of different use scenes of the host and the container, the actual memory use value is different from scene to scene, so the difference can be eliminated by using a memory mapping ratio method, and the actual situation of the container is combined to be converted into an actual predicted value. Parameter(s)

The elasticity coefficient represents the elasticity of the model, and because different scenes have different elasticity speed requirements, how to control the elasticity speed is a more complicated problem, the elasticity parameter can be automatically optimized according to the elasticity requirements of users and managers, and the parameter is not discussed in the invention; parameter(s)

A first order model representing the time between garbage collection and the memory mapped value, set by the user, for the purpose of controlling the mapped value

(ii) a Parameter(s)

According to the usage rate of the container

And calculating to obtain the expression:

formula (3-5)

When in use

If the memory utilization rate is at a lower value, the probability of performance degradation is not caused by the shortage of memory space; conversely, if the utilization is at a higher value, then the performance is due to a memory shortageThe possibility of deterioration is also large, and therefore the parameters

The size of the memory quota at the next stage can be more accurately determined. In the same way, when

When the performance index becomes optimistic, the memory can be properly reduced to save resources, and at the moment

That is, the idle rate of the memory, when the idle rate is high, a large memory space can be reduced without affecting the performance of the service; when the idle rate is low, care should be taken to avoid affecting service performance.

Finally, the size of the memory provided by the next stage is determined by the memory mapping index output by the prediction model and the real memory allowed to be allocated for the program in the host

Because of

Therefore, the size of the memory provided at any time does not exceed the resource limit set by the system:

formula (3-6)

Wherein

And

indicating the maximum and minimum memory that can be allocated for a program in the host setting. The purpose of setting the minimum value is to provide the most basic usage resources of the service without limiting the start-up of the service. The maximum memory is set to prevent in a cloud computing ringIn the situation, other services on the host are influenced due to the resource preemption behavior of a single service, and the safety of the system is improved. And adjusting the container allocation memory according to the recommended memory size output in the prediction model and the elastic expansion strategy. The workload is from the outside and unpredictable, and the load prediction is to predict the memory resource demand according to the method so as to deal with the future workload and achieve the effects of high performance and high memory utilization rate.

In summary, the invention provides a load prediction algorithm based on the combination of the memory resource usage and the garbage collection service performance, the garbage collection duration is used as a measurement index of the service performance, and the memory usage of the next stage of the container is predicted according to the monitoring data of the historical time sequence, so that the subsequent dynamic adjustment of the elastic expansion mechanism of the memory has foresight. The invention can solve the problem of service quality detection in the expansion process of large-scale operation, and saves memory resources while ensuring the service quality.

It should be understood that equivalent substitutions and changes to the technical solution and the inventive concept of the present invention should be made by those skilled in the art to the protection scope of the appended claims.

Claims

1. A container memory load prediction method based on memory resource and service performance combination is characterized in that: the method comprises the following steps:

(3) and a load prediction algorithm: detecting the performance of the container service at the current stage each time, comparing the performance with the expected performance of the container service, and obtaining a performance difference value after comparison; then, calculating the memory mapping value of the next stage according to the change condition of the performance and the current memory resource use condition; outputting a final prediction adjustment value by the memory mapping value according to the actual conditions of the container and the host;

the size of the performance window is selected, and the specific steps are as follows:

variance＝(direct-inverse)²formula (3-9)

2. The method according to claim 1, wherein the method for predicting the load of the container memory based on the combination of the memory resource and the service performance comprises: the load prediction algorithm comprises the following specific steps:

(3.1) representing a container service performance index by P, and using GC (gas chromatography) duration in the operation process of the container service as a detection index; by monitoring the average duration GC of executing garbage collection operation in the interval of the latest time period_nTo express the service performance index of this elastic expansion and contraction, P_iRepresents; the expression is as follows:

(3.2) use

Representing the expected performance of the container service in the operation process, taking a sliding window in the past w time as a time sequence, performing weighted average operation on actual performance indexes of the container service detected in the window, wherein the weights are sequentially decreased along with the time distance of a predicted point, and the expression is as follows:

(3.3) with e_iThe difference between the observed performance and the expected performance is expressed, the performance difference is used for reflecting whether the container service faces the situation of insufficient memory or excessive memory in the past period of time, and further determining whether the container memory is amplified or shrunk, and the expression is as follows:

(3.4)、the prediction model is based on the performance difference e_iCalculating mapping index R epsilon (0, 1) of memory allocation size in the next stage by using two influence factors of the current memory use condition U]Using the mapping index of the previous stage, combining the performance of the service just past in the previous stage, i.e. the performance difference e_iDetermining whether the memory mapping index is adjusted upwards or downwards; when the garbage collection duration of the service is increased, e_i>0, the memory space should be required to be expanded for the container, at this time, the memory mapping index should be adjusted up, and the expression is:

R_i＝R_i-1+pole(α·ω_i·e_i) Formula (3-4)

Wherein R is_iThe memory mapping value indicating that the container in the next stage should be adjusted is obtained by the mapping value R in the previous stage_i-1In the method, the result, R, is calculated according to the change of the performance difference and the use condition of the memory resource_i∈(0,1](ii) a The parameter pole is an elastic coefficient, and the parameter alpha represents a first-order model between the garbage recovery time and the memory mapping value; parameter omega_iAccording to the memory utilization rate U of the container_memAnd calculating to obtain the expression:

(3.5) determining the size M of the memory provided by the next stage according to the memory mapping index output by the prediction model and the real memory allowed to be allocated to the program in the host_i：

M_i＝R_i·(M_max-M_min)+M_minFormula (3-6)

Wherein M is_maxAnd M_minIndicating the maximum and minimum memory that can be allocated for a program in the host setting.