WO2023015788A1

WO2023015788A1 - Serverless computing resource allocation system for energy consumption optimization

Info

Publication number: WO2023015788A1
Application number: PCT/CN2021/135610
Authority: WO
Inventors: 赵来平; 李克秋; 贾雪超
Original assignee: 天津大学
Priority date: 2021-08-10
Filing date: 2021-12-06
Publication date: 2023-02-16
Also published as: CN113535409A; CN113535409B

Abstract

Disclosed in the present invention is a serverless computing resource allocation system for energy consumption optimization. The system comprises a resource explorer (100), a resource configurator (200), a serverless computing system (300), a system monitor (400) and a resource coordinator (500), wherein the resource explorer (100) is used for performing resource exploration for a newly started function in the serverless computing system (300), so as to find a configuration that meets the performance requirements of the function and to find the optimal configuration scheme for the minimum energy consumption; the resource configurator (200) is used for finding, in the resource configuration scheme, all resource configurations that meet the performance requirements of the function; the system monitor (400) is used for monitoring indexes; and the resource coordinator (500) is used for making a corresponding adjustment after receiving alarm information which is sent from the system monitor (400). By means of the present invention, compared with the prior art, the energy consumption can be reduced and the energy efficiency of a data center can be improved while ensuring the performance of a function in a serverless computing environment; and the serverless computing resource allocation system is not strongly coupled to a platform and can be applied to any serverless computing system.

Description

A Serverless Computing Resource Allocation System Oriented to Energy Consumption Optimization

technical field

The invention relates to the technical field of cloud computing, in particular to a technology for reducing system energy consumption and ensuring function performance under the serverless computing architecture of a cloud data center.

Background technique

With the rapid expansion of the scale of data centers, the proportion of energy consumption in data centers is gradually increasing. Therefore, the huge energy consumption of data centers has become an urgent problem for cloud service providers to solve. Reducing energy consumption not only responds to the call for energy conservation and emission reduction policies and green data centers, but also ensures that servers operate safely within rated power. More importantly, for cloud service providers, reducing energy consumption means reducing data center operation and maintenance costs. These reasons drive cloud service providers to strive to improve energy efficiency.

Serverless computing is a rapidly evolving cloud application architecture. Serverless computing does not require users to configure and manage resources, and can automatically expand according to user needs, greatly improving development efficiency. However, current serverless frameworks generally only scale functions horizontally based on queries per second (QPS) or resource utilization in a specific dimension, and do not dynamically modify the resource amount of a function. In addition, in order to strictly guarantee function performance, existing serverless platforms do not consider energy consumption when allocating resources. The present study shows that for the same function, different combinations of multidimensional resource allocations can result in the same processing delay but with very different energy consumption. Therefore, the concept of energy exchange is introduced, that is, different energy consumption results in the same processing delay. Energy swappability provides the possibility to reduce energy consumption while guaranteeing function performance.

Mixed deployment of applications can lead to power overloading of servers, so in order to keep applications stable on power-constrained servers, many research efforts have begun to discuss energy efficiency issues. Although this solves the mixed deployment of applications on power-constrained servers, the energy consumption of the servers is still high. Technologies like Dynamic Voltage Frequency Scaling and Intel p-state can only achieve limited power savings by presetting the core frequency, so a more aggressive approach to reducing power consumption is needed. Although previous work has performed well in guaranteeing application performance on servers with power constraints, how to reduce energy consumption under serverless computing architectures is still an unsolved problem.

Contents of the invention

In order to reduce the energy consumption of serverless workloads and ensure the runtime performance of functions, the present invention proposes a serverless computing resource allocation system oriented towards energy consumption optimization, which realizes independent A function-level resource allocation system and method that runs on serverless and is based on energy consumption exchange.

Technical scheme of the present invention is as follows:

A serverless computing resource allocation system oriented to energy consumption optimization, the system includes a resource explorer, a resource configurator, a serverless computing system, a system monitor, and a resource coordinator; wherein:

The resource explorer is used to use the machine learning prediction model trained offline to perform resource exploration for the newly started function in the serverless computing system, find the configuration that meets the function's energy requirements in the user resources to be allocated, and find the minimum energy consumption at the same time. Optimized optimal configuration scheme: that is, find the resource configuration that minimizes energy consumption in the critical section, that is, the resource configuration scheme that minimizes power consumption P*request execution time T when the function is running;

The resource configurator is used to find all resource configurations that meet the function performance requirements in the resource configuration scheme;

The system monitor is used to monitor three indicators: (1) whether there is a newly deployed function in the serverless computing platform; (2) whether the current power of the server exceeds the heat dissipation design power consumption threshold, if exceeded, the monitor will Send a power consumption overload alarm to the coordinator; (3) Whether the delay of each function exceeds its respective delay threshold, once the function delay is found to exceed the set threshold, the system monitor will send a delay violation alarm to the system coordinator;

The resource coordinator is configured to make corresponding adjustments after receiving the alarm information sent by the system monitor, that is, after receiving the power consumption overload alarm, execute the function with the largest power consumption in the serverless computing system Cooling process, each step lowers the frequency of the CPU core owned by the function until the overall power is lower than the set threshold.

Compared with the prior art, a serverless computing resource allocation system oriented to energy consumption optimization of the present invention can achieve the following beneficial technical effects:

1) It can reduce energy consumption and improve the energy efficiency of data centers while ensuring function performance in a serverless computing environment;

2) There is no strong coupling with the platform, and it can be used in any serverless computing system;

3) It can reduce the energy consumption of computationally intensive workloads by 21.2%, while ensuring the runtime performance of functions.

4) Enable finer resource scheduling in an energy-aware environment.

Description of drawings

Figure 1 is a schematic diagram of exchangeable energy consumption under different resource combinations;

Fig. 2 is a schematic diagram 1 of a serverless computing resource allocation system oriented to energy consumption optimization in the present invention;

FIG. 3 is a second schematic diagram of the architecture of a serverless computing resource allocation system oriented to energy consumption optimization in the present invention;

FIG. 4 is a schematic diagram of comparison results of energy consumption and delay under different workloads between the present invention and the control of energy consumption by the operating system.

Detailed ways

Below in conjunction with the accompanying drawings, the framework structure, functions and effects of the present invention are described in detail as follows.

Through detailed and in-depth analysis and characterization of serverless workloads, it is possible to find the more serious and controllable parts of serverless computing workloads. As shown in Figure 1, it is a schematic diagram of exchangeable energy consumption under different resource combinations, which includes the function runtime power under the combination of the number of CPU cores and the main frequency, and the number of CPU cores and the number of instances. The specific meaning of energy exchange is that different combinations of multi-dimensional resource allocation can result in the same processing delay, but have different energy consumption. The resource configuration in area 1 on the right side of the dotted line can meet the performance requirements of the function, and in this area, the dotted line box 2 is the optimal resource allocation scheme with the least power consumption during runtime. The present invention utilizes a machine learning model constructed offline to find a resource allocation scheme that minimizes energy consumption under the current load intensity; meanwhile, it maintains good operation under the influence of uncontrollable factors in the serverless computing architecture.

As shown in Figures 2 and 3, they are schematic diagrams 1 and 2 of the architecture of a serverless computing resource allocation system oriented towards energy consumption optimization in the present invention. The system includes a resource explorer 100 , a resource configurator 200 , a serverless computing system 300 , a system monitor 400 and a resource coordinator 500 . The specific description is as follows:

The resource explorer 100 is used to use the machine learning prediction model trained offline to perform resource exploration for the newly started function in the serverless computing system 300, find out the configuration that meets the function's performance requirements in the user resources to be allocated, and at the same time find the configuration that meets the performance requirements of the function in the numerous resources. In the configuration scheme, find the best configuration scheme to minimize energy consumption. Through an in-depth analysis of the energy consumption of serverless workloads, find the parts of the serverless computing system that can be optimized for energy consumption. In order to prevent the function from running in a low-power state for a longer time and resulting in higher overall energy consumption, the function's power consumption predictor and the processing time of the function request are used to find the global optimal solution that minimizes energy consumption. Among the resource configuration combinations, the best resource combination is then sent to the resource configurator 200 . The machine learning prediction model used in the present invention is described as follows:

In order to find the optimal resource configuration that minimizes energy consumption and can meet the performance requirements of the function, it is necessary to establish a power consumption and delay model of the function. There are many metrics that affect a function's power consumption and latency, including load intensity, number of function instances, and various system-level resources (number of CPU cores, last-level cache, CPU frequency, memory, memory bandwidth, network bandwidth, disk, etc.). Correlations between these metrics and function runtime power and latency were assessed using Pearson and Spearman correlation coefficients. Finally, the six indicators with the strongest correlation with function latency (including last-level cache, CPU core frequency, number of CPU cores, number of function instances, QPS, and calculation amount per request) were selected, as well as those related to energy consumption. The four most powerful indicators (including last-level cache, CPU core frequency, number of CPU cores, and calculation amount per request). And use this to build a data set for offline training latency and energy consumption models.

In order to quickly find the optimal configuration solution for the function, first reduce the resources and eliminate those resource configurations that far exceed the resource requirements of the function itself. For example, for a commercial server in the data center, the number of CPU cores is from a maximum of 80 Reduced to 16. Then use the binary search algorithm to find the critical section that meets the performance requirements of the function in the filtered resource allocation scheme. Finally, the traversal is carried out in the critical section to find the resource allocation scheme that meets the performance requirements of the function and consumes the least energy. The specific working process of the resource explorer is as follows: First, use the function performance model built offline to find all resource configurations that meet the function performance requirements in the filtered resource configuration scheme. The function performance model uses the number of requests per second, the calculation amount of each request, the memory size, the size of the last level cache, the number of CPU cores, the main frequency of the CPU cores, and the number of copies of the function instance as the input of the model. In the process of finding all resource allocation schemes that meet the performance requirements of the function, the binary search algorithm is used for each dimension of resources, which speeds up the positioning of the critical section.

In order to find the optimal resource allocation that minimizes energy consumption among all resource allocation schemes that meet the performance requirements of the function, an exhaustive traversal method is used to prevent suboptimal solutions from being found. Locate the optimal resource configuration in the critical section, using the power consumption model of the function and the execution time model of each request. The power consumption model of the function uses the number of requests per second, the calculation amount of each request, the size of the last level cache, the number of CPU cores, the main frequency of the CPU core, and the number of copies of the function instance as the input of the power consumption model when the function is running. The execution time model uses the calculation amount of each request, the size of the last level cache, the number of CPU cores and the main frequency of the CPU core as the input of the request processing time model, and finds the resource configuration that minimizes energy consumption in the critical section, that is, the function runtime Power consumption P* requests the resource allocation scheme with the smallest execution time T.

The resource configurator 200 is configured to find all resource configurations satisfying function performance requirements in the resource configuration scheme. The specific working process of the resource allocator 200 is: after receiving the optimal resource combination, it is responsible for interacting with the platform and the operating system, and actually executes the operation of allocating resources for the function.

The system monitor 400 is used to monitor three indicators: (1) whether there is a newly deployed function in the serverless computing platform. (2) Whether the current power of the server exceeds the thermal design power consumption threshold, if so, the monitor will send a power consumption overload alarm to the coordinator. (3) Whether the delay of each function exceeds its respective delay threshold. Once the function delay is found to exceed the set threshold, the system monitor 400 will send a delay violation alarm to the system coordinator 500 . The presence of the system monitor 400 not only allows for slight deviations in the predictor, but also reduces fluctuations in function performance due to uncontrollable system disturbances. The specific working process of the system monitor 400 is: regularly check the total power of the current node. If the current total power of the server exceeds the power consumption threshold set by the system, the system monitor 400 will issue a power consumption overload alarm. The system monitor 400 monitors whether there is a newly deployed function in the serverless computing platform; if there is, first, a thread for recording function information will be asynchronously started to collect the power consumption at startup of the function, the startup time of the container and the idle power consumption of the container; After the collection is completed, calculate the maximum survival time of the function based on the information, the formula is as follows:

Maximum function survival time = cold start power consumption * cold start time / function idle power consumption

At the same time, the system monitor 400 triggers subsequent resource exploration and resource allocation.

In order to strictly guarantee the function performance, the system monitor 400 needs to collect the runtime performance of all functions deployed in the system in real time. If the performance of the function is found to be lower than the minimum performance requirement set by the system, the system monitor 400 will issue a function performance violation alarm .

The resource coordinator 500 is configured to make corresponding adjustments according to the corresponding alarm after receiving the alarm information sent by the system monitor 400 . After receiving the power consumption overload alarm, the function with the largest power consumption in the serverless computing system 300 will be cooled down, and the frequency of the CPU core owned by the function will be reduced step by step each time until the overall power is lower than the set threshold. In order to make the function run stably in a real production environment, it is necessary for the resource coordinator 500 to make corresponding adjustments after receiving the alarm information sent by the monitor. The setting of the power threshold allows for a more gentle handling of power overloads in a way that has far less impact on the performance of functions than measures taken automatically by the server. If the coordinator receives a function performance violation alert, the coordinator will use a heuristic exploration method to try to increase the amount of resources (CPU core number, core main frequency, last-level cache, etc.) by one unit each time for the function. The specific working process of the resource coordinator 500 is:

After receiving the power consumption overload alarm, the resource coordinator 500 first needs to find the function with the highest power consumption in the system, and then try to reduce the main frequency of the core owned by the function. In order to take into account the performance of the function, the resource coordinator 500 only adjusts the main frequency of one level at a time. After receiving a function performance violation alert, each iteration will try to increase the resource amount of a single dimension for the function (for example, a CPU core, a main frequency of 100MHz, a memory size of 100M, a last-level cache and other server physical resources ). Also, after each iteration, check the runtime performance of the function. If the performance improves, continue to increase the amount of resources for this dimension in the next iteration. If there is no change in performance, a resource will be selected from other resource dimensions for resource allocation. Until the function with the highest power is found in the monitoring queue and the frequency reduction operation is performed on it. That is, after the resource coordinator 500 receives a function delay violation alarm, it will try to increase the amount of resources in a certain dimension for the function in each iteration. If the operation is found to be effective, it will continue to increase the resources in this dimension; A dimension of resources until the performance requirements of the function are guaranteed.

In order to reduce waste of resources, the resource coordinator 500 also reclaims some resources from functions whose delays are much smaller than the set target. At the same time, once the resource coordinator 500 finds that the current QPS exceeds 20% of the initial value, it will re-trigger the resource exploration of this function. The presence of the system monitor 400 and the resource coordinator 500 allows small biases in the predictors and eliminates function delay violations due to uncontrollable system disturbances.

The system needs to be initialized before use, including the following operations:

Initialize the detection period for the arrival of new functions, the server power consumption threshold and the delay threshold for each function, the server power monitoring period and function performance monitoring period, and the resource allocation granularity of the coordinator.

The present invention starts from the energy consumption of the serverless computing workload, and tries to reduce the energy consumption by using some simple and effective methods. First, the concept of energy exchange is introduced to minimize the energy consumption of serverless workloads. Then, a function-level runtime system is designed to manage the resource allocation of functions and minimize the energy consumption of functions while ensuring the performance requirements of functions. By virtue of the machine learning model constructed offline, the present invention effectively finds a resource allocation scheme that minimizes energy consumption while ensuring function performance requirements. As shown in FIG. 4 , it is a schematic diagram of comparison results of energy consumption and delay under different workloads between the present invention and the control of energy consumption by the operating system.

Compared with the prior art, the present invention comprehensively considers issues such as energy consumption and function runtime performance, and enables the function to run stably in a disturbed environment through feedback adjustment to the workload. The system is a runtime system that actively manages the resource allocation of functions, and is able to reduce the overall energy consumption by coordinating the length of each stage of the workload to minimize energy consumption and guarantee the performance of functions. At the same time, the system is not strongly coupled with serverless computing platforms, so it can run on most platforms. The evaluation results show that compared with the state-of-the-art technology, the invention can reduce the energy consumption of computing-intensive serverless workloads by up to 21.2%, while strictly guaranteeing the function runtime performance.

Claims

A serverless computing resource allocation system oriented to energy consumption optimization, characterized in that the system includes a resource explorer (100), a resource configurator (200), a serverless computing system (300), a system monitor (400) and resource coordinator (500); wherein:

The resource explorer (100) is used to use the machine learning prediction model trained offline to perform resource exploration for the newly started function in the serverless computing system (300), and find the configuration that meets the function's performance requirements in the user resources to be allocated , and at the same time find the best configuration scheme for minimizing energy consumption: that is, find the resource configuration that minimizes energy consumption in the critical section, that is, the resource configuration scheme that minimizes power consumption P*request execution time T when the function is running;

The resource configurator (200) is configured to find all resource configurations that meet function performance requirements in the resource configuration scheme;

The system monitor (400) is used to monitor three indicators: (1) whether there is a newly deployed function in the serverless computing platform; (2) whether the current power of the server exceeds the heat dissipation design power consumption threshold, if exceeded, The monitor will send a power consumption overload alarm to the coordinator; (3) whether the delay of each function exceeds the respective delay threshold, once the function delay is found to exceed the set threshold, the system monitor (400) will report to the system to coordinate A device (500) issues a delay violation alert;

The resource coordinator (500) is configured to make corresponding adjustments after receiving the alarm information sent by the system monitor (400), that is: after receiving the power consumption overload alarm, the serverless computing system ( 300), the function with the largest power consumption is cooled, and the frequency of the CPU core owned by the function is reduced step by step each time until the overall power is lower than the set threshold.
A serverless computing resource allocation system oriented to energy consumption optimization according to claim 1, wherein the machine learning forecasting model uses Pearson and Spearman correlation coefficients to evaluate these indicators and function runtime power and Correlation between delays; select six indicators related to function delays, including last-level cache, CPU core frequency, number of CPU cores, number of function instances, QPS, calculation amount per request, and energy consumption-related The four indicators include last-level cache, CPU core main frequency, number of CPU cores, and calculation amount per request, and build a data set for offline training latency and energy consumption models.
A serverless computing resource allocation system oriented to energy consumption optimization according to claim 1, wherein the resource explorer (100) further includes the following processing: firstly reducing resources, eliminating those resources that are far beyond the function itself Resource allocation according to resource requirements; then use the binary search algorithm to find the critical section that meets the performance requirements of the function in the filtered resource allocation scheme; finally, expand the traversal in the critical section to find the resource allocation scheme that meets the performance requirements of the function and consumes the least energy.
A serverless computing resource allocation system oriented to energy consumption optimization according to claim 1, wherein said resource configurator (200) further includes the following processing: after receiving the optimal resource combination, it is responsible for communicating with the platform Interact with the operating system to perform operations that allocate resources for functions.
A serverless computing resource allocation system oriented to energy consumption optimization according to claim 1, wherein the system monitor (400) further includes the following processing: regularly check the total power of the current node; if the current server's When the total power exceeds the power consumption threshold set by the system, the system monitor (400) will issue a power consumption overload alarm; the system monitor (400) monitors whether there is a newly deployed function in the serverless computing platform (300); if there is , first asynchronously start a thread that records function information to collect the power consumption at startup, container startup time, and container idle power consumption of the function; after the collection is completed, calculate the longest survival time of the function based on the information, the formula is as follows:

Maximum function survival time = cold start power consumption * cold start time / function idle power consumption

Meanwhile, the system monitor (400) triggers subsequent resource exploration and resource allocation.