CN115454598B

CN115454598B - Service deployment and resource allocation method for partial decoupling data center

Info

Publication number: CN115454598B
Application number: CN202211102743.0A
Authority: CN
Inventors: 沈纲祥; 刘志豪
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2023-06-06
Anticipated expiration: 2042-09-09
Also published as: CN115454598A; WO2024051012A1

Abstract

The invention discloses a service deployment and resource allocation method of a partial decoupling data center, which comprises the steps of firstly judging whether a current service belongs to a resource demand intensive service, if so, preferentially searching whether a resource pool can be used for service deployment and resource allocation, and if the resources in the resource pool can not meet the service demand, searching whether a usable server exists; otherwise, firstly searching whether a server capable of being deployed exists, and when the server cannot meet the requirement, attempting to use the resource pool to deploy the service. The service deployment and resource allocation method for the partially decoupled data center can improve the service bearing capacity of the data center, maximally utilize the servers in the data center and improve the resource utilization rate of the data center.

Description

Service deployment and resource allocation method for partial decoupling data center

Technical Field

The invention relates to the technical field of service deployment, in particular to a service deployment and resource allocation method of a partially decoupled data center.

Background

A conventional data center is composed of servers and a private network interconnecting them, as shown in fig. 1 (a), in which each server integrates its own various resources such as CPU, memory and disk space. While a data center can accomplish complex tasks and serve thousands of users at the same time, its resource utilization is often inefficient because the resources are tightly coupled within each server. For example, while one type of resource (e.g., a CPU) may be fully utilized in a server, other types of resources (e.g., disk space) may be rarely used. This can lead to wastage of server resources. In addition, as the size of data centers increases, conventional data centers suffer from high cost, high power consumption, and may not be easily upgradeable when needed.

In view of these problems, an alternative data center architecture, called a fully decoupled data center, has recently been proposed, which better utilizes resources by decoupling them as shown in fig. 1 (b). Specifically, the resources in each server are decomposed and the same type of resources are arranged/grouped into a resource pool. These resource pools are interconnected using a private network with high capacity and low latency. The decoupling enables different types of resources to be independently upgraded and expanded, and the overall resource utilization rate is greatly improved.

For this evolution of data centers from traditional data centers to fully decoupled data centers, there may be intermediate stages in which some resources are still provided by old servers as before, while others are provided as a pool of resources formed after decoupling. We refer to this type of data center as a partially decoupled data center, as shown in fig. 1 (c).

Currently, there is a great deal of literature on fully decoupled data centers, and resource decoupling has been demonstrated for performance improvements in various aspects of data centers. In recent years, china still uses a traditional data center established based on a traditional server. In this process, no one has considered a partially decoupled data center, where old servers and new decoupled resources coexist. At present, how to realize efficient service deployment and resource allocation in a partially decoupled data center is a problem to be solved urgently.

Disclosure of Invention

The invention aims to provide a service deployment and resource allocation method of a partial decoupling data center with high feasibility and high efficiency.

In order to solve the above problems, the present invention provides a service deployment and resource allocation method for a partially decoupled data center, which includes the following steps:

s1, receiving a group of services, and sequencing the services according to the requirements of the services on different resources to obtain a plurality of service lists;

s2, extracting a service from the service list each time, and judging whether the service is a resource-intensive service or not; if yes, executing step S3; otherwise, executing the step S4;

s3, attempting to use a decoupling module in a service corresponding resource pool to perform service deployment; the method comprises the following steps:

s31, firstly eliminating all full-load decoupling modules in the corresponding resource pool;

s32, searching whether a single decoupling module meeting the service resource requirement exists in the corresponding resource pool; if yes, deploying the service by using the decoupling module; otherwise, step S33 is performed;

s33, judging whether the remaining decoupling modules in the corresponding resource pools can meet the service resource requirements, if so, distributing the current service to a plurality of decoupling modules for deployment; otherwise, the resource pool deployment fails, and the step S4 is executed;

s4, attempting to use a server to perform service deployment; the method comprises the following steps:

s41, firstly removing servers of which the residual resources can not meet the service resource requirements;

s42, sequencing the servers according to different residual resources in the servers to obtain a plurality of server lists;

s43, judging whether a server meeting the service resource requirement exists in a server list corresponding to the service, if so, deploying the service by using the server; otherwise, the server deployment fails, and step S3 is executed.

As a further improvement of the present invention, in step S1, the services are ordered according to the requirements of the services on different resources to obtain three service lists, the first service list is ordered according to the requirements of the services on the CPU from small to large, the second service list is ordered according to the requirements of the services on the memory from small to large, and the third service list is ordered according to the requirements of the services on the external.

As a further improvement of the present invention, step S2 includes: according to the sequence of the first service list, the second service list and the third service list and according to the small to large demand of the service on the resource, one service is taken out from the current service list each time, and whether the service is a resource intensive service is judged; if yes, executing step S3; otherwise, step S4 is performed.

As a further improvement of the present invention, in step S3, the first task list corresponds to a CPU resource pool including a CPU module, the second task list corresponds to a memory resource pool including a memory module, and the third task list corresponds to a memory resource pool including a memory module.

As a further improvement of the present invention, in step S42, the servers are ordered according to different remaining resources in the servers to obtain three server lists, the first server list is ordered according to the remaining CPU resources from small to large, the second server list is ordered according to the remaining memory resources from small to large, and the third server list is ordered according to the remaining memory resources from small to large.

As a further improvement of the invention, the method further comprises the following steps:

if the current service fails to be deployed in both the resource pool and the server, ending the deployment of the current service and continuing to deploy the next service.

if the current service fails to be deployed in both the resource pool and the server, the current service is deleted from the three service lists.

As a further improvement of the present invention, before step S1, the following steps are further included:

according to the demands of the services on different resource types, the services are divided into CPU-intensive services, memory-intensive services, IO-intensive services and low-load demand-type services, wherein the CPU-intensive services, the memory-intensive services and the IO-intensive services are resource-intensive services.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any one of the methods described above when executing the program.

The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods described above.

The invention has the beneficial effects that:

the service deployment and resource allocation method for the partially decoupled data center can improve the service bearing capacity of the data center, maximally utilize the servers in the data center and improve the resource utilization rate of the data center.

The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention, as well as the preferred embodiments thereof, together with the following detailed description of the invention, given by way of illustration only, together with the accompanying drawings.

Drawings

FIG. 1 (a) a conventional data center architecture;

FIG. 1 (b) fully decoupled data center architecture;

FIG. 1 (c) part of a data center architecture;

FIG. 2 is a flow chart of a method for service deployment and resource allocation for a partially decoupled data center in accordance with a preferred embodiment of the present invention;

FIG. 3 is a block diagram of a method for service deployment and resource allocation for a partially decoupled data center in accordance with a preferred embodiment of the present invention;

FIG. 4 is a graph showing the results of the integer linear programming model and the method for service deployment and resource allocation of a partially decoupled data center in accordance with the preferred embodiment of the present invention in terms of the total number of deployed services;

fig. 5 is a diagram showing the performance of the service deployment and resource allocation method and the first hit scheme of the partial decoupling data center according to the preferred embodiment of the present invention when the degree of decoupling increases.

Detailed Description

The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the invention and practice it.

As shown in fig. 2, a method for service deployment and resource allocation of a partially decoupled data center according to a preferred embodiment of the present invention includes the following steps:

specifically, in step S1, the service is ordered according to the service requirement of different resources to obtain three service lists, the first service list is ordered according to the service requirement of the CPU from small to large, the second service list is ordered according to the service requirement of the memory from small to large, and the third service list is ordered according to the service requirement of the external. The resources comprise a CPU, a memory and an external memory. Alternatively, the corresponding list is generated by giving a high weight to the CPU, memory, and external memory, respectively.

specifically, according to the sequence of the first service list, the second service list and the third service list and according to the small to large requirement of the service on resources, one service is taken out from the current service list each time, and whether the service is a resource intensive service is judged; if yes, executing step S3; otherwise, step S4 is performed.

Wherein, whether the service is resource intensive is divided according to the requirements of the service for different resource types. Each service has different requirements for various resources. Such as a CPU intensive service, the demands on the CPU are great, while the demands on other resources are general. In the case of limited data center resources, our goal is to maximize the number of service deployments. The division criteria of the resource-intensive traffic may be defined manually or may be adjusted according to the actual scenario.

Optionally, before step S1, the method further includes the following steps:

S3, attempting to use a decoupling module in a service corresponding resource pool to perform service deployment; the first task list corresponds to a CPU resource pool containing a CPU module, the second task list corresponds to a memory resource pool containing a memory module, and the third task list corresponds to a memory resource pool containing a memory module.

Specifically, step S3 includes:

specifically, the servers are ordered according to different residual resources in the servers to obtain three server lists, the servers are ordered from small to large according to residual CPU resources in the first server list, the servers are ordered from small to large according to residual memory resources in the second server list, and the servers are ordered from small to large according to residual external memory resources in the third server list.

In some embodiments, the method further comprises the steps of:

Further, the method also comprises the following steps:

In the invention, when the service belongs to the service with dense resource demand, preferentially searching whether the service can be deployed and distributed by using the resource pool, and when the resource in the resource pool can not meet the service demand, searching whether a usable server exists; if not a resource demand intensive service: firstly searching whether a server capable of being deployed exists or not, and when the server cannot meet the requirement, attempting to use the resource pool to deploy the service.

Specifically, referring to fig. 3, first, traffic is ordered according to the number of CPUs required. For services with lower demands on resources, attempts are made to provision servers for them first. We examine the ordered list of servers to select servers with sufficient resources. For memory-intensive traffic and CPU-intensive traffic. We try to deploy them using a pool of resources. If a single module in the resource pool can meet the resource requirement, we first use it to provide services. If not, we use the remaining resources in the resource pool to deploy the service.

According to the invention, the service and the server are ordered in the list by setting a plurality of service lists and server lists, so that service deployment and resource allocation efficiency can be effectively improved.

In one embodiment, we consider three types of resource modules: a 32-core CPU module, a 128GB memory module and a 1024GB disk module. It is assumed that each server has a CPU module, a memory module, and a disk module, and that all the resolved resource pools are a collection of different resource modules. We consider different decoupling levels from 0%, 10%, … … to 100%. The simulation was performed considering two cases. In case 1, we have 10 servers, corresponding to 30 resource modules, 10 modules for each resource type. Case 2 contains 1000 servers, corresponding to 3000 modules, 1000 for each resource type. Then, depending on the level of decoupling, a proportion of the servers are broken down and then their resource modules are aggregated into different types of resource pools. For example, for case 1, if the decoupling level is 30%, then 3 servers are decomposed, 9 resource modules, 3 for each resource type. Furthermore, we divide four traffic types into two main categories: conventional services and resource intensive services. The resource requirement of the conventional service is distributed in the space ranges of [1,16] core CPU, [1,64] GB memory and [1,512] GB disk. In contrast, resource-intensive services have high demands on certain types of resources, while other resource demands remain normal. For example, a CPU resource required for a CPU intensive service is in the range of [16,32] cores, and memory and disk space requirements are normal. Memory-intensive services and IO-intensive services require memory resources in the range of [64,128] GB and disk space resources in the range of [512,1024] GB, respectively. For these four different types of traffic we ensure that their generated numbers are the same.

Based on case 1, fig. 4 compares the results of the integer linear programming model and the method of the present invention in terms of the total number of services deployed, where the total number of services that need to be deployed is 50.

The objective of the integer linear programming model is to maximize the number of services offered, but to adhere to the following constraints.

(1) Virtual data center business constraints: each virtual data center service is considered to be successfully deployed only if it is provided with sufficient resources.

(2) Server allocation limit: although a server may accommodate multiple services, the total resources allocated to these services cannot exceed the capacity of the server.

(3) Resource module allocation constraints: one resource module may provide resources for multiple services and if one service requires more resources, it may use the resources of multiple modules. However, the total resources allocated to the traffic should not exceed the total capacity of the resource module.

(4) Server and resource pool separation constraints: if a service has allocated the resources of a resource module, it cannot reallocate the resources of a server and vice versa.

We note that as the decoupling proportion increases, the total number of deployed traffic increases gradually. This is because an increase in the degree of decoupling will free up more resource modules from the server to form a large resource pool, which allows more traffic to share these resources so that they are better utilized. Furthermore, we note that the performance of the method of the present invention is very close to an integer linear programming model, which demonstrates the efficiency of the method. We also compared the method of the present invention to a first hit (FF) scheme. Here, the first hit scheme refers to that the FF scheme first attempts to deploy a service using a server without ordering the service, the server, and the resource module in advance, and if unsuccessful, the scheme deploys the service using a resource pool. We note that the proposed approach is clearly superior to the first hit scheme and deployment traffic exceeds the first hit scheme by up to 31.6%.

We also evaluated the efficiency of the proposed method based on the large-scale data center scenario of case 2 (where there are 5000 businesses). In this case, since the integer linear programming model is difficult to solve, we do not provide its results, but rather compare with the first hit scheme. Fig. 5 shows the performance of the proposed method and the first hit scheme when the degree of decoupling increases, where "IR" corresponds to the rate of increase in the number of deployed traffic for our proposed method compared to the first hit scheme. As with case 1, the number of successful deployments of traffic increases with the increase in the proportion of decoupling. Furthermore, we note that the proposed approach is more than 10% better than the first hit approach, and this improvement becomes more pronounced as the decoupling ratio increases. This is because the method proposed by the present invention performs the ordering process before attempting to deploy the service using the server, which is not done in contrast to the first hit scheme.

The preferred embodiment of the invention also discloses an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which processor implements the steps of the method described in the above embodiments when executing the program.

The preferred embodiment of the present invention also discloses a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described in the above embodiments.

The above embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present invention, and are intended to be within the scope of the present invention. The protection scope of the invention is subject to the claims.

Claims

1. The service deployment and resource allocation method of the partial decoupling data center is characterized by comprising the following steps:

2. The method for allocating service and resources of a partially decoupled data center according to claim 1, wherein in step S1, three service lists are obtained by sorting the service according to the demands of the service for different resources, the first service list sorts the service according to the demands of the service for the CPU from small to large, the second service list sorts the service according to the demands of the service for the memory from small to large, and the third service list sorts the service according to the demands of the service for the external.

3. The method for service deployment and resource allocation of a partially decoupled data center according to claim 2, wherein step S2 comprises: according to the sequence of the first service list, the second service list and the third service list and according to the small to large demand of the service on the resource, one service is taken out from the current service list each time, and whether the service is a resource intensive service is judged; if yes, executing step S3; otherwise, step S4 is performed.

4. The method for service deployment and resource allocation of a partially decoupled data center according to claim 2, wherein in step S3, the first task list corresponds to a CPU resource pool including a CPU module, the second task list corresponds to a memory resource pool including a memory module, and the third task list corresponds to a memory resource pool including a memory module.

5. The service deployment and resource allocation method of the partially decoupled data center according to claim 2, wherein in step S42, the servers are ranked according to different remaining resources in the servers to obtain three server lists, the servers are ranked according to the remaining CPU resources from small to large in the first server list, the servers are ranked according to the remaining memory resources from small to large in the second server list, and the servers are ranked according to the remaining external memory resources from small to large in the third server list.

6. The service deployment and resource allocation method of a partially decoupled data center of claim 1, further comprising the steps of:

7. The service deployment and resource allocation method of a partially decoupled data center of claim 6, further comprising the steps of:

8. The method for service deployment and resource allocation of a partially decoupled data center according to claim 1, further comprising the step of, prior to step S1:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1-8 when the program is executed by the processor.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1-8.