CN115454598A

CN115454598A - Service deployment and resource allocation method of partially decoupled data center

Info

Publication number: CN115454598A
Application number: CN202211102743.0A
Authority: CN
Inventors: 沈纲祥; 刘志豪
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2022-12-09
Anticipated expiration: 2042-09-09
Also published as: CN115454598B; WO2024051012A1

Abstract

The invention discloses a service deployment and resource allocation method of a partially decoupled data center, which comprises the steps of firstly judging whether the current service belongs to a resource demand intensive type service, if so, preferentially searching whether a resource pool can be used for service deployment and resource allocation, and searching whether an available server exists when resources in the resource pool can not meet the service demand; otherwise, whether a server capable of being deployed exists is found, and when the server cannot meet the requirement, the service deployment by using the resource pool is attempted. The service deployment and resource allocation method of the partially decoupled data center can improve the service carrying capacity of the data center, simultaneously utilize the servers in the data center to the maximum extent and improve the resource utilization rate of the data center.

Description

Service deployment and resource allocation method of partially decoupled data center

Technical Field

The invention relates to the technical field of service deployment, in particular to a service deployment and resource allocation method of a partially decoupled data center.

Background

A conventional data center consists of servers and a private network interconnecting them, as shown in fig. 1 (a), where each server integrates its own various resources, such as CPU, memory and disk space. While one data center can accomplish complex tasks and serve thousands of users simultaneously, its resource utilization is often inefficient because the resources are tightly coupled within each server. For example, while one type of resource (e.g., a CPU) may be fully utilized in a server, other types of resources (e.g., disk space) may be rarely used. This results in a waste of server resources. In addition, as the size of data centers increases, conventional data centers suffer from high cost and high energy consumption, and may not be easily upgraded when needed.

In view of these problems, an alternative data center architecture, called a fully decoupled data center, has recently been proposed, as shown in fig. 1 (b), which better utilizes resources through decoupling of resources. Specifically, the resources in each server are decomposed, and the same type of resources are arranged/grouped into a resource pool. These resource pools are interconnected using a dedicated network with high capacity and low latency. The decoupling enables the resources of different types to be upgraded and expanded independently, and greatly improves the utilization rate of the whole resources.

For this process of evolution of a data center from a traditional data center to a fully decoupled data center, there may be intermediate stages where some resources are still provided by old servers as before, while other resources are provided as a pool of resources formed after decoupling. We refer to this type of data center as a partially decoupled data center, as shown in fig. 1 (c).

At present, a great deal of literature is available for researching a completely decoupled data center, and all aspects of performance improvement of the data center due to resource decoupling are verified. In recent years, china still uses the traditional data center established based on the traditional server. In this process, no one has considered a partially decoupled data center, where the old server and the new decoupled resource coexist. At present, how to implement efficient service deployment and resource allocation in a partially decoupled data center is an urgent problem to be solved.

Disclosure of Invention

The invention aims to provide a service deployment and resource allocation method of a partially decoupled data center, which is high in feasibility and efficiency.

In order to solve the above problems, the present invention provides a method for service deployment and resource allocation in a partially decoupled data center, which comprises the following steps:

s1, receiving a group of services, and sequencing the services according to the requirements of the services on different resources to obtain a plurality of service lists;

s2, taking out a service from the service list each time, and judging whether the service is a resource intensive service or not; if yes, executing step S3; otherwise, executing step S4;

s3, attempting to use a decoupling module in a resource pool corresponding to the service to deploy the service; the method comprises the following steps:

s31, eliminating all fully loaded decoupling modules in the corresponding resource pool;

s32, searching whether a single decoupling module meeting the service resource requirement exists in the corresponding resource pool; if yes, deploying the service by using the decoupling module; otherwise, go to step S33;

s33, judging whether the rest decoupling modules in the corresponding resource pool can meet the service resource requirements, if so, distributing the current service to a plurality of decoupling modules for deployment; otherwise, the resource pool fails to be deployed, and step S4 is executed;

s4, attempting to use a server to carry out service deployment; the method comprises the following steps:

s41, eliminating a server of which the residual resource can not meet the requirement of the service resource;

s42, sorting the servers according to different residual resources in the servers to obtain a plurality of server lists;

s43, judging whether a server meeting the service resource requirement exists in a server list corresponding to the service, if so, deploying the service by using the server; otherwise, the server fails to deploy, and step S3 is executed.

As a further improvement of the present invention, in step S1, services are ordered according to requirements of the services for different resources to obtain three service lists, where the first service list orders the services from small to large according to the requirements of the services for a CPU, the second service list orders the services from small to large according to the requirements of the services for a memory, and the third service list orders the services from small to large according to the requirements of the services for an external memory.

As a further improvement of the present invention, step S2 includes: taking out a service from the current service list each time according to the sequence of the first service list, the second service list and the third service list and the requirement of the service on resources from small to large, and judging whether the service is a resource intensive service; if yes, executing step S3; otherwise, step S4 is executed.

As a further improvement of the present invention, in step S3, the first task list corresponds to a CPU resource pool including a CPU module, the second task list corresponds to a memory resource pool including a memory module, and the third task list corresponds to a memory resource pool including a memory module.

As a further improvement of the present invention, in step S42, the servers are sorted according to different remaining resources in the servers to obtain three server lists, the first server list sorts the servers from small to large according to the remaining CPU resources, the second server list sorts the servers from small to large according to the remaining memory resources, and the third server list sorts the servers from small to large according to the remaining external memory resources.

As a further improvement of the invention, the method also comprises the following steps:

if the current service fails to be deployed in both the resource pool and the server, the current service deployment is ended, and the next service is continuously deployed.

and if the current service fails to be deployed in the resource pool and the server, deleting the current service from the three service lists.

As a further improvement of the present invention, before step S1, the following steps are further included:

according to the requirements of services on different resource types, the services are divided into a CPU intensive service, a memory intensive service, an IO intensive service and a low-load demand service, wherein the CPU intensive service, the memory intensive service and the IO intensive service are resource intensive services.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of any one of the above methods when executing the program.

The invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of any of the methods described above.

The invention has the beneficial effects that:

the service deployment and resource allocation method of the partially decoupled data center can improve the service carrying capacity of the data center, simultaneously utilize the servers in the data center to the maximum extent and improve the resource utilization rate of the data center.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.

Drawings

FIG. 1 (a) a conventional data center architecture;

FIG. 1 (b) fully decouples the data center architecture;

FIG. 1 (c) partial data center architecture;

FIG. 2 is a flow chart of a method for service deployment and resource allocation in a partially decoupled data center in a preferred embodiment of the present invention;

FIG. 3 is an architecture diagram of a method for service deployment and resource allocation in a partially decoupled data center in a preferred embodiment of the present invention;

FIG. 4 is a result of an integer linear programming model and a method for service deployment and resource allocation in a partially decoupled data center in accordance with a preferred embodiment of the present invention in terms of total number of services deployed;

fig. 5 shows the performance of the service deployment and resource allocation method and the first hit scheme of the partially decoupled data center with an increased decoupling level in the preferred embodiment of the present invention.

Detailed Description

The present invention is further described below in conjunction with the drawings and the embodiments so that those skilled in the art can better understand the present invention and can carry out the present invention, but the embodiments are not to be construed as limiting the present invention.

As shown in fig. 2, a method for service deployment and resource allocation of a partially decoupled data center in a preferred embodiment of the present invention includes the following steps:

specifically, in step S1, the services are ordered according to the requirements of the services for different resources to obtain three service lists, the services are ordered in the first service list from small to large according to the requirements of the services for the CPU, the services are ordered in the second service list from small to large according to the requirements of the services for the memory, and the services are ordered in the third service list from small to large according to the requirements of the services for the external memory. The resources include a CPU, a memory and an external memory. Optionally, the corresponding list is generated by giving a high weight to the CPU, memory, and external memory, respectively.

specifically, according to the sequence of a first service list, a second service list and a third service list, and according to the requirement of the service on resources from small to large, one service is taken out from the current service list each time, and whether the service is a resource intensive service is judged; if yes, executing step S3; otherwise, step S4 is executed.

Wherein, whether the service is a resource intensive service is divided according to the requirements of the service on different resource types. Each service has different requirements on various resources. Such as a CPU-intensive service, the demands on the CPU are large, while the demands on other resources are general. In the case of limited data center resources, our goal is to maximize the number of service deployments. The division standard of the resource intensive service can be artificially defined and can also be adjusted according to the actual scene.

Optionally, before step S1, the following steps are further included:

S3, attempting to use a decoupling module in a resource pool corresponding to the service to deploy the service; the first task list corresponds to a CPU resource pool containing a CPU module, the second task list corresponds to a memory resource pool containing a memory module, and the third task list corresponds to a memory resource pool containing a memory module.

Specifically, step S3 includes:

s42, sequencing the servers according to different residual resources in the servers to obtain a plurality of server lists;

specifically, the servers are sorted according to different remaining resources in the servers to obtain three server lists, the servers are sorted from small to large according to the remaining CPU resources in the first server list, the servers are sorted from small to large according to the remaining memory resources in the second server list, and the servers are sorted from small to large according to the remaining external memory resources in the third server list.

In some embodiments, further comprising the steps of:

Further, the method also comprises the following steps:

In the invention, when the service belongs to a service with intensive resource requirements, whether the service can be deployed and distributed by using a resource pool is preferably searched, and when the resources in the resource pool can not meet the service requirements, whether an available server exists is searched; if not resource demand intensive traffic: whether a server capable of being deployed exists is found, and when the server cannot meet the requirement, the service deployment by using the resource pool is attempted.

Specifically, referring to fig. 3, first, the traffic is sorted according to the number of CPUs required. For services with low demand for resources, it is attempted to provide them with servers first. We examine the sorted list of servers to select a server with sufficient resources. For memory intensive traffic and CPU intensive traffic. We attempt to deploy them using a pool of resources. If a single module in the resource pool can satisfy the resource requirement, we first use it to provide services. If not, we deploy the service using the remaining resources in the resource pool.

The invention arranges the service and the server in the list by setting the plurality of service lists and the plurality of server lists, thereby effectively improving the service deployment and the resource allocation efficiency.

The method for partially decoupling service deployment and resource allocation of the data center can improve the service carrying capacity of the data center, simultaneously utilize the servers in the data center to the maximum extent and improve the resource utilization rate of the data center.

In one embodiment, we consider three types of resource modules: 32-core CPU module, 128GB memory module and 1024GB disk module. Assuming that each server has a CPU module, a memory module and a disk module, all the decomposed resource pools are a collection of different resource modules. We consider different levels of decoupling from 0%, 10%, … … to 100%. The simulation was performed considering two cases. In case 1, we have 10 servers, corresponding to 30 resource modules, with 10 modules for each resource type. Case 2 contains 1000 servers for 3000 modules, 1000 for each resource type. Then, according to the level of decoupling, a certain proportion of the servers are decomposed and then their resource modules are aggregated into different types of resource pools. For example, for case 1, if the level of decoupling is 30%, then 3 servers are broken up, 9 resource modules, and 3 for each resource type. Furthermore, we classify four traffic types into two major categories: regular service and resource intensive traffic. The resource requirements of the conventional business are distributed in the range of [1,16] core CPU, [1,64] GB memory and [1,512] GB disk space. In contrast, resource-intensive services have a high demand for certain types of resources, while other resource demands remain normal. For example, a CPU-intensive service requires CPU resources in the range of the [16,32] core, and the memory and disk space requirements are normal. Memory-intensive services and IO-intensive services require memory resources in the [64,128] GB range and disk space resources in the [512,1024] GB range, respectively. We guarantee that they are generated in the same amount for these four different types of services.

Based on case 1, fig. 4 compares the results of the integer linear programming model and the method of the present invention in terms of the total number of services to be deployed, which is 50.

The objective of the integer linear programming model is to maximize the number of services provided, but to adhere to the following constraints.

(1) Virtual data center service constraints: each virtual data center service is considered to be successfully deployed only if it is provided with sufficient resources.

(2) Server allocation restriction: although a server can accommodate multiple services, the total resources allocated to these services cannot exceed the capacity of the server.

(3) Resource module allocation constraints: a resource module may provide resources for multiple services and if a service requires more resources, it may use the resources of multiple modules. However, the total resources allocated to the service should not exceed the total capacity of the resource modules.

(4) Server and resource pool separation constraints: if a service has allocated resources of a resource module, it cannot reallocate resources of a server and vice versa.

We note that as the decoupling ratio increases, the total number of services deployed gradually increases. This is because an increase in the degree of decoupling frees more resource modules from the server to form a large resource pool, which allows more services to share these resources so that they are better utilized. Furthermore, we note that the performance of the method of the present invention is very close to the integer linear programming model, which confirms the efficiency of the method. We also compared the method of the present invention to the first hit (FF) scheme. Here, the first hit scheme refers to the FF scheme first attempting to deploy traffic using a server without ordering the traffic, server, and resource modules in advance, and if not successful, using a resource pool to deploy traffic. We note that the proposed method is clearly superior to the first hit scheme and deployment traffic exceeds the first hit scheme by up to 31.6%.

We also evaluated the efficiency of the proposed method based on the large-scale data center scenario of case 2 (with 5000 services). In this case, since the integer linear programming model is difficult to solve, we do not provide its results, but rather compare it to the first hit. Fig. 5 shows the performance of the proposed method and first hit scheme with increasing degrees of decoupling, where "IR" corresponds to an increased proportion of the number of deployed services compared to the first hit scheme by our proposed method. As with case 1, the number of successful deployments of the service increases as the decoupling ratio increases. Furthermore, we note that the proposed method outperforms the first hit scheme by more than 10%, and this improvement becomes more pronounced as the decoupling ratio increases. This is because the method proposed by the present invention performs the ranking process before attempting to deploy a service using a server, which is not done in contrast to the first hit scheme.

The preferred embodiment of the present invention also discloses an electronic device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the program to implement the steps of the method in the above embodiments.

The preferred embodiment of the present invention also discloses a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the method described in the above embodiments.

The above embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitution or change made by the technical personnel in the technical field on the basis of the invention is all within the protection scope of the invention. The protection scope of the invention is subject to the claims.

Claims

1. The service deployment and resource allocation method of the partially decoupled data center is characterized by comprising the following steps of:

s31, eliminating all full-load decoupling modules in the corresponding resource pool;

2. The method for service deployment and resource allocation of the partially decoupled data center according to claim 1, wherein in step S1, the services are ordered according to the requirements of the services for different resources to obtain three service lists, the first service list is ordered according to the requirements of the services for the CPU from small to large, the second service list is ordered according to the requirements of the services for the memory from small to large, and the third service list is ordered according to the requirements of the services for the external memory from small to large.

3. The method for service deployment and resource allocation in a partially decoupled data center of claim 2, wherein step S2 comprises: taking out a service from the current service list each time according to the sequence of the first service list, the second service list and the third service list and the requirement of the service on resources from small to large, and judging whether the service is a resource intensive service; if yes, executing step S3; otherwise, step S4 is executed.

4. The service deployment and resource allocation method for the partially decoupled data center according to claim 2, wherein in step S3, the first task list corresponds to a CPU resource pool including a CPU module, the second task list corresponds to a memory resource pool including a memory module, and the third task list corresponds to a memory resource pool including a memory module.

5. The method for service deployment and resource allocation in a partially decoupled data center according to claim 2, wherein in step S42, the servers are sorted according to different remaining resources in the servers to obtain three server lists, a first server list is used to sort the servers according to the remaining CPU resources from small to large, a second server list is used to sort the servers according to the remaining memory resources from small to large, and a third server list is used to sort the servers according to the remaining memory resources from small to large.

6. The method of partially decoupling service deployment and resource allocation in a data center of claim 1, further comprising the steps of:

7. The method for service deployment and resource allocation in a partially decoupled data center of claim 6, further comprising the steps of:

8. The method for service deployment and resource allocation of a partially decoupled data center according to claim 1, wherein before step S1, further comprising the steps of:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1-8 are implemented when the program is executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.