CN115454598B - Service deployment and resource allocation method for partial decoupling data center - Google Patents

Service deployment and resource allocation method for partial decoupling data center Download PDF

Info

Publication number
CN115454598B
CN115454598B CN202211102743.0A CN202211102743A CN115454598B CN 115454598 B CN115454598 B CN 115454598B CN 202211102743 A CN202211102743 A CN 202211102743A CN 115454598 B CN115454598 B CN 115454598B
Authority
CN
China
Prior art keywords
service
resource
server
data center
deployment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211102743.0A
Other languages
Chinese (zh)
Other versions
CN115454598A (en
Inventor
沈纲祥
刘志豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN202211102743.0A priority Critical patent/CN115454598B/en
Priority to PCT/CN2022/137172 priority patent/WO2024051012A1/en
Publication of CN115454598A publication Critical patent/CN115454598A/en
Application granted granted Critical
Publication of CN115454598B publication Critical patent/CN115454598B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a service deployment and resource allocation method of a partial decoupling data center, which comprises the steps of firstly judging whether a current service belongs to a resource demand intensive service, if so, preferentially searching whether a resource pool can be used for service deployment and resource allocation, and if the resources in the resource pool can not meet the service demand, searching whether a usable server exists; otherwise, firstly searching whether a server capable of being deployed exists, and when the server cannot meet the requirement, attempting to use the resource pool to deploy the service. The service deployment and resource allocation method for the partially decoupled data center can improve the service bearing capacity of the data center, maximally utilize the servers in the data center and improve the resource utilization rate of the data center.

Description

Service deployment and resource allocation method for partial decoupling data center
Technical Field
The invention relates to the technical field of service deployment, in particular to a service deployment and resource allocation method of a partially decoupled data center.
Background
A conventional data center is composed of servers and a private network interconnecting them, as shown in fig. 1 (a), in which each server integrates its own various resources such as CPU, memory and disk space. While a data center can accomplish complex tasks and serve thousands of users at the same time, its resource utilization is often inefficient because the resources are tightly coupled within each server. For example, while one type of resource (e.g., a CPU) may be fully utilized in a server, other types of resources (e.g., disk space) may be rarely used. This can lead to wastage of server resources. In addition, as the size of data centers increases, conventional data centers suffer from high cost, high power consumption, and may not be easily upgradeable when needed.
In view of these problems, an alternative data center architecture, called a fully decoupled data center, has recently been proposed, which better utilizes resources by decoupling them as shown in fig. 1 (b). Specifically, the resources in each server are decomposed and the same type of resources are arranged/grouped into a resource pool. These resource pools are interconnected using a private network with high capacity and low latency. The decoupling enables different types of resources to be independently upgraded and expanded, and the overall resource utilization rate is greatly improved.
For this evolution of data centers from traditional data centers to fully decoupled data centers, there may be intermediate stages in which some resources are still provided by old servers as before, while others are provided as a pool of resources formed after decoupling. We refer to this type of data center as a partially decoupled data center, as shown in fig. 1 (c).
Currently, there is a great deal of literature on fully decoupled data centers, and resource decoupling has been demonstrated for performance improvements in various aspects of data centers. In recent years, china still uses a traditional data center established based on a traditional server. In this process, no one has considered a partially decoupled data center, where old servers and new decoupled resources coexist. At present, how to realize efficient service deployment and resource allocation in a partially decoupled data center is a problem to be solved urgently.
Disclosure of Invention
The invention aims to provide a service deployment and resource allocation method of a partial decoupling data center with high feasibility and high efficiency.
In order to solve the above problems, the present invention provides a service deployment and resource allocation method for a partially decoupled data center, which includes the following steps:
s1, receiving a group of services, and sequencing the services according to the requirements of the services on different resources to obtain a plurality of service lists;
s2, extracting a service from the service list each time, and judging whether the service is a resource-intensive service or not; if yes, executing step S3; otherwise, executing the step S4;
s3, attempting to use a decoupling module in a service corresponding resource pool to perform service deployment; the method comprises the following steps:
s31, firstly eliminating all full-load decoupling modules in the corresponding resource pool;
s32, searching whether a single decoupling module meeting the service resource requirement exists in the corresponding resource pool; if yes, deploying the service by using the decoupling module; otherwise, step S33 is performed;
s33, judging whether the remaining decoupling modules in the corresponding resource pools can meet the service resource requirements, if so, distributing the current service to a plurality of decoupling modules for deployment; otherwise, the resource pool deployment fails, and the step S4 is executed;
s4, attempting to use a server to perform service deployment; the method comprises the following steps:
s41, firstly removing servers of which the residual resources can not meet the service resource requirements;
s42, sequencing the servers according to different residual resources in the servers to obtain a plurality of server lists;
s43, judging whether a server meeting the service resource requirement exists in a server list corresponding to the service, if so, deploying the service by using the server; otherwise, the server deployment fails, and step S3 is executed.
As a further improvement of the present invention, in step S1, the services are ordered according to the requirements of the services on different resources to obtain three service lists, the first service list is ordered according to the requirements of the services on the CPU from small to large, the second service list is ordered according to the requirements of the services on the memory from small to large, and the third service list is ordered according to the requirements of the services on the external.
As a further improvement of the present invention, step S2 includes: according to the sequence of the first service list, the second service list and the third service list and according to the small to large demand of the service on the resource, one service is taken out from the current service list each time, and whether the service is a resource intensive service is judged; if yes, executing step S3; otherwise, step S4 is performed.
As a further improvement of the present invention, in step S3, the first task list corresponds to a CPU resource pool including a CPU module, the second task list corresponds to a memory resource pool including a memory module, and the third task list corresponds to a memory resource pool including a memory module.
As a further improvement of the present invention, in step S42, the servers are ordered according to different remaining resources in the servers to obtain three server lists, the first server list is ordered according to the remaining CPU resources from small to large, the second server list is ordered according to the remaining memory resources from small to large, and the third server list is ordered according to the remaining memory resources from small to large.
As a further improvement of the invention, the method further comprises the following steps:
if the current service fails to be deployed in both the resource pool and the server, ending the deployment of the current service and continuing to deploy the next service.
As a further improvement of the invention, the method further comprises the following steps:
if the current service fails to be deployed in both the resource pool and the server, the current service is deleted from the three service lists.
As a further improvement of the present invention, before step S1, the following steps are further included:
according to the demands of the services on different resource types, the services are divided into CPU-intensive services, memory-intensive services, IO-intensive services and low-load demand-type services, wherein the CPU-intensive services, the memory-intensive services and the IO-intensive services are resource-intensive services.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any one of the methods described above when executing the program.
The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods described above.
The invention has the beneficial effects that:
the service deployment and resource allocation method for the partially decoupled data center can improve the service bearing capacity of the data center, maximally utilize the servers in the data center and improve the resource utilization rate of the data center.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention, as well as the preferred embodiments thereof, together with the following detailed description of the invention, given by way of illustration only, together with the accompanying drawings.
Drawings
FIG. 1 (a) a conventional data center architecture;
FIG. 1 (b) fully decoupled data center architecture;
FIG. 1 (c) part of a data center architecture;
FIG. 2 is a flow chart of a method for service deployment and resource allocation for a partially decoupled data center in accordance with a preferred embodiment of the present invention;
FIG. 3 is a block diagram of a method for service deployment and resource allocation for a partially decoupled data center in accordance with a preferred embodiment of the present invention;
FIG. 4 is a graph showing the results of the integer linear programming model and the method for service deployment and resource allocation of a partially decoupled data center in accordance with the preferred embodiment of the present invention in terms of the total number of deployed services;
fig. 5 is a diagram showing the performance of the service deployment and resource allocation method and the first hit scheme of the partial decoupling data center according to the preferred embodiment of the present invention when the degree of decoupling increases.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the invention and practice it.
As shown in fig. 2, a method for service deployment and resource allocation of a partially decoupled data center according to a preferred embodiment of the present invention includes the following steps:
s1, receiving a group of services, and sequencing the services according to the requirements of the services on different resources to obtain a plurality of service lists;
specifically, in step S1, the service is ordered according to the service requirement of different resources to obtain three service lists, the first service list is ordered according to the service requirement of the CPU from small to large, the second service list is ordered according to the service requirement of the memory from small to large, and the third service list is ordered according to the service requirement of the external. The resources comprise a CPU, a memory and an external memory. Alternatively, the corresponding list is generated by giving a high weight to the CPU, memory, and external memory, respectively.
S2, extracting a service from the service list each time, and judging whether the service is a resource-intensive service or not; if yes, executing step S3; otherwise, executing the step S4;
specifically, according to the sequence of the first service list, the second service list and the third service list and according to the small to large requirement of the service on resources, one service is taken out from the current service list each time, and whether the service is a resource intensive service is judged; if yes, executing step S3; otherwise, step S4 is performed.
Wherein, whether the service is resource intensive is divided according to the requirements of the service for different resource types. Each service has different requirements for various resources. Such as a CPU intensive service, the demands on the CPU are great, while the demands on other resources are general. In the case of limited data center resources, our goal is to maximize the number of service deployments. The division criteria of the resource-intensive traffic may be defined manually or may be adjusted according to the actual scenario.
Optionally, before step S1, the method further includes the following steps:
according to the demands of the services on different resource types, the services are divided into CPU-intensive services, memory-intensive services, IO-intensive services and low-load demand-type services, wherein the CPU-intensive services, the memory-intensive services and the IO-intensive services are resource-intensive services.
S3, attempting to use a decoupling module in a service corresponding resource pool to perform service deployment; the first task list corresponds to a CPU resource pool containing a CPU module, the second task list corresponds to a memory resource pool containing a memory module, and the third task list corresponds to a memory resource pool containing a memory module.
Specifically, step S3 includes:
s31, firstly eliminating all full-load decoupling modules in the corresponding resource pool;
s32, searching whether a single decoupling module meeting the service resource requirement exists in the corresponding resource pool; if yes, deploying the service by using the decoupling module; otherwise, step S33 is performed;
s33, judging whether the remaining decoupling modules in the corresponding resource pools can meet the service resource requirements, if so, distributing the current service to a plurality of decoupling modules for deployment; otherwise, the resource pool deployment fails, and the step S4 is executed;
s4, attempting to use a server to perform service deployment; the method comprises the following steps:
s41, firstly removing servers of which the residual resources can not meet the service resource requirements;
s42, sequencing the servers according to different residual resources in the servers to obtain a plurality of server lists;
specifically, the servers are ordered according to different residual resources in the servers to obtain three server lists, the servers are ordered from small to large according to residual CPU resources in the first server list, the servers are ordered from small to large according to residual memory resources in the second server list, and the servers are ordered from small to large according to residual external memory resources in the third server list.
S43, judging whether a server meeting the service resource requirement exists in a server list corresponding to the service, if so, deploying the service by using the server; otherwise, the server deployment fails, and step S3 is executed.
In some embodiments, the method further comprises the steps of:
if the current service fails to be deployed in both the resource pool and the server, ending the deployment of the current service and continuing to deploy the next service.
Further, the method also comprises the following steps:
if the current service fails to be deployed in both the resource pool and the server, the current service is deleted from the three service lists.
In the invention, when the service belongs to the service with dense resource demand, preferentially searching whether the service can be deployed and distributed by using the resource pool, and when the resource in the resource pool can not meet the service demand, searching whether a usable server exists; if not a resource demand intensive service: firstly searching whether a server capable of being deployed exists or not, and when the server cannot meet the requirement, attempting to use the resource pool to deploy the service.
Specifically, referring to fig. 3, first, traffic is ordered according to the number of CPUs required. For services with lower demands on resources, attempts are made to provision servers for them first. We examine the ordered list of servers to select servers with sufficient resources. For memory-intensive traffic and CPU-intensive traffic. We try to deploy them using a pool of resources. If a single module in the resource pool can meet the resource requirement, we first use it to provide services. If not, we use the remaining resources in the resource pool to deploy the service.
According to the invention, the service and the server are ordered in the list by setting a plurality of service lists and server lists, so that service deployment and resource allocation efficiency can be effectively improved.
The service deployment and resource allocation method for the partially decoupled data center can improve the service bearing capacity of the data center, maximally utilize the servers in the data center and improve the resource utilization rate of the data center.
In one embodiment, we consider three types of resource modules: a 32-core CPU module, a 128GB memory module and a 1024GB disk module. It is assumed that each server has a CPU module, a memory module, and a disk module, and that all the resolved resource pools are a collection of different resource modules. We consider different decoupling levels from 0%, 10%, … … to 100%. The simulation was performed considering two cases. In case 1, we have 10 servers, corresponding to 30 resource modules, 10 modules for each resource type. Case 2 contains 1000 servers, corresponding to 3000 modules, 1000 for each resource type. Then, depending on the level of decoupling, a proportion of the servers are broken down and then their resource modules are aggregated into different types of resource pools. For example, for case 1, if the decoupling level is 30%, then 3 servers are decomposed, 9 resource modules, 3 for each resource type. Furthermore, we divide four traffic types into two main categories: conventional services and resource intensive services. The resource requirement of the conventional service is distributed in the space ranges of [1,16] core CPU, [1,64] GB memory and [1,512] GB disk. In contrast, resource-intensive services have high demands on certain types of resources, while other resource demands remain normal. For example, a CPU resource required for a CPU intensive service is in the range of [16,32] cores, and memory and disk space requirements are normal. Memory-intensive services and IO-intensive services require memory resources in the range of [64,128] GB and disk space resources in the range of [512,1024] GB, respectively. For these four different types of traffic we ensure that their generated numbers are the same.
Based on case 1, fig. 4 compares the results of the integer linear programming model and the method of the present invention in terms of the total number of services deployed, where the total number of services that need to be deployed is 50.
The objective of the integer linear programming model is to maximize the number of services offered, but to adhere to the following constraints.
(1) Virtual data center business constraints: each virtual data center service is considered to be successfully deployed only if it is provided with sufficient resources.
(2) Server allocation limit: although a server may accommodate multiple services, the total resources allocated to these services cannot exceed the capacity of the server.
(3) Resource module allocation constraints: one resource module may provide resources for multiple services and if one service requires more resources, it may use the resources of multiple modules. However, the total resources allocated to the traffic should not exceed the total capacity of the resource module.
(4) Server and resource pool separation constraints: if a service has allocated the resources of a resource module, it cannot reallocate the resources of a server and vice versa.
We note that as the decoupling proportion increases, the total number of deployed traffic increases gradually. This is because an increase in the degree of decoupling will free up more resource modules from the server to form a large resource pool, which allows more traffic to share these resources so that they are better utilized. Furthermore, we note that the performance of the method of the present invention is very close to an integer linear programming model, which demonstrates the efficiency of the method. We also compared the method of the present invention to a first hit (FF) scheme. Here, the first hit scheme refers to that the FF scheme first attempts to deploy a service using a server without ordering the service, the server, and the resource module in advance, and if unsuccessful, the scheme deploys the service using a resource pool. We note that the proposed approach is clearly superior to the first hit scheme and deployment traffic exceeds the first hit scheme by up to 31.6%.
We also evaluated the efficiency of the proposed method based on the large-scale data center scenario of case 2 (where there are 5000 businesses). In this case, since the integer linear programming model is difficult to solve, we do not provide its results, but rather compare with the first hit scheme. Fig. 5 shows the performance of the proposed method and the first hit scheme when the degree of decoupling increases, where "IR" corresponds to the rate of increase in the number of deployed traffic for our proposed method compared to the first hit scheme. As with case 1, the number of successful deployments of traffic increases with the increase in the proportion of decoupling. Furthermore, we note that the proposed approach is more than 10% better than the first hit approach, and this improvement becomes more pronounced as the decoupling ratio increases. This is because the method proposed by the present invention performs the ordering process before attempting to deploy the service using the server, which is not done in contrast to the first hit scheme.
The preferred embodiment of the invention also discloses an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which processor implements the steps of the method described in the above embodiments when executing the program.
The preferred embodiment of the present invention also discloses a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described in the above embodiments.
The above embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present invention, and are intended to be within the scope of the present invention. The protection scope of the invention is subject to the claims.

Claims (10)

1. The service deployment and resource allocation method of the partial decoupling data center is characterized by comprising the following steps:
s1, receiving a group of services, and sequencing the services according to the requirements of the services on different resources to obtain a plurality of service lists;
s2, extracting a service from the service list each time, and judging whether the service is a resource-intensive service or not; if yes, executing step S3; otherwise, executing the step S4;
s3, attempting to use a decoupling module in a service corresponding resource pool to perform service deployment; the method comprises the following steps:
s31, firstly eliminating all full-load decoupling modules in the corresponding resource pool;
s32, searching whether a single decoupling module meeting the service resource requirement exists in the corresponding resource pool; if yes, deploying the service by using the decoupling module; otherwise, step S33 is performed;
s33, judging whether the remaining decoupling modules in the corresponding resource pools can meet the service resource requirements, if so, distributing the current service to a plurality of decoupling modules for deployment; otherwise, the resource pool deployment fails, and the step S4 is executed;
s4, attempting to use a server to perform service deployment; the method comprises the following steps:
s41, firstly removing servers of which the residual resources can not meet the service resource requirements;
s42, sequencing the servers according to different residual resources in the servers to obtain a plurality of server lists;
s43, judging whether a server meeting the service resource requirement exists in a server list corresponding to the service, if so, deploying the service by using the server; otherwise, the server deployment fails, and step S3 is executed.
2. The method for allocating service and resources of a partially decoupled data center according to claim 1, wherein in step S1, three service lists are obtained by sorting the service according to the demands of the service for different resources, the first service list sorts the service according to the demands of the service for the CPU from small to large, the second service list sorts the service according to the demands of the service for the memory from small to large, and the third service list sorts the service according to the demands of the service for the external.
3. The method for service deployment and resource allocation of a partially decoupled data center according to claim 2, wherein step S2 comprises: according to the sequence of the first service list, the second service list and the third service list and according to the small to large demand of the service on the resource, one service is taken out from the current service list each time, and whether the service is a resource intensive service is judged; if yes, executing step S3; otherwise, step S4 is performed.
4. The method for service deployment and resource allocation of a partially decoupled data center according to claim 2, wherein in step S3, the first task list corresponds to a CPU resource pool including a CPU module, the second task list corresponds to a memory resource pool including a memory module, and the third task list corresponds to a memory resource pool including a memory module.
5. The service deployment and resource allocation method of the partially decoupled data center according to claim 2, wherein in step S42, the servers are ranked according to different remaining resources in the servers to obtain three server lists, the servers are ranked according to the remaining CPU resources from small to large in the first server list, the servers are ranked according to the remaining memory resources from small to large in the second server list, and the servers are ranked according to the remaining external memory resources from small to large in the third server list.
6. The service deployment and resource allocation method of a partially decoupled data center of claim 1, further comprising the steps of:
if the current service fails to be deployed in both the resource pool and the server, ending the deployment of the current service and continuing to deploy the next service.
7. The service deployment and resource allocation method of a partially decoupled data center of claim 6, further comprising the steps of:
if the current service fails to be deployed in both the resource pool and the server, the current service is deleted from the three service lists.
8. The method for service deployment and resource allocation of a partially decoupled data center according to claim 1, further comprising the step of, prior to step S1:
according to the demands of the services on different resource types, the services are divided into CPU-intensive services, memory-intensive services, IO-intensive services and low-load demand-type services, wherein the CPU-intensive services, the memory-intensive services and the IO-intensive services are resource-intensive services.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1-8 when the program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1-8.
CN202211102743.0A 2022-09-09 2022-09-09 Service deployment and resource allocation method for partial decoupling data center Active CN115454598B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211102743.0A CN115454598B (en) 2022-09-09 2022-09-09 Service deployment and resource allocation method for partial decoupling data center
PCT/CN2022/137172 WO2024051012A1 (en) 2022-09-09 2022-12-07 Service deployment and resource allocation method for partially-decoupled data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211102743.0A CN115454598B (en) 2022-09-09 2022-09-09 Service deployment and resource allocation method for partial decoupling data center

Publications (2)

Publication Number Publication Date
CN115454598A CN115454598A (en) 2022-12-09
CN115454598B true CN115454598B (en) 2023-06-06

Family

ID=84303685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211102743.0A Active CN115454598B (en) 2022-09-09 2022-09-09 Service deployment and resource allocation method for partial decoupling data center

Country Status (2)

Country Link
CN (1) CN115454598B (en)
WO (1) WO2024051012A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106033373A (en) * 2015-03-11 2016-10-19 苏宁云商集团股份有限公司 A method and a system for scheduling virtual machine resources in a cloud computing platform
CN107135123A (en) * 2017-05-10 2017-09-05 郑州云海信息技术有限公司 A kind of concocting method in the dynamic pond of RACK server resources
CN110647394A (en) * 2018-06-27 2020-01-03 阿里巴巴集团控股有限公司 Resource allocation method, device and equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060242647A1 (en) * 2005-04-21 2006-10-26 Kimbrel Tracy J Dynamic application placement under service and memory constraints
JP4129988B2 (en) * 2005-11-10 2008-08-06 インターナショナル・ビジネス・マシーンズ・コーポレーション How to provision resources
US10129169B2 (en) * 2016-04-07 2018-11-13 International Business Machines Corporation Specifying a highly-resilient system in a disaggregated compute environment
CN108616553B (en) * 2016-12-13 2020-08-04 中国移动通信有限公司研究院 Method and device for resource scheduling of cloud computing resource pool
CN110858161B (en) * 2018-08-24 2023-05-12 阿里巴巴集团控股有限公司 Resource allocation method, device, system, equipment and medium
CN112698952A (en) * 2021-01-05 2021-04-23 广州品唯软件有限公司 Unified management method and device for computing resources, computer equipment and storage medium
CN113626162A (en) * 2021-07-09 2021-11-09 西安电子科技大学 Data center task hybrid deployment method and system based on dynamic resource sharing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106033373A (en) * 2015-03-11 2016-10-19 苏宁云商集团股份有限公司 A method and a system for scheduling virtual machine resources in a cloud computing platform
CN107135123A (en) * 2017-05-10 2017-09-05 郑州云海信息技术有限公司 A kind of concocting method in the dynamic pond of RACK server resources
CN110647394A (en) * 2018-06-27 2020-01-03 阿里巴巴集团控股有限公司 Resource allocation method, device and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Disaggregated Data Centers: Challenges and Trade-offs;Rui Lin et.al;《IEEE Communications Magazine》;第20-26页 *
Exploring the Benefits of Resource Disaggregation for Service Reliability in Data Centers;Guo, Chao et.al;《IEEE Transactions on Cloud Computing》;第1-17页 *

Also Published As

Publication number Publication date
WO2024051012A1 (en) 2024-03-14
CN115454598A (en) 2022-12-09

Similar Documents

Publication Publication Date Title
CN110096336B (en) Data monitoring method, device, equipment and medium
US20190319895A1 (en) Resource Scheduling Method And Apparatus
CN110489217A (en) A kind of method for scheduling task and system
CN113296792B (en) Storage method, device, equipment, storage medium and system
CN111506404A (en) Kubernetes-based shared GPU (graphics processing Unit) scheduling method
CN111309440B (en) Method and equipment for managing and scheduling multiple types of GPUs
CN104506669B (en) The IP address distribution system and method for a kind of Based on Distributed network simulation platform
CN114153580A (en) Cross-multi-cluster work scheduling method and device
CN115391023A (en) Computing resource optimization method and device for multitask container cluster
CN115454598B (en) Service deployment and resource allocation method for partial decoupling data center
CN112650449B (en) Method and system for releasing cache space, electronic device and storage medium
CN112698947B (en) GPU resource flexible scheduling method based on heterogeneous application platform
CN112748997A (en) Workflow scheduling method and system
CN115964176B (en) Cloud computing cluster scheduling method, electronic equipment and storage medium
CN115866059A (en) Block chain link point scheduling method and device
Fan et al. A scheduler for serverless framework base on kubernetes
CN108874798B (en) Big data sorting method and system
CN116010051A (en) Federal learning multitasking scheduling method and device
CN112968962B (en) Cloud platform storage resource scheduling method based on distributed computer cluster architecture
CN112073501A (en) Tenant separation type storage and management method
CN113204434B (en) Planned task execution method and device based on k8s and computer equipment
CN111950869A (en) Iterative solution method and system for improving initial solution of scheduling problem of space measurement and control network
CN112925852B (en) Distributed database designated node capacity reduction method
CN110851199A (en) Information protection system in power system and initialization method thereof
CN111949407B (en) Resource allocation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant