CN112162864B

CN112162864B - Cloud resource allocation method, device and storage medium

Info

Publication number: CN112162864B
Application number: CN202011158622.9A
Authority: CN
Inventors: 兰天
Original assignee: New H3C Big Data Technologies Co Ltd
Current assignee: New H3C Big Data Technologies Co Ltd
Priority date: 2020-10-26
Filing date: 2020-10-26
Publication date: 2023-06-09
Anticipated expiration: 2040-10-26
Also published as: CN112162864A

Abstract

The disclosure provides a cloud resource allocation method, a cloud resource allocation device and a storage medium, which are used for improving the utilization rate of cloud resources. According to the cloud host and the cloud operating system, the GPU acceleration strategy and the related threshold are set, and when the CPU resource utilization rate of the cloud host is too high, the cloud operating system is triggered to add GPU resources to the cloud host so as to improve the overall computing resource processing performance of the cloud host. According to the cloud resource flexible scheduling method and the cloud resource flexible scheduling device, the GPU resources and the CPU resources are reasonably and effectively scheduled and distributed, so that the cloud resource flexible performance of the cloud operating system is improved, physical resources can be saved under the condition that user requirements are met, and the utilization rate of the cloud resources is improved.

Description

Cloud resource allocation method, device and storage medium

Technical Field

The disclosure relates to the technical field of cloud computing, and in particular relates to a cloud resource allocation method, a cloud resource allocation device and a storage medium.

Background

The elastic expansion function of cloud resources is to automatically adjust computing resources according to the service and strategy of users, different demands of users are met under the condition that the service volume continuously fluctuates, all cloud manufacturers currently support the function of elastic expansion based on physical resources such as a CPU (Central processing Unit), a memory and the like, the CPU and the GPU are used as processors of the current main stream, the physical resources are relatively expensive, and if the two resources cannot be reasonably and effectively integrated and distributed, the waste of the cloud resources and the reduction of user experience can be caused.

Disclosure of Invention

In view of the above, the present disclosure provides a cloud resource allocation method, a device and a storage medium, which are used for improving the utilization rate of cloud resources.

Based on an embodiment of the present disclosure, the present disclosure provides a cloud resource allocation method, the method including:

monitoring the CPU resource utilization rate of a first cloud host allocated to a user, and triggering and executing a first elastic expansion strategy for allocating GPU resources when the CPU resource utilization rate of the first cloud host exceeds a first threshold;

and distributing GPU resources for the first cloud host according to a first elastic expansion strategy.

Further, the method further comprises: judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold value, wherein the second threshold value is larger than the first threshold value, and triggering and executing a second elastic expansion strategy for expanding the cloud host when judging that the CPU resource utilization rate of the first cloud host exceeds the second threshold value; and expanding a new cloud host for the user according to a second elastic expansion strategy.

Further, the method further comprises: if the CPU resource utilization rate of the first cloud host exceeds a first threshold value and triggers execution of a first elastic expansion strategy for distributing GPU resources, the fact that the GPU resource pool is not available for GPU resources is found, and then the second elastic expansion strategy for expanding the cloud host is directly triggered and executed.

Further, the method further comprises: and after GPU resources are allocated to the first cloud host, injecting a computing unified device architecture package into the first cloud host.

Further, the method further comprises: and triggering and executing an elastic expansion strategy for recovering the GPU resources when the CPU resource utilization rate of the first cloud host is lower than a first threshold value and the difference value between the CPU resource utilization rate and the first threshold value is larger than a preset amplitude.

Based on another aspect of the disclosure, the disclosure further provides a cloud resource allocation apparatus, which includes:

the monitoring module is used for monitoring the CPU resource utilization rate of the first cloud host allocated to the user, and triggering and executing a first elastic expansion strategy for allocating GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;

and the elastic module is used for distributing GPU resources to the first cloud host according to a first elastic expansion strategy.

Further, the monitoring module is further configured to determine whether a CPU resource usage rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and trigger to execute a second elastic expansion policy for expanding the cloud host when it is determined that the CPU resource usage rate of the first cloud host exceeds the second threshold;

the elastic module is further configured to extend a new cloud host for the user according to a second elastic expansion policy.

Further, the monitoring module is further configured to, when it is determined that the CPU resource usage rate of the first cloud host exceeds a first threshold and triggers execution of a first elastic expansion policy for allocating GPU resources, directly trigger execution of the second elastic expansion policy for expanding the cloud host if it is found that GPU resources are not available in the GPU resource pool.

Further, the monitoring module is further configured to trigger to execute an elastic expansion policy for recovering the GPU resources when it is determined that the CPU resource usage of the first cloud host is lower than the first threshold and a difference value between the CPU resource usage and the first threshold is greater than a preset magnitude.

According to the cloud host and the cloud operating system, the GPU acceleration strategy and the related threshold are set, and when the CPU resource utilization rate of the cloud host is too high, the cloud operating system is triggered to add GPU resources to the cloud host so as to improve the overall computing resource processing performance of the cloud host. According to the cloud resource scheduling method and the cloud resource scheduling device, the GPU resources and the CPU resources are reasonably and effectively scheduled and distributed, so that the cloud resource elastic expansion performance of the cloud operating system is improved, physical resources can be saved under the condition that user requirements are met, and the cloud resource scheduling capability and the cloud resource utilization rate are improved.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the following description will briefly describe the drawings that are required to be used in the embodiments of the present disclosure or the description in the prior art, and it is apparent that the drawings in the following description are only some embodiments described in the present disclosure, and other drawings may also be obtained according to these drawings of the embodiments of the present disclosure for those skilled in the art.

Fig. 1 is a flowchart of steps of a cloud resource allocation method according to an embodiment of the present disclosure;

fig. 2 is a process schematic diagram of a cloud resource allocation method according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a cloud resource allocation apparatus according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a cloud resource allocation apparatus according to an embodiment of the present disclosure.

Detailed Description

The terminology used in the embodiments of the disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the disclosure. As used in the embodiments of the present disclosure, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The term "and/or" if used in this disclosure is intended to encompass any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in embodiments of the present disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of embodiments of the present disclosure. Depending on the context, furthermore, the word "if" used may be interpreted as "at … …" or "at … …" or "in response to a determination".

Fig. 1 is a flowchart of steps of a cloud resource allocation method according to an embodiment of the present disclosure, where the method is applied to a cloud operating system (a system running on a cloud management platform becomes a cloud operating system) in a cloud computing scenario, and the method triggers the cloud operating system to add GPU resources to a cloud host to improve overall computing resource processing performance of the cloud host when the CPU resource utilization of the cloud host is too high by setting a threshold for GPU acceleration and a GPU acceleration policy. According to the cloud resource scheduling method and the cloud resource scheduling device, the GPU resources and the CPU resources are properly scheduled and distributed, so that the cloud resource elastic expansion performance of the cloud operating system is improved, physical resources can be saved under the condition that the user requirements are met, and the cloud resource scheduling capability is improved. The method comprises the following steps:

step 101, monitoring the CPU resource utilization rate of a first cloud host allocated to a user, and triggering and executing a first elastic expansion strategy for allocating GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold;

before this step is performed, a first threshold needs to be preset for the first cloud host, and an elastic expansion policy for allocating GPU resources is newly created for the user through an elastic expansion management platform (which is a sub-platform of the cloud management platform). The first threshold is used to trigger execution of a first elastic scaling policy that allocates GPU resources for the cloud host.

The cloud operating system monitors the CPU resource utilization rate of the cloud host in real time, when the CPU resource utilization rate exceeds a first threshold value, the computing resource of the first cloud host is tense at the moment, and in order to ensure the normal operation of the service, the execution of a first elastic telescopic strategy needs to be triggered, so that GPU resources are added for the first host, the processing performance of the user cloud host is improved through the good parallel computing capacity of the GPU, and the service requirements of the user can be expected to be met on the premise that the cloud host instance is not increased.

Step 102, GPU resources are allocated to the first cloud host according to a first elastic expansion strategy.

In an embodiment of the disclosure, the method further includes: and triggering and executing an elastic expansion strategy for recovering the GPU resources when the CPU resource utilization rate of the first cloud host is lower than a first threshold value and the difference value between the CPU resource utilization rate and the first threshold value is larger than a preset amplitude.

The specific flexible policy how to allocate GPU resources to the cloud host may be set on the corresponding flexible management platform, for example, the number of GPU resources may be increased or recovered according to a certain proportion of existing CPU resources of the cloud host, which is not specifically limited in this disclosure.

In an embodiment of the present disclosure, after GPU resources are allocated to the cloud host, if the CPU resource utilization of the cloud host continues to increase beyond a preset threshold, the method further includes a step of expanding a new cloud host for the user to meet the service needs of the user. Based thereon, the method further comprises:

step 103, judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold, wherein the second threshold is larger than a first threshold, for example, the first threshold is 80%, the second threshold is 90%, and triggering and executing a second elastic expansion strategy for expanding the cloud host when judging that the CPU resource utilization rate of the first cloud host exceeds the second threshold;

step 104, expanding a new cloud host for the user according to the second elastic expansion strategy.

By expanding a new cloud host for a user, a part of business on the first cloud host can be transferred to the newly expanded cloud host or the newly increased business can be guided to the new cloud host, so that the resource pressure of the first cloud host is reduced, the business service quality of the first cloud host is ensured, and the situation that business response is slowed down or even interrupted due to tension of cloud resources is avoided as much as possible.

In order to clearly disclose technical contents and technical effects of the technical scheme of the present disclosure, the following description is made with reference to the accompanying drawings. Fig. 2 is a process schematic diagram of a cloud resource allocation method according to an embodiment of the present disclosure. In general, the process of elastic expansion of the cloud operating system is as follows: when cloud host 1 reaches a CPU utilization threshold (e.g., 90%), the elastic expansion management platform automatically expands one or more cloud host instances, such as "cloud host 2" and "cloud host 3".

An implementation of an embodiment of the disclosure shown in fig. 2 is to combine GPU acceleration to improve cloud host performance, first add a "GPU acceleration" option to the flexible association platform, and set policy 1.

Step 1, when the CPU utilization of the cloud host reaches a first threshold, for example 80% (the threshold is set by a user, so as to trigger GPU acceleration in advance before expanding the cloud host), the execution of the GPU acceleration strategy 1 is triggered.

And 2, automatically distributing GPU resources for the cloud host 1 from a GPU resource pool according to the resource load condition of the cloud host 1 (on the premise that the GPU resources are abundant).

After adding GPU resources to the cloud host 1, a unified computing device architecture ((Compute Unified Device Architecture, CUDA) component package is injected into the cloud host 1 through a group-init.

And 3, after adding the GPU resources, if the service throughput of the cloud host 1 is still continuously increased, so that the CPU utilization rate of the cloud host reaches a second threshold value, for example, 90%, the execution of an expansion strategy of the cloud host is triggered at the moment, that is, one or more cloud hosts are expanded for a user, and the expansion operation allocates CPU resources for the newly expanded cloud hosts from a CPU resource pool.

In some cases, the following may also be present: when the CPU resource utilization rate of the first cloud host exceeds the first threshold, triggering and executing a first elastic expansion strategy for distributing GPU resources, and finding that the GPU resource pool is not available for GPU resources, under the condition, directly triggering and executing a second elastic expansion strategy for expanding the cloud host, thereby expanding a new cloud host for a user.

It should be appreciated that embodiments of the present disclosure may be implemented or realized by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The method may be implemented in a computer program using standard programming techniques, including a non-transitory computer readable storage medium configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.

Furthermore, the operations of the processes described in the present disclosure may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes (or variations and/or combinations thereof) described in this disclosure may be performed under control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications), by hardware, or combinations thereof, collectively executing on one or more processors. The computer program includes a plurality of instructions executable by one or more processors.

Further, the methods provided by the present disclosure may be implemented in any type of computing platform operatively connected to a suitable, including, but not limited to, a personal computer, mini-computer, mainframe, workstation, network or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and so forth. Aspects of the disclosure may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optical read and/or write storage medium, RAM, ROM, etc., such that it is readable by a programmable computer, which when read by a computer, is operable to configure and operate the computer to perform the processes described herein. Further, the machine readable code, or portions thereof, may be transmitted over a wired or wireless network. When such media includes instructions or programs that, in conjunction with a microprocessor or other data processor, implement the steps described above, the invention described in this disclosure includes these and other different types of non-transitory computer-readable storage media. The present disclosure also includes the computer itself when programmed according to the methods and techniques described in this disclosure.

Fig. 3 is a schematic structural diagram of a cloud resource allocation device according to an embodiment of the present disclosure, where each functional module in the device may be implemented in a software module form, a hardware unit form, or a combination of software and hardware. The functions of the modules of the device have corresponding relations with the steps in the method provided by the implementation of the present disclosure. The apparatus 300 includes: a monitoring module 310 and an elastic module 320.

The monitoring module 310 is configured to monitor a CPU resource utilization of a first cloud host allocated to a user, and trigger to execute a first elastic expansion policy for allocating GPU resources when it is determined that the CPU resource utilization of the first cloud host exceeds a first threshold.

The elastic module 320 is configured to allocate GPU resources to the first cloud host according to a first elastic expansion policy.

In an embodiment of the present disclosure, the monitoring module 310 may be further configured to determine whether a CPU resource usage rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and trigger to execute a second elastic expansion policy for expanding the cloud host when it is determined that the CPU resource usage rate of the first cloud host exceeds the second threshold. The elastic module 320 is further configured to extend a new cloud host for the user according to a second elastic scaling policy.

In an embodiment of the present disclosure, the monitoring module 310 may be further configured to, when it is determined that the CPU resource usage of the first cloud host exceeds the first threshold and triggers execution of the first elastic scaling policy for allocating GPU resources, directly trigger execution of the second elastic scaling policy for expanding the cloud host if it is found that GPU resources are not available in the GPU resource pool.

In an embodiment of the present disclosure, the monitoring module 310 may be further configured to trigger to execute the elastic scaling policy for recovering the GPU resource when it is determined that the CPU resource usage of the first cloud host is lower than the first threshold and the difference from the first threshold is greater than the preset magnitude.

Fig. 4 is a schematic structural diagram of a cloud resource allocation apparatus according to an embodiment of the present disclosure, where the apparatus 400 includes: a processor 410, such as a Central Processing Unit (CPU), an internal bus 420, a network interface 440, and a computer readable storage medium 430. Wherein the processor 410 and the computer-readable storage medium 430 may communicate with each other via an internal bus 420. The computer-readable storage medium 430 may store therein a computer program provided by the present disclosure for implementing the above-described cloud resource allocation method, where the computer program may implement the functions of each step of the method provided by the present disclosure when executed by the processor 410.

The machine-readable storage medium may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Additionally, the machine-readable storage medium may be at least one storage device located remotely from the processor. The processor may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

The device provided by the embodiment of the present disclosure and the method provided by the embodiment of the present disclosure are the same technical concept, and have the same beneficial effects as the method adopted, operated or implemented by the device.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the present disclosure. Various modifications and variations of this disclosure will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A cloud resource allocation method, the method comprising:

GPU resources are distributed to the first cloud host according to a first elastic expansion strategy;

judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold value, wherein the second threshold value is larger than the first threshold value, and triggering and executing a second elastic expansion strategy for expanding the cloud host when judging that the CPU resource utilization rate of the first cloud host exceeds the second threshold value;

according to a second elastic expansion strategy, expanding a new cloud host for the user;

if the CPU resource utilization rate of the first cloud host exceeds a first threshold value and triggers execution of a first elastic expansion strategy for distributing GPU resources, the fact that the GPU resource pool is not available for GPU resources is found, and then the second elastic expansion strategy for expanding the cloud host is directly triggered and executed.

2. The method according to claim 1, wherein the method further comprises:

and after GPU resources are allocated to the first cloud host, injecting a computing unified device architecture package into the first cloud host.

3. The method according to claim 1, wherein the method further comprises:

and triggering and executing an elastic expansion strategy for recovering the GPU resources when the CPU resource utilization rate of the first cloud host is lower than a first threshold value and the difference value between the CPU resource utilization rate and the first threshold value is larger than a preset amplitude.

4. A cloud resource allocation apparatus, comprising:

the monitoring module is used for monitoring the CPU resource utilization rate of the first cloud host allocated to the user, and triggering and executing a first elastic expansion strategy for allocating GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value; the method is also used for directly triggering and executing a second elastic expansion strategy for expanding the cloud host when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value and the first elastic expansion strategy for distributing GPU resources is triggered and executed, and the GPU resource pool is found to be free of GPU resources;

the elastic module is used for distributing GPU resources to the first cloud host according to a first elastic expansion strategy;

the monitoring module is further configured to determine whether a CPU resource usage rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and trigger to execute a second elastic expansion policy for expanding the cloud host when it is determined that the CPU resource usage rate of the first cloud host exceeds the second threshold;

5. The apparatus of claim 4, wherein the device comprises a plurality of sensors,

the monitoring module is further configured to trigger to execute an elastic expansion policy for recovering GPU resources when it is determined that the CPU resource usage rate of the first cloud host is lower than a first threshold and a difference value between the CPU resource usage rate and the first threshold is greater than a preset magnitude.

6. A storage medium having stored thereon a computer program which, when executed by a processor, performs the function of the method steps of any of claims 1 to 3.