CN112559182B

CN112559182B - Resource allocation method, device, equipment and storage medium

Info

Publication number: CN112559182B
Application number: CN202011488619.3A
Authority: CN
Inventors: 李鸿斌
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-12-16
Filing date: 2020-12-16
Publication date: 2024-04-09
Anticipated expiration: 2040-12-16
Also published as: CN112559182A

Abstract

The application discloses a resource allocation method, a device, equipment and a storage medium, relates to the field of big data, and can be applied to the fields of cloud computing and cloud. The specific implementation scheme is as follows: when first idle resources exist in a plurality of first containers, acquiring the size of the first idle resources, wherein the first containers are used for processing on-line business; and when the size of the first idle resource is larger than or equal to a first threshold value, the first idle resource is allocated to offline service so as to process the offline service. And the capacity expansion and contraction is not required to be carried out frequently under the architecture without a server, so that the problem that the capacity expansion efficiency cannot keep pace is avoided.

Description

Resource allocation method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of cloud computing, and in particular, to a method, an apparatus, a device, and a storage medium for resource allocation.

Background

The Serverless (Serverless) is a new service architecture, so that service research personnel only need to pay attention to service logic, and the efficiency of service iteration is improved without considering operation and capacity.

The Serverless service architecture is triggered by an event by wrapping out host management, operating system management, resource allocation, capacity expansion, etc., and providing services by a third party. The Serverless can automatically expand the computing capacity and the capacity when the service volume is large so as to bear more user requests, and contract the resources when the service volume is reduced so as to avoid resource waste, and the mechanism is an elastic expansion and contraction mechanism of the Serverless.

For online service, the service volume has a plurality of wave peaks and wave troughs, so that the Serverless service needs repeated capacity expansion and contraction, and the capacity expansion efficiency is easy to keep up with the request.

Disclosure of Invention

The application provides a resource allocation method, a device, equipment and a storage medium.

According to a first aspect of the present application, there is provided a resource allocation method, including:

when first idle resources exist in a plurality of first containers, acquiring the size of the first idle resources, wherein the first containers are used for processing on-line business;

and when the size of the first idle resource is larger than or equal to a first threshold value, the first idle resource is allocated to offline service so as to process the offline service.

According to a second aspect of the present application, there is provided a resource allocation apparatus comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring the size of a first idle resource when determining that the first idle resource exists in a plurality of first containers, wherein the first containers are used for processing on-line business;

and the allocation module is used for allocating the first idle resources to offline service to process the offline service when the size of the first idle resources is larger than or equal to a first threshold value.

According to a third aspect of the present application, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the first aspects.

According to a fourth aspect of the present application, there is provided a computer program product comprising: a computer program stored in a readable storage medium, from which it can be read by at least one processor of an electronic device, the at least one processor executing the computer program causing the electronic device to perform the method of the first aspect.

According to a fifth aspect of the present application, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of the first aspects.

The method, the device, the equipment and the storage medium for resource allocation provided by the embodiment of the application, firstly, when first idle resources exist in a plurality of first containers, the size of the first idle resources is obtained, wherein the first containers are used for processing on-line business; then, when the size of the first idle resource is greater than or equal to a first threshold, the first idle resource is allocated to the offline service to process the offline service. According to the scheme provided by the embodiment of the application, when the size of the first idle resource is larger than or equal to the first threshold value, the traffic of the online service can be determined to be in the valley state, and the redundant idle resource is used for processing the offline service at the moment, so that the waste of the resource is avoided, and capacity reduction is not needed. Because the capacity reduction is not needed for the resources in the valley state, when the traffic volume of the online service is increased, the processing of the offline service can be suspended, and the first idle resources before the processing are reused for the processing of the online service without capacity expansion, thereby avoiding the problem that the capacity expansion efficiency cannot be kept after the capacity expansion is frequently carried out due to the traffic volume change of the online service, and widening the application scene of serverless.

It should be understood that the description of this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.

Drawings

The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:

fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;

fig. 2 is a flow chart of a resource allocation method according to an embodiment of the present application;

fig. 3 is a flow chart of a resource allocation method according to another embodiment of the present application;

fig. 4 is a schematic diagram of service processing provided in an embodiment of the present application;

fig. 5 is a schematic diagram of resource scheduling according to an embodiment of the present application;

fig. 6 is a schematic flow chart of resource allocation according to an embodiment of the present application;

fig. 7 is a schematic flow chart of periodic capacity reduction according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a resource allocation device according to an embodiment of the present application;

fig. 9 is a block diagram of an electronic device of a resource allocation method according to an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The concepts related to the present application are explained first.

Serverless: the server-free service architecture is a service architecture for outsourcing all components of host management, operating system management, resource allocation, capacity expansion and contraction and even application logic and providing services by a third party. Serverless runs a framework on a server, responds to a plurality of events by changing into a micro-service or micro-function to respond to an event, when accessing, calls related resources to start running, and after the running is completed, all overheads are unloaded.

The Serverless architecture can enable a developer to concentrate on a product without managing and operating a cloud or local server, without considering the problems of the specification size, storage type, network bandwidth, automatic expansion and contraction of the server and the like, and without operating and maintaining the server.

A container: under the Serverless architecture, the resources which are created when the server performs service deployment and are virtualized can exist in the form of virtual machines. When creating the container, a corresponding request value and limit value are set, wherein the request value represents the minimum resource requirement used by the container, and the limit value represents the maximum value of the available resources used by the container. The resources herein may include resources of multiple dimensions, including, for example, central processing unit (Central Processing Unit, CPU) resources, network card resources, memory resources, and the like.

The application scenario of serverless is described below in conjunction with fig. 1.

Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application, as shown in fig. 1, including a server 10, where the server 10 may create a plurality of containers and allocate a corresponding resource to each container.

In fig. 1, the server 10 is illustrated as creating 2 containers, namely a first container 11 and a second container 12, and when creating the first container 11 and the second container 12, resource allocation is performed for the first container 11 and the second container 12. When the server 10 creates the first container 11 and the second container 12, corresponding parameters are set for the first container 11 and the second container 12, where the parameters include a minimum resource requirement, i.e. a request value, used by the container, and a maximum value, i.e. a limit value, of resources that can be used by the container.

The request value and limit value may be different for different containers, where the request value may be relied upon as a judgment of the resource allocation at the time of container scheduling.

The server may allocate initial resources for each container based on the request value of the container, which may include a CPU, memory, network card, etc.

After the resources are allocated, the container may receive service requests and then process the service requests using the allocated resources. For example, in fig. 1, the first client 13 sends a service request to the first container 11, and the first container 11 may process the service of the first client 13 according to the allocated resources; the second client 14 sends a service request to the second container 12, the second container 12 can process the service of the second client 14 according to the allocated resources, and so on.

The services may include an online service, which is a service that needs to be processed in time, and an offline service, which is a service that does not need to be processed in time. For online service, since the traffic of online service may change in different time periods, there are time periods with larger traffic, i.e., peak time periods of online service, and time periods with smaller traffic, i.e., trough time periods of online service.

The resources required for different traffic sizes are also variable. Aiming at the situation, the scheme adopted by the server architecture is to perform an elastic capacity expansion mechanism, expand the capacity of the resource when the traffic is increased, and shrink the capacity of the resource when the traffic is reduced. This approach requires repeated capacity expansion and contraction for online services, which easily results in failure of capacity expansion efficiency. Therefore, the current serverless architecture is not well suited for the processing of online services and the processing of services with larger code packages. The above drawbacks severely limit the application scenarios of serverless.

There are currently two main solutions to the above problems. The first is to perform elastic capacity expansion in advance according to the historical behavior of the service, so as to avoid the problem that capacity expansion efficiency cannot keep pace with the problem that capacity expansion is avoided by triggering capacity expansion when a large amount of resources are needed to be used. The second is to perform manual intervention, according to the known resource requirement, the manual intervention is performed in advance to ensure the resource of business processing.

The first scheme needs a large amount of accumulated historical data, cannot be applied in the early stage, and the change of the traffic volume may have irregular problems, and the early stage historical data is not necessarily completely suitable for processing the later stage traffic, so that inaccurate prediction is caused. The second scheme is higher in labor cost through manual intervention, and poor in timeliness, and cannot be popularized on a large scale.

Based on this, the embodiment of the application provides a resource allocation scheme under the server architecture, so as to solve the problem that the capacity expansion efficiency cannot keep up with the request due to repeated capacity expansion and contraction, and develop the use scene of the server architecture. The following will describe aspects of the present application with reference to the accompanying drawings.

Fig. 2 is a flow chart of a resource allocation method provided in an embodiment of the present application, and as shown in fig. 2, the method may include:

s21, when first idle resources exist in a plurality of first containers, acquiring the size of the first idle resources, wherein the first containers are used for processing on-line business.

The first container is a container created by the server and used for processing the online service. When the first containers are created, resources are allocated for each first container, and the resources can be used for processing the online service. The number of first containers may be plural and the resources allocated by different first containers may be different.

Since the traffic of the on-line traffic has peaks and valleys, i.e. the traffic of different time periods is different. When the traffic of the online traffic is in the valley, there may be a free resource, i.e. a first free resource, in the first container.

Upon determining that a first free resource exists in the plurality of first containers, a size of the first free resource may be obtained.

S22, when the size of the first idle resource is larger than or equal to a first threshold value, the first idle resource is allocated to offline service so as to process the offline service.

After the size of the first idle resources is obtained, when the size of the first idle resources reaches a first threshold value, the first idle resources can be allocated to offline service to process the offline service. The first threshold is a preset value, and the size of the first threshold can be determined according to actual needs. Generally, when the size of the first idle resource is greater than or equal to the first threshold, the size of the first idle resource is capable of meeting the need to process offline traffic.

The resource allocation method provided by the embodiment of the application includes the steps that firstly, when first idle resources exist in a plurality of first containers, the size of the first idle resources is obtained, wherein the first containers are used for processing on-line business; then, when the size of the first idle resource is greater than or equal to a first threshold, the first idle resource is allocated to the offline service to process the offline service. According to the scheme provided by the embodiment of the application, when the size of the first idle resource is larger than or equal to the first threshold value, the traffic of the online service can be determined to be in the valley state, and the redundant idle resource is used for processing the offline service at the moment, so that the waste of the resource is avoided, and capacity reduction is not needed. Because the capacity reduction is not needed for the resources in the valley state, when the traffic volume of the online service is increased, the processing of the offline service can be suspended, and the first idle resources before the processing are reused for the processing of the online service without capacity expansion, thereby avoiding the problem that the capacity expansion efficiency cannot be kept after the capacity expansion efficiency is frequently increased due to the traffic volume change of the online service.

The following describes aspects of the present application in connection with specific embodiments.

Fig. 3 is a flow chart of a resource allocation method according to another embodiment of the present application, as shown in fig. 3, including:

and S31, when the first idle resources exist in the first containers, acquiring the machine resources of each first container and the resources used by the corresponding online service.

The machine resource of each first container is the resource allocated by the server when the first container is created, i.e. the machine resource to which the first container applies. The server sets the corresponding request value and limit value as each container is created. Wherein the request value indicates the minimum resource requirement for the container to use and the limit value indicates the maximum value of the resources that the container can use.

When a resource is initially allocated to a container, a resource having a corresponding request value is typically allocated to the container, where the request value indicates at least the resource to which the container can be allocated. In some cases, the resources that the container can use may exceed the size of the request value, but at most cannot exceed the size of the limit value.

For any first container, after the machine resource of the first container, that is, the corresponding request value is acquired, the machine resource applied by the first container is known, and the machine resource can be fully or partially used for processing the online service.

In some cases, only a portion of the resources, i.e., the resources that have been used by the online service, may be needed to process the corresponding online service, and the other unused resources are free resources within the first container.

S32, acquiring the size of the first idle resources according to the machine resources of each first container and the used resources.

For any one first container, after the machine resources of the first container and the resources used by the online service processed in the first container are acquired, the idle resources in the first container can be obtained. When a plurality of first containers exist, the size of the first idle resources can be obtained according to the idle resources in each first container, and the first idle resources are the whole idle resources in the system.

Fig. 4 is a schematic diagram of service processing provided in the embodiment of the present application, as shown in fig. 4, including service a, where service a is an online service, and the service a needs to be processed by using a resource in the first container.

In fig. 4, the request value of the first container corresponding to the service a is 4 units, and the limit value is also 4 units. When the first container is created, 4 units of machine resources are allocated to the first container according to the request value of the first container.

When the service a is processed, 2 units of resources are needed, as shown in fig. 4, and according to 4 units of machine resources in the first container and 2 units of resources used by the service a, the idle resources of the first container can be obtained to be 2 units.

For each first container, the calculation of the free resources can be performed in this way, and then the size of the first free resources common to the plurality of first containers can be obtained.

S33, when the size of the first idle resource is larger than or equal to a first threshold value, acquiring the offline service from a message queue, and determining the size of a first resource required for processing the offline service, wherein the size of the first resource is smaller than or equal to the first threshold value.

In the embodiment of the application, the number of the offline services may be one or more, and the offline services may be pre-stored in the message queue. When there is a first free resource, the first free resource is not immediately allocated to the offline service.

Since the traffic of the online service may change at any time, the size of the resources required for processing the online service may also change at any time, resulting in that the size of the first free resources also changes dynamically.

Therefore, the offline service is acquired from the message queue and is processed only when the size of the first idle resource is greater than or equal to the first threshold. Wherein the size of the first free resources is generally sufficient to handle these offline traffic when the size of the first free resources is greater than or equal to the first threshold. I.e. the size of the first resource is smaller than or equal to the first threshold value and smaller than or equal to the size of the first free resource.

S34, distributing the first idle resources to the offline service according to the size of the first resources.

Specifically, a maximum resource value of each second container may be obtained, where the second container is a container for performing processing of offline service. Each second container is also created by the server, and when creating the second container, a request value and a limit value need to be set for each second container, where the request value is the size of the machine resource of the second container, and the limit value is the size of the maximum resource value of the second container.

Since the second container is a container for performing processing of an offline service, and the offline service is a service that does not need immediate processing, the machine resource of the second container is 0, i.e. the corresponding request value is 0. And the maximum resource value of the second container, i.e. limit value, is greater than 0.

After the maximum resource value of the second container is obtained, for the offline service processed by each second container, the first idle resource can be allocated to the offline service in the second container according to the maximum resource value of the second container and the size of the first resource required by the processed offline service. Wherein, for any one second container, the size of the allocated free resource is larger than or equal to the size of the first resource required by the processed offline service and smaller than or equal to the maximum resource value of the second container.

For example, in fig. 4, service B is an offline service, where the request value of the corresponding second container is 0, the limit value is 2, and the maximum resource value of the second container is 2 units. The first resource required for processing the service B has a size of 1 unit, and then a resource greater than or equal to 1 unit and less than or equal to 2 units may be allocated to the second container. In fig. 4, 2 units of free resources are allocated for the second container, wherein the processing of service B requires 1 unit of free resources to be consumed, and finally 1 unit of resources is not used.

Fig. 5 is a schematic diagram of resource scheduling provided in the embodiment of the present application, and as shown in fig. 5, an execution triggering event of an offline service is triggered by a message queue. Specifically, the scheduler firstly collects the current machine resource consumption condition to obtain a first idle resource, and then decides whether to process offline service according to the current machine resource consumption (i.e. the size of the first idle resource). If the offline service processing is needed, the offline tasks can be acquired from the message queue, and then corresponding idle resources are allocated to the offline tasks to perform the offline service processing.

When the traffic of the online service is in the trough, the traffic is reduced, and the first idle resources are left for processing the offline service, so that the waste of resources is avoided, and the capacity reduction is not needed. Since the traffic of the online service may change at any time, the traffic of the online service may gradually increase over a period of time, that is, a new online service needs to be processed, and then resources need to be further allocated.

Fig. 6 is a schematic flow chart of resource allocation according to an embodiment of the present application, as shown in fig. 6, including:

s61, acquiring a service request of the new online business.

When the service volume of the online service increases, it indicates that a new online service needs to be processed, at this time, a service request of the new online service is acquired first, and then resources are allocated to the new online service according to the service request of the new online service for corresponding processing.

S62, obtaining the size of a second idle resource according to the size of the first idle resource and the size of the first resource used by the offline service.

Before that, there is a size of a first free resource in the plurality of first containers, where a portion of the resources in the first free resource are used for allocation to processing of offline traffic, and resources required for processing of the offline traffic are the first resources.

After determining the size of the first free resources and the size of the first resources, a second free resource may be determined, wherein the second free resource is a free resource remaining after the on-line traffic and the off-line traffic before processing in the system. For example, in fig. 4, the resources required for processing the online service a are 2 units, the corresponding first idle resources are 2 units, and the size of the first resources required for the offline service B is 1 unit, so the size of the second idle resources remaining after processing the online service a and the offline service B is 1 unit.

And S63, distributing resources to the new online service according to the service request and the size of the second idle resources so as to process the new online service.

After the second idle resource size is obtained, resources can be allocated to the new online service according to the service request of the new online service and the second idle resource size so as to process the new online service.

Specifically, when the size of the second idle resource is greater than or equal to the second threshold, it indicates that the remaining second idle resource is large enough to handle the new online traffic. At this point, therefore, the second free resources may be allocated directly to new online services to handle these new online services.

And when the size of the second idle resources is smaller than the second threshold value, indicating that the remaining second idle resources are not large enough, wherein the size of the second idle resources is not enough to process the new online services. Because the online service needs to be processed immediately and the offline service does not need to be processed immediately, at this time, the offline service is stopped to be processed first, and after the offline service is stopped to be processed, the first resource originally used for processing the offline service is left, and then the second idle resource and the first resource are allocated to the new online service to process the new online service.

According to the scheme, when the traffic of the online service is in the trough state, idle resources are distributed to the offline service through a time-sharing multiplexing method to process the offline service, the waste of resources is avoided, the capacity reduction is avoided, when the traffic of the online service is in the peak state, the processing of the offline service can be suspended, and the idle resources are reused for the processing of new online service, and the capacity expansion is not needed at the moment. Therefore, the scheme of the application does not need to carry out frequent expansion and contraction, so that the problem that the expansion efficiency cannot be kept up due to frequent expansion and contraction is avoided, and the service scene of the server is further widened.

In some cases, the resource utilization of the system may be at a lower level for a long period of time, at which time the scaling may be done periodically as appropriate to save resources. This will be described below with reference to fig. 7.

Fig. 7 is a schematic flow chart of periodic capacity reduction according to an embodiment of the present application, as shown in fig. 7, including:

s71, acquiring the resource utilization rate of the online service in the first period.

In this embodiment of the present application, the resource utilization rate of the online service may be obtained periodically, for example, the resource utilization rate of each first period may be obtained, and the length of the first period may be determined according to actual needs.

The resource utilization rate may be an average resource utilization rate of the online service in the first period, or may be a resource utilization rate when the online service in the first period is in a peak state, which is not particularly limited in the embodiment of the present application.

And S72, if the resource utilization rate is smaller than or equal to a preset threshold value, reducing the number of the first containers.

Specifically, the target resource utilization rate is obtained first, then the number of the first containers is determined to be reduced according to the resource utilization rate and the target resource utilization rate, and the first containers are reduced according to the number.

For example, when the current resource utilization is 10%, if the target resource utilization after capacity reduction is desired to reach 30%, the resources to be reduced can be judged according to the current resource utilization of 10% and the target resource utilization of 30% and the current resource size consumed by the online service, so as to determine the number of the first containers to be reduced, and correspondingly reduce.

In some cases, in addition to the fact that resource utilization may be insufficient to enable scaling, there may be a problem of mismatch in utilization of the various dimensions of the resource. For example, the resources include several dimensions of CPU, memory, network card, etc., and each container, when created, has a corresponding request value and limit value. When these containers are handling online traffic, there may be a situation where CPU consumption is large, memory and network card consumption is small, but the resources allocated by the containers are small in CPU resources and memory and network card resources are large, which may cause a problem that the utilization rates of the dimensions of the resources are not matched. Aiming at the problem, the resource package can be modified, and the resource package can be timely adjusted according to the resource consumption condition of the business in each dimension so as to ensure that the utilization rate of the resources in each dimension is matched as much as possible.

Fig. 8 is a schematic structural diagram of a resource allocation device according to an embodiment of the present application, and as shown in fig. 8, the resource allocation device 80 includes:

An obtaining module 81, configured to obtain a size of a first idle resource when determining that the first idle resource exists in a plurality of first containers, where the first containers are used for performing processing of online services;

and the allocation module 82 is configured to allocate the first idle resource to offline service to process the offline service when the size of the first idle resource is greater than or equal to a first threshold.

In one possible implementation, the obtaining module 81 includes:

the first acquisition unit is used for acquiring the machine resources of each first container and the resources used by the corresponding online service;

and the second acquisition unit is used for acquiring the size of the first idle resources according to the machine resources of each first container and the used resources.

In one possible implementation, the distribution module 82 includes:

a determining unit, configured to obtain the offline service from a message queue, and determine a size of a first resource required for processing the offline service, where the size of the first resource is less than or equal to the first threshold;

and the first allocation unit is used for allocating the first idle resources to the offline service according to the size of the first resources.

In one possible embodiment, the first distribution unit comprises:

the first obtaining subunit is used for obtaining the maximum resource value of each second container, wherein the second containers are containers used for processing offline service, the machine resources of the second containers are 0, and the maximum resource value is larger than 0;

and the first allocation subunit is used for allocating the first idle resources to the offline service in each second container according to the maximum resource value of the second container and the size of the first resources required for processing the offline service.

In one possible implementation, the distribution module 82 further includes:

a third obtaining unit, configured to obtain a service request of a new online service;

a fourth obtaining unit, configured to obtain a size of a second idle resource according to the size of the first idle resource and a size of a first resource that has been used by the offline service;

and the second allocation unit is used for allocating resources to the new online service according to the service request and the size of the second idle resources so as to process the new online service.

In one possible embodiment, the second allocation unit includes:

A second allocation subunit, configured to allocate the second idle resource to the new online service to process the new online service when the size of the second idle resource is greater than or equal to a second threshold;

and the third allocation subunit is used for stopping processing the offline service to obtain the first resource when the size of the second idle resource is smaller than the second threshold value, and allocating the second idle resource and the first resource to the new online service to process the new online service.

In one possible implementation, the obtaining module 81 further includes:

a third obtaining unit, configured to obtain a resource utilization rate of an online service in a first period;

and the processing unit is used for reducing the number of the first containers if the resource utilization rate is smaller than or equal to a preset threshold value.

In one possible embodiment, the processing unit includes:

the second acquisition subunit is used for acquiring the target resource utilization rate;

and the processing subunit is used for determining to reduce the number of the first containers according to the resource utilization rate and the target resource utilization rate and reducing the first containers according to the number.

The resource allocation device provided in the embodiment of the present application is configured to execute the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment is not repeated here.

According to embodiments of the present application, an electronic device and a readable storage medium are also provided.

According to an embodiment of the present application, there is also provided a computer program product comprising: a computer program stored in a readable storage medium, from which at least one processor of an electronic device can read, the at least one processor executing the computer program causing the electronic device to perform the solution provided by any one of the embodiments described above.

Fig. 9 is a block diagram of an electronic device of a resource allocation method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the electronic device 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, such as a resource allocation method. For example, in some embodiments, the resource allocation method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into RAM 903 and executed by the computing unit 901, one or more steps of the resource allocation method described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the resource allocation method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.

The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. A resource allocation method, comprising:

when the size of the first idle resource is larger than or equal to a first threshold value, acquiring offline service from a message queue, and determining the size of a first resource required for processing the offline service, wherein the size of the first resource is smaller than or equal to the first threshold value;

Obtaining a maximum resource value of each second container, wherein the second containers are containers for processing offline service, the machine resources of the second containers are 0, and the maximum resource value is larger than 0;

and distributing the first idle resources to the offline service in each second container according to the maximum resource value of the second container and the size of the first resources required for processing the offline service.

2. The method of claim 1, wherein obtaining the size of the first free resource comprises:

acquiring machine resources of each first container and resources used by corresponding online business;

and acquiring the size of the first idle resources according to the machine resources of each first container and the used resources.

3. The method of claim 1, wherein the method further comprises:

acquiring a service request of a new online service;

acquiring the size of a second idle resource according to the size of the first idle resource and the size of the first resource used by the offline service;

and allocating resources to the new online service according to the service request and the size of the second idle resources so as to process the new online service.

4. The method of claim 3, wherein allocating resources to the new online service to process the new online service according to the service request and the size of the second free resources comprises:

when the size of the second idle resource is larger than or equal to a second threshold value, the second idle resource is allocated to the new online service so as to process the new online service;

and stopping processing the offline service to obtain the first resource when the size of the second idle resource is smaller than the second threshold value, and distributing the second idle resource and the first resource to the new online service to process the new online service.

5. The method of any of claims 1-4, wherein the method further comprises:

acquiring the resource utilization rate of the online service in a first period;

and if the resource utilization rate is smaller than or equal to a preset threshold value, reducing the number of the first containers.

6. The method of claim 5, wherein reducing the number of first containers if the resource utilization is less than or equal to a preset threshold comprises:

obtaining a target resource utilization rate;

and determining to reduce the number of the first containers according to the resource utilization rate and the target resource utilization rate, and reducing the first containers according to the number.

7. A resource allocation apparatus, comprising:

the allocation module is used for allocating the first idle resources to offline service to process the offline service when the size of the first idle resources is larger than or equal to a first threshold value;

wherein the distribution module comprises:

a first allocation unit, configured to allocate the first idle resource to the offline service according to the size of the first resource;

the first distribution unit includes:

8. The apparatus of claim 7, wherein the acquisition module comprises:

9. The apparatus of claim 7, wherein the allocation module further comprises:

10. The apparatus of claim 9, wherein the second allocation unit comprises:

11. The apparatus of any of claims 7-10, wherein the acquisition module further comprises:

12. The apparatus of claim 11, wherein the processing unit comprises:

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-6.