CN111193802A

CN111193802A - Dynamic resource allocation method, system, terminal and storage medium based on user group

Info

Publication number: CN111193802A
Application number: CN201911423198.3A
Authority: CN
Inventors: 王德奎
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-05-22

Abstract

The invention provides a resource dynamic allocation method, a system, a terminal and a storage medium based on a user group, comprising the following steps: setting a user group and setting the weight of the user group; acquiring task information under a user group and generating user group resource demand according to the task information; generating a user group resource allocation sequence according to the resource demand from large to small; calculating the resource allocation amount of the user group according to the current resource available amount and the user group weight; and allocating resources to the user group according to the resource allocation sequence and the user group resource allocation quantity. The invention can fairly and reasonably distribute resources for different user groups.

Description

Dynamic resource allocation method, system, terminal and storage medium based on user group

Technical Field

The invention relates to the technical field of cluster resource allocation, in particular to a hair style recommendation method, a hair style recommendation system, a hair style recommendation terminal and a storage medium based on big data.

Background

In the field of artificial intelligence, great computational power is required to improve training speed during model training, and more enterprises or scientific research institutions start to purchase GPU servers as infrastructure in the artificial intelligence scene. The existing deep learning framework and the classical algorithm model need more GPU video memory and GPU card number during operation, one or more GPUs are occupied by one-time model training, the GPU is a resource shortage, the GPU server cost is high, and enterprises cannot purchase a large number of GPU servers to meet the GPU resource requirements of all algorithm personnel at the same time. From the perspective of resource utilization rate, the infrastructure platform operation and maintenance personnel hope that the allocated resources can be fully utilized, the resource utilization rate of the cluster is improved, and the algorithm personnel using the GPU hope that more GPU cards can be obtained, so that the training task can be completed in as short time as possible, and the iteration speed of the model is accelerated. When different algorithm personnel or different departments exist, how to distribute resources fairly and effectively is a great difficulty for the operation and maintenance personnel of the infrastructure. Meanwhile, in some scenes, for example, scientific research personnel need a batch of GPU resources temporarily and urgently to complete model training, and do not want to wait for the allocation of the resources, so that a training result can be output as quickly as possible, and at the moment, a platform needs to support the resource allocation of the urgent task.

Disclosure of Invention

In view of the above-mentioned deficiencies of the prior art, the present invention provides a method, a system, a terminal and a storage medium for dynamically allocating resources based on user groups, so as to solve the above-mentioned technical problems.

In a first aspect, the present invention provides a method for dynamically allocating resources based on user groups, including:

setting a user group and setting the weight of the user group;

acquiring task information under a user group and generating user group resource demand according to the task information;

generating a user group resource allocation sequence according to the resource demand from large to small;

calculating the resource allocation amount of the user group according to the current resource available amount and the user group weight;

and allocating resources to the user group according to the resource allocation sequence and the user group resource allocation quantity.

Further, the method further comprises:

collecting task information under a user group;

removing emergency tasks from the tasks of the user group, and adding the emergency tasks to an emergency queue, wherein the emergency queue preferentially allocates resources;

moving the task allocated to the minimum executable resource to the end of the task queue of the user group;

and sequencing the tasks of the task queue of the user group from big to small according to the tasks, and sequencing the tasks with the same size from early to late according to the creation time.

Further, the method further comprises:

and sequentially issuing the distributed resources of the user group to the tasks in the task queue.

Further, the method further comprises:

collecting the amount of resources distributed by a user group;

judging whether the user group demand resource quantity exceeds the distributed resource quantity:

if yes, adding the user group into a resource queue to be distributed;

and if not, removing the user group from the resource queue to be allocated.

Further, the calculating the user group resource allocation amount according to the current resource available amount and the user group weight includes:

collecting the current resource available amount of the cluster;

calculating the weight ratio of the user group, wherein the weight ratio is the ratio of the user group weight to the sum of all the user group weights;

calculating the product of the current resource available quantity and the weight ratio, and taking the product as the resource allocation quantity of the user group;

and collecting historical resource allocation quantity of the user group, and taking the sum of the resource allocation quantity and the historical resource allocation quantity as the allocated resource quantity.

Further, the method further comprises:

judging whether the resource allocation amount of the user group is larger than the resource demand amount:

and if so, allocating resources with the same quantity of resource demand for the user group, and releasing the resource distribution quantity and the resource demand difference quantity resources.

In a second aspect, the present invention provides a system for dynamically allocating resources based on user groups, including:

a grouping setting unit configured to set a user group and set weights of the user group;

the demand generation unit is configured for acquiring task information under a user group and generating a user group resource demand according to the task information;

the sequence generating unit is configured to generate a user group resource allocation sequence according to the resource demand from large to small;

the quantity value calculation unit is configured for calculating the resource allocation quantity of the user group according to the current resource available quantity and the user group weight;

and the allocation execution unit is configured to allocate resources to the user group according to the resource allocation sequence and the user group resource allocation amount.

Further, the system further comprises:

the information acquisition unit is configured to acquire task information under a user group;

the economic processing unit is configured to remove the emergency tasks from the tasks of the user group and add the emergency tasks to an emergency queue, and the emergency queue preferentially allocates resources;

the task scheduling unit is configured to move the task allocated to the minimum executable resource to the tail of the task queue of the user group;

and the queue sorting unit is configured and used for sorting the tasks of the task queue of the user group from big to small according to the tasks and sorting the tasks with the same size according to the creation time from early to late.

In a third aspect, a terminal is provided, including:

a processor, a memory, wherein,

the memory is used for storing a computer program which,

the processor is used for calling and running the computer program from the memory so as to make the terminal execute the method of the terminal.

In a fourth aspect, a computer storage medium is provided having stored therein instructions that, when executed on a computer, cause the computer to perform the method of the above aspects.

The beneficial effect of the invention is that,

the invention provides a method, a system, a terminal and a storage medium for dynamically allocating resources based on user groups, wherein different user group scheduling queues are divided according to training tasks submitted by user groups, and each scheduling queue of a user group comprises information such as weight, priority, ratio of user group resources occupying cluster resources, whether the scheduling queue is an emergency queue and the like. In each scheduling period, after the scheduling of one task is completed, a proper user group is calculated and selected again, so that resources can be reasonably distributed for different user groups fairly. The method and the system can ensure that each user can fairly obtain resources, meanwhile, based on the method and the system, when some algorithm personnel need to urgently create the training task, the training task can be ensured to be firstly scheduled and distributed with resources, and when different departments or users simultaneously apply for GPU resources to perform model training, the starvation degree of the departments or the users on the resources is calculated through a certain algorithm, so that the resource requirements of the departments or the users are preferentially met.

In addition, the invention has reliable design principle, simple structure and very wide application prospect.

Drawings

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention.

FIG. 2 is a schematic block diagram of a system of one embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention. The execution subject in fig. 1 may be a system for dynamically allocating resources based on user groups.

As shown in fig. 1, the method 100 includes:

step 110, setting a user group and setting the weight of the user group;

step 120, collecting task information under a user group and generating a user group resource demand according to the task information;

step 130, generating a user group resource allocation sequence according to the resource demand from large to small;

step 140, calculating the resource allocation of the user group according to the current resource available amount and the user group weight;

and 150, allocating resources to the user group according to the resource allocation sequence and the user group resource allocation quantity.

In order to facilitate understanding of the present invention, the following describes the dynamic resource allocation method based on user groups according to the principle of the dynamic resource allocation method based on user groups in combination with the dynamic allocation process of cluster resources in the embodiment.

Specifically, the method for dynamically allocating resources based on the user group includes:

s1, a scheduling platform construction stage, a Kubernetes cluster and a self-developed service system are deployed, platform operation and maintenance personnel create different user groups for different users or departments based on the platform, and the user groups comprise the following attributes as follows:

s2, users belonging to different user groups can submit deep learning training tasks, the training tasks comprise requested resource quantity, including CPU, memory, storage, GPU, wherein the relationship among the user groups, the users and the deep learning training tasks is as follows, one of the tasks usually comprises a plurality of Pod (each Pod can be regarded as one or more containers)

S3, when a plurality of users belonging to the same user group submit training tasks at the same time (whether the training task is an emergency task is allowed to be designated), a user group scheduling queue of the user group is constructed based on the following algorithm, namely, the training task of each user group belongs to a priority queue. And if the training task is an emergency task, putting the training task into an emergency task queue built in the system, wherein the emergency task queue does not distinguish user groups, and tasks belonging to different user groups can be put into the emergency task queue.

Whether the training task has completed the resource allocation of the minimized Pod number in the last scheduling period, and if so, the training task is put at the tail of the queue

And if the priority of the training tasks is higher, the training tasks are placed at the head of the queue, and if the priority of the two training tasks is the same, the training tasks are sorted according to the creation time of the training tasks and placed at the head of the queue with the earlier creation time.

When a scheduling period begins, processing an emergency task queue firstly, judging whether a task to be scheduled exists in the emergency task queue or not, when the task exists, taking out the task and distributing resources, and after traversing the emergency task queue, if the task in the Pennding state still exists in the emergency task queue, terminating the scheduling period and not processing the tasks in the queues of all user groups.

S4, grouping all deep learning training tasks in the system according to the user group, and in each scheduling period, selecting and processing the training tasks of the user group according to a certain rule to achieve the fair resource allocation effect of the user group, wherein the specific rule is as follows:

collecting task information under each user group, wherein the task information comprises (CPU demand, GPU demand and memory demand) statistics of resource demand Q of each user group_X. And generating the resource allocation sequence of the user group according to the demand from large to small.

And when the historical allocation resource amount of one user group exceeds the resource demand amount of the user group, removing the user group from the resource queue to be allocated, and not allocating resources any more. Otherwise, adding the user group into the resource queue to be distributed.

Resource allocation amount of user group:

wherein R is the weight of the user group to be allocated, R is the sum of the weights of all the user groups, and Q_kThe amount of resources currently available for the cluster.

The amount of allocated resources for the user group is: q_p＝Q_f+Q_lWherein Q is_lAn amount is allocated for the historical resources.

Judging whether the resource allocation amount of the user group is larger than the resource demand amount: and if so, allocating resources with the same quantity of resource demand for the user group, and releasing the resource distribution quantity and the resource demand difference quantity resources.

Calculating a share value of a user group queue: selecting the minimum value of the three as the share value of the queue

And selecting the user group queue with the minimum share value, taking out the task at the head of the user group queue, scheduling, selecting and allocating resources. And after the selected training task completes the resource allocation, the calculation of the resource allocation of each user group is carried out again, and a proper user group queue and the training task are selected again. And traversing all training tasks of all user groups.

As shown in fig. 2, the system 200 includes:

a group setting unit 210 configured to set a user group and set weights of the user group;

the requirement generation unit 220 is configured to collect task information under a user group and generate a user group resource requirement amount according to the task information;

a sequence generating unit 230 configured to generate a user group resource allocation sequence according to the resource demand from large to small;

a quantity value calculating unit 240 configured to calculate a user group resource allocation quantity according to the current resource available quantity and the user group weight;

an allocation performing unit 250 configured to allocate resources to the user groups according to the resource allocation order and the user group resource allocation amount.

Optionally, as an embodiment of the present invention, the system further includes:

Fig. 3 is a schematic structural diagram of a terminal system 300 according to an embodiment of the present invention, where the terminal system 300 may be used to execute the method for dynamically allocating resources based on user groups according to the embodiment of the present invention.

The terminal system 300 may include: a processor 310, a memory 320, and a communication unit 330. The components communicate via one or more buses, and those skilled in the art will appreciate that the architecture of the servers shown in the figures is not intended to be limiting, and may be a bus architecture, a star architecture, a combination of more or less components than those shown, or a different arrangement of components.

The memory 320 may be used for storing instructions executed by the processor 310, and the memory 320 may be implemented by any type of volatile or non-volatile storage terminal or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk. The executable instructions in memory 320, when executed by processor 310, enable terminal 300 to perform some or all of the steps in the method embodiments described below.

The processor 310 is a control center of the storage terminal, connects various parts of the entire electronic terminal using various interfaces and lines, and performs various functions of the electronic terminal and/or processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory. The processor may be composed of an Integrated Circuit (IC), for example, a single packaged IC, or a plurality of packaged ICs connected with the same or different functions. For example, the processor 310 may include only a Central Processing Unit (CPU). In the embodiment of the present invention, the CPU may be a single operation core, or may include multiple operation cores.

A communication unit 330, configured to establish a communication channel so that the storage terminal can communicate with other terminals. And receiving user data sent by other terminals or sending the user data to other terminals.

The present invention also provides a computer storage medium, wherein the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).

Therefore, according to the invention, different user group scheduling queues are divided according to the submitted training tasks of the user group, and the scheduling queue of each user group comprises information such as weight, priority, the ratio of user group resources occupying cluster resources, whether the scheduling queue is an emergency queue and the like. In each scheduling period, after the scheduling of one task is completed, a proper user group is calculated and selected again, so that resources can be reasonably distributed for different user groups fairly. The method can ensure that each user can fairly obtain resources, meanwhile, based on the method, when some algorithm personnel need to create a training task urgently, the training task can be ensured to be scheduled and distributed with resources, when different departments or users simultaneously apply for GPU resources for model training, the starvation degree of the departments or users for the resources is calculated through a certain algorithm, so that the resource requirements of the departments or users are preferentially met, the technical effect which can be achieved by the embodiment can be referred to the description above, and the details are not repeated.

Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in the form of a software product, where the computer software product is stored in a storage medium, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like, and the storage medium can store program codes, and includes instructions for enabling a computer terminal (which may be a personal computer, a server, or a second terminal, a network terminal, and the like) to perform all or part of the steps of the method in the embodiments of the present invention.

The same and similar parts in the various embodiments in this specification may be referred to each other. Especially, for the terminal embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant points can be referred to the description in the method embodiment.

In the embodiments provided in the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, systems or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for dynamically allocating resources based on user groups is characterized by comprising the following steps:

setting a user group and setting the weight of the user group;

2. The method of claim 1, further comprising:

collecting task information under a user group;

3. The method of claim 2, further comprising:

4. The method of claim 1, further comprising:

collecting the amount of resources distributed by a user group;

if yes, adding the user group into a resource queue to be distributed;

and if not, removing the user group from the resource queue to be allocated.

5. The method of claim 1, wherein calculating the allocation of the user group resource based on the current resource availability and the user group weight comprises:

collecting the current resource available amount of the cluster;

6. The method of claim 1, further comprising:

7. A system for dynamic resource allocation based on user groups, comprising:

8. The system of claim 7, further comprising:

9. A terminal, comprising:

a processor;

a memory for storing instructions for execution by the processor;

wherein the processor is configured to perform the method of any one of claims 1-6.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.