CN113765949A - Resource allocation method and device - Google Patents

Resource allocation method and device Download PDF

Info

Publication number
CN113765949A
CN113765949A CN202010490013.7A CN202010490013A CN113765949A CN 113765949 A CN113765949 A CN 113765949A CN 202010490013 A CN202010490013 A CN 202010490013A CN 113765949 A CN113765949 A CN 113765949A
Authority
CN
China
Prior art keywords
user
resource
jobs
resource usage
influence factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010490013.7A
Other languages
Chinese (zh)
Inventor
徐华
包小明
王国威
孙宏伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010490013.7A priority Critical patent/CN113765949A/en
Publication of CN113765949A publication Critical patent/CN113765949A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a method and a device for resource allocation, wherein the method comprises the following steps: determining characteristics of a plurality of jobs submitted within a preset time period; determining a first influence factor and a second influence factor according to the characteristics of the plurality of jobs, wherein the first influence factor is a weight corresponding to historical resource usage information of a cluster, and the second influence factor is a weight corresponding to real-time resource usage information of the cluster; and determining a target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information and the second influence factor. The technical scheme of the application can realize fair allocation of resources and avoid the problem of unfair resource allocation caused by improper parameter setting.

Description

Resource allocation method and device
Technical Field
The present application relates to the field of network communication technologies, and in particular, to a method and an apparatus for resource allocation.
Background
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results.
During the AI training process, resources need to be allocated for the jobs submitted by the user. In the context of a private cloud data center, multiple users share resources in a node cluster, and therefore how to achieve fair allocation of resources among the multiple users becomes a problem to be solved urgently.
Disclosure of Invention
The application provides a method and a device for resource allocation, which can realize fair allocation of resources and avoid the problem of unfair resource allocation caused by improper parameter setting.
In a first aspect, a method for resource allocation is provided, including: determining characteristics of a plurality of jobs submitted within a preset time period; determining a first influence factor and a second influence factor according to the characteristics of the plurality of jobs, wherein the first influence factor is a weight corresponding to historical resource usage information of a cluster, and the second influence factor is a weight corresponding to real-time resource usage information of the cluster; and determining a target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information and the second influence factor.
In the technical scheme, the weight corresponding to the historical resource use information of the cluster and the weight corresponding to the real-time resource use information of the cluster are determined according to the characteristics of a plurality of jobs submitted in the cluster within a preset time period. On one hand, the determined weight corresponding to the historical resource use information of the cluster and the weight corresponding to the real-time resource use information of the cluster are accurate. On the other hand, the above-mentioned weights have better flexibility, and when the cluster job type or feature changes, the weight corresponding to the historical resource usage information of the cluster and the weight corresponding to the real-time resource usage information of the cluster also change correspondingly. Thus, the problems of unfair resource distribution, starvation of other users and the like can be avoided.
In one possible implementation, the characteristics of the plurality of jobs include a duty ratio of a high resource consumption job in the plurality of jobs, the high resource consumption job being a job that occupies resources greater than a first threshold.
In another possible implementation, the duty ratio of the high resource consuming jobs includes one or more of: a ratio of the number of users of the high-resource-consumption job to the number of users of the plurality of jobs.
In another possible implementation, if the duty ratio of the high-resource-consumption job is greater than a second threshold, the first influence factor is set to a value positively correlated with the duty ratio; setting the first influence factor to a value positively correlated with the duty ratio if the duty ratio of the job with high resource consumption is less than or equal to the second threshold value.
In another possible implementation manner, the method further includes: displaying, via the user interface UI, one or more of: a characteristic of the plurality of jobs, the first impact factor, and the second impact factor; wherein the UI interface is configured to adjust the first impact factor and the second impact factor.
In another possible implementation manner, the historical resource usage proportion of each user of a plurality of users in the cluster is determined according to the historical resource usage information of the plurality of users; determining real-time resource usage proportion of each user of the at least one user in the cluster according to the real-time resource usage information of the at least one user; and determining the target user according to the historical resource usage ratios of the plurality of users, the first influence factor, the real-time resource usage ratio of the at least one user and the second influence factor.
In another possible implementation manner, the first influence factor is positively correlated with the proportion of the high-resource-consumption job, and the second influence factor is negatively correlated with the proportion of the high-resource-consumption job.
In another possible implementation manner, the target user is determined according to the historical resource usage information of the user, the first influence factor, the real-time resource usage information, the second influence factor and the weight of the user.
In a second aspect, an apparatus for resource allocation is provided, including:
the system comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining the characteristics of a plurality of jobs submitted in a preset time period;
the determining module is further configured to determine a first influence factor and a second influence factor according to the characteristics of the plurality of jobs, where the first influence factor is a weight corresponding to historical resource usage information of a cluster, and the second influence factor is a weight corresponding to real-time resource usage information of the cluster;
the determining module is further configured to determine a target user according to historical resource usage information of the user, the first impact factor, real-time resource usage information, and the second impact factor.
In one possible implementation, the characteristics of the plurality of jobs include a duty ratio of a high resource consumption job in the plurality of jobs, the high resource consumption job being a job that occupies resources greater than a first threshold.
In another possible implementation, the duty ratio of the high resource consuming jobs includes one or more of: a ratio of the number of users of the high-resource-consumption job to the number of users of the plurality of jobs.
In another possible implementation manner, the determining module is specifically configured to: setting the first impact factor to a value positively correlated with the duty ratio if the duty ratio of the job with high resource consumption is larger than a second threshold value; setting the first influence factor to a value positively correlated with the duty ratio if the duty ratio of the job with high resource consumption is less than or equal to the second threshold value.
In another possible implementation manner, the method further includes: a display module for displaying one or more of the following through a user interface UI: a characteristic of the plurality of jobs, the first impact factor, and the second impact factor; wherein the UI interface is configured to adjust the first impact factor and the second impact factor.
In another possible implementation manner, the determining module is specifically configured to: determining the historical resource usage proportion of each user of a plurality of users in a cluster according to the historical resource usage information of the plurality of users; determining real-time resource usage proportion of each user of the at least one user in the cluster according to the real-time resource usage information of the at least one user; and determining the target user according to the historical resource usage ratios of the plurality of users, the first influence factor, the real-time resource usage ratio of the at least one user and the second influence factor.
In another possible implementation manner, the first influence factor is positively correlated with the proportion of the high-resource-consumption job, and the second influence factor is negatively correlated with the proportion of the high-resource-consumption job.
In another possible implementation manner, the determining module is specifically configured to: and determining the target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information, the second influence factor and the weight of the user.
It will be appreciated that extensions, definitions, explanations and explanations of relevant content in the above-described first aspect also apply to the same content in the second aspect.
In a third aspect, an apparatus for resource allocation is provided, including: a memory for storing a program; a processor for executing the memory-stored program, the processor for performing, when the memory-stored program is executed:
determining characteristics of a plurality of jobs submitted within a preset time period; determining a first influence factor and a second influence factor according to the characteristics of the plurality of jobs, wherein the first influence factor is a weight corresponding to historical resource usage information of a cluster, and the second influence factor is a weight corresponding to real-time resource usage information of the cluster; and determining a target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information and the second influence factor.
In one possible implementation, the characteristics of the plurality of jobs include a duty ratio of a high resource consumption job in the plurality of jobs, the high resource consumption job being a job that occupies resources greater than a first threshold.
In another possible implementation, the duty ratio of the high resource consuming jobs includes one or more of: a ratio of the number of users of the high-resource-consumption job to the number of users of the plurality of jobs.
In another possible implementation, if the duty ratio of the high-resource-consumption job is greater than a second threshold, the first influence factor is set to a value positively correlated with the duty ratio;
setting the first influence factor to a value positively correlated with the duty ratio if the duty ratio of the job with high resource consumption is less than or equal to the second threshold value.
In another possible implementation manner, the method further includes: displaying, via the user interface UI, one or more of: a characteristic of the plurality of jobs, the first impact factor, and the second impact factor; wherein the UI interface is configured to adjust the first impact factor and the second impact factor.
In another possible implementation manner, the historical resource usage proportion of each user of a plurality of users in the cluster is determined according to the historical resource usage information of the plurality of users; determining real-time resource usage proportion of each user of the at least one user in the cluster according to the real-time resource usage information of the at least one user; and determining the target user according to the historical resource usage ratios of the plurality of users, the first influence factor, the real-time resource usage ratio of the at least one user and the second influence factor.
In another possible implementation manner, the first influence factor is positively correlated with the proportion of the high-resource-consumption job, and the second influence factor is negatively correlated with the proportion of the high-resource-consumption job.
It will be appreciated that extensions, definitions, explanations and explanations of relevant content in the above-described first aspect also apply to the same content in the third aspect.
In a fourth aspect, a computer storage medium is provided, which stores program code comprising instructions for performing the steps in the method for resource allocation in the first aspect and any one of the implementations of the first aspect.
The storage medium may specifically be a nonvolatile storage medium.
In a fifth aspect, a chip is provided, where the chip includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface, and performs the method for resource allocation in any one of the implementations of the first aspect and the first aspect.
Optionally, as an implementation manner, the chip may further include a memory, where instructions are stored in the memory, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to execute the method for resource allocation in any one of the implementations of the first aspect and the first aspect.
Drawings
Fig. 1 is a schematic diagram of an exemplary fully-connected network model provided in an embodiment of the present application.
Fig. 2 is a schematic diagram of a training process of a neural network model provided in an embodiment of the present application.
Fig. 3 is a schematic diagram of a possible system architecture suitable for use in embodiments of the present application.
Fig. 4 is a schematic flow chart of a method for resource allocation according to an embodiment of the present application.
Fig. 5 is a schematic flow chart of a specific implementation of step 420.
Fig. 6 is a schematic flow chart of a specific implementation of step 430.
Fig. 7 is a schematic block diagram of an apparatus 700 for resource allocation provided by an embodiment of the present application.
Fig. 8 is a hardware structure diagram of an apparatus 800 for resource allocation according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be understood that in the embodiments of the present application, "first", "second", "third", "fourth", etc. are only intended to refer to different objects, and do not indicate other limitations on the objects referred to.
Since the embodiments of the present application relate to a large number of terms in the art, the following description will first describe terms and concepts related to the embodiments of the present application for easy understanding.
1. Work in
A job may be a collection of program instances that need to be executed to complete a particular computing service, typically corresponding to a set of processes, containers, or other runtime entities on one or more computers. A job may contain multiple tasks.
2. Task
An individual instance in a collection of intra-job program instances typically corresponds to a process, container, or other runtime entity on a computer.
3. Deep neural network
Deep Neural Networks (DNNs), also called multi-layer neural networks, can be understood as neural networks with multiple hidden layers. The DNNs are divided according to the positions of different layers, and the neural networks inside the DNNs can be divided into three categories: input layer, hidden layer, output layer. Generally, the first layer is an input layer, the last layer is an output layer, and the middle layers are hidden layers. The layers are all connected, that is, any neuron of the ith layer is necessarily connected with any neuron of the (i + 1) th layer.
For example, as shown in fig. 1, a typical fully-connected network model includes an input layer 110, a hidden layer 120, a hidden layer 130, and an output layer 140; data flows in from the input layer 110, and calculation is performed step by step to finally obtain a result from the output layer 140; wherein, each layer in the middle is provided with a plurality of parameters, and the parameters and the input of the previous layer are calculated to obtain the output; the model parameters require a large amount of data to be fitted through model training, so that the optimal model effect is obtained.
In the embodiment of the present application, the training operation in the deep neural network training process may also be referred to as a deep learning operation.
4. High resource consumption job
A job with high resource consumption occupies a resource greater than the first threshold for the job. The jobs with high resource consumption in the embodiment of the present application may include, but are not limited to: long-term work, large-scale work, and the like.
5. Long time operation
The long-time job may be a job that is executed a single time for more than a preset threshold time. Specifically, in an example, a single deep learning-type operation in the deep neural network is performed for a long time, which can be understood as a long-time operation.
For convenience of description, the following first describes a training procedure of the deep neural network model.
Illustratively, fig. 2 is a schematic diagram of a training process of a deep neural network model provided in an embodiment of the present application. The training process includes steps S210 to S280, and the steps S210 to S280 are described in detail below.
And S210, loading the network model for the first time.
And S220, inputting training data into the network model.
And S230, initializing parameters of the network model according to the training data.
And S240, forward propagation.
The forward propagation algorithm is to perform a series of linear operations and activation operations by using a plurality of weight coefficient matrixes W, bias vectors b and input value vectors x; the calculation is carried out backward layer by layer from the input layer until the calculation is carried out to the output layer, and the output result is obtained as a value.
And S250, calculating loss according to the result.
For example, in the process of training the deep neural network, because the output of the deep neural network is expected to be as close to the value really expected to be predicted as possible, the weight vector of each layer of neural network can be updated according to the difference between the predicted value of the current network and the target value really expected (of course, an initialization process is usually performed before the first update, that is, parameters are configured in advance for each layer in the deep neural network); for example, if the predicted value of the network is high, the weight vector is adjusted to make the predicted value lower, and the adjustment is continued until the deep neural network can predict the real desired target value or a value very close to the real desired target value.
Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions) or objective functions (objective functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the larger the difference, the training of the deep neural network becomes the process of reducing the loss as much as possible.
And S260, back propagation.
For example, the neural network may use a Back Propagation (BP) algorithm to correct the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, the error loss is generated by transmitting the input signal in the forward direction until the output, and the parameters in the initial neural network model are updated by reversely propagating the error loss information, so that the error loss is converged. The back propagation algorithm is a back propagation motion with error loss as a dominant factor, aiming at obtaining the optimal parameters of the neural network model, such as a weight matrix.
And S270, continuously updating the network model parameters.
And S280, saving parameters or weights of the network model.
Because the training process of the network model needs a large amount of iterative training (thousands of times) to obtain the final model parameter value, the corresponding task requirements are met, and the model training of the deep neural network is a very time-consuming process. Therefore, the number of long-term jobs in the deep learning type job is large.
When allocating resources to such deep learning type jobs, once the allocated resources start to run, it may take several days to release the resources, and if mechanisms such as preemption are not used, other users cannot use the resources in the period of time. Other jobs in line may need to wait a long time to obtain resources, and situations arise where other users' jobs are starved.
6. Large work
A large job may be one that requires more resources than a preset threshold for a single run of the job. Specifically, in an example, a single deep learning-type job in a deep neural network requires more resources for a single operation, and can be understood as a large job.
In the deep neural network, as the network model is more complex, the data volume is larger and larger, and the calculated amount of model training becomes extremely large; therefore, the timeliness requirement of model generation is met through distributed training. Distributed training refers to the cooperative processing of a deep learning type job by multiple nodes or multiple processors. Thus, a single deep learning-like job in a deep neural network requires a large amount of resources.
The deep learning-based job differs from the big data-based job in that a plurality of different subtasks of the deep learning-based job cannot be processed in batches, and group scheduling is required. Group scheduling is a scheduling algorithm for parallel systems that schedules tasks on multiple nodes or multiple processors to run simultaneously. Therefore, when allocating resources to such deep learning type jobs, either all the resources required for running a plurality of tasks in the job are satisfied, or no resources are allocated, because partial resource satisfaction only results in resource waste. Therefore, the number of large jobs is large in the deep learning type jobs.
7. Pooling of resources
Since the number of large jobs in the deep learning type job is large, the amount of resources required for the deep learning type job is large. Meanwhile, the cloud data center becomes the best carrier for running the deep learning jobs because the resources for running the deep learning jobs, such as a Central Processing Unit (CPU) or a Graphics Processing Unit (GPU) device of a computing node, are expensive. In a cloud environment, the software and hardware environment framework cost of a single user is reduced and the resource utilization rate is maximized through time-sharing, resource elasticity and other modes.
8. Resource allocation
Under the scene of a private cloud data center, namely a multi-user shared resource pool, a mode that a single person alone shares a private resource is not provided, and a plurality of users/tenants can share cluster resources. Therefore, a special scheduler is needed to schedule jobs of different users, and select appropriate nodes for different tasks of the jobs to run. On one hand, the requirement of operation on hardware and software environment can be met, on the other hand, the utilization rate of resources can be improved, and the purpose of resource sharing and time sharing multiplexing is achieved.
Basic resource allocation, that is, when users in a cluster have resource requirements, resource scheduling is performed on different jobs of multiple users, and appropriate nodes and resources are selected for the different jobs to place tasks. Specifically, the free resources in the cluster may be allocated to the user according to a certain rule, so that the user obtains the resources to complete the job/task.
Fair resource allocation, that is, a plurality of users/tenants share cluster resources, and there is no price lever, so that fair resource allocation among users needs to be ensured, and malicious resource competition among the users is avoided.
As an example, in a private cloud or enterprise data center, the total cluster resources are limited, and multiple users share the resources. On the one hand, when processing deep learning type operation load, the processing data amount is huge, which often causes the situation that cluster resources are more and less. On the other hand, users in using a shared cluster may tend to apply for a large amount of resources, and thus expect more resources to be available in each user game. Thus, under such shared cluster scenarios, fair scheduling becomes a common method and mechanism for data center resource management and scheduling.
Generally, the resource usage amount expected to be available by different users can be defined by a weight, for example, the weight of A, B, C user is 1:1:2, then under the condition that the jobs are sufficiently competitive, A, B, C can obtain the cluster resources of 1/4, 1/4 and 1/2 respectively. A. Jobs between B no longer have absolute static priority configuration, but dynamic priorities are calculated based on the weights of users and resources they have acquired, thereby achieving fair resource allocation.
In the scenario of fair allocation of resources, the weight of the user is defined by the administrator. The cluster administrator distributes weights to different users or departments based on various factors (such as personnel or paid cost of each department) and takes the weights as fair scheduling input, the users can submit job or task resource requirements at will, and the scheduling system automatically distributes resources to the users based on a fair resource distribution method according to the weights of different users, current cluster information and the like, so that the users can obtain resources with corresponding weights, and the fairness of the users for the use of cluster resources is guaranteed.
On one hand, the fair resource allocation can stimulate the full sharing of resources and improve the utilization rate of the resources. On the other hand, a resource allocation scheme of multiple users under the cluster resource shortage scene is provided, malicious competition is avoided, and meanwhile the problem that the jobs of the users are starved is also avoided. If a fair resource allocation mechanism is not available, the free competition of the user jobs in the free cluster can cause malicious users to obtain excessive resources, the user occupies the whole cluster, the jobs of other users are starved and failed, the resources are allocated out of order, the business execution requirements (such as periodic model training and updating) of the enterprise cannot respond in time, and further the business operation of the whole enterprise is influenced.
9. Hierarchical user model
In the actual resource use and management scene of an enterprise or a large organization, the management of all users is not flat, a plurality of users form a group, a plurality of groups form a department, and a plurality of departments form a large organization structure, so that the hierarchical user model is more suitable for the management requirements of the enterprise or the organization. Therefore, in the fair allocation scheme of resources, the fair allocation requirements of each organization level need to be considered, and the hierarchical resource management model of the organization and the weight of each organization can be defined by an administrator. The resource allocation scheme needs to allocate resources according to the use condition of cluster resources and a hierarchical management model, and the fair allocation requirement of each level is met.
In the related technical solution, the dynamic priority of each user is calculated according to the historical resource usage information and the real-time resource usage information of each user, and a suitable node (for example, a server) is selected for different tasks of the job of the high-priority user and resources are allocated, so that the node is used for running the tasks included in the job of the high-priority user.
However, since in this related art, the setting of the weight of the historical resource usage information (may also be referred to as an impact factor) and the weight of the real-time resource usage information (may also be referred to as an impact factor) depends on the experience of the administrator. On the one hand, the weights configured based on the experience of the administrator are not very accurate, which may lead to a problem of unfair resource allocation. If the weight of the real-time resource usage information is set to be too large, in a scene with more jobs, for example, a training scene of a deep neural network model, a user with less real-time resource usage information may obtain more resources, and other users may not obtain resources to complete their jobs/tasks, so that the jobs of other users are starved. If the weight setting of the historical resource use information is too large, the use condition of real-time resources of the user can be ignored, resources for running the operation/task of the user are continuously allocated to the high priority (the priority of the user with less historical resource use information is the highest), and in the scenes with less large operation and more small operation, a large amount of resources can be allocated to the user with high priority at one time, so that the problems that a single user allocates too many resources in a single circulation, the queuing time of other users is long and the like can occur. On the other hand, the weight is configured manually by an administrator when the cluster is started, so that the flexibility is poor, and the method can only adapt to a scene that the cluster operation type is single and is not changed for a long time. If a job type occurs in the cluster, the weights cannot be adjusted in real time, resulting in problems of unfair resource allocation or starvation of other jobs.
In view of this, the present application provides a resource allocation method and a resource allocation device, which automatically sense and analyze the characteristics of the cluster job in real time, and adjust the weight of the historical resource usage information and the weight of the real-time resource usage information in real time according to the characteristics of the cluster job. The accuracy of parameter configuration is improved, the experience dependence of parameter setting of cluster resource allocation on an administrator is avoided, and the configuration difficulty is reduced, so that the problems of operation starvation, unfairness and the like in different scenes are reduced.
Fig. 3 is a schematic diagram of a possible system architecture suitable for use in embodiments of the present application.
As shown in fig. 3, the system architecture may include a control node 310 and underlying hardware resources 320. Control node 310 may include a graphical user interface/client 311, a job management service 312, and a resource management service 313.
Illustratively, the user graphical interface/client 311 may be used to receive jobs submitted from different users. Job management service 312 may be used to manage and submit jobs received from different users. The resource management service 313 may include a resource management device 3131 and a resource allocation device 3132, wherein the resource management device 3131 may be configured to bind and release resources, and the resource allocation device 3132 may schedule resources for jobs according to the requirements of different jobs. The underlying hardware resources 320 may include, but are not limited to, a Central Processing Unit (CPU), memory, network, GPU, and Remote Direct Memory Access (RDMA).
Illustratively, a user may submit a job via the user graphical interface/client 311; job management service 312, upon receiving the request, may parse the job and submit the resource request to resource management service 313; upon receiving the request, the resource management service 313 may select an appropriate node from the managed underlying hardware resources 320 (i.e., underlying physical resources) for placement of the job by the resource allocation device 3132; after the resource allocating device 3132 completes the selection of the node, a corresponding operation is started on the corresponding node, and this part of the resources is occupied by the operation, until the operation is finished and the resources are released through the resource management device 3131.
The method for resource allocation according to the embodiment of the present application is described in detail below with reference to fig. 4.
The method of resource allocation shown in fig. 4 can be performed by the resource allocation device 3132 shown in fig. 3, and is applied to the system architecture shown in fig. 3. The method shown in FIG. 4 includes steps 410-440, and the steps 410-440 are described in detail below.
Step 410: characteristics of a plurality of jobs submitted within a preset time period are determined.
The characteristics of a plurality of jobs submitted in a past period of time can be determined in the embodiment of the application. In particular, as an example, the characteristics of the plurality of jobs may include a duty ratio of jobs with high resource consumption in the plurality of jobs. Wherein the duty ratio of the high resource consumption job may include one or more of the following: a ratio of the number of users of the high-resource-consumption job to the number of users of the plurality of jobs.
It should be understood that the proportion of the number of users of the job with high resource consumption to the number of users may be a ratio of the number of users of the job with high resource consumption to the number of users corresponding to all jobs. The all jobs may be a historical job currently running and a currently submitted job, or may also be a historical job and a currently submitted job within a historical period of time, which is not specifically limited in this application.
Step 420: determining a first impact factor and a second impact factor based on characteristics of the plurality of jobs.
The first influence factor is a weight corresponding to historical resource usage information of the cluster, and the second influence factor is a weight corresponding to real-time resource usage information of the cluster. It is understood that in some embodiments, the sum of the first impact factor and the second impact factor is less than or equal to 1.
In the embodiment of the application, the characteristics of the operation are positively correlated with the first influence factor and negatively correlated with the second influence factor. As an example, the job is characterized as a proportion of jobs with high resource consumption among a plurality of jobs. That is, if the occupation ratio of a job of high resource consumption in a plurality of jobs is large, the first influence factor may be set large, and the second influence factor may be set small; if the occupation ratio of the jobs with high resource consumption in the plurality of jobs is small, the first influence factor may be set small and the second influence factor may be set large.
In one possible implementation, if the duty ratio of the high-resource-consumption job is greater than the second threshold, the first impact factor is set to a value that is positively correlated with the duty ratio of the high-resource-consumption job, and the second impact factor is set to a value that is negatively correlated with the duty ratio of the high-resource-consumption job. In another possible implementation, if the duty ratio of the high-resource-consumption job is less than or equal to the second threshold, the first impact factor is set to a value positively correlated to the duty ratio of the high-resource-consumption job, and the second impact factor is set to a value negatively correlated to the duty ratio of the high-resource-consumption job.
Step 430: and determining a target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information and the second influence factor.
The target user in this embodiment of the present application may be a high-priority user, that is, for a plurality of users in the cluster, the plurality of users may be prioritized according to the historical resource usage information of each user, the first impact factor, the real-time resource usage information of each user, and the second impact factor, and a high-priority user is selected from the plurality of users.
It should be noted that the first influence factor and the second influence factor determined in the embodiment of the present application are for the entire cluster, that is, for a plurality of users in the cluster, the first influence factor and the second influence factor used by each user are the same.
Optionally, in some embodiments, the historical resource usage proportion of each of the plurality of users in the cluster may be further determined according to the historical resource usage information of the plurality of users, and the real-time resource usage proportion of each of the at least one user in the cluster is determined according to the real-time resource usage information of the at least one user, so that the target user is determined according to the historical resource usage proportion of the plurality of users, the first impact factor, the real-time resource usage proportion of the at least one user, and the second impact factor.
Step 440: resources are allocated for at least one job or task submitted by a target user.
In the embodiment of the application, after the target user is determined, one job is selected from at least one job submitted by the target user, and a node is selected for the job and a task is distributed. It is also possible to select one task from a plurality of tasks included in one job of the target user, and select a node for the task and distribute the task.
In the technical scheme, the weight corresponding to the historical resource use information of the cluster and the weight corresponding to the real-time resource use information of the cluster are determined according to the characteristics of a plurality of jobs submitted in the cluster within a preset time period. On one hand, the determined weight corresponding to the historical resource use information of the cluster and the weight corresponding to the real-time resource use information of the cluster are accurate. On the other hand, the above-mentioned weights have better flexibility, and when the cluster job type or feature changes, the weight corresponding to the historical resource usage information of the cluster and the weight corresponding to the real-time resource usage information of the cluster also change correspondingly. Thus, the problems of unfair resource distribution, starvation of other users and the like can be avoided.
Optionally, in some embodiments, one or more of the following may also be displayed through a User Interface (UI): a characteristic of the plurality of jobs, the first impact factor, and the second impact factor. Wherein the UI interface is configured to adjust the first impact factor and the second impact factor.
Optionally, the UI interface may also display one or more of: cluster resource usage, job queuing, etc.
A specific implementation of step 420 is described below with reference to the specific example in fig. 5 by taking the system architecture shown in fig. 3 as an example.
It should be understood that the example of fig. 5 is only for assisting the skilled person in understanding the embodiments of the present application, and is not intended to limit the embodiments of the present application to the specific values or specific scenarios of fig. 5. It will be apparent to those skilled in the art from the examples given that various equivalent modifications or variations can be made, and such modifications and variations also fall within the scope of the embodiments of the application.
The method shown in FIG. 5 may include steps 510-530, and the steps 510-530 are described in detail below.
Step 510: and acquiring and recording the task running conditions of all users in the cluster in real time.
In this embodiment of the present application, the task operation conditions of the users in the cluster may include, but are not limited to: the resource usage amount of the completed task and the running task of the user, the starting time of the running task and the like.
Optionally, in some embodiments, the historical resource usage of each user over a past period of time may also be processed in a sliding average or the like manner, increasing the impact of recent resource usage on priority assignment.
Optionally, in some embodiments, only the resource usage amount of the user in each period may be recorded in a periodic calculation manner, and real-time updating is not required, so that the processing efficiency may be improved.
Step 520: and counting and analyzing the operation characteristics of the cluster in the latest period of time.
In the embodiment of the present application, the job characteristics of the cluster in the latest period of time may be determined according to the task operation conditions of all users in the cluster collected and recorded in step 510. The job features may include, but are not limited to: the high resource message requires information such as the number and proportion of jobs, average waiting time of jobs, number of queued jobs, and required resources.
It should be understood that the jobs with high resource message demand in the embodiments of the present application may include, but are not limited to: large-scale operation and long-term operation.
One possible method of determining whether a user-submitted job is a large job is described below.
In the embodiment of the application, the resource quantity R theoretically acquired by each user can be calculatedi. If the Total amount of resources is Total, in a possible implementation manner, under the situation that the users have no weight, the amount of resources that each user should acquire is
Figure BDA0002520715310000101
Where n is the number of users. In another possible implementation manner, in a scenario where a user has a weight, if the user weight is wiThen the amount of resources each user should acquire
Figure BDA0002520715310000102
If the resource demand of a task or a plurality of tasks needing to be scheduled simultaneously in a job exceeds the amount R of the acquired resource of the useriThen the job may be determined to be a large job.
One possible method of determining whether a user-submitted job is a long-term job is described below.
In the embodiment of the application, if the running time of a certain task or a certain job exceeds a preset threshold, the job can be determined as a long-time job.
It should be noted that the preset threshold may be a period of historical resource usage statistics, or may also be a value set by an administrator in a self-defined manner, which is not specifically limited in this application.
In this embodiment of the application, the job characteristics of the cluster obtained by the analysis in the recent period of time may also be presented to the administrator in a visual manner, for example, presented to the administrator through the user graphical interface/client 311.
Step 530: and determining the influence factor parameters of the historical resource usage and the influence factor parameters of the real-time resource usage according to the operation characteristics of the cluster in the latest period of time.
In the embodiment of the application, the operation characteristics of the cluster in the latest period of time, such as the number and proportion of the operations required by the high resource message, the average waiting time of the operations and other information, can be acquired, so that the reference value of the influence factor parameter used by the historical resource and the reference value of the influence factor parameter used by the real-time resource are calculated.
As an example, job duty information of high resource consumption in a cluster in a past period of time may be determined, and a reference value of an impact factor parameter of historical resource usage may be determined based on the job duty information of high resource consumption. And determining the reference value of the real-time resource use influence factor parameter according to the reference value of the history resource use influence factor parameter.
Specifically, if the job duty ratio of high resource consumption in the cluster is larger in the past period of time, resource allocation tends to be performed based on the historical resource usage, and the reference value of the impact factor parameter of the historical resource usage may be set to be higher, which can prevent such a large job starvation of the user. On the contrary, if the job duty ratio of the high resource consumption is smaller, the resource allocation tends to be performed based on the real-time resource usage duty ratio, and the reference value of the impact factor parameter of the real-time resource usage can be set to be higher, so that in a scene that a small job is more, the problems that a large amount of resources are allocated to a user with high priority at one time, a single user allocates too many resources in a single cycle, the queuing time of other users is longer, and the like can be avoided.
It should be understood that there are various implementations of determining job proportion information of high resource consumption in a cluster in the past period, and this is not specifically limited in the embodiment of the present application. In one possible implementation, the job duty of high resource consumption may be a ratio of the number of jobs of high resource consumption in a past period of time to the total number of jobs in the past period of time. In another possible implementation, the job proportion of high resource consumption may also be a ratio of the number of users who submitted the job of high resource consumption in the past period of time to the total number of users.
Specifically, assuming that the job duty of high resource consumption is β, in the embodiment of the present application, a reference value α of an impact factor of historical resource usage may be calculated based on the job duty β of high resource consumption, where the reference value α of the impact factor and the job duty β of high resource consumption are linearly related. For example, α ═ k β + v, where k and v may be set by the algorithm itself, it is sufficient to ensure that α ≦ 1.
It should be noted that k and v need not be configured by an administrator, and are set by algorithm development. For example, k is 0.8 and v is 0.2. For another example, k is 1 and v is 0.
In the embodiment of the application, the reference value of the influence factor of the historical resource usage is alpha, and the reference value of the influence factor of the real-time resource usage is 1-alpha.
In the embodiment of the present application, the reference value of the impact factor of the historical resource usage and the reference value of the impact factor of the real-time resource usage may also be visually presented to the administrator, for example, presented to the administrator through the user graphical interface/client 311.
In a possible implementation manner, an administrator may obtain job characteristics of the cluster in a recent period of time through the gui/client 311, and modify or adjust the reference value of the impact factor of the historical resource usage and the reference value of the impact factor of the real-time resource usage based on the job characteristics, so as to determine the priority of the user according to the modified or adjusted reference values of the impact factors of the historical resource usage and the reference values of the impact factors of the real-time resource usage. Therefore, an administrator can adjust the values of the influence factor parameters of historical resource use and real-time resource use according to the cluster running condition, so that the values of the influence factor parameters are more accurate, and fair distribution of resources in the cluster is realized. In another possible implementation manner, the administrator may also choose not to modify or adjust the reference value of the impact factor of the historical resource usage and the reference value of the impact factor of the real-time resource usage.
In another possible implementation manner, the administrator may also select an automatic execution manner through the user graphical interface/client 311 interface, and directly determine the priority of the user according to the reference value of the impact factor of the historical resource usage and the reference value of the impact factor of the real-time resource usage. In this way, the parameters can be automatically updated, and the manual participation of the administrator is reduced.
Taking the system architecture shown in fig. 3 as an example, a specific implementation of step 430 will be described with reference to the specific example in fig. 6.
It should be understood that the example of fig. 6 is only for assisting the skilled person in understanding the embodiments of the present application, and is not intended to limit the embodiments of the present application to the specific values or specific scenarios of fig. 6. It will be apparent to those skilled in the art from the examples given that various equivalent modifications or variations can be made, and such modifications and variations also fall within the scope of the embodiments of the application.
The method shown in fig. 6 may include steps 610-650, and the steps 610-650 are respectively described in detail below.
Step 610: the historical resource usage fraction of each user is calculated.
In the embodiment of the application, historical resource usage information of all tasks of each user in a past period of time can be counted. The usage information of the historical resource may be the resource usage amount of the completed task × the resource usage time. And calculating the historical resource usage ratio of each user according to the usage information of the historical resources.
As an example, for each user, each dimension resource can be counted separately, and the historical usage ratio of each user's single-dimension resource in the whole cluster (also may be referred to as the historical resource usage ratio) is calculated. It should be understood that in its multidimensional resource usage proportion, the resource dimensions of its dominant usage can be found. Generally, the maximum value of the multidimensional resources can be taken as the final historical resource usage ratio of the user.
Step 620: and calculating the real-time resource usage ratio of each user.
In the embodiment of the application, the resource usage amount of each user running the task can be obtained, and the real-time resource usage proportion of the user is determined according to the resource usage amount.
It should be understood that the method for calculating the real-time resource usage ratio of each user is similar to the above method for calculating the historical resource usage ratio of each user, and the real-time resource usage ratio of the single-dimensional resource of each user in the whole cluster can also be calculated for each user.
Step 630: and calculating the final resource utilization ratio of each user according to the historical resource utilization ratio and the real-time resource utilization ratio of each user.
In the embodiment of the application, the final resource usage proportion of each user can be determined by adopting a weighted summation mode according to the historical resource usage proportion, the influence factor alpha of the historical resource usage, the real-time resource usage proportion and the influence factor 1-alpha of the real-time resource usage. As an example, the user's final resource usage duty (historical resource usage duty α + real-time resource usage duty (1- α)).
Step 640: and finding out the user with the highest priority.
In the embodiment of the application, the priority of the user can be dynamically determined according to the final resource utilization ratio of the user. There are various specific implementations, and two possible implementations are described below.
In one possible implementation, with the same weight for each user, the final resource usage is highest for the user with the smallest priority, and the assignment of resources to the job/task submitted by the highest priority user is attempted preferentially.
In another possible implementation manner, in the case that the weight of each user is not the same, the weight of each user is the resource that the user should obtain. The user with the highest priority can be found by comparing the final resource usage ratio of the user with the corresponding resource usage ratio. For example, a ratio of the final resource usage ratio of the user to the due resource ratio is calculated, and the smaller the ratio, the higher the priority of the user is, and the resource allocation to the job/task submitted by the user with the highest priority can be tried preferentially.
Step 650: resources are allocated for jobs/tasks submitted by the highest priority user.
In the embodiment of the present application, after the user with the highest priority is determined, the task/job with the highest priority in the tasks/jobs submitted by the user may also be determined, and according to the cluster operation condition, a priority attempt may be made to select a suitable node from the managed basic hardware resources 320 (i.e., the underlying physical resources) for the task/job with the highest priority to place the task/job.
It should be understood that the above cluster operating conditions may include, but are not limited to: total cluster resources, number of nodes, resource utilization rate, and the like.
In the technical scheme, by introducing an automatic analysis and perception mechanism of cluster operation load characteristics, the cluster load characteristics and suggested influence factor parameters are presented to an administrator, so that an optimal decision of cluster configuration is made in a semi-automatic or full-automatic mode. The administrator can set calculation and update algorithm parameter configuration for cluster automation, and can manually adjust the type of configuration parameters of the cluster according to the suggested value given by the system and the experience of the administrator, so that the dependence on the experience of the administrator is avoided, the accuracy of parameter configuration is improved, and the problem of unfair resource allocation possibly caused by improper parameter setting is reduced.
It is to be understood that the above description is intended to assist those skilled in the art in understanding the embodiments of the present application and is not intended to limit the embodiments of the present application to the particular values or particular scenarios illustrated. It will be apparent to those skilled in the art from the foregoing description that various equivalent modifications or changes may be made, and such modifications or changes are intended to fall within the scope of the embodiments of the present application.
The method for resource allocation in the embodiment of the present application is described in detail above with reference to fig. 1 to 6, and the apparatus embodiment of the present application is described in detail below with reference to fig. 7 and 8. It should be understood that the apparatus for resource allocation in the embodiment of the present application may perform the foregoing methods for resource allocation in the embodiment of the present application, that is, the following specific working processes of various products, and reference may be made to the corresponding processes in the foregoing method embodiments.
Fig. 7 is a schematic block diagram of an apparatus 700 for resource allocation provided by an embodiment of the present application.
It should be understood that the apparatus 700 for resource allocation can perform the steps in the method for resource allocation shown in fig. 4 to 6, and will not be described in detail here to avoid repetition. The apparatus 700 for resource allocation comprises: the determination module 710 is used to determine,
a determining module 710 for determining characteristics of a plurality of jobs submitted within a preset time period;
the determining module 710 is further configured to determine a first impact factor and a second impact factor according to the characteristics of the plurality of jobs, where the first impact factor is a weight corresponding to historical resource usage information of a cluster, and the second impact factor is a weight corresponding to real-time resource usage information of the cluster;
the determining module 710 is further configured to determine a target user according to the historical resource usage information of the user, the first impact factor, the real-time resource usage information, and the second impact factor.
Optionally, the characteristics of the plurality of jobs include a duty ratio of jobs with high resource consumption in the plurality of jobs, the jobs with high resource consumption being jobs with occupied resources greater than a first threshold.
Optionally, the duty cycle of the high resource consuming job comprises one or more of: a ratio of the number of users of the high-resource-consumption job to the number of users of the plurality of jobs.
Optionally, the determining module 710 is specifically configured to: setting the first impact factor to a value positively correlated with the duty ratio if the duty ratio of the job with high resource consumption is larger than a second threshold value; setting the first influence factor to a value positively correlated with the duty ratio if the duty ratio of the job with high resource consumption is less than or equal to the second threshold value.
Optionally, the method further comprises: a display module 720, configured to display, through the user interface UI, one or more of: a characteristic of the plurality of jobs, the first impact factor, and the second impact factor; wherein the UI interface is configured to adjust the first impact factor and the second impact factor.
Optionally, the determining module 710 is specifically configured to: determining the historical resource usage proportion of each user of a plurality of users in a cluster according to the historical resource usage information of the plurality of users; determining real-time resource usage proportion of each user of the at least one user in the cluster according to the real-time resource usage information of the at least one user; and determining the target user according to the historical resource usage ratios of the plurality of users, the first influence factor, the real-time resource usage ratio of the at least one user and the second influence factor.
Optionally, the first impact factor is positively correlated with the fraction of the high-resource-consumption job, and the second impact factor is negatively correlated with the fraction of the high-resource-consumption job.
Optionally, the determining module 710 is specifically configured to: and determining the target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information, the second influence factor and the weight of the user.
It should be understood that the apparatus 700 for resource allocation herein is embodied in the form of a functional unit. The term "unit" herein may be implemented in software and/or hardware, and is not particularly limited thereto.
For example, a "unit" may be a software program, a hardware circuit, or a combination of both that implement the above-described functions. The hardware circuitry may include an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (e.g., a shared processor, a dedicated processor, or a group of processors) and memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that support the described functionality.
Accordingly, the units of the respective examples described in the embodiments of the present application can be realized in electronic hardware, or a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Fig. 8 is a hardware structure diagram of an apparatus 800 for resource allocation according to an embodiment of the present application.
The apparatus 800 of resource allocation shown in fig. 8 may include a memory 801, a processor 802, a communication interface 803, and a bus 804. The memory 801, the processor 802, and the communication interface 803 are communicatively connected to each other via a bus 804.
The memory 801 may be a read-only memory (ROM), a static memory device, and a Random Access Memory (RAM). The memory 801 may store programs, and when the programs stored in the memory 801 are executed by the processor 802, the processor 802 and the communication interface 803 are used for executing the steps of the method for resource allocation of the embodiments of the present application, for example, the steps of the method for resource allocation shown in fig. 4 to 6 may be executed.
The processor 802 may be a general-purpose CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits, and is configured to execute the relevant programs to implement the functions required by the units in the resource allocation apparatus shown in fig. 7 of the present application, or to execute the resource allocation method of the present application method embodiment.
The processor 802 may also be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the resource allocation method according to the embodiment of the present application may be implemented by integrated logic circuits of hardware in the processor 802 or instructions in the form of software.
The processor 802 may also be a general purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 801, and the processor 802 reads information in the memory 801, and completes, in combination with hardware of the processor, functions required to be executed by units included in the apparatus for resource allocation according to the embodiment of the present application, or performs a method for resource allocation according to the embodiment of the method of the present application.
For example, the processor 802 may correspond to the determination module 710 in the apparatus for resource allocation shown in fig. 7.
The communication interface 803 enables communication between the apparatus 800 for resource allocation and other devices or communication networks using transceiver means such as, but not limited to, transceivers.
Bus 804 may include a path that conveys information between various components of apparatus 800 (e.g., memory 801, processor 802, communication interface 803) for resource allocation.
It should be noted that although the apparatus 800 for resource allocation described above only shows memories, processors, and communication interfaces, in a specific implementation process, those skilled in the art should understand that the apparatus 800 for resource allocation may also include other devices necessary for normal operation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 800 for resource allocation described above may also include hardware devices for implementing other additional functions. Furthermore, it should be understood by those skilled in the art that the apparatus 800 for resource allocation described above may also include only the components necessary to implement the embodiments of the present application, and not necessarily all of the components shown in fig. 8.
The embodiment of the application also provides a chip, which comprises a transceiver unit and a processing unit. The transceiver unit can be an input/output circuit and a communication interface; the processing unit is a processor or a microprocessor or an integrated circuit integrated on the chip; the chip may perform the method of resource allocation in the above method embodiments.
Embodiments of the present application further provide a computer-readable storage medium, on which instructions are stored, and when executed, the instructions perform the method for resource allocation in the above method embodiments.
Embodiments of the present application further provide a computer program product containing instructions, which when executed perform the method for resource allocation in the foregoing method embodiments.
It should be understood that the processor in the embodiments of the present application may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It will also be appreciated that the memory in the embodiments of the subject application can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of Random Access Memory (RAM) are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchlink DRAM (SLDRAM), and direct bus RAM (DR RAM).
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer instructions or the computer program are loaded or executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In addition, the "/" in this document generally indicates that the former and latter associated objects are in an "or" relationship, but may also indicate an "and/or" relationship, which may be understood with particular reference to the former and latter text.
In the present application, "at least one" means one or more, "a plurality" means two or more. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.
It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (16)

1. A method of resource allocation, comprising:
determining characteristics of a plurality of jobs submitted within a preset time period;
determining a first influence factor and a second influence factor according to the characteristics of the plurality of jobs, wherein the first influence factor is a weight corresponding to historical resource usage information of a cluster, and the second influence factor is a weight corresponding to real-time resource usage information of the cluster;
determining a target user according to historical resource use information of the user, the first influence factor, real-time resource use information and the second influence factor;
and allocating resources for at least one job or task submitted by the target user.
2. The method of claim 1, wherein the characteristics of the plurality of jobs comprise a duty ratio of high resource consuming jobs in the plurality of jobs, the high resource consuming jobs being jobs that occupy resources greater than a first threshold.
3. The method of claim 2, wherein the duty cycle of the high resource consuming job comprises one or more of: a ratio of the number of users of the high-resource-consumption job to the number of users of the plurality of jobs.
4. The method of claim 2 or 3, wherein determining a first impact factor and a second impact factor from characteristics of the plurality of jobs comprises:
setting the first impact factor to a value positively correlated with a duty ratio of the high-resource-consumption job;
setting the second impact factor to a value negatively correlated with the duty ratio of the high-resource-consumption job.
5. The method according to any one of claims 1 to 4, further comprising:
displaying, via the user interface UI, one or more of: a characteristic of the plurality of jobs, the first impact factor, and the second impact factor;
wherein the UI is configured to adjust the first and second impact factors.
6. The method of any one of claims 1 to 5, wherein determining a target user based on historical resource usage information of the user, the first impact factor, real-time resource usage information, and the second impact factor comprises:
determining the historical resource usage proportion of each user of a plurality of users in a cluster according to the historical resource usage information of the plurality of users;
determining real-time resource usage proportion of each user of the at least one user in the cluster according to the real-time resource usage information of the at least one user;
and determining the target user according to the historical resource usage ratios of the plurality of users, the first influence factor, the real-time resource usage ratio of the at least one user and the second influence factor.
7. The method of any one of claims 1 to 6, wherein determining a target user based on historical resource usage information of the user, the first impact factor, real-time resource usage information, and the second impact factor comprises:
and determining the target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information, the second influence factor and the weight of the user.
8. An apparatus for resource allocation, comprising:
the system comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining the characteristics of a plurality of jobs submitted in a preset time period;
the determining module is further configured to determine a first influence factor and a second influence factor according to the characteristics of the plurality of jobs, where the first influence factor is a weight corresponding to historical resource usage information of a cluster, and the second influence factor is a weight corresponding to real-time resource usage information of the cluster;
the determining module is further configured to determine a target user according to historical resource usage information of the user, the first impact factor, real-time resource usage information, and the second impact factor.
9. The apparatus of claim 8, wherein the characteristics of the plurality of jobs comprise a duty ratio of high resource consuming jobs in the plurality of jobs, the high resource consuming jobs being jobs that occupy resources greater than a first threshold.
10. The apparatus of claim 9, wherein the duty cycle of the high resource consuming job comprises one or more of: a ratio of the number of users of the high-resource-consumption job to the number of users of the plurality of jobs.
11. The apparatus according to claim 9 or 10, wherein the determining module is specifically configured to:
setting the first impact factor to a value positively correlated with a duty ratio of the high-resource-consumption job;
setting the second impact factor to a value negatively correlated with the duty ratio of the high-resource-consumption job.
12. The apparatus of any one of claims 8 to 11, further comprising:
a display module for displaying one or more of the following through a user interface UI: a characteristic of the plurality of jobs, the first impact factor, and the second impact factor;
wherein the UI is configured to adjust the first and second impact factors.
13. The apparatus according to any one of claims 8 to 12, wherein the determining module is specifically configured to:
determining the historical resource usage proportion of each user of a plurality of users in a cluster according to the historical resource usage information of the plurality of users;
determining real-time resource usage proportion of each user of the at least one user in the cluster according to the real-time resource usage information of the at least one user;
and determining the target user according to the historical resource usage ratios of the plurality of users, the first influence factor, the real-time resource usage ratio of the at least one user and the second influence factor.
14. The apparatus according to any one of claims 9 to 13, wherein the determining module is specifically configured to:
and determining the target user according to the historical resource use information of the user, the first influence factor, the real-time resource use information, the second influence factor and the weight of the user.
15. An apparatus for resource allocation comprising a processor, a memory and a communication interface, the memory having stored therein computer-executable instructions, the processor, when executing, executing the computer-executable instructions in the memory to perform the method of resource allocation according to any one of claims 1 to 7.
16. A computer-readable storage medium, comprising a computer program which, when run on a computer, causes the computer to perform the method of resource allocation of any one of claims 1 to 7.
CN202010490013.7A 2020-06-02 2020-06-02 Resource allocation method and device Pending CN113765949A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010490013.7A CN113765949A (en) 2020-06-02 2020-06-02 Resource allocation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010490013.7A CN113765949A (en) 2020-06-02 2020-06-02 Resource allocation method and device

Publications (1)

Publication Number Publication Date
CN113765949A true CN113765949A (en) 2021-12-07

Family

ID=78782934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010490013.7A Pending CN113765949A (en) 2020-06-02 2020-06-02 Resource allocation method and device

Country Status (1)

Country Link
CN (1) CN113765949A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116051106A (en) * 2022-07-29 2023-05-02 荣耀终端有限公司 Abnormal order processing method and device
CN117215799A (en) * 2023-11-03 2023-12-12 天津市职业大学 Management method, system, computer equipment and storage medium of software module
WO2024000859A1 (en) * 2022-06-28 2024-01-04 深圳前海微众银行股份有限公司 Job scheduling method, job scheduling apparatus, job scheduling system, and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104168318A (en) * 2014-08-18 2014-11-26 中国联合网络通信集团有限公司 Resource service system and resource distribution method thereof
US20150178135A1 (en) * 2012-09-12 2015-06-25 Salesforce.Com, Inc. Facilitating tiered service model-based fair allocation of resources for application servers in multi-tenant environments
US20160379306A1 (en) * 2015-06-24 2016-12-29 Christopher Sean Slotterback Optimized resource allocation
CN107239329A (en) * 2016-03-29 2017-10-10 西门子公司 Unified resource dispatching method and system under cloud environment
CN108460618A (en) * 2018-01-09 2018-08-28 北京三快在线科技有限公司 A kind of resource allocation method and device, electronic equipment
CN108924221A (en) * 2018-06-29 2018-11-30 华为技术有限公司 The method and apparatus for distributing resource
CN109189563A (en) * 2018-07-25 2019-01-11 腾讯科技(深圳)有限公司 Resource regulating method, calculates equipment and storage medium at device
CN109783236A (en) * 2019-01-16 2019-05-21 北京百度网讯科技有限公司 Method and apparatus for output information
CN110363416A (en) * 2019-06-29 2019-10-22 上海淇馥信息技术有限公司 Financial resources distribution method, device and electronic equipment
CN110363319A (en) * 2018-03-26 2019-10-22 阿里巴巴集团控股有限公司 Resource allocation methods, server, resource claim method and client
CN110825520A (en) * 2019-10-18 2020-02-21 山东省计算中心(国家超级计算济南中心) Cluster top-speed elastic expansion method for realizing efficient resource utilization
CN111193802A (en) * 2019-12-31 2020-05-22 苏州浪潮智能科技有限公司 Dynamic resource allocation method, system, terminal and storage medium based on user group

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150178135A1 (en) * 2012-09-12 2015-06-25 Salesforce.Com, Inc. Facilitating tiered service model-based fair allocation of resources for application servers in multi-tenant environments
CN104168318A (en) * 2014-08-18 2014-11-26 中国联合网络通信集团有限公司 Resource service system and resource distribution method thereof
US20160379306A1 (en) * 2015-06-24 2016-12-29 Christopher Sean Slotterback Optimized resource allocation
CN107239329A (en) * 2016-03-29 2017-10-10 西门子公司 Unified resource dispatching method and system under cloud environment
CN108460618A (en) * 2018-01-09 2018-08-28 北京三快在线科技有限公司 A kind of resource allocation method and device, electronic equipment
CN110363319A (en) * 2018-03-26 2019-10-22 阿里巴巴集团控股有限公司 Resource allocation methods, server, resource claim method and client
CN108924221A (en) * 2018-06-29 2018-11-30 华为技术有限公司 The method and apparatus for distributing resource
CN109189563A (en) * 2018-07-25 2019-01-11 腾讯科技(深圳)有限公司 Resource regulating method, calculates equipment and storage medium at device
CN109783236A (en) * 2019-01-16 2019-05-21 北京百度网讯科技有限公司 Method and apparatus for output information
CN110363416A (en) * 2019-06-29 2019-10-22 上海淇馥信息技术有限公司 Financial resources distribution method, device and electronic equipment
CN110825520A (en) * 2019-10-18 2020-02-21 山东省计算中心(国家超级计算济南中心) Cluster top-speed elastic expansion method for realizing efficient resource utilization
CN111193802A (en) * 2019-12-31 2020-05-22 苏州浪潮智能科技有限公司 Dynamic resource allocation method, system, terminal and storage medium based on user group

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DEYANG TENG; NA YE: ""Cell clustering-based resource allocation in ultra-dense networks"", 《2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC)》, 31 December 2017 (2017-12-31) *
何文婷;崔慧敏;冯晓兵;: "HDAS:异构集群上Hadoop+框架中的动态亲和性调度", 高技术通讯, no. 04, 15 April 2016 (2016-04-15) *
陈重韬;: "面向多用户环境的MapReduce集群调度算法研究", 高技术通讯, no. 04, 15 April 2017 (2017-04-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024000859A1 (en) * 2022-06-28 2024-01-04 深圳前海微众银行股份有限公司 Job scheduling method, job scheduling apparatus, job scheduling system, and storage medium
CN116051106A (en) * 2022-07-29 2023-05-02 荣耀终端有限公司 Abnormal order processing method and device
CN117215799A (en) * 2023-11-03 2023-12-12 天津市职业大学 Management method, system, computer equipment and storage medium of software module

Similar Documents

Publication Publication Date Title
Le et al. Allox: compute allocation in hybrid clusters
EP3553657A1 (en) Method and device for allocating distributed system task
CN113765949A (en) Resource allocation method and device
CN107688492B (en) Resource control method and device and cluster resource management system
CN110389816B (en) Method, apparatus and computer readable medium for resource scheduling
CN104778080A (en) Job scheduling processing method and device based on coprocessor
CN111258745B (en) Task processing method and device
CN113037800B (en) Job scheduling method and job scheduling device
CN109445947B (en) Resource allocation processing method, device, equipment and storage medium
WO2022246833A1 (en) System, method, and medium for elastic allocation of resources for deep learning jobs
CN109783236B (en) Method and apparatus for outputting information
CN115033357A (en) Micro-service workflow scheduling method and device based on dynamic resource selection strategy
CN115586961A (en) AI platform computing resource task scheduling method, device and medium
CN112925616A (en) Task allocation method and device, storage medium and electronic equipment
CN113127173B (en) Heterogeneous sensing cluster scheduling method and device
CN116708451B (en) Edge cloud cooperative scheduling method and system
CN112817722A (en) Time-sharing scheduling method based on priority, terminal and storage medium
Ananth et al. Cooperative game theoretic approach for job scheduling in cloud computing
CN116915869A (en) Cloud edge cooperation-based time delay sensitive intelligent service quick response method
CN104731662B (en) A kind of resource allocation methods of variable concurrent job
CN115129481B (en) Computing resource allocation method and device and electronic equipment
CN116880968A (en) Job scheduling method and scheduling system
CN111258729B (en) Redis-based task allocation method and device, computer equipment and storage medium
de Freitas Cunha et al. An SMDP approach for Reinforcement Learning in HPC cluster schedulers
CN110008002B (en) Job scheduling method, device, terminal and medium based on stable distribution probability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right

Effective date of registration: 20220215

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Applicant after: Huawei Cloud Computing Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Applicant before: HUAWEI TECHNOLOGIES Co.,Ltd.

TA01 Transfer of patent application right
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination