WO2022006830A1 - 一种多队列多集群的任务调度方法及系统 - Google Patents

一种多队列多集群的任务调度方法及系统 Download PDF

Info

Publication number
WO2022006830A1
WO2022006830A1 PCT/CN2020/101185 CN2020101185W WO2022006830A1 WO 2022006830 A1 WO2022006830 A1 WO 2022006830A1 CN 2020101185 W CN2020101185 W CN 2020101185W WO 2022006830 A1 WO2022006830 A1 WO 2022006830A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
reward
energy consumption
delay
proportion
Prior art date
Application number
PCT/CN2020/101185
Other languages
English (en)
French (fr)
Inventor
崔得龙
林建鹏
彭志平
李启锐
何杰光
邱金波
Original Assignee
广东石油化工学院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东石油化工学院 filed Critical 广东石油化工学院
Priority to PCT/CN2020/101185 priority Critical patent/WO2022006830A1/zh
Priority to US17/277,816 priority patent/US11954526B2/en
Publication of WO2022006830A1 publication Critical patent/WO2022006830A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4893Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the invention relates to the technical field of cloud computing, in particular to a task scheduling method and system with multiple queues and multiple clusters.
  • virtual unit placement can be abstracted as a packing problem, which is an NP-complete problem;
  • the placement of virtual units can be abstracted as a secondary allocation problem, which is also an NP-complete problem.
  • Cloud service supply and demand parties negotiate the workload and service level agreement to be executed.
  • Cloud service providers pay more attention to how to maximize resource utilization and minimize operating costs.
  • Cloud service users pay more attention to the How to schedule tasks to minimize lease time and minimize payment costs.
  • One of the core indicators of operating costs and payment costs is energy consumption. In actual cloud task scheduling and resource allocation, there is a conflict between cloud service providers who aim to minimize energy consumption and users who seek to optimize service quality. Small task delays, while cloud service providers want less energy consumption.
  • the existing cloud task scheduling and resource allocation methods can only optimize one optimization objective, that is, minimize task delay or minimize energy consumption as the optimization objective of cloud system.
  • the optimal scheduling strategy cannot effectively weigh the relationship between the two optimization goals of energy consumption and task completion time (ie task delay) according to specific requirements, so that the sum of task delay and energy consumption (ie, optimization goal) is minimized.
  • the optimal scheduling strategy is generated.
  • the technical problem to be solved by the present invention is to provide a multi-queue multi-cluster task scheduling method and system, which can generate an optimal scheduling strategy with minimizing task delay and energy consumption as the optimization goal of the cloud system.
  • the present invention provides a multi-queue multi-cluster task scheduling method and system.
  • Step S1 constructing a training data set;
  • the training data set includes a one-to-one correspondence between state spaces and action decisions;
  • the state space includes multiple task attribute groups in multiple queues arranged in sequence;
  • the task attribute group includes tasks The amount of data and the number of CPU cycles required for the task;
  • Step S2 using the training data set to train and optimize multiple parallel deep neural networks to obtain multiple parallel trained and optimized deep neural networks;
  • Step S3 setting a reward function; the reward function minimizes the sum of the task delay and energy consumption by adjusting the proportion of the reward value of the task delay and the proportion of the reward value of the energy consumption;
  • Step S4 inputting the state space to be scheduled into a plurality of parallel deep neural networks after training and optimization, to obtain a plurality of action decisions to be scheduled;
  • Step S5 according to the reward function, determine an optimal action decision among the plurality of action decisions to be scheduled for output;
  • Step S6 Scheduling a plurality of the task attribute groups to a plurality of clusters according to the optimal action decision.
  • Step S7 store the state space to be scheduled and the optimal action decision as a sample in the experience playback pool; repeat steps S4-S7 until the number of samples in the experience playback pool reaches the threshold;
  • Step S8 Randomly extract a set number of samples from the experience playback pool, further train and optimize a plurality of parallel deep neural networks after training and optimization, and obtain a plurality of parallel further trained and optimized deep neural networks.
  • Neural Networks Randomly extract a set number of samples from the experience playback pool, further train and optimize a plurality of parallel deep neural networks after training and optimization, and obtain a plurality of parallel further trained and optimized deep neural networks.
  • Step S9 Update the multiple parallel trained and optimized deep neural networks in step S4 to multiple parallel further trained and optimized deep neural networks.
  • the setting reward function specifically includes:
  • Step S31 adding the time consumed by each task transmission process and the time consumed by the task calculation process to obtain the task delay of each task;
  • Step S32 determine the maximum task delay among all task delays
  • Step S33 adding up the energy consumed by all task transmission processes and the energy consumed by all task calculation processes to obtain the energy consumption of all tasks;
  • Step S34 setting the first reward value proportion occupied by task delay and the second reward value proportion occupied by energy consumption; the sum of the first reward value proportion and the second reward value proportion is 1;
  • Step S35 Set a reward function according to the maximum task delay, the proportion of the first reward value, the energy consumption, and the second reward value.
  • the setting of the reward function according to the maximum task delay, the proportion of the first reward value, the energy consumption and the second reward value specifically includes:
  • Step S351 Multiply the maximum task delay by the ratio of the first reward value to obtain a first product
  • Step S352 multiply the energy consumption by the proportion of the second reward value to obtain a second product
  • Step S353 Add the first product and the second product to obtain a reward function.
  • determining an optimal action decision among the plurality of action decisions to be scheduled for output according to the reward function specifically including:
  • Step S51 Calculate the reward function value of each action decision to be scheduled according to the reward function
  • Step S52 Select the minimum reward function value among all reward function values
  • Step S53 Select the action decision to be scheduled corresponding to the minimum reward function value as the best action decision for output.
  • scheduling a plurality of the task attribute groups to a plurality of clusters according to the optimal action decision further includes:
  • Step S10 Evenly distribute the number of CPU cycles of each cluster to all the task attribute groups in the cluster.
  • the multi-queue multi-cluster task scheduling system includes:
  • a training data set building module is used to construct a training data set;
  • the training data set includes a one-to-one correspondence between state spaces and action decisions;
  • the state space includes a plurality of task attribute groups in a plurality of queues arranged in sequence;
  • the The task attribute group includes the amount of task data and the number of CPU cycles required by the task;
  • a training and optimization module is used to train and optimize multiple parallel deep neural networks using the training data set to obtain multiple parallel trained and optimized deep neural networks;
  • the reward function setting module is used to set the reward function; the reward function minimizes the sum of the task delay and the energy consumption by adjusting the proportion of the reward value of the task delay and the proportion of the reward value of the energy consumption;
  • an action decision acquisition module used for inputting the state space to be scheduled into a plurality of parallel deep neural networks after training and optimization to obtain a plurality of action decisions to be scheduled;
  • an optimal action decision obtaining module configured to determine an optimal action decision among the plurality of action decisions to be scheduled for output according to the reward function
  • a scheduling module configured to schedule a plurality of the task attribute groups to a plurality of clusters according to the optimal action decision.
  • the sample storage module is used to store the state space to be scheduled and the optimal action decision as a sample in the experience playback pool; repeatedly execute the action decision acquisition module, the best action decision acquisition module, the scheduling module, and the sample storage module , until the number of samples in the experience playback pool reaches the threshold;
  • a further training and optimization module is used to randomly extract a set number of samples from the experience playback pool, and further train and optimize multiple parallel said trained and optimized deep neural networks to obtain multiple parallel further trainings and optimized deep neural network;
  • the updating module is configured to update the plurality of parallel deep neural networks after training and optimization in the action decision acquiring module to the plurality of parallel deep neural networks after further training and optimization.
  • the reward function setting module specifically includes:
  • a task delay calculation unit configured to add the time consumed by each task transmission process and the time consumed by the task calculation process to obtain the task delay of each task;
  • the maximum task delay determination unit is used to determine the maximum task delay among all task delays
  • the energy consumption calculation unit is used to add up the energy consumed by all task transmission processes and the energy consumed by all task calculation processes to obtain the energy consumption of all tasks;
  • the reward value proportion setting unit is used to set the first reward value proportion occupied by task delay and the second reward value proportion occupied by energy consumption; the sum of the first reward value proportion and the second reward value proportion is 1 ;
  • a reward function setting unit configured to set a reward function according to the maximum task delay, the proportion of the first reward value, the energy consumption and the second reward value.
  • the reward function setting unit specifically includes:
  • a first product obtaining subunit configured to multiply the maximum task delay by the ratio of the first reward value to obtain a first product
  • a second product obtaining subunit configured to multiply the proportion of the energy consumption and the second reward value to obtain a second product
  • a reward function acquiring subunit configured to add the first product and the second product to obtain a reward function.
  • the beneficial effects of the present invention are: the multi-queue and multi-cluster task scheduling method and system disclosed by the present invention are aimed at cloud service providers aiming at minimizing energy consumption and users who pursue the optimization of service quality.
  • the reward function can adjust the proportion of the reward value of the task delay and the reward value of the energy consumption according to the specific requirements, so as to minimize the sum of the task delay and energy consumption.
  • increase the proportion of the reward value of the task delay increase the proportion of the reward value of the energy consumption when you want to obtain less energy consumption.
  • the energy consumption and the task delay can be effectively weighed.
  • the relationship between the optimization objectives such that the sum of task delay and energy consumption is minimized.
  • the optimization process uses the reward function to calculate the reward function value of the action decision output by each deep neural network, selects the action decision corresponding to the minimum reward function value as the best action decision, and performs multi-queue multi-cluster task scheduling according to the best action decision, so as to
  • the optimal scheduling strategy can be generated with minimizing task delay and energy consumption as the optimization goal of the cloud system.
  • FIG. 1 is a flowchart of Embodiment 1 of a task scheduling method for multiple queues and multiple clusters of the present invention
  • Embodiment 2 is a schematic flowchart of Embodiment 2 of a task scheduling method for multiple queues and multiple clusters of the present invention
  • Fig. 3 is the cloud system frame diagram of the present invention.
  • FIG. 4 is a structural diagram of an embodiment of a multi-queue multi-cluster task scheduling system according to the present invention.
  • FIG. 1 is a flowchart of Embodiment 1 of a task scheduling method for multiple queues and multiple clusters according to the present invention.
  • the task scheduling method of the multi-queue multi-cluster includes:
  • Step S1 constructing a training data set; the training data set includes a one-to-one correspondence between state spaces and action decisions; the state space includes multiple task attribute groups in multiple queues arranged in sequence; the task attribute group includes tasks The amount of data and the number of CPU cycles required for the task.
  • Step S2 using the training data set to train and optimize a plurality of parallel deep neural networks to obtain a plurality of parallel trained and optimized deep neural networks.
  • Step S3 Setting a reward function; the reward function minimizes the sum of the task delay and energy consumption by adjusting the proportion of the reward value of the task delay and the proportion of the reward value of the energy consumption.
  • the step S3 specifically includes:
  • Step S31 Add the time consumed by each task transmission process and the time consumed by the task calculation process to obtain the task delay of each task.
  • Step S32 Determine the maximum task delay among all task delays.
  • Step S33 Add up the energy consumed by all task transmission processes and the energy consumed by all task calculation processes to obtain the energy consumption of all tasks.
  • Step S34 setting the first reward value proportion occupied by task delay and the second reward value proportion occupied by energy consumption; the sum of the first reward value proportion and the second reward value proportion is 1.
  • Step S35 Set a reward function according to the maximum task delay, the proportion of the first reward value, the energy consumption, and the second reward value.
  • the step S35 specifically includes:
  • Step S351 Multiply the maximum task delay by the ratio of the first reward value to obtain a first product.
  • Step S352 Multiply the energy consumption by the proportion of the second reward value to obtain a second product.
  • Step S353 Add the first product and the second product to obtain a reward function.
  • Step S4 Input the state space to be scheduled into a plurality of parallel deep neural networks after training and optimization, and obtain a plurality of action decisions to be scheduled.
  • Step S5 According to the reward function, an optimal action decision is determined among the plurality of action decisions to be scheduled for output.
  • the step S5 specifically includes:
  • Step S51 Calculate the reward function value of each of the action decisions to be scheduled according to the reward function.
  • Step S52 Select the minimum reward function value among all reward function values.
  • Step S6 Scheduling a plurality of the task attribute groups to a plurality of clusters according to the optimal action decision.
  • step S6 it also includes:
  • Step S10 Evenly distribute the number of CPU cycles of each cluster to all the task attribute groups in the cluster.
  • the multi-queue multi-cluster task scheduling method further includes:
  • Step S7 Store the to-be-scheduled state space and the optimal action decision as a sample in an experience playback pool; repeat steps S4-S7 until the number of samples in the experience playback pool reaches a threshold.
  • Step S9 Update the multiple parallel trained and optimized deep neural networks in step S4 to multiple parallel further trained and optimized deep neural networks.
  • FIG. 2 is a schematic flowchart of Embodiment 2 of a task scheduling method for multiple queues and multiple clusters according to the present invention.
  • the task scheduling method of the multi-queue multi-cluster includes:
  • ⁇ x is the parameter of the neural network
  • ⁇ x includes the node parameter and the connection line parameter between the nodes.
  • the experience playback pool stores the strategy obtained before, which is one of the characteristics of the DNN algorithm that distinguishes it from the previous neural network algorithm.
  • the initialization of neural network parameters is random.
  • n1 represents the total number of tasks, that is, the number of waiting task queues n times the number of tasks contained in each queue m, task 1 ... task n1 represents multiple tasks in multiple queues arranged in sequence in the state space Attribute groups; each task attribute group includes the amount of task data and the number of CPU cycles required by the task.
  • the task of the cloud system is to schedule atomic tasks in multiple queues, that is, the task set in Figure 2, into multiple clusters.
  • the number of waiting task queues in the system is n, 1 ⁇ n ⁇ N, where N represents the maximum number of waiting task queues in the system.
  • the number of tasks contained in each queue in the system is m, 1 ⁇ m ⁇ M, where M represents the maximum number of tasks contained in each queue in the system, and the total number of tasks is m*n.
  • the number of designed computing clusters is k, 1 ⁇ k ⁇ K, where K represents the maximum number of computing clusters in the system.
  • the task T nm represents the m-th task in the n-th queue, and the attributes of the task T-nm are expressed as ( ⁇ nm , ⁇ nm ) as a two-tuple, where ⁇ nm represents the data volume of the m-th task in the n-th queue , ⁇ nm represents the number of CPU cycles required for the mth task in the nth queue.
  • ⁇ nm the number of CPU cycles required for the mth task in the nth queue.
  • the attributes of cluster J k are represented by triples as where C k represents the computing power of cluster k, that is, the number of cycles of the CPU, represents the communication power consumption of cluster k, represents the computational power consumption of cluster k.
  • Step 3 Each DNN outputs a different action decision (d 1 , d 2 ,...,d x ).
  • d x represents the action decision of the Xth DNN output.
  • the action decision is which cluster the task is scheduled to, and the action decision is also called the scheduling policy.
  • Step 4 Calculate the Q value corresponding to each action decision, and select the action decision that obtains the smallest Q value as the best action decision for the task set:
  • the communication model includes the transmission time and energy consumption required for task data transmission.
  • the bandwidth is equally distributed to each task, so the bandwidth that task m in queue n can occupy for:
  • w nk represents the bandwidth from queue n to cluster k
  • a nk represents the number of tasks scheduled to cluster k in queue n.
  • the communication delay T comm is the time it takes for the task data to be uploaded to the server:
  • Communication energy consumption E comm is the energy consumed in the process of task transmission:
  • the computational model contains the computational latency and computational energy consumption of the task.
  • the computing power of the cluster will be equally distributed to the tasks scheduled to the cluster, that is, each task gets CPU cycles:
  • the calculation delay T comp is the time consumed by the task calculation:
  • the calculation energy consumption E comp is the energy consumed in the task calculation process:
  • the factors considered in this embodiment are task delay and energy consumption, so the reward function of the system, that is, the Q value, is defined as follows:
  • ⁇ d represents the optimization proportion of task delay
  • ⁇ e represents the optimization proportion of energy consumption
  • ⁇ d + ⁇ e 1.
  • d represents the action decision output by the DNN. The setting is based on specific requirements, so that the sum of task delay and energy consumption, that is, the optimization goal is the smallest.
  • the final optimization goal of the system is to obtain the optimal scheduling strategy.
  • DNN makes an action decision, according to the formula
  • the action decision with the minimum Q value is taken as the best action decision for the task set, so as to obtain the optimal scheduling strategy and minimize task delay and energy consumption, which is to minimize the expected return value R:
  • the optimization process of the system is the training process of the scheduling model.
  • the scheduling model consists of multiple heterogeneous DNNs.
  • the optimization process of the system includes:
  • multiple task attribute groups in multiple queues are represented as state space s, expressed as ⁇ 11 , ⁇ 11 , ⁇ 12 , ⁇ 12 ,..., ⁇ nm , ⁇ nm ⁇ , as the input of X DNNs , each DNN outputs a different action decision (d 1 ,d 2 ,...,d x ).
  • d x represents the action decision of the Xth DNN output.
  • the system state s t is used as input and outputs the action decision of each DNN Expressed as:
  • Step 5 Store the current task set state space s and the best action decision d opt as samples (s, d opt ) in the experience playback pool.
  • Mini-batch The batch size, that is, the number of samples selected before adjusting the parameters each time, is used for model training.
  • the goal is to minimize the expected return value, that is, minimize task delay and energy consumption. Since the final optimization goal of the system is to obtain the optimal scheduling strategy, the model is continuously trained and optimized through the samples in the experience playback pool, so that the accuracy of the model is higher, so that the task set state space can be input into the model.
  • the gradient descent algorithm optimizes the parameter values of each DNN by minimizing the cross-entropy loss (The parameters of the DNN are the weights on the nodes and the connections between the nodes) until the reward function converges. This process follows the formula of minimizing the cross-entropy loss function.
  • T represents the mathematical matrix transpose
  • d t represents the action decision at time step t
  • the system state s t as input
  • the final output Represents the system state s t as input, a function of the network parameters of X neural network DNNs.
  • Step 6 Test the scheduling model composed of multiple heterogeneous DNNs.
  • the first part is to compare and verify the key parameters of the HDLL model, and observe the influence of the parameters on the optimization effect of the model.
  • the key parameters of the model include the number of heterogeneous DNNs, the learning rate, and the Batch-size (that is, the number of samples selected for one training.
  • the size of the Batch-Size affects the optimization degree and speed of the model, and it directly affects the GPU memory usage. If the GPU memory is not large, the value should be set smaller).
  • the second part is the optimization results of this embodiment and benchmark algorithms, including random selection algorithm (Random), round-robin algorithm (Round-Robin, RR), MoPSO multi-objective particle swarm optimization, DLL distributed learning algorithm and Greedy greedy algorithm. Compare and verify.
  • the experimental results show that the model can effectively balance the two optimization objectives of energy consumption and task completion time, and has obvious optimization effect.
  • FIG. 3 is a frame diagram of the cloud system of the present invention.
  • the scheduling model composed of multiple heterogeneous DNNs in this embodiment that is, the distributed deep learning model in FIG. 3 is set on the second layer in the cloud system framework, and the scheduling model is the heterogeneous distributed deep learning Model basic architecture.
  • the cloud system framework mainly has three layers.
  • the first layer is the user load layer. Due to the huge number of cloud users and the diversity of user types, the user load is diverse, and the user load includes multiple tasks, dependencies between tasks, and data transmission. Therefore, in the process of task scheduling, it is necessary to ensure the execution order and dependencies between tasks.
  • the cloud system framework uses a task decoupler at the user load layer to decouple the user load into subtasks and assign them to multiple task waiting queues, while ensuring that the parent tasks of the subtasks in the waiting queues have been executed and the required data has been transmitted. , to ensure that the tasks in the queue are atomic and can run independently.
  • the second layer is the core layer of the entire framework - the scheduling layer, which is responsible for task scheduling and resource supply to achieve the optimization goal of minimizing task delay and system energy consumption.
  • This layer contains the following four components: 1) Scheduling model: consists of multiple heterogeneous DNNs. 2) Energy consumption model: including communication consumption and computing consumption. 3) Service Level Agreement (SLA): It is a service agreement signed by the user and the cloud service provider, which mainly considers the completion time of the task, that is, the task delay, including the task communication delay and computing delay. 4)
  • the controller is the core component of the task scheduling layer, responsible for coordinating various components; generating task scheduling and resource allocation strategies to ensure SLA and minimum system energy consumption.
  • the third layer is the data center layer.
  • a large number of infrastructure equipment forms a large-scale data center, which can cluster adjacent servers into computing clusters according to geographical location.
  • multiple computing clusters are connected by optical fibers, and the transmission speed is extremely fast, so the data transmission delay and energy consumption between them can be ignored.
  • the bandwidth and distance between cloud tasks from different users connecting to different clusters are important considerations for optimization problems.
  • the computing capability and computing power of the cluster are also key factors that affect the system scheduling efficiency.
  • this embodiment proposes a two-stage scheduling framework for the task scheduling and resource allocation of multi-user and multi-cloud providers (the first layer in FIG. 3 ).
  • the framework consists of the task scheduling stage ( FIG. 3 ).
  • the optimization objective in the task scheduling phase is the reward function of this embodiment.
  • the resource scheduling phase the computing power of the cluster is evenly distributed to the tasks scheduled to the cluster. Different schedulers are used in different stages.
  • the two-stage scheduling framework consists of a task scheduling stage and a resource configuration stage. The scheduling tasks are completed according to the optimization goals of different stages.
  • the scheduler in the task scheduling phase is called the task scheduler, and the scheduler in the resource scheduling phase is called the resource scheduler.
  • a heterogeneous distributed deep learning model is used to complete the task scheduling task of scheduling jobs to the data center.
  • the DQN-based deep reinforcement learning model is used to complete the resource configuration task of configuring virtual machine resources for tasks and deploying them to the server.
  • Resource configuration that is, the computing power of the cluster will be equally distributed to the tasks scheduled to the cluster, that is, each task gets CPU cycles:
  • the invention proposes a cloud task scheduling and resource allocation method based on Deep Q-network, aiming at the conflict problem between cloud service providers aiming at minimizing energy consumption and users pursuing service quality optimization.
  • ⁇ d When a smaller task delay is desired, increase ⁇ d , When it is more desirable to obtain less energy consumption, increase ⁇ e , and balance the relationship between the two optimization objectives of energy consumption and task completion time by adjusting the proportion of the return value of different optimization objectives.
  • the present invention can select the operation delay and system energy consumption acceptable to both parties of the cloud service according to the experimental results, then the reward function parameters in this state are also determined accordingly.
  • the weight parameters ⁇ d and ⁇ e in the reward function To dynamically adjust the system optimization objectives to meet the actual scheduling needs.
  • the invention proposes a cloud task scheduling and resource allocation method based on heterogeneous distributed deep learning.
  • the method solves the task scheduling and resource allocation problem of multiple queues and multiple clusters by combining multiple heterogeneous DNNs as the scheduling model of the cloud system. Taking minimizing task delay and energy consumption as the optimization goal of cloud system, the optimal scheduling strategy is generated.
  • the steps provided by the present invention it is possible to obtain which cluster a certain task is allocated to, so that the optimization goal of the present invention is designed to be the largest.
  • FIG. 4 is a structural diagram of an embodiment of a multi-queue multi-cluster task scheduling system according to the present invention.
  • the multi-queue multi-cluster task scheduling system includes:
  • a training data set construction module 401 is used to construct a training data set; the training data set includes a one-to-one correspondence between state spaces and action decisions; the state space includes multiple task attribute groups in multiple queues arranged in sequence; The task attribute group includes the amount of task data and the number of CPU cycles required by the task.
  • the training and optimization module 402 is configured to use the training data set to train and optimize multiple parallel deep neural networks to obtain multiple parallel trained and optimized deep neural networks.
  • the reward function setting module 403 is used for setting a reward function; the reward function minimizes the sum of the task delay and the energy consumption by adjusting the proportion of the reward value of the task delay and the proportion of the reward value of the energy consumption.
  • the reward function setting module 403 specifically includes:
  • the task delay calculation unit is configured to add the time consumed by each task transmission process and the time consumed by the task calculation process to obtain the task delay of each task.
  • the energy consumption calculation unit is used for adding up the energy consumed by all task transmission processes and the energy consumed by all task calculation processes to obtain the energy consumption of all tasks.
  • the reward value proportion setting unit is used to set the first reward value proportion occupied by task delay and the second reward value proportion occupied by energy consumption; the sum of the first reward value proportion and the second reward value proportion is 1 .
  • a reward function setting unit configured to set a reward function according to the maximum task delay, the proportion of the first reward value, the energy consumption and the second reward value.
  • the reward function setting unit specifically includes:
  • the first product obtaining subunit is configured to multiply the maximum task delay by the proportion of the first reward value to obtain a first product.
  • the second product obtaining subunit is configured to multiply the proportion of the energy consumption and the second reward value to obtain a second product.
  • a reward function acquiring subunit configured to add the first product and the second product to obtain a reward function.
  • the action decision obtaining module 404 is configured to input the state space to be scheduled into a plurality of parallel deep neural networks after training and optimization to obtain a plurality of action decisions to be scheduled.
  • the optimal action decision obtaining module 405 is configured to determine an optimal action decision among the plurality of action decisions to be scheduled according to the reward function and output it.
  • the optimal action decision obtaining module 405 specifically includes:
  • a reward function value calculation unit configured to calculate a reward function value of each of the action decisions to be scheduled according to the reward function.
  • the minimum reward function value selection unit is used to select the minimum reward function value among all reward function values.
  • the best action decision selection unit is configured to select the action decision to be scheduled corresponding to the minimum reward function value as the best action decision for output.
  • the scheduling module 406 is configured to schedule a plurality of the task attribute groups to a plurality of clusters according to the optimal action decision.
  • the scheduling module 406 further includes:
  • a resource configuration module configured to evenly distribute the number of CPU cycles of each cluster to all the task attribute groups in the cluster.
  • the multi-queue multi-cluster task scheduling system further includes:
  • the sample storage module is used to store the state space to be scheduled and the optimal action decision as a sample in the experience playback pool; repeatedly execute the action decision acquisition module, the best action decision acquisition module, the scheduling module, and the sample storage module , until the number of samples in the experience playback pool reaches the threshold.
  • a further training and optimization module is used to randomly extract a set number of samples from the experience playback pool, and further train and optimize multiple parallel said trained and optimized deep neural networks to obtain multiple parallel further trainings and optimized deep neural network.
  • the updating module is configured to update the plurality of parallel deep neural networks after training and optimization in the action decision acquiring module to the plurality of parallel deep neural networks after further training and optimization.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种多队列多集群的任务调度方法及系统,涉及云计算技术领域,方法包括:构建训练数据集;所述训练数据集包括一一对应的状态空间和动作决策;所述状态空间包括依次排列的多个队列中的多个任务属性组;所述任务属性组包括任务数据量和任务所需CPU周期数(S1);利用所述训练数据集对多个并行的深度神经网络进行训练和优化,得到多个并行的训练和优化后的深度神经网络(S2);设置回报函数;所述回报函数通过调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小(S3);将待调度的状态空间输入多个并行的所述训练和优化后的深度神经网络中,得到多个待调度的动作决策(S4);根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出(S5);根据所述最佳动作决策将多个所述任务属性组调度到多个集群(S6)。上述方法能够以最小化任务延迟和能源消耗作为云系统的优化目标,生成最优调度策略。

Description

一种多队列多集群的任务调度方法及系统 技术领域
本发明涉及云计算技术领域,特别是涉及一种多队列多集群的任务调度方法及系统。
背景技术
目前的云计算环境,以Amazon、IBM、微软、Yahoo为例,其所建设的数据中心均拥有几十万台服务器,Google拥有的服务器数量甚至超过了100万台,各种物理资源虚拟化后数目更加庞大,物理节点和虚拟化单元宕机、动态加入和撤销等时有发生,管理起来技术难度大、复杂性高。又如,以多层Web服务工作流为例,由于突发事件引起的负载变化规律,常常无法预测。从任务优化分配角度来说,各种类型的云工作流任务在多个处理单元上的调度已被证明是NP完全难题。从资源优化供给角度来说,虚拟单元放置一方面需考虑能源消耗,即减少激活物理机和使用网络设备的数量,此时虚拟化单元放置可抽象为装箱问题,这是一个NP完全难题;另一方面需考虑数据在虚拟单元之间的传输,即减少对网络带宽的使用,此时虚拟单元放置可抽象为二次分配问题,这同样是一个NP完全难题。
云服务供需双方协商好待执行的工作量和服务等级协议,云服务提供商更关注以怎样的资源组合方案尽可能提高资源利用率,从而最大限度降低运营成本;而云服务使用者更关注以怎样的任务调度方式尽可能减少租用时间,从而最大限度降低支付成本。其中运营成本和支付成本最核心的指标之一,便是能源消耗。在实际的云任务调度与资源配置中,以能源消耗最小化为目标的云服务供应商和追求服务质量最优化的用户之间存在着冲突问题,这种冲突体现在云服务使用者希望得到较小的任务延迟,而云服务提供商则希望得到较小的能源消耗。现有的云任务调度与资源配置方法即多队列多集群的任务调度与资源配置方法仅能对一个优化目标进行优化,即以最小化任务延迟或最小化能源消耗作为云系统的优化目标,生成最优调度策略,无法根据具体要求有效地权衡能源消耗与任务完工时间(即任务延迟)这两个优化目标之间的关系,使得任务延迟与能源消耗 两者之和(即优化目标)最小,以最小化任务延迟和能源消耗作为云系统的优化目标,生成最优调度策略。
发明内容
本发明要解决的技术问题是提供一种多队列多集群的任务调度方法及系统,能够以最小化任务延迟和能源消耗作为云系统的优化目标,生成最优调度策略。
为解决上述技术问题,本发明提供了一种多队列多集群的任务调度方法及系统。
该多队列多集群的任务调度方法,包括:
步骤S1:构建训练数据集;所述训练数据集包括一一对应的状态空间和动作决策;所述状态空间包括依次排列的多个队列中的多个任务属性组;所述任务属性组包括任务数据量和任务所需CPU周期数;
步骤S2:利用所述训练数据集对多个并行的深度神经网络进行训练和优化,得到多个并行的训练和优化后的深度神经网络;
步骤S3:设置回报函数;所述回报函数通过调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小;
步骤S4:将待调度的状态空间输入多个并行的所述训练和优化后的深度神经网络中,得到多个待调度的动作决策;
步骤S5:根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出;
步骤S6:根据所述最佳动作决策将多个所述任务属性组调度到多个集群。
可选的,还包括:
步骤S7:将所述待调度状态空间和所述最佳动作决策作为一个样本存储到经验回放池中;重复执行步骤S4-S7,直至所述经验回放池中的样本数达到阈值;
步骤S8:从所述经验回放池中随机抽取设定数量的样本,对多个并行的所述训练和优化后的深度神经网络进一步训练和优化,得到多个并行的进一步训练和优化后的深度神经网络;
步骤S9:将步骤S4中多个并行的所述训练和优化后的深度神经网络更新为多个所述并行的进一步训练和优化后的深度神经网络。
可选的,所述设置回报函数,具体包括:
步骤S31:将每个任务传输过程所消耗的时间和所述任务计算过程所消耗的时间相加,得到每个任务的任务延迟;
步骤S32:确定所有任务延迟中的最大任务延迟;
步骤S33:将所有任务传输过程所消耗的能源和所有任务计算过程所消耗的能源相加,得到所有任务的能源消耗;
步骤S34:设置任务延迟所占的第一回报值比重以及能源消耗所占的第二回报值比重;所述第一回报值比重和所述第二回报值比重之和为1;
步骤S35:根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数。
可选的,所述根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数,具体包括:
步骤S351:将所述最大任务延迟与所述第一回报值比重相乘,得到第一乘积;
步骤S352:将所述能源消耗与所述第二回报值比重相乘,得到第二乘积;
步骤S353:将所述第一乘积与所述第二乘积相加,得到回报函数。
可选的,所述根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出,具体包括:
步骤S51:根据所述回报函数计算每个所述待调度的动作决策的回报函数值;
步骤S52:选取所有回报函数值中的最小回报函数值;
步骤S53:选取所述最小回报函数值对应的待调度的动作决策为最佳动作决策进行输出。
可选的,所述根据所述最佳动作决策将多个所述任务属性组调度到多个集群,之后还包括:
步骤S10:将每个集群的CPU周期数平均分配给所述集群中的所有所述任务属性组。
该多队列多集群的任务调度系统,包括:
训练数据集构建模块,用于构建训练数据集;所述训练数据集包括一一对应的状态空间和动作决策;所述状态空间包括依次排列的多个队列中的多个任务属性组;所述任务属性组包括任务数据量和任务所需CPU周期数;
训练和优化模块,用于利用所述训练数据集对多个并行的深度神经网络进行训练和优化,得到多个并行的训练和优化后的深度神经网络;
回报函数设置模块,用于设置回报函数;所述回报函数通过调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小;
动作决策获取模块,用于将待调度的状态空间输入多个并行的所述训练和优化后的深度神经网络中,得到多个待调度的动作决策;
最佳动作决策获取模块,用于根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出;
调度模块,用于根据所述最佳动作决策将多个所述任务属性组调度到多个集群。
可选的,还包括:
样本存储模块,用于将所述待调度状态空间和所述最佳动作决策作为一个样本存储到经验回放池中;重复执行动作决策获取模块、最佳动作决策获取模块、调度模块、样本存储模块,直至所述经验回放池中的样本数达到阈值;
进一步训练和优化模块,用于从所述经验回放池中随机抽取设定数量的样本,对多个并行的所述训练和优化后的深度神经网络进一步训练和优化,得到多个并行的进一步训练和优化后的深度神经网络;
更新模块,用于将动作决策获取模块中多个并行的所述训练和优化后的深度神经网络更新为多个所述并行的进一步训练和优化后的深度神经网络。
可选的,所述回报函数设置模块,具体包括:
任务延迟计算单元,用于将每个任务传输过程所消耗的时间和所述任务计算过程所消耗的时间相加,得到每个任务的任务延迟;
最大任务延迟确定单元,用于确定所有任务延迟中的最大任务延迟;
能源消耗计算单元,用于将所有任务传输过程所消耗的能源和所有任务计算过程所消耗的能源相加,得到所有任务的能源消耗;
回报值比重设置单元,用于设置任务延迟所占的第一回报值比重以及能源消耗所占的第二回报值比重;所述第一回报值比重和所述第二回报值比重之和为1;
回报函数设置单元,用于根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数。
可选的,所述回报函数设置单元,具体包括:
第一乘积获取子单元,用于将所述最大任务延迟与所述第一回报值比重相乘,得到第一乘积;
第二乘积获取子单元,用于将所述能源消耗与所述第二回报值比重相乘,得到第二乘积;
回报函数获取子单元,用于将所述第一乘积与所述第二乘积相加,得到回报函数。
与现有技术相比,本发明的有益效果在于:本发明公开的多队列多集群的任务调度方法及系统,针对以能源消耗最小化为目标的云服务供应商和追求服务质量最优化的用户之间存在的冲突问题设置回报函数,该回报函数可根据具体要求调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小,当希望得到较小的任务延迟时,则增加任务延迟的回报值比重,当希望得到较小的能源消耗时,则增加能源消耗的回报值比重,通过调整不同优化目标的回报值比重,有效地权衡能源消耗与任务延迟这两个优化目标之间的关系,使得任务延迟与能源消耗之和最小。优化过程采用回报函数对各深度神经网络输出的动作决策计算其回报函数值,选取最小回报函数值对应的动作决策为最佳动作决策,根据最佳动作决策进行多队列多集群的任务调度,从而能够以最小化任务延迟和能源消耗作为云系统的优化目标,生成最优调度策略。
说明书附图
下面结合附图对本发明作进一步说明:
图1为本发明多队列多集群的任务调度方法实施例1的流程图;
图2为本发明多队列多集群的任务调度方法实施例2的流程示意图;
图3为本发明的云系统框架图;
图4为本发明多队列多集群的任务调度系统实施例的结构图。
具体实施方式
实施例1:
图1为本发明多队列多集群的任务调度方法实施例1的流程图。参见图1,该多队列多集群的任务调度方法包括:
步骤S1:构建训练数据集;所述训练数据集包括一一对应的状态空间和动作决策;所述状态空间包括依次排列的多个队列中的多个任务属性组;所述任务属性组包括任务数据量和任务所需CPU周期数。
步骤S2:利用所述训练数据集对多个并行的深度神经网络进行训练和优化,得到多个并行的训练和优化后的深度神经网络。
步骤S3:设置回报函数;所述回报函数通过调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小。
该步骤S3具体包括:
步骤S31:将每个任务传输过程所消耗的时间和所述任务计算过程所消耗的时间相加,得到每个任务的任务延迟。
步骤S32:确定所有任务延迟中的最大任务延迟。
步骤S33:将所有任务传输过程所消耗的能源和所有任务计算过程所消耗的能源相加,得到所有任务的能源消耗。
步骤S34:设置任务延迟所占的第一回报值比重以及能源消耗所占的第二回报值比重;所述第一回报值比重和所述第二回报值比重之和为1。
步骤S35:根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数。
该步骤S35具体包括:
步骤S351:将所述最大任务延迟与所述第一回报值比重相乘,得到第一乘积。
步骤S352:将所述能源消耗与所述第二回报值比重相乘,得到第二乘积。
步骤S353:将所述第一乘积与所述第二乘积相加,得到回报函数。
步骤S4:将待调度的状态空间输入多个并行的所述训练和优化后的深度神经网络中,得到多个待调度的动作决策。
步骤S5:根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出。
该步骤S5具体包括:
步骤S51:根据所述回报函数计算每个所述待调度的动作决策的回报函数值。
步骤S52:选取所有回报函数值中的最小回报函数值。
步骤S53:选取所述最小回报函数值对应的待调度的动作决策为最佳动作决策进行输出。
步骤S6:根据所述最佳动作决策将多个所述任务属性组调度到多个集群。
该步骤S6之后还包括:
步骤S10:将每个集群的CPU周期数平均分配给所述集群中的所有所述任务属性组。
该多队列多集群的任务调度方法还包括:
步骤S7:将所述待调度状态空间和所述最佳动作决策作为一个样本存储到经验回放池中;重复执行步骤S4-S7,直至所述经验回放池中的样本数达到阈值。
步骤S8:从所述经验回放池中随机抽取设定数量的样本,对多个并行的所述训练和优化后的深度神经网络进一步训练和优化,得到多个并行的进一步训练和优化后的深度神经网络。
步骤S9:将步骤S4中多个并行的所述训练和优化后的深度神经网络更新为多个所述并行的进一步训练和优化后的深度神经网络。
实施例2:
图2为本发明多队列多集群的任务调度方法实施例2的流程示意图。参见图2,该多队列多集群的任务调度方法包括:
步骤1:初始化X个神经网络DNN的网络参数θ x和经验回放池(Replay Memory)规模。
其中,θ x为神经网络的参数,θ x包括节点参数以及节点之间的连接线参数。经验回放池存储之前获得的策略,该特性是DNN算法区别于之前神经网络算法的特征之一。神经网络参数的初始化是随机的。
步骤2:将多个队列中的多个任务属性组表示成状态空间s t,表示为s t={task 1,task 2,...,task n1}作为X个异构神经网络DNN的输入。
其中,n1表示的是总的任务数,即等待任务队列数n乘以每个队列包含的任务数m,task 1...task n1表示状态空间中依次排列的多个队列中的多个任务属性组;每个任务属性组均包括任务数据量和任务所需CPU周期数。
云系统的任务是将多个队列中的原子任务,即图2中的任务集调度到多个集群中。假设系统中的等待任务队列数为n个,1≤n≤N,其中,N表示系统中的最大等待任务队列数。设系统中每个队列包含的任务数为m个,1≤m≤M,其中,M表示系统中每个队列包含的最大任务数,则总任务数为m*n个。设计算集群数为k个,1≤k≤K,其中,K表示系统中最大计算集群数。
任务T nm表示第n个队列中第m个任务,任务T nm的属性用二元组表示为(α nmnm),其中,α nm表示第n个队列中第m个任务的数据量,β nm表示第n个队列中第m个任务所需CPU周期数。另外,设定每个任务所需要CPU周期与任务数据量呈线性相关,即:β nm=q*α nm,其中q表示计算力与数据的比率(Computationto DataRatio)。
集群J k的属性用三元组表示为
Figure PCTCN2020101185-appb-000001
其中C k表示集群k的计算能力,即是CPU的周期数,
Figure PCTCN2020101185-appb-000002
表示集群k的通信功耗,
Figure PCTCN2020101185-appb-000003
表示集群k的计算功耗。
另外,多个队列到多个集群之间的带宽表示为{w 12,...,w nk},w nk表示队列n到集群k的带宽大小。
步骤3:每个DNN输出不同的动作决策(d 1,d 2,...,d x)。d x表示第X个DNN输出的动作决策。
动作决策即这个任务调度到哪一个集群,动作决策也称为调度策略。
步骤4:计算每个动作决策对应的Q值,选择获取最小Q值的动作决策作为该任务集的最佳动作决策:
Figure PCTCN2020101185-appb-000004
式中,s为当前任务集状态空间,即步骤2中的状态空间s t={task 1,task 2,...,task n1};d opt为状态空间s t={task 1,task 2,...,task n1}对应的最佳动作决策。
在本实施例中主要考虑调度过程的两个关键因素:任务延迟和能源消耗。下面将通过公式阐明本实施例提到的通信模型和计算模型的定义。
通信模型包含任务数据传输需要的传输时间以及能耗。当同个队列中多个任务同时调度到同个集群时,带宽是均分给每个任务的,因此队列n中的任务m所能占用的带宽
Figure PCTCN2020101185-appb-000005
为:
Figure PCTCN2020101185-appb-000006
式中,w nk表示队列n到集群k的带宽大小,A nk表示队列n中调度到集群k的任务数。
通信延迟T comm即是任务数据上传到服务器所消耗的时间:
Figure PCTCN2020101185-appb-000007
其中,
Figure PCTCN2020101185-appb-000008
表示队列n中的任务m上传到服务器所消耗的时间,α nm表示队列n的任务m的数据量。
通信能耗E comm即是任务传输过程中所消耗的能源:
Figure PCTCN2020101185-appb-000009
其中,
Figure PCTCN2020101185-appb-000010
是队列n中的任务m传输过程中所消耗的能源,
Figure PCTCN2020101185-appb-000011
是单位任务(例如1MB字节)传输所消耗的功耗。
队列n中所有任务的通信能源消耗
Figure PCTCN2020101185-appb-000012
Figure PCTCN2020101185-appb-000013
计算模型包含任务的计算延迟和计算能耗。集群计算能力将均分给调度到该集群的任务,即每个任务获得CPU周期:
Figure PCTCN2020101185-appb-000014
式中,
Figure PCTCN2020101185-appb-000015
表示队列n中的任务m获得的CPU周期,a nmk表示调度到集群k中的第n个队列中第m个任务的数据量。
计算延迟T comp即是任务计算所消耗的时间:
Figure PCTCN2020101185-appb-000016
式中,
Figure PCTCN2020101185-appb-000017
表示队列n中的任务m计算所消耗的时间。
计算能耗E comp即是任务计算过程中所消耗的能源:
Figure PCTCN2020101185-appb-000018
式中,
Figure PCTCN2020101185-appb-000019
表示队列n中的任务m计算过程中所消耗的能源。
队列n中所有任务的计算能源消耗
Figure PCTCN2020101185-appb-000020
Figure PCTCN2020101185-appb-000021
本实施例考虑的因素是任务延迟与能源消耗,因此系统的回报函数,即Q值定义如下:
Figure PCTCN2020101185-appb-000022
式中,ξ d表示任务延迟所占的优化比重,ξ e表示能源消耗所占的优化比重,ξ d∈[0,1],ξ e∈[0,1],且ξ de=1,根据需要调整ξ d和ξ e这两个参数,即该发明更希望得到较小的任务延迟,则增加ξ d,反之增加ξ e。d表示DNN输出的动作决策。
Figure PCTCN2020101185-appb-000023
的设置是根据具体要求,使得任务延迟与能源消耗两者之和,即优化目标最小。
系统的最终优化目标是获得最优的调度策略,在DNN作出动作决策后,根据公式
Figure PCTCN2020101185-appb-000024
获取最小Q值的动作决策作为该任务集的最佳动作决策,从而获得最优的调度策略,最小化任务延迟与能源消耗,即是最小化期望回报值R:
R=minQ(s,d)
系统的优化过程即调度模型的训练过程,调度模型由多个异构的DNN组成,系统的优化过程包括:
首先将多个队列中的多个任务属性组表示成状态空间s,表示为{α 11111212,...,α nmnm},作为X个DNN的输入,每个DNN输出不同的动作决策(d 1,d 2,...,d x)。d x表示第X个DNN输出的动作决策。在时间步t,系统状态s t作为输入,输出每个DNN的动作决策
Figure PCTCN2020101185-appb-000025
表示为:
Figure PCTCN2020101185-appb-000026
Figure PCTCN2020101185-appb-000027
是表示第b个DNN的网络参数的函数。动作决策dn为一串二进制序列,表示为dn={a 111,a 121,..,a nmk},a nmk∈{0,1},1≤n≤N,1≤m≤M,1≤k≤K,若a nmk=1,表示队列n中作业m调度到集群k中,紧接着,采用公式
Figure PCTCN2020101185-appb-000028
计算每个DNN输出的动作决策的Q值,选择获取最小Q值的动作决策作为该任务集的最佳动作决策:
Figure PCTCN2020101185-appb-000029
步骤5:将当前任务集状态空间s和最佳动作决策d opt作为样本(s,d opt)存储到经验回放池中,待经验回放池中的样本数达到阈值,从中随机抽取Mini-batch(批大小,即每次调整参数前所选取的样本数量)数的样本,进行模型训练,目标是最小化期望回报值,即最小化任务延迟与能源消耗。由于系统的最终优化目标是获得最优的调度策略,因此通过经验回放池中的样本不断地对模型进行训练和优化,使得模型的精度更高,从而实现将任务集状态空间输入模型后即可输出最佳的动作决策,即最优的调度策略,该最佳动作决策能够最小化期望回报值,即最小化任务延迟与能源消耗。梯度下降算法通过最小化交叉熵损失minimizing the cross-entropy loss来优化各DNN的参数值
Figure PCTCN2020101185-appb-000030
(DNN的参数即节点及节点间的连线上的权重),直至回报函数收敛,该过程按照最小化交叉熵损失函数公式
Figure PCTCN2020101185-appb-000031
计算,式中,
Figure PCTCN2020101185-appb-000032
表示各DNN 的参数值
Figure PCTCN2020101185-appb-000033
的最小化交叉熵损失函数,T表示数学上的矩阵转置,d t表示在时间步t,系统状态s t作为输入,最终输出的动作决策,
Figure PCTCN2020101185-appb-000034
表示系统状态s t作为输入,X个神经网络DNN的网络参数的函数。
步骤6:对多个异构的DNN组成的调度模型进行测试。为了验证本实施例提出的模型的有效性与性能,设计两部分仿真实验。第一部分是针对HDLL模型的关键参数进行对比验证,观察参数对模型的优化效果的影响。模型关键参数包括异构DNN个数、学习率、Batch-size(即一次训练所选取的样本数,Batch-Size的大小影响模型的优化程度和速度,同时其直接影响到GPU内存的使用情况,若GPU内存不大,该数值最好设置小一点)。第二部分是对本实施例与基准算法,包括随机选择算法(Random)、循环选择算法(Round-Robin,RR)、MoPSO多目标粒子群优化、DLL分布式学习算法和Greedy贪婪算法的优化结果进行对比验证。实验结果表明该模型能够有效的权衡能源消耗与任务完工时间这两个优化目标,具有较明显的优化效果。
图3为本发明的云系统框架图。参见图3,本实施例中由多个异构的DNN组成的调度模型,即图3中的分布式深度学习模型设置于云系统框架中的第二层,该调度模型即异构分布深度学习模型基本架构。
云系统框架主要有三层。第一层是用户负载层,由于云用户的数量的庞大,用户种类的多样性,因此用户负载存在多样性,用户负载中包含多个任务,任务之间存在依赖性、以及数据的传输。因此在任务调度的过程中需要保证任务之间的执行顺序和依赖关系。云系统框架在用户负载层采用任务解耦器对用户负载解耦成子任务分配到多个任务等待队列中,同时确保等待队列中的子任务的父任务已执行完成以及所需的数据已传输完成,保证队列中的任务具有原子性,均能独立运行。第二层是整个框架的核心层-调度层,该层是负责任务的调度与资源的供给,以达到最小化任务延迟和系统能源消耗的优化目标。该层包含以下四个组件:1)调度模型:由多个异构的DNN组成。2)能源消耗模型:包含通信消耗和计算消耗。3)服务水平协议(SLA):是用户与云服务供应商签署的服务协议,主要考虑任务的完成时间,即任务延迟,包括任务通信延迟和计算延迟。 4)控制器(Controller),是任务调度层的核心组件,负责协调各个组件;生成任务调度和资源配置策略,保证SLA和最小系统能耗。第三层是数据中心层。数量众多的基础设备组成规模庞大的数据中心,可按照地理位置将邻近的服务器聚类成计算集群。在通信方面,多个计算集群之间通过光纤连接,传输速度极快,因此可忽略其之间的数据传输延迟和能耗。然而来自不同用户的云任务连接到不同集群的带宽和距离有明显的差距,因此这两者是优化问题的重要考虑因素。另外,由于硬件设备的差异,集群的计算能力和计算功率也是影响系统调度效率的关键因素。
如图3所示,本实施例针对多用户多云供应商(图3中第一层)的任务调度与资源配置问题,提出了一种两阶段的调度框架,该框架由任务调度阶段(图3中第二层)和资源配置阶段(图3中第三层)组成,根据不同阶段的优化目标来完成调度任务。其中,任务调度阶段的优化目标是本实施例的回报函数。资源调度阶段是将集群计算能力均分给调度到该集群的任务。不同阶段采用不同的调度器,两阶段的调度框架由任务调度阶段和资源配置阶段组成,根据不同阶段的优化目标来完成调度任务。任务调度阶段的调度器称之为任务调度器,资源调度阶段的调度器称之为资源调度器。任务调度阶段,采用基于异构分布式深度学习模型来完成将作业调度到数据中心的任务调度任务。资源配置阶段,采用基于深度强化学习模型DQN来完成为任务配置虚拟机资源,并部署到服务器中的资源配置任务。资源配置即集群计算能力将均分给调度到该集群的任务,即每个任务获得CPU周期:
Figure PCTCN2020101185-appb-000035
本发明针对以能源消耗最小化为目标的云服务供应商和追求服务质量最优化的用户之间的冲突问题,提出一种基于Deep Q-network的云任 务调度与资源配置方法。该方法设计的回报函数为任务延迟与能源消耗的和,定义为:
Figure PCTCN2020101185-appb-000036
ξ de∈[0,1],且ξ de=1,可根据需要调整ξ d和ξ e这两个参数,当更希望得到较小的任务延迟时,增加ξ d,当更希望得到较小的能源消耗时,增加ξ e,通过调整不同优化目标的回报值比重,来权衡能源消耗与任务完工时间这两个优化目标关系。本发明能够根据实验结果,选取云服务双方都能接受的作业延时和系统能耗,则该状态下的回报函数参数也就相应确定了,通过调整回报函数中的权重参数ξ d和ξ e来动态调整系统优化目标,以满足实际的调度需求。
本发明提出一种基于异构分布深度学习的云任务调度与资源配置方法,该方法通过联合多个异构的DNN作为云系统的调度模型以解决多队列多集群的任务调度与资源配置问题,以最小化任务延迟和能耗消耗作为云系统的优化目标,生成最优调度策略。根据本发明的应用范围,通过本发明提供的步骤,能够得到某个任务分配到哪个集群,使得本发明设计的优化目标最大。
图4为本发明多队列多集群的任务调度系统实施例的结构图。参见图4,该多队列多集群的任务调度系统包括:
训练数据集构建模块401,用于构建训练数据集;所述训练数据集包括一一对应的状态空间和动作决策;所述状态空间包括依次排列的多个队列中的多个任务属性组;所述任务属性组包括任务数据量和任务所需CPU周期数。
训练和优化模块402,用于利用所述训练数据集对多个并行的深度神经网络进行训练和优化,得到多个并行的训练和优化后的深度神经网络。
回报函数设置模块403,用于设置回报函数;所述回报函数通过调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小。
该回报函数设置模块403具体包括:
任务延迟计算单元,用于将每个任务传输过程所消耗的时间和所述任务计算过程所消耗的时间相加,得到每个任务的任务延迟。
最大任务延迟确定单元,用于确定所有任务延迟中的最大任务延迟。
能源消耗计算单元,用于将所有任务传输过程所消耗的能源和所有任务计算过程所消耗的能源相加,得到所有任务的能源消耗。
回报值比重设置单元,用于设置任务延迟所占的第一回报值比重以及能源消耗所占的第二回报值比重;所述第一回报值比重和所述第二回报值比重之和为1。
回报函数设置单元,用于根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数。
该回报函数设置单元具体包括:
第一乘积获取子单元,用于将所述最大任务延迟与所述第一回报值比重相乘,得到第一乘积。
第二乘积获取子单元,用于将所述能源消耗与所述第二回报值比重相乘,得到第二乘积。
回报函数获取子单元,用于将所述第一乘积与所述第二乘积相加,得到回报函数。
动作决策获取模块404,用于将待调度的状态空间输入多个并行的所述训练和优化后的深度神经网络中,得到多个待调度的动作决策。
最佳动作决策获取模块405,用于根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出。
该最佳动作决策获取模块405具体包括:
回报函数值计算单元,用于根据所述回报函数计算每个所述待调度的动作决策的回报函数值。
最小回报函数值选取单元,用于选取所有回报函数值中的最小回报函数值。
最佳动作决策选取单元,用于选取所述最小回报函数值对应的待调度的动作决策为最佳动作决策进行输出。
调度模块406,用于根据所述最佳动作决策将多个所述任务属性组调度到多个集群。
该调度模块406之后还包括:
资源配置模块,用于将每个集群的CPU周期数平均分配给所述集群中的所有所述任务属性组。
该多队列多集群的任务调度系统还包括:
样本存储模块,用于将所述待调度状态空间和所述最佳动作决策作为一个样本存储到经验回放池中;重复执行动作决策获取模块、最佳动作决策获取模块、调度模块、样本存储模块,直至所述经验回放池中的样本数达到阈值。
进一步训练和优化模块,用于从所述经验回放池中随机抽取设定数量的样本,对多个并行的所述训练和优化后的深度神经网络进一步训练和优化,得到多个并行的进一步训练和优化后的深度神经网络。
更新模块,用于将动作决策获取模块中多个并行的所述训练和优化后的深度神经网络更新为多个所述并行的进一步训练和优化后的深度神经网络。
上述实施方式旨在举例说明本发明能够被本领域专业技术人员实现或使用,对上述实施方式进行常规修改对本领域技术人员来说将是显而易见的,故本发明包括但不限于上述实施方式,任何符合本申请文件的描述,符合与本文所公开的原理相同或相似的方法、工艺、产品,均落入本发明的保护范围之内。

Claims (10)

  1. 一种多队列多集群的任务调度方法,其特征在于,包括:
    步骤S1:构建训练数据集;所述训练数据集包括一一对应的状态空间和动作决策;所述状态空间包括依次排列的多个队列中的多个任务属性组;所述任务属性组包括任务数据量和任务所需CPU周期数;
    步骤S2:利用所述训练数据集对多个并行的深度神经网络进行训练和优化,得到多个并行的训练和优化后的深度神经网络;
    步骤S3:设置回报函数;所述回报函数通过调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小;
    步骤S4:将待调度的状态空间输入多个并行的所述训练和优化后的深度神经网络中,得到多个待调度的动作决策;
    步骤S5:根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出;
    步骤S6:根据所述最佳动作决策将多个所述任务属性组调度到多个集群。
  2. 根据权利要求1所述的多队列多集群的任务调度方法,其特征在于,还包括:
    步骤S7:将所述待调度状态空间和所述最佳动作决策作为一个样本存储到经验回放池中;重复执行步骤S4-S7,直至所述经验回放池中的样本数达到阈值;
    步骤S8:从所述经验回放池中随机抽取设定数量的样本,对多个并行的所述训练和优化后的深度神经网络进一步训练和优化,得到多个并行的进一步训练和优化后的深度神经网络;
    步骤S9:将步骤S4中多个并行的所述训练和优化后的深度神经网络更新为多个所述并行的进一步训练和优化后的深度神经网络。
  3. 根据权利要求1所述的多队列多集群的任务调度方法,其特征在于,所述设置回报函数,具体包括:
    步骤S31:将每个任务传输过程所消耗的时间和所述任务计算过程所消耗的时间相加,得到每个任务的任务延迟;
    步骤S32:确定所有任务延迟中的最大任务延迟;
    步骤S33:将所有任务传输过程所消耗的能源和所有任务计算过程所 消耗的能源相加,得到所有任务的能源消耗;
    步骤S34:设置任务延迟所占的第一回报值比重以及能源消耗所占的第二回报值比重;所述第一回报值比重和所述第二回报值比重之和为1;
    步骤S35:根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数。
  4. 根据权利要求3所述的多队列多集群的任务调度方法,其特征在于,所述根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数,具体包括:
    步骤S351:将所述最大任务延迟与所述第一回报值比重相乘,得到第一乘积;
    步骤S352:将所述能源消耗与所述第二回报值比重相乘,得到第二乘积;
    步骤S353:将所述第一乘积与所述第二乘积相加,得到回报函数。
  5. 根据权利要求1所述的多队列多集群的任务调度方法,其特征在于,所述根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出,具体包括:
    步骤S51:根据所述回报函数计算每个所述待调度的动作决策的回报函数值;
    步骤S52:选取所有回报函数值中的最小回报函数值;
    步骤S53:选取所述最小回报函数值对应的待调度的动作决策为最佳动作决策进行输出。
  6. 根据权利要求1所述的多队列多集群的任务调度方法,其特征在于,所述根据所述最佳动作决策将多个所述任务属性组调度到多个集群,之后还包括:
    步骤S10:将每个集群的CPU周期数平均分配给所述集群中的所有所述任务属性组。
  7. 一种多队列多集群的任务调度系统,其特征在于,包括:
    训练数据集构建模块,用于构建训练数据集;所述训练数据集包括一一对应的状态空间和动作决策;所述状态空间包括依次排列的多个队列中的多个任务属性组;所述任务属性组包括任务数据量和任务所需CPU周 期数;
    训练和优化模块,用于利用所述训练数据集对多个并行的深度神经网络进行训练和优化,得到多个并行的训练和优化后的深度神经网络;
    回报函数设置模块,用于设置回报函数;所述回报函数通过调整任务延迟的回报值比重与能源消耗的回报值比重,使任务延迟与能源消耗之和最小;
    动作决策获取模块,用于将待调度的状态空间输入多个并行的所述训练和优化后的深度神经网络中,得到多个待调度的动作决策;
    最佳动作决策获取模块,用于根据所述回报函数在多个所述待调度的动作决策中确定一个最佳动作决策进行输出;
    调度模块,用于根据所述最佳动作决策将多个所述任务属性组调度到多个集群。
  8. 根据权利要求7所述的多队列多集群的任务调度系统,其特征在于,还包括:
    样本存储模块,用于将所述待调度状态空间和所述最佳动作决策作为一个样本存储到经验回放池中;重复执行动作决策获取模块、最佳动作决策获取模块、调度模块、样本存储模块,直至所述经验回放池中的样本数达到阈值;
    进一步训练和优化模块,用于从所述经验回放池中随机抽取设定数量的样本,对多个并行的所述训练和优化后的深度神经网络进一步训练和优化,得到多个并行的进一步训练和优化后的深度神经网络;
    更新模块,用于将动作决策获取模块中多个并行的所述训练和优化后的深度神经网络更新为多个所述并行的进一步训练和优化后的深度神经网络。
  9. 根据权利要求7所述的多队列多集群的任务调度系统,其特征在于,所述回报函数设置模块,具体包括:
    任务延迟计算单元,用于将每个任务传输过程所消耗的时间和所述任务计算过程所消耗的时间相加,得到每个任务的任务延迟;
    最大任务延迟确定单元,用于确定所有任务延迟中的最大任务延迟;
    能源消耗计算单元,用于将所有任务传输过程所消耗的能源和所有任 务计算过程所消耗的能源相加,得到所有任务的能源消耗;
    回报值比重设置单元,用于设置任务延迟所占的第一回报值比重以及能源消耗所占的第二回报值比重;所述第一回报值比重和所述第二回报值比重之和为1;
    回报函数设置单元,用于根据所述最大任务延迟、所述第一回报值比重、所述能源消耗以及所述第二回报值设置回报函数。
  10. 根据权利要求9所述的多队列多集群的任务调度系统,其特征在于,所述回报函数设置单元,具体包括:
    第一乘积获取子单元,用于将所述最大任务延迟与所述第一回报值比重相乘,得到第一乘积;
    第二乘积获取子单元,用于将所述能源消耗与所述第二回报值比重相乘,得到第二乘积;
    回报函数获取子单元,用于将所述第一乘积与所述第二乘积相加,得到回报函数。
PCT/CN2020/101185 2020-07-10 2020-07-10 一种多队列多集群的任务调度方法及系统 WO2022006830A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/101185 WO2022006830A1 (zh) 2020-07-10 2020-07-10 一种多队列多集群的任务调度方法及系统
US17/277,816 US11954526B2 (en) 2020-07-10 2020-07-10 Multi-queue multi-cluster task scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/101185 WO2022006830A1 (zh) 2020-07-10 2020-07-10 一种多队列多集群的任务调度方法及系统

Publications (1)

Publication Number Publication Date
WO2022006830A1 true WO2022006830A1 (zh) 2022-01-13

Family

ID=79553676

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/101185 WO2022006830A1 (zh) 2020-07-10 2020-07-10 一种多队列多集群的任务调度方法及系统

Country Status (2)

Country Link
US (1) US11954526B2 (zh)
WO (1) WO2022006830A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114629906A (zh) * 2022-03-14 2022-06-14 浙江大学 一种可靠的基于深度强化学习的云容器集群资源调度方法及装置
CN116680062A (zh) * 2023-08-03 2023-09-01 湖南博信创远信息科技有限公司 一种基于大数据集群的应用调度部署方法及存储介质
WO2023184939A1 (zh) * 2022-03-28 2023-10-05 福州大学 基于深度强化学习的云数据中心自适应高效资源分配方法
WO2024027413A1 (zh) * 2022-08-01 2024-02-08 华为技术有限公司 一种协同调度方法和相关设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220118047A (ko) * 2021-02-18 2022-08-25 삼성전자주식회사 어플리케이션의 모델 파일을 초기화하는 프로세서 및 이를 포함하는 전자 장치
CN115237581B (zh) * 2022-09-21 2022-12-27 之江实验室 一种面向异构算力的多策略智能调度方法和装置
CN116048820B (zh) * 2023-03-31 2023-06-06 南京大学 面向边缘云的dnn推断模型部署能耗优化方法和系统
CN117349026B (zh) * 2023-12-04 2024-02-23 环球数科集团有限公司 一种用于aigc模型训练的分布式算力调度系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110351348A (zh) * 2019-06-27 2019-10-18 广东石油化工学院 一种基于dqn的云计算资源调度优化方法
CN110580196A (zh) * 2019-09-12 2019-12-17 北京邮电大学 一种实现并行任务调度的多任务强化学习方法
CN110737529A (zh) * 2019-09-05 2020-01-31 北京理工大学 一种面向短时多变大数据作业集群调度自适应性配置方法
CN110768827A (zh) * 2019-10-17 2020-02-07 广州大学 一种基于群智能算法的任务卸载方法
CN111158912A (zh) * 2019-12-30 2020-05-15 天津大学 云雾协同计算环境下一种基于深度学习的任务卸载决策方法
US20200201677A1 (en) * 2018-04-11 2020-06-25 Shenzhen University Cloud computing task allocation method and device, apparatus, and storage medium
CN111722910A (zh) * 2020-06-19 2020-09-29 广东石油化工学院 一种云作业调度及资源配置的方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190362227A1 (en) * 2018-05-23 2019-11-28 Microsoft Technology Licensing, Llc Highly performant pipeline parallel deep neural network training
US20200067637A1 (en) * 2018-08-21 2020-02-27 The George Washington University Learning-based high-performance, energy-efficient, fault-tolerant on-chip communication design framework
US11868880B2 (en) * 2018-11-20 2024-01-09 Microsoft Technology Licensing, Llc Mitigating communication bottlenecks during parameter exchange in data-parallel DNN training
US11232368B2 (en) * 2019-02-20 2022-01-25 Accenture Global Solutions Limited System for predicting equipment failure events and optimizing manufacturing operations
US11106943B2 (en) * 2019-07-23 2021-08-31 Microsoft Technology Licensing, Llc Model-aware synthetic image generation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200201677A1 (en) * 2018-04-11 2020-06-25 Shenzhen University Cloud computing task allocation method and device, apparatus, and storage medium
CN110351348A (zh) * 2019-06-27 2019-10-18 广东石油化工学院 一种基于dqn的云计算资源调度优化方法
CN110737529A (zh) * 2019-09-05 2020-01-31 北京理工大学 一种面向短时多变大数据作业集群调度自适应性配置方法
CN110580196A (zh) * 2019-09-12 2019-12-17 北京邮电大学 一种实现并行任务调度的多任务强化学习方法
CN110768827A (zh) * 2019-10-17 2020-02-07 广州大学 一种基于群智能算法的任务卸载方法
CN111158912A (zh) * 2019-12-30 2020-05-15 天津大学 云雾协同计算环境下一种基于深度学习的任务卸载决策方法
CN111722910A (zh) * 2020-06-19 2020-09-29 广东石油化工学院 一种云作业调度及资源配置的方法

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114629906A (zh) * 2022-03-14 2022-06-14 浙江大学 一种可靠的基于深度强化学习的云容器集群资源调度方法及装置
CN114629906B (zh) * 2022-03-14 2023-09-29 浙江大学 一种可靠的基于深度强化学习的云容器集群资源调度方法及装置
WO2023184939A1 (zh) * 2022-03-28 2023-10-05 福州大学 基于深度强化学习的云数据中心自适应高效资源分配方法
WO2024027413A1 (zh) * 2022-08-01 2024-02-08 华为技术有限公司 一种协同调度方法和相关设备
CN116680062A (zh) * 2023-08-03 2023-09-01 湖南博信创远信息科技有限公司 一种基于大数据集群的应用调度部署方法及存储介质
CN116680062B (zh) * 2023-08-03 2023-12-01 湖南博创高新实业有限公司 一种基于大数据集群的应用调度部署方法及存储介质

Also Published As

Publication number Publication date
US11954526B2 (en) 2024-04-09
US20220269536A1 (en) 2022-08-25

Similar Documents

Publication Publication Date Title
WO2022006830A1 (zh) 一种多队列多集群的任务调度方法及系统
Abed-Alguni et al. Distributed Grey Wolf Optimizer for scheduling of workflow applications in cloud environments
Masdari et al. A survey of PSO-based scheduling algorithms in cloud computing
CN111831415B (zh) 一种多队列多集群的任务调度方法及系统
Pradhan et al. A novel load balancing technique for cloud computing platform based on PSO
Alkayal et al. Efficient task scheduling multi-objective particle swarm optimization in cloud computing
Kalra et al. A review of metaheuristic scheduling techniques in cloud computing
Rekha et al. Efficient task allocation approach using genetic algorithm for cloud environment
Ge et al. GA-based task scheduler for the cloud computing systems
Florence et al. A load balancing model using firefly algorithm in cloud computing
Selvarani et al. Improved cost-based algorithm for task scheduling in cloud computing
CN110737529A (zh) 一种面向短时多变大数据作业集群调度自适应性配置方法
CN111722910B (zh) 一种云作业调度及资源配置的方法
Asghari et al. Online scheduling of dependent tasks of cloud’s workflows to enhance resource utilization and reduce the makespan using multiple reinforcement learning-based agents
CN105975342A (zh) 基于改进布谷鸟搜索算法的云计算任务调度方法及系统
Tantalaki et al. Pipeline-based linear scheduling of big data streams in the cloud
Kaur et al. Enhanced genetic algorithm based task scheduling in cloud computing
Li et al. An effective scheduling strategy based on hypergraph partition in geographically distributed datacenters
Tong et al. DDQN-TS: A novel bi-objective intelligent scheduling algorithm in the cloud environment
Ding et al. Kubernetes-oriented microservice placement with dynamic resource allocation
Zhang et al. A PSO-based hierarchical resource scheduling strategy on cloud computing
Biswas et al. A novel scheduling with multi-criteria for high-performance computing systems: an improved genetic algorithm-based approach
Kaushik et al. An energy-efficient reliable grid scheduling model using NSGA-II
CN114675953A (zh) 资源动态调度方法、装置、设备及计算机可读存储介质
Chalack et al. Resource allocation in cloud environment using approaches based particle swarm optimization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20944394

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20944394

Country of ref document: EP

Kind code of ref document: A1