CN109960578A - A kind of offline dispatching method of data center resource based on deeply study - Google Patents

A kind of offline dispatching method of data center resource based on deeply study Download PDF

Info

Publication number
CN109960578A
CN109960578A CN201711399661.6A CN201711399661A CN109960578A CN 109960578 A CN109960578 A CN 109960578A CN 201711399661 A CN201711399661 A CN 201711399661A CN 109960578 A CN109960578 A CN 109960578A
Authority
CN
China
Prior art keywords
offline
layer
resource
deeply
data center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711399661.6A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201711399661.6A priority Critical patent/CN109960578A/en
Publication of CN109960578A publication Critical patent/CN109960578A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources

Abstract

The present invention relates to field of computer technology, in particular to a kind of offline dispatching method of data center resource based on deeply study.Deeply study can provide a viable option for the artificial heuristic of resource scheduling management.By constantly learning, deeply learning method can be optimized for particular job load (such as periodic load or Random Load), and keep high quality Optimized Operation result under various conditions.Guide depth network towards objective optimization, finally towards optimal objective training by calculating the reward value dispatched each time in offline scheduling using minimum average operation slowdown (system slows down the time) as optimization aim.As the result is shown, in a large amount of embodiments test of the invention, the slowndown of the offline dispatching method learnt using deeply embodies deeply learning method in the advantage in this field far below traditional optimization job scheduling methods such as SJF (shortest job first algorithms).

Description

A kind of offline dispatching method of data center resource based on deeply study
Technical field
The present invention relates to field of computer technology, in particular to it is a kind of based on deeply study data center resource from Line dispatching method.
Background technique
Resource management is the basic problem in computer network and operating system.Resource allocation is usually associativity problem, Different np hard problems can be mapped to.Although every kind of Resource Allocation Formula is all specifically, general method is certain Under the conditions of design have performance guarantee efficient heuritic approach.It has recently been demonstrated that machine learning can be resource management One viable option of artificial heuristic offer, especially have become machine learning research an active area depth Spend intensified learning.
In fact, deeply learning method is particularly suitable for resource management system.Firstly, the decision that these systems are made Often height is duplicate, to generate a large amount of training data for deeply study.Secondly, deeply study can incite somebody to action Complication system and decision strategy are modeled as deep neural network.Third, even if lacking accurate model, if there is with target phase The return signal of pass, so that it may those be trained to be difficult to the target directly optimized.Finally, by constantly learning, deeply study Method can be optimized for particular job load (for example, small-sized work, low-load is periodical), and under various conditions Keep efficient.
Summary of the invention
The technical problem to be solved by the present invention is providing a kind of deeply dispatched offline applied to data center resource Learning algorithm becomes the optimal solution of the current efficiently heuritic approach of substitution.
A kind of offline dispatching method of data center resource based on deeply study, which is characterized in that in the data Heart resource offline scheduling system includes data source modules, running environment module, evaluation mechanism study module and control strategy study Module;
The data source modules are used to generate the data of offline schedule job, and the data source includes the required resources-type of operation Type (for example, CPU, memory, I/O), the required resource size of operation, the total number of off-line operation.
The running environment module for constructing running environment model, the running environment include distribution cluster resource, Wait operation slot.All parts in running environment module are all showed with the image of cell.Cluster resource is shown often Kind resource allocation gives the operation for the service of being planned to, since current time, show T time step-length backward.Wait operation slot figure Resource requirement as indicating waiting operation.
The evaluation mechanism study module by from the data source modules in conjunction with the information obtained in running environment module Evaluation mechanism obtains required reward functions in operational process, feedback data of the reward functions as the evaluation mechanism The control strategy study module is delivered to by the evaluation mechanism study module, optimizes network parameter.
Optimisation strategy of the control strategy study module for deeply learning method learns, and passes through obtained award Function obtains the resource tune for the job scheduling sequence after guidance, and by policy update neural network parameter Spend the final manipulation of physical strategy of operation.
Prospect of the invention be it is wide, it is tight that the present invention can solve the generally existing highly energy-consuming of data center, the wasting of resources The problems such as weight.So the present invention has good application, certain economic benefit can be all brought to all trades and professions.This hair The bright deeply learning algorithm used is its real-time compared to present algorithm advantage, rapidity, can study property again.
Detailed description of the invention
Fig. 1 is the block schematic illustration of the deeply study of an embodiment of the present invention.
Fig. 2 is the status diagram of the off-line system of an embodiment of the present invention.
Fig. 3 is the resource offline scheduling flow figure based on deeply study of an embodiment of the present invention.
Specific embodiment
Below according to drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below Example is not intended to limit the scope of the invention for illustrating the present invention.
Fig. 1 is the block schematic illustration of the deeply study of the embodiment of the present invention.
As shown in Figure 1, intelligent body and environment are interacting.In each time step t, intelligent body observes some state S_t selects a movement a_t.After action, ambient condition is transitioned into s_ (t+1), and intelligent body receives reward r_t.State Conversion and reward are random, and are assumed with markov attribute.
Further, the behavior that intelligent body can only draw oneself up, without priori knowledge, environment will transition to which kind of state or What reward may be.By with environment interaction, during the training period, intelligent body can observe these quantity.The target of study is Maximize desired accumulation discount reward:, wherein γ ∈ (0,1] be discount reward the factor.
Further, the present invention uses the intensified learning method based on decision search, by holding on policing parameter A kind of nitrification enhancement that row gradient declines to learn.Target is to maximize expected accumulation discount reward, this target Gradient is given by:
Further,It is the movement a selected from state s and the expected accumulation prize for then following tactful π _ θ It encourages.The key idea of Policy-Gradient method is to follow the execution track that strategy obtains by observation to estimate gradient.Simple In monte carlo method, the intelligent multiple tracks of sampler body, and the accumulation discount that use experience calculates rewards vtAsUnbiased esti-mator.Then it is declined by gradient updates policing parameter:
Further, α is step-length.This equation produces well-known enhancing algorithm, can intuitively understand as follows.Side ToIt gives and how to change policing parameter to increase πθ(st,at) (the movement probability under at state st).Equation has stepped a step to this direction;The size of step-length depends on returning to vtHave much.In our design, we Using a slight variant, by from each return value vtIn subtract a baseline value reduce gradient estimation variance.
Fig. 2 is the status diagram of the off-line system of an embodiment of the present invention.
The state of off-line system is expressed as different grid charts by we, including currently distribute cluster resource grid chart, The resource requirement grid chart of the operation slot of waiting.Two grids of the leftmost side Fig. 2 show every kind of resource allocation to being planned to The operation of service, comprising from current time to T time step later.Different colours in these images represent different works Make.Operation slot grid chart indicates the resource requirement of waiting operation, and the quantity of operation slot is equal to the quantity with machine operation of generation, Operation is set to correspond with operation slot.
Fig. 3 is the scheduling of resource flow chart based on deeply study of an embodiment of the present invention.
As shown in figure 3, the scheduling of resource that should be learnt based on deeply the following steps are included:
Step S301, generates off-line operation at random.
Further, it will be assumed that two kinds of resources, i.e. capacity { 1r;1r}.Operation duration and resource requirement selection are such as Under: 80% run duration selects between 1t and 3t;Remaining equal uniform design from 10t to 15t.Each work has One independent superior resources selected at random.Demand to superior resources selects generally between 0.25r and 0.5r, other moneys The demand in source uniform design between 0.05r and 0.1r.
Whole off-line operations are packed into operation slot by step S302.The quantity and operation of operation are known in offline schedule job Demand.Therefore, be arranged operation slot quantity be equal to generation random off-line operation quantity, enable operation with operation slot one by one It is corresponding.
Step S303, deep learning network select action value A.
Further, the deep neural network that we use is convolutional neural networks CNN.First layer input layer, the second layer Convolutional layer Conv1, third layer pond layer Pool1:MaxPooling, the 4th layer of convolutional layer Conv2, layer 5 pond layer Pool2: MaxPooling, layer 6 full articulamentum Local3, the 9th layer of full articulamentum Local4, the 10th layer of output layer Softmax.Foundation The probability selection action value A of output layer.
Step S304 judges whether operation slot A is empty.
Further, operation slot mesh width is equal to resource maximum capacity 1r, is highly equal to 20t, i.e. 20 times Step-length.
Step S305, judges whether the operation in operation slot A can be packed into colony dispatching.
Further, cluster grid size is equal to operation slot grid.
Step S306, operation are packed into cluster, and operation slot A is set as empty.
Further, on grid image in the cluster, the job requirements resource size of loading is shown.
Step S307, cluster run a time step.
Further, system time increases a time step, and cluster grid image translates up a line.Former the first row figure As capped, last line is set as empty.
Step S308 judges whether the scheduling that fulfils assignment.
Further, the scheduling that fulfils assignment must simultaneously meet the following: operation has been fully enclosed system, without just Without waiting for operation in the operation of operation, operation slot.
Step S309 updates offline neural network parameter by reward value.
Further, in one embodiment of the invention, we are using minimum average operation slowdown as optimization Target.For each operation j, slowdown byIt provides, whereinBe operation deadline (i.e. reach and it is complete At the time between execution),It is operation (ideal) duration, pays attention to.We are each time step as a result, Long reward is set as, wherein j is the set of current operation (make a reservation for or wait to be serviced) in systems.Observation is set Determine discount factor γ=1, over time, accumulation remuneration is consistent with the summation of slowdown, therefore maximizes accumulation prize It encourages, minimizes average slowdown.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by those familiar with the art, all answers It is included within the scope of the present invention.Therefore, protection scope of the present invention should be subject to the protection scope in claims.

Claims (6)

1. a kind of offline dispatching method of data center resource based on deeply study, which is characterized in that the data center Resource offline dispatches system
Data source modules, for generating the data of offline schedule job, data include operation required resource type (for example, CPU, memory, I/O), the required resource size of operation, the total number of off-line operation;
Running environment module, for constructing running environment model, the running environment include distribution cluster resource Cluster, Wait operation slot JobSlot.All parts in running environment module are all showed with the image of cell;
Evaluation mechanism study module awards letter for the information combining assessment mechanism of acquisition to be obtained required reward functions Number is delivered to control strategy study module as feedback, optimizes network parameter;
Control strategy study module, the optimisation strategy for deeply learning method learn, by obtained reward functions from And the off-line operation schedule sequences after instructing are used for, and by policy update neural network parameter, obtain the resource offline The final manipulation of physical strategy of schedule job.
2. a kind of data center's offline resources dispatching method based on deeply study according to claim 1, special Sign is, the method for generating offline schedule job are as follows: we assume that two kinds of resources, i.e. capacity { 1r;1r }, the operation duration and Resource requirement selection is as follows: 80% run duration selects between 1t and 3t;Remaining is uniformly selected from 10t to 15t It selects.Each work has the independent superior resources selected at random, and the demand to superior resources is generally in 0.25r and 0.5r Between select, the demand of other resources uniform design between 0.05r and 0.1r.
3. a kind of offline dispatching method of data center resource based on deeply study according to claim 1, special Sign is that off-line operation environment includes 1 cluster resource Cluster, the N number of waiting operation slot JobSlot of distribution, and wherein N is The quantity of off-line operation.10 grids of every kind of resource width of cluster resource Cluster, 20 grids of height, wait operation slot 10 grids of every kind of resource width of JobSlot, 20 grids of height.
4. a kind of offline dispatching method of data center resource based on deeply study according to claim 1, special Sign is that the target of deeply study is to maximize desired progressive award:Wherein γ ∈ (0,1] be The factor of discount reward, the present invention uses the intensified learning method based on decision search, by executing on policing parameter Come a kind of nitrification enhancement learnt, target is to maximize expected accumulation discount reward, the ladder of this target for gradient decline Degree is given by:
Further,It is the movement a selected from state s and the expected accumulation prize for then following tactful π _ θ It encourages, the key idea of Policy-Gradient method is to follow the execution track that strategy obtains by observation to estimate gradient, simple In monte carlo method, the intelligent multiple tracks of sampler body, and the accumulation discount that use experience calculates rewards vtAsUnbiased esti-mator, then it pass through gradient decline update policing parameter:
Further, α is step-length.This equation produces well-known enhancing algorithm, can intuitively understand as follows, directionIt gives and how to change policing parameter to increase πθ(st,at) (the movement probability s under at statet), Equation has stepped a step to this direction;The size of step-length depends on returning to vtHave it is much, in our design, we use One slight variant, by from each return value vtIn subtract a baseline value reduce gradient estimation variance.
5. a kind of offline dispatching method of data center resource based on deeply study according to claim 1, special Sign is that in one embodiment of the invention, we are using minimum average operation slowdown as optimization aim.For every A operation j, slowdown is by Sj=Cj/TjIt provides, wherein CjIt is the deadline of operation (between i.e. arrival and completion execute Time), TjIt is operation (ideal) duration, pays attention to Sj>=1, we are set as the reward of each time step as a result,Wherein j is the set of current operation (make a reservation for or wait to be serviced) in systems.Overview setup discount factor γ= 1, over time, accumulation remuneration is consistent with the summation of slowdown, therefore maximizes progressive award, minimizes average slowdown。
6. a kind of offline dispatching method of data center resource based on deeply study according to claim 1, special Sign is that in one embodiment of the invention, the deep neural network used is convolutional neural networks CNN, the knot in network Structure is as follows: first layer input layer, second layer convolutional layer Conv1, third layer pond layer Pool1:MaxPooling, the 4th layer of convolution Layer Conv2, layer 5 pond layer Pool2:MaxPooling, the full articulamentum Local3 of layer 6, the 9th layer of full articulamentum Local4, the 10th layer of output layer Softmax.
CN201711399661.6A 2017-12-22 2017-12-22 A kind of offline dispatching method of data center resource based on deeply study Pending CN109960578A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711399661.6A CN109960578A (en) 2017-12-22 2017-12-22 A kind of offline dispatching method of data center resource based on deeply study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711399661.6A CN109960578A (en) 2017-12-22 2017-12-22 A kind of offline dispatching method of data center resource based on deeply study

Publications (1)

Publication Number Publication Date
CN109960578A true CN109960578A (en) 2019-07-02

Family

ID=67018801

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711399661.6A Pending CN109960578A (en) 2017-12-22 2017-12-22 A kind of offline dispatching method of data center resource based on deeply study

Country Status (1)

Country Link
CN (1) CN109960578A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110347478A (en) * 2019-07-08 2019-10-18 白紫星 A kind of model-free data center resource dispatching algorithm based on intensified learning
CN110443412A (en) * 2019-07-18 2019-11-12 华中科技大学 The intensified learning method of Logistic Scheduling and path planning in dynamic optimization process
CN110609474A (en) * 2019-09-09 2019-12-24 创新奇智(南京)科技有限公司 Data center energy efficiency optimization method based on reinforcement learning
CN111651220A (en) * 2020-06-04 2020-09-11 上海电力大学 Spark parameter automatic optimization method and system based on deep reinforcement learning
CN112035251A (en) * 2020-07-14 2020-12-04 中科院计算所西部高等技术研究院 Deep learning training system and method based on reinforcement learning operation layout
CN113157422A (en) * 2021-04-29 2021-07-23 清华大学 Cloud data center cluster resource scheduling method and device based on deep reinforcement learning
CN114116183A (en) * 2022-01-28 2022-03-01 华北电力大学 Data center service load scheduling method and system based on deep reinforcement learning
CN115271130A (en) * 2022-09-30 2022-11-01 合肥工业大学 Dynamic scheduling method and system for maintenance order of ship main power equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577265A (en) * 2012-07-25 2014-02-12 田文洪 Method and device of offline energy-saving dispatching in cloud computing data center
CN108595267A (en) * 2018-04-18 2018-09-28 中国科学院重庆绿色智能技术研究院 A kind of resource regulating method and system based on deeply study

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577265A (en) * 2012-07-25 2014-02-12 田文洪 Method and device of offline energy-saving dispatching in cloud computing data center
CN108595267A (en) * 2018-04-18 2018-09-28 中国科学院重庆绿色智能技术研究院 A kind of resource regulating method and system based on deeply study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HONGZI MAO等: "Resource Management with Deep Reinforcement Learning", 《HOTNETS "16: PROCEEDINGS OF THE 15TH ACM WORKSHOP ON HOT TOPICS IN NETWORKS》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110347478A (en) * 2019-07-08 2019-10-18 白紫星 A kind of model-free data center resource dispatching algorithm based on intensified learning
CN110443412A (en) * 2019-07-18 2019-11-12 华中科技大学 The intensified learning method of Logistic Scheduling and path planning in dynamic optimization process
CN110609474A (en) * 2019-09-09 2019-12-24 创新奇智(南京)科技有限公司 Data center energy efficiency optimization method based on reinforcement learning
CN111651220A (en) * 2020-06-04 2020-09-11 上海电力大学 Spark parameter automatic optimization method and system based on deep reinforcement learning
CN111651220B (en) * 2020-06-04 2023-08-18 上海电力大学 Spark parameter automatic optimization method and system based on deep reinforcement learning
CN112035251A (en) * 2020-07-14 2020-12-04 中科院计算所西部高等技术研究院 Deep learning training system and method based on reinforcement learning operation layout
CN112035251B (en) * 2020-07-14 2023-09-26 中科院计算所西部高等技术研究院 Deep learning training system and method based on reinforcement learning operation layout
CN113157422A (en) * 2021-04-29 2021-07-23 清华大学 Cloud data center cluster resource scheduling method and device based on deep reinforcement learning
CN114116183A (en) * 2022-01-28 2022-03-01 华北电力大学 Data center service load scheduling method and system based on deep reinforcement learning
CN114116183B (en) * 2022-01-28 2022-04-29 华北电力大学 Data center service load scheduling method and system based on deep reinforcement learning
CN115271130A (en) * 2022-09-30 2022-11-01 合肥工业大学 Dynamic scheduling method and system for maintenance order of ship main power equipment

Similar Documents

Publication Publication Date Title
CN109960578A (en) A kind of offline dispatching method of data center resource based on deeply study
Rahimian et al. A hybrid integer programming and variable neighbourhood search algorithm to solve nurse rostering problems
Shyalika et al. Reinforcement learning in dynamic task scheduling: A review
US20220027817A1 (en) Deep reinforcement learning for production scheduling
CN111274036B (en) Scheduling method of deep learning task based on speed prediction
Risbeck et al. Unification of closed-loop scheduling and control: State-space formulations, terminal constraints, and nominal theoretical properties
Wu et al. A multi-model estimation of distribution algorithm for energy efficient scheduling under cloud computing system
Tang et al. Online operations of automated electric taxi fleets: An advisor-student reinforcement learning framework
CN109960573A (en) A kind of cross-domain calculating task dispatching method and system based on Intellisense
Lin Context-aware task allocation for distributed agile team
Zafeiropoulos et al. Reinforcement learning-assisted autoscaling mechanisms for serverless computing platforms
Stølevik et al. A hybrid approach for solving real-world nurse rostering problems
CN113094159A (en) Data center job scheduling method, system, storage medium and computing equipment
Peng et al. Critical chain based Proactive-Reactive scheduling for Resource-Constrained project scheduling under uncertainty
CN106371924A (en) Task scheduling method for maximizing MapReduce cluster energy consumption
Moazeni et al. Dynamic resource allocation using an adaptive multi-objective teaching-learning based optimization algorithm in cloud
Annear et al. Dynamic assignment of a multi-skilled workforce in job shops: An approximate dynamic programming approach
Cai et al. Deep reinforcement learning for solving resource constrained project scheduling problems with resource disruptions
CN113743761A (en) Intern shift-by-shift scheduling method and system based on random neighborhood search algorithm
Kang et al. Cooperative distributed gpu power capping for deep learning clusters
CN113535365A (en) Deep learning training operation resource placement system and method based on reinforcement learning
Tian et al. Towards critical region reliability support for Grid workflows
Risbeck et al. Closed-loop economic model predictive control for scheduling and control problems
Prasad et al. Adaptive smoothed functional algorithms for optimal staffing levels in service systems
CN110347478A (en) A kind of model-free data center resource dispatching algorithm based on intensified learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190702

WD01 Invention patent application deemed withdrawn after publication