CN110515735A - A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm - Google Patents
A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm Download PDFInfo
- Publication number
- CN110515735A CN110515735A CN201910807351.6A CN201910807351A CN110515735A CN 110515735 A CN110515735 A CN 110515735A CN 201910807351 A CN201910807351 A CN 201910807351A CN 110515735 A CN110515735 A CN 110515735A
- Authority
- CN
- China
- Prior art keywords
- value
- learning algorithm
- task
- weight factor
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a kind of based on the multiple target cloud resource dispatching method for improving Q learning algorithm.This method is constantly interacted by Agent with environment, and study obtains optimal policy.The present invention passes through Cloudsim cloud computing emulation platform, it is random to generate different task and virtual machine, using the deadline and operating cost for optimizing task simultaneously as optimization aim, it designs a kind of based on the multiple target cloud resource dispatching method for improving Q learning algorithm, the convergence rate of Q learning algorithm is accelerated using the heuristic movement selection strategy for automatically updating weight factor, while improving the optimizing ability of algorithm, to improve the utilization rate of cloud resource, user satisfaction is improved, operator's cost is reduced.
Description
Technical field
The present invention relates to cloud resource scheduling fields, and in particular to it is a kind of based on improve Q learning algorithm multiple target because of resource
Dispatching method.
Background technique
Cloud resource scheduling refers to according to resource using rule, different resource user according to rule cloud service platform into
The process of row resource adjustment.Reasonable scheduling of resource optimization algorithm is most important for the comprehensive performance for improving cloud computing system
's.QoS constraint in scheduling includes operating cost, deadline, safety, availability etc..In actual demand, cost constraint
With the deadline respectively be influence operator and user satisfaction key factor, will reduce execute the time and reduce operation at
This is simultaneously essential for dispatching algorithm as optimization aim.Therefore, present invention use to optimize execution simultaneously
Time and operating cost are the multiple target cloud resource scheduling model of target.
Intensified learning is provided as a kind of non-supervisory formula intelligent search algorithm with learning ability unrelated with model in cloud
Have preferable learning effect in the scheduling problem of source, therefore attempts to solve cloud resource scheduling problem using nitrification enhancement.Its
In, Q learning algorithm is more stable for solving the performance of cloud resource scheduling problem, but meeting existence space is big, and convergence rate is slow
The problems such as, to improve algorithm the convergence speed, the present invention combines weight factor with heuristic function, instructs every time according to Agent
Return value immediately after white silk automatically updates the weight factor after different movements execute, so that it is determined that movement selection strategy, improves and calculate
Method convergence rate.
Summary of the invention
In order to solve cloud resource scheduling problem, the invention discloses one kind can reduce task execution time, reduces system
Operating cost and the dispatching method that algorithm the convergence speed, operational efficiency and optimizing ability are accounted for range.
For this purpose, the present invention provides the following technical scheme that
1. a kind of based on the multiple target cloud resource dispatching method for improving Q learning algorithm, which is characterized in that algorithm passes through
The interaction of Agent and environment is learnt, and Agent updates Q table, more new state by movement selection strategy selection movement,
Iteration above-mentioned steps obtain optimal policy until Q expression to convergence, Agent.It specifically includes:
Definition status space: state space is made of different state s, is indicated by a dynamic array, and wherein state s is used
One-dimension array indicates that the subscript of s indicates that task number, the value of s indicate virtual machine serial number.For example 5 tasks distribute 3 virtually
Machine, then be the shaping array of 5 elements, which empty machine is the value expression task of each element be assigned on and execute.
It defines motion space: being integer variable by action definition, i-th of task is distributed into jth platform virtual machine when executing
When This move, then integer variable j is assigned to i-th of value in state s array.Such as one-dimension array [1,0,0,2,1], then table
Show that the 0th task distributes to No. 1 virtual machine, the 1st task distributes to No. 0 virtual machine ...
Definition is returned immediately: r=ω * (Etc-Ti)+(1-ω)*(Cst-Ci), wherein TiAnd CiRespectively indicate current shape
Total execution time of the allocated task of lower i-th virtual machine of state and the totle drilling cost for executing task.Etc and Cst all indicate compared with
Big constant, sets total execution time of all tasks on all virtual machines for Etc herein, and all tasks are arranged in institute in Cst
There is the totle drilling cost on virtual machine.
Define Q value more new formula are as follows:
Wherein, (0,1) α ∈ indicates learning rate.γ indicates discount factor;QtIndicate the Q value of t moment.
Define the more new formula of weight factor are as follows:Wherein, siAnd aiPoint
It Biao Shi not need to update the state and movement of weight factor;f(si,ai) indicate in state siLower execution acts aiWeight factor;
rmaxExpression state siUnder maximal rewards value;atIndicate Agent in current period in state siThe movement of lower selection;rtIt indicates
Current period is in state siLower execution acts atThe return value of feedback.
Define the update rule of heuristic function are as follows:Wherein, πf(st)
Expression state is stWhen, the optimal movement that is selected under the guidance of weight factor function f;Indicate maximum power
The ratio of repeated factor and total weight factor indicates the significance level of the movement by the size of this numerical value;The size of U value indicates
Power of the weight factor to the influence degree for acting selection, U is bigger, and weight factor is stronger to the directiveness for acting selection.
Definition automatically updates the heuristic movement selection strategy of weight factor are as follows:
Wherein, arandomIndicate one movement of random selection;P, q ∈ [0,1], p value determine that Agent carries out exploration probability, and p value is bigger,
The probability that Agent is explored is with regard to smaller.
Include: based on the multiple target cloud resource dispatching method for improving Q learning algorithm
Step 1: the parameter and algorithm parameter of emulation platform are set;
Step 2: the random task and virtual machine for generating certain scale is set by Cloudsim emulation platform;
Step 3: initialization Q table and G table, init state space S;
Step 4: iteration executes step 4-1 to 4-5;
Step 4-1: using s as being set as current state;
Step 4-2: using the heuristic movement selection strategy for automatically updating weight factor based on ε-greedy algorithm from
Selection acts in set of actions A;
Step 4-3: executing the movement chosen, and the return value immediately that the movement is executed under the state is recorded, according to formula Qt=
(1-α)Qt+α(r+γ*Qt+1) Q value is updated, while weight factor is updated, further according to formula
Update G value;State is transferred to NextState s ' by s;
Step 4-4: calculating error=MAX (error | Qt-Qprevious-t), wherein Qprevious-tIndicate the previous of t moment
Moment value;
Step 4-5: terminate the learning process of Agent if error < θ, otherwise return step 4-1.Wherein θ is to fix
Value, sets according to demand.
It is ect by the execution timing definition of task in the present inventionij=sizei/mipj, wherein sizeiIndicate i-th
The size of business, mipjIndicate the processing speed of jth platform virtual machine.
By the total run time of jth platform virtual machine is defined as:
By a complete scheduling scheme PiTotal execution time are as follows:
By total operating cost consumed by execution task is defined as:Wherein,
cstjIndicate the resources costs consumed by jth platform virtual machines performing tasks within the unit time.
According to defined above, then optimization aim of the invention may be defined as: min [Time (Pi),Cost(Pi)].Indicate this
The target of invention is that task is always executed to time and operating cost minimum.
In order to more clearly evaluate Multiobjective Scheduling, P will be dispatchediEvaluation function is defined as: est (Pi)=ω *
logTime(Pi)+(1-ω)*logCost(Pi), wherein ω ∈ [0,1] indicates user to the pass for executing time and operating cost
Note degree, by adjusting the size of ω come meet user to execute time and operating cost different demands;Time(Pi) indicate to adjust
Degree scheme PiTotal execution time;Cost(Pi) indicate scheduling scheme PiTotal operating cost consumed by execution task.Pass through evaluation
The size of function judges the quality of scheduling strategy, according to the optimization aim of the present embodiment it is found that evaluation function is smaller, dispatches plan
It is slightly better.
Compared with the prior art, the invention has the following beneficial effects:
1. multiple target cloud resource scheduling model proposed by the present invention can uniformly examine operator's interests with user demand
Consider, while reducing task completion time, reduces operating cost.
2. improvement Q learning algorithm proposed by the present invention is in terms of solving multiple target cloud resource scheduling problem, using based on certainly
The dynamic heuristic movement selection strategy for updating weight factor, in terms of optimizing ability, convergence and load balance ability all
Optimization effectively improves the overall performance of cloud resource scheduling.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
Fig. 2 is comparison schematic diagram of the different work dispatching method in terms of algorithm optimizing ability in the embodiment of the present invention;
Fig. 3 is comparison schematic diagram of the different work dispatching method in terms of algorithm the convergence speed in the embodiment of the present invention;
Fig. 4 is comparison schematic diagram of the different work dispatching method in terms of algorithmic load equilibrium in the embodiment of the present invention.
Specific embodiment
In order to enable technical solution in the embodiment of the present invention to understand and be fully described by, with reference to embodiments in
Attached drawing, the present invention is further described in detail
Embodiment:
As shown in Figure 1, it is a kind of based on the multiple target cloud resource dispatching method for improving Q study, it is carried out by Agent and environment
Interaction, carries out the study of optimal policy, when meeting the strategy of error < θ condition, terminates Agent learning process.Specific packet
It includes:
Step 1: the parameter and algorithm parameter of emulation platform are set;
Step 2: the random task and virtual machine for generating certain scale is set by Cloudsim emulation platform;
Step 3: initialization Q table and G table, init state space S;
Step 4: iteration executes step 4-1 to 4-5;
Step 4-1: using s as being set as current state;
Step 4-2: using the heuristic movement selection strategy for automatically updating weight factor based on ε-greedy algorithm from
Selection acts in set of actions A;
Step 4-3: executing the movement chosen, and the return value immediately that the movement is executed under the state is recorded, according to formula Qt=
(1-α)Qt+α(r+γ*Qt+1) Q value is updated, while weight factor is updated, further according to formula
Update G value;State is transferred to NextState s ' by s;
Step 4-4: calculating error=MAX (error | Qt-Qprevious-t), wherein Qprevious-tIndicate the previous of t moment
Moment value;
Step 4-5: terminate the learning process of Agent if error < θ, otherwise return step 4-1.Wherein θ is to fix
Value, sets according to demand.
The present embodiment in step 1, is configured Cloudsim emulation platform and algorithm parameter, wherein Cloudsim
Emulation platform setting such as table 1;Improve the setting of Q learning algorithm as shown in table 2, wherein α indicates that learning rate, γ indicate discount factor, ε
For balancing utilization and heuristic process in ε-greedy algorithm;ω indicates concern of the user to time and operating cost is executed
Degree;U indicates power of the weight factor to the influence degree for acting selection.
The setting of 1 experiment parameter of table
The setting of 2 algorithm parameter of table
The present embodiment generates data set in step 2, using Cloudsim emulation platform at random, and task size is defined on
Between section [60000,120000], the processing speed of virtual machine is defined between [400,1200].Task rule in embodiment
Mould 5 incremented by successively, at maximum up to 30 since 10;The quantity of virtual machine is set as 5.
It is ect by the execution timing definition of task in the present embodimentij=sizei/mipj, wherein sizeiIt indicates i-th
The size of task, mipjIndicate the processing speed of jth platform virtual machine.
By the total run time of jth platform virtual machine is defined as:
By a complete scheduling scheme PiTotal execution time are as follows:
By total operating cost consumed by execution task is defined as:Wherein,
cstjIndicate the resources costs consumed by jth platform virtual machines performing tasks within the unit time.
According to defined above, then the optimization aim of the present embodiment may be defined as: min [Time (Pi),Cost(Pi)].It indicates
The target of the present embodiment is that task is always executed to time and operating cost minimum.
In order to more clearly evaluate Multiobjective Scheduling, P will be dispatchediEvaluation function is defined as: est (Pi)=ω *
logTime(Pi)+(1-ω)*logCost(Pi), wherein ω ∈ [0,1] indicates user to the pass for executing time and operating cost
Note degree, by adjusting the size of ω come meet user to execute time and operating cost different demands;Time(Pi) indicate to adjust
Degree scheme PiTotal execution time;Cost(Pi) indicate scheduling scheme PiTotal operating cost consumed by execution task.Pass through evaluation
The size of function judges the quality of scheduling strategy, according to the optimization aim of the present embodiment it is found that evaluation function is smaller, dispatches plan
It is slightly better.
Fig. 2 be improvement of the present invention Q learning algorithm in terms of the optimizing ability in multiple target cloud resource scheduling problem with
The comparison diagram of other dispatching algorithms.Following four algorithm is compared altogether:
1. task is dispensed on each virtual machine by the scheduling scheme executed in order in order, i.e., first is appointed
First virtual machine is distributed in business, and second task is distributed to second virtual machine etc., indicated with Equ.
2. genetic algorithm (GA).
3.Q learning algorithm (QL).
4. based on the heuristic Q learning algorithm (WHAQL) for automatically updating weight.
In Fig. 2, abscissa indicates task scale, and ordinate indicates evaluation function value, and evaluation function value is smaller, algorithm
Optimizing ability is stronger.
By Fig. 2 it is known that the obtained scheduling scheme of WHAQL algorithm used in the present invention can be minimized
Evaluation function, optimizing ability is strong.
Fig. 3 be improvement of the present invention Q learning algorithm in terms of the convergence rate in multiple target cloud resource scheduling problem with
The comparison diagram of other dispatching algorithms.It will be calculated based on the heuristic Q learning algorithm (WHAQL) for automatically updating weight factor and Q study
Method (QL) and heuristic Q learning algorithm (HAQL) compare.
When Fig. 3 indicates that working as task scale is 20, ω=0.5, the comparison diagram of three kinds of algorithm iteration processes.Total the number of iterations is set
It is set to 5000 times, every iteration 500 times are a study stage, and record is once as a result, totally 10 study stages.Wherein, abscissa
Expression task scale, ordinate indicate evaluation function value.
It is learnt by Fig. 3, WHAQL algorithm used in the present invention is compared to other two kinds of algorithms, and convergence rate is faster.
Fig. 4 be improvement of the present invention Q learning algorithm in terms of the load balancing in multiple target cloud resource scheduling problem with
The comparison diagram of other dispatching algorithms.It will be based on automatically updating the heuristic Q learning algorithm (WHAQL) of weight factor and hold in order
Capable dispatching method (Equ), genetic algorithm (GA) and Q learning algorithm (QL) algorithm compares.
Wherein, abscissa indicates task scale, and ordinate indicates load balancing value, load balancing value closer to 1, system
It loads more balanced.
The ratio of the most short execution time of virtual machine and maximum execution time are defined as to the load balancing function of system, meter
Calculating formula is
It is learnt by Fig. 4, the load balance degree of WHAQL algorithm used in the present invention is imitated compared to other several algorithms
Fruit is more preferable, it was demonstrated that WHAQL not only has higher utilization rate to resource, can also effectively mitigate the workload of virtual machine.
The implementation case proves that the present invention is a kind of to be sought based on the multiple target cloud resource dispatching method for improving Q learning algorithm
Three excellent ability, convergence rate and load balancing aspects have preferable performance.
The above is that the embodiment of the present invention is discussed in detail in conjunction with attached drawing, and embodiments herein is only
It is to be used to help understand method of the invention.For those skilled in the art, according to the thought of the present invention, having
It can have some change and modify in body embodiment and application range, therefore present specification should not be construed as limiting the invention.
Claims (2)
1. based on the multiple target cloud resource dispatching method for improving Q learning algorithm, which is characterized in that Agent with environment by carrying out
Interaction selects the maximum movement of return value to execute, and in the movement choice phase, this method considers weight factor and heuristic function
It combines, the return value immediately after being trained every time according to Agent, automatically updates the weight factor after different movements execute, thus
It determines movement selection strategy, improves algorithm the convergence speed, detailed process is as follows:
Step 1: generating task data and virtual-machine data at random using Cloudsim emulation platform;
Step 2: defining the state space S of Q study: indicating that wherein state s is indicated with one-dimension array, s's by a dynamic array
Subscript indicates that task number, the value of s indicate virtual machine serial number;
Step 3: defining the set of actions A of Q study: being integer variable by action definition, i-th of task is distributed to the when executing
When j platform virtual machine This move, then integer quantity j is assigned to i-th of value in state s array;
Step 4: defining the Reward Program immediately of Q learning algorithm: r=ω * (Etc-Ti)+(1-ω)*(Cst-Ci);Wherein, TiWith
CiIt respectively indicates total execution time of the allocated task of lower i-th virtual machine of current state and executes the totle drilling cost of task,
Etc and Cst indicates larger constant, sets total execution time of all tasks on all virtual machines, Cst for Etc herein
Totle drilling cost of all tasks on all virtual machines is set;
Step 5: generation task data and virtual-machine data being adjusted using based on the Q learning algorithm for automatically updating weight factor
Degree distribution.
2. the multiple target cloud resource dispatching method according to claim 1 for improving Q learning algorithm, which is characterized in that described
Generation task data and virtual-machine data are scheduled using based on the Q learning algorithm for automatically updating weight factor in step 5
Distribution, specific steps are as follows:
Step 5-1: initialization Q table and G table, wherein Q table is used to store the value of a certain movement under a certain state;G table is used to deposit
Store up the relevant information of weight factor;
Step 5-2: init state space S;
Step 5-3: iteration executes 5-3-1 to 5-3-6 step;
Step 5-3-1: current state is set by s;
Step 5-3-2: driven using the heuristic movement selection strategy for automatically updating weight factor based on ε-greedy algorithm
Make selection in set A to act;
Step 5-3-3: executing the movement chosen, and records the return value immediately and NextState s ' that the movement is executed under the state;
Step 5-3-4: according to formula Qt=(1- α) Qt+α(r+γ*Qt+1) updating Q value, wherein α ∈ (0,1) indicates learning rate;
γ indicates discount factor;QtIndicate the Q value of t moment;Update weight factor f (st,at), further according to formulaUpdate G value;
Step 5-3-5: calculating error=MAX (error | Qt-Qprevious-t), wherein Qprevious-tIndicate t moment it is previous when
Quarter value;
Step 5-3-6: terminating the learning process of Agent if error < θ, and otherwise (wherein θ is to fix to return step 5-3-1
Value is set according to demand).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910807351.6A CN110515735A (en) | 2019-08-29 | 2019-08-29 | A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910807351.6A CN110515735A (en) | 2019-08-29 | 2019-08-29 | A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110515735A true CN110515735A (en) | 2019-11-29 |
Family
ID=68627856
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910807351.6A Pending CN110515735A (en) | 2019-08-29 | 2019-08-29 | A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110515735A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191934A (en) * | 2019-12-31 | 2020-05-22 | 北京理工大学 | Multi-target cloud workflow scheduling method based on reinforcement learning strategy |
CN111427688A (en) * | 2020-03-23 | 2020-07-17 | 武汉轻工大学 | Cloud task multi-target scheduling method and device, electronic equipment and storage medium |
CN111637444A (en) * | 2020-06-05 | 2020-09-08 | 沈阳航空航天大学 | Nuclear power steam generator water level control method based on Q learning |
CN112256422A (en) * | 2020-11-17 | 2021-01-22 | 中国人民解放军战略支援部队信息工程大学 | Heterogeneous platform task scheduling method and system based on Q learning |
CN112327786A (en) * | 2020-11-19 | 2021-02-05 | 哈尔滨理工大学 | Comprehensive scheduling method for dynamically adjusting non-occupied time period of equipment |
CN112543038A (en) * | 2020-11-02 | 2021-03-23 | 杭州电子科技大学 | Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO |
CN113163447A (en) * | 2021-03-12 | 2021-07-23 | 中南大学 | Communication network task resource scheduling method based on Q learning |
CN113190081A (en) * | 2021-04-26 | 2021-07-30 | 中国科学院近代物理研究所 | Method and device for adjusting time synchronism of power supply |
CN113222253A (en) * | 2021-05-13 | 2021-08-06 | 珠海埃克斯智能科技有限公司 | Scheduling optimization method, device and equipment and computer readable storage medium |
CN113326135A (en) * | 2021-06-21 | 2021-08-31 | 桂林航天工业学院 | Cloud resource scheduling method under multiple targets |
CN113535365A (en) * | 2021-07-30 | 2021-10-22 | 中科计算技术西部研究院 | Deep learning training operation resource placement system and method based on reinforcement learning |
CN116932164A (en) * | 2023-07-25 | 2023-10-24 | 和光舒卷(广东)数字科技有限公司 | Multi-task scheduling method and system based on cloud platform |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102868972A (en) * | 2012-09-05 | 2013-01-09 | 河海大学常州校区 | Internet of things (IoT) error sensor node location method based on improved Q learning algorithm |
CN103064817A (en) * | 2012-12-21 | 2013-04-24 | 桂林电子科技大学 | Simplified two-line serial data bus transport method |
CN105930214A (en) * | 2016-04-22 | 2016-09-07 | 广东石油化工学院 | Q-learning-based hybrid cloud job scheduling method |
CN108139930A (en) * | 2016-05-24 | 2018-06-08 | 华为技术有限公司 | Resource regulating method and device based on Q study |
CN108508745A (en) * | 2018-01-22 | 2018-09-07 | 中国铁道科学研究院通信信号研究所 | A kind of multiple target cycle tests collection optimization generation method |
-
2019
- 2019-08-29 CN CN201910807351.6A patent/CN110515735A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102868972A (en) * | 2012-09-05 | 2013-01-09 | 河海大学常州校区 | Internet of things (IoT) error sensor node location method based on improved Q learning algorithm |
CN103064817A (en) * | 2012-12-21 | 2013-04-24 | 桂林电子科技大学 | Simplified two-line serial data bus transport method |
CN105930214A (en) * | 2016-04-22 | 2016-09-07 | 广东石油化工学院 | Q-learning-based hybrid cloud job scheduling method |
CN108139930A (en) * | 2016-05-24 | 2018-06-08 | 华为技术有限公司 | Resource regulating method and device based on Q study |
CN108508745A (en) * | 2018-01-22 | 2018-09-07 | 中国铁道科学研究院通信信号研究所 | A kind of multiple target cycle tests collection optimization generation method |
Non-Patent Citations (6)
Title |
---|
LUIZ A. CELIBERTO JR: "Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach", 《2010 LATIN AMERICAN ROBOTICS SYMPOSIUM AND INTELLIGENT ROBOTICS MEETING》 * |
REINALDO A. C. BIANCHI: "euristically-Accelerated Multiagent Reinforcement Learning", 《IEEE TRANSACTIONS ON CYBERNETICS ( VOLUME: 44, ISSUE: 2, FEBRUARY 2014)》 * |
吴昊霖等: "在线更新的信息强度引导启发式Q学习", 《计算机应用研究》 * |
张文旭: "基于一致性与事件驱动的强化学习研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
李成严: "不确定执行时间的云计算资源调度", 《哈尔滨理工大学学报》 * |
王洪彦: "新的启发式Q学习算法", 《计算机工程》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191934A (en) * | 2019-12-31 | 2020-05-22 | 北京理工大学 | Multi-target cloud workflow scheduling method based on reinforcement learning strategy |
CN111191934B (en) * | 2019-12-31 | 2022-04-15 | 北京理工大学 | Multi-target cloud workflow scheduling method based on reinforcement learning strategy |
CN111427688A (en) * | 2020-03-23 | 2020-07-17 | 武汉轻工大学 | Cloud task multi-target scheduling method and device, electronic equipment and storage medium |
CN111427688B (en) * | 2020-03-23 | 2023-08-11 | 武汉轻工大学 | Cloud task multi-target scheduling method and device, electronic equipment and storage medium |
CN111637444B (en) * | 2020-06-05 | 2021-10-22 | 沈阳航空航天大学 | Nuclear power steam generator water level control method based on Q learning |
CN111637444A (en) * | 2020-06-05 | 2020-09-08 | 沈阳航空航天大学 | Nuclear power steam generator water level control method based on Q learning |
CN112543038A (en) * | 2020-11-02 | 2021-03-23 | 杭州电子科技大学 | Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO |
CN112256422A (en) * | 2020-11-17 | 2021-01-22 | 中国人民解放军战略支援部队信息工程大学 | Heterogeneous platform task scheduling method and system based on Q learning |
CN112327786A (en) * | 2020-11-19 | 2021-02-05 | 哈尔滨理工大学 | Comprehensive scheduling method for dynamically adjusting non-occupied time period of equipment |
CN113163447A (en) * | 2021-03-12 | 2021-07-23 | 中南大学 | Communication network task resource scheduling method based on Q learning |
CN113190081A (en) * | 2021-04-26 | 2021-07-30 | 中国科学院近代物理研究所 | Method and device for adjusting time synchronism of power supply |
CN113190081B (en) * | 2021-04-26 | 2022-12-13 | 中国科学院近代物理研究所 | Method and device for adjusting time synchronism of power supply |
CN113222253A (en) * | 2021-05-13 | 2021-08-06 | 珠海埃克斯智能科技有限公司 | Scheduling optimization method, device and equipment and computer readable storage medium |
CN113222253B (en) * | 2021-05-13 | 2022-09-30 | 珠海埃克斯智能科技有限公司 | Scheduling optimization method, device, equipment and computer readable storage medium |
CN113326135A (en) * | 2021-06-21 | 2021-08-31 | 桂林航天工业学院 | Cloud resource scheduling method under multiple targets |
CN113326135B (en) * | 2021-06-21 | 2023-08-22 | 桂林航天工业学院 | Cloud resource scheduling method under multiple targets |
CN113535365A (en) * | 2021-07-30 | 2021-10-22 | 中科计算技术西部研究院 | Deep learning training operation resource placement system and method based on reinforcement learning |
CN116932164A (en) * | 2023-07-25 | 2023-10-24 | 和光舒卷(广东)数字科技有限公司 | Multi-task scheduling method and system based on cloud platform |
CN116932164B (en) * | 2023-07-25 | 2024-03-29 | 和光舒卷(广东)数字科技有限公司 | Multi-task scheduling method and system based on cloud platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110515735A (en) | A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm | |
Wei | Task scheduling optimization strategy using improved ant colony optimization algorithm in cloud computing | |
Torabi et al. | A dynamic task scheduling framework based on chicken swarm and improved raven roosting optimization methods in cloud computing | |
Sun et al. | PACO: A period ACO based scheduling algorithm in cloud computing | |
CN109656702A (en) | A kind of across data center network method for scheduling task based on intensified learning | |
CN108182115A (en) | A kind of virtual machine load-balancing method under cloud environment | |
CN110109753A (en) | Resource regulating method and system based on various dimensions constraint genetic algorithm | |
CN108694090A (en) | A kind of cloud computing resource scheduling method of Based on Distributed machine learning | |
CN103699446A (en) | Quantum-behaved particle swarm optimization (QPSO) algorithm based multi-objective dynamic workflow scheduling method | |
CN110351348B (en) | Cloud computing resource scheduling optimization method based on DQN | |
CN111722910A (en) | Cloud job scheduling and resource allocation method | |
CN109067834A (en) | Discrete particle cluster dispatching algorithm based on oscillatory type inertia weight | |
Petropoulos et al. | A particle swarm optimization algorithm for balancing assembly lines | |
CN108170530A (en) | A kind of Hadoop Load Balancing Task Scheduling methods based on mixing meta-heuristic algorithm | |
Thaman et al. | Current perspective in task scheduling techniques in cloud computing: a review | |
Manikandan et al. | LGSA: Hybrid task scheduling in multi objective functionality in cloud computing environment | |
Yu et al. | Fluid: Resource-aware hyperparameter tuning engine | |
Mojab et al. | iCATS: Scheduling big data workflows in the cloud using cultural algorithms | |
Zhou et al. | Deep reinforcement learning-based algorithms selectors for the resource scheduling in hierarchical cloud computing | |
Balla et al. | Reliability-aware: task scheduling in cloud computing using multi-agent reinforcement learning algorithm and neural fitted Q. | |
Han et al. | A DEA based hybrid algorithm for bi-objective task scheduling in cloud computing | |
Hu et al. | A two-stage multi-objective task scheduling framework based on invasive tumor growth optimization algorithm for cloud computing | |
CN108958919A (en) | More DAG task schedule expense fairness assessment models of limited constraint in a kind of cloud computing | |
Sharma et al. | Multi-Faceted Job Scheduling Optimization Using Q-learning With ABC In Cloud Environment | |
Yamazaki et al. | Implementation and evaluation of the JobTracker initiative task scheduling on Hadoop |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20191129 |
|
WD01 | Invention patent application deemed withdrawn after publication |