CN110515735A - A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm - Google Patents

A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm Download PDF

Info

Publication number
CN110515735A
CN110515735A CN201910807351.6A CN201910807351A CN110515735A CN 110515735 A CN110515735 A CN 110515735A CN 201910807351 A CN201910807351 A CN 201910807351A CN 110515735 A CN110515735 A CN 110515735A
Authority
CN
China
Prior art keywords
value
learning algorithm
task
weight factor
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910807351.6A
Other languages
Chinese (zh)
Inventor
李成严
孙巍
宋月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN201910807351.6A priority Critical patent/CN110515735A/en
Publication of CN110515735A publication Critical patent/CN110515735A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of based on the multiple target cloud resource dispatching method for improving Q learning algorithm.This method is constantly interacted by Agent with environment, and study obtains optimal policy.The present invention passes through Cloudsim cloud computing emulation platform, it is random to generate different task and virtual machine, using the deadline and operating cost for optimizing task simultaneously as optimization aim, it designs a kind of based on the multiple target cloud resource dispatching method for improving Q learning algorithm, the convergence rate of Q learning algorithm is accelerated using the heuristic movement selection strategy for automatically updating weight factor, while improving the optimizing ability of algorithm, to improve the utilization rate of cloud resource, user satisfaction is improved, operator's cost is reduced.

Description

A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm
Technical field
The present invention relates to cloud resource scheduling fields, and in particular to it is a kind of based on improve Q learning algorithm multiple target because of resource Dispatching method.
Background technique
Cloud resource scheduling refers to according to resource using rule, different resource user according to rule cloud service platform into The process of row resource adjustment.Reasonable scheduling of resource optimization algorithm is most important for the comprehensive performance for improving cloud computing system 's.QoS constraint in scheduling includes operating cost, deadline, safety, availability etc..In actual demand, cost constraint With the deadline respectively be influence operator and user satisfaction key factor, will reduce execute the time and reduce operation at This is simultaneously essential for dispatching algorithm as optimization aim.Therefore, present invention use to optimize execution simultaneously Time and operating cost are the multiple target cloud resource scheduling model of target.
Intensified learning is provided as a kind of non-supervisory formula intelligent search algorithm with learning ability unrelated with model in cloud Have preferable learning effect in the scheduling problem of source, therefore attempts to solve cloud resource scheduling problem using nitrification enhancement.Its In, Q learning algorithm is more stable for solving the performance of cloud resource scheduling problem, but meeting existence space is big, and convergence rate is slow The problems such as, to improve algorithm the convergence speed, the present invention combines weight factor with heuristic function, instructs every time according to Agent Return value immediately after white silk automatically updates the weight factor after different movements execute, so that it is determined that movement selection strategy, improves and calculate Method convergence rate.
Summary of the invention
In order to solve cloud resource scheduling problem, the invention discloses one kind can reduce task execution time, reduces system Operating cost and the dispatching method that algorithm the convergence speed, operational efficiency and optimizing ability are accounted for range.
For this purpose, the present invention provides the following technical scheme that
1. a kind of based on the multiple target cloud resource dispatching method for improving Q learning algorithm, which is characterized in that algorithm passes through The interaction of Agent and environment is learnt, and Agent updates Q table, more new state by movement selection strategy selection movement, Iteration above-mentioned steps obtain optimal policy until Q expression to convergence, Agent.It specifically includes:
Definition status space: state space is made of different state s, is indicated by a dynamic array, and wherein state s is used One-dimension array indicates that the subscript of s indicates that task number, the value of s indicate virtual machine serial number.For example 5 tasks distribute 3 virtually Machine, then be the shaping array of 5 elements, which empty machine is the value expression task of each element be assigned on and execute.
It defines motion space: being integer variable by action definition, i-th of task is distributed into jth platform virtual machine when executing When This move, then integer variable j is assigned to i-th of value in state s array.Such as one-dimension array [1,0,0,2,1], then table Show that the 0th task distributes to No. 1 virtual machine, the 1st task distributes to No. 0 virtual machine ...
Definition is returned immediately: r=ω * (Etc-Ti)+(1-ω)*(Cst-Ci), wherein TiAnd CiRespectively indicate current shape Total execution time of the allocated task of lower i-th virtual machine of state and the totle drilling cost for executing task.Etc and Cst all indicate compared with Big constant, sets total execution time of all tasks on all virtual machines for Etc herein, and all tasks are arranged in institute in Cst There is the totle drilling cost on virtual machine.
Define Q value more new formula are as follows: Wherein, (0,1) α ∈ indicates learning rate.γ indicates discount factor;QtIndicate the Q value of t moment.
Define the more new formula of weight factor are as follows:Wherein, siAnd aiPoint It Biao Shi not need to update the state and movement of weight factor;f(si,ai) indicate in state siLower execution acts aiWeight factor; rmaxExpression state siUnder maximal rewards value;atIndicate Agent in current period in state siThe movement of lower selection;rtIt indicates Current period is in state siLower execution acts atThe return value of feedback.
Define the update rule of heuristic function are as follows:Wherein, πf(st) Expression state is stWhen, the optimal movement that is selected under the guidance of weight factor function f;Indicate maximum power The ratio of repeated factor and total weight factor indicates the significance level of the movement by the size of this numerical value;The size of U value indicates Power of the weight factor to the influence degree for acting selection, U is bigger, and weight factor is stronger to the directiveness for acting selection.
Definition automatically updates the heuristic movement selection strategy of weight factor are as follows: Wherein, arandomIndicate one movement of random selection;P, q ∈ [0,1], p value determine that Agent carries out exploration probability, and p value is bigger, The probability that Agent is explored is with regard to smaller.
Include: based on the multiple target cloud resource dispatching method for improving Q learning algorithm
Step 1: the parameter and algorithm parameter of emulation platform are set;
Step 2: the random task and virtual machine for generating certain scale is set by Cloudsim emulation platform;
Step 3: initialization Q table and G table, init state space S;
Step 4: iteration executes step 4-1 to 4-5;
Step 4-1: using s as being set as current state;
Step 4-2: using the heuristic movement selection strategy for automatically updating weight factor based on ε-greedy algorithm from Selection acts in set of actions A;
Step 4-3: executing the movement chosen, and the return value immediately that the movement is executed under the state is recorded, according to formula Qt= (1-α)Qt+α(r+γ*Qt+1) Q value is updated, while weight factor is updated, further according to formula Update G value;State is transferred to NextState s ' by s;
Step 4-4: calculating error=MAX (error | Qt-Qprevious-t), wherein Qprevious-tIndicate the previous of t moment Moment value;
Step 4-5: terminate the learning process of Agent if error < θ, otherwise return step 4-1.Wherein θ is to fix Value, sets according to demand.
It is ect by the execution timing definition of task in the present inventionij=sizei/mipj, wherein sizeiIndicate i-th The size of business, mipjIndicate the processing speed of jth platform virtual machine.
By the total run time of jth platform virtual machine is defined as:
By a complete scheduling scheme PiTotal execution time are as follows:
By total operating cost consumed by execution task is defined as:Wherein, cstjIndicate the resources costs consumed by jth platform virtual machines performing tasks within the unit time.
According to defined above, then optimization aim of the invention may be defined as: min [Time (Pi),Cost(Pi)].Indicate this The target of invention is that task is always executed to time and operating cost minimum.
In order to more clearly evaluate Multiobjective Scheduling, P will be dispatchediEvaluation function is defined as: est (Pi)=ω * logTime(Pi)+(1-ω)*logCost(Pi), wherein ω ∈ [0,1] indicates user to the pass for executing time and operating cost Note degree, by adjusting the size of ω come meet user to execute time and operating cost different demands;Time(Pi) indicate to adjust Degree scheme PiTotal execution time;Cost(Pi) indicate scheduling scheme PiTotal operating cost consumed by execution task.Pass through evaluation The size of function judges the quality of scheduling strategy, according to the optimization aim of the present embodiment it is found that evaluation function is smaller, dispatches plan It is slightly better.
Compared with the prior art, the invention has the following beneficial effects:
1. multiple target cloud resource scheduling model proposed by the present invention can uniformly examine operator's interests with user demand Consider, while reducing task completion time, reduces operating cost.
2. improvement Q learning algorithm proposed by the present invention is in terms of solving multiple target cloud resource scheduling problem, using based on certainly The dynamic heuristic movement selection strategy for updating weight factor, in terms of optimizing ability, convergence and load balance ability all Optimization effectively improves the overall performance of cloud resource scheduling.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
Fig. 2 is comparison schematic diagram of the different work dispatching method in terms of algorithm optimizing ability in the embodiment of the present invention;
Fig. 3 is comparison schematic diagram of the different work dispatching method in terms of algorithm the convergence speed in the embodiment of the present invention;
Fig. 4 is comparison schematic diagram of the different work dispatching method in terms of algorithmic load equilibrium in the embodiment of the present invention.
Specific embodiment
In order to enable technical solution in the embodiment of the present invention to understand and be fully described by, with reference to embodiments in Attached drawing, the present invention is further described in detail
Embodiment:
As shown in Figure 1, it is a kind of based on the multiple target cloud resource dispatching method for improving Q study, it is carried out by Agent and environment Interaction, carries out the study of optimal policy, when meeting the strategy of error < θ condition, terminates Agent learning process.Specific packet It includes:
Step 1: the parameter and algorithm parameter of emulation platform are set;
Step 2: the random task and virtual machine for generating certain scale is set by Cloudsim emulation platform;
Step 3: initialization Q table and G table, init state space S;
Step 4: iteration executes step 4-1 to 4-5;
Step 4-1: using s as being set as current state;
Step 4-2: using the heuristic movement selection strategy for automatically updating weight factor based on ε-greedy algorithm from Selection acts in set of actions A;
Step 4-3: executing the movement chosen, and the return value immediately that the movement is executed under the state is recorded, according to formula Qt= (1-α)Qt+α(r+γ*Qt+1) Q value is updated, while weight factor is updated, further according to formula Update G value;State is transferred to NextState s ' by s;
Step 4-4: calculating error=MAX (error | Qt-Qprevious-t), wherein Qprevious-tIndicate the previous of t moment Moment value;
Step 4-5: terminate the learning process of Agent if error < θ, otherwise return step 4-1.Wherein θ is to fix Value, sets according to demand.
The present embodiment in step 1, is configured Cloudsim emulation platform and algorithm parameter, wherein Cloudsim Emulation platform setting such as table 1;Improve the setting of Q learning algorithm as shown in table 2, wherein α indicates that learning rate, γ indicate discount factor, ε For balancing utilization and heuristic process in ε-greedy algorithm;ω indicates concern of the user to time and operating cost is executed Degree;U indicates power of the weight factor to the influence degree for acting selection.
The setting of 1 experiment parameter of table
The setting of 2 algorithm parameter of table
The present embodiment generates data set in step 2, using Cloudsim emulation platform at random, and task size is defined on Between section [60000,120000], the processing speed of virtual machine is defined between [400,1200].Task rule in embodiment Mould 5 incremented by successively, at maximum up to 30 since 10;The quantity of virtual machine is set as 5.
It is ect by the execution timing definition of task in the present embodimentij=sizei/mipj, wherein sizeiIt indicates i-th The size of task, mipjIndicate the processing speed of jth platform virtual machine.
By the total run time of jth platform virtual machine is defined as:
By a complete scheduling scheme PiTotal execution time are as follows:
By total operating cost consumed by execution task is defined as:Wherein, cstjIndicate the resources costs consumed by jth platform virtual machines performing tasks within the unit time.
According to defined above, then the optimization aim of the present embodiment may be defined as: min [Time (Pi),Cost(Pi)].It indicates The target of the present embodiment is that task is always executed to time and operating cost minimum.
In order to more clearly evaluate Multiobjective Scheduling, P will be dispatchediEvaluation function is defined as: est (Pi)=ω * logTime(Pi)+(1-ω)*logCost(Pi), wherein ω ∈ [0,1] indicates user to the pass for executing time and operating cost Note degree, by adjusting the size of ω come meet user to execute time and operating cost different demands;Time(Pi) indicate to adjust Degree scheme PiTotal execution time;Cost(Pi) indicate scheduling scheme PiTotal operating cost consumed by execution task.Pass through evaluation The size of function judges the quality of scheduling strategy, according to the optimization aim of the present embodiment it is found that evaluation function is smaller, dispatches plan It is slightly better.
Fig. 2 be improvement of the present invention Q learning algorithm in terms of the optimizing ability in multiple target cloud resource scheduling problem with The comparison diagram of other dispatching algorithms.Following four algorithm is compared altogether:
1. task is dispensed on each virtual machine by the scheduling scheme executed in order in order, i.e., first is appointed First virtual machine is distributed in business, and second task is distributed to second virtual machine etc., indicated with Equ.
2. genetic algorithm (GA).
3.Q learning algorithm (QL).
4. based on the heuristic Q learning algorithm (WHAQL) for automatically updating weight.
In Fig. 2, abscissa indicates task scale, and ordinate indicates evaluation function value, and evaluation function value is smaller, algorithm Optimizing ability is stronger.
By Fig. 2 it is known that the obtained scheduling scheme of WHAQL algorithm used in the present invention can be minimized Evaluation function, optimizing ability is strong.
Fig. 3 be improvement of the present invention Q learning algorithm in terms of the convergence rate in multiple target cloud resource scheduling problem with The comparison diagram of other dispatching algorithms.It will be calculated based on the heuristic Q learning algorithm (WHAQL) for automatically updating weight factor and Q study Method (QL) and heuristic Q learning algorithm (HAQL) compare.
When Fig. 3 indicates that working as task scale is 20, ω=0.5, the comparison diagram of three kinds of algorithm iteration processes.Total the number of iterations is set It is set to 5000 times, every iteration 500 times are a study stage, and record is once as a result, totally 10 study stages.Wherein, abscissa Expression task scale, ordinate indicate evaluation function value.
It is learnt by Fig. 3, WHAQL algorithm used in the present invention is compared to other two kinds of algorithms, and convergence rate is faster.
Fig. 4 be improvement of the present invention Q learning algorithm in terms of the load balancing in multiple target cloud resource scheduling problem with The comparison diagram of other dispatching algorithms.It will be based on automatically updating the heuristic Q learning algorithm (WHAQL) of weight factor and hold in order Capable dispatching method (Equ), genetic algorithm (GA) and Q learning algorithm (QL) algorithm compares.
Wherein, abscissa indicates task scale, and ordinate indicates load balancing value, load balancing value closer to 1, system It loads more balanced.
The ratio of the most short execution time of virtual machine and maximum execution time are defined as to the load balancing function of system, meter Calculating formula is
It is learnt by Fig. 4, the load balance degree of WHAQL algorithm used in the present invention is imitated compared to other several algorithms Fruit is more preferable, it was demonstrated that WHAQL not only has higher utilization rate to resource, can also effectively mitigate the workload of virtual machine.
The implementation case proves that the present invention is a kind of to be sought based on the multiple target cloud resource dispatching method for improving Q learning algorithm Three excellent ability, convergence rate and load balancing aspects have preferable performance.
The above is that the embodiment of the present invention is discussed in detail in conjunction with attached drawing, and embodiments herein is only It is to be used to help understand method of the invention.For those skilled in the art, according to the thought of the present invention, having It can have some change and modify in body embodiment and application range, therefore present specification should not be construed as limiting the invention.

Claims (2)

1. based on the multiple target cloud resource dispatching method for improving Q learning algorithm, which is characterized in that Agent with environment by carrying out Interaction selects the maximum movement of return value to execute, and in the movement choice phase, this method considers weight factor and heuristic function It combines, the return value immediately after being trained every time according to Agent, automatically updates the weight factor after different movements execute, thus It determines movement selection strategy, improves algorithm the convergence speed, detailed process is as follows:
Step 1: generating task data and virtual-machine data at random using Cloudsim emulation platform;
Step 2: defining the state space S of Q study: indicating that wherein state s is indicated with one-dimension array, s's by a dynamic array Subscript indicates that task number, the value of s indicate virtual machine serial number;
Step 3: defining the set of actions A of Q study: being integer variable by action definition, i-th of task is distributed to the when executing When j platform virtual machine This move, then integer quantity j is assigned to i-th of value in state s array;
Step 4: defining the Reward Program immediately of Q learning algorithm: r=ω * (Etc-Ti)+(1-ω)*(Cst-Ci);Wherein, TiWith CiIt respectively indicates total execution time of the allocated task of lower i-th virtual machine of current state and executes the totle drilling cost of task, Etc and Cst indicates larger constant, sets total execution time of all tasks on all virtual machines, Cst for Etc herein Totle drilling cost of all tasks on all virtual machines is set;
Step 5: generation task data and virtual-machine data being adjusted using based on the Q learning algorithm for automatically updating weight factor Degree distribution.
2. the multiple target cloud resource dispatching method according to claim 1 for improving Q learning algorithm, which is characterized in that described Generation task data and virtual-machine data are scheduled using based on the Q learning algorithm for automatically updating weight factor in step 5 Distribution, specific steps are as follows:
Step 5-1: initialization Q table and G table, wherein Q table is used to store the value of a certain movement under a certain state;G table is used to deposit Store up the relevant information of weight factor;
Step 5-2: init state space S;
Step 5-3: iteration executes 5-3-1 to 5-3-6 step;
Step 5-3-1: current state is set by s;
Step 5-3-2: driven using the heuristic movement selection strategy for automatically updating weight factor based on ε-greedy algorithm Make selection in set A to act;
Step 5-3-3: executing the movement chosen, and records the return value immediately and NextState s ' that the movement is executed under the state;
Step 5-3-4: according to formula Qt=(1- α) Qt+α(r+γ*Qt+1) updating Q value, wherein α ∈ (0,1) indicates learning rate; γ indicates discount factor;QtIndicate the Q value of t moment;Update weight factor f (st,at), further according to formulaUpdate G value;
Step 5-3-5: calculating error=MAX (error | Qt-Qprevious-t), wherein Qprevious-tIndicate t moment it is previous when Quarter value;
Step 5-3-6: terminating the learning process of Agent if error < θ, and otherwise (wherein θ is to fix to return step 5-3-1 Value is set according to demand).
CN201910807351.6A 2019-08-29 2019-08-29 A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm Pending CN110515735A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910807351.6A CN110515735A (en) 2019-08-29 2019-08-29 A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910807351.6A CN110515735A (en) 2019-08-29 2019-08-29 A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm

Publications (1)

Publication Number Publication Date
CN110515735A true CN110515735A (en) 2019-11-29

Family

ID=68627856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910807351.6A Pending CN110515735A (en) 2019-08-29 2019-08-29 A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm

Country Status (1)

Country Link
CN (1) CN110515735A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191934A (en) * 2019-12-31 2020-05-22 北京理工大学 Multi-target cloud workflow scheduling method based on reinforcement learning strategy
CN111427688A (en) * 2020-03-23 2020-07-17 武汉轻工大学 Cloud task multi-target scheduling method and device, electronic equipment and storage medium
CN111637444A (en) * 2020-06-05 2020-09-08 沈阳航空航天大学 Nuclear power steam generator water level control method based on Q learning
CN112256422A (en) * 2020-11-17 2021-01-22 中国人民解放军战略支援部队信息工程大学 Heterogeneous platform task scheduling method and system based on Q learning
CN112327786A (en) * 2020-11-19 2021-02-05 哈尔滨理工大学 Comprehensive scheduling method for dynamically adjusting non-occupied time period of equipment
CN112543038A (en) * 2020-11-02 2021-03-23 杭州电子科技大学 Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO
CN113163447A (en) * 2021-03-12 2021-07-23 中南大学 Communication network task resource scheduling method based on Q learning
CN113190081A (en) * 2021-04-26 2021-07-30 中国科学院近代物理研究所 Method and device for adjusting time synchronism of power supply
CN113222253A (en) * 2021-05-13 2021-08-06 珠海埃克斯智能科技有限公司 Scheduling optimization method, device and equipment and computer readable storage medium
CN113326135A (en) * 2021-06-21 2021-08-31 桂林航天工业学院 Cloud resource scheduling method under multiple targets
CN113535365A (en) * 2021-07-30 2021-10-22 中科计算技术西部研究院 Deep learning training operation resource placement system and method based on reinforcement learning
CN116932164A (en) * 2023-07-25 2023-10-24 和光舒卷(广东)数字科技有限公司 Multi-task scheduling method and system based on cloud platform

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868972A (en) * 2012-09-05 2013-01-09 河海大学常州校区 Internet of things (IoT) error sensor node location method based on improved Q learning algorithm
CN103064817A (en) * 2012-12-21 2013-04-24 桂林电子科技大学 Simplified two-line serial data bus transport method
CN105930214A (en) * 2016-04-22 2016-09-07 广东石油化工学院 Q-learning-based hybrid cloud job scheduling method
CN108139930A (en) * 2016-05-24 2018-06-08 华为技术有限公司 Resource regulating method and device based on Q study
CN108508745A (en) * 2018-01-22 2018-09-07 中国铁道科学研究院通信信号研究所 A kind of multiple target cycle tests collection optimization generation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868972A (en) * 2012-09-05 2013-01-09 河海大学常州校区 Internet of things (IoT) error sensor node location method based on improved Q learning algorithm
CN103064817A (en) * 2012-12-21 2013-04-24 桂林电子科技大学 Simplified two-line serial data bus transport method
CN105930214A (en) * 2016-04-22 2016-09-07 广东石油化工学院 Q-learning-based hybrid cloud job scheduling method
CN108139930A (en) * 2016-05-24 2018-06-08 华为技术有限公司 Resource regulating method and device based on Q study
CN108508745A (en) * 2018-01-22 2018-09-07 中国铁道科学研究院通信信号研究所 A kind of multiple target cycle tests collection optimization generation method

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
LUIZ A. CELIBERTO JR: "Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach", 《2010 LATIN AMERICAN ROBOTICS SYMPOSIUM AND INTELLIGENT ROBOTICS MEETING》 *
REINALDO A. C. BIANCHI: "euristically-Accelerated Multiagent Reinforcement Learning", 《IEEE TRANSACTIONS ON CYBERNETICS ( VOLUME: 44, ISSUE: 2, FEBRUARY 2014)》 *
吴昊霖等: "在线更新的信息强度引导启发式Q学习", 《计算机应用研究》 *
张文旭: "基于一致性与事件驱动的强化学习研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
李成严: "不确定执行时间的云计算资源调度", 《哈尔滨理工大学学报》 *
王洪彦: "新的启发式Q学习算法", 《计算机工程》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191934A (en) * 2019-12-31 2020-05-22 北京理工大学 Multi-target cloud workflow scheduling method based on reinforcement learning strategy
CN111191934B (en) * 2019-12-31 2022-04-15 北京理工大学 Multi-target cloud workflow scheduling method based on reinforcement learning strategy
CN111427688A (en) * 2020-03-23 2020-07-17 武汉轻工大学 Cloud task multi-target scheduling method and device, electronic equipment and storage medium
CN111427688B (en) * 2020-03-23 2023-08-11 武汉轻工大学 Cloud task multi-target scheduling method and device, electronic equipment and storage medium
CN111637444B (en) * 2020-06-05 2021-10-22 沈阳航空航天大学 Nuclear power steam generator water level control method based on Q learning
CN111637444A (en) * 2020-06-05 2020-09-08 沈阳航空航天大学 Nuclear power steam generator water level control method based on Q learning
CN112543038A (en) * 2020-11-02 2021-03-23 杭州电子科技大学 Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO
CN112256422A (en) * 2020-11-17 2021-01-22 中国人民解放军战略支援部队信息工程大学 Heterogeneous platform task scheduling method and system based on Q learning
CN112327786A (en) * 2020-11-19 2021-02-05 哈尔滨理工大学 Comprehensive scheduling method for dynamically adjusting non-occupied time period of equipment
CN113163447A (en) * 2021-03-12 2021-07-23 中南大学 Communication network task resource scheduling method based on Q learning
CN113190081A (en) * 2021-04-26 2021-07-30 中国科学院近代物理研究所 Method and device for adjusting time synchronism of power supply
CN113190081B (en) * 2021-04-26 2022-12-13 中国科学院近代物理研究所 Method and device for adjusting time synchronism of power supply
CN113222253A (en) * 2021-05-13 2021-08-06 珠海埃克斯智能科技有限公司 Scheduling optimization method, device and equipment and computer readable storage medium
CN113222253B (en) * 2021-05-13 2022-09-30 珠海埃克斯智能科技有限公司 Scheduling optimization method, device, equipment and computer readable storage medium
CN113326135A (en) * 2021-06-21 2021-08-31 桂林航天工业学院 Cloud resource scheduling method under multiple targets
CN113326135B (en) * 2021-06-21 2023-08-22 桂林航天工业学院 Cloud resource scheduling method under multiple targets
CN113535365A (en) * 2021-07-30 2021-10-22 中科计算技术西部研究院 Deep learning training operation resource placement system and method based on reinforcement learning
CN116932164A (en) * 2023-07-25 2023-10-24 和光舒卷(广东)数字科技有限公司 Multi-task scheduling method and system based on cloud platform
CN116932164B (en) * 2023-07-25 2024-03-29 和光舒卷(广东)数字科技有限公司 Multi-task scheduling method and system based on cloud platform

Similar Documents

Publication Publication Date Title
CN110515735A (en) A kind of multiple target cloud resource dispatching method based on improvement Q learning algorithm
Wei Task scheduling optimization strategy using improved ant colony optimization algorithm in cloud computing
Torabi et al. A dynamic task scheduling framework based on chicken swarm and improved raven roosting optimization methods in cloud computing
Sun et al. PACO: A period ACO based scheduling algorithm in cloud computing
CN109656702A (en) A kind of across data center network method for scheduling task based on intensified learning
CN108182115A (en) A kind of virtual machine load-balancing method under cloud environment
CN110109753A (en) Resource regulating method and system based on various dimensions constraint genetic algorithm
CN108694090A (en) A kind of cloud computing resource scheduling method of Based on Distributed machine learning
CN103699446A (en) Quantum-behaved particle swarm optimization (QPSO) algorithm based multi-objective dynamic workflow scheduling method
CN110351348B (en) Cloud computing resource scheduling optimization method based on DQN
CN111722910A (en) Cloud job scheduling and resource allocation method
CN109067834A (en) Discrete particle cluster dispatching algorithm based on oscillatory type inertia weight
Petropoulos et al. A particle swarm optimization algorithm for balancing assembly lines
CN108170530A (en) A kind of Hadoop Load Balancing Task Scheduling methods based on mixing meta-heuristic algorithm
Thaman et al. Current perspective in task scheduling techniques in cloud computing: a review
Manikandan et al. LGSA: Hybrid task scheduling in multi objective functionality in cloud computing environment
Yu et al. Fluid: Resource-aware hyperparameter tuning engine
Mojab et al. iCATS: Scheduling big data workflows in the cloud using cultural algorithms
Zhou et al. Deep reinforcement learning-based algorithms selectors for the resource scheduling in hierarchical cloud computing
Balla et al. Reliability-aware: task scheduling in cloud computing using multi-agent reinforcement learning algorithm and neural fitted Q.
Han et al. A DEA based hybrid algorithm for bi-objective task scheduling in cloud computing
Hu et al. A two-stage multi-objective task scheduling framework based on invasive tumor growth optimization algorithm for cloud computing
CN108958919A (en) More DAG task schedule expense fairness assessment models of limited constraint in a kind of cloud computing
Sharma et al. Multi-Faceted Job Scheduling Optimization Using Q-learning With ABC In Cloud Environment
Yamazaki et al. Implementation and evaluation of the JobTracker initiative task scheduling on Hadoop

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191129

WD01 Invention patent application deemed withdrawn after publication