CN116450308A - Multi-strategy learning-based adaptive DAG task scheduling method - Google Patents
Multi-strategy learning-based adaptive DAG task scheduling method Download PDFInfo
- Publication number
- CN116450308A CN116450308A CN202211596732.2A CN202211596732A CN116450308A CN 116450308 A CN116450308 A CN 116450308A CN 202211596732 A CN202211596732 A CN 202211596732A CN 116450308 A CN116450308 A CN 116450308A
- Authority
- CN
- China
- Prior art keywords
- task
- value
- node
- dag
- scheduling method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 14
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 14
- 238000004088 simulation Methods 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000003780 insertion Methods 0.000 claims description 3
- 230000037431 insertion Effects 0.000 claims description 3
- 239000002243 precursor Substances 0.000 claims description 3
- 230000001186 cumulative effect Effects 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/483—Multiproc
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/484—Precedence
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention mainly aims to solve the problem of low searching efficiency of the existing task scheduling algorithm, and discloses a multi-strategy learning-based adaptive DAG task scheduling method, which comprises the following steps: a state updating stage; a reward updating stage; a motion selection stage; a simulation stage; the simulation phase is repeated until the iteration number limit or time limit is met, and finally a minimum makespan value is returned. The invention effectively balances the relation between exploration and utilization, thereby accelerating the finding of a better makespan value, reducing the cost of searching time and improving the searching efficiency of an algorithm; the method has universality, is suitable for new application and new hardware systems, and improves the system efficiency.
Description
Technical Field
The invention relates to the technical field of task scheduling systems, in particular to a self-adaptive DAG task scheduling method based on multi-strategy learning.
Background
In distributed heterogeneous computing systems, various computing resources are interconnected with high-speed networks to support computing-intensive parallel and distributed applications. Efficient task scheduling is critical to improving system performance. How to schedule parallel computing tasks for efficient execution in heterogeneous computing systems is a hotspot problem in the field of system research. Parallel computing tasks oriented to application fields such as big data and artificial intelligence generally represent data dependency and parallel relations among tasks by using a DAG (directed acyclic graph) task graph model. DAG task scheduling in heterogeneous computing systems is a classical problem for computer architecture research.
DAG task scheduling under heterogeneous computing systems is an NP-complete problem and is more complex in practical scheduling systems. Many heuristic algorithms have been proposed, such as list scheduling algorithms, genetic and evolutionary based random search algorithms, task replication based algorithms, and the like. Most of these methods are heuristic and lack versatility in different application scenarios. With the update iteration of the software and hardware environment, the traditional heuristic scheduling method which depends on expert experience design is difficult to be universally applied to a novel application scene, so that the traditional scheduling method cannot fully exert the system efficiency in a new application and a new hardware system. In fact, the prior art cannot balance the relationship between exploration and utilization, so that a better makespan value cannot be found quickly, and the search time cost is increased.
Disclosure of Invention
The invention mainly aims to solve the problem of low searching efficiency of the existing task scheduling algorithm, and provides a multi-strategy learning-based adaptive DAG task scheduling method, which effectively balances the relation between exploration and utilization, thereby accelerating the finding of a better makespan value, reducing the searching time cost and improving the algorithm searching efficiency; the method has universality, is suitable for new application and new hardware systems, and improves the system efficiency.
In order to achieve the above object, the present invention adopts the following technical scheme.
An adaptive DAG task scheduling method based on multi-strategy learning comprises the following steps:
step S1: a state updating stage: initializing a ready queue, namely only an entry task, selecting a scheduled task from the ready queue, and updating the state of the task;
step S2: a reward updating stage: scheduling task execution in the ready queue, updating the rewarding value of the task after the execution is finished, and back-transmitting the access times and the rewarding value to the entry node;
step S3: action selection stage: starting from the scheduled task, all the next scheduled tasks n are calculated i A (n) i ) The value of A (n i ) The task with the largest value is put into a ready queue; the maximum A (n) is calculated according to the formula i ) The value is used as the node of the next scheduling;
step S4: simulation stage: repeating the steps S1-S3 until the exit task is scheduled, and finally obtaining a makespan value;
step S5: repeating the step S4 until the iteration number limit or the time limit is met, and finally returning to a minimum makespan value;
the invention provides a self-adaptive DAG task scheduling method based on multi-strategy learning, which effectively balances the relation between exploration and utilization, and finds out a better makespan value in the actual DAG scheduling process, thereby effectively reducing the searching time cost and improving the algorithm searching efficiency; the method has universality, is suitable for new applications and new hardware systems, and can fully exert the system efficiency.
Preferably, the step S1 includes: setting task node status flags S (n) on the basis of DAG model i ) Task node access count N (N) i ) Each update state includes: s (n) i )=1,N(n i )=N(n i ) +1. The status flag S (n) i ) And access count N (N) i ) The change logic is simple and effective, and the execution efficiency is high.
Preferably, the specific process of the step S1 is as follows: the task node status flags S (n i ) Initialization is all 0,S (n) i ) The setting of 1 needs to satisfy:j is epsilon pred (i), where j represents the precursor node of i; the task node access times count N (N) i ) Initialization is all 0, update mode is N (N i )=N(n i )+1。S(n i ) Constraint set to 1 satisfies preferential constraint between DAG tasksRelationship.
Preferably, in the step S2, the selection of the processor adopts an insertion-based strategy in the HEFT algorithm. The processor selection strategy can fully utilize the idle time of the processor, and avoid the phenomenon of processor resource waste so as to further shorten the total scheduling length.
Preferably, the step S2 further includes: setting task node jackpot Q (n) based on DAG model i ) The task node accumulates rewards Q (n) i ) Initialization is all 0, update mode is Q (n i )=Q(n i )+EST(n i ) Wherein EST (n) i ) Representing task n i Is the earliest start time of (2); task node access count N (N) i ) The update mode of (a) is N (N) i )=N(n i ) +1; if the scheduled task is an exit task, the end of the round of scheduling is indicated. Cumulative rewards Q (n) i ) The value of (c) will affect a (n) i ) Further influencing the next scheduled task selection, the present invention designs a cumulative prize Q (n) i ) Can ensure the task n with more utilization value in the scheduling i Priority is scheduled for execution, thereby shortening the overall scheduling length.
Preferably, in step S3, the a (n i ) The calculation formula of the value is:
A(n i )=V(n i )+E(n i )
wherein c is a constant parameter, which is mainly used for balancing the weight between exploration and utilization; v (n) i ) A representation utilization value section; e (n) i ) A search value section; rank (rank) u (n i ) Representing task node n i Refers to the uplink weight of slave task n i Critical path length to egress task;representing task node n i N represents the number of DAG tasks; />Representing a consideration task node n i Uplink average prize value under uplink weight of (a) refers to the slave task n i Average prize value for the task in the critical path to the egress task; n (N) i ) Representing the current task node n i Is the number of accesses of N (N) j ) Parent node n representing the current task node j Is used for the number of accesses. Using value part V (n i ) The larger the current node is, the more valuable it is. If the current node access number is small, the search value portion E (n i ) Will increase, indicating that the current node is more worth exploring. And the set key super parameter c can achieve the aim of well balancing exploration and utilization, so that the algorithm performance is improved and the whole state space is searched as large as possible.
Preferably, in step S4, after the simulation ends and the makespan value is obtained, the task node state flag S (n i ) All 0's for the next round of simulated scheduling. Status markers S (n) i ) After resetting to 0, the round schedule keeps access count N (N i ) And cumulative rewards Q (n) i ) The execution of the next round of scheduling can be well guided, and after multiple rounds of iterative scheduling, the algorithm can obtain a better makespan value.
Therefore, the invention has the advantages that:
(1) The relation between exploration and utilization is effectively balanced, a better makespan value is found by accelerating in the actual DAG scheduling process, the searching time cost is effectively reduced, and the algorithm searching efficiency is improved;
(2) The method has universality, is suitable for new applications and new hardware systems, and can fully exert the system efficiency.
Drawings
FIG. 1 is a flow chart of an adaptive DAG task scheduling method based on multi-strategy learning in an embodiment of the invention.
1. A state updating stage 2, a rewarding updating stage 3, an action selecting stage 4 and a simulation stage.
Detailed Description
The invention is further described below with reference to the drawings and detailed description.
An adaptive DAG task scheduling method based on multi-strategy learning, as shown in figure 1, comprises the following steps:
step S1: a state updating stage: initializing a ready queue, namely only an entry task, selecting a scheduled task from the ready queue, and updating the state of the task;
setting task node status flags S (n) on the basis of DAG model i ) Task node access count N (N) i ) Each update state includes: s (n) i )=1,N(n i )=N(n i ) +1. Specifically, task node status flags S (n i ) Initialization is all 0,S (n) i ) The setting of 1 needs to satisfy:j is epsilon pred (i), where j represents the precursor node of i; task node access count N (N) i ) Initialization is all 0, update mode is N (N i )=N(n i )+1。
Step S2: a reward updating stage: scheduling task execution in the ready queue, selecting a strategy of insertion-based in HEFT algorithm by a processor, updating a reward value of the task after execution, and back-transmitting access times and the reward value to an entry node;
setting task node jackpot Q (n) based on DAG model i ) Task node cumulative rewards Q (n) i ) Initialization is all 0, update mode is Q (n i )=Q(n i )+EST(n i ) Wherein EST (n) i ) Representing task n i Is the earliest start time of (2); task node access count N (N) i ) The update mode of (a) is N (N) i )=N(n i ) +1; if the scheduled task is an exit task, the end of the round of scheduling is indicated.
Step S3: dynamic movementThe selection stage is as follows: starting from the scheduled task, all the next scheduled tasks n are calculated i A (n) i ) The value of A (n i ) The task with the largest value is put into a ready queue; the maximum A (n) is calculated according to the formula i ) The value is used as the node of the next scheduling;
A(n i ) The calculation formula of the value is:
A(n i )=V(n i )+E(n i )
wherein c is a constant parameter, which is mainly used for balancing the weight between exploration and utilization; v (n) i ) A representation utilization value section; e (n) i ) A search value section; rank (rank) u (n i ) Representing task node n i Refers to the uplink weight of slave task n i Critical path length to egress task;representing task node n i N represents the number of DAG tasks; />Representing a consideration task node n i Uplink average prize value under uplink weight of (a) refers to the slave task n i Average prize value for the task in the critical path to the egress task; n (N) i ) Representing the current task node n i Is the number of accesses of N (N) j ) Parent node n representing the current task node j Is used for the number of accesses.
Step S4: simulation stage: repeating the steps S1-S3 until the exit task is scheduled, and finally obtaining a makespan value; after the simulation is finished and the makespan value is obtained, the task node state mark S (n) i ) All 0's for the next round of simulated scheduling.
Step S5: repeating the step S4 until the iteration number limit or the time limit is met, and finally returning to a minimum makespan value.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (7)
1. The adaptive DAG task scheduling method based on multi-strategy learning is characterized by comprising the following steps of:
step S1: a state updating stage: initializing a ready queue, namely only an entry task, selecting a scheduled task from the ready queue, and updating the state of the task;
step S2: a reward updating stage: scheduling task execution in the ready queue, updating the rewarding value of the task after the execution is finished, and back-transmitting the access times and the rewarding value to the entry node;
step S3: action selection stage: starting from the scheduled task, all the next scheduled tasks n are calculated i A (n) i ) The value of A (n i ) The task with the largest value is put into a ready queue;
step S4: simulation stage: repeating the steps S1-S3 until the exit task is scheduled, and finally obtaining a makespan value;
step S5: repeating the step S4 until the iteration number limit or the time limit is met, and finally returning to a minimum makespan value.
2. The adaptive DAG task scheduling method based on multi-strategy learning according to claim 1, wherein the step S1 comprises: setting task node status flags S (n) on the basis of DAG model i ) Task node access count N(n i ) Each update state includes: s (n) i )=1,N(n i )=N(n i )+1。
3. The adaptive DAG task scheduling method based on multi-strategy learning according to claim 2, wherein the specific process of step S1 is as follows: the task node status flags S (n i ) Initialization is all 0,S (n) i ) The setting of 1 needs to satisfy:j is epsilon pred (i), where j represents the precursor node of i; the task node access times count N (N) i ) Initialization is all 0, update mode is N (N i )=N(n i )+1。
4. The adaptive DAG task scheduling method based on multi-policy learning according to claim 1, wherein in step S2, the selection of the processor adopts an insertion-based policy in the HEFT algorithm.
5. The adaptive DAG task scheduling method based on multi-strategy learning of claim 1, wherein the step S2 further comprises: setting task node jackpot Q (n) based on DAG model i ) The task node accumulates rewards Q (n) i ) Initialization is all 0, update mode is Q (n i )=Q(n i )+EST(n i ) Wherein EST (n) i ) Representing task n i Is the earliest start time of (2); task node access count N (N) i ) The update mode of (a) is N (N) i )=N(n i ) +1; if the scheduled task is an exit task, the end of the round of scheduling is indicated.
6. The adaptive DAG task scheduling method based on multi-strategy learning according to claim 1, wherein in step S3, the a (n i ) The calculation formula of the value is:
A(n i )=V(n i )+E(n i )
wherein c is a constant parameter, which is mainly used for balancing the weight between exploration and utilization; v (n) i ) A representation utilization value section; e (n) i ) A search value section; rank (rank) u (n i ) Representing task node n i Refers to the uplink weight of slave task n i Critical path length to egress task;representing task node n i N represents the number of DAG tasks; />Representing a consideration task node n i Uplink average prize value under uplink weight of (a) refers to the slave task n i Average prize value for the task in the critical path to the egress task; n (N) i ) Representing the current task node n i Is the number of accesses of N (N) j ) Parent node n representing the current task node j Is used for the number of accesses.
7. The adaptive DAG task scheduling method based on multi-strategy learning as claimed in claim 1 or 2 or 3 or 4 or 5 or 6, wherein in step S4, after the simulation ends to obtain makespan value, the task node status flag S (n i ) All 0's for the next round of simulated scheduling.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211596732.2A CN116450308A (en) | 2022-12-12 | 2022-12-12 | Multi-strategy learning-based adaptive DAG task scheduling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211596732.2A CN116450308A (en) | 2022-12-12 | 2022-12-12 | Multi-strategy learning-based adaptive DAG task scheduling method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116450308A true CN116450308A (en) | 2023-07-18 |
Family
ID=87122567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211596732.2A Pending CN116450308A (en) | 2022-12-12 | 2022-12-12 | Multi-strategy learning-based adaptive DAG task scheduling method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116450308A (en) |
-
2022
- 2022-12-12 CN CN202211596732.2A patent/CN116450308A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Arabnejad et al. | List scheduling algorithm for heterogeneous systems by an optimistic cost table | |
Cheng et al. | Cross-platform resource scheduling for spark and MapReduce on YARN | |
CN112988345B (en) | Dependency task unloading method and device based on mobile edge calculation | |
CN114691363A (en) | Cloud data center self-adaption efficient resource allocation method based on deep reinforcement learning | |
CN113627871B (en) | Workflow scheduling method, system and storage medium based on multi-target particle swarm algorithm | |
CN109101339A (en) | Video task parallel method, device and Heterogeneous Cluster Environment in isomeric group | |
CN115237581B (en) | Heterogeneous computing power-oriented multi-strategy intelligent scheduling method and device | |
CN107341041B (en) | Cloud task multidimensional constraint backfill scheduling method based on priority queue | |
US20230252353A1 (en) | On-device training method to train an artificial intelligent model and a system therefor | |
CN114443249A (en) | Container cluster resource scheduling method and system based on deep reinforcement learning | |
Phan et al. | Evolving toward the perfect schedule: Co-scheduling job assignments and data replication in wide-area systems using a genetic algorithm | |
US20220405129A1 (en) | Workflow scheduling method and system based on multi-target particle swarm algorithm, and storage medium | |
CN115330189A (en) | Workflow optimization scheduling method based on improved moth flame algorithm | |
CN115454612A (en) | Cloud platform task scheduling method based on dimension learning strategy and wolf optimization | |
CN110262879B (en) | Monte Carlo tree searching method based on balanced exploration and utilization | |
Bian et al. | Neural task scheduling with reinforcement learning for fog computing systems | |
Qureshi et al. | Grid resource allocation for real-time data-intensive tasks | |
CN116450308A (en) | Multi-strategy learning-based adaptive DAG task scheduling method | |
CN116582407A (en) | Containerized micro-service arrangement system and method based on deep reinforcement learning | |
CN116932198A (en) | Resource scheduling method, device, electronic equipment and readable storage medium | |
CN116069473A (en) | Deep reinforcement learning-based Yarn cluster workflow scheduling method | |
CN114625493B (en) | Kubernetes cluster resource scheduling method based on improved longhorn beetle whisker intelligent method | |
CN115756646A (en) | Industrial internet-based edge computing task unloading optimization method | |
Suzuki et al. | Response Time Analysis of Execution Right Delegation Scheduling | |
Yao et al. | A memory-constraint-aware list scheduling algorithm for memory-constraint heterogeneous muti-processor system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |