CN116450308A - Multi-strategy learning-based adaptive DAG task scheduling method - Google Patents

Multi-strategy learning-based adaptive DAG task scheduling method Download PDF

Info

Publication number
CN116450308A
CN116450308A CN202211596732.2A CN202211596732A CN116450308A CN 116450308 A CN116450308 A CN 116450308A CN 202211596732 A CN202211596732 A CN 202211596732A CN 116450308 A CN116450308 A CN 116450308A
Authority
CN
China
Prior art keywords
task
value
node
dag
scheduling method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211596732.2A
Other languages
Chinese (zh)
Inventor
程雨夏
舒浪
黄凡丁
李宋晨
陈一飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202211596732.2A priority Critical patent/CN116450308A/en
Publication of CN116450308A publication Critical patent/CN116450308A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/483Multiproc
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/484Precedence
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention mainly aims to solve the problem of low searching efficiency of the existing task scheduling algorithm, and discloses a multi-strategy learning-based adaptive DAG task scheduling method, which comprises the following steps: a state updating stage; a reward updating stage; a motion selection stage; a simulation stage; the simulation phase is repeated until the iteration number limit or time limit is met, and finally a minimum makespan value is returned. The invention effectively balances the relation between exploration and utilization, thereby accelerating the finding of a better makespan value, reducing the cost of searching time and improving the searching efficiency of an algorithm; the method has universality, is suitable for new application and new hardware systems, and improves the system efficiency.

Description

Multi-strategy learning-based adaptive DAG task scheduling method
Technical Field
The invention relates to the technical field of task scheduling systems, in particular to a self-adaptive DAG task scheduling method based on multi-strategy learning.
Background
In distributed heterogeneous computing systems, various computing resources are interconnected with high-speed networks to support computing-intensive parallel and distributed applications. Efficient task scheduling is critical to improving system performance. How to schedule parallel computing tasks for efficient execution in heterogeneous computing systems is a hotspot problem in the field of system research. Parallel computing tasks oriented to application fields such as big data and artificial intelligence generally represent data dependency and parallel relations among tasks by using a DAG (directed acyclic graph) task graph model. DAG task scheduling in heterogeneous computing systems is a classical problem for computer architecture research.
DAG task scheduling under heterogeneous computing systems is an NP-complete problem and is more complex in practical scheduling systems. Many heuristic algorithms have been proposed, such as list scheduling algorithms, genetic and evolutionary based random search algorithms, task replication based algorithms, and the like. Most of these methods are heuristic and lack versatility in different application scenarios. With the update iteration of the software and hardware environment, the traditional heuristic scheduling method which depends on expert experience design is difficult to be universally applied to a novel application scene, so that the traditional scheduling method cannot fully exert the system efficiency in a new application and a new hardware system. In fact, the prior art cannot balance the relationship between exploration and utilization, so that a better makespan value cannot be found quickly, and the search time cost is increased.
Disclosure of Invention
The invention mainly aims to solve the problem of low searching efficiency of the existing task scheduling algorithm, and provides a multi-strategy learning-based adaptive DAG task scheduling method, which effectively balances the relation between exploration and utilization, thereby accelerating the finding of a better makespan value, reducing the searching time cost and improving the algorithm searching efficiency; the method has universality, is suitable for new application and new hardware systems, and improves the system efficiency.
In order to achieve the above object, the present invention adopts the following technical scheme.
An adaptive DAG task scheduling method based on multi-strategy learning comprises the following steps:
step S1: a state updating stage: initializing a ready queue, namely only an entry task, selecting a scheduled task from the ready queue, and updating the state of the task;
step S2: a reward updating stage: scheduling task execution in the ready queue, updating the rewarding value of the task after the execution is finished, and back-transmitting the access times and the rewarding value to the entry node;
step S3: action selection stage: starting from the scheduled task, all the next scheduled tasks n are calculated i A (n) i ) The value of A (n i ) The task with the largest value is put into a ready queue; the maximum A (n) is calculated according to the formula i ) The value is used as the node of the next scheduling;
step S4: simulation stage: repeating the steps S1-S3 until the exit task is scheduled, and finally obtaining a makespan value;
step S5: repeating the step S4 until the iteration number limit or the time limit is met, and finally returning to a minimum makespan value;
the invention provides a self-adaptive DAG task scheduling method based on multi-strategy learning, which effectively balances the relation between exploration and utilization, and finds out a better makespan value in the actual DAG scheduling process, thereby effectively reducing the searching time cost and improving the algorithm searching efficiency; the method has universality, is suitable for new applications and new hardware systems, and can fully exert the system efficiency.
Preferably, the step S1 includes: setting task node status flags S (n) on the basis of DAG model i ) Task node access count N (N) i ) Each update state includes: s (n) i )=1,N(n i )=N(n i ) +1. The status flag S (n) i ) And access count N (N) i ) The change logic is simple and effective, and the execution efficiency is high.
Preferably, the specific process of the step S1 is as follows: the task node status flags S (n i ) Initialization is all 0,S (n) i ) The setting of 1 needs to satisfy:j is epsilon pred (i), where j represents the precursor node of i; the task node access times count N (N) i ) Initialization is all 0, update mode is N (N i )=N(n i )+1。S(n i ) Constraint set to 1 satisfies preferential constraint between DAG tasksRelationship.
Preferably, in the step S2, the selection of the processor adopts an insertion-based strategy in the HEFT algorithm. The processor selection strategy can fully utilize the idle time of the processor, and avoid the phenomenon of processor resource waste so as to further shorten the total scheduling length.
Preferably, the step S2 further includes: setting task node jackpot Q (n) based on DAG model i ) The task node accumulates rewards Q (n) i ) Initialization is all 0, update mode is Q (n i )=Q(n i )+EST(n i ) Wherein EST (n) i ) Representing task n i Is the earliest start time of (2); task node access count N (N) i ) The update mode of (a) is N (N) i )=N(n i ) +1; if the scheduled task is an exit task, the end of the round of scheduling is indicated. Cumulative rewards Q (n) i ) The value of (c) will affect a (n) i ) Further influencing the next scheduled task selection, the present invention designs a cumulative prize Q (n) i ) Can ensure the task n with more utilization value in the scheduling i Priority is scheduled for execution, thereby shortening the overall scheduling length.
Preferably, in step S3, the a (n i ) The calculation formula of the value is:
A(n i )=V(n i )+E(n i )
wherein c is a constant parameter, which is mainly used for balancing the weight between exploration and utilization; v (n) i ) A representation utilization value section; e (n) i ) A search value section; rank (rank) u (n i ) Representing task node n i Refers to the uplink weight of slave task n i Critical path length to egress task;representing task node n i N represents the number of DAG tasks; />Representing a consideration task node n i Uplink average prize value under uplink weight of (a) refers to the slave task n i Average prize value for the task in the critical path to the egress task; n (N) i ) Representing the current task node n i Is the number of accesses of N (N) j ) Parent node n representing the current task node j Is used for the number of accesses. Using value part V (n i ) The larger the current node is, the more valuable it is. If the current node access number is small, the search value portion E (n i ) Will increase, indicating that the current node is more worth exploring. And the set key super parameter c can achieve the aim of well balancing exploration and utilization, so that the algorithm performance is improved and the whole state space is searched as large as possible.
Preferably, in step S4, after the simulation ends and the makespan value is obtained, the task node state flag S (n i ) All 0's for the next round of simulated scheduling. Status markers S (n) i ) After resetting to 0, the round schedule keeps access count N (N i ) And cumulative rewards Q (n) i ) The execution of the next round of scheduling can be well guided, and after multiple rounds of iterative scheduling, the algorithm can obtain a better makespan value.
Therefore, the invention has the advantages that:
(1) The relation between exploration and utilization is effectively balanced, a better makespan value is found by accelerating in the actual DAG scheduling process, the searching time cost is effectively reduced, and the algorithm searching efficiency is improved;
(2) The method has universality, is suitable for new applications and new hardware systems, and can fully exert the system efficiency.
Drawings
FIG. 1 is a flow chart of an adaptive DAG task scheduling method based on multi-strategy learning in an embodiment of the invention.
1. A state updating stage 2, a rewarding updating stage 3, an action selecting stage 4 and a simulation stage.
Detailed Description
The invention is further described below with reference to the drawings and detailed description.
An adaptive DAG task scheduling method based on multi-strategy learning, as shown in figure 1, comprises the following steps:
step S1: a state updating stage: initializing a ready queue, namely only an entry task, selecting a scheduled task from the ready queue, and updating the state of the task;
setting task node status flags S (n) on the basis of DAG model i ) Task node access count N (N) i ) Each update state includes: s (n) i )=1,N(n i )=N(n i ) +1. Specifically, task node status flags S (n i ) Initialization is all 0,S (n) i ) The setting of 1 needs to satisfy:j is epsilon pred (i), where j represents the precursor node of i; task node access count N (N) i ) Initialization is all 0, update mode is N (N i )=N(n i )+1。
Step S2: a reward updating stage: scheduling task execution in the ready queue, selecting a strategy of insertion-based in HEFT algorithm by a processor, updating a reward value of the task after execution, and back-transmitting access times and the reward value to an entry node;
setting task node jackpot Q (n) based on DAG model i ) Task node cumulative rewards Q (n) i ) Initialization is all 0, update mode is Q (n i )=Q(n i )+EST(n i ) Wherein EST (n) i ) Representing task n i Is the earliest start time of (2); task node access count N (N) i ) The update mode of (a) is N (N) i )=N(n i ) +1; if the scheduled task is an exit task, the end of the round of scheduling is indicated.
Step S3: dynamic movementThe selection stage is as follows: starting from the scheduled task, all the next scheduled tasks n are calculated i A (n) i ) The value of A (n i ) The task with the largest value is put into a ready queue; the maximum A (n) is calculated according to the formula i ) The value is used as the node of the next scheduling;
A(n i ) The calculation formula of the value is:
A(n i )=V(n i )+E(n i )
wherein c is a constant parameter, which is mainly used for balancing the weight between exploration and utilization; v (n) i ) A representation utilization value section; e (n) i ) A search value section; rank (rank) u (n i ) Representing task node n i Refers to the uplink weight of slave task n i Critical path length to egress task;representing task node n i N represents the number of DAG tasks; />Representing a consideration task node n i Uplink average prize value under uplink weight of (a) refers to the slave task n i Average prize value for the task in the critical path to the egress task; n (N) i ) Representing the current task node n i Is the number of accesses of N (N) j ) Parent node n representing the current task node j Is used for the number of accesses.
Step S4: simulation stage: repeating the steps S1-S3 until the exit task is scheduled, and finally obtaining a makespan value; after the simulation is finished and the makespan value is obtained, the task node state mark S (n) i ) All 0's for the next round of simulated scheduling.
Step S5: repeating the step S4 until the iteration number limit or the time limit is met, and finally returning to a minimum makespan value.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (7)

1. The adaptive DAG task scheduling method based on multi-strategy learning is characterized by comprising the following steps of:
step S1: a state updating stage: initializing a ready queue, namely only an entry task, selecting a scheduled task from the ready queue, and updating the state of the task;
step S2: a reward updating stage: scheduling task execution in the ready queue, updating the rewarding value of the task after the execution is finished, and back-transmitting the access times and the rewarding value to the entry node;
step S3: action selection stage: starting from the scheduled task, all the next scheduled tasks n are calculated i A (n) i ) The value of A (n i ) The task with the largest value is put into a ready queue;
step S4: simulation stage: repeating the steps S1-S3 until the exit task is scheduled, and finally obtaining a makespan value;
step S5: repeating the step S4 until the iteration number limit or the time limit is met, and finally returning to a minimum makespan value.
2. The adaptive DAG task scheduling method based on multi-strategy learning according to claim 1, wherein the step S1 comprises: setting task node status flags S (n) on the basis of DAG model i ) Task node access count N(n i ) Each update state includes: s (n) i )=1,N(n i )=N(n i )+1。
3. The adaptive DAG task scheduling method based on multi-strategy learning according to claim 2, wherein the specific process of step S1 is as follows: the task node status flags S (n i ) Initialization is all 0,S (n) i ) The setting of 1 needs to satisfy:j is epsilon pred (i), where j represents the precursor node of i; the task node access times count N (N) i ) Initialization is all 0, update mode is N (N i )=N(n i )+1。
4. The adaptive DAG task scheduling method based on multi-policy learning according to claim 1, wherein in step S2, the selection of the processor adopts an insertion-based policy in the HEFT algorithm.
5. The adaptive DAG task scheduling method based on multi-strategy learning of claim 1, wherein the step S2 further comprises: setting task node jackpot Q (n) based on DAG model i ) The task node accumulates rewards Q (n) i ) Initialization is all 0, update mode is Q (n i )=Q(n i )+EST(n i ) Wherein EST (n) i ) Representing task n i Is the earliest start time of (2); task node access count N (N) i ) The update mode of (a) is N (N) i )=N(n i ) +1; if the scheduled task is an exit task, the end of the round of scheduling is indicated.
6. The adaptive DAG task scheduling method based on multi-strategy learning according to claim 1, wherein in step S3, the a (n i ) The calculation formula of the value is:
A(n i )=V(n i )+E(n i )
wherein c is a constant parameter, which is mainly used for balancing the weight between exploration and utilization; v (n) i ) A representation utilization value section; e (n) i ) A search value section; rank (rank) u (n i ) Representing task node n i Refers to the uplink weight of slave task n i Critical path length to egress task;representing task node n i N represents the number of DAG tasks; />Representing a consideration task node n i Uplink average prize value under uplink weight of (a) refers to the slave task n i Average prize value for the task in the critical path to the egress task; n (N) i ) Representing the current task node n i Is the number of accesses of N (N) j ) Parent node n representing the current task node j Is used for the number of accesses.
7. The adaptive DAG task scheduling method based on multi-strategy learning as claimed in claim 1 or 2 or 3 or 4 or 5 or 6, wherein in step S4, after the simulation ends to obtain makespan value, the task node status flag S (n i ) All 0's for the next round of simulated scheduling.
CN202211596732.2A 2022-12-12 2022-12-12 Multi-strategy learning-based adaptive DAG task scheduling method Pending CN116450308A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211596732.2A CN116450308A (en) 2022-12-12 2022-12-12 Multi-strategy learning-based adaptive DAG task scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211596732.2A CN116450308A (en) 2022-12-12 2022-12-12 Multi-strategy learning-based adaptive DAG task scheduling method

Publications (1)

Publication Number Publication Date
CN116450308A true CN116450308A (en) 2023-07-18

Family

ID=87122567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211596732.2A Pending CN116450308A (en) 2022-12-12 2022-12-12 Multi-strategy learning-based adaptive DAG task scheduling method

Country Status (1)

Country Link
CN (1) CN116450308A (en)

Similar Documents

Publication Publication Date Title
Arabnejad et al. List scheduling algorithm for heterogeneous systems by an optimistic cost table
Cheng et al. Cross-platform resource scheduling for spark and MapReduce on YARN
CN112988345B (en) Dependency task unloading method and device based on mobile edge calculation
CN114691363A (en) Cloud data center self-adaption efficient resource allocation method based on deep reinforcement learning
CN113627871B (en) Workflow scheduling method, system and storage medium based on multi-target particle swarm algorithm
CN109101339A (en) Video task parallel method, device and Heterogeneous Cluster Environment in isomeric group
CN115237581B (en) Heterogeneous computing power-oriented multi-strategy intelligent scheduling method and device
CN107341041B (en) Cloud task multidimensional constraint backfill scheduling method based on priority queue
US20230252353A1 (en) On-device training method to train an artificial intelligent model and a system therefor
CN114443249A (en) Container cluster resource scheduling method and system based on deep reinforcement learning
Phan et al. Evolving toward the perfect schedule: Co-scheduling job assignments and data replication in wide-area systems using a genetic algorithm
US20220405129A1 (en) Workflow scheduling method and system based on multi-target particle swarm algorithm, and storage medium
CN115330189A (en) Workflow optimization scheduling method based on improved moth flame algorithm
CN115454612A (en) Cloud platform task scheduling method based on dimension learning strategy and wolf optimization
CN110262879B (en) Monte Carlo tree searching method based on balanced exploration and utilization
Bian et al. Neural task scheduling with reinforcement learning for fog computing systems
Qureshi et al. Grid resource allocation for real-time data-intensive tasks
CN116450308A (en) Multi-strategy learning-based adaptive DAG task scheduling method
CN116582407A (en) Containerized micro-service arrangement system and method based on deep reinforcement learning
CN116932198A (en) Resource scheduling method, device, electronic equipment and readable storage medium
CN116069473A (en) Deep reinforcement learning-based Yarn cluster workflow scheduling method
CN114625493B (en) Kubernetes cluster resource scheduling method based on improved longhorn beetle whisker intelligent method
CN115756646A (en) Industrial internet-based edge computing task unloading optimization method
Suzuki et al. Response Time Analysis of Execution Right Delegation Scheduling
Yao et al. A memory-constraint-aware list scheduling algorithm for memory-constraint heterogeneous muti-processor system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination