WO2023241000A1

WO2023241000A1 - Dag task scheduling method and apparatus, device, and storage medium

Info

Publication number: WO2023241000A1
Application number: PCT/CN2022/142437
Authority: WO
Inventors: 胡克坤; 鲁璐; 赵坤; 董刚; 赵雅倩; 李仁刚
Original assignee: 苏州元脑智能科技有限公司
Priority date: 2022-06-15
Filing date: 2022-12-27
Publication date: 2023-12-21
Also published as: CN114756358B; CN114756358A

Abstract

A DAG task scheduling method and apparatus, a device, and a storage medium. The method comprises: constructing a network model according to the sequence of a directed graph neural network and a sequential decoder, and defining an objective function of the network model by taking the minimum task scheduling length as an objective; obtaining a DAG task data set, and generating a corresponding information matrix for each DAG task in the DAG task data set; training the network model by using the information matrix, and updating model parameters of the network model by using reinforcement learning according to the objective function to obtain a trained DAG task scheduling model; and determining, by using the DAG task scheduling model, the scheduling sequence of sub-tasks in a DAG task to be executed, and executing, by using a parallel computing system according to the scheduling sequence, the DAG task to be executed. The method can shorten the DAG task scheduling length and improve the parallel execution efficiency of DAG tasks.

Description

A DAG task scheduling method, device, equipment and storage medium

Cross-references to related applications

This application requests the priority of the Chinese patent application submitted to the China Patent Office on June 15, 2022, with the application number 202210671115.8, and the application name is "A DAG task scheduling method, device, equipment and storage medium", and its entire content is approved by This reference is incorporated into this application.

Technical field

The present application relates to the field of task scheduling technology, and in particular to a DAG task scheduling method, device, equipment and storage medium.

Background technique

At present, driven by the demand for high performance and complex functions, parallel computing systems are increasingly used to execute real-time applications, such as those with complex functional components such as sensing, planning, and control that require extremely high performance and real-time performance. Autonomous driving tasks. DAG (Directed Acyclic Graph) tasks are often used to represent complex dependencies between multiple task components (subtasks) of similar real-time applications and to formally describe fine-grained parallel task scheduling problems, that is, DAG tasks Scheduling issues. Considering that the non-preemptive task model can avoid task migration and switching overhead, priority-based non-preemptive scheduling for DAG tasks has received widespread attention. This problem studies how to non-preemptively schedule a given DAG task to a parallel computing system. Executing on it to minimize the processing time is a typical NP (Non-deterministic Polynomial Complete) complete problem. In the existing technology, long-term parallel computing practice has accumulated a large number of excellent heuristic scheduling algorithms, such as table scheduling algorithms and clustering scheduling algorithms. However, due to the nature of heuristic strategies, these algorithms cannot establish basic design principles for DAG task schedulers, such as how to use DAG task execution time and DAG task graph topological structure characteristics to allocate priorities to each sub-task under different DAG task sizes and configurations. level, and the scheduling performance is not ideal.

Contents of the invention

In view of this, the purpose of this application is to provide a DAG task scheduling method, device, equipment and medium that can shorten the DAG task scheduling length and improve the parallel execution efficiency of DAG tasks. The specific plan is as follows:

In the first aspect, this application discloses a DAG task scheduling method, including:

Construct a network model in the order of directed graph neural network and sequential decoder, and define the objective function of the network model with the minimum task scheduling length as the goal;

Obtain the DAG task data set and generate the corresponding information matrix for each DAG task in the DAG task data set;

Use the information matrix to train the network model, and use reinforcement learning to update the model parameters of the network model according to the objective function to obtain the trained DAG task scheduling model;

The DAG task scheduling model is used to determine the scheduling order of subtasks within the DAG task to be executed, and the parallel computing system is used to execute the DAG task to be executed according to the scheduling order.

In some embodiments of the present application, before constructing the network model in the order of directed graph neural network and sequential decoder, it also includes:

Construct a graph convolution layer for DAG task feature learning based on aggregation functions and nonlinear activation functions;

A directed graph neural network is constructed in the order of input layer, K-layer graph convolution layer, and output layer.

Use the priority allocation status of subtasks within the DAG task as a variable to define a vector expression of the context environment for the DAG task;

Building a sequential decoder for prioritization based on attention mechanisms and vector representations of context.

In some embodiments of this application, the objective function of the network model is defined with the minimum task scheduling length as the goal, including:

Using the task scheduling length corresponding to the priority sorting of DAG tasks at different time steps and the lower limit of the task scheduling length as independent variables, the scheduling length deceleration rate evaluation index of the DAG task is generated; the lower limit of the task scheduling length is determined based on the path length of the critical path of the DAG task. ;

Construct a reward function based on the policy gradient algorithm and scheduling length deceleration rate evaluation index;

Construct the objective function of the network model based on the reward function.

In some embodiments of this application, the DAG task data set is obtained, including:

Configure DAG task parameters; DAG task parameters include the number of task layers, the number of child nodes of the target node, the probability of generating child nodes of the target node, the probability of adding a connecting edge between two adjacent task layers, and the number of each subtask. Computational load;

Generate a DAG task according to the DAG task parameters to obtain the DAG task data set.

In some embodiments of this application, a corresponding information matrix is generated for each DAG task in the DAG task data set, including:

Generate a node feature matrix based on the characteristics of each subtask in the DAG task in the DAG task data set;

Generate an adjacency matrix based on the connection relationships between different subtasks in the DAG task data set;

Based on the node feature matrix and adjacency matrix, the information matrix corresponding to the DAG task is obtained.

In some embodiments of this application, the information matrix is used to train the network model, and reinforcement learning is used to update the model parameters of the network model according to the objective function, including:

Input the information matrix into the network model, and use the directed graph neural network to output the vector representation of each subtask based on the characteristics of the subtask and the dependency relationship between the subtasks;

Using the sequential decoder, the subtasks within the DAG task are prioritized based on the attention mechanism and the context of the DAG task according to the vector representation of the subtask;

Use the DAG task scheduling simulator to calculate the task scheduling length of the DAG task based on priority sorting;

According to the task scheduling length and objective function, reinforcement learning is used to update the model parameters of the network model until the network model converges.

In the second aspect, this application discloses a DAG task scheduling device, including:

The network building module is used to build the network model in the order of directed graph neural network and sequential decoder, and define the objective function of the network model with the minimum task scheduling length as the goal;

The data set acquisition module is used to obtain the DAG task data set and generate the corresponding information matrix for each DAG task in the DAG task data set;

The training module is used to train the network model using the information matrix, and update the model parameters of the network model using reinforcement learning according to the objective function to obtain the trained DAG task scheduling model;

The scheduling sequence determination module is used to use the DAG task scheduling model to determine the scheduling sequence of subtasks within the DAG task to be executed, and to use the parallel computing system to execute the DAG task to be executed according to the scheduling sequence.

In some embodiments of this application, the DAG task scheduling device also includes:

The graph convolution layer building unit is used to construct a graph convolution layer for DAG task feature learning based on aggregation functions and non-linear activation functions;

The directed graph neural network construction unit is used to construct the directed graph neural network in the order of the input layer, K-layer graph convolution layer, and output layer.

The vector expression definition unit is used to use the priority allocation status of subtasks within the DAG task as a variable to define a vector expression of the context environment for the DAG task;

The sequential decoder building unit is used to build a sequential decoder for prioritization based on the vector expression of the attention mechanism and the context environment to obtain the decoder.

In some embodiments of this application, network building modules include:

The scheduling length deceleration rate evaluation index construction unit is used to generate the scheduling length deceleration rate evaluation index of the DAG task using the task scheduling length corresponding to the priority sorting of the DAG task at different time steps and the lower limit of the task scheduling length as independent variables; the task scheduling length The lower limit is determined based on the path length of the critical path of the DAG task;

The reward function construction unit is used to construct the reward function based on the policy gradient algorithm and the scheduling length deceleration rate evaluation index;

The objective function building unit is used to build the objective function of the network model based on the reward function.

In some embodiments of this application, the data set acquisition module includes:

Task parameter configuration unit, used to configure DAG task parameters; DAG task parameters include the number of task layers, the number of child nodes of the target node, the generation probability of child nodes of the target node, and the connecting edges between two adjacent task layers. Add probabilities and computational loads for individual subtasks;

A task generation unit is used to generate a DAG task according to the DAG task parameters to obtain a DAG task data set.

The node feature matrix generation unit is used to generate the node feature matrix based on the characteristics of each sub-task in the DAG task in the DAG task data set;

The adjacency matrix generation unit is used to generate an adjacency matrix based on the connection relationship between different subtasks in the DAG task data set;

The information matrix determination unit is used to obtain the information matrix corresponding to the DAG task based on the node feature matrix and the adjacency matrix.

In a third aspect, this application discloses an electronic device, including:

Memory, used to hold computer programs;

A processor is used to execute a computer program to implement the aforementioned DAG task scheduling method.

In a fourth aspect, the present application discloses a non-volatile readable storage medium for storing a computer program; when the computer program is executed by a processor, the aforementioned DAG task scheduling method is implemented.

In this application, the network model is constructed according to the order of directed graph neural network and sequential decoder, and the objective function of the network model is defined with the minimum task scheduling length as the goal; the DAG task data set is obtained, and each of the DAG task data sets is The DAG task generates the corresponding information matrix; uses the information matrix to train the network model, and uses reinforcement learning to update the model parameters of the network model according to the objective function to obtain the trained DAG task scheduling model; use the DAG task scheduling model to determine the DAG to be executed The scheduling order of subtasks within the task, and the parallel computing system is used to execute the DAG tasks to be executed according to the scheduling order. In this application, a DAG task scheduling model is obtained based on directed graph neural network and reinforcement learning. The directed graph neural network can automatically identify rich features related to subtasks within the DAG task, and the sequential decoder can use these features to prioritize subtasks. , At the same time, the reinforcement learning optimization model is used to achieve the scheduling goal of minimizing the DAG task scheduling length, which can shorten the DAG task scheduling length, improve the parallel execution efficiency of DAG tasks, and use reinforcement learning to solve the difficulty of allocating optimal priorities for DAG tasks. Enough supervision labeling issues.

Description of the drawings

In order to explain the embodiments of the present application or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only This is an embodiment of the present application. For those of ordinary skill in the art, other drawings can be obtained based on the provided drawings without exerting creative efforts.

Figure 1 is a flow chart of a DAG task scheduling method provided by this application;

Figure 2 is a specific DAG task scheduling system structure diagram provided by this application;

Figure 3 is a specific directed graph neural network structure diagram provided by this application;

Figure 4 is a flow chart of a specific DAG task scheduling model training method provided by this application;

Figure 5 is a schematic structural diagram of a DAG task scheduling device provided by this application;

Figure 6 is a structural diagram of an electronic device provided by this application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments These are only some of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

In the existing technology, long-term parallel computing practice has accumulated a large number of excellent heuristic scheduling algorithms, such as table scheduling algorithms and clustering scheduling algorithms. However, due to the nature of the heuristic strategy, these algorithms cannot establish a basic design for the DAG task scheduler. Principles, such as how to use DAG task execution time and DAG task graph topological structure characteristics to assign priorities to each sub-task under different DAG task sizes and configurations, and the scheduling performance is not ideal. In order to overcome the above technical problems, this application proposes a DAG task scheduling method, which can shorten the DAG task scheduling length and improve the parallel execution efficiency of DAG tasks.

The embodiment of the present application discloses a DAG task scheduling method, as shown in Figure 1. The method may include the following steps:

Step S11: Construct a network model in the order of directed graph neural network and sequential decoder, and define the objective function of the network model with the minimum task scheduling length as the goal.

In this embodiment, the network model is first constructed in the order of Directed Graph Neural Network (DGNN) and sequential decoder, and the objective function of the network model is defined with the minimum task scheduling length as the goal; wherein, the above are The directed graph neural network is used to identify the task characteristics of the subtasks within the DAG task and output the embedded representation corresponding to each subtask, that is, a vector representation. The task characteristics include execution time and dependencies; the above-mentioned sequential decoder is used to output according to the directed graph neural network The embedded representation of sorts the priorities of all subtasks and outputs the priority sorting of the subtasks. The objective function is used to guide the learning of the network model so that the network model can finally output the minimum task scheduling length of the DAG task according to the input DAG task. Among them, the above-mentioned network model also includes a DAG task scheduling simulator, which is used to calculate the scheduling length of DAG tasks on a given parallel computing system.

Before describing this embodiment in detail, the basic concepts involved in this embodiment will first be explained. A parallel computing system can generally be described as a four-tuple ARC = (P, L, V, B). Among them: P={p _i |i=1,2,....,m} is the set of processing nodes; L={l _ij |p _i ,p _j ∈P} is the set of communication links between processing nodes; V＝{v _i |i＝1,2,….,m} is the calculation speed set of processing nodes, _vi represents the calculation speed of p _i , and satisfies v ₁ ≤v ₂ ≤…≤v _k ; B＝{ b _ij ∣l _ij ∈L} is a set of communication link bandwidths, and b _ij represents the bandwidth of communication link l _ij .

DAG tasks refer to multiple subtasks with complex dependencies that can be executed in parallel on a parallel computing system. They are commonly represented by weighted directed acyclic graphs and are recorded as DAG = (T, E, C). Among them, T = {t _i |i = 1, 2, 3,..., n} is a set of nodes, each node represents a subtask, and n is the total number of subtasks.

E＝{∣i,j＝1,2,3,…,n} is the set of directed edges. The directed edges

Represents the communication and data dependency relationship from the subtask _ti connected by this edge to its other subtask t _j . T _j must start execution after receiving the calculation result of _ti . C＝{c _i∣i ＝1,2,…,n} is a set of computing loads, c _i represents the computing load of subtask t _i , and the sum of the computing loads of all subtasks is w, then we have

Let Pred(t _i ) and Succ(t _i ) be the direct predecessor subtask set and direct successor subtask set of t _i respectively. The set of connecting edges between t _i and all subtasks in Pred(t _i ) and Succ(t _i ) is called the incident edge set E _in (t _i ) and the outgoing edge set E _out (t _i ) of t _i respectively; Denote the in-degree and out-degree of t _i as deg _in (t _i ) and deg _out (t _i ) respectively, then deg _in (t _i )=|E _in (t _i )| and deg _out (t _i )= |E _out (t _i )|. like

Then _ti is called the entry subtask and is recorded as tentry; if

Then _ti is said to be an exit subtask and is recorded as texit. The path λ = [t ₁ , t ₂ ,..., t _k ] is a finite sequence of subtask nodes and satisfies

have

If a path contains both entry subtasks and exit subtasks, the path is called a complete path. The path length Λ(λ) of λ is the sum of the computational loads of all subtasks on the path, that is

The complete path with the longest path length is called the critical path. In addition to task load, in-degree, and out-degree node-level features, the original features xi of each subtask node vi also include critical path length and non-critical path length. The non-critical path length can be subtracted from the total computing load of the DAG task. The critical path length is obtained. For example, Figure 2 shows a specific DAG task scheduling system, which shows a DAG task with 9 nodes and a unique entrance and exit. The strings in the nodes represent the subtask ID and computing load. In fact, a DAG task may have multiple entrances and exits, which can be turned into a DAG task with unique entrances and exits by adding a dummy entry subtask or exit subtask and corresponding connecting edges. Unless otherwise specified, this embodiment refers to the latter type of DAG tasks, and the number of subtasks n satisfies the requirement that it is much larger than the number m of nodes in the parallel computing system.

In this embodiment, before constructing the network model in the order of directed graph neural network and sequential decoder, it may also include: constructing a graph convolution layer for DAG task feature learning based on the aggregation function and nonlinear activation function; according to the input layer , K-layer graph convolution layer, and output layer are sequentially constructed to obtain a directed graph neural network. Use the above directed graph neural network to learn the vector representation of each sub-task within the above DAG task;

The above vector representation is recorded as

As shown in Figure 3, a directed graph neural network is designed, which consists of an input layer, a K-layer graph convolution layer, and an output layer. The input layer reads the results of the DAG task. Point feature matrix

Among them, the aggregate function aggregates messages from the immediate predecessors of subtask _ti , and the update function performs nonlinear transformation on the aggregated messages; Pred(t _i ) is the set of direct predecessor subtasks of _ti . For the aggregate function, it can use a variety of methods such as taking the maximum value, taking the average, etc. This patent uses the attention mechanism to implement:

Among them, α _ij represents the attention coefficient of subtask t _j to t _i , which is learned through training. For the update function, it can be any nonlinear activation function. It is not general. In this embodiment, the ReLu function is used to implement:

Right now

The output layer directly outputs the vertex embedded representation learned by the Kth graph convolution layer. It can be understood that the graph convolution layer constructed by using aggregation functions and nonlinear activation functions can better adapt to the directed characteristics of the DAG task graph, extract the dependencies between subtasks, and be able to identify the characteristics of the subtasks themselves and the characteristics of the subtasks. The DAG task has dependencies on other subtasks, so as to learn the embedded representation of subtask nodes more effectively, so as to provide richer features for subsequent subtask prioritization, thereby improving the accuracy of prioritization.

In this embodiment, before constructing the network model in the order of directed graph neural network and sequential decoder, it may also include: using the priority allocation status of subtasks within the DAG task as a variable to define a vector expression of the context environment for the DAG task; Building a sequential decoder for prioritization based on attention mechanisms and vector representations of context. It can be understood that the above decoder is a sequential decoder used to sort all subtasks of the DAG task, specifically based on the embedded representation of all n subtask nodes of the DAG task learned by the directed graph neural network.

The sequential decoder sequentially selects subtask nodes based on the attention mechanism to generate a priority ranking of size n π = [π ₁ , π ₂ ,..., π _n ], which corresponds to a priority-ordered subtask arrangement.

And satisfy: priority

In this embodiment, the sequential decoder can formally describe the DAG task priority allocation as a probability distribution defined by the following formula:

Among them, θ is the network parameter to be optimized. The sequential decoder first samples the subtask node with the highest priority from the probability distribution p(π ₁ ∣θ), and then samples the subtask node with the second highest priority from p(π ₂ ∣π ₁ ,θ), And so on until all subtask nodes are sampled.

At each time step τ(τ∈[1,n]), the sequential decoder selects subtask nodes according to the following rules

And give it priority (n-τ):

π _τ =argmax(p(π _τ ∣π ₁ ,π ₂ ,…,π _τ-1 ,θ)) (5)

Among them, the argmax function is used to determine the maximum independent variable point set; the conditional probability distribution p(π∣π ₁ ,π ₂ ,…,π _τ-1 ,θ) is calculated according to the following formula:

Among them, softmax is the normalized exponential function; W _θ is the feature transformation matrix to be trained; σ is the nonlinear activation function; att is the attention function; U _τ is the set of subtasks that have not yet been assigned priorities at time step τ. The set of priority subtasks is recorded as O _τ , and satisfies O _τ ∪U _τ =T. During the priority sorting process for a DAG task, the sets O _τ and U _τ will be updated in real time according to the priority allocation status; cont is the order. The context of the real-time subtask selection of the decoder. It can be understood that in formula (6), the vector representation of the subtask is compared with the vector representation of the context, and then the attention mechanism is used to allocate the weight of the subtask; the context environment The vector representation calculation formula is as follows:

cont＝W[cont _O ;cont _U ]+b (7)

Among them, W represents linear transformation, "[;]" represents tensor connector; cont _O and cont _U are the embedded representations corresponding to O _τ and U _τ respectively, which are calculated by the following formula:

Among them, σ is a nonlinear activation function. Thus, a sequential decoder is built based on the attention mechanism, and the selection problem of subtask nodes is reduced to the problem of randomly selecting subnode numbers from the conditional probability distribution, so as to determine the subtask node more accurately based on the vector representation of the subtask node. Prioritization of tasks.

In this embodiment, defining the objective function of the network model with the minimum task scheduling length as the goal may include: using the task scheduling length corresponding to the priority ordering of DAG tasks at different time steps and the lower limit of the task scheduling length as independent variables to generate a DAG task The scheduling length deceleration rate evaluation index; the lower limit of the task scheduling length is determined according to the path length of the critical path of the DAG task; the reward function is constructed based on the policy gradient algorithm and the scheduling length deceleration rate evaluation index; the objective function of the network model is constructed based on the reward function. It can be understood that the DAG task scheduling problem can be modeled and expressed as a Markov Decision Process (MDP, Markov Decision Process); a basic MDP can usually be described by a five-tuple: MDP = (A, S, R, Π,Δ).

Among them, A is a set of n time step actions of the sequential decoder. The action A _τ at time step τ (τ∈[1,n]) represents the sequential decoder’s selection of subtask nodes from the DAG task.

And give it priority (n-τ); S is the set of environmental states of n time steps; the environmental state Sτ of time step τ is represented by the embedded representation of all sub-task nodes of the DAG task

Assigned priority subtask sequence

And under the premise of specifying the priority sequence of some subtasks, the scheduling length estimate of the DAG task Makespan (π, τ) is composed of these three quantities;

Right now

R represents the immediate return value of the environment, which is used to evaluate the effect of the previous selection action of the sequential decoder. Since the goal of DAG task scheduling is to minimize the task scheduling length, in this embodiment, a reward function r(π,τ) is designed based on the scheduling length deceleration rate evaluation index:

On a parallel computing system composed of m computing nodes,

Indicates the task scheduling length estimate corresponding to DAG task priority sorting π at time step τ(τ∈[1,n]). Among them, Λ(λ _critical (π,τ)) is the critical path length of the DAG task determined according to the priority sorting π, and w is the sum of the computing load of all subtasks of the DAG. Makespan _low (DAG) represents the lower bound of the scheduling length of DAG tasks on a parallel computing system. The specific calculation formula is as follows:

Among them, λ _critical (DAG) represents the DAG task critical path length.

In MDP, it is assumed that the state transition matrix Π is deterministic, because for a given state and action, there is no randomness in the determination of the next state. This is because the scheduling action does not execute the task, it only affects the scheduling policy and changes the task. arrangement. Finally, let the discount factor Δ be a constant 1. According to formulas (4) and (10), use the policy gradient algorithm to maximize the expectation of cumulative rewards corresponding to the DAG task priority sorting π to define the objective function of the network model. J is:

Among them, π～p(τ,θ) indicates that the priority order of the specified DAG tasks is sampled from the learning strategy.

Step S12: Obtain the DAG task data set, and generate a corresponding information matrix for each DAG task in the DAG task data set.

In this embodiment, the DAG task data set used for model training is obtained, and then the information matrix of each DAG task in the DAG task data set is extracted, including the node feature matrix and the adjacency matrix. Specifically, in this embodiment, generating a corresponding information matrix for each DAG task in the DAG task data set may include: generating a node feature matrix based on the characteristics of each sub-task in the DAG task in the DAG task data set; node features The matrix is a feature matrix that represents the computing load of the subtask, the normalized computing load, and the in-degree and out-degree of the subtask; an adjacency matrix is generated based on the connection relationship between different subtasks in the DAG task data set; based on the node feature matrix and the adjacency matrix to obtain the information matrix corresponding to the DAG task.

In this embodiment, obtaining the DAG task data set may include: configuring DAG task parameters; the DAG task parameters include the number of task layers, the number of child nodes of the target node, the generation probability of child nodes of the target node, and the number of adjacent two nodes. The connection edge adding probability between each task layer and the computing load of each sub-task are generated; a DAG task is generated according to the DAG task parameters to obtain a DAG task data set. Due to the lack of publicly available large-scale DAG task data sets, in this embodiment, the DAG task is first generated, which can be obtained using a parallel task generation model based on the DAG task parameters. For example, the nested fork-join task model synthesizes a DAG task; this model is controlled by four parameters, namely n _depth , n _child , p _fork and p _pert . Among them, n _depth represents the number or depth of DAG task layers; n _child represents the number of child nodes of a certain node; p _fork represents the probability of generating a child node for a certain node; p _pert is between two adjacent nodes. The probability of randomly adding connecting edges. For each subtask node ti in the kth layer, its child node t _j and edge e _ij are generated based on probability p _fork ; the number of child nodes in the k+1th layer is determined by the uniform distribution n _child , that is Determine the number of child nodes under the target node through uniform distribution. This process starts from the entry subtask node and is repeated n _depth times, thereby creating a DAG task with n _depth levels. In addition, connecting edges are randomly added between the k-th layer and k+1-th layer nodes of the DAG task with probability p _pert . The larger the value of p _pert , the higher the degree of parallelism of the generated DAG task. Finally, add edges from the last layer node to the exit node and assign computational load to each subtask. Among them, the computing load of the subtask obeys the normal distribution with parameters μ (μ>0) and δ, where μ represents the average computing load of the subtask, and δ represents the standard deviation of the computing load of each subtask. Of course, it can also Assuming other distributions, it is sufficient to ensure that the computing load of each subtask is positive, and there is no limit here. For the constructed DAG task, extract the characteristics of each subtask node and construct the node feature matrix X; according to the interconnection relationship between nodes, construct the adjacency matrix A.

Step S13: Use the information matrix to train the network model, and use reinforcement learning to update the model parameters of the network model according to the objective function to obtain the trained DAG task scheduling model.

In this embodiment, after the network model is constructed, the model parameters are first initialized. According to a specific strategy such as normal distribution random initialization, Xavier initialization or He Initialization initialization, the parameters W of each layer of the directed graph neural network are initialized, and the sequential decoder model parameter p(θ) is initialized.

Then, use the information matrix corresponding to the above-mentioned DAG task data set to train the network model. After obtaining the above-mentioned DAG task data set, the above-mentioned DAG task data set is divided into a training set and a test set. Specifically, the cross-validation method and retention method can be used. Partitioning methods such as the leave-one-out method or the leave-one-out method are used, in which the test set is used to train the above-mentioned network model, and the test set is used to test the trained network model.

In this embodiment, the information matrix is used to train the network model, and reinforcement learning is used to update the model parameters of the network model according to the objective function, which may include:

S130: Input the information matrix into the network model, and use the directed graph neural network to output the vector representation of each subtask based on the characteristics of the subtask and the dependency relationship between the subtasks;

S131: Use the sequential decoder to prioritize the subtasks within the DAG task based on the attention mechanism and the context of the DAG task according to the vector representation of the subtask;

S132: Use the DAG task scheduling simulator to calculate the task scheduling length of the DAG task according to priority sorting;

S133: Based on the task scheduling length and objective function, use reinforcement learning to update the model parameters of the network model until the network model converges.

That is, the node feature matrix and adjacency matrix contained in the information matrix are used as the input of the network model, forward propagation is performed, the vector representation of all subtasks is obtained through the directed graph neural network, and the sequential decoder outputs the subtask priority ranking, which can Schedule the execution of subtasks in sequence through the DAG task simulation scheduler, and calculate the corresponding scheduling length, and then calculate the model objective function value according to formula (12); and according to a certain strategy such as stochastic gradient descent or Adam and other algorithms, back propagation corrects each layer Network parameter values. Therefore, the reinforcement learning algorithm is used to minimize the DAG task scheduling length, and the network model is continuously optimized by rewarding DAG task priority ranking with shorter scheduling length; thus, the resulting scheduling length is shorter and the parallel computing efficiency is higher. It can effectively avoid the difficulty of collecting enough supervision labels for optimal priority allocation of DAG tasks.

Specifically, the network model is trained by finding the gradient about θ for the objective function J defined by formula (11):

in,

is the gradient operator. The model gradient in Equation (12) can be estimated using the Monte Carlo stochastic gradient descent method:

Among them, B represents the set of subtasks of the DAG task randomly sampled from the data set. Use stochastic gradient descent method or Adam algorithm to optimize the objective function. When the objective function value no longer decreases or reaches the maximum number of iterations, the model training is terminated. The scheduling plan obtained at this time is the optimal scheduling plan. That is, the gradient of the objective function is estimated based on the Monte Carlo stochastic gradient descent method. Therefore, in this embodiment, deep reinforcement learning for the DAG task is implemented based on the directed graph neural network and the objective function. Among them, the DAG task scheduling length can be obtained by scheduling all subtasks in sequence through the DAG task scheduling simulator to be executed in parallel on the parallel computing system ARC, and recording the completion time of the exit task to obtain the DAG task scheduling length.

Step S14: Use the DAG task scheduling model to determine the scheduling order of subtasks within the DAG task to be executed, and use the parallel computing system to execute the DAG task to be executed according to the scheduling order.

In this embodiment, after the DAG task scheduling model is trained, the node feature matrix and adjacency matrix of the DAG task to be executed are input to the model, and the optimal DAG task scheduling sequence obtained by the model is output as the result, and according to the scheduling Sequentially utilize the parallel computing system to execute the DAG tasks to be executed. Therefore, for the non-preemptive scheduling problem of DAG tasks, this embodiment is based on deep reinforcement learning and directed graph neural network to prioritize tasks, thereby determining the scheduling order of tasks, reducing task execution time, and improving task execution efficiency. .

This embodiment also proposes a DAG task scheduling system based on deep reinforcement learning and directed graph neural network. As shown in Figure 2, the system consists of an input module, a directed graph neural network, a sequential decoder, a scheduling length calculation module and a model parameter update module. The input module is responsible for reading the node feature matrix The expression is decoded by the sequential decoder, and the output is the priority ordering of all subtasks; the scheduling length calculation module schedules them to be executed on the parallel computing system based on this ordering, and uses the scheduling length as a feedback signal to update the model using a reinforcement learning algorithm. parameter. Therefore, the above-mentioned DAG task scheduling system based on deep reinforcement learning and directed graph neural network takes the DAG task as input, generates an embedded representation for each sub-task of the DAG task through the directed graph neural network, and uses the sequential decoder to generate all sub-tasks. Prioritize tasks and calculate the task scheduling length or completion time corresponding to this ranking. The system aims to minimize the scheduling length of DAG tasks, and the calculated scheduling length is used as a reward signal to update the model through a reinforcement learning algorithm.

As can be seen from the above, in this embodiment, the network model is constructed in the order of directed graph neural network and sequential decoder, and the objective function of the network model is defined with the minimum task scheduling length as the goal; the DAG task data set is obtained, and the DAG task data is Each DAG task in the set generates a corresponding information matrix; use the information matrix to train the network model, and use reinforcement learning to update the model parameters of the network model according to the objective function to obtain the trained DAG task scheduling model; use the DAG task scheduling model Determine the scheduling order of subtasks within the DAG task to be executed, and use the parallel computing system to execute the DAG task to be executed according to the scheduling order. In this application, a DAG task scheduling model is obtained based on directed graph neural network and reinforcement learning. The directed graph neural network can automatically identify rich features related to subtasks within the DAG task, and the sequential decoder can use these features to prioritize subtasks. , At the same time, using the reinforcement learning optimization model to achieve the scheduling goal of minimizing the DAG task scheduling length can shorten the DAG task scheduling length, improve the parallel execution efficiency of DAG tasks, and use reinforcement learning to solve the difficulty of allocating optimal priorities for DAG tasks. Enough supervision labeling issues.

Correspondingly, the embodiment of the present application also discloses a DAG task scheduling device, as shown in Figure 5. The device includes:

The network building module 11 is used to build a network model in the order of a directed graph neural network and a sequential decoder, and define the objective function of the network model with the minimum task scheduling length as the goal;

The data set acquisition module 12 is used to obtain the DAG task data set and generate a corresponding information matrix for each DAG task in the DAG task data set;

The training module 13 is used to train the network model using the information matrix, and update the model parameters of the network model using reinforcement learning according to the objective function to obtain the trained DAG task scheduling model;

The scheduling order determination module 14 is used to use the DAG task scheduling model to determine the scheduling order of subtasks within the DAG task to be executed, and to use the parallel computing system to execute the DAG task to be executed according to the scheduling order.

In some specific embodiments, the DAG task scheduling device may include:

In some specific embodiments, the network building module 11 may specifically include:

In some specific embodiments, the data set acquisition module 12 may include:

In some specific embodiments, the training module 13 may specifically include:

The vector representation determination unit is used to input the information matrix into the network model, and uses the directed graph neural network to output the vector representation of each subtask based on the characteristics of the subtask and the dependency relationship between the subtasks;

The priority ranking determination unit is used to use the sequential decoder to prioritize the subtasks within the DAG task based on the vector representation of the subtasks based on the attention mechanism and the context environment of the DAG task;

A task scheduling length determination unit, configured to use the DAG task scheduling simulator to calculate the task scheduling length of the DAG task according to priority ranking;

The model optimization unit is used to update the model parameters of the network model using reinforcement learning according to the task scheduling length and objective function until the network model converges.

Furthermore, the embodiment of the present application also discloses an electronic device, as shown in FIG. 6 . The content in the figure cannot be considered as any limitation on the scope of the present application.

FIG. 6 is a schematic structural diagram of an electronic device 20 provided by an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input-output interface 25 and a communication bus 26. The memory 22 is used to store computer programs, and the computer programs are loaded and executed by the processor 21 to implement relevant steps in the DAG task scheduling method disclosed in any of the foregoing embodiments.

In this embodiment, the power supply 23 is used to provide working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows can be applicable Any communication protocol of the technical solution of this application is not specifically limited here; the input and output interface 25 is used to obtain external input data or output data to the external world, and its specific interface type can be selected according to specific application needs. Here No specific limitation is made.

In addition, the memory 22, as a carrier for resource storage, can be a read-only memory, a random access memory, a magnetic disk, an optical disk, etc. The resources stored thereon include an operating system 221, a computer program 222, and data 223 including DAG tasks. Storage The method can be temporary storage or permanent storage.

Among them, the operating system 221 is used to manage and control each hardware device and the computer program 222 on the electronic device 20 to realize the calculation and processing of the massive data 223 in the memory 22 by the processor 21. It can be Windows Server, Netware, Unix, Linux etc. In addition to computer programs that can be used to complete the DAG task scheduling method executed by the electronic device 20 disclosed in any of the foregoing embodiments, the computer program 222 may further include computer programs that can be used to complete other specific tasks.

Further, embodiments of the present application also disclose a non-volatile readable storage medium. Computer-executable instructions are stored in the non-volatile readable storage medium. When the computer-executable instructions are loaded and executed by the processor, the The steps of the DAG task scheduling method disclosed in any of the foregoing embodiments.

Each embodiment in this specification is described in a progressive manner. Each embodiment focuses on its differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple. For relevant details, please refer to the description in the method section.

The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein may be implemented directly in hardware, in software modules executed by a processor, or in a combination of both. Software modules may be located in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or anywhere in the field of technology. any other known form of storage media.

Finally, it should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or any such actual relationship or sequence between operations. Furthermore, the terms "comprises," "comprises," or any other variations thereof are intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus that includes a list of elements includes not only those elements, but also those not expressly listed other elements, or elements inherent to the process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or apparatus that includes the stated element.

The above is a detailed introduction to a DAG task scheduling method, device, equipment and medium provided by this application. Specific examples are used in this article to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for assistance. Understand the methods and core ideas of this application; at the same time, for those of ordinary skill in the field, there will be changes in the specific implementation methods and application scope based on the ideas of this application. In summary, the content of this specification does not should be understood as a limitation on this application.

Claims

A DAG task scheduling method, characterized by including:

Construct a network model in the order of directed graph neural network and sequential decoder, and define the objective function of the network model with the minimum task scheduling length as the goal;

Obtain a DAG task data set, and generate a corresponding information matrix for each DAG task in the DAG task data set;

Use the information matrix to train the network model, and use reinforcement learning to update the model parameters of the network model according to the objective function to obtain a trained DAG task scheduling model;

The DAG task scheduling model is used to determine the scheduling order of subtasks within the DAG task to be executed, and a parallel computing system is used to execute the DAG task to be executed according to the scheduling order.
The DAG task scheduling method according to claim 1, characterized in that before constructing the network model in the order of directed graph neural network and sequential decoder, it also includes:

Construct a graph convolution layer for DAG task feature learning based on aggregation functions and nonlinear activation functions;

The directed graph neural network is constructed in the order of input layer, K-layer graph convolution layer and output layer.
The DAG task scheduling method according to claim 1, characterized in that before constructing the network model in the order of directed graph neural network and sequential decoder, it also includes:

Using the priority allocation status of subtasks within the DAG task as a variable, define a vector expression of the context environment for the DAG task;

A sequential decoder for prioritization is built based on the attention mechanism and the vector representation of the context.
The DAG task scheduling method according to claim 1, wherein defining the objective function of the network model with the minimum task scheduling length as the target includes:

Using the task scheduling length corresponding to the priority sorting of DAG tasks at different time steps and the lower limit of task scheduling length as independent variables, the scheduling length deceleration rate evaluation index of the DAG task is generated; the lower limit of the task scheduling length is based on the path of the critical path of the DAG task The length is determined;

Construct a reward function based on the policy gradient algorithm and the scheduling length deceleration rate evaluation index;

An objective function of the network model is constructed based on the reward function.
The DAG task scheduling method according to claim 1, characterized in that said obtaining the DAG task data set includes:

Configure DAG task parameters; the DAG task parameters include the number of task layers, the number of child nodes of the target node, the probability of generating child nodes of the target node, the probability of adding a connecting edge between two adjacent task layers, and the probability of each child node. The computational load of the task;

Generate a DAG task according to the DAG task parameters to obtain the DAG task data set.
The DAG task scheduling method according to claim 1, characterized in that generating a corresponding information matrix for each DAG task in the DAG task data set includes:

Generate a node feature matrix according to the characteristics of each subtask in the DAG task in the DAG task data set;

Generate an adjacency matrix according to the connection relationship between different subtasks in the DAG task data set;

The information matrix corresponding to the DAG task is obtained based on the node feature matrix and the adjacency matrix.
The DAG task scheduling method according to any one of claims 1 to 6, characterized in that the network model is trained using the information matrix, and the network model is updated using reinforcement learning according to the objective function. model parameters, including:

Input the information matrix into the network model, and use the directed graph neural network to output a vector representation of each subtask based on the characteristics of the subtask and the dependency relationship between the subtasks;

Using the sequential decoder, prioritize the subtasks within the DAG task based on the attention mechanism and the context of the DAG task according to the vector representation of the subtask;

Use a DAG task scheduling simulator to calculate the task scheduling length of the DAG task according to the priority ranking;

According to the task scheduling length and the objective function, reinforcement learning is used to update the model parameters of the network model until the network model converges.
The DAG task scheduling method according to claim 1, characterized in that using the DAG task scheduling model to determine the scheduling order of subtasks within the DAG task to be executed includes:

The node feature matrix and adjacency matrix of the DAG task to be executed are input into the DAG task scheduling model to obtain the scheduling sequence.
The DAG task scheduling method according to claim 4, wherein the critical path is the complete path with the longest path length.
The DAG task scheduling method according to claim 7, wherein the using reinforcement learning to update the model parameters of the network model includes:

With the goal of minimizing the task scheduling length, a reinforcement learning algorithm is used, and the network model is continuously optimized by rewarding the priority sorting with a shorter scheduling length.
The DAG task scheduling method according to claim 1, wherein the directed graph neural network is used to identify task characteristics of subtasks within the DAG task and output an embedded representation corresponding to each subtask.
The DAG task scheduling method according to claim 11, wherein the task characteristics include execution time and dependencies.
The DAG task scheduling method according to claim 1, wherein the objective function is used to guide the learning of the network model, so that the network model outputs the minimum value of the DAG task according to the input DAG task. Task scheduling length.
The DAG task scheduling method according to claim 1, characterized in that the parallel computing system is described as a four-tuple ARC=(P, L, V, B), where: P={p i |i=1 ,2,….,m} is the set of processing nodes; L={l ij |p i ,p j ∈P} is the set of communication links between processing nodes; V={v i ∣i=1,2,… .,m} is the calculation speed set of the processing node, vi represents the calculation speed of p i , and satisfies v 1 ≤ v 2 ≤...≤v k ; B＝{b ij ∣l ij ∈L} is the communication link bandwidth The set of , b ij represents the bandwidth of communication link l ij .
The DAG task scheduling method according to claim 2, characterized in that the graph convolution operation of the k-layer graph convolution layer is implemented by an aggregate function and an update function, as follows:

Among them, the aggregate function aggregates messages from the immediate predecessors of subtask ti , the update function performs nonlinear transformation on the aggregated messages, and Pred(t i ) is the set of direct predecessor subtasks of ti .
The DAG task scheduling method according to claim 5, wherein generating a DAG task according to the DAG task parameters to obtain the DAG task data set includes:

Based on the DAG task parameters, a parallel task generation model is used to obtain the DAG task to obtain the DAG task data set.
The DAG task scheduling method according to claim 7, characterized in that, using the sequential decoder, according to the vector representation of the sub-task, the DAG task is evaluated based on an attention mechanism and the context environment of the DAG task. The subtasks within are prioritized including:

Using the sequential decoder, subtask nodes are selected based on the attention mechanism according to the vector representation of the subtask to generate a priority ranking π=[π 1 , π 2 ,..., π n ] of size n.
A DAG task scheduling device, characterized by including:

A network building module, used to build a network model in the order of a directed graph neural network and a sequential decoder, and define the objective function of the network model with the minimum task scheduling length as the goal;

A data set acquisition module, used to obtain a DAG task data set, and generate a corresponding information matrix for each DAG task in the DAG task data set;

A training module used to train the network model using the information matrix, and update the model parameters of the network model using reinforcement learning according to the objective function to obtain a trained DAG task scheduling model;

A scheduling sequence determination module is configured to use the DAG task scheduling model to determine the scheduling sequence of subtasks within the DAG task to be executed, and to use a parallel computing system to execute the DAG task to be executed according to the scheduling sequence.
An electronic device, characterized by including:

Memory, used to hold computer programs;

A processor, configured to execute the computer program to implement the DAG task scheduling method according to any one of claims 1 to 17.
A non-volatile readable storage medium, characterized in that it is used to store a computer program; wherein when the computer program is executed by a processor, the DAG task scheduling method according to any one of claims 1 to 17 is implemented.