CN113127169B

CN113127169B - Efficient link scheduling method for dynamic workflow in data center network

Info

Publication number: CN113127169B
Application number: CN202110373804.6A
Authority: CN
Inventors: 沈鸿; 王鑫
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2021-04-07
Filing date: 2021-04-07
Publication date: 2023-05-02
Anticipated expiration: 2041-04-07
Also published as: CN113127169A

Abstract

The invention provides a high-efficiency link scheduling method of dynamic workflow in a data center network, which comprises the following steps: s1: using a directed acyclic graph neural network to simultaneously process the n arriving coflows; s2: forming J incomplete operation DAGs, inputting the J incomplete operation DAGs as a large non-connected directed acyclic graph into a directed acyclic graph neural network, and outputting to obtain an emmbedding vector of each node; s3: s2, obtaining n subedding vectors, using the n subedding vectors as the input of a strategy network in deep reinforcement learning, obtaining score of each node, and calculating to obtain a weighted score value of each node; s4: according to the partial DAG graphs of the current different jobs, finding out all nodes with the current ingress degree of 0, calculating the probability of each node based on the weighted score value through a softmax operation, and obtaining a coflow scheduling priority list according to the probability arrangement of the nodes; s5: and carrying out priority scheduling tasks based on the coflow scheduling priority list.

Description

Efficient link scheduling method for dynamic workflow in data center network

Technical Field

The present invention relates to the field of high performance computing, and more particularly, to a method for efficient link scheduling of dynamic workflows in a data center network.

Background

Modern parallel computing platforms (e.g., hadoop, spark, dryad) have supported processing large data sets at data centers. The processing is typically composed of multiple computing and communication stages. The computation phase involves the local operation of the servers, while the communication phase involves the transmission of data between servers in the data center network to initiate the next computation phase. Such an intermediate communication phase has a great influence on the application delay. Coflow is an abstraction proposed to model such communication modes, which represents a set of intermediate parallel data streams and is transmitted between servers to initiate the next phase.

For jobs with a single communication phase, minimizing the average completion time of the coflow may improve the latency of the job. However, for multi-stage jobs, minimizing the average Coflow completion time may not be the correct metric, and may even lead to worse performance, as it ignores the dependencies between Coflow in the job: starts-After and finish-Before. For multi-stage jobs, each job consists of multiple Cofiows, typically represented by a DAG (directed acyclic graph) that captures the dependencies between the Cofiows (Starts-After).

Although the Coflow scheduling for single-stage jobs is studied very much, the Coflow scheduling for multi-stage jobs and the dependency between Coflows are largely ignored. The problem of the Coflow scheduling of multi-stage jobs has proven to be an NP-hard problem. The problem of the Coflow scheduling of multi-stage jobs is that there are many processing difficulties and inherent factors including how to process different job DAGs, how to effectively extract characteristic information (including node information, side information, dependency relationships, etc.) of job DAGs, and factors such as different numbers of Coflow in different jobs, different numbers of parallel flows in a single Coflow. The prior art mainly has the following limitations:

existing jobs, including heuristic and approximation algorithms, focus on the Coflow scheduling of single-stage jobs. For the Coflow scheduling problem of multi-stage jobs, in Aalo, the authors simply discuss a straightforward heuristic to reduce the completion time of the multi-stage job. The heuristic solution of manual debugging simplifies the problem by giving some relaxation conditions, ensuring only an optimal solution of the roughly approximated NP-hard problem.

Therefore, we think that whether an adaptive scheduling model without manual guidance can be constructed, and the dependent Coflow from different jobs is dynamically scheduled by directly interacting with the environment, so that the sum of the completion time of the weighted jobs is optimized, and the working efficiency is improved. Weights may capture different priorities for different jobs, with the problem equating to minimizing average job completion time in the special case of all weights being equal.

Disclosure of Invention

The present invention provides a method for efficient link scheduling of dynamic workflows in a data center network, which can dynamically schedule dependent Coflow from different jobs without relying on a lot of manual debugging, so that the sum of weighted job completion times is minimized.

In order to solve the technical problems, the technical scheme of the invention is as follows: a method for efficient link scheduling of dynamic workflow in a data center network comprises the following steps:

s1: using a directed acyclic graph neural network to process n arriving coflows simultaneously, and forming a job by a plurality of dependent tasks, wherein the job is expressed by DAG;

s2: forming J incomplete operation DAGs, inputting the J incomplete operation DAGs as a large non-connected directed acyclic graph into a directed acyclic graph neural network, and outputting to obtain an emmbedding vector of each node;

s3: s2, obtaining n subedding vectors, using the n subedding vectors as the input of a strategy network in deep reinforcement learning, obtaining score of each node, and calculating to obtain a weighted score value of each node;

s4: according to the partial DAG graphs of the current different jobs, finding out all nodes with the current ingress degree of 0, calculating the probability of each node based on the weighted score value through a softmax operation, and obtaining a coflow scheduling priority list according to the probability arrangement of the nodes; nodes with current ingress degree not being 0 are temporarily stored in a reflow waiting list;

s5: and carrying out priority scheduling tasks based on the coflow scheduling priority list, updating the coflow scheduling priority list and the coflow waiting list every time one coflow is scheduled until n coflow scheduling is completed, and feeding back the quality of the report evaluating action by the environment.

Further, the updating formula of the directed acyclic graph neural network for calculating the global information and node characteristics of the job DAG is as follows:

wherein ,

representation of all nodes v representing layer i, h _g A representation representing the entire DAG graph; p (v) represents a set of direct predecessor nodes of node v; t (T)Representing a set of nodes that are not directly successor; g ^l ,F _l And R are both parameterized neural networks.

Still further, in step S2, J incomplete job DAGs are constructed, specifically as follows:

constructing a waiting queue W with the capacity of m and a waiting queue D with the capacity of n, wherein m > n; sequentially arranging tasks in a waiting queue W until the number of the tasks is greater than or equal to n, taking out the first n tasks and putting the first n tasks into a waiting queue D; because the n tasks come from different jobs and the sequence coming successively meets the respective dependency relationship, J incomplete operation DAGs are formed.

Further, the weighted score value of each node is obtained by multiplying the task weight corresponding to the node by the score of each node; each job corresponds to a task weight, and the task weights in the same job are the same.

Still further, step S5, specifically:

s1: performing priority scheduling tasks based on the coflow scheduling priority list; after each time a coflow is scheduled, removing corresponding nodes and edges in the DAG graph, updating a node set with the ingress degree of 0, and further updating a coflow scheduling priority list and a coflow waiting list;

s2: and continuously repeating the operation until the n coflow scheduling is completed, and feeding back the environmental feedback to evaluate whether the action is good or bad.

And further, in step S5, after the execution of the coflow scheduling priority list is finished, the queue D to be processed is updated, namely, the first n tasks are selected again from the queue W to be placed in the queue D to be processed, and the step 2 is returned to continue the execution.

Still further, step S5 evaluates the quality of the action by calculating the sum of the weighted job completion times as a reward by which the agent is optimized so that the sum of the weighted job completion times is minimized.

Compared with the prior art, the technical scheme of the invention has the beneficial effects that:

1. on the premise that the GNN can extract the characteristic of the operation DAG, the invention uses the directed acyclic graph neural network instead, makes full use of the special structure of the DAG, and can generate a more favorable vector representation of the graph. And the directed acyclic graph neural network can directly process nodes in sequence and output the ebedding vector obeying the dependency sequence.

2. In order to avoid the problem of processing a complete set of job DAGs at a time, resulting in too large an action space, requiring a long time to complete a training, and the input size of the policy network is not fixed. The invention assumes that the coflow arrival sequences in different jobs are based on their respective dependency relationships, and only n tasks in the queue D to be processed need to be processed each time, so that the training speed is faster, the action space is small, and the input size of the strategy network is also fixed.

3. Considering the importance of different jobs, the optimization goal is to weight the sum of job completion times. For optimization purposes, the invention considers the influence of the weight on the basis of the policy network output score, and the input of softmax is also a weighted score value, so that the reinforcement learning agent can learn the optimal action more easily.

Drawings

Fig. 1 is a schematic diagram of a dynamic Coflow scheduling method according to this embodiment.

Fig. 2 is a multi-stage job DAG provided in this embodiment, in which nodes represent computation stages and edges represent communication stages between nodes.

Fig. 3 is a schematic diagram of the directed acyclic graph neural network according to the embodiment for processing different job DAGs.

Detailed Description

The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only some embodiments of the present invention, which are only for illustration and not to be construed as limitations of the present patent. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.

Example 1

As shown in fig. 1, a method for efficient link scheduling of dynamic workflows in a data center network includes the following steps:

The present embodiment does not employ a Graph Neural Network (GNN), and in general, the most common GNN architecture aggregates information from neighbors based on messaging. This common GNN architecture can handle undirected graphs efficiently, but for a job DAG that contains dependencies, it may not be possible to extract its intrinsic features efficiently, which are also needed for our neural network (dependency or partial ordering). Therefore, to obtain higher predictive power, acyclic graph neural networks (DAGNNs) are used to integrate this information into the architectural representation.

In a specific embodiment, the update formula of the directed acyclic graph neural network for calculating the global information and node characteristics of the job DAG is as follows:

wherein ,

representation of all nodes v representing layer i, h _g A representation representing the entire DAG graph; p (v) represents a set of direct predecessor nodes of node v; t represents a set of nodes that are not directly successor; g ^l ,F _l And R are both parameterized neural networks.

It can be seen that the acyclic graph neural network is all using the information of the current layer, i.e. the acyclic graph neural network always uses the latest information to update the node representation. Because it uses only the predecessor node of the current node for aggregation and no successor node is needed, the same layer of information is used when updating the current node

These differences are all unique to DAG. A vector representation of a more advantageous graph is generated by using this structure appropriately. The main idea of the acyclic graph neural network is to process nodes according to the partial order defined by the DAG.

In contrast to previous GNN-processed undirected graphs: (1) The acyclic graph neural network updates the representation of node v directly with information of the current layer, but not the previous layer. (2) The acyclic graph neural network aggregates only the precursor nodes of node v, not the entire neighbor node.

In a specific embodiment, step S2, a common practice in the prior art is to input, for a single job DAG, the topology of the single job DAG and the flow information of each coflow together into GNN, to obtain an ebedding vector of each node. However, this embodiment is directed to a plurality of different job DAGs, and arrival times of different jobs and coflows are random. It is assumed that the order of coflow arrival in different jobs is based on their respective dependencies. A job set comprises J jobs, each job is composed of a plurality of dependent tasks (coflow). A waiting queue W of capacity m and a waiting queue D of capacity n are constructed, where m is much greater than n. With the arrival of the tasks, the tasks are sequentially arranged in a waiting queue W, and the first n tasks are taken out and put into a waiting queue D when the number of the tasks is greater than or equal to n. The n tasks (coflow) come from different jobs, but the sequence of arrival meets the respective dependency relationship, so that J incomplete jobs DAG are formed. The J incomplete job DAGs (which may be considered as a large, connectionless, directed acyclic graph as a whole) are then taken together as inputs to the DAGNN, through which the emmbedding vector for each node is output.

In a specific embodiment, n enabling vectors are obtained in step S2, n enabling vectors output by DAGNN are used as input of a Policy-Network in deep reinforcement learning, and score of each node is obtained through training by a deep reinforcement learning agent. In short, the policy network maps the ebadd vector to a scalar value (score).

Because each job corresponds to a task weight, the task weights in the same job are the same, and the node corresponding weight is multiplied by the score of each node to obtain the weighted score value of each node.

In a specific embodiment, step S5, specifically:

In a specific embodiment, a priority scheduling task (coflow) is performed based on a coflow scheduling priority list, and the coflow scheduling priority list and the coflow waiting list are updated every time one coflow is scheduled until n coflow scheduling is completed, and environmental feedback is performed to evaluate whether the action is good or bad. In this embodiment, the coflow scheduling priority list and the coflow waiting list are updated, and since n coflow is divided into two lists according to whether the ingress is 0, each time a coflow is processed, it is deleted from the working DAG, which results in that the node whose ingress is not 0 is now ingress 0, so that the coflow scheduling priority list and the coflow waiting list need to be updated.

The embodiment abstracts the data center network into a huge non-blocking switch, and the task only competes bandwidth resources at ports. Each coflow is converted into a demand matrix, each value representing the size of the stream to be transmitted from an ingress port to an egress port. For each partial job DAG graph, it can be seen as a complete DAG graph.

The present embodiment is therefore based on obtaining the sum of the weighted job completion times for the current entire schedule, i.e., reorder. Reward evaluates the quality of the action, guides the Agent to advance towards the wanted direction, and finally learns the optimal strategy by continuously and directly interacting with the environment. The agent performs the action (i.e. gets the priority list) on the environment, changing the state of the environment, and the environment feeds back a report to evaluate the current action, which aims to maximize the sum of the jackpots. At the beginning of training, the selected actions are certainly bad, the selected actions are slowly trained, and the intelligent body learns good strategies through feedback of the environment. Through training, the intelligent agent can execute an optimal strategy every time J working DAGs arrive, so that the sum of the completion time of the weighted operation is minimum.

In this embodiment, after the execution of the coflow scheduling priority list is completed, the pending queue D (state transition) is updated, that is, the first n tasks are selected again from the pending queue W and put into the pending queue D, and the step 2 is returned to continue the execution.

This embodiment optimizes the sum of weighted job completion times, with different jobs having different importance and assigned different weights. On the basis of the policy network output to obtain the score of each node, the influence of the weight is considered, and the input of softmax is also a weighted score value, so that the sum of the optimization target-minimized weighted operation completion time is consistent with that of the optimization target.

It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims

1. A high-efficiency link scheduling method of dynamic workflow in a data center network is characterized in that: the method comprises the following steps:

s5: performing priority scheduling tasks based on the coflow scheduling priority list; updating a coflow scheduling priority list and a coflow waiting list every time one coflow is scheduled until n coflow scheduling is completed, and feeding back the quality of a report evaluation action by the environment;

step S2, constructing J incomplete operation DAGs, which are specifically as follows:

2. The efficient link scheduling method for dynamic workflows in a data center network according to claim 1, wherein: the directed acyclic graph neural network is used for calculating global information and node characteristics of the operation DAG, and the formula is as follows:

wherein ,

representation of all nodes v representing layer i, h _g A representation representing the entire DAG graph; p (v) represents a set of direct predecessor nodes of node v; t represents a set of nodes that are not directly successor; g ^l ,F ^l And R are both parameterized neural networks.

3. The efficient link scheduling method for dynamic workflows in a data center network according to claim 2, wherein: the weighted score value of each node is obtained by multiplying the task weight corresponding to the node by the score of each node; each job corresponds to a task weight, and the task weights in the same job are the same.

4. A method of efficient link scheduling of dynamic workflows in a data center network according to claim 3, wherein: step S5, specifically:

5. The efficient link scheduling method for dynamic workflows in a data center network of claim 4, wherein: and S5, after the execution of the coflow scheduling priority list is finished, updating the queue D to be processed, namely, selecting the first n tasks from the queue W again and putting the tasks into the queue D to be processed, and returning to the step 2 to continue the execution.

6. The efficient link scheduling method for dynamic workflows in a data center network of claim 5, wherein: and S5, evaluating the quality of the action by calculating the sum of the completion time of the weighted operation as a reward, and optimizing the agent by the reward so as to minimize the sum of the completion time of the weighted operation.