CN113127169A - Efficient link scheduling method for dynamic workflow in data center network - Google Patents
Efficient link scheduling method for dynamic workflow in data center network Download PDFInfo
- Publication number
- CN113127169A CN113127169A CN202110373804.6A CN202110373804A CN113127169A CN 113127169 A CN113127169 A CN 113127169A CN 202110373804 A CN202110373804 A CN 202110373804A CN 113127169 A CN113127169 A CN 113127169A
- Authority
- CN
- China
- Prior art keywords
- node
- coflow
- scheduling
- nodes
- tasks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000013528 artificial neural network Methods 0.000 claims abstract description 25
- 239000013598 vector Substances 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 8
- 230000002787 reinforcement Effects 0.000 claims abstract description 7
- 230000009471 action Effects 0.000 claims description 17
- 239000003795 chemical substances by application Substances 0.000 claims description 9
- 238000011156 evaluation Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 230000001419 dependent effect Effects 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241000404172 Minois dryas Species 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/484—Precedence
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention provides a high-efficiency link scheduling method for dynamic workflow in a data center network, which comprises the following steps: s1: processing the n arriving coflows simultaneously by using a directed acyclic graph neural network; s2: forming J incomplete operation DAGs, inputting the J incomplete operation DAGs into a directed acyclic graph neural network as a large unconnected directed acyclic graph, and outputting to obtain an embedding vector of each node; s3: obtaining n embedding vectors through step S2, taking the n embedding vectors as the input of a strategy network in the deep reinforcement learning, obtaining the score of each node, and calculating to obtain a weighted score value of each node; s4: according to partial DAG graphs of different jobs at present, finding out all nodes with the income degree of 0 at present, calculating the probability of each node based on the weighted score value through a softmax operation, and obtaining a coflow scheduling priority list according to the probability arrangement of the nodes; s5: and performing a priority scheduling task based on the coflow scheduling priority list.
Description
Technical Field
The invention relates to the field of high-performance computing, in particular to an efficient link scheduling method for dynamic workflows in a data center network.
Background
Modern parallel computing platforms (e.g., Hadoop, Spark, Dryad) have supported processing large data sets in data centers. The process is typically comprised of multiple computation and communication phases. The computation phase involves local operation of the servers, while the communication phase involves data transfer between the servers in the data center network to initiate the next computation phase. Such intermediate communication phases have a large impact on application delay. The flow is an abstraction proposed to model such a communication scheme, representing a set of intermediate parallel data flows, and transmitted between servers to initiate the next phase.
Minimizing the average completion time of the flow for jobs having a single communication phase may improve the latency of the jobs. However, for multi-stage operations, minimizing the average Coflow completion time may not be the correct metric, and may even lead to worse performance, because it ignores the dependency between coflows in the operation: Starts-After and Finishes-Beform. For multi-stage jobs, each job consists of multiple coflows, typically represented by a DAG (directed acyclic graph) that captures the dependencies (starters-After) between the coflows.
Although the flow scheduling for single-stage jobs has been studied in many ways, the flow scheduling for multi-stage jobs and the dependency between flows are largely ignored. The flow scheduling problem for multi-stage jobs has proven to be an NP-hard problem. The difficult problem of the flow scheduling of the multi-stage operation is that many processing difficulties and inherent factors exist, including how to process different operation DAGs, how to effectively extract characteristic information (including node information, side information, dependency relationship and the like) of the operation DAGs, and factors such as different flow numbers in different jobs, different parallel flow numbers in a single flow, and the like. The prior art mainly has the following limitations:
existing jobs, including heuristic and approximation algorithms, are centralized in the flow scheduling of single-phase jobs. For the flow scheduling problem of multi-phase jobs, in Aalo the authors briefly discuss a straightforward heuristic to reduce the completion time of multi-phase jobs. The manually debugged heuristic solution simplifies the problem by giving some slack conditions, ensuring only a rough approximation of the optimal solution for the NP-hard problem.
Therefore, the self-adaptive scheduling model without manual guidance can be constructed, dependent flow from different jobs can be dynamically scheduled by directly interacting with the environment, the sum of weighted operation completion time is optimized, and the operation efficiency is improved. The weights may capture different priorities for different jobs, and in the special case where all weights are equal, the problem is equivalent to minimizing the average job completion time.
Disclosure of Invention
The invention provides an efficient link scheduling method for dynamic workflow in a data center network, aiming at overcoming the defects and shortcomings in the prior art, and the efficient link scheduling method can dynamically schedule the dependent flow from different jobs without depending on a large amount of manual debugging, so that the sum of weighted operation completion time is minimized.
In order to solve the technical problems, the technical scheme of the invention is as follows: a method for efficient link scheduling of dynamic workflows in a data center network comprises the following steps:
s1: processing n arriving coflows simultaneously by using a directed acyclic graph neural network, forming a job by a plurality of tasks of the dependency relationship, and expressing by adopting DAG;
s2: forming J incomplete operation DAGs, inputting the J incomplete operation DAGs into a directed acyclic graph neural network as a large unconnected directed acyclic graph, and outputting to obtain an embedding vector of each node;
s3: obtaining n embedding vectors through step S2, taking the n embedding vectors as the input of a strategy network in the deep reinforcement learning, obtaining the score of each node, and calculating to obtain a weighted score value of each node;
s4: according to partial DAG graphs of different jobs at present, finding out all nodes with the income degree of 0 at present, calculating the probability of each node based on the weighted score value through a softmax operation, and obtaining a coflow scheduling priority list according to the probability arrangement of the nodes; temporarily storing the nodes with the current income degree not being 0 in a flow waiting list;
s5: and performing a priority scheduling task based on the coflow scheduling priority list, updating the coflow scheduling priority list and the coflow waiting list after scheduling one coflow, and feeding back the performance of the reward evaluation action by the environment until n coflow schedules are completed.
Further, the update formula of the directed acyclic graph neural network for computing the global information and node characteristics of the job DAG is as follows:
wherein ,representation of all nodes v, h, of the l-th layergA representation representing the entire DAG graph; p (v) a set of immediate predecessor nodes representing node v; t represents a set of nodes without a direct successor; gl,FlAnd R are both parameterized neural networks.
Still further, in step S2, J incomplete job DAGs are constructed, which are as follows:
constructing a waiting queue W with the capacity of m and a pending queue D with the capacity of n, wherein m > n; sequentially arranging the tasks in a waiting queue W, and taking out the first n tasks and putting the tasks into a queue D to be processed when the number of the tasks is more than or equal to n; because n tasks come from different jobs and the sequence of successive arrival meets respective dependency relationship, the J incomplete operation DAG is formed.
Furthermore, the weighted score value of each node is obtained by multiplying the task weight corresponding to the node by the score of each node; each job corresponds to a task weight, and the task weights in the same job are the same.
Still further, in step S5, specifically:
s1: performing a priority scheduling task based on the coflow scheduling priority list; after each coflow is scheduled, removing corresponding nodes and edges in the DAG, updating a node set with the degree of entry of 0, and further updating a coflow scheduling priority list and a coflow waiting list;
s2: and continuously repeating the operation until the n flow schedules are finished, and feeding back the rewarded evaluation action to the environment.
Still further, in step S5, after the flow scheduling priority list is executed, the queue D to be processed is updated, that is, the first n tasks are selected from the waiting queue W again and placed in the queue D to be processed, and the process returns to step S2 to continue the execution.
Still further, in step S5, the quality of the action is evaluated by calculating the sum of weighted job completion times as a reward, and the sum of weighted job completion times is minimized by rewarding the optimization agent.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
1. on the premise that GNN can extract the characteristics of operation DAG, the invention uses the directed acyclic graph neural network instead, makes full use of the special structure of DAG, and can generate a more favorable vector representation of the graph. And the directed acyclic graph neural network can directly process the nodes in sequence and output the embedding vector obeying the dependent sequence.
2. In order to avoid processing a complete set of operation DAG at a time, which results in too large action space, a long time is needed for completing one training, and the input size of the strategy network is not fixed. The invention assumes that the flow arrival sequence in different jobs is based on the respective dependency relationship, and only n tasks in the queue D to be processed need to be processed each time, so that the training speed is faster, the action space is small, and the input size of the strategy network is also fixed.
3. Considering the importance of different jobs, the optimization goal is to weight the sum of job completion times. In order to optimize the objective, the invention considers the influence of the weights on the basis of the policy network output score, and the input of softmax is also a weighted score value, which makes it easier for the reinforcement learning agent to learn the optimal action.
Drawings
Fig. 1 is a schematic model diagram of a dynamic flow scheduling method according to this embodiment.
Fig. 2 is a multi-stage operation DAG provided by this embodiment, in which nodes represent computation stages and edges represent communication stages between nodes.
Fig. 3 is a schematic diagram of the directed acyclic graph neural network provided in this embodiment processing different operation DAGs.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and are used for illustration only, and should not be construed as limiting the patent. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a method for efficient link scheduling of dynamic workflows in a data center network includes the following steps:
s1: processing n arriving coflows simultaneously by using a directed acyclic graph neural network, forming a job by a plurality of tasks of the dependency relationship, and expressing by adopting DAG;
s2: forming J incomplete operation DAGs, inputting the J incomplete operation DAGs into a directed acyclic graph neural network as a large unconnected directed acyclic graph, and outputting to obtain an embedding vector of each node;
s3: obtaining n embedding vectors through step S2, taking the n embedding vectors as the input of a strategy network in the deep reinforcement learning, obtaining the score of each node, and calculating to obtain a weighted score value of each node;
s4: according to partial DAG graphs of different jobs at present, finding out all nodes with the income degree of 0 at present, calculating the probability of each node based on the weighted score value through a softmax operation, and obtaining a coflow scheduling priority list according to the probability arrangement of the nodes; temporarily storing the nodes with the current income degree not being 0 in a flow waiting list;
s5: and performing a priority scheduling task based on the coflow scheduling priority list, updating the coflow scheduling priority list and the coflow waiting list after scheduling one coflow, and feeding back the performance of the reward evaluation action by the environment until n coflow schedules are completed.
The present embodiments do not employ Graph Neural Networks (GNNs), and in general, the most common GNN architecture aggregates information from neighbors based on messaging. This common GNN architecture can efficiently process undirected graphs, but for working DAGs that contain dependencies, it may not be possible to efficiently extract their intrinsic features, which are also required by our neural networks (dependencies or partial ordering). Therefore, to achieve higher prediction capabilities, an acyclic graph neural network (DAGNN) is used to integrate this information into the architectural representation.
In a specific embodiment, the update formula of the directed acyclic graph neural network for computing the global information and node characteristics of the job DAG is as follows:
wherein ,representation of all nodes v, h, of the l-th layergA representation representing the entire DAG graph; p (v) a set of immediate predecessor nodes representing node v; t represents a set of nodes without a direct successor; gl,FlAnd R are both parameterized neural networks.
It can be seen that the acyclic graph neural networks all use the information of the current layer, i.e. the acyclic graph neural networks always use the latest information to update the node representation. Because it only uses the predecessor node of the current node to carry out aggregation and does not need successor node, when updating the current node, it uses the information of the same layerThese differences are all special structures specific to DAGs. This structure is suitably used to produce a more favorable vector representation of the graph. The main idea of the acyclic graph neural network is to process nodes in a partial order defined by a DAG.
In contrast to the undirected graph of the previous GNN process: (1) the acyclic graph neural network updates the representation of the node v directly with the information of the current layer instead of the previous layer. (2) The acyclic graph neural network aggregates only predecessor nodes of node v, not entire neighbor nodes.
In a specific embodiment, in step S2, the general method in the prior art is to input the topology of the single job DAG and the flow information of each coflow into the GNN for a single job DAG, so as to obtain an embedding vector of each node. However, this embodiment addresses multiple different job DAGs, and the arrival times of different jobs and coflows are random. Assume that the order of arrival of coflows in different jobs is based on their respective dependencies. A job set contains J jobs, each job consisting of several dependent tasks (coflows). A wait queue W of capacity m and a pending queue D of capacity n are constructed, where m is much greater than n. With the arrival of tasks, the tasks are sequentially arranged in a waiting queue W, and when the number of the tasks is larger than or equal to n, the first n tasks are taken out and put into a queue D to be processed. The n tasks (coflow) come from different jobs, but the successive order meets respective dependency relationship, and J incomplete job DAGs are formed. Then, the J incomplete job DAGs (the whole can be regarded as a large unconnected directed acyclic graph) are taken together as the input of the DAGNN, and an embedding vector of each node is output through the DAGNN.
In a specific embodiment, n embedding vectors are obtained through step S2, the n embedding vectors output by the DAGNN are used as input of a Policy Network (Policy-Network) in the deep reinforcement learning, and the score of each node is obtained through training by a deep reinforcement learning agent. It is simply the policy network that maps the embedding vector to a scalar value (score).
Because each job corresponds to a task weight, and the task weights in the same job are the same, the embodiment multiplies the corresponding weight of the node by the score of each node to obtain a weighted score value of each node.
In a specific embodiment, step S5, specifically:
s1: performing a priority scheduling task based on the coflow scheduling priority list; after each coflow is scheduled, removing corresponding nodes and edges in the DAG, updating a node set with the degree of entry of 0, and further updating a coflow scheduling priority list and a coflow waiting list;
s2: and continuously repeating the operation until the n flow schedules are finished, and feeding back the rewarded evaluation action to the environment.
In a specific embodiment, a priority scheduling task (coflow) is performed based on the coflow scheduling priority list, and when one coflow is scheduled, the coflow scheduling priority list and the coflow waiting list are updated until n coflow schedules are completed and the environment feeds back whether the actions are evaluated. In this embodiment, the coflow scheduling priority list and the coflow waiting list are updated, because n coflows are divided into two lists according to whether the incomings are 0, and each coflow is processed, the coflow is deleted from the working DAG, which results in a node with the original incomings not being 0, and now the incomings are 0, so the coflow scheduling priority list and the coflow waiting list need to be updated.
In the embodiment, the data center network is abstracted into a huge non-blocking switch, and tasks only compete for bandwidth resources at ports. Each coflow is converted into a requirement matrix, each value representing the size of the flow to be transmitted from a certain ingress port to a certain egress port. For each part operation DAG graph, the part operation DAG graph can be regarded as a complete DAG graph.
The present embodiment is therefore based on the sum of the weighted job completion times, i.e., rewarded, resulting in the current overall schedule. And (4) Reward evaluates the quality of the action, guides the Agent of the Agent to advance towards a desired direction, and finally learns the optimal strategy by continuously and directly interacting with the environment. The agent performs actions (actions are the priority list) to act on the environment, the state of the environment is changed, the environment feeds back a reward to evaluate the current actions, and the aim of the embodiment is to maximize the accumulated reward sum. When the training is started, the selected action is definitely not good, the training is slowly carried out, and the intelligent agent learns a good strategy through the feedback of the environment. Through training, the intelligent agent can execute an optimal strategy every time J work DAGs come, so that the sum of weighted work completion time is minimum.
In this embodiment, after the flow scheduling priority list is executed, the queue D to be processed is updated (state transition), that is, the first n tasks are selected from the waiting queue W again and placed in the queue D to be processed, and the process returns to step 2 to continue execution.
The optimization of the embodiment is to weight the sum of the completion time of the operation, and different jobs have different importance and are distributed with different weights. On the basis that the output of the policy network obtains the score of each node, the influence of the weight is considered, and the input of softmax is also a weighted score value, so that the sum of the completion time of the minimized weighted operation which is the optimization target is kept consistent.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (7)
1. A high-efficiency link scheduling method for dynamic workflow in a data center network is characterized in that: the method comprises the following steps:
s1: processing n arriving coflows simultaneously by using a directed acyclic graph neural network, forming a job by a plurality of tasks of the dependency relationship, and expressing by adopting DAG;
s2: forming J incomplete operation DAGs, inputting the J incomplete operation DAGs into a directed acyclic graph neural network as a large unconnected directed acyclic graph, and outputting to obtain an embedding vector of each node;
s3: obtaining n embedding vectors through step S2, taking the n embedding vectors as the input of a strategy network in the deep reinforcement learning, obtaining the score of each node, and calculating to obtain a weighted score value of each node;
s4: according to partial DAG graphs of different jobs at present, finding out all nodes with the income degree of 0 at present, calculating the probability of each node based on the weighted score value through a softmax operation, and obtaining a coflow scheduling priority list according to the probability arrangement of the nodes; temporarily storing the nodes with the current income degree not being 0 in a flow waiting list;
s5: performing a priority scheduling task based on the coflow scheduling priority list; and updating the coflow scheduling priority list and the coflow waiting list after scheduling one coflow, and feeding back the performance of the reward evaluation action by the environment until the scheduling of the n coflows is completed.
2. The method of claim 1 for efficient link scheduling of dynamic workflows in a data center network, comprising: the directed acyclic graph neural network is used for calculating global information and node characteristics of a job DAG, and the formula is as follows:
3. The method for efficient link scheduling of dynamic workflows in a data center network of claim 2, wherein: step S2, construct J incomplete job DAGs, as follows:
constructing a waiting queue W with the capacity of m and a pending queue D with the capacity of n, wherein m > n; sequentially arranging the tasks in a waiting queue W, and taking out the first n tasks and putting the tasks into a queue D to be processed when the number of the tasks is more than or equal to n; because n tasks come from different jobs and the sequence of successive arrival meets respective dependency relationship, the J incomplete operation DAG is formed.
4. The method of claim 3, wherein the method comprises: the weighted score value of each node is obtained by multiplying the task weight corresponding to the node by the score of each node; each job corresponds to a task weight, and the task weights in the same job are the same.
5. The method of claim 4, wherein the method comprises: step S5, specifically:
s1: performing a priority scheduling task based on the coflow scheduling priority list; after each coflow is scheduled, removing corresponding nodes and edges in the DAG, updating a node set with the degree of entry of 0, and further updating a coflow scheduling priority list and a coflow waiting list;
s2: and continuously repeating the operation until the n flow schedules are finished, and feeding back the rewarded evaluation action to the environment.
6. The method of claim 5, wherein the method comprises: step S5, after the execution of the coflow scheduling priority list is completed, updating the queue D to be processed, that is, the first n tasks are selected from the waiting queue W again and placed in the queue D to be processed, and returning to step S2 to continue the execution.
7. The method of claim 6, wherein the method comprises: in step S5, the quality of the action is evaluated by calculating the sum of the weighted job completion times as a reward, and the agent is optimized by the reward so that the sum of the weighted job completion times is minimized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110373804.6A CN113127169B (en) | 2021-04-07 | 2021-04-07 | Efficient link scheduling method for dynamic workflow in data center network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110373804.6A CN113127169B (en) | 2021-04-07 | 2021-04-07 | Efficient link scheduling method for dynamic workflow in data center network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113127169A true CN113127169A (en) | 2021-07-16 |
CN113127169B CN113127169B (en) | 2023-05-02 |
Family
ID=76775168
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110373804.6A Active CN113127169B (en) | 2021-04-07 | 2021-04-07 | Efficient link scheduling method for dynamic workflow in data center network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113127169B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113518012A (en) * | 2021-09-10 | 2021-10-19 | 之江实验室 | Distributed cooperative flow simulation environment construction method and system |
CN114691342A (en) * | 2022-05-31 | 2022-07-01 | 蓝象智联(杭州)科技有限公司 | Method and device for realizing priority scheduling of federated learning algorithm component and storage medium |
CN114756358A (en) * | 2022-06-15 | 2022-07-15 | 苏州浪潮智能科技有限公司 | DAG task scheduling method, device, equipment and storage medium |
CN116996443A (en) * | 2023-09-25 | 2023-11-03 | 之江实验室 | Network collaborative traffic scheduling method and system combining GNN and SAC models |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101267452A (en) * | 2008-02-27 | 2008-09-17 | 华为技术有限公司 | A conversion method and application server for WEB service mixing scheme |
US20190089645A1 (en) * | 2015-08-25 | 2019-03-21 | Shanghai Jiao Tong University | Dynamic Network Flows Scheduling Scheme in Data Center |
CN111131080A (en) * | 2019-12-26 | 2020-05-08 | 电子科技大学 | Distributed deep learning flow scheduling method, system and equipment |
CN111756653A (en) * | 2020-06-04 | 2020-10-09 | 北京理工大学 | Multi-coflow scheduling method based on deep reinforcement learning of graph neural network |
-
2021
- 2021-04-07 CN CN202110373804.6A patent/CN113127169B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101267452A (en) * | 2008-02-27 | 2008-09-17 | 华为技术有限公司 | A conversion method and application server for WEB service mixing scheme |
US20190089645A1 (en) * | 2015-08-25 | 2019-03-21 | Shanghai Jiao Tong University | Dynamic Network Flows Scheduling Scheme in Data Center |
CN111131080A (en) * | 2019-12-26 | 2020-05-08 | 电子科技大学 | Distributed deep learning flow scheduling method, system and equipment |
CN111756653A (en) * | 2020-06-04 | 2020-10-09 | 北京理工大学 | Multi-coflow scheduling method based on deep reinforcement learning of graph neural network |
Non-Patent Citations (3)
Title |
---|
PENGHAO SUN,等;: "DeepWeave:Accelerating Job Completion Time with Deep Reinforcement Learning-based Coflow Scheduling", 《PROCEEDINGSOFTHETWENTY-NINTHINTERNATIONALJOINTCONFERENCEONARTIFICIALINTELLIGENCE(IJCAI-20)》 * |
郑莹,等;: "深度强化学习在典型网络系统中的应用综述", 《无线电通信技术》 * |
马腾等: "基于深度增强学习的数据中心网络coflow调度机制", 《电子学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113518012A (en) * | 2021-09-10 | 2021-10-19 | 之江实验室 | Distributed cooperative flow simulation environment construction method and system |
CN113518012B (en) * | 2021-09-10 | 2021-12-10 | 之江实验室 | Distributed cooperative flow simulation environment construction method and system |
CN114691342A (en) * | 2022-05-31 | 2022-07-01 | 蓝象智联(杭州)科技有限公司 | Method and device for realizing priority scheduling of federated learning algorithm component and storage medium |
CN114756358A (en) * | 2022-06-15 | 2022-07-15 | 苏州浪潮智能科技有限公司 | DAG task scheduling method, device, equipment and storage medium |
CN114756358B (en) * | 2022-06-15 | 2022-11-04 | 苏州浪潮智能科技有限公司 | DAG task scheduling method, device, equipment and storage medium |
CN116996443A (en) * | 2023-09-25 | 2023-11-03 | 之江实验室 | Network collaborative traffic scheduling method and system combining GNN and SAC models |
CN116996443B (en) * | 2023-09-25 | 2024-01-23 | 之江实验室 | Network collaborative traffic scheduling method and system combining GNN and SAC models |
Also Published As
Publication number | Publication date |
---|---|
CN113127169B (en) | 2023-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113127169B (en) | Efficient link scheduling method for dynamic workflow in data center network | |
CN111756812B (en) | Energy consumption perception edge cloud cooperation dynamic unloading scheduling method | |
CN109039942B (en) | Network load balancing system and balancing method based on deep reinforcement learning | |
CN111756653B (en) | Multi-coflow scheduling method based on deep reinforcement learning of graph neural network | |
Wang et al. | An adaptive artificial bee colony with reinforcement learning for distributed three-stage assembly scheduling with maintenance | |
CN108880663A (en) | Incorporate network resource allocation method based on improved adaptive GA-IAGA | |
CN111030835B (en) | Task scheduling model of TTFC network and message scheduling table generation method | |
CN113098714B (en) | Low-delay network slicing method based on reinforcement learning | |
CN114253735B (en) | Task processing method and device and related equipment | |
CN108111335A (en) | A kind of method and system dispatched and link virtual network function | |
CN114443249A (en) | Container cluster resource scheduling method and system based on deep reinforcement learning | |
CN113190342B (en) | Method and system architecture for multi-application fine-grained offloading of cloud-edge collaborative networks | |
CN115150335B (en) | Optimal flow segmentation method and system based on deep reinforcement learning | |
Fan et al. | Associated task scheduling based on dynamic finish time prediction for cloud computing | |
CN116112488A (en) | Fine-grained task unloading and resource allocation method for MEC network | |
CN115756646A (en) | Industrial internet-based edge computing task unloading optimization method | |
CN112506644B (en) | Task scheduling method and system based on cloud edge-side hybrid computing mode system | |
CN109298932B (en) | OpenFlow-based resource scheduling method, scheduler and system | |
Zhang et al. | Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A Multi-Agent Reinforcement Learning Approach | |
CN112506658A (en) | Dynamic resource allocation and task scheduling method in service chain | |
CN115114030B (en) | On-line multi-workflow scheduling method based on reinforcement learning | |
CN116909717B (en) | Task scheduling method | |
EP4202682A1 (en) | Deadlock-free scheduling of a task graph on a multi-core processor | |
Laili et al. | Multi operators-based partial connected parallel evolutionary algorithm | |
Wang et al. | A Scalable Deep Reinforcement Learning Model for Online Scheduling Coflows of Multi-Stage Jobs for High Performance Computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |