CN112448899A - Flow scheduling-based multitask training cluster network optimization method - Google Patents

Flow scheduling-based multitask training cluster network optimization method Download PDF

Info

Publication number
CN112448899A
CN112448899A CN201910819132.XA CN201910819132A CN112448899A CN 112448899 A CN112448899 A CN 112448899A CN 201910819132 A CN201910819132 A CN 201910819132A CN 112448899 A CN112448899 A CN 112448899A
Authority
CN
China
Prior art keywords
priority
task
training
communication
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910819132.XA
Other languages
Chinese (zh)
Inventor
孙军欢
胡水海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhixing Technology Co Ltd
Original Assignee
Shenzhen Zhixing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhixing Technology Co Ltd filed Critical Shenzhen Zhixing Technology Co Ltd
Priority to CN201910819132.XA priority Critical patent/CN112448899A/en
Publication of CN112448899A publication Critical patent/CN112448899A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/62Queue scheduling characterised by scheduling criteria
    • H04L47/6215Individual queue per QOS, rate or priority
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention provides a multi-task training cluster network optimization method based on traffic scheduling, which determines the traffic priority of each training task according to task characteristics, namely determines the tasks within the I-th iteration after the training is started as the highest priority, respectively determines the tasks after the I-th iteration after the training as other priorities except the highest priority according to the total transmission quantity of the tasks in all previous iteration rounds, constructs a communication queue, maps the traffic of each task into the communication queue according to the priority, and performs communication based on the communication queue to improve the communication efficiency.

Description

Flow scheduling-based multitask training cluster network optimization method
Technical Field
The invention relates to the field of network communication of a multitask machine learning training cluster; in particular to a multitask training cluster network optimization method based on traffic scheduling.
Background
Deep Learning (DL) has achieved wide success in artificial intelligence driven services and is the core of basic products in many related fields. Because the computation cost of Deep Neural Network (DNN) training is very high, the advantage of parallel computation of a distributed system needs to be explored if timely training is to be realized. Thus, industry leadership IT enterprises such as microsoft, Facebook and Google have begun running distributed Deep Learning Training (DLT) tasks on production clusters of hundreds or thousands of servers. DLT, as a compute intensive task, requires a focused effort for efficient cluster computing resource scheduling. Meanwhile, as the GPU is faster and faster in computation speed and larger in model, the performance bottleneck of the cluster is shifting from computation to communication. However, network optimization of DLT in a production environment is still in a starting stage, and the existing parameter interaction mechanism has a great defect.
It is particularly noted that deep learning training clusters (DL clusterings) in a production environment are full of various uncertainties. Especially when running several, tens or even hundreds of training tasks simultaneously on a larger scale cluster, they (especially the different tasks scheduled to the same compute node) will have to share the network of the cluster. Thus, there is a strong competition for network resources between traffic between different training tasks, and between long-lived elephant flows and delay-sensitive mouse flows in the traffic.
Disclosure of Invention
In view of this, the present invention provides a method for optimizing a multitask training cluster network based on traffic scheduling.
In one aspect, an embodiment of the present invention provides a traffic scheduling method based on a task priority queue.
The traffic scheduling method comprises the following steps:
constructing K ready queues (K is a positive integer not less than 2), wherein each queue corresponds to a priority; wherein, the priority of the first queue is highest, and the priority of the queue is reduced by the priority;
and (3) enabling the flow of each training task to enter a corresponding queue according to the respective priority, and scheduling according to the priority:
determining the task flow within the I-th iteration after the training is started as the highest priority;
respectively mapping the flow of the tasks after the I-th iteration after the training is started to other priority queues except the highest priority queue according to the total sending quantity of the tasks in all previous iteration rounds; the larger the total sending quantity of the tasks is, the lower the priority of the tasks is;
wherein, the I is a positive integer; the magnitude of the I value is generally set according to conditions such as experience and model.
In another aspect, an embodiment of the present invention provides a method for optimizing a multi-task training cluster network.
With reference to the first aspect, based on the traffic scheduling method in the first aspect, the method for optimizing a multitask training cluster network includes:
acquiring task characteristics, and determining the traffic priority of each training task according to the task characteristics:
determining the tasks within the I-th iteration after the training is started to be the highest priority;
determining the tasks after the I-th iteration after the training is started as other priorities except the highest priority according to the total sending quantity of the tasks in all previous iteration rounds; the larger the total sending quantity of the tasks is, the lower the priority of the tasks is;
wherein, the I is a positive integer; the value of the I is generally set according to conditions such as experience, models and the like;
according to the traffic scheduling method mentioned in the first aspect, the traffic of each training task on each computing node of the cluster is scheduled, and traffic communication of each training task is controlled, so that the average completion time of each training task is minimized.
In another aspect, an embodiment of the present invention provides a traffic scheduling module based on a task priority queue.
With reference to the first aspect, correspondingly, the traffic scheduling module includes:
a priority component for obtaining/receiving task communication priority;
and a communication queue component for constructing K ready queues (K being a positive integer no less than 2): each queue corresponds to a priority; wherein, the priority of the first queue is highest, and the priority of the queue is reduced by the priority;
and mapping the flow corresponding to each task to a corresponding ready queue according to the acquired task communication priority, and carrying out scheduling communication.
In another aspect, an embodiment of the present invention provides a flow scheduling-based multitask training cluster network system.
With reference to the second and third aspects, the above multi-task training cluster network system includes:
a communication management unit and a flow scheduling unit;
the communication management unit is used for determining the communication priority of the task; specifically, after the communication management module obtains the task characteristics, the communication management module determines the traffic priority of each training task according to the number of training iteration rounds in the task characteristics, and the like:
determining the tasks within the I-th iteration after the training is started as the highest priority;
determining the tasks after the I-th iteration after the training is started as other priorities except the highest priority according to the total sending quantity of the tasks in all previous iteration rounds; the larger the total sending quantity of the tasks is, the lower the priority of the tasks is;
wherein, the I is a positive integer; the value of the I is generally set according to conditions such as experience, models and the like;
the traffic scheduling unit mentioned above includes the traffic scheduling module mentioned in the third aspect, and is configured to obtain the task communication priority determined by the communication management unit, and schedule communication according to the task communication priority.
The task priority queue traffic scheduling method and module, and the multi-task training cluster network optimization method and the multi-task training cluster network system based on the task priority queue traffic scheduling method and module improve communication efficiency by determining task communication priority, constructing a communication queue, mapping each task traffic into the communication queue according to the priority, and performing communication based on the communication queue.
The technical solution of the present invention is further described with reference to the accompanying drawings and specific embodiments.
Drawings
To more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings related to a part of the embodiments of the present invention or the description in the prior art will be briefly introduced below.
Fig. 1 is a flowchart illustrating a method for optimizing a multi-task training cluster network based on traffic scheduling according to some embodiments of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention is clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of a portion of the invention and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The following are some preferred embodiments of the invention. Wherein the content of the first and second substances,
some of the preferred embodiments described above provide a method for task priority queue based traffic scheduling. The traffic scheduling method comprises the following steps:
at a host terminal serving as a computing node, constructing K ready queues (K is a positive integer not less than 2) by a system of the host terminal for traffic scheduling; wherein each queue corresponds to a priority; wherein, the priority of the first queue is highest, and the priority of the queue is reduced by the priority;
and the flow of each training task running at the same host terminal enters a corresponding queue according to the priority of each training task, and is scheduled according to the priority:
determining the task flow within the I-th iteration after the training is started as the highest priority to obtain early feedback, predict and guide subsequent training; respectively mapping the flow of the tasks after the I-th iteration after the training is started to other priority queues except the highest priority queue according to the total sending quantity of the tasks in all previous iteration rounds; the larger the total sending quantity of the tasks is, the lower the priority of the tasks is; wherein, the I is a positive integer; the magnitude of the I value is generally set according to conditions such as experience and model.
In some of the above-mentioned preferred embodiments, in the method for scheduling traffic based on task priority queues, during the communication process, the sending traffic belonging to each task is also dynamically changed, in short, that is, data belonging to any task traffic is changed before it is sent. To address this issue, in these embodiments, the task priority is dynamically changed:
for any task, when the sending data quantity (for example, the number of bytes) of the task exceeds a preset threshold value, the priority of the task is reduced, and the task flow is moved to a lower queue.
In some of the flow scheduling methods based on task priority queues provided in the preferred embodiments, a task is added to a queue with the lowest priority, and if other task flows continuously enter the queue thereafter, communication is performed to form a stable communication flow, and the flow of the task is kept waiting; the priority of the traffic with longer latency in the low priority is increased.
Other embodiments of the present invention provide a method for optimizing a multi-task training cluster network based on traffic scheduling. As shown in fig. 1, the method includes:
acquiring task characteristics, analyzing the task characteristics, and determining the traffic priority of each training task according to the number of training iteration rounds, the previous sending data amount and the like:
determining the tasks within the I-th iteration after the training is started as the highest priority;
determining the tasks after the I-th iteration after the training is started as other priorities except the highest priority according to the total sending quantity of the tasks in all previous iteration rounds; the larger the total sending quantity of the tasks is, the lower the priority of the tasks is;
wherein, the I is a positive integer; the value of the I is generally set according to conditions such as experience, models and the like;
according to the traffic scheduling method in any embodiment, the traffic of each training task on each computing node of the cluster is scheduled, and the traffic communication of each training task is controlled, so that the average completion time of each training task is minimized.
In the conventional traffic priority control method, the priority of a flow can be easily modified by dynamically modifying the DSCP of one flow. However, in some of the above-mentioned preferred embodiments, in order to improve the communication efficiency, an efficient network (for example, RDMA-based network) is used as the transmission network of the training data, and the high-speed network usually bypasses the kernel of the operating system to reduce and avoid CPU occupation, so as to achieve high-speed communication. It is for the above reason that the DSCP cannot be dynamically modified, and therefore, when a high-speed network is used, the above method cannot be directly adopted to dynamically modify the priority. Therefore, in the optimization method for the multitask training cluster network based on traffic scheduling provided in the embodiments, a unique DSCP is allocated to each task, and traffic priority scheduling is implemented by periodically adjusting the DSCP-priority mapping relationship (on the end hosts and the switches).
Still other embodiments of the present invention provide a task priority queue based traffic scheduling module. The traffic scheduling module comprises:
a priority component for obtaining/receiving task communication priority; specifically, when the priority component is called, the priority component acquires/receives the priority of the cluster training task traffic communication sent by the communication management unit from the communication management unit;
a communication queue component, which is used to construct K ready queues (K is a positive integer no less than 2): each queue corresponds to a priority; wherein, the priority of the first queue is highest, and the priority of the queue is reduced by the priority;
and mapping the flow corresponding to each task to a corresponding ready queue according to the acquired task communication priority, and carrying out scheduling communication.
Some of the above preferred embodiments provide a traffic scheduling module, during the communication process, the sending traffic belonging to each task is also dynamically changed, in short, that is, the data belonging to any task traffic is changed before it is sent. To address this issue, in these embodiments, the communication queue component dynamically changes task priority according to the amount of data sent:
for any task, when the sending data quantity (for example, the number of bytes) of the task exceeds a preset threshold value, the priority of the task is reduced.
Some of the traffic scheduling modules provided in the above preferred embodiments increase the priority of the low-priority traffic with a long latency.
Still other embodiments of the present invention provide a multitask training cluster network system based on stream scheduling. The system comprises: a communication management unit and a flow scheduling unit; wherein the content of the first and second substances,
the communication management unit runs on one node of the cluster and is used for determining the communication priority of the task; specifically, after acquiring the task features, the communication management module determines the traffic priority of each training task according to the number of training iteration rounds in the task features and the like:
determining tasks within the I-th iteration after the training is started as the highest priority to realize early feedback;
determining the tasks after the I-th iteration after the training is started as other priorities except the highest priority according to the total sending quantity of the tasks in all previous iteration rounds; the larger the total sending quantity of the tasks is, the lower the priority of the tasks is;
wherein, the I is a positive integer; the value of the I is generally set according to conditions such as experience, models and the like;
the traffic scheduling unit runs on each computing node of the cluster and is used for traffic scheduling when the parameters of each training task exchange communication; specifically, the traffic scheduling unit includes the traffic scheduling module in the embodiment, and is configured to obtain the task communication priority determined by the communication management unit, and schedule communication according to the task communication priority.
In some of the above-mentioned preferred embodiments, in order to improve communication efficiency, an efficient network is used in the multitask training cluster network system based on stream scheduling; therefore, in the multi-task training cluster network system based on flow scheduling provided by these embodiments, a unique DSCP is allocated to each task, and traffic priority scheduling is implemented by periodically adjusting the DSCP-priority mapping relationship (on the end hosts and the switches).
The above description is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto.

Claims (10)

1. A flow scheduling method based on task priority queues is characterized by comprising the following steps:
constructing K ready queues, wherein each queue corresponds to a priority; wherein, the priority of the first queue is highest, and the priority of the queue is reduced by the priority;
and (3) enabling the flow of each training task to enter a corresponding queue according to the respective priority, and scheduling according to the priority:
determining the task flow within the I-th iteration after the training is started as the highest priority;
respectively mapping the flow of the tasks after the I-th iteration after the training is started to other priority queues except the highest priority queue according to the total sending quantity of the tasks in all previous iteration rounds;
wherein K is a positive integer not less than 2, and I is a positive integer.
2. The traffic scheduling method according to claim 1,
dynamically changing task priority:
for any task, when the sending data volume of the task exceeds a preset threshold value, the priority of the task is reduced.
3. The traffic scheduling method according to claim 1,
and for the traffic with longer waiting time in low priority, the priority is improved.
4. A method for optimizing a multitask training cluster network is characterized by comprising the following steps:
acquiring task characteristics of each training task, and determining the flow priority according to the task characteristics:
determining the tasks within the I-th iteration after the training is started as the highest priority;
determining the tasks after the I-th iteration after the training is started as other priorities except the highest priority according to the total sending quantity of the tasks in all previous iteration rounds; wherein I is a positive integer;
the traffic scheduling method according to any one of claims 1 to 3, wherein the traffic of each training task on each computing node of the cluster is scheduled to control traffic communication thereof.
5. The method for multitask training cluster network optimization according to claim 4,
performing parameter exchange by using a high-speed network;
and allocating a unique DSCP for each task, and realizing the flow priority scheduling by regularly adjusting the DSCP-priority mapping relation.
6. A task priority queue based traffic scheduling module, comprising:
a priority component for obtaining/receiving task communication priority;
and a communication queue component for constructing K ready queues: each queue corresponds to a priority; wherein, the priority of the first queue is highest, and the priority of the queue is reduced by the priority; wherein K is a positive integer not less than 2;
and mapping the flow corresponding to each task to a corresponding ready queue according to the acquired task communication priority, and carrying out scheduling communication.
7. The traffic scheduling module of claim 6,
dynamically changing task priority:
for any task, when the sending data volume of the task exceeds a preset threshold value, the priority of the task is reduced.
8. The traffic scheduling module of claim 6,
and for the traffic with longer waiting time in low priority, the priority is improved.
9. A flow scheduling based multitask training cluster network system, comprising:
a communication management unit and a flow scheduling unit; wherein the content of the first and second substances,
the communication management unit is used for determining the communication priority of the task; the communication management unit determines the flow priority of each training task according to the task characteristics:
determining the tasks within the I-th iteration after the training is started as the highest priority;
determining the tasks after the I-th iteration after the training is started as other priorities except the highest priority according to the total sending quantity of the tasks in all previous iteration rounds; wherein I is a positive integer;
a traffic scheduling unit comprising the traffic scheduling module of any of claims 6 to 8, configured to obtain the task communication priority determined by the communication management unit, and schedule communication according to the task communication priority.
10. The multitasking training cluster network system of claim 9,
the system uses a high-speed network for parameter exchange;
and allocating a unique DSCP for each task, and realizing the flow priority scheduling by regularly adjusting the DSCP-priority mapping relation.
CN201910819132.XA 2019-08-31 2019-08-31 Flow scheduling-based multitask training cluster network optimization method Pending CN112448899A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910819132.XA CN112448899A (en) 2019-08-31 2019-08-31 Flow scheduling-based multitask training cluster network optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910819132.XA CN112448899A (en) 2019-08-31 2019-08-31 Flow scheduling-based multitask training cluster network optimization method

Publications (1)

Publication Number Publication Date
CN112448899A true CN112448899A (en) 2021-03-05

Family

ID=74733938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910819132.XA Pending CN112448899A (en) 2019-08-31 2019-08-31 Flow scheduling-based multitask training cluster network optimization method

Country Status (1)

Country Link
CN (1) CN112448899A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900472A (en) * 2022-07-12 2022-08-12 之江实验室 Method and system for realizing cooperative flow scheduling by control surface facing to multiple tasks

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572718A (en) * 2008-04-30 2009-11-04 张文 IP QoS unified strategic system based on oriented application and method thereof
CN103136056A (en) * 2013-03-04 2013-06-05 浪潮电子信息产业股份有限公司 Cloud computing platform scheduling method
CN106533981A (en) * 2016-12-19 2017-03-22 北京邮电大学 Multi-attribute based big data flow scheduling method and device
CN107025205A (en) * 2016-01-30 2017-08-08 华为技术有限公司 A kind of method and apparatus of training pattern in distributed system
US20180183684A1 (en) * 2016-12-28 2018-06-28 Google Inc. Auto-prioritization of device traffic across local network
CN108694090A (en) * 2018-04-16 2018-10-23 江苏润和软件股份有限公司 A kind of cloud computing resource scheduling method of Based on Distributed machine learning
CN108762896A (en) * 2018-03-26 2018-11-06 福建星瑞格软件有限公司 One kind being based on Hadoop cluster tasks dispatching method and computer equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572718A (en) * 2008-04-30 2009-11-04 张文 IP QoS unified strategic system based on oriented application and method thereof
CN103136056A (en) * 2013-03-04 2013-06-05 浪潮电子信息产业股份有限公司 Cloud computing platform scheduling method
CN107025205A (en) * 2016-01-30 2017-08-08 华为技术有限公司 A kind of method and apparatus of training pattern in distributed system
CN106533981A (en) * 2016-12-19 2017-03-22 北京邮电大学 Multi-attribute based big data flow scheduling method and device
US20180183684A1 (en) * 2016-12-28 2018-06-28 Google Inc. Auto-prioritization of device traffic across local network
CN108762896A (en) * 2018-03-26 2018-11-06 福建星瑞格软件有限公司 One kind being based on Hadoop cluster tasks dispatching method and computer equipment
CN108694090A (en) * 2018-04-16 2018-10-23 江苏润和软件股份有限公司 A kind of cloud computing resource scheduling method of Based on Distributed machine learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900472A (en) * 2022-07-12 2022-08-12 之江实验室 Method and system for realizing cooperative flow scheduling by control surface facing to multiple tasks
CN114900472B (en) * 2022-07-12 2022-11-08 之江实验室 Method and system for realizing cooperative flow scheduling by control surface facing to multiple tasks

Similar Documents

Publication Publication Date Title
CN113254197B (en) Network resource scheduling method and system based on deep reinforcement learning
CN114647515A (en) GPU cluster-oriented dynamic resource scheduling method
CN114253735B (en) Task processing method and device and related equipment
CN110990140B (en) Method for scheduling distributed machine learning flow in photoelectric switching network
CN111767146A (en) Distributed machine learning system acceleration method based on network reconfiguration
CN112333234B (en) Distributed machine learning training method and device, electronic equipment and storage medium
Gentry et al. Robust dynamic resource allocation via probabilistic task pruning in heterogeneous computing systems
CN111740925B (en) Deep reinforcement learning-based flow scheduling method
US11868808B2 (en) Automatic driving simulation task scheduling method and apparatus, device, and readable medium
CN114205353A (en) Calculation unloading method based on hybrid action space reinforcement learning algorithm
Li et al. Endpoint-flexible coflow scheduling across geo-distributed datacenters
CN105740059A (en) Particle swarm scheduling method for divisible task
CN116166381A (en) Resource scheduling based on IACO algorithm in multi-cloud management platform
Wang et al. CEFS: Compute-efficient flow scheduling for iterative synchronous applications
CN113938930B (en) Construction method of virtual network function forwarding graph adapting to 5G network multi-service scene
CN115543626A (en) Power defect image simulation method adopting heterogeneous computing resource load balancing scheduling
CN112448899A (en) Flow scheduling-based multitask training cluster network optimization method
Che et al. Deep reinforcement learning in M2M communication for resource scheduling
CN112446484A (en) Multitask training cluster intelligent network system and cluster network optimization method
Chen et al. Deadline-constrained MapReduce scheduling based on graph modelling
Liu et al. 5G/B5G Network Slice Management via Staged Reinforcement Learning
Bensalem et al. Towards optimal serverless function scaling in edge computing network
CN113010319A (en) Dynamic workflow scheduling optimization method based on hybrid heuristic rule and genetic algorithm
Zhang et al. Dynamic VNF scheduling: a deep reinforcement learning approach
Wadhonkar et al. A task scheduling algorithm based on task length and deadline in cloud computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination