WO2020186836A1 - 任务调度 - Google Patents
任务调度 Download PDFInfo
- Publication number
- WO2020186836A1 WO2020186836A1 PCT/CN2019/124494 CN2019124494W WO2020186836A1 WO 2020186836 A1 WO2020186836 A1 WO 2020186836A1 CN 2019124494 W CN2019124494 W CN 2019124494W WO 2020186836 A1 WO2020186836 A1 WO 2020186836A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- operation task
- tasks
- task
- queue
- scheduling
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/483—Multiproc
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the present disclosure relates to the field of deep learning, in particular to task scheduling methods and devices, and storage media.
- the training of deep learning models is a key link in deep learning.
- the training process is very complicated, and the hardware resources used are also very diverse.
- each open source framework has its own set of scheduling methods to manage the training process of deep learning models, but most of them use Directed Acyclic Graphs to describe the dependencies between operational tasks, so that the training process follows the correct process. , But the task scheduling method of this training platform is not efficient.
- the present disclosure provides a task scheduling method and device and a computer storage medium.
- a task scheduling method including:
- the dependency relationship between the multiple operation tasks is determined; based on the dependency relationship between the multiple operation tasks, the operation task queue Multiple operation tasks are scheduled.
- the operand corresponding to the operation task includes a read operand and/or a write operand.
- the determining the dependency relationship between the multiple operation tasks according to the operands corresponding to the multiple operation tasks in the operation task queue includes: if the second operation task includes a write operation to the first operation task The second operation task includes a write operation to the operand of the first operation task, and it is determined that the second operation task depends on the first operation task; wherein, the first operation task The first operation task and the second operation task are different operation tasks in the operation task queue.
- the determining the dependency relationship between the multiple operation tasks according to the operands corresponding to the multiple operation tasks in the operation task queue further includes: if the second operation task includes reading of the first operation task The read operation performed by the operand determines that there is no dependency between the first operation task and the second operation task.
- the scheduling the multiple operation tasks in the operation task queue based on the dependency relationship between the multiple operation tasks includes: based on the dependency relationship between the multiple operation tasks , Determine the scheduling sequence of the multiple operation tasks; allocate memory for the current operation task in the operation task queue; after the allocation of memory is completed, schedule the current operation task to the context corresponding to the current operation task Upper execution; according to the scheduling sequence, the memory allocation of the next operation task of the current operation task is performed.
- the determining the scheduling sequence of the multiple operation tasks based on the dependency relationship between the multiple operation tasks includes: if the first operation task of the multiple operation tasks is related to the multiple operation tasks If there is no dependency between the second operation task in the operation task, it is determined to call the first operation task and the second operation task in parallel; and/or if the second operation task depends on the first operation task , It is determined to schedule the second operation task after the first operation task.
- the determining the dependency relationship between the multiple operation tasks according to the operands corresponding to the multiple operation tasks in the operation task queue includes: performing M communication operations included in the multiple operation tasks Tasks undergo fusion processing to obtain at least one combined communication operation task, and each combined communication operation task includes at least one communication operation task among the M communication operation tasks, where M is an integer greater than or equal to 1; Combining an operand corresponding to one communication operation task and an operand corresponding to at least one non-communication operation task among the multiple operation tasks determines the dependency relationship between the multiple operation tasks.
- the operand corresponding to the combined communication operation task includes: a set of read operands corresponding to at least one communication operation task included in the combined communication operation task, and/or the operand included in the combined communication operation task The set of write operations corresponding to at least one communication operation task.
- the operation task queue includes a first operation task queue and a second operation task queue, wherein the first operation task queue contains communication operation tasks among the multiple operation tasks, and the second operation task The queue contains non-communication operation tasks among the multiple operation tasks; wherein, the operation tasks contained in the first operation task queue and the second operation task queue are based on the dependence between the multiple operation tasks. Arranged in the scheduling order determined by the relationship.
- the method further includes: recording dependency information between the first operation task queue and the second operation task queue, wherein, if the operation task in the first operation task queue depends on the At least one operation task in the second operation task queue, or if the operation task in the second operation task queue depends on at least one operation task in the first operation task queue, the dependency information includes the at least one operation task Information about the last operation task in the operation task; the scheduling the plurality of operation tasks in the operation task queue based on the dependency between the plurality of operation tasks includes: based on the first The dependency information between the operation task queue and the second operation task queue schedules the operation tasks in the first operation task queue and the second operation task queue.
- the method further includes: setting the priority corresponding to the memory reclamation operation task to the highest, wherein the second operation task queue includes all the operation tasks in the operation task queue except the memory reclamation operation task. Describes non-communication operation tasks.
- the method further includes: determining a context corresponding to each of the multiple operation tasks, wherein the operation task The corresponding context includes abstract resources and information flows; the scheduling of the multiple operation tasks in the operation task queue based on the dependency between the multiple operation tasks includes: based on the multiple operations The context corresponding to each operation task in the task and the dependency relationship between the multiple operation tasks are used to schedule the multiple operation tasks in the operation task queue.
- the information flow includes a unified computing device architecture CUDA information flow and/or a host information flow.
- the multiple operation tasks in the operation task queue are scheduled based on the context corresponding to each of the multiple operation tasks and the dependency relationship between the multiple operation tasks , Including: if there is no dependency between at least two of the multiple operation tasks and the at least two operation tasks correspond to different abstract resources, scheduling the at least two operation tasks in parallel.
- the multiple operation tasks in the operation task queue are scheduled based on the context corresponding to each of the multiple operation tasks and the dependency relationship between the multiple operation tasks ,include:
- the first synchronization is called An interface to synchronize the third operation task and the fourth operation task.
- the multiple operation tasks in the operation task queue are scheduled based on the context corresponding to each of the multiple operation tasks and the dependency relationship between the multiple operation tasks , Including: if there is a dependency between the third operation task and the fourth operation task among the multiple operation tasks, and the information flow corresponding to at least one of the third operation task and the fourth operation task If it is the host information flow, the second synchronization interface is called to synchronize the third operation task and the fourth operation task.
- a task scheduling device including:
- the dependency determination module is configured to determine the dependency between the multiple operation tasks according to the operands corresponding to the multiple operation tasks in the operation task queue; the scheduling module is configured to determine the dependency between the multiple operation tasks Dependency relationship, scheduling the multiple operation tasks in the operation task queue.
- a non-volatile computer-readable storage medium stores a computer program, and the computer program is used to execute any of the above-mentioned first aspect Task scheduling method.
- a task scheduling device including:
- a processor a memory for storing executable instructions of the processor; wherein the processor is configured to: call the executable instructions stored in the memory to implement task scheduling in any possible implementation of the first aspect method.
- a computer program including instructions for implementing the method in any possible implementation manner of the first aspect.
- the embodiments of the present disclosure determine the dependency relationship between multiple operation tasks according to the respective operands of the multiple operation tasks in the operation task queue, and schedule multiple operation tasks based on the dependency relationship, thereby minimizing the number of operation tasks. Inter-dependence to achieve efficient scheduling of operational tasks.
- Fig. 1 is a flowchart showing a scheduling method according to an exemplary embodiment of the present disclosure.
- Fig. 2 is a flowchart of a task scheduling method according to another exemplary embodiment of the present disclosure.
- Fig. 3 is a flowchart of a task scheduling method according to still another exemplary embodiment of the present disclosure.
- Fig. 4 is a flowchart of a task scheduling method according to another exemplary embodiment of the present disclosure.
- Fig. 5 is a schematic diagram showing a communication overlap scene according to an exemplary embodiment of the present disclosure.
- Fig. 6 is a flowchart of a task scheduling method according to another exemplary embodiment of the present disclosure.
- Fig. 7 is a flowchart of a task scheduling method according to another exemplary embodiment of the present disclosure.
- Fig. 8 is a schematic diagram showing a hardware resource context according to an exemplary embodiment of the present disclosure.
- Fig. 9 is a schematic diagram showing division of operation tasks according to an exemplary embodiment of the present disclosure.
- Fig. 10 is a schematic diagram showing an interface between a context collection and a scheduling system according to an exemplary embodiment of the present disclosure.
- Fig. 11 is a flowchart of a task scheduling method according to another exemplary embodiment of the present disclosure.
- Fig. 12 is a block diagram showing a task scheduling device according to an exemplary embodiment of the present disclosure.
- Fig. 13 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 14 is a block diagram showing a task scheduling device according to still another exemplary embodiment of the present disclosure.
- Fig. 15 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 16 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 17 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 18 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 19 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 20 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 21 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 22 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 23 is a block diagram showing a task scheduling device according to another exemplary embodiment of the present disclosure.
- Fig. 24 is a schematic structural diagram of an apparatus for task scheduling according to an exemplary embodiment of the present disclosure.
- first, second, third, etc. may be used in this disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other.
- first information may also be referred to as second information, and similarly, the second information may also be referred to as first information.
- word “if” as used herein can be interpreted as "when” or “when” or “in response to a certainty”.
- the embodiments of the present disclosure provide a task scheduling method, which can be used in a deep learning training platform, such as a neural network training platform, or other devices or platforms that involve hardware resource scheduling and need to improve scheduling efficiency.
- a deep learning training platform such as a neural network training platform
- the following only takes the deep learning training platform as an example.
- Fig. 1 shows a task scheduling method according to an exemplary embodiment, which includes the following steps.
- step 101 the dependency relationship between the multiple operation tasks is determined according to the operands (operand) corresponding to the multiple operation tasks in the operation task queue.
- step 102 the multiple operation tasks in the operation task queue are scheduled based on the dependency between the multiple operation tasks.
- the foregoing embodiment determines the dependency relationship between the operation tasks based on the operands, minimizes the dependency between the operation tasks, and achieves the purpose of fine-grained and efficient scheduling.
- the dependency relationship between multiple operation tasks is determined based on the operands of the fine-grained tasks included in the operation task.
- the operand is the data object of the operation task.
- the operand includes a read operand corresponding to a read operation and/or a write operand corresponding to a write operation.
- the operation task may include zero, one or more read operations, and may also include zero, one or more write operations, and accordingly, the operation task may correspond to one or more operands.
- step 101 it may be determined whether there is a dependency between the two operation tasks based on whether the operands of the two operation tasks are the same. For example, if the operands of the two operation tasks do not have an intersection, then it is determined that the two operation tasks do not overlap. There is no dependency between operational tasks. In some embodiments, the following manner may be used to determine the dependency between multiple operation tasks based on the operands.
- the first case there is a dependency between two operational tasks.
- the training platform can It is determined that the second operation task depends on the first operation task.
- the operation task B in the operation task queue needs to read the number of write operations of the operation task A, it can be determined that the operation task B depends on the operation task A. For another example, if the operation task B in the operation task queue needs to write the write operation number and/or the read operation number of the operation task A, it can be determined that the operation task B depends on the operation task A.
- the second case there is no dependency between the two operation tasks.
- the training platform can determine that there is no dependency between the first operation task and the second operation task. For example, if the operation task B in the operation task queue needs to read the number of read operations of the operation task A, it can be determined that there is no dependency between the operation task A and the operation task B.
- the training platform can determine that there is no dependency between the first operation task and the second operation task. For example, if the operands of operation task A are a and b, and the operand of operation task B is c, then operation task A does not need to do any operation on the operand of operation task B, and operation task B does not need to do any operation on operation task A. If any operation is performed on the operand, it can be determined that there is no dependency between operation task A and operation task B.
- step 102 may include the following steps.
- step 102-1 the scheduling sequence of the multiple operation tasks is determined based on the dependency relationship between the multiple operation tasks.
- the training platform can call the first operation task and the second operation task that are not dependent in parallel.
- the second operation task or the training platform can schedule the first operation task and the second operation task in any order. For example, if there is no dependency between operation task A and operation task B, the training platform can schedule operation task A and operation task B in parallel or in any order.
- the training platform will schedule the second operation task after scheduling the first operation task. For example, if operation task B depends on operation task A, the training platform first schedules operation task A, and then schedules operation task B.
- step 102-2 memory is allocated for the current operation task in the operation task queue.
- the training platform After the training platform determines the scheduling order of the multiple operation tasks based on the dependency relationship between the multiple operation tasks, it can schedule the multiple operation tasks in the operation task queue according to the scheduling order.
- the scheduling of a certain operation task in the operation task queue includes two processes: allocating the memory space required by the operation task and scheduling the operator of the operation task to the corresponding context for execution.
- step 102-3 after the memory allocation is completed, the current operation task is scheduled for execution on the context corresponding to the current operation task, and the memory allocation of the next operation task of the current operation task is performed according to the scheduling sequence.
- the training platform may send the operation execution request of the current operation task to the context corresponding to the current operation task after allocating memory for the current operation task. At this time, there is no need to wait for the execution of the current operation task to be completed, and the training platform can begin to allocate memory for the next operation task of the current operation task in the scheduling order. In other words, after the memory allocation of the current operation task is completed, the execution of the current operation task and the scheduling of the next operation task can be performed in parallel. The scheduling of the next operation task needs to wait for the memory allocation of the current operation task to be completed, without waiting for the execution of the current operation task to complete. In some embodiments, when allocating memory for an operation task, if the memory space is insufficient, it is necessary to wait for at least one operation task before the operation task to complete in the scheduling sequence and release enough memory space before proceeding. Prepare memory for the current operation task.
- the improvement of communication efficiency between different computing nodes is also the key to improving the training efficiency, and the communication messages affect the scalability of the training platform.
- GPU Graphics Processing Unit, graphics processing unit
- NCCL Nvidia Collective Multi-GPU Communication Library
- NCCL The implementation mechanism of NCCL is to connect all communication nodes end to end to form a one-way ring, data fragmentation pipeline transmission, ideally communication time does not increase with the increase of the number of nodes, only related to the total amount of data and bandwidth, but only when the data The number of copies is far greater than the number of nodes. This feature makes NCCL very inefficient when the amount of data is small.
- Communication tasks are tasks performed between multiple computing nodes, while non-communication tasks are tasks performed on a single computing node.
- step 101 may include the following steps.
- step 101-1 the M communication operation tasks included in the multiple operation tasks are fused to obtain at least one combined communication operation task, and each combined communication operation task includes at least one communication operation among the M communication operation tasks. task.
- M can be an integer greater than or equal to 1.
- the training platform can integrate multiple communication operation tasks to obtain one or more combined communication operation tasks.
- the number of communication operation tasks included in each combined communication operation task may be one or more.
- step 101-2 the dependency between the multiple operation tasks is determined according to the operand corresponding to at least one combined communication operation task and the operand corresponding to at least one non-communication operation task among the multiple operation tasks.
- the training platform may separately merge the read operands and/or write operands corresponding to at least one communication operation task included in the merged communication operation task to obtain a set of read operands and/or write operands after fusion Therefore, the set of read operands is used as the read operand corresponding to the merged communication task, and the set of write operands is used as the write operand corresponding to the merged communication task.
- each operation task has its own read operands and write operands. By fusing at least one communication operation task, the read operands and/or write operands of multiple operation tasks can be unioned respectively.
- the training platform can determine the dependency between different combined communication operation tasks according to the operands corresponding to the combined communication operation task, and can determine the dependencies between different combined communication operation tasks according to the operands corresponding to the combined communication operation task and at least one of the multiple operation tasks
- the operand corresponding to the non-communication operation task is used to determine the dependency between the combined communication operation task and at least one non-communication operation task.
- the manner of determining the dependency relationship is the same as the manner of determining the dependency relationship between at least two operation tasks in the foregoing embodiment, and will not be repeated here.
- the training platform may merge multiple communication operation tasks into at least one combined communication operation task, and then according to the operand corresponding to the at least one combined operation task and at least one non-communication operation among the multiple operation tasks.
- the operands corresponding to the tasks determine the dependency relationship between the merged communication operation tasks, and/or the dependency relationship between at least one merged communication operation task and at least one non-communication operation task, so as to obtain a larger amount of communication data, so that training The platform has more efficient communication efficiency.
- the operation task queue may be divided into a first operation task pair column and a second operation task queue, where the first operation task pair includes communication operation tasks among multiple operation tasks.
- One operation task queue is a remote communication queue.
- the second operation task queue contains non-communication operation tasks among the multiple operation tasks.
- the second operation task pair is listed as a local queue.
- the remote communication queue includes operation tasks for data exchange between the current computing node and other computing nodes.
- the local queue includes tasks performed on the current computing node, for example, CPU (Central Processing Unit, central processing unit) computing, GPU computing, CPU to GPU data transmission, GPU to CPU data transmission, and other operational tasks.
- CPU Central Processing Unit, central processing unit
- the operation tasks contained in each of the above two queues are arranged in a scheduling order determined based on the dependency between multiple operation tasks in the respective operation task queues. Therefore, the operation tasks in the same queue do not need to record all the operation tasks.
- the information of the dependent operation task can be guaranteed according to the queue first-in first-out mechanism.
- operation task B depends on operation task A, and the scheduling sequence is operation task A first, then operation task B.
- Operation task A and operation task B are both communication operation tasks, then the operation task is in the first operation task pair A is arranged before operation task B.
- operation task A will be scheduled first, and then operation task B will be scheduled, so that operation task B does not need to record the information of dependent operation task A.
- operation task A and operation task B are non-communication operation tasks, and in the second operation task pair column, operation task A is also arranged before operation task B, and operation task B does not need to record information about operation task A .
- the communication operation task can be executed faster.
- the operation task E depends on the operation task A, and the operation task A and the operation task E are both communication operation tasks.
- the operation task A and the operation need to be scheduled in the operation task queue.
- operation task E can be scheduled. If the operation task queue is divided into the first operation task queue and the second operation task pair, then the first operation task queue includes operation task A and operation task E, and the second operation task queue includes operation task B and operation Task C and Operation Task D.
- the operation task E can be executed after the operation task A is executed, without waiting for the operation task B, the operation task C, and the operation task D to be all executed.
- the foregoing task scheduling method may further include the following steps.
- step 103 the dependency information between the first operation task queue and the second operation task queue is recorded, wherein, if the operation task in the first operation task queue depends on one or more operation tasks in the second operation task queue, Then the dependency information includes information about the last operation task among the one or more operation tasks, and/or if the operation task in the second operation task queue depends on at least one operation task in the first operation task queue, the dependency information includes at least Information about the last operation task in an operation task.
- the training platform can record the information of the last operation task on which the two operation task queues depend, for example, as shown in FIG. 5. For example, if the operation task A in the first operation task queue depends on the operation tasks B and C of the second operation task queue, and the operation task C depends on the operation task B, only the dependence of the operation task A on the operation task C needs to be recorded.
- step 102 may include: scheduling the operation tasks in the first operation task queue and the second operation task queue based on the dependency information between the first operation task queue and the second operation task queue.
- the communication operation task in the first operation task queue can be executed more quickly without waiting for the completion of all non-communication operation tasks.
- memory management is a complex and critical issue. If memory recovery cannot be performed correctly and in time, the target operation task cannot be scheduled as soon as possible, or it is equivalent to a reduction in the number of samples that can be calculated . To solve this problem, the embodiments of the present disclosure propose to set the priority corresponding to the memory reclamation operation task to the highest.
- FIG. 6 is another task scheduling method shown on the basis of the embodiment shown in FIG. 4, and the above task scheduling method may further include the following steps.
- step 104 the priority corresponding to the memory reclamation operation task is set to be the highest, where the second operation task queue includes non-communication operation tasks in the operation task queue except for the memory reclamation operation task.
- the training platform can delete the operation task from its task queue and also delete its corresponding memory recovery operation set. That is, after a certain target operation task is scheduled to the corresponding context, the training platform can set the operation of reclaiming the memory occupied by the target operation task in advance, so that the occupied memory can be quickly recovered after the target operation task is completed.
- the memory reclamation operation described in this step is a logical operation, which is to mark this memory as available for redistribution, and operation tasks that have not yet been executed can still continue to use this memory. Since in the embodiment of the present disclosure, the operation task performed on the re-allocated memory must be arranged after the operation task before the recovery, the calculation sequence on the stream ensures that the two operation tasks will not conflict.
- the memory reclamation operation is not stored in the first operation task queue or the second operation task queue like other target operation tasks, but is stored in an additional map data structure.
- the key (keyword) is a dependent non-recovery memory operation, that is, it is dependent on the corresponding target operation task, and the reclaimed memory operation is stored in the vector (vector) pointed to by the value (value).
- the priority corresponding to the memory recovery operation task is set to the highest, and the memory recovery operation is stored independently to ensure that memory resources are cleaned up in time, making the memory recovery of the training platform more efficient.
- the foregoing task scheduling method may further include the following steps.
- step 100 a context corresponding to each of the multiple operation tasks is determined, where the context corresponding to the operation task includes abstract resources and information flows.
- the training platform can abstract the hardware resources in the system, and provide unified logical management and interfaces for each hardware resource, so that the following abstract resources can be obtained: GPU computing resources, PCIE (Peripheral Component Interconnect Express, High-speed serial computer expansion bus standard) uplink transmission resources, PCIE downlink transmission resources, IB (InfiniBand, unlimited broadband) network resources and CPU computing resources.
- GPU computing resources PCIE (Peripheral Component Interconnect Express, High-speed serial computer expansion bus standard) uplink transmission resources, PCIE downlink transmission resources, IB (InfiniBand, unlimited broadband) network resources and CPU computing resources.
- PCIE Peripheral Component Interconnect Express, High-speed serial computer expansion bus standard
- IB Intelligent Broadband
- the operation task queue of each abstract resource can be further encapsulated.
- the operation tasks corresponding to GPU computing resources, PCIE uplink transmission resources, PCIE downlink transmission resources, and IB network resources are essentially encapsulation of asynchronous CUDA Stream (information flow), so GPU computing resources, PCIE uplink transmission resources, PCIE downlink
- the information flow corresponding to the transmission resource and the IB network resource may be the CUDA (Compute Unified Device Architecture) information flow.
- the information flow corresponding to the CPU computing resource may be a Host Stream.
- the training platform can divide the multiple operation tasks in the operation task queue into operation tasks corresponding to each abstract resource according to the context corresponding to each hardware resource.
- multiple operation tasks can be used as the operation task library 900, which can be divided into built-in operation tasks and extensible operation tasks.
- the built-in operation task 910 can implement system built-in functions, such as uplink transmission 913, downlink transmission 912, memory recovery 911, and so on.
- Scalable operation tasks can be added by users of the training platform as needed, and optionally can be further divided into computing operation tasks 920 and communication operation tasks 930.
- the calculation operation task 920 can be further divided into a CPU calculation operation task 921 and a GPU calculation operation task 922.
- each operation task after division corresponds to a corresponding context.
- the foregoing step 102 includes: scheduling multiple operation tasks in the operation task queue based on the context corresponding to each of the multiple operation tasks and the dependency relationship between the multiple operation tasks.
- the training platform can schedule multiple operation tasks based on the context corresponding to each operation task and the dependency relationship between the multiple operation tasks.
- the specific implementation process is as follows.
- the first case there is no dependency between the two operation tasks.
- the training platform can schedule the two operation tasks in parallel when there is no dependency between two operation tasks among multiple operation tasks and the two operation tasks correspond to different abstract resources.
- the second case there is a dependency between two operational tasks.
- the training platform needs to determine the information flow corresponding to the third operation task and the fourth operation task, and based on the information flow corresponding to the third operation task and the fourth operation task, Determine the synchronization interface of the third operation task and the fourth operation task.
- the training platform may call the first synchronization interface to synchronize the third operation task and the fourth operation task.
- the first synchronization interface may be the cudaStreamWaitEvent() interface.
- operations such as status query and waiting for completion of the operation task in the operation task queue can be performed through CUDA Event (event).
- CUDA Event Based on CUDA Event, it provides a fine-grained and lightweight implicit synchronization method. If the training platform detects that any two of the above operation tasks are dependent, and the corresponding information streams are both CUDA Stream, the training platform can call the first synchronization interface.
- the first synchronization interface is called to determine the current state of the third operation task, for example, whether the third operation task is scheduled, so as to synchronize the third operation task and the fourth operation task.
- the synchronization here is to ensure the correctness of the calculation results of the two operation tasks that have dependencies. For example, assuming that the fourth operation task depends on the third operation task, the purpose of synchronizing the third operation task and the fourth operation task is to make the fourth operation task wait for the third operation task to be executed before starting.
- the second synchronization interface can be called to perform the third operation task and the fourth operation task. Synchronize.
- the second synchronization interface may be the cudaStreamSynchronize() interface.
- the first synchronization interface may be called to synchronize the third operation task and the fourth operation task.
- the delay and overhead generated by the training platform when calling the first synchronization interface are less than the delay and overhead when calling the second synchronization interface, so that more efficient hardware system utilization can be achieved.
- first interface there may be a first interface and a second interface between the hardware resource context and the scheduling system of the training platform, as shown in FIG. 10.
- the first interface is used for the scheduling system to publish operation tasks that need to be scheduled to the specified context.
- the first interface may be a schedule() interface.
- the second interface is used for the scheduling system to synchronize any context.
- the second interface may be a synchronize() interface.
- the task scheduling method may include the following steps.
- step 201 a context corresponding to each of the multiple operation tasks in the operation task queue is obtained.
- step 202 multiple communication operation tasks included in the multiple operation tasks are fused to obtain at least one combined communication operation task.
- step 203 the dependency relationship between the multiple operation tasks is determined based on the operand corresponding to at least one combined communication operation task and the operand corresponding to at least one non-communication operation task among the multiple operation tasks.
- the second operation task of the multiple operation tasks includes a read operation on the write operand of the first operation task, or the second operation task includes a write operation on the operand of the first operation task, determine the second operation task Depends on the first operational task.
- the second operation task includes a read operation performed on the read operand of the first operation task, it is determined that there is no dependency between the first operation task and the second operation task.
- step 204 the at least one combined communication operation task is used as the first operation task queue, and the non-communication operation task among the multiple operation tasks is used as the second operation task queue.
- the operation task queue may include a first operation task queue and a second operation task queue, where the first operation task queue includes a combined communication operation task among a plurality of operation tasks, and the second operation task queue includes a non-communication operation task among the plurality of operation tasks. Operation task. Wherein, the operation tasks contained in the first operation task queue and the second operation task queue are arranged in a scheduling sequence determined based on the dependency relationship between the multiple operation tasks.
- step 205 the dependency information between the first operation task queue and the second operation task queue is recorded.
- step 206 based on the dependency information between the first operation task queue and the second operation task queue, the operation tasks in the first operation task queue and the second operation task queue are scheduled.
- step 207 the priority corresponding to the memory reclamation operation task is set to the highest.
- the foregoing second operation task queue includes non-communication operation tasks in the operation task queue except for memory recovery operation tasks.
- Step 207 may be performed after the dependent operation task is scheduled to the corresponding context.
- the execution process of all the above steps is consistent with the execution process provided in the previous embodiment, and will not be repeated here.
- the dependency relationship between the multiple operation tasks is determined, the dependency between the operations is minimized, and an efficient dependency analysis and scheduling strategy is realized.
- the embodiments of the present disclosure also provide efficient communication fusion and overlap solutions. Improve the efficiency of communication operation itself through communication fusion, improve the overall efficiency of the training process through communication overlap, and solve the problems of dependency fusion and multi-task queue interdependence introduced in the fusion and overlap process.
- the memory recovery operation is defined as the highest priority operation to ensure the timely cleaning of memory resources and achieve the purpose of efficient memory recovery.
- the present disclosure also provides apparatus embodiments.
- FIG. 12 is a block diagram of a task scheduling device according to an exemplary embodiment of the present disclosure.
- the device includes a dependency determination module 310 and a scheduling module 320.
- the dependency determination module 310 is configured to determine the dependency between the multiple operation tasks according to the operands corresponding to the multiple operation tasks in the operation task queue.
- the scheduling module 320 is configured to schedule multiple operation tasks in the operation task queue based on the dependency relationship between the multiple operation tasks.
- the operand corresponding to the operation task includes a read operand and/or a write operand.
- FIG. 13 is a block diagram of another task scheduling apparatus according to an exemplary embodiment of the present disclosure.
- This embodiment is based on the foregoing embodiment in FIG. 12, and the dependency determination module 310 includes: a first determination
- the sub-module 311 is configured to determine the second operation if the second operation task includes a read operation on the write operand of the first operation task, or the second operation task includes a write operation on the operand of the first operation task
- the task depends on the first operation task; wherein, the first operation task and the second operation task are different operation tasks in the operation task queue.
- FIG. 14 is a block diagram of another task scheduling apparatus according to an exemplary embodiment of the present disclosure.
- this embodiment further includes: a second The determining submodule 312 is configured to determine that there is no dependency between the first operation task and the second operation task if the first operation task includes a read operation on the read operand of the second operation task.
- FIG. 15 is a block diagram of another task scheduling apparatus according to an exemplary embodiment of the present disclosure.
- the scheduling module 320 includes: a scheduling sequence determination submodule 321, configured to determine the scheduling sequence of multiple operation tasks based on the dependency relationship between the multiple operation tasks; the first execution submodule 322 is configured to allocate memory for the current operation task in the operation task queue; second execution The sub-module 323 is configured to schedule the current operation task to be executed on the context corresponding to the current operation task after the memory allocation is completed, and perform the memory allocation of the next operation task of the current operation task according to the scheduling sequence.
- a scheduling sequence determination submodule 321 configured to determine the scheduling sequence of multiple operation tasks based on the dependency relationship between the multiple operation tasks
- the first execution submodule 322 is configured to allocate memory for the current operation task in the operation task queue
- second execution The sub-module 323 is configured to schedule the current operation task to be executed on the context corresponding to the current operation task after the memory allocation is completed, and perform the memory allocation of the next operation task
- FIG. 16 is a block diagram of another task scheduling apparatus according to an exemplary embodiment of the present disclosure.
- the scheduling sequence determining submodule 321 includes: The determining unit 3211 is configured to determine to call the first operation task and the second operation task in parallel if there is no dependency between the first operation task of the plurality of operation tasks and the second operation task of the plurality of operation tasks; and /Or the second determining unit 3212 is configured to determine that the second operation task is scheduled after the first operation task if the second operation task depends on the first operation task.
- FIG. 17 is a block diagram of another task scheduling apparatus according to an exemplary embodiment of the present disclosure.
- this embodiment includes: a fusion submodule 313. It is configured to perform fusion processing on the M communication operation tasks included in the multiple operation tasks to obtain at least one combined communication operation task, and each combined communication operation task includes at least one communication operation task among the M communication operation tasks, Where M is an integer greater than or equal to 1; the third determining submodule 314 is configured to determine based on the operand corresponding to at least one combined communication operation task and the operand corresponding to at least one non-communication operation task among the multiple operation tasks The dependency between multiple operation tasks.
- the operand corresponding to the merged communication operation task includes: a set of read operands corresponding to at least one communication operation task included in the merged communication operation task, and/or at least one communication operation task included in the merged communication operation task corresponds to The set of write operands.
- the operation task queue includes a first operation task queue and a second operation task queue, where the first operation task queue contains communication operation tasks among multiple operation tasks, and the second operation task queue contains communication operation tasks among multiple operation tasks.
- FIG. 18 is a block diagram of another task scheduling apparatus according to an exemplary embodiment of the present disclosure.
- This embodiment is based on the foregoing embodiment in FIG. 12, and the apparatus further includes: a recording module 330 configured to To record the dependency information between the first operation task queue and the second operation task queue, wherein, if the operation task in the first operation task queue depends on at least one operation in the second operation task queue Task, or if the operation task in the second operation task queue depends on at least one operation task in the first operation task queue, the dependency information includes information about the last operation task in the at least one operation task .
- a recording module 330 configured to To record the dependency information between the first operation task queue and the second operation task queue, wherein, if the operation task in the first operation task queue depends on at least one operation in the second operation task queue Task, or if the operation task in the second operation task queue depends on at least one operation task in the first operation task queue, the dependency information includes information about the last operation task in the at least one operation task .
- the scheduling module 320 includes a first scheduling submodule 324 configured to schedule the first operation task queue and the second operation task queue based on the dependency information between the first operation task queue and the second operation task queue Operation tasks in the operation task queue.
- FIG. 19 is a block diagram of another task scheduling device according to an exemplary embodiment of the present disclosure. This embodiment is based on the foregoing embodiment in FIG. 18, and the device further includes: a priority setting module 340, It is configured to set the priority corresponding to the memory reclamation operation task to be the highest, where the second operation task queue includes non-communication operation tasks in the operation task queue except for the memory reclamation operation task.
- a priority setting module 340 It is configured to set the priority corresponding to the memory reclamation operation task to be the highest, where the second operation task queue includes non-communication operation tasks in the operation task queue except for the memory reclamation operation task.
- FIG. 20 is a block diagram of another task scheduling device according to an exemplary embodiment of the present disclosure. This embodiment is based on the aforementioned embodiment in FIG. 19, and the device further includes: a context determining module 350, It is configured to determine the context corresponding to each of the multiple operation tasks, where the context corresponding to the operation task includes abstract resources and information flows.
- a context determining module 350 It is configured to determine the context corresponding to each of the multiple operation tasks, where the context corresponding to the operation task includes abstract resources and information flows.
- the scheduling module 320 includes a second scheduling sub-module 325 configured to perform operations on multiple operation tasks in the operation task queue based on the context corresponding to each operation task in the multiple operation tasks and the dependency between the multiple operation tasks. Scheduling.
- the information flow includes a unified computing device architecture CUDA information flow and/or a host information flow.
- FIG. 21 is a block diagram of another scheduling device according to an exemplary embodiment of the present disclosure. This embodiment is based on the foregoing embodiment in FIG. 20.
- the second scheduling submodule 325 includes: a first scheduling The unit 3251 is configured to schedule at least two operation tasks in parallel if there is no dependency between at least two operation tasks among the multiple operation tasks and the at least two operation tasks correspond to different abstract resources.
- FIG. 22 is a block diagram of another scheduling apparatus according to an exemplary embodiment of the present disclosure.
- This embodiment is based on the foregoing embodiment in FIG. 20, and the second scheduling submodule 325 includes: second scheduling The unit 3252 is configured to call the third operation task and the fourth operation task if there is a dependency between the third operation task and the fourth operation task, and the information flows corresponding to the third operation task and the fourth operation task are both CUDA information flows.
- a synchronization interface is used to synchronize the third operation task and the fourth operation task.
- FIG. 23 is a block diagram of another scheduling apparatus according to an exemplary embodiment of the present disclosure.
- This embodiment is based on the foregoing embodiment in FIG. 20, and the scheduling submodule 325 includes: a third scheduling unit 3253 , Configured to, if there is a dependency between the third operation task and the fourth operation task among the multiple operation tasks, and the information flow corresponding to at least one of the third operation task and the fourth operation task is the host information flow, Then the second synchronization interface is called to synchronize the third operation task and the fourth operation task.
- the relevant part can refer to the part of the description of the method embodiment.
- the device embodiments described above are merely illustrative.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place. , Or it can be distributed to multiple network units.
- Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the present disclosure. Those of ordinary skill in the art can understand and implement it without creative work.
- the embodiment of the present disclosure also provides a non-volatile computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to execute any of the above-mentioned task scheduling methods.
- the embodiment of the present disclosure also provides a task scheduling device, the device includes: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to: call the executable instructions stored in the memory to implement Any of the above task scheduling methods.
- FIG. 24 is a schematic structural diagram of a task scheduling apparatus 2400 according to an exemplary embodiment.
- the device 2400 may be provided as a task scheduling device.
- the apparatus 2400 includes a processing component 2422, which further includes one or more processors, and a memory resource represented by a memory 2432, for storing instructions executable by the processing component 2422, such as application programs.
- the application program stored in the memory 2432 may include one or more modules each corresponding to a set of instructions.
- the processing component 2422 is configured to execute instructions to perform any of the task scheduling methods described above.
- the device 2400 may further include a power component 2426 configured to perform power management of the device 2400, a wired or wireless network interface 2450 configured to connect the device 2400 to the network, and an input/output interface 2458.
- the device 2400 can operate based on an operating system stored in the memory 2432, such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeB SD TM or the like.
- the embodiments of the present disclosure also provide a computer program, the computer program including instructions for implementing the method in any of the foregoing possible implementation manners.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Advance Control (AREA)
- Stored Programmes (AREA)
Abstract
Description
Claims (34)
- 一种任务调度方法,其特征在于,所述方法包括:根据操作任务队列中多个操作任务对应的操作数,确定所述多个操作任务之间的依赖关系;基于所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度。
- 根据权利要求1所述的方法,其特征在于,所述操作任务对应的操作数包括读操作数和/或写操作数。
- 根据权利要求1或2所述的方法,其特征在于,所述根据操作任务队列中多个操作任务所对应的操作数,确定所述多个操作任务之间的依赖关系,包括:如果第二操作任务包含对第一操作任务的写操作数进行的读操作、或所述第二操作任务包含对所述第一操作任务的操作数进行的写操作,确定所述第二操作任务依赖于所述第一操作任务;其中,所述第一操作任务和所述第二操作任务是所述操作任务队列中的不同操作任务。
- 根据权利要求1至3中任一项所述的方法,其特征在于,所述根据操作任务队列中多个操作任务所对应的操作数,确定所述多个操作任务之间的依赖关系,还包括:如果第二操作任务包含对第一操作任务的读操作数进行的读操作,确定所述第一操作任务与所述第二操作任务之间不存在依赖。
- 根据权利要求1至4中任一项所述的方法,其特征在于,所述基于所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度,包括:基于所述多个操作任务之间的依赖关系,确定所述多个操作任务的调度顺序;为所述操作任务队列中的当前操作任务分配内存;在所述分配内存完成之后,将所述当前操作任务调度到所述当前操作任务对应的上下文上执行;按照所述调度顺序进行所述当前操作任务的下一个操作任务的内存分配。
- 根据权利要求5所述的方法,其特征在于,所述基于所述多个操作任务之间的依赖关系,确定所述多个操作任务的调度顺序,包括:如果所述多个操作任务中的第一操作任务与所述多个操作任务中的第二操作任务之间不存在依赖,则确定并行调用所述第一操作任务与所述第二操作任务;和/或如果所述第二操作任务依赖于所述第一操作任务,则确定在所述第一操作任务之后调度所述第二操作任务。
- 根据权利要求1至6中任一项所述的方法,其特征在于,所述根据操作任务队列中多个操作任务所对应的操作数,确定所述多个操作任务之间的依赖关系,包括:对所述多个操作任务中包括的M个通信操作任务进行融合处理,得到至少一个合并通信操作任务,每个合并通信操作任务包含所述M个通信操作任务中的至少一个通信操作任务,其中,M为大于或等于1的整数;根据所述至少一个合并通信操作任务对应的操作数和所述多个操作任务中至少一个非通信操作任务对应的操作数,确定所述多个操作任务之间的依赖关系。
- 根据权利要求7所述的方法,其特征在于,所述合并通信操作任务对应的操作 数包括:所述合并通信操作任务中包括的至少一个通信操作任务对应的读操作数的集合,和/或所述合并通信操作任务中包括的至少一个通信操作任务对应的写操作数的集合。
- 根据权利要求1至8中任一项所述的方法,其特征在于,所述操作任务队列包括第一操作任务队列和第二操作任务队列,其中,所述第一操作任务队列包含所述多个操作任务中的通信操作任务,所述第二操作任务队列包含所述多个操作任务中的非通信操作任务;其中,所述第一操作任务队列和所述第二操作任务队列中包含的操作任务是按照基于所述多个操作任务之间的依赖关系确定的调度顺序排列的。
- 根据权利要求9所述的方法,其特征在于,所述方法还包括:记录所述第一操作任务队列与所述第二操作任务队列之间的依赖信息,其中,若所述第一操作任务队列中的操作任务依赖所述第二操作任务队列中的至少一个操作任务,或者若所述第二操作任务队列中的操作任务依赖于所述第一操作任务队列中的至少一个操作任务,所述依赖信息包括所述至少一个操作任务中的最后一个操作任务的信息;所述基于所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度,包括:基于所述第一操作任务队列与所述第二操作任务队列之间的依赖信息,调度所述第一操作任务队列和所述第二操作任务队列中的操作任务。
- 根据权利要求9或10所述的方法,其特征在于,所述方法还包括:将内存回收操作任务对应的优先级设置为最高,其中,所述第二操作任务队列包括所述操作任务队列中除所述内存回收操作任务之外的所述非通信操作任务。
- 根据权利要求1-11任一项所述的方法,其特征在于,在所述确定所述多个操作任务之间的依赖关系之前,所述方法还包括:确定与所述多个操作任务中每个操作任务对应的上下文,其中,所述操作任务对应的上下文包括抽象资源和信息流;所述基于所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度,包括:基于所述多个操作任务中每个操作任务对应的所述上下文以及所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度。
- 根据权利要求12所述的方法,其特征在于,所述信息流包括统一计算设备架构CUDA信息流和/或主机信息流。
- 根据权利要求12或13所述的方法,其特征在于,所述基于所述多个操作任务中每个操作任务对应的上下文以及所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度,包括:若所述多个操作任务中的至少两个操作任务之间不存在依赖并且所述至少两个操作任务对应不同的抽象资源,并行调度所述至少两个操作任务。
- 根据权利要求12至14中任一项所述的方法,其特征在于,所述基于所述多个操作任务中每个操作任务对应的上下文以及所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度,包括:若所述多个操作任务中的第三操作任务和第四操作任务之间存在依赖,并且所述第三操作任务和第四操作任务对应的信息流均为CUDA信息流,则调用第一同步接口来进行所述第三操作任务和所述第四操作任务的同步。
- 根据权利要求12至15中任一项所述的方法,其特征在于,所述基于所述多个操作任务中每个操作任务对应的上下文以及所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度,包括:若所述多个操作任务中的第三操作任务和第四操作任务之间存在依赖,且所述第三操作任务和所述第四操作任务中的至少一个操作任务对应的信息流为主机信息流,则调用第二同步接口来进行所述第三操作任务和所述第四操作任务的同步。
- 一种任务调度装置,其特征在于,所述装置包括:依赖关系确定模块,被配置为根据操作任务队列中多个操作任务对应的操作数,确定所述多个操作任务之间的依赖关系;调度模块,被配置为基于所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度。
- 根据权利要求17所述的装置,其特征在于,所述操作任务对应的操作数包括读操作数和/或写操作数。
- 根据权利要求17或18所述的装置,其特征在于,所述依赖关系确定模块包括:第一确定子模块,被配置为如果第二操作任务包含对第一操作任务的写操作数进行的读操作、或所述第二操作任务包含对所述第一操作任务的操作数进行的写操作,确定所述第二操作任务依赖于所述第一操作任务;其中,所述第一操作任务和所述第二操作任务是所述操作任务队列中的不同操作任务。
- 根据权利要求17至19中任一项所述的装置,其特征在于,所述依赖关系确定模块还包括:第二确定子模块,被配置为如果第二操作任务包含对第一操作任务的读操作数进行的读操作,确定所述第一操作任务与所述第二操作任务之间不存在依赖。
- 根据权利要求18至20中任一项所述的装置,其特征在于,所述调度模块包括:调度顺序确定子模块,被配置为基于所述多个操作任务之间的依赖关系,确定所述多个操作任务的调度顺序;第一执行子模块,被配置为所述操作任务队列中的当前操作任务分配内存;第二执行子模块,被配置为在所述分配内存完成之后,将所述当前操作任务调度到所述当前操作任务对应的上下文上执行,并按照所述调度顺序进行所述当前操作任务的下一个操作任务的内存分配。
- 根据权利要求21所述的装置,其特征在于,所述调度顺序确定子模块包括:第一确定单元,被配置为如果所述多个操作任务中的第一操作任务与所述多个操作任务中的第二操作任务之间不存在依赖,则确定并行调用所述第一操作任务与所述第二操作任务;和/或第二确定单元,被配置为如果所述第二操作任务依赖于所述第一操作任务,则确定在所述第一操作任务之后调度所述第二操作任务。
- 根据权利要求17至22中任一项所述的装置,其特征在于,所述依赖关系确定 模块包括:融合子模块,被配置为对所述多个操作任务中包括的M个通信操作任务进行融合处理,得到至少一个合并通信操作任务,每个合并通信操作任务包含所述M个通信操作任务中的至少一个通信操作任务,其中,M为大于或等于1的整数;第三确定子模块,被配置为根据所述至少一个合并通信操作任务对应的操作数和所述多个操作任务中至少一个非通信操作任务对应的操作数,确定所述多个操作任务之间的依赖关系。
- 根据权利要求23所述的装置,其特征在于,所述合并通信操作任务对应的操作数包括:所述合并通信操作任务中包括的至少一个通信操作任务对应的读操作数的集合,和/或所述合并通信操作任务中包括的至少一个通信操作任务对应的写操作数的集合。
- 根据权利要求17至24中任一项所述的装置,其特征在于,所述操作任务队列包括第一操作任务队列和第二操作任务队列,其中,所述第一操作任务队列包含所述多个操作任务中的通信操作任务,所述第二操作任务队列包含所述多个操作任务中的非通信操作任务;其中,所述第一操作任务队列和所述第二操作任务队列中包含的操作任务是按照基于各自操作任务队列中的所述多个操作任务之间的依赖关系确定的调度顺序排列的。
- 根据权利要求25所述的装置,其特征在于,所述装置还包括:记录模块,被配置为记录所述第一操作任务队列与所述第二操作任务队列之间的依赖信息,其中,若所述第一操作任务队列中的操作任务依赖所述第二操作任务队列中的至少一个操作任务,或者若所述第二操作任务队列中的操作任务依赖于所述第一操作任务队列中的至少一个操作任务,所述依赖信息包括所述至少一个操作任务中的最后一个操作任务的信息;所述调度模块包括:第一调度子模块,被配置为基于所述第一操作任务队列与所述第二操作任务队列之间的依赖信息,调度所述第一操作任务队列和所述第二操作任务队列中的操作任务。
- 根据权利要求25或26所述的装置,其特征在于,所述装置还包括:优先级设置模块,被配置为将内存回收操作任务对应的优先级设置为最高,其中,所述第二操作任务队列包括所述操作任务队列中除所述内存回收操作任务之外的所述非通信操作任务。
- 根据权利要求17-27任一项所述的装置,其特征在于,所述装置还包括:上下文确定模块,被配置为确定与所述多个操作任务中每个操作任务对应的上下文,其中,所述操作任务对应的上下文包括抽象资源和信息流;所述调度模块包括:第二调度子模块,被配置为基于所述多个操作任务中每个操作任务对应的所述上下文以及所述多个操作任务之间的依赖关系,对所述操作任务队列中的所述多个操作任务进行调度。
- 根据权利要求28所述的装置,其特征在于,所述信息流包括统一计算设备架构CUDA信息流和/或主机信息流。
- 根据权利要求28或29所述的装置,其特征在于,所述第二调度子模块包括:第一调度单元,被配置为若所述多个操作任务中的至少两个操作任务之间没有依赖并且所述至少两个操作任务对应不同的抽象资源,则并行调度所述至少两个操作任务。
- 根据权利要求28至30中任一项所述的装置,其特征在于,所述第二调度子模块包括:第二调度单元,被配置为若所述多个操作任务中的第三操作任务和第四操作任务之间存在依赖,并且所述第三操作任务和第四操作任务对应的信息流均为CUDA信息流,则调用第一同步接口来进行所述第三操作任务和所述第四操作任务的同步。
- 根据权利要求28至31中任一项所述的装置,其特征在于,所述第二调度子模块包括:第三调度单元,被配置为若所述多个操作任务中的第三操作任务和第四操作任务之间存在依赖,且所述第三操作任务和所述第四操作任务中的至少一个操作任务对应的信息流为主机信息流,则调用第二同步接口来进行所述第三操作任务和所述第四操作任务的同步。
- 一种非易失性计算机可读存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序用于执行上述权利要求1至16中任一所述的任务调度方法。
- 一种任务调度装置,其特征在于,所述装置包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器中存储的所述可执行指令,以实现权利要求1至16中任一项所述的任务调度方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG11202010574PA SG11202010574PA (en) | 2019-03-15 | 2019-12-11 | Task scheduling |
JP2020561765A JP7050957B2 (ja) | 2019-03-15 | 2019-12-11 | タスクスケジューリング |
KR1020207030753A KR20200136468A (ko) | 2019-03-15 | 2019-12-11 | 작업 스케줄링 |
US17/077,186 US11347546B2 (en) | 2019-03-15 | 2020-10-22 | Task scheduling method and device, and computer storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910200097.3A CN111694675B (zh) | 2019-03-15 | 2019-03-15 | 任务调度方法及装置、存储介质 |
CN201910200097.3 | 2019-03-15 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/077,186 Continuation US11347546B2 (en) | 2019-03-15 | 2020-10-22 | Task scheduling method and device, and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020186836A1 true WO2020186836A1 (zh) | 2020-09-24 |
Family
ID=72475505
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/124494 WO2020186836A1 (zh) | 2019-03-15 | 2019-12-11 | 任务调度 |
Country Status (7)
Country | Link |
---|---|
US (1) | US11347546B2 (zh) |
JP (1) | JP7050957B2 (zh) |
KR (1) | KR20200136468A (zh) |
CN (1) | CN111694675B (zh) |
SG (1) | SG11202010574PA (zh) |
TW (1) | TW202036306A (zh) |
WO (1) | WO2020186836A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113220480A (zh) * | 2021-04-29 | 2021-08-06 | 西安易联趣网络科技有限责任公司 | 分布式的数据任务跨云调度系统及方法 |
EP4236125A4 (en) * | 2020-11-26 | 2024-03-06 | Huawei Tech Co Ltd | METHOD FOR IMPLEMENTING COLLECTIVE COMMUNICATION, COMPUTER DEVICE AND COMMUNICATION SYSTEM |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112463334B (zh) * | 2020-12-04 | 2023-08-18 | 苏州浪潮智能科技有限公司 | 一种训练任务排队原因分析方法、系统、设备以及介质 |
CN112612615B (zh) * | 2020-12-28 | 2022-12-06 | 中孚安全技术有限公司 | 基于多线程内存分配和上下文调度的数据处理方法及系统 |
US20220269528A1 (en) * | 2021-02-24 | 2022-08-25 | Huawei Technologies Co., Ltd. | System, method and apparatus for intelligent heterogeneous computation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102129390A (zh) * | 2011-03-10 | 2011-07-20 | 中国科学技术大学苏州研究院 | 片上多核计算平台的任务调度系统及进行任务并行化方法 |
CN102360309A (zh) * | 2011-09-29 | 2012-02-22 | 中国科学技术大学苏州研究院 | 片上多核异构系统的调度系统与调度执行方法 |
CN107766144A (zh) * | 2016-08-17 | 2018-03-06 | 中兴通讯股份有限公司 | 一种任务调度方法、装置及系统 |
WO2018198745A1 (ja) * | 2017-04-27 | 2018-11-01 | 日本電気株式会社 | 計算資源管理装置、計算資源管理方法、及びコンピュータ読み取り可能な記録媒体 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1122639A3 (en) * | 1998-08-24 | 2002-02-13 | Advanced Micro Devices, Inc. | Mechanism for load block on store address generation and universal dependency vector/queue entry |
CN102354289B (zh) * | 2011-09-21 | 2012-10-10 | 苏州大学 | 一种并发事务的调度方法和相关装置 |
WO2013177765A1 (en) | 2012-05-30 | 2013-12-05 | Intel Corporation | Runtime dispatching among heterogeneous group of processors |
CA2953959C (en) * | 2014-06-30 | 2021-02-02 | Amazon Technologies, Inc. | Feature processing recipes for machine learning |
CN104156264B (zh) | 2014-08-01 | 2017-10-10 | 西北工业大学 | 一种基于多gpu的基带信号处理任务并行实时调度方法 |
WO2016041126A1 (zh) * | 2014-09-15 | 2016-03-24 | 华为技术有限公司 | 基于gpu的数据流处理方法和装置 |
CN104965761B (zh) | 2015-07-21 | 2018-11-02 | 华中科技大学 | 一种基于gpu/cpu混合架构的流程序多粒度划分与调度方法 |
US10437649B2 (en) * | 2016-03-11 | 2019-10-08 | Intel Corporation | Task mapping for heterogeneous platforms |
CN106227507B (zh) * | 2016-07-11 | 2019-10-18 | 北京深鉴智能科技有限公司 | 计算系统及其控制器 |
CN106648846A (zh) * | 2016-09-23 | 2017-05-10 | 郑州云海信息技术有限公司 | 一种改进的异构多核任务调度的方法 |
CN110298443B (zh) * | 2016-09-29 | 2021-09-17 | 中科寒武纪科技股份有限公司 | 神经网络运算装置及方法 |
CN108021563B (zh) * | 2016-10-31 | 2021-09-07 | 华为技术有限公司 | 一种指令间数据依赖的检测方法和装置 |
US10503671B2 (en) * | 2016-12-29 | 2019-12-10 | Oath Inc. | Controlling access to a shared resource |
WO2018223330A1 (en) * | 2017-06-08 | 2018-12-13 | Alibaba Group Holding Limited | Method and apparatus for distributed machine learning system |
-
2019
- 2019-03-15 CN CN201910200097.3A patent/CN111694675B/zh active Active
- 2019-12-11 SG SG11202010574PA patent/SG11202010574PA/en unknown
- 2019-12-11 WO PCT/CN2019/124494 patent/WO2020186836A1/zh active Application Filing
- 2019-12-11 JP JP2020561765A patent/JP7050957B2/ja active Active
- 2019-12-11 KR KR1020207030753A patent/KR20200136468A/ko active Search and Examination
- 2019-12-27 TW TW108148049A patent/TW202036306A/zh unknown
-
2020
- 2020-10-22 US US17/077,186 patent/US11347546B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102129390A (zh) * | 2011-03-10 | 2011-07-20 | 中国科学技术大学苏州研究院 | 片上多核计算平台的任务调度系统及进行任务并行化方法 |
CN102360309A (zh) * | 2011-09-29 | 2012-02-22 | 中国科学技术大学苏州研究院 | 片上多核异构系统的调度系统与调度执行方法 |
CN107766144A (zh) * | 2016-08-17 | 2018-03-06 | 中兴通讯股份有限公司 | 一种任务调度方法、装置及系统 |
WO2018198745A1 (ja) * | 2017-04-27 | 2018-11-01 | 日本電気株式会社 | 計算資源管理装置、計算資源管理方法、及びコンピュータ読み取り可能な記録媒体 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP4236125A4 (en) * | 2020-11-26 | 2024-03-06 | Huawei Tech Co Ltd | METHOD FOR IMPLEMENTING COLLECTIVE COMMUNICATION, COMPUTER DEVICE AND COMMUNICATION SYSTEM |
CN113220480A (zh) * | 2021-04-29 | 2021-08-06 | 西安易联趣网络科技有限责任公司 | 分布式的数据任务跨云调度系统及方法 |
CN113220480B (zh) * | 2021-04-29 | 2023-03-10 | 西安易联趣网络科技有限责任公司 | 分布式的数据任务跨云调度系统及方法 |
Also Published As
Publication number | Publication date |
---|---|
TW202036306A (zh) | 2020-10-01 |
JP7050957B2 (ja) | 2022-04-08 |
CN111694675A (zh) | 2020-09-22 |
JP2021520578A (ja) | 2021-08-19 |
CN111694675B (zh) | 2022-03-08 |
KR20200136468A (ko) | 2020-12-07 |
SG11202010574PA (en) | 2020-11-27 |
US11347546B2 (en) | 2022-05-31 |
US20210042155A1 (en) | 2021-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020186836A1 (zh) | 任务调度 | |
US11886929B2 (en) | Deploying cloud-native services across control planes | |
TWI407373B (zh) | 用於管理多核心架構之資源的方法和設備 | |
Wang et al. | Optimizing load balancing and data-locality with data-aware scheduling | |
CN106933669B (zh) | 用于数据处理的装置和方法 | |
CN101248405B (zh) | 使用并发域的多线程化 | |
US9396028B2 (en) | Scheduling workloads and making provision decisions of computer resources in a computing environment | |
US7987467B2 (en) | Scale across in a grid computing environment | |
US10580190B2 (en) | Graph based heterogeneous parallel processing system | |
US8516487B2 (en) | Dynamic job relocation in a high performance computing system | |
WO2023082914A1 (zh) | 资源分配方法、装置、可读介质及电子设备 | |
WO2014110702A1 (zh) | 协同并发式消息总线、主动构件组装模型及构件拆分方法 | |
WO2013127132A1 (zh) | 用于分布式共享存储的任务同步方法、装置及系统 | |
Cannella et al. | Adaptivity support for MPSoCs based on process migration in polyhedral process networks | |
CN112256414A (zh) | 一种连接多种计算存储引擎的方法及系统 | |
CN113672240A (zh) | 一种基于容器的多机房批量自动化部署应用的方法及系统 | |
Liu et al. | Optimizing shuffle in wide-area data analytics | |
CN114637536A (zh) | 任务处理方法、计算协处理器、芯片及计算机设备 | |
CN115827250A (zh) | 一种数据存储方法、装置及设备 | |
US8977752B2 (en) | Event-based dynamic resource provisioning | |
WO2014110701A1 (zh) | 独立主动构件和可运行主动构件组装模型及构件拆分方法 | |
US11954534B2 (en) | Scheduling in a container orchestration system utilizing hardware topology hints | |
CN106681750A (zh) | 实体机器管理装置及实体机器管理方法 | |
US20200174847A1 (en) | Enabling rewire-aware mapreduce cluster in disaggregated systems | |
KR20240067674A (ko) | 데이터 로컬 정보를 고려하는 컨테이너 기반 동적 워크로드 처리 시스템, 장치 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 20207030753 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2020561765 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19920379 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.02.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19920379 Country of ref document: EP Kind code of ref document: A1 |