WO2017166777A1 - 一种任务调度方法及装置 - Google Patents

一种任务调度方法及装置 Download PDF

Info

Publication number
WO2017166777A1
WO2017166777A1 PCT/CN2016/102055 CN2016102055W WO2017166777A1 WO 2017166777 A1 WO2017166777 A1 WO 2017166777A1 CN 2016102055 W CN2016102055 W CN 2016102055W WO 2017166777 A1 WO2017166777 A1 WO 2017166777A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
queue
thread
threads
queues
Prior art date
Application number
PCT/CN2016/102055
Other languages
English (en)
French (fr)
Inventor
赵鹏
刘雷
曹玮
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP16896547.3A priority Critical patent/EP3425502B1/en
Publication of WO2017166777A1 publication Critical patent/WO2017166777A1/zh
Priority to US16/145,607 priority patent/US10891158B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a task scheduling method and apparatus.
  • processors on the market adopt multi-core multi-threaded architecture, and even many terminal-configured processors can reach 4 cores or even 8 cores.
  • the streaming application is a common application type.
  • the application data of the streaming application can be divided into multiple A data block in which the plurality of data blocks are processed in parallel by a multi-core processor.
  • the parallel processing of the multi-core processor there may be a situation in which at least two threads simultaneously access one data block when the task is executed in parallel by using at least two threads, thereby causing data competition.
  • a "lock" is introduced in the parallel processing of the multi-core processor.
  • the thread needs to obtain the lock of the data block before accessing a data block, and after the access is completed, the lock of the data block needs to be released.
  • other threads cannot access the data block at the same time, thereby avoiding data competition caused by multiple threads simultaneously accessing one data block when multiple threads execute tasks in parallel.
  • Embodiments of the present invention provide a task scheduling method and apparatus, which can not only reduce data competition caused by multiple threads simultaneously accessing a data block, but can improve task scheduling efficiency.
  • a first aspect of the embodiments of the present invention provides a task scheduling method, where the task scheduling method includes: performing the multiple tasks according to a correspondence between a plurality of tasks to be executed and M data blocks to be accessed by the multiple tasks.
  • Each task in the task is added to the task queue of the data block corresponding to the task, wherein the M data blocks are in one-to-one correspondence with the M task queues; and the N task queues in the M task queues are executed in parallel by the N threads.
  • the task scheduling method provided by the embodiment of the present invention may be centered on a data block, and each task of the multiple tasks is separately added to a task queue of the data block corresponding to the task; and then each thread of the N threads is respectively used. Execute a task in a task queue. In this way, even if multiple threads run in parallel, since each thread of the multiple threads is used for different data blocks, the tasks in the task queue of different data blocks are executed; therefore, there is no multiple threads competing for one data block at the same time or The problem of competing for a block of data.
  • the number of threads created in the system is limited due to system performance or system configuration constraints; when there are no idle threads to perform tasks in the task queue of the data block just generated, the task queue of the data block is Need to wait for thread execution After the tasks in the task queue of other data blocks are completed, the tasks in the task queue corresponding to the data block are executed. For example, when N threads are created in the system, and tasks in the task queue that currently have M (2 ⁇ N ⁇ M) data blocks need to be executed, the N threads can only execute M task queues in parallel. Tasks in N task queues.
  • the method of the embodiment of the present invention may further include: adding at least one task queue of the M task queues to the waiting queue group, where each task queue in the at least one task queue includes at least one task, and the at least one task A task is not executed by a thread in N threads.
  • the waiting queue group is configured to store the task queues in the M task queues according to the principle of first in first out.
  • At least one task queue of the M task queues can be added to the waiting queue group, so that any one of the N threads can be executed after being idle according to the principle of storing the task queue in the waiting queue group.
  • the waiting queue group stores the tasks in the task queue.
  • the method for performing the tasks in the N task queues in the M task queues in parallel by using the N threads may include: executing the N threads in the foregoing, and executing the front N in the waiting queue group in parallel Tasks in the task queue.
  • the first N task queues are the first N task queues added in the waiting queue group, and each of the N threads executes a corresponding task queue in the first N task queues according to the first in first out principle.
  • the task in .
  • any thread in the system may be idle due to execution of all tasks in the task queue of a data block, or the thread may be idle due to exiting a task in the task queue of a data block. That is, the thread is an idle thread. At this time, the idle thread can execute the task added to the first task queue of the waiting queue group after the first N task queues.
  • the method of the embodiment of the present invention may further include: performing, by using an idle thread, a task in the first queue in the waiting queue group, where the first queue is added to the first group of the waiting queue group after the first N task queues. Task queues.
  • the method of the embodiment of the present invention may further include: deleting the executed first queue from the waiting queue group except.
  • a method in which any one of the N threads executes a task in a task queue may include: the thread executing each task in the task queue one by one.
  • the method for the task scheduling apparatus to execute the task in the task queue (the second queue) by using one thread (the first thread) includes: reading the kth task in the second queue by using the first thread, and switching to The context of the kth task in the second queue begins to execute, 1 ⁇ k ⁇ K, K is the total number of tasks in the second queue; if the first thread executes the kth task in the second queue, then the first A thread exits the kth task in the second queue, and uses the first thread to read the k+1th task in the first queue, and switches to the context of the k+1th task in the second queue to start execution. Until the K tasks in the second queue are executed.
  • the above thread may exit the execution of the kth task because the kth task in the second queue waits for the task execution result from the other task queue (third queue). At this time, the kth task that exits execution may be added to the third queue, and after the kth task obtains the task execution result, the other thread (second thread) executes the kth in the third queue. Tasks.
  • the foregoing multiple tasks include a first task belonging to the second queue, and the first task waits for a task execution result from the third queue in the process of being executed by the first thread, where the first thread is among the N threads A thread for executing a task in the second queue, the second queue being any task queue in the waiting queue group, and the third queue is a task queue different from the second queue in the waiting queue group.
  • the method of the embodiment of the present invention may further include: withdrawing the first thread from executing the first task that is waiting for the execution result of the task; adding the first task that is to be executed to the third queue; waiting for the first task to obtain the task execution result Thereafter, the second thread executes the first task in the second queue.
  • the second thread may be a thread that executes a task in the corresponding task queue among the N threads; or the second thread is a thread that exits the task in the corresponding task queue among the N threads.
  • the first task described above is the kth in the second queue. Tasks.
  • the task can be added to the other task queue. In this way, after the task obtains the task execution result of the other task queues described above, the idle thread can be used to execute the task in the task queue.
  • the method in the embodiment of the present invention may further include: waiting for the queue group to be N threads The task queue where the thread executes the task is deleted.
  • the task queue of the task that has been executed in the waiting queue group is deleted in time, so that the waiting queue group does not include the task queue of the task that has been executed by the thread.
  • a thread executes a task in a task queue of a data block one by one within a certain period of time, that is, one thread executes the data for the same data block within a certain period of time.
  • the problem of a large number of cache lines being swapped out and replaced by the different data processed before and after the task switching in the traditional task parallel system can be avoided, thereby improving the efficiency of the memory access and improving the performance of the program.
  • a second aspect of the embodiments of the present invention provides a task scheduling apparatus, including: a task adding module and a task executing module.
  • a task adding module configured to add, according to a correspondence between a plurality of tasks to be executed and M data blocks to be accessed by the plurality of tasks, a task of adding the plurality of tasks to a data block corresponding to the task a queue, wherein the M data blocks are in one-to-one correspondence with the M task queues.
  • a task execution module configured to execute, by using the N threads in parallel, the task adding the module to add to the tasks in the N task queues of the M task queues, where the N Each thread in the thread executes a task in one of the N task queues, and different threads in the N threads execute tasks in different task queues, 2 ⁇ N ⁇ M.
  • the task scheduling apparatus includes, but is not limited to, the task adding module and the task executing module in the foregoing second aspect, and the functions of the task adding module and the task executing module in the second aspect include but not Limited to the functions in the above description.
  • the task scheduling apparatus includes means for performing the above-described first aspect and the task scheduling method of the various alternatives of the first aspect, the modules are for performing the above first aspect and various alternative manners of the first aspect The task scheduling method, and the logical division of the task scheduling device.
  • a third aspect of the embodiments of the present invention provides a task scheduling apparatus, including: one or more processors, a memory, a bus, and a communication interface.
  • the memory is used to store a computer execution instruction
  • the processor is connected to the memory through the bus.
  • the processor executes the computer-executed instruction stored in the memory, so that the task scheduling device performs the first aspect and The task scheduling method described in various alternative manners of the first aspect.
  • a fourth aspect of the embodiments of the present invention provides a computer readable storage medium having stored therein one or more program codes, the program code including computer execution instructions, executed by a processor of a task scheduling apparatus When the computer executes the instructions, the task scheduling apparatus performs the task scheduling method described in the above first aspect and the various alternatives of the first aspect.
  • FIG. 1 is a schematic diagram of a DBOS scenario according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a task scheduling method according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an example of a task queue of a data block according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of an example of a task queue of another data block according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an example of a task queue of a thread scheduling data block according to an embodiment of the present disclosure
  • FIG. 6 is a flowchart of another task scheduling method according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of an example of a task queue of another thread scheduling data block according to an embodiment of the present disclosure.
  • FIG. 8 is a flowchart of another task scheduling method according to an embodiment of the present invention.
  • FIG. 9 is a flowchart of another task scheduling method according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a task scheduling apparatus according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of another task scheduling apparatus according to an embodiment of the present disclosure.
  • FIG. 12 is a schematic structural diagram of another task scheduling apparatus according to an embodiment of the present disclosure.
  • FIG. 13 is a schematic structural diagram of another task scheduling apparatus according to an embodiment of the present invention.
  • first and second and the like in the description of the present invention and the drawings are used to distinguish different objects, and are not intended to describe a specific order of the objects.
  • first queue and the second queue, etc. are used to distinguish different task queues, rather than to describe the feature order of the task queue.
  • a plurality means two or more unless otherwise indicated.
  • multiple processors or multi-core processors are two or more processor.
  • the technical solution of the embodiment of the present invention is applied to a process of “streaming application” in the field of media data processing, telecommunication data processing, and big data analysis and processing, and can utilize a parallel processing capability of a multi-core processor to improve a processing efficiency thereof.
  • media data processing includes image processing, audio processing, video processing, and the like.
  • FIG. 1 is a schematic diagram of a DBOS scenario in an embodiment of the present invention.
  • the software and hardware framework 10 of the DBOS scenario includes: a data block space definition interface, a task related interface, an application layer, and an operating system (OS) running on the multi-core processor.
  • OS operating system
  • the task-related interface is configured to receive a service processing request of the streaming application of the application layer for the M data blocks, and the operating system 12 creates at least one task for the data block according to the processing request corresponding to each data block, and creates the task. Add to the task queue of the data block specified by the task.
  • the data block space definition interface is configured to receive user-written program code for allocating memory space for M data blocks, and the operating system 12 allocates memory space for the M data blocks. This memory space is used to hold a task queue for each of the M data blocks.
  • the data block space definition interface and the task related interface may be mentioned by the foregoing operating system. For.
  • the multi-core processor can call the runtime library (Runtime Library) through the N threads configured in the operating system, and execute the tasks in the task queue of the N data blocks in the M data blocks in parallel in the data block.
  • the N thread executes tasks in one of the N task queues, and different threads of the N threads execute tasks in different task queues.
  • the M data blocks may be data stored in a hard disk or data stored in a memory space; of course, the M data blocks may also be service processing requests carried in the streaming application.
  • the data to be processed in the embodiment of the present invention is not limited thereto.
  • the DBOS scenario shown in FIG. 1 is taken as an example, and the DBOS scenario applied in the embodiment of the present invention is exemplified.
  • the DBOS scenario to which the embodiment of the present invention is applied is not limited to the DBOS scenario shown in FIG. 1 . Therefore, the DBOS scenario shown in FIG. 1 does not limit the application scenario of the technical solution of the embodiment of the present invention.
  • the execution body of the task scheduling method provided by the embodiment of the present invention may be a computer device (or a multi-core device) installed with a multi-core processor; or the execution body of the task scheduling method may be a device for executing a task scheduling method in the multi-core device. , such as task scheduling devices.
  • the task scheduling device may be a central processing unit (CPU) of the multi-core device.
  • the data block is accessed, and there may be a problem that multiple threads compete for one data block or one data block at the same time; in the embodiment of the present invention, The data block is centered, and each of the plurality of tasks is separately added to the task queue of the data block corresponding to the task; then each of the N threads executes a task in the task queue separately. In this way, even if multiple threads run in parallel, since each thread of the multiple threads is used for different data blocks, the tasks in the task queue of different data blocks are executed; therefore, there is no multiple threads competing for one data block at the same time or A problem with the lock of a data block.
  • the multi-core device for performing the task scheduling method in the embodiment of the present invention may be a multi-core computer device capable of performing "streaming application" processing in images, audio, video, etc., such as a personal computer (PC), a server, or the like.
  • PC personal computer
  • server or the like.
  • a task scheduling method provided by an embodiment of the present invention is as shown in FIG. 2, and the task scheduling method includes:
  • the task scheduling apparatus adds each of the multiple tasks to the task queue of the data block corresponding to the task according to the correspondence between the plurality of tasks to be executed and the M data blocks to be accessed by the multiple tasks.
  • M data blocks are in one-to-one correspondence with M task queues.
  • DB 0 corresponds to task queue
  • DB 1 corresponds to task queue 1
  • DB 2 corresponds to task queue 2.
  • task a has a corresponding relationship with DB 0 and DB 2
  • task b has a corresponding relationship with DB 0.
  • task c and DB 0 there is a corresponding relation
  • task d and DB 0 and DB 1 corresponding relationship exists
  • the task e with DB 1 corresponding relationship exists
  • the task f with DB 1 corresponding relationship exists
  • the task number g and DB 2 there is a corresponding relation
  • the task scheduling apparatus can add task a, task b, task c, and task d to the task queue of DB 0 (task queue 0), and can add task d, task e, and task f to DB 1 's task queue (task queue 1), you can add task g and task a to DB 2 's task queue (task queue 2).
  • the multiple tasks to be executed are created by the task scheduling apparatus for the M data blocks, when the multiple tasks are created, the data blocks corresponding to each of the multiple tasks are specified. .
  • the task scheduling device can create respective data blocks for different data blocks according to requirements. Tasks, task scheduling devices create different tasks for each data block, and the number of tasks is not the same.
  • the task scheduling device when the task scheduling device creates a task for each data block, it specifies which data block the task is created for; for example, the task scheduling device can create a data block according to each task specified when the task is created. Tasks are added to the task queue of the data block specified by the task.
  • the task scheduling apparatus creates four tasks for DB 0 , including: task a, task b, task c, and task d; the task scheduling device creates three tasks for DB 1 , including: task d, task e and task f; the task scheduling device creates two tasks for DB 2 , including: task g and task a.
  • the method for the task scheduling device to create a task for the data block can refer to the related method for creating a task for the data block in the prior art, which is not described herein again.
  • the task scheduling apparatus uses N threads to execute tasks in N task queues of the M task queues in parallel, and each of the N threads executes tasks in one task queue of the N task queues, the N Different threads in the thread execute tasks in different task queues, 2 ⁇ N ⁇ M.
  • the task scheduling apparatus may use N threads as N scheduler threads, and schedule N task queues in parallel, and execute each of the N task queues by using each of the N threads.
  • a method in which any one of the N threads executes a task in a task queue may be: the thread executes each task in the task queue one by one.
  • the task scheduling apparatus can execute tasks in task queue 0, task queue 1 and task queue 2 in parallel using Thread 0 , Thread 1, and Thread 2 .
  • the task scheduling apparatus can execute simultaneously: performing task a, task b, task c, and task d in task queue 0 by using Thread 0 ; performing task d, task e, and task f in task queue 1 by using Thread 1 ; Thread 2 executes task g and task a in task queue 2.
  • the tasks in task queue 0, task queue 1 and task queue 2 are executed in parallel with Thread 0 , Thread 1 and Thread 2 shown in FIG. 5 as an example.
  • the tasks in task queue 0, task queue 1 and task queue 2 are executed in parallel in Thread 0 , Thread 1 and Thread 2
  • task a, task b, task c and task d need to be executed for DB 0
  • for DB 1 Task d, task e, and task f are executed
  • task g and task a are executed for DB 2 ; however, for each data block, only one thread executes the task in the task queue of the data block.
  • Thread 0 is used to perform task a, task b, task c, and task d for DB 0 ;
  • Thread 1 is used to perform task d, task e, and task f for DB 1 ;
  • Thread 2 is used to perform task g and task a for DB 2 .
  • N threads in the embodiment of the present invention may be created by the task scheduling device according to the system configuration.
  • N in order to ensure full use of system hardware resources, it is usually possible to create the same number of threads as the current number of processor cores (or hardware threads). That is, N can be the number of processor cores in the task scheduling device.
  • the thread in the embodiment of the present invention is an operating system thread (OS Thread).
  • the task scheduling device can create 3 threads including: Thread 0 , Thread 1 and Thread 2 . That is, one thread runs on a processor core.
  • the task scheduling method provided by the embodiment of the present invention may be centered on a data block, and each task of the multiple tasks is separately added to a task queue of the data block corresponding to the task; and then each thread of the N threads is respectively used. Execute a task in a task queue. In this way, even if multiple threads run in parallel, since each thread of the multiple threads is used for different data blocks, the tasks in the task queue of different data blocks are executed; therefore, there is no multiple threads competing for one data block at the same time or A problem with the lock of a data block.
  • a thread executes a task in a task queue of a data block one by one within a certain period of time, that is, one thread performs the data block corresponding to the same data block within a certain period of time.
  • the task then the thread processing data before and after multiple task switching in a certain period of time is the same. In this way, the problem of a large number of cache lines being swapped out and replaced by the different data processed before and after the task switching in the traditional task parallel system can be avoided, thereby improving the efficiency of the memory access and improving the performance of the program.
  • the number of threads created in the system is limited, and the number of data blocks is generally not restricted by system performance or system configuration; therefore, when the number of data blocks is greater than the number of threads Then, the task queue of a part of the data block needs to wait for the idle thread in the N threads, and then the idle thread executes the task in the task queue of the partial data block.
  • N threads when N threads are created in the system, and tasks in the task queue that currently have M (2 ⁇ N ⁇ M) data blocks need to be executed, the N threads can only execute M task queues in parallel. Tasks in N task queues. The remaining M-N task queues need to wait for the idle threads in the N threads to execute the tasks in the M-N task queues.
  • the method of the embodiment of the present invention may further include S101':
  • the task scheduling device adds at least one task queue of the M task queues to the waiting queue group.
  • the task queue in the at least one task queue includes at least one task, and the at least one task is not executed by threads in the N threads.
  • the waiting queue group is used to store the task queues in the above M task queues according to the principle of first in first out.
  • the task scheduling apparatus may add the task queue of the generated data block to the waiting queue group.
  • the task scheduling apparatus can add the task queue (task queue 3) of the data block DB 3 and the task queue (task queue 4) of DB4 to the Wait Queue (Wait Q) group.
  • the WQ group stores queues of tasks that have been added but not executed on a first-in-first-out basis.
  • the task scheduling apparatus can preferentially adopt the idle thread to schedule the DB when any one of the N threads is an idle thread.
  • task queue 3 to execute task queue DB 3 in.
  • any one of the N threads can perform the waiting according to the principle of storing the task queue in the waiting queue group after being idle.
  • the tasks in the task queue are stored in the queue group.
  • S102 in FIG. 2 may be replaced by S102a:
  • the task scheduling apparatus uses the N threads to execute the tasks in the first N task queues in the waiting queue group in parallel.
  • the first N task queues are the first N task queues added in the waiting queue group, and each of the N threads executes a corresponding task queue in the first N task queues according to the first in first out principle.
  • the task in .
  • the waiting queue group includes only the task queue in which the task is not executed by the thread.
  • the method in the embodiment of the present invention may further include S103:
  • the task scheduling apparatus deletes a task queue of a task in the queue group that is executed by a thread of the N threads.
  • the task queue of the task in the waiting queue group that has been executed by the thread in the N thread is deleted in time, and the waiting queue group does not include the task queue in which the task executed by the thread is located.
  • any thread in the system may be idle due to execution of all tasks in the task queue of a data block, or the thread may be in the task queue for exiting a data block.
  • Idle state that is, the thread is an idle thread.
  • the idle thread can execute the task added to the first task queue of the waiting queue group after the first N task queues.
  • the method of the embodiment of the present invention may further include:
  • the task scheduling apparatus uses an idle thread to execute a task in the first queue in the waiting queue group, where the first queue is the first task queue added to the waiting queue group after the first N task queues.
  • the idle thread may be a thread that executes a task in the corresponding task queue among the N threads.
  • the method in the embodiment of the present invention may further include S105:
  • the task scheduling apparatus deletes the executed first queue from the waiting queue group.
  • the task queue of the task that has been executed in the waiting queue group is deleted in time, so that the waiting queue group includes only the task queue in which the task not executed by the thread is located.
  • the method of the embodiment of the present invention further includes: the task scheduling apparatus creates a task queue for each of the M data blocks.
  • the task queue created by the task scheduling device for the data block is for storing the task corresponding to the data block.
  • the task scheduling device initially creates a task queue for each data block.
  • the task scheduling means may be a data block as the center, to create a DB for adding 0 0 0 task queue corresponding task DB, DB 1 for adding create a corresponding task DB 1 Task Queue 1, create a task queue 2 for DB 2 to add DB 2 corresponding tasks.
  • the task scheduling apparatus may allocate memory spaces for the M data blocks to save the task queues of the M data blocks.
  • the method of the embodiment of the present invention may further include: the task scheduling apparatus allocates a memory space for the M data blocks, and the memory space is used to save the task queue of each of the M data blocks.
  • the task scheduling apparatus may allocate a memory space for the M data blocks according to a data block type of the data block in the M data blocks, a size of the data block, and a number of data blocks.
  • the method that the task scheduling apparatus uses a thread (such as the first thread) to execute a task in a task queue (such as the second queue) may specifically include Sa-Sb:
  • the task scheduling device uses the first thread to read the kth task in the second queue, and switches to the context of the kth task in the second queue to start execution, 1 ⁇ k ⁇ K, K is in the second queue The total number of tasks.
  • the task scheduling apparatus executes the kth task in the second queue by using the first thread, the first thread exits the kth task in the second queue, and uses the first thread to read the first queue.
  • the first thread may exit the execution of the kth task because the kth task (ie, the first task) in the second task waits for the task execution result from the third queue.
  • the method in which the task scheduling apparatus adopts a first thread to execute the task in the second queue may further include Sc:
  • the task scheduling device withdraws the first thread from executing the first task that is waiting for the task execution result of the third queue (the kth task in the second queue); adding the first task that exits the execution to the third queue; After the first task obtains the task execution result, the second thread executes the first task in the second queue.
  • the second queue is any task queue in the waiting queue group
  • the third queue is a task queue in the waiting queue group different from the second queue.
  • the first thread After the first thread is exited from executing the first task that is waiting for the result of the task execution, the first thread cannot be used to execute other tasks in the task queue (second queue) where the first task is located. At this point, the first thread becomes an idle thread, which can be used to execute tasks in other task queues in the waiting task group.
  • the idle thread in the above embodiment can not only perform the line of the task in the corresponding task queue in the N threads.
  • the idle thread can also exit the thread executing the task in the corresponding task queue for N threads.
  • the first task in the second queue of the idle thread may also be adopted. That is, the second thread is an idle thread when the first task obtains the task execution result.
  • the second thread may be a thread that executes a task in the corresponding task queue among the N threads; or the second thread is a thread that exits the task in the corresponding task queue among the N threads.
  • the task scheduling apparatus can implement the "task scheduling apparatus to perform a task in a task queue by using one thread a" by the following algorithm program.
  • the specific algorithm program is as follows:
  • Lines 3-18 of the above algorithm program are used to cyclically determine whether the Wait Q (that is, the waiting queue group in the embodiment of the present invention) is empty;
  • Wait Q If Wait Q is empty, it means that there is no task queue to be processed in the current system, and the thread (that is, the scheduler thread) is suspended (see line 19 in the above algorithm program for details);
  • Thread a is used to read the first task in task queue 1 in Wait Q (see lines 4-6 in the above algorithm program), and switch to the first task of task queue 1
  • the context begins to execute (see line 8 in the above algorithm program for details); wherein task queue 1 is the task queue that is currently included in Wait Q and is added to the task queue of the Wait Q first;
  • thread a If thread a exits the first task in task queue 1 due to thread a executing the first task in task queue 1, (see line 9 in the above algorithm program), thread a is used to read the task.
  • the second task in queue 1 and the context of the second task switched to task queue 1 begins to execute (see lines 10-11 in the above algorithm program for details);
  • the thread a is exited from the first task in task queue 1 (see line 13 in the above algorithm program), then the task queue The first task in 1 is added to the task queue t.
  • the idle task executes the first task in the task queue 1 (see Line 14 of the above algorithm program).
  • the task queue t is a task queue different from the task queue 1 in the Wait Q.
  • the thread a since the first task in the task queue 1 waits for the task execution result of the task queue t, after the thread a exits the first task in the task queue 1, the thread a becomes an idle thread, and the thread a Can be used to perform tasks in the next task queue in Wait Q (see line 15 in the algorithm above for details).
  • the task scheduling method provided by the embodiment of the present invention may be centered on a data block, and each task of the multiple tasks is separately added to a task queue of the data block corresponding to the task; and then each thread of the N threads is respectively used. Execute a task in a task queue. In this way, even if multiple threads run in parallel, since each thread of the multiple threads is used for different data blocks, the tasks in the task queue of different data blocks are executed; therefore, there is no multiple threads competing for one data block at the same time or A problem with the lock of a data block.
  • the task queue can also be added to the waiting queue group when there is no idle thread available for executing the task in the task queue; and the task queue of the task that has been executed in the waiting queue group can be deleted in time, so that the waiting The queue of tasks in which the tasks that have been executed by the thread are not included in the queue group.
  • An embodiment of the present invention provides a task scheduling apparatus.
  • the task scheduling apparatus includes: a task adding module 21 and a task executing module 22.
  • the task adding module 21 is configured to add each of the multiple tasks to the data block corresponding to the task according to the correspondence between the plurality of tasks to be executed and the M data blocks to be accessed by the multiple tasks.
  • the task execution module 22 is configured to execute, by using N threads in parallel, the tasks added by the task adding module 21 to the N task queues in the M task queues, wherein each of the N threads executes the N tasks. Any one of the task queues in the queue The different threads in the N threads execute tasks in different task queues, 2 ⁇ N ⁇ M.
  • the task scheduling apparatus may further include: a queue adding module 23.
  • a queue adding module 23 configured to add at least one task queue of the foregoing M task queues to a waiting queue group, where each task queue in the at least one task queue includes at least one task, and the at least one task is not the foregoing Thread execution in N threads.
  • the waiting queue group is configured to store the task queues in the M task queues according to the principle of first in first out.
  • task execution module 22 is specifically configured to:
  • the above N threads are used to execute the tasks in the first N task queues in the waiting queue group in parallel.
  • the first N task queues are the first N task queues added in the waiting queue group, and each of the N threads executes a corresponding task queue in the first N task queues according to the first in first out principle.
  • the task in .
  • the task execution module 22 is further configured to perform, by using an idle thread, a task in the first queue in the waiting queue group, where the first queue is added to the waiting queue group after the first N task queues.
  • the first task queue is further configured to perform, by using an idle thread, a task in the first queue in the waiting queue group, where the first queue is added to the waiting queue group after the first N task queues. The first task queue.
  • the task scheduling apparatus may further include: a queue deletion module 24.
  • the queue deletion module 24 is configured to delete the first queue executed by the task execution module 22 from the waiting queue group.
  • the idle thread is a thread that executes a task in the corresponding task queue among the N threads; or the idle thread is a thread that exits a task in the corresponding task queue among the N threads.
  • the foregoing multiple tasks include a first task belonging to the second queue, the first A task waits for a task execution result from a third queue in a process executed by the first thread, the first thread being a thread of the N threads for executing a task in the second queue, the second queue is Any one of the above-mentioned waiting queue groups, the third queue is a task queue different from the second queue in the waiting queue group.
  • the task scheduling apparatus may further include: a task control module.
  • a task control module configured to withdraw the first thread from executing the first task that is waiting for the task execution result.
  • the task adding module 21 is further configured to add the first task that exits execution to the third queue.
  • the task execution module 22 is further configured to: after the first task obtains the task execution result, execute the first task in the second queue by using a second thread.
  • the second thread is a thread that executes a task in the corresponding task queue among the N threads; or the second thread is a thread that exits a task in the corresponding task queue among the N threads.
  • queue deletion module 24 is further configured to delete a task queue in which the tasks executed by the threads of the N threads in the waiting queue group are located.
  • the task scheduling apparatus may be centered on a data block, and each task of the multiple tasks is separately added to a task queue of the data block corresponding to the task; and then each thread of each of the N threads is respectively used. Execute a task in a task queue. In this way, even if multiple threads run in parallel, since each thread of the multiple threads is used for different data blocks, the tasks in the task queue of different data blocks are executed; therefore, there is no multiple threads competing for one data block at the same time or A problem with the lock of a data block.
  • a thread executes a task in a task queue of a data block one by one within a certain period of time, that is, one thread performs the data block corresponding to the same data block within a certain period of time.
  • the task then the thread processing data before and after multiple task switching in a certain period of time is the same. In this way, the problem of a large number of cache lines being swapped out and replaced by the different data processed before and after the task switching in the traditional task parallel system can be avoided, thereby improving the efficiency of the memory access and improving the performance of the program.
  • An embodiment of the present invention provides a task scheduling apparatus. As shown in FIG. 13, the task scheduling apparatus includes:
  • One or more processors 31, a memory 32, a bus system 33, and one or more applications the one or more processors 31 and the memory 32 being connected by the bus system 33; the one or more The applications are stored in the memory 32, and the one or more applications include instructions.
  • the processor 31 is configured to execute the instruction, and is specifically used to replace the task adding module 21, the task executing module 22, the queue adding module 23, and the queue deleting module 24, etc., as shown in FIG. 2, FIG. 6, FIG. 8 and The task scheduling method shown in any of the figures in FIG. That is, the processor 31 can integrate the functional units or function modules, such as the task adding module 21, the task executing module 22, the queue adding module 23, and the queue deleting module 24, that is, the foregoing functional modules can be integrated into one processor 31. .
  • the processor 31 may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present invention. .
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • the bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided It is an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in FIG. 13, but it does not mean that there is only one bus or one type of bus.
  • An embodiment of the present invention further provides a computer readable storage medium having stored therein one or more program codes, the one or more program codes including instructions, when a processor of the task scheduling device When the instruction is executed, the task scheduling apparatus performs a task scheduling method as shown in any of FIGS. 2, 6, 8, and 9.
  • the computer readable storage medium may include a high speed RAM memory and may also include a non-volatile memory, such as at least one disk storage.
  • the above program code may be used as a component of an embedded operating system running on the task scheduling device, or as a component of various applications running on the task scheduling device, relatively speaking, when the above implementation When the task scheduling method provided in the example is used as a component of the embedded operating system, the application modification may be unnecessary, and the implementation difficulty and the modification workload are small.
  • each step in the method flow shown in any of the foregoing FIG. 2, FIG. 6, FIG. 8 and FIG. 9 may perform the above non-volatile storage by using a task scheduling apparatus in a hardware form.
  • the task scheduling apparatus may be centered on a data block, and each task of the multiple tasks is separately added to a task queue of the data block corresponding to the task; and then each thread of each of the N threads is respectively used. Execute a task in a task queue. In this way, even if multiple threads run in parallel, since each thread of the multiple threads is used for different data blocks, the tasks in the task queue of different data blocks are executed; therefore, there is no multiple threads competing for one data block at the same time or A problem with the lock of a data block.
  • a thread executes a task in a task queue of a data block one by one within a certain period of time, that is, one thread performs the data block corresponding to the same data block within a certain period of time.
  • the task then the thread processing data before and after multiple task switching in a certain period of time is the same. In this way, the problem of a large number of cache lines being swapped out and replaced by the different data processed before and after the task switching in the traditional task parallel system can be avoided, thereby improving the efficiency of the memory access and improving the performance of the program.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment. of.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)

Abstract

一种任务调度方法及装置,涉及计算机技术领域,不仅可以避免多个线程并行执行任务时,同时访问一个数据块而引发的数据竞争,还可以避免引入锁带来的额外性能开销,减少并发错误的检测和调试难度。具体方案为:根据待执行的多个任务与多个任务待访问的M个数据块的对应关系,将多个任务中的每个任务添加到与该任务对应的数据块的任务队列(S101);采用N个线程并行执行M个任务队列中N个任务队列中的任务,该N个线程中的每个线程执行N个任务队列中一个任务队列中的任务,N个线程中不同的线程执行不同任务队列中的任务,2≤N≤M(S102)。该方法和装置用于多核系统的任务调度过程中。

Description

一种任务调度方法及装置
本申请要求于2016年03月29日提交中国专利局、申请号为201610188139.2、发明名称为“一种任务调度方法及装置”的中国申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,尤其涉及一种任务调度方法及装置。
背景技术
随着处理器技术的发展,多核处理器的使用越来越广泛,市场上大多数处理器采用多核多线程的体系结构,甚至很多终端配置的处理器已经可以达到4核乃至8核。
流式应用是一种常见的应用类型,在采用上述多核处理器处理流式应用的应用数据时,为了充分利用该多核处理器的并行处理能力,可以将流式应用的应用数据划分为多个数据块,由多核处理器并行处理上述多个数据块。其中,多核处理器的并行处理过程中,可能会存在由于采用至少两个线程并行执行任务时,该至少两个线程同时访问一个数据块,而引发数据竞争。
现有技术中,为了避免上述数据竞争,在多核处理器的并行处理过程引入了“锁”。其中,线程在访问一个数据块之前,需要先获得该数据块的锁,并在访问完成后,需要释放该数据块的锁。如此在一个线程访问该数据块的过程中,其他线程则不能同时访问该数据块,则可以避免采用多个线程并行执行任务时,多个线程同时访问一个数据块而引发的数据竞争。
但是,存在的问题是:在多核处理器的并行处理过程引入了“锁”后,虽然可以避免多个线程并行执行任务时,同时访问一个数据块,但是在多线程并行执行任务的过程中,仍旧存在多个线程 同时竞争一个数据块的锁的问题,锁的竞争会带来额外的性能开销。
发明内容
本发明的实施例提供一种任务调度方法及装置,不仅可以减少多个线程并行执行任务时,同时访问一个数据块而引发的数据竞争,从而可以提高任务调度的效率。
为达到上述目的,本发明的实施例采用如下技术方案:
本发明实施例的第一方面,提供一种任务调度方法,该任务调度方法包括:根据待执行的多个任务与该多个任务待访问的M个数据块的对应关系,将上述多个任务中的每个任务添加到与该任务对应的数据块的任务队列,其中,该M个数据块与M个任务队列一一对应;采用N个线程并行执行该M个任务队列中N个任务队列中的任务,其中,该N个线程中的每个线程执行该N个任务队列中一个任务队列中的任务,该N个线程中不同的线程执行不同任务队列中的任务,2≤N≤M。
本发明实施例提供的任务调度方法,可以以数据块为中心,多个任务中的每个任务分别添加到与该任务对应的数据块的任务队列;然后采用N个线程中的每个线程分别执行一个任务队列中的任务。如此,即使多个线程并行运行,由于采用该多个线程中每一个线程针对不同的数据块,执行不同数据块的任务队列中的任务;因此,不会存在多个线程同时竞争一个数据块或者竞争一个数据块的锁的问题。
通过本方案,可以减少采用多个线程并行执行任务时,多个线程同时访问一个数据块而引发的数据竞争,从而可以提高任务调度的效率。并且,由于本方案中无需引入锁机制,因此可以避免引入锁带来的额外性能开销,降低并发错误的检测和调试难度。
可以想到的是,由于系统性能或者系统配置的制约,系统中创建的线程的数量有限;当没有空闲的线程来执行刚刚生成的数据块的任务队列中的任务时,该数据块的任务队列则需要等待线程执行 完其他数据块的任务队列中的任务后,再执行该数据块对应的任务队列中的任务。例如,当系统中创建了N个线程,而当前存在M(2≤N≤M)个数据块的任务队列中的任务需要被执行时,该N个线程则只能够并行执行M个任务队列中N个任务队列中的任务。
据此,本发明实施例的方法还可以包括:将M个任务队列中的至少一个任务队列添加至等待队列组,该至少一个任务队列中的每个任务队列中包含至少一个任务,且该至少一个任务未被N个线程中的线程执行。其中,该等待队列组用于按照先进先出的原则存放上述M个任务队列中的任务队列。
通过本方案,可以将M个任务队列中的至少一个任务队列添加至等待队列组,如此,上述N个线程中的任一线程便可以在空闲后依据等待队列组中存放任务队列的原则,执行等待队列组中存放任务队列中的任务。
可选的,本发明实施例中,采用N个线程并行执行M个任务队列中N个任务队列中的任务的方法具体可以包括:采用该N个线程,并行执行上述等待队列组中的前N个任务队列中的任务。
其中,该前N个任务队列为该等待队列组中最先添加的N个任务队列,该N个线程中的每个线程按照先进先出原则执行该前N个任务队列中一相应的任务队列中的任务。
进一步的,系统中的任一线程可能会因为执行完一个数据块的任务队列中的所有任务而处于空闲状态,或者该线程可能会因为退出执行一个数据块的任务队列中的任务而处于空闲状态,即该线程为空闲线程。此时,该空闲线程则可以执行在上述前N个任务队列后被添加到上述等待队列组的首个任务队列中的任务。
具体的,本发明实施例的方法还可以包括:采用一空闲线程执行等待队列组中第一队列中的任务,第一队列为在上述前N个任务队列后被添加到该等待队列组的首个任务队列。
并且,为了实时更新该等待队列组中的任务队列,本发明实施例的方法还可以包括:将被执行的第一队列从该等待队列组中删 除。
示例性的,在本发明实施例中,上述N个线程中的任一线程执行一任务队列中的任务的方法可以包括:该线程逐个执行该任务队列中的每个任务。
具体的,任务调度装置采用一个线程(第一线程)执行一个任务队列(第二队列)中的任务的方法,包括:采用第一线程读取第二队列中的第k个任务,并切换到第二队列中的第k个任务的上下文开始执行,1≤k<K,K为第二队列中的任务总数;若采用第一线程执行完第二队列中的第k个任务,则将第一线程退出第二队列中的第k个任务,并采用第一线程读取第一队列中的第k+1个任务,并切换到第二队列中的第k+1个任务的上下文开始执行,直至第二队列中的K个任务执行完毕。
可以想到的是,上述线程可能会因为第二队列中的第k个任务等待来自其他任务队列(第三队列)的任务执行结果,而退出执行第k个任务。此时,则可以将退出执行的第k个任务添加至第三队列,并在该第k个任务获得该任务执行结果后,采用其他线程(第二线程)执行第三队列中的该第k个任务。
具体的,上述多个任务包括属于第二队列的第一任务,该第一任务在被第一线程执行的过程中等待来自第三队列的任务执行结果,该第一线程为上述N个线程中用于执行该第二队列中的任务的线程,该第二队列为上述等待队列组中的任一任务队列,该第三队列为等待队列组中不同于该第二队列的任务队列。
本发明实施例的方法还可以包括:将第一线程退出执行正在等待上述任务执行结果的第一任务;将退出执行的第一任务添加至上述第三队列;待该第一任务获得任务执行结果后,采用第二线程执行第二队列中的第一任务。
其中,上述第二线程可以为N个线程中执行完相应的任务队列中的任务的线程;或者,上述第二线程为N个线程中退出执行相应的任务队列中的任务的线程。上述第一任务即为第二队列中的第k 个任务。
通过上述方案,即使一线程因为该线程正在执行的任务队列中的一任务等待其他任务队列的任务执行结果,而退出执行该任务,也可以将该任务添加至上述其他任务队列中。如此,便可以在该任务获得上述其他任务队列的任务执行结果后,采用一个空闲线程执行该任务队列中的该任务。
优选的,当一线程开始执行等待队列组中的一个任务队列后,则表示该任务队列当前不处于等待执行状态。为了及时更新等待队列组中的任务队列,使得该等待队列组中仅包括未被线程执行的任务所在的任务队列,本发明实施例的方法还可以包括:将等待队列组中被N个线程中的线程执行的任务所在的任务队列删除。
通过上述方案,及时删除了等待队列组中已经被执行的任务所在的任务队列,这样等待队列组中则不会包括已被线程执行的任务所在的任务队列。
进一步的,由于在本发明实施例中,一个线程在一定的时间段内,会逐个执行一个数据块的任务队列中的任务,即一个线程在一定的时间段内针对同一数据块,执行该数据块对应的任务;那么该线程在一定时间段内进行多次任务切换前后处理的数据是相同的。如此,可以避免了传统任务并行系统中因任务切换前后处理不同的数据而导致的大量缓存(cache)行换出换入的问题,从而可以提高访存效率,提高程序的性能。
本发明实施例的第二方面,提供一种任务调度装置,包括:任务添加模块和任务执行模块。
任务添加模块,用于根据待执行的多个任务与该多个任务待访问的M个数据块的对应关系,将该多个任务中的每个任务添加到与该任务对应的数据块的任务队列,其中,该M个数据块与M个任务队列一一对应。
任务执行模块,用于采用N个线程并行执行上述任务添加模块添加至上述M个任务队列中N个任务队列中的任务,其中,该N个 线程中的每个线程执行该N个任务队列中一个任务队列中的任务,该N个线程中不同的线程执行不同任务队列中的任务,2≤N≤M。
需要说明的是,本发明实施例提供的任务调度装置包括但不限于上述第二方面中的任务添加模块和任务执行模块,并且第二方面中的任务添加模块和任务执行模块的功能包括但不限于上述描述中的功能。该任务调度装置包括用于执行上述第一方面以及第一方面的各种可选方式所述的任务调度方法的模块,这些模块是为了执行上述第一方面以及第一方面的各种可选方式所述的任务调度方法,而对任务调度装置进行的逻辑上的划分。
本发明实施例的第三方面,提供一种任务调度装置,该任务调度装置包括:一个或多个处理器、存储器、总线和通信接口。
上述存储器用于存储计算机执行指令,上述处理器与该存储器通过上述总线连接,当任务调度装置运行时,该处理器执行该存储器存储的计算机执行指令,以使任务调度装置执行上述第一方面以及第一方面的各种可选方式所述的任务调度方法。
本发明实施例的第四方面,提供一种计算机可读存储介质,该计算机可读存储介质中存储有一个或多个程序代码,该程序代码包括计算机执行指令,当任务调度装置的处理器执行上述计算机执行指令时,该任务调度装置执行上述第一方面以及第一方面的各种可选方式所述的任务调度方法。
需要说明的是,上述任务调度装置及该任务调度装置执行计算机可读存储介质中存储的程序的具体技术效果及其相关分析过程可以参考本发明实施例第一方面或第一方面的任一种实现方式中的相关技术效果描述,此处不再赘述。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可 以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种DBOS场景示意图;
图2为本发明实施例提供的一种任务调度方法的流程图;
图3为本发明实施例提供的一种数据块的任务队列实例示意图;
图4为本发明实施例提供的另一种数据块的任务队列实例示意图;
图5为本发明实施例提供的一种线程调度数据块的任务队列的实例示意图;
图6为本发明实施例提供的另一种任务调度方法的流程图;
图7为本发明实施例提供的另一种线程调度数据块的任务队列的实例示意图;
图8为本发明实施例提供的另一种任务调度方法的流程图;
图9为本发明实施例提供的另一种任务调度方法的流程图;
图10为本发明实施例提供的一种任务调度装置的结构组成示意图;
图11为本发明实施例提供的另一种任务调度装置的结构组成示意图;
图12为本发明实施例提供的另一种任务调度装置的结构组成示意图;
图13为本发明实施例提供的另一种任务调度装置的结构组成示意图。
具体实施方式
本发明的说明书以及附图中的术语“第一”和“第二”等是用于区别不同的对象,而不是用于描述对象的特定顺序。例如,第一队列和第二队列等是用于区别不同的任务队列,而不是用于描述任务队列的特征顺序。
在本发明的描述中,除非另有说明,“多个”的含义是指两个或两个以上。例如,多个处理器或多核处理器是指两个或两个以上 处理器。
此外,本发明的描述中所提到的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括其他没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行详细地描述,显然,所描述的实施例仅仅是本发明的一部分实施例,而不是全部实施例。
本发明实施例的技术方案应用于媒体数据处理领域、电信数据处理领域以及大数据分析处理领域中的“流式应用”的处理过程,可以利用多核处理器的并行处理能力提高其处理效率的场景中。其中,媒体数据处理包括图像处理、音频处理、视频处理等。
具体可以应用于面向数据块的调度(Data Block Oriented Schedule,DBOS)的场景中。如图1所示,为本发明实施例中的一种DBOS场景示意图。该DBOS场景的软硬件框架10包括:数据块空间定义接口、任务相关接口、应用层和运行在多核处理器上的操作系统(Operating System,OS)。
以下以处理流式应用的处理过程为例,对图1所示的DBOS场景进行举例说明:
任务相关接口用于接收应用层的流式应用针对M个数据块的业务处理请求,由操作系统12根据每个数据块对应的处理请求,为该数据块创建至少一个任务,并将创建的任务分别添加至该任务指定的数据块的任务队列。
数据块空间定义接口用于接收用户编写的用于为M个数据块分配内存空间的程序代码,由操作系统12用于该M个数据块分配内存空间。该内存空间用于保存M个数据块中每个数据块的任务队列。其中,数据块空间定义接口和任务相关接口可以为上述操作系统提 供的。
多核处理器可以通过操作系统中配置的N个线程调用运行时库(Runtime Library),以数据块为中心,并行执行M个数据块中N个数据块的任务队列中的任务。其中,N线程执行N个任务队列中一个任务队列中的任务,该N个线程中不同的线程执行不同任务队列中的任务。
可以想到的是,上述M个数据块可以为存储在硬盘中的数据,也可以为存储在内存空间中的数据;当然,该M个数据块也可以为携带在上述流式应用的业务处理请求中待处理的数据,本发明实施例对此不做限制。
需要说明的是,本发明实施例这里仅以图1所示的DBOS场景为例,对本发明实施例所应用的DBOS场景进行举例说明。本发明实施例具体所应用的DBOS场景并不限于图1所示的DBOS场景,因此,图1所示的DBOS场景并不会对本发明实施例的技术方案的应用场景造成限定。
本发明实施例提供的任务调度方法的执行主体可以为安装有多核处理器的计算机设备(简称多核设备);或者,任务调度方法的执行主体可以为上述多核设备中用于执行任务调度方法的装置,如任务调度装置。该任务调度装置可以为上述多核设备的中央处理器(Central Processing Unit,CPU)。
与现有技术中以任务为中心,各个线程并行执行不同的任务时访问数据块,可能存在多个线程同时竞争一个数据块或者一个数据块的锁的问题相比;本发明实施例中可以以数据块为中心,将多个任务中的每个任务分别添加到与该任务对应的数据块的任务队列;然后采用N个线程中的每个线程分别执行一个任务队列中的任务。如此,即使多个线程并行运行,由于采用该多个线程中每一个线程针对不同的数据块,执行不同数据块的任务队列中的任务;因此,不会存在多个线程同时竞争一个数据块或者一个数据块的锁的问题。
本发明实施例中的用于执行任务调度方法的多核设备可以为个人计算机((personal computer,PC)、服务器等可以进行图像、音频、视频等中的“流式应用”处理的多核计算机设备。
下面结合附图,通过具体的实施例及其应用场景,以任务调度方法的执行主体为上述任务调度方法装置为例,对本发明实施例提供的一种任务调度方法进行详细地说明。
实施例一
本发明实施例提供的一种任务调度方法,如图2所示,该任务调度方法包括:
S101、任务调度装置根据待执行的多个任务与多个任务待访问的M个数据块的对应关系,将多个任务中的每个任务添加到与该任务对应的数据块的任务队列。
其中,M个数据块与M个任务队列一一对应。
示例性的,如图3所示,假设当前存在待访问的3个数据块(DB0、DB1和DB2)。这三个数据块中,DB0与任务队列0对应,DB1与任务队列1对应,DB2与任务队列2对应。
假设当前的多个任务(任务a、任务b、任务c、任务d、任务e、任务f和任务g)中,任务a与DB0和DB2存在对应关系,任务b与DB0存在对应关系,任务c与DB0存在对应关系,任务d与DB0和DB1存在对应关系,任务e与DB1存在对应关系,任务f与DB1存在对应关系,任务个g与DB2存在对应关系;那么,如图4所示,任务调度装置便可以将任务a、任务b、任务c和任务d添加到DB0的任务队列(任务队列0),可以将任务d、任务e和任务f添加到DB1的任务队列(任务队列1),可以将任务g和任务a添加到DB2的任务队列(任务队列2)。
需要说明的是,由于上述待执行的多个任务是任务调度装置针对上述M个数据块创建的,因此在创建该多个任务时,便指定了该多个任务中每个任务对应的数据块。
其中,任务调度装置可以根据需求为不同的数据块创建各自的 任务,任务调度装置为每一个数据块创建的任务不尽相同,其任务的数量也不尽相同。
可以想到的是,任务调度装置在为每一个数据块创建任务时,都会指定该任务是针对哪一个数据块创建的;从而任务调度装置可以根据创建任务时各个任务所指定的数据块,将创建的任务分别添加至该任务指定的数据块的任务队列。
如图4所示,任务调度装置为DB0创建了4个任务,包括:任务a、任务b、任务c和任务d;任务调度装置为DB1创建了3个任务,包括:任务d、任务e和任务f;任务调度装置为DB2创建了2个任务,包括:任务g和任务a。
需要说明的是,任务调度装置为数据块创建任务的方法可以参考现有技术中为数据块创建任务的相关方法,本发明实施例这里不再赘述。
S102、任务调度装置采用N个线程并行执行M个任务队列中N个任务队列中的任务,该N个线程中的每个线程执行该N个任务队列中一个任务队列中的任务,该N个线程中不同的线程执行不同任务队列中的任务,2≤N≤M。
其中,任务调度装置可以采用N个线程作为N个调度器线程,并行调度N个任务队列,分别采用该N个线程中的每个线程执行N个任务队列中的一个任务队列中的任务。
在本发明实施例中,上述N个线程中的任一线程执行一任务队列中的任务的方法可以为:该线程逐个执行该任务队列中的每个任务。
示例性的,如图5所示,任务调度装置可以采用Thread0、Thread1和Thread2,并行执行任务队列0、任务队列1和任务队列2中的任务。具体的,任务调度装置可以同时执行:采用Thread0执行任务队列0中的任务a、任务b、任务c和任务d;采用Thread1执行任务队列1中的任务d、任务e和任务f;采用Thread2执行任务队列2中的任务g和任务a。
以图5所示的Thread0、Thread1和Thread2并行执行任务队列0、任务队列1和任务队列2中的任务为例。虽然在Thread0、Thread1和Thread2并行执行任务队列0、任务队列1和任务队列2中的任务的过程中,需要针对DB0执行任务a、任务b、任务c和任务d,针对DB1执行任务d、任务e和任务f,针对DB2执行任务g和任务a;但是,对于每一个数据块而言,仅会采用一个线程执行该数据块的任务队列中的任务。如,采用Thread0针对DB0执行任务a、任务b、任务c和任务d;采用Thread1针对DB1执行任务d、任务e和任务f;采用Thread2针对DB2执行任务g和任务a。由此可见,不会存在多个线程同时竞争一个数据块或者一个数据块的锁的问题。
可以想到的是,本发明实施例中的N个线程可以为任务调度装置根据系统配置创建的。其中,在多核处理器系统中,为了保证充分利用系统硬件资源,通常可以创建与当前处理器核数(或硬件线程数)相同个数的线程。即N可以为该任务调度装置中的处理器核数。其中,本发明实施例中的线程为操作系统线程(Operating System Thread,OS Thread)。
示例性的,若该任务调度装置中的处理器核数为3,那么该任务调度装置则可以创建3个线程包括:Thread0、Thread1和Thread2。即一个处理器核上运行一个线程。
本发明实施例提供的任务调度方法,可以以数据块为中心,多个任务中的每个任务分别添加到与该任务对应的数据块的任务队列;然后采用N个线程中的每个线程分别执行一个任务队列中的任务。如此,即使多个线程并行运行,由于采用该多个线程中每一个线程针对不同的数据块,执行不同数据块的任务队列中的任务;因此,不会存在多个线程同时竞争一个数据块或者一个数据块的锁的问题。
即通过本方案,可以减少采用多个线程并行执行任务时,多个线程同时访问一个数据块而引发的数据竞争,从而可以提高任务调度的效率。并且,由于本方案中无需引入锁机制,因此可以避免引 入锁带来的额外性能开销,降低并发错误的检测和调试难度。
此外,在本发明实施例中,一个线程在一定的时间段内,会逐个执行一个数据块的任务队列中的任务,即一个线程在一定的时间段内针对同一数据块,执行该数据块对应的任务;那么该线程在一定时间段内进行多次任务切换前后处理的数据是相同的。如此,可以避免了传统任务并行系统中因任务切换前后处理不同的数据而导致的大量缓存(cache)行换出换入的问题,从而可以提高访存效率,提高程序的性能。
进一步的,由于系统性能或者系统配置的制约,系统中创建的线程的数量有限,而数据块的数量一般不会受到系统性能或者系统配置的制约;因此,当数据块的数量大于线程的数量时,则会有一部分数据块的任务队列需要等待N个线程中存在空闲线程时,再采用空闲线程执行该部分数据块的任务队列中的任务。
例如,当系统中创建了N个线程,而当前存在M(2≤N≤M)个数据块的任务队列中的任务需要被执行时,该N个线程则只能够并行执行M个任务队列中N个任务队列中的任务。剩余的M-N个任务队列则需要等待N个线程中存在空闲线程时,再采用空闲线程执行这M-N个任务队列中的任务。
基于此,如图6所示,本发明实施例的方法还可以包括S101′:
S101′、任务调度装置将M个任务队列中的至少一个任务队列添加至等待队列组。
其中,该至少一个任务队列中的每个任务队列中包含至少一个任务,且该至少一个任务未被N个线程中的线程执行。该等待队列组用于按照先进先出的原则存放上述M个任务队列中的任务队列。
示例性的,当系统中创建的线程的数量有限,没有空闲的线程来执行刚刚生成的数据块的任务队列时,任务调度装置可以将生成的数据块的任务队列添加至等待队列组。如图7所示,任务调度装置可以将数据块DB3的任务队列(任务队列3)和DB4的任务队列(任 务队列4),添加至等待队列(Wait Queue,Wait Q)组。
WQ组按照先进先出的原则存放已添加任务、但未被执行的任务队列。在图7中,由于DB3的任务队列先于DB4的任务队列添加至WQ组;因此,任务调度装置可以在N个线程中的任一线程为空闲线程时,优先采用该空闲线程调度DB3的任务队列,执行DB3的任务队列中的任务。
可以理解的是,将M个任务队列中的至少一个任务队列添加至等待队列组后,上述N个线程中的任一线程便可以在空闲后依据等待队列组中存放任务队列的原则,执行等待队列组中存放任务队列中的任务。
可选的,如图6所示,图2中的S102可以替换为S102a:
S102a、任务调度装置采用该N个线程,并行执行上述等待队列组中的前N个任务队列中的任务。
其中,该前N个任务队列为该等待队列组中最先添加的N个任务队列,该N个线程中的每个线程按照先进先出原则执行该前N个任务队列中一相应的任务队列中的任务。
进一步的,当一线程开始执行等待队列组中的一个任务队列后,则表示该任务队列当前不处于等待执行状态。为了及时更新等待队列组中的任务队列,使得该等待队列组中仅包括未被线程执行的任务所在的任务队列,如图8所示,本发明实施例的方法还可以包括S103:
S103、任务调度装置将等待队列组中被N个线程中的线程执行的任务所在的任务队列删除。
其中,及时删除等待队列组中已经被N个线程中的线程执行的任务所在的任务队列,等待队列组中则不会包括已被线程执行的任务所在的任务队列。
可以想到的是,系统中的任一线程可能会因为执行完一个数据块的任务队列中的所有任务而处于空闲状态,或者该线程可能会因为退出执行一个数据块的任务队列中的任务而处于空闲状态,即该线程为空闲线程。此时,该空闲线程则可以执行在上述前N个任务队列后被添加到所述等待队列组的首个任务队列中的任务。
具体的,如图9所示,本发明实施例的方法还可以包括S104:
S104、任务调度装置采用一空闲线程执行等待队列组中第一队列中的任务,第一队列为在上述前N个任务队列后被添加到该等待队列组的首个任务队列。
其中,该空闲线程可以为N个线程中执行完相应的任务队列中的任务的线程。
并且,为了实时更新该等待队列组中的任务队列,如图9所示,本发明实施例的方法还可以包括S105:
S105、任务调度装置将被执行的第一队列从该等待队列组中删除。
通过上述方案,及时删除了等待队列组中已经被执行的任务所在的任务队列,这样便可以保证等待队列组中仅包括未被线程执行的任务所在的任务队列。
进一步的,在S101之前,本发明实施例的方法还包括:任务调度装置为M个数据块中的每一个数据块创建任务队列。
其中,任务调度装置为数据块创建的任务队列都是为了存放与该数据块对应的任务。本发明实施例中任务调度装置为每一个数据块初始创建的任务队列均置空。
示例性的,如图3所示,任务调度装置可以以数据块为中心,为DB0创建一个用于添加DB0对应任务的任务队列0,为DB1创建一个用于添加DB1对应任务的任务队列1,为DB2创建一个用于添加DB2对应任务的任务队列2。
进一步的,在任务调度装置为M个数据块中的每一个数据块创建任务队列之前,任务调度装置可以为M个数据块分配用于保存该M个数据块的任务队列的内存空间。具体的,本发明实施例的方法还可以包括:任务调度装置为M个数据块分配内存空间,内存空间用于保存M个数据块中的每一个数据块的任务队列。
示例性的,任务调度装置可以根据M个数据块中数据块的数据块类型、数据块的大小以及数据块的数目,为该M个数据块分配内存空间。
需要说明的是,任务调度装置为M个数据块分配内存空间的其他方法可以参考现有技术在数据处理过程中为数据块分配内存空间的相关方法,此处不再赘述。
进一步的,上述实施例中,S102、S102a或S104中,任务调度装置采用一个线程(如第一线程)执行一个任务队列(如第二队列)中的任务的方法具体可以包括Sa-Sb:
Sa、任务调度装置采用第一线程读取第二队列中的第k个任务,并切换到第二队列中的第k个任务的上下文开始执行,1≤k<K,K为第二队列中的任务总数。
Sb、若任务调度装置采用第一线程执行完第二队列中的第k个任务,则将第一线程退出第二队列中的第k个任务,并采用第一线程读取第一队列中的第k+1个任务,并切换到第二队列中的第k+1个任务的上下文开始执行,直至第二队列中的K个任务执行完毕。
进一步的,第一线程可能会因为第二任务中的第k个任务(即第一任务)等待来自第三队列的任务执行结果,而退出执行该第k个任务。对于这种情况,任务调度装置采用一第一线程执第二队列中的任务的方法还可以包括Sc:
Sc、任务调度装置将第一线程退出执行正在等待第三队列的任务执行结果的第一任务(第二队列中的第k个任务);将退出执行的第一任务添加至第三队列;待该第一任务获得任务执行结果后,采用第二线程执行第二队列中的第一任务。
其中,第二队列为上述等待队列组中的任一任务队列,第三队列为等待队列组中不同于该第二队列的任务队列。
可以理解的是,在将第一线程退出执行正在等待任务执行结果的第一任务后,则不能采用该第一线程执行第一任务所在任务队列(第二队列)中的其他任务。此时,该第一线程就成了一个空闲线程,则可以用于执行等待任务组中其他任务队列中的任务。
由此可知,上述实施例中的空闲线程(如S104中所述的空闲线程)则不仅可以为N个线程中执行完相应的任务队列中的任务的线 程;空闲线程还可以为N个线程中退出执行相应的任务队列中的任务的线程。
并且,在第一任务获得任务执行结果后,也可以采用一个空闲线程第二队列中的第一任务。即上述第二线程在第一任务获得任务执行结果时,为空闲线程。
同理,该第二线程可以为上述N个线程中执行完相应的任务队列中的任务的线程;或者,该第二线程为上述N个线程中退出执行相应的任务队列中的任务的线程。
示例性的,任务调度装置可以通过以下算法程序实现“任务调度装置采用一个线程a执行一个任务队列中的任务”。具体算法程序如下所示:
Figure PCTCN2016102055-appb-000001
Figure PCTCN2016102055-appb-000002
其中,上述算法程序的详细描述如下所示:
上述算法程序中第3-18行用于循环判断Wait Q(即本发明实施例中的等待队列组)是否为空;
如果Wait Q为空,则表示当前系统中没有任务队列待处理,此时线程(即调度器线程)被挂起(详见上述算法程序中第19行);
如果Wait Q非空,则采用线程a读取Wait Q中任务队列1中的第一个任务(详见上述算法程序中第4-6行),切换到该任务队列1的第一个任务的上下文开始执行(详见上述算法程序中第8行);其中,任务队列1为Wait Q中当前包含的任务队列中,最先添加至该Wait Q的任务队列;
将线程a退出执行任务队列1中的第一个任务后,判断线程a退出执行任务队列1中的第一个任务的原因(详见上述算法程序中第9-16行);
如果由于线程a执行完任务队列1中的第一个任务,将线程a退出执行任务队列1中的第一个任务(详见上述算法程序中第9行),则采用线程a读取该任务队列1中的第二个任务,并切换到任务队列1的第二个任务的上下文开始执行(详见上述算法程序中第10-11行);
如果因为任务队列1中的第一个任务等待任务队列t的任务执行结果,将线程a退出执行任务队列1中的第一个任务(详见上述算法程序中第13行),则将任务队列1中的第一个任务添加至任务队列t,待任务队列a中的第一个任务获得任务队列t的任务执行结果后,采用一个空闲线程执行任务队列1中的第一个任务(详见上述算法程序中第14行)。其中,任务队列t为Wait Q中不同于任务队列1的一个任务队列。
其中,在因为任务队列1中的第一个任务等待任务队列t的任务执行结果,将线程a退出执行任务队列1中的第一个任务后,该线程a则变成空闲线程,该线程a可以用于执行Wait Q中的下一个任务队列中的任务(详见上述算法程序中第15行)。
本发明实施例提供的任务调度方法,可以以数据块为中心,多个任务中的每个任务分别添加到与该任务对应的数据块的任务队列;然后采用N个线程中的每个线程分别执行一个任务队列中的任务。如此,即使多个线程并行运行,由于采用该多个线程中每一个线程针对不同的数据块,执行不同数据块的任务队列中的任务;因此,不会存在多个线程同时竞争一个数据块或者一个数据块的锁的问题。
即通过本方案,可以减少采用多个线程并行执行任务时,多个线程同时访问一个数据块而引发的数据竞争,从而可以提高任务调度的效率。并且,由于本方案中无需引入锁机制,因此可以避免引入锁带来的额外性能开销,降低并发错误的检测和调试难度。
并且,还可以在没有空闲的线程可用于执行任务队列中的任务时,将该任务队列添加至等待队列组;并且,可以及时删除等待队列组中已经被执行的任务所在的任务队列,使得等待队列组中不包括已被线程执行的任务所在的任务队列。
实施例二
本发明实施例提供一种任务调度装置,如图10所示,该任务调度装置包括:任务添加模块21和任务执行模块22。
任务添加模块21,用于根据待执行的多个任务与该多个任务待访问的M个数据块的对应关系,将该多个任务中的每个任务添加到与该任务对应的数据块的任务队列,其中,该M个数据块与M个任务队列一一对应。
任务执行模块22,用于采用N个线程并行执行上述任务添加模块21添加至上述M个任务队列中N个任务队列中的任务,其中,该N个线程中的每个线程执行该N个任务队列中一个任务队列中的任 务,该N个线程中不同的线程执行不同任务队列中的任务,2≤N≤M。
进一步的,如图11所示,该任务调度装置还可以包括:队列添加模块23。
队列添加模块23,用于将上述M个任务队列中的至少一个任务队列添加至等待队列组,该至少一个任务队列中的每个任务队列中包含至少一个任务,且该至少一个任务未被上述N个线程中的线程执行。
其中,上述等待队列组用于按照先进先出的原则存放上述M个任务队列中的任务队列。
可以想到的是,上述至少一个任务队列中每一个任务队列中的任务都是由任务添加模块21添加至对应任务队列的。
进一步的,上述任务执行模块22,具体用于:
采用上述N个线程,并行执行上述等待队列组中的前N个任务队列中的任务。
其中,该前N个任务队列为该等待队列组中最先添加的N个任务队列,该N个线程中的每个线程按照先进先出原则执行该前N个任务队列中一相应的任务队列中的任务。
进一步的,所述任务执行模块22,还用于采用一空闲线程执行上述等待队列组中第一队列中的任务,该第一队列为在上述前N个任务队列后被添加到等待队列组的首个任务队列。
如图12所示,该任务调度装置还可以包括:队列删除模块24。
队列删除模块24,用于将被所述任务执行模块22执行的所述第一队列从所述等待队列组中删除。
其中,上述空闲线程为上述N个线程中执行完相应的任务队列中的任务的线程;或者,上述空闲线程为上述N个线程中退出执行相应的任务队列中的任务的线程。
进一步的,上述多个任务包括属于第二队列的第一任务,该第 一任务在被第一线程执行的过程中等待来自第三队列的任务执行结果,该第一线程为所述N个线程中用于执行该第二队列中的任务的线程,该第二队列为上述等待队列组中的任一任务队列,该第三队列为上述等待队列组中不同于该第二队列的任务队列。
该任务调度装置还可以包括:任务控制模块。
任务控制模块,用于将所述第一线程退出执行正在等待所述任务执行结果的所述第一任务。
相应的,上述任务添加模块21,还用于将退出执行的所述第一任务添加至所述第三队列。
上述任务执行模块22,还用于待所述第一任务获得所述任务执行结果后,采用第二线程执行所述第二队列中的所述第一任务。
其中,上述第二线程为上述N个线程中执行完相应的任务队列中的任务的线程;或者,上述第二线程为所述N个线程中退出执行相应的任务队列中的任务的线程。
进一步的,队列删除模块24,还用于将所述等待队列组中被所述N个线程中的线程执行的任务所在的任务队列删除。
需要说明的是,本发明实施例提供的任务调度装置中各个功能模块的详细描述可以参考本发明方法实施例中的相关内容,本实施例这里不再赘述。
本发明实施例提供的任务调度装置,可以以数据块为中心,多个任务中的每个任务分别添加到与该任务对应的数据块的任务队列;然后采用N个线程中的每个线程分别执行一个任务队列中的任务。如此,即使多个线程并行运行,由于采用该多个线程中每一个线程针对不同的数据块,执行不同数据块的任务队列中的任务;因此,不会存在多个线程同时竞争一个数据块或者一个数据块的锁的问题。
即通过本方案,可以减少采用多个线程并行执行任务时,多个线程同时访问一个数据块而引发的数据竞争,从而可以提高任务调度的效率。并且,由于本方案中无需引入锁机制,因此可以避免引 入锁带来的额外性能开销,降低并发错误的检测和调试难度。
此外,在本发明实施例中,一个线程在一定的时间段内,会逐个执行一个数据块的任务队列中的任务,即一个线程在一定的时间段内针对同一数据块,执行该数据块对应的任务;那么该线程在一定时间段内进行多次任务切换前后处理的数据是相同的。如此,可以避免了传统任务并行系统中因任务切换前后处理不同的数据而导致的大量缓存(cache)行换出换入的问题,从而可以提高访存效率,提高程序的性能。
实施例三
本发明实施例提供一种任务调度装置,如图13所示,该任务调度装置包括:
一个或多个处理器31、存储器32、总线系统33,以及一个或多个应用程序,所述一个或多个处理器31和所述存储器32通过所述总线系统33相连;所述一个或多个应用程序存储在所述存储器32中,所述一个或多个应用程序包括指令。
所述处理器31用于执行所述指令,并具体用于代替上述任务添加模块21、任务执行模块22、队列添加模块23以及队列删除模块24等,执行如图2、图6、图8和图9中任一附图所示的任务调度方法。即处理器31可以为上述任务添加模块21、任务执行模块22、队列添加模块23以及队列删除模块24等功能单元或功能模块的集成,即上述各功能模块可以集成在一个该处理器31中实现。
所述处理器31可能是一个中央处理器(Central Processing Unit,CPU),或者是特定集成电路(Application Specific Integrated Circuit,ASIC),或者是被配置成实施本发明实施例的一个或多个集成电路。
其中,总线可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component Interconnect,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。该总线可以分 为地址总线、数据总线、控制总线等。为便于表示,图13中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
本发明实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有一个或多个程序代码,所述一个或多个程序代码包括指令,当所述任务调度装置的处理器31执行所述指令时,所述任务调度装置执行如图2、图6、图8和图9中任一附图所示的任务调度方法。
所述计算机可读存储介质可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。
需要说明的是,上述程序代码可以作为运行于任务调度装置的嵌入式操作系统的一个组件,也可以作为运行于任务调度装置上的各种应用程序中的一个组件,相对而言,当上述实施例中提供的任务调度方法作为嵌入式操作系统的一个组件时,可以无需对应用程序进行修改,实施难度和修改工作量较小。
其中,在具体实现过程中,上述图2、图6、图8和图9中任一附图所示的方法流程中的各步骤均可以通过硬件形式的任务调度装置执行上述非易失性存储介质中存储的软件形式的程序代码实现。
需要说明的是,本发明实施例提供的任务调度装置中你功能模块的具体描述可以参考本发明方法实施例中对应部分的相关描述,本实施例这里不再赘述。
本发明实施例提供的任务调度装置,可以以数据块为中心,多个任务中的每个任务分别添加到与该任务对应的数据块的任务队列;然后采用N个线程中的每个线程分别执行一个任务队列中的任务。如此,即使多个线程并行运行,由于采用该多个线程中每一个线程针对不同的数据块,执行不同数据块的任务队列中的任务;因此,不会存在多个线程同时竞争一个数据块或者一个数据块的锁的问题。
即通过本方案,可以减少采用多个线程并行执行任务时,多个 线程同时访问一个数据块而引发的数据竞争,从而可以提高任务调度的效率。并且,由于本方案中无需引入锁机制,因此可以避免引入锁带来的额外性能开销,降低并发错误的检测和调试难度。
此外,在本发明实施例中,一个线程在一定的时间段内,会逐个执行一个数据块的任务队列中的任务,即一个线程在一定的时间段内针对同一数据块,执行该数据块对应的任务;那么该线程在一定时间段内进行多次任务切换前后处理的数据是相同的。如此,可以避免了传统任务并行系统中因任务切换前后处理不同的数据而导致的大量缓存(cache)行换出换入的问题,从而可以提高访存效率,提高程序的性能。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目 的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (13)

  1. 一种任务调度方法,其特征在于,所述方法包括:
    根据待执行的多个任务与所述多个任务待访问的M个数据块的对应关系,将所述多个任务中的每个任务添加到与该任务对应的数据块的任务队列,其中,所述M个数据块与M个任务队列一一对应;
    采用N个线程并行执行所述M个任务队列中N个任务队列中的任务,其中,所述N个线程中的每个线程执行所述N个任务队列中一个任务队列中的任务,所述N个线程中不同的线程执行不同任务队列中的任务,2≤N≤M。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    将所述M个任务队列中的至少一个任务队列添加至等待队列组,所述至少一个任务队列中的每个任务队列中包含至少一个任务,且所述至少一个任务未被所述N个线程中的线程执行;
    其中,所述等待队列组用于按照先进先出的原则存放所述M个任务队列中的任务队列。
  3. 根据权利要求2所述的方法,其特征在于,所述采用N个线程并行执行所述M个任务队列中N个任务队列中的任务,包括:
    采用所述N个线程,并行执行所述等待队列组中的前N个任务队列中的任务;
    其中,所述前N个任务队列为所述等待队列组中最先添加的N个任务队列,所述N个线程中的每个线程按照所述先进先出原则执行所述前N个任务队列中一相应的任务队列中的任务。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    采用一空闲线程执行所述等待队列组中第一队列中的任务,所述第一队列为在所述前N个任务队列后被添加到所述等待队列组的首个任务队列;
    将被执行的所述第一队列从所述等待队列组中删除;
    其中,所述空闲线程为所述N个线程中执行完相应的任务队列中的任务的线程;或者,所述空闲线程为所述N个线程中退出执行相应的任务队列中的任务的线程。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述多个任务包括属于第二队列的第一任务,所述第一任务在被第一线程执行的过程中等待来自第三队列的任务执行结果,所述第一线程为所述N个线程中用于执行所述第二队列中的任务的线程,所述第二队列为所述等待队列组中的任一任务队列,所述第三队列为所述等待队列组中不同于所述第二队列的任务队列;所述方法还包括:
    将所述第一线程退出执行正在等待所述任务执行结果的所述第一任务;
    将退出执行的所述第一任务添加至所述第三队列;
    待所述第一任务获得所述任务执行结果后,采用第二线程执行所述第二队列中的所述第一任务;
    其中,所述第二线程为所述N个线程中执行完相应的任务队列中的任务的线程;或者,所述第二线程为所述N个线程中退出执行相应的任务队列中的任务的线程。
  6. 根据权利要求2-4中任一项所述的方法,其特征在于,所述方法还包括:
    将所述等待队列组中被所述N个线程中的线程执行的任务所在的任务队列删除。
  7. 一种任务调度装置,其特征在于,所述装置包括:
    任务添加模块,用于根据待执行的多个任务与所述多个任务待访问的M个数据块的对应关系,将所述多个任务中的每个任务添加到与该任务对应的数据块的任务队列,其中,所述M个数据块与M个任务队列一一对应;
    任务执行模块,用于采用N个线程并行执行所述任务添加模块添加至所述M个任务队列中N个任务队列中的任务,其中,所述N个线程中的每个线程执行所述N个任务队列中一个任务队列中的任 务,所述N个线程中不同的线程执行不同任务队列中的任务,2≤N≤M。
  8. 根据权利要求7所述的装置,其特征在于,所述装置还包括:
    队列添加模块,用于将所述M个任务队列中的至少一个任务队列添加至等待队列组,所述至少一个任务队列中的每个任务队列中包含至少一个任务,且所述至少一个任务未被所述N个线程中的线程执行;
    其中,所述等待队列组用于按照先进先出的原则存放所述M个任务队列中的任务队列。
  9. 根据权利要求8所述的装置,其特征在于,所述任务执行模块,具体用于:
    采用所述N个线程,并行执行所述等待队列组中的前N个任务队列中的任务;
    其中,所述前N个任务队列为所述等待队列组中最先添加的N个任务队列,所述N个线程中的每个线程按照所述先进先出原则执行所述前N个任务队列中一相应的任务队列中的任务。
  10. 根据权利要求9所述的装置,其特征在于,所述任务执行模块,还用于采用一空闲线程执行所述等待队列组中第一队列中的任务,所述第一队列为在所述前N个任务队列后被添加到所述等待队列组的首个任务队列;
    所述装置还包括:
    队列删除模块,用于将被所述任务执行模块执行的所述第一队列从所述等待队列组中删除;
    其中,所述空闲线程为所述N个线程中执行完相应的任务队列中的任务的线程;或者,所述空闲线程为所述N个线程中退出执行相应的任务队列中的任务的线程。
  11. 根据权利要求7-10中任一项所述的装置,其特征在于,所述多个任务包括属于第二队列的第一任务,所述第一任务在被第一 线程执行的过程中等待来自第三队列的任务执行结果,所述第一线程为所述N个线程中用于执行所述第二队列中的任务的线程,所述第二队列为所述等待队列组中的任一任务队列,所述第三队列为所述等待队列组中不同于所述第二队列的任务队列;所述装置还包括:
    任务控制模块,用于将所述第一线程退出执行正在等待所述任务执行结果的所述第一任务;
    相应的,所述任务添加模块,还用于将退出执行的所述第一任务添加至所述第三队列;
    所述任务执行模块,还用于待所述第一任务获得所述任务执行结果后,采用第二线程执行所述第二队列中的所述第一任务;
    其中,所述第二线程为所述N个线程中执行完相应的任务队列中的任务的线程;或者,所述第二线程为所述N个线程中退出执行相应的任务队列中的任务的线程。
  12. 根据权利要求10所述的装置,其特征在于,所述队列删除模块,还用于将所述等待队列组中被所述N个线程中的线程执行的任务所在的任务队列删除。
  13. 一种任务调度装置,其特征在于,所述装置包括:一个或多个处理器、存储器、总线和通信接口;
    所述存储器用于存储计算机执行指令,所述处理器与所述存储器通过所述总线连接,当所述任务调度装置运行时,所述处理器执行所述存储器存储的所述计算机执行指令,以使所述任务调度装置执行如权利要求1-6任意一项所述的任务调度方法。
PCT/CN2016/102055 2016-03-29 2016-10-13 一种任务调度方法及装置 WO2017166777A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP16896547.3A EP3425502B1 (en) 2016-03-29 2016-10-13 Task scheduling method and device
US16/145,607 US10891158B2 (en) 2016-03-29 2018-09-28 Task scheduling method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610188139.2 2016-03-29
CN201610188139.2A CN105893126B (zh) 2016-03-29 2016-03-29 一种任务调度方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/145,607 Continuation US10891158B2 (en) 2016-03-29 2018-09-28 Task scheduling method and apparatus

Publications (1)

Publication Number Publication Date
WO2017166777A1 true WO2017166777A1 (zh) 2017-10-05

Family

ID=57014916

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/102055 WO2017166777A1 (zh) 2016-03-29 2016-10-13 一种任务调度方法及装置

Country Status (4)

Country Link
US (1) US10891158B2 (zh)
EP (1) EP3425502B1 (zh)
CN (1) CN105893126B (zh)
WO (1) WO2017166777A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120550A (zh) * 2018-07-18 2019-01-01 武汉绿色网络信息服务有限责任公司 一种无锁化处理方法和装置
WO2023077436A1 (en) * 2021-11-05 2023-05-11 Nvidia Corporation Thread specialization for collaborative data transfer and computation

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893126B (zh) * 2016-03-29 2019-06-11 华为技术有限公司 一种任务调度方法及装置
CN108228240B (zh) * 2016-12-14 2021-02-26 北京国双科技有限公司 多任务队列中任务的处理方法和装置
CN106991071B (zh) * 2017-03-31 2021-05-18 联想(北京)有限公司 内核调度方法及系统
CN107301087A (zh) * 2017-06-28 2017-10-27 郑州云海信息技术有限公司 一种多线程系统的性能提升方法和装置
CN107608773B (zh) * 2017-08-24 2020-08-04 阿里巴巴集团控股有限公司 任务并发处理方法、装置及计算设备
CN107526551B (zh) * 2017-09-14 2020-03-31 郑州云海信息技术有限公司 一种cpu多核的io请求处理方法、装置及设备
CN108182158A (zh) * 2018-01-12 2018-06-19 江苏华存电子科技有限公司 一种应用在存储系统中的任务调度优化方法
CN108509277A (zh) * 2018-04-03 2018-09-07 中国电子科技集团公司第七研究所 电子锁串口异步接收处理系统及方法
CN110543148B (zh) * 2018-05-28 2021-04-09 华为技术有限公司 一种任务调度方法及装置
CN109725995B (zh) * 2018-06-15 2024-04-12 中国平安人寿保险股份有限公司 数据抽取任务执行方法、装置、设备及可读存储介质
CN110673944B (zh) * 2018-07-03 2022-09-09 杭州海康威视数字技术股份有限公司 执行任务的方法和装置
CN108958944A (zh) * 2018-07-26 2018-12-07 郑州云海信息技术有限公司 一种多核处理系统及其任务分配方法
CN109117260B (zh) * 2018-08-30 2021-01-01 百度在线网络技术(北京)有限公司 一种任务调度方法、装置、设备和介质
CN109840149B (zh) * 2019-02-14 2021-07-30 百度在线网络技术(北京)有限公司 任务调度方法、装置、设备及存储介质
CN109933426B (zh) * 2019-02-19 2021-06-25 北京三快在线科技有限公司 服务调用的处理方法、装置、电子设备及可读存储介质
CN111045797A (zh) * 2019-10-31 2020-04-21 华控清交信息科技(北京)有限公司 任务调度执行方法、相关装置和介质
CN111145410B (zh) * 2019-12-18 2021-09-03 宁波博太科智能科技股份有限公司 一种门禁权限计算方法
CN111290846B (zh) * 2020-02-26 2023-08-18 杭州涂鸦信息技术有限公司 一种分布式任务调度方法及系统
CN111475300B (zh) * 2020-04-09 2023-06-23 江苏盛海智能科技有限公司 一种多线程多任务管理方法及终端
CN111488220A (zh) * 2020-04-09 2020-08-04 北京字节跳动网络技术有限公司 一种启动请求处理方法、装置和电子设备
CN111736976B (zh) * 2020-06-30 2023-08-15 中国工商银行股份有限公司 任务处理方法、装置、计算设备和介质
CN112860401B (zh) * 2021-02-10 2023-07-25 北京百度网讯科技有限公司 任务调度方法、装置、电子设备和存储介质
CN113342898B (zh) * 2021-06-29 2022-10-04 杭州数梦工场科技有限公司 数据同步方法及装置
CN113703941B (zh) * 2021-08-30 2024-08-06 竞技世界(北京)网络技术有限公司 任务调度方法、系统及电子设备
CN114528113B (zh) * 2022-04-24 2022-08-23 广州中望龙腾软件股份有限公司 一种线程锁管理系统、方法、设备和可读介质
CN115114359B (zh) * 2022-05-27 2023-11-14 马上消费金融股份有限公司 用户数据处理方法及装置
CN115080247B (zh) * 2022-08-15 2022-11-04 科来网络技术股份有限公司 一种高可用线程池切换方法及装置
CN115292025A (zh) * 2022-09-30 2022-11-04 神州数码融信云技术服务有限公司 任务调度方法及装置、计算机设备及计算机可读存储介质
CN116450324A (zh) * 2023-06-20 2023-07-18 北京超星未来科技有限公司 任务处理方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101005486A (zh) * 2006-12-28 2007-07-25 金蝶软件(中国)有限公司 一种资源访问控制方法及系统
CN102902573A (zh) * 2012-09-20 2013-01-30 北京搜狐新媒体信息技术有限公司 一种基于共享资源的任务的处理方法及装置
WO2015070789A1 (en) * 2013-11-14 2015-05-21 Mediatek Inc. Task scheduling method and related non-transitory computer readable medium for dispatching task in multi-core processor system based at least partly on distribution of tasks sharing same data and/or accessing same memory address (es)
CN105893126A (zh) * 2016-03-29 2016-08-24 华为技术有限公司 一种任务调度方法及装置

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL137085A (en) * 2000-06-29 2004-08-31 Eci Telecom Ltd Method for effective utilizing of shared resources in computerized systems
US7650601B2 (en) * 2003-12-04 2010-01-19 International Business Machines Corporation Operating system kernel-assisted, self-balanced, access-protected library framework in a run-to-completion multi-processor environment
US8549524B2 (en) * 2009-12-23 2013-10-01 Sap Ag Task scheduler for cooperative tasks and threads for multiprocessors and multicore systems
US8381004B2 (en) * 2010-05-26 2013-02-19 International Business Machines Corporation Optimizing energy consumption and application performance in a multi-core multi-threaded processor system
US9928105B2 (en) * 2010-06-28 2018-03-27 Microsoft Technology Licensing, Llc Stack overflow prevention in parallel execution runtime
US8954986B2 (en) * 2010-12-17 2015-02-10 Intel Corporation Systems and methods for data-parallel processing
CN102541653B (zh) * 2010-12-24 2013-12-25 新奥特(北京)视频技术有限公司 一种多任务线程池调度方法和系统
US9250953B2 (en) * 2013-11-12 2016-02-02 Oxide Interactive Llc Organizing tasks by a hierarchical task scheduler for execution in a multi-threaded processing system
CN103955491B (zh) * 2014-04-15 2017-04-19 南威软件股份有限公司 一种定时数据增量同步的方法
CN104375882B (zh) * 2014-11-21 2016-06-01 北京应用物理与计算数学研究所 匹配于高性能计算机结构的多级嵌套数据驱动计算方法
CN104899099A (zh) * 2015-05-26 2015-09-09 北京金和网络股份有限公司 一种基于线程池的任务分配方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101005486A (zh) * 2006-12-28 2007-07-25 金蝶软件(中国)有限公司 一种资源访问控制方法及系统
CN102902573A (zh) * 2012-09-20 2013-01-30 北京搜狐新媒体信息技术有限公司 一种基于共享资源的任务的处理方法及装置
WO2015070789A1 (en) * 2013-11-14 2015-05-21 Mediatek Inc. Task scheduling method and related non-transitory computer readable medium for dispatching task in multi-core processor system based at least partly on distribution of tasks sharing same data and/or accessing same memory address (es)
CN105893126A (zh) * 2016-03-29 2016-08-24 华为技术有限公司 一种任务调度方法及装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120550A (zh) * 2018-07-18 2019-01-01 武汉绿色网络信息服务有限责任公司 一种无锁化处理方法和装置
CN109120550B (zh) * 2018-07-18 2019-10-08 武汉绿色网络信息服务有限责任公司 一种无锁化处理方法和装置
WO2023077436A1 (en) * 2021-11-05 2023-05-11 Nvidia Corporation Thread specialization for collaborative data transfer and computation

Also Published As

Publication number Publication date
EP3425502A4 (en) 2019-03-27
EP3425502B1 (en) 2022-04-20
CN105893126B (zh) 2019-06-11
EP3425502A1 (en) 2019-01-09
US20190034230A1 (en) 2019-01-31
CN105893126A (zh) 2016-08-24
US10891158B2 (en) 2021-01-12

Similar Documents

Publication Publication Date Title
WO2017166777A1 (zh) 一种任务调度方法及装置
US10037222B2 (en) Virtualization of hardware accelerator allowing simultaneous reading and writing
WO2016112701A9 (zh) 异构多核可重构计算平台上任务调度的方法和装置
US8739171B2 (en) High-throughput-computing in a hybrid computing environment
US8914805B2 (en) Rescheduling workload in a hybrid computing environment
CN106371894B (zh) 一种配置方法、装置和数据处理服务器
US8413158B2 (en) Processor thread load balancing manager
WO2017070900A1 (zh) 多核数字信号处理系统中处理任务的方法和装置
US8516487B2 (en) Dynamic job relocation in a high performance computing system
WO2011103825A2 (zh) 多处理器系统负载均衡的方法和装置
WO2011123991A1 (zh) 并行计算的内存访问方法
CN103744716A (zh) 一种基于当前vcpu调度状态的动态中断均衡映射方法
WO2013185571A1 (zh) 多线程虚拟流水线处理器的线程控制和调用方法及其处理器
TW202246977A (zh) 一種任務調度方法、任務調度裝置、電腦設備、電腦可讀儲存媒介和電腦程式產品
JP2015504226A (ja) マルチスレッドコンピューティング
CN114168271B (zh) 一种任务调度方法、电子设备及存储介质
US10289306B1 (en) Data storage system with core-affined thread processing of data movement requests
CN112925616A (zh) 任务分配方法、装置、存储介质及电子设备
CN112789593A (zh) 一种基于多线程的指令处理方法及装置
US9437299B2 (en) Systems and methods for order scope transitions using cam
CN108255572A (zh) 一种vcpu切换方法和物理主机
US9619277B2 (en) Computer with plurality of processors sharing process queue, and process dispatch processing method
CN115964164A (zh) 计算机实现的方法、硬件加速器以及存储介质
US9176910B2 (en) Sending a next request to a resource before a completion interrupt for a previous request
CN111459620A (zh) 安全容器操作系统到虚拟机监控器的信息调度方法

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016896547

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016896547

Country of ref document: EP

Effective date: 20181004

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16896547

Country of ref document: EP

Kind code of ref document: A1