TW201415409A

TW201415409A - System and method of dynamic task allocation

Info

Publication number: TW201415409A
Application number: TW101138022A
Authority: TW
Inventors: Guang-Jian Wang; Wen-Wu Wu; xiao-jun Fu
Original assignee: Hon Hai Prec Ind Co Ltd
Priority date: 2012-10-09
Filing date: 2012-10-16
Publication date: 2014-04-16
Also published as: CN103713949A

Abstract

The present invention provides a system and method of dynamic task allocation, the method includes: evaluating actual computing ability of GPU and CPU; separating a task into N subtasks; determining a subtask to be a allocating task; computing a first estimated completion time according to the actual computing ability of the GPU, the first estimated completion time indicating that the time needed for the allocating task computing by the GPU; computing a second estimated completion time according to the actual computing ability of the CPU, the second estimated completion time indicating that the time needed for the allocating task computing by the CPU; putting the first and the second estimated completion time in sequence in accordance with the length the time; allocating the task to a task queue according the result of the sequence. The present invention can be used to allocate tasks for GPU and CPU.

Description

Dynamic task allocation system and method

本發明涉及一種電腦動態任務分配系統及方法。The invention relates to a computer dynamic task allocation system and method.

目前基於CUDA(Compute Unified Device Architecture)架構的電腦進行任務分配時，是按照一種固定分配模式進行任務分配的。然而由於不同CPU和不同GPU之間有著巨大的運算能力的區別，且CPU個數以及每個CPU所能控制的顯示卡數目都是不同的，每種顯示卡所配備的顯存容量以及速度都是不同的。顯然，目前的CUDA架構並沒有考慮到這種異構狀態，所使用的任務分配方式也沒有很好的利用CPU和GPU的運算處理能力。At present, when a computer based on the CUDA (Compute Unified Device Architecture) architecture performs task assignment, tasks are assigned according to a fixed allocation mode. However, due to the huge computing power difference between different CPUs and different GPUs, and the number of CPUs and the number of display cards that each CPU can control are different, the memory capacity and speed of each display card are different. Obviously, the current CUDA architecture does not take into account this heterogeneous state, and the task allocation method used does not make good use of the CPU and GPU computing processing capabilities.

鑒於以上內容，有必要提供一種動態任務分配系統及方法，其可將任務有效分配給GPU和CPU處理。In view of the above, it is necessary to provide a dynamic task allocation system and method that efficiently allocates tasks to the GPU and CPU for processing.

所述動態任務分配系統，該系統包括：評估模組，用於評估GPU及CPU的實際運算能力；分解模組，用於將新任務分解成N項子任務，其中N為大於等於1的整數；確定模組，用於從所分解得到的子任務中確定一項為待分配的子任務；計算模組，用於當待分配的子任務可由GPU執行時，根據GPU的實際運算能力計算該項子任務由GPU執行時所需的第一預計完成時間，該第一預計完成時間等於所述GPU執行該項子任務所需的時間與所述GPU執行其當前任務佇列中的待處理任務所需的時間之和；所述計算模組，還用於根據CPU的實際運算能力對所述待分配的子任務計算其由CPU執行時所需的第二預計完成時間，該第二預計完成時間等於所述CPU執行該項子任務所需的時間與所述CPU執行其當前任務佇列中的待處理任務所需的時間之和；排序模組，用於將所述待分配的子任務的第一及第二預計完成時間按照時間長短進行排序；及分配模組，用於根據排序結果將所述待分配的子任務分配到執行該項子任務所需預計完成時間最短的任務佇列。The dynamic task allocation system includes: an evaluation module for evaluating actual computing power of the GPU and the CPU; and a decomposition module for decomposing the new task into N sub-tasks, wherein N is an integer greater than or equal to 1 a determining module, configured to determine, from the decomposed subtask, an item to be a subtask to be allocated; and a computing module, configured to calculate the subtask to be allocated according to an actual computing capability of the GPU when the subtask to be allocated is executable by the GPU a first expected completion time required by the GPU to execute by the GPU, the first expected completion time being equal to a time required by the GPU to execute the subtask and a pending task in the current task queue of the GPU a sum of time required; the computing module is further configured to calculate, according to an actual computing capability of the CPU, a second estimated completion time required for execution by the CPU for the subtask to be allocated, the second expected completion The time is equal to the sum of the time required by the CPU to execute the subtask and the time required by the CPU to execute the task to be processed in the current task queue; the sorting module is configured to use the subtask to be assigned First Expected completion time of the second sorted according to length; and dispensing module for dispensing according to the ranking result will be assigned sub-task to perform subtask is estimated completion time required for the shortest task queue.

所述動態任務分配方法，該方法包括：評估步驟，評估GPU及CPU的實際運算能力；分解步驟，將新任務分解成N項子任務，其中N為大於等於1的整數；確定步驟，從所分解得到的子任務中確定一項為待分配的子任務；第一計算步驟，當待分配的子任務可由GPU執行時，根據GPU的實際運算能力計算該項子任務由GPU執行時所需的第一預計完成時間，該第一預計完成時間等於所述GPU執行該項子任務所需的時間與所述GPU執行其當前任務佇列中的待處理任務所需的時間之和；第二計算步驟，根據CPU的實際運算能力對所述待分配的子任務計算其由CPU執行時所需的第二預計完成時間，該第二預計完成時間等於所述CPU執行該項子任務所需的時間與所述CPU執行其當前任務佇列中的待處理任務所需的時間之和；排序步驟，將所述待分配的子任務的第一及第二預計完成時間按照時間長短進行排序；及分配步驟，根據排序結果將所述待分配的子任務分配到執行該項子任務所需預計完成時間最短的任務佇列。The dynamic task allocation method includes: an evaluation step of evaluating actual computing power of the GPU and the CPU; and an decomposition step of decomposing the new task into N sub-tasks, wherein N is an integer greater than or equal to 1; The subtask obtained by the decomposition determines an item as a subtask to be allocated; the first calculation step, when the subtask to be allocated is executable by the GPU, calculates the subtask required by the GPU according to the actual computing power of the GPU. a first expected completion time, the first estimated completion time being equal to a sum of a time required by the GPU to execute the subtask and a time required by the GPU to perform a task to be processed in the current task queue; a second calculation a step of calculating, according to an actual computing capability of the CPU, a second expected completion time required by the CPU for the subtask to be allocated, the second estimated completion time being equal to a time required by the CPU to execute the subtask a sum of time required for the CPU to execute a task to be processed in its current task queue; a sorting step of following the first and second expected completion times of the subtask to be assigned The length between the sorting; and assigning step assigned to perform subtask shortest time required for completion queue task according to the ranking result subtasks to be dispensed.

相較於現有技術，所述動態任務分配系統及方法，其可將任務有效分配給GPU和CPU來處理，充分利用GPU和CPU的運算處理能力，從而提高電腦的任務執行效率。Compared with the prior art, the dynamic task allocation system and method can effectively allocate tasks to the GPU and the CPU for processing, and fully utilize the computing processing capability of the GPU and the CPU, thereby improving the task execution efficiency of the computer.

如圖1所示，是本發明動態任務分配系統的運行環境圖。在本實施例中，動態任務分配系統10運行於具有CUDA(Compute Unified Device Architecture)架構的電腦100中，所述電腦100包括至少一個 GPU（Graphic Processing Unit）20，至少一個CPU（Central Processing Unit）30及記憶體40。所述動態任務分配系統10存儲於所述記憶體40中，用於當所述電腦100有新的任務需處理時將新任務拆解後有效分配給所述GPU 20和CPU 30處理。As shown in FIG. 1, it is an operating environment diagram of the dynamic task distribution system of the present invention. In this embodiment, the dynamic task distribution system 10 runs in a computer 100 having a CUDA (Compute Unified Device Architecture) architecture, and the computer 100 includes at least one GPU (Graphic Processing Unit) 20 and at least one CPU (Central Processing Unit). 30 and memory 40. The dynamic task assignment system 10 is stored in the memory 40 for effectively allocating new tasks to the GPU 20 and the CPU 30 for processing when the computer 100 has a new task to process.

本實施例中，所述動態任務分配系統10包括評估模組101、分解模組102、確定模組103、計算模組104、排序模組105、分配模組106、標識模組107及判斷模組108（參閱圖2所示）。本發明所稱的模組是完成一特定功能的程式段，關於各模組的功能將在圖3的流程圖中具體描述。In this embodiment, the dynamic task distribution system 10 includes an evaluation module 101, an decomposition module 102, a determination module 103, a calculation module 104, a sequencing module 105, an allocation module 106, an identification module 107, and a determination module. Group 108 (see Figure 2). The module referred to in the present invention is a program segment for performing a specific function, and the functions of each module will be specifically described in the flowchart of FIG.

如圖3所示，是本發明動態任務分配系統的較佳實施例的流程圖。為清楚說明本發明，本實施例以電腦100包括一個GPU 20、一個CPU 30，在其他實施例中所述電腦100也可包括多個GPU 20和多個CPU 30。3 is a flow chart of a preferred embodiment of the dynamic task assignment system of the present invention. In order to clearly illustrate the present invention, the computer 100 includes a GPU 20 and a CPU 30. In other embodiments, the computer 100 may also include a plurality of GPUs 20 and a plurality of CPUs 30.

步驟S1，評估模組101評估GPU 20及CPU 30的實際運算能力。需要說明的是，本步驟可採用業界標準GFLOPS（Floating Point Operations Per Second，每秒所執行的浮點運算次數）峰值計算得到，也可以採用GPU 20、CPU 30晶片廠家所提供設計的運算能力預估法。例如評估得出所述GPU 20的運算能力為200GFLOPS，所述CPU 30的運算能力為20GFLOPS。In step S1, the evaluation module 101 evaluates the actual computing power of the GPU 20 and the CPU 30. It should be noted that this step can be calculated by using the peak value of the industry standard GFLOPS (Floating Point Operations Per Second), or the computing power of the GPU 20 and CPU 30 chip manufacturers. Estimate the law. For example, it is estimated that the computing power of the GPU 20 is 200 GFLOPS, and the computing power of the CPU 30 is 20 GFLOPS.

對於GPU 20及CPU 30的實際運算能力的評估，可在電腦100每次開機進入作業系統後執行一次本步驟即可。For the evaluation of the actual computing power of the GPU 20 and the CPU 30, this step can be performed once the computer 100 is booted into the operating system.

步驟S2，分解模組102將新任務分解成N項子任務， N為大於等於1的整數。舉例而言，例如可以根據現有技術中的資料並行優於任務並行的分解原則將新任務M分解為2個並行子任務M1、M2及需等待M1、M2執行完後方可執行的子任務M3。In step S2, the decomposition module 102 decomposes the new task into N items of subtasks, and N is an integer greater than or equal to 1. For example, the new task M can be decomposed into two parallel sub-tasks M1, M2 and sub-tasks M3 that need to wait for M1 and M2 to execute after performing, for example, according to the prior art data parallelism and task parallel decomposition principle.

步驟S3，確定模組103從所分解得到的子任務中確定一項為待分配的子任務。舉例而言，新任務M分解為並行子任務M1、M2及需等待M1、M2執行完後方可執行的子任務M3，則確定模組103確定子任務M1、M2需先執行，而由於子任務M1、M2可並行處理，所述確定模組103可從所述並行子任務中隨機選中其中一項子任務例如M1先進行分配，即確定子任務M1為待分配的子任務。In step S3, the determining module 103 determines an item from the decomposed subtasks as a subtask to be allocated. For example, if the new task M is decomposed into the parallel subtasks M1 and M2 and the subtask M3 that needs to wait for the execution of the M1 and M2, the determination module 103 determines that the subtasks M1 and M2 need to be executed first, and because the subtasks are executed. The M1 and the M2 can be processed in parallel, and the determining module 103 can randomly select one of the subtasks, for example, M1, from the parallel subtasks to perform the allocation, that is, determine the subtask M1 as the subtask to be allocated.

步驟S4，計算模組104根據評估得到的所述GPU 20的實際運算能力計算待分配的子任務M1由所述GPU 20執行時所需的第一預計完成時間；根據評估得到的所述CPU 30的實際運算能力計算待分配的子任務M1由所述CPU 30執行時所需的第二預計完成時間。所述第一預計完成時間等於所述GPU 20執行該項待分配的子任務所需的時間與所述GPU 20處理其當前任務佇列中的待處理任務所需的時間之和，所述GPU 20當前任務佇列中的待處理任務即是已經分配到所述GPU 20的任務佇列但尚未執行的任務；所述第二預計完成時間等於所述CPU 30執行該項待分配的子任務所需的時間與所述CPU 30處理其當前任務佇列中的待處理任務所需的時間之和，所述CPU 30當前任務佇列中的待處理任務即是已經分配到所述CPU 30的任務佇列但尚未執行的任務。In step S4, the calculation module 104 calculates, according to the estimated actual computing capability of the GPU 20, the first estimated completion time required for the subtask M1 to be allocated to be executed by the GPU 20; the CPU 30 obtained according to the evaluation. The actual computing power calculates the second expected completion time required for the subtask M1 to be allocated to be executed by the CPU 30. The first expected completion time is equal to the sum of the time required for the GPU 20 to execute the subtask to be allocated and the time required by the GPU 20 to process the to-be-processed task in its current task queue, the GPU The pending task in the current task queue is the task that has been assigned to the task queue of the GPU 20 but has not yet been executed; the second estimated completion time is equal to the CPU 30 executing the subtask to be assigned. The sum of the time required and the time required by the CPU 30 to process the task to be processed in its current task queue, the task to be processed in the current task queue of the CPU 30 is the task that has been assigned to the CPU 30 A task that has been queued but not yet executed.

舉例而言，例如根據所述GPU 20的實際運算能力預算得到所述GPU 20執行子任務M1所需時間為0.01秒，而所述GPU 20執行其當前任務佇列的待處理任務需11秒（在這裏，所需時間可通過程式啟動時運行提前準備好的資料算出），則所述子任務M1由所述GPU 20執行時的第一預計完成時間為11.01秒。For example, the time required for the GPU 20 to execute the subtask M1 is 0.01 seconds according to the actual computing power budget of the GPU 20, and the GPU 20 performs 11 seconds of the pending task of the current task queue ( Here, the required time can be calculated by running the data prepared in advance when the program is started, and the first expected completion time when the subtask M1 is executed by the GPU 20 is 11.01 seconds.

再如，根據所述CPU 30的實際運算能力計算得出所述CPU 30執行子任務M1所需時間為0.1秒，而所述CPU 30執行其當前任務佇列的待處理任務需10秒，則所述子任務M1由所述CPU 30執行時的第二預計完成時間為10.1秒。For example, according to the actual computing power of the CPU 30, it is calculated that the time required for the CPU 30 to execute the subtask M1 is 0.1 second, and the CPU 30 executes the task to be processed of its current task queue for 10 seconds. The second estimated completion time when the subtask M1 is executed by the CPU 30 is 10.1 seconds.

需要說明的是，若所述計算模組104在計算某項子任務的第一預計完成時間時（即由所述GPU 20執行時的預計完成時間），發現該項子任務是一項不能由所述GPU 20執行的任務例如是邏輯判斷時，則所述計算模組104可直接設置所述子任務的第一預計完成時間等於無窮大，然後再根據所述CPU 30的實際運算能力計算該子任務由所述CPU 30執行時的第二預計完成時間。It should be noted that if the computing module 104 is calculating the first expected completion time of a sub-task (ie, the estimated completion time when the GPU 20 is executed), it is found that the sub-task cannot be When the task performed by the GPU 20 is, for example, a logical determination, the computing module 104 can directly set the first expected completion time of the subtask to be equal to infinity, and then calculate the sub-operation according to the actual computing power of the CPU 30. The second expected completion time when the task is executed by the CPU 30.

步驟S5，排序模組105將所述待分配的子任務的第一及第二預計完成時間進行排序。In step S5, the sorting module 105 sorts the first and second estimated completion times of the subtasks to be allocated.

例如，將步驟S4中計算得到的子任務M1由所述GPU 20執行的第一預計完成時間和由所述CPU 30執行的第二預計完成時間按照從短到長的順序進行排序，顯然，所述子任務M1由所述CPU 30執行時的第二預計完成時間（10.1秒）短於所述子任務M1由所述GPU 20執行時的第一預計完成時間（11.01秒）。For example, the first estimated completion time performed by the GPU 20 in the subtask M1 calculated in the step S4 and the second estimated completion time executed by the CPU 30 are sorted in order from short to long, obviously, The second expected completion time (10.1 seconds) when the subtask M1 is executed by the CPU 30 is shorter than the first expected completion time (11.01 seconds) when the subtask M1 is executed by the GPU 20.

步驟S6，分配模組106將所述待分配的子任務分配到執行該項子任務所需預計完成時間最短的任務佇列。In step S6, the distribution module 106 allocates the subtask to be assigned to the task queue with the shortest expected completion time required to execute the subtask.

例如，根據步驟S5的排序結果，將子任務M1分配到CPU 30的任務佇列。For example, the subtask M1 is assigned to the task queue of the CPU 30 in accordance with the sort result of the step S5.

需要說明的是，若某項子任務在所述GPU 20下執行時的第一預計完成時間和在所述CPU 30下執行時的第二預計完成時間相等，所述分配模組106將該項子任務分配到所述GPU 20的任務佇列。此分配的原因為GPU 20不能處理作業系統的調度作業任務，所以盡可能讓更多計算任務交給GPU 20來處理，以保留CPU 30通用處理能力。It should be noted that, if the first estimated completion time when a subtask is executed under the GPU 20 is equal to the second expected completion time when the CPU 30 is executed, the distribution module 106 will Subtasks are assigned to the task queue of the GPU 20. The reason for this allocation is that the GPU 20 cannot handle the scheduled job tasks of the operating system, so as much as possible, more computing tasks are handed over to the GPU 20 for processing to preserve the CPU 30 general processing power.

步驟S7，若子任務分配到GPU 20的任務佇列，標識模組107標識GPU 20執行該項子任務所需的時間。若子任務分配到CPU 30的任務佇列，標識模組107標識CPU 30執行該項子任務的時間。例如子任務M1分配到所述CPU 30後，標識所述CPU 30處理任務名稱為M1的任務所需時間為0.1秒。本步驟的目的在於方便步驟S4在計算子任務的第一及第二預計完成時間時，計算GPU 20或CPU 30執行其當前任務佇列的待處理任務所需的時間。In step S7, if the subtask is assigned to the task queue of the GPU 20, the identification module 107 identifies the time required for the GPU 20 to execute the subtask. If the subtask is assigned to the task queue of the CPU 30, the identification module 107 identifies the time at which the CPU 30 executes the subtask. For example, after the subtask M1 is assigned to the CPU 30, the time required for the CPU 30 to process the task whose task name is M1 is 0.1 second. The purpose of this step is to facilitate the step S4 to calculate the time required by the GPU 20 or the CPU 30 to perform the task to be processed of its current task queue when calculating the first and second expected completion times of the subtasks.

步驟S8，判斷模組108判斷是否還有其他子任務尚未分配，若是則執行步驟S3，若否則結束流程。例如子任務M2及M3尚未完成分配，則回到流程步驟S3，由確定模組103確定下一項待分配的子任務。而由於M2與M1為可並行的子任務，所以確定模組103即可確定子任務M2為下一項待分配的子任務。另外需要說明的是，由於子任務M3需等待M1及M2執行完後方可執行，所述確定模組103需先確定M1及M2執行完後才可確定M3為待分配子任務。In step S8, the determining module 108 determines whether there are other subtasks that have not been allocated yet, and if so, executes step S3, and if not, ends the process. For example, if the subtasks M2 and M3 have not completed the allocation, the process returns to the process step S3, and the determination module 103 determines the next subtask to be assigned. Since M2 and M1 are sub-tasks that can be paralleled, the determining module 103 can determine that the subtask M2 is the next subtask to be assigned. It should be noted that, since the subtask M3 needs to wait for the execution of the M1 and the M2, the determining module 103 needs to determine that the M1 and the M2 are executed before determining that the M3 is the subtask to be allocated.

需要說明的是，若所述電腦100包括多個GPU 20和/或多個CPU 30，在步驟S4計算子任務的第一預計完成時間和第二預計完成時間時，則分別計算子任務由不同GPU 20和CPU 30執行時所需的預計完成時間。It should be noted that, if the computer 100 includes multiple GPUs 20 and/or multiple CPUs 30, when the first estimated completion time and the second estimated completion time of the subtasks are calculated in step S4, the subtasks are calculated separately. The estimated completion time required for GPU 20 and CPU 30 to execute.

舉例而言，例如所述電腦100包括2個序列號分別為I、II的GPU 20，3個序列號分別為A、B、C的CPU 30，則根據步驟S1評估得到的所述序列號分別為I、II的GPU 20和序列號分別為A、B、C的CPU 30的運算能力分別計算子任務例如M1的第一預計完成時間和第二預計完成時間。For example, for example, the computer 100 includes two GPUs 20 with serial numbers I and II, and three CPUs 30 with serial numbers A, B, and C, respectively, and the serial numbers obtained according to step S1 are respectively determined. The first expected completion time and the second expected completion time of the subtasks, such as M1, are calculated for the computing power of the GPU 20 of I, II and the CPU 30 of serial numbers A, B, and C, respectively.

例如計算得到子任務M1由序列號為I的GPU 20執行時的第一預計完成時間（I-GPU 20）為8.5秒，由序列號為II的GPU 20執行時的第一預計完成時間（II-GPU 20）為9.3秒。而由序列號為A的CPU 30執行時的第二預計完成時間（A-CPU 30）為9.1秒，由序列號為B的CPU 30執行時的第二預計完成時間（B-CPU 30）為9.2秒，由序列號為C的CPU 30執行時的第二預計完成時間（C-CPU 30）為8.6秒。顯然，在步驟S5進行預計完成時間的排序時，所述子任務M1由序列號為I的GPU 20執行時的預計完成時間最短，則於步驟S6將所述子任務M1分配到所述序列號為I的GPU 20的任務佇列。For example, it is calculated that the first estimated completion time (I-GPU 20) when the subtask M1 is executed by the GPU 20 of the serial number I is 8.5 seconds, and the first expected completion time when executed by the GPU 20 of the serial number II (II) - GPU 20) is 9.3 seconds. The second expected completion time (A-CPU 30) when executed by the CPU 30 of sequence number A is 9.1 seconds, and the second expected completion time (B-CPU 30) when executed by the CPU 30 of sequence number B is At 9.2 seconds, the second expected completion time (C-CPU 30) when executed by the CPU 30 of serial number C is 8.6 seconds. Obviously, when the sorting of the expected completion time is performed in step S5, the subtask M1 is the shortest expected completion time when the GPU 20 of the serial number I is executed, and the subtask M1 is assigned to the serial number in step S6. The task of GPU 20 for I is listed.

最後應說明的是，以上實施例僅用以說明本發明的技術方案而非限制，儘管參照較佳實施例對本發明進行了詳細說明，本領域的普通技術人員應當理解，可以對本發明的技術方案進行修改或等同替換，而不脫離本發明技術方案的精神和範圍。It should be noted that the above embodiments are only for explaining the technical solutions of the present invention and are not intended to be limiting, and the present invention will be described in detail with reference to the preferred embodiments. Modifications or equivalents are made without departing from the spirit and scope of the invention.

100．．．電腦100. . . computer

10．．．動態任務分配系統10. . . Dynamic task allocation system

20．．．GPU20. . . GPU

30．．．CPU30. . . CPU

40．．．記憶體40. . . Memory

101．．．評估模組101. . . Evaluation module

102．．．分解模組102. . . Decomposition module

103．．．確定模組103. . . Determine module

104．．．計算模組104. . . Computing module

105．．．排序模組105. . . Sorting module

106．．．分配模組106. . . Distribution module

107．．．標識模組107. . . Identification module

108．．．判斷模組108. . . Judging module

S1．．．評估GPU及CPU的實際運算能力S1. . . Evaluate the actual computing power of the GPU and CPU

S2．．．將新任務分解成N項子任務S2. . . Decompose new tasks into N subtasks

S3．．．確定待分配的子任務S3. . . Determine the subtask to be assigned

S4．．．分別計算待分配子任務由GPU及CPU執行時所需的預計完成時間S4. . . Calculate the estimated completion time required for the subtask to be allocated to be executed by the GPU and CPU respectively

S5．．．將所述待分配的子任務的預計完成時間進行排序S5. . . Sorting the expected completion time of the subtask to be assigned

S6．．．根據排序結果分配該項子任務S6. . . Assign the subtask according to the sort result

S7．．．標識GPU或CPU執行該項子任務所需的時間S7. . . Identifies the time required for the GPU or CPU to perform this subtask

S8．．．是否還有其他子任務尚未分配？S8. . . Are there other subtasks that have not been assigned yet?

圖1是本發明動態任務分配系統的運行環境圖。1 is a diagram showing the operating environment of the dynamic task distribution system of the present invention.

圖2是本發明動態任務分配系統的功能模組圖。2 is a functional block diagram of a dynamic task assignment system of the present invention.

圖3是本發明動態任務分配方法的較佳實施例的流程圖。3 is a flow chart of a preferred embodiment of the dynamic task assignment method of the present invention.

S6．．．根據排序結果分配該頊子任務S6. . . Assign the dice task according to the sort result

S7．．．標識GPU或CPU執行該頊子任務所需的時間S7. . . Identifies the time required by the GPU or CPU to perform this dice task

Claims

A dynamic task allocation system, the system comprising:
Evaluation module for evaluating the actual computing power of the GPU and the CPU;
The decomposition module is configured to decompose the new task into N items of subtasks, where N is an integer greater than or equal to 1;
Determining a module for determining an item from the decomposed subtask as a subtask to be allocated;
a computing module, configured to: when the subtask to be allocated is executable by the GPU, calculate a first estimated completion time required for the subtask to be executed by the GPU according to an actual computing capability of the GPU, where the first estimated completion time is equal to the The sum of time required by the GPU to perform the subtask and the time required by the GPU to perform the task to be processed in its current task queue;
The calculation module is further configured to calculate, according to an actual computing capability of the CPU, a second estimated completion time required for execution by the CPU for the subtask to be allocated, where the second estimated completion time is equal to the CPU executing the The sum of the time required for the item task and the time required by the CPU to execute the task to be processed in its current task queue;
a sorting module, configured to sort the first and second estimated completion times of the subtasks to be allocated according to the length of time; and an allocation module, configured to allocate the subtasks to be assigned to the execution according to the sorting result This subtask requires a task with the shortest expected completion time.

The dynamic task assignment system of claim 1, wherein if the first expected completion time and the second estimated completion time of the subtask to be allocated are equal, the distribution module assigns the subtask to the The task queue of the GPU.

The dynamic task assignment system of claim 1 or 2, further comprising an identification module, configured to identify the GPU after the subtask to be allocated is assigned to a task queue of the GPU or the CPU Or the time required by the CPU to perform this subtask.

The dynamic task assignment system of claim 1 or 2, when the subtask to be allocated cannot be executed by the GPU, the computing module sets the first expected completion of the subtask Time is infinity.

For example, in the dynamic task assignment system described in claim 1, the decomposition module decomposes a new task according to a decomposition principle of data parallelism superior to task parallelism.

A dynamic task allocation method, the method comprising:
Evaluation steps to evaluate the actual computing power of the GPU and CPU;
Decomposing steps to decompose the new task into N items of subtasks, where N is an integer greater than or equal to 1;
Determining a step of determining an item from the decomposed subtask as a subtask to be assigned;
a first calculating step, when the subtask to be allocated is executable by the GPU, calculating a first estimated completion time required for the subtask to be executed by the GPU according to an actual computing capability of the GPU, where the first estimated completion time is equal to the GPU The sum of the time required to perform the subtask and the time required by the GPU to perform the pending task in its current task queue;
a second calculating step of calculating, according to an actual computing capability of the CPU, a second estimated completion time required for execution by the CPU, the second estimated completion time is equal to the CPU executing the subtask The sum of the time required and the time required by the CPU to execute the task to be processed in its current task queue;
a sorting step of sorting the first and second estimated completion times of the subtasks to be allocated according to the length of time; and assigning, assigning the subtasks to be allocated according to the sorting result to the subtasks required to execute the subtasks The task with the shortest completion time is expected.

The dynamic task assignment method of claim 6, wherein if the first estimated completion time and the second estimated completion time of the subtask to be allocated are equal, the subtask is assigned to the The task queue of the GPU.

The method for assigning dynamic tasks as described in claim 6 or 7, wherein the method further comprises the step of identifying:
After the subtask to be allocated is allocated to the task queue of the GPU, the time required for the GPU to execute the subtask is identified;
After the subtask to be allocated is assigned to the task queue of the CPU, the time required for the CPU to execute the subtask is identified.

The dynamic task assignment method according to claim 6 or 7, wherein when the subtask to be allocated cannot be executed by the GPU, the first step of the subtask is set in the first calculating step. The expected completion time is infinity.

For example, in the dynamic task assignment method described in claim 6, the decomposition step decomposes a new task according to a decomposition principle of data parallelism superior to task parallelism.