WO2022174442A1 - 多核处理器、多核处理器的处理方法及相关设备 - Google Patents
多核处理器、多核处理器的处理方法及相关设备 Download PDFInfo
- Publication number
- WO2022174442A1 WO2022174442A1 PCT/CN2021/077230 CN2021077230W WO2022174442A1 WO 2022174442 A1 WO2022174442 A1 WO 2022174442A1 CN 2021077230 W CN2021077230 W CN 2021077230W WO 2022174442 A1 WO2022174442 A1 WO 2022174442A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- task
- chain
- core
- chains
- scheduler
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title abstract description 12
- 238000012545 processing Methods 0.000 claims abstract description 371
- 238000000034 method Methods 0.000 claims description 66
- 238000004590 computer program Methods 0.000 claims description 12
- 230000001419 dependent effect Effects 0.000 abstract description 5
- 238000007726 management method Methods 0.000 description 152
- 230000008569 process Effects 0.000 description 40
- 238000011161 development Methods 0.000 description 38
- 230000006870 function Effects 0.000 description 11
- 230000006854 communication Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 230000008878 coupling Effects 0.000 description 5
- 238000010168 coupling process Methods 0.000 description 5
- 238000005859 coupling reaction Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 239000004065 semiconductor Substances 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000007175 bidirectional communication Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/485—Task life-cycle, e.g. stopping, restarting, resuming execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
- G06F9/522—Barrier synchronisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5017—Task decomposition
Definitions
- the present application relates to the technical field of processors, and in particular, to a multi-core processor, a processing method for a multi-core processor, and related devices.
- the task scheduler (Job Manager, JM) is used to implement multi-core GPU task scheduling (Kick-Off, KO).
- the device development kit and driver parse the upper-layer applications (APPs) calls to the graphics/computing application programming interface (Application Programming Interface, API), and encapsulate them into tasks that the GPU can recognize and execute.
- APPs Application Programming Interface
- JC task chain
- Communication Stream command stream
- the multi-cores of the GPU concurrently execute the tasks they receive.
- the task scheduler is responsible for multi-core scheduling and is responsible for or participating in multi-process management, which affects the multi-core utilization efficiency.
- the prior art solution does not solve the no-load problem of GPU multi-core scheduling.
- Embodiments of the present application provide a multi-core processor, a processing method for a multi-core processor, and related equipment, so as to solve the multi-core no-load problem and improve the multi-core scheduling performance.
- an embodiment of the present application provides a multi-core processor, including a task scheduler and multiple processing cores coupled to the task scheduler; wherein the task scheduler is used to store multiple task chains and the dependencies between the multiple task chains, the dependencies include dependencies and non-dependencies; the task scheduler is also used to: according to the dependencies between the multiple task chains, from A first task chain and a second task chain are determined from the plurality of task chains; there is no dependency between the first task chain and the second task chain, and the first task chain includes one or more first task chains task, the second task chain includes one or more second tasks; part or all of the plurality of processing cores are scheduled to execute the one or more first tasks; when at least one of the plurality of processing cores is When one first processing core is in an idle state, at least one second task in the second task chain is scheduled to be executed in the at least one first processing core.
- the multi-core processor may be a multi-core coprocessor such as a GPU or a Neural Network Processing Unit (NPU), which includes a task scheduler and multiple processing cores coupled to the task scheduler;
- the task scheduler can maintain the dependencies between task chains, that is, store the dependencies between multiple task chains, and the task scheduler also stores these multiple task chains, so that the task scheduler can retrieve the multiple task chains from the It is determined that the first task chain and the second task chain have no dependencies; the first task chain includes one or more first tasks, and the second task chain includes one or more second tasks, and the task scheduler can schedule these multiple tasks.
- NPU Neural Network Processing Unit
- Some or all of the processing cores execute one or more first tasks in the first task chain; since the first task chain and the second task chain are independent, the first task chain and the second task chain can Parallel execution, or the first task in the first task chain and the second task in the second task chain can be executed in parallel, when at least one of the multiple processing cores is in an idle state, the task scheduler will At least one second task in the second task chain is scheduled to be executed in the at least one first processing core; wherein, in an idle state or an idle state, that is, the processing core is not executing a task, the processing core in the idle state may be
- the processing core that is not scheduled to execute the first task in the first task chain may also be a processing core that is idling after executing the first task in the first task chain; thus, in this embodiment of the present application, once there is a When the processing core is idle, the idle processing core will be immediately scheduled by the task scheduler to execute tasks, thereby improving the multi-core scheduling performance.
- the task scheduler includes a dependency management unit and a task queue unit; wherein, the dependency management unit is used to store the dependency relationship between the multiple task chains; if it is determined that all After the dependency between the first task chain and the second task chain is no dependency, send a first instruction to the task queue unit, where the first instruction is used to instruct the first task chain and the The dependency relationship between the second task chains is described as a non-dependency relationship.
- the task scheduler includes a dependency management unit and a task queue unit, and the hardware implements dependency management between task chains, that is, the dependency management unit can store the dependency relationship between task chains without software (ie, DDK) Participate in the dependency management and control between task chains, thereby saving the interaction time of software and hardware and software side calls; and after the dependencies between the task chains are removed, that is, the dependencies between the task chains are no dependencies or never exist.
- the dependency management unit can store the dependency relationship between task chains without software (ie, DDK) Participate in the dependency management and control between task chains, thereby saving the interaction time of software and hardware and software side calls; and after the dependencies between the task chains are removed, that is, the dependencies between the task chains are no dependencies or never exist.
- the hardware responds quickly and can immediately dispatch the non-dependency task chain to the processing core, which is better than the software side management; for example, if the dependency management unit determines that the first task chain and the second task chain are connected After the dependency between them is no dependency, the first instruction is immediately sent to the task queue unit, and the task queue unit immediately sends the first task chain and the second task chain to the processing core for execution.
- the task scheduler further includes a task splitting unit and a multi-core management unit; wherein, the task queue unit is configured to store the multiple task chains; after receiving the dependency management unit After the first instruction sent by the unit, send the first task chain and the second task chain to the task splitting unit, and send a second instruction to the multi-core management unit, where the second instruction is used to indicate The multi-core management unit preempts processing cores for the first task chain and the second task chain.
- the task scheduler further includes a task splitting unit and a multi-core management unit, and the task queue unit can store multiple task chains.
- the task scheduler After receiving the first instruction sent by the dependency management unit, the task scheduler knows the first task The chain and the second task chain have no dependency, and the first task chain and the second task chain are sent to the task splitting unit; and the second instruction is sent to the multi-core management unit, and the second instruction is used to instruct the multi-core management unit to be the first task chain. and the second task chain to preempt the processing core; since the task splitting unit can split the first task chain into one or more first tasks and the second task chain into one or more second tasks, the multi-core management unit Processing cores can be preempted for the first task chain and the second task chain, which facilitates the execution of the first task chain and the second task chain.
- the task splitting unit is configured to split the first task chain into the one or more first tasks; the multi-core management unit is configured to split the first task chain into the one or more first tasks; Two instructions, preempting one or more second processing cores from the plurality of processing cores; sending the result of preempting the one or more second processing cores to the task splitting unit; the task splitting unit, is also used to schedule the one or more second processing cores to execute the one or more first tasks.
- the task splitting unit may split the first task chain into one or more first tasks; wherein, the second instruction may include the requirements for executing the first task chain The number of processing cores or the identification of the processing cores specifically used to execute the first task chain, etc., after receiving the second instruction from the task queue unit, the multi-core management unit can preempt one of the multiple processing cores according to the second instruction.
- the task splitting unit splits the first task chain into one or more first tasks, and After receiving the result that the multi-core management unit preempts one or more second processing cores for the first task chain, schedule the one or more second processing cores to execute one or more first tasks of the first task chain; this is beneficial to Preempt computing resources for the execution of the first task chain.
- the task splitting unit is further configured to split the second task chain into the one or more second tasks;
- the multi-core management unit is further configured to split the second task chain into the one or more second tasks;
- the task splitting unit may split the second task chain into one or more second tasks; the task splitting unit after scheduling the last task chain of the first task chain After a task is executed by one of the one or more second processing cores, the multi-core management unit may preempt the processing core for the execution of the second task in the second task chain; wherein the second instruction may include executing The number of processing cores required by the second task chain or the identification of the processing cores specifically used to execute the second task chain, etc.; after that, as long as at least one of the multiple processing cores is in an idle state, the multi-core management unit will Preempt the at least one first processing core according to the second instruction, and send the result of preempting the at least one first processing core to the task splitting unit; the task splitting unit can then split at least one of the one or more second tasks A second task is scheduled to be executed in the at least one first processing core; in this way, the hardware (multi-core management unit) implements the release and application of processing cores with the
- this management method greatly reduces or even eliminates the no-load problem of some processing cores, and improves the utilization efficiency of processing cores.
- the task scheduler further includes a task assembling unit; the task assembling unit is configured to obtain the command flow and the dependencies between some or all of the multiple task chains , and generate some or all of the multiple task chains according to the command flow; send some or all of the multiple task chains to the task queue unit, and send to the dependency management unit Dependencies between some or all of the plurality of task chains are sent.
- the software (DDK) may issue tasks to the multi-core processor in the form of a command stream
- the task assembly unit in the multi-core processor may receive the command stream and receive some or all of the tasks in multiple task chains and generate some or all of the task chains in the plurality of task chains according to the command flow; and send some or all of the task chains in the plurality of task chains to the task queue unit, and to the dependency management unit
- the dependencies between some or all of the multiple task chains are sent; in this way, when the software (DDK) issues tasks in the form of a command stream, multi-core scheduling can also be realized.
- an embodiment of the present application provides a method for processing a multi-core processor, which is applied to a multi-core processor, where the multi-core processor includes a task scheduler and multiple processing cores coupled to the task scheduler;
- the method includes: storing, by the task scheduler, a plurality of task chains and dependencies between the plurality of task chains, the dependencies including dependencies and non-dependencies; A dependency relationship between multiple task chains, a first task chain and a second task chain are determined from the multiple task chains; there is no dependency relationship between the first task chain and the second task chain, and the The first task chain includes one or more first tasks, and the second task chain includes one or more second tasks; the task scheduler schedules some or all of the plurality of processing cores to execute the one or multiple first tasks; when at least one first processing core in the multiple processing cores is in an idle state, schedule at least one second task in the second task chain to the is executed in the at least one first processing core.
- the task scheduler includes a dependency management unit and a task queue unit; wherein, storing the dependencies between the multiple task chains by the task scheduler includes: The dependency management unit in the task scheduler stores the dependencies between the multiple task chains; the task scheduler stores the dependencies between the multiple task chains from the multiple Determining the first task chain and the second task chain in the task chain includes: if the dependency between the first task chain and the second task chain is determined by the dependency management unit in the task scheduler After the dependency relationship is independent, the dependency management unit in the task scheduler sends a first instruction to the task queue unit, where the first instruction is used to instruct the first task chain and the second task Dependencies between chains are no dependencies.
- the task scheduler further includes a task splitting unit and a multi-core management unit; wherein, storing multiple task chains by the task scheduler includes: storing multiple task chains by the task scheduler
- the task queue unit of the system stores the plurality of task chains; the task scheduler determines the first task chain and the first task chain from the plurality of task chains according to the dependencies between the plurality of task chains.
- Two task chains further comprising: after receiving the first instruction sent by the dependency management unit in the task scheduler through the task queue unit in the task scheduler, The task queue unit sends the first task chain and the second task chain to the task splitting unit, and sends a second instruction to the multi-core management unit, where the second instruction is used to instruct the The multi-core management unit preempts processing cores for the first task chain and the second task chain.
- the scheduling part or all of the multiple processing cores to execute the one or more first tasks by the task scheduler includes: The task splitting unit splits the first task chain into the one or more first tasks; through the multi-core management unit in the task scheduler, according to the second instruction, from the multi-core management unit; Preempt one or more second processing cores from the processing cores; send the result of preempting the one or more second processing cores to the task splitting unit through the multi-core management unit in the task scheduler; The task splitting unit in the task scheduler schedules the one or more second processing cores to execute the one or more first tasks.
- using the task scheduler to schedule at least one second task chain in the second Scheduling a task to be executed in the at least one first processing core includes: splitting the second task chain into the one or more second tasks by the task splitting unit in the task scheduler; When at least one first processing core in the plurality of processing cores is in an idle state, preempt the at least one first processing core according to the second instruction by the multi-core management unit in the task scheduler; The result of preempting the at least one first processing core is sent to the task splitting unit by the multi-core management unit in the task scheduler; the task splitting unit in the task scheduler splits the At least one second task among the one or more second tasks is scheduled to be executed in the at least one first processing core.
- the task scheduler further includes a task assembling unit; the method further includes: acquiring the command stream and the multiple task chains through the task assembling unit in the task scheduler the dependencies between some or all of the task chains, and generate some or all of the multiple task chains according to the command flow; send the tasks to the tasks through the task assembling unit in the task scheduler
- the queuing unit sends some or all of the plurality of task chains, and sends the dependency relationship between some or all of the plurality of task chains to the dependency management unit.
- the present application provides a semiconductor chip, which may include the multi-core processor provided by any one of the implementation manners of the foregoing first aspect.
- the present application provides a semiconductor chip, which may include: a multi-core processor provided by any one of the implementation manners of the first aspect, an internal memory coupled to the multi-core processor, and an external memory.
- the present application provides a system-on-chip SoC chip, where the SoC chip includes the multi-core processor provided by any one of the implementation manners of the first aspect, an internal memory coupled to the multi-core processor, and an external memory.
- the SoC chip may be composed of chips, or may include chips and other discrete devices.
- the present application provides a chip system, where the chip system includes the multi-core processor provided by any one of the implementation manners of the first aspect above.
- the chip system further includes a memory, and the memory is used for saving necessary or related program instructions and data during the operation of the multi-core processor.
- the chip system may be composed of chips, or may include chips and other discrete devices.
- the present application provides a processing device, the processing device having the function of implementing any one of the processing methods for a multi-core processor in the second aspect above.
- This function can be implemented by hardware or by executing corresponding software by hardware.
- the hardware or software includes one or more modules corresponding to the above functions.
- the present application provides a terminal, where the terminal includes a multi-core processor, and the multi-core processor is the multi-core processor provided by any one of the implementation manners of the first aspect above.
- the terminal may also include a memory for coupling with the multi-core processor, which holds program instructions and data necessary for the terminal.
- the terminal may also include a communication interface for the terminal to communicate with other devices or a communication network.
- the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a multi-core processor, implements the multi-core processing described in any one of the above second aspects The processing method flow of the device.
- an embodiment of the present application provides a computer program, where the computer program includes instructions, when the computer program is executed by a multi-core processor, the multi-core processor can execute the multi-core processor described in any one of the above second aspects.
- FIG. 1 is a schematic diagram of the architecture of a multi-core scheduling system provided by an embodiment of the present application.
- FIG. 2 is a schematic diagram of a scheduling execution process of a task chain provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of the architecture of another multi-core scheduling system provided by an embodiment of the present application.
- FIG. 4 is a schematic diagram of another scheduling execution process of a task chain provided by an embodiment of the present application.
- FIG. 5 is a schematic flowchart of a multi-core scheduling provided by an embodiment of the present application.
- FIG. 6 is a schematic flowchart of a processing method for a multi-core processor provided by an embodiment of the present application.
- a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
- an application running on a computing device and the computing device may be components.
- One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between two or more computers.
- these components can execute from various computer readable media having various data structures stored thereon.
- a component may, for example, be based on a signal having one or more data packets (eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals) Communicate through local and/or remote processes.
- data packets eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals
- FIG. 1 is a schematic structural diagram of a multi-core scheduling system provided by an embodiment of the present application.
- the task scheduler implements task and process scheduling management in units of task chains, wherein the task chain is a singly linked list structure, which is a collection of a series of tasks.
- the task chain can be assembled by the device development kit and driver, and then sent to the task scheduler; or the task chain can be assembled in the task scheduler.
- the software ie DDK
- the task scheduler does not perceive the dependencies between the task chains.
- the task scheduler executes the task chains issued by the software in a fixed order.
- APPs upper-layer applications
- DDK lower layer of APPs.
- Dependency can be understood as the execution of a task chain needs to be based on the execution or completion of other task chains.
- the tasks specified by upper-layer applications (APPs) are divided into different types of task chains, such as binning, render, compute, raytracing, transfer, etc. These different types of tasks can be scheduled and executed in parallel on the hardware. . If several task chains belong to the same type as divided above, they are called task chains of the same type. For the same type of task chain, the task scheduler ensures that the next task chain is scheduled to be executed after the previous task chain ends. Therefore, for the scheduled execution of the same type of task chain, there are the following shortcomings:
- the load of the task chain is too small, and some processing cores are unloaded during the execution of the task chain, but cannot be used for the execution of the next task chain in advance.
- the specific performance is that, according to the execution order, it is assumed to be divided into a task chain executed before and a task chain executed later.
- the execution time of the task chain is related to the task with the longest execution time in the task chain. Since each task in the task chain The execution time is different, and the execution time of each processing core used to execute the task in the previously executed task chain is inconsistent. Some processing cores have short execution time, some processing cores have long execution time, and short execution time.
- the core After the core executes the tasks in the previously executed task chain, it needs to wait for the processing core with a long execution time to complete the tasks in the previously executed task chain until the execution of the previously executed task chain is completed.
- the processing core is always in an idle state, but it cannot be used to execute the task chain executed later, which will cause some processing cores to be idle (IDLE) for a long time before the execution of the descendant task chain starts, and the hardware performance is wasted .
- the existing technical solutions do not solve the no-load problem of multi-core scheduling, especially for a task chain with a light load, since some processing cores have a long no-load time, the performance loss is serious.
- FIG. 2 is a schematic diagram of a scheduling execution process of a task chain provided by an embodiment of the present application.
- Figure 2 is briefly described as follows:
- task chain 0 Job chain0
- task chain 1 Job chain1
- tasks tasks 0 to 3 respectively, with a total of 4 tasks; among them, task chain 0 and task chain 1 are of the same type Task chain, and there is no dependency between task chain 0 and task chain 1.
- the multi-core processor has a 4-core structure, that is, the multi-core processor includes processing cores 0 to 3.
- the task scheduler first issues four tasks in task chain 0 to processing cores 0 to 3 for execution; for example, task 0 in task chain 0 is issued to processing core 0 for execution, and task 1 in task chain 0 is executed It is issued to processing core 1 for execution, task 2 in task chain 0 is issued to processing core 2 for execution, and task 3 in task chain 0 is issued to processing core 3 for execution.
- the task scheduler sends the four tasks in the task chain to the processing cores 0 to 3 for execution; for example, Task 0 in the task chain is issued to processing core 0 for execution, task 1 in the task chain is issued to processing core 1 for execution, task 2 in the task chain is issued to processing core 2 for execution, and task 3 in the task chain is issued For processing core 3 to execute.
- processing core 1 executes task 1 in task chain 0 and task chain 1
- processing core 2 executes task 2 in task chain 0 and task chain 1
- processing core 3 executes task chain 0 and task chain 1.
- Task 3 there is no load situation.
- Processing core dead time is a drop in hardware performance, resulting in a performance penalty for processing cores.
- the present application needs to solve the no-load problem between multi-cores in the task chain scheduling process, and improve the performance of multi-core scheduling.
- FIG. 3 is a schematic structural diagram of a multi-core scheduling system 30 provided by an embodiment of the present application.
- the multi-core scheduling system 30 includes a multi-core processor 31 and a device development kit and driver (DDK) 32 .
- the multi-core processor 31 may be a multi-core coprocessor such as a GPU, a Neural Network Processing Unit (NPU), etc.
- the multi-core processor 31 may specifically include a task scheduler 311 and multiple processing units coupled to the task scheduler 311 The core 312; wherein, the task scheduler 311 is used to store multiple task chains and dependencies between the multiple task chains, and the dependencies include dependencies and no dependencies; the task scheduler 311, further used to: determine a first task chain and a second task chain from the plurality of task chains according to the dependencies between the plurality of task chains; the first task chain and the second task chain There is no dependency between chains, the first task chain includes one or more first tasks, and the second task chain includes one or more second tasks; scheduling some or all of the plurality of processing cores 312 Execute the one or more first tasks; when at least one first processing core in the plurality of processing cores 312 is in an idle state, schedule at least one second task in the second task chain to the executed in at least one first processing core.
- the task scheduler 311 is applied to the task distribution of the multiple processing cores 312 of the multi-core processor 31 and the scheduling management of the multiple processing cores 312 , and is a management unit of the multi-core processor 31 .
- the device development kit and driver 32 include a user mode driver (User Mode Driver, UMD) and a kernel mode driver (Kernel Mode Driver, KMD).
- UMD User Mode Driver
- KMD Kernel Mode Driver
- the multiple task chains stored in the task scheduler 311 are analyzed by the device development kit and the driver 32 to parse the API calls of upper-layer applications (APPs), and transmit the tasks to the task scheduler 311 on the multi-core processor 31 .
- the device development kit and the driver 32 can directly complete the task assembly and deliver it to the task scheduler 311 in the form of a task chain.
- the device development kit and driver 32 can also hand over task assembly or work to the task scheduler 311, and deliver the task to the task scheduler 311 in the form of a command stream, and the task scheduler 311 assembles the task chain according to the command stream.
- the device development kit and driver 32 will also deliver the dependencies between the task chains to the task scheduler 311 , and the dependencies between the task chains include dependencies and no dependencies.
- the dependencies between the task chains are maintained in the software (device development kit, driver 32 ), and the multi-core processor 31 cannot know the dependencies between the task chains.
- the task scheduler is It is ensured that the task chain executed later is scheduled to be executed after the execution of the task chain executed earlier is completed, so that some processing cores in the multi-core processor will have an idle period.
- the present application proposes a novel multi-core scheduling scheme in consideration of the deficiencies of the existing multi-core scheduling schemes.
- the technical solution provided by the present application maintains the dependency relationship between task chains on hardware, that is, maintains the dependency relationship between task chains on the multi-core processor 31 , specifically on the task scheduler 311 .
- Dependencies between task chains Since the task scheduler 311 can know the dependencies between the task chains, the task scheduler 311 can deliver tasks in the task chains without dependencies to the processing cores 312 for execution in advance, so as to prevent the occurrence of no-load of the processing cores.
- the two task chains when the two task chains are delivered to the task scheduler 311 or assembled in the task scheduler 311, the two task chains may be independent, that is, the two A task chain has no dependencies at the beginning, and can be directly scheduled for execution; when the two task chains are delivered to the task scheduler 311 or assembled in the task scheduler 311, the two task chains may also be If there is a dependency relationship, the dependency relationship between the two task chains was later released, that is, the two task chains had a dependency relationship at the beginning, but later became no dependency relationship. After the dependency is lifted, it can be scheduled for execution.
- the task scheduler 311 schedules the execution of the task chain, if there is no dependency between the task chains, that is, the dependency between the task chains is no dependency, the task in the currently executing task chain can be issued After the processing core is finished, without waiting for the execution of the previously executed task chain to complete, the task of the later executed task chain is immediately dispatched to the processing core, and the empty processing core is scheduled to be used by the later executed task chain.
- the first task chain starts executing before the second task chain.
- the first task chain includes one or more first tasks
- the second task chain includes One or more second tasks; when the one or more first tasks are all issued to some or all of the plurality of processing cores 312 for execution, as long as at least one of the plurality of processing cores 312 returns In the idle state, at least one second task among the one or more second tasks is delivered to the at least one first processing core in the idle state for execution.
- the idle state or the idle state that is, the processing core 312 is not executing tasks.
- the processing cores 312 in the idle state may be processing cores that are not scheduled to execute the first task in the first task chain.
- the processing cores used to execute the first task chain are only part of the multiple processing cores 312, If the processing core 312 for executing the first task in the first task chain is in an idle state, it can be used for executing the second task in the second task chain.
- the processing core 312 in the idle state may also be a processing core that is idle after executing the first task in the first task chain.
- the processing core 312 for executing the first task in the first task chain has executed the After a task starts to be in an idle state, it can be used to execute the second task in the second task chain immediately without waiting for the completion of the first task chain before executing the second task in the second task chain.
- the execution of the first task chain is completed means that all the first tasks in the first task chain are executed and completed, and one processing core 312 can execute at least one first task or at least one second task. It should be understood that the multi-core scheduling process of the present application is a dynamic process.
- the execution time of the third task chain is after the second task chain, and the third task chain includes one or more Three tasks; when the one or more second tasks are all issued to the processing cores 312 for execution, as long as there are processing cores 312 in the idle state among the multiple processing cores 312, the one or more third tasks will be At least one third task is issued to the processing core 312 in an idle state for execution; wherein, the processing core 312 used for executing the third task may be: the processing core 312 not used for executing the first task and the second task, or It may be the processing core 312 that is idle after executing the first task, or the processing core 312 that is idle after executing the first task and the second task.
- each processing core 312 of the plurality of processing cores 312 is immediately scheduled to execute the task of the next task chain as long as it is in an idle state, so that the present application can effectively solve the problem of no-load processing cores and improve multi-core scheduling performance.
- first task chain and second task chain may be the same type of task chain, but the first task chain and the second task chain have no dependencies when they are delivered to the processing core for execution.
- the above-mentioned first task chain and second task chain may also be different types of task chains, which can be regarded as independent, because different types of task chains can be executed in parallel.
- the device development kit and the driver 32 actively issue tasks to the multi-core processor 31 .
- the multi-core processor 31 After the multi-core processor 31 completes the task, it informs the device development kit and the driver 32 by interrupting or querying the register; generally, it is an interrupt, which is efficient and friendly to the device development kit and the driver 32 .
- the multi-core processor 31 includes a task scheduler 311 and multiple processing cores 312 coupled to the task scheduler 311; the task scheduler 311 can maintain dependencies between task chains, that is, store multiple processing cores 312. The dependencies between the task chains, and the task scheduler 311 also stores the multiple task chains, so that the task scheduler 311 can determine the first task chain and the second task chain without dependencies from the multiple task chains; The first task chain includes one or more first tasks, and the second task chain includes one or more second tasks.
- the task scheduler 311 may schedule some or all of the multiple processing cores 312 to execute the first task chain.
- the task scheduler 311 schedules at least one second task in the second task chain to the at least one first processing core for execution; thus, in this embodiment of the present application, once a processing core is idling, the idling processing core will be immediately scheduled by the task scheduler 311 to execute tasks, thereby improving the Multi-core scheduling performance.
- the task scheduler 311 includes a dependency management unit 3111 and a task queue unit 3112; wherein the dependency management unit 3111 is used to store the dependency relationship between the multiple task chains; If it is determined that the dependency between the first task chain and the second task chain is no dependency, a first instruction is sent to the task queue unit 3112, where the first instruction is used to instruct the first A dependency relationship between a task chain and the second task chain is a non-dependency relationship.
- the task scheduler 311 includes a dependency management unit 3111 and a task queue unit 3112 .
- the device development kit, the driver 32 or the task scheduler 311 delivers the task chain to the task queue 3112 , and simultaneously delivers the dependency between the task chains to the dependency management unit 3111 .
- the device development kit and the driver 32 deliver the dependencies between the task chains to the task scheduler 311 , that is, the device development kit and the driver 32 deliver the dependencies between the task chains to the task scheduler 311 .
- the dependency management unit 3111 can store the dependencies between task chains; the device development kit and driver 32 deliver the task chain to the task scheduler 311, that is, the device development kit and driver 32 deliver the task chain to In the task queue unit 3112 of the task scheduler 311 , the task queue unit 3112 can be used to store the task chain; in addition, the task chain assembled by the task scheduler 311 is also stored in the task queue unit 3112 .
- the device development kit and the driver 32 are delivered to the task queue unit 3112, or the task chains obtained after being assembled by the task scheduler 311 and delivered and stored in the task queue unit 3112 may have a dependency relationship, or may have no dependency relationship. and when there is a dependency relationship between the task chains delivered to the task queue unit 3112, the dependency relationship can be released along with the execution of the task chain.
- the dependency management unit 3111 can maintain the dependency between task chains, and specifically records the change of the dependency between the task chains.
- the task chains that are delivered to the task queue unit 3112 and have no dependencies at the beginning can be executed immediately, that is, the dependency management unit 3111 can inform the task queue unit 3112 that these task chains that have no dependencies at the beginning can be executed.
- the dependency management unit 3111 records the dependencies of these initially dependent task chains. After the dependencies among the task chains with dependencies are released, the dependency management unit 3111 informs the task queue unit 3112 that these task chains can be executed. For example, after the dependency management unit 3111 determines that the dependency relationship between the first task chain and the second task chain is no dependency relationship, it informs the task queue unit 3112 of the dependency between the first task chain and the second task chain through the first instruction The relationship is non-dependent. It should be understood that dependency resolution is the change of a dependency from having a dependency to no dependency. Wherein, each first instruction is directed to an independent task chain, and is used to inform the task queue unit 3112 whether the task chain can start to be executed.
- a task chain may depend on the completion of execution of one or more other task chains.
- a task chain may depend on the end of the processing of an event in the DDK.
- the characteristic value (signal semaphore) can be written to a semaphore buffer (buffer).
- the dependency management unit 3111 can poll the semaphore, and poll the expected value at a certain time point, that is, poll the signal triggered by the end of task chain 0.
- the dependency management unit 3111 confirms that the task chain 1 can be executed, and informs the task queue unit 3112 that the execution of the task chain 1 can be issued.
- the above task chain 0 may be the first task chain, and the above task chain 1 may be the second task chain.
- the dependency management unit 3111 notifies the task queue unit 3112 to deliver the task chains that execute these dependent contacts after judging that the dependencies between the task chains are released.
- the task queue unit 3112 After the task queue unit 3112 completes the execution of a task chain, it notifies the dependency management unit 3111 of a certain semaphore.
- dependencies such as barrier, fence, semaphore, event, etc.
- semaphore there are two types of polling (wait/polling) and setting (signal/write) class event.
- the dependency management unit 3111 needs to be notified.
- the signal can be translated into set, it does not mean that it is set from two values 0 and 1, and any value can be written.
- the action of the signal is to write the buffer, and the written value can be any value according to the maintenance rules.
- the task scheduler 311 includes a dependency management unit 3111 and a task queue unit 3112, and the hardware implements dependency management between task chains, that is, the dependency management unit 3111 can acquire and store the dependencies between the task chains without the need for Software (ie DDK) participates in the dependency management and control between task chains, thus saving the interaction time of software and hardware and software side calls; and after the dependencies between task chains are removed, that is, the dependencies between task chains are No dependencies or after the transition from a dependency relationship to a non-dependency relationship, the hardware responds quickly, and can immediately schedule a task chain without a dependency relationship to the processing core, which is better than the software side management; for example, if the dependency management unit 3111 determines that the first task is After the dependency between the chain and the second task chain is no dependency, the first instruction is immediately sent to the task queue unit 3112, and the task queue unit 3112 immediately sends the first task chain and the second task chain to the processing core for execution.
- Software ie DDK
- the task scheduler 311 further includes a task splitting unit 3113 and a multi-core management unit 3114; wherein the task queue unit 3112 is used to store the multiple task chains; After the first instruction sent by the dependency management unit 3111, the first task chain and the second task chain are sent to the task splitting unit 3113, and the second instruction is sent to the multi-core management unit 3114. The second instruction is used to instruct the multi-core management unit 3114 to preempt processing cores for the first task chain and the second task chain.
- the task scheduler 311 further includes a task splitting unit 3113 and a multi-core management unit 3114 .
- the task queue unit 3112 stores the multiple task chains, that is, the task queue unit 3112 manages multiple task chains of multiple processes; for example, the task queue unit 3112 can link the first task chain and the second task chain without dependencies send execution.
- the task queuing unit 3112 can assign the task chain with no dependencies or dependency relief to the task splitting unit 3113 for execution according to a certain strategy; at the same time, it informs the multi-core management unit 3114 to apply for a corresponding processing core for execution without dependencies or dependency relief. task chain.
- the dependency management unit 3111 informs the task queue unit 3112 through the first instruction that the dependency relationship between the first task chain and the second task chain is non-dependency; after the task queue unit 3112 receives the first instruction, the first task chain
- the second task chain and the second task chain are sent to the task splitting unit 3113, and the multi-core management unit 3114 is notified by the second instruction to preempt the processing core 312 for the first task chain and the second task chain, so as to execute the first task chain and the second task chain. task chain.
- the task queue unit 3112 needs to inform the multi-core management unit 3114 which processing cores to preempt for the first task chain and the second task chain through the second instruction, but it does not need to explain how to preempt, because the multi-core management unit 3114 implements preemption using a fixed strategy.
- the second instruction for preempting the processing core for the first task chain and the second task chain is sent in two times, the first sending informs the multi-core management unit 3114 to preempt the processing core for the first task chain, and the second sending informs the multi-core management Unit 3114 preempts the processing core for the second task chain.
- the above certain strategies include but are not limited to:
- the task queue unit 3112 may, when the software enables the time slice rotation function (the function software can choose whether to enable or not), in the The task chain of the corresponding process is dispatched only after the corresponding time slice.
- the task queue unit manages the distribution of the binning/compute task chain through a predetermined strategy, such as interleaving.
- the multi-core management unit 3114 can realize dynamic preemption (or dynamic occupation) and dynamic release of multiple processing cores 312. If a certain processing core completes the task in the previously executed task chain, the multi-core management unit 3114 immediately Release and re-apply to preempt the processing core for executing tasks in the task chain to be executed later; for example, after a certain processing core executes the first task in the first task chain, the multi-core management unit 3114 can It is released from the first task chain and re-applies to preempt the processing core for executing the second task in the second task chain. It should be understood that the interpretation of dynamic preemption is that it may not be used under the preemption.
- the processing core 312 preempted by the multi-core management unit 3114 for the task chain will not be used to execute the tasks in the task chain.
- the management unit 3114 will directly release the processing core 312, and the release speed in this case is very fast.
- the task queue unit 3112 issues the task chain to the task splitting unit 3113 .
- the task scheduler 311 further includes a task splitting unit 3113 and a multi-core management unit 3114.
- the task queue unit 3112 can store multiple task chains.
- the task scheduler 311 receives the first instruction sent by the dependency management unit 3111.
- the multi-core management unit 3114 preempts processing cores for the first task chain and the second task chain; since the task splitting unit 3113 can split the first task chain into one or more first tasks and the second task chain into one or multiple second tasks, the multi-core management unit 3114 can preempt processing cores for the first task chain and the second task chain, which is beneficial to the execution of the first task chain and the second task chain.
- the task splitting unit 3113 is configured to split the first task chain into the one or more first tasks; the multi-core management unit 3114 is configured to the second instruction, preempt one or more second processing cores from the plurality of processing cores 312; send the result of preempting the one or more second processing cores to the task splitting unit 3113; the task The splitting unit 3113 is further configured to schedule the one or more second processing cores to execute the one or more first tasks.
- the task splitting unit 3113 splits the tasks in the task chain.
- the task splitting unit 3113 splits the first task chain into one or more first tasks; and the rules for splitting the task chain can be For raster order (Raster order), Z order (Z order), U order (U order), 3D cube (3D cube) and so on.
- the task splitting unit 3113 sends the split task to the processing core 312 that has been preempted for the task chain in the multi-core management unit 3114, and the processing core 312 implements the calculation and execution of the task;
- One or more second processing cores are preempted in the core 312 to execute the first task chain, and the one or more second processing cores may be some or all of the multiple processing cores 312.
- the task splitting unit 3113 divides the first One or more first tasks obtained by splitting the task chain are delivered to the one or more second processing cores. It should be understood that there is no specific relationship between the tasks split from the task chain and the processing core 312.
- the tasks split from the task chain can be sent to the device development kit, and the task chain specified by the driver 32 is used to execute the task chain. on any one of the processing cores 312. For example, one or more first tasks obtained by splitting the first task chain are randomly distributed to the above-mentioned one or more second processing cores.
- the rules for the multi-core management unit 3114 to preempt processing cores for the task chain are as follows:
- the device development kit and the driver 32 will issue the designation to the task queue unit 3112;
- the device development kit and driver 32 will specify that the task chain can be executed on all processing cores 312, but in special scenarios, when some task chains can be slowly executed in an asynchronous (async) manner, the device development kit and driver 32 can be specified.
- the task chain is only allowed to execute on certain processing cores 312 .
- the development kit, driver 32 specifies in advance that the first task chain can be executed on all or part of the plurality of processing cores.
- a multi-core processor 31 exemplifies two scenarios for the GPU:
- the GPU can do device virtualization, so that for DDK, he can "see” multiple GPU instances (although the hardware is still essentially one GPU).
- each GPU instance can see different GPU cores. For example, GPU0 instance can only see GPU cores 0 to 1; GPU1 instance can only see GPU cores 2 to 5.
- DDK when scheduling the task chain on the GPU0 instance, you need to specify that the task chain can only be executed on GPU cores 0 to 1; when scheduling the task chain on the GPU1 instance, you need to specify GPU cores 2 to 5.
- APPs can specify that certain tasks are asynchronous computing scenarios (async compute). These computations do not require high real-time performance.
- DDK estimates the async compute task chain through certain indicators. Calculate the load, thereby allocating the corresponding number of GPU cores so that they do not execute at full speed.
- the multi-core management unit 3114 and the task splitting unit 3113 can share the preemption status of the processing cores 312 in real time, that is, the multi-core management unit 3114 sends the preemption status of the processing cores 312 to the task splitting unit 3113 in real time. After any one of the processing cores 312 completes the task execution, it will notify the multi-core management unit 3114, and the multi-core management unit 2114 actively decides to release and preempt the processing cores according to the scoreboard maintained by itself and the task completion status.
- the scoreboard is located in the multi-core management unit 3114, and the dependency management unit 3111 needs to know the end event of each task chain in order to handle the dependencies between the task chains, and obtain this information indirectly through the scoreboard.
- the task splitting unit 3113 is responsible for dispatching tasks to the processing cores 312, but needs to query the scoreboard in the multi-core management unit 3114, which processing cores have been preempted by the multi-core management unit, and whether these processing cores can still receive tasks at present Or whether the task can still be executed, and whether the processing cores used to execute a certain task chain are all released (this is the mark of the end of the execution of the task chain).
- the scoreboard in the multi-core management unit 3114 needs to be written to record the assignment of tasks on the processing cores preempted by the multi-core management unit 3114 .
- the task splitting unit 3113 may split the first task chain into one or more first tasks; wherein the second instruction may include executing the first task chain.
- the multi-core management unit 3114 After receiving the second instruction from the task queue unit 3112, the multi-core management unit 3114 In 312, one or more second processing cores are preempted, and the result of preempting one or more second processing cores is sent to the task splitting unit 3113; the task splitting unit 3113 splits the first task chain into one or more and after receiving the result that the multi-core management unit 3114 preempts one or more second processing cores for the first task chain, schedule the one or more second processing cores to execute one or more of the first task chain The first task; this is conducive to preempting computing resources for the execution of the first task chain.
- the task splitting unit 3113 is further configured to split the second task chain into the one or more second tasks;
- the multi-core management unit 3114 is further configured to When at least one first processing core in the plurality of processing cores 312 is in an idle state, preempt the at least one first processing core according to the second instruction; The result of the at least one first processing core;
- the task splitting unit 3113 is further configured to schedule at least one second task of the one or more second tasks to the at least one first processing core for execution.
- the task splitting unit 3113 may further split the second task chain into one or more second tasks. After the task splitting unit 3113 schedules one or more second processing cores to execute one or more first tasks obtained by splitting the first task chain, the multi-core management unit 3114 can immediately preempt the processing core for the execution of the second task chain And the multi-core management unit 3114 preempts the processing core for the execution of the second task chain, as long as there is a processing core 312 in an idle state, it can be preempted for executing the second task chain, and used for executing the processing core of the second task chain That is, the first processing core. It should be understood that the second task chain may be executed on all or part of the plurality of processing cores is also specified in advance by the development kit and the driver 32 .
- the processing cores 312 in the idle state may be processing cores that are not scheduled to execute the first task in the first task chain.
- the processing cores used to execute the first task chain are only part of the multiple processing cores 312, If the processing core 312 for executing the first task in the first task chain is in an idle state, it can be preempted by the multi-core management unit 3114 for executing the second task in the second task chain.
- the processing core 312 in the idle state may also be a processing core that is idle after executing the first task in the first task chain.
- the processing core 312 for executing the first task in the first task chain has executed the After a task starts to be in an idle state, it can be preempted by the multi-core management unit 3114 for executing the second task in the second task chain immediately, without waiting for the execution of the first task chain to complete before being preempted by the multi-core management unit 3114 for execution The second task in the second task chain.
- FIG. 4 is a schematic diagram of another task chain scheduling and execution process provided by an embodiment of the present application.
- a brief description of FIG. 4 is as follows:
- task chain 0 and task chain 1 can be divided into tasks 0 to 3 respectively, with a total of 4 tasks; among them, task chain 0 and task chain 1 are the same type of task chain, and task chain 0 and task chain 1 There is no dependency between them.
- the multi-core processor has a 4-core structure, that is, the multi-core processor includes processing cores 0 to 3.
- the task scheduler first issues the four tasks of task chain 0 to processing cores 0 to 3 for execution; for example, task 0 in task chain 0 is issued to processing core 0 for execution, and task 1 in task chain 0 is executed It is sent to processing core 1 for execution, task 2 in task chain 0 is sent to processing core 2 for execution, and task 3 in task chain 0 is sent to processing core 3 for execution.
- the task scheduler After waiting for any one of the processing cores 0 to 3 to complete the task in the task chain 0, the task scheduler immediately sends the task in the task chain 1 to the processing core for execution; for example, the processing core 3 finishes executing the task chain.
- the processing core 3 For task 3 in task chain 0, immediately send task 0 in task chain 1 to processing core 3 for execution; after processing core 2 executes task 2 in task chain 0, it immediately sends task 1 in task chain 1 to the processing core. 2 execution; processing core 1 finishes executing task 1 in task chain 0, and immediately sends task 2 in task chain 1 to processing core 1 for execution; processing core 0 finishes executing task 0 in task chain 0, and immediately sends task Task 3 in 1 is sent to processing core 0 for execution.
- the above task chain 0 may be the first task chain
- the above task chain 1 may be the second task chain.
- the scheduling feature in Figure 4 enables concurrent execution of independent task chains, timely scheduling and full use of the computing power of the processing core, reducing performance degradation caused by no-load phenomena.
- the processing core 312 for executing the release of the first task chain is preempted in time for executing the second task chain;
- the tasks in each task chain are issued by a balanced strategy to ensure that the number of unfinished tasks on each processing core is basically equal.
- multi-core management Unit 3114 will grab all computing resources used to execute low priority task chains; for processing core 312, it can only see tasks in high priority task chains or only tasks in low priority task chains , it is impossible to see the tasks in the high-priority task chain and the tasks in the low-priority task chain at the same time.
- the rest of the processing Cores can be dynamically scheduled to execute other types of task chains, such as dynamically scheduled to execute binning-type task chains.
- the task splitting unit 3113 can split the second task chain into one or more second tasks; Finally, after the first task is executed by one of the one or more second processing cores, the multi-core management unit 3114 can preempt the processing core for the execution of the second task in the second task chain; wherein the second instruction It may include the number of processing cores required to execute the second task chain or the identification of the processing cores specifically used to execute the second task chain; thereafter, as long as at least one first processing core in the plurality of processing cores 312 is in an idle state, the multi-core The management unit 3114 will preempt the at least one first processing core according to the second instruction, and send the result of preempting the at least one first processing core to the task splitting unit 3113; At least one second task among the second tasks is scheduled to be executed in the at least one first processing core; in this way, the hardware (multi-core management unit 3114 ) implements the release and application of processing cores with the granularity of multiple processing cores 3
- this management method greatly reduces or even eliminates the no-load problem of some processing cores, and improves the utilization efficiency of the processing cores.
- the task scheduler 311 further includes a task assembling unit 3115; the task assembling unit 3115 is configured to obtain the command flow and some or all of the task chains among the multiple task chains. and generate some or all of the multiple task chains according to the command flow; send some or all of the multiple task chains to the task queue unit 3112, and send all or all of the multiple task chains to the task queue unit 3112
- the dependency management unit 3111 sends the dependency between some or all of the multiple task chains.
- DDK inserts the dependencies specified in the API and the dependencies that are not specified by the API but inferred by the DDK itself into the command stream in the order of instructions.
- the hardware executes the command stream, assembles the commands in the command stream into a job, matches the dependency of the instruction form to the corresponding task chain, and sends it to the next-level module after completion.
- the device development kit and the driver 32 can directly complete the task assembly, and deliver it to the task scheduler 311 in the form of a task chain.
- the device development kit and driver 32 can also hand over the task assembly or work to the task assembly unit 3115 in the task scheduler 311, and issue the task to the task assembly unit 3115 in the form of a command stream, and the task assembly unit 3115 assembles it according to the command stream.
- task chain in addition, the device development kit and driver 32 will also issue the dependencies between the task chains to the task assembly unit 3115; after the task assembly unit 3115 assembles the task chain, it sends the assembled task chain to the task queue unit 3112, and send the dependencies of the assembled task chain to the dependency management unit 3111.
- the task assembling unit 3115 may optionally exist according to the work division of the device development kit, the driver 32 and the multi-core processor 31 .
- the software (DDK) may issue tasks to the multi-core processor 31 in the form of a command stream, and the task assembly unit 3115 in the multi-core processor 31 may receive the command stream and receive parts of multiple task chains and generate some or all of the plurality of task chains according to the command flow; and send some or all of the plurality of task chains to the task queue unit 3112, and Send the dependencies between some or all of the multiple task chains to the dependency management unit 3111; in this way, when the software (DDK) issues tasks in the form of command streams, multi-core scheduling can also be implemented.
- FIG. 5 is a schematic flowchart of a multi-core scheduling provided by an embodiment of the present application, which can be applied to the multi-core scheduling system 30 shown in FIG. 3, including but not limited to the following steps:
- Step 501 Device development kit, driver (DDK) task analysis.
- DDK parses the tasks that need to be executed by the multi-core processor by calling the parsing API, and sets the dependencies between the tasks.
- Step 502 is entered after the parsing of a segment of tasks is completed.
- the DDK task parsing process can be specifically executed by the device development kit and the driver 32 .
- Step 502 Task assembly.
- tasks are assembled into task chains identifiable by multi-core processors, and corresponding data sequences (desc or descriptors) are constructed, and dependencies are recorded.
- descriptors are data structures stored in double-rate synchronous dynamic random access memory (Double Data Rate, DDR), which are used to characterize various aspects of each task chain, such as which input data is available, which program segment is used to execute, How to process, where to output, in what format, etc.
- DDR Double Data Rate
- the task assembly process may be executed by the device development kit, the driver 32 or the task assembly unit 3115 .
- Step 503 Dependency management.
- the dependency management process it participates in maintaining the dependencies between task chains according to the record information of the scoreboard.
- the remaining task chains on which the task chain waiting for execution depends are recorded in the scoreboard as completed, the dependency relationship of the task chain waiting for execution is released.
- the dependency management process can be specifically executed by the dependency management unit 3111 , and the scoreboard is located in the multi-core management unit 3114 .
- Step 504 Task queue.
- the task queuing process can be specifically executed by the task queuing unit 3112 .
- Step 505 Multi-core management.
- the number of tasks obtained by splitting a task chain may be the same as or different from the number of processing cores; there is a situation where the number of tasks obtained by splitting a task chain is greater than the number of processing cores, and there is at least one processing core.
- the core needs to execute two or more tasks of the task chain; for a processing core that needs to execute two or more tasks of the task chain, it is released after the last task of the task chain is executed. ; And for a processing core that only executes one task of the task chain, the task of the task chain that it executes is also the last task of the task chain.
- the multi-core management process may be specifically executed by the multi-core management unit 3114 .
- Step 506 Task splitting.
- the task chain waiting to be executed is divided into one or more tasks, and sent to the processing cores that apply for preemption for the task chain waiting to be executed in step 505 to realize task calculation. After one or more split tasks are issued, step 507 and step 508 are entered simultaneously.
- the task splitting process can be specifically executed by the task splitting unit 3113 .
- Step 507 Scoreboard.
- the scoreboard records the task issued to each processing core and the task chain to which the task belongs, and confirms whether the task in a task chain on the processing core is completely completed according to the return information of the processing core, and if so, go to step 505 Perform dynamic release and dynamic preemption of processing cores.
- the scoreboard is located in the multi-core management unit 3114 , and the scoreboard process can be specifically executed by the multi-core management unit 3114 .
- Step 508 Multi-core execution.
- each processing core executes independently, and each processing core returns a response to the scoreboard after completing each task.
- the multi-core execution process may be executed by the processing core 312 .
- the task scheduler manages the dependencies between the same type of task chains, and the dependencies between the task chains need to be managed on the hardware side, not on the software (DDK) side. That is, the hardware implements the dependency management of the task chain, without the need for DDK to participate in the control, which saves the interaction time of software and hardware and software side calls, and the hardware responds quickly. After the dependency relationship is released, a new task chain can be dispatched immediately, which is better than Software side management.
- the task scheduler implements fine-grained dynamic release and dynamic preemption operations of processing cores.
- a processing core completes the last task of a certain task chain, it is immediately released and preempted to execute the task to be executed again.
- chain to mitigate or eliminate processing core idling through fine-grained management. That is, the hardware implements the fine-grained release and preemption of the multi-core of the multi-core processor, and each processing core is independently managed.
- a processing core completes a task that belongs to its own execution in a task chain, it is immediately released and re-applied for the remaining tasks.
- the computing resources of the chain Compared with the unified release and application operation of multiple cores with the task chain as the boundary or force, this management method greatly reduces or even eliminates the no-load problem of some processing cores, and improves the utilization efficiency of processing cores.
- the task scheduler implements dynamic scheduling of processing cores across task chains and processes to prevent processing cores from being unloaded. After the task of the task chain is issued, if there is no dependency on the next task chain, the next task chain can be executed immediately without waiting for the end of the task chain. That is, the hardware implements dynamic scheduling across task chains and processes, which can effectively reduce the problem of processing core no-load within the same process and between different processes, which is better than software-side management.
- FIG. 6 is a method for processing a multi-core processor provided by an embodiment of the present application, which is applied to a multi-core processor.
- the multi-core processor includes a task scheduler and a plurality of and the processing method of the multi-core processor is applicable to any of the multi-core processors in the above-mentioned FIG. 3-FIG. 5 and the devices (such as mobile phones, computers, servers, etc.) including the multi-core processors.
- the method may include but is not limited to steps 601-604, wherein,
- Step 601 Store multiple task chains and dependencies between the multiple task chains through the task scheduler, where the dependencies include dependencies and no dependencies;
- Step 602 Determine, by the task scheduler, a first task chain and a second task chain from the plurality of task chains according to the dependencies between the plurality of task chains; the first task chain and the There is no dependency between the second task chains, the first task chain includes one or more first tasks, and the second task chain includes one or more second tasks;
- Step 603 Schedule some or all of the plurality of processing cores to execute the one or more first tasks through the task scheduler;
- Step 604 When at least one first processing core in the plurality of processing cores is in an idle state, schedule at least one second task in the second task chain to the at least one first processing core through the task scheduler. Executed in a processing core.
- the task scheduler includes a dependency management unit and a task queue unit; wherein, storing the dependencies between the multiple task chains by the task scheduler includes: The dependency management unit in the task scheduler stores the dependencies between the multiple task chains; the task scheduler stores the dependencies between the multiple task chains from the multiple Determining the first task chain and the second task chain in the task chain includes: if the dependency between the first task chain and the second task chain is determined by the dependency management unit in the task scheduler After the dependency relationship is independent, the dependency management unit in the task scheduler sends a first instruction to the task queue unit, where the first instruction is used to instruct the first task chain and the second task Dependencies between chains are no dependencies.
- the task scheduler further includes a task splitting unit and a multi-core management unit; wherein, storing multiple task chains by the task scheduler includes: storing multiple task chains by the task scheduler
- the task queue unit of the system stores the plurality of task chains; the task scheduler determines the first task chain and the first task chain from the plurality of task chains according to the dependencies between the plurality of task chains.
- Two task chains further comprising: after receiving the first instruction sent by the dependency management unit in the task scheduler through the task queue unit in the task scheduler, The task queue unit sends the first task chain and the second task chain to the task splitting unit, and sends a second instruction to the multi-core management unit, where the second instruction is used to instruct the The multi-core management unit preempts processing cores for the first task chain and the second task chain.
- the scheduling part or all of the multiple processing cores to execute the one or more first tasks by the task scheduler includes: The task splitting unit splits the first task chain into the one or more first tasks; through the multi-core management unit in the task scheduler, according to the second instruction, from the multi-core management unit; Preempt one or more second processing cores from the processing cores; send the result of preempting the one or more second processing cores to the task splitting unit through the multi-core management unit in the task scheduler; The task splitting unit in the task scheduler schedules the one or more second processing cores to execute the one or more first tasks.
- using the task scheduler to schedule at least one second task chain in the second Scheduling a task to be executed in the at least one first processing core includes: splitting the second task chain into the one or more second tasks by the task splitting unit in the task scheduler; When at least one first processing core in the plurality of processing cores is in an idle state, preempt the at least one first processing core according to the second instruction by the multi-core management unit in the task scheduler; The result of preempting the at least one first processing core is sent to the task splitting unit by the multi-core management unit in the task scheduler; the task splitting unit in the task scheduler splits the At least one second task among the one or more second tasks is scheduled to be executed in the at least one first processing core.
- the task scheduler further includes a task assembling unit; the method further includes: acquiring the command stream and the multiple task chains through the task assembling unit in the task scheduler the dependencies between some or all of the task chains, and generate some or all of the multiple task chains according to the command flow; send the tasks to the tasks through the task assembling unit in the task scheduler
- the queuing unit sends some or all of the plurality of task chains, and sends the dependency relationship between some or all of the plurality of task chains to the dependency management unit.
- the multi-core processor includes a task scheduler and multiple processing cores coupled to the task scheduler; the task scheduler can maintain dependencies between task chains, that is, store the dependencies between multiple task chains. Dependency relationship, and also store these multiple task chains through the task scheduler, so that the first task chain and the second task chain without dependencies can be determined from the multiple task chains through the task scheduler; and the first task chain
- the chain includes one or more first tasks, the second task chain includes one or more second tasks, and some or all of the multiple processing cores can be scheduled to execute one or more of the first task chains through the task scheduler
- the second task of the second task chain can be executed in parallel, and when at least one first processing core in the multiple processing cores is in an idle state, at least one second task in the second task chain can be scheduled to the at least one first processing core through the task scheduler.
- the idle processing core will be immediately scheduled by the task scheduler to execute tasks, thereby improving multi-core scheduling performance.
- the present application further provides a semiconductor chip, which may include the multi-core processor provided by any one of the implementation manners of the foregoing embodiments.
- the present application further provides a semiconductor chip, which may include the multi-core processor provided by any one of the above embodiments, an internal memory coupled to the multi-core processor, and an external memory.
- the present application further provides a system-on-a-chip SoC chip, where the SoC chip includes the multi-core processor provided by any one of the foregoing embodiments, an internal memory coupled to the multi-core processor, and an external memory.
- the SoC chip may be composed of chips, or may include chips and other discrete devices.
- the present application further provides a chip system, where the chip system includes the multi-core processor provided by any one of the implementation manners of the foregoing embodiments.
- the chip system further includes a memory, and the memory is used for saving necessary or related program instructions and data during the operation of the multi-core processor.
- the chip system may be composed of chips, or may include chips and other discrete devices.
- the present application further provides a processing apparatus, which has the function of implementing any one of the processing methods for a multi-core processor in the foregoing method embodiments.
- This function can be implemented by hardware or by executing corresponding software by hardware.
- the hardware or software includes one or more modules corresponding to the above functions.
- the present application further provides a terminal, where the terminal includes a multi-core processor, and the multi-core processor is the multi-core processor provided by any one of the implementation manners of the foregoing embodiments.
- the terminal may also include memory for coupling with the multi-core processor, which holds program instructions and data necessary for the terminal.
- the terminal may also include a communication interface for the terminal to communicate with other devices or a communication network.
- Embodiments of the present application further provide a computer-readable storage medium, wherein the computer-readable storage medium may store a program, and when the program is executed by a multi-core processor, the program includes part or all of any one of the foregoing method embodiments step.
- Embodiments of the present application further provide a computer program, where the computer program includes instructions, when the computer program is executed by a multi-core processor, the multi-core processor can execute any of the multi-core processors described in the above method embodiments. Some or all of the steps of a processing method.
- the disclosed apparatus may be implemented in other manners.
- the device embodiments described above are only illustrative.
- the division of the above-mentioned units is only a logical function division.
- multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
- the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.
- the units described above as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
- the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
- the integrated units are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium.
- the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc., specifically a processor in the computer device) to execute all or part of the steps of the foregoing methods in the various embodiments of the present application.
- a computer device which may be a personal computer, a server, or a network device, etc., specifically a processor in the computer device
- the aforementioned storage medium may include: U disk, mobile hard disk, magnetic disk, optical disk, Read-Only Memory (Read-Only Memory, abbreviation: ROM) or Random Access Memory (Random Access Memory, abbreviation: RAM), etc.
- a medium that can store program code may include: U disk, mobile hard disk, magnetic disk, optical disk, Read-Only Memory (Read-Only Memory, abbreviation: ROM) or Random Access Memory (Random Access Memory, abbreviation: RAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
Claims (14)
- 一种多核处理器,其特征在于,包括任务调度器、以及耦合于所述任务调度器的多个处理核;其中,所述任务调度器,用于存储多个任务链和所述多个任务链之间的依赖关系,所述依赖关系包括有依赖关系和无依赖关系;所述任务调度器,还用于:根据所述多个任务链之间的依赖关系,从所述多个任务链中确定第一任务链和第二任务链;所述第一任务链与所述第二任务链之间无依赖关系,所述第一任务链包括一个或多个第一任务,所述第二任务链包括一个或多个第二任务;调度所述多个处理核中的部分或全部执行所述一个或多个第一任务;当所述多个处理核中有至少一个第一处理核处于空闲状态时,将所述第二任务链中的至少一个第二任务调度至所述至少一个第一处理核中执行。
- 根据权利要求1所述的多核处理器,其特征在于,所述任务调度器包括依赖管理单元、任务队列单元;其中,所述依赖管理单元,用于存储所述多个任务链之间的依赖关系;若判断到所述第一任务链与所述第二任务链之间的依赖关系为无依赖关系后,向所述任务队列单元发送第一指令,所述第一指令用于指示所述第一任务链与所述第二任务链之间的依赖关系为无依赖关系。
- 根据权利要求2所述的多核处理器,其特征在于,所述任务调度器还包括任务拆分单元、多核管理单元;其中,所述任务队列单元,用于存储所述多个任务链;在接收到所述依赖管理单元发送的第一指令后,向所述任务拆分单元发送所述第一任务链和所述第二任务链,以及向所述多核管理单元发送第二指令,所述第二指令用于指示所述多核管理单元为所述第一任务链和所述第二任务链抢占处理核。
- 根据权利要求3所述的多核处理器,其特征在于,所述任务拆分单元,用于将所述第一任务链拆分成所述一个或多个第一任务;所述多核管理单元,用于根据所述第二指令,从所述多个处理核中抢占一个或多个第二处理核;向所述任务拆分单元发送抢占所述一个或多个第二处理核的结果;所述任务拆分单元,还用于调度所述一个或多个第二处理核执行所述一个或多个第一任务。
- 根据权利要求4所述的多核处理器,其特征在于,所述任务拆分单元,还用于将所述第二任务链拆分成所述一个或多个第二任务;所述多核管理单元,还用于当所述多个处理核中有至少一个第一处理核处于空闲状态 时,根据所述第二指令,抢占所述至少一个第一处理核;向所述任务拆分单元发送抢占所述至少一个第一处理核的结果;所述任务拆分单元,还用于将所述一个或多个第二任务中的至少一个第二任务调度至所述至少一个第一处理核中执行。
- 根据权利要求2-5中任一项所述的多核处理器,其特征在于,所述任务调度器还包括任务组装单元;所述任务组装单元,用于获取命令流以及所述多个任务链中的部分或全部任务链之间的依赖关系,并根据所述命令流生成所述多个任务链中的部分或全部任务链;向所述任务队列单元发送所述多个任务链中的部分或全部任务链,以及向所述依赖管理单元发送所述多个任务链中的部分或全部任务链之间的依赖关系。
- 一种多核处理器的处理方法,其特征在于,应用于多核处理器,所述多核处理器包括任务调度器、以及耦合于所述任务调度器的多个处理核;所述方法包括:通过所述任务调度器存储多个任务链和所述多个任务链之间的依赖关系,所述依赖关系包括有依赖关系和无依赖关系;通过所述任务调度器根据所述多个任务链之间的依赖关系,从所述多个任务链中确定第一任务链和第二任务链;所述第一任务链与所述第二任务链之间无依赖关系,所述第一任务链包括一个或多个第一任务,所述第二任务链包括一个或多个第二任务;通过所述任务调度器调度所述多个处理核中的部分或全部执行所述一个或多个第一任务;当所述多个处理核中有至少一个第一处理核处于空闲状态时,通过所述任务调度器将所述第二任务链中的至少一个第二任务调度至所述至少一个第一处理核中执行。
- 根据权利要求7所述的方法,其特征在于,所述任务调度器包括依赖管理单元、任务队列单元;其中,所述通过所述任务调度器存储所述多个任务链之间的依赖关系,包括:通过所述任务调度器中的所述依赖管理单元存储所述多个任务链之间的依赖关系;所述通过所述任务调度器根据所述多个任务链之间的依赖关系,从所述多个任务链中确定第一任务链和第二任务链,包括:若通过所述任务调度器中的所述依赖管理单元判断到所述第一任务链与所述第二任务链之间的依赖关系为无依赖关系后,通过所述任务调度器中的所述依赖管理单元向所述任务队列单元发送第一指令,所述第一指令用于指示所述第一任务链与所述第二任务链之间的依赖关系为无依赖关系。
- 根据权利要求8所述的方法,其特征在于,所述任务调度器还包括任务拆分单元、多核管理单元;其中,所述通过所述任务调度器存储多个任务链,包括:通过所述任务调度器中的所述任务队列单元存储所述多个任务链;所述通过所述任务调度器根据所述多个任务链之间的依赖关系,从所述多个任务链中确定第一任务链和第二任务链,还包括:在通过所述任务调度器中的所述任务队列单元接收到通过所述任务调度器中的所述依赖管理单元发送的第一指令后,通过所述任务调度器中的所述任务队列单元向所述任务拆分单元发送所述第一任务链和所述第二任务链,以及向所述多核管理单元发送第二指令,所述第二指令用于指示所述多核管理单元为所述第一任务链和所述第二任务链抢占处理核。
- 根据权利要求9所述的方法,其特征在于,所述通过所述任务调度器调度所述多个处理核中的部分或全部执行所述一个或多个第一任务,包括:通过所述任务调度器中的所述任务拆分单元将所述第一任务链拆分成所述一个或多个第一任务;通过所述任务调度器中的所述多核管理单元根据所述第二指令,从所述多个处理核中抢占一个或多个第二处理核;通过所述任务调度器中的所述多核管理单元向所述任务拆分单元发送抢占所述一个或多个第二处理核的结果;通过所述任务调度器中的所述任务拆分单元调度所述一个或多个第二处理核执行所述一个或多个第一任务。
- 根据权利要求10所述的方法,其特征在于,所述当所述多个处理核中有至少一个第一处理核处于空闲状态时,通过所述任务调度器将所述第二任务链中的至少一个第二任务调度至所述至少一个第一处理核中执行,包括:通过所述任务调度器中的所述任务拆分单元将所述第二任务链拆分成所述一个或多个第二任务;当所述多个处理核中有至少一个第一处理核处于空闲状态时,通过所述任务调度器中的所述多核管理单元根据所述第二指令,抢占所述至少一个第一处理核;通过所述任务调度器中的所述多核管理单元向所述任务拆分单元发送抢占所述至少一个第一处理核的结果;通过所述任务调度器中的所述任务拆分单元将所述一个或多个第二任务中的至少一个第二任务调度至所述至少一个第一处理核中执行。
- 根据权利要求8-11中任一项所述的方法,其特征在于,所述任务调度器还包括任务组装单元;所述方法还包括:通过所述任务调度器中的所述任务组装单元获取命令流以及所述多个任务链中的部分或全部任务链之间的依赖关系,并根据所述命令流生成所述多个任务链中的部分或全部任务链;通过所述任务调度器中的所述任务组装单元向所述任务队列单元发送所述多个任务链 中的部分或全部任务链,以及向所述依赖管理单元发送所述多个任务链中的部分或全部任务链之间的依赖关系。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,该计算机程序被多核处理器执行时实现上述权利要求7-12中任意一项所述的方法。
- 一种计算机程序,其特征在于,所述计算机可读程序包括指令,当所述计算机程序被多核处理器执行时,使得所述多核处理器执行如上述权利要求7-12中任意一项所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/077230 WO2022174442A1 (zh) | 2021-02-22 | 2021-02-22 | 多核处理器、多核处理器的处理方法及相关设备 |
CN202180093759.7A CN116868169A (zh) | 2021-02-22 | 2021-02-22 | 多核处理器、多核处理器的处理方法及相关设备 |
EP21926159.1A EP4287024A4 (en) | 2021-02-22 | 2021-02-22 | MULTI-CORE PROCESSOR, MULTI-CORE PROCESSOR PROCESSING METHOD AND RELATED DEVICE |
US18/452,046 US20230393889A1 (en) | 2021-02-22 | 2023-08-18 | Multi-core processor, multi-core processor processing method, and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/077230 WO2022174442A1 (zh) | 2021-02-22 | 2021-02-22 | 多核处理器、多核处理器的处理方法及相关设备 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/452,046 Continuation US20230393889A1 (en) | 2021-02-22 | 2023-08-18 | Multi-core processor, multi-core processor processing method, and related device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022174442A1 true WO2022174442A1 (zh) | 2022-08-25 |
Family
ID=82931921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/077230 WO2022174442A1 (zh) | 2021-02-22 | 2021-02-22 | 多核处理器、多核处理器的处理方法及相关设备 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20230393889A1 (zh) |
EP (1) | EP4287024A4 (zh) |
CN (1) | CN116868169A (zh) |
WO (1) | WO2022174442A1 (zh) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102098503A (zh) * | 2009-12-14 | 2011-06-15 | 中兴通讯股份有限公司 | 一种多核处理器并行解码图像的方法和装置 |
CN103235742A (zh) * | 2013-04-07 | 2013-08-07 | 山东大学 | 多核集群服务器上基于依赖度的并行任务分组调度方法 |
CN103885826A (zh) * | 2014-03-11 | 2014-06-25 | 武汉科技大学 | 一种多核嵌入式系统实时任务调度实现方法 |
US20170060640A1 (en) * | 2015-08-31 | 2017-03-02 | Mstar Semiconductor, Inc. | Routine task allocating method and multicore computer using the same |
US20180032376A1 (en) * | 2016-07-27 | 2018-02-01 | Samsung Electronics Co .. Ltd. | Apparatus and method for group-based scheduling in multi-core processor system |
CN109697122A (zh) * | 2017-10-20 | 2019-04-30 | 华为技术有限公司 | 任务处理方法、设备及计算机存储介质 |
-
2021
- 2021-02-22 EP EP21926159.1A patent/EP4287024A4/en active Pending
- 2021-02-22 WO PCT/CN2021/077230 patent/WO2022174442A1/zh active Application Filing
- 2021-02-22 CN CN202180093759.7A patent/CN116868169A/zh active Pending
-
2023
- 2023-08-18 US US18/452,046 patent/US20230393889A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102098503A (zh) * | 2009-12-14 | 2011-06-15 | 中兴通讯股份有限公司 | 一种多核处理器并行解码图像的方法和装置 |
CN103235742A (zh) * | 2013-04-07 | 2013-08-07 | 山东大学 | 多核集群服务器上基于依赖度的并行任务分组调度方法 |
CN103885826A (zh) * | 2014-03-11 | 2014-06-25 | 武汉科技大学 | 一种多核嵌入式系统实时任务调度实现方法 |
US20170060640A1 (en) * | 2015-08-31 | 2017-03-02 | Mstar Semiconductor, Inc. | Routine task allocating method and multicore computer using the same |
US20180032376A1 (en) * | 2016-07-27 | 2018-02-01 | Samsung Electronics Co .. Ltd. | Apparatus and method for group-based scheduling in multi-core processor system |
CN109697122A (zh) * | 2017-10-20 | 2019-04-30 | 华为技术有限公司 | 任务处理方法、设备及计算机存储介质 |
Non-Patent Citations (2)
Title |
---|
PENG, MANMAN ET AL.: "Task Allocation and Load Balance on Multi-core Processors", MICROELECTRONICS & COMPUTER, vol. 28, no. 11, 30 November 2011 (2011-11-30), pages 35 - 39, XP055960537 * |
See also references of EP4287024A4 * |
Also Published As
Publication number | Publication date |
---|---|
CN116868169A (zh) | 2023-10-10 |
EP4287024A1 (en) | 2023-12-06 |
US20230393889A1 (en) | 2023-12-07 |
EP4287024A4 (en) | 2024-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7051330B1 (en) | Generic application server and method of operation therefor | |
US9727372B2 (en) | Scheduling computer jobs for execution | |
US10671458B2 (en) | Epoll optimisations | |
US9003410B2 (en) | Abstracting a multithreaded processor core to a single threaded processor core | |
US6732138B1 (en) | Method and system for accessing system resources of a data processing system utilizing a kernel-only thread within a user process | |
US9411636B1 (en) | Multi-tasking real-time kernel threads used in multi-threaded network processing | |
US8402470B2 (en) | Processor thread load balancing manager | |
CN108595282A (zh) | 一种高并发消息队列的实现方法 | |
WO2023103296A1 (zh) | 一种写数据高速缓存的方法、系统、设备和存储介质 | |
US9973512B2 (en) | Determining variable wait time in an asynchronous call-back system based on calculated average sub-queue wait time | |
WO2021022964A1 (zh) | 一种基于多核系统的任务处理方法、装置及计算机可读存储介质 | |
WO2023246042A1 (zh) | 调度方法及装置、芯片、电子设备及存储介质 | |
CN110795254A (zh) | 一种基于php处理高并发io的方法 | |
CN107562685B (zh) | 一种基于延时补偿的多核处理器核心间数据交互的方法 | |
CN114721818A (zh) | 一种基于Kubernetes集群的GPU分时共享方法和系统 | |
CN113946445A (zh) | 一种基于asic的多线程模块及多线程控制方法 | |
WO2022174442A1 (zh) | 多核处理器、多核处理器的处理方法及相关设备 | |
WO2023241307A1 (zh) | 管理线程的方法及装置 | |
CN111309494A (zh) | 一种多线程事件处理组件 | |
CN115658278A (zh) | 一种支持高并发协议交互的微任务调度机 | |
CN106997304B (zh) | 输入输出事件的处理方法及设备 | |
CN112114967B (zh) | 一种基于服务优先级的gpu资源预留方法 | |
US20230161620A1 (en) | Pull mode and push mode combined resource management and job scheduling method and system, and medium | |
CN114564420A (zh) | 多核处理器共享并行总线的方法 | |
TWI823655B (zh) | 適用於智慧處理器的任務處理系統與任務處理方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21926159 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180093759.7 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021926159 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2021926159 Country of ref document: EP Effective date: 20230830 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |