WO2022257898A1 - 一种任务调度的方法、系统和硬件任务调度器 - Google Patents

一种任务调度的方法、系统和硬件任务调度器 Download PDF

Info

Publication number
WO2022257898A1
WO2022257898A1 PCT/CN2022/097255 CN2022097255W WO2022257898A1 WO 2022257898 A1 WO2022257898 A1 WO 2022257898A1 CN 2022097255 W CN2022097255 W CN 2022097255W WO 2022257898 A1 WO2022257898 A1 WO 2022257898A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
candidate
cpu core
storage area
hardware
Prior art date
Application number
PCT/CN2022/097255
Other languages
English (en)
French (fr)
Inventor
史济源
张明礼
王睿
周代金
熊婕
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22819496.5A priority Critical patent/EP4339776A1/en
Publication of WO2022257898A1 publication Critical patent/WO2022257898A1/zh
Priority to US18/533,561 priority patent/US20240103913A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • G06F9/4856Task life-cycle, e.g. stopping, restarting, resuming execution resumption being on a different machine, e.g. task migration, virtual machine migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Definitions

  • the present application relates to the field of computers, in particular to a task scheduling method, system and hardware task scheduler.
  • a central processing unit central processing unit, CPU
  • CPU central processing unit
  • CPU core will schedule and switch between multiple tasks to execute different tasks in a time-sharing manner.
  • the size of task scheduling switching delay will affect the real-time performance of task execution, and then affect the performance of the entire system.
  • task scheduling and task switching will be performed serially and synchronously. Task scheduling and task switching performed synchronously in serial will consume a long time. After task scheduling returns a candidate task, it may also need to access memory to obtain information about the candidate task during task switching. Accessing memory takes a long time.
  • the present application provides a task scheduling method, system and hardware task scheduler, which can realize low-latency task switching, so as to improve the real-time performance of task execution and system performance.
  • a task scheduling system includes a CPU core and a hardware task scheduler.
  • the hardware task scheduler is configured to perform task scheduling to select candidate tasks, and actively send metadata of the candidate tasks to the first storage area.
  • the CPU core is configured to read metadata of the candidate task from the first storage area to execute the candidate task.
  • the first storage area is located in a storage area whose access speed is faster than that of the memory.
  • the first storage area includes an internal storage space of the CPU core and/or a cache of the CPU core.
  • the CPU core Before executing the selected candidate task, the CPU core needs to send a command to get the metadata of the candidate task. If the metadata is stored in the memory, the CPU core needs to consume more time to read the metadata.
  • the hardware task scheduler does not wait for the CPU core to send an instruction to obtain the metadata of the selected candidate task, but actively sends the metadata of the selected candidate task, and will This metadata is sent to a storage area that can be accessed faster than memory. Therefore, CPU cores can more quickly read this metadata from storage areas that can be accessed faster than memory. Therefore, the task scheduling system provided by this application reduces the time delay for the CPU core to obtain metadata. Low-latency task switching is achieved.
  • the metadata of the candidate task includes the storage location information of the context of the candidate task, the identifier of the candidate task, and the state.
  • the identification of the candidate tasks is used to distinguish the candidate tasks.
  • the context of the candidate task refers to the minimum data set required by the CPU core to run the task.
  • the CPU core saves the context of the task.
  • the CPU core reads the context of the task to restore the running environment of the task.
  • the hardware task scheduler is further configured to actively send the context of the candidate task to the cache of the CPU core.
  • the context of candidate tasks may be stored in memory. If the CPU core obtains the context of the candidate task from the memory, it will consume more time. Because the hardware task scheduler actively sends the context of the candidate task to the cache of the CPU core after executing the task scheduling, the CPU core can read the context from the cache in a timely manner. The time consumed by the CPU core to read the context from the cache is less than the time consumed by the CPU core to read the context from the memory. The delay of task switching is further reduced.
  • the hardware task scheduler is further configured to save the The context of the candidate task of the hardware task scheduler is actively sent to the cache memory of the CPU core.
  • the CPU core is further configured to send the element of the switched task data to the second storage area.
  • the hardware task scheduler is further configured to read metadata of the switched task from the second storage area, and store the metadata of the switched task in the hardware task scheduler.
  • the second storage area is located in the internal memory, or located in a storage area whose access speed is faster than that of the internal memory, and the second storage area is different from the first storage area.
  • a task scheduling method includes: a hardware task scheduler executes task scheduling to select candidate tasks, and actively sends metadata of the candidate tasks to the first storage area.
  • the first storage area is located in a storage area whose access speed is faster than that of the memory.
  • the first storage area includes an internal storage space of the CPU core and/or a cache of the CPU core.
  • the CPU core is used to execute the candidate task.
  • the metadata of the candidate task includes the storage location information of the context of the candidate task, the identifier of the candidate task, and the state.
  • the hardware task scheduler actively sends the context of the candidate task to the The cache memory of the CPU core.
  • the hardware task scheduler will store the The scheduler actively sends the context of the candidate task to the cache of the CPU core.
  • the hardware task scheduler reads the switched task metadata, and store the metadata of the switched task in the hardware task scheduler.
  • the second storage area is located in the internal memory, or located in a storage area whose access speed is faster than that of the internal memory.
  • the second storage area is different from the first storage area.
  • a task scheduling method includes: the CPU core reads the metadata of the candidate task from the first storage area to execute the candidate task.
  • the first storage area is located in a storage area whose access speed is faster than that of the memory.
  • the first storage area includes an internal storage space of the CPU core and/or a cache of the CPU core.
  • the metadata of the candidate task includes the storage location information of the context of the candidate task, the identifier of the candidate task, and the state.
  • the CPU core sends the metadata of the switched task to the second storage area .
  • the second storage area is located in the internal memory, or located in a storage area whose access speed is faster than that of the internal memory.
  • the second storage area is different from the first storage area.
  • a hardware task scheduler in a fourth aspect, includes a memory and a processor.
  • the memory is used to store metadata of one or more candidate tasks.
  • the processor is configured to perform task scheduling to select candidate tasks, and actively send the metadata of the selected candidate tasks stored in the memory to the first storage area.
  • the first storage area is located in a storage area whose access speed is faster than that of the memory.
  • the first storage area includes an internal storage space of the CPU core and/or a cache of the CPU core.
  • the CPU core is used to execute the selected candidate task.
  • the metadata of the selected candidate task includes the storage location information of the context of the selected candidate task, the identifier of the selected candidate task and the state of the selected candidate task.
  • the processor is further configured to send the context of the selected candidate task to The cache memory of the CPU core.
  • the memory is further used to store the selected candidate task context.
  • the processor is further configured to save the context of the selected candidate task stored in the memory sent to the cache of the CPU core.
  • the processor is further configured to read from the second storage area to be switched metadata of the switched task, and store the metadata of the switched task in the memory.
  • the second storage area is located in the internal memory, or located in a storage area whose access speed is faster than that of the internal memory.
  • the second storage area is different from the first storage area.
  • a hardware task scheduler includes a storage module and a task management module.
  • the storage module is used to store metadata of one or more candidate tasks.
  • the task management module is configured to perform task scheduling to select candidate tasks, and actively send the metadata of the selected candidate tasks stored in the storage module to the first storage area.
  • the first storage area is located in a storage area whose access speed is faster than that of the memory.
  • the first storage area includes an internal storage space of the CPU core and/or a cache of the CPU core.
  • the CPU core is used to execute the selected candidate task.
  • the metadata of the selected candidate task includes the storage location information of the context of the selected candidate task, the identifier of the selected candidate task and the state of the selected candidate task.
  • the task management module is further configured to send the context of the selected candidate task to the cache of the CPU core.
  • the storage module is further configured to store the selected candidate The context of the task.
  • the task management module is further configured to store the The context is sent to the cache of the CPU core.
  • the task management module is further configured to read the metadata of the switched task, and store the metadata of the switched task in the storage module.
  • the second storage area is located in the internal memory, or located in a storage area whose access speed is faster than that of the internal memory.
  • the second storage area is different from the first storage area.
  • FIG. 1 is a schematic diagram of an implementation environment involved in an embodiment of the present application
  • FIG. 2 is a flow chart of a task scheduling method provided in an embodiment of the present application
  • FIG. 3 is a sequence diagram corresponding to the task scheduling method shown in FIG. 2 provided by the embodiment of the present application;
  • Fig. 4 is a flow chart of another task scheduling method provided by the embodiment of the present application.
  • FIG. 5 is a sequence diagram corresponding to the task scheduling method shown in FIG. 4 provided by the embodiment of the present application;
  • Fig. 6 is a flow chart of another task scheduling method provided by the embodiment of the present application.
  • FIG. 7 is a sequence diagram corresponding to the task scheduling method shown in FIG. 6 provided by the embodiment of the present application.
  • FIG. 8 is a schematic diagram of a logic structure of a hardware task scheduler provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a hardware structure of a hardware task scheduler provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a task scheduling system provided by an embodiment of the present application.
  • Fig. 11 is a schematic diagram of another task scheduling system provided by the embodiment of the present application.
  • FIG. 1 shows a schematic diagram of an implementation environment involved in this embodiment of the present application.
  • the implementation environment includes a computer device 100 .
  • the computer device 100 is a device performing a multitasking system, including various types of devices.
  • the computer device 100 may be a mobile phone, a personal computer, or a server.
  • the computer device 100 includes a CPU 110 and a memory 120 .
  • CPU 110 and memory 120 are interconnected via a bus.
  • the CPU 110 includes one or more CPU cores (core).
  • CPU110 includes two CPU cores: CPU core 111 and CPU core 112.
  • the CPU core includes a control unit (control unit, CU), an arithmetic logic unit (arithmetic logic unit, ALU), registers, a level 1 cache (level 1 cache, L1 cache) and a level 2 cache (level 2 cache, L2 cache) .
  • CPU core 111 includes CU111-1, ALU111-2, registers 111-3, L1 cache 111-4, and L2 cache 111-5.
  • CPU core 112 includes CU112-1, ALU112-2, register 112-3, L1 cache 112-4 and L2 cache 112-5.
  • the CPU also includes a three-level cache (level 3 cache, L3 cache).
  • CPU 110 includes L3 cache 113.
  • Some computer devices also include a level 4 cache (level 4 cache, L4 cache).
  • CPU 110 includes L4 cache 114.
  • L1 cache, L2 cache, L3 cache, and L4 cache are collectively referred to as cache (cache).
  • the highest level cache is known as the last level cache (last level cache, LLC).
  • last level cache LLC
  • the L3 cache is called LLC.
  • the L4 cache is the highest level cache of the CPU, the L4 cache is called LLC.
  • CU, ALU and cache are interconnected through on-chip bus.
  • the computer device 100 may also include other components.
  • network card for example, network card, monitor, mouse, keyboard, etc.
  • Registers are accessed faster than memory.
  • the time consumed by the CU/ALU to read data from the register is less than the time consumed by the CU/ALU to read data from the memory.
  • the time consumed by the CU/ALU to write data to the register is less than the time consumed by the CU/ALU to write data to the memory.
  • Cache access is slower than registers. However, the access speed of Cache is faster than that of memory. Among them, the lower the cache level, the faster the cache access speed. For example, L1 cache is accessed faster than L2 cache.
  • the access speed of L2 cache is faster than that of L3 cache.
  • the access speed of L3 Cache is faster than that of L4 cache.
  • the CPU core executes instructions through the CU to perform tasks. If the instruction involves data operations, the ALU performs related operations. During the execution of the instruction, the CU reads the instruction from the register. During operation, the ALU reads data from registers. Registers have very little storage space. So, the relevant instruction or data may not be in the register. If there is no relevant instruction or data in the register, the CU fetches the instruction or data from the memory. Because the access speed of the cache is faster than that of the memory, when a memory access request is issued, the CU will first check whether there are related instructions or data in the cache. If there are relevant instructions or data in the cache, the CU will read the relevant instructions or data from the cache.
  • the relevant instruction or data will be sent from the memory to the cache, and then sent from the cache to the register. Therefore, when the data required by the CPU core only exists in the memory, the CPU has a large delay in reading the data; if the data exists in the cache, the delay is low. Also, the lower the cache level, the lower the latency for the CPU to read the data.
  • a CPU core may also include internal storage space.
  • the CPU core 111 includes an internal storage space 111-6.
  • the CPU core 112 includes an internal storage space 112-6.
  • the internal storage space is only used to store data or instructions required by the CPU core.
  • the content stored in this internal storage space will not be modified by other CPU cores.
  • the content stored in the internal storage space 111-6 will not be modified by the CPU core 112.
  • the content stored in the internal storage space 112-6 will not be modified by the CPU core 111.
  • the content stored in the internal storage space can be pushed by L1 cache or by LLC.
  • the access speed of the internal storage space is faster than the cache. If the data required by the CPU core exists in the internal storage space, the latency for the CPU core to read the data is also low.
  • the CPU core executes task scheduling instructions to select candidate tasks. After the candidate task is determined, the CPU core obtains the metadata and context of the candidate task. Then, the CPU core performs a task switch to execute the candidate task.
  • the metadata or context of the candidate task may exist in the memory, so it takes a long time for the CPU core to obtain the metadata or context of the candidate task.
  • a hardware task scheduler is added to the computer equipment, and the hardware task scheduler is responsible for executing task scheduling to select candidate tasks. The hardware task scheduler completes task scheduling in parallel. For example, when the CPU core is still executing instructions of the current task, the hardware task scheduler executes task scheduling in parallel.
  • the hardware task scheduler can also complete task scheduling in advance. For example, when the CPU core is still executing instructions of the current task, the hardware task scheduler has completed task scheduling. The hardware task scheduler schedules tasks in advance or in parallel, so that the time consumed by task scheduling can be ignored. Moreover, the hardware task scheduler does not wait for the CPU core to send an instruction to obtain the metadata of the candidate task after completing task scheduling, but actively sends the metadata of the candidate task to a designated storage area. The hardware task scheduler actively sends the metadata of the candidate task, so that the metadata of the candidate task can be sent to the designated storage area earlier. The designated storage area is located in storage space that can be accessed faster than memory.
  • the specified storage area is located in the internal storage space of the CPU core, or the specified storage area is located in the cache.
  • the designated storage area may be located in a certain level of cache in the multi-level cache.
  • the designated storage area is located in the first level cache.
  • the designated storage area is located in the second level cache. Because the hardware task scheduler actively sends the metadata of the candidate task to the designated storage area with a faster memory access speed, the CPU core can obtain the candidate task in a timely manner from the designated storage area with a faster memory access speed metadata, which reduces the delay for the CPU core to obtain the metadata of candidate tasks.
  • the delay for the CPU core to obtain the metadata of the candidate task is the lowest. If the designated storage area is located in the cache, the lower the cache level is, the lower the latency for the CPU core to obtain the candidate task metadata.
  • the hardware task scheduler actively sends the context of the candidate task to the cache after completing the task scheduling, so that the CPU core can obtain the context of the candidate task from the cache in a timely manner, reducing the CPU core's time for obtaining the context of the candidate task. delay.
  • the technical solution provided by the embodiment of the present application reduces the time delay for the CPU core to acquire candidate task metadata and context by eliminating the task scheduling time in the task switching scheduling, and reduces the total time delay of the task switching scheduling. Please refer to the following description for the detailed solutions of the embodiments of the present application.
  • FIG. 2 shows a flowchart of a task scheduling method provided by an embodiment of the present application.
  • the hardware task scheduler first receives the task scheduling instructions of the CPU core, and then executes task scheduling to select candidate tasks. After selecting a candidate task, the hardware task scheduler actively sends the metadata of the candidate task to the designated storage area, and actively sends the context of the candidate task to the cache.
  • the designated storage area is located in a storage area that can be accessed faster than memory. For example, the designated storage area is in the cache. Alternatively, the specified storage area is located in the internal storage space of the CPU core.
  • the CPU core obtains the metadata of the candidate task from a designated storage area with faster memory access speed, and obtains the context of the candidate task from the cache.
  • the task scheduling method flow includes the following steps:
  • Step 201 the CPU core sends a message to the hardware task scheduler, notifying the hardware task scheduler to execute task scheduling.
  • the CPU core sends a message to the hardware task scheduler during the execution of the current task, so as to notify the hardware task scheduler to start executing task scheduling earlier to select candidate tasks.
  • the CPU core sends a task scheduling instruction to the hardware task scheduler to notify the hardware task scheduler to start executing task scheduling.
  • the CPU core can determine the timing of sending task scheduling instructions through various mechanisms. For example, a CPU core can detect the running time of the current task. If the running time exceeds the threshold, the CPU core sends a task scheduling instruction. Alternatively, the CPU core can detect special instructions in the code of the current task. When a special instruction is detected, the CPU core sends a task scheduling instruction.
  • the CPU core may also send a task scheduling instruction when the current task is executed, for example, when the execution time of the current task expires.
  • the CPU core sends the task scheduling instruction to the hardware task scheduler. For example, the CPU core writes the CPU core identifier into a certain storage area in the hardware task scheduler, so as to trigger the hardware task scheduler to perform task scheduling and select a candidate task for the CPU core represented by the CPU core identifier.
  • the CPU core After sending the task scheduling instruction, the CPU core continues to execute the current task and perform pre-processing before scheduling. For example, check the status of the current task. If the state of the current task does not allow the task to be switched, the CPU core continues to execute the current task. If the task is allowed to be switched, the CPU core executes steps 205-208 to obtain candidate tasks and perform task switching when detecting the switching opportunity.
  • the CPU core can detect the switching opportunity in various ways. For example, to detect whether the execution time of the current task expires, or to detect whether a special instruction is executed.
  • the hardware task scheduler executes steps 202-204 in parallel to complete task scheduling to select a candidate task and send relevant information of the candidate task.
  • the CPU core performs task switching to execute the candidate task selected by the hardware task scheduler.
  • Task Execution The task that the CPU core is running before the switch is called the switched task.
  • Step 202 the hardware task scheduler executes task scheduling to select candidate tasks.
  • Each candidate task queue includes metadata for one or more candidate tasks.
  • the metadata includes the identification of the candidate task, the state of the candidate task, and the storage location information of the context of the candidate task.
  • the identifier of the candidate task is used to distinguish different candidate tasks.
  • the states of the candidate tasks include ready, executing, newly created, blocked, terminated, and so on.
  • the ready state indicates that the task has obtained all necessary resources except the CPU core, and is waiting for the execution of the CPU core.
  • the hardware task scheduler selects candidate tasks from the candidate tasks in the ready state for execution by the CPU core.
  • the context of the candidate task refers to the minimum data set required by the CPU core to run the task.
  • the CPU saves the context of the task.
  • the CPU reads the context of the task to restore the execution environment of the task.
  • the context of candidate tasks is stored in memory.
  • the storage location information of the context of the candidate task includes the storage address of the context of the candidate task in memory.
  • the storage location information may also include the length of the candidate task context. Various ways can be used to express the length of this context. For example, a byte offset relative to where this context is stored in memory. Alternatively, the number of cache lines occupied by this context.
  • the metadata of the candidate task may also include the CPU core associated with the candidate task.
  • the CPU core associated with the candidate task indicates the affinity between the task and the CPU core. That is, which CPU core or cores are more expected to execute the task.
  • the hardware task scheduler preferentially selects a candidate task from candidate tasks associated with the CPU core for execution by the CPU core.
  • the hardware task scheduler can also express the affinity of candidate tasks and CPU cores based on other means. For example, the hardware task scheduler maintains a queue for each CPU core, and stores candidate tasks that are affinity with the CPU core into the queue corresponding to the CPU core.
  • the metadata may also include information such as priority of candidate tasks, time slice threshold, and the like.
  • the hardware task scheduler performs task scheduling to select candidate tasks.
  • the hardware task scheduler may preferentially select candidate tasks among candidate tasks that are affinity with the CPU core.
  • the hardware task scheduler can also select candidate tasks among candidate tasks that do not specify any CPU affinity.
  • the hardware task scheduler executes the task scheduling algorithm to complete the task scheduling.
  • the task scheduling algorithm can be various types of scheduling algorithms. For example, shortest time first scheduling, round robin, weight scheduling, priority scheduling, etc.
  • Step 203 the hardware task scheduler actively sends the metadata of the candidate task to a designated storage area.
  • the hardware task scheduler After selecting a candidate task, the hardware task scheduler actively sends the metadata of the candidate task to the designated storage area.
  • the designated storage area is located in a storage area that can be accessed faster than memory.
  • the specified storage area includes the internal storage area of the CPU core.
  • the designated storage area includes the cache memory of the CPU core.
  • the CPU core's cache can be inside or outside the CPU core. For example, L1 cache or L2 cache located inside the CPU core, or L3 cache or L4 cache located outside the CPU core.
  • the hardware task scheduler actively sends the metadata of the candidate task to the designated storage area, so that the CPU core can obtain the metadata of the candidate task faster.
  • the hardware task scheduler can send the metadata of the candidate task to the designated storage area based on various methods. For example, the hardware task scheduler directly sends the metadata of the candidate task to the designated storage area based on a dedicated channel. Alternatively, the hardware task scheduler can also send instructions to the bus. The bus executes the instruction based on the bus protocol to send the metadata of the candidate task to the specified storage area. For example, when the bus is an Advanced RISC Machine (ARM) Advanced Microcontroller Bus Architecture (AMBA) bus, the hardware task scheduler sends cache stashing related instructions to Metadata for candidate tasks are sent to the cache. The cache stashing command sends the metadata of candidate tasks to the L3 cache by default. The parameters carried by the instruction include the metadata of the candidate task and the memory address allocated for the candidate task.
  • the hardware task scheduler directly sends the metadata of the candidate task to the designated storage area based on a dedicated channel.
  • the hardware task scheduler can also send instructions to the bus.
  • the bus executes the instruction based on the bus protocol to send the metadata of
  • the bus executes the instruction and stores the metadata of the candidate task into the cache memory matching the memory address.
  • the hardware task scheduler can also set the level of the cache in the parameter of the instruction, so as to send the metadata of the candidate task to the cache of the specified level. For example, L2 cache.
  • the hardware task scheduler can also set the identifier of the CPU core in the parameter of the instruction, so as to send the metadata of the candidate task to the cache that can be accessed by the CPU core specified by the identifier.
  • the hardware task scheduler sets this memory address for the CPU core in its driver.
  • the CPU core reads the memory address.
  • the CPU core reads the metadata of the candidate task based on the memory address, it will directly read the metadata of the candidate task from the cache.
  • Step 204 the hardware task scheduler actively sends the context of the candidate task to the cache.
  • the hardware task scheduler After selecting a candidate task, the hardware task scheduler actively sends the context of the candidate task to the cache, so that the CPU core can obtain the context of the candidate task from the cache in time.
  • the hardware task scheduler may send an instruction to send the context of the candidate task to the cache.
  • the parameters carried by the instruction include the memory address of the context of the candidate task.
  • the hardware task scheduler can also set the cache level in the parameter of the instruction, so as to send the context of the candidate task to the specified level of cache.
  • the instruction is executed, if the context of the candidate task has been stored in the specified level of cache, the bus will no longer send the context of the candidate task stored in the memory address.
  • the context of the candidate task will be pushed from the LLC to the high-level cache.
  • the hardware task scheduler can obtain the memory address of the context of the candidate task.
  • the metadata of the candidate task is sent to the cache by the hardware task scheduler. Therefore, after the CPU core obtains the metadata of the candidate task, the CPU core can also obtain the memory address of the candidate task. After obtaining the memory address of the candidate task, the CPU core reads the context of the candidate task based on the memory address. Because the context of the candidate task has been sent to the cache, the CPU core will be able to read the context of the candidate task directly from the cache.
  • Step 205 the CPU core reads the metadata of the candidate task.
  • the hardware task scheduler When the CPU core executes the current task and task scheduling preprocessing, the hardware task scheduler has selected a candidate task through task scheduling, and sends the metadata of the candidate task to the designated storage area. Therefore, the CPU core can read the metadata of the candidate task from the designated storage area.
  • the specified storage area is located in the internal storage space of the CPU core or in the cache. Therefore, the CPU core can quickly obtain the metadata of the candidate task.
  • Step 206 the CPU reads the context of the candidate task.
  • the CPU After acquiring the metadata of the candidate task, the CPU reads the storage location information of the candidate task context from the metadata. The CPU reads the context based on this storage location information. Because the hardware task scheduler sends the candidate task's context to the cache, the CPU core can read the context from the cache.
  • Step 207 the CPU core stores the metadata of the switched task into the hardware task scheduler.
  • the CPU stores the metadata of the switched task into the hardware task scheduler, so that the hardware task scheduler stores the metadata of the switched task into the candidate task queue.
  • the hardware task scheduler may select the switched task as a candidate task again, so that the switched task can be executed by the CPU core again.
  • the CPU core can store the metadata of the switched task into the hardware task scheduler through various methods. For example, the CPU core may store the metadata of the switched task into the hardware task scheduler through any one of the following two implementation manners (207A and 207B).
  • the CPU core sends the metadata of the switched task to the hardware task scheduler, and the hardware task scheduler stores the metadata of the switched task into the candidate task queue.
  • the CPU core sends the metadata of the switched task to the hardware task scheduler. For example, the CPU core writes the metadata of the switched task into a certain storage area in the hardware task scheduler, so as to prompt the hardware task scheduler to read the metadata of the switched task from the specified storage area, and write the switched task
  • the metadata of the switched task is stored in the candidate task queue.
  • the hardware task scheduler may select the switched task again, so that the switched task is executed by the CPU core again.
  • the CPU core sends the metadata of the switched task to the specified storage area, and the hardware task scheduler reads the metadata of the switched task from the specified storage area, and stores the metadata of the switched task into the candidate task queue .
  • the CPU core sends the metadata of the switched task to the designated storage area.
  • the designated storage area may be located in a storage area whose access speed is faster than that of the memory.
  • the specified storage area includes the internal storage space of the CPU core.
  • the specified storage area includes the cache memory of the CPU core.
  • the CPU core's cache can be inside or outside the CPU core.
  • the CPU core may send metadata of the switched task to the designated storage area through a dedicated channel.
  • the CPU core can also send an instruction, so that the metadata of the switched task is sent to a designated storage area. Since saving the metadata of the switched task does not affect the execution of the candidate task, the designated storage area can also be located in the memory.
  • the CPU core directly writes the metadata of the switched task into a storage area located in the memory.
  • the CPU core writes the metadata of the switched task to the designated storage area to trigger the hardware task scheduler to read the metadata of the switched task.
  • the hardware task scheduler detects the designated storage area. Once the metadata of the task is found to be stored in the specified storage area, the hardware task scheduler reads the metadata of the task from the specified storage area. After reading the metadata of the task, the hardware task scheduler stores it in the candidate task queue of the hardware task scheduler. When performing subsequent task scheduling, the hardware task scheduler may select the task again so that the task is executed by the CPU core again.
  • the designated storage area and the designated storage area in step 203 may be located in the same storage space.
  • the specified storage area and the specified storage area in step 203 are all located in L2 cache.
  • both the specified storage area and the specified storage area in step 203 are located in the internal storage space of the CPU core.
  • the specified storage area and step 203 may also be located in different storage spaces.
  • the specified storage area is located in L3 cache, and the specified storage area in step 203 is located in L2 cache.
  • the specified storage area is located in memory, and the storage area specified in step 203 is located in L2 cache.
  • the specified storage area and the specified storage area in step 203 are different storage areas.
  • the CPU core writes the metadata of the switched task into the specified storage area
  • the hardware task scheduler reads the switched task metadata from the specified storage area.
  • the hardware task scheduler writes the metadata of the selected candidate task into the specified storage area in step 203
  • the CPU core reads the metadata of the selected candidate task from the specified storage area in step 203 .
  • the specified storage area in step 203 is called the first storage area
  • the specified storage area in step 207B is called the second storage area.
  • Step 208 the CPU core executes task switching to run the candidate task.
  • the CPU core After obtaining the context of the candidate task, the CPU core switches the context to execute the candidate task.
  • the CPU core Before performing task switching, the CPU core performs preprocessing before task switching. For example, save the running state of the switched task and the memory base address of the task.
  • Modern operating systems usually employ virtual memory management mechanisms. For example, if a CPU core has 32-bit address lines, the CPU core can access a 4 gigabit (Gb) storage space. The address of this 4Gb storage space is called a physical address.
  • Gb gigabit
  • each task for example, a process
  • the address of each task's independent 4G address space is called a virtual address.
  • the memory management unit is responsible for converting the virtual address of a task into a physical address. This address translation depends on the task's memory base address.
  • the CPU core sends task scheduling instructions during the execution of the current task. Therefore, when the CPU core continues to execute the current task, the hardware task scheduler can perform task scheduling in parallel to select candidate tasks.
  • the hardware task scheduler executes task scheduling in parallel, making the time consumed by task scheduling negligible.
  • the hardware task scheduler does not wait for the CPU core to send an instruction to trigger the hardware task scheduler to send the metadata of the candidate task, but actively sends the metadata of the candidate task to the first storage with faster memory access speed. area, so that the CPU core can timely read the metadata of the candidate task from the first storage area with faster access speed.
  • the task scheduling method reduces the time delay of task scheduling switching.
  • FIG. 4 and FIG. 6 are flowcharts of two other task scheduling methods provided by the embodiments of the present application. For details not disclosed in the method embodiments shown in FIG. 4 and FIG. 6 , please refer to the method embodiment shown in FIG. 3 .
  • FIG. 4 shows a flowchart of another task scheduling method provided by an embodiment of the present application.
  • the hardware task scheduler first executes task scheduling to select a candidate task, reads the context of the candidate task in advance, and stores the context of the candidate task inside the hardware task scheduler.
  • the hardware task scheduler sends the metadata of the candidate task to the first storage area, and sends the context of the candidate task to the cache.
  • the CPU core acquires the metadata of the candidate task from the first storage area with faster access speed, and acquires the context of the candidate task from the cache.
  • the task scheduling method flow includes the following steps:
  • Step 401 the hardware task scheduler executes task scheduling to select candidate tasks.
  • Step 402 the hardware task scheduler acquires the context of the candidate task, and stores the context of the candidate task in the hardware task scheduler.
  • the hardware task scheduler After selecting a candidate task, acquires the context of the candidate task and stores it inside the hardware task scheduler. For example, the hardware task scheduler sends a read command to the bus to read the context of the candidate task and store it in the hardware task scheduler.
  • the parameters carried by the read instruction include storage location information of the candidate task context.
  • Step 403 the CPU core sends a message to notify the hardware task scheduler to execute task scheduling.
  • the CPU core sends a message to notify the hardware task scheduler to perform task scheduling. For example, the CPU core sends a task scheduling instruction to the hardware task scheduler, so as to notify the hardware task scheduler to perform task scheduling. After sending the instruction, the CPU core continues to execute the current task or pre-processing before scheduling.
  • the hardware task scheduler has completed task scheduling in step 401 . Therefore, when receiving this instruction, the hardware task scheduler can directly select a candidate task for the CPU core from the selected candidate tasks. For example, a candidate task that is compatible with the CPU core is selected from the selected candidate tasks. Then, the hardware task scheduler executes step 404 and step 405 to send the metadata and context of the candidate task.
  • Step 404 the hardware task scheduler sends the metadata of the candidate task to the first storage area.
  • Step 405 the hardware task scheduler sends the context of the candidate task to the cache.
  • the hardware task scheduler has stored the context in the hardware task scheduler in step 402 .
  • the context is also stored in memory.
  • the context may also be stored in the CPU core's cache. If the context stored elsewhere is modified after the context is stored in the hardware task scheduler, the context stored in the hardware task scheduler is invalid. If the context stored in the hardware task scheduler is invalid, the hardware task scheduler cannot directly send the context stored in the hardware task scheduler to the cache of the CPU core. Therefore, before sending the context, the hardware task scheduler needs to judge whether the context is valid. There are several ways to determine whether this context is valid.
  • the hardware task scheduler sets a flag for each context stored in the hardware task scheduler, and the external storage (memory or cache) modifies the flag after modifying the context, and the hardware task scheduler detects the flag to determine whether the context is valid sex.
  • the hardware task scheduler checks the validity of the context by detecting bus consistency.
  • the hardware task scheduler When the context of the candidate task is valid, the hardware task scheduler sends the context stored inside the hardware task scheduler to the cache. In step 203, the hardware task scheduler sends the metadata stored in the hardware task scheduler to the cache. The hardware task scheduler sends the context stored in the hardware task scheduler to the cache, similar to step 203.
  • the hardware task scheduler executes instructions similar to step 204 to send the context of the candidate task to the cache.
  • the context being sent to the cache is not the context stored in the hardware task scheduler.
  • the context sent to the cache may be stored in memory.
  • the context sent to the cache may already be stored in the cache.
  • Step 406 the CPU core reads the metadata of the candidate task.
  • the hardware task scheduler When the CPU core executes the current task and task scheduling preprocessing, the hardware task scheduler has sent the candidate task metadata to the first storage area. Therefore, the CPU core can read the metadata of the candidate task from the first storage area, so as to switch and execute the candidate task.
  • Step 407 the CPU core reads the context of the candidate task.
  • the CPU core After obtaining the metadata of the candidate task, the CPU core reads the storage location information of the candidate task context from the metadata. The CPU core reads the context based on the storage location information. Since the context has already been sent to the cache, the CPU core will read the context from the cache.
  • Step 408 the CPU core stores the metadata of the switched task into the hardware task scheduler.
  • the CPU core can store the metadata of the switched task into the hardware task scheduler through various methods.
  • the CPU core may store the metadata of the switched task into the hardware task scheduler through method 407A or method 407B.
  • Method 408A is similar to method 207A.
  • Method 408B is similar to method 207B. I won't repeat them here.
  • the hardware task scheduler After the hardware task scheduler stores the switched tasks in the candidate task queue, it can execute task scheduling again to select a good candidate task.
  • the hardware task scheduler obtains the context of the candidate task.
  • the hardware task scheduler When receiving the scheduling instruction of the CPU core again, the hardware task scheduler directly sends the metadata and context of the selected candidate task to the CPU core.
  • Step 409 the CPU core executes task switching to run the candidate task.
  • the CPU core After obtaining the context of the candidate task, the CPU core switches the context to execute the candidate task.
  • the sequence diagram of the task scheduling method provided by the embodiment of the present application is shown in FIG. 5 .
  • the hardware task scheduler Before the CPU core sends the task scheduling instruction, the hardware task scheduler has completed the task scheduling. Therefore, after receiving the task scheduling instruction of the CPU core, the hardware task scheduler can immediately send the metadata of the candidate task to the first storage area, and send the context of the candidate task to the cache of the CPU core. The hardware task scheduler completes task scheduling before the CPU core sends task scheduling instructions, so that the time consumed by task scheduling can be completely eliminated. Because the task scheduling has been completed in advance, the hardware task scheduler can immediately send the metadata and context of the candidate task after receiving the task scheduling instruction.
  • This method can send the metadata and context of the candidate task to the storage area with faster access speed more quickly, and can better ensure that the CPU core can immediately obtain the metadata and context of the candidate task from the storage area with faster memory access speed.
  • the task scheduling method provided in this embodiment reduces the time delay of task scheduling switching.
  • FIG. 6 shows a flowchart of another task scheduling method provided by an embodiment of the present application.
  • the hardware task scheduler monitors the first storage area of the CPU core. If there is no candidate task metadata to be executed in the storage area, the hardware task scheduler executes task scheduling to select a candidate task for the CPU core, and actively sends the metadata of the candidate task to the first storage area.
  • the CPU core no longer sends task scheduling instructions, but directly reads the metadata of the candidate task from the first storage area, and prefetches the context of the candidate task to the cache of the CPU core.
  • the CPU core reads the context of the candidate task from the cache to perform task switching.
  • the task scheduling method flow includes the following steps:
  • Step 601 the hardware task scheduler monitors the first storage area of the CPU core. If there is no candidate task to be executed in the storage area, the hardware task scheduler executes task scheduling to select a candidate task, and actively sends the metadata of the candidate task to the storage area.
  • Step 602 the CPU core reads metadata of candidate tasks from the first storage area.
  • the CPU core When detecting that a task needs to be switched, the CPU core no longer sends a task scheduling instruction, but directly reads the metadata of the candidate task from the first storage area.
  • step 603 is executed. After that, the CPU core continues to execute the current task or perform the operation before the task switching. After performing the operations before the task switching, the CPU core executes step 604 .
  • step 603 the CPU core prefetches the context of the candidate task, causing the context of the candidate task to be sent to the cache.
  • the CPU core After obtaining the metadata of the candidate task, the CPU core reads the storage location information of the candidate task context from the metadata. The CPU core prefetches the context of the candidate task based on the storage location information of the context of the candidate task. For example, a CPU core sends a prefetch instruction to the bus, causing the candidate task's context to be sent to the cache. The parameters carried by the prefetch instruction include storage location information of the candidate task context.
  • Step 604 the CPU core reads the context of the candidate task.
  • the CPU core After performing the operation before the task switching, the CPU core reads the context based on the above storage location information. Since the context has already been sent to the cache, the CPU core will read the context from the cache.
  • Step 605 the CPU core sends the metadata of the switched task to the second storage area.
  • Step 606 the hardware task scheduler monitors the second storage area, reads the switched task metadata from the storage area, and stores the switched task metadata into the candidate task queue of the hardware task scheduler.
  • a hardware task scheduler monitors the second storage area. Once the task metadata is found to be stored in the second storage area, the hardware task scheduler reads the task metadata from the second storage area and writes the task metadata into the candidate task queue of the hardware task scheduler. When performing subsequent task scheduling, the hardware task scheduler may select the task again so that the task is executed by the CPU core again.
  • Step 607 the CPU core executes task switching to run the candidate task.
  • the CPU core After obtaining the context of the candidate task, the CPU core switches the context to execute the candidate task.
  • the hardware task scheduler executes task scheduling in advance, and sends the metadata of the selected candidate tasks to the first storage area in advance.
  • the CPU core detects that tasks need to be switched, the CPU core no longer sends task scheduling instructions, but directly reads metadata of candidate tasks from the first storage area.
  • the CPU core prefetches the context of the candidate task, causing the context of the candidate task to be sent to the cache. That is, when a task needs to be switched, the CPU core may directly read the metadata of the candidate task from the first storage area, and read the context of the candidate task from the cache.
  • the CPU core directly reads the metadata of the candidate task from the first storage area, and reads the context of the candidate task from the cache, thereby reducing the delay of task switching.
  • the CPU core does not need to directly interact with the hardware task scheduler. Therefore, the task scheduling method provided in this embodiment not only reduces the delay of task scheduling switching, but also avoids direct interaction between the CPU core and the hardware task scheduler.
  • the hardware task scheduler 500 includes: a task management module 810 and a storage module 820 .
  • the task management module 810 is configured to execute step 202 and step 203 in the embodiment shown in FIG. 2 , or step 401 and step 404 in the embodiment shown in FIG. 4 , or step 601 in the embodiment shown in FIG. 6 .
  • the storage module 820 is used for storing the metadata of the candidate tasks involved in the execution process of the task management module 510 . specifically,
  • the task management module 810 is configured to perform task scheduling to select a candidate task, and actively send the metadata of the candidate task to the first storage area.
  • the storage module 820 is configured to store metadata of one or more candidate tasks for the task management module 810 to select candidate tasks.
  • the task management module 810 is also configured to execute step 204 in the embodiment shown in FIG. 2 .
  • the task management module 810 is also used to actively send the context of candidate tasks to the cache.
  • the storage module 820 is also used to store the context of the candidate tasks.
  • the task management module 810 is also configured to execute step 402 and step 405 in the embodiment shown in FIG. 4 .
  • the task management module 810 is also used to acquire the context of the candidate task and store the context of the candidate task in the hardware task scheduler.
  • the task management module 810 is also used to actively send the context of the candidate tasks stored in the hardware task scheduler to the cache.
  • the task management module 810 is also configured to execute step 207 in the embodiment shown in FIG. 2 , or step 408 in the embodiment shown in FIG. 4 , or step 606 in the embodiment shown in FIG. 6 .
  • the task management module 810 is configured to receive the metadata of the switched task sent by the CPU core, and store the metadata of the switched task in the hardware task scheduler.
  • the task management module 810 is configured to read the metadata of the switched task from the second storage area, and store the metadata of the switched task in the hardware task scheduler.
  • the hardware task scheduler 800 further includes an interface module 830 .
  • the interface module 830 is used to send the instructions sent by the task management module 810 when executing the above steps.
  • the hardware task scheduler provided in the above embodiment executes task scheduling
  • only the division of the above functional modules is used as an example for illustration.
  • the above functions can be assigned to different functional modules according to requirements. That is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the hardware task scheduler provided by the above-mentioned embodiments belongs to the same idea as the task scheduling method embodiment, and its specific implementation process is detailed in the method embodiment, and will not be repeated here.
  • FIG. 9 shows a schematic diagram of a hardware structure of a hardware task scheduler 900 provided by an embodiment of the present application.
  • the hardware task scheduler 900 includes a task management processor 902 , a memory 904 and a connection line 906 .
  • the processor 902 and the memory 904 are connected to each other by a connection line 906 .
  • the memory 904 may be various types of storage media, such as static random access memory (static random access memory, SRAM).
  • static random access memory static random access memory, SRAM
  • the processor 902 may be a general-purpose processor.
  • a general-purpose processor can be a central processing unit (CPU).
  • the processor 902 may also be a dedicated processor.
  • a special purpose processor may be a processor specially designed to perform specific steps and/or operations.
  • the dedicated processor may be a digital signal processor (digital signal processor, DSP), an application-specific integrated circuit (application-specific integrated circuit, ASIC), a field-programmable gate array (field-programmable gate array, FPGA), etc.
  • the processor 902 may also be a combination of multiple processors.
  • the processor 902 may include at least one circuit, to execute the steps that the hardware task scheduler is responsible for executing in the task scheduling method provided in the foregoing embodiments.
  • the connecting line 906 may be in various forms. For example, on-chip interconnection wiring, silicon substrate (Si Interposer) wiring, printed circuit board (printed circuit board, PCB) wiring, etc.
  • the above-mentioned devices may be respectively arranged on independent chips, or at least partly or all of them may be arranged on the same chip. Whether each device is independently arranged on different chips or integrated and arranged on one or more chips often depends on the needs of product design.
  • the embodiments of the present application do not limit the specific implementation forms of the foregoing devices.
  • the hardware task scheduler 900 shown in FIG. 9 is merely exemplary. During implementation, the hardware task scheduler 900 may also include other components, which will not be listed here.
  • FIG. 10 shows a schematic diagram of a task scheduling system provided by an embodiment of the present application.
  • the task scheduling system includes a CPU 1030 and a hardware task scheduler 1010 .
  • the CPU 1030 and the hardware task scheduler 1010 are located on a bare chip (Die) 1000 .
  • the CPU 1030 includes one or more CPU cores.
  • CPU1030 includes two CPU cores: CPU core 1031 and CPU core 1032.
  • the hardware task scheduler 1010 performs task scheduling for the CPU core 1031 to select a candidate task, and actively sends the metadata of the candidate task to the first storage area of the CPU core 1031.
  • the hardware task scheduler 1010 performs task scheduling for the CPU core 1032 to select a candidate task, and sends the metadata of the candidate task to the first storage area of the CPU core 1032.
  • hardware task scheduler 1010 resides within CPU 1030.
  • the hardware task scheduler is located outside the CPU 1030 and connected to the CPU 1030 through the on-chip bus 1020.
  • CPU1030 includes only CPU core.
  • CPU1030 includes only CPU core 1031.
  • the hardware task scheduler provides task scheduling services for the CPU core 1031. At this time, the hardware task scheduler provides task scheduling service for the CPU 1030 .
  • FIG. 11 shows a schematic diagram of another task scheduling system provided by the embodiment of the present application.
  • the hardware task scheduler can provide task scheduling services for CPU cores in other Dies.
  • the task scheduling system includes multiple Dies.
  • the task scheduling system includes two Dies: Die1100 and Die1200.
  • Die1100 includes CPU1130.
  • Die1200 includes CPU1230.
  • CPU1130 includes one or more CPU cores.
  • CPU1230 includes one or more CPU cores.
  • CPU1130 includes two CPU cores: CPU core 1131 and CPU core 1132.
  • CPU1230 includes two CPU cores: CPU core 1231 and CPU core 1232.
  • the task scheduling system includes one or more hardware task schedulers.
  • the task scheduling system includes a hardware task scheduler 1110 .
  • the hardware task scheduler 1110 provides task scheduling for the CPU core 1131, CPU core 1132, CPU core 1231 and CPU core 1232 to select candidate tasks, and actively sends the metadata of the candidate tasks to the first storage area of the corresponding CPU core.
  • There are many ways to deploy the hardware task scheduler 1110 .
  • hardware task scheduler 1110 resides within CPU 1130 .
  • the hardware task scheduler 1110 is located outside the CPU 1130 and connected to the CPU 1130 through an on-chip bus 1120 .
  • the task scheduling system may also include a hardware task scheduler 1210 .
  • the hardware task scheduler 1210 resides within the CPU 1230 .
  • the hardware task scheduler 1210 is located outside the CPU 1230 and connected to the CPU 1230 through an on-chip bus 1220 .
  • the hardware task scheduler 1210 provides task scheduling for the CPU core 1231 and the CPU core 1232 to select candidate tasks, and actively sends the metadata of the candidate tasks to the first storage area of the CPU core 1231 or CPU core 1232.
  • the hardware task scheduler can also be deployed in a variety of ways. For example, multiple hardware task schedulers are deployed in Die, and each hardware task scheduler provides task scheduling services for a CPU core. This embodiment of the application does not enumerate all deployment methods one by one.

Abstract

公开了一种任务调度的方法、系统和硬件任务调度器,属于计算机技术领域。硬件任务调度器为中央处理单元CPU核提供任务调度以选择候选任务。在选择了候选任务后,CPU核主动发送该候选任务的元数据到第一存储区域,主动发送该候选任务的上下文到高速缓存。该第一存储区域位于访问速度快于内存的存储空间中。在执行任务切换时,CPU核从第一存储区域读取候选任务的元数据,从高速缓存读取候选任务的上下文,降低了CPU核获取候选任务元数据和上下文的时延。实现了低时延的任务切换。

Description

一种任务调度的方法、系统和硬件任务调度器
本申请要求于2021年06月10日提交的申请号为202110648382.9、发明名称为“一种任务调度的方法、系统和硬件任务调度器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机领域,特别涉及一种任务调度的方法、系统和硬件任务调度器。
背景技术
在多任务的系统中,中央处理单元(central processing unit,CPU)或CPU核会在多个任务间调度切换,以分时执行不同的任务。任务调度切换时延的大小会影响任务执行的实时性,继而影响整个系统的性能。
如果采用基于软件的任务调度方法,任务调度和任务切换会串行同步执行。串行同步执行的任务调度和任务切换会消耗较长的时间。任务调度返回候选任务后,任务切换时还可能需要访问内存以获取该候选任务的相关信息。访问内存会消耗较长的时间。
发明内容
本申请提供了一种任务调度的方法、系统和硬件任务调度器,能够实现低时延的任务切换,以提升任务执行的实时性和系统的性能。
第一方面,提供了一种任务调度系统。所述任务调度系统包括CPU核和硬件任务调度器。
其中,所述硬件任务调度器用于执行任务调度以选择候选任务,并主动发送所述候选任务的元数据到第一存储区域。
其中,所述CPU核用于从所述第一存储区域读取所述候选任务的元数据以执行所述候选任务。
其中,所述第一存储区域位于访问速度快于内存的存储区域中。所述第一存储区域包括所述CPU核的内部存储空间和/或所述CPU核的高速缓存。
在执行选择的候选任务之前,CPU核需要发送命令以获取该候选任务的元数据。若该元数据存储于内存中,CPU核需要消耗较多的时间以读取该元数据。本申请提供的任务调度系统中,硬件任务调度器在完成任务调度后,不等待CPU核发送获取该选择的候选任务元数据的指令,而是主动发送该选择的候选任务的元数据,且将该元数据发送到访问速度快于内存的存储区域。所以,CPU核可以更快地从访问速度快于内存的存储区域读取该元数据。因此,本申请提供的任务 调度系统降低了CPU核获取元数据的时延。实现了低时延的任务切换。
根据第一方面,在第一方面的第一种可能的实现方式中,所述候选任务的元数据包括所述候选任务的上下文的存储位置信息、所述候选任务的标识和所述候选任务的状态。
其中,候选任务的标识用于区分候选任务。
其中,候选任务的上下文是指CPU核运行该任务所需的最小数据集。在中断该任务,或者,切换该任务时,CPU核保存该任务的上下文。当需要运行该任务时,CPU核读取该任务的上下文以恢复该任务的运行环境。
根据第一方面,或,第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述硬件任务调度器还用于主动发送所述候选任务的上下文到所述CPU核的高速缓存。
候选任务的上下文有可能存储于内存。若CPU核从内存中获取候选任务的上下文将消耗较多的时间。因为硬件任务调度器在执行完任务调度后主动发送候选任务的上下文到CPU核的高速缓存,所以CPU核可以及时地从高速缓存中读取该上下文。CPU核从高速缓存中读取该上下文消耗的时间要小于CPU核从内存读取该上下文消耗的时间。进一步降低了任务切换的时延。
根据第一方面,或,第一方面的第一种或第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述硬件任务调度器还用于将存储于所述硬件任务调度器的所述候选任务的上下文主动发送到所述CPU核的高速缓存。
根据第一方面,或,以上第一方面的任一种可能的实现方式,在第一方面的第四种可能的实现方式中,所述CPU核还用于发送所述被切换的任务的元数据到第二存储区域。所述硬件任务调度器还用于从所述第二存储区域读取所述被切换的任务的元数据,并将所述被切换的任务的元数据存储于所述硬件任务调度器。
其中,所述第二存储区域位于内存中,或,位于访问速度快于内存的存储区域中,所述第二存储区域不同于所述第一存储区域。
第二方面,提供一种任务调度方法。所述方法包括:硬件任务调度器执行任务调度以选择候选任务,并主动发送所述候选任务的元数据到第一存储区域。
其中,所述第一存储区域位于访问速度快于内存的存储区域中。所述第一存储区域包括CPU核的内部存储空间和/或所述CPU核的高速缓存。所述CPU核用于执行所述候选任务。
根据第二方面,在第二方面的第一种可能的实现方式中,所述候选任务的元数据包括所述候选任务的上下文的存储位置信息、所述候选任务的标识和所述候选任务的状态。
根据第二方面,或,第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述硬件任务调度器主动发送所述候选任务的上下文到所述CPU核的高速缓存。
根据第二方面,或,第二方面的第一种或第二种可能的实现方式,在第二方 面的第三种可能的实现方式中,所述硬件任务调度器将存储于所述硬件任务调度器的所述候选任务的上下文主动发送到所述CPU核的高速缓存。
根据第二方面,或,以上第二方面的任一种可能的实现方式,在第二方面的第四种可能的实现方式中,所述硬件任务调度器从第二存储区域读取被切换的任务的元数据,并将所述被切换的任务的元数据存储于所述硬件任务调度器。
其中,所述第二存储区域位于内存中,或,位于访问速度快于内存的存储区域中。所述第二存储区域不同于所述第一存储区域。
第三方面,提供一种任务调度方法。所述方法包括:CPU核从第一存储区域读取候选任务的元数据以执行所述候选任务。
其中,所述第一存储区域位于访问速度快于内存的存储区域中。所述第一存储区域包括所述CPU核的内部存储空间和/或所述CPU核的高速缓存。
根据第三方面,在第三方面的第一种可能的实现方式中,所述候选任务的元数据包括所述候选任务的上下文的存储位置信息、所述候选任务的标识和所述候选任务的状态。
根据第三方面,或,第三方面的第一种可能的实现方式,在第三方面的第二种可能的实现方式中,所述CPU核发送被切换的任务的元数据到第二存储区域。
其中,所述第二存储区域位于内存中,或,位于访问速度快于内存的存储区域中。所述第二存储区域不同于所述第一存储区域。
第四方面,提供一种硬件任务调度器。所述硬件任务调度器包括存储器和处理器。
其中,所述存储器用于存储一个或多个候选任务的元数据。
其中,所述处理器用于执行任务调度以选择候选任务,并主动将存储于所述存储器的所述选择的候选任务的元数据发送到第一存储区域。
其中,所述第一存储区域位于访问速度快于内存的存储区域中。所述第一存储区域包括CPU核的内部存储空间和/或所述CPU核的高速缓存。所述CPU核用于执行所述选择的候选任务。
根据第四方面,在第四方面的第一种可能的实现方式中,所述选择的候选任务的元数据包括所述选择的候选任务的上下文的存储位置信息、所述选择的候选任务的标识和所述选择的候选任务的状态。
根据第四方面,或,第四方面的第一种可能的实现方式,在第四方面的第二种可能的实现方式中,所述处理器还用于发送所述选择的候选任务的上下文到所述CPU核的高速缓存。
根据第四方面,或,第四方面的第一种或第二种可能的实现方式,在第四方面的第三种可能的实现方式中,所述存储器还用于存储所述选择的候选任务的上下文。
根据第四方面的第三种可能的实现方式,在第四方面的第四种可能的实现方 式中,所述处理器还用于将存储于所述存储器内的所述选择的候选任务的上下文发送到所述CPU核的高速缓存。
根据第四方面,或,以上第四方面的任一种可能的实现方式,在第四方面的第五种可能的实现方式中,所述处理器还用于从第二存储区域读取被切换的任务的元数据,并将所述被切换的任务的元数据存储于所述存储器。
其中,所述第二存储区域位于内存中,或,位于访问速度快于内存的存储区域中。所述第二存储区域不同于所述第一存储区域。
第五方面,提供一种硬件任务调度器。所述硬件任务调度器包括存储模块和任务管理模块。
其中,所述存储模块用于存储一个或多个候选任务的元数据。
其中,所述任务管理模块用于执行任务调度以选择候选任务,并主动将存储于所述存储模块的所述选择的候选任务的元数据发送到第一存储区域。
其中,所述第一存储区域位于访问速度快于内存的存储区域中。所述第一存储区域包括CPU核的内部存储空间和/或所述CPU核的高速缓存。所述CPU核用于执行所述选择的候选任务。
根据第五方面,在第五方面的第一种可能的实现方式中,所述选择的候选任务的元数据包括所述选择的候选任务的上下文的存储位置信息、所述选择的候选任务的标识和所述选择的候选任务的状态。
根据第五方面,或,第五方面的第一种可能的实现方式,在第五方面的第二种可能的实现方式中,所述任务管理模块还用于发送所述选择的候选任务的上下文到所述CPU核的高速缓存。
根据第五方面,或,第五方面的第一种或第二种可能的实现方式,在第五方面的第三种可能的实现方式中,所述存储模块还用于存储所述选择的候选任务的上下文。
根据第五方面的第三种可能的实现方式,在第五方面的第四种可能的实现方式中,所述任务管理模块还用于将存储于所述存储模块的所述选择的候选任务的上下文发送到所述CPU核的高速缓存。
根据第五方面,或,以上第五方面的任一种可能的实现方式,在第五方面的第五种可能的实现方式中,所述任务管理模块还用于从第二存储区域读取被切换的任务的元数据,并将所述被切换的任务的元数据存储于所述存储模块。
其中,所述第二存储区域位于内存中,或,位于访问速度快于内存的存储区域中。所述第二存储区域不同于所述第一存储区域。
附图说明
图1是本申请实施例所涉及的一种实施环境示意图;
图2是本申请实施例提供的一种任务调度方法的流程图;
图3是本申请实施例提供的与图2所示任务调度方法对应的时序图;
图4是本申请实施例提供的另一种任务调度方法的流程图;
图5是本申请实施例提供的与图4所示任务调度方法对应的时序图;
图6是本申请实施例提供的另一种任务调度方法的流程图;
图7是本申请实施例提供的与图6所示任务调度方法对应的时序图;
图8是本申请实施例提供的一种硬件任务调度器的逻辑结构示意图;
图9是本申请实施例提供的一种硬件任务调度器的硬件结构示意图;
图10是本申请实施例提供的一种任务调度系统示意图;
图11是本申请实施例提供的另一种任务调度系统示意图。
具体实施方式
为使本申请的原理、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
本申请实施例提供一种基于硬件的任务调度方法。请参考图1,其示出了本申请实施例涉及的一种实施环境示意图。参见图1,该实施环境包括计算机设备100。该计算机设备100是执行多任务系统的设备,包括多种类型的设备。例如,该计算机设备100,可以是手机,也可以是个人电脑,还可以是服务器。该计算机设备100包括CPU110和内存120。CPU110和内存120通过总线互连。其中,CPU110包括一个或多个CPU核(core)。例如,CPU110包括两个CPU核:CPU core 111和CPU core 112。CPU核包括控制单元(control unit,CU)、算术逻辑单元(arithmetic logic unit,ALU)、寄存器、一级高速缓存(level 1 cache,L1 cache)和二级高速缓存(level 2 cache,L2 cache)。例如,CPU core 111包括CU111-1、ALU111-2、寄存器111-3、L1 cache 111-4和L2 cache 111-5。CPU core 112包括CU112-1、ALU112-2、寄存器112-3、L1 cache 112-4和L2 cache 112-5。CPU还包括三级高速缓存(level 3 cache,L3 cache)。例如,CPU110包括L3 cache 113。有些计算机设备还包括四级高速缓存(level 4 cache,L4 cache)。例如,CPU110包括L4 cache 114。L1 cache、L2 cache、L3 cache和L4 cache统称为高速缓存(cache)。级别最高的高速缓存被称为最后一级高速缓存(last level cache,LLC)。例如,当CPU不包括L4 cache时,L3 cache被称为LLC。当L4 cache是CPU的级别最高的高速缓存时,L4 cache被称为LLC。CU、ALU和cache通过片内总线互连。
该计算机设备100还可以包括其他组件。例如,网卡、显示器、鼠标、键盘等。
寄存器的访问速度快于内存的访问速度。CU/ALU从寄存器读取数据所消耗的时间要小于CU/ALU从内存读取数据所消耗的时间。CU/ALU向寄存器写入数据所消耗的时间要小于CU/ALU向内存写入数据所消耗的时间。Cache的访问速度慢于寄存器。但,Cache的访问速度快于内存。其中,cache的级别越低,cache的访问速度越快。例如,L1 cache的访问速度快于L2 cache。L2 cache的访问速度快于L3 cache。L3 Cache的访问速度快于L4 cache。
CPU核通过CU执行指令以执行任务。若指令涉及到数据运算,则ALU执行相关运算。在执行指令的过程中,CU从寄存器中读取指令。在运算过程中,ALU从寄存器中读取数据。寄存器的存储空间非常小。所以,相关指令或数据可能不在寄存器中。若寄存器内没有相关指令或数据,则CU从内存获取指令或数据。因为cache的访问速度快于内存,所以在发出内存访问请求时,CU会先查看cache中是否有相关的指令或数据。若cache中有相关的指令或数据,则CU会从cache中读取相关的指令或数据。若cache中没有相关的指令或数据,则相关的指令或数据会被从内存发送到cache,再从cache发送到寄存器。因此,当CPU core需要的数据仅存在于内存时,CPU读取该数据的时延较大;若该数据存在于高速缓存,则时延较低。而且,高速缓存的级别越低,CPU读取该数据的时延越低。
CPU核还可以包括内部存储空间。例如,CPU core 111包括内部存储空间111-6。CPU core 112包括内部存储空间112-6。当CPU核包括内部存储空间时,该内部存储空间仅用于存储该CPU核需要的数据或指令。存储于该内部存储空间的内容不会被其他CPU核修改。例如,存储于内部存储空间111-6的内容不会被CPU core 112修改。存储于内部存储空间112-6的内容不会被CPU core 111修改。基于不同的实现形式,内部存储空间内存储的内容可以由L1 cache推送,也可以由LLC推送。内部存储空间的访问速度快于cache。若CPU核需要的数据存在于内部存储空间,则CPU核读取该数据的时延也较低。
当需要切换任务时,CPU核执行任务调度指令来选择候选任务。确定该候选任务后,CPU核获取该候选任务的元数据和上下文。然后,CPU核执行任务切换以执行该候选任务。该候选任务的元数据或上下文可能存在于内存中,因此CPU核获取该候选任务的元数据或上下文的时延较大。本申请实施例提供的技术方案,在计算机设备中增加硬件任务调度器,该硬件任务调度器负责执行任务调度以选择候选任务。该硬件任务调度器并行完成任务调度。例如,当CPU核还在执行当前任务的指令时,该硬件任务调度器并行地执行任务调度。该硬件任务调度器还可以提前完成任务调度。例如,当CPU核还在执行当前任务的指令时,该硬件任务调度器已经完成任务调度。该硬件任务调度器提前或并行地成任务调度,使得任务调度消耗的时间可以忽略不计。并且,该硬件任务调度器在完成任务调度后不等待CPU核发送获取该候选任务的元数据的指令,而是主动发送该候选任务的元数据到指定的存储区域。硬件任务调度器主动发送该候选任务的元数据,使得该候选任务的元数据能够更早地被发送到指定的存储区域。该指定的存储区域位于访问速度快于内存的存储空间。例如,该指定的存储区域位于CPU核的内部存储空间,或者,该指定的存储区域位于高速缓存中。当该指定的存储区域位于高速缓存中时,该指定的存储区域可以位于多级高速缓存中的某一级高速缓存。例如,该指定的存储区域位于一级高速缓存中。或者,该指定的存储区域位于二级高速缓存。因为该硬件任务调度器主动发送该候选任务的元数据到访存速度较快的指定的存储区域,所以CPU核能够及时地从该访存速度较快的指定的存储区域中获取到该候选任务的元数据,降低了CPU核获取候选任务元数据的时延。其中, 若该指定的存储区域位于CPU核的内部存储空间,CPU核获取候选任务元数据的时延最低。若该指定存储区域位于高速缓存中,高速缓存的级别越低,CPU核获取候选任务元数据的时延越低。另外,该硬件任务调度器在完成任务调度后主动发送该候选任务的上下文到高速缓存,使得CPU核能够及时地从高速缓存中获取到该候选任务的上下文,降低了CPU核获取候选任务上下文的时延。本申请实施例提供的技术方案,通过消除任务切换调度中的任务调度时间,降低CPU核获取候选任务元数据和上下文的时延,降低了任务切换调度的总时延。本申请实施例的详细方案请参考下述描述。
请参考图2,其示出了本申请实施例提供的一种任务调度方法的流程图。在该流程中,硬件任务调度器先接收CPU核的任务调度指令,然后执行任务调度以选择候选任务。在选择了候选任务后,硬件任务调度器主动发送该候选任务的元数据到指定存储区域,并主动发送该候选任务的上下文到高速缓存。该指定的存储区域位于访问速度快于内存的存储区域中。例如,该指定的存储区域位于高速缓存。或者,该指定的存储区域位于CPU核的内部存储空间。CPU核在执行任务切换前从访存速度更快的指定存储区域获取该候选任务的元数据,从高速缓存中获取该候选任务的上下文。该任务调度方法流程包括如下步骤:
步骤201、CPU核发送消息给硬件任务调度器,通知硬件任务调度器执行任务调度。
CPU核在执行当前任务的过程中发送消息给硬件任务调度器,以通知硬件任务调度器更早地开始执行任务调度以选择候选任务。例如,CPU核发送任务调度指令给硬件任务调度器以通知硬件任务调度器开始执行任务调度。CPU核可以通过多种机制决定发送任务调度指令的时机。例如,CPU核可以检测当前任务的运行时间。若该运行时间超过阈值,则CPU核发送任务调度指令。或者,CPU核可以检测当前任务的代码中的特殊指令。当检测到特殊指令时,CPU核发送任务调度指令。CPU核也可以在执行完当前任务时,例如,当前任务的执行时间到期时,发送任务调度指令。
CPU核发送该任务调度指令给硬件任务调度器。例如,CPU核将该CPU核标识写入硬件任务调度器内的某个存储区域,以触发硬件任务调度器执行任务调度为该CPU核标识所代表的CPU核选择一个候选任务。
发送任务调度指令后,CPU核继续执行当前任务,并执行调度前的预处理。例如,检查当前任务的状态。若当前任务的状态不允许该任务被切换,则CPU核继续执行当前任务。若该任务允许被切换,则CPU核在检测到切换时机时,执行步骤205-208以获取候选任并执行任务切换。CPU核可以通过多种方式检测该切换时机。例如,检测当前任务的执行时间是否到期,或者,检测是否执行到某个特殊的指令。
在CPU核继续执行当前任务或执行调度前预处理时,硬件任务调度器并行地执行步骤202-204完成任务调度以选择候选任务,并发送该候选任务的相关信息。
CPU核执行任务切换是为了执行硬件任务调度器选择的候选任务。执行任务 切换前CPU核正在运行的任务称为被切换的任务。
步骤202、硬件任务调度器执行任务调度以选择候选任务。
硬件任务调度器内有一个或多个候选任务队列。每个候选任务队列包括一个或多个候选任务的元数据。该元数据包括候选任务的标识、候选任务的状态和候选任务的上下文的存储位置信息。
其中,候选任务的标识用于区分不同的候选任务。
其中,候选任务的状态包括就绪、执行、新建、阻塞、终止等。其中,就绪状态表示该任务已获得除CPU核以外的所有必要资源,等待CPU核的执行。硬件任务调度器在已处于就绪状态的候选任务中选择候选任务供CPU核执行。
其中,候选任务的上下文是指CPU核运行该任务所需的最小数据集。在中断该任务,或者,切换该任务时,CPU保存该任务的上下文。当需要运行该任务时,CPU读取该任务的上下文以恢复该任务的运行环境。内存中存储了候选任务的上下文。候选任务的上下文的存储位置信息包括该候选任务的上下文在内存中的存储地址。该存储位置信息可能还包括该候选任务上下文的长度。多种方式可用于表示该上下文的长度。例如,相对于该上下文在内存中存储位置的字节偏移量。或者,该上下文所占用的缓存行(cache line)数目。
候选任务的元数据还可以包括候选任务关联的CPU核。候选任务关联的CPU核,表示该任务与该CPU核的亲和性。即,该任务更期望被哪个或者哪些CPU核执行。硬件任务调度器在接收到CPU核的调度指令之后,优先在与该CPU核有关联的候选任务中选择候选任务以供该CPU核执行。硬件任务调度器还可以基于其他方式表示候选任务与CPU核的亲和性。例如,硬件任务调度器为每个CPU核维护一个队列,将与CPU核亲和的候选任务存入该CPU核对应的队列中。
该元数据还可以包括候选任务的优先级、时间片阈值等信息。
硬件任务调度器执行任务调度以选择候选任务。硬件任务调度器可以优先在与该CPU核亲和的候选任务中选择候选任务。硬件任务调度器也可以在未指定任何亲和CPU的候选任务中选择候选任务。硬件任务调度器执行任务调度算法来完成任务调度。该任务调度算法可以是多种类型的调度算法。例如,最短时间优先调度、轮询、权重调度、优先级调度等。
步骤203、硬件任务调度器主动发送该候选任务的元数据到指定存储区域。
在选择出候选任务后,硬件任务调度器主动发送该候选任务的元数据到指定存储区域。该指定的存储区域位于访问速度快于内存的存储区域中。例如,该指定的存储区域包括该CPU核的内部存储区域。又例如,该指定的存储区域包括该CPU核的高速缓存。CPU核的高速缓存可以在CPU核内或核外。例如,位于该CPU核内的L1 cache或L2 cache,或者,位于该CPU核外的L3 cache或L4 cache。硬件任务调度器主动发送该候选任务的元数据该指定的存储区域,使得该CPU核可以更快地获取到该候选任务的元数据。
硬件任务调度器可以基于多种方式发送该候选任务的元数据到指定的存储区域。例如,硬件任务调度器基于专用通道,直接发送该候选任务的元数据到指 定的存储区域。或者,硬件任务调度器还可以发送指令到总线。总线基于总线协议执行该指令以将该候选任务的元数据发送到指定的存储区域。例如,当总线是进阶精简指令集机器(Advanced RISC Machine,ARM)高级微控制器总线架构(Advanced Microcontroller Bus Architecture,AMBA)总线时,硬件任务调度器发送缓存隐藏(cache stashing)相关指令以将候选任务的元数据发送到高速缓存中。cache stashing指令默认将候选任务的元数据发送到L3 cache。该指令携带的参数包括该候选任务的元数据和为该候选任务分配的内存地址。基于总线协议,总线执行该指令将该候选任务的元数据存入与该内存地址匹配的高速缓存中。硬件任务调度器还可以在该指令的参数中设置高速缓存的级别,以将候选任务的元数据发送到指定级别的高速缓存中。例如,L2 cache。当计算机设备包括多个CPU核时,硬件任务调度器还可以在该指令的参数中设置CPU核的标识,以将候选任务的元数据发送到该标识指定的CPU核能够访问的高速缓存中。硬件任务调度器在其驱动程序中为CPU核设置该内存地址。在计算机设备启动时,CPU核读取该内存地址。当CPU核基于该内存地址读取该候选任务的元数据时,将直接从高速缓存中读取该候选任务的元数据。
步骤204、硬件任务调度器主动发送该候选任务的上下文到高速缓存。
在选择了候选任务后,硬件任务调度器主动发送该候选任务的上下文到高速缓存,以使得CPU核可以及时地从高速缓存中获取到该候选任务的上下文。
与步骤203类似,硬件任务调度器可以发送指令以将该候选任务的上下文发送到高速缓存。该指令携带的参数包括该候选任务的上下文的内存地址。总线执行该指令时,将存储于该内存地址的该候选任务的上下文发送到高速缓存。类似地,硬件任务调度器还可以在该指令的参数中设置高速缓存的级别,以将该候选任务的上下文发送到指定级别的高速缓存中。在执行该指令时,若该候选任务的上下文已存储于指定级别的高速缓存中,则总线不再发送存储于内存地址的该候选任务的上下文。若该候选任务存储于LLC,且指令要求将上下文发送到高级别的高速缓存(例如,L2 cache),则该候选任务的上下文会被从LLC推送到高级别的高速缓存。因为存储于硬件任务调度器的候选任务的元数据中包括该候选任务的上下文的存储地址,所以硬件任务调度器能够获取该候选任务的上下文的内存地址。该候选任务的元数据被硬件任务调度器发送到高速缓存。所以,当CPU核获取到该候选任务的元数据后,CPU核也可以获取到该候选任务的内存地址。获取到该候选任务的内存地址后,CPU核基于该内存地址读取候选任务的上下文。因为该候选任务的上下文已经被发送到高速缓存,因此CPU核将能够直接从高速缓存中读取候选任务的上下文。
步骤205、CPU核读取该候选任务的元数据。
在CPU核执行当前任务及任务调度预处理时,硬件任务调度器已经通过执行任务调度选择了候选任务,并将该候选任务的元数据发送到指定存储区域。因此,CPU核可以从该指定存储区域读取候选任务的元数据。该指定存储区域位于CPU核的内部存储空间或者高速缓存中。因此,CPU核可以快速的获取到该候选任务 的元数据。
步骤206、CPU读取该候选任务的上下文。
获取到候选任务的元数据后,CPU从该元数据中读出该候选任务上下文的存储位置信息。CPU基于该存储位置信息读取上下文。因为硬件任务调度器发送该候选任务的上下文到高速缓存,所以CPU核可以从高速缓存中读取该上下文。
步骤207、CPU核将被切换的任务的元数据存入硬件任务调度器。
CPU将被切换的任务元数据存入硬件任务调度器,以便硬件任务调度器将该被切换的任务的元数据存入候选任务队列。在后续任务调度中,硬件任务调度器可以再次选择该被切换的任务作为候选任务,以使得该被切换的任务可以被CPU核再次执行。CPU核可以通过多种方法将被切换的任务的元数据存入硬件任务调度器。例如,CPU核可以通过如下两种实现方式(207A和207B)的任一种,将被切换的任务的元数据存入硬件任务调度器。
207A:CPU核发送被切换的任务的元数据给硬件任务调度器,硬件任务调度器将被切换的任务的元数据存入候选任务队列。
CPU核发送被切换的任务的元数据给硬件任务调度器。例如,CPU核将被切换的任务的元数据写入硬件任务调度器内的某个存储区域,以促使硬件任务调度器从该指定存储区域读取被切换的任务的元数据,并将该被切换的任务的元数据存入候选任务队列。硬件任务调度器在执行后续的任务调度时,可以再次选择该被切换的任务,以使得该被切换的任务再次被CPU核执行。
207B:CPU核发送被切换的任务的元数据到指定存储区域,硬件任务调度器从该指定存储区域读取被切换的任务的元数据,并将被切换的任务的元数据存入候选任务队列。
CPU核发送被切换的任务的元数据到指定存储区域。该指定存储区域可以位于访问速度快于内存的存储区域。例如,该指定存储区域包括该CPU核的内部存储空间。又例如,该指定存储区域包括该CPU核的高速缓存。CPU核的高速缓存可以在CPU核内或核外。例如,位于该CPU核内的L1 cache或L2 cache,或者,位于该CPU核外的L3 cache或L4 cache。与步骤203类似,CPU核可以通过专用通道以将被切换的任务的元数据发送到该指定的存储区域。CPU核也可以发送指令,以使得被切换的任务的元数据发送到指定的存储区域。因为保存被切换的任务的元数据不影响候选任务的执行,所以该指定存储区域还可以位于内存中。CPU核直接将被切换的任务的元数据写入位于内存中的存储区域。
CPU核将被切换的任务的元数据写入到指定存储区域触发硬件任务调度器读取该被切换的任务的元数据。或者,硬件任务调度器检测该指定存储区域。一旦发现有任务的元数据存储于该指定存储区域,则硬件任务调度器从该指定存储区域读取任务的元数据。读取任务的元数据后,硬件任务调度器将其存入硬件任务调度器的候选任务队列。硬件任务调度器在执行后续的任务调度时,可以再次选择该任务以使得该任务再次被CPU核执行。
该指定存储区域和步骤203中的指定存储区域可以位于同一个存储空间。例 如,该指定存储区域和步骤203中的指定存储区域均位于L2 cache。或者,该指定存储区域和步骤203中的指定存储区域均位于CPU核的内部存储空间。该指定存储区域和步骤203也可以位于不同的存储空间。例如,该指定存储区域位于L3 cache,步骤203中的指定存储区域位于L2 cache。或者,该指定存储区域位于内存,步骤203中指定的存储区域位于L2 cache。
该指定存储区域和步骤203中的指定存储区域是不同的存储区域。CPU核向该指定存储区域写入被切换的任务的元数据,硬件任务调度器从该指定存储区域读出被切换的任务元数据。硬件任务调度器向步骤203中的指定存储区域写入选择的候选任务的元数据,CPU核从步骤203中的指定存储区域读出选择的候选任务的元数据。为区分二者,本申请实施例将步骤203中的指定存储区域称为第一存储区域,将步骤207B中的指定存储区域称为第二存储区域。
步骤208、CPU核执行任务切换,以运行该候选任务。
获取候选任务的上下文后,CPU核切换上下文以执行该候选任务。
在执行任务切换前,CPU核执行任务切换前的预处理。例如,保存被切换任务的运行状态和该任务的内存基地址。现代操作系统通常采用虚拟内存管理机制。例如,若CPU核有32位地址线,则该CPU核可以访问4吉比特(gigabit,Gb)的存储空间。这4Gb存储空间的地址被称为物理地址。基于虚拟内存管理机制,每个任务(例如,进程)都有自己独立的4G的地址空间。每个任务的独立的4G地址空间的地址被称为虚拟地址。CPU核在执行某个任务时,需要将该任务的虚拟地址转换为物理地址,以访问该物理地址对应的存储单元。内存管理单元(memory management unit,MMU)负责将任务的虚拟地址转换为物理地址。该地址转换依赖于该任务的内存基地址。
综上所述,本申请实施例提供的任务调度方法的时序图如图3所示。CPU核在执行当前任务的过程中发送任务调度指令。因此,在CPU核继续执行当前时,硬件任务调度器可以并行的执行任务调度以选择候选任务。硬件任务调度器并行地执行任务调度,使得任务调度消耗的时间可以忽略不计。在完成任务调度后,硬件任务调度器不等待CPU核发送指令以触发硬件任务调度器发送候选任务的元数据,而是主动将该候选任务的元数据发送到访存速度更快的第一存储区域,使得CPU核能够及时地从访问速度更快的第一存储区域读取候选任务的元数据。降低了CPU核获取候选任务元数据的时延。在完成任务调度后,硬件任务调度器主动发送该候选任务的上下文到高速缓存,使得CPU核能够及时地从高速缓存中读取候选任务的上下文。降低了CPU核获取上下文的时延。本实施例提供的任务调度方法降低了任务调度切换的时延。
图4和图6是本申请实施例提供的另外两种任务调度方法的流程图。对于图4和图6所示方法实施例中未披露的细节,请参照图3所示的方法实施例。
请参考图4,其示出了本申请实施例提供的另一种任务调度方法的流程图。在该流程中,硬件任务调度器先执行任务调度以选择候选任务,预先读取该候选任务的上下文,并将该候选任务的上下文存储在硬件任务调度器的内部。当接收到CPU核的任务调度指令时,硬件任务调度器发送该候选任务的元数据到第一存储区域,发送该候选任务的上下文到高速缓存。在执行任务切换前,CPU核从访问速度更快的第一存储区域获取该候选任务的元数据,从高速缓存中获取该候选任务的上下文。该任务调度方法流程包括如下步骤:
步骤401、硬件任务调度器执行任务调度以选择候选任务。
步骤402、硬件任务调度器获取该候选任务的上下文,并存储该候选任务的上下文到该硬件任务调度器。
在选择候选任务后,硬件任务调度器获取该候选任务的上下文,并将其存储于硬件任务调度器内部。例如,硬件任务调度器发送读指令到总线,以读出该候选任务的上下文并存储于硬件任务调度器。该读指令携带的参数包括该候选任务上下文的存储位置信息。
步骤403、CPU核发送消息通知硬件任务调度器执行任务调度。
CPU核发送消息通知硬件任务调度器执行任务调度。例如,CPU核发送任务调度指令给硬件任务调度器,以通知硬件任务调度器执行任务调度。发送完该指令后,CPU核继续执行当前任务或调度前预处理。
硬件任务调度器已于步骤401完成任务调度。因此,接收到此指令时,硬件任务调度器可以直接从选择好的候选任务中为该CPU核选择一个候选任务。例如,从选择好的候选任务中选择一个与该CPU核亲和的候选任务。然后,硬件任务调度器执行步骤404和步骤405,发送该候选任务的元数据和上下文。
步骤404、硬件任务调度器发送该候选任务的元数据到第一存储区域。
步骤405、硬件任务调度器发送该候选任务的上下文到高速缓存。
硬件任务调度器在步骤402中已经将上下文存储于该硬件任务调度器。该上下文还存储于内存。该上下文还可能存储于该CPU核的高速缓存。如果在该上下文存储于该硬件任务调度器之后,存储于其他位置的该上下文被修改了,那么存储于该硬件任务调度器的该上下文就是无效的。如果存储于该硬件任务调度器的该上下文是无效的,该硬件任务调度器不能直接将存储于该硬件任务调度器的该上下文发送到CPU核的高速缓存。因此,硬件任务调度器在发送该上下文之前,需要判断该上下文是否有效。有多种方式可以判断该上下文是否有效。例如,硬件任务调度器为存储于硬件任务调度器的每一个上下文设置一个标志,外部存储(内存或者高速缓存)在修改了上下文后修改该标志,硬件任务调度器检测该标志以判断上下文的有效性。或者,硬件任务调度器通过检测总线一致性来判断上下文的有效性。
当该候选任务的上下文有效时,硬件任务调度器将存储于硬件任务调度器内部的上下文发送到高速缓存。在步骤203中,硬件任务调度器将存储于硬件任务调度器内部的元数据发送到高速缓存。硬件任务调度器将存储于硬件任务调度器 内部的上下文发送到高速缓存,类似于步骤203。
当上下文无效时,硬件任务调度器执行类似于步骤204的指令以发送候选任务的上下文到高速缓存。此时,被发送到高速缓存的上下文不是存储硬件任务调度器中的上下文。该被发送到高速缓存的上下文有可能被存储于内存。该被发送到高速缓存的上下文有可能已经存储于高速缓存。
步骤406、CPU核读取该候选任务的元数据。
在CPU核执行当前任务及任务调度预处理时,硬件任务调度器已经将该候选任务元数据发送到第一存储区域。因此,CPU核可从第一存储区域读取候选任务的元数据,以切换执行该候选任务。
步骤407、CPU核读取该候选任务的上下文。
获取到候选任务的元数据后,CPU核从该元数据中读出该候选任务上下文的存储位置信息。CPU核基于该存储位置信息读取上下文。因为该上下文已经被发送到高速缓存了,所以CPU核将从高速缓存中读取该上下文。
步骤408、CPU核将被切换的任务元数据存入硬件任务调度器。
CPU核可以通过多种方法将被切换的任务的元数据存入硬件任务调度器。例如,CPU核可以通过方法407A或者方法407B将被切换的任务的元数据存入硬件任务调度器。方法408A类似于方法207A。方法408B类似于方法207B。此处不再赘述。
硬件任务调度器存储被切换的任务到候选任务队列后,可再次执行任务调度以选择好候选任务。硬件任务调度器获取该候选任务的上下文。当再次接收到CPU核的调度指令时,硬件任务调度器直接发送选择好的候选任务的元数据和上下文给CPU核。
步骤409、CPU核执行任务切换,以运行该候选任务。
获取候选任务的上下文后,CPU核切换上下文以执行该候选任务。
综上所述,本申请实施例提供的任务调度方法的时序图如图5所示。在CPU核发送任务调度指令之前,硬件任务调度器已经完成任务调度。因此当接收到CPU核的任务调度指令后,硬件任务调度器可以立即发送候选任务的元数据至第一存储区域,并发送该候选任务的上下文至CPU核的高速缓存。硬件任务调度器在CPU核发送任务调度指令之前完成任务调度,使得任务调度消耗的时间可以完全消除。因为已经预先完成任务调度,所以接收到任务调度指令后,硬件任务调度器可以立即发送候选任务的元数据和上下文。该方法能够更快地将候选任务的元数据和上下文发送到访问速度更快的存储区域,更能保证CPU核能从访存速度更快的存储区域立即获取候选任务元数据和上下文。本实施例提供的任务调度方法降低了任务调度切换的时延。
请参考图6,其示出了本申请实施例提供的另一种任务调度方法的流程图。在该流程中,硬件任务调度器监测CPU核的第一存储区域。若该存储区域内没有 待执行的候选任务元数据,则硬件任务调度器执行任务调度为该CPU核选择候选任务,并主动发送该候选任务的元数据到第一存储区域。在需要任务切换时,CPU核不再发送任务调度指令,而是直接从该第一存储区域读取候选任务的元数据,并预取该候选任务的上下文到该CPU核的高速缓存。CPU核从高速缓存读取候选任务的上下文以执行任务切换。该任务调度方法流程包括如下步骤:
步骤601、硬件任务调度器监测CPU核的第一存储区域。若该存储区域内没有待执行的候选任务,则硬件任务调度器执行任务调度以选择候选任务,并主动发送该候选任务的元数据到该存储区域。
步骤602、CPU核从第一存储区域读取候选任务元数据。
在检测到需要切换任务时,CPU核不再发送任务调度指令,而是直接从第一存储区域读取候选任务的元数据。
当CPU核获取到候选任务元数据后,执行步骤603。之后,CPU核继续执行当前任务或者执行任务切换前的操作。当执行完任务切换前的操作后,CPU核执行步骤604。
步骤603、CPU核预取候选任务的上下文,促使该候选任务的上下文被发送高速缓存。
获取到候选任务的元数据后,CPU核从该元数据中读出该候选任务上下文的存储位置信息。CPU核基于该候选任务的上下文的存储位置信息预取该候选任务的上下文。例如,CPU核发送预取指令到总线,促使该候选任务的上下文被发送到高速缓存。该预取指令携带的参数包括该候选任务上下文的存储位置信息。
步骤604、CPU核读取该候选任务的上下文。
执行完任务切换前的操作后,CPU核基于上述存储位置信息读取上下文。因为该上下文已经被发送到高速缓存了,所以CPU核将从高速缓存中读取该上下文。
步骤605、CPU核发送被切换的任务的元数据至第二存储区域。
步骤606、硬件任务调度器监测第二存储区域,从该存储区域中读取被切换的任务元数据,并将被切换的任务元数据存入硬件任务调度器的候选任务队列。
硬件任务调度器监测第二存储区域。一旦发现有任务的元数据存储于该第二存储区域,则硬件任务调度器从该第二存储区域读出任务元数据并写入硬件任务调度器的候选任务队列。硬件任务调度器在执行后续的任务调度时,可以再次选择该任务以让该任务再次被CPU核执行。
步骤607、CPU核执行任务切换,以运行该候选任务。
获取候选任务的上下文后,CPU核切换上下文以执行该候选任务。
综上所述,本申请实施例提供的任务调度方法的时序图如图7所示。硬件任务调度器预先执行任务调度,并将选择出的候选任务的元数据预先发送到第一存储区域。当CPU核检测到需要切换任务时,CPU核不再发送任务调度指令,而是直接从第一存储区域读取候选任务的元数据。在读取到候选任务的元数据后,CPU核预取该候选任务的上下文,促使该候选任务的上下文被发送到高速缓存。即, 在需要切换任务时,CPU核可以直接从第一存储区域读取候选任务的元数据,从高速缓存中读取候选任务的上下文。CPU核直接从第一存储区域读取候选任务的元数据,从高速缓存中读取候选任务的上下文,降低了任务切换的时延。另外,本实施例中,CPU核不需要和硬件任务调度器直接交互。因此,本实施例提供的任务调度方法不仅降低了任务调度切换的时延,而且避免了CPU核和硬件任务调度器的直接交互。
下述为本申请的装置实施例,可以用于执行本申请的方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图8,其示出了本申请实施例提供的一种硬件任务调度器800的逻辑结构示意图。参见图8,该硬件任务调度器500包括:任务管理模块810和存储模块820。任务管理模块810用于执行图2所示实施例中的步骤202和步骤203,或,图4所示实施例中的步骤401和步骤404,或,图6所示实施例中的步骤601。存储模块820用于存储任务管理模块510执行过程中涉及的候选任务的元数据。具体地,
该任务管理模块810,用于执行任务调度以选择候选任务,并主动发送该候选任务的元数据到第一存储区域。
该存储模块820,用于存储一个或多个候选任务的元数据,以供任务管理模块810选择候选任务。
可选地,该任务管理模块810还用于执行图2所示实施例中的步骤204。该任务管理模块810还用于主动发送候选任务的上下文到高速缓存。
可选地,该存储模块820还用于存储候选任务的上下文。
可选地,该任务管理模块810还用于执行图4所示实施例中的步骤402和步骤405。该任务管理模块810还用于获取候选任务的上下文并将候选任务的上下文存储于硬件任务调度器。该任务管理模块810还用于将存储于硬件任务调度器的候选任务的上下文主动发送到高速缓存。
可选地,该任务管理模块810还用于执行图2所示实施例中的步骤207,或,图4所示实施例中的步骤408,或,图6所示实施例中的步骤606。该任务管理模块810用于接收CPU核发送的被切换的任务的元数据,并将该被切换的任务的元数据存储于硬件任务调度器。或者,该任务管理模块810用于从第二存储区域读取被切换的任务的元数据,并将被切换的任务的元数据存储于硬件任务调度器。
可选地,该硬件任务调度器800还包括接口模块830。该接口模块830用于发送任务管理模块810在执行上述步骤中所发送的指令。
需要说明的是,上述实施例提供的硬件任务调度器在执行任务调度时,仅以上述各功能模块的划分举例说明。实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成。即,将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的硬件任务调度器与任务 调度方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
请参考图9,其示出了本申请实施例提供的一种硬件任务调度器900的硬件结构示意图。参见图9,该硬件任务调度器900包括任务管理处理器902、存储器904和连接线906。处理器902和存储器904通过连接线906彼此连接。
其中,存储器904可以是各种类型的存储介质,例如静态随机存取存储器(static random access memory,SRAM)。
其中,处理器902可以是通用处理器。例如,通用处理器可以是中央处理器(CPU)。此外,处理器902也可以是专用处理器。专用处理器可以是专门设计的用于执行特定步骤和/或操作的处理器。例如,该专用处理器可以是数字信号处理器(digital signal processor,DSP)、应用专用集成电路(application-specific integrated circuit,ASIC)和现场可编程门阵列(field-programmable gate array,FPGA)等。此外,处理器902还可以是多个处理器的组合。处理器902可以包括至少一个电路,以执行上述实施例提供的任务调度方法中硬件任务调度器负责执行的步骤。
其中,连接线906可以是多种形式的连接线。例如硅片片上互连走线、硅基板(Si Interposer)走线、印刷电路板(printed circuit board,PCB)走线等。
上述器件可以分别设置在彼此独立的芯片上,也可以至少部分的或者全部的设置在同一块芯片上。将各个器件独立设置在不同的芯片上,还是整合设置在一个或者多个芯片上,往往取决于产品设计的需要。本申请实施例对上述器件的具体实现形式不做限定。
图9所示的硬件任务调度器900仅仅是示例性的。在实现过程中,硬件任务调度器900还可以包括其他组件,本文不再一一列举。
图10示出了本申请实施例提供的一种任务调度系统示意图。请参考图10,该任务调度系统包括CPU1030和硬件任务调度器1010。该CPU1030和硬件任务调度器1010位于裸芯片(Die)1000上。其中,CPU1030包括一个或多个CPU核。例如,CPU1030包括两个CPU核:CPU core 1031和CPU core 1032。硬件任务调度器1010为CPU core 1031执行任务调度以选择候选任务,并主动将该候选任务的元数据发送到CPU core 1031的第一存储区域。或者,硬件任务调度器1010为CPU core 1032执行任务调度以选择候选任务,并将该候选任务的元数据发送到CPU core 1032的第一存储区域。硬件任务调度器1010有多种部署方式。例如,硬件任务调度器1010位于CPU 1030内。或者,硬件任务调度器位于CPU1030外外,通过片内总线1020与CPU 1030连接。
当CPU1030仅包括一个CPU核时。例如,CPU1030仅包括CPU core 1031。硬件任务调度器为CPU core 1031提供任务调度服务。此时,硬件任务调度器即是为CPU1030提供任务调度服务。
图11示出了本申请实施例提供的另一种任务调度系统示意图。在该任务调度系统中,硬件任务调度器可以为其他Die内的CPU核提供任务调度服务。请参考图11,该任务调度系统包括多个Die。例如,该任务调度系统包括两个Die:Die1100和Die1200。其中,Die1100包括CPU1130。Die1200包括CPU1230。其中,CPU1130包括一个或多个CPU core。CPU1230包括一个或多个CPU core。例如,CPU1130包括两个CPU core:CPU core 1131和CPU core 1132。CPU1230包括两个CPU core:CPU core 1231和CPU core 1232。该任务调度系统包括一个或多个硬件任务调度器。例如,该任务调度系统包括硬件任务调度器1110。硬件任务调度器1110为CPU core 1131、CPU core 1132、CPU core 1231和CPU core 1232提供任务调度以选择候选任务,并主动将候选任务的元数据发送到相应CPU core的第一存储区域。硬件任务调度器1110有多种部署方式。例如,硬件任务调度器1110位于CPU1130内。或者,硬件任务调度器1110位于CPU1130外,通过片内总线1120和CPU1130连接。例如,该任务调度系统还可以包括硬件任务调度器1210。该硬件任务调度器1210位于CPU1230内。或者,该硬件任务调度器1210位于CPU1230外,通过片内总线1220和CPU1230连接。该硬件任务调度器1210为CPU core 1231和CPU core 1232提供任务调度以选择候选任务,并主动将候选任务的元数据发送到CPU core 1231或者CPU core 1232的第一存储区域。
硬件任务调度器还有多种部署方式。例如,Die内部署多个硬件任务调度器,每个硬件任务调度器分别为一个CPU core提供任务调度服务。本申请实施例不再一一例举所有部署方式。
应理解,在本申请实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,不应对本申请实施例的实施过程构成任何限定。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。

Claims (20)

  1. 一种任务调度系统,其特征在于,所述任务调度系统包括中央处理单元CPU核和硬件任务调度器,
    所述硬件任务调度器,用于执行任务调度以选择候选任务,并主动发送所述候选任务的元数据到第一存储区域;
    所述CPU核,用于从所述第一存储区域读取所述候选任务的元数据以执行所述候选任务;
    其中,所述第一存储区域位于访问速度快于内存的存储区域中,所述第一存储区域包括所述CPU核的内部存储空间和/或所述CPU核的高速缓存。
  2. 根据权利要求1所述的系统,其特征在于,所述候选任务的元数据包括所述候选任务的上下文的存储位置信息、所述候选任务的标识和所述候选任务的状态。
  3. 根据权利要求1或2所述的系统,其特征在于,
    所述硬件任务调度器,还用于主动发送所述候选任务的上下文到所述CPU核的高速缓存。
  4. 根据权利要求1-3所述的系统,其特征在于,
    所述硬件任务调度器,还用于将存储于所述硬件任务调度器的所述候选任务的上下文主动发送到所述CPU核的高速缓存。
  5. 一种任务调度方法,应用于硬件任务调度器,其特征在于,所述方法包括:
    执行任务调度以选择候选任务;
    主动发送所述候选任务的元数据到第一存储区域;
    其中,所述第一存储区域位于访问速度快于内存的存储区域中,所述第一存储区域包括CPU核的内部存储空间和/或所述CPU核的高速缓存,所述CPU核用于执行所述候选任务。
  6. 根据权利要求5所述的方法,其特征在于,所述候选任务的元数据包括所述候选任务的上下文的存储位置信息、所述候选任务的标识和所述候选任务的状态。
  7. 根据权利要求5或6所述的方法,其特征在于,所述方法包括:主动发送所述候选任务的上下文到所述CPU核的高速缓存。
  8. 根据权利要求5至7任一所述的方法,其特征在于,所述方法包括:将 存储于所述硬件任务调度器的所述候选任务的上下文主动发送到所述CPU核的高速缓存。
  9. 一种任务调度方法,应用于中央处理单元CPU核,其特征在于,所述方法包括:
    从第一存储区域读取候选任务的元数据以执行所述候选任务;
    其中,所述第一存储区域位于访问速度快于内存的存储区域中,所述第一存储区域包括所述CPU核的内部存储空间和/或所述CPU核的高速缓存。
  10. 根据权利要求9所述的方法,其特征在于,所述候选任务的元数据包括所述候选任务的上下文的存储位置信息、所述候选任务的标识和所述候选任务的状态。
  11. 一种硬件任务调度器,其特征在于,包括:
    存储器,用于存储一个或多个候选任务的元数据;
    处理器,用于执行任务调度以选择候选任务,并主动将存储于所述存储器的所述选择的候选任务的元数据发送到第一存储区域;
    其中,所述第一存储区域位于访问速度快于内存的存储区域中,所述第一存储区域包括CPU核的内部存储空间和/或所述CPU核的高速缓存,所述CPU核用于执行所述选择的候选任务。
  12. 根据权利要求11所述的硬件任务调度器,其特征在于,所述选择的候选任务的元数据包括所述选择的候选任务的上下文的存储位置信息、所述选择的候选任务的标识和所述选择的候选任务的状态。
  13. 根据权利要求11或12所述的硬件任务调度器,其特征在于,所述处理器还用于主动发送所述选择的候选任务的上下文到所述CPU核的高速缓存。
  14. 根据权利要求11至13任一所述的硬件任务调度器,其特征在于,所述存储器还用于存储所述选择的候选任务的上下文。
  15. 根据权利要求14所述的硬件任务调度器,其特征在于,所述处理器还用于将存储于所述存储器内的所述选择的候选任务的上下文主动发送到所述CPU核的高速缓存。
  16. 一种硬件任务调度器,其特征在于,所述硬件任务调度器包括:
    存储模块,用于存储一个或多个候选任务的元数据;
    任务管理模块,用于执行任务调度以选择候选任务,并主动将存储于所述存 储模块的所述选择的候选任务的元数据发送到第一存储区域;
    其中,所述第一存储区域位于访问速度快于内存的存储区域中,所述第一存储区域包括CPU核的内部存储空间和/或所述CPU核的高速缓存,所述CPU核用于执行所述选择的候选任务。
  17. 根据权利要求16所述的硬件任务调度器,其特征在于,所述选择的候选任务的元数据包括所述选择的候选任务的上下文的存储位置信息、所述选择的候选任务的标识和所述选择的候选任务的状态。
  18. 根据权利要求16或17所述的硬件任务调度器,其特征在于,所述任务管理模块还用于主动发送所述选择的候选任务的上下文到所述CPU核的高速缓存。
  19. 根据权利要求16至18任一所述的硬件任务调度器,其特征在于,所述存储模块还用于存储所述选择的候选任务的上下文。
  20. 根据权利要求19所述的硬件任务调度器,其特征在于,所述任务管理模块还用于将存储于所述存储模块的所述选择的候选任务的上下文主动发送到所述CPU核的高速缓存。
PCT/CN2022/097255 2021-06-10 2022-06-07 一种任务调度的方法、系统和硬件任务调度器 WO2022257898A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22819496.5A EP4339776A1 (en) 2021-06-10 2022-06-07 Task scheduling method, system, and hardware task scheduler
US18/533,561 US20240103913A1 (en) 2021-06-10 2023-12-08 Task scheduling method and system, and hardware task scheduler

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110648382.9 2021-06-10
CN202110648382.9A CN115469976A (zh) 2021-06-10 2021-06-10 一种任务调度的方法、系统和硬件任务调度器

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/533,561 Continuation US20240103913A1 (en) 2021-06-10 2023-12-08 Task scheduling method and system, and hardware task scheduler

Publications (1)

Publication Number Publication Date
WO2022257898A1 true WO2022257898A1 (zh) 2022-12-15

Family

ID=84363334

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/097255 WO2022257898A1 (zh) 2021-06-10 2022-06-07 一种任务调度的方法、系统和硬件任务调度器

Country Status (4)

Country Link
US (1) US20240103913A1 (zh)
EP (1) EP4339776A1 (zh)
CN (1) CN115469976A (zh)
WO (1) WO2022257898A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033486A1 (en) * 2001-07-02 2003-02-13 Shay Mizrachi Cache system for network and multi-tasking applications
CN102077172A (zh) * 2008-07-02 2011-05-25 Nxp股份有限公司 使用运行时间任务调度的多处理器电路
US20150301854A1 (en) * 2014-04-21 2015-10-22 Samsung Electronics Co., Ltd. Apparatus and method for hardware-based task scheduling
US20180081813A1 (en) * 2016-09-22 2018-03-22 International Business Machines Corporation Quality of cache management in a computer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033486A1 (en) * 2001-07-02 2003-02-13 Shay Mizrachi Cache system for network and multi-tasking applications
CN102077172A (zh) * 2008-07-02 2011-05-25 Nxp股份有限公司 使用运行时间任务调度的多处理器电路
US20150301854A1 (en) * 2014-04-21 2015-10-22 Samsung Electronics Co., Ltd. Apparatus and method for hardware-based task scheduling
US20180081813A1 (en) * 2016-09-22 2018-03-22 International Business Machines Corporation Quality of cache management in a computer

Also Published As

Publication number Publication date
CN115469976A (zh) 2022-12-13
US20240103913A1 (en) 2024-03-28
EP4339776A1 (en) 2024-03-20

Similar Documents

Publication Publication Date Title
JP7313381B2 (ja) ハードウェアアクセラレーションのためのハードウェアリソースの埋込みスケジューリング
US9594521B2 (en) Scheduling of data migration
KR101775569B1 (ko) 가상 재시도 큐
US20200201692A1 (en) System and method for offloading application functions to a device
US20120271952A1 (en) Microprocessor with software control over allocation of shared resources among multiple virtual servers
JP2014504416A (ja) 組み合わせたcpu/gpuアーキテクチャシステムにおけるデバイスの発見およびトポロジーのレポーティング
US8166339B2 (en) Information processing apparatus, information processing method, and computer program
EP3598310B1 (en) Network interface device and host processing device
US9170963B2 (en) Apparatus and method for generating interrupt signal that supports multi-processor
US9009420B2 (en) Structure for performing cacheline polling utilizing a store and reserve instruction
US7818558B2 (en) Method and apparatus for EFI BIOS time-slicing at OS runtime
CN114546896A (zh) 系统内存管理单元、读写请求处理方法、电子设备和片上系统
US11640305B2 (en) Wake-up and timer for scheduling of functions with context hints
US8555001B2 (en) Cache memory, including miss status/information and a method using the same
US20110138091A1 (en) Pre-memory resource contention resolution
WO2014206229A1 (zh) 一种加速器以及数据处理方法
JP2003281079A (ja) ページ・テーブル属性によるバス・インタフェース選択
US20060184948A1 (en) System, method and medium for providing asynchronous input and output with less system calls to and from an operating system
US9983874B2 (en) Structure for a circuit function that implements a load when reservation lost instruction to perform cacheline polling
WO2022257898A1 (zh) 一种任务调度的方法、系统和硬件任务调度器
JP5254710B2 (ja) データ転送装置、データ転送方法およびプロセッサ
US20230132931A1 (en) Hardware management of direct memory access commands
US6799247B1 (en) Remote memory processor architecture
JP2005258509A (ja) ストレージ装置
JPH08292932A (ja) マルチプロセッサシステムおよびマルチプロセッサシステムにおいてタスクを実行する方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22819496

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022819496

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022819496

Country of ref document: EP

Effective date: 20231212

NENP Non-entry into the national phase

Ref country code: DE