WO2015042904A1 - 一种多核系统中资源池的调度方法、装置和系统 - Google Patents

一种多核系统中资源池的调度方法、装置和系统 Download PDF

Info

Publication number
WO2015042904A1
WO2015042904A1 PCT/CN2013/084575 CN2013084575W WO2015042904A1 WO 2015042904 A1 WO2015042904 A1 WO 2015042904A1 CN 2013084575 W CN2013084575 W CN 2013084575W WO 2015042904 A1 WO2015042904 A1 WO 2015042904A1
Authority
WO
WIPO (PCT)
Prior art keywords
level
antenna
tasks
user
subtasks
Prior art date
Application number
PCT/CN2013/084575
Other languages
English (en)
French (fr)
Inventor
吴素文
王吉滨
李琼
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2013/084575 priority Critical patent/WO2015042904A1/zh
Priority to CN201380003199.7A priority patent/CN105051689A/zh
Publication of WO2015042904A1 publication Critical patent/WO2015042904A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a resource pool scheduling method, apparatus, and system in a multi-core system. Background technique
  • a centralized scheduling module which is also called a centralized scheduling core.
  • the centralized scheduling module can allocate tasks in the functional modules to different cores for processing according to a certain granularity, and each core can process functions. Part of the module's granularity of functions corresponds to the task. For example, after the core A and the core B process the tasks corresponding to the functions, the centralized scheduling module is triggered, and the centralized scheduling module performs the functions that need to be processed according to the core C and the core D, and considers the situation of the antenna and the different users.
  • the load pre-calculation is performed on the information of the load occupied during processing, and then the task calculation is performed on the core C and the core D according to the calculation result obtained by the pre-calculation of the load.
  • the task assigned to the core C is: processing the antenna level task
  • the tasks assigned to the core D are: antennas 4-7 handling antenna-level tasks and users 10-15 of user-level tasks, etc., thereafter, core C and Core D can be processed separately according to the tasks assigned to it.
  • the inventors of the present invention have found that since the centralized scheduling module performs task assignment, it needs to estimate the load required for each service processing, and the service load is related to multiple dimensions of the service. It is difficult to estimate the accuracy. This may cause uneven distribution of tasks among the cores. Eventually, the load gap between the cores will be too large. For example, some nuclear overloads will occur, and some cores will have load margins. Summary of the invention
  • the embodiments of the present invention provide a resource pool scheduling method, apparatus, and system in a multi-core system, which can more uniformly schedule tasks in a resource pool.
  • an embodiment of the present invention provides a method for scheduling a resource pool in a multi-core system, including: Divide tasks in the resource pool into multiple subtasks according to preset granularity;
  • Trigger multiple cores to sequentially acquire subtasks from the shared queue and process them until the subtasks in the shared queue are processed.
  • the tasks in the resource pool include user-level tasks and antenna-level tasks, and the tasks in the resource pool are divided into multiple sub-tasks according to a preset granularity.
  • the user-level task is divided into multiple user-level sub-tasks according to the preset granularity
  • the antenna-level task is divided into multiple antenna-level sub-tasks according to the preset granularity
  • the adding the multiple sub-tasks to the shared queue is as follows: adding the user-level sub-task to the user-level shared queue, and adding the antenna-level sub-task to the antenna-level shared queue.
  • the triggering multiple cores sequentially acquires sub-tasks from the shared queue and processes them until the sub-tasks in the shared queue are all After processing, including:
  • Triggering multiple cores to obtain antenna level subtasks from the antenna level shared queue and processing them until the antenna level subtasks in the antenna level shared queue are all processed the multiple cores sequentially obtain user level from the user level shared queue The tasks are processed and processed until the user-level subtasks in the user-level shared queue are all processed.
  • the tasks in the resource pool include user-level tasks and antenna-level tasks, and the tasks in the resource pool are divided into multiple sub-tasks according to a preset granularity. Previously, it also included:
  • the task in the current resource pool is directly allocated; if it is a user-level task, the step of dividing the task in the current resource pool into multiple sub-tasks according to a preset granularity is performed.
  • the direct allocation of the tasks in the current resource pool includes:
  • Estimating the load required for each antenna processing to obtain an estimated load Assigning tasks in the current resource pool to respective cores according to processing capability information of each core and the estimated load.
  • the shared queue is specifically a software sharing queue or a hardware sharing queue.
  • the embodiment of the present invention further provides a scheduling device for a resource pool in a multi-core system, including a dividing unit, an adding unit, and a triggering unit;
  • a dividing unit configured to divide a task in the resource pool into a plurality of subtasks according to a preset granularity; adding a unit, configured to add multiple subtasks obtained by the dividing unit to the shared queue; and triggering unit, configured to trigger multiple The core performs task processing, so that multiple cores of the processing task sequentially acquire sub-tasks from the shared queue and process them until the sub-tasks in the shared queue are processed.
  • the tasks in the resource pool include user-level tasks and antenna-level tasks, and a 'J:
  • the dividing unit is specifically configured to divide the user-level task into multiple user-level sub-tasks according to a preset granularity, and divide the antenna-level task into multiple antenna-level sub-tasks according to a preset granularity;
  • the adding unit is specifically configured to add the user-level sub-task to the user-level shared queue, and add the antenna-level sub-task to the antenna-level shared queue.
  • the triggering unit is specifically configured to trigger multiple cores to sequentially acquire an antenna level sub-task from an antenna-level shared queue and process the same. After the antenna level subtasks in the shared queue of the antenna level are all processed, the multiple cores sequentially acquire the user level subtasks from the user level shared queue and process them until all the user level subtasks in the user level shared queue are processed. Finished.
  • the task in the resource pool includes a user-level task and an antenna-level task
  • the scheduling device of the resource pool in the multi-core system further includes a determining unit and an allocating unit
  • a determining unit configured to determine whether the task in the current resource pool is an antenna level task or a user level task; and an allocating unit, configured to: when the determining unit determines that the task in the current resource pool is an antenna level task, in the current resource pool Tasks are assigned directly;
  • the dividing unit is specifically configured to: when the determining unit determines that the task in the current resource pool is a user-level task, divide the task in the current resource pool into multiple sub-tasks according to a preset granularity.
  • the allocating unit in combination with the third possible implementation manner of the second aspect, includes an obtaining subunit, an estimating subunit, and an allocating subunit;
  • An estimation subunit configured to estimate a load required for each antenna processing to obtain an estimated load
  • an allocation subunit configured to obtain, according to the processing capability information of each core acquired by the acquiring subunit, and the estimation subunit The estimated load allocates tasks in the current resource pool to individual cores.
  • the embodiment of the present invention further provides a communication system, including a resource pool scheduling apparatus in any multi-core system provided by the embodiment of the present invention.
  • the embodiment of the present invention further provides a communication device of a multi-core system, including a processor, a memory for storing data and a program, and a transceiver module for transmitting and receiving data;
  • the processor is configured to divide the tasks in the resource pool into multiple subtasks according to a preset granularity; add the multiple subtasks to the shared queue; trigger multiple cores to sequentially acquire subtasks from the shared queue and process the same. Until the subtasks in the shared queue are processed.
  • the task in the resource pool includes a user-level task and an antenna-level task
  • the processor is specifically configured to divide a user-level task into multiple user-level sub-tasks according to a preset granularity, and divide the antenna-level task into multiple antenna-level sub-tasks according to a preset granularity;
  • the task is added to the user-level shared queue, and the antenna-level sub-task is added to the antenna-level shared queue.
  • the multiple cores are triggered to obtain the antenna-level sub-task from the antenna-level shared queue and processed until the antenna-level sub-task in the antenna-level shared queue.
  • the plurality of cores sequentially acquire user-level subtasks from the user-level shared queue and process them until all user-level subtasks in the user-level shared queue are processed.
  • the processor is further configured to determine whether a task in the current resource pool is an antenna level task or a user level task; Then, the task in the current resource pool is directly allocated; if it is a user-level task, the operation of dividing the task in the current resource pool into multiple sub-tasks according to a preset granularity is performed.
  • the tasks in the resource pool are divided into multiple subtasks according to the preset granularity, and then the subtasks are added to the shared queue, and multiple cores are triggered to sequentially obtain subtasks from the shared queue and process them. Since in this scenario, no load estimation is required, but multiple processing tasks The core obtains sub-tasks from the shared queue and processes them according to their own load conditions. Therefore, it is possible to avoid the problem of uneven assignment of tasks of each core due to inaccurate load estimation, which can be more evenly balanced in the resource pool. Tasks are scheduled to effectively balance the load of each core in real time.
  • FIG. 1 is a flowchart of a method for scheduling a resource pool in a multi-core system according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a scenario of a distributed scheduling manner according to an embodiment of the present invention
  • FIG. 2b is another flowchart of a method for scheduling a resource pool in a multi-core system according to an embodiment of the present invention
  • FIG. 3a is a schematic diagram of a scenario in which a distributed and centralized scheduling manner is provided in an embodiment of the present invention
  • FIG. 3 is a flowchart of a method for scheduling a resource pool in a multi-core system according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a device for scheduling a resource pool in a multi-core system according to an embodiment of the present invention
  • FIG. Another structural schematic diagram of a resource pool scheduling device in a multi-core system
  • FIG. 6 is a schematic structural diagram of a communication device of a multi-core system according to an embodiment of the present invention. detailed description
  • Embodiments of the present invention provide a method, an apparatus, and a system for scheduling a resource pool in a multi-core system. The following is a detailed description.
  • Embodiment 1
  • the scheduling device of the resource pool in the multi-core system may specifically be a centralized scheduling module in a multi-core system.
  • a method for scheduling a resource pool in a multi-core system includes: dividing a task in a resource pool into multiple sub-tasks according to a preset granularity, adding the multiple sub-tasks to the shared queue, and triggering multiple cores to sequentially obtain from the shared queue Subtasks are processed until the subtasks in the shared queue are processed.
  • the granularity of the partitioning may be set according to the requirements of the actual application, for example, according to the function of the task, and so on.
  • the tasks in the resource pool may be of various types.
  • the user-level tasks and the antenna-level tasks may be included.
  • the tasks may be separately classified according to the types of the tasks.
  • the user-level tasks may be specifically configured.
  • the preset granularity is divided into multiple user-level subtasks, and the antenna level tasks are divided into multiple antenna level subtasks according to preset granularity, and the like.
  • the scheduling method of the resource pool in the multi-core system may further include:
  • the method for directly allocating tasks in the resource pool may be as follows:
  • the method may further include: After determining that the pre-processing module has processed the task, the trigger of the pre-processing module is accepted.
  • the pre-processing module may specifically be other cores, hardware accelerators, or hardware modules (eg,
  • IP Intellectual Property Core
  • IP module also known as an IP core.
  • step 101 Add multiple subtasks divided in step 101 to the shared queue.
  • the user-level task has been divided into multiple user-level sub-tasks according to a preset granularity
  • the antenna-level task is divided into multiple antenna-level sub-tasks according to a preset granularity
  • the user-level sub-task can be added to the user-level shared queue
  • the antenna-level sub-task is added to the antenna-level shared queue.
  • the shared queue (such as a user-level shared queue or an antenna-level shared queue) may be a software shared queue or a hardware shared queue.
  • the multiple cores may be triggered to obtain the antenna level sub-tasks from the antenna-level shared queue and processed until the antenna-level sub-tasks in the antenna-level shared queue are all processed, and then the multiple cores may be triggered again.
  • the plurality of cores may also be directly executed by the multiple cores without triggering.
  • the user-level subtasks are sequentially acquired from the user-level shared queue and processed until the user-level subtasks in the user-level shared queue are all processed. For example, taking multiple cores including core C and core D as an example, you can specifically: 3 ⁇ 4:
  • each of the antenna level subtasks is obtained from the antenna level shared queue. For example, core C acquires the first antenna level subtask, such as antenna 0 task, and core D acquires the second antenna level. Subtask 1, such as antenna 1 task.
  • the core D Since the software processing load is different and the core D is processed first, the core D will continue to acquire the next antenna level subtask from the antenna level shared queue (ie, the third antenna level subtask), such as the antenna 2 task, for the same reason, If core C is still processing antenna 2 task while core C is processing, then core C will acquire the next antenna level subtask (ie, the fourth antenna level subtask) from the antenna level shared queue, ie Antenna 3 task, and if core D has processed antenna 2 task, core C has not completed antenna 0 task, then core D is obtained from antenna level shared queue. The next antenna level subtask (ie the fourth antenna level subtask), the antenna 3 task, and so on.
  • the next antenna level subtask ie the fourth antenna level subtask
  • the core C and the core D sequentially acquire and process the antenna level subtasks from the antenna level shared queue, wherein each core acquires the next antenna level after completing the antenna level subtask obtained by itself. task.
  • the core C and the core D cannot obtain the subtasks from the shared queue.
  • the task in the resource pool is processed, and the core C and the core D are processed.
  • the user-level sub-task is obtained from the user-level shared queue for processing.
  • the process of obtaining the user-level sub-task is similar to the process of obtaining the antenna-level sub-task, and is not described here.
  • the shared queue is a software shared queue, software is required to ensure that the cores do not get the same subtask. If it is a hardware shared queue, hardware is required to ensure that the cores do not get the same subtask.
  • the distributed scheduling method is used in combination with the centralized scheduling mode to schedule tasks in the resource pool, that is, only the user-level tasks are divided before the user-level subtasks. After being added to the user-level shared queue, each core only obtains the user-level sub-tasks from the user-level shared queue and processes them. The antenna-level tasks are directly allocated, and are not described here.
  • the embodiment divides the tasks in the resource pool into multiple subtasks according to the preset granularity, and then adds the subtasks to the shared queue, and triggers multiple cores to sequentially obtain subtasks from the shared queue and perform deal with.
  • multiple cores of the processing task instead of performing load estimation, acquire sub-tasks from the shared queue according to their own load conditions and process them. Therefore, each of the faults caused by the load estimation can be avoided.
  • the problem of uneven task assignment of the core can more uniformly balance the tasks in the resource pool and effectively balance the load of each core in real time. According to the method described in the first embodiment, the second and third examples will be further explained in the following.
  • Embodiment 2 will be further explained in the following.
  • the scheduling device of the resource pool in the multi-core system is specifically a centralized scheduling module in the multi-core system, and the tasks in the resource pool include user-level tasks and antenna-level tasks, and distributed modulation is used as shown in FIG. 2a.
  • the figure shows the scenario of the distributed scheduling mode. It can be seen from Figure 2a that it is used.
  • the user-level task or the antenna-level task needs to be divided into user-level sub-tasks and antenna-level sub-tasks, respectively, and added to their respective shared queues, that is, user-level subtasks are written (ie, joined).
  • the shared queue the antenna level subtask writes (ie, joins) the antenna level shared queue, and then triggers multiple cores to sequentially acquire and process the antenna level subtask from the antenna level shared queue, and trigger multiple cores to sequentially share the queue from the user level. Get user-level subtasks and process them. The details will be described below.
  • the centralized scheduling module is triggered.
  • the pre-processing module may specifically be other cores, hardware accelerators, or hardware IPs.
  • the pre-processing module may specifically be a core A and a core B, that is, after the core A and the core B have processed the task, the centralized scheduling module is triggered.
  • the centralized scheduling module divides the user-level task into multiple user-level sub-tasks according to a preset granularity, and divides the antenna-level task into multiple antenna-level sub-tasks according to a preset granularity.
  • the granularity of the partitioning may be set according to the requirements of the actual application, for example, according to the function of the task, and so on.
  • the centralized scheduling module adds the user-level sub-task to the user-level shared queue, and adds the antenna-level sub-task to the antenna-level shared queue.
  • the shared queue such as a user-level shared queue or an antenna-level shared queue, may be a software shared queue or a hardware shared queue.
  • the centralized scheduling module triggers multiple cores to sequentially acquire antenna level subtasks from the antenna level shared queue and process them until all the antenna level subtasks in the antenna level shared queue are processed, and then step 205 is performed.
  • the details can be as follows:
  • each of the antenna level subtasks is obtained from the antenna level shared queue. For example, core C acquires the first antenna level subtask, such as antenna 0 task, and core D acquires the second antenna level. Subtask 1, such as antenna 1 task.
  • the core D Since the software processing load is different and the core D is processed first, the core D will continue to acquire the next antenna level subtask from the antenna level shared queue (ie, the third antenna level subtask), such as the antenna 2 task, for the same reason, If core C is still processing the antenna 2 task while core C has processed the antenna 0 task, then core C will acquire the next antenna level subtask from the antenna level shared queue (ie The fourth antenna level subtask), that is, the antenna 3 task, and if the core D has not completed the antenna 0 task when the core 2 is processed, the core D acquires the next antenna level from the antenna level shared queue.
  • the task ie the fourth antenna level subtask
  • the core C and the core D sequentially acquire and process the antenna level subtasks from the antenna level shared queue, wherein each core acquires the next antenna level after completing the antenna level subtask obtained by itself. task.
  • step 205 After the tasks in the shared queue are processed, the core C and the core D cannot obtain the subtasks from the shared queue. Then, the task in the resource pool is processed, and the core C and the core are processed. D starts to obtain the user-level subtask from the user-level shared queue for processing, that is, step 205 is performed.
  • the multiple cores sequentially acquire user-level subtasks from the user-level shared queue and process them until the user-level subtasks in the user-level shared queue are all processed. For example, taking multiple cores including core C and core D as an example, the details can be as follows:
  • core C and core D When core C and core D are processed, they respectively obtain user-level subtasks from the user-level shared queue. For example, core C acquires the first user-level subtask, such as user 0 task, and core D acquires the second user level. Subtask 1, such as User 1 task.
  • the core D Since the software processing load is different, and the core D is processed first, then the core D will continue to acquire the next user-level subtask (ie, the third user-level subtask) from the user-level shared queue, such as the user 2 task, for the same reason, If core C is still processing user 2 tasks while core C is processing, then core C will get the next user-level subtask (ie, the fourth user-level subtask) from the user-level shared queue, ie User 3 task, and if core D has finished processing user 2 task, core C has not completed user 0 task, then core D obtains the next user level subtask from the user level shared queue (ie, the fourth user level subtask ), that is, user 3 tasks, and so on. That is to say, the core C and the core D will sequentially acquire and process the user-level subtasks from the user-level shared queue, wherein each core acquires the next user-level sub-task after completing the user-level subtasks acquired by itself. task.
  • the core C and the core D
  • the shared queue (including the antenna level shared queue and the user level shared queue) is a software shared queue
  • software is required to ensure that each core does not take the same subtask
  • hardware Shared queues require hardware to ensure that each core does not get the same subtask.
  • the antenna-level task and the user-level task in the resource pool are respectively divided into multiple sub-tasks according to preset granularity, and then the sub-tasks are respectively added to the antenna-level shared queue and the user-level shared queue. And trigger multiple cores to get subtasks from these shared queues in turn and process them.
  • multiple cores of the processing task instead of performing load estimation, acquire sub-tasks from the shared queue according to their own load conditions and process them. Therefore, each of the faults caused by the load estimation can be avoided.
  • the problem of uneven task assignment of the core can more uniformly balance the tasks in the resource pool and effectively balance the load of each core in real time.
  • the scheduling device of the resource pool in the multi-core system is specifically a centralized scheduling module in the multi-core system, and the tasks in the resource pool include user-level tasks and antenna-level tasks as an example.
  • the scheduling mode using the distributed modulation combination is taken as an example for description.
  • FIG. 3a the figure is a schematic diagram of a distributed and centralized scheduling mode.
  • the direct allocation method is used for scheduling, and for the user level task, it is divided to obtain the user level subtask, and is added to the user level shared queue. And then trigger multiple cores to sequentially obtain user-level subtasks from the user-level shared queue and process them. The details will be described below.
  • the centralized scheduling module is triggered.
  • the pre-processing module may specifically be other cores, hardware accelerators, or hardware IP devices.
  • the pre-processing module may specifically be a core A and a core B, that is, after the core A and the core B have processed the task, the centralized scheduling module is triggered.
  • the centralized scheduling module determines whether the task in the current resource pool is an antenna level task or a user level task. If the task is an antenna level task, step 303 is performed. Otherwise, if it is a user level task, step 304 is performed.
  • the centralized scheduling module directly allocates the antenna level task.
  • the specific information may be as follows: Obtaining processing capability information of each core, estimating the load required for each antenna processing, obtaining an estimated load, and then, according to the processing capability information of the respective cores and the obtained estimated load, the task in the current resource pool (ie, the antenna level) Tasks are assigned to individual cores.
  • the centralized scheduling module can obtain the processing capability information of the core C and the processing capability information of the core D, and pre-calculate the load required for each antenna processing, and then follow The processing capability information of the core C, the processing capability information of the core D, and the calculation result obtained by the pre-computation of the load perform task assignment to the core C and the core D.
  • the tasks assigned to the core C are: Antenna 0 for processing the antenna level task 3.
  • the tasks assigned to core D are: Antenna 4-7 for processing antenna-level tasks, etc. Thereafter, Core C and Core D can be processed according to the tasks assigned to them.
  • the centralized scheduling module divides the user-level task into multiple user-level sub-tasks according to a preset granularity.
  • the granularity of the partitioning may be set according to the requirements of the actual application, for example, according to the function of the task, and so on.
  • the centralized scheduling module adds the user-level sub-task to the user-level shared queue.
  • the user-level shared queue may be a software shared queue or a hardware shared queue.
  • the centralized scheduling module triggers multiple cores (or, instead of triggering the multiple cores by the centralized scheduling module, but the multiple cores perform themselves), sequentially from the user.
  • the level shared queue acquires user-level subtasks and processes them until all user-level subtasks in the user-level shared queue are processed. For example, taking multiple cores including core C and core D as an example, the details can be as follows:
  • the centralized scheduling module triggers core C and core D to obtain user-level subtasks from the user-level shared queue, for example, the core.
  • C obtains the first user-level subtask, such as user 0 task
  • core D acquires the second user-level subtask 1, such as user 1 task.
  • the core D Since the software processing load is different, and the core D is processed first, then the core D will continue to obtain the next user-level subtask (ie, the third user-level subtask) from the user-level shared queue, such as the user 2 task, for the same reason, If core C is still processing user 2 tasks while core C is processing, then core C will get the next user-level subtask (ie, the fourth user-level subtask) from the user-level shared queue, ie User 3 task, and if core D has finished processing user 2 task, core C has not completed user 0 task, then core D is shared from user level.
  • the next user-level subtask that is, the fourth user-level subtask
  • the core C and the core D will sequentially acquire and process the user-level subtasks from the user-level shared queue, wherein each core acquires the next user-level sub-task after completing the user-level subtasks acquired by itself. task.
  • step 306 if the user-level shared queue is a software shared queue, software is required to ensure that each core does not take the same sub-task, and if it is a hardware shared queue, hardware is required to ensure that each core does not take Go to the same subtask.
  • the antenna-level tasks in the resource pool are directly allocated, and the user-level tasks are respectively divided into multiple sub-tasks according to the preset granularity, and then the sub-tasks are respectively added to the user-level shared queue.
  • multiple cores are triggered to obtain user-level subtasks from the user-level shared queue and process them.
  • no load estimation is needed, and multiple cores of the processing task acquire user-level subtasks from the user-level shared queue according to their own load conditions, and perform processing.
  • the embodiment of the present invention further provides a resource pool scheduling apparatus in a multi-core system.
  • the resource pool scheduling apparatus in the multi-core system includes a dividing unit 401, an adding unit 402, and a trigger.
  • a dividing unit 401 configured to divide tasks in the resource pool into multiple sub-tasks according to a preset granularity
  • the granularity of the partitioning may be set according to the requirements of the actual application, for example, according to the function of the task, and so on.
  • the adding unit 402 is configured to add the multiple subtasks obtained by the dividing unit 401 to the shared queue.
  • the triggering unit 403 is configured to trigger multiple cores to sequentially acquire subtasks from the shared queue and perform the process. The subtasks in the shared queue are all processed.
  • the tasks in the resource pool may be of various types.
  • the user-level tasks and the antenna-level tasks may be included.
  • the tasks may be separately classified according to the types of the tasks.
  • the user-level tasks may be specifically configured.
  • the preset granularity is divided into multiple user-level subtasks, and the antenna-level tasks are divided into multiple antenna-level sub-tasks according to preset granularity, and the like;
  • the dividing unit 401 is specifically configured to divide the user-level task into multiple user-level sub-tasks according to a preset granularity, and divide the antenna-level task into multiple antenna-level sub-tasks according to a preset granularity;
  • the adding unit 402 may be specifically configured to add the user-level sub-task to the user-level shared queue, and add the antenna-level sub-task to the antenna-level shared queue.
  • the triggering unit 403 may be configured to trigger multiple cores to sequentially acquire and perform processing on the antenna level sub-task from the antenna-level shared queue, until all the antenna-level sub-tasks in the antenna-level shared queue are processed, and then the multiple cores are User-level subtasks are obtained from the user-level shared queue in turn and processed until the user-level subtasks in the user-level shared queue are all processed.
  • the shared queue (such as a user-level shared queue or an antenna-level shared queue) may be a software shared queue or a hardware shared queue.
  • different processing may be performed according to priority of the task. For example, because the load required for antenna processing is large, for high-priority tasks such as antenna-level tasks, centralized processing may be adopted.
  • the scheduling mode is used for processing, and since the load required for user processing is small, for low-priority tasks such as user-level tasks, distributed scheduling can be used for processing, that is, as shown in FIG. 5, the multi-core
  • the scheduling device of the resource pool in the system may further include a determining unit 404 and an allocating unit 405;
  • the determining unit 404 can be configured to determine whether the task in the current resource pool is an antenna level task or a user level task;
  • the allocating unit 405 can be configured to directly allocate tasks (ie, antenna level tasks) in the current resource pool when the determining unit 404 determines that the task in the current resource pool is an antenna level task;
  • the dividing unit 401 is specifically configured to: when the determining unit determines that the task in the current resource pool is a user-level task, divide the task in the current resource pool (ie, the user-level task) into multiple sub-tasks according to a preset granularity.
  • the allocating unit 405 may include an obtaining subunit, an estimating subunit, and an allocation subroutine Yuan
  • An estimation subunit configured to estimate an load required for processing each antenna to obtain an estimated load
  • an allocation subunit configured to obtain processing capability information of each core acquired according to the obtaining subunit, and the estimating subunit
  • the estimated load assigns tasks in the current resource pool to individual cores.
  • the dividing unit 401 performs an operation of "dividing tasks in the resource pool into a plurality of subtasks according to a preset granularity", that is, the scheduling device of the resource pool in the multi-core system may further include an accepting unit;
  • the accepting unit is configured to accept the trigger of the pre-processing module after determining that the pre-processing module has processed the task.
  • the pre-processing module may specifically be other cores, hardware accelerators, or hardware IPs.
  • each of the foregoing units may be implemented as an independent entity, or may be any combination, and implemented as one or several entities.
  • the scheduling device of the resource pool in the multi-core system may specifically be a centralized scheduling in a multi-core system. Module, see examples two and three. For the specific implementation of the above various units, refer to the foregoing method embodiments, and details are not described herein again.
  • the dividing unit 401 of the resource pool scheduling apparatus in the core system of the embodiment can divide the tasks in the resource pool into multiple subtasks according to the preset granularity, and then the adding unit 402 adds the subtasks to the shared queue.
  • the multiple cores are triggered by the trigger unit 403 to sequentially acquire sub-tasks from the shared queue and process them.
  • multiple cores of the processing task acquire sub-tasks from the shared queue according to their own load conditions and process them. Therefore, each of the faults caused by the load estimation can be avoided.
  • the problem of uneven task assignment of the core can more uniformly balance the tasks in the resource pool and effectively balance the load of each core in real time.
  • the embodiment of the present invention further provides a communication system, which is characterized in that: the scheduling device of the resource pool in the multi-core system provided by the embodiment of the present invention, wherein the scheduling device of the resource pool in the multi-core system may specifically For a centralized scheduling module in a multi-core system, for example, the following may be specifically: a centralized scheduling module, configured to divide tasks in a resource pool into multiple sub-parts according to preset granularity The multiple sub-tasks are added to the shared queue, and multiple cores are triggered to sequentially acquire sub-tasks from the shared queue and process them until the sub-tasks in the shared queue are processed.
  • a centralized scheduling module configured to divide tasks in a resource pool into multiple sub-parts according to preset granularity The multiple sub-tasks are added to the shared queue, and multiple cores are triggered to sequentially acquire sub-tasks from the shared queue and process them until the sub-tasks in the shared queue are processed.
  • the granularity of the partitioning may be set according to the requirements of the actual application, for example, according to the function of the task, and so on.
  • the tasks in the resource pool may be of various types.
  • the user-level tasks and the antenna-level tasks may be included.
  • the tasks may be separately classified according to the types of the tasks.
  • the user-level tasks may be specifically configured.
  • the preset granularity is divided into multiple user-level subtasks
  • the antenna level tasks are divided into multiple antenna level subtasks according to preset granularity, and the like.
  • the centralized scheduling module may add the user-level sub-task to the user-level shared queue, and add the antenna-level sub-task to the antenna-level shared queue.
  • multiple cores may be triggered to obtain the antenna level from the antenna-level shared queue.
  • the task is processed until the antenna level subtasks in the antenna level shared queue are all processed, and then the multiple cores sequentially acquire the user level subtasks from the user level shared queue and process them until the user level shared queue User-level subtasks are all processed.
  • different processing may be performed according to priority of the task.
  • centralized processing may be adopted.
  • the scheduling mode is used for processing, and since the load required for user processing is small, for low-priority tasks such as user-level tasks, distributed scheduling can be used for processing, namely:
  • the centralized scheduling module is further configured to determine whether the task in the current resource pool is an antenna-level task or a user-level task, and if it is an antenna-level task, directly assign the task in the current resource pool (that is, an antenna-level task); For the user-level task, the operation of dividing the task in the current resource pool (that is, the user-level task) into multiple sub-tasks according to the preset granularity is performed.
  • the operations for directly allocating tasks in the resource pool may be as follows:
  • the communication system may further include multiple cores, and the cores may be used to sequentially acquire sub-tasks from the shared queue and process them under the trigger of the centralized scheduling module, until the sub-tasks in the shared queue are processed.
  • the communication system may further include a pre-processing module, and after the task is processed, the centralized scheduling module is triggered to perform the foregoing scheduling operation, and details are not described herein.
  • the centralized scheduling module in the communication system of the embodiment divides the tasks in the resource pool into multiple subtasks according to the preset granularity, and then adds the subtasks to the shared queue, and triggers multiple cores to sequentially Get subtasks in the shared queue and process them.
  • multiple cores of the processing task acquire sub-tasks from the shared queue according to their own load conditions and process them. Therefore, each of the faults caused by the load estimation can be avoided.
  • the problem of uneven task assignment of the core can more uniformly balance the tasks in the resource pool and effectively balance the load of each core in real time.
  • a communication device of a multi-core system includes a processor 601, a memory 602 for storing data and programs, and a transceiver module 603 for transmitting and receiving data; wherein:
  • the processor 601 is configured to divide the tasks in the resource pool into multiple subtasks according to a preset granularity, add the multiple subtasks to the shared queue, and trigger multiple cores to sequentially acquire subtasks from the shared queue and process them until the The subtasks in the shared queue are processed.
  • the granularity of the partitioning may be set according to the requirements of the actual application, for example, according to the function of the task, and so on.
  • the tasks in the resource pool may be of various types.
  • the user-level tasks and the antenna-level tasks may be included.
  • the tasks may be separately classified according to the types of the tasks.
  • the user-level tasks may be specifically configured.
  • the preset granularity is divided into multiple user-level subtasks, and the antenna-level tasks are divided into multiple antenna-level sub-tasks according to preset granularity, and the like, namely:
  • the processor 601 is specifically configured to divide the user-level task into multiple user-level sub-tasks according to a preset granularity, and divide the antenna-level task into multiple antenna-level sub-tasks according to a preset granularity; Adding a user-level shared queue to the antenna-level shared queue; triggering multiple cores to obtain antenna-level sub-tasks from the antenna-level shared queue and processing them until the antenna-level subtasks in the antenna-level shared queue are processed. After the completion, the multiple cores sequentially acquire the user-level subtasks from the user-level shared queue and process them until the user-level subtasks in the user-level shared queue are all processed. It should be noted that if the shared queue is a software shared queue, software is required to ensure that the cores do not get the same subtask, and if it is a hardware shared queue, hardware is required to ensure that the cores do not get the same subtask.
  • different processing may be performed according to priority of the task.
  • centralized processing may be adopted.
  • the scheduling mode is used for processing, and since the load required by the user processing is small, for low-priority tasks such as user-level tasks, the distributed scheduling mode can be used for processing, that is, the processor 601 is performing "putting resources.”
  • the processor 601 is further configured to determine whether the task in the current resource pool is an antenna level task or a user level task; if it is an antenna level task, directly assign a task in the current resource pool; if it is a user level task, execute The operation of dividing the tasks in the current resource pool into multiple subtasks according to the preset granularity.
  • the operations for directly allocating tasks in the resource pool may be as follows:
  • Tasks are assigned to individual cores.
  • the processor 601 in the communication device of the multi-core system of the embodiment can divide the tasks in the resource pool into multiple sub-tasks according to the preset granularity, then add the sub-tasks to the shared queue, and trigger multiple The core sequentially acquires subtasks from the shared queue and processes them.
  • multiple cores of the processing task acquire sub-tasks from the shared queue according to their own load conditions and process them. Therefore, each of the faults caused by the load estimation can be avoided.
  • the problem of uneven task assignment of the core can more uniformly balance the tasks in the resource pool and effectively balance the load of each core in real time.
  • the program may be stored in a computer readable storage medium, and the storage medium may include: Read only memory (ROM, Read Only Memory), random access memory (RAM), disk or optical disk.
  • ROM Read only memory
  • RAM random access memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

一种多核系统中资源池的调度方法,包括:将资源池中的任务按照预置的粒度划分成多个子任务;将所述多个子任务加入共享队列中;触发多个核依次从共享队列中获取子任务并进行处理,直至共享队列中的子任务处理完毕;此外,还提供相应的装置和系统。

Description

一种多核系统中资源池的调度方法、 装置和系统 技术领域
本发明涉及通信技术领域, 具体涉及一种多核系统中资源池的调度方法、 装置和系统。 背景技术
在多核系统中, 由于每个核处理能力的限制, 一个功能模块的任务难以在 一个核中完全处理, 需要多核作为一个资源池进行任务的协同处理。 因此需要 对多核进行任务分配, 以保证任务能够及时完成, 同时使各核负载较为均衡。 其中, 对多核进行任务分配的过程就称为资源池的调度。
在现有方案中, 一般具有一个集中调度模块, 也称为集中调度核, 该集中 调度模块可以将功能模块中的任务按照一定的粒度, 分配给不同的核进行处 理, 每个核可以处理功能模块的一部分粒度的功能所对应的任务。 例如, 当核 A和核 B处理完成一些功能所对应的任务后, 触发集中调度模块, 集中调度模 块按照核 C和核 D需要处理的功能的情况, 同时考虑天线的情况、 以及针对不 同用户进行处理时所占用的负载的情况等信息, 进行负载预计算, 然后根据该 负载预计算所得到的计算结果对核 C和核 D进行任务分配,比如分配给核 C的任 务为: 处理天线级任务的天线 0-3和用户级任务的用户 0-9, 分配给核 D的任务 为: 处理天线级任务的天线 4-7和用户级任务的用户 10-15, 等等, 此后, 核 C 和核 D便可以分别按照分配给自己的任务进行处理。
在对现有技术的研究和实践过程中, 本发明的发明人发现, 由于集中调度 模块在进行任务分配时, 需要预估各个业务处理所需要的负载, 而业务负载与 业务的多个维度相关, 难以估计准确, 这可能会引起各个核的任务分配不均, 最终导致各核的负载差距过大, 比如, 会导致某些核过载, 而某些核还有负载 余量, 等等。 发明内容
本发明实施例提供一种多核系统中资源池的调度方法、装置和系统, 可以 更为均衡地对资源池中的任务进行调度。
第一方面, 本发明实施例提供一种多核系统中资源池的调度方法, 包括: 将资源池中的任务按照预置的粒度划分成多个子任务;
将所述多个子任务加入共享队列中;
触发多个核依次从共享队列中获取子任务并进行处理,直至共享队列中的 子任务处理完毕。
在第一种可能的实施方式中, 结合第一方面, 所述资源池中的任务包括用 户级任务和天线级任务,则所述将资源池中的任务按照预置的粒度划分成多个 子任务包括:
将用户级任务按照预置的粒度划分成多个用户级子任务,将天线级任务按 照预置的粒度划分成多个天线级子任务;
所述将所述多个子任务加入共享队列中具体为:将所述用户级子任务加入 用户级共享队列, 将所述天线级子任务加入天线级共享队列。
在第二种可能的实施方式中, 结合第一方面的第一种可能的实施方式中, 所述触发多个核依次从共享队列中获取子任务并进行处理,直至共享队列中的 子任务全部处理完毕, 包括:
触发多个核依次从天线级共享队列获取天线级子任务并进行处理,直至天 线级共享队列中的天线级子任务全部处理完毕后,所述多个核依次从用户级共 享队列获取用户级子任务并进行处理,直至用户级共享队列中的用户级子任务 全部处理完毕。
在第三种可能的实施方式中, 结合第一方面, 所述资源池中的任务包括用 户级任务和天线级任务,则所述将资源池中的任务按照预置的粒度划分成多个 子任务之前, 还包括:
确定当前资源池中的任务是天线级任务还是用户级任务;
若为天线级任务, 则对所述当前资源池中的任务进行直接分配; 若为用户级任务,则执行将当前资源池中的任务按照预置的粒度划分成多 个子任务的步骤。
在第四种可能的实施方式中, 结合第一方面的第三种可能的实施方式, 所 述对所述当前资源池中的任务进行直接分配, 包括:
获取各个核的处理能力信息;
对各天线处理所需的负载进行估算, 得到估算负载; 根据各个核的处理能力信息和所述估算负载将所述当前资源池中的任务 分配给各个核。
在第五种可能的实施方式中, 结合第一方面、 第一方面的第一、 二、 三或 四种可能的实施方式, 所述共享队列具体为软件共享队列或硬件共享队列。
第二方面, 本发明实施例还提供一种多核系统中资源池的调度装置, 包括 划分单元、 添加单元和触发单元;
划分单元, 用于将资源池中的任务按照预置的粒度划分成多个子任务; 添加单元, 用于将所述划分单元得到的多个子任务加入共享队列中; 触发单元, 用于触发多个核进行任务处理,使得处理任务的多个核依次从 共享队列中获取子任务并进行处理, 直至共享队列中的子任务处理完毕。
在第一种可能的实施方式中, 结合第二方面, 所述资源池中的任务包括用 户级任务和天线级任务, 贝' J :
所述划分单元,具体用于将用户级任务按照预置的粒度划分成多个用户级 子任务, 将天线级任务按照预置的粒度划分成多个天线级子任务;
所述添加单元, 具体用于将所述用户级子任务加入用户级共享队列,将所 述天线级子任务加入天线级共享队列。
在第二种可能的实施方式中, 结合第二方面的第一种可能的实施方式中, 所述触发单元,具体用于触发多个核依次从天线级共享队列获取天线级子任务 并进行处理, 直至天线级共享队列中的天线级子任务全部处理完毕后, 所述多 个核依次从用户级共享队列获取用户级子任务并进行处理,直至用户级共享队 列中的用户级子任务全部处理完毕。
在第三种可能的实施方式中, 结合第二方面, 所述资源池中的任务包括用 户级任务和天线级任务,则所述多核系统中资源池的调度装置还包括判断单元 和分配单元;
判断单元, 用于确定当前资源池中的任务是天线级任务还是用户级任务; 分配单元, 用于在判断单元确定当前资源池中的任务是天线级任务时,对 所述当前资源池中的任务进行直接分配;
所述划分单元,具体用于在判断单元确定当前资源池中的任务是用户级任 务时, 将当前资源池中的任务按照预置的粒度划分成多个子任务。 在第四种可能的实施方式中, 结合第二方面的第三种可能的实施方式, 所 述分配单元包括获取子单元、 估算子单元和分配子单元;
获取子单元, 用于获取各个核的处理能力信息;
估算子单元, 用于对各天线处理所需的负载进行估算, 得到估算负载; 分配子单元,用于根据所述获取子单元获取到的各个核的处理能力信息和 所述估算子单元得到的估算负载将所述当前资源池中的任务分配给各个核。
第三方面, 本发明实施例还提供一种通信系统, 包括本发明实施例提供的 任一种多核系统中资源池的调度装置。
第四方面, 本发明实施例还提供一种多核系统的通信设备, 包括处理器、 用于存储数据和程序的存储器和用于收发数据的收发模块;
所述处理器, 用于将资源池中的任务按照预置的粒度划分成多个子任务; 将所述多个子任务加入共享队列中;触发多个核依次从共享队列中获取子任务 并进行处理, 直至共享队列中的子任务处理完毕。
在第一种可能的实施方式中, 结合第四方面, 所述处理器, 所述资源池中 的任务包括用户级任务和天线级任务, 贝' J :
所述处理器,具体用于将用户级任务按照预置的粒度划分成多个用户级子 任务,将天线级任务按照预置的粒度划分成多个天线级子任务; 将所述用户级 子任务加入用户级共享队列,将所述天线级子任务加入天线级共享队列; 触发 多个核依次从天线级共享队列获取天线级子任务并进行处理,直至天线级共享 队列中的天线级子任务全部处理完毕后,再由所述多个核依次从用户级共享队 列获取用户级子任务并进行处理,直至用户级共享队列中的用户级子任务全部 处理完毕。
在第二种可能的实施方式中, 结合第四方面, 所述处理器, 所述处理器, 还用于确定当前资源池中的任务是天线级任务还是用户级任务;若为天线级任 务, 则对所述当前资源池中的任务进行直接分配; 若为用户级任务, 则执行将 当前资源池中的任务按照预置的粒度划分成多个子任务的操作。
本发明实施例采用将资源池中的任务按照预置的粒度划分成多个子任务, 然后将这些子任务加入共享队列中,并触发多个核依次从共享队列中获取子任 务并进行处理。 由于在该方案中, 无需进行负载估算, 而是由处理任务的多个 核根据自身的负载情况, 从共享队列中获取子任务并进行处理, 因此, 可以避 免由于负载估计不够准确所引起的各个核的任务分配不均的问题,可以更为均 衡地对资源池中的任务进行调度, 实时有效地平衡各个核的负载。 附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所 需要使用的附图作筒单地介绍,显而易见地, 下面描述中的附图仅仅是本发明 的一些实施例, 对于本领域技术人员来讲, 在不付出创造性劳动的前提下, 还 可以根据这些附图获得其他的附图。
图 1是本发明实施例提供多核系统中资源池的调度方法的流程图; 图 2a是本发明实施例提供的分布式调度方式的场景示意图;
图 2b是本发明实施例提供多核系统中资源池的调度方法的另一流程图; 图 3a是本发明实施例提供分布式和集中式相结合的调度方式的场景示意 图;
图 3b是本发明实施例提供多核系统中资源池的调度方法的另一流程图; 图 4是本发明实施例提供多核系统中资源池的调度装置的结构示意图; 图 5是本发明实施例提供多核系统中资源池的调度装置的另一结构示意 图;
图 6是本发明实施例提供的多核系统的通信设备的结构示意图。 具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是 全部的实施例。基于本发明中的实施例, 本领域技术人员在没有作出创造性劳 动前提下所获得的所有其他实施例, 都属于本发明保护的范围。
本发明实施例提供一种多核系统中资源池的调度方法、装置和系统。 以下 分别进行详细说明。 实施例一、
在本实例中,将从多核系统中资源池的调度装置的角度进行描述, 该多核 系统中资源池的调度装置具体可以为多核系统中的集中调度模块。 一种多核系统中资源池的调度方法, 包括: 将资源池中的任务按照预置的 粒度划分成多个子任务,将该多个子任务加入共享队列中,触发多个核依次从 共享队列中获取子任务并进行处理, 直至共享队列中的子任务处理完毕。
如图 1所示, 具体流程可以如下:
101、 将资源池中的任务按照预置的粒度划分成多个子任务;
其中, 划分的粒度可以根据实际应用的需求进行设置, 比如, 具体可以根 据任务的功能进行划分, 等等。
其中, 资源池中的任务可以有多种类型, 比如, 可以包括用户级任务和天 线级任务, 具体在划分子任务时, 可以根据任务的类型分别进行划分, 比如, 具体可以将用户级任务按照预置的粒度划分成多个用户级子任务,而将天线级 任务按照预置的粒度划分成多个天线级子任务, 等等。
可选的, 为了提高处理效率, 还可以根据任务的优先级进行不同的处理, 例如, 由于天线处理所需要的负载较大, 因此, 对于高优先级的任务如天线级 任务, 可以采用集中式调度方式来进行处理, 而由于用户处理所需要的负载较 小, 因此, 对于低优先级的任务如用户级任务, 则可以采用分布式调度方式来 进行处理,即,在步骤"将资源池中的任务按照预置的粒度划分成多个子任务" (即步骤 101 )之前, 该多核系统中资源池的调度方法还可以包括:
确定当前资源池中的任务是天线级任务还是用户级任务, 若为天线级任 务, 则对所述当前资源池中的任务(即天线级任务)进行直接分配; 若为用户 级任务, 则执行将当前资源池中的任务(即用户级任务)按照预置的粒度划分 成多个子任务的步骤。
其中, 对资源池中的任务进行直接分配的方法具体可以如下:
获取各个核的处理能力信息,根据该各个核的处理能力信息对各天线处理 所需的负载进行估算,得到估算负载,根据得到的估算负载将该当前资源池中 的任务(即天线级任务)分配给各个核。
此外, 在步骤 "将资源池中的任务按照预置的粒度划分成多个子任务"之 前,还可以进行其他的处理,比如,可以由其他前级处理模块对任务进行处理, 然后才触发该调度装置如集中调度模块执行步骤 101 , 即, 在步骤 101 (即将 资源池中的任务按照预置的粒度划分成多个子任务)之前,该方法还可以包括: 在确定前级处理模块处理完任务之后, 接受所述前级处理模块的触发。 其中, 前级处理模块具体可以为其他的核、 硬件加速器或硬件模块(如
Intellectual Property Core, 即 IP模块, 也称为 IP核)等。
需说明的是,在本发明实施例中,仅仅以用户级任务和天线级任务为例进 行说明, 应当理解的是, 资源池中的任务也可以包括其他的类型, 其实现方法 与此相同, 在此不再赘述。
102、 将步骤 101中划分的多个子任务加入共享队列中。
例如,如果在步骤 101中, 已经将用户级任务按照预置的粒度划分成多个 用户级子任务, 以及将天线级任务按照预置的粒度划分成多个天线级子任务, 则此时, 具体可以将用户级子任务加入用户级共享队列,将所述天线级子任务 加入天线级共享队列。
其中, 该共享队列(比如用户级共享队列或天线级共享队列)可以是软件 共享队列, 也可以是硬件共享队列。
103、 触发多个核依次从共享队列中获取子任务并进行处理, 直至共享队 列中的子任务处理完毕。
例如,具体可以触发多个核依次从天线级共享队列获取天线级子任务并进 行处理, 直至天线级共享队列中的天线级子任务全部处理完毕后, 然后再由该 多个核(可以再次触发该多个核, 也可以无需触发, 由这多个核直接执行)依 次从用户级共享队列获取用户级子任务并进行处理,直至用户级共享队列中的 用户级子任务全部处理完毕。 比如, 以多个核包括核 C和核 D为例, 则具体 可以: ¾口下:
核 C和核 D进行处理时, 各自从天线级共享队列中获取天线级子任务, 例如, 核 C获取到第一个天线级子任务, 比如天线 0任务, 核 D获取到第二 个天线级子任务 1 ,比如天线 1任务。由于软件处理负载不同,核 D先处理完, 那么, 核 D会继续从天线级共享队列中获取下一个天线级子任务(即第三个 天线级子任务), 比如天线 2任务, 同理, 如果核 C处理完天线 0任务时, 核 D还在处理天线 2任务, 那么, 核 C将会从天线级共享队列中获取下一个天 线级子任务(即第四个天线级子任务), 即天线 3任务, 而如果核 D处理完天 线 2任务时, 核 C还未完成天线 0任务, 则由核 D从天线级共享队列中获取 下一个天线级子任务(即第四个天线级子任务), 即天线 3任务, 以此类推。 也就是说, 核 C和核 D会依次从天线级共享队列中获取天线级子任务并进行 处理, 其中, 各个核在完成自身所获取的天线级子任务后, 才会获取下一个天 线级子任务。 当共享队列中的任务被处理完成之后, 核 C和核 D便不能从共 享队列中获取到子任务了, 那么, 此时就表明该资源池中的任务被处理完毕, 核 C和核 D则开始从用户级共享队列中获取用户级子任务进行处理, 其获取 用户级子任务的过程与获取天线级子任务的过程类似, 在此不再赘述。
需说明的是,如果共享队列是软件共享队列, 则需要软件保证各核不会取 到同一个子任务, 而如果是硬件共享队列, 则需要硬件保证各核不会取到同一 个子任务。
此外,还需说明的是,如果采用的是采用分布式调度方式结合集中式调度 方式对资源池中的任务进行调度的话, 即在此之前只对用户级任务进行划分, 并将用户级子任务加入到用户级共享队列中,则此时各个核只从用户级共享队 列中获取用户级子任务并进行处理, 而对天线级任务则进行直接分配,在此不 再赘述。
由上可知,本实施例采用将资源池中的任务按照预置的粒度划分成多个子 任务, 然后将这些子任务加入共享队列中, 并触发多个核依次从共享队列中获 取子任务并进行处理。 由于在该方案中, 无需进行负载估算, 而是由处理任务 的多个核根据自身的负载情况, 从共享队列中获取子任务并进行处理, 因此, 可以避免由于负载估计不够准确所引起的各个核的任务分配不均的问题,可以 更为均衡地对资源池中的任务进行调度, 实时有效地平衡各个核的负载。 根据实施例一所描述的方法,以下将在实施二和三种举例作进一步详细说 明。 实施例二、
在本实施例中,将以多核系统中资源池的调度装置具体为多核系统中的集 中调度模块、 资源池中的任务包括用户级任务和天线级任务,且采用分布式调 参见图 2a, 该图为分布式调度方式的场景示意图, 由图 2a可知, 无论是用 户级任务还是天线级任务, 均需对其进行划分, 分别得到用户级子任务和天线 级子任务, 并分别加入到各自的共享队列中, 即用户级子任务写入(即加入) 用户级共享队列, 天线级子任务写入(即加入)天线级共享队列, 然后触发多 个核依次从天线级共享队列中获取天线级子任务并进行处理,以及触发多个核 依次从用户级共享队列中获取用户级子任务并进行处理。 以下将进行详细说 明。
如图 2b所示, 具体流程可以如下:
201、 前级处理模块处理完任务之后, 触发集中调度模块。
其中,前级处理模块具体可以为其他的核、硬件加速器或硬件 IP等。 比如, 该前级处理模块具体可以是核 A和核 B等, 即核 A和核 B处理完任务之后, 触发 集中调度模块。
202、 集中调度模块将用户级任务按照预置的粒度划分成多个用户级子任 务, 以及将天线级任务按照预置的粒度划分成多个天线级子任务。
其中, 划分的粒度可以根据实际应用的需求进行设置, 比如, 具体可以根 据任务的功能进行划分, 等等。
203、 集中调度模块将用户级子任务加入用户级共享队列, 以及将该天线 级子任务加入天线级共享队列。
其中, 该共享队列, 比如用户级共享队列或天线级共享队列可以是软件共 享队列, 也可以是硬件共享队列。
204、 集中调度模块触发多个核依次从天线级共享队列获取天线级子任务 并进行处理, 直至天线级共享队列中的天线级子任务全部处理完毕后, 然后执 行步骤 205。 比如, 以多个核包括核 C和核 D为例, 则具体可以如下:
核 C和核 D进行处理时, 各自从天线级共享队列中获取天线级子任务, 例 如, 核 C获取到第一个天线级子任务, 比如天线 0任务, 核 D获取到第二个天线 级子任务 1 , 比如天线 1任务。 由于软件处理负载不同, 核 D先处理完, 那么, 核 D会继续从天线级共享队列中获取下一个天线级子任务(即第三个天线级子 任务), 比如天线 2任务, 同理, 如果核 C处理完天线 0任务时, 核 D还在处理天 线 2任务, 那么, 核 C将会从天线级共享队列中获取下一个天线级子任务(即 第四个天线级子任务), 即天线 3任务, 而如果核 D处理完天线 2任务时, 核 C还 未完成天线 0任务,则由核 D从天线级共享队列中获取下一个天线级子任务(即 第四个天线级子任务), 即天线 3任务, 以此类推。 也就是说, 核 C和核 D会依 次从天线级共享队列中获取天线级子任务并进行处理, 其中,各个核在完成自 身所获取的天线级子任务后, 才会获取下一个天线级子任务。
当共享队列中的任务被处理完成之后, 核 C和核 D便不能从共享队列中获 取到子任务了, 那么, 此时就表明该资源池中的任务被处理完毕, 于是, 核 C 和核 D则开始从用户级共享队列中获取用户级子任务进行处理, 即执行步骤 205。
205、 该多个核依次从用户级共享队列获取用户级子任务并进行处理, 直 至用户级共享队列中的用户级子任务全部处理完毕。 比如, 以多个核包括核 C 和核 D为例, 则具体可以如下:
核 C和核 D进行处理时, 各自从用户级共享队列中获取用户级子任务, 例 如, 核 C获取到第一个用户级子任务, 比如用户 0任务, 核 D获取到第二个用户 级子任务 1 , 比如用户 1任务。 由于软件处理负载不同, 核 D先处理完, 那么, 核 D会继续从用户级共享队列中获取下一个用户级子任务(即第三个用户级子 任务), 比如用户 2任务, 同理, 如果核 C处理完用户 0任务时, 核 D还在处理用 户 2任务, 那么, 核 C将会从用户级共享队列中获取下一个用户级子任务(即 第四个用户级子任务), 即用户 3任务, 而如果核 D处理完用户 2任务时, 核 C还 未完成用户 0任务,则由核 D从用户级共享队列中获取下一个用户级子任务(即 第四个用户级子任务), 即用户 3任务, 以此类推。 也就是说, 核 C和核 D会依 次从用户级共享队列中获取用户级子任务并进行处理, 其中,各个核在完成自 身所获取的用户级子任务后, 才会获取下一个用户级子任务。
需说明的是, 在步骤 204和 205中, 如果共享队列(包括天线级共享队列和 用户级共享队列 )是软件共享队列, 则需要软件保证各核不会取到同一个子任 务, 而如果是硬件共享队列, 则需要硬件保证各核不会取到同一个子任务。
此外, 还需说明的是, 以上仅仅以 C核和 D核为例进行说明, 应当理解的 是, 还可以包括更多的核, 比如 E核、 F核和 G核等等, 其具体实现方法与上述 相同, 在此不再赘述。
由上可知,本实施例采用将资源池中的天线级任务和用户级别任务按照预 置的粒度分别划分成多个子任务,然后分别将这些子任务加入天线级共享队列 和用户级共享队列中,并触发多个核依次从这些共享队列中获取子任务并进行 处理。 由于在该方案中, 无需进行负载估算, 而是由处理任务的多个核根据自 身的负载情况, 从共享队列中获取子任务并进行处理, 因此, 可以避免由于负 载估计不够准确所引起的各个核的任务分配不均的问题,可以更为均衡地对资 源池中的任务进行调度, 实时有效地平衡各个核的负载。 实施例三、
与实施二相同的是,在本实施例中,依然以多核系统中资源池的调度装置 具体为多核系统中的集中调度模块,且资源池中的任务包括用户级任务和天线 级任务为例进行说明, 与实施例二不同的是, 在本实施例中, 将采用分布式调 结合的调度方式) 为例进行说明。
参见图 3a, 该图为分布式和集中式相结合的调度方式的场景示意图, 由图
3a可知,在本实施例中,对于天线级任务,采用的是直接分配的方式进行调度, 而对于用户级任务, 则对其进行划分以得到用户级子任务, 并加入到用户级共 享队列中,然后触发多个核依次从用户级共享队列中获取用户级子任务并进行 处理。 以下将进行详细说明。
如图 3b所示, 具体流程可以如下:
301、 前级处理模块处理完任务之后, 触发集中调度模块。
其中, 前级处理模块具体可以为其他的核、 硬件加速器或硬件 IP设备等。 比如,该前级处理模块具体可以是核 A和核 B等,即核 A和核 B处理完任务之后, 触发集中调度模块。
302、 集中调度模块确定当前资源池中的任务是天线级任务还是用户级任 务, 若为天线级任务, 则执行步骤 303 , 否则, 若为用户级任务, 则执行步骤 304。
303、 集中调度模块对天线级任务进行直接分配, 例如, 具体可以如下: 获取各个核的处理能力信息, 并对各天线处理所需的负载进行估算,得到 估算负载,然后根据该各个核的处理能力信息和得到的估算负载将该当前资源 池中的任务(即天线级任务)分配给各个核。
比如, 以多个核包括核 C和核 D为例, 则集中调度模块可以获取核 C的 处理能力信息和核 D 的处理能力信息, 并对各天线处理所需的负载进行预计 算, 然后按照核 C的处理能力信息、 核 D的处理能力信息和该负载预计算所 得到的计算结果对核 C和核 D进行任务分配, 比如分配给核 C的任务为: 处 理天线级任务的天线 0-3, 分配给核 D的任务为: 处理天线级任务的天线 4-7, 等等, 此后, 核 C和核 D便可以分别按照分配给自己的任务进行处理。
304、 集中调度模块将用户级任务按照预置的粒度划分成多个用户级子任 务。
其中, 划分的粒度可以根据实际应用的需求进行设置, 比如, 具体可以根 据任务的功能进行划分, 等等。
305、 集中调度模块将用户级子任务加入用户级共享队列。
其中, 该用户级共享队列可以是软件共享队列, 也可以是硬件共享队列。
306、 在各个核处理完天线级任务之后, 集中调度模块触发多个核(或者, 也可以无需由集中调度模块对这多个核进行触发, 而是由这多个核自行执行) 依次从用户级共享队列获取用户级子任务并进行处理,直至用户级共享队列中 的用户级子任务全部处理完毕。 比如, 以多个核包括核 C和核 D为例, 则具体 可以如下:
在核 C和核 D进行完天线级任务(比如步骤 303中分配的天线级任务) 之后时, 集中调度模块触发核 C和核 D分别从用户级共享队列中获取用户级 子任务, 例如, 核 C获取到第一个用户级子任务, 比如用户 0任务, 核 D获 取到第二个用户级子任务 1 , 比如用户 1任务。 由于软件处理负载不同, 核 D 先处理完, 那么, 核 D会继续从用户级共享队列中获取下一个用户级子任务 (即第三个用户级子任务), 比如用户 2任务, 同理, 如果核 C处理完用户 0 任务时, 核 D还在处理用户 2任务, 那么, 核 C将会从用户级共享队列中获 取下一个用户级子任务(即第四个用户级子任务), 即用户 3任务, 而如果核 D处理完用户 2任务时, 核 C还未完成用户 0任务, 则由核 D从用户级共享 队列中获取下一个用户级子任务(即第四个用户级子任务), 即用户 3任务, 以此类推。 也就是说, 核 C和核 D会依次从用户级共享队列中获取用户级子 任务并进行处理, 其中, 各个核在完成自身所获取的用户级子任务后, 才会获 取下一个用户级子任务。
需说明的是, 在步骤 306中, 如果用户级共享队列是软件共享队列, 则需 要软件保证各核不会取到同一个子任务, 而如果是硬件共享队列, 则需要硬件 保证各核不会取到同一个子任务。
此外, 还需说明的是, 以上仅仅以 C核和 D核为例进行说明, 应当理解的 是, 还可以包括更多的核, 比如 E核、 F核和 G核等等, 其具体实现方法与上述 相同, 在此不再赘述。
由上可知, 本实施例采用将资源池中的天线级任务直接进行分配, 而对用 户级别任务则按照预置的粒度分别划分成多个子任务,然后分别将这些子任务 加入用户级共享队列中, 并在多个核处理完天线级任务之后,触发多个核依次 从用户级共享队列中获取用户级子任务并进行处理。 由于在该方案中,在对用 户级任务进行处理时, 无需进行负载估算, 而是由处理任务的多个核根据自身 的负载情况, 从用户级共享队列中获取用户级子任务并进行处理, 因此, 可以 弥补在对天线级任务进行处理时由于负载估计不够准确所引起的各个核的任 务分配不均的问题, 可以更为均衡地对资源池中的任务进行调度, 实时有效地 平衡各个核的负载。 实施例四、
为了更好地实施以上方法,本发明实施例还提供一种多核系统中资源池的 调度装置, 如图 4所示, 该多核系统中资源池的调度装置包括划分单元 401、 添加单元 402和触发单元 403;
划分单元 401 , 用于将资源池中的任务按照预置的粒度划分成多个子任 务;
其中, 划分的粒度可以根据实际应用的需求进行设置, 比如, 具体可以根 据任务的功能进行划分, 等等。
添加单元 402, 用于将划分单元 401得到的多个子任务加入共享队列中; 触发单元 403 , 用于触发多个核依次从共享队列中获取子任务并进行处 理, 直至共享队列中的子任务全部处理完毕。
其中, 资源池中的任务可以有多种类型, 比如, 可以包括用户级任务和天 线级任务, 具体在划分子任务时, 可以根据任务的类型分别进行划分, 比如, 具体可以将用户级任务按照预置的粒度划分成多个用户级子任务,而将天线级 任务按照预置的粒度划分成多个天线级子任务, 等等; 即:
划分单元 401 , 具体可以用于将用户级任务按照预置的粒度划分成多个用 户级子任务, 将天线级任务按照预置的粒度划分成多个天线级子任务;
则此时, 添加单元 402 , 具体可以用于将用户级子任务加入用户级共享队 列, 将天线级子任务加入天线级共享队列。
触发单元 403 , 具体可以用于触发多个核依次从天线级共享队列获取天线 级子任务并进行处理, 直至天线级共享队列中的天线级子任务全部处理完毕 后,再由所述多个核依次从用户级共享队列获取用户级子任务并进行处理, 直 至用户级共享队列中的用户级子任务全部处理完毕。
其中, 该共享队列(比如用户级共享队列或天线级共享队列)可以是软件 共享队列, 也可以是硬件共享队列。
可选的, 为了提高处理效率, 还可以根据任务的优先级进行不同的处理, 例如, 由于天线处理所需要的负载较大, 因此, 对于高优先级的任务如天线级 任务, 可以采用集中式调度方式来进行处理, 而由于用户处理所需要的负载较 小, 因此, 对于低优先级的任务如用户级任务, 则可以采用分布式调度方式来 进行处理, 即如图 5所示, 该多核系统中资源池的调度装置还可以包括判断单 元 404和分配单元 405;
判断单元 404 , 可以用于确定当前资源池中的任务是天线级任务还是用户 级任务;
分配单元 405 , 可以用于在判断单元 404确定当前资源池中的任务是天线 级任务时, 对当前资源池中的任务(即天线级任务)进行直接分配;
划分单元 401 , 具体用于在判断单元确定当前资源池中的任务是用户级任 务时, 将当前资源池中的任务(即用户级任务)按照预置的粒度划分成多个子 任务。
例如, 其中, 分配单元 405可以包括获取子单元、 估算子单元和分配子单 元;
获取子单元, 用于获取各个核的处理能力信息;
估算子单元,用于根据对各天线处理所需的负载进行估算,得到估算负载; 分配子单元,用于所述根据获取子单元获取到的各个核的处理能力信息和 所述估算子单元得到的估算负载将当前资源池中的任务分配给各个核。
此外, 在划分单元 401 "将资源池中的任务按照预置的粒度划分成多个子 任务"之前, 还可以进行其他的处理, 比如, 可以由其他前级处理模块对任务 进行处理, 然后才触发划分单元 401执行 "将资源池中的任务按照预置的粒度 划分成多个子任务" 的操作, 即, 该多核系统中资源池的调度装置还可以包括 接受单元;
接受单元, 用于在确定前级处理模块处理完任务之后,接受该前级处理模 块的触发。
其中, 前级处理模块具体可以为其他的核、 硬件加速器或硬件 IP等。 具体实现时, 以上各个单元可以作为独立的实体来实现,也可以进行任意 组合, 作为同一或若干个实体来实现, 比如, 该多核系统中资源池的调度装置 具体可以为多核系统中的集中调度模块, 参见实施例二和三。 以上各个单元的 具体实施可参见前面的方法实施例, 在此不再赘述。
由上可知,本实施例的核系统中资源池的调度装置的划分单元 401可以将 资源池中的任务按照预置的粒度划分成多个子任务,然后由添加单元 402将这 些子任务加入共享队列中,并由触发单元 403触发多个核依次从共享队列中获 取子任务并进行处理。 由于在该方案中, 无需进行负载估算, 而是由处理任务 的多个核根据自身的负载情况, 从共享队列中获取子任务并进行处理, 因此, 可以避免由于负载估计不够准确所引起的各个核的任务分配不均的问题,可以 更为均衡地对资源池中的任务进行调度, 实时有效地平衡各个核的负载。 实施例五、
相应的, 本发明实施例还提供一种通信系统, 其特征在于, 包括本发明实 施例提供的任一种多核系统中资源池的调度装置, 其中, 该多核系统中资源池 的调度装置具体可以为多核系统中的集中调度模块, 例如, 具体可以如下: 集中调度模块, 用于将资源池中的任务按照预置的粒度划分成多个子任 务,将该多个子任务加入共享队列中, 触发多个核依次从共享队列中获取子任 务并进行处理, 直至共享队列中的子任务处理完毕。
其中, 划分的粒度可以根据实际应用的需求进行设置, 比如, 具体可以根 据任务的功能进行划分, 等等。
其中, 资源池中的任务可以有多种类型, 比如, 可以包括用户级任务和天 线级任务, 具体在划分子任务时, 可以根据任务的类型分别进行划分, 比如, 具体可以将用户级任务按照预置的粒度划分成多个用户级子任务,而将天线级 任务按照预置的粒度划分成多个天线级子任务, 等等。 则此时, 集中调度模块 具体可以将用户级子任务加入用户级共享队列,将所述天线级子任务加入天线 级共享队列, 此后, 可以触发多个核依次从天线级共享队列获取天线级子任务 并进行处理, 直至天线级共享队列中的天线级子任务全部处理完毕后, 然后再 由该多个核依次从用户级共享队列获取用户级子任务并进行处理,直至用户级 共享队列中的用户级子任务全部处理完毕。
可选的, 为了提高处理效率, 还可以根据任务的优先级进行不同的处理, 例如, 由于天线处理所需要的负载较大, 因此, 对于高优先级的任务如天线级 任务, 可以采用集中式调度方式来进行处理, 而由于用户处理所需要的负载较 小, 因此, 对于低优先级的任务如用户级任务, 则可以采用分布式调度方式来 进行处理, 即:
集中调度模块,还用于确定当前资源池中的任务是天线级任务还是用户级 任务, 若为天线级任务, 则对所述当前资源池中的任务(即天线级任务)进行 直接分配; 若为用户级任务, 则执行将当前资源池中的任务(即用户级任务) 按照预置的粒度划分成多个子任务的操作。
其中, 对资源池中的任务进行直接分配的操作具体可以如下:
获取各个核的处理能力信息, 并对各天线处理所需的负载进行估算,得到 估算负载,然后根据该各个核的处理能力信息和得到的估算负载将该当前资源 池中的任务(即天线级任务)分配给各个核。
此外, 该通信系统还可以包括多个核, 这些核可以用于在集中调度模块的 触发下,依次从共享队列中获取子任务并进行处理, 直至共享队列中的子任务 处理完毕, 具体可参见前面的实施例, 在此不再赘述。 进一步的, 该通信系统还可以包括前级处理模块, 用于处理完任务之后, 触发集中调度模块执行上述调度操作, 在此不再赘述。
以上各个设备的具体实施可参见前面的方法实施例, 在此不再赘述。 由上可知,本实施例的通信系统中的集中调度模块采用将资源池中的任务 按照预置的粒度划分成多个子任务, 然后将这些子任务加入共享队列中, 并触 发多个核依次从共享队列中获取子任务并进行处理。 由于在该方案中, 无需进 行负载估算, 而是由处理任务的多个核根据自身的负载情况,从共享队列中获 取子任务并进行处理, 因此, 可以避免由于负载估计不够准确所引起的各个核 的任务分配不均的问题, 可以更为均衡地对资源池中的任务进行调度, 实时有 效地平衡各个核的负载。 实施例六、
一种多核系统的通信设备, 如图 6所示, 包括处理器 601、 用于存储数据 和程序的存储器 602和用于收发数据的收发模块 603; 其中:
处理器 601 , 用于将资源池中的任务按照预置的粒度划分成多个子任务, 将该多个子任务加入共享队列中,触发多个核依次从共享队列中获取子任务并 进行处理, 直至共享队列中的子任务处理完毕。
其中, 划分的粒度可以根据实际应用的需求进行设置, 比如, 具体可以根 据任务的功能进行划分, 等等。
其中, 资源池中的任务可以有多种类型, 比如, 可以包括用户级任务和天 线级任务, 具体在划分子任务时, 可以根据任务的类型分别进行划分, 比如, 具体可以将用户级任务按照预置的粒度划分成多个用户级子任务,而将天线级 任务按照预置的粒度划分成多个天线级子任务, 等等, 即:
处理器 601 , 具体可以用于将用户级任务按照预置的粒度划分成多个用户 级子任务,将天线级任务按照预置的粒度划分成多个天线级子任务; 将该用户 级子任务加入用户级共享队列,将该天线级子任务加入天线级共享队列; 触发 多个核依次从天线级共享队列获取天线级子任务并进行处理,直至天线级共享 队列中的天线级子任务全部处理完毕后,再由所述多个核依次从用户级共享队 列获取用户级子任务并进行处理,直至用户级共享队列中的用户级子任务全部 处理完毕。 需说明的是,如果共享队列是软件共享队列, 则需要软件保证各核不会取 到同一个子任务, 而如果是硬件共享队列, 则需要硬件保证各核不会取到同一 个子任务。
可选的, 为了提高处理效率, 还可以根据任务的优先级进行不同的处理, 例如, 由于天线处理所需要的负载较大, 因此, 对于高优先级的任务如天线级 任务, 可以采用集中式调度方式来进行处理, 而由于用户处理所需要的负载较 小, 因此, 对于低优先级的任务如用户级任务, 则可以采用分布式调度方式来 进行处理, 即处理器 601在执行 "将资源池中的任务按照预置的粒度划分成多 个子任务" 的操作之前:
处理器 601 , 还可以用于确定当前资源池中的任务是天线级任务还是用户 级任务; 若为天线级任务, 则对当前资源池中的任务进行直接分配; 若为用户 级任务,则执行将当前资源池中的任务按照预置的粒度划分成多个子任务的操 作。
其中, 对资源池中的任务进行直接分配的操作具体可以如下:
获取各个核的处理能力信息和天线的负载信息,根据该各个核的处理能力 信息和天线的负载信息对各个任务的负载进行估算,得到估算负载,根据得到 的估算负载将该当前资源池中的任务(即天线级任务)分配给各个核。
以上各个部分的具体实施可参见前面的实施例, 在此不再赘述。
由上可知, 本实施例的多核系统的通信设备中的处理器 601可以将资源池 中的任务按照预置的粒度划分成多个子任务,然后将这些子任务加入共享队列 中,并触发多个核依次从共享队列中获取子任务并进行处理。由于在该方案中, 无需进行负载估算, 而是由处理任务的多个核根据自身的负载情况,从共享队 列中获取子任务并进行处理, 因此, 可以避免由于负载估计不够准确所引起的 各个核的任务分配不均的问题, 可以更为均衡地对资源池中的任务进行调度, 实时有效地平衡各个核的负载。 本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步 骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读 存储介质中, 存储介质可以包括: 只读存储器(ROM, Read Only Memory ), 随机存取记忆体(RAM, Random Access Memory ) , 磁盘或光盘等。 以上对本发明实施例所提供的一种多核系统中资源池的调度方法、装置和 了阐述, 以上实施例的说明只是用于帮助理解本发明的方法及其核心思想; 同 时, 对于本领域的技术人员, 依据本发明的思想, 在具体实施方式及应用范围 上均会有改变之处, 综上所述, 本说明书内容不应理解为对本发明的限制。

Claims

权 利 要 求
1、 一种多核系统中资源池的调度方法, 其特征在于, 包括:
将资源池中的任务按照预置的粒度划分成多个子任务;
将所述多个子任务加入共享队列中;
触发多个核依次从共享队列中获取子任务并进行处理,直至共享队列中的 子任务全部处理完毕。
2、 根据权利要求 1所述的方法, 其特征在于, 所述资源池中的任务包括 用户级任务和天线级任务,则所述将资源池中的任务按照预置的粒度划分成多 个子任务包括:
将用户级任务按照预置的粒度划分成多个用户级子任务,将天线级任务按 照预置的粒度划分成多个天线级子任务;
所述将所述多个子任务加入共享队列中具体为:将所述用户级子任务加入 用户级共享队列, 将所述天线级子任务加入天线级共享队列。
3、 根据权利要求 2所述的方法, 其特征在于, 所述触发多个核依次从共 享队列中获取子任务并进行处理, 直至共享队列中的子任务全部处理完毕, 包 括:
触发多个核依次从天线级共享队列获取天线级子任务并进行处理,直至天 线级共享队列中的天线级子任务全部处理完毕后,再由所述多个核依次从用户 级共享队列获取用户级子任务并进行处理,直至用户级共享队列中的用户级子 任务全部处理完毕。
4、 根据权利要求 1所述的方法, 其特征在于, 所述资源池中的任务包括 用户级任务和天线级任务,则所述将资源池中的任务按照预置的粒度划分成多 个子任务之前, 还包括:
确定当前资源池中的任务是天线级任务还是用户级任务;
若为天线级任务, 则对所述当前资源池中的任务进行直接分配; 若为用户级任务,则执行将当前资源池中的任务按照预置的粒度划分成多 个子任务的步骤。
5、 根据权利要求 4所述的方法, 其特征在于, 所述对所述当前资源池中 的任务进行直接分配, 包括: 获取各个核的处理能力信息;
对各天线处理所需的负载进行估算, 得到估算负载;
根据所述各个核的处理能力信息和所述估算负载将所述当前资源池中的 任务分配给各个核。
6、 根据权利要求 1至 5任一项所述的方法, 其特征在于,
所述共享队列具体为软件共享队列或硬件共享队列。
7、 一种多核系统中资源池的调度装置, 其特征在于, 包括:
划分单元, 用于将资源池中的任务按照预置的粒度划分成多个子任务; 添加单元, 用于将所述划分单元得到的多个子任务加入共享队列中; 触发单元, 用于触发多个核依次从共享队列中获取子任务并进行处理, 直 至共享队列中的子任务全部处理完毕。
8、 根据权利要求 7所述的多核系统中资源池的调度装置, 其特征在于, 所述资源池中的任务包括用户级任务和天线级任务, 则:
所述划分单元,具体用于将用户级任务按照预置的粒度划分成多个用户级 子任务, 将天线级任务按照预置的粒度划分成多个天线级子任务;
所述添加单元, 具体用于将所述用户级子任务加入用户级共享队列,将所 述天线级子任务加入天线级共享队列。
9、 根据权利要求 8所述的多核系统中资源池的调度装置, 其特征在于, 所述触发单元,具体用于触发多个核依次从天线级共享队列获取天线级子 任务并进行处理, 直至天线级共享队列中的天线级子任务全部处理完毕后,再 由所述多个核依次从用户级共享队列获取用户级子任务并进行处理,直至用户 级共享队列中的用户级子任务全部处理完毕。
10、 根据权利要求 7所述的多核系统中资源池的调度装置, 其特征在于, 所述资源池中的任务包括用户级任务和天线级任务,则所述多核系统中资源池 的调度装置还包括判断单元和分配单元;
判断单元, 用于确定当前资源池中的任务是天线级任务还是用户级任务; 分配单元, 用于在判断单元确定当前资源池中的任务是天线级任务时,对 所述当前资源池中的任务进行直接分配;
所述划分单元,具体用于在判断单元确定当前资源池中的任务是用户级任 务时, 将当前资源池中的任务按照预置的粒度划分成多个子任务。
11、根据权利要求 10所述的多核系统中资源池的调度装置, 其特征在于, 所述分配单元包括获取子单元、 估算子单元和分配子单元;
获取子单元, 用于获取各个核的处理能力信息;
估算子单元, 用于对各天线处理所需的负载进行估算, 得到估算负载; 分配子单元,用于根据所述获取子单元获取到的各个核的处理能力信息和 所述估算子单元得到的估算负载将所述当前资源池中的任务分配给各个核。
12、 一种通信系统, 其特征在于, 包括权利要求 7至 11任一项所述的多 核系统中资源池的调度装置。
13、 一种多核系统的通信设备, 其特征在于, 包括处理器、 用于存储数据 和程序的存储器和用于收发数据的收发模块;
所述处理器, 用于将资源池中的任务按照预置的粒度划分成多个子任务; 将所述多个子任务加入共享队列中;触发多个核依次从共享队列中获取子任务 并进行处理, 直至共享队列中的子任务处理完毕。
14、 根据权利要求 13所述的多核系统的通信设备, 其特征在于, 所述资 源池中的任务包括用户级任务和天线级任务, 则:
所述处理器,具体用于将用户级任务按照预置的粒度划分成多个用户级子 任务,将天线级任务按照预置的粒度划分成多个天线级子任务; 将所述用户级 子任务加入用户级共享队列,将所述天线级子任务加入天线级共享队列; 触发 多个核依次从天线级共享队列获取天线级子任务并进行处理,直至天线级共享 队列中的天线级子任务全部处理完毕后,再由所述多个核依次从用户级共享队 列获取用户级子任务并进行处理,直至用户级共享队列中的用户级子任务全部 处理完毕。
15、 根据权利要求 13所述的多核系统的通信设备, 其特征在于, 所述处理器,还用于确定当前资源池中的任务是天线级任务还是用户级任 务; 若为天线级任务, 则对所述当前资源池中的任务进行直接分配; 若为用户 级任务,则执行将当前资源池中的任务按照预置的粒度划分成多个子任务的操 作。
PCT/CN2013/084575 2013-09-29 2013-09-29 一种多核系统中资源池的调度方法、装置和系统 WO2015042904A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2013/084575 WO2015042904A1 (zh) 2013-09-29 2013-09-29 一种多核系统中资源池的调度方法、装置和系统
CN201380003199.7A CN105051689A (zh) 2013-09-29 2013-09-29 一种多核系统中资源池的调度方法、装置和系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/084575 WO2015042904A1 (zh) 2013-09-29 2013-09-29 一种多核系统中资源池的调度方法、装置和系统

Publications (1)

Publication Number Publication Date
WO2015042904A1 true WO2015042904A1 (zh) 2015-04-02

Family

ID=52741842

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/084575 WO2015042904A1 (zh) 2013-09-29 2013-09-29 一种多核系统中资源池的调度方法、装置和系统

Country Status (2)

Country Link
CN (1) CN105051689A (zh)
WO (1) WO2015042904A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109426553A (zh) * 2017-08-21 2019-03-05 上海寒武纪信息科技有限公司 任务切分装置及方法、任务处理装置及方法、多核处理器
CN110502330A (zh) * 2018-05-16 2019-11-26 上海寒武纪信息科技有限公司 处理器及处理方法
US10901815B2 (en) 2017-06-26 2021-01-26 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11537843B2 (en) 2017-06-29 2022-12-27 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11687467B2 (en) 2018-04-28 2023-06-27 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115981819B (zh) * 2022-12-30 2023-10-24 摩尔线程智能科技(北京)有限责任公司 用于多核系统的核心调度方法和装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101165655A (zh) * 2006-10-20 2008-04-23 国际商业机器公司 多处理器计算系统及其任务分配方法
CN101261591A (zh) * 2008-04-28 2008-09-10 艾诺通信系统(苏州)有限责任公司 多核dsp系统中自适应的任务调度方法
CN101387952A (zh) * 2008-09-24 2009-03-18 上海大学 单芯片多处理器任务调度管理方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169743A (zh) * 2007-11-27 2008-04-30 南京大学 电力网格中基于多核计算机实现并行潮流计算的方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101165655A (zh) * 2006-10-20 2008-04-23 国际商业机器公司 多处理器计算系统及其任务分配方法
CN101261591A (zh) * 2008-04-28 2008-09-10 艾诺通信系统(苏州)有限责任公司 多核dsp系统中自适应的任务调度方法
CN101387952A (zh) * 2008-09-24 2009-03-18 上海大学 单芯片多处理器任务调度管理方法

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10901815B2 (en) 2017-06-26 2021-01-26 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11726844B2 (en) 2017-06-26 2023-08-15 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11537843B2 (en) 2017-06-29 2022-12-27 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
CN109426553A (zh) * 2017-08-21 2019-03-05 上海寒武纪信息科技有限公司 任务切分装置及方法、任务处理装置及方法、多核处理器
US11656910B2 (en) 2017-08-21 2023-05-23 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11687467B2 (en) 2018-04-28 2023-06-27 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
CN110502330A (zh) * 2018-05-16 2019-11-26 上海寒武纪信息科技有限公司 处理器及处理方法

Also Published As

Publication number Publication date
CN105051689A (zh) 2015-11-11

Similar Documents

Publication Publication Date Title
US10467725B2 (en) Managing access to a resource pool of graphics processing units under fine grain control
EP3669494B1 (en) Dynamic allocation of edge computing resources in edge computing centers
US11003507B2 (en) Mapreduce job resource sizing using assessment models
US12081454B2 (en) Systems and methods for provision of a guaranteed batch
US10572290B2 (en) Method and apparatus for allocating a physical resource to a virtual machine
CN107688492B (zh) 资源的控制方法、装置和集群资源管理系统
Moschakis et al. Evaluation of gang scheduling performance and cost in a cloud computing system
US8875151B2 (en) Load balancing method and apparatus in symmetric multi-processor system
WO2015042904A1 (zh) 一种多核系统中资源池的调度方法、装置和系统
WO2017084453A1 (zh) 云平台资源管理方法、装置及系统
CN104915253B (zh) 一种作业调度的方法及作业处理器
Liu et al. Dynamically negotiating capacity between on-demand and batch clusters
CN108028806A (zh) 网络功能虚拟化nfv网络中分配虚拟资源的方法和装置
CN109189581B (zh) 一种作业调度方法和装置
Ghouma et al. Context aware resource allocation and scheduling for mobile cloud
CN110502343B (zh) 一种资源分配方法、系统、装置及计算机可读存储介质
WO2020211358A1 (zh) 数据库调度方法、装置、计算机设备及存储介质
CN109426561A (zh) 一种任务处理方法、装置及设备
US20180143851A1 (en) Workflow Job Distribution in a Data Processing System with Agent Time Window Constraints
Markthub et al. Using rcuda to reduce gpu resource-assignment fragmentation caused by job scheduler
Nayak et al. Analytical study for throttled and proposed throttled algorithm of load balancing in cloud computing using cloud analyst
Gomez-Miguelez et al. Deployment and management of SDR cloud computing resources: problem definition and fundamental limits
CN111522637A (zh) 一种基于成本效益的storm任务调度方法
CN114091807A (zh) 多无人机任务分配及调度方法、装置、系统及存储介质
KR20150012071A (ko) 다중 사용자를 위한 자원 할당 방법 및 장치

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201380003199.7

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13894303

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13894303

Country of ref document: EP

Kind code of ref document: A1