CN116680044A

CN116680044A - Task scheduling method, device, electronic equipment and computer readable storage medium

Info

Publication number: CN116680044A
Application number: CN202210167735.8A
Authority: CN
Inventors: 周臣; 高军; 李兵兵; 李声融; 刘志强
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2022-02-23
Filing date: 2022-02-23
Publication date: 2023-09-01
Also published as: WO2023160371A1

Abstract

The disclosure provides a task scheduling method, a task scheduling device, electronic equipment and a computer readable storage medium. The task scheduling method includes selecting a first set of processing units from a plurality of sets of processing units based on a traffic type and execution time of the first task. The method also includes assigning the first task to a processing unit in the first set of processing units for execution. With the embodiments of the present disclosure, tasks may be scheduled to be executed by appropriate processing resources based on the task's traffic type and execution time to achieve load balancing in complex scenarios.

Description

Task scheduling method, device, electronic equipment and computer readable storage medium

Technical Field

Embodiments of the present disclosure relate to the field of computer technology, and more particularly, embodiments of the present disclosure relate to a task scheduling method, apparatus, electronic device, computer-readable storage medium, and computer program product.

Background

In recent years, in order to improve the security, reliability and performance of a storage system and reduce the cost, various types of tasks such as data verification, encryption and decryption, data compression, artificial Intelligence (AI) reasoning, and the like are introduced into the storage system. The frequency and overhead of these tasks is increasing, which presents challenges to the computing power of the storage system.

Because it is difficult to increase processor power by substantially increasing single-core performance, existing memory systems rely more on the number of processors and the number of cores to increase power. Some memory systems provide a pool of processing resources consisting of processing cores or processors that schedule associated or similar tasks to be performed by the same processing resources in such a way as to increase cache hit rates for better performance. However, the customer scenario is more and more complex, and the number of tasks in the storage system and the processing resources required to execute the tasks are very different, which results in uneven loading of the processing resources.

Disclosure of Invention

The embodiment of the disclosure provides a technical scheme of task scheduling. The technical scheme can schedule the task to the proper processing resource for execution based on the service type and the execution time of the task so as to realize load balancing in a complex scene.

According to a first aspect of the present disclosure, a task scheduling method is provided. The method comprises the following steps: a first set of processing units is selected from the plurality of sets of processing units based on the traffic type and execution time of the first task, and the first task is assigned to a processing unit in the first set of processing units for execution. In this way, tasks with the same or similar execution time can be allocated to the same processing resource for execution, so that task allocation among the processing resources is more balanced, and task scheduling efficiency and system performance are improved.

In some embodiments, multiple sets of processing units are associated with different traffic types and different execution time levels. In this way, tasks may be individually processed by the grouped processing resources. For example, tasks of a high-density computing class and tasks related to storage business logic are allocated to different processing resources for execution, so that the affinity of business is ensured, and the scheduling efficiency is also ensured.

In some embodiments, assigning the first task to a processing unit in the first set of processing units may include: and distributing the first task to the processing unit with the least tasks to be executed in the first processing unit set. In this way, load balancing within the set of processing units is achieved.

In some embodiments, the processing unit having a plurality of execution queues associated with priorities, assigning a first task to a processing unit in the first set of processing units may include: the first task is added to one of the plurality of execution queues based on the priority of the first task. In this way, tasks can be executed according to task priorities, and real-time requirements of different tasks are met.

In some embodiments, the processing unit is caused to acquire tasks from the plurality of queues in a polling manner for execution, the number of tasks each acquired by the processing unit being determined based on the respective priorities of the queues. In this way, the number of tasks may be acquired according to the priorities for the different tasks, for example, more tasks may be acquired from the high-priority queue at a time, whereby the real-time requirements of the different tasks may be satisfied.

In some embodiments, the processing unit is one of a controller, a processor, or a processor core. In this way, processing resources may be managed at different granularities.

According to a second aspect of the present disclosure, a task scheduling method is provided. The method comprises the following steps: selecting a second set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the second task, adding the second task to a task queue shared by the second set of processing units and a third set of processing units from the plurality of sets of processing units when the second set of processing units is determined to be in a high load state, wherein the second set of processing units and the third set of processing units have the same execution time level and the second set of processing units and the third set of processing units are for executing tasks of different traffic types. In this way, a global queue is provided across the set of processing units that covers the same execution time level of the set of processing units. When the situation changes to cause unbalanced load, the light-load processing unit can acquire task execution of the global queue, so that task allocation among processing resources is more balanced, and task scheduling efficiency and system performance are improved.

In some embodiments, if it is determined that there are idle processing units in the second set of processing units and the third set of processing units, the tasks in the shared task queue are assigned to the idle processing units for execution. In this way, tasks in the shared task queue may be executed by the processing units with lower loads, thereby ensuring that the tasks are executed in time.

In some embodiments, tasks in the shared task queue may be periodically assigned to the second or third set of processing units for execution. In this way, in some abnormal situations, such as when the collection of processing units is always highly loaded or busy, tasks in the shared task queue can still be executed, rather than being suspended from the queue at all times.

In some embodiments, the processing unit may be one of a controller, a processor, or a processor core. In this way, processing resources may be managed at different granularities.

According to a third aspect of the present disclosure, there is provided a task scheduling device including a selection unit and an allocation unit. The selection unit is configured to select a first set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the first task. The allocation unit is configured to allocate the first task to a processing unit of the first set of processing units for execution.

According to a fourth aspect of the present disclosure, there is provided a task scheduling device including a selection unit, a load determination unit, and an equalization unit. The selection unit is configured to select a second set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the second task. The load determination unit is configured to determine that the second set of processing units is in a high load state. The balancing unit is configured to add the second task to a task queue shared by the second set of processing units and a third set of processing units of the plurality of sets of processing units when the second set of processing units is in a high load state. The second set of processing units and the third set of processing units have the same execution time level and the second set of processing units and the third set of processing units are for executing tasks of different traffic types.

According to a fifth aspect of the present disclosure, there is provided an electronic device comprising: a processing unit and a memory, the processing unit executing instructions in the memory, causing the electronic device to perform a method according to the first or second aspect of the disclosure.

According to a sixth aspect of the present disclosure there is provided a computer readable storage medium having stored thereon one or more computer instructions, wherein execution of the one or more computer instructions by a processor causes the processor to perform the method according to the first or second aspect of the present disclosure.

According to a seventh aspect of the present disclosure, there is provided a computer program product comprising machine executable instructions which, when executed by an apparatus, cause the apparatus to perform the method according to the first or second aspect of the present disclosure.

Drawings

The above and other features, advantages and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, wherein like or similar reference numerals designate like or similar elements, and wherein:

FIG. 1 illustrates a schematic diagram of an example environment in which various embodiments of the present disclosure may be implemented;

FIG. 2 illustrates a schematic diagram of another example environment in which various embodiments of the present disclosure can be implemented;

FIG. 3 illustrates a schematic flow diagram of a task scheduling method according to some embodiments of the present disclosure;

FIG. 4 illustrates a schematic diagram of a layout of processing resources according to some embodiments of the present disclosure;

FIG. 5 illustrates a schematic diagram of distributing tasks among a set of processing units according to some embodiments of the present disclosure;

FIG. 6 illustrates a schematic flow diagram of another task scheduling method according to some embodiments of the present disclosure;

FIG. 7 illustrates a schematic diagram of a shared task queue, according to some embodiments of the present disclosure;

FIG. 8 illustrates a schematic block diagram of a task scheduling device according to some embodiments of the present disclosure;

FIG. 9 illustrates a schematic block diagram of another task scheduling device according to some embodiments of the present disclosure; and

FIG. 10 shows a schematic block diagram of an example device that may be used to implement embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

In describing embodiments of the present disclosure, the term "comprising" and its like should be taken to be open-ended, i.e., including, but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other explicit and implicit definitions are also possible below. It is noted that all specific values herein are examples, and are intended to be included in a vast array of values, for ease of understanding.

In order to improve the safety, the reliability and the performance and reduce the cost, tasks such as data verification, encryption and decryption, data compression and the like are introduced into a storage system. Some schemes schedule tasks according to task affinity, scheduling business logic associated or similar tasks to the same processing resources for execution. For example, processing units (e.g., controllers, processors, processor cores) of a storage system are divided into sets according to traffic type, including I/O read-write, data exchange channels, protocol parsing, etc. Depending on the type of traffic, tasks may be scheduled to corresponding sets of processing units, and then assigned to processing units within a group for execution. This is not problematic in the case where the processing execution times of all tasks are not greatly different. However, as customer scenarios tend to be complex, even for tasks of the same traffic type, their execution times may vary significantly, e.g., in a particular traffic type, the execution time of data compression, AI reasoning tasks may be in the millimeter level, the execution time of data flushing, data aggregation tasks may be in the hundred microsecond level, and the execution time of latency sensitive front-end, mirror network, cache hit query, etc. tasks may be in the ten microsecond level. This results in the original scheduling policy no longer being accurate, scheduling efficiency being degraded and some processing resources may be idle.

In view of this, the present disclosure provides a scheme for scheduling tasks according to their traffic types and execution times. According to embodiments of the present disclosure, a processing unit set may be selected from a plurality of processing unit sets to process a task according to execution time and service type of the task, and then the task may be allocated to the processing unit in the selected processing unit set to be executed. In this way, tasks with the same or similar execution time can be allocated to the same processing resources for execution, so that load allocation among the processing resources is more balanced, and task scheduling efficiency and system performance are improved.

Embodiments of the present disclosure are described below with reference to fig. 1 to 10.

Example Environment

FIG. 1 illustrates a schematic diagram of an example environment in which various embodiments of the present disclosure may be implemented. As shown in FIG. 1, host 110 may communicate with storage control system 130 via front-end network card 120. For example, host 110 may generate tasks and send to storage control system 130 for execution. The storage control system 130 includes a plurality of storage nodes (or simply "nodes") 132. Node 132 may communicate with host 110 through front-end network card 132 and with back-end storage 150 through back-end network card 140.

Node 132 may include a plurality of controllers 133. The controller 133 may be a computing device such as a server, desktop computer, cluster, or the like. The controller 133 includes at least a processor, a memory (also referred to as "memory"), and a bus. The processor is for processing requests from the host 110 or requests generated internally by the storage control system 130. In the controller 133, a plurality of processors may be included, each of which includes a plurality of processor cores. The memory is used to temporarily store data received from the host 110 or data read by the hard disk 151 of the storage device 150. The bus is used for communication between the internal components of the controller 133.

The front-end network card 120 may be connected to the plurality of controllers 133 included in each node 132 through an internal network channel, and the back-end network card 140 may be connected to the plurality of controllers 133 included in each node through an internal network channel. Thus, the controller 103 in each node 100 can transmit and receive traffic through the front-end network card 120 or the back-end network card 120. In some embodiments, node 132 may have respective front-end network card 120 and back-end network card 140. Alternatively, the front end network card 120 and the back end network card 140 may be shared among the plurality of nodes 132 of the storage control system 130.

The storage control system 130 includes a task scheduling unit 101. The task scheduling unit 101 may be implemented as a computing device running program code to schedule tasks from the host 110 for execution by the node 132. In some embodiments, when the storage control system 130 receives a task from the host 110, the task scheduling unit 101 schedules the task to be executed to the appropriate processing resource (e.g., the controller 133, a processor in the controller, or a processor core). In some embodiments, the task scheduling unit 101 may schedule tasks according to attributes of the tasks (e.g., traffic type, execution time, priority), busyness of processing resources, and the like. The task scheduling process according to the embodiment of the present disclosure will be further described below with reference to fig. 3 to 9. The storage control system 130 may access the back-end storage 150 via the back-end network card 140.

The storage 150 may include a plurality of hard disks or hard disk clusters 151. The hard disk 151 is used for storing data, and may be a magnetic disk or other type of storage medium, such as a solid state disk or a shingled magnetic recording hard disk. In addition, nodes 132 may be coupled to storage 150 through back-end network card 140 to enable sharing of data among the nodes. In some application scenarios, the storage control system 130 (including the front-end network card 120 and the back-end network card 140) and the storage 130 may be collectively referred to as a storage system or storage device.

FIG. 2 illustrates a schematic diagram of another example environment 200 in which various embodiments of the present disclosure can be implemented. In contrast to the hardware environment depicted in FIG. 1, FIG. 2 illustrates an environment resulting from hardware virtualization.

As shown in FIG. 2, nodes 132 in storage control system 130 may be virtualized to create a virtual pool 220 of processing resources. In particular, the processing units of node 132 in storage control system 130 may be combined together and partitioned into multiple sets 221 of processing units. The processing units 222 in the set of processor units 221 may be controllers of nodes, processors in a controller, or processor cores in a processor, depending on the particular implementation, which the present disclosure does not limit. In this way, processing resources may be managed at different granularities.

The description is given with the processing unit being a processor core. For example, the storage control system has 4 nodes, 4 controllers 133 in each node, 4 CPUs in each controller, 48 CPU cores in each CPU, and 768 CPU cores in total for one node 132. If a storage system contains 4 nodes, the total number of cores is up to 3072. If each processing unit set is set to have 16 cores, 192 processing unit sets can be obtained; if each set of processing units is set to have 32 cores, 92 sets of processing units can be obtained. It should be understood that the present disclosure does not limit the number of processing units 222 in the processing unit set 221, and the number of processing units 222 that the processing unit set 221 has may be the same or different.

The processing resource pool 220 may be presented to the task scheduling unit 101 such that the task scheduling unit 101 does not need to consider the case of underlying hardware, but rather schedules tasks according to the virtualized resource pool 220 comprising the set of processing units 221. In some embodiments, the set of processing units 221 may be configured to perform tasks associated with a particular traffic type in order to increase cache hit rates, thereby improving performance, in view of task affinity. In addition to scheduling in terms of traffic type, the set of processing units 221 may also be configured to perform tasks at specific execution time levels. In other words, one set of processing units is used to perform tasks with relatively long execution times, such as data compression, encryption and decryption tasks, AI reasoning, etc., while another set of processing units is used to perform tasks with shorter execution times, such as I/O reading and writing, metadata processing, protocol parsing, etc. In this manner, when a task is received by the storage control system 130, the task may be scheduled to the appropriate set of processing units 221 for execution by one or more processing units within the set of processing units.

It should be understood that the environments illustrated in fig. 1 and 2 are merely exemplary, and that embodiments of the present disclosure may also be implemented in different environments. For example, embodiments of the present disclosure may also be implemented in other computer systems other than storage systems.

Task scheduling

Fig. 3 illustrates a schematic flow diagram of a task scheduling method 300 according to some embodiments of the present disclosure. The method 300 may be performed by the task scheduling unit 101 as shown in fig. 1 or fig. 2 or by a computing device implementing the task management unit 101.

At block 310, the task scheduling unit 101 selects a first set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the first task. In some embodiments, the storage management system 130 receives the first task from the host 110 via the front-end network card 120. The first task may have associated attribute information such as, for example, traffic type, execution time, priority information, etc.

Traffic types may include, but are not limited to, protocol resolution, I/O read-write (at the front-end of the cache), cache flushing (at the back-end of the cache), data exchange, and so forth. The execution time may be on the order of the time required to execute a task, e.g., on the order of milliseconds, hundred microseconds, ten microseconds, and so on. Alternatively, the execution time of the task may be a determined or estimated amount of time. According to embodiments of the present disclosure, the task scheduling unit 101 may select a set of processing units that execute a task from two dimensions, namely, a task type and execution time of the task.

The priority information of a task may be determined based on the task source process or thread, with the priority of the task spawned by the operating system process being higher than the priority of the general user process. The priority of a task may also be determined based on other factors, such as, for example, real-time requirements of the task, requirements for the amount of resources, etc., which are not limiting in this disclosure.

In some embodiments, multiple sets of processing units may be associated with different traffic types and different execution time levels, respectively. That is, one set of processing units may be configured to perform tasks of a corresponding traffic type and have a corresponding level of execution time. Thus, multiple sets of processing units may be organized in two dimensions, a business type and an execution time level, and tasks may be scheduled based on matching relationships between the tasks and the sets of processing units.

Fig. 4 illustrates a schematic diagram of a layout 400 of processing resources according to some embodiments of the present disclosure. The processing resources include a plurality of processing unit sets 221. Each processing unit set 221 may include a plurality of processing units. In some embodiments, the processing unit may be a controller in a storage control system, a processor in a controller, or a processor core in a processor.

For ease of understanding, the plurality of processing unit sets 221 are shown in FIG. 4 as being arranged in rows and columns, where columns correspond to traffic types and rows correspond to the length of execution time. It should be appreciated that the collection of processing units 221 in the processing resource pool 220 is not necessarily arranged in rows and columns.

As shown in fig. 4, the set of processing units in the column of traffic type 1 may be configured to perform tasks with traffic type 1, the set of processing units in the column of traffic type 2 may be configured to perform tasks with traffic type 2, and so on. Similarly, a set of processing units in a long task row may be configured to execute tasks with longer execution times, a set of processing units in a medium task row may be configured to execute tasks with medium execution times, and a set of processing units in a short task row may be configured to execute tasks with shorter execution times. It will be appreciated that the execution time of a task depends on the performance of the processing unit, and thus longer execution time, medium execution time, shorter execution time are relative concepts here. For example, a task with an execution time on the order of milliseconds may be considered a long task, a task with an execution time on the order of hundred microseconds may be considered a medium task, and a task with an execution time on the order of ten microseconds may be considered a short task. The present disclosure is not limited to specific rules for classifying tasks by execution time.

Further, while fig. 4 shows the division of tasks into long, medium, and short tasks, it should be understood that the execution time level of the tasks is not limited thereto, and the tasks may be more simply divided into two levels, or more.

As described above, as application scenarios become more complex, the execution time of tasks of the same traffic type may vary considerably. The execution time of a task is associated with an operation to which the task relates. For example, in tasks whose traffic type is protocol parsing, tasks associated with data integrity Domain (DIF) operations typically require a longer time (e.g., on the order of milliseconds) and are therefore classified as long tasks, while tasks associated with protocol processing are performed for a shorter time (e.g., on the order of hundred microseconds) and are therefore classified as medium tasks. As another example, in tasks with a traffic type of cache swiping, tasks associated with data compression, erasure Codes (ECs) typically require a longer time (e.g., on the order of milliseconds) and are therefore classified as long tasks, while tasks associated with I/O processing, metadata processing are performed for a shorter time (e.g., on the order of hundred microseconds) and are therefore classified as medium tasks. As another example, latency sensitive front-end tasks, cache hit queries, network message processing, etc., may be performed for a shorter time (e.g., on the order of ten microseconds or less) and thus may be classified as short tasks.

For ease of understanding, and not limiting the present disclosure, assuming that the first task belongs to traffic type 1 and the execution is on the order of milliseconds (i.e., long tasks), the task scheduling unit 101 may select the top-left-most set of processing units in fig. 4 to execute the first task.

In addition, while each traffic type is shown in the schematic layout of fig. 4 as having a set of processing units for long, medium, short tasks, this is not required. That is, some traffic types may have only a set of processing units for a partial execution time level, i.e., some cells in layout 400 may be blank.

Next, at block 320, the task scheduling unit 101 assigns a first task to a processing unit of the first set of processing units for execution.

FIG. 5 illustrates a schematic diagram of distributing tasks among a set of processing units according to some embodiments of the present disclosure. The description is given taking the example that the first task is a long task and belongs to the service type 1. As shown, the processing unit set 501 includes a plurality of processing units: processing unit 510-1, processing unit 510-2, … … processes unit 510-N, where N is a positive integer. The task scheduling unit 101 may assign the first task 520 to a certain processing unit among them based on a minimum task principle.

The processing unit has a plurality of execution queues associated with priorities of tasks. For example, processing unit 510-1 may have queues 511 and 512. Queue 511 is a task queue that includes a first priority (e.g., high priority), and queue 512 is a task queue that includes a second priority (e.g., low priority). It should be appreciated that any of processing units 510-1 through 510-N may include more or fewer queues.

According to the minimum task principle, the task scheduling unit 101 determines the processing unit that receives the first task 520. For example, processing unit 510-1 may be determined to have the least tasks to perform. Accordingly, the task scheduling unit 101 may assign the first task 520 to the processing unit 510-1. In some embodiments, the first task 520 is added to a corresponding execution queue of the plurality of execution queues based on priority information of the first task 520. In this case, if the first task 520 has the second priority, the task may be added to the queue 512.

In addition, if the processing unit having the smallest task to be executed in the processing unit set 501 is not unique, that is, both of two or more processing units are the processing unit having the smallest task, the task scheduling unit 101 may allocate the task 520 to the one processing unit having the smaller number of tasks of the queue corresponding to the task priority. Alternatively, the task scheduling unit 101 may also assign the task 520 to any one of the processing units having the least tasks to be executed.

In some embodiments, processing unit 510-1 may acquire tasks from its queues 511 and 512 in a round robin fashion for execution, the number of tasks acquired each time being determined based on the priority of each of the plurality of queues. In some embodiments, weights may be configured for different priority queues, e.g., higher weights may be configured for high priority queues, thereby acquiring more tasks from the high priority queues at a time.

In this way, tasks are allocated to appropriate processing resources for processing based on the traffic type and execution time of the tasks, such that tasks having different execution times are processed by different processing resources. For example, long tasks (e.g., data compression, AI reasoning, etc.) of the high-density computing class are related to the storage business logic, and medium tasks and short tasks (e.g., protocol processing, message processing, etc.) are related to the storage business logic, so that under different scenes, tasks can be more reasonably scheduled on the premise of ensuring task affinity, better load balancing is achieved, and processing resources are more fully utilized.

Fig. 6 illustrates a schematic flow diagram of another task scheduling method 600, according to some embodiments of the present disclosure. The method 600 may be performed by the task scheduling unit 101 as shown in fig. 1 or fig. 2 or by a computing device implementing the task management unit 101.

At block 610, the task scheduling unit 101 selects a second set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the second task.

Here, the plurality of processing unit sets may be, for example, the plurality of processing unit sets described with reference to fig. 4, wherein each processing unit set is configured to execute tasks belonging to a specific traffic type and having a specific execution time level. In an embodiment, the processing unit may be a controller, a processor, or a processor core.

In some embodiments, the storage management system 130 receives the second task from the host 110 via the front-end network card 120. The second task may have associated attribute information such as, for example, traffic type, execution time, priority information, etc. For ease of understanding, and not limiting the present disclosure, it is assumed that the second task is a middle task with a middle execution time and belongs to traffic type 1, with the priority being the first priority. The process of selecting the second set of processing units by the task scheduler 101 is similar to the process described with reference to block 310 of fig. 3 and will not be repeated here.

At block 620, the task scheduling unit 101 determines whether the selected second set of processing units is in a high load state. In some embodiments, the task scheduling unit 101 may determine an average or a total number of the processing units in the second set of processing units to be executed, and consider the second set of processing units to be in a high load state if the average or the total number exceeds a corresponding preset threshold. Alternatively, the task scheduling unit 101 may further determine the number of tasks to be performed of the processing unit with the least number of tasks to be performed in the second processing unit set, and consider the second processing unit set to be in a high load state if the least number of tasks to be performed exceeds a corresponding threshold.

If the second set of processing units is determined not to be in a high load state, the method 600 proceeds to block 630, where the task processing unit 101 assigns the second task to a processing unit in the second set of processing units for execution. The process 630 of task scheduling 101 assigning a second task to a second set of processing units to perform is similar to the process described with reference to block 320 of fig. 3 and will not be described in detail herein.

It will be appreciated that if a second task is allocated to a second set of processing units in a high load state, it may result in the execution of the second task being delayed, which is disadvantageous. Embodiments of the present disclosure provide a shared task queue to address this problem.

Thus, if the second set of processing units is determined to be in a high load state, the method proceeds to block 640, where the task scheduling unit 101 adds the second task to a task queue shared by the second set of processing units and a third set of processing units of the plurality of sets of processing units. Here, the second set of processing units and the third set of processing units may have the same execution time level and have different traffic types.

FIG. 7 illustrates a schematic diagram of a shared task queue, according to some embodiments of the present disclosure. As shown in fig. 7, a shared task queue 703 is provided across sets of processing units 701 and 702 of traffic types.

As described above, assuming that the second task is a middle task and belongs to service type 1, the second set of processing units is a set of processing units 701 configured to perform the middle task of service type 1. In response to determining that the second set of processing units 701 is in a high load state (e.g., each processing unit 710 has more tasks to be performed), a second task 720 is added to the shared task queue 703. In this way, the second task 720 can be retrieved and executed by a processing unit in the third set of processing units 702 from the shared queue without waiting too long in the queue of the high-load second set of processing units.

It is noted that task queues 703 are shared between sets of processing units associated with different traffic types, but associated with the same execution time level. That is, long tasks are always performed by a set of processing units associated with the long tasks, and medium or short tasks are always processed by a set of processing units associated with the medium or short tasks. This is advantageous because assigning tasks within a collection of processing units using a minimum task principle can have good scheduling efficiency where the execution time of each task is approximately the same or similar.

Additionally, while shared task queue 703 is shown in FIG. 7 as being shared between two sets of processing units 701 and 702, it should be appreciated that shared task queue 703 may be shared between more sets of processing units associated with the same execution time level and associated with different traffic types. That is, in the layout 400 described with reference to FIG. 4, there may be shared task queues between sets of processing units located in the same row.

In some embodiments, if there is an idle processing unit in the second processing unit set 702 and the third processing unit set 703, the task scheduling unit 101 may cause the idle processing unit to fetch a task from the shared queue 703 for execution. For example, referring to fig. 7, the third set of processing units 702 has a lower load than the second set of processing units 701, once a certain processing unit in the third set of processing units 702 has processed a task in a corresponding priority queue, a task of a corresponding priority may be selected from the shared task queue 703 for execution. The number of tasks that the processing unit obtains from the shared queue 703 may be determined based on the weight of the priority, similar to the process described with reference to block 320 of fig. 3. Alternatively or additionally, tasks may be retrieved from the shared task queue 703 after the processing unit has executed all queued tasks. In this way, tasks in the shared queue can be consumed in time by means of the light-load processing unit, and the effect of load balancing is achieved.

In some embodiments, to cope with an abnormal scenario in which the second processing unit set 701 and the third processing unit set 702 are always in a high load or busy state, tasks in the shared task queue 703 may be periodically allocated to the second processing unit set 701 or the third processing unit set 702 for execution. In this way it is ensured that tasks in the shared task queue can be executed.

The task scheduling method according to various embodiments of the present disclosure is described in detail above. In some embodiments, tasks with the same or similar execution time can be allocated to the same processing resource for execution, so that load allocation among the processing resources is more balanced, and task scheduling efficiency and system performance are improved. In other embodiments, a global shared queue across sets of processing units is provided that covers sets of processing units at the same execution time level. When the load is unbalanced due to scene change, the light-load processing unit can acquire the tasks of the global shared queue for execution, so that the task allocation among processing resources is more balanced, and the task scheduling efficiency and the system performance are improved.

For example, in conventional scheduling techniques, the processing resource (e.g., CPU) occupancy gap for each set of processing units may reach 50% or even more due to the complex scenario. After the technical scheme is used, the difference distance of the occupied processing resources can be reduced to about 10%.

Example apparatus and apparatus

Fig. 8 illustrates a schematic block diagram of a task scheduling device 800 according to some embodiments of the present disclosure. The task scheduling apparatus 800 may be implemented in the task scheduling unit 101 as shown in fig. 1 or fig. 2 or a computing device implementing the task management unit 101.

The task scheduling device 800 includes a selection unit 810 and an allocation unit 820. The selection unit 810 is configured to select a first set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the first task. The allocation unit 820 is configured to allocate a first task to a processing unit of the first set of processing units for execution.

In some embodiments, multiple sets of processing units may be associated with different traffic types and different execution time levels. That is, one of the plurality of sets of processing units may be configured to perform tasks belonging to a particular traffic type and at a particular execution time level.

In some embodiments, allocation unit 820 may allocate tasks within a set of processing units according to a minimum task rule. Thus, the allocation unit 820 may allocate the first task to the processing unit of the first set of processing units with the least tasks to be performed.

In some embodiments, a processing unit in a set of processing units may have multiple execution queues associated with priorities. The plurality of execution queues may include a first queue associated with a first priority and a second queue associated with a second priority. The first queue includes tasks of a first priority and the second queue includes tasks of a second priority. In some embodiments, the first priority is prioritized over the second priority, that is, the first priority task has a higher real-time requirement.

In some embodiments, allocation unit 820 may be further configured to add the first task to one of the plurality of execution queues based on the priority of the first task.

In some embodiments, allocation unit 820 may be further configured to cause the processing unit to acquire tasks from the plurality of queues in a round robin fashion for execution, the number of tasks each acquired by the processing unit being determined based on the respective priorities of the plurality of queues. In this way, the number of tasks may be acquired according to the priorities for the different tasks, for example, more tasks may be acquired from the high-priority queue at a time, whereby the real-time requirements of the different tasks may be satisfied.

Fig. 9 illustrates a schematic block diagram of another task scheduler 900 according to some embodiments of the present disclosure. The task scheduling apparatus 900 may be implemented in the task scheduling unit 101 as shown in fig. 1 or fig. 2 or a computing device implementing the task management unit 101.

The task scheduling device 900 includes a selection unit 910, a load determination unit 920, and an equalization unit 930. The selection unit 910 is configured to select a second set of processing units from the plurality of sets of processing units based on the traffic type and the execution time of the second task. The load determination unit 920 is configured to determine that the second set of processing units is in a high load state. The balancing unit 930 is configured to add the second task to a task queue shared by the second set of processing units and a third set of processing units of the plurality of sets of processing units in response to the determination. The second set of processing units and the third set of processing units are associated with the same execution time level and are associated with different traffic types.

In some embodiments, the task scheduling device may further include an allocation unit 940. The allocation unit 940 may be configured to: if it is determined that there are idle processing units in the second set of processing units and the third set of processing units, the tasks in the shared task queue are assigned to the idle processing units for execution.

In some embodiments, the allocation unit 940 may also be configured to allocate tasks in the shared task queue to the second set of processing units or the third set of processing units for execution. In this way, tasks in the shared task queue may be performed in time in some abnormal situations, e.g., when the collection of processing units is always highly loaded or busy.

In some embodiments, the allocation unit 940 may be further configured to: if it is determined that there are idle processing units in the second set of processing units and the third set of processing units, the tasks in the shared task queue are assigned to the idle processing units for execution. In this way, tasks in the shared task queue may be executed by the processing units with lower loads, thereby ensuring that the tasks are executed in time.

In some embodiments, the allocation unit 940 may be further configured to: the tasks in the shared task queue are assigned to the second set of processing units or the third set of processing units for execution. In this way, in some abnormal situations, such as when the collection of processing units is always highly loaded or busy, tasks in the shared task queue can still be allocated and executed (e.g., periodically) rather than being suspended from the queue for execution.

Fig. 10 shows a schematic block diagram of an example device 1000 that may be used to implement embodiments of the present disclosure. The device 1000 may be used to implement the task scheduling unit 101 as shown in fig. 1 and 2 or a computing device running program code to implement the task scheduling unit 101. As shown, the device 1000 includes a Central Processing Unit (CPU) 1001 that can perform various suitable actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM) 1002 or loaded from a storage unit 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data required for the operation of the device 1000 can also be stored. The CPU 1001, ROM 1002, and RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

Various components in device 1000 are connected to I/O interface 1005, including: an input unit 1006 such as a keyboard, a mouse, and the like; an output unit 1007 such as various types of displays, speakers, and the like; a storage unit 1008 such as a magnetic disk, an optical disk, or the like; and communication unit 1009 such as a network card, modem, wireless communication transceiver, etc. Communication unit 1009 allows device 1000 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.

The various processes and treatments described above, such as methods 300 and/or 600, may be performed by the processing unit 1001. For example, in some embodiments, the methods 300 and/or 600 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 1008. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1000 via ROM1002 and/or communication unit 1009. When the computer program is loaded into RAM1003 and executed by CPU 1001, one or more actions of methods 300 and/or 600 described above may be performed.

The present disclosure may be methods, apparatus, systems, and/or computer program products. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for performing aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The embodiments of the present disclosure have been described above, the foregoing description is illustrative, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A method for task scheduling, comprising:

selecting a first set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the first task; and

the first task is assigned to a processing unit of the first set of processing units for execution.

2. The method of claim 1, wherein the plurality of sets of processing units are associated with different traffic types and different execution time levels.

3. The method of claim 1, wherein assigning the first task to a processing unit in the first set of processing units comprises:

And distributing the first task to the processing unit with the least tasks to be executed in the first processing unit set.

4. The method of claim 1, wherein the processing unit has a plurality of execution queues associated with priorities, and wherein assigning the first task to a processing unit in the first set of processing units comprises:

the first task is added to one of the plurality of execution queues based on the priority of the first task.

5. The method of any one of claims 1 to 4, wherein the processing unit is one of a controller, a processor, or a processor core.

6. A method for task scheduling, comprising:

selecting a second set of processing units from the plurality of sets of processing units based on the traffic type and execution time of the second task;

determining that the second set of processing units is in a high load state; and

in response to the determination, adding the second task to a task queue shared by the second set of processing units and a third set of processing units of the plurality of sets of processing units, wherein the second set of processing units and the third set of processing units have the same execution time level, and the second set of processing units and the third set of processing units are for executing tasks of different traffic types.

7. The method as recited in claim 6, further comprising:

and if the idle processing units exist in the second processing unit set and the third processing unit set, distributing the tasks in the shared task queue to the idle processing units for execution.

8. The method as recited in claim 7, further comprising:

the tasks in the shared task queue are assigned to the second set of processing units or the third set of processing units for execution.

9. The method according to any one of claims 6 to 8, wherein the processing unit is one of a controller, a processor, or a processor core.

10. A task scheduling device, comprising:

a selection unit configured to select a first set of processing units from the plurality of sets of processing units based on a traffic type and execution time of the first task; and

an allocation unit configured to allocate the first task to a processing unit of the first set of processing units for execution.

11. The apparatus of claim 10, wherein the plurality of sets of processing units are associated with different traffic types and different execution time levels.

12. The apparatus of claim 10, wherein the allocation unit is further configured to:

13. The apparatus of claim 10, wherein the processing unit has a plurality of execution queues associated with priorities, and the allocation unit is further configured to:

14. The apparatus of any one of claims 10 to 13, wherein the processing unit is one of a controller, a processor, or a processor core.

15. A task scheduling device, comprising:

a selection unit configured to select a second set of processing units from the plurality of sets of processing units based on a traffic type and execution time of the second task;

a state determining unit configured to determine that the second set of processing units is in a high load state; and

and an equalizing unit configured to add the second task to a task queue shared by the second processing unit set and a third processing unit set of the plurality of processing unit sets in response to the determination, wherein the second processing unit set and the third processing unit set have the same execution time level, and the second processing unit set and the third processing unit set are for executing tasks of different traffic types.

16. The apparatus as recited in claim 15, further comprising:

an allocation unit configured to allocate tasks in the shared task queue to the idle processing units for execution if it is determined that there are idle processing units in the second and third sets of processing units.

17. The apparatus of claim 16, wherein the allocation unit is further configured to:

18. The apparatus of any one of claims 15 to 17, wherein the processing unit is one of a controller, a processor, or a processor core.

19. An electronic device, comprising:

a processing unit and a memory for storing the processing unit,

the processing unit executing instructions in the memory, causing the electronic device to perform the method according to any one of claims 1 to 9.

20. A computer-readable storage medium storing one or more computer instructions, wherein execution of the one or more computer instructions by a processor causes the processor to perform the method of any one of claims 1 to 9.

21. A computer program product comprising machine executable instructions which, when executed by an apparatus, cause the apparatus to perform the method according to any one of claims 1 to 9.