WO2024041401A1

WO2024041401A1 - Method and apparatus for processing task, and device and storage medium

Info

Publication number: WO2024041401A1
Application number: PCT/CN2023/112749
Authority: WO
Inventors: 孙东旭; 朱科潜
Original assignee: 华为技术有限公司
Priority date: 2022-08-24
Filing date: 2023-08-11
Publication date: 2024-02-29
Also published as: CN117667324A

Abstract

The present application relates to the technical field of task management. Provided are a method and apparatus for processing a task, and a device, a storage medium and a program product. The method comprises: determining a first task to be executed by a first logical core in a physical core in a processing resource, and determining whether the first task has a predetermined priority; if it is determined that the first task has the predetermined priority, determining whether a second logical core in the physical core executes a second task of the predetermined priority; and if it is determined that the second logical core does not execute the second task of the predetermined priority, assigning, to the second logical core, a dedicated task which comprises a no-operation instruction. The embodiments of the present application can accelerate the execution of a high-priority task in a processor, thereby improving the processing efficiency of the high-priority task and increasing the resource utilization rate of the processor.

Description

Methods, devices, equipment and storage media for processing tasks

Cross-references to related applications

This application claims priority to the Chinese invention patent application with application number 202211021692.9, entitled "Methods, devices, equipment and storage media for processing tasks" and the filing date is August 24, 2022, which is incorporated by reference. The application is incorporated herein in its entirety.

Technical field

Embodiments of this application mainly relate to the computer field. More specifically, embodiments of the present application relate to methods, apparatus, devices and storage media for processing tasks.

Background technique

With the rapid advancement of computer technology and communication technology, people increasingly rely on networks and computers to handle various tasks. Therefore, the amount of data has grown explosively. In order to manage this data, more and more data centers have appeared. These data centers use configured servers combined with network infrastructure to transmit, accelerate, display, calculate, and store various data of users or customers.

The development of data centers has gone through multiple stages, from the initial stage of data storage to the stage of data processing. Now with the development of cloud technology, it has entered the development stage of cloud data center. With the rapid development of data centers, the number of servers deployed in data centers is increasing. However, there are still many problems that need to be solved in the process of using these servers to serve customers.

Contents of the invention

Embodiments of the present application provide a solution for processing tasks.

According to a first aspect of the application, a method of processing a task is provided. The method includes: determining a first task to be executed by a first logical core among physical cores in a processing resource; determining whether the first task is a predetermined priority; if it is determined that the first task is a predetermined priority, determining whether the second logical core executes the second task of the predetermined priority; and if it is determined that the second logical core does not execute the second task of the predetermined priority, allocate a dedicated task including a null instruction to the second logical core.

Through this method, it can speed up the execution of high-priority tasks in the processor, realize the suppression of lower-priority tasks by high-priority tasks on a single physical core, and eliminate the interference of different priority tasks on the logical core, improving the efficiency of the processor. The execution of high-priority tasks within a single physical core improves the processing efficiency of high-priority tasks and increases the resource utilization of the processor.

In some embodiments, determining the first task includes: obtaining a task ready queue for the first logical core, where the task ready queue is an ordered queue based on the priority of the task; and obtaining the first task from the task ready queue. Through this method, tasks to be processed by the logic core can be quickly and accurately obtained, which reduces the time for obtaining high-priority tasks and improves processing efficiency.

In some embodiments, obtaining the first task includes: selecting a ready task from the head of the task ready queue according to priority; and determining whether the first logical core is executing the current task; if it is determined that the first logical core is executing the current task, compare The priority of the ready task and the priority of the current task; and if it is determined that the priority of the ready task is higher than the priority of the current task, the ready task is determined as the first task to replace the execution of the current task. In this way, it can be quickly determined whether tasks in the logical core need to be replaced, which increases the probability that high-priority tasks will be processed quickly.

In some embodiments, the method further includes: obtaining an allocation task assigned to the first logical core and a corresponding priority for the allocation task; and adding the allocation task to the task ready queue based on the corresponding priority. Through this method, the order of newly assigned tasks in the priority queue can be quickly determined, and high-priority tasks can be processed in a timely manner, which improves the processing efficiency of high-priority tasks.

In some embodiments, the predetermined priority is a first predetermined priority, and the method further includes: if it is determined that the first task is a second predetermined priority, causing the first logical core to execute the first task, and the second predetermined priority is lower than First booking priority. In this way, high-priority tasks can be quickly executed and processing efficiency is improved.

In some embodiments, determining whether the second logical core in the physical core executes a second task of a predetermined priority includes: if it is determined that the first task is of a predetermined priority, determining whether the second logical core is executing the second task; if it is determined that the first task is of a predetermined priority, The second logical core is executing the second task and determines whether the priority of the second task is a predetermined priority. In this way, the efficiency of determining the task priority of another logical core in the physical core can be improved.

In some embodiments, the physical core is a first physical core, the processing resource further includes a second physical core, the predetermined priority is a first predetermined priority, and the method further includes: obtaining performance parameters related to the first task. Baseline value; adjust the value assignable to the second object based on the baseline value A shared resource of a third task on the management core, the third task having a second predetermined priority level lower than the first predetermined priority level. Through this method, more inter-core shared resources can be allocated to high-priority tasks first, speeding up the execution of high-priority tasks, and improving the processing efficiency and processing efficiency of high-priority tasks.

In some embodiments, obtaining the baseline value includes: suppressing the execution of the third task by the second physical check; and determining the baseline value based on suppressing the third task. In this way, baseline performance indicators can be determined quickly and accurately.

In some embodiments, suppressing the execution of the third task includes: limiting the upper limit of shared resources allocated to the third task; or suspending the execution of the third task. In this way, the baseline performance of high-priority tasks can be obtained quickly and accurately.

In some embodiments, suppressing the execution of the third task includes: suppressing the execution of the third task by suppressing the second physical check multiple times at predetermined time intervals; and determining the baseline value includes: determining based on the multiple suppressing of the third task. Multiple baseline values. By performing the above operations multiple times, you can avoid task oscillation or ping-pong phenomena.

In some embodiments, adjusting the shared resources that can be allocated to the third task on the second physical core includes: determining the sensitivity of the first task to the shared resources based on a baseline value of the performance parameter; and adjusting the allocated resources based on the sensitivity. Shared resources for the third task. In this way, it can be accurately determined whether high-priority tasks are resource-sensitive, so that while ensuring high-priority tasks, the execution of low-priority tasks is also ensured.

In some embodiments, the performance parameters include at least one of the following: cache accesses per thousand instructions, cache miss rate, and memory bandwidth, wherein determining the sensitivity includes at least one of the following: if determining cache accesses per thousand instructions If the amount is lower than the threshold cache access amount or the cache miss rate is greater than the threshold cache miss rate, it is determined that the first task is not cache-sensitive; if it is determined that the cache access amount per thousand instructions is higher than or equal to the threshold cache access amount and the cache miss is If the cache miss rate is less than or equal to the threshold cache miss rate, the first task is determined to be cache sensitive; if the memory bandwidth is determined to be less than the threshold bandwidth, the first task is determined to be insensitive to memory bandwidth; and if the memory bandwidth is determined to be greater than or equal to the threshold bandwidth, the first task is determined to be cache-sensitive. Tasks are sensitive to memory bandwidth. In this way, it is possible to quickly determine which resource a high priority is sensitive to, thereby providing accurate information for resource configuration and improving resource allocation efficiency and accuracy.

In some embodiments, adjusting the shared resources includes: if it is determined that the first task is not sensitive to the shared resources, increasing the upper limit of the shared resources used for the third task; and if it is determined that the first task is sensitive to the shared resources, dynamically adjusting the shared resources used for the third task. The upper limit of shared resources for three tasks. In this way, the configuration of shared resources can be accurately adjusted.

In some embodiments, dynamically adjusting the upper limit of shared resources includes: obtaining an actual value of a performance parameter related to the first task; and dynamically adjusting the upper limit of shared resources based on the baseline value and the actual value. In this way, the configuration of shared resources can be accurately adjusted.

In some embodiments, dynamically adjusting the upper limit of shared resources based on the baseline value and the actual value includes: if it is determined that the difference between the baseline value and the actual value exceeds the first threshold, reducing the shared resources allocated to the third task; and if it is determined that the difference between the baseline value and the actual value is lower than a second threshold, increasing the shared resources allocated to the third task, wherein the second threshold is lower than the first threshold. In this way, appropriate resources can be accurately configured for high-priority tasks and low-priority tasks. While ensuring the processing efficiency of high-priority tasks, it also ensures the execution of low-priority tasks and improves resource utilization.

In some embodiments, the shared resources include at least one of last level cache LLC and memory bandwidth. This way, you can determine exactly which shared resources to adjust.

According to a second aspect of the application, an apparatus for processing a task is provided. The apparatus includes: a task determination unit configured to determine a first task to be executed by a first logical core among physical cores in the processing resource; a priority determination unit configured to determine whether the first task is a predetermined priority; an execution determination unit configured to determine whether the second logical core in the physical core executes the second task of the predetermined priority if it is determined that the first task is of a predetermined priority; and an allocation unit configured to determine if the second logical core has not A second task of a predetermined priority is executed, and a dedicated task including a null instruction is assigned to the second logical core.

According to a third aspect of the present application, there is also provided an electronic device, comprising: at least one computing unit; at least one memory, the at least one memory being coupled to the at least one computing unit and storing information for use by the at least one Instructions executed by a computing unit which, when executed by the at least one computing unit, cause the device to perform the method according to the first aspect of the application.

According to a fourth aspect of the present application, a computer-readable storage medium is also provided, on which a computer program is stored. When the program is executed by a processor, the method according to the first aspect of the present application is implemented.

According to the fifth aspect of the present application, a computer program product is also provided, including computer-executable instructions, wherein when the computer-executable instructions are executed by a processor, the method according to the first aspect of the present application is implemented.

It can be understood that the device of the second aspect, the electronic device of the third aspect, the computer storage medium of the fourth aspect, or the computer program product of the fifth aspect provided above are used to execute the method provided by the first aspect. Therefore, the explanation or explanation regarding the first aspect also applies in the second, third, fourth and fifth aspects. In addition, the beneficial effects that can be achieved in the second aspect, the third aspect, the fourth aspect, and the fifth aspect can be referred to the beneficial effects in the corresponding methods, and will not be described again here.

Description of drawings

The above and other features, advantages and aspects of various embodiments of the present application will become more apparent with reference to the following detailed description taken in conjunction with the accompanying drawings. In the drawings, the same or similar reference numbers represent the same or similar elements, where:

Figure 1 illustrates a schematic diagram of an example environment in which various embodiments of the present application can be implemented;

Figure 2 shows a schematic flow diagram for processing tasks according to some embodiments of the present application;

Figure 3 shows a schematic diagram of a system for controlling inter-core shared resources according to some embodiments of the present application;

4 illustrates a schematic flow diagram of a process for determining resource sensitivity according to some embodiments of the present application;

Figure 5 shows a schematic flowchart of a process for dynamically allocating resources according to some embodiments of the present application;

Figure 6 shows a schematic diagram of an implementation example of a computing device according to some embodiments of the present application;

Figure 7 shows a block diagram of an apparatus according to some embodiments of the present application; and

8 illustrates a block diagram of a computing device capable of implementing various embodiments of the present application.

Detailed ways

Embodiments of the present application will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather these embodiments are provided for Understand this application more thoroughly and completely. It should be understood that the drawings and embodiments of the present application are for illustrative purposes only and are not intended to limit the scope of protection of the present application.

In the description of the embodiments of the present application, the term "including" and similar expressions shall be understood as an open inclusion, that is, "including but not limited to." The term "based on" should be understood to mean "based at least in part on." The terms "one embodiment" or "the embodiment" should be understood to mean "at least one embodiment". The terms "first," "second," etc. may refer to different or the same object. Other explicit and implicit definitions may be included below.

As mentioned above, with the continuous expansion of data centers, the number of servers is also growing rapidly. However, the server's resource utilization is always in a low state, such as central processing unit (Central Process Unit, CPU) utilization. This means a huge waste of resources. In order to improve server resource utilization, a common practice is to mix and deploy tasks of different priorities (virtual machines/containers, etc.). For example, server CPU utilization can be increased from 15% to 30% through colocation, thereby saving 50% of CPU costs.

However, when tasks with different priorities are mixed and deployed, the CPUs in the on-chip resources are time-shared and multiplexed, and lower-priority tasks will inevitably interfere with high-priority tasks. The existing host operating system (or hypervisor) task scheduling algorithm cannot guarantee that high-priority tasks will completely preempt lower-priority tasks; in addition, when tasks of different priorities run on different logical cores on the same physical core, There is interference at resources such as the arithmetic logic unit, first-level cache, and second-level cache, which affects the rapid execution of high-priority tasks. Furthermore, the Last Level Cache (LLC) and the Memory Bandwidth (MBW) are shared resources between cores. When high-priority tasks and lower-priority tasks are run on different physical cores, the lower priority High-level tasks may seize too many inter-core shared resources, causing the processing efficiency of high-priority tasks to be impaired.

In order to solve the above problems, a traditional solution is to use a container co-location solution, such as a full-scenario offline co-location solution, which designs a new offline scheduler class based on the Linux system. This scheduler class has a lower priority in the scheduling queue than the default Scheduler class, higher than IDLE scheduler class. However, designing a new scheduling class increases system complexity and operation and maintenance costs, and cannot reuse features such as load balancing in the current system.

Another traditional solution is to classify tasks into delay-sensitive and best-effort types. This solution needs to read data related to user-side processing efficiency in real time, and provide as many resources as possible for best-effort tasks on the premise of meeting the processing efficiency of delay-sensitive applications. However, these solutions require certain prior knowledge of the task, such as data such as application throughput and cache hit rate under different cache capacities (depending on specific hardware performance and architecture). However, since it requires prior knowledge of the task, it can only be limited to specific tasks and cannot be adjusted for various tasks run by the server.

In order to solve at least some of the above problems and other potential problems, in embodiments of the present application, the computing device first determines the first task to be performed by the first logical core among the physical cores in the processing resource, and then determines the first Whether the task is high priority. If the computing device determines that the first task is a high-priority task, it determines whether a second logical core in the physical core executes a high-priority second task. If it is determined that the second logical core does not execute the high-priority second task, a dedicated task including a null instruction is allocated to the second logical core. thus making high-priority Tasks exclusively occupy resources within the physical core. Furthermore, when there are high-priority tasks running in one physical core, the shared resources occupied by low-priority tasks running in other physical cores within the same on-chip resource can also be adjusted to further improve the execution of high-priority tasks. Based on this method, embodiments of the present application can speed up the execution of high-priority tasks in the processor, improve the processing efficiency of high-priority tasks, and improve the resource utilization of the processor.

Figure 1 shows a schematic diagram of an example environment 100 in which various embodiments of the present application can be implemented. As shown in FIG. 1 , environment 100 includes computing device 101 .

Computing devices 101 include, but are not limited to, personal computers, servers, handheld or laptop devices, mobile devices (such as mobile phones, personal digital assistants (PDAs), media players, etc.), multi-processor systems, consumer electronics, small computers , large computers, distributed computing environments including any of the above systems or devices, etc.

Computing device 101 is used to handle various tasks from users. The tasks described in this article are tasks processed by computing devices, which can be virtual machines, containers, or a set of processes or a set of threads. In order to provide high-quality services for some important tasks, users can assign a priority to the service, also called a hard priority. For example, tasks are divided into high-priority tasks and low-priority tasks, such as the high-priority task 102 and the low-priority task 103 shown in FIG. 1 . The priorities of high priority tasks 102 and low priority tasks 103 are pre-specified. Tasks assigned a task priority have a task label indicating the priority level of their task. For example, assign each task a field to store the task label.

In one example, computing device 101 prioritizes tasks based on their type. In another example, the priority of a task is specified by the user. For example, in a public cloud scenario, a resource-exclusive virtual machine and a resource-sharing virtual machine are mixedly deployed. A high-priority label is added to the resource-exclusive virtual machine, and a low-priority label is added to the resource-sharing virtual machine. In a private cloud scenario, users can classify tasks into high-priority tasks and low-priority tasks based on whether they are delay-sensitive or not. The above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure.

Computing device 101 also has on-chip resources 107 . On-chip resources 107 refer to the resources set on the chip, including at least physical cores 108 and 109, as well as the last level cache LLC and memory bandwidth MBW, that is, LLC/MBW 114. For example, on-chip resource 107 is a CPU. Figure 1 shows that the on-chip resources include two physical cores 108 and 109, which is only an example and not a specific limitation of the present disclosure. Those skilled in the art can set the number of physical cores included in the on-chip resource 107 as needed. For example, on-chip resources 107 may include one physical core or more than two physical cores.

As shown in FIG. 1 , the physical core 108 includes a logical core 110 and a logical core 111 , and the physical core 109 includes a logical core 112 and a logical core 113 . The logical core in Figure 1 is obtained by performing hyper-threading operations on the physical core. The physical core shown in FIG. 1 including two logical cores is only an example and is not a specific limitation of the present disclosure. The number of logical cores in a physical core can be set by those skilled in the art as needed. For example, one physical core includes one or more than two logical cores, and the number of logical cores in two physical cores can be the same or different.

The computing device 101 configures a task ready queue for each logical core for storing tasks to be executed by each logical core. When the computing device 101 receives a task, it will allocate the newly received task to the task ready queue of a logical core according to the load balancing of the logical core and/or the user's configuration of the task.

Computing device 101 also includes a CPU scheduling optimization module 104 for scheduling and optimizing execution of high priority tasks. Usually, tasks assigned to the task ready queue of a logical core will process the tasks in the ready queue in order based on the time slice assigned to the task and the processor time slice, such as using the CPU time slice round-robin scheduling method or the completely fair scheduling method. Since the tasks received by the computing device 101 also have priorities set, the CPU scheduling optimization module 104 is used to further adjust the sorting of each task including the priorities in the ready queue and the execution of the tasks.

The CPU scheduling optimization module 104 includes a single core suppression module 105 and a logical core isolation module 106 . The single-core suppression module 105 is used to sort tasks including priorities in the ready queue. High-priority tasks are arranged at the front of the task ready queue, while low-priority tasks are arranged at the rear of the task ready queue. If two tasks have the same priority, they will be queued according to the existing time slice scheduling method. Then, through the single-core suppression module 105, tasks with high priority are sent to the logical core for processing first.

If the logical core is processing high-priority tasks, the logical core isolation module 106 in the CPU optimized scheduling module will further determine whether other logical cores on the same physical core are executing high-priority tasks. For example, if the logical core 110 in the physical core 108 is running a high-priority task, the logical core isolation module 106 will determine whether the logical core 111 on the same physical core is running a high-priority task. Shown in FIG. 1 is an example in which the physical core includes two logical cores. If the physical core includes more than two logical cores, the logical isolation module 106 determines whether one or more other logical cores are running high-priority tasks.

If other logical cores, such as the logical core 111, are not running high-priority tasks, the logical core isolation module 106 will pull up an empty instruction task in the logical core 111. The priority of this empty instruction task is higher than low priority and lower than high priority. For example, if other logical cores are running If a low-priority task is executed, the low-priority task will be replaced with an empty instruction task. If other logical cores do not process the task, the empty instruction task is directly pulled up. After an empty instruction task is pulled up on other logical cores, low-priority tasks cannot occupy the other logical cores, thereby ensuring the execution of high-priority tasks. If other logical cores are also running high-priority tasks, no adjustments will be made to other logical cores.

In addition, if the task to be processed by the logical core is a low-priority task, the logical core only executes the low-priority task and does not perform task control operations on other logical cores. Similarly, if other logical cores are running high-priority tasks, low-priority tasks running on the logical core will be replaced by empty instruction tasks.

If there are multiple physical cores for on-chip resources, there will also be competition for shared resources between multiple physical cores. For example, the last level cache LLC and memory bandwidth MBW resources shared by multiple physical cores. Alternatively or additionally, computing device 101 optionally also includes an on-chip shared resource manager 115 . The on-chip resource sharing manager 115 is used to obtain the performance index of high-priority tasks executed on one physical core, and then adjust the occupation of shared resources between physical cores by low-priority tasks on other physical cores based on the performance index.

The on-chip shared resource manager 115 includes a collector 116 for obtaining performance indicators on high-priority tasks. The on-chip shared resource manager 115 also includes a classifier 117 for determining whether high priority tasks are sensitive to shared resources. Then the resource controller 118 adjusts the occupation of shared resources by low-priority tasks on other physical cores based on whether the high-priority tasks on one physical core are sensitive to the shared resources. For example, if the logical core 110 of the physical core 108 runs a high-priority task, in addition to controlling the logical core 111 in the physical core 108 not to execute low-priority tasks, the on-chip shared resource manager 115 also controls the allocation of resources in the physical core 109 to Shared resources for low priority tasks.

Through the above method, the execution of high-priority tasks in the processor can be accelerated, the processing efficiency of high-priority tasks is improved, and the resource utilization of the processor is improved.

A schematic diagram of an example environment 100 in which embodiments of the present application can be implemented is described above in conjunction with FIG. 1 . A flowchart of a method 200 for processing tasks according to an embodiment of the present disclosure is described below with reference to FIG. 2 . Method 200 may be performed at computing device 101 in FIG. 1 and any suitable computing device.

At block 201, a first task is determined to be performed by a first logical core among the physical cores in the processing resource. For example, computing device 101 determines tasks processed by logical cores 110 within physical cores 108 to assign to the logical cores for execution.

In some embodiments, the computing device 101 may obtain a task ready queue corresponding to the first logical core, which is an ordered queue based on the priority of the task. Then, the computing device 101 obtains the first task to be executed by the first logical core from the task ready queue. Through this method, tasks to be processed by the logic core can be quickly and accurately obtained, which reduces the time to obtain high-priority tasks and improves processing efficiency.

In some embodiments, when acquiring the first task for execution on the first logical core, the computing device 101 selects the ready task from the head of the task ready queue according to the priority. Since the task ready queue is a sorted queue, the head of the queue stores ready tasks with higher priority. The computing device 101 then also determines whether the first logical core is currently executing the current task. If it is determined that the first logical core has not executed the task, the first logical core executes the ready task. If the first logical core is executing the current task, the priority of the ready task needs to be compared with the priority of the current task. If it is determined that the priority of the ready task is higher than the priority of the current task, the ready task is determined as the first task to replace the execution of the current task. If it is determined that the priority of the ready task is lower than the priority of the current task, the current task will continue to be executed without executing the low-priority ready task. If it is determined that the priority of the ready task is equal to the priority of the current task, the CPU time slice round-robin scheduling method or the completely fair scheduling method is used for scheduling. The above process is generally performed after a new task is allocated to the task ready queue of the first logical core and a sorting is performed. Through the above method, it can be quickly determined whether tasks in the logical core need to be replaced, which increases the probability that high-priority tasks will be processed quickly. In some embodiments, if the logical core is not assigned a new task, the task to be run will be obtained from the head of the task ready queue when the logical core has processed the CPU time slice of the current task. The above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure.

In some embodiments, when receiving a new task, the computing device 101 obtains the assigned task assigned to the first logical core and the corresponding priority for the assigned task. The computing device then adds the assigned tasks to the task ready queue based on the corresponding priorities. In one example, when the computing device 101 receives a task assigned to the logical core 110, it reorders the tasks in the task ready queue in order from high to low priority. In another example, newly assigned tasks are inserted into a logical queue based on their priority. If the priorities of two tasks are the same, the order of the two tasks is determined according to the commonly used time slice ordering method. Through the above method, the order of newly assigned tasks in the priority queue can be quickly determined, and high-priority tasks can be processed in a timely manner, thereby improving the processing efficiency of high-priority tasks.

At block 202, it is determined whether the first task is a predetermined priority. Due to different task priorities, resources within the same physical core are managed differently. Accordingly, computing device 101 prioritizes tasks for processing within logical cores as they are determined.

In some embodiments, the predetermined priority is a first predetermined priority, ie, a high priority. If it is determined that the first task is not the scheduled priority, That is, if it is not the first predetermined priority or the high priority, the computing device 101 will determine that the first task is the second predetermined priority, and then the computing device 101 causes the first logical core to execute the first task, in which the second predetermined priority is low. at the first predetermined priority level, that is, low priority level. At this time, the tasks of other logical cores are not adjusted. In this way, the first task can be quickly executed and the processing efficiency is improved.

If it is determined at block 202 that the first task is of a predetermined priority, at block 203 , the computing device 101 further determines whether the second logical core in the physical core executes a second task of the predetermined priority. If it is determined at block 203 that the second logical core is executing a second task of a predetermined priority, it indicates that the second logical core is also processing a high-priority task. Therefore, the tasks performed by the second logical core may not be adjusted.

In some embodiments, if it is determined that the first task is of a predetermined priority, that is, the first task is a high-priority task, the computing device 101 determines whether the second logical core is executing the second task. If the second task is not executed, the second logical core can be directly caused to execute a task including a null instruction. The null instruction task can also be called a dedicated task dedicated to scheduling control. If it is determined that the second logical core is executing the second task, it is determined whether the priority of the second task is a predetermined priority. If it is determined to be a predetermined priority, no further operation is performed. In this way, the efficiency of determining the priority level of the second task can be improved.

If it is determined that the second logical core is not executing the second task of the predetermined priority, at block 204 , the second logical core is assigned a dedicated task including a null instruction. At this time, an empty instruction task is launched in the second logical core. Therefore, the high-priority tasks running in the first logical core occupy more core resources for execution.

Through this method, embodiments of the present application can speed up the execution of high-priority tasks in the processor, realize the suppression of lower-priority tasks by high-priority tasks on a single physical core, and eliminate the need for tasks of different priorities to overlap in the logical core. The interference on the CPU improves the execution of high-priority tasks within a single physical core and improves the processing efficiency of high-priority tasks.

A schematic flowchart for processing tasks according to some embodiments of the present application is described above in conjunction with FIG. 2 . An example process for handling inter-core shared resources among multiple physical cores to further speed up the execution of high-priority tasks is described below. The processing resources may include multiple physical cores. After a high-priority task is run on the first physical core, low-priority applications on one or more other physical cores need to be adjusted based on the performance of the high-priority task. In this process, the computing device 101 first obtains the baseline value of the performance parameter related to the first task. Shared resources allocable to a third task on the second physical core are then adjusted based on the baseline value, the third task having a second predetermined priority lower than the first predetermined priority. In this way, more inter-core shared resources can be allocated to high-priority tasks first, speeding up the execution of high-priority tasks and improving processing efficiency and processing efficiency. This process is further described below in conjunction with Figure 3.

Figure 3 depicts a schematic diagram of a system for controlling inter-core shared resources according to some embodiments of the present application. The system 300 is used to control the allocation of shared resources between cores, which may be the on-chip shared resource manager 115 in Figure 1 . The system 300 includes a data collector 301, a resource sensitivity classifier 302 and a resource controller 303.

The data collector 301 is used to collect high-priority task microarchitecture performance indicators, including but not limited to the number of instructions executed per unit time, instruction cycles, cache misses, cache accesses, memory bandwidth, and pipeline back-end memory constraints, etc. Relevant basic indicators. Based on these indicators, we can further calculate the number of cache misses per thousand instructions (Cache Misses Per Kilo Instructions, MPKI), the cache accesses per thousand instructions, such as the last level cache accesses per thousand instructions (LLC Accesses) Per Kilo Instructions (APKI), Cache Misses Rate (CMR), instructions per cycle (IPC) and other complex indicators. The data collector also obtains the memory bandwidth MBW allocated to high-priority tasks. For example, if a high-priority task runs on the first physical core among multiple physical cores, the data collector 301 will collect performance indicators corresponding to the high-priority task.

In order to further understand the performance of high-priority tasks, it is necessary to obtain the baseline performance of high-priority tasks. The baseline performance of a high-priority task refers to the performance index value of executing the high-priority task after suppressing low-priority tasks on other physical cores. In one example, the computing device suppresses the execution of one or more low-priority tasks by one or more other physical cores. The computing device then determines the baseline value based on the suppression of low-priority tasks. If low-priority tasks on other physical cores are not suppressed, the normal operating performance index value of the high-priority task will be obtained. In this way, baseline performance indicators can be determined quickly and accurately.

For example, the first physical core 108 in Figure 1 runs a high-priority task, suppresses the execution of the low-priority third task on the second physical core 109, and then obtains baseline performance for high-priority services. Additionally, when the execution of the low-priority third task on the second physical core 109 is suppressed, the execution of the third task by the second physical core 109 is suppressed multiple times at predetermined time intervals. Multiple baseline values are then determined based on multiple suppressions of the third task. The baseline value for the indicator can then be determined based on averaging multiple values for the same indicator over multiple suppressions or performing various suitable processing.

Low-priority tasks on other cores are periodically suppressed as shown in block 304 in Figure 3 . The suppression process includes two methods. One is, as shown in block 305, to limit the upper limit of shared resources allocated to low-priority tasks. By suppressing in this way, the trigger frequency can be set to hundreds of milliseconds. Another way, as shown in block 306, is to suspend the execution of low-priority tasks. This method can accurately obtain high-priority tasks without low-priority tasks. The baseline performance under task interference is good, but it is not friendly to low-priority tasks, so the trigger frequency can be set to the second level.

System 300 also includes a resource sensitivity classifier 302 that determines whether a task is sensitive to or sensitive to shared resources based on the obtained baseline values of the performance parameters. For example, it is determined whether the high-priority task running on the first physical core 108 in FIG. 1 is sensitive to shared resources. In one example, computing device 101 determines the sensitivity of the first task to the shared resource based on the baseline value of the performance parameter. As shown in block 307, it is determined whether sensitivity to the shared resource is based on the baseline value of the performance parameter. The resource controller 303 then adjusts the shared resources that can be allocated to low-priority tasks on other cores based on the sensitivity. In this way, it can be accurately determined whether high-priority tasks are resource-sensitive, so that while ensuring the processing efficiency of high-priority tasks, the execution of low-priority tasks is also ensured. The process of determining resource sensitivity will be further described below in connection with FIG. 4 .

If it is determined at block 307 that the first task is not sensitive to shared resources, at block 308 , statically allocate shared resources for low-priority tasks on other cores, for example, add resources for low-priority tasks on other cores. The upper limit of shared resources. Since the first task is not sensitive to shared resources, more shared resources can be allocated to low-priority tasks. At this time, the amount of low-priority resources allocated to other cores is increased according to a predetermined strategy. If it is determined that the first task is sensitive to shared resources, at block 309, an upper limit of shared resources for low priority tasks on other cores is dynamically allocated. In the process of dynamically allocating shared resources, it is necessary to use the baseline performance and normal performance of the high-priority first task to dynamically configure shared resources. The shared resources include at least one of last level cache LLC and memory bandwidth. This way you can determine exactly which shared resources to adjust. The process of dynamically configuring shared resources will be described in conjunction with Figure 5.

The system schematic diagram for controlling shared resources between cores is described above in conjunction with Figure 3. The process of determining resource sensitivity is further described below in conjunction with Figure 4. 4 illustrates a schematic flow diagram of a process for determining resource sensitivity according to some embodiments of the present application;

At block 401, the computing device obtains performance parameters for a high priority first task. The performance parameters include at least one of the following: cache accesses per thousand instructions APKI, cache miss rate CMR, and memory bandwidth MBW.

At block 402, APKI and CMR parameters are determined. If it is determined that the cache access amount APKI per thousand instructions is lower than the threshold cache access amount A or the cache miss rate CMR is greater than the threshold cache miss rate B, then it is determined at block 404 that the high-priority first task is not sensitive to cache resources. , such as LLC is not sensitive. If it is determined that the cache access amount APKI per thousand instructions is higher than or equal to the threshold cache access amount A and the cache miss rate CMR is less than or equal to the threshold cache miss rate B, a high-priority first task pair cache is determined at block 405 Resources LLC is sensitive.

At block 403, the memory bandwidth MBW is compared with the threshold C. If it is determined that the memory bandwidth MBW is less than the threshold C, then at block 406 it is determined that the high-priority first task is not sensitive to the memory bandwidth. If it is determined that the memory bandwidth MBW is greater than or equal to the threshold C, then at block 407 it is determined that the high priority first task is sensitive to the memory bandwidth MBW. Alternatively or additionally, when the setting is not sensitive to a type of resource, wait for the condition to be established multiple times in succession to take effect, and prevent ping-pong phenomenon, which refers to the difference between different results due to different results of each test. Changes back and forth, causing resource allocation to change repeatedly. Through the above method, it is possible to quickly determine which resources high priority is sensitive to, thereby providing accurate information for resource configuration and improving resource allocation efficiency and accuracy.

A schematic flowchart of a process for determining resource sensitivity according to some embodiments of the present application is described above in conjunction with FIG. 4 . The computing device obtains the baseline performance of the high-priority first task running on the first core and the actual value of the performance parameter of the first task that is not suppressed. Then dynamically adjust the upper limit of shared resources based on the baseline value and actual value. In this way, the configuration of shared resources can be accurately adjusted. In some embodiments, if it is determined that the difference between the baseline value and the actual value exceeds the first threshold, the computing device reduces the shared resources allocated to the low-priority third task running on other cores. If it is determined that the difference between the baseline value and the actual value is lower than a second threshold, the shared resources allocated to the third task are increased, wherein the second threshold is lower than the first threshold. Through the above method, appropriate resources can be accurately configured for high-priority tasks and low-priority tasks. While ensuring the processing efficiency of high-priority tasks, it also ensures the execution of low-priority tasks and improves resource utilization. The process for dynamically allocating resources is further described below in conjunction with FIG. 5 .

Figure 5 shows a schematic flowchart of a process for dynamically allocating resources according to some embodiments of the present application. When running a low-priority task, the initial state is set and certain shared resources are allocated to it, for example, the LLC is set to 2-way (way) and the memory bandwidth is 10%. The above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure.

In block 501, the computing device compares the actual value of the collected indicator for the high-priority first task with the baseline performance, for example, compares the actual value of the performance indicator APKI or MBW with the baseline value. At block 502, if the collected indicator is worse than the baseline performance, for example, the difference between the two exceeds the threshold E, then at block 508, reduce the resources allocated to low-priority tasks, for example, halve the resources allocated to low-priority tasks. The resources of the task may be reduced by a certain percentage. Alternatively or additionally, block 505 is optionally included. The above process may be compared continuously multiple times, and operation 508 is performed only when a threshold number of differences exceeds the threshold E in multiple comparisons.

At block 503, if the actual value of the collected indicator is close to the baseline performance, for example, the difference between the two values is less than the threshold F, that is, If the performance degradation is less than the threshold F, then increase the resources of the low-priority task at block 509. Alternatively or additionally, block 506 is optionally included. The above process may be compared multiple times continuously, and operation 509 is performed only if the difference of a threshold number of times among the multiple comparisons is less than the threshold F. In block 504, situations that do not satisfy the above two conditions are determined as other situations. In other cases, at block 507, the original configuration remains unchanged.

In addition, when dynamically adjusting resource allocation, if the low-priority task resources have been limited to the initial value, and the performance degradation of the high-priority task still exceeds the threshold (the degree of degradation is unacceptable), and continues multiple times, then the low-priority task can be Limit the CPU bandwidth of high-level tasks (that is, limit the upper limit of CPU resources). If the performance degradation is still serious after the limit, you can request the cluster center to migrate low-priority tasks.

Through the above method, appropriate resources can be accurately configured for high-priority tasks and low-priority tasks. While ensuring the processing efficiency of high-priority tasks, it also ensures the execution of low-priority tasks and improves resource utilization.

A schematic diagram of an implementation example of the computing device of the present invention is further described below with reference to FIG. 6 . Computing device 601 includes task layer 601, software layer 605, and hardware layer 613. The task layer 601 is used to obtain tasks and configure the tasks into high-priority tasks 602 and low-priority tasks 604 . The priority of the task is implemented through priority tag 603.

High-priority tasks or low-priority tasks are executed on the hardware layer 613 through the software layer 605 . The software layer 605 includes a CPU scheduling optimization module 606, which is used to manage resources within a physical core, such as prioritizing high-priority tasks and isolating logical cores. The CPU scheduling optimization module 606 includes a single core suppression module 608 and a logical core isolation module 609 to implement the above functions. The software layer 605 also includes an on-chip shared resource management module 607, which is used to manage inter-core shared resources on the chip. The on-chip shared resource management module 607 includes a data collection module 610, a resource sensitivity classifier 611 and a resource controller 612, for adjusting shared resources between cores.

The hardware layer includes on-chip resources 614 and Resource Director Technology (RDT)/Memory System Resource Partitioning and Monitoring (MPAM) to implement task execution. On-chip resource 614 may be a CPU. In addition, the on-chip resources 614 also include a performance monitoring unit 616. The performance monitoring unit 616 and RDT/MPAM are used to provide performance parameters to the data acquisition module 610.

FIG. 7 further shows a block diagram of an apparatus 700 for processing a task according to an embodiment of the present application. The apparatus 700 may include a plurality of modules for performing corresponding steps in the process 200 discussed in FIG. 2 . As shown in Figure 7, the apparatus 700 includes a task determining unit 701 configured to determine a first task to be executed by a first logical core among physical cores in the processing resource; a priority determining unit 702 configured to determine the first whether the task is of a predetermined priority level; the execution determination unit 703 is configured to, if it is determined that the first task is of a predetermined priority level, determine whether the second logical core in the physical core executes the second task of the predetermined priority level; and the allocation unit 704 is It is configured to allocate a dedicated task including a null instruction to the second logical core if it is determined that the second logical core does not execute the second task of the predetermined priority.

In some embodiments, the task determining unit 701 includes: a ready queue determining unit configured to obtain a task ready queue for the first logical core, where the task ready queue is an ordered queue based on the priority of the task; the first task obtains Unit, configured to obtain the first task from the task ready queue.

In some embodiments, the first task acquisition unit includes: a selection unit configured to select a ready task from the head of the task ready queue according to priority; an execution determination unit configured to determine whether the first logical core is executing current task. The priority comparison unit is configured to compare the priority of the ready task with the priority of the current task if it is determined that the first logical core is executing the current task. The replacement unit is configured to determine the ready task as the first task for replacing execution of the current task if it is determined that the priority of the ready task is higher than the priority of the current task.

In some embodiments, the apparatus 700 further includes: a task acquisition unit configured to acquire an allocation task allocated to the first logical core and a corresponding priority for the allocation task; and an adding unit configured to obtain the allocation task based on the corresponding priority. Add the assigned task to the task ready queue.

In some embodiments, where the predetermined priority is the first predetermined priority, the apparatus 700 further includes: a first task execution unit configured to cause the first logical core to execute the first task if it is determined that the first task is the second predetermined priority. A task with a second predetermined priority lower than the first predetermined priority.

In some embodiments, the execution determination unit includes: a second task execution determination unit configured to determine whether the second logical core is executing the second task if it is determined that the first task is of a predetermined priority; a second task priority determination unit , configured to determine whether the priority of the second task is a predetermined priority if it is determined that the second logical core is executing the second task.

In some embodiments, where the physical core is a first physical core, the processing resource further includes a second physical core, and the predetermined priority is the first predetermined priority, the apparatus 700 further includes: a baseline value acquisition unit configured to obtain the a baseline value of a task-related performance parameter; a shared resource adjustment unit configured to adjust shared resources that can be allocated to a third task on the second physical core based on the baseline value, and the third task has a value lower than A second predetermined priority level to the first predetermined priority level.

In some embodiments, the baseline value acquisition unit includes: a suppression unit configured to suppress the execution of the third task by the second physical check; and a baseline value determination unit configured to determine the baseline value based on the suppression of the third task. .

In some embodiments, the suppressing unit includes a limiting unit configured to limit an upper limit of shared resources allocated to the third task; or a suspending unit configured to suspend execution of the third task.

In some embodiments, the suppressing unit includes suppressing the execution of the third task by suppressing the second physical check multiple times at predetermined time intervals; and the baseline value determining unit includes determining multiple baseline values based on suppressing the third task multiple times.

In some embodiments, the shared resource adjustment unit includes: a sensitivity determining unit configured to determine the sensitivity of the first task to the shared resource based on a baseline value of the performance parameter. The allocable resource adjustment unit is configured to adjust shared resources that can be allocated to the third task based on sensitivity.

In some embodiments, the performance parameter includes at least one of the following: cache accesses per thousand instructions, cache miss rate, and memory bandwidth, wherein the sensitivity determination unit is configured to perform at least one of the following: if it is determined that per thousand instructions If the cache access amount of the instruction is lower than the threshold cache access amount or the cache miss rate is greater than the threshold cache miss rate, it is determined that the first task is not cache-sensitive; if it is determined that the cache access amount per thousand instructions is higher than or equal to the threshold cache access amount and the cache miss rate is less than or equal to the threshold cache miss rate, the first task is determined to be cache sensitive; if the memory bandwidth is determined to be less than the threshold bandwidth, the first task is determined to be insensitive to the memory bandwidth; and if the memory bandwidth is determined to be greater than or equal to the threshold bandwidth , determine that the first task is sensitive to memory bandwidth.

In some embodiments, the allocable resource adjustment unit includes: a first increasing unit configured to increase the upper limit of shared resources for the third task if it is determined that the first task is not sensitive to the shared resources; and a dynamic adjustment unit configured to increase the upper limit of the shared resources for the third task. It is configured to dynamically adjust the upper limit of shared resources used for the third task if it is determined that the first task is sensitive to the shared resources.

In some embodiments, the dynamic adjustment unit includes: an actual value acquisition unit configured to acquire an actual value of a performance parameter related to the first task; and an upper limit adjustment unit configured to dynamically adjust the performance parameter based on the baseline value and the actual value. Adjust the upper limit of shared resources.

In some embodiments, the upper limit adjustment unit includes: a reduction unit configured to reduce the shared resource allocated to the third task if it is determined that the difference between the baseline value and the actual value exceeds the first threshold; and a second increase unit , configured to increase the shared resource allocated to the third task if it is determined that the difference between the baseline value and the actual value is lower than a second threshold, wherein the second threshold is lower than the first threshold.

In some embodiments, the shared resource includes at least one of last level cache LLC and memory bandwidth.

8 illustrates a schematic block diagram of an example device 800 that may be used to implement embodiments of the present disclosure. For example, computing devices 101 and 601 according to embodiments of the present application may be implemented by example device 800. As shown, the device 800 includes a central processing unit (CPU) 801 that can operate on a computer in accordance with computer program instructions stored in a read-only memory (ROM) 802 or loaded from a storage unit 808 into a random access memory (RAM) 803 Program instructions to perform various appropriate actions and processes. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. CPU 801, ROM 802 and RAM 803 are connected to each other via bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

Multiple components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, optical disk, etc. ; and communication unit 809, such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.

The various processes and processes described above, such as processes 200, 400, and 500, may be performed by processing unit 801. For example, in some embodiments, processes 200, 400, and 500 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When a computer program is loaded into RAM 803 and executed by CPU 801, one or more actions of processes 200, 400, and 500 described above may be performed.

The application may be a method, device, system, chip and/or computer program product. The chip may include a processing unit and a communication interface, and the processing unit may process program instructions received from the communication interface. A computer program product may include a computer-readable storage medium having thereon computer-readable program instructions for performing various aspects of the present application.

Computer-readable storage media may be tangible devices that can retain and store instructions for use by an instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above. More specific examples (non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) or flash memory), static State-of-the-art random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanical coding devices such as punched cards or recessed cards with instructions stored on them The protruding structure in the groove, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or through electrical wires. transmitted electrical signals.

Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to various computing/processing devices, or to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage on a computer-readable storage medium in the respective computing/processing device .

Computer program instructions for performing the operations of this application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or instructions in one or more programming languages. Source code or object code written in any combination of object-oriented programming languages - such as Smalltalk, C++, etc., and conventional procedural programming languages - such as the "C" language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server implement. In situations involving remote computers, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through the Internet). connect). In some embodiments, by utilizing state information of computer-readable program instructions to personalize an electronic circuit, such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), the electronic circuit can Computer readable program instructions are executed to implement various aspects of the application.

Various aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus, thereby producing a machine such that the instructions, when executed by a processing unit of the computer or other programmable data processing apparatus, , resulting in an apparatus that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium. These instructions cause the computer, programmable data processing device and/or other equipment to work in a specific manner. Therefore, the computer-readable medium storing the instructions includes An article of manufacture that includes instructions that implement aspects of the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.

Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other equipment, causing a series of operating steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executed on a computer, other programmable data processing apparatus, or other equipment to implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions that embody one or more elements for implementing the specified logical function(s). Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts. , or can be implemented using a combination of specialized hardware and computer instructions.

The various embodiments of the present application have been described above. The above description is illustrative, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles of the various embodiments, practical applications, or improvements to the technology in the market, or to enable other persons of ordinary skill in the art to understand the various embodiments disclosed herein.

Claims

A method for processing tasks, characterized in that the method includes:

determining a first task to be performed by a first logical core among the physical cores in the processing resource;

Determine whether the first task is a predetermined priority;

If it is determined that the first task is of the predetermined priority level, determining whether a second logical core in the physical core executes the second task of the predetermined priority level; and

If it is determined that the second logical core does not execute the second task of the predetermined priority, a dedicated task including a null instruction is allocated to the second logical core.
The method of claim 1, wherein determining the first task includes:

Obtain a task ready queue for the first logical core, where the task ready queue is an ordered queue based on the priority of the task;

Obtain the first task from the task ready queue.
The method of claim 2, wherein obtaining the first task includes:

Select ready tasks from the head of the task ready queue according to priority; and

Determine whether the first logical core is executing the current task;

If it is determined that the first logical core is executing the current task, compare the priority of the ready task with the priority of the current task; and

If it is determined that the priority of the ready task is higher than the priority of the current task, the ready task is determined as the first task for replacing the execution of the current task.
The method of claim 2, further comprising:

Obtaining an allocation task allocated to the first logical core and a corresponding priority for the allocation task; and

The assigned task is added to the task ready queue based on the corresponding priority.
The method according to claim 1, wherein the predetermined priority is a first predetermined priority, and the method further includes:

If it is determined that the first task is of a second predetermined priority level, the first logical core is caused to execute the first task, and the second predetermined priority level is lower than the first predetermined priority level.
The method of claim 1, wherein determining whether the second logical core in the physical core executes the second task of the predetermined priority includes:

If it is determined that the first task is of a predetermined priority, determine whether the second logical core is executing the second task;

If it is determined that the second logical core is executing the second task, it is determined whether the priority of the second task is the predetermined priority.
The method of claim 1, wherein the physical core is a first physical core, the processing resource further includes a second physical core, the predetermined priority is a first predetermined priority, and the method further includes :

Obtain baseline values of performance parameters related to the first task;

Shared resources allocable to a third task on a second physical core are adjusted based on the baseline value, the third task having a second predetermined priority lower than the first predetermined priority.
The method according to claim 7, wherein obtaining the baseline value includes:

suppressing performance of the third task by the second physical check; and determining the baseline value based on the suppression of the third task.
The method according to claim 8, characterized in that suppressing the execution of the third task includes:

Limit the upper limit of the shared resources allocated to the third task; or

Suspend the execution of the third task.
The method according to claim 8, characterized in that:

Suppressing the execution of the third task includes: suppressing the execution of the third task by the second physical check multiple times at predetermined time intervals;

And determining the baseline value includes: determining multiple baseline values based on multiple suppressions of the third task.
The method of claim 7, wherein adjusting shared resources that can be allocated to the third task on the second physical core includes:

determining the sensitivity of the first task to shared resources based on the baseline value of the performance parameter; and

Based on the sensitivity, shared resources allocable to the third task are adjusted.
The method according to claim 11, characterized in that the performance parameters include at least one of the following: cache per thousand instructions Memory access volume, cache miss rate and memory bandwidth, wherein determining the sensitivity includes at least one of the following:

If it is determined that the cache access amount per thousand instructions is lower than the threshold cache access amount or the cache miss rate is greater than the threshold cache miss rate, it is determined that the first task is not sensitive to cache;

If it is determined that the cache access amount per thousand instructions is higher than or equal to the threshold cache access amount and the cache miss rate is less than or equal to the threshold cache miss rate, it is determined that the first task is cache-sensitive;

If the memory bandwidth is determined to be less than the threshold bandwidth, determining that the first task is not sensitive to memory bandwidth; and

If it is determined that the memory bandwidth is greater than or equal to the threshold bandwidth, it is determined that the first task is sensitive to the memory bandwidth.
The method according to claim 11, characterized in that adjusting the shared resources includes:

If it is determined that the first task is not sensitive to the shared resource, increasing the upper limit of the shared resource for the third task; and

If it is determined that the first task is sensitive to the shared resource, dynamically adjust the upper limit of the shared resource used for the third task.
The method according to claim 13, characterized in that dynamically adjusting the upper limit of the shared resources includes:

Obtain the actual value of the performance parameter related to the first task; and

Based on the baseline value and the actual value, the upper limit of the shared resource is dynamically adjusted.
The method according to claim 14, characterized in that dynamically adjusting the upper limit of the shared resources based on the baseline value and the actual value includes:

If it is determined that the difference between the baseline value and the actual value exceeds a first threshold, reducing the shared resource allocated to the third task; and

If it is determined that the difference between the baseline value and the actual value is lower than a second threshold, increasing the shared resource allocated to the third task, wherein the second threshold is lower than the first threshold .
The method of claim 7, wherein the shared resource includes at least one of last level cache LLC and memory bandwidth.
A device for processing tasks, characterized in that the device includes:

a task determination unit configured to determine a first task to be executed by a first logical core among the physical cores in the processing resource;

a priority determination unit configured to determine whether the first task is a predetermined priority;

an execution determination unit configured to determine whether a second logical core in the physical core executes a second task of the predetermined priority if it is determined that the first task is of the predetermined priority; and

and an allocation unit configured to allocate a dedicated task including a null instruction to the second logical core if it is determined that the second logical core does not execute the second task of the predetermined priority.
An electronic device including:

at least one computing unit;

At least one memory coupled to the at least one computing unit and storing instructions for execution by the at least one computing unit, the instructions, when executed by the at least one computing unit, cause the The device performs the method according to any one of claims 1-16.
A computer-readable storage medium having a computer program stored thereon, which implements the method according to any one of claims 1-16 when executed by a processor.
A computer program product comprising computer-executable instructions, wherein the computer-executable instructions when executed by a processor implement the method according to any one of claims 1-16.