CN114816033A

CN114816033A - Frequency modulation method and device of processor and computing equipment

Info

Publication number: CN114816033A
Application number: CN202210474304.6A
Authority: CN
Inventors: 胡耀国; 黄靖淞; 赵辉昌
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2022-07-29
Also published as: CN110941325A; WO2021073130A1; CN110941325B

Abstract

The application discloses a frequency modulation method and device of a processor and computing equipment, and belongs to the technical field of computers. The frequency modulation method of the processor comprises the following steps: in the process of processing the target of a target task in parallel by a plurality of kernels of a processor, acquiring the invalid utilization rate of each kernel in the plurality of kernels; reducing a frequency of a first core of the plurality of cores in the target process; increasing a frequency of a second core of the plurality of cores in the target process; the invalid utilization rate of the first core is higher than an invalid utilization rate threshold, and the invalid utilization rate of the second core is lower than the invalid utilization rate threshold. The method and the device can improve the effective utilization rate of processor resources and can be used for adjusting the frequency of a plurality of cores of the processor.

Description

Frequency modulation method and device of processor and computing equipment

The present application is a divisional application, the original application number is 201910990340.6, the application date is 2019, 10 and 17, and the entire content of the original chinese patent application is incorporated by reference in the present application.

Technical Field

The present application relates to the field of computer technologies, and in particular, to a frequency modulation method and apparatus for a processor, and a computing device.

Background

With the development of computer technology, current processors typically include multiple cores. The multiple cores are capable of parallel processing of tasks.

When a plurality of cores process a task in parallel, each of the plurality of cores processes a portion of the task. After the multiple kernels finish processing the parts processed by the kernels, one of the kernels summarizes the processing results of the kernels to obtain the processing result of the task.

Generally, when a plurality of cores perform parallel processing on tasks, the difference of the processing amount of each core is large, so that the processing time length of each core is different. Before the task is processed, the kernel which is processed faster may be in an idle state, resulting in a lower effective utilization of processor resources.

Disclosure of Invention

The application provides a frequency modulation method and device of a processor and computing equipment, which can improve the effective utilization rate of processor resources, and the technical scheme is as follows:

in a first aspect, a method for frequency modulation of a processor is provided, the method comprising: in a target process of a plurality of kernels of a processor for processing a target task in parallel, acquiring an invalid utilization rate of each kernel of the plurality of kernels; reducing a frequency of a first core of the plurality of cores in the target process; increasing a frequency of a second core of the plurality of cores in the target process; wherein the invalid utilization of the first core is above an invalid utilization threshold, and the invalid utilization of the second core is below the invalid utilization threshold. The invalid utilization rate is used for representing the use condition of the kernel resource, and the invalid utilization rate of any kernel is inversely related to the calculated amount of the any kernel in the process of processing the target task. The higher the invalid utilization of the core, the higher the current frequency of the core, and the lower the invalid utilization of the core, the lower the current frequency of the core.

It should be noted that, because the computing device can obtain the invalid utilization rate of each core in the multiple cores in the target process of parallel processing of the target task by the multiple cores of the processor, according to the invalid utilization rate, reduce the frequency of the first core in the target process and increase the frequency of the second core in the target process, the invalid utilization rate of the first core is higher than the invalid utilization rate threshold, and the invalid utilization rate of the second core is lower than the invalid utilization rate threshold. The frequency of the kernel with higher invalid utilization rate in the target process is reduced, and the frequency of the kernel with lower invalid utilization rate in the target process is improved. The waiting time of the kernel with higher invalid utilization rate is reduced, and the processing time of the processor to the target task is further reduced, so that the efficiency of processing the target task by a plurality of kernels and the effective utilization rate of processor resources are improved.

Optionally, the obtaining an invalid utilization rate of each core of the plurality of cores includes: acquiring the number of times of calculation of any core in the plurality of cores in at least one unit time period in the target process; determining the ratio of the idle time length corresponding to any core to the total time length of the at least one unit time period as the invalid utilization rate of any core, wherein the idle time length corresponding to any core is the total time length of the idle unit time period corresponding to any core in the at least one unit time period, and the calculation times of any core in the corresponding idle unit time period are less than the calculation time threshold.

For example, if any core performs a floating point calculation while processing a target task, the calculation number may be a sum of the number of times that any core calculates at least one parameter of a floating point and a vector. If any core performs integer calculation when processing the target task, the calculation number may be the sum of the number of times that any core calculates at least one parameter of the integer and the vector, which is not limited in the embodiment of the present application. Wherein, there is an event for each calculation performed by any core, and the processor comprises a register for storing the event corresponding to each calculation. When obtaining the invalid utilization rate of any core in the plurality of cores, the computing device may obtain, by reading the register in each unit time period, the number of events stored in the register in each unit time period. The number is the number of times of calculation of any kernel in each unit time period.

The computing device may be preset with a threshold of the number of computations, and since any kernel usually performs a large number of computations when processing the target task, when the number of computations of any kernel in any unit time period of at least one unit time period is less than the threshold of the number of computations, it indicates that any kernel does not execute the target task in any unit time period, that is, is in an idle running state. At this time, the computing device may determine the arbitrary unit period as an idle unit period. For any core, after all idle unit time periods of the core in at least one unit time period are determined, the idle time length of the core can be determined, and therefore the invalid utilization rate of the core can be obtained.

Optionally, the reducing the frequency of the first core in the target process includes: reducing the frequency of the first core in the target process based on the reference frequency of the first core, wherein the frequency of the first core is less than or equal to the reference frequency of the first core after reducing the frequency of the first core in the target process; wherein the reference frequency of the first core is positively correlated to: a difference between the invalid utilization threshold and an invalid utilization of the first core.

Optionally, the increasing the frequency of a second core of the plurality of cores in the target process includes: increasing the frequency of the second core in the target process based on the reference frequency of the second core, wherein the frequency of the second core is less than or equal to the reference frequency of the second core after increasing the frequency of the second core in the target process; wherein the reference frequency of the second kernel is positively correlated to: a difference between the invalid utilization threshold and an invalid utilization of the second core.

For example, the computing device may first determine a first core and a second core of the plurality of cores based on an invalid utilization for each core of the plurality of cores. And then reducing the frequency of the first kernel in the target process based on the reference frequency of the first kernel, and increasing the frequency of the second kernel in the target process based on the reference frequency of the second kernel.

Optionally, the method further comprises: determining an average of the invalid utilization rates of the plurality of cores as the invalid utilization rate threshold. Illustratively, the average may be an arithmetic average, a geometric average, a squared average, a harmonic average, a weighted average, or the like. The computing device may determine the reference frequency for each core by an average value and a target formula. The target formula may be: s ═ sx (1+ a-b), where s' denotes a reference frequency for any of the plurality of cores, s denotes a current frequency for the any core, a denotes an invalid utilization threshold, and b denotes an invalid utilization for the any core. The computing device may reduce a frequency of the first core during processing of the target task to a reference frequency of the first core and increase a frequency of the second core during processing of the target task to a reference frequency of the second core.

Optionally, any one of the cores has a frequency threshold, and since the frequency of each core in the process of processing the target task cannot be greater than the frequency threshold, before increasing the frequency of the second core in the process of processing the target task, the method further includes: determining the minimum value of the two parameters of the reference frequency and the frequency threshold value of the second kernel; the increasing the frequency of a second core of the plurality of cores in the target process comprises: increasing the frequency of the second kernel in the target process to the minimum value. To avoid the frequency of the second core being greater than the frequency threshold after increasing the frequency of the second core in the target process.

Optionally, the processor in the computing device further includes a plurality of power supply interfaces in one-to-one correspondence with the plurality of cores, and the plurality of power supply interfaces are configured to provide voltages to the corresponding cores to drive the corresponding cores to process the target tasks. The computing device may reduce a frequency of the first core in processing the target task by reducing a voltage of the power supply interface corresponding to the first core. And increasing the voltage of the power supply interface corresponding to the second core to increase the frequency of the second core in the process of processing the target task.

It should be noted that, since the processing stability of the core is affected when the frequency of the core in the target process is too high, after the frequency of the first core in the target process is reduced, the frequency of the first core is less than or equal to the reference frequency of the first core. After increasing the frequency of the second core in the target process, the frequency of the second core is less than or equal to the reference frequency of the second core. Therefore, the first core and the second core are prevented from being too frequent in the target process, and the influence on the processing stability of the first core and the second core is reduced.

Optionally, the method further comprises: after reducing the frequency of the first kernel in the target process and increasing the frequency of the second kernel in the target process, detecting whether the target process is finished; and when the target process is not finished, repeatedly executing the processes of obtaining the invalid utilization rate, reducing the frequency of the first kernel in the target process and increasing the frequency of the second kernel in the target process.

It should be noted that, since the cores execute the target task when the processor executes the target task, the target task may be periodic or aperiodic. When the target task is periodic, because the operating scenes of the plurality of cores in each period are almost the same, the process of acquiring the invalid utilization rate, reducing the frequency of the first core in the target process and increasing the frequency of the second core in the target process can be performed only once, and the final frequency modulation scheme is obtained. When the target task is aperiodic, the operation scenes of the multiple cores in each time period are different, so that when the computing device detects that the multiple cores finish the target task, the computing device can repeatedly execute the processes of obtaining the invalid utilization rate, reducing the frequency of the first core in the target process and increasing the frequency of the second core in the target process, so as to realize the real-time adjustment of the frequencies of the first core and the second core. Therefore, the difference of processing time lengths of a plurality of kernels in the process of processing the target task can be effectively reduced, and the effective utilization rate of processor resources is further improved.

Optionally, a sum of the decreasing values of the frequencies of all first cores of the plurality of cores is greater than or equal to a sum of the increasing values of the frequencies of all second cores of the plurality of cores. Therefore, the total voltage required by the cores after the frequencies of the cores are adjusted is less than or equal to the total voltage required by the cores before the frequencies of the cores are adjusted, the situation that the frequencies of the cores cannot be adjusted by the available electric energy of the power supply system of the processor can be avoided, and the influence on the stability of the cores is reduced.

In a second aspect, a frequency modulation apparatus for a processor is provided, the frequency modulation apparatus for a processor includes various modules for performing the frequency modulation method of the processor according to any one of the first aspect.

In a third aspect, a computer-readable storage medium is provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the frequency tuning method of the processor according to any one of the first aspect.

In a fourth aspect, there is provided a chip comprising programmable logic and/or program instructions for implementing the frequency tuning method of the processor according to any one of the first aspect when the chip is in operation.

In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the frequency tuning method of the processor of any of the first aspects.

In a sixth aspect, a computing device is provided, the computing device comprising: a memory and a processor, wherein the processor is configured to execute a program stored in the memory to implement the frequency modulation method of the processor according to any one of the first aspect.

Drawings

Fig. 1 is a flowchart of a frequency modulation method of a processor according to an embodiment of the present disclosure;

fig. 2 is a schematic view of a scenario in which multiple cores in a computing device process target tasks according to an embodiment of the present application;

fig. 3 is a flowchart of a method for obtaining an invalid utilization rate of each core in a plurality of cores according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a thread running in multiple cores according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of another thread running in multiple cores according to an embodiment of the present application;

fig. 6 is a block diagram of a frequency modulation apparatus of a processor according to an embodiment of the present disclosure;

fig. 7 is a block diagram of a frequency modulation apparatus of another processor according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of a computing device according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.

The computing device includes a processor, which typically includes multiple cores. The multiple cores are capable of parallel processing of tasks. When multiple cores process a task in parallel, each core is used to process a portion of the task. After the multiple kernels finish processing the parts processed by the kernels, one of the kernels summarizes the processing results of the kernels to obtain the processing result of the task.

A computing device is required to distribute a task to multiple cores before the cores can process the task in parallel. But generally does not distribute the task evenly across multiple cores. For example, in the High Performance Computing (HPC) domain, before a task is processed by a plurality of kernels, the computing device generates a computing model based on the task and distributes the computing model to the plurality of kernels. Because the computational model is generally complex, the parts allocated to the multiple cores in the task are not uniform, and the calculated amount of the parts allocated to the multiple cores is also greatly different, so that the processing time length of each core is different. Before the task is processed, the kernel which is processed faster may be in an idle state, resulting in a lower effective utilization of processor resources. The effective utilization rate of the kernel is positively correlated with the calculation amount of the kernel in the process of processing the task.

The related art provides two techniques of adjusting frequencies of a plurality of cores of a processor, one is a processor acceleration (turbo) technique for increasing a processing speed of the plurality of cores, and the other is a hardware performance state (HWP) technique for reducing a processing power consumption of the plurality of cores. When the two techniques are used to adjust the frequency of multiple cores, the processor will detect whether each core is in an active state. Then, in the turbo technology, the processor controls the power supply system to increase the power supply voltage of the core in the operating state and decrease the power supply voltage of the core in the non-operating state within a Thermal Design Power (TDP) range of the processor. The operating frequency of the kernel in the working state is improved, and the operating frequency of the kernel in the non-working state is reduced, so that the processing speed of the kernel in the working state is improved. In HWP techniques, the processor may reduce the operating frequency of cores that are inactive to reduce the processing power consumption of multiple cores. However, when multiple cores process tasks in parallel, in some application scenarios (e.g., HPC application scenarios), the processor may determine that the core in the idle state is still in the working state when there is an idle state in the core that processes the task faster. This allows the processor to increase the operating frequency of each core in a processor acceleration (turbo) technique. In HWP techniques, the processor does not reduce the operating frequency of either core. Therefore, the kernel which is processed faster is still in an idle state, and the effective utilization rate of the processor resource is influenced. Therefore, neither of the two methods provided by the related art can achieve an increase in the effective utilization of processor resources.

The embodiment of the application provides a frequency modulation method for a processor, which can be applied to modules, except for a plurality of cores, in the processor included in a computing device, for example, the method can be applied to a non-core (core) module or a management module, and the like, in the processor. Alternatively, the method may be applied to a module other than a processor in a computing device, for example, the method may be applied to a Baseboard Management Controller (BMC). The method may also be applied to an external device different from the computing device where the processor is located, which is not limited in this embodiment of the application. Fig. 1 is a flowchart of a frequency tuning method for a processor according to an embodiment of the present disclosure, where fig. 1 illustrates that the method is applied to a computing device, and the method may include:

step 101, in a target process of a plurality of cores of a processor for processing a target task in parallel, obtaining an invalid utilization rate of each core in the plurality of cores.

The plurality of cores are used for processing the target task. The plurality of cores may be all of the cores included in the processor or some of the cores included in the processor. The processor may be a Central Processing Unit (CPU) or a Graphics Processing Unit (GPU).

In the process of processing the target of the target task by the multiple kernels, when the multiple threads can run in parallel by adopting a multithreading technology, one thread runs in each kernel; when multithreading is not employed to enable multiple threads to run in parallel, there is one process running in each kernel. For example, referring to fig. 2, fig. 2 is a schematic view of a scenario in which multiple cores in a computing device process a target task according to an embodiment of the present application. The scenario includes a computing device 20, the computing device 20 including a processor 201, the processor 201 including a plurality of cores (4 shown in fig. 2), the plurality of cores including a core a, a core b, a core c, and a core d. Fig. 2 illustrates an example in which each core runs one thread, where the core a runs thread a1, the core b runs thread b1, the core c runs thread c1, and the core d runs thread d 1.

The invalid utilization rate is used for representing the use condition of the kernel resource, and the invalid utilization rate of any kernel is inversely related to the calculated amount of the any kernel in the target process. The higher the invalid utilization of the core, the higher the current frequency of the core, and the lower the invalid utilization of the core, the lower the current frequency of the core. Alternatively, the plurality of cores may start executing the target process first, and execute the step 101 after a period of time for starting executing the target process, so as to avoid that the invalid utilization rate of each core in the target process is not obtained due to too short time for executing the target process. Illustratively, the period of time may be three minutes, four minutes, five minutes, or greater than ten minutes.

For example, referring to fig. 3, fig. 3 is a flowchart of a method for obtaining an invalid utilization rate of each core of a plurality of cores according to an embodiment of the present application, where the method may include:

step 1011, obtaining the number of times of calculation of any core in the multiple cores in at least one unit time period in the target process.

Alternatively, the unit time period may be 1 millisecond (ms), in which case the at least one unit time period may be thirty thousand unit time periods (i.e., the total time period of the at least one unit time period is five minutes).

Alternatively, if any core performs floating point calculation in the target process, the calculation number may be the sum of the number of times that any core calculates at least one parameter of a floating point and a vector. If any core performs integer calculation in the target process, the calculation number may be the sum of the number of times that any core calculates at least one parameter of the integer and the vector, which is not limited in the embodiment of the present application.

Wherein, there is an event for each calculation performed by any core, and the processor comprises a register for storing the event corresponding to each calculation. The computing device may obtain the number of events stored by the register in each unit time period by reading the register in each unit time period. The number is the number of times of calculation of any kernel in each unit time period. Illustratively, a processor includes a Performance Monitor Unit (PMU) register having a plurality of PMU data stored therein, each PMU data having an event stored therein. For any core in the multiple cores, the computing device may obtain the number of computations of the core in each unit time period by reading PMU data stored in the PMU register in each unit time period.

Step 1012, determining a ratio of an idle time length corresponding to any core to a total time length of at least one unit time period as an invalid utilization rate of any core, where the idle time length corresponding to any core is the total time length of the idle unit time period corresponding to any core in at least one unit time period, and a number of times of calculation of any core in the corresponding idle unit time period is less than a number of times of calculation threshold.

The calculation number threshold may be preset in the calculation device. Since any core usually performs a large amount of computation in the target process, when the number of computations of any core in any unit time period of at least one unit time period is less than the threshold number of computations, it indicates that any core does not perform the target task in any unit time period, i.e., is in an idle running state. At this time, the computing device may determine the any unit time period as an idle unit time period corresponding to the any core. For any core, after all idle unit time periods corresponding to the core in at least one unit time period are determined, the idle time length corresponding to the core can be determined, and therefore the invalid utilization rate of the core is obtained.

Illustratively, the count threshold may be 800, 900, 1000, or 1100. Assuming that the threshold value of the number of calculations is 1000, the unit time period is 1ms, and at least one unit time period is thirty thousand unit time periods (i.e., the total time period is 5 minutes). The plurality of cores includes a core a, a core b, a core c, and a core d. The idle unit time periods corresponding to the thirty thousand unit time periods of the kernel a are 3 ten thousand (that is, the idle time period of the kernel a is 0.5 minute), and the invalid utilization rate of the kernel a is 0.5/5-0.1. The idle unit time period corresponding to the core b in the thirty thousand unit time periods is 15 ten thousand (that is, the idle time period of the core b is 2.5 minutes), and the invalid utilization rate of the core b is 2.5/5-0.5. The idle unit time period corresponding to the thirty thousand unit time periods of the core c is 1.5 ten thousand (that is, the idle time period of the core c is 0.25 minute), and the invalid utilization rate of the core c is 0.25/5-0.05. The idle unit time period corresponding to the core d in the thirty thousand unit time periods is 9 ten thousand (that is, the idle time period of the core d is 1.5 minutes), and the invalid utilization rate of the core d is 1.5/5-0.3.

Step 102, determining a first core and a second core in the plurality of cores based on the invalid utilization rate of each core in the plurality of cores.

Since the invalid utilization rate of the first core is above the invalid utilization rate threshold and the invalid utilization rate of the second core is below the invalid utilization rate threshold, the computing device may determine each of the plurality of cores as the first core or the second core according to a magnitude relationship of the invalid utilization rate of each of the plurality of cores to the invalid utilization rate threshold. When the invalid utilization rate of any core is higher than the invalid utilization rate threshold value, determining any core as a first core. And when the invalid utilization rate of any core is lower than the invalid utilization rate threshold value, determining any core as a second core.

Wherein the invalid utilization threshold may be an average of invalid utilizations of the plurality of cores. Optionally, the average may be an arithmetic average, a geometric average, a square average, a harmonic average, a weighted average, or the like, which is not limited in this application. For example, referring to step 101, assuming that the average value is an arithmetic average value, the average value of the invalid utilization rates of the core a, the core b, the core c, and the core d is (0.1+0.5+0.05+ 0.3)/4-0.2375. Assuming that the average is a squared average, the average of the invalid utilizations of core a, core b, core c, and core d is

When the invalid utilization threshold is 0.2375, please refer to step 101, the invalid utilization of the core a and the invalid utilization of the core c are both smaller than the invalid utilization threshold, and the invalid utilization of the core b and the invalid utilization of the core d are both larger than the invalid utilization threshold. The computing device may determine core a and core c as the second core and core b and core d as the first core.

For example, the computing device may determine a magnitude relationship of the invalid utilization of each core to an invalid utilization threshold by a magnitude relationship of the current frequency of each core to its reference frequency. A first core and a second core of the plurality of cores are further determined. The reference frequency of either core is positively correlated to: the difference between the invalid utilization threshold and the invalid utilization of the any core. When the current frequency of any core is greater than the reference frequency of the core, the invalid utilization rate of any core is higher than an invalid utilization rate threshold value, and further any core can be determined as a first core; when the current frequency of any core is less than the reference frequency, the invalid utilization rate of any core is lower than the invalid utilization rate threshold value, and further, the any core can be determined as a second core.

Wherein the computing device may determine a reference frequency for each of the plurality of cores based on the invalid utilization of the plurality of cores and an invalid utilization threshold. Or the computing device may preset a one-to-one correspondence between different invalid utilization rates of the cores and the reference frequency, and the computing device may directly search the one-to-one correspondence between the different invalid utilization rates of the cores and the reference frequency according to the invalid utilization rate of any one of the cores, so as to determine the reference frequency corresponding to the invalid utilization rate of the any one core. And then the computing device determines the magnitude relation between the invalid utilization rate of each core and the invalid utilization rate threshold value based on the magnitude relation between the current frequency of each core and the reference frequency thereof, and further determines a first core and a second core in the plurality of cores.

For example, when determining the reference frequency for each of the plurality of cores based on the invalid utilization of the plurality of cores and an invalid utilization threshold, the computing device may determine the reference frequency for each core by the invalid utilization threshold and a target formula. The target formula may be: s ═ sx (1+ a-b). Wherein s' represents a reference frequency of any one of the cores, s represents a current frequency of the any one core, a represents an invalid utilization threshold, and b represents an invalid utilization of the any one core.

For example, referring to the foregoing

steps

101 and 102, assume that the current frequencies of core a, core b, core c, and core d are all 100 megahertz (Mhz), and the invalid utilization threshold is an average of the invalid utilizations of the cores, and the invalid utilization threshold is 0.2375. The reference frequency of the core a is 100 × (1+0.2375-0.1) ═ 113.75Mhz, the reference frequency of the core b is 100 × (1+0.2375-0.5) ═ 73.75Mhz, the reference frequency of the core c is 100 × (1+0.2375-0.05) ═ 118.75Mhz, and the reference frequency of the core d is 100 × (1+0.2375-0.3) ═ 93.75 Mhz. It can be seen that the larger the difference between the invalid utilization threshold and the invalid utilization of any core, the larger the reference frequency of any core; the smaller the difference between the invalid utilization threshold and the invalid utilization of any core, the smaller the reference frequency of any core. The current frequency of the core a and the current frequency of the core c are both smaller than the reference frequency thereof, and the current frequency of the core b and the current frequency of the core d are both greater than the reference frequency thereof. The computing device may determine core a and core c as the second core and core b and core d as the first core.

And 103, reducing the frequency of a first core in the plurality of cores in the target process.

Optionally, referring to step 102, the computing device may decrease the frequency of the first core in the target process based on the reference frequency of the first core. After reducing the frequency of the first core in the target process, the frequency of the first core is less than or equal to the reference frequency of the first core. Because the processing stability of the core is affected when the frequency of the core in the target process is too high, after the frequency of the first core in the target process is reduced, the frequency of the first core is less than or equal to the reference frequency of the first core, so that the frequency of the first core in the target process can be prevented from being too high, and the influence on the processing stability of the first core is reduced.

Illustratively, referring to the foregoing

steps

101 and 102, as for the core a, the core b, the core c, and the core d, the core b and the core d are the first cores. Before this step 103 is performed, the current frequencies of both core b and core d are 100 Mhz. The reference frequency of core b is 73.75Mhz, and the reference frequency of core d is 93.75 Mhz. The computing device may reduce the frequency of kernel b in the target process to 73.75Mhz or 73Mhz, and reduce the frequency of kernel d in the target process to 93.75Mhz or 93 Mhz.

Optionally, the processor in the computing device further includes a plurality of power supply interfaces in one-to-one correspondence with the plurality of cores, and the plurality of power supply interfaces are configured to provide voltages to the corresponding cores to drive the corresponding cores to process the target tasks. The computing device may reduce the frequency of the first core in the target process by reducing the voltage of the power supply interface corresponding to the first core.

And 104, improving the frequency of a second core in the plurality of cores in the target process.

The computing device may increase the frequency of the second core in the target process by increasing the voltage of the power supply interface corresponding to the second core. Optionally, referring to step 102, the computing device may increase the frequency of the second core in the target process based on the reference frequency of the second core. After increasing the frequency of the second core in the target process, the frequency of the second core is less than or equal to the reference frequency of the second core. Because the processing stability of the kernel is affected when the frequency of the kernel in the target process is too high, after the frequency of the second kernel in the target process is increased, the frequency of the second kernel is less than or equal to the reference frequency of the second kernel, so that the frequency of the second kernel in the target process can be prevented from being too high, and the influence on the processing stability of the second kernel is reduced.

Illustratively, referring to the foregoing

steps

101 and 102, as for the core a, the core b, the core c, and the core d, the core a and the core c are the second cores. Before this step 104 is performed, the current frequency of both core a and core c is 100 Mhz. The reference frequency of core a is 113.75Mhz, and the reference frequency of core c is 118.75 Mhz. The computing device may increase the frequency of kernel a in the target process to 113.75Mhz or 113Mhz, and increase the frequency of kernel c in the target process to 118.75Mhz or 118 Mhz.

It is noted that each of the plurality of cores has a frequency threshold. Since the frequency of each core in the target process cannot be greater than its frequency threshold. Thus, prior to performing this step 104, the computing device may first determine the minimum of the two parameters, the reference frequency of the second core and the frequency threshold of the second core. And then increasing the frequency of the second kernel in the target process to be the minimum value of the two parameters of the reference frequency and the frequency threshold value thereof so as to avoid that the frequency of the second kernel is greater than the frequency threshold value after the frequency of the second kernel in the target process is increased.

On the other hand, when the frequency of the cores of the processor is continuously increased, the available power of the power supply system of the processor is continuously reduced, so that there may be a situation that the voltage of the cores cannot be increased, thereby affecting the stability of the cores. In the embodiment of the present application, if the frequency of the first core in the target process is reduced first, and then the frequency of the second core in the target process is increased, the available electric energy in the power supply system of the processor can be increased after the frequency of the first core in the target process is reduced. In this way, the frequency of the second core in the target process can be increased as much as possible without affecting the stability of the plurality of cores.

Optionally, a sum of the reduced values of the frequencies of all first cores of the plurality of cores may be greater than or equal to a sum of the increased values of the frequencies of all second cores of the plurality of cores. Therefore, the total voltage required by the cores after the frequencies of the cores are adjusted is less than or equal to the total voltage required by the cores before the frequencies of the cores are adjusted, the situation that the frequencies of the cores cannot be adjusted by the available electric energy of the power supply system of the processor can be avoided, and the influence on the stability of the cores is reduced. Optionally, the sum of the reduced values of the frequencies of all the first cores in the plurality of cores may also be smaller than the sum of the increased values of the frequencies of all the second cores in the plurality of cores, which is not limited in this embodiment of the present application.

For example, please refer to fig. 4 and fig. 5, fig. 4 is a schematic diagram of a thread running in a plurality of cores according to an embodiment of the present disclosure, and fig. 5 is a schematic diagram of another thread running in a plurality of cores according to an embodiment of the present disclosure. Fig. 4 shows threads run in the plurality of cores when the frequency of the plurality of cores in the target process is not changed, and fig. 5 shows threads run in the plurality of cores after the frequency of the plurality of cores in the target process is changed. Fig. 4 and 5 each illustrate an example in which the plurality of cores includes a core a, a core b, a core c, and a core d. As shown in fig. 4 and 5, when the core a, the core b, the core c, and the core d process target tasks, the core a runs the thread a1, the core b runs the thread b1, the core c runs the thread c1, and the core d runs the thread d 1. As shown in FIG. 4, both thread b1 and thread d1 have idle thread segments because they need to wait for thread a1 and thread c 1. As shown in fig. 5, after reducing the frequency of kernel b and kernel d in the target process and increasing the frequency of kernel a and kernel c in the target process, the running speeds of thread b1 and thread d1 are reduced, and the running speeds of thread a1 and thread c1 are increased, so that the proportion of idle thread segments occupying thread b1 and the proportion of idle thread segments occupying thread d1 are both reduced. Therefore, the time for waiting for the thread a1 and the thread c1 by the thread b1 and the thread d1 is reduced, the processing time of the target tasks by the multiple cores is further reduced, and the effective utilization rate of the multiple cores is improved.

And 105, detecting whether the target process is finished.

And step 106, when the target process is not finished, repeatedly executing the processes of obtaining the invalid utilization rate, reducing the frequency of the first kernel in the target process and increasing the frequency of the second kernel in the target process.

Since the target task may be periodic or aperiodic, when the target task is periodic, the operating scenarios of the multiple kernels in each period are almost the same, and therefore, the final frequency modulation scheme can be obtained by performing steps 101 to 103 only once. When the target task is non-periodic, the operation scenes of the multiple cores in each time period are different, so that the computing device can repeatedly execute the steps 101 to 103 when detecting that the target process is not finished, so as to realize real-time adjustment of the frequencies of the first core and the second core. Therefore, the difference of the processing time lengths of the cores in the target process can be effectively reduced, and the effective utilization rate of processor resources is further improved.

In summary, according to the frequency modulation method for the processor provided in the embodiment of the present application, after the computing device is capable of obtaining the invalid utilization rate of each core in the plurality of cores in the target process of processing the target task in parallel by the plurality of cores of the processor, the frequency of the first core in the target process is reduced and the frequency of the second core in the target process is increased according to the invalid utilization rate, where the invalid utilization rate of the first core is higher than the invalid utilization rate threshold, and the invalid utilization rate of the second core is lower than the invalid utilization rate threshold. The frequency of the kernel with higher invalid utilization rate in the target process is reduced, and the frequency of the kernel with lower invalid utilization rate in the target process is improved. The waiting time of the kernel with higher invalid utilization rate is reduced, and the processing time of the processor to the target task is further reduced, so that the efficiency of processing the target task by a plurality of kernels and the effective utilization rate of processor resources are improved.

It should be noted that, the foregoing steps are described by taking the example that the computing device performs frequency modulation on multiple cores included in one processor, in practical applications, the computing device may also perform frequency modulation on multiple cores included in multiple processors, and reference may be made to the foregoing steps 101 to 106 for a process of performing frequency modulation on each core, which is not described herein again in this embodiment of the present application.

The sequence of the method provided by the embodiment of the application can be properly adjusted, and the steps can be correspondingly increased or decreased according to the situation. Any method that can be easily modified by those skilled in the art within the technical scope of the present disclosure is also intended to be covered by the present disclosure. The

aforementioned steps

103 and 104 may be performed simultaneously. In addition, as the operating frequency of the processor increases, the voltage requirement of the processor becomes more and more strict, so step 104 may be executed first and then step 103 may be executed in order to maintain the stability of the processor, which is not limited in the embodiment of the present application.

The frequency modulation method of the processor provided by the embodiment of the present application is described in detail above with reference to fig. 1 to 5, and the frequency modulation apparatus of the processor provided by the embodiment of the present application is described below with reference to fig. 6 and 7.

Referring to fig. 6, fig. 6 is a block diagram of a frequency modulation apparatus for a processor according to an embodiment of the present application, where the frequency modulation apparatus 300 for a processor includes:

an obtaining module 301, configured to obtain an invalid utilization rate of each core in a plurality of cores in a target process of parallel processing of a target task by the plurality of cores of the processor, where the invalid utilization rate of any core is inversely related to a computation amount of the any core in the target process.

A frequency tuning module 302 configured to reduce a frequency of a first core of the plurality of cores in the target process.

The frequency modulation module 302 is further configured to increase a frequency of a second core of the plurality of cores in the target process.

The invalid utilization rate of the first core is higher than an invalid utilization rate threshold, and the invalid utilization rate of the second core is lower than the invalid utilization rate threshold.

Optionally, the obtaining module 301 is configured to:

acquiring the number of times of calculation of any one of a plurality of kernels in at least one unit time period in a target process;

determining the ratio of the idle time corresponding to any core to the total time of at least one unit time period as the invalid utilization rate of any core, wherein the idle time corresponding to any core is the total time of the idle unit time period corresponding to any core in at least one unit time period, and the calculation times of any core in the corresponding idle unit time period are less than the calculation time threshold.

Optionally, the frequency modulation module 302 is configured to:

reducing the frequency of the first core in the target process based on the reference frequency of the first core, wherein after reducing the frequency of the first core in the target process, the frequency of the first core is less than or equal to the reference frequency of the first core;

wherein the reference frequency of the first kernel is positively correlated to: a difference between the invalid utilization threshold and the invalid utilization of the first core.

Optionally, the frequency modulation module 302 is configured to:

increasing the frequency of the second core in the target process based on the reference frequency of the second core, wherein after the frequency of the second core in the target process is increased, the frequency of the second core is less than or equal to the reference frequency of the second core;

wherein the reference frequency of the second core is positively correlated to: a difference between the invalid utilization threshold and the invalid utilization of the second core.

Fig. 7 shows a block diagram of a frequency modulation apparatus of another processor provided in an embodiment of the present application, and referring to fig. 7, based on fig. 6, the frequency modulation apparatus 300 of the processor further includes:

a determining module 303, configured to determine an average of the invalid utilization rates of the multiple cores as an invalid utilization rate threshold.

Optionally, as shown in fig. 7, the frequency modulation apparatus 300 of the processor further includes:

the detecting module 304 is configured to detect whether the target process is finished after reducing the frequency of the first kernel in the target process and increasing the frequency of the second kernel in the target process.

A repeating module 305, configured to repeatedly execute a process of obtaining an invalid utilization rate, reducing the frequency of the first core in the target process, and increasing the frequency of the second core in the target process when the target process is not finished.

Optionally, a sum of the decreasing values of the frequencies of all first cores of the plurality of cores is greater than or equal to a sum of the increasing values of the frequencies of all second cores of the plurality of cores.

To sum up, in the frequency modulation apparatus of a processor provided in this application, after the frequency modulation module can obtain the invalid utilization rate of each core in the plurality of cores in the target process in which the module processes the target task in parallel at the plurality of cores of the processor, the frequency of the first core in the target process is reduced according to the invalid utilization rate, and the frequency modulation module can also improve the frequency of the second core in the target process according to the invalid utilization rate, where the invalid utilization rate of the first core is higher than an invalid utilization rate threshold, and the invalid utilization rate of the second core is lower than the invalid utilization rate threshold. The frequency of the kernel with higher invalid utilization rate in the target process is reduced, and the frequency of the kernel with lower invalid utilization rate in the target process is improved. The waiting time of the kernel with higher invalid utilization rate is reduced, and the processing time of the processor to the target task is further reduced, so that the efficiency of processing the target task by a plurality of kernels and the effective utilization rate of processor resources are improved.

The embodiment of the application provides a computer-readable storage medium, and a computer program is stored in the storage medium, and when being executed by a processor, the computer program realizes the frequency modulation method of any processor provided by the embodiment of the application.

The embodiment of the present application provides a chip, where the chip includes a programmable logic circuit and/or a program instruction, and when the chip runs, the chip is configured to implement the frequency modulation method of any one of the processors provided in the embodiment of the present application.

The embodiment of the present application provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the frequency modulation method of any one of the processors provided in the embodiment of the present application.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product comprising one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device including one or more available media integrated servers, data centers, and the like. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium, or a semiconductor medium (e.g., solid state disk), among others.

An embodiment of the present application provides a computing device, including: the frequency modulation method comprises a memory and a processor, wherein the processor is used for executing a program stored in the memory so as to realize the frequency modulation method of any processor provided by the embodiment of the application. Optionally, the processor may include a plurality of cores and a frequency modulation device of any one of the processors provided in the embodiments of the present application. Alternatively, the processor may include a plurality of cores and a chip provided by the embodiment of the present application. Alternatively, the computing device may further include: the frequency modulation device of any processor provided by the embodiment of the application. Alternatively, the computing device may further include a chip provided in the embodiments of the present application.

For example, please refer to fig. 8, fig. 8 is a schematic structural diagram of a computing device according to an embodiment of the present application, and the embodiment of the present application takes fig. 8 as an example to describe a case where a processor includes a plurality of cores and a frequency modulation device of any one of the processors according to the foregoing embodiments. As shown in fig. 8, the computing device 40 includes: a memory 401 and a processor 402. The memory 401 is configured to store a program, and the processor 402 is configured to execute the program stored in the memory 401, so as to implement the frequency tuning method of any processor provided in the embodiments of the present application.

Optionally, as shown in fig. 8, the computing device 40 may also include at least one communication interface 403 and at least one communication bus 404. The memory 401, processor 402, and communication interface 403 are communicatively coupled via a communication bus 404. Wherein the communication interface 403 is used for communicating with other devices under the control of the processor 402, the processor 402 may call programs stored in the memory 401 through the communication bus 404.

In this application, the terms "first" and "second," etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The term "plurality" means two or more unless expressly limited otherwise.

It should be noted that, the method embodiments and the apparatus embodiments provided in the embodiments of the present application can all be mutually referred to, and the embodiments of the present application do not limit this. The sequence of the steps of the method embodiments provided in the embodiments of the present application can be appropriately adjusted, and the steps can be correspondingly increased or decreased according to the situation, and any method that can be easily conceived by those skilled in the art within the technical scope disclosed in the present application shall be covered by the protection scope of the present application, and therefore, the details are not repeated.

While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for frequency tuning a processor, the method comprising:

respectively acquiring an invalid utilization rate of a first processor core and an invalid utilization rate of a second processor core of the processor, wherein the first processor core and the second processor core are used for processing target tasks in parallel, the invalid utilization rate of the first processor core and the invalid utilization rate of the second processor core are respectively used for indicating the resource use condition of the first processor core or the second processor core, the invalid utilization rate of the first processor core is negatively related to the calculated amount of the first processor core, and the invalid utilization rate of the second processor core is negatively related to the calculated amount of the second processor core;

when the invalid utilization rate of the first processor core is higher than that of the second processor core, reducing the frequency of the first processor core in the process of processing the target task; and the number of the first and second groups,

and increasing the frequency of the second processor core in the process of processing the target task.

2. The method of claim 1, wherein prior to the reducing the frequency of the first processor core in processing the target task when the invalid utilization of the first processor core is higher than the invalid utilization of the second processor core, the method further comprises:

determining that the invalid utilization of the first processor core is higher than the invalid utilization of the second processor core according to an invalid utilization threshold.

3. The method of claim 1 or 2, wherein reducing the frequency of the first processor core in processing the target task when the invalid utilization of the first processor core is higher than the invalid utilization of the second processor core comprises:

respectively acquiring the counting times of a first period of the first processor core and the second processor core in the process of processing the target task;

calculating the invalid utilization rate of the first processor core according to the idle time of the first processor core and the time of the first period; and the number of the first and second groups,

and calculating the invalid utilization rate of the first processor core according to the idle time of the second processor core and the time of the first period.

4. The method of any of claims 1-3, wherein the reducing the frequency of the first processor core in processing the target task comprises:

and reducing the frequency of the first processor core in the target process according to the reference frequency of the first processor core, wherein the reduced frequency of the first core is less than or equal to the reference frequency of the first core.

5. The method of claim 4, wherein a difference between the invalid utilization threshold and the invalid utilization of the first processor core positively correlates to a reference frequency of the first processor core.

6. The method as claimed in any one of claims 1 to 3, wherein the increasing the frequency of the second processor core in processing the target task comprises:

and increasing the frequency of the second processor core in the target process according to the reference frequency of the second processor core, wherein the increased frequency of the second processor core in the target process is less than or equal to the reference frequency of the second processor core.

7. The method of claim 4, wherein a difference between the invalid utilization threshold and the invalid utilization of the second processor core positively correlates to a reference frequency of the second processor core.

8. The method of any one of claims 1 to 7, further comprising:

determining an average of the invalid utilization of the first processor core and the invalid utilization of the second processor core as the invalid utilization threshold.

9. A frequency tuning apparatus for a processor, the frequency tuning apparatus comprising:

an obtaining module, configured to obtain an invalid utilization rate of a first processor core and an invalid utilization rate of a second processor core of the processor, where the first processor core and the second processor core are configured to process a target task in parallel, the invalid utilization rate of the first processor core and the invalid utilization rate of the second processor core are respectively configured to indicate resource usage of the first processor core or the second processor core, the invalid utilization rate of the first processor core is inversely related to a computation amount of the first processor core, and the invalid utilization rate of the second processor core is inversely related to a computation amount of the second processor core;

the frequency modulation module is used for reducing the frequency of the first processor core in the target task processing process when the invalid utilization rate of the first processor core is higher than that of the second processor core; and increasing the frequency of the second processor core in the process of processing the target task.

10. Frequency modulation apparatus according to claim 9, characterized in that the frequency modulation apparatus further comprises:

the determining module is configured to determine that the invalid utilization rate of the first processor core is higher than the invalid utilization rate of the second processor core according to an invalid utilization rate threshold.

11. Frequency-modulation device according to claim 9 or 10,

the obtaining module is specifically configured to obtain the count times of a first period in the process of processing the target task by the first processor core and the second processor core, respectively; calculating the invalid utilization rate of the first processor core according to the idle time of the first processor core and the time of the first period; and calculating the invalid utilization rate of the first processor core according to the idle time of the second processor core and the time of the first period.

12. Frequency modulation device according to one of claims 9 to 11,

the frequency modulation module is specifically configured to reduce the frequency of the first processor core in the target process according to the reference frequency of the first processor core, where the reduced frequency of the first core is less than or equal to the reference frequency of the first core.

13. The frequency hopping apparatus of claim 12, wherein a difference between the invalid utilization threshold and the invalid utilization of the first processor core is positively correlated to a reference frequency of the first processor core.

14. Frequency modulation device according to one of claims 9 to 11,

the frequency modulation module is specifically configured to increase the frequency of the second processor core in the target process according to the reference frequency of the second processor core, where the increased frequency of the second processor core in the target process is less than or equal to the reference frequency of the second processor core.

15. The frequency hopping apparatus of claim 14, wherein a difference between the invalid utilization threshold and the invalid utilization of the second processor core is positively correlated to a reference frequency of the second processor core.

16. Frequency modulation device according to one of claims 1 to 15,

the determining module is further configured to determine an average of the invalid utilization rate of the first processor core and the invalid utilization rate of the second processor core as the invalid utilization rate threshold.

17. A processor characterized by a processor and a power interface for powering the processor, the processor being configured to function as the operating steps of the frequency modulation method of the processor according to any one of claims 1 to 8.

18. A computing device comprising a memory and a processor, wherein the processor is configured to execute a program stored in the memory to implement the frequency tuning method of the processor of any one of claims 1 to 8.