CN116541162A - Calculation force control method and device, storage medium and electronic equipment - Google Patents

Calculation force control method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN116541162A
CN116541162A CN202310269889.2A CN202310269889A CN116541162A CN 116541162 A CN116541162 A CN 116541162A CN 202310269889 A CN202310269889 A CN 202310269889A CN 116541162 A CN116541162 A CN 116541162A
Authority
CN
China
Prior art keywords
gpu
utilization rate
task
power quota
actual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310269889.2A
Other languages
Chinese (zh)
Inventor
张伟韬
陈飞
邹懋
王鲲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vita Technology Beijing Co ltd
Original Assignee
Vita Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vita Technology Beijing Co ltd filed Critical Vita Technology Beijing Co ltd
Priority to CN202310269889.2A priority Critical patent/CN116541162A/en
Publication of CN116541162A publication Critical patent/CN116541162A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B11/00Automatic controllers
    • G05B11/01Automatic controllers electric
    • G05B11/36Automatic controllers electric with provision for obtaining particular characteristics, e.g. proportional, integral, differential
    • G05B11/42Automatic controllers electric with provision for obtaining particular characteristics, e.g. proportional, integral, differential for obtaining a characteristic which is both proportional and time-dependent, e.g. P. I., P. I. D.
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The disclosure relates to a method, a device, a storage medium and an electronic device for controlling calculation force, wherein the method comprises the following steps: responding to a GPU task executed on a virtual GPU with a set GPU utilization rate, acquiring the actual GPU utilization rate of the GPU task, wherein the set GPU utilization rate characterizes the calculation power occupied by the virtual GPU on the whole GPU; determining a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to a PID algorithm; the target power quota is issued to continue executing the GPU task. According to the calculation power control method, the running condition of the GPU is reflected through the actual GPU utilization rate, and the target calculation power quota issued to the GPU task is flexibly adjusted by combining the PID algorithm on the basis of the actual GPU utilization rate, so that the accuracy of calculation power control is improved.

Description

Calculation force control method and device, storage medium and electronic equipment
Technical Field
The disclosure relates to the technical field of computers, and in particular relates to a computing force control method, a computing force control device, a storage medium and electronic equipment.
Background
With the rapid development of Artificial Intelligence (AI) applications, the demands of GPUs (graphics processing unit, graphics processors) are increasing. In some service scenarios, the computing power requirement of the AI service does not need to occupy the computing power of the entire GPU, but only needs to occupy part of the computing power of the GPU. In this case, multiple virtual GPUs may be virtualized using GPU virtualization techniques, thereby running multiple AI tasks on one physical GPU at the same time, thereby reducing hardware costs. However, how to precisely and conveniently control the computational power quota of each virtual GPU is a challenge.
Disclosure of Invention
The disclosure aims to provide a calculation force control method, a calculation force control device, a storage medium and electronic equipment, so as to improve accuracy of calculation force control.
To achieve the above object, a first aspect of embodiments of the present disclosure provides a computing force control method, the method including:
responding to a GPU task executed on a virtual GPU with a set GPU utilization rate, and acquiring the actual GPU utilization rate of the GPU task, wherein the set GPU utilization rate characterizes the calculation power occupied by the virtual GPU on the whole GPU;
determining a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to a PID algorithm;
and issuing the target computing power quota to continue to execute the GPU task.
Optionally, the determining, according to a PID algorithm, a target power quota for adjusting a deviation between the actual GPU utilization and the set GPU utilization includes:
determining a scaling factor according to the actual GPU utilization rate and the set GPU utilization rate;
under the condition that the proportion coefficient is in a preset coefficient range, adjusting parameters of the PID algorithm according to the proportion coefficient;
and determining a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to the PID algorithm after the parameters are adjusted.
Optionally, the scaling factor is determined according to the following formula:
wherein K is cons The calculated power quota actually consumed in the last allocation period for the GPU task is K allo A computing power quota allocated for the GPU task in a last allocation period, U set For the set GPU utilization, U actu And the actual GPU utilization rate is obtained.
Optionally, the GPU task includes a plurality of GPU subtasks, and the method further includes:
and responding to the next GPU subtask submitted by the GPU task submitting thread, and determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask.
Optionally, the determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask includes:
determining a remaining computing power quota of the virtual GPU according to the target computing power quota;
and under the condition that the residual computing power quota is smaller than the standby computing power quota corresponding to the next GPU subtask, dormancy is performed on the next GPU subtask.
Optionally, the determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask includes:
before submitting the next GPU subtask to the GPU, determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask.
Optionally, the obtaining the actual GPU utilization of the GPU task includes:
and acquiring the actual GPU utilization rate of the GPU task according to a preset allocation period.
A second aspect of embodiments of the present disclosure provides a computing force control device, the device comprising:
the acquisition module is used for responding to a GPU task executed on a virtual GPU with a set GPU utilization rate, acquiring the actual GPU utilization rate of the GPU task, wherein the set GPU utilization rate represents the calculation force occupied by the virtual GPU on the whole GPU;
the determining module is used for determining a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to a PID algorithm;
and the issuing module is used for issuing the target computing power quota to continue executing the GPU task.
A third aspect of the disclosed embodiments provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the computing force control method provided in any of the first aspects of the disclosure.
According to a fourth aspect of embodiments of the present disclosure, there is provided an electronic device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to implement the steps of the computing force control method provided in any one of the first aspects.
By the technical scheme, the actual GPU utilization rate of the GPU task executed on the virtual GPU with the set GPU utilization rate is obtained, so that the deviation between the actual GPU utilization rate and the set GPU utilization rate can be determined. On the basis, a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate is determined according to a PID algorithm, and the target computing power quota is issued to continue executing the GPU task. In the process, the operating condition of the GPU is reflected through the actual GPU utilization rate, and the target calculation power quota issued to the GPU task is flexibly adjusted by combining the PID algorithm on the basis of the actual GPU utilization rate, so that the accuracy of calculation power control is improved.
Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate the disclosure and together with the description serve to explain, but do not limit the disclosure. In the drawings:
FIG. 1 is a flow chart illustrating a method of computing force control according to an exemplary embodiment.
Fig. 2 is a flow chart illustrating a method of computing force control according to another exemplary embodiment.
Fig. 3 is a block diagram illustrating a computing force control device according to an example embodiment.
Fig. 4 is a block diagram of an electronic device, according to an example embodiment.
Detailed Description
Specific embodiments of the present disclosure are described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the disclosure, are not intended to limit the disclosure.
The related art generally performs computational control in the following manner to implement running multiple GPU tasks (i.e., AI tasks performed on a GPU) simultaneously on one physical GPU:
in the first mode, GPU calculation control is performed based on a CPU time slice scheduling mode with fixed quota. This approach allocates corresponding GPU power quotas for different GPU tasks based on the CPU time-slice schedule, however, the CPU time-slice schedule cannot be identical to the GPU time-slice schedule, and thus the allocated GPU power quotas based on the CPU time-slice schedule may not coincide with the actual demands of the GPU tasks. And after the CPU issues the calculation task to the GPU, the CPU cannot control the execution process of the GPU task, and even if the CPU process is controlled to sleep, the GPU task can still be continuously executed on the GPU. Therefore, the manner of scheduling based on the CPU time slices cannot control the computational power quota of the GPU more accurately.
And in a second mode, GPU calculation force control based on hardware support is performed. The method makes finer segmentation on the hardware level, namely, physically segments a graphics card into a plurality of independent areas, each area is provided with a respective independent computing unit and a buffer, and therefore different GPU tasks are distributed to different areas to realize that a plurality of GPU tasks are simultaneously operated on a physical GPU. However, this approach is improved from the hardware level, is costly, and the amount of computing power that each region divided from the hardware level can carry is constant, e.g., 25% and 50%, etc., with low flexibility.
In view of this, the embodiments of the present disclosure provide a method, an apparatus, a storage medium, and an electronic device for controlling power, by acquiring an actual GPU utilization of a GPU task executed on a virtual GPU having a set GPU utilization, it is possible to determine a deviation between the actual GPU utilization and the set GPU utilization. On the basis, a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate is determined according to a PID algorithm, and the target computing power quota is issued to continue executing the GPU task. In the process, the operating condition of the GPU is reflected through the actual GPU utilization rate, and the target calculation power quota issued to the GPU task is flexibly adjusted by combining the PID algorithm on the basis of the actual GPU utilization rate, so that calculation power control is flexibly performed under the condition that hardware is not required to be improved, and the accuracy of calculation power control is improved.
Referring to fig. 1, fig. 1 is a flow chart illustrating a method of computing force control according to an exemplary embodiment. The computing power control method can be applied to computing devices, such as a mobile terminal, a cloud server and the like, and the disclosure is not limited in detail. As shown in fig. 1, the calculation force control method includes:
s101, responding to the GPU task executed on the virtual GPU with the set GPU utilization rate, and acquiring the actual GPU utilization rate of the GPU task.
The method comprises the steps of setting the GPU utilization rate to represent the calculation force which can be occupied by the virtual GPU on the whole GPU. The set GPU utilization may be based on actual conditions, which is not specifically limited by the present disclosure. Because the set GPU utilization rate in the embodiments of the present disclosure is variable, the calculation force ratio between different virtual GPUs can be set according to the requirements, so that the calculation force of the whole GPU is cut according to the requirements, and compared with the second mode of dividing a plurality of independent calculation force areas from the hardware level, the control of the calculation force in this mode of the present disclosure is more flexible.
It should be noted that, on the basis of dividing the entire GPU into a plurality of virtual GPUs having set GPU utilization rates, GPU tasks may be executed on each virtual GPU based on the corresponding set GPU utilization rates. However, the actual GPU utilization of the GPU tasks performed on the virtual GPU is typically biased from the set GPU utilization of the virtual GPU, whereby the actual GPU utilization of the GPU tasks may be obtained, thereby determining a bias between the actual GPU utilization and the set GPU utilization.
It should also be noted that, for different types of graphics cards, the manner in which the actual GPU utilization of the GPU task is obtained is different. For example, for an Injeida graphics card, the actual GPU utilization may be obtained by calling an interface provided by Injeida (e.g., the nvmlDeviceGetUtilizationRates interface).
S102, determining a target power quota for adjusting deviation between the actual GPU utilization rate and the set GPU utilization rate according to a PID algorithm.
The PID algorithm is a closed-loop control algorithm, and the algorithm can determine a target calculation power quota for adjusting the deviation according to the ratio (Proportion), integral (Integral) and Differential (Differential) of the deviation between the actual GPU utilization and the set GPU utilization, so that the deviation between the actual GPU utilization and the set GPU utilization is effectively corrected, and the actual GPU utilization approaches the set GPU utilization. The calculation formula of the PID algorithm may refer to the related art, and will not be described herein. The power quota may refer to the amount of computation that the GPU may use. For example, in the case of performing a GPU task by invoking a related interface, the power quota may be characterized by a parameter Kernel in the related interface. The related interfaces are used for submitting the kernel functions to the GPU, and the specific calling interfaces can be determined according to actual conditions. For example, for an Injedada graphics card, the relevant interface may be cudaLaunchKernel, for example, and the power quota may be characterized by a parameter Kernel in the interface, by invoking the interface to submit a Kernel to the GPU, so that the GPU task may be performed on the GPU.
It will be appreciated that, according to the change in the actual GPU utilization, the target power quota determined by the PID algorithm for adjusting the deviation between the actual GPU utilization and the set GPU utilization also changes accordingly. Compared with the mode one of GPU power control based on the CPU time slice scheduling mode with fixed quota, the power quota distributed by the mode is not fixed, but the issued target power quota is flexibly adjusted according to the actual GPU utilization of the current GPU task, so that power control can be performed more accurately.
In addition, it should be noted that the universality of the PID algorithm is strong, and the PID algorithm can be directly applied to GPUs of different models, so that the workload and cost brought by adapting to different GPUs are greatly reduced.
S103, issuing a target computing power quota to continue executing the GPU task.
It will be appreciated that in the event that a target power quota is determined for adjusting the deviation between the actual GPU utilization and the set GPU utilization, the target power quota may be issued to continue executing GPU tasks.
According to the technical scheme provided by the embodiment of the disclosure, the operating condition of the GPU is reflected through the actual GPU utilization rate, and the target calculation power quota issued to the GPU task is flexibly adjusted by combining the PID algorithm on the basis of the actual GPU utilization rate, so that calculation power control is flexibly performed under the condition that hardware is not required to be improved, and the accuracy of calculation power control is improved.
Optionally, the step S101 may include:
and acquiring the actual GPU utilization rate of the GPU task according to a preset allocation period.
The preset allocation period may be determined according to practical situations, for example, may be 100ms, which is not specifically limited in the present disclosure.
In the embodiment of the disclosure, the actual GPU utilization rate of the GPU task can be obtained by the monitoring thread according to the preset allocation period, so that the computing power quota is issued regularly to continue executing the GPU task.
Optionally, the step S102 may include:
determining a scaling factor according to the actual GPU utilization and the set GPU utilization;
under the condition that the proportion coefficient is in the preset coefficient range, adjusting parameters of the PID algorithm according to the proportion coefficient;
and determining a target power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to the PID algorithm after the parameters are adjusted.
Wherein, the proportionality coefficient can be determined according to the following formula:
wherein K is cons Computing power quota actually consumed in last allocation period for GPU task, K allo Computing power quota allocated for GPU task in last allocation period, U set To set GPU utilization, U actu Is the actual GPU utilization.
In one possible implementation, the computing power quota actually consumed by the GPU task may be determined according to the number of times the relevant interface is invoked in the last allocation period and the value of the parameter representing the computing power quota set when the interface is invoked, and the computing power quota allocated to the GPU task may be determined according to the target computing power quota determined by the PID algorithm in the last allocation period. Therefore, the scaling factor representing the fluctuation condition of the virtual GPU computing power can be determined according to the computing power quota actually consumed by the GPU task in the last distribution period, the computing power quota distributed by the GPU task in the last distribution period, the set GPU utilization rate of the GPU task and the actual GPU utilization rate of the GPU task.
It should be noted that, the scaling factor is not within the preset factor range and can represent that the computing power of the virtual GPU fluctuates within a small range, and the scaling factor is within the preset factor range and can represent that the computing power of the virtual GPU fluctuates greatly, that is, the difference between the actual GPU utilization rate and the set GPU utilization rate is large. The preset coefficient range may be determined according to practical situations, for example, the preset coefficient range may represent a numerical range smaller than 1/2 and/or a numerical range larger than 2, which is not specifically limited in the present disclosure.
It should also be noted that when the computing power of the virtual GPU fluctuates in a small range, fine adjustment between the actual GPU utilization and the set GPU utilization may be achieved through a negative feedback mechanism of the PID algorithm. When the gap between the actual GPU utilization of the virtual GPU and the set GPU utilization is large, it is often difficult for the PID algorithm to quickly converge to the set GPU utilization. Under the condition, the parameters of the PID algorithm can be adjusted according to the proportional coefficient so as to realize the parameter self-adaptive adjustment of the PID algorithm, thereby accelerating the algorithm convergence and improving the task execution efficiency.
For example, under the condition that the proportion coefficient is in the preset coefficient range, the proportion (P), integral (I) and derivative (D) results of the proportion coefficient and the PID algorithm can be multiplied respectively, so that convergence of the PID algorithm is accelerated, calculation force of the virtual GPU is controlled rapidly and accurately, and user experience is improved.
By adopting the technical scheme, the scaling factor which can reflect the calculation power fluctuation condition of the virtual GPU is determined through the actual GPU utilization rate and the set GPU utilization rate, and the self-adaptive adjustment of the PID algorithm when the calculation power fluctuation is large is realized according to the scaling factor, so that the convergence speed of the PID algorithm is high, the algorithm execution efficiency is further improved on the basis of flexibly carrying out calculation power control through the PID algorithm, and the calculation power of the virtual GPU is controlled more rapidly and accurately.
Alternatively, the GPU task may include a plurality of GPU subtasks, typically submitted to the GPU by the CPU. Illustratively, the GPU subtasks are submitted to the GPU by a GPU task submission thread in the CPU. On this basis, the technical scheme provided by the embodiment of the disclosure may include:
and responding to the next GPU subtask submitted by the GPU task submitting thread, and determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask.
Wherein the reserve power quota is related to the next GPU subtask submitted by the GPU task submission thread, which is not specifically limited by the present disclosure.
In one possible implementation, a remaining power quota of the virtual GPU may be determined from the target power quota, and the next GPU sub-task is dormant if the remaining power quota is less than a standby power quota corresponding to the next GPU sub-task.
It is to be understood that, in the case that the remaining power quota is smaller than the standby power quota corresponding to the next GPU subtask, it may be indicated that the current virtual GPU may use insufficient computation amount to execute the next GPU subtask, so that the dormancy of the next GPU subtask may be controlled by controlling the dormancy of the GPU task submitting thread, thereby achieving the purpose of controlling the execution of the GPU task. On the basis, the step of acquiring the actual GPU utilization rate of the GPU task can be returned, so that the calculation power quota is continuously issued to the GPU task until the rest calculation power quota is not smaller than the standby calculation power quota corresponding to the next GPU subtask, and the next GPU subtask is awakened. Alternatively, waiting for the other GPU subtasks that are taking up computing power to release computing power until the remaining computing power quota of the current virtual GPU is sufficient for executing the next GPU subtask wakes up the next GPU subtask. Therefore, the continuous execution of the GPU task can be ensured on the basis of improving the accuracy of the calculation force control. The remaining power quota can be calculated according to the sum of the remaining power quota of the last allocation period of the GPU task and the target power quota. The remaining power quota of the last allocation period can be calculated according to the difference between the power quota actually consumed by the GPU task in the last allocation period and the power quota allocated by the GPU task in the last allocation period.
It should be noted that before each GPU subtask is executed on the GPU, it generally involves submitting the GPU subtask to the GPU, so that the GPU subtask is executed on the GPU, whereas after submitting the GPU subtask to the GPU, it is difficult to control the execution of the GPU task by the CPU, so that before submitting the GPU subtask to the GPU, it can be determined whether to sleep the next GPU subtask according to the target power quota and the power quota to be used corresponding to the next GPU subtask. Thus, the dormancy of the GPU subtasks can be controlled by controlling the dormancy of the GPU task submitting thread, so that the aim of controlling the execution of the GPU task is fulfilled.
Illustratively, in response to the next GPU subtask submitted by the GPU task submitting thread, the kernel may be submitted to the GPU by invoking the associated interface, thereby executing the GPU subtask on the GPU. In this process, if the remaining power quota of the current virtual GPU is insufficient for executing the GPU subtask, the GPU task submitting thread that submitted the GPU subtask may be controlled to sleep before submitting the GPU subtask to the GPU, and if the remaining power quota of the current virtual GPU is sufficient for executing the GPU subtask, the GPU task submitting thread may be awakened, and the subsequent step of submitting the GPU subtask to the GPU may be continued.
The technical scheme provided by the embodiment of the disclosure is that the actual GPU utilization rate of the GPU task executed on the virtual GPU with the set GPU utilization rate is obtained, so that the deviation between the actual GPU utilization rate and the set GPU utilization rate is determined. On the basis, a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate is determined according to a PID algorithm, and the target computing power quota is issued to continue executing the GPU task. In the process, the operating condition of the GPU is reflected through the actual GPU utilization rate, and the target calculation power quota issued to the GPU task is flexibly adjusted by combining the PID algorithm on the basis of the actual GPU utilization rate, so that calculation power control is flexibly performed under the condition that hardware is not required to be improved, and the accuracy of calculation power control is improved.
Referring to fig. 2, fig. 2 is a flow chart illustrating a method of computing force control according to another exemplary embodiment. As shown in fig. 2, the power control method may include steps S2011 to S2015 executed by a monitor thread and steps S2021 to S2023 executed by a GPU task submission thread, and the monitor thread and the GPU task submission thread may run on a CPU.
The step of monitoring the execution of the thread may include:
in step S2011, in response to the GPU task executed on the virtual GPU having the set GPU utilization, the actual GPU utilization of the GPU task is obtained.
In step S2012, a scaling factor is determined according to the actual GPU utilization and the set GPU utilization, and whether to adjust the parameters of the PID algorithm is determined according to the scaling factor.
Wherein, in the case that the scaling factor determined according to the actual GPU utilization and the set GPU utilization is within the preset factor range, the parameter of the adjustment PID algorithm is determined, so as to execute step S2013. In the case where the scaling factor determined according to the actual GPU utilization and the set GPU utilization is not within the preset factor range, it is determined that the parameters of the PID algorithm are not adjusted, thereby executing step S2014.
In step S2013, parameters of the PID algorithm are adjusted according to the scaling factor.
In step S2014, a target computational power quota for adjusting the deviation between the actual GPU utilization and the set GPU utilization is determined according to the PID algorithm.
The parameters of the PID algorithm may be original parameters or parameters adjusted according to a scaling factor.
In step S2015, the remaining power quota of the virtual GPU is updated according to the target power quota.
Wherein the step of executing the GPU task submission thread may comprise:
in step S2021, in response to the next GPU subtask, it is determined whether the difference between the virtual GPU' S current remaining power quota and the standby power quota corresponding to the next GPU subtask is less than 0.
The remaining power quota may be calculated according to the target power quota and the remaining power quota of the GPU task in the last allocation period. Step S2022 is executed if the difference between the current remaining power quota of the virtual GPU and the standby power quota corresponding to the next GPU subtask is less than 0, and step S2023 is executed if the difference between the current remaining power quota of the virtual GPU and the standby power quota corresponding to the next GPU subtask is not less than 0.
In step S2022, the next GPU subtask is dormant.
In step S2023, the remaining power quota of the virtual GPU is updated according to the standby power quota.
Furthermore, it should be noted that the frequency of execution of the monitor thread and the GPU task submission thread may or may not be consistent. That is, the process of issuing the target power quota does not affect the GPU task submission thread submitting the GPU subtasks and vice versa.
According to the technical scheme provided by the embodiment of the disclosure, the operating condition of the GPU is reflected through the actual GPU utilization rate, and the target calculation power quota issued to the GPU task is flexibly adjusted by combining the PID algorithm on the basis of the actual GPU utilization rate, so that calculation power control is flexibly performed under the condition that hardware is not required to be improved, and the accuracy of calculation power control is improved. And the proportional coefficient which can reflect the calculation force fluctuation condition of the virtual GPU is determined through the actual GPU utilization rate and the set GPU utilization rate, and the self-adaptive adjustment of the PID algorithm when the calculation force fluctuation is large is realized according to the proportional coefficient, so that the convergence speed of the PID algorithm is high, the algorithm execution efficiency is further improved on the basis of flexibly carrying out calculation force control through the PID algorithm, and the calculation force of the virtual GPU is controlled more rapidly and accurately.
Based on the same conception, the present disclosure also provides a computing force control device 300, referring to fig. 3, fig. 3 is a block diagram of a computing force control device 300 according to an exemplary embodiment. As shown in fig. 3, the computing force control device 300 includes:
the obtaining module 301 is configured to obtain an actual GPU utilization rate of a GPU task in response to a GPU task executed on a virtual GPU having a set GPU utilization rate, where the set GPU utilization rate characterizes an computing power occupied by the virtual GPU on the whole GPU;
a determining module 302, configured to determine a target power quota for adjusting a deviation between the actual GPU utilization and the set GPU utilization according to a PID algorithm;
and the issuing module 303 is configured to issue a target computing power quota to continue executing the GPU task.
The technical scheme provided by the embodiment of the disclosure is that the actual GPU utilization rate of the GPU task executed on the virtual GPU with the set GPU utilization rate is obtained, so that the deviation between the actual GPU utilization rate and the set GPU utilization rate is determined. On the basis, a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate is determined according to a PID algorithm, and the target computing power quota is issued to continue executing the GPU task. In the process, the operating condition of the GPU is reflected through the actual GPU utilization rate, and the target calculation power quota issued to the GPU task is flexibly adjusted by combining the PID algorithm on the basis of the actual GPU utilization rate, so that calculation power control is flexibly performed under the condition that hardware is not required to be improved, and the accuracy of calculation power control is improved.
Optionally, the determining module 302 is configured to:
determining a scaling factor according to the actual GPU utilization and the set GPU utilization;
under the condition that the proportion coefficient is in the preset coefficient range, adjusting parameters of the PID algorithm according to the proportion coefficient;
and determining a target power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to the PID algorithm after the parameters are adjusted.
Optionally, the scaling factor is determined according to the following formula:
wherein K is cons Computing power quota actually consumed in last allocation period for GPU task, K allo Computing power quota allocated for GPU task in last allocation period, U set To set GPU utilization, U actu Is the actual GPU utilization.
Optionally, the GPU task includes a plurality of GPU subtasks, and the computing power control device 300 further includes a response module for:
and responding to the next GPU subtask submitted by the GPU task submitting thread, and determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask.
Optionally, the response module is configured to:
determining a remaining power quota of the virtual GPU according to the target power quota;
and under the condition that the remaining power quota is smaller than the standby power quota corresponding to the next GPU subtask, the next GPU subtask is dormant.
Optionally, the response module is configured to:
before submitting the next GPU subtask to the GPU, determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask.
Optionally, the obtaining module 301 is configured to:
and acquiring the actual GPU utilization rate of the GPU task according to a preset allocation period.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
Fig. 4 is a block diagram of an electronic device 400, shown in accordance with an exemplary embodiment. As shown in fig. 4, the electronic device 400 may include: a processor 401, a memory 402. The electronic device 400 may also include one or more of a multimedia component 403, an input/output (I/O) interface 404, and a communication component 405.
Wherein the processor 401 is configured to control the overall operation of the electronic device 400 to perform all or part of the steps in the computing power control method described above. The memory 402 is used to store various types of data to support operation at the electronic device 400, which may include, for example, instructions for any application or method operating on the electronic device 400, as well as application-related data, such as contact data, transceived messages, pictures, audio, video, and the like. The Memory 402 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia component 403 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen, the audio component being for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may be further stored in the memory 402 or transmitted through the communication component 405. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 404 provides an interface between the processor 401 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 405 is used for wired or wireless communication between the electronic device 400 and other devices. Wireless communication, such as Wi-Fi, bluetooth, near field communication (Near Field Communication, NFC for short), 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, etc., or one or a combination of more of them, is not limited herein. The corresponding communication component 405 may thus comprise: wi-Fi module, bluetooth module, NFC module, etc.
In an exemplary embodiment, the electronic device 400 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), digital signal processors (Digital Signal Processor, abbreviated as DSP), digital signal processing devices (Digital Signal Processing Device, abbreviated as DSPD), programmable logic devices (Programmable Logic Device, abbreviated as PLD), field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), controllers, microcontrollers, microprocessors, or other electronic components for performing the above-described method of computing power control.
In another exemplary embodiment, a computer readable storage medium is also provided, comprising program instructions which, when executed by a processor, implement the steps of the computing force control method described above. For example, the computer readable storage medium may be the memory 402 including program instructions described above, which are executable by the processor 401 of the electronic device 400 to perform the computing force control method described above.
In another exemplary embodiment, a computer program product is also provided, comprising a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-mentioned algorithm control method when executed by the programmable apparatus.
The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the present disclosure within the scope of the technical concept of the present disclosure, and all the simple modifications belong to the protection scope of the present disclosure.
In addition, the specific features described in the foregoing embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, the present disclosure does not further describe various possible combinations.
Moreover, any combination between the various embodiments of the present disclosure is possible as long as it does not depart from the spirit of the present disclosure, which should also be construed as the disclosure of the present disclosure.

Claims (10)

1. A method of computing force control, the method comprising:
responding to a GPU task executed on a virtual GPU with a set GPU utilization rate, and acquiring the actual GPU utilization rate of the GPU task, wherein the set GPU utilization rate characterizes the calculation power occupied by the virtual GPU on the whole GPU;
determining a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to a PID algorithm;
and issuing the target computing power quota to continue to execute the GPU task.
2. The method of claim 1, wherein the determining a target power quota for adjusting a deviation between the actual GPU utilization and the set GPU utilization according to a PID algorithm comprises:
determining a scaling factor according to the actual GPU utilization rate and the set GPU utilization rate;
under the condition that the proportion coefficient is in a preset coefficient range, adjusting parameters of the PID algorithm according to the proportion coefficient;
and determining a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to the PID algorithm after the parameters are adjusted.
3. The method of claim 2, wherein the scaling factor is determined according to the formula:
wherein K is cons The calculated power quota actually consumed in the last allocation period for the GPU task is K allo A computing power quota allocated for the GPU task in a last allocation period, U set For the set GPU utilization, U actu And the actual GPU utilization rate is obtained.
4. The method of claim 1, wherein the GPU task comprises a plurality of GPU subtasks, the method further comprising:
and responding to the next GPU subtask submitted by the GPU task submitting thread, and determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask.
5. The method of claim 4, wherein the determining whether to sleep the next GPU sub-task based on the target computing power quota and a standby computing power quota corresponding to the next GPU sub-task comprises:
determining a remaining computing power quota of the virtual GPU according to the target computing power quota;
and under the condition that the residual computing power quota is smaller than the standby computing power quota corresponding to the next GPU subtask, dormancy is performed on the next GPU subtask.
6. The method according to claim 4 or 5, wherein the determining whether to sleep the next GPU subtask according to the target power quota and a standby power quota corresponding to the next GPU subtask comprises:
before submitting the next GPU subtask to the GPU, determining whether to sleep the next GPU subtask according to the target computing power quota and the standby computing power quota corresponding to the next GPU subtask.
7. The method according to any one of claims 1-5, wherein the obtaining an actual GPU utilization of the GPU task comprises:
and acquiring the actual GPU utilization rate of the GPU task according to a preset allocation period.
8. A computing force control device, the device comprising:
the acquisition module is used for responding to a GPU task executed on a virtual GPU with a set GPU utilization rate, acquiring the actual GPU utilization rate of the GPU task, wherein the set GPU utilization rate represents the calculation force occupied by the virtual GPU on the whole GPU;
the determining module is used for determining a target computing power quota for adjusting the deviation between the actual GPU utilization rate and the set GPU utilization rate according to a PID algorithm;
and the issuing module is used for issuing the target computing power quota to continue executing the GPU task.
9. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of the method according to any of claims 1-7.
10. An electronic device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to implement the steps of the method of any one of claims 1-7.
CN202310269889.2A 2023-03-15 2023-03-15 Calculation force control method and device, storage medium and electronic equipment Pending CN116541162A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310269889.2A CN116541162A (en) 2023-03-15 2023-03-15 Calculation force control method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310269889.2A CN116541162A (en) 2023-03-15 2023-03-15 Calculation force control method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN116541162A true CN116541162A (en) 2023-08-04

Family

ID=87449474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310269889.2A Pending CN116541162A (en) 2023-03-15 2023-03-15 Calculation force control method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN116541162A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149440A (en) * 2023-10-26 2023-12-01 北京趋动智能科技有限公司 Task scheduling method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149440A (en) * 2023-10-26 2023-12-01 北京趋动智能科技有限公司 Task scheduling method and device, electronic equipment and storage medium
CN117149440B (en) * 2023-10-26 2024-03-01 北京趋动智能科技有限公司 Task scheduling method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111738446B (en) Scheduling method, device, equipment and medium of deep learning reasoning engine
CN110941325B (en) Frequency modulation method and device of processor and computing equipment
JP2009181571A (en) System and method to establish and dynamically control energy consumption in large-scale data center or it infrastructure
CN109710330B (en) Method and device for determining running parameters of application program, terminal and storage medium
CN116541162A (en) Calculation force control method and device, storage medium and electronic equipment
US20120297216A1 (en) Dynamically selecting active polling or timed waits
CN110795238B (en) Load calculation method and device, storage medium and electronic equipment
CN110287018A (en) Batch tasks method of combination and device
CN112256383B (en) Method, device, equipment and medium for adjusting CPU core number of virtual machine
CN107608788B (en) Control method, device and equipment
US10539995B2 (en) Performance boosting method and system of semiconductor device
CN114968567A (en) Method, apparatus and medium for allocating computing resources of a compute node
KR102270239B1 (en) Method and apparatus for executing software in a electronic device
US11243603B2 (en) Power management of an event-based processing system
CN117056072A (en) Graphics processor load management system, method and electronic equipment
US20140222229A1 (en) Power control apparatus, electronic computer, and power control method
US9766932B2 (en) Energy efficient job scheduling
CN110837419A (en) Inference engine system and method based on elastic batch processing and electronic equipment
CN113220429B (en) Method, device, equipment and medium for processing tasks of Java thread pool
CN114116220A (en) GPU (graphics processing Unit) sharing control method, GPU sharing control device and storage medium
Zhang et al. Machine learning on volatile instances: Convergence, runtime, and cost tradeoffs
US20230130125A1 (en) Coordinated microservices worker throughput control
CN118227270B (en) Method, device, equipment and medium for adjusting memory of virtual machine
CN117973635B (en) Decision prediction method, electronic device, and computer-readable storage medium
CN115858178B (en) Method, device, medium and equipment for sharing resources in convolution calculation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination