CN117130769A - Frequency modulation method, training method of frequency adjustment neural network and electronic equipment - Google Patents

Frequency modulation method, training method of frequency adjustment neural network and electronic equipment Download PDF

Info

Publication number
CN117130769A
CN117130769A CN202310210010.7A CN202310210010A CN117130769A CN 117130769 A CN117130769 A CN 117130769A CN 202310210010 A CN202310210010 A CN 202310210010A CN 117130769 A CN117130769 A CN 117130769A
Authority
CN
China
Prior art keywords
thread
cpu
load
target
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310210010.7A
Other languages
Chinese (zh)
Inventor
肖俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to CN202310210010.7A priority Critical patent/CN117130769A/en
Publication of CN117130769A publication Critical patent/CN117130769A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/325Power saving in peripheral device
    • G06F1/3265Power saving in display device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a frequency modulation method, a training method of a frequency adjustment neural network and electronic equipment, and relates to the field of artificial intelligence. The method is used for balancing the frequency and the energy consumption of the CPU on the premise of ensuring the preset application performance. The method comprises the following steps: for any target CPU cluster, acquiring system state information, wherein the system state information comprises a first load, a second load, a third load and a frame rate of a preset application, the first load is the load sum of all threads, the second load is the load sum of threads with binding relation with the target CPU cluster, and the third load is the load of the target CPU cluster; inputting the system state information into a frequency adjustment neural network corresponding to a target CPU cluster, and obtaining a target operating frequency output by the frequency adjustment neural network corresponding to the target CPU cluster; the operating frequency of the target CPU cluster is set as the target operating frequency.

Description

Frequency modulation method, training method of frequency adjustment neural network and electronic equipment
Technical Field
The application relates to the field of artificial intelligence, in particular to a frequency modulation method, a training method of a frequency regulation neural network and electronic equipment.
Background
As electronic devices become more powerful, applications installed in electronic devices are increasing. Taking game application as an example, with the improvement of game image quality, game playing methods and game scenes are enriched, and higher requirements are put on the performance of electronic equipment.
At present, when the electronic equipment runs the game application, the problems of high power consumption, serious heating and clamping often exist. Both because of limitations in the system resources (e.g., hardware resources) of the electronic device. For example, in order to ensure the game smoothness, the electronic device may be operated at the highest frequency of the central processing unit (central processing unit, CPU), but if the CPU is operated at the highest frequency for a long time, the temperature of the electronic device is too high, so the electronic device must reduce the frequency of the CPU to reduce the temperature, but the CPU reduces the game smoothness again after reducing the frequency, which affects the game experience of the user.
Therefore, how to reasonably adjust the CPU frequency and improve the game performance becomes a problem to be considered.
Disclosure of Invention
The embodiment of the application provides a frequency modulation method, a training method of a frequency adjustment neural network and electronic equipment, which are used for balancing the frequency and energy consumption of a CPU (central processing unit) on the premise of ensuring the performance of preset application.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical scheme:
in a first aspect, the present application provides a frequency modulation method, where an electronic device runs a preset application, where the preset application includes a plurality of threads, where the plurality of threads include threads having a binding relationship with a CPU cluster of the electronic device, the CPU cluster of the electronic device includes a target CPU cluster corresponding to a frequency adjustment neural network, and the frequency adjustment neural network is configured to adjust an operating frequency of the target CPU cluster corresponding to the frequency adjustment neural network, where the method includes: for any target CPU cluster, acquiring system state information, wherein the system state information comprises a first load, a second load, a third load and a frame rate of a preset application, the first load is the load sum of all threads, the second load is the load sum of threads with binding relation with the target CPU cluster, and the third load is the load of the target CPU cluster; inputting the system state information into a frequency adjustment neural network corresponding to a target CPU cluster, and obtaining a target operating frequency output by the frequency adjustment neural network corresponding to the target CPU cluster; the operating frequency of the target CPU cluster is set as the target operating frequency.
It can be understood that, since different CPU clusters have different frequency adjustment neural networks, and simultaneously, the running frequency of the corresponding CPU cluster is adjusted by combining the load condition of each CPU cluster and the load condition of the thread bound to the CPU cluster, so that the CPU power consumption can be balanced under the condition of ensuring the performance of the CPU cluster.
In one possibility provided in the first aspect, acquiring system state information includes: at a first moment, acquiring system state information of a previous n frames at the first moment, wherein the system state information of the previous n frames at the first moment comprises a first load corresponding to each frame in the previous n frames, a second load corresponding to each frame in the previous n frames, a third load corresponding to each frame in the previous n frames and a frame rate of each frame in the previous n frames, and n is more than or equal to 1; setting the operation frequency of the target CPU cluster as the target operation frequency comprises the following steps: the operating frequency of the target CPU cluster during a frame subsequent to the first time is set as the target operating frequency.
That is, the application can adjust the running frequency of the target CPU cluster in the current frame according to the running condition (including load and FPS) of the target CPU cluster in the previous frame or frames, thereby realizing the frequency adjustment with the granularity of the frame as well as finer.
In one possibility provided in the first aspect, the first moment is a moment when any one frame ends.
In one possibility provided by the first aspect, the preset application is a gaming application.
In one possibility provided by the first aspect, the load is CPU utilization.
In a second aspect, the present application provides a training method of a frequency adjustment neural network, configured to train to obtain a frequency adjustment neural network corresponding to a target CPU cluster of an electronic device, where the electronic device runs a preset application, where the preset application includes a plurality of threads, where the plurality of threads include threads having a binding relationship with the CPU cluster of the electronic device, the CPU cluster of the electronic device includes a target CPU cluster corresponding to the frequency adjustment neural network, and the frequency adjustment neural network is configured to adjust an operating frequency of the target CPU cluster corresponding to the frequency adjustment neural network, where the method includes: for any target CPU cluster, training data of the reinforcement learning network model corresponding to the target CPU cluster is obtained according to the data obtained by interaction between the reinforcement learning network model corresponding to the target CPU cluster and the target CPU cluster; training the reinforcement learning network model corresponding to the target CPU cluster by utilizing the training data to perform reinforcement learning training to obtain a frequency adjustment neural network corresponding to the target CPU cluster; the system state information comprises a first load, a second load, a third load and a frame rate of a preset application, wherein the first load is the load sum of all threads, the second load is the load sum of threads with binding relation with the target CPU cluster, and the third load is the load of the target CPU cluster.
It can be understood that, by combining the load condition of each CPU cluster and the load condition of the thread bound to the CPU cluster, the frequency adjustment neural network corresponding to the CPU cluster is trained, so that the frequency adjustment neural network matched with the actual condition of each CPU cluster can be obtained. Meanwhile, the frequency adjustment neural network is trained and obtained aiming at the application, so that the CPU frequency obtained through adjustment is more matched with the requirement of the application.
In one possibility provided in the second aspect, the training data includes a plurality of samples, each sample includes a state, an action, and a reward, and the training data of the reinforcement learning network model corresponding to the target CPU cluster is obtained according to the data obtained by the reinforcement learning network model interacting with the target CPU cluster, including: acquiring system state information at a first moment, wherein the system state information is used as a state, and comprises a first load corresponding to each frame in the previous n frames at the first moment, a second load corresponding to each frame in the previous n frames, a third load corresponding to each frame in the previous n frames, a frame rate in the previous n frames and a frame rate corresponding to each frame in the previous n frames, wherein n is more than or equal to 1; inputting the state into a reinforcement learning network model corresponding to the target CPU cluster, obtaining a target operating frequency, and taking the target operating frequency as the state; setting the operating frequency of the target CPU cluster during the following frame of the first moment as the target operating frequency; and acquiring the frame rate of the preset application in the period of the next frame at the first moment and the power consumption of the target CPU cluster, and calculating to obtain rewards.
In one possibility provided by the second aspect, the frame rate, power consumption and rewards satisfy:
where rt is a prize, fpst is a frame rate, FPS is a preset target frame rate, Δ is a preset difference, P (ft) is power consumption, and is a parameter related to the frequency ft of the target CPU cluster.
Thus, when the actually detected frame rate fpst is lower than the target frame rate FPS, the frame rate difference (i.e., the difference between fpst and target frame rate FPS) dominates the prize, which aims to raise the CPU frequency to increase the frame rate. Otherwise, the reward is the overhead of power consumption, and the purpose is to reduce the CPU frequency to reduce the CPU power consumption, so as to achieve the balance of application performance and CPU power consumption.
In one possibility provided in the second aspect, the method further comprises: acquiring a tracking file, wherein the tracking file comprises a frame length of a plurality of frames, running time of a plurality of threads in the plurality of frames, a thread awakened by each thread and a thread waiting by each thread; determining the dependency relationship among threads according to the thread awakened by each thread in the trace file and the thread waited by each thread, wherein the dependency relationship is used for indicating that the threads are executed in series or in parallel; determining the type of the thread according to the CPU utilization rate of the thread, wherein the type comprises a first type and a second type, and the CPU utilization rate of the thread of the first type is higher than that of the thread of the second type; and respectively establishing a binding relation between each thread and one CPU cluster according to the types and the dependency relations of the threads, wherein the CPU clusters bound by the threads are the same or different.
In one possibility provided in the second aspect, according to the type and the dependency relationship of the threads, a binding relationship between each thread and one CPU cluster is respectively established, the CPU clusters bound by the plurality of threads are the same or different, and the one or more CPU clusters include a first CPU cluster and a second CPU cluster, including: dividing two threads which are executed in parallel in a first type of thread into two different thread groups, and dividing two threads which are executed in series in the first type of thread into the same thread group to obtain two thread groups; establishing a binding relationship between one of the two thread groups and the first CPU cluster, establishing a binding relationship between the other of the two thread groups and the second CPU cluster, and establishing a binding relationship between the second type of thread and the second CPU cluster.
Therefore, the heavy-load threads can be executed in parallel, and the task processing efficiency is improved; the second CPU cluster can also share the tasks of the first CPU cluster, so that load balancing can be realized.
In one possibility provided by the second aspect, the operating frequency supported by the first CPU cluster is higher than the operating frequency supported by the second CPU cluster, so as to meet the requirement of the heavy load thread on the CPU frequency.
In one possibility provided in the second aspect, if the CPU utilization of the thread in at least k frames is greater than or equal to the first threshold in the multi-frame, the thread is a thread of the first type, and k is greater than or equal to 1; and if the CPU utilization rate of the threads in fewer than k frames in the multi-frame is greater than or equal to a second threshold value, the threads are the threads of the second type.
In one possibility provided by the second aspect, the preset application is a gaming application.
In a third aspect, the present application provides an electronic device, including: a memory and a processor; the processor is coupled with the memory; wherein the memory is for storing computer program code, the computer program code comprising computer instructions; the computer instructions, when executed by a processor, cause an electronic device to perform any of the possible methods of the first and second aspects.
In a fourth aspect, the present application provides a computer-readable storage medium comprising computer instructions; when executed on an electronic device, the computer instructions cause the electronic device to perform any of the possible methods as in the first and second aspects.
In a fifth aspect, the present application provides a computer program product for causing a terminal device to carry out the method as in the first aspect, the second aspect and any one of its possible designs when the computer program product is run on the terminal device.
In a sixth aspect, the present application provides a chip system comprising one or more interface circuits and one or more processors. The interface circuit and the processor are interconnected by a wire. The chip system described above may be applied to a terminal device comprising a communication module and a memory. The interface circuit is for receiving signals from the memory of the terminal device and transmitting the received signals to the processor, the signals including computer instructions stored in the memory. When the processor executes the computer instructions, the terminal device may perform the method as in the first aspect, the second aspect and any of its possible designs.
The technical effects of any one of the design manners of the third aspect to the sixth aspect may be referred to the technical effects of the different design manners of the first aspect, which are not repeated herein.
Drawings
FIG. 1 is a schematic diagram of a game scenario provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of interaction between an agent and an environment according to the related art;
FIG. 3 is a schematic diagram illustrating interaction between an agent and an environment according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a system architecture 100 according to an embodiment of the present application;
fig. 5 is a schematic flow chart of a training method of a frequency-adjusting neural network according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a thread grouping process according to an embodiment of the present application;
fig. 7 is a schematic diagram of a training method of a frequency-adjusting neural network according to an embodiment of the present application;
fig. 8 is a second flow chart of a training method of a frequency-adjusting neural network according to an embodiment of the present application;
fig. 9 is a schematic flow chart of a frequency modulation method according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an execution device 1000 according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a training device 1100 according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application are described below with reference to the accompanying drawings in the embodiments of the present application. In the description of embodiments of the application, the terminology used in the embodiments below is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," "the," and "the" are intended to include, for example, "one or more" such forms of expression, unless the context clearly indicates to the contrary. It should also be understood that in the following embodiments of the present application, "at least one", "one or more" means one or more than two (including two). The term "and/or" is used to describe an association relationship of associated objects, meaning that there may be three relationships; for example, a and/or B may represent: a alone, a and B together, and B alone, wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise. The term "coupled" includes both direct and indirect connections, unless stated otherwise. The terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated.
In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
The frequency modulation method provided by the embodiment of the application can be applied to scenes needing to adjust the CPU frequency, such as a game scene, a video call scene, a video watching scene and the like. It can be understood that the higher the frequency of the CPU in the electronic device, the stronger the arithmetic processing capability the CPU can provide, thereby improving the performance of the device; the lower the frequency of the CPU is, the more limited the arithmetic processing capability the CPU can provide, so that the power consumption of the equipment can be reduced, and the excessive temperature of the equipment is avoided.
Specifically, as shown in fig. 1, the frequency modulation method of the embodiment of the present application can be applied in a game scene, and the electronic device can meet the requirements of smooth running of a game and high frame rate (FPS) by increasing the CPU frequency; and the electronic equipment can reduce the power consumption of the equipment and lighten the heating condition by reducing the CPU frequency. That is, in a gaming scenario, the electronic device may balance game performance and power consumption by adjusting the CPU frequency.
In some related art, facing the scenario of adjusting the CPU frequency, the scheduler of the electronic device, such as an energy-aware scheduler (energy aware scheduling, EAS), a full-fair scheduler (completely fair scheduler, CFS), a power scheduler (power gate), and the like, selects a corresponding CPU frequency based on the CPU load measured in the history window. Wherein, the higher the CPU load, the higher the CPU frequency, thereby meeting the power demand.
However, the frequency modulation scheme provided by the related art adjusts the CPU frequency according to the CPU load, and the CPU load reflects the resource utilization condition of the whole device, so that the frequency modulation scheme can meet the power calculation requirement, rapidly complete the task and improve the throughput of the system, but cannot guarantee the performance of specific applications and cannot guarantee the optimal energy efficiency ratio.
Aiming at the problems, the application provides a training method and a frequency modulation method of a frequency adjustment neural network, which can furthest reduce CPU power consumption and promote the experience of a user in the process of using a specific application under the premise of ensuring the running performance of a preset application (such as a game application, a video application and the like).
The application provides a training method and a frequency modulation method of a frequency regulation neural network, which relate to the field of artificial intelligence, and can be particularly applied to optimizing a CPU scheduling and frequency regulation mechanism, performing symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on training data (such as CPU utilization rate, FPS, running frequency and power consumption in the application), and finally obtaining a trained frequency regulation neural network (namely, a neural network for regulating the frequency); in addition, the frequency modulation method provided by the embodiment of the application can use the trained frequency adjustment neural network to input data (such as CPU utilization rate and FPS of game application in the application) into the trained frequency adjustment neural network to obtain output data (such as running frequency in the application). It should be noted that, the training method and the frequency modulation method of the frequency-adjusting neural network provided by the embodiments of the present application are applications based on the same concept, and may be understood as two parts in a system or two stages of an overall process: such as a network training phase and a network application phase.
For a better understanding of the embodiments of the present application, reinforcement learning related to the embodiments of the present application is briefly described below:
reinforcement learning (reinforcement learning, RL) is used to describe and address the problem of agents (agents) learning strategies during interactions with an environment to maximize return or achieve a particular goal. A common model for reinforcement learning is the markov decision process (markov decision process, MDP). MDP is a mathematical model that analyzes decision-making problems. Reinforcement learning is where agents (agents) learn in a "trial and error" manner, and rewards (reward) obtained by interaction of actions (actions) with the environment guide the behavior with the goal of maximizing rewards for agents. Instead of telling the reinforcement learning system how to produce the correct action, the reinforcement signal (i.e., reward) provided by the environment in reinforcement learning makes an assessment of how well the action was produced. Since little information is provided by the external environment, the agent must learn from his own experiences. In this way, the agent obtains knowledge in the context of action-assessment (i.e., rewards), and improves the course of action to suit the context. Common reinforcement learning algorithms are DQN (Deep Q Network), Q-learning, policy gradient (policy gradient), actor-evaluator (actor-evaluator), and the like.
As shown in fig. 2, reinforcement learning mainly includes five elements: agents, environments, states, actions, and rewards. The state is a description of the environment, and changes after the intelligent agent acts. The action is a description of the agent's behavior, as a result of the agent's decision. That is, the input of the agent is a state and the output is an action. The rewards (positive or negative) are calculated by the agent based on its feedback to the post-action environment. The training process of reinforcement learning is as follows: the intelligent agent interacts with the environment for a plurality of times to obtain actions, states and rewards of each interaction; the plurality of groups (actions, states and rewards) are used as training data to train the intelligent agent once. By adopting the process, the intelligent agent is trained for the next round until the convergence condition is met.
As an example, the process of obtaining an action, state, rewards of an interaction is shown in FIG. 2, the current state s of the environment t Input to agent, agent output action a t Action a according to the environment t Calculating the rewards r of the interaction under the action of the related performance indexes t Thus far, the action a of the current interaction is obtained t State s t Prize r t . Record action a of this interaction t State s t Prize r t For later use in training the agent. Simultaneously recording environment in action a t The next state under action s t+1 So as to realize the next interaction of the agent with the environment.
Corresponding to this embodiment, as shown in fig. 3, the agent may be a frequency regulator, and the environment may be a CPU running a preset application. Wherein the CPU includes a plurality of CPU clusters (CPU clusters). The frequency adjustor can comprise a plurality of frequency-adjusting neural networks. Each frequency adjustment neural network may correspond to a CPU cluster for adjusting the operating frequency of the corresponding CPU cluster. The number of the CPU clusters is equal to or greater than the number of the frequency-adjusting neural networks.
The processing power of the plurality of CPU clusters may be the same or different. Processors in which there are multiple clusters of CPUs differing in processing power are referred to as heterogeneous multi-core processors. The frequencies supported by each CPU cluster in the heterogeneous multi-core processor may be different.
The preset application may be a game application, a video application, an office application, or the like, and is not particularly limited herein. The embodiment mainly uses a preset application as an example of a game application for explanation.
Wherein the state comprises at least three of: the load of each CPU cluster, the load of each thread in the game application and the frame rate of the game application. The load of the CPU cluster comprises the CPU utilization rate of the CPU cluster, and the load of the thread comprises the CPU utilization rate of the thread.
In this embodiment, the action includes the operating frequency of the CPU cluster. It should be appreciated that the frequencies supported by the CPU clusters may be in a variety of situations, such as 800MHz, 1GHz, 1.2GHz, etc. Correspondingly, the action set also comprises a plurality of corresponding actions. The agent receives the current state s t The current state s can then be determined from the frequencies (i.e., the action sets) of the various CPU clusters t The frequency (i.e., action) of the corresponding CPU cluster increases the probability of the agent obtaining a positive prize.
And, the reward is a function of the application performance parameter and the power consumption parameter of the CPU cluster. That is, the reward parameters used to calculate the reward include an application performance parameter and a power consumption parameter of the CPU cluster. In this way, application performance (e.g., game frame rate, picture definition, etc.) and power consumption of the CPU cluster can be balanced in the bonus. It should be appreciated that the nature of the environmental feedback rewards to the agent is: the environment feeds back the reward parameters to the agent. Subsequently, the agent needs to calculate a reward based on the reward parameters and update the decision strategy accordingly.
It should be noted that the frequency adjustor and the CPU may be integrated in one electronic device, or may be separately provided in different electronic devices, and data (for example, the load of the CPU, the load of each thread in the game application, the frame rate of the game application, the CPU frequency, etc.) may be transmitted through an input/output (I/O) interface between the two electronic devices.
The frequency adjustor and the CPU are integrated on the same electronic device, and the electronic device is exemplified as being equipped with a game application.
The method provided by the present embodiment may include two phases, a model training phase and a model application phase, respectively.
In a model training stage, the embodiment of the application provides a training method of a frequency adjustment neural network, which can divide threads of a game application into a plurality of thread groups according to thread loads, and bind the thread groups to different CPU clusters, so that the different thread groups operate on the different CPU clusters. Then, for different CPU clusters, taking the CPU utilization rate of the CPU cluster, the total load of threads bound to the CPU cluster, the total load of all threads and the game frame rate as states, taking the frequency of the CPU cluster as an action, calculating rewards, and training the reinforcement learning model to obtain a plurality of converged reinforcement learning models (namely frequency adjustment neural networks), wherein the plurality of frequency adjustment neural networks correspond to the different CPU clusters.
In a model application stage, the embodiment of the application provides a frequency modulation method, which inputs the load of a corresponding CPU cluster in the previous n frames, the total load of a thread group bound to the CPU cluster in the previous n frames, the total load of all thread groups in the previous n frames and the game frame rate in the previous n frames into each frequency adjustment neural network. The frequency adjustment neural network can output CPU frequency for adjusting the operation frequency of the next frame of the corresponding CPU cluster.
The method provided by the application is described below from a model training phase and a model application phase:
referring to fig. 4, a system architecture 100 according to an embodiment of the present application is provided. As shown in fig. 4, the data acquisition device 160 is configured to acquire training data, where the training data includes: CPU utilization of each thread, CPU utilization of each CPU cluster, FPS, CPU frequency and other parameters; and stores the training data in database 130, training device 120 trains to obtain target model/rule 101 based on the training data maintained in database 130. How the training device 120 obtains the target model/rule 101 based on the training data will be described in more detail, where the target model/rule 101 can be used to implement the frequency modulation method provided by the embodiment of the present application, that is, the CPU utilization of each thread, the CPU utilization of each CPU cluster, and the FPS input the target model/rule 101, that is, the CPU frequency can be obtained. The target model/rule 101 in embodiments of the present application may specifically be a frequency-tuned neural network (i.e., a reinforcement learning model for tuning the CPU frequency).
The target model/rule 101 obtained by training according to the training device 120 may be applied to different systems or devices, such as the execution device 110 shown in fig. 4, where the execution device 110 may be a terminal, such as a mobile phone terminal, a tablet computer, a notebook computer, an AR/VR, a vehicle-mounted terminal, etc., and may also be a server or cloud terminal, etc. In fig. 4, the execution device 110 is configured with an I/O interface 112 for data interaction with external devices, and a user may input data to the I/O interface 112 through the client device 140, where the input data may include, in an embodiment of the present application: CPU utilization of each thread, CPU utilization of each CPU cluster, FPS, etc.
The preprocessing module 113 is configured to perform preprocessing according to the input data received by the I/O interface 112, and in this embodiment, the preprocessing module 113 may be configured to perform smoothing operation on the input data.
In the process related to the execution of the computation by the computation module 111 of the execution device 110, the execution device 110 may call the data, the code, etc. in the data storage system 150 for the corresponding process, or may store the data, the instruction, etc. obtained by the corresponding process in the data storage system 150.
Finally, the I/O interface 112 returns the processing results, such as the CPU frequencies obtained above, to the client device 140, thereby providing the user with the processing results.
It is noted that the training device 120 may generate corresponding target models/rules based on different training data for different targets or different tasks, which may be used to achieve the targets or to perform the tasks, thereby providing the user with the desired results. In addition, fig. 4 is only a schematic diagram of a system architecture provided by an embodiment of the present application, where the positional relationship between devices, apparatuses, modules, etc. shown in the drawing is not limited in any way, for example, in fig. 4, the data storage system 150 is an external memory with respect to the execution device 110, and in other cases, the data storage system 150 may be disposed in the execution device 110. For another example, the client device 140 and the execution device 110 may be the same device.
The model training phase described above is described in connection with the accompanying drawings.
Fig. 5 shows a first flowchart of a training method of a frequency-adjusting neural network according to an embodiment of the present application, which can be applied to the training device 120 in fig. 4, and specifically illustrates a process in which the training device 120 divides threads of a game application into a plurality of thread groups and establishes a binding relationship between the threads and CPU clusters. As shown in fig. 5, the training method of the frequency-regulated neural network includes:
s510, obtaining trace files of the game application.
The trace file mainly comprises information such as frame length of each frame, starting time and ending time of each frame, CPU utilization rate of each CPU cluster in each frame, frequency of each CPU cluster in each frame, names of threads related to game application operation, which CPU cluster each thread is executed by, operation time of each thread in each frame, starting time and ending time of each thread, objects awakened by each thread, waiting time of each thread, objects waiting (namely, another thread) and the like during game application operation. The thread related to the running of the game application comprises a plurality of threads of the game application, and other threads related in the running process of the game, namely threads not belonging to the game application.
S520, analyzing the trace file to determine the dependency relationship among the plurality of target application threads.
The target application thread refers to a plurality of threads included in the game application itself.
In this embodiment, the dependency relationship may be determined according to a wake object and a wait object of the threads, and is used to indicate that the two threads are executed serially or in parallel. Taking thread a and thread B as examples, if thread a can wake thread B, but thread B cannot wake thread a, it indicates that thread B depends on thread a; if the thread A can wake up the thread B and the thread B can wake up the thread A, indicating that the thread A and the thread B are mutually dependent; if the thread A and the thread B are independently executed, the situation that the wake-up and the wake-up do not exist, the thread A and the thread B are independent.
In the case that thread B depends on thread a, the electronic device must execute thread a first and then wake thread B by thread a, i.e., the electronic device cannot execute thread a and thread B simultaneously, with serial execution between thread a and thread B. In the case where the thread a and the thread B are dependent on each other or independent of each other, it is not so-called whether the electronic device executes the thread a or the thread B first, and both may be executed in parallel.
To be used forFor example, the game is training device 120 pair +.>Analysis of the trace file of the game may result in a partial inter-thread dependency graph as shown in FIG. 6. As shown in fig. 6, the thread related to the game running includes: a plurality of threads such as unitgfx (rendering thread), unitmain (logical thread), worker (worker), worker2, surfaceFlinger (inter-process rendering), audio (Audio), network (Network), input (input), and the like. Wherein the unityGfx thread, unityMain thread, worker thread and worker2 thread are +.>Game own thread, surfaceFlinger thread, audio thread, network thread and input thread are running +.>Other threads involved in the game. The unityGfx thread and the unityMain thread can wake up each other and can be parallel; the woker2 thread can only be woken up by the unityGfx thread, both of which can only be serialized.
S530, determining the type of the corresponding target application thread according to the CPU utilization rate of each target application thread, and dividing the target application threads of the same type into the same group.
The CPU utilization rate of the target application thread is the load of the target application thread, and is the duty ratio of the running time of the target application thread in a frame in the frame. For example, if the frame length of a frame is 12ms and the running time of a unitomain thread in the frame time is 6ms, the CPU utilization of the unitomain thread in the frame is 50%.
In this embodiment, the types of the target application thread may include two types, i.e., a light load thread and a heavy load thread. The training device 120 may determine each target application thread as a lightly loaded thread as well as a heavily loaded thread based on its CPU utilization over multiple frames. The CPU utilization rate of the heavy load thread is higher than that of the light load thread group. And, the training device 120 may also divide all light load threads into light load thread groups and all heavy load threads into heavy load thread groups.
In an alternative embodiment, taking as an example the training device 120 obtains the CPU utilization of the target application thread in consecutive k frames. If the CPU utilization rate of the target application thread in the continuous k frames exceeds m frames and is greater than or equal to a first threshold value, determining that the thread is a heavy load thread, and dividing the thread into heavy load thread groups; otherwise, the thread is determined to be a light load thread and is divided into light load thread groups. Wherein k and m are positive integers, and m is smaller than k.
For example, k=10, m=6, the first threshold is 50%, and if the training device 120 obtains the CPU utilization of the target application thread in 10 frames is divided into: 30%, 55%, 45%, 65%, 60%, 61%, 58%, 70%, 20%, 25%. Then the training device 120 determines that it is a heavy load thread because the CPU utilization of the thread is over 50% for 6 frames within 10 frames.
In another alternative embodiment, the training device 120 may calculate a CPU utilization average value of each target application thread in k frames, and if the CPU utilization average value is greater than or equal to the first threshold, determine that the thread is a heavy load thread, and divide the thread into heavy load thread groups; if the CPU utilization average value is smaller than the first threshold value, determining that the thread is a light-load thread, and dividing the thread into light-load thread groups.
Optionally, the training device 120 may also divide other threads involved in the game play process into one group, such as other thread groups. Alternatively, the training device 120 may divide threads having CPU utilization greater than or equal to the second threshold among other threads involved in the game play process into lightly loaded thread groups.
Illustratively, as shown in FIG. 6, the training device 120 divides unitGfx and unitMain into heavy-load thread groups, worker, worker2 and surfaceFlinger into light-load thread groups, audio, network and input into other thread groups.
It should be noted that the above grouping is merely an example, and the training device 120 may further divide the target application threads into more thread groups, for example, 4, 5, 6, etc., which is not limited herein.
S540, according to the type and the dependency relationship of the target application threads, establishing the binding relationship between each target application thread and the CPU cluster.
After the binding relation between the target application thread and the CPU cluster is established, the target application thread can be executed by the corresponding CPU cluster.
In this embodiment, for heavy-load threads, the training device 120 may divide all heavy-load threads into two groups of threads according to the dependency relationship. Specifically, the training device 120 divides two heavy-load threads capable of parallel execution into two thread groups, and divides two heavy-load threads capable of serial execution into the same group, thus obtaining two groups of threads. Then, the training device 120 may establish a binding relationship between one of the two sets of threads and the first CPU cluster, and establish a binding relationship between the other of the two sets of threads and the second CPU cluster. Therefore, the heavy-load threads can be executed in parallel, and the task processing efficiency is improved; the second CPU cluster can also share the tasks of the first CPU cluster, so that load balancing can be realized.
Further, for light-load threads, the training device 120 may establish a binding of all light-load threads to the second CPU cluster.
In an alternative implementation mode, the first CPU cluster and the second CPU cluster are respectively a big core and a middle core, namely the frequencies supported by the first CPU cluster and the second CPU cluster are sequentially reduced, so that the requirement of a heavy load thread on CPU frequency is met.
Illustratively, as shown in FIG. 6, considering that unityGfx and unityMain may be performed in parallel, the training device 120 binds unityGfx to a first CPU cluster and unityMain, woker, worker2 and surfaceFlinger to a second CPU cluster.
In an alternative embodiment, the training device 120 may not need to determine the dependency relationship between the multiple target application threads, i.e., may not need to perform S520. In this case, the training device 120 may establish a binding relationship of all heavy load threads with the first CPU cluster, and establish a binding relationship of all light load threads with the second CPU cluster, even if the light/heavy load threads run in isolation.
Alternatively, for threads in other thread groups, the electronic device may establish a binding relationship between all threads in the group and the third CPU cluster or the second CPU cluster.
After establishing the binding relationship of the target application thread and the CPU cluster, the training device 120 may train the reinforcement learning model for different CPU clusters. In this embodiment, the first frequency adjustment neural network and the second frequency adjustment neural network may be trained for the first CPU cluster and the second CPU cluster, respectively. The first frequency adjusting neural network is used for adjusting the operation frequency of the first CPU cluster, and the second frequency adjusting neural network is used for adjusting the operation frequency of the second CPU cluster.
The training of the first frequency adjustment neural network and the training of the second frequency adjustment neural network are the same, and the difference is that the state value input and the action output by the first frequency adjustment neural network and the second frequency adjustment neural network are related to the corresponding CPU cluster.
Specifically, as shown in fig. 7, for the first CPU cluster, information that can characterize the game running load and the current execution environment may be input as state values to the first frequency adjustment neural network, and the first frequency may be output through the decision of the first frequency adjustment neural network. The information capable of characterizing the game running load comprises a CPU utilization rate 1 and a CPU utilization rate 2, wherein the CPU utilization rate 1 is the sum of the CPU utilization rates of all target application threads bound to the first CPU cluster, and the CPU utilization rate 2 is the sum of the CPU utilization rates of all target application threads. The current execution environment can be characterized to comprise FPS and CPU utilization rate 3, wherein the CPU utilization rate 3 is the CPU utilization rate of the first CPU cluster. And after the first frequency is acted on the first CPU cluster, observing to obtain the power consumption 1 of the first CPU cluster and taking the game frame rate as a return. And circulating in this way, training the first frequency adjustment neural network according to the data samples consisting of the states, the actions and the rewards until the first frequency adjustment neural network converges.
For the second CPU cluster, the CPU utilization 4, the CPU utilization 2, the CPU utilization 5, and the FPS may be used as status values, the frequency of the second CPU cluster is used as an action, and the reward calculated by the power consumption 2 and the game frame rate of the second CPU cluster is used as a data sample, so as to train the second frequency adjustment neural network. The CPU utilization rate 5 is the CPU utilization rate of the second CPU cluster, and the CPU utilization rate 4 is the sum of the CPU utilization rates of all target application threads bound to the second CPU cluster.
Fig. 8 shows a second flowchart of a training method of a frequency-tuned neural network according to an embodiment of the present application, which can be applied to the training device 120 in fig. 4, and specifically illustrates a process of training the training device 120 to obtain the frequency-tuned neural network. As shown in fig. 8, the training method of the frequency-regulated neural network includes:
s810, at time t, acquiring a system state value S t
The time t is the time when the execution of the game thread queue buffer is completed, that is, the time when the rendered image of one frame is put back into the data buffer (buffer queue), which can be simply understood as the end time of one frame, and can also be called as the first time. The system state s t The method comprises the steps of frame rate of game application in the previous n frames, CPU utilization rate of target CPU cluster in the previous n frames, total CPU utilization rate of all target application threads bound to the target CPU cluster in the previous n frames, and total CPU utilization rate of all target application threads in the previous n frames.
The first n frames are the first n frames at the time t. For example, the t time is the end time of the i-th frame, and the previous n frames include the i-n-th frame to the i-th frame. n is an integer of 1 or more, for example, 1, 2, 3, etc., and is not particularly limited herein.
Illustratively, n is 3, the training device 120 may obtain, at a time when one frame ends, the CPU utilization of the target CPU cluster in each frame (may be referred to as a third load), the sum of the CPU utilizations of all the target application threads bound to the target CPU cluster in each frame (may be referred to as a second load), the sum of the CPU utilizations of all the target application threads in each frame (may be referred to as a first load), and the game frame rate of each frame in 3 frames before the time.
In an alternative embodiment, the training device 120 may perform smoothing on the frame rate and the CPU utilization to obtain the system state value s t
The target CPU cluster may be the first CPU cluster or the second CPU cluster. Taking fig. 6 as an example, the sum of the CPU utilization of all the target application threads bound to the first CPU cluster is the CPU utilization of unityGfx. The sum of the CPU utilization rates of all the target application threads bound to the second CPU cluster is the sum of the CPU utilization rate of unityMain, the CPU utilization rate of the worker and the CPU utilization rate of the worker 2. The sum of the CPU utilization rates of all target application threads is the sum of the CPU utilization rate of unityGfx, the CPU utilization rate of unityMain, the CPU utilization rate of woker2 and the like.
S820, the system state value S t Inputting into DQN network to obtainTarget operating frequency a t
Specifically, the DQN network may select the operation frequency of the target CPU cluster of the next frame from the CPU frequency adjustable range of the target CPU cluster as output, thereby obtaining the target operation frequency a t
S830, setting the operation frequency of the target CPU cluster as the target operation frequency a t
For example, if the target CPU cluster is the first CPU cluster, the operating frequency of the first CPU cluster is set to the target operating frequency a t The method comprises the steps of carrying out a first treatment on the surface of the If the target CPU cluster is the second CPU cluster, setting the operating frequency of the second CPU cluster as the target operating frequency a t
Specifically, the training device 120 may set the operating frequency of the target CPU cluster in the current frame to the target operating frequency a t . The current frame may be understood as the following frame at time t. For example, if the time t is the time when the 3 rd frame ends, the training device 120 sequentially obtains the 1 st frame, the 2 nd frame, the 3 rd frame, and the 4 th frame … …. The previous 3 rd frame at the time t may be the 1 st frame, the 2 nd frame, and the 3 rd frame, and the next frame at the time t is the 4 th frame, that is, the training device 120 may predict the operating frequency of the target CPU cluster during the 4 th frame by using the loads and FPSs during the 1 st frame, the 2 nd frame, and the 3 rd frame.
S840, at the time t+1, the game frame rate and the power consumption of the target CPU cluster are obtained.
The time t+1 can be understood as the end time of the current frame.
S850, calculating and obtaining rewards r based on game frame rate at time t+1 and power consumption of target CPU cluster t+1 And according to rewards r t+1 The DQN network is updated.
As one example, the prize r is obtained according to the following piecewise function:
wherein C is f Is constant, fps t For the game frame rate at time t, FPS is a preset target frame rate, P (f) t ) For power consumption, for frequency f with the target CPU cluster t Related parameters.
From the above piecewise function, it can be seen that: when the frame rate of the actual detection fps t Below the target frame rate FPS, the frame rate difference (i.e., FPS t Difference from the target frame rate FPS) to boost the CPU frequency to increase the frame rate. Otherwise, the reward is an overhead of power consumption, which aims to reduce the CPU frequency to reduce CPU power consumption.
S860, judge if the DQN network is converged.
In an alternative embodiment, a loss function of the DQN network can be calculated. If the loss function is less than a threshold, the DQN network converges; if the loss function is greater than the threshold, the DQN network does not converge.
Wherein the loss function of the DQN network satisfies the formula:
wherein γ is a predetermined constant, Q (s t ,a t ;θ - ) The Q function is a function having θ as a variable, and the Q function has a state st and an action at as inputs, and outputs a corresponding Q value. Refers to the maximum value of the Q function, which can be determined based on all actions and states currently collected.
The updating of the DQN network based on the prize is actually updating the value of θ in the loss function. The maximum Q value is obtained by continuously updating the θ value.
S870, when the DQN network converges, the frequency-regulated neural network is output.
Note that the DQN includes two neural networks, one is a Q network and one is a target neural network (i.e., the frequency-tuned neural network of the present application). The structure of the target neural network is the same as that of the Q network, and the initial weight is the same. And when the Q network is converged, synchronizing parameters of the Q network to the target neural network to obtain the frequency-adjusting neural network.
In S840, the training device 120 may also obtain the state value S at time t+1 t+1 When the QDN network does not converge, the training device 120 may compare the state value s t+1 The DQN network is re-entered and training continues as per the flow shown in fig. 8 until the DQN network converges.
Further, during the training process, the training device 120 may collect (s t ,a t ,r t ) As a training sample, the training data is convenient to accumulate, and training of other models is performed.
The model application phase of the present application will be described further below.
Fig. 9 is a schematic flow chart of a frequency modulation method according to an embodiment of the present application, which can be applied to the execution device 110 in fig. 4. As shown in fig. 9, the frequency modulation method includes:
s910, at the first moment, the game frame rate of each frame in the first n frames at the first moment, the CPU utilization rate of the target CPU cluster in each frame, the CPU utilization rate sum of all the target application threads bound with the target CPU cluster in each frame, and the CPU utilization rate sum of all the target application threads in each frame are obtained as system state values.
S920, inputting the system state value into the frequency adjustment neural network corresponding to the target CPU cluster to obtain the target operating frequency.
S930, the operating frequency of the target CPU cluster during the frame subsequent to the first time is set as the target operating frequency.
It should be noted that, before executing the frequency modulation method provided in this embodiment, the training device 110 needs to not only migrate the frequency-adjusting neural network into the executing device 110, but also establish, in advance, a binding relationship between each thread in the game application and the corresponding CPU cluster according to a manner in the training device 120.
In this embodiment, on one hand, the frequency-tuned neural network is trained for an application and used for determining the CPU frequency when the application is running. In this way, the CPU frequency obtained by adjustment is more matched with the requirement of the application by taking the application as granularity. On the other hand, rewards in reinforcement learning balance application performance and CPU power consumption, not just CPU power consumption. In this way, the CPU frequency determined by the frequency adjustment neural network can balance the application performance and the power consumption of the CPU, and the rationality is higher.
In addition, the application takes the system state values of the previous frames as input to obtain the CPU running frequency of the current frame, thereby realizing frequency adjustment by taking the frames as units and being more accurate.
Referring to fig. 10, fig. 10 is a schematic structural diagram of an execution device provided by an embodiment of the present application, and the execution device 1000 may be embodied as a mobile phone, a tablet, a notebook computer, an intelligent wearable device, a server, etc., which is not limited herein. Wherein the executing device 1000 may be used to implement the frequency adjustment function in the corresponding embodiment of fig. 9. Specifically, the execution apparatus 1000 includes: receiver 1001, transmitter 1002, processor 1003, and memory 1004 (where the number of processors 1003 in execution device 1000 may be one or more, one processor is exemplified in fig. 10), where processor 1003 may include application processor 10031 and communication processor 10032. In some embodiments of the application, the receiver 1001, transmitter 1002, processor 1003, and memory 1004 may be connected by a bus or other means.
Memory 1004 may include read only memory and random access memory and provide instructions and data to processor 1003. A portion of the memory 1004 may also include non-volatile random access memory (non-volatile random access memory, NVRAM). The memory 1004 stores a processor and operating instructions, executable modules or data structures, or a subset thereof, or an extended set thereof, wherein the operating instructions may include various operating instructions for performing various operations.
The processor 1003 controls the operation of the execution device. In a specific application, the individual components of the execution device are coupled together by a bus system, which may include, in addition to a data bus, a power bus, a control bus, a status signal bus, etc. For clarity of illustration, however, the various buses are referred to in the figures as bus systems.
The frequency modulation method disclosed in the above embodiment of the present application may be applied to the processor 1003 or implemented by the processor 1003. The processor 1003 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry of hardware in the processor 1003 or instructions in the form of software. The processor 1003 may be a general purpose processor, digital signal processor (digital signal processing, DSP), microprocessor or microcontroller, and may further include an application specific integrated circuit (application specific integrated circuit, ASIC), field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The processor 1003 may implement or execute the methods, steps and logical blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory 1004, and the processor 1003 reads information in the memory 1004 and performs the steps of the method in combination with its hardware.
The receiver 1001 may be used to receive input numeric or character information and to generate signal inputs related to performing relevant settings and function control of the device. The transmitter 1002 may be configured to output numeric or character information via a first interface; the transmitter 1002 may also be configured to send instructions to the disk stack via the first interface to modify data in the disk stack; the transmitter 1002 may also include a display device such as a display screen.
In an embodiment of the present application, in an instance, the processor 1003 is configured to execute the frequency modulation method executed by the execution device in the corresponding embodiment of fig. 4.
Referring to fig. 11, fig. 11 is a schematic structural diagram of a training device 1100 according to an embodiment of the present application, specifically, the training device 1100 is implemented by one or more servers, where the training device 1100 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (central processing units, CPU) 1111 (e.g., one or more processors) and a memory 1132, and one or more storage media 1130 (e.g., one or more mass storage devices) storing application programs 1142 or data 1144. Wherein the memory 1132 and the storage medium 1130 may be transitory or persistent. The program stored on the storage medium 1130 may include one or more modules (not shown), each of which may include a series of instruction operations on the training device. Still further, the central processor 1111 may be configured to communicate with a storage medium 1130 and execute a series of instruction operations in the storage medium 1130 on the training device 1100.
The training device 1100 may also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input-output interfaces 1158; or one or more operating systems 1141, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.
In the embodiment of the present application, the central processor 1111 is configured to perform the steps related to the frequency adjustment neural network training method in the above embodiment.
Embodiments of the present application also provide a computer program product which, when run on a computer, causes the computer to perform the steps as performed by the aforementioned performing device, or causes the computer to perform the steps as performed by the aforementioned training device.
The embodiment of the present application also provides a computer-readable storage medium having stored therein a program for performing signal processing, which when run on a computer, causes the computer to perform the steps performed by the aforementioned performing device or causes the computer to perform the steps performed by the aforementioned training device.
The execution device, training device or terminal device provided in the embodiment of the present application may be a chip, where the chip includes: a processing unit, which may be, for example, a processor, and a communication unit, which may be, for example, an input/output interface, pins or circuitry, etc. The processing unit may execute the computer-executable instructions stored in the storage unit to cause the chip in the execution device to perform the data processing method described in the above embodiment, or to cause the chip in the training device to perform the data processing method described in the above embodiment. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the wireless access device side located outside the chip, such as a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a random access memory (random access memory, RAM), etc.
It should be further noted that the above-described apparatus embodiments are merely illustrative, and that the units described as separate units may or may not be physically separate, and that units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the application, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general purpose hardware, or of course by means of special purpose hardware including application specific integrated circuits, special purpose CPUs, special purpose memories, special purpose components, etc. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. However, a software program implementation is a preferred embodiment for many more of the cases of the present application. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, etc., comprising several instructions for causing a computer device (which may be a personal computer, a training device, a network device, etc.) to perform the method according to the embodiments of the present application.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, training device, or data center to another website, computer, training device, or data center via a wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a training device, a data center, or the like that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Claims (15)

1. The frequency modulation method is characterized by being applied to electronic equipment, wherein the electronic equipment runs a preset application, the preset application comprises a plurality of threads, the threads comprise threads which have binding relation with CPU clusters of the electronic equipment, the CPU clusters of the electronic equipment comprise target CPU clusters corresponding to a frequency adjustment neural network, and the frequency adjustment neural network is used for adjusting the operation frequency of the target CPU clusters corresponding to the frequency adjustment neural network, and the method comprises the following steps:
for any target CPU cluster, acquiring system state information, wherein the system state information comprises a first load, a second load, a third load and a frame rate of the preset application, the first load is the sum of loads of all threads, the second load is the sum of loads of threads which have binding relation with the target CPU cluster, and the third load is the load of the target CPU cluster;
inputting the system state information into a frequency adjustment neural network corresponding to the target CPU cluster, and obtaining a target operating frequency output by the frequency adjustment neural network corresponding to the target CPU cluster;
and setting the operating frequency of the target CPU cluster as the target operating frequency.
2. The method of claim 1, wherein the acquiring system state information comprises:
acquiring system state information of a previous n frame at a first moment, wherein the system state information of the previous n frame at the first moment comprises a first load corresponding to each frame in the previous n frame, a second load corresponding to each frame in the previous n frame, a third load corresponding to each frame in the previous n frame and a frame rate of each frame in the previous n frame, and n is more than or equal to 1;
the setting the operation frequency of the target CPU cluster to the target operation frequency includes:
setting an operating frequency of the target CPU cluster during a frame subsequent to the first time as the target operating frequency.
3. A method according to claim 1 or 2, wherein the first time instant is the time instant at which any frame ends.
4. A method according to any one of claims 1-3, wherein the pre-set application is a gaming application.
5. A method according to any one of claims 1-3, wherein the load is CPU utilization.
6. The training method of the frequency adjustment neural network is characterized by comprising the steps of training to obtain the frequency adjustment neural network corresponding to a target CPU cluster of electronic equipment, wherein the electronic equipment runs a preset application, the preset application comprises a plurality of threads, the threads with binding relation with the CPU cluster of the electronic equipment are included in the threads, the CPU cluster of the electronic equipment comprises the target CPU cluster corresponding to the frequency adjustment neural network, and the frequency adjustment neural network is used for adjusting the running frequency of the target CPU cluster corresponding to the frequency adjustment neural network, and the method comprises the following steps:
For any target CPU cluster, training data of the reinforcement learning network model corresponding to the target CPU cluster is obtained according to the data obtained by interaction between the reinforcement learning network model corresponding to the target CPU cluster and the target CPU cluster;
training reinforcement learning is carried out on the reinforcement learning network model corresponding to the target CPU cluster by utilizing the training data, and a frequency adjustment neural network corresponding to the target CPU cluster is obtained;
the system state information comprises a first load, a second load, a third load and a frame rate of the preset application, wherein the first load is the load sum of all threads, the second load is the load sum of threads with binding relation with the target CPU cluster, and the third load is the load of the target CPU cluster.
7. The method of claim 6, wherein the training data comprises a plurality of samples, each sample comprising a state, an action, and a reward, the obtaining training data of the reinforcement learning network model corresponding to the target CPU cluster based on data obtained by interaction of the reinforcement learning network model with the target CPU cluster, comprising:
Acquiring system state information at a first moment, wherein the system state information is used as the state, and comprises a first load corresponding to each frame in the previous n frames at the first moment, a second load corresponding to each frame in the previous n frames, a third load corresponding to each frame in the previous n frames, a frame rate in the previous n frames and a frame rate corresponding to each frame in the previous n frames, wherein n is more than or equal to 1;
inputting the state into a reinforcement learning network model corresponding to the target CPU cluster, obtaining a target operating frequency, and taking the target operating frequency as the state;
setting an operating frequency of the target CPU cluster during a frame subsequent to the first time as the target operating frequency;
and acquiring the frame rate of the preset application in the period of the next frame of the first moment and the power consumption of the target CPU cluster, and calculating to obtain the rewards.
8. The method of claim 7, wherein the frame rate, the power consumption, and the reward satisfy:
wherein r is t For the rewards, fps t For the frame rate, FPS is a preset target frame rate, delta is a preset difference, P (f) t ) For the power consumption, the frequency f of the target CPU cluster t Related parameters.
9. The method according to any one of claims 6-8, further comprising:
acquiring a tracking file, wherein the tracking file comprises a frame length of a multi-frame, running time of the plurality of threads in the multi-frame, a thread awakened by each thread and a thread waiting by each thread;
determining the dependency relationship among threads according to the thread awakened by each thread in the trace file and the thread waited by each thread, wherein the dependency relationship is used for indicating that the threads are executed in series or in parallel;
determining the type of the thread according to the CPU utilization rate of the thread, wherein the type comprises a first type and a second type, and the CPU utilization rate of the thread of the first type is higher than that of the thread of the second type;
and respectively establishing a binding relation between each thread and one CPU cluster according to the types of the threads and the dependency relations, wherein the CPU clusters bound by the threads are the same or different.
10. The method according to claim 9, wherein the binding relationship between each thread and one CPU cluster is respectively established according to the type of the thread and the dependency relationship, the CPU clusters bound by the plurality of threads are the same or different, the one or more CPU clusters include a first CPU cluster and a second CPU cluster, and the method includes:
Dividing two threads which are executed in parallel in the first type of threads into two different thread groups, and dividing two threads which are executed in series in the first type of threads into the same thread group to obtain two thread groups;
establishing a binding relationship between one of the two thread groups and the first CPU cluster, establishing a binding relationship between the other of the two thread groups and the second CPU cluster, and establishing a binding relationship between the second type of thread and the second CPU cluster.
11. The method of claim 10, wherein the operating frequency supported by the first CPU cluster is higher than the operating frequency supported by the second CPU cluster.
12. The method of any of claims 9-11, wherein if there is at least k frames in the multiframe where the CPU utilization of the thread is greater than or equal to a first threshold, the thread is a first type of thread, k is greater than or equal to 1;
and if the CPU utilization rate of the threads in fewer than k frames in the multi-frame is greater than or equal to a second threshold value, the threads are of a second type.
13. The method according to any one of claims 6-12, wherein the preset application is a gaming application.
14. An electronic device, the electronic device comprising: a memory and a processor; the processor is coupled with the memory; wherein the memory is for storing computer program code, the computer program code comprising computer instructions; the computer instructions, when executed by the processor, cause the electronic device to perform the method of any of claims 1-13.
15. A computer-readable storage medium comprising computer instructions; the computer instructions, when run on an electronic device, cause the electronic device to perform the method of any one of claims 1-13.
CN202310210010.7A 2023-02-25 2023-02-25 Frequency modulation method, training method of frequency adjustment neural network and electronic equipment Pending CN117130769A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310210010.7A CN117130769A (en) 2023-02-25 2023-02-25 Frequency modulation method, training method of frequency adjustment neural network and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310210010.7A CN117130769A (en) 2023-02-25 2023-02-25 Frequency modulation method, training method of frequency adjustment neural network and electronic equipment

Publications (1)

Publication Number Publication Date
CN117130769A true CN117130769A (en) 2023-11-28

Family

ID=88853366

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310210010.7A Pending CN117130769A (en) 2023-02-25 2023-02-25 Frequency modulation method, training method of frequency adjustment neural network and electronic equipment

Country Status (1)

Country Link
CN (1) CN117130769A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117687495A (en) * 2024-02-04 2024-03-12 荣耀终端有限公司 Data acquisition method, training method and electronic equipment

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103475790A (en) * 2013-09-06 2013-12-25 中国科学院计算技术研究所 Intelligent mobile terminal power consumption management method
CN108958450A (en) * 2018-06-11 2018-12-07 江苏食品药品职业技术学院 Intelligence adjusts the computer of running frequency and power consumption
CN109491494A (en) * 2018-11-26 2019-03-19 北京地平线机器人技术研发有限公司 Method of adjustment, device and the intensified learning model training method of power parameter
CN109918141A (en) * 2019-03-15 2019-06-21 Oppo广东移动通信有限公司 Thread execution method, device, terminal and storage medium
CN110333911A (en) * 2019-07-04 2019-10-15 北京迈格威科技有限公司 A kind of file packet read method and device
CN110413400A (en) * 2018-04-28 2019-11-05 珠海全志科技股份有限公司 A kind of cpu frequency adjusting method and system
CN110795383A (en) * 2018-08-01 2020-02-14 Oppo广东移动通信有限公司 SoC frequency control method, device, terminal and storage medium
US20210072814A1 (en) * 2017-12-12 2021-03-11 Samsung Electronics Co., Ltd. Method and apparatus for operating a processor in an electronic device
CN113115451A (en) * 2021-02-23 2021-07-13 北京邮电大学 Interference management and resource allocation scheme based on multi-agent deep reinforcement learning
CN113448425A (en) * 2021-07-19 2021-09-28 哈尔滨工业大学 Dynamic parallel application program energy consumption runtime optimization method and system based on reinforcement learning
CN113742082A (en) * 2021-09-13 2021-12-03 Oppo广东移动通信有限公司 Application resource allocation method and device, computer readable medium and terminal
CN114510140A (en) * 2020-11-16 2022-05-17 深圳市万普拉斯科技有限公司 Frequency modulation method and device and electronic equipment
CN115705275A (en) * 2021-08-03 2023-02-17 Oppo广东移动通信有限公司 Parameter acquisition method and device and electronic equipment

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103475790A (en) * 2013-09-06 2013-12-25 中国科学院计算技术研究所 Intelligent mobile terminal power consumption management method
US20210072814A1 (en) * 2017-12-12 2021-03-11 Samsung Electronics Co., Ltd. Method and apparatus for operating a processor in an electronic device
CN110413400A (en) * 2018-04-28 2019-11-05 珠海全志科技股份有限公司 A kind of cpu frequency adjusting method and system
CN108958450A (en) * 2018-06-11 2018-12-07 江苏食品药品职业技术学院 Intelligence adjusts the computer of running frequency and power consumption
CN110795383A (en) * 2018-08-01 2020-02-14 Oppo广东移动通信有限公司 SoC frequency control method, device, terminal and storage medium
CN109491494A (en) * 2018-11-26 2019-03-19 北京地平线机器人技术研发有限公司 Method of adjustment, device and the intensified learning model training method of power parameter
CN109918141A (en) * 2019-03-15 2019-06-21 Oppo广东移动通信有限公司 Thread execution method, device, terminal and storage medium
CN110333911A (en) * 2019-07-04 2019-10-15 北京迈格威科技有限公司 A kind of file packet read method and device
CN114510140A (en) * 2020-11-16 2022-05-17 深圳市万普拉斯科技有限公司 Frequency modulation method and device and electronic equipment
CN113115451A (en) * 2021-02-23 2021-07-13 北京邮电大学 Interference management and resource allocation scheme based on multi-agent deep reinforcement learning
CN113448425A (en) * 2021-07-19 2021-09-28 哈尔滨工业大学 Dynamic parallel application program energy consumption runtime optimization method and system based on reinforcement learning
CN115705275A (en) * 2021-08-03 2023-02-17 Oppo广东移动通信有限公司 Parameter acquisition method and device and electronic equipment
CN113742082A (en) * 2021-09-13 2021-12-03 Oppo广东移动通信有限公司 Application resource allocation method and device, computer readable medium and terminal

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117687495A (en) * 2024-02-04 2024-03-12 荣耀终端有限公司 Data acquisition method, training method and electronic equipment

Similar Documents

Publication Publication Date Title
JP7389177B2 (en) Federated learning methods, devices, equipment and storage media
CN109324875B (en) Data center server power consumption management and optimization method based on reinforcement learning
US10397829B2 (en) System apparatus and methods for cognitive cloud offloading in a multi-rat enabled wireless device
CN108958916B (en) Workflow unloading optimization method under mobile edge environment
CN111401744B (en) Dynamic task unloading method in uncertainty environment in mobile edge calculation
CN113225377B (en) Internet of things edge task unloading method and device
CN113989561B (en) Parameter aggregation updating method, device and system based on asynchronous federal learning
CN110647403B (en) Cloud computing resource allocation method in multi-user MEC system
CN113645637B (en) Method and device for unloading tasks of ultra-dense network, computer equipment and storage medium
CN117130769A (en) Frequency modulation method, training method of frequency adjustment neural network and electronic equipment
CN114866494A (en) Reinforced learning intelligent agent training method, modal bandwidth resource scheduling method and device
CN115714820A (en) Distributed micro-service scheduling optimization method
Zhang et al. A deep reinforcement learning approach for online computation offloading in mobile edge computing
Hu et al. Dynamic task offloading in MEC-enabled IoT networks: A hybrid DDPG-D3QN approach
CN114090108A (en) Computing task execution method and device, electronic equipment and storage medium
CN113946423A (en) Multi-task edge computing scheduling optimization method based on graph attention network
CN112860409A (en) Mobile cloud computing random task sequence scheduling method based on Lyapunov optimization
CN112817741A (en) DNN task control method for edge calculation
CN110768827B (en) Task unloading method based on group intelligent algorithm
Lim et al. Real-time DNN model partitioning for QoE enhancement in mobile vision applications
CN114138493A (en) Edge computing power resource scheduling method based on energy consumption perception
CN113407313B (en) Resource demand-aware multi-queue scheduling method, system and server
CN114079953A (en) Resource scheduling method, device, terminal and storage medium for wireless network system
CN113747504A (en) Method and system for multi-access edge computing combined task unloading and resource allocation
CN112422651A (en) Cloud resource scheduling performance bottleneck prediction method based on reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination