WO2021103618A1 - Configuration of operating frequency of chip - Google Patents

Configuration of operating frequency of chip Download PDF

Info

Publication number
WO2021103618A1
WO2021103618A1 PCT/CN2020/105195 CN2020105195W WO2021103618A1 WO 2021103618 A1 WO2021103618 A1 WO 2021103618A1 CN 2020105195 W CN2020105195 W CN 2020105195W WO 2021103618 A1 WO2021103618 A1 WO 2021103618A1
Authority
WO
WIPO (PCT)
Prior art keywords
chip
subtask
frequency
task
data
Prior art date
Application number
PCT/CN2020/105195
Other languages
French (fr)
Chinese (zh)
Inventor
李天健
戴彦
王迎瑞
侯宇乐
杨修齐
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Priority to KR1020217020535A priority Critical patent/KR20210098508A/en
Priority to JP2021538698A priority patent/JP2022516549A/en
Publication of WO2021103618A1 publication Critical patent/WO2021103618A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/324Power saving characterised by the action undertaken by lowering clock frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to smart device technology, in particular to a method and device for setting the operating frequency of a chip.
  • Frequency reduction is one of the main methods of preventing excessive power consumption of the chip.
  • the frequency reduction technology mainly reduces the power consumption of the chip by temporarily reducing the operating frequency of the chip (frequency).
  • an initial operating frequency can be set before the application program is executed in the terminal device, and the chip runs at the initial operating frequency to perform calculation processing on the application program.
  • a fixed frequency reduction strategy can be adopted for the chip, for example, the working frequency of the chip is reduced by a fixed ratio, or the working frequency is reduced to A fixed value.
  • the embodiments of the present disclosure provide at least a method and device for setting the operating frequency of a chip.
  • a method for setting the operating frequency of a chip includes: acquiring a plurality of subtasks of a target task and task parameters of each subtask, the task parameters including parameters used to indicate the operation scale of the subtasks Based on the task parameters of each subtask in the multiple subtasks, determine the target chip frequency of each subtask; according to the determined target chip frequency of each subtask, set the operating frequency of the chip to execute each subtask.
  • the method further includes: performing task analysis processing on the target task to obtain the multiple subtasks and the task parameters of each subtask; storing each subtask and each subtask in the multiple subtasks The corresponding relationship of the task parameters of each subtask; the obtaining multiple subtasks of the target task and the task parameters of each subtask includes: searching the task parameters of each subtask from the stored corresponding relationships.
  • the task parameter includes at least one of the following: a calculation amount of the subtask, and a memory access amount of the subtask.
  • the determining the target chip frequency of each subtask based on the task parameter of each subtask of the plurality of subtasks includes: acquiring device information of the device where the chip is located, and The device information includes device resource information; the target chip frequency of each subtask is determined based on the device information and the task parameter of each subtask of the multiple subtasks.
  • the device resource information includes any one or more of the following: the number of computing units, bandwidth, and memory capacity.
  • the device information further includes: the chip temperature of the chip; the determining each subtask based on the device information and the task parameter of each subtask of the plurality of subtasks
  • the target chip frequency of includes: determining the target of the first subtask based on the task parameters of the first subtask among the plurality of subtasks, the device resource information, and the chip temperature during execution of the first subtask Chip frequency.
  • the determining the target chip frequency of each subtask based on the device information and the task parameter of each subtask of the plurality of subtasks includes: acquiring a plurality of first data and A preset mapping relationship between a plurality of second data, the first data includes preset task parameters and preset device information, the second data includes a preset chip frequency; according to the preset mapping relationship, the Device information and task parameters of each subtask determine the target chip frequency of each subtask.
  • the determining the target chip frequency of each subtask according to the preset mapping relationship, the device information, and the task parameters of each subtask includes: determining the third data and the task parameters of each subtask.
  • the distance between each first data in the plurality of first data in the preset mapping relationship, the third data includes task parameters and device information of the first subtask among the plurality of subtasks; It is assumed that the preset chip frequency corresponding to the target first data closest to the third data in the mapping relationship is used as the target chip frequency of the first subtask.
  • the method before the obtaining the preset mapping relationship between the plurality of first data and the plurality of second data, the method further includes: obtaining a plurality of sets of optional chip frequencies, and obtaining samples obtained A plurality of discrete first data; for each of the first data in the plurality of first data, a set of chip frequencies is selected from the plurality of sets of optional chip frequencies as the The data corresponds to the second data, and a mapping relationship between each of the first data and the selected second data is established.
  • the preset chip frequency corresponding to the first data in the preset mapping relationship is based on the second chip frequency under the condition of each group of selectable chip frequencies in the multiple sets of selectable chip frequencies.
  • the performance evaluation parameter corresponding to one data is selected from the multiple sets of optional chip frequencies.
  • the performance evaluation parameters include task processing performance parameters and chip operating power consumption;
  • the preset chip frequency corresponding to the first data in the preset mapping relationship is: the multiple groups Among the optional chip frequencies, the chip operating power consumption is lower than the preset power consumption and the optional chip frequency with the best task processing performance parameters.
  • the method further includes: receiving configuration information input by a user for the preset mapping relationship.
  • the determining the target chip frequency of each subtask based on the task parameter of each subtask in the plurality of subtasks includes:
  • the task parameters are selected from multiple sets of selectable chip frequencies, and the chip frequency that enables the chip to achieve the lowest task running time under the restriction of chip power consumption is selected as the target chip frequency.
  • the method further includes: receiving frequency setting strategy information input by a user; and determining the target chip of each subtask based on the task parameter corresponding to each subtask in the plurality of subtasks
  • the frequency includes: determining the target chip frequency of each subtask based on the frequency setting strategy information and the task parameter of each subtask of the multiple subtasks.
  • the frequency setting strategy information includes enabling or disabling the chip frequency dynamic setting function for subtasks.
  • the operating frequency includes at least one of the following: a core frequency of the chip or a memory frequency.
  • a device for setting the operating frequency of a chip includes: an acquisition module for acquiring multiple subtasks of a target task and task parameters of each subtask.
  • the task parameters include A parameter of the calculation scale of the task; a frequency control module for determining the target chip frequency of each subtask based on the task parameters of each of the multiple subtasks; a frequency setting module for determining each subtask according to the task parameters
  • the target chip frequency is to set the working frequency of the chip to execute each subtask.
  • an electronic device including: a memory and a processor, the memory is configured to store machine-readable instructions, and the processor is configured to invoke the machine-readable instructions to implement the first aspect of the present disclosure method.
  • the device further includes: a chip configured to process each subtask in the target task based on the operating frequency set by the processor.
  • a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the processor is prompted to implement the method described in the first aspect of the present disclosure.
  • the method and device for setting the working frequency of a chip provided by the embodiments of the present disclosure determine the target chip frequency of each subtask in the target task based on the task parameters of the subtask, so that the chip can use the subtask when executing each subtask.
  • Fig. 1 shows a flowchart of a method for setting a chip operating frequency provided by some embodiments of the present disclosure.
  • Fig. 2 shows another flowchart of a method for setting a chip operating frequency provided by some embodiments of the present disclosure.
  • FIG. 3 shows another flowchart of the process of establishing a preset mapping relationship provided by some embodiments of the present disclosure.
  • FIG. 4 shows another flowchart of a method for setting the operating frequency of a chip provided by some embodiments of the present disclosure.
  • Fig. 5 shows a block diagram of a device for setting the operating frequency of a chip provided by some embodiments of the present disclosure.
  • FIG. 6 shows another block diagram of the device for setting the operating frequency of a chip provided by some embodiments of the present disclosure.
  • Fig. 7 shows a block diagram of an electronic device with a chip operating frequency provided by some embodiments of the present disclosure.
  • the embodiments of the present disclosure provide a method for setting the operating frequency of a chip in a terminal device.
  • the method aims to make the operating performance of the device better when the chip runs at the set operating frequency, for example, tasks running on the device It can be processed at a faster speed.
  • the chip can be an artificial intelligence (AI) chip, for example, an AI chip used in a smartphone, an AI chip used in an autonomous vehicle, an AI chip used in edge devices of the Internet of Things, etc. Wait. It may also be other types of chips, such as a CPU (Central Processing Unit) chip, a DSP (Digital Signal Processing), a memory chip, etc., which are not limited in the embodiment of the present disclosure.
  • AI artificial intelligence
  • Fig. 1 shows a flow chart of a method for setting the operating frequency of a chip provided by some embodiments of the present disclosure. As shown in Fig. 1, the method may include the following processing.
  • step 100 a plurality of subtasks of the target task and task parameters of each subtask are acquired, the task parameters including parameters used to indicate the operation scale of the subtasks.
  • the target task may be a task to be executed and processed on the device.
  • the task may be a training task of a deep learning model, or an inference task of a neural network model, or running an application program, etc.
  • the execution of the target task requires the use of a chip on the device, and the chip is responsible for part or all of the processing work such as calculations during the execution of the task.
  • the subtask may be a relatively independent execution unit included in the target task.
  • the target task can include multiple functions, which can be divided according to functions, and each function is used as an independent subtask in the target task.
  • the task code can also be divided into multiple code blocks and subtasks are divided according to the code blocks, and each piece of code is relatively complete.
  • a relatively complete or independent code block as a subtask.
  • each layer or layers of the neural network model can be regarded as a subtask; or, it can also be the code segment of the realization process of each layer As a subtask.
  • one or at least two of the multiple functional modules can also be used as a subtask, and so on.
  • the embodiments of the present disclosure do not limit the way of dividing subtasks, and there may be multiple ways of dividing subtasks, such as the above-mentioned function as a unit, code block as a unit, network layer or operator as a unit, or function module as a unit. It should be noted that although dividing the target task into multiple sub-tasks can achieve fine-grained frequency settings and improve equipment operating performance, too many sub-tasks may cause more frequent frequency settings. It has a certain impact on the operating performance of the equipment. Therefore, you can balance the settings of the number of subtasks. For example, try not to divide the inside of the loop body included in the target task into subtasks to reduce the number of frequency settings. The specific settings can be based on actual conditions. Needs to proceed, so I won’t go into details here.
  • the task parameter of the subtask may include a parameter used to indicate the calculation scale of the subtask, for example, may include a parameter used to indicate one or more resources that the subtask needs to occupy.
  • the task parameters may include but are not limited to at least one of the following: calculation amount of the subtask, memory access (memory access is the number of times the chip accesses the memory and/or the total amount of data accessed when the subtask is executed), Or, it may also include other types of task parameters, which are not limited in the embodiment of the present disclosure.
  • the task parameters of the subtasks can be obtained in various ways. For example, obtain task parameters of subtasks from other devices or from other functional modules in the system, or obtain task parameters of subtasks from local storage, or obtain task parameters of each subtask by performing task analysis on the target task. and many more.
  • the correspondence between each subtask and the respective task parameter may be stored in advance.
  • task analysis processing may be performed on the target task in advance to obtain multiple subtasks included in the target task and task parameters of each subtask, and each subtask and its task parameters are stored. In this way, when the target task needs to be run, the task parameters of each subtask can be quickly found based on the stored information.
  • the correspondence between the subtasks and task parameters can be stored in the form of key-value.
  • corresponding task IDs can be set for multiple subtasks, and the task IDs of the subtasks can be stored as keys , And store the task parameters of this subtask as value.
  • the key-value storage method is suitable for querying through the primary key, with fast query speed and large amount of stored data. But the present disclosure is not limited to this.
  • device information of the device where the chip is located can be further obtained, and the target chip frequency of each subtask can be jointly determined based on the device information and task parameters.
  • the aforementioned device information may include device resource information of the device where the chip is located.
  • the device resource information is used to indicate information about the available resources of the device.
  • the device resource information may include, but is not limited to, any of the following or Multiple device resource information: the number of computing units, bandwidth, memory capacity, etc.
  • the target chip frequency of each subtask may also depend on the dependency between multiple subtasks or the task execution mode, such as serial, parallel, or serial between partial subtasks and partial subtasks. Parallel and so on.
  • the chip can occupy all available device resources on the device when executing each subtask, such as occupying all computing units and all bandwidth on the device. And so on, but the embodiments of the present disclosure are not limited thereto.
  • step 102 the target chip frequency of each subtask is determined based on the task parameter of each subtask in the plurality of subtasks.
  • the target chip frequency of each subtask can be determined according to the task parameters of the subtask.
  • the target chip frequency of each subtask may be jointly determined based on the device information of the device where the chip is located and the task parameters of each subtask.
  • multiple sets of optional chip frequencies may be determined, and a corresponding optional chip frequency may be selected for each subtask from the multiple sets of optional chip frequencies based on a certain strategy.
  • the multiple sets of optional chip frequencies are a specific number of discrete frequencies that can be set by the chips in the device.
  • the core frequency may include m frequency values such as x1, x2...xm.
  • the determination of the target chip frequency can be transformed into an optimization problem.
  • the task running time can be used as the main reference factor for selection. As an example, it can be based on each sub The task parameters of the task, among the multiple sets of optional chip frequencies that can be set, select the chip frequency that enables the chip to achieve the lowest task running time under the restriction of chip power consumption, as the target chip frequency of each subtask .
  • multiple subtasks can be analyzed as a whole to determine the overall task running time of the target task. At this time, under the condition of determining the shortest task running time, each of the multiple subtasks can be obtained at the same time. The target chip frequency of each subtask.
  • the task running time of each subtask can be analyzed separately, and the target chip frequency of the subtask can be determined based on the shortest task running time of each subtask as a condition, and so on.
  • the system resources consumed by the task, chip operating power consumption, task operating speed, and task execution result accuracy can also be selected as reference factors. This is not limited.
  • the target chip frequency is determined according to the task parameters and device information of the subtasks as an example to illustrate the principle of solving an optimization problem.
  • the task parameter of the subtask is k n (n represents the n-dimensional space, such as the amount of memory access, the amount of calculation, etc.), and the device information corresponding to the subtask is d m (m represents the m-dimensional space, Such as memory capacity, etc.), and assume that the chip frequency is x, where the frequency can be a core frequency, a memory frequency, or a frequency combination of a core frequency and a memory frequency.
  • P(k n , d m , x) represents the operating power consumption of the chip.
  • the operating power consumption of the chip is related to task parameters, device information and chip frequency. Even if the task parameters and device information are fixed and the chip frequency changes, P will Will change accordingly. T(k n , d m , x) represents the task running time. Similarly, even if the task parameters and device information are fixed, T will change when the chip frequency changes. Then you can combine the task parameters and device information of the above subtasks with multiple chip frequencies to obtain the corresponding chip operating power consumption and task operating time, and select from multiple chip frequencies based on the chip operating power consumption and task operating time The optimal chip frequency for each subtask.
  • the optimal chip frequency can be selected according to the following formula (1):
  • the above formula (1) indicates that the lowest task running time is achieved under the condition of chip power consumption limit (not exceeding Power Limited ), which is the optimization goal of the optimization problem. Find the optimal chip frequency with this optimization goal.
  • the multiple optional chip frequencies that the chip can set if a certain chip frequency is used, the subtask can be processed with the shortest task running time under the condition of chip power consumption.
  • the chip frequency is the target chip frequency of the subtask.
  • the target chip frequency of each subtask may also be different. In this step, the target chip frequency that best matches each subtask can be found, so that the subtask is completed at the fastest speed and the chip does not exceed the power consumption.
  • step 104 according to the determined target chip frequency of each subtask, the operating frequency of the chip when each subtask is executed is set.
  • the chip frequency can be set as the target chip frequency of the subtask, or it can be based on the target chip frequency and the device or chip executing
  • the current status information of the subtask for example, the status information of the device or the chip when the chip starts to execute the subtask, such as the chip temperature, is used to set the operating frequency of the chip when the subtask is executed.
  • the above process of determining the target chip frequency may be performed uniformly before executing the target task, and the target chip frequency of each of the multiple subtasks is stored, and then, in the process of running each subtask, Set the chip frequency to the target chip frequency of the subtask.
  • the target chip frequency of each subtask may also be determined before the execution of each subtask.
  • the target chip frequency of the subtask may be determined based on the task parameters of the subtask and the current state information of the device or chip. Frequency.
  • the determination of the target chip frequency of each subtask may be performed at different times, which is not limited in the embodiment of the present disclosure.
  • the operating frequency of the chip may include at least one of the following: the core frequency of the chip or the memory frequency.
  • the method of the embodiment of the present disclosure can be used only to set the core frequency of the chip, where the memory frequency can be set to a fixed value or set in the unit of the target task, or the method is only used to set the memory frequency, where the core
  • the frequency can be set as a fixed value or set in the unit of the target task, or this method is used to set the core frequency and memory frequency of the chip, for example, the core frequency and the memory frequency are set as a combination of frequencies, and the target chip of each subtask Frequency is the combination of frequencies.
  • the frequency of the memory may be the frequency of a volatile memory, such as the frequency of SDRAM (Synchronous Dynamic Random Access Memory).
  • SDRAM Serial Dynamic Random Access Memory
  • DDR SDRAM Double Data Rate SDRAM
  • DDR2 SDRAM Double Data Rate SDRAM
  • DDR3 SDRAM DDR4 SDRAM
  • DDR5 SDRAM DDR5 SDRAM
  • the frequency of the memory may also be the frequency of a non-volatile memory, such as the frequency of a flash memory.
  • the method for setting the working frequency of the chip in the embodiment of the present disclosure determines the target chip frequency of each subtask based on the task parameters of each subtask in the target task, so that the chip can use the target chip of the subtask when executing each subtask. Frequency operation.
  • This refined frequency setting method can achieve the optimal operating performance as much as possible when executing each subtask, thereby increasing the operating speed of the entire task.
  • Figure 2 provides another flow chart of a method for setting chip operating frequency in some embodiments of the present disclosure.
  • the method takes the target chip frequency jointly determined according to the task parameters and device information of the subtasks as an example, and exemplarily shows A way to determine the frequency of the target chip.
  • the method may include the following processing, wherein the same steps as those in FIG. 1 will not be described in detail.
  • step 200 task analysis is performed on the target task to be run on the device to obtain multiple subtasks and task parameters of each of the subtasks.
  • the task parameter of each subtask may include at least one of the following: the calculation amount and the memory access amount of the subtask.
  • step 202 the device information of the device where the chip is located is obtained.
  • the device information of the device where the chip is located may include device resource information, and the device resource information may include at least one of the following: bandwidth, number of computing units, memory capacity, and so on.
  • step 204 a preset mapping relationship between a plurality of first data and a plurality of second data is acquired.
  • the device may pre-store a plurality of mapping relationships between a plurality of first data and a plurality of second data, where the first data includes preset task parameters and preset device information of the subtask, and the second data includes preset device information.
  • Set the chip frequency may be the core frequency of the chip.
  • the preset chip frequency may be a combination of the core frequency of the chip and the memory frequency.
  • Table 1 illustrates the preset mapping relationship between multiple first data and multiple second data:
  • the following example illustrates how to establish the above-mentioned mapping relationship.
  • the values of the task parameters and device information will be within a certain range.
  • the value of the task parameter is within the range F1
  • the value of the device information is within the range F2.
  • Both F1 and F2 can be referred to as the preset first data value range.
  • k1, k2, and k3 represent task parameters
  • d1, d2, and d3 represent device information.
  • the chip frequency to be set is the core frequency, it may include multiple sets of optional chip frequencies that the chip may set, for example, x1, x2, x3, and so on.
  • a group of chip frequencies can be selected from the above-mentioned multiple groups of optional chip frequencies as the second data corresponding to the first data, and the first data and the second data can be established.
  • the mapping relationship between the data may be based on performance evaluation parameters.
  • the performance evaluation parameters of the first data under the condition of each group of optional chip frequencies can be determined separately, for example, a performance evaluation parameter can be obtained according to the first data and one group of optional chip frequencies; according to the first data and Another set of optional chip frequencies obtains another performance evaluation parameter, where the performance evaluation parameter can be obtained through simulation, inference, or mathematical formula operation, which is not limited in the embodiment of the present disclosure.
  • a group of chip frequencies can be selected from the multiple sets of optional chip frequencies as the first data according to the performance evaluation parameters
  • the corresponding second data may include task processing performance parameters and chip operating power consumption, and the chip operating power consumption of the above multiple sets of optional chip frequencies may be lower than the preset power consumption and the optional chip frequency with the optimal task processing performance parameters As the second data corresponding to the first data.
  • the aforementioned task processing performance parameters include but are not limited to task processing time.
  • the task processing performance parameter is the task running time
  • the chip frequency is the core frequency as an example.
  • step 2041 traverse the multiple sets of optional chip frequencies, and obtain the operating time under the conditions of each set of optional chip frequencies and the first data according to each set of optional chip frequencies and a certain set of first data.
  • the operating power consumption and task running time of the chip is not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, a certain set of first data.
  • the ten sets of chip frequencies can be combined respectively to obtain ten chip operating power consumption P and ten task operating time T.
  • the chip frequency is x1
  • the corresponding chip operating power consumption of the subtask is P1
  • the task running time is T1
  • the chip frequency is x2
  • the corresponding chip operating power consumption of the subtask is P2
  • the task is running Time
  • P may be determined according to the power consumption model
  • T may be determined according to the runtime model.
  • the model input of the power consumption model is task parameters, device information and chip frequency
  • the model output is P.
  • the model inputs of the runtime model are task parameters, device information and chip frequency, and the model outputs T.
  • the specific structure of the above-mentioned power consumption model and runtime model can be obtained in a variety of ways, for example, a support vector machine, a feedback neural network, a K-means aggregation algorithm, and so on.
  • both the power consumption model and the runtime model are neural network models as an example, and the neural network model can be obtained through model training.
  • these sets are used as the training set of the neural network model.
  • the task parameters of the task can be obtained, and the device information of the device can be obtained, and then a frequency x 1 is selected from the above device frequency set, and the three are used as the input of the power consumption model.
  • the predicted value of the operating power consumption of the chip is output.
  • the predicted value is a difference between the predicted value and the real value of the chip's operating power consumption under the condition of input task parameters, device information, and frequency x 1 (the result of real device operation).
  • backpropagation is performed to train the function.
  • Consumption model In the same way, the runtime model can also be trained in the above manner, but the output is changed to the task runtime, which will not be described in detail here.
  • the first data and the chip frequency to the power consumption model to obtain the running power consumption of the chip. For example, by inputting the first data (k1, d1) and the chip frequency xi into the power consumption model, the operating power consumption Pi of the chip can be obtained. Input the first data and chip frequency to the runtime model to get the task runtime. For example, by inputting the first data (k1, d1) and the chip frequency xi into the runtime model, the task runtime Ti can be obtained.
  • step 2042 the chip frequency that determines the optimal chip running power consumption and task running time is selected as the second data corresponding to the first data.
  • the chip frequency with the lowest task running time within the limit of chip running power consumption can be selected as the second data.
  • the chip frequency with the smallest Ti, such as x1 is selected as the second data corresponding to the first data (k1, d1).
  • step 2043 a mapping relationship between the first data and the second data is established.
  • each set of first data can obtain the second data corresponding to the first data, that is, the optimal chip frequency, according to the process shown in FIG. 3.
  • the second data corresponding to each first data in the multiple first data can be obtained, and each mapping relationship is established accordingly, thereby obtaining a set of mapping relationships.
  • the set may include multiple sets of mapping relationships, and each set of mapping relationships The relationship includes a first data and a corresponding second data.
  • the set of mapping relationships may be the preset mapping relationships described in step 204, as shown in Table 1 as an example.
  • step 206 the chip frequency corresponding to the subtask in the preset mapping relationship is determined as the target chip frequency according to the preset mapping relationship, the device information, and the task parameters of the subtask.
  • the task parameter of a certain subtask is k1
  • the device information corresponding to the subtask is d1.
  • the target chip frequency can be obtained as x1.
  • This subtask can be called the first subtask.
  • the first data in the mapping relationship table is discrete, sometimes the task parameters and device information of the first subtask are not completely consistent with the first data in the mapping relationship. In this case, as an optional implementation In this way, the first data closest to the device information and task parameters of the first task can be found in the mapping relationship table, and the chip frequency corresponding to the closest first data is used as the target chip frequency of the first subtask.
  • the distance between the third data and each first data in the mapping relationship can be calculated.
  • the distance can be calculated by taking the third data and each first data as a vector, and calculating the distance between the vectors, for example, it can be (Euclidean distance between the vector of the third data and the vector of each first data Or other distances.
  • the chip frequency corresponding to the first data closest to the third data distance can be used as the target chip frequency of the first subtask.
  • the above-mentioned first data closest to the third data may be the closest downward, and the closest downward refers to taking the first data corresponding to the lower chip frequency as much as possible.
  • the first The chip frequency corresponding to the data Y1 is lower than the chip frequency corresponding to the second data Y2, so you can choose to use the chip frequency corresponding to Y1. If the distances between the first data Y1 and the first data Y2 and the third data are not equal, the first data with a relatively close distance can still be selected as the target first data.
  • the first data corresponding to the lower chip frequency is preferentially selected.
  • the preset range may be based on the actual The requirement setting is not limited in the embodiment of the present disclosure.
  • step 208 according to the determined target chip frequency of each subtask, the operating frequency of the chip to execute each subtask is set.
  • the frequency of the chip can be set as the target chip frequency of the subtask.
  • the task to be executed is decomposed to obtain multiple subtasks, and the target chip frequency of each subtask is obtained respectively.
  • This refined frequency setting method can make each task The sub-task part is as far as possible to achieve the best operating performance, thereby improving the running speed of the entire task.
  • by pre-establishing the mapping relationship between task parameters, device information and chip frequency it is possible to speed up the determination of the chip frequency and improve the performance of the device.
  • FIG. 4 provides another flow chart of the method for setting the working frequency of the chip according to some embodiments of the present disclosure.
  • setting the combined frequency of the chip is taken as an example for description, and the obtained device information may also include the chip of the chip Temperature, the chip temperature can be collected by the temperature sensor in the device where the chip is located.
  • the target task to be performed is a three-layer neural network model inference task.
  • the power consumption model and runtime model obtained by training may be completely different.
  • the model can be determined by offline training for the specific end device where the target task is running. After the power consumption model and the runtime model are determined, the determined model and the optimization problem solving engine can be stored in the end device.
  • the processing performed by the optimization problem solving engine can be referred to the process of establishing the preset mapping relationship in the process description shown in Figure 2.
  • the optimization problem solving engine can obtain the result according to the sampled multiple sets of first data and the determined model Each chip frequency corresponds to the chip running power consumption and task running time, and then the optimal chip frequency is selected according to formula (1), thereby establishing the mapping relationship between the first data and the second data.
  • the device may execute the establishment of the preset mapping relationship, or another device may execute the process of establishing the preset mapping relationship, and store the pre-established mapping relationship in the device where the target task is running. The device can directly look up the table when running the target task.
  • step 400 task analysis is performed on the target task to be run on the device, and multiple subtasks and task parameters of each of the subtasks are obtained.
  • the target task is an inference task of a three-layer neural network model, each of which can be a subtask.
  • a subtask list can be obtained.
  • the subtask list includes three subtasks: subtask 1, subtask 2, and subtask 3.
  • this step also analyzes and obtains the task parameters of each subtask, for example, the amount of memory access and the amount of calculation of the subtask.
  • the subtask list may include: ⁇ subtask 1, task parameter 1>, ⁇ subtask 2, task parameter 2>, ⁇ subtask 3, task parameter 3>.
  • the task parameters can be expressed as k n (n represents n-dimensional space, such as the amount of memory access and the amount of calculation).
  • Subtask 1, subtask 2, and subtask 3 can be regarded as the first subtask, respectively.
  • step 402 the device information of the device where the chip is located is obtained.
  • the device information may include device resource information.
  • the subtask 1, the subtask 2 and the subtask 3 are in a serial relationship, and the device resource information occupied by these subtasks during operation may be the same.
  • the bandwidth, the number of computing units, etc. can be the same.
  • Device information can be expressed as d m (m represents m-dimensional space, such as bandwidth, memory capacity, etc.).
  • step 404 before it is determined that the execution of subtask 1 is about to start, the chip temperature C1 on the chip is collected, and according to the task parameters of subtask 1, device resource information and chip temperature C1, it is determined that the device of subtask 1 is running.
  • the device information may also include the chip temperature of the chip, because an excessively high temperature during the execution of the task will also cause the chip to take measures to reduce the frequency.
  • the device information in the embodiments of the present disclosure not only includes device resource information such as the number of computing units, bandwidth, and memory capacity, but also includes chip temperature.
  • the chip temperature may be dynamically acquired, that is, before each subtask in the running process of the target task starts to execute, the chip temperature corresponding to the subtask is acquired, and the target chip frequency of the subtask is determined accordingly.
  • step 406 the working frequency of the chip is set to the target chip frequency of subtask 1, and the execution of subtask 1 is started.
  • step 408 before it is determined that the execution of subtask 2 is about to start, the chip temperature C2 on the chip is collected, and according to the task parameters of subtask 2, device resource information and chip temperature C2, it is determined that the device of subtask 2 is running.
  • the temperature of the chip changes during the running process of the target task, then the latest chip temperature can be collected and combined with the chip before each subtask is executed during the running process of the target task.
  • the temperature determines the target chip frequency for this subtask.
  • the chip frequency in the embodiment of the present disclosure may be a combined frequency or one of the combined frequencies, that is, the core frequency and the memory frequency of the chip are set at the same time.
  • the core frequency is x and the memory frequency is y.
  • the corresponding chip frequency can be (x1, y1); when the subtask is When the task parameters and device information are (k2, d2), the corresponding chip frequency can be (x2, y2).
  • the input of the power consumption model can include: k1, d1, x1, y1, and the output of the model can be the chip's operating power consumption P; the same way, the input of the runtime model It can include: k1, d1, x1, y1, and the output of the model can be the task running time T.
  • the device information may include the chip temperature.
  • step 410 the working frequency of the chip is set to the target chip frequency of subtask 2, and the execution of subtask 2 is started.
  • step 412 before it is determined that subtask 3 will be executed, the chip temperature C3 of the chip on the device is collected, and the task parameters of subtask 3, device resource information, and chip temperature C3 are used to determine where subtask 3 is running.
  • the chip on the device corresponds to the set target chip frequency.
  • step 414 the operating frequency of the chip is set to the target chip frequency of subtask 3, and the execution of subtask 3 is started.
  • the operating frequency of the device chip can be reset to the default value, and the execution of the target task ends.
  • the task to be executed is decomposed to obtain multiple subtasks, and the target chip frequency of each subtask is obtained respectively.
  • This refined frequency setting method can make each subtask of the task Some parts are as far as possible to achieve the best operating performance, thereby improving the running speed of the entire task.
  • dynamically collecting the chip temperature during the execution of the first subtask, and synthesizing the chip temperature to determine the target chip operating frequency of the first subtask which can make the determination of the chip operating frequency more comprehensive. Therefore, the frequency setting is more reasonable, and the operation performance of the equipment is further improved.
  • the end device may also provide a user interface through which the input configuration information for the preset mapping relationship can be received.
  • the user can calculate or estimate the target chip frequency that each subtask should use offline, and store the preset mapping relationship between each subtask and the corresponding target chip frequency in the device, so that the device runs according to the The configured preset mapping relationship only needs to set the working frequency of the corresponding chip when the subtask is executed.
  • the aforementioned user interface can also be used to receive frequency setting strategy information.
  • the frequency setting strategy information may include, for example, a model for determining chip operating power consumption and task operating time according to task parameters and device information, or how to select and determine the target chip frequency of a certain subtask according to the power consumption model and task operating time model. According to the frequency setting strategy information, based on the frequency setting strategy information and the task parameters of each of the multiple subtasks, the target chip frequency corresponding to the chip on the device when each subtask is running can be determined.
  • the frequency setting strategy information may also include: turning on or turning off the chip frequency dynamic setting function for subtasks.
  • the device can determine the target chip frequency for each subtask of the target task according to the method for setting chip operating frequency described in the foregoing embodiment of the present disclosure. For example, you can collect task parameters, chip temperature, and device resource information of subtasks, and select and determine the target chip frequency to be used based on these information, power consumption model and task runtime model.
  • the frequency setting strategy information includes turning off the chip frequency dynamic setting function for subtasks, the device can directly obtain the target chip frequency that each subtask should use according to the preset mapping relationship configured offline and calculated through the user interface.
  • a more flexible chip frequency setting method can be provided, and the user can update the chip frequency determination method more conveniently, making the chip frequency determination more reasonable and faster.
  • FIG. 5 is an exemplary structure diagram of a device for setting the operating frequency of a chip provided by an embodiment of the present disclosure. As shown in FIG. 5, the device may include: an acquisition module 51, a frequency control module 52, and a frequency setting module 53.
  • the obtaining module 51 is configured to obtain multiple subtasks of the target task and task parameters of each subtask, where the task parameters include parameters for representing the operation scale of the subtasks.
  • the task parameter includes at least one of the following: a calculation amount of the subtask, and a memory access amount of the subtask.
  • the frequency control module 52 is configured to determine the target chip frequency of each subtask based on the task parameter of each subtask in the multiple subtasks.
  • the frequency setting module 53 is configured to set the operating frequency of the chip to execute each subtask according to the determined target chip frequency of each subtask.
  • the device may further include: a task analysis module 54 for performing task analysis processing on the target task to obtain the multiple subtasks and task parameters of each subtask; and storing Correspondence between each subtask and the task parameter of each subtask in the plurality of subtasks.
  • the obtaining module 51 when used to obtain multiple subtasks of the target task and the task parameters of each subtask, includes: searching the task parameters of each subtask from the stored correspondence relationship.
  • the task analysis module 54 can first perform task analysis processing on the target task, such as parsed to obtain multiple subtasks, and then analyze each of the multiple subtasks separately Obtain the task parameters of each subtask.
  • the task analysis module 54 can store the corresponding relationship between each subtask obtained by the analysis and its task parameters.
  • the obtaining module 51 may be responsible for controlling the running process of the target task. For example, the obtaining module 51 may obtain the corresponding relationship parsed and stored by the task analyzing module 54 described above, and control the execution of each subtask therein one by one. Exemplarily, assuming there are three subtasks, when the first subtask is to be executed, the obtaining module 51 can obtain the task parameters of the first subtask from the corresponding relationship, and combine the task parameters and the device of the device where the chip is located. The information is sent to the frequency control module 52, and the frequency control module 52 determines the target chip frequency of the first subtask according to task parameters and device information.
  • the frequency control module 52 may send the determined target chip frequency to the frequency setting module 53, and the frequency setting module 53 may set the operating frequency of the chip to the target chip frequency.
  • the frequency setting module 53 may send a feedback signal to the acquiring module 51 after setting the working frequency of the chip to notify the acquiring module 51 that the frequency setting has been completed. Then, the acquisition module 51 can start to execute the first subtask according to the feedback signal.
  • the acquisition module 51 can start to prepare to execute the second subtask. Similarly, before the execution of the second subtask, the acquisition module 51 can obtain the first subtask from the correspondence obtained by the task analysis module 54.
  • the task parameters of the two subtasks are sent to the frequency control module 52 to determine the target chip frequency of the second subtask. After the frequency setting module 53 feedbacks that the chip frequency setting is completed, the acquiring module 51 starts to execute the second subtask.
  • the implementation of the third subtask is similar, and will not be detailed again.
  • the acquiring module 51 After the acquiring module 51 confirms that all the subtasks of the target task have been executed, it can send a frequency reset signal to the frequency control module 52, and the frequency control module 52 then sends a frequency reset signal to the frequency setting module 53, and the frequency setting module 53
  • the operating frequency of the chip can be set to the default value. At this point, the execution of the target task is over.
  • the frequency control module 52 when used to determine the target chip frequency of each subtask based on the task parameters of each of the multiple subtasks, it includes: acquiring device information of the device where the chip is located, The device information includes device resource information; based on the device information and the task parameter of each subtask of the multiple subtasks, the target chip frequency of each subtask is determined.
  • the device resource information includes any one or more of the following: the number of computing units, bandwidth, and memory capacity.
  • the device information further includes: the chip temperature of the chip; the frequency control module 52 is further configured to: based on the task parameters of the first subtask among the plurality of subtasks, the device resource information, and The chip temperature during execution of the first subtask determines the target chip frequency of the first subtask.
  • the frequency control module 52 is configured to: obtain a preset mapping relationship between a plurality of first data and a plurality of second data, the first data includes preset task parameters and preset device information, so The second data includes a preset chip frequency; the target chip frequency of each subtask is determined according to the preset mapping relationship, the device information, and the task parameters of each subtask.
  • the frequency control module 52 when used to determine the target chip frequency of each subtask according to the preset mapping relationship, the device information, and the task parameters of each subtask, it includes: determining a third The distance between the data and each first data in the plurality of first data in the preset mapping relationship, the third data includes task parameters and device information of the first subtask among the plurality of subtasks; The preset chip frequency corresponding to the target first data closest to the third data in the preset mapping relationship is used as the target chip frequency of the first subtask.
  • the frequency control module 52 is further configured to obtain multiple sets of selectable chip frequencies before obtaining the preset mapping relationship between the multiple first data and the multiple second data, and obtain the discrete discrete data obtained by sampling.
  • a plurality of first data for each of the first data in the plurality of first data, a set of chip frequencies is selected from the plurality of sets of selectable chip frequencies as the chip frequency corresponding to each of the first data Second data, and establish a mapping relationship between each of the first data and the selected second data.
  • the preset chip frequency corresponding to the first data in the preset mapping relationship is preset based on the first data under the condition of each group of selectable chip frequencies in a plurality of sets of selectable chip frequencies
  • the corresponding performance evaluation parameters are selected from the multiple sets of optional chip frequencies.
  • the performance evaluation parameters include task processing performance parameters and chip operating power consumption;
  • the preset chip frequency corresponding to the first data in the preset mapping relationship is: the multiple sets of selectable chip frequencies
  • the operating power consumption of the medium chip is lower than the preset power consumption and the optional chip frequency with the best task processing performance parameters.
  • the device may further include: an interface module 55, configured to receive configuration information of the preset mapping relationship input by the user.
  • the frequency control module 52 when used to determine the target chip frequency of each subtask based on the task parameters of each subtask of the plurality of subtasks, it includes: The task parameters of the task are selected from multiple sets of selectable chip frequencies, and the chip frequency that enables the chip to achieve the lowest task running time under the limitation of chip power consumption is selected as the target chip frequency.
  • the interface module 55 is further configured to: receive frequency setting strategy information input by the user; the frequency control module 52 is further configured to: based on the frequency setting strategy information and each of the multiple subtasks The task parameters to determine the target chip frequency of each subtask.
  • the frequency setting strategy information includes enabling or disabling the chip frequency dynamic setting function for subtasks.
  • the operating frequency includes at least one of the following: a core frequency of the chip or a memory frequency.
  • the above-mentioned apparatus may be used to execute any corresponding method described above, and for the sake of brevity, it will not be repeated here.
  • the electronic device 700 includes a memory 710 and a processor 720.
  • the memory 710 is used to store machine-readable instructions 711, and the processor 720 is used to call
  • the machine-readable instruction 711 implements the method for setting the operating frequency of the chip in any embodiment of this specification.
  • the target chip for chip operating frequency setting may be the memory 710 and/or the processor 720.
  • the electronic device 700 may further include a communication interface 730 and a bus 740.
  • the memory 710, the processor 720, and the communication interface 730 are connected to each other through a bus 740.
  • the electronic device 700 may further include a chip 750 configured to process each subtask in the target task based on the operating frequency set by the processor 720.
  • the embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the processor is prompted to implement the chip operating frequency setting method of any embodiment of this specification .
  • the computer-readable storage medium may be the memory 710 in FIG. 7.
  • one or more embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of the present disclosure may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • “and/or” in the embodiments of the present disclosure means having at least one of the two, for example, “multi and/or B” includes three schemes: multi, B, and "multi and B".
  • the embodiments of the subject and functional operations described in the present disclosure can be implemented in the following: digital electronic circuits, tangible computer software or firmware, computer hardware including the structures disclosed in the present disclosure and structural equivalents thereof, or among them A combination of one or more.
  • the embodiments of the subject matter described in the present disclosure may be implemented as one or more computer programs, that is, one or one of the computer program instructions encoded on a tangible non-transitory program carrier to be executed by a data processing device or to control the operation of the data processing device Multiple modules.
  • the program instructions may be encoded on artificially generated propagated signals, such as machine-generated electrical, optical, or electromagnetic signals, which are generated to encode information and transmit it to a suitable receiver device for data transmission.
  • the processing device executes.
  • the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
  • the processing and logic flow described in the present disclosure can be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating according to input data and generating output.
  • the processing and logic flow can also be executed by a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit), and the device can also be implemented as a dedicated logic circuit.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • Computers suitable for executing computer programs include, for example, general-purpose and/or special-purpose microprocessors, or any other type of central processing unit.
  • the central processing unit will receive instructions and data from a read-only memory and/or a random access memory.
  • the basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include one or more mass storage devices for storing data, such as magnetic disks, magneto-optical disks, or optical disks, or the computer will be operatively coupled with this mass storage device to receive data from or send data to it. It transmits data, or both.
  • the computer does not have to have such equipment.
  • the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or, for example, a universal serial bus (USB ) Flash drives are portable storage devices, just to name a few.
  • PDA personal digital assistant
  • GPS global positioning system
  • USB universal serial bus
  • Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices (such as EPROM, EEPROM, and flash memory devices), magnetic disks (such as internal hard disks or Removable disks), magneto-optical disks, CD ROM and DVD-ROM disks.
  • semiconductor memory devices such as EPROM, EEPROM, and flash memory devices
  • magnetic disks such as internal hard disks or Removable disks
  • magneto-optical disks CD ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by or incorporated into a dedicated logic circuit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Power Sources (AREA)

Abstract

Embodiments of the present disclosure provide a method and device for configuring the operating frequency of a chip. The method comprises: acquiring multiple subtasks of a target task and task parameters of each subtask, wherein the task parameters comprise a parameter indicating the scale of computation of the subtask; determining, on the basis of the task parameters of each of the multiple subtasks, a target chip frequency of each subtask; and configuring, according to the determined target chip frequencies corresponding to the respective subtasks, operating frequencies of a chip when executing the respective subtasks.

Description

芯片工作频率的设置Chip working frequency setting 技术领域Technical field
本公开涉及智能设备技术,具体涉及芯片工作频率的设置方法和装置。The present disclosure relates to smart device technology, in particular to a method and device for setting the operating frequency of a chip.
背景技术Background technique
随着5G和人工智能技术的发展,端设备(例如,智能手机,智能摄像机等)被赋予更多的计算需求,需要执行更多的任务。然而,端设备中包括的用于计算处理的芯片大多存在功耗限制,不能完全发挥设备的计算峰值能力。降频是现在芯片防止功耗超标的主要方法之一,降频技术主要是通过临时降低芯片的工作频率(frequency)来减少芯片的功耗。With the development of 5G and artificial intelligence technology, end devices (for example, smart phones, smart cameras, etc.) are given more computing requirements and need to perform more tasks. However, most of the chips used for computing processing included in the end device have power consumption limitations, and cannot fully utilize the computing peak capability of the device. Frequency reduction is one of the main methods of preventing excessive power consumption of the chip. The frequency reduction technology mainly reduces the power consumption of the chip by temporarily reducing the operating frequency of the chip (frequency).
在一些技术中,可以在端设备中运行应用程序执行前,设置一个初始工作频率,芯片以该初始工作频率运行以执行对应用程序的计算处理。并且,在应用程序的运行过程中,遇到芯片的功耗超标等情况时,对芯片可以采用固定降频策略,例如,将芯片的工作频率按照固定的比例降低,或者将其工作频率降低到一固定的数值。In some technologies, an initial operating frequency can be set before the application program is executed in the terminal device, and the chip runs at the initial operating frequency to perform calculation processing on the application program. In addition, during the operation of the application program, when the power consumption of the chip exceeds the standard, a fixed frequency reduction strategy can be adopted for the chip, for example, the working frequency of the chip is reduced by a fixed ratio, or the working frequency is reduced to A fixed value.
发明内容Summary of the invention
本公开实施例至少提供一种芯片工作频率的设置方法和装置。The embodiments of the present disclosure provide at least a method and device for setting the operating frequency of a chip.
第一方面,提供一种芯片工作频率的设置方法,该方法包括:获取目标任务的多个子任务以及每个子任务的任务参数,所述任务参数包括用于表示所述子任务的运算规模的参数;基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率;根据确定的每个子任务的目标芯片频率,设置芯片执行所述每个子任务的工作频率。In a first aspect, a method for setting the operating frequency of a chip is provided. The method includes: acquiring a plurality of subtasks of a target task and task parameters of each subtask, the task parameters including parameters used to indicate the operation scale of the subtasks Based on the task parameters of each subtask in the multiple subtasks, determine the target chip frequency of each subtask; according to the determined target chip frequency of each subtask, set the operating frequency of the chip to execute each subtask.
根据本公开的任一实施例,所述方法还包括:对所述目标任务进行任务解析处理,得到所述多个子任务以及每个子任务的任务参数;存储所述多个子任务中每个子任务与所述每个子任务的任务参数的对应关系;所述获取目标任务的多个子任务以及每个子任务的任务参数包括:从存储的所述对应关系中查找每个子任务的任务参数。According to any embodiment of the present disclosure, the method further includes: performing task analysis processing on the target task to obtain the multiple subtasks and the task parameters of each subtask; storing each subtask and each subtask in the multiple subtasks The corresponding relationship of the task parameters of each subtask; the obtaining multiple subtasks of the target task and the task parameters of each subtask includes: searching the task parameters of each subtask from the stored corresponding relationships.
根据本公开的任一实施例,所述任务参数包括如下至少一项:所述子任务的计算量、所述子任务的访存量。According to any embodiment of the present disclosure, the task parameter includes at least one of the following: a calculation amount of the subtask, and a memory access amount of the subtask.
根据本公开的任一实施例,所述基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:获取所述芯片所在设备的设备信息,所述设备信息包括设备资源信息;基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。According to any embodiment of the present disclosure, the determining the target chip frequency of each subtask based on the task parameter of each subtask of the plurality of subtasks includes: acquiring device information of the device where the chip is located, and The device information includes device resource information; the target chip frequency of each subtask is determined based on the device information and the task parameter of each subtask of the multiple subtasks.
根据本公开的任一实施例,所述设备资源信息包括以下中的任意一项或多项:计算单元的数量、带宽、存储器容量。According to any embodiment of the present disclosure, the device resource information includes any one or more of the following: the number of computing units, bandwidth, and memory capacity.
根据本公开的任一实施例,所述设备信息还包括:所述芯片的芯片温度;所述 基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:基于所述多个子任务中的第一子任务的任务参数、所述设备资源信息和所述第一子任务执行时的芯片温度,确定所述第一子任务的目标芯片频率。According to any one of the embodiments of the present disclosure, the device information further includes: the chip temperature of the chip; the determining each subtask based on the device information and the task parameter of each subtask of the plurality of subtasks The target chip frequency of includes: determining the target of the first subtask based on the task parameters of the first subtask among the plurality of subtasks, the device resource information, and the chip temperature during execution of the first subtask Chip frequency.
根据本公开的任一实施例,所述基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:获取多个第一数据和多个第二数据之间的预设映射关系,所述第一数据包括预设任务参数和预设设备信息,所述第二数据包括预设芯片频率;根据所述预设映射关系、所述设备信息以及每个子任务的任务参数,确定所述每个子任务的目标芯片频率。According to any embodiment of the present disclosure, the determining the target chip frequency of each subtask based on the device information and the task parameter of each subtask of the plurality of subtasks includes: acquiring a plurality of first data and A preset mapping relationship between a plurality of second data, the first data includes preset task parameters and preset device information, the second data includes a preset chip frequency; according to the preset mapping relationship, the Device information and task parameters of each subtask determine the target chip frequency of each subtask.
根据本公开的任一实施例,所述根据所述预设映射关系、所述设备信息以及每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:确定第三数据与所述预设映射关系中的多个第一数据中每个第一数据之间的距离,所述第三数据包括所述多个子任务中第一子任务的任务参数和设备信息;将所述预设映射关系中与所述第三数据距离最近的目标第一数据对应的预设芯片频率,作为所述第一子任务的目标芯片频率。According to any embodiment of the present disclosure, the determining the target chip frequency of each subtask according to the preset mapping relationship, the device information, and the task parameters of each subtask includes: determining the third data and the task parameters of each subtask. The distance between each first data in the plurality of first data in the preset mapping relationship, the third data includes task parameters and device information of the first subtask among the plurality of subtasks; It is assumed that the preset chip frequency corresponding to the target first data closest to the third data in the mapping relationship is used as the target chip frequency of the first subtask.
根据本公开的任一实施例,所述获取多个第一数据和多个第二数据之间的预设映射关系之前,所述方法还包括:获取多组可选芯片频率,并获取采样得到的离散的多个第一数据;对于所述多个第一数据中的每个所述第一数据,由所述多组可选芯片频率中选择一组芯片频率作为与所述每个第一数据对应的第二数据,并建立所述每个第一数据和选择的所述第二数据之间的映射关系。According to any embodiment of the present disclosure, before the obtaining the preset mapping relationship between the plurality of first data and the plurality of second data, the method further includes: obtaining a plurality of sets of optional chip frequencies, and obtaining samples obtained A plurality of discrete first data; for each of the first data in the plurality of first data, a set of chip frequencies is selected from the plurality of sets of optional chip frequencies as the The data corresponds to the second data, and a mapping relationship between each of the first data and the selected second data is established.
根据本公开的任一实施例,所述预设映射关系中与所述第一数据对应的预设芯片频率是基于在多组可选芯片频率中每组可选芯片频率的条件下所述第一数据对应的性能评估参数,从所述多组可选芯片频率中选择的。According to any embodiment of the present disclosure, the preset chip frequency corresponding to the first data in the preset mapping relationship is based on the second chip frequency under the condition of each group of selectable chip frequencies in the multiple sets of selectable chip frequencies. The performance evaluation parameter corresponding to one data is selected from the multiple sets of optional chip frequencies.
根据本公开的任一实施例,所述性能评估参数包括任务处理性能参数和芯片运行功耗;所述预设映射关系中与所述第一数据对应的预设芯片频率为:所述多组可选芯片频率中芯片运行功耗低于预设功耗且任务处理性能参数最优的可选芯片频率。According to any embodiment of the present disclosure, the performance evaluation parameters include task processing performance parameters and chip operating power consumption; the preset chip frequency corresponding to the first data in the preset mapping relationship is: the multiple groups Among the optional chip frequencies, the chip operating power consumption is lower than the preset power consumption and the optional chip frequency with the best task processing performance parameters.
根据本公开的任一实施例,所述方法还包括:接收用户输入的对于所述预设映射关系的配置信息。According to any embodiment of the present disclosure, the method further includes: receiving configuration information input by a user for the preset mapping relationship.
根据本公开的任一实施例,所述基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:基于所述多个子任务中每个子任务对应的任务参数,由多组可选芯片频率中,选择能够使得所述芯片在芯片功耗限制条件下实现任务运行时间最低的芯片频率,作为所述目标芯片频率。According to any embodiment of the present disclosure, the determining the target chip frequency of each subtask based on the task parameter of each subtask in the plurality of subtasks includes: The task parameters are selected from multiple sets of selectable chip frequencies, and the chip frequency that enables the chip to achieve the lowest task running time under the restriction of chip power consumption is selected as the target chip frequency.
根据本公开的任一实施例,所述方法还包括:接收用户输入的频率设置策略信息;所述基于所述多个子任务中每个子任务对应的任务参数,确定所述每个子任务的目标芯片频率,包括:基于所述频率设置策略信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。According to any embodiment of the present disclosure, the method further includes: receiving frequency setting strategy information input by a user; and determining the target chip of each subtask based on the task parameter corresponding to each subtask in the plurality of subtasks The frequency includes: determining the target chip frequency of each subtask based on the frequency setting strategy information and the task parameter of each subtask of the multiple subtasks.
根据本公开的任一实施例,所述频率设置策略信息包括打开或关闭针对子任务的芯片频率动态设置功能。According to any embodiment of the present disclosure, the frequency setting strategy information includes enabling or disabling the chip frequency dynamic setting function for subtasks.
根据本公开的任一实施例,所述工作频率包括如下至少一项:所述芯片的核心 频率、或者存储器频率。According to any embodiment of the present disclosure, the operating frequency includes at least one of the following: a core frequency of the chip or a memory frequency.
第二方面,提供一种芯片工作频率的设置装置,所述装置包括:获取模块,用于获取目标任务的多个子任务以及每个子任务的任务参数,所述任务参数包括用于表示所述子任务的运算规模的参数;频率控制模块,用于基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率;频率设置模块,用于根据确定的每个子任务的目标芯片频率,设置芯片执行所述每个子任务的工作频率。In a second aspect, a device for setting the operating frequency of a chip is provided. The device includes: an acquisition module for acquiring multiple subtasks of a target task and task parameters of each subtask. The task parameters include A parameter of the calculation scale of the task; a frequency control module for determining the target chip frequency of each subtask based on the task parameters of each of the multiple subtasks; a frequency setting module for determining each subtask according to the task parameters The target chip frequency is to set the working frequency of the chip to execute each subtask.
第三方面,提供一种电子设备,包括:存储器、处理器,所述存储器用于存储机器可读指令,所述处理器用于调用所述机器可读指令,实现本公开第一方面所述的方法。In a third aspect, an electronic device is provided, including: a memory and a processor, the memory is configured to store machine-readable instructions, and the processor is configured to invoke the machine-readable instructions to implement the first aspect of the present disclosure method.
根据本公开的任一实施例,所述设备还包括:芯片,用于基于所述处理器设置的工作频率对目标任务中的每个子任务进行处理。According to any embodiment of the present disclosure, the device further includes: a chip configured to process each subtask in the target task based on the operating frequency set by the processor.
第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时,促使所述处理器实现本公开第一方面所述的方法。In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, and when the program is executed by a processor, the processor is prompted to implement the method described in the first aspect of the present disclosure.
本公开实施例提供的芯片工作频率的设置方法和装置,基于目标任务中的每个子任务的任务参数确定与该子任务的目标芯片频率,能够使得芯片在执行该每个子任务时以该子任务的目标芯片频率运行,这种精细化的频率设置方式能够使得执行各个子任务时都尽可能的实现最优的运行性能,从而提高整个任务的运行速度。The method and device for setting the working frequency of a chip provided by the embodiments of the present disclosure determine the target chip frequency of each subtask in the target task based on the task parameters of the subtask, so that the chip can use the subtask when executing each subtask. The target chip frequency runs at the target chip frequency. This refined frequency setting method enables the implementation of each sub-task to achieve the best operating performance as much as possible, thereby increasing the operating speed of the entire task.
附图说明Description of the drawings
图1示出了本公开一些实施例提供的芯片工作频率的设置方法的流程图。Fig. 1 shows a flowchart of a method for setting a chip operating frequency provided by some embodiments of the present disclosure.
图2示出了本公开一些实施例提供的芯片工作频率的设置方法的另一流程图。Fig. 2 shows another flowchart of a method for setting a chip operating frequency provided by some embodiments of the present disclosure.
图3示出了本公开一些实施例提供的预设映射关系的建立过程的另一流程图。FIG. 3 shows another flowchart of the process of establishing a preset mapping relationship provided by some embodiments of the present disclosure.
图4示出了本公开一些实施例提供的芯片工作频率的设置方法的另一流程图。FIG. 4 shows another flowchart of a method for setting the operating frequency of a chip provided by some embodiments of the present disclosure.
图5示出了本公开一些实施例提供的芯片工作频率的设置装置的框图。Fig. 5 shows a block diagram of a device for setting the operating frequency of a chip provided by some embodiments of the present disclosure.
图6示出了本公开一些实施例提供的芯片工作频率的设置装置的另一框图。FIG. 6 shows another block diagram of the device for setting the operating frequency of a chip provided by some embodiments of the present disclosure.
图7示出了本公开一些实施例提供的芯片工作频率的电子设备的框图。Fig. 7 shows a block diagram of an electronic device with a chip operating frequency provided by some embodiments of the present disclosure.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本公开一个或多个实施例中的技术方案,下面将结合本公开一个或多个实施例中的附图,对本公开一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开一个或多个实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。In order to enable those skilled in the art to better understand the technical solutions in one or more embodiments of the present disclosure, in the following, in conjunction with the drawings in one or more embodiments of the present disclosure, a comparison of the technical solutions in one or more embodiments of the present disclosure The technical solution is described clearly and completely. Obviously, the described embodiments are only a part of the embodiments of the present disclosure, rather than all of the embodiments. Based on one or more embodiments of the present disclosure, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of the present disclosure.
本公开实施例提供一种对端设备中的芯片的工作频率进行设置的方法,该方法旨在使得芯片以该设置的工作频率运行时,设备的运行性能较好,例如,设备上运行的 任务能够以更快的速度处理完成。其中,该芯片可以为人工智能(Artificial Intelligence,AI)芯片,例如,在智能手机上应用的AI芯片,在自动驾驶汽车上使用的AI芯片,在物联网的边缘设备上使用的AI芯片,等等。也可以为其他类型的芯片,例如CPU(Central Processing Unit)芯片、DSP(Digital Signal Processing)、存储器芯片等等,本公开实施例对此不做限定。The embodiments of the present disclosure provide a method for setting the operating frequency of a chip in a terminal device. The method aims to make the operating performance of the device better when the chip runs at the set operating frequency, for example, tasks running on the device It can be processed at a faster speed. Among them, the chip can be an artificial intelligence (AI) chip, for example, an AI chip used in a smartphone, an AI chip used in an autonomous vehicle, an AI chip used in edge devices of the Internet of Things, etc. Wait. It may also be other types of chips, such as a CPU (Central Processing Unit) chip, a DSP (Digital Signal Processing), a memory chip, etc., which are not limited in the embodiment of the present disclosure.
图1示出了本公开一些实施例提供的一种芯片工作频率的设置方法的流程图,如图1所示,该方法可以包括如下处理。Fig. 1 shows a flow chart of a method for setting the operating frequency of a chip provided by some embodiments of the present disclosure. As shown in Fig. 1, the method may include the following processing.
在步骤100中,获取目标任务的多个子任务以及每个子任务的任务参数,所述任务参数包括用于表示所述子任务的运算规模的参数。In step 100, a plurality of subtasks of the target task and task parameters of each subtask are acquired, the task parameters including parameters used to indicate the operation scale of the subtasks.
本步骤中,所述的目标任务可以是在设备上要执行和处理的任务。例如,该任务可以是深度学习模型的训练任务,或者一个神经网络模型的推理任务,或者,运行一个应用程序等。所述目标任务的执行需要使用到该设备上的芯片,由芯片负责任务执行过程中的计算等部分或全部处理工作。In this step, the target task may be a task to be executed and processed on the device. For example, the task may be a training task of a deep learning model, or an inference task of a neural network model, or running an application program, etc. The execution of the target task requires the use of a chip on the device, and the chip is responsible for part or all of the processing work such as calculations during the execution of the task.
所述的子任务,可以是目标任务中包括的一个相对独立的执行单元。例如,目标任务中可以包括多个函数,可以根据函数进行划分,每个函数作为目标任务中一个独立的子任务。又例如,也可以将任务代码划分为多个代码块并根据代码块划分子任务,每段代码是相对完整的。比如,一个相对完整或独立的代码块作为一个子任务。再例如,当目标任务是神经网络模型的训练或推理任务时,该神经网络模型的每一层或多层可以看作一个子任务;或者,也可以是将每一层的实现过程的代码段作为一个子任务。再例如,也可以将多个功能模块中的一个或至少两个功能模块作为一个子任务,等等。The subtask may be a relatively independent execution unit included in the target task. For example, the target task can include multiple functions, which can be divided according to functions, and each function is used as an independent subtask in the target task. For another example, the task code can also be divided into multiple code blocks and subtasks are divided according to the code blocks, and each piece of code is relatively complete. For example, a relatively complete or independent code block as a subtask. For another example, when the target task is the training or inference task of a neural network model, each layer or layers of the neural network model can be regarded as a subtask; or, it can also be the code segment of the realization process of each layer As a subtask. For another example, one or at least two of the multiple functional modules can also be used as a subtask, and so on.
本公开实施例不限制子任务的划分方式,可以有多种子任务的划分方式,比如上述提到的以函数为单位,以代码块为单位,以网络层或算子为单位,或者以功能模块为单位,等等,需要注意的是,尽管将目标任务划分成多个子任务可以达到精细粒度的频率设置,提高设备运行性能,但是过多的子任务数量有可能会造成较为频繁的频率设置,对设备运行性能有一定影响,因此,可以平衡子任务数量的设置,比如,目标任务包括的循环体的内部尽量不要再划分成子任务,以减少频率设置的次数,等等,具体设置可以基于实际需求进行,这里不再赘述。The embodiments of the present disclosure do not limit the way of dividing subtasks, and there may be multiple ways of dividing subtasks, such as the above-mentioned function as a unit, code block as a unit, network layer or operator as a unit, or function module as a unit. It should be noted that although dividing the target task into multiple sub-tasks can achieve fine-grained frequency settings and improve equipment operating performance, too many sub-tasks may cause more frequent frequency settings. It has a certain impact on the operating performance of the equipment. Therefore, you can balance the settings of the number of subtasks. For example, try not to divide the inside of the loop body included in the target task into subtasks to reduce the number of frequency settings. The specific settings can be based on actual conditions. Needs to proceed, so I won’t go into details here.
在本公开实施例中,子任务的任务参数,可以包括用于表示子任务的运算规模的参数,例如可以包括用于指示子任务需要占用的一种或多种资源的参数。作为一个例子,该任务参数可以包括但不限于如下至少一项:子任务的计算量、访存量(访存量即在子任务执行时芯片要访问存储器的次数和/或访问数据的总量),或者还可以包括其他类型的任务参数,本公开实施例对此不做限定。In the embodiment of the present disclosure, the task parameter of the subtask may include a parameter used to indicate the calculation scale of the subtask, for example, may include a parameter used to indicate one or more resources that the subtask needs to occupy. As an example, the task parameters may include but are not limited to at least one of the following: calculation amount of the subtask, memory access (memory access is the number of times the chip accesses the memory and/or the total amount of data accessed when the subtask is executed), Or, it may also include other types of task parameters, which are not limited in the embodiment of the present disclosure.
在本公开实施例中,可以通过多种方式获取子任务的任务参数。例如,从其他设备或从系统中的其他功能模块获取子任务的任务参数,或者,从本地存储器获取子任务的任务参数,或者,通过对目标任务进行任务解析,得到各个子任务的任务参数,等等。In the embodiments of the present disclosure, the task parameters of the subtasks can be obtained in various ways. For example, obtain task parameters of subtasks from other devices or from other functional modules in the system, or obtain task parameters of subtasks from local storage, or obtain task parameters of each subtask by performing task analysis on the target task. and many more.
在一些实施例中,为了更快速地获取到该任务参数,可以将各个子任务与各自 的任务参数的对应关系进行预先存储。示例性的,可以预先对目标任务进行任务解析处理,得到该目标任务包括的多个子任务以及每个子任务的任务参数,并将该每个子任务及其任务参数进行存储。这样后续在目标任务需要运行时可以根据存储的信息快速的查找到每个子任务的任务参数。在一些例子中,所述子任务与任务参数之间的对应关系可以通过key-value的形式存储,例如,可以为多个子任务分别设置对应的任务标识,并将子任务的任务标识存储为key,将该子任务的任务参数存储为value。key-value形式的存储方式,适合通过主键进行查询,查询速度快,存放数据量大。但本公开不限于此。In some embodiments, in order to obtain the task parameter more quickly, the correspondence between each subtask and the respective task parameter may be stored in advance. Exemplarily, task analysis processing may be performed on the target task in advance to obtain multiple subtasks included in the target task and task parameters of each subtask, and each subtask and its task parameters are stored. In this way, when the target task needs to be run, the task parameters of each subtask can be quickly found based on the stored information. In some examples, the correspondence between the subtasks and task parameters can be stored in the form of key-value. For example, corresponding task IDs can be set for multiple subtasks, and the task IDs of the subtasks can be stored as keys , And store the task parameters of this subtask as value. The key-value storage method is suitable for querying through the primary key, with fast query speed and large amount of stored data. But the present disclosure is not limited to this.
在一些实施例中,在获取每个子任务的任务参数之外,还可进一步获取芯片所在设备的设备信息,并依据该设备信息和任务参数共同确定每个子任务的目标芯片频率。In some embodiments, in addition to obtaining the task parameters of each subtask, device information of the device where the chip is located can be further obtained, and the target chip frequency of each subtask can be jointly determined based on the device information and task parameters.
其中,上述的设备信息可以包括芯片所在设备的设备资源信息,该设备资源信息用于指示该设备的可用资源的信息,作为一个例子,该设备资源信息可以包括但不限于如下的任意一项或多项设备资源信息:计算单元的数量、带宽、存储器容量等。在一些实施例中,每个子任务的目标芯片频率还可以依赖于多个子任务之间的依赖关系或任务执行方式,例如串行、并行、或者部分子任务之间串行且部分子任务之间并行等。例如,如果目标任务划分得到的各个子任务的执行是串行关系,那么芯片在执行每个子任务时都可以占用该设备上的所有可用设备资源,比如占用设备上的所有计算单元和所有带宽,等等,但本公开实施例不限于此。Wherein, the aforementioned device information may include device resource information of the device where the chip is located. The device resource information is used to indicate information about the available resources of the device. As an example, the device resource information may include, but is not limited to, any of the following or Multiple device resource information: the number of computing units, bandwidth, memory capacity, etc. In some embodiments, the target chip frequency of each subtask may also depend on the dependency between multiple subtasks or the task execution mode, such as serial, parallel, or serial between partial subtasks and partial subtasks. Parallel and so on. For example, if the execution of each subtask obtained by the division of the target task is in a serial relationship, then the chip can occupy all available device resources on the device when executing each subtask, such as occupying all computing units and all bandwidth on the device. And so on, but the embodiments of the present disclosure are not limited thereto.
在步骤102中,基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。In step 102, the target chip frequency of each subtask is determined based on the task parameter of each subtask in the plurality of subtasks.
本步骤中,可以根据每个子任务的任务参数确定该子任务的目标芯片频率。或者,可以基于芯片所在设备的设备信息和每个子任务的任务参数,共同确定所述每个子任务的目标芯片频率。In this step, the target chip frequency of each subtask can be determined according to the task parameters of the subtask. Alternatively, the target chip frequency of each subtask may be jointly determined based on the device information of the device where the chip is located and the task parameters of each subtask.
在一些实施例中,可以确定多组可选芯片频率,并基于一定策略从该多组可选芯片频率中为每个子任务选取对应的可选芯片频率。其中,所述的多组可选芯片频率是设备中的芯片可以设置的特定数量的离散的频率。例如,以设置芯片的核心频率为例,该核心频率可以包括x1、x2….xm等m种频率值。上述策略可以基于实际需求进行设置,下面将描述几种示例性策略,但本公开实施例不限于此。In some embodiments, multiple sets of optional chip frequencies may be determined, and a corresponding optional chip frequency may be selected for each subtask from the multiple sets of optional chip frequencies based on a certain strategy. Wherein, the multiple sets of optional chip frequencies are a specific number of discrete frequencies that can be set by the chips in the device. For example, taking setting the core frequency of the chip as an example, the core frequency may include m frequency values such as x1, x2...xm. The aforementioned strategies can be set based on actual needs. Several exemplary strategies will be described below, but the embodiments of the present disclosure are not limited thereto.
在一些实施例中,可以将所述目标芯片频率的确定转化为一个最优化问题,在一些可选实现方式中,可以将任务运行时间作为选择的主要参考因素,作为一个例子,可以基于每个子任务的任务参数,在可被设置的多组可选芯片频率中,选择能够使得所述芯片在芯片功耗限制条件下实现任务运行时间最低的芯片频率,作为所述每个子任务的目标芯片频率。其中,在一些例子中,可以将多个子任务作为一个整体进行分析,来确定目标任务的总体任务运行时间,此时,在确定最短的任务运行时间的条件下,可以同时得到多个子任务中每个子任务的目标芯片频率。或者,在另一些例子中,可以单独分析每个子任务的任务运行时间,并基于每个子任务的最短的任务运行时间作为条件来确定该子任务的目标芯片频率,等等。在另一些可选实现方式中,也可以将任务消耗的系统资源、芯片运行功耗、任务运行速度、任务执行结果的精度等一种或任意多种作为选择的参考因素,本公开实施例对此不做限定。In some embodiments, the determination of the target chip frequency can be transformed into an optimization problem. In some alternative implementations, the task running time can be used as the main reference factor for selection. As an example, it can be based on each sub The task parameters of the task, among the multiple sets of optional chip frequencies that can be set, select the chip frequency that enables the chip to achieve the lowest task running time under the restriction of chip power consumption, as the target chip frequency of each subtask . Among them, in some examples, multiple subtasks can be analyzed as a whole to determine the overall task running time of the target task. At this time, under the condition of determining the shortest task running time, each of the multiple subtasks can be obtained at the same time. The target chip frequency of each subtask. Or, in other examples, the task running time of each subtask can be analyzed separately, and the target chip frequency of the subtask can be determined based on the shortest task running time of each subtask as a condition, and so on. In other optional implementation manners, one or any of the system resources consumed by the task, chip operating power consumption, task operating speed, and task execution result accuracy can also be selected as reference factors. This is not limited.
下面以根据子任务的任务参数和设备信息来确定目标芯片频率为例,示例一种最优化问题的求解原理。以其中一个子任务为例,该子任务的任务参数是k n(n表示n维空间,如访存量,计算量等),该子任务对应的设备信息是d m(m表示m维空间,比如存储器容量等),并假设芯片频率是x,其中,该频率可以为核心频率、存储器频率、或是核心频率与存储器频率的频率组合。P(k n,d m,x)表示芯片运行功耗,可以看到,该芯片运行功耗与任务参数、设备信息和芯片频率有关,即使固定任务参数和设备信息,芯片频率变化时P也会随之变化。T(k n,d m,x)表示任务运行时间,同理,即使固定任务参数和设备信息,芯片频率变化时T也会随之变化。那么可将上述子任务的任务参数和设备信息分别与多种芯片频率组合,得到对应的芯片运行功耗和任务运行时间,并基于芯片运行功耗和任务运行时间来从多种芯片频率中选择每个子任务的最优芯片频率。 In the following, the target chip frequency is determined according to the task parameters and device information of the subtasks as an example to illustrate the principle of solving an optimization problem. Take one of the subtasks as an example. The task parameter of the subtask is k n (n represents the n-dimensional space, such as the amount of memory access, the amount of calculation, etc.), and the device information corresponding to the subtask is d m (m represents the m-dimensional space, Such as memory capacity, etc.), and assume that the chip frequency is x, where the frequency can be a core frequency, a memory frequency, or a frequency combination of a core frequency and a memory frequency. P(k n , d m , x) represents the operating power consumption of the chip. It can be seen that the operating power consumption of the chip is related to task parameters, device information and chip frequency. Even if the task parameters and device information are fixed and the chip frequency changes, P will Will change accordingly. T(k n , d m , x) represents the task running time. Similarly, even if the task parameters and device information are fixed, T will change when the chip frequency changes. Then you can combine the task parameters and device information of the above subtasks with multiple chip frequencies to obtain the corresponding chip operating power consumption and task operating time, and select from multiple chip frequencies based on the chip operating power consumption and task operating time The optimal chip frequency for each subtask.
例如,可以按照如下的公式(1)选择最优的芯片频率:For example, the optimal chip frequency can be selected according to the following formula (1):
Figure PCTCN2020105195-appb-000001
Figure PCTCN2020105195-appb-000001
上述公式(1)表示:在芯片功耗限制条件(不超过Power Limited)下实现任务运行时间最低,这就是最优化问题的优化目标。以该优化目标寻找最优的芯片频率,在芯片可以设置的多个可选芯片频率中,若采用某一芯片频率,使得在芯片功耗限制条件下以最短的任务运行时间处理完该子任务,则该芯片频率就是该子任务的目标芯片频率。 The above formula (1) indicates that the lowest task running time is achieved under the condition of chip power consumption limit (not exceeding Power Limited ), which is the optimization goal of the optimization problem. Find the optimal chip frequency with this optimization goal. Among the multiple optional chip frequencies that the chip can set, if a certain chip frequency is used, the subtask can be processed with the shortest task running time under the condition of chip power consumption. , The chip frequency is the target chip frequency of the subtask.
由于每个子任务的任务参数和设备信息中的至少一个参数可以是不同的,所以各个子任务的目标芯片频率也可能不同。本步骤可以寻找到最匹配各个子任务的目标芯片频率,使得子任务以最快的速度完成且芯片不会功耗超标。Since at least one parameter in the task parameter and the device information of each subtask may be different, the target chip frequency of each subtask may also be different. In this step, the target chip frequency that best matches each subtask can be found, so that the subtask is completed at the fastest speed and the chip does not exceed the power consumption.
在步骤104中,根据确定的每个子任务的目标芯片频率,设置芯片执行所述每个子任务时的工作频率。In step 104, according to the determined target chip frequency of each subtask, the operating frequency of the chip when each subtask is executed is set.
本步骤中,在目标任务的运行过程中,当运行到其中一个子任务时,可以将芯片的频率设置为该子任务的目标芯片频率,或者,可以基于该目标芯片频率与设备或芯片在执行该子任务时的当前状态信息,例如,设备或芯片在芯片开始执行该子任务时的状态信息,如芯片温度,来设置芯片执行该子任务时的工作频率。其中,在一些实施例中,上述确定目标芯片频率的过程可以在执行目标任务之前预先统一执行,并且存储多个子任务中每个子任务的目标芯片频率,然后,在运行每个子任务的过程中,将芯片的频率设置为该子任务的目标芯片频率。在另一些实施例中,也可以在执行每个子任务之前,确定该子任务的目标芯片频率,例如,可以基于子任务的任务参数与设备或芯片的当前状态信息,确定该子任务的目标芯片频率,此时,各个子任务的目标芯片频率的确定可以是在不同时间分别执行的,本公开实施例对此不做限定。In this step, during the running process of the target task, when one of the subtasks is run, the chip frequency can be set as the target chip frequency of the subtask, or it can be based on the target chip frequency and the device or chip executing The current status information of the subtask, for example, the status information of the device or the chip when the chip starts to execute the subtask, such as the chip temperature, is used to set the operating frequency of the chip when the subtask is executed. Among them, in some embodiments, the above process of determining the target chip frequency may be performed uniformly before executing the target task, and the target chip frequency of each of the multiple subtasks is stored, and then, in the process of running each subtask, Set the chip frequency to the target chip frequency of the subtask. In other embodiments, the target chip frequency of each subtask may also be determined before the execution of each subtask. For example, the target chip frequency of the subtask may be determined based on the task parameters of the subtask and the current state information of the device or chip. Frequency. At this time, the determination of the target chip frequency of each subtask may be performed at different times, which is not limited in the embodiment of the present disclosure.
在本公开实施例中,所述的芯片的工作频率可以包括如下至少一项:所述芯片的核心频率、或者存储器频率。比如,可以通过本公开实施例的方法只用于设置芯片的核心频率,其中,存储器频率可以设置为固定值或者以目标任务为单位进行设置,或者该方法只用于设置存储器频率,其中,核心频率可以设置为固定值或以目标任务为单位进行设置,或者该方法用于设置芯片的核心频率和存储器频率,例如将核心频率和存储器频率作为一个频率组合进行设置,每一个子任务的目标芯片频率是该频率组合。In the embodiment of the present disclosure, the operating frequency of the chip may include at least one of the following: the core frequency of the chip or the memory frequency. For example, the method of the embodiment of the present disclosure can be used only to set the core frequency of the chip, where the memory frequency can be set to a fixed value or set in the unit of the target task, or the method is only used to set the memory frequency, where the core The frequency can be set as a fixed value or set in the unit of the target task, or this method is used to set the core frequency and memory frequency of the chip, for example, the core frequency and the memory frequency are set as a combination of frequencies, and the target chip of each subtask Frequency is the combination of frequencies.
其中,存储器的频率可以为易失性存储器的频率,如SDRAM(Synchronous Dynamic Random Access Memory,同步动态随机存取内存)的频率。例如,DDR SDRAM(Double Data Rate SDRAM,双倍速率SDRAM)、DDR2 SDRAM、DDR3 SDRAM、DDR4 SDRAM、DDR5 SDRAM等等。存储器的频率也可以是非易失性存储器的频率,如闪存(flash memory)的频率。The frequency of the memory may be the frequency of a volatile memory, such as the frequency of SDRAM (Synchronous Dynamic Random Access Memory). For example, DDR SDRAM (Double Data Rate SDRAM), DDR2 SDRAM, DDR3 SDRAM, DDR4 SDRAM, DDR5 SDRAM, and so on. The frequency of the memory may also be the frequency of a non-volatile memory, such as the frequency of a flash memory.
本公开实施例的芯片工作频率的设置方法,基于目标任务中的每个子任务的任务参数确定与该子任务的目标芯片频率,能够使得芯片在执行该每个子任务时以该子任务的目标芯片频率运行,这种精细化的频率设置方式能够使得执行各个子任务时都尽可能的实现最优的运行性能,从而提高整个任务的运行速度。The method for setting the working frequency of the chip in the embodiment of the present disclosure determines the target chip frequency of each subtask based on the task parameters of each subtask in the target task, so that the chip can use the target chip of the subtask when executing each subtask. Frequency operation. This refined frequency setting method can achieve the optimal operating performance as much as possible when executing each subtask, thereby increasing the operating speed of the entire task.
图2提供了本公开一些实施例的芯片工作频率的设置方法的另一流程图,该方法以根据子任务的任务参数和设备信息来共同确定的目标芯片频率为例,示例性地给出了一种确定目标芯片频率的方式。如图2所示,该方法可以包括如下处理,其中,与图1相同的步骤将不再详述。Figure 2 provides another flow chart of a method for setting chip operating frequency in some embodiments of the present disclosure. The method takes the target chip frequency jointly determined according to the task parameters and device information of the subtasks as an example, and exemplarily shows A way to determine the frequency of the target chip. As shown in FIG. 2, the method may include the following processing, wherein the same steps as those in FIG. 1 will not be described in detail.
在步骤200中,对设备上待运行的目标任务进行任务解析,得到多个子任务以及每个所述子任务的任务参数。例如,每个子任务的任务参数可以包括如下至少一项:该子任务的计算量和访存量。In step 200, task analysis is performed on the target task to be run on the device to obtain multiple subtasks and task parameters of each of the subtasks. For example, the task parameter of each subtask may include at least one of the following: the calculation amount and the memory access amount of the subtask.
在步骤202中,获取芯片所在设备的设备信息。In step 202, the device information of the device where the chip is located is obtained.
例如,所述的芯片所在设备的设备信息可以包括设备资源信息,该设备资源信息可以包括如下至少一项:带宽、计算单元的数量和存储器容量等。For example, the device information of the device where the chip is located may include device resource information, and the device resource information may include at least one of the following: bandwidth, number of computing units, memory capacity, and so on.
在步骤204中,获取多个第一数据和多个第二数据之间的预设映射关系。In step 204, a preset mapping relationship between a plurality of first data and a plurality of second data is acquired.
本步骤中,设备可以预先存储有多个第一数据和多个第二数据之间的映射关系,其中的第一数据包括子任务的预设任务参数和预设设备信息,第二数据包括预设芯片频率。例如,该预设芯片频率可以是芯片的核心频率。又例如,该预设芯片频率可以是芯片的核心频率和存储器频率的组合。In this step, the device may pre-store a plurality of mapping relationships between a plurality of first data and a plurality of second data, where the first data includes preset task parameters and preset device information of the subtask, and the second data includes preset device information. Set the chip frequency. For example, the preset chip frequency may be the core frequency of the chip. For another example, the preset chip frequency may be a combination of the core frequency of the chip and the memory frequency.
例如,如下的表1示例了多个第一数据和多个第二数据的预设映射关系:For example, the following Table 1 illustrates the preset mapping relationship between multiple first data and multiple second data:
表1映射关系示例Table 1 Mapping relationship example
Figure PCTCN2020105195-appb-000002
Figure PCTCN2020105195-appb-000002
如下示例性说明如何建立上述的映射关系。The following example illustrates how to establish the above-mentioned mapping relationship.
假设不论设备上执行何种任务,任务参数和设备信息的取值都会在一定的范围内,比如,任务参数的取值在范围F1内,设备信息的取值在范围F2内。该F1和F2 都可以称为预设的第一数据取值范围。那么,可以在该第一数据取值范围内采样,得到一些离散的采样点,即得到多组第一数据,例如,(k1,d1)、(k2,d2)、(k3,d3)等,其中的k1,k2,k3表示任务参数,d1,d2和d3表示设备信息。此外,假设要设置的芯片频率是核心频率,可以包括芯片可能设置的多组可选芯片频率,例如,x1,x2,x3等。Assume that no matter what task is executed on the device, the values of the task parameters and device information will be within a certain range. For example, the value of the task parameter is within the range F1, and the value of the device information is within the range F2. Both F1 and F2 can be referred to as the preset first data value range. Then, you can sample within the value range of the first data to obtain some discrete sampling points, that is, to obtain multiple sets of first data, for example, (k1, d1), (k2, d2), (k3, d3), etc., Among them, k1, k2, and k3 represent task parameters, and d1, d2, and d3 represent device information. In addition, assuming that the chip frequency to be set is the core frequency, it may include multiple sets of optional chip frequencies that the chip may set, for example, x1, x2, x3, and so on.
具体地,对于其中的每一组第一数据,可以由上述的多组可选芯片频率中选择一组芯片频率作为与该第一数据对应的第二数据,并建立该第一数据和第二数据之间的映射关系。其中,在一些实施例中,在从多组可选芯片频率中选择第二数据时,可以是依据性能评估参数来选择。例如,可以分别确定该第一数据在每组可选芯片频率的条件下的性能评估参数,比如根据该第一数据与其中一组可选芯片频率得到一个性能评估参数;根据该第一数据与另一组可选芯片频率得到另一个性能评估参数,其中,该性能评估参数可以通过模拟、推理、或数学公式运算的方式得到,本公开实施例对此不做限定。Specifically, for each group of the first data, a group of chip frequencies can be selected from the above-mentioned multiple groups of optional chip frequencies as the second data corresponding to the first data, and the first data and the second data can be established. The mapping relationship between the data. Among them, in some embodiments, when selecting the second data from multiple sets of optional chip frequencies, the selection may be based on performance evaluation parameters. For example, the performance evaluation parameters of the first data under the condition of each group of optional chip frequencies can be determined separately, for example, a performance evaluation parameter can be obtained according to the first data and one group of optional chip frequencies; according to the first data and Another set of optional chip frequencies obtains another performance evaluation parameter, where the performance evaluation parameter can be obtained through simulation, inference, or mathematical formula operation, which is not limited in the embodiment of the present disclosure.
在分别得到该第一数据在各组可选芯片频率的条件下对应的性能评估参数后,可以根据该性能评估参数从所述多组可选芯片频率中选择一组芯片频率作为该第一数据对应的第二数据。例如,该性能评估参数可以包括任务处理性能参数和芯片运行功耗,可以将上述多组可选芯片频率中芯片运行功耗低于预设功耗且任务处理性能参数最优的可选芯片频率作为该第一数据对应的第二数据。示例性的,上述的任务处理性能参数包括但不限于任务处理时间。After obtaining the corresponding performance evaluation parameters of the first data under the conditions of each group of optional chip frequencies, a group of chip frequencies can be selected from the multiple sets of optional chip frequencies as the first data according to the performance evaluation parameters The corresponding second data. For example, the performance evaluation parameters may include task processing performance parameters and chip operating power consumption, and the chip operating power consumption of the above multiple sets of optional chip frequencies may be lower than the preset power consumption and the optional chip frequency with the optimal task processing performance parameters As the second data corresponding to the first data. Exemplarily, the aforementioned task processing performance parameters include but are not limited to task processing time.
如下通过图3,示例一种预设映射关系的建立过程,并且,在该示例描述中,以任务处理性能参数是任务运行时间,芯片频率是核心频率为例进行说明。As shown in FIG. 3, an example of a process of establishing a preset mapping relationship is illustrated, and in this example description, the task processing performance parameter is the task running time, and the chip frequency is the core frequency as an example.
在步骤2041中,遍历所述多组可选芯片频率,分别根据每组可选芯片频率与某一组第一数据得到在所述每组可选芯片频率和该第一数据的条件下运行时的芯片运行功耗和任务运行时间。In step 2041, traverse the multiple sets of optional chip frequencies, and obtain the operating time under the conditions of each set of optional chip frequencies and the first data according to each set of optional chip frequencies and a certain set of first data. The operating power consumption and task running time of the chip.
例如,假设芯片有十组可选芯片频率,针对一组第一数据,可以分别结合该十组芯片频率,得到十个芯片运行功耗P和十个任务运行时间T。比如,当芯片频率是x1时,对应的该子任务的芯片运行功耗是P1,任务运行时间是T1;当芯片频率是x2时,对应的该子任务的芯片运行功耗是P2,任务运行时间是T2。For example, assuming that the chip has ten sets of optional chip frequencies, for a set of first data, the ten sets of chip frequencies can be combined respectively to obtain ten chip operating power consumption P and ten task operating time T. For example, when the chip frequency is x1, the corresponding chip operating power consumption of the subtask is P1, and the task running time is T1; when the chip frequency is x2, the corresponding chip operating power consumption of the subtask is P2, and the task is running Time is T2.
在一些实施例中,可以根据功耗模型确定P,根据运行时间模型确定T。其中,功耗模型的模型输入是任务参数、设备信息和芯片频率,模型输出是P。运行时间模型的模型输入是任务参数、设备信息和芯片频率,模型输出T。In some embodiments, P may be determined according to the power consumption model, and T may be determined according to the runtime model. Among them, the model input of the power consumption model is task parameters, device information and chip frequency, and the model output is P. The model inputs of the runtime model are task parameters, device information and chip frequency, and the model outputs T.
上述的功耗模型和运行时间模型的具体结构可以通过多种方式得到,例如,支持向量机、反馈神经网络、K-means聚合算法等。The specific structure of the above-mentioned power consumption model and runtime model can be obtained in a variety of ways, for example, a support vector machine, a feedback neural network, a K-means aggregation algorithm, and so on.
下面以功耗模型和运行时间模型都是神经网络模型为例进行说明,该神经网络模型可以通过模型训练的方式得到。其中,训练样本集可以给定固定的任务列表X={r 1,r 2,......,r p},设备的频率集合为X={x 1,x 2,......,x q},这些集合作为神经网络模型的训练集。比如,对于某一任务r 1,可以得到该任务的任务参数,并获取该设备的设备信息,再由上述设备频率集合中取一个频率x 1,三者作为功耗模型的输入,经过功耗 模型的处理后输出芯片运行功耗的预测值。该预测值与设备在输入的任务参数、设备信息和频率x 1条件下的芯片运行功耗的真实值(真实设备运行的结果)之间具有差异,根据该差异进行反向传播,训练该功耗模型。同理,运行时间模型也可以按照上述方式训练,只是将输出改为任务运行时间,在此不再详述。 In the following, both the power consumption model and the runtime model are neural network models as an example, and the neural network model can be obtained through model training. Among them, the training sample set can be given a fixed task list X={r 1 , r 2 ,..., r p }, and the frequency set of the device is X={x 1 , x 2 ,... .., x q }, these sets are used as the training set of the neural network model. For example, for a certain task r 1 , the task parameters of the task can be obtained, and the device information of the device can be obtained, and then a frequency x 1 is selected from the above device frequency set, and the three are used as the input of the power consumption model. After the model is processed, the predicted value of the operating power consumption of the chip is output. There is a difference between the predicted value and the real value of the chip's operating power consumption under the condition of input task parameters, device information, and frequency x 1 (the result of real device operation). According to the difference, backpropagation is performed to train the function. Consumption model. In the same way, the runtime model can also be trained in the above manner, but the output is changed to the task runtime, which will not be described in detail here.
在得到训练完成的功耗模型和运行时间模型后,向功耗模型输入第一数据和芯片频率,就可以得到芯片运行功耗。如,向功耗模型输入第一数据(k1,d1)和芯片频率xi,就可以得到芯片运行功耗Pi。向运行时间模型输入第一数据和芯片频率,就可以得到任务运行时间。如,向运行时间模型输入第一数据(k1,d1)和芯片频率xi,就可以得到任务运行时间Ti。After obtaining the trained power consumption model and the running time model, input the first data and the chip frequency to the power consumption model to obtain the running power consumption of the chip. For example, by inputting the first data (k1, d1) and the chip frequency xi into the power consumption model, the operating power consumption Pi of the chip can be obtained. Input the first data and chip frequency to the runtime model to get the task runtime. For example, by inputting the first data (k1, d1) and the chip frequency xi into the runtime model, the task runtime Ti can be obtained.
在步骤2042中,选择确定芯片运行功耗和任务运行时间最优的芯片频率,作为该第一数据对应的第二数据。In step 2042, the chip frequency that determines the optimal chip running power consumption and task running time is selected as the second data corresponding to the first data.
本步骤中,例如可以根据公式(1)的最优芯片频率的选择条件,在多组芯片频率中,选择在芯片运行功耗的限制范围内任务运行时间最低的芯片频率作为第二数据。例如,在满足Pi≤Power Limited的所有芯片频率xi中,选择Ti最小的芯片频率,如x1,作为第一数据(k1,d1)对应的第二数据。 In this step, for example, according to the selection condition of the optimal chip frequency in formula (1), among multiple sets of chip frequencies, the chip frequency with the lowest task running time within the limit of chip running power consumption can be selected as the second data. For example, among all chip frequencies xi satisfying Pi≤Power Limited , the chip frequency with the smallest Ti, such as x1, is selected as the second data corresponding to the first data (k1, d1).
在步骤2043中,建立该第一数据和该第二数据之间的映射关系。In step 2043, a mapping relationship between the first data and the second data is established.
如上所述的,离散采样点对应的多组第一数据中,每一组第一数据都可以按照图3所示的流程得到该第一数据对应的第二数据,即最优的芯片频率。如此,可以得到多个第一数据中各个第一数据分别对应的第二数据,并据此建立各个映射关系,从而得到映射关系的集合,该集合中可以包括多组映射关系,每一组映射关系都包括一个第一数据和一个对应的第二数据。该映射关系的集合可以是步骤204所述的预设映射关系,如表1的示例。As described above, among the multiple sets of first data corresponding to the discrete sampling points, each set of first data can obtain the second data corresponding to the first data, that is, the optimal chip frequency, according to the process shown in FIG. 3. In this way, the second data corresponding to each first data in the multiple first data can be obtained, and each mapping relationship is established accordingly, thereby obtaining a set of mapping relationships. The set may include multiple sets of mapping relationships, and each set of mapping relationships The relationship includes a first data and a corresponding second data. The set of mapping relationships may be the preset mapping relationships described in step 204, as shown in Table 1 as an example.
在步骤206中,根据预设映射关系、设备信息以及子任务的任务参数,确定所述子任务在预设映射关系中对应的芯片频率为目标芯片频率。In step 206, the chip frequency corresponding to the subtask in the preset mapping relationship is determined as the target chip frequency according to the preset mapping relationship, the device information, and the task parameters of the subtask.
例如,假设某一子任务的任务参数是k1,该子任务对应的设备信息是d1,通过查表1,可以得到的目标芯片频率是x1。该子任务可以被称之为第一子任务。For example, suppose the task parameter of a certain subtask is k1, and the device information corresponding to the subtask is d1. By looking up Table 1, the target chip frequency can be obtained as x1. This subtask can be called the first subtask.
此外,由于映射关系表中的第一数据是离散的,有时第一子任务的任务参数和设备信息并不完全与映射关系中的第一数据一致,这种情况下,作为一种可选实现方式,可以在映射关系表中找到与第一任务的设备信息和任务参数最接近的第一数据,用该最接近的第一数据对应的芯片频率作为第一子任务的目标芯片频率。In addition, because the first data in the mapping relationship table is discrete, sometimes the task parameters and device information of the first subtask are not completely consistent with the first data in the mapping relationship. In this case, as an optional implementation In this way, the first data closest to the device information and task parameters of the first task can be found in the mapping relationship table, and the chip frequency corresponding to the closest first data is used as the target chip frequency of the first subtask.
例如,假设第一子任务的任务参数和设备信息可以称为第三数据,如(k3,d3),可以计算该第三数据与映射关系中的各个第一数据之间的距离。该距离的计算可以是将第三数据和各个第一数据都作为一个向量,计算向量之间的距离,例如,可以是(第三数据的向量和每个第一数据的向量之间的欧式距离或其他距离。可以将与第三数据距离最近的第一数据对应的芯片频率,作为第一子任务的目标芯片频率。For example, assuming that the task parameters and device information of the first subtask can be referred to as third data, such as (k3, d3), the distance between the third data and each first data in the mapping relationship can be calculated. The distance can be calculated by taking the third data and each first data as a vector, and calculating the distance between the vectors, for example, it can be (Euclidean distance between the vector of the third data and the vector of each first data Or other distances. The chip frequency corresponding to the first data closest to the third data distance can be used as the target chip frequency of the first subtask.
其中,在一些实施例中,上述的与第三数据距离最近的第一数据可以是向下取最近的,该向下取最近是指尽量取对应芯片频率较低的第一数据。比如,假设第三数据 的两个相邻的第一数据分别是第一数据Y1和第一数据Y2,并且该第一数据Y1和第一数据Y2与第三数据的距离是相等的,第一数据Y1对应的芯片频率比第二数据Y2对应的芯片频率要低,那么可以选择使用Y1对应的芯片频率。如果第一数据Y1和第一数据Y2与第三数据的距离不相等,则仍然可以选择距离较近的第一数据作为目标第一数据。再例如,只要在第一数据Y1和第一数据Y2与第三数据的距离之间的差距在预设范围内,则优先选择对应芯片频率较低的第一数据,该预设范围可以基于实际需求设置,本公开实施例对此不做限定。Wherein, in some embodiments, the above-mentioned first data closest to the third data may be the closest downward, and the closest downward refers to taking the first data corresponding to the lower chip frequency as much as possible. For example, suppose that two adjacent first data of the third data are the first data Y1 and the first data Y2, and the distances between the first data Y1 and the first data Y2 and the third data are equal, the first The chip frequency corresponding to the data Y1 is lower than the chip frequency corresponding to the second data Y2, so you can choose to use the chip frequency corresponding to Y1. If the distances between the first data Y1 and the first data Y2 and the third data are not equal, the first data with a relatively close distance can still be selected as the target first data. For another example, as long as the distance between the first data Y1 and the distance between the first data Y2 and the third data is within a preset range, the first data corresponding to the lower chip frequency is preferentially selected. The preset range may be based on the actual The requirement setting is not limited in the embodiment of the present disclosure.
在步骤208中,根据确定的每个子任务的所述目标芯片频率,设置芯片执行所述每个子任务的工作频率。In step 208, according to the determined target chip frequency of each subtask, the operating frequency of the chip to execute each subtask is set.
本步骤中,在目标任务的运行过程中,当运行到其中一个子任务时,可以将芯片的频率设置为该子任务的目标芯片频率。In this step, in the running process of the target task, when one of the subtasks is run, the frequency of the chip can be set as the target chip frequency of the subtask.
本公开实施例的设备芯片的工作频率的设置方法,对待执行的任务进行分解得到多个子任务,并分别得到各个子任务的目标芯片频率,这种精细化的频率设置方式能够使得让任务的各个子任务部分都尽可能的实现最优的运行性能,从而提高整个任务的运行速度。并且,通过预先建立任务参数、设备信息与芯片频率的映射关系,使得能够加快芯片频率的确定速度,提高设备性能。In the method for setting the operating frequency of the device chip in the embodiment of the present disclosure, the task to be executed is decomposed to obtain multiple subtasks, and the target chip frequency of each subtask is obtained respectively. This refined frequency setting method can make each task The sub-task part is as far as possible to achieve the best operating performance, thereby improving the running speed of the entire task. In addition, by pre-establishing the mapping relationship between task parameters, device information and chip frequency, it is possible to speed up the determination of the chip frequency and improve the performance of the device.
图4提供了本公开一些实施例的芯片工作频率的设置方法的另一流程图,该示例中,以设置芯片的组合频率为例进行说明,并且,获取的设备信息中还可以包括芯片的芯片温度,该芯片温度可以是由芯片所在设备中的温度传感器来负责采集。此外,这里假设要执行的目标任务是一个三层的神经网络模型的推理任务。FIG. 4 provides another flow chart of the method for setting the working frequency of the chip according to some embodiments of the present disclosure. In this example, setting the combined frequency of the chip is taken as an example for description, and the obtained device information may also include the chip of the chip Temperature, the chip temperature can be collected by the temperature sensor in the device where the chip is located. In addition, it is assumed that the target task to be performed is a three-layer neural network model inference task.
其中,对于不同的端设备,比如GPU(Graphics Processing Unit,图形处理单元)、CPU或者DSP芯片所在的端设备,由于设备信息不同,训练得到的功耗模型和运行时间模型可能会完全不同。可以针对目标任务运行所在的特定端设备,离线训练确定模型。功耗模型和运行时间模型确定之后,可以将该确定的模型以及最优化问题求解引擎存储到端设备中。其中,最优化问题求解引擎所做的处理可以参见图2所示流程描述中的预设映射关系建立的过程,该最优化问题求解引擎可以根据采样的多组第一数据以及确定的模型,得到各芯片频率分别对应的芯片运行功耗和任务运行时间,再根据公式(1)选择出最优的芯片频率,从而建立起第一数据和第二数据之间的映射关系。可选的,可以由设备来执行预设映射关系的建立,或者也可以由其他设备执行预设映射关系的建立处理,并将预先建立好的映射关系存储在该目标任务运行所在的设备中,设备在运行目标任务时可以直接查表使用。Among them, for different end devices, such as GPU (Graphics Processing Unit, graphics processing unit), CPU, or end device where the DSP chip is located, due to different device information, the power consumption model and runtime model obtained by training may be completely different. The model can be determined by offline training for the specific end device where the target task is running. After the power consumption model and the runtime model are determined, the determined model and the optimization problem solving engine can be stored in the end device. Among them, the processing performed by the optimization problem solving engine can be referred to the process of establishing the preset mapping relationship in the process description shown in Figure 2. The optimization problem solving engine can obtain the result according to the sampled multiple sets of first data and the determined model Each chip frequency corresponds to the chip running power consumption and task running time, and then the optimal chip frequency is selected according to formula (1), thereby establishing the mapping relationship between the first data and the second data. Optionally, the device may execute the establishment of the preset mapping relationship, or another device may execute the process of establishing the preset mapping relationship, and store the pre-established mapping relationship in the device where the target task is running. The device can directly look up the table when running the target task.
在步骤400中,对设备上待运行的目标任务进行任务解析,得到多个子任务以及每个所述子任务的任务参数。In step 400, task analysis is performed on the target task to be run on the device, and multiple subtasks and task parameters of each of the subtasks are obtained.
例如,该目标任务是三层的神经网络模型的推理任务,其中的每一层可以是一个子任务。那么本步骤经过任务解析,可以得到一个子任务列表,该子任务列表中包括三个子任务:子任务1、子任务2和子任务3。并且,本步骤还分析得到每个子任务的任务参数,例如,子任务的访存量和计算量等。比如,子任务列表中可以包括:<子任务1、任务参数1>、<子任务2、任务参数2>、<子任务3、任务参数3>。其中的任务 参数可以表示为k n(n表示n维空间,比如访存量,计算量)。子任务1、子任务2和子任务3可以被分别作为第一子任务。 For example, the target task is an inference task of a three-layer neural network model, each of which can be a subtask. Then, after task analysis in this step, a subtask list can be obtained. The subtask list includes three subtasks: subtask 1, subtask 2, and subtask 3. In addition, this step also analyzes and obtains the task parameters of each subtask, for example, the amount of memory access and the amount of calculation of the subtask. For example, the subtask list may include: <subtask 1, task parameter 1>, <subtask 2, task parameter 2>, <subtask 3, task parameter 3>. The task parameters can be expressed as k n (n represents n-dimensional space, such as the amount of memory access and the amount of calculation). Subtask 1, subtask 2, and subtask 3 can be regarded as the first subtask, respectively.
在步骤402中,获取芯片所在设备的设备信息。In step 402, the device information of the device where the chip is located is obtained.
其中,设备信息可以包括设备资源信息。在一个示例中,子任务1、子任务2和子任务3之间是串行关系,这些子任务在运行时占用的设备资源信息可以是相同的。比如,带宽、计算单元数量等可以是相同的。Among them, the device information may include device resource information. In an example, the subtask 1, the subtask 2 and the subtask 3 are in a serial relationship, and the device resource information occupied by these subtasks during operation may be the same. For example, the bandwidth, the number of computing units, etc. can be the same.
设备信息可以表示为d m(m表示m维空间,比如带宽、存储器容量等)。 Device information can be expressed as d m (m represents m-dimensional space, such as bandwidth, memory capacity, etc.).
在步骤404中,在确定将要开始执行子任务1之前,采集芯片上的芯片温度C1,并根据子任务1的任务参数、设备资源信息和芯片温度C1,确定子任务1在运行时所述设备上的芯片对应设置的目标芯片频率。In step 404, before it is determined that the execution of subtask 1 is about to start, the chip temperature C1 on the chip is collected, and according to the task parameters of subtask 1, device resource information and chip temperature C1, it is determined that the device of subtask 1 is running. The chip on the corresponding to the set target chip frequency.
本公开一些实施例中,设备信息还可以包括芯片的芯片温度,因为在执行任务的过程中,温度过高也会导致芯片采取降频的措施。也就是说,本公开实施例的设备信息中不仅包括计算单元数量、带宽、存储器容量等设备资源信息,还包括了芯片温度。并且,该芯片温度可以是动态获取的,即在目标任务的运行过程中的每个子任务开始执行前,获取该子任务对应的芯片温度,并据此确定该子任务的目标芯片频率。In some embodiments of the present disclosure, the device information may also include the chip temperature of the chip, because an excessively high temperature during the execution of the task will also cause the chip to take measures to reduce the frequency. In other words, the device information in the embodiments of the present disclosure not only includes device resource information such as the number of computing units, bandwidth, and memory capacity, but also includes chip temperature. In addition, the chip temperature may be dynamically acquired, that is, before each subtask in the running process of the target task starts to execute, the chip temperature corresponding to the subtask is acquired, and the target chip frequency of the subtask is determined accordingly.
在步骤406中,将芯片的工作频率设置成子任务1的目标芯片频率,并开始执行子任务1。In step 406, the working frequency of the chip is set to the target chip frequency of subtask 1, and the execution of subtask 1 is started.
在步骤408中,在确定将要开始执行子任务2之前,采集芯片上的芯片温度C2,并根据子任务2的任务参数、设备资源信息和芯片温度C2,确定子任务2在运行时所述设备上的芯片对应设置的目标芯片频率。In step 408, before it is determined that the execution of subtask 2 is about to start, the chip temperature C2 on the chip is collected, and according to the task parameters of subtask 2, device resource information and chip temperature C2, it is determined that the device of subtask 2 is running. The chip on the corresponding to the set target chip frequency.
本公开实施例中,在目标任务的运行过程中,芯片的温度是变化的,那么可以在该目标任务运行过程中每次要执行一个子任务之前,都采集最新的芯片温度,并结合该芯片温度确定该子任务的目标芯片频率。In the embodiment of the present disclosure, the temperature of the chip changes during the running process of the target task, then the latest chip temperature can be collected and combined with the chip before each subtask is executed during the running process of the target task. The temperature determines the target chip frequency for this subtask.
还需要说明的是,本公开实施例的芯片频率可以是一个组合频率,或者是组合频率中的一个频率,即同时设置芯片的核心频率和存储器频率。例如,核心频率是x,存储器频率是y。比如,当子任务的任务参数和设备信息是(k1,d1)时(其中的设备信息可以包括设备资源信息和芯片温度),对应采取的芯片频率可以是(x1,y1);当子任务的任务参数和设备信息是(k2,d2)时,对应采取的芯片频率可以是(x2,y2)。It should also be noted that the chip frequency in the embodiment of the present disclosure may be a combined frequency or one of the combined frequencies, that is, the core frequency and the memory frequency of the chip are set at the same time. For example, the core frequency is x and the memory frequency is y. For example, when the task parameters and device information of the subtask are (k1, d1) (the device information can include device resource information and chip temperature), the corresponding chip frequency can be (x1, y1); when the subtask is When the task parameters and device information are (k2, d2), the corresponding chip frequency can be (x2, y2).
并且,在确定功耗模型和运行时间模型时,例如,功耗模型的输入可以包括:k1,d1,x1,y1,模型的输出可以是芯片运行功耗P;同理,运行时间模型的输入可以包括:k1,d1,x1,y1,模型的输出可以是任务运行时间T。并且,在训练该功耗模型和运行时间模型时,设备信息中可以包括芯片温度。In addition, when determining the power consumption model and the runtime model, for example, the input of the power consumption model can include: k1, d1, x1, y1, and the output of the model can be the chip's operating power consumption P; the same way, the input of the runtime model It can include: k1, d1, x1, y1, and the output of the model can be the task running time T. Moreover, when training the power consumption model and the runtime model, the device information may include the chip temperature.
在步骤410中,将芯片的工作频率设置成子任务2的目标芯片频率,并开始执行子任务2。In step 410, the working frequency of the chip is set to the target chip frequency of subtask 2, and the execution of subtask 2 is started.
在步骤412中,在确定将要开始执行子任务3之前,采集设备上的芯片的芯片温度C3,并根据子任务3的任务参数、设备资源信息和芯片温度C3,确定子任务3在 运行时所述设备上的芯片对应设置的目标芯片频率。In step 412, before it is determined that subtask 3 will be executed, the chip temperature C3 of the chip on the device is collected, and the task parameters of subtask 3, device resource information, and chip temperature C3 are used to determine where subtask 3 is running. The chip on the device corresponds to the set target chip frequency.
在步骤414中,将芯片的工作频率设置成子任务3的目标芯片频率,并开始执行子任务3。In step 414, the operating frequency of the chip is set to the target chip frequency of subtask 3, and the execution of subtask 3 is started.
当目标任务包括的所有的子任务都执行结束后,可以将设备芯片的工作频率重新设置为默认值,目标任务执行结束。When all the subtasks included in the target task are executed, the operating frequency of the device chip can be reset to the default value, and the execution of the target task ends.
本公开实施例的芯片工作频率的设置方法,对待执行的任务进行分解得到多个子任务,并分别得到各个子任务的目标芯片频率,这种精细化的频率设置方式能够使得让任务的各个子任务部分都尽可能的实现最优的运行性能,从而提高整个任务的运行速度。并且,在任务运行的过程中,动态的采集第一子任务执行时的芯片温度,并综合该芯片温度确定第一子任务的目标芯片工作频率,能够使得芯片工作频率的确定考虑因素更加全面,从而频率设置更加合理,进一步提高设备运行性能。In the method for setting the working frequency of the chip in the embodiment of the present disclosure, the task to be executed is decomposed to obtain multiple subtasks, and the target chip frequency of each subtask is obtained respectively. This refined frequency setting method can make each subtask of the task Some parts are as far as possible to achieve the best operating performance, thereby improving the running speed of the entire task. In addition, in the process of task operation, dynamically collecting the chip temperature during the execution of the first subtask, and synthesizing the chip temperature to determine the target chip operating frequency of the first subtask, which can make the determination of the chip operating frequency more comprehensive. Therefore, the frequency setting is more reasonable, and the operation performance of the equipment is further improved.
此外,端设备也可以提供用户接口,通过该用户接口可以接收输入的对于预设映射关系的配置信息。比如,用户可以离线的计算出或者估计出每个子任务应使用的目标芯片频率,并将该各个子任务与对应目标芯片频率的预设映射关系存储在设备中,这样设备在运行时就根据该配置的预设映射关系设置子任务执行时对应的芯片的工作频率即可。In addition, the end device may also provide a user interface through which the input configuration information for the preset mapping relationship can be received. For example, the user can calculate or estimate the target chip frequency that each subtask should use offline, and store the preset mapping relationship between each subtask and the corresponding target chip frequency in the device, so that the device runs according to the The configured preset mapping relationship only needs to set the working frequency of the corresponding chip when the subtask is executed.
上述的用户接口还可以用于接收频率设置策略信息。该频率设置策略信息例如可以包括根据任务参数、设备信息确定芯片运行功耗和任务运行时间的模型、或者还可以包括如何根据功耗模型和任务运行时间模型选择确定某个子任务的目标芯片频率。根据所述频率设置策略信息,就可以基于该频率设置策略信息和多个子任务中每个子任务的任务参数,分别确定每个子任务在运行时所述设备上的芯片对应设置的目标芯片频率。The aforementioned user interface can also be used to receive frequency setting strategy information. The frequency setting strategy information may include, for example, a model for determining chip operating power consumption and task operating time according to task parameters and device information, or how to select and determine the target chip frequency of a certain subtask according to the power consumption model and task operating time model. According to the frequency setting strategy information, based on the frequency setting strategy information and the task parameters of each of the multiple subtasks, the target chip frequency corresponding to the chip on the device when each subtask is running can be determined.
此外,该频率设置策略信息还可以包括:打开或关闭针对子任务的芯片频率动态设置功能。当频率设置策略信息包括打开针对子任务的芯片频率动态设置功能时,设备就可以按照本公开前述实施例中所述的芯片工作频率的设置方法,对目标任务的各个子任务确定的目标芯片频率,比如,可以采集子任务的任务参数、芯片温度、设备资源信息,并根据这些信息以及功耗模型和任务运行时间模型选择确定要使用的目标芯片频率。而当频率设置策略信息包括关闭针对子任务的芯片频率动态设置功能时,设备就可以按照通过用户接口配置的离线计算出的预设映射关系,直接获取每个子任务应使用的目标芯片频率。In addition, the frequency setting strategy information may also include: turning on or turning off the chip frequency dynamic setting function for subtasks. When the frequency setting strategy information includes enabling the chip frequency dynamic setting function for subtasks, the device can determine the target chip frequency for each subtask of the target task according to the method for setting chip operating frequency described in the foregoing embodiment of the present disclosure. For example, you can collect task parameters, chip temperature, and device resource information of subtasks, and select and determine the target chip frequency to be used based on these information, power consumption model and task runtime model. When the frequency setting strategy information includes turning off the chip frequency dynamic setting function for subtasks, the device can directly obtain the target chip frequency that each subtask should use according to the preset mapping relationship configured offline and calculated through the user interface.
通过上述用户接口,可以提供更为灵活的芯片频率设置方式,用户可以更方便的对芯片频率的确定方式进行更新,使得芯片频率的确定更为合理和快速。Through the above user interface, a more flexible chip frequency setting method can be provided, and the user can update the chip frequency determination method more conveniently, making the chip frequency determination more reasonable and faster.
图5是本公开实施例提供的芯片工作频率的设置装置的一示例性结构图,如图5所示,该装置可以包括:获取模块51、频率控制模块52和频率设置模块53。FIG. 5 is an exemplary structure diagram of a device for setting the operating frequency of a chip provided by an embodiment of the present disclosure. As shown in FIG. 5, the device may include: an acquisition module 51, a frequency control module 52, and a frequency setting module 53.
获取模块51,用于获取目标任务的多个子任务以及每个子任务的任务参数,所述任务参数包括用于表示所述子任务的运算规模的参数。示例性的,所述任务参数包括如下至少一项:所述子任务的计算量、子任务的访存量。The obtaining module 51 is configured to obtain multiple subtasks of the target task and task parameters of each subtask, where the task parameters include parameters for representing the operation scale of the subtasks. Exemplarily, the task parameter includes at least one of the following: a calculation amount of the subtask, and a memory access amount of the subtask.
频率控制模块52,用于基于所述多个子任务中每个子任务的任务参数,确定所 述每个子任务的目标芯片频率。The frequency control module 52 is configured to determine the target chip frequency of each subtask based on the task parameter of each subtask in the multiple subtasks.
频率设置模块53,用于根据确定的每个子任务的目标芯片频率,设置芯片执行所述每个子任务的工作频率。The frequency setting module 53 is configured to set the operating frequency of the chip to execute each subtask according to the determined target chip frequency of each subtask.
在一个例子中,如图6所示,该装置还可以包括:任务解析模块54,用于对所述目标任务进行任务解析处理,得到所述多个子任务以及每个子任务的任务参数;并存储所述多个子任务中每个子任务与所述每个子任务的任务参数的对应关系。获取模块51,在用于获取目标任务的多个子任务以及每个子任务的任务参数时,包括:从存储的所述对应关系中查找每个子任务的任务参数。In an example, as shown in FIG. 6, the device may further include: a task analysis module 54 for performing task analysis processing on the target task to obtain the multiple subtasks and task parameters of each subtask; and storing Correspondence between each subtask and the task parameter of each subtask in the plurality of subtasks. The obtaining module 51, when used to obtain multiple subtasks of the target task and the task parameters of each subtask, includes: searching the task parameters of each subtask from the stored correspondence relationship.
例如,在实际实施中,要开始运行一个目标任务时,可以先由任务解析模块54对目标任务进行任务解析处理,比如解析得到多个子任务,再分别对多个子任务中的每个子任务进行解析得到各个子任务的任务参数。任务解析模块54可以将解析得到的每个子任务及其任务参数的对应关系进行存储。For example, in actual implementation, when starting to run a target task, the task analysis module 54 can first perform task analysis processing on the target task, such as parsed to obtain multiple subtasks, and then analyze each of the multiple subtasks separately Obtain the task parameters of each subtask. The task analysis module 54 can store the corresponding relationship between each subtask obtained by the analysis and its task parameters.
获取模块51可以负责控制目标任务的运行过程,例如,获取模块51可以获取上述任务解析模块54解析并存储得到的对应关系,控制逐个执行其中的各个子任务。示例性的,假设有三个子任务,当要开始执行第一个子任务时,获取模块51可以由对应关系中获取到该第一个子任务的任务参数,并将任务参数以及芯片所在设备的设备信息发送至频率控制模块52,由频率控制模块52根据任务参数和设备信息确定第一个子任务的目标芯片频率。The obtaining module 51 may be responsible for controlling the running process of the target task. For example, the obtaining module 51 may obtain the corresponding relationship parsed and stored by the task analyzing module 54 described above, and control the execution of each subtask therein one by one. Exemplarily, assuming there are three subtasks, when the first subtask is to be executed, the obtaining module 51 can obtain the task parameters of the first subtask from the corresponding relationship, and combine the task parameters and the device of the device where the chip is located. The information is sent to the frequency control module 52, and the frequency control module 52 determines the target chip frequency of the first subtask according to task parameters and device information.
接着,频率控制模块52可以将确定的目标芯片频率发送至频率设置模块53,该频率设置模块53可以将芯片的工作频率设置为所述的目标芯片频率。并且,频率设置模块53在设置芯片的工作频率完成后,可以向获取模块51发送一个反馈信号,以通知获取模块51频率设置已经完成。那么,获取模块51根据该反馈信号,就可以开始执行第一个子任务。Then, the frequency control module 52 may send the determined target chip frequency to the frequency setting module 53, and the frequency setting module 53 may set the operating frequency of the chip to the target chip frequency. Moreover, the frequency setting module 53 may send a feedback signal to the acquiring module 51 after setting the working frequency of the chip to notify the acquiring module 51 that the frequency setting has been completed. Then, the acquisition module 51 can start to execute the first subtask according to the feedback signal.
当第一个子任务执行结束后,获取模块51可以开始准备执行第二个子任务,同样的,在第二个子任务开始执行前,获取模块51可以由任务解析模块54得到的对应关系中获得第二个子任务的任务参数,发送至频率控制模块52确定第二个子任务的目标芯片频率。当频率设置模块53反馈芯片频率设置完成后,获取模块51开始执行第二个子任务。第三个子任务的执行同理,不再详述。当获取模块51确认目标任务的所有子任务都执行结束后,可以向频率控制模块52发送频率重置信号,频率控制模块52据此再向频率设置模块53发送频率重置信号,频率设置模块53可以将芯片的工作频率设置为默认值。此时目标任务执行结束。When the execution of the first subtask ends, the acquisition module 51 can start to prepare to execute the second subtask. Similarly, before the execution of the second subtask, the acquisition module 51 can obtain the first subtask from the correspondence obtained by the task analysis module 54. The task parameters of the two subtasks are sent to the frequency control module 52 to determine the target chip frequency of the second subtask. After the frequency setting module 53 feedbacks that the chip frequency setting is completed, the acquiring module 51 starts to execute the second subtask. The implementation of the third subtask is similar, and will not be detailed again. After the acquiring module 51 confirms that all the subtasks of the target task have been executed, it can send a frequency reset signal to the frequency control module 52, and the frequency control module 52 then sends a frequency reset signal to the frequency setting module 53, and the frequency setting module 53 The operating frequency of the chip can be set to the default value. At this point, the execution of the target task is over.
在一个例子中,频率控制模块52,在用于基于所述多个子任务中每个子任务的任务参数确定所述每个子任务的目标芯片频率时,包括:获取所述芯片所在设备的设备信息,所述设备信息包括设备资源信息;基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。例如,该设备资源信息包括以下中的任意一项或多项:计算单元的数量、带宽、存储器容量。In an example, when the frequency control module 52 is used to determine the target chip frequency of each subtask based on the task parameters of each of the multiple subtasks, it includes: acquiring device information of the device where the chip is located, The device information includes device resource information; based on the device information and the task parameter of each subtask of the multiple subtasks, the target chip frequency of each subtask is determined. For example, the device resource information includes any one or more of the following: the number of computing units, bandwidth, and memory capacity.
在一个例子中,所述设备信息还包括:所述芯片的芯片温度;频率控制模块52, 还用于:基于所述多个子任务中的第一子任务的任务参数、所述设备资源信息和所述第一子任务执行时的芯片温度,确定所述第一子任务的目标芯片频率。In an example, the device information further includes: the chip temperature of the chip; the frequency control module 52 is further configured to: based on the task parameters of the first subtask among the plurality of subtasks, the device resource information, and The chip temperature during execution of the first subtask determines the target chip frequency of the first subtask.
在一个例子中,频率控制模块52,用于:获取多个第一数据和多个第二数据之间的预设映射关系,所述第一数据包括预设任务参数和预设设备信息,所述第二数据包括预设芯片频率;根据所述预设映射关系、所述设备信息以及每个子任务的任务参数,确定所述每个子任务的目标芯片频率。In an example, the frequency control module 52 is configured to: obtain a preset mapping relationship between a plurality of first data and a plurality of second data, the first data includes preset task parameters and preset device information, so The second data includes a preset chip frequency; the target chip frequency of each subtask is determined according to the preset mapping relationship, the device information, and the task parameters of each subtask.
在一个例子中,频率控制模块52,在用于根据所述预设映射关系、所述设备信息以及每个子任务的任务参数,确定所述每个子任务的目标芯片频率时,包括:确定第三数据与所述预设映射关系中的多个第一数据中每个第一数据之间的距离,所述第三数据包括所述多个子任务中第一子任务的任务参数和设备信息;将所述预设映射关系中与所述第三数据距离最近的目标第一数据对应的预设芯片频率,作为所述第一子任务的目标芯片频率。In an example, when the frequency control module 52 is used to determine the target chip frequency of each subtask according to the preset mapping relationship, the device information, and the task parameters of each subtask, it includes: determining a third The distance between the data and each first data in the plurality of first data in the preset mapping relationship, the third data includes task parameters and device information of the first subtask among the plurality of subtasks; The preset chip frequency corresponding to the target first data closest to the third data in the preset mapping relationship is used as the target chip frequency of the first subtask.
在一个例子中,频率控制模块52,还用于在获取多个第一数据和多个第二数据之间的预设映射关系之前,获取多组可选芯片频率,并获取采样得到的离散的多个第一数据;对于所述多个第一数据中的每个所述第一数据,由所述多组可选芯片频率中选择一组芯片频率作为与所述每个第一数据对应的第二数据,并建立所述每个第一数据和选择的所述第二数据之间的映射关系。In an example, the frequency control module 52 is further configured to obtain multiple sets of selectable chip frequencies before obtaining the preset mapping relationship between the multiple first data and the multiple second data, and obtain the discrete discrete data obtained by sampling. A plurality of first data; for each of the first data in the plurality of first data, a set of chip frequencies is selected from the plurality of sets of selectable chip frequencies as the chip frequency corresponding to each of the first data Second data, and establish a mapping relationship between each of the first data and the selected second data.
在一个例子中,预设所述预设映射关系中与所述第一数据对应的预设芯片频率是基于在多组可选芯片频率中每组可选芯片频率的条件下所述第一数据对应的性能评估参数,从所述多组可选芯片频率中选择的。In one example, the preset chip frequency corresponding to the first data in the preset mapping relationship is preset based on the first data under the condition of each group of selectable chip frequencies in a plurality of sets of selectable chip frequencies The corresponding performance evaluation parameters are selected from the multiple sets of optional chip frequencies.
在一个例子中,所述性能评估参数包括任务处理性能参数和芯片运行功耗;所述预设映射关系中与所述第一数据对应的预设芯片频率为:所述多组可选芯片频率中芯片运行功耗低于预设功耗且任务处理性能参数最优的可选芯片频率。In an example, the performance evaluation parameters include task processing performance parameters and chip operating power consumption; the preset chip frequency corresponding to the first data in the preset mapping relationship is: the multiple sets of selectable chip frequencies The operating power consumption of the medium chip is lower than the preset power consumption and the optional chip frequency with the best task processing performance parameters.
在一个例子中,如图6所示,该装置还可以包括:接口模块55,用于接收用户输入的对于所述预设映射关系的配置信息。In an example, as shown in FIG. 6, the device may further include: an interface module 55, configured to receive configuration information of the preset mapping relationship input by the user.
在一个例子中,频率控制模块52,在用于基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率时,包括:基于所述多个子任务中每个子任务的任务参数,由多组可选芯片频率中,选择能够使得所述芯片在芯片功耗限制条件下实现任务运行时间最低的芯片频率,作为所述目标芯片频率。In an example, when the frequency control module 52 is used to determine the target chip frequency of each subtask based on the task parameters of each subtask of the plurality of subtasks, it includes: The task parameters of the task are selected from multiple sets of selectable chip frequencies, and the chip frequency that enables the chip to achieve the lowest task running time under the limitation of chip power consumption is selected as the target chip frequency.
在一个例子中,接口模块55,还用于:接收用户输入的频率设置策略信息;所述频率控制模块52,还用于:基于所述频率设置策略信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。In an example, the interface module 55 is further configured to: receive frequency setting strategy information input by the user; the frequency control module 52 is further configured to: based on the frequency setting strategy information and each of the multiple subtasks The task parameters to determine the target chip frequency of each subtask.
在一个例子中,所述频率设置策略信息包括打开或关闭针对子任务的芯片频率动态设置功能。In an example, the frequency setting strategy information includes enabling or disabling the chip frequency dynamic setting function for subtasks.
在一个例子中,所述工作频率包括如下至少一项:所述芯片的核心频率、或者存储器频率。In an example, the operating frequency includes at least one of the following: a core frequency of the chip or a memory frequency.
在一些实施例中,上述装置可以用于执行上文所述的对应任意方法,为了简洁,这里不再赘述。In some embodiments, the above-mentioned apparatus may be used to execute any corresponding method described above, and for the sake of brevity, it will not be repeated here.
本公开实施例还提供了一种电子设备,如图7所示,电子设备700包括存储器710、处理器720,所述存储器710用于存储机器可读指令711,所述处理器720用于调用所述机器可读指令711,实现本说明书任一实施例的芯片工作频率的设置方法。其中,芯片工作频率设置的目标芯片可以是存储器710和/或处理器720。在一个例子中,该电子设备700还可以包括通信接口730和总线740。存储器710、处理器720以及通信接口730通过总线740相互连接。在一个例子中,该电子设备700还可以包括芯片750,该芯片750用于基于所述处理器720设置的工作频率对目标任务中的每个子任务进行处理。The embodiment of the present disclosure also provides an electronic device. As shown in FIG. 7, the electronic device 700 includes a memory 710 and a processor 720. The memory 710 is used to store machine-readable instructions 711, and the processor 720 is used to call The machine-readable instruction 711 implements the method for setting the operating frequency of the chip in any embodiment of this specification. The target chip for chip operating frequency setting may be the memory 710 and/or the processor 720. In an example, the electronic device 700 may further include a communication interface 730 and a bus 740. The memory 710, the processor 720, and the communication interface 730 are connected to each other through a bus 740. In an example, the electronic device 700 may further include a chip 750 configured to process each subtask in the target task based on the operating frequency set by the processor 720.
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时,促使所述处理器实现本说明书任一实施例的芯片工作频率的设置方法。在一个例子中,该计算机可读存储介质可以是图7中的存储器710。The embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the processor is prompted to implement the chip operating frequency setting method of any embodiment of this specification . In an example, the computer-readable storage medium may be the memory 710 in FIG. 7.
本领域技术人员应明白,本公开一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本公开一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本公开一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that one or more embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of the present disclosure may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
其中,本公开实施例所述的“和/或”表示至少具有两者中的其中一个,例如,“多和/或B”包括三种方案:多、B、以及“多和B”。Among them, "and/or" in the embodiments of the present disclosure means having at least one of the two, for example, "multi and/or B" includes three schemes: multi, B, and "multi and B".
本公开中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于数据处理设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in the present disclosure are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the data processing device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
上述对本公开特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的行为或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The specific embodiments of the present disclosure have been described above. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims can be performed in a different order than in the embodiments and still achieve desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or sequential order shown in order to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
本公开中描述的主题及功能操作的实施例可以在以下中实现:数字电子电路、有形体现的计算机软件或固件、包括本公开中公开的结构及其结构性等同物的计算机硬件、或者它们中的一个或多个的组合。本公开中描述的主题的实施例可以实现为一个或多个计算机程序,即编码在有形非暂时性程序载体上以被数据处理装置执行或控制数据处理装置的操作的计算机程序指令中的一个或多个模块。可替代地或附加地,程序指令可以被编码在人工生成的传播信号上,例如机器生成的电、光或电磁信号,该信号被生成以将信息编码并传输到合适的接收机装置以由数据处理装置执行。计算机存储介质可以是机器可读存储设备、机器可读存储基板、随机或串行存取存储器设备、或它们中的一个或多个的组合。The embodiments of the subject and functional operations described in the present disclosure can be implemented in the following: digital electronic circuits, tangible computer software or firmware, computer hardware including the structures disclosed in the present disclosure and structural equivalents thereof, or among them A combination of one or more. The embodiments of the subject matter described in the present disclosure may be implemented as one or more computer programs, that is, one or one of the computer program instructions encoded on a tangible non-transitory program carrier to be executed by a data processing device or to control the operation of the data processing device Multiple modules. Alternatively or additionally, the program instructions may be encoded on artificially generated propagated signals, such as machine-generated electrical, optical, or electromagnetic signals, which are generated to encode information and transmit it to a suitable receiver device for data transmission. The processing device executes. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
本公开中描述的处理及逻辑流程可以由执行一个或多个计算机程序的一个或多个可编程计算机执行,以通过根据输入数据进行操作并生成输出来执行相应的功能。所述处理及逻辑流程还可以由专用逻辑电路—例如FPGA(现场可编程门阵列)或ASIC(专用集成电路)来执行,并且装置也可以实现为专用逻辑电路。The processing and logic flow described in the present disclosure can be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating according to input data and generating output. The processing and logic flow can also be executed by a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit), and the device can also be implemented as a dedicated logic circuit.
适合用于执行计算机程序的计算机包括,例如通用和/或专用微处理器,或任何其他类型的中央处理单元。通常,中央处理单元将从只读存储器和/或随机存取存储器接收指令和数据。计算机的基本组件包括用于实施或执行指令的中央处理单元以及用于存储指令和数据的一个或多个存储器设备。通常,计算机还将包括用于存储数据的一个或多个大容量存储设备,例如磁盘、磁光盘或光盘等,或者计算机将可操作地与此大容量存储设备耦接以从其接收数据或向其传送数据,抑或两种情况兼而有之。然而,计算机不是必须具有这样的设备。此外,计算机可以嵌入在另一设备中,例如移动电话、个人数字助理(PDA)、移动音频或视频播放器、游戏操纵台、全球定位系统(GPS)接收机、或例如通用串行总线(USB)闪存驱动器的便携式存储设备,仅举几例。Computers suitable for executing computer programs include, for example, general-purpose and/or special-purpose microprocessors, or any other type of central processing unit. Generally, the central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include one or more mass storage devices for storing data, such as magnetic disks, magneto-optical disks, or optical disks, or the computer will be operatively coupled with this mass storage device to receive data from or send data to it. It transmits data, or both. However, the computer does not have to have such equipment. In addition, the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or, for example, a universal serial bus (USB ) Flash drives are portable storage devices, just to name a few.
适合于存储计算机程序指令和数据的计算机可读介质包括所有形式的非易失性存储器、媒介和存储器设备,例如包括半导体存储器设备(例如EPROM、EEPROM和闪存设备)、磁盘(例如内部硬盘或可移动盘)、磁光盘以及CD ROM和DVD-ROM盘。处理器和存储器可由专用逻辑电路补充或并入专用逻辑电路中。Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices (such as EPROM, EEPROM, and flash memory devices), magnetic disks (such as internal hard disks or Removable disks), magneto-optical disks, CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by or incorporated into a dedicated logic circuit.
虽然本公开包含许多具体实施细节,但是这些不应被解释为限制任何公开的范围或所要求保护的范围,而是主要用于描述特定公开的具体实施例的特征。本公开内在多个实施例中描述的某些特征也可以在单个实施例中被组合实施。另一方面,在单个实施例中描述的各种特征也可以在多个实施例中分开实施或以任何合适的子组合来实施。此外,虽然特征可以如上所述在某些组合中起作用并且甚至最初如此要求保护,但是来自所要求保护的组合中的一个或多个特征在一些情况下可以从该组合中去除,并且所要求保护的组合可以指向子组合或子组合的变型。Although the present disclosure contains many specific implementation details, these should not be construed as limiting the scope of any disclosure or the scope of protection, but are mainly used to describe the features of specific embodiments of the specific disclosure. Certain features described in multiple embodiments within the present disclosure can also be implemented in combination in a single embodiment. On the other hand, various features described in a single embodiment can also be implemented in multiple embodiments separately or in any suitable sub-combination. In addition, although features may function in certain combinations as described above and even initially claimed as such, one or more features from the claimed combination may in some cases be removed from the combination, and the claimed The combination of protection can point to a sub-combination or a variant of the sub-combination.
类似地,虽然在附图中以特定顺序描绘了操作,但是这不应被理解为要求这些操作以所示的特定顺序执行或顺次执行、或者要求所有例示的操作被执行,以实现期望的结果。在某些情况下,多任务和并行处理可能是有利的。此外,上述实施例中各种系统模块和组件的分离不应被理解为在所有实施例中均需要这样的分离,并且应当理解,所描述的程序组件和系统通常可以一起集成在单个软件产品中,或封装成多个软件产品。Similarly, although operations are depicted in a specific order in the drawings, this should not be construed as requiring these operations to be performed in the specific order shown or sequentially, or requiring all illustrated operations to be performed to achieve the desired result. In some cases, multitasking and parallel processing may be advantageous. In addition, the separation of various system modules and components in the above embodiments should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can usually be integrated together in a single software product. , Or packaged into multiple software products.
由此,主题的特定实施例已被描述。其他实施例在所附权利要求书的范围以内。在某些情况下,权利要求书中记载的动作可以以不同的顺序执行并且仍实现期望的结果。此外,附图中描绘的处理并非必需所示的特定顺序或顺次顺序,以实现期望的结果。在某些实现中,多任务和并行处理可能是有利的。Thus, specific embodiments of the subject matter have been described. Other embodiments are within the scope of the appended claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desired results. In addition, the processes depicted in the drawings are not necessarily in the specific order or sequential order shown in order to achieve the desired result. In some implementations, multitasking and parallel processing may be advantageous.
以上所述仅为本公开一个或多个实施例的较佳实施例而已,并不用以限制本公开一个或多个实施例,凡在本公开一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开一个或多个实施例保护的范围之内。The foregoing descriptions are only preferred embodiments of one or more embodiments of the present disclosure, and are not intended to limit one or more embodiments of the present disclosure. All within the spirit and principle of one or more embodiments of the present disclosure, Any modification, equivalent replacement, improvement, etc. made should be included in the protection scope of one or more embodiments of the present disclosure.

Claims (35)

  1. 一种芯片工作频率的设置方法,其特征在于,所述方法包括:A method for setting the operating frequency of a chip, characterized in that the method includes:
    获取目标任务的多个子任务以及每个子任务的任务参数,所述任务参数包括用于表示所述子任务的运算规模的参数;Acquiring multiple subtasks of the target task and task parameters of each subtask, where the task parameters include parameters used to indicate the operation scale of the subtasks;
    基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率;Determine the target chip frequency of each subtask based on the task parameter of each subtask in the plurality of subtasks;
    根据确定的每个子任务的目标芯片频率,设置芯片执行所述每个子任务的工作频率。According to the determined target chip frequency of each subtask, the working frequency of the chip to execute each subtask is set.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    对所述目标任务进行任务解析处理,得到所述多个子任务以及每个子任务的任务参数;Performing task analysis processing on the target task to obtain the multiple subtasks and the task parameters of each subtask;
    存储所述多个子任务中每个子任务与所述每个子任务的任务参数的对应关系;Storing the correspondence between each subtask and the task parameter of each subtask among the plurality of subtasks;
    所述获取目标任务的多个子任务以及每个子任务的任务参数包括:从存储的所述对应关系中查找每个子任务的任务参数。The obtaining the multiple subtasks of the target task and the task parameters of each subtask includes: searching the task parameters of each subtask from the stored correspondence relationship.
  3. 根据权利要求1或2所述的方法,其特征在于,所述任务参数包括如下至少一项:所述子任务的计算量、所述子任务的访存量。The method according to claim 1 or 2, wherein the task parameter includes at least one of the following: a calculation amount of the subtask, and a memory access amount of the subtask.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:The method according to any one of claims 1 to 3, wherein the determining the target chip frequency of each subtask based on the task parameter of each subtask in the plurality of subtasks comprises:
    获取所述芯片所在设备的设备信息,所述设备信息包括设备资源信息;Acquiring device information of the device where the chip is located, where the device information includes device resource information;
    基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。Determine the target chip frequency of each subtask based on the device information and the task parameter of each subtask of the multiple subtasks.
  5. 根据权利要求4所述的方法,其特征在于,所述设备资源信息包括以下中的任意一项或多项:计算单元的数量、带宽、存储器容量。The method according to claim 4, wherein the device resource information includes any one or more of the following: the number of computing units, bandwidth, and memory capacity.
  6. 根据权利要求4或5所述的方法,其特征在于,所述设备信息还包括:所述芯片的芯片温度;The method according to claim 4 or 5, wherein the device information further comprises: the chip temperature of the chip;
    所述基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:The determining the target chip frequency of each subtask based on the device information and the task parameter of each subtask of the multiple subtasks includes:
    基于所述多个子任务中第一子任务的任务参数、所述设备资源信息和所述第一子任务执行时的芯片温度,确定所述第一子任务的目标芯片频率。The target chip frequency of the first subtask is determined based on the task parameters of the first subtask among the plurality of subtasks, the device resource information, and the chip temperature when the first subtask is executed.
  7. 根据权利要求4至6任一项所述的方法,其特征在于,所述基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:The method according to any one of claims 4 to 6, wherein the determining the target chip frequency of each subtask based on the device information and the task parameters of each subtask of the plurality of subtasks, include:
    获取多个第一数据和多个第二数据之间的预设映射关系,所述第一数据包括预设任务参数和预设设备信息,所述第二数据包括预设芯片频率;Acquiring a preset mapping relationship between a plurality of first data and a plurality of second data, where the first data includes preset task parameters and preset device information, and the second data includes a preset chip frequency;
    根据所述预设映射关系、所述设备信息以及每个子任务的任务参数,确定所述每个子任务的目标芯片频率。Determine the target chip frequency of each subtask according to the preset mapping relationship, the device information, and the task parameters of each subtask.
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述预设映射关系、所述设备信息以及每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:The method according to claim 7, wherein the determining the target chip frequency of each subtask according to the preset mapping relationship, the device information, and the task parameters of each subtask comprises:
    确定第三数据与所述预设映射关系中的多个第一数据中每个第一数据之间的距离,所述第三数据包括所述多个子任务中第一子任务的任务参数和设备信息;Determine the distance between the third data and each first data in the plurality of first data in the preset mapping relationship, where the third data includes task parameters and equipment of the first subtask among the plurality of subtasks information;
    将所述预设映射关系中与所述第三数据距离最近的第一数据对应的预设芯片频率,作为所述第一子任务的目标芯片频率。The preset chip frequency corresponding to the first data closest to the third data in the preset mapping relationship is used as the target chip frequency of the first subtask.
  9. 根据权利要求7或8所述的方法,其特征在于,所述获取多个第一数据和多个第二数据之间的预设映射关系之前,所述方法还包括:The method according to claim 7 or 8, characterized in that, before the obtaining the preset mapping relationship between the plurality of first data and the plurality of second data, the method further comprises:
    获取多组可选芯片频率,并获取采样得到的离散的多个第一数据;Obtain multiple sets of optional chip frequencies, and obtain multiple discrete first data obtained by sampling;
    对于所述多个第一数据中的每个所述第一数据,由所述多组可选芯片频率中选择一组芯片频率作为与所述每个第一数据对应的第二数据,并建立所述每个第一数据和选择的所述第二数据之间的映射关系。For each of the plurality of first data, select a set of chip frequencies from the plurality of sets of selectable chip frequencies as the second data corresponding to each of the first data, and establish The mapping relationship between each of the first data and the selected second data.
  10. 根据权利要求7至9中任一项所述的方法,其特征在于,所述预设映射关系中与所述第一数据对应的预设芯片频率是基于在多组可选芯片频率中每组可选芯片频率的条件下所述第一数据对应的性能评估参数,从所述多组可选芯片频率中选择的。The method according to any one of claims 7 to 9, wherein the preset chip frequency corresponding to the first data in the preset mapping relationship is based on each group of multiple sets of selectable chip frequencies The performance evaluation parameter corresponding to the first data under the condition of the selectable chip frequency is selected from the multiple sets of selectable chip frequencies.
  11. 根据权利要求10所述的方法,其特征在于,所述性能评估参数包括任务处理性能参数和芯片运行功耗;The method according to claim 10, wherein the performance evaluation parameters include task processing performance parameters and chip operating power consumption;
    所述预设映射关系中与所述第一数据对应的预设芯片频率为:所述多组可选芯片频率中芯片运行功耗低于预设功耗且任务处理性能参数最优的可选芯片频率。The preset chip frequency corresponding to the first data in the preset mapping relationship is: among the multiple sets of selectable chip frequencies, the operating power consumption of the chip is lower than the preset power consumption and the task processing performance parameter is optimal. Chip frequency.
  12. 根据权利要求7至11中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 7 to 11, wherein the method further comprises:
    接收用户输入的对于所述预设映射关系的配置信息。Receiving configuration information input by the user for the preset mapping relationship.
  13. 根据权利要求1至12任一项所述的方法,其特征在于,所述基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:The method according to any one of claims 1 to 12, wherein the determining the target chip frequency of each subtask based on the task parameter of each subtask in the plurality of subtasks comprises:
    基于所述多个子任务中每个子任务的任务参数,由多组可选芯片频率中,选择能够使得所述芯片在芯片功耗限制条件下实现任务运行时间最低的芯片频率,作为所述目标芯片频率。Based on the task parameters of each of the multiple subtasks, from multiple sets of optional chip frequencies, select the chip frequency that enables the chip to achieve the lowest task running time under the restriction of chip power consumption, as the target chip frequency.
  14. 根据权利要求1至13任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 13, wherein the method further comprises:
    接收用户输入的频率设置策略信息;Receive the frequency setting strategy information input by the user;
    所述基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率,包括:基于所述频率设置策略信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。The determining the target chip frequency of each subtask based on the task parameter of each subtask of the plurality of subtasks includes: setting the strategy information based on the frequency and the task parameter of each subtask of the plurality of subtasks, Determine the target chip frequency of each subtask.
  15. 根据权利要求14所述的方法,其特征在于,所述频率设置策略信息包括打开或关闭针对子任务的芯片频率动态设置功能。The method according to claim 14, wherein the frequency setting strategy information comprises turning on or turning off the chip frequency dynamic setting function for subtasks.
  16. 根据权利要求1至15任一项所述的方法,其特征在于,所述工作频率包括如下至少一项:所述芯片的核心频率、或者存储器频率。The method according to any one of claims 1 to 15, wherein the operating frequency includes at least one of the following: a core frequency of the chip or a memory frequency.
  17. 一种芯片工作频率的设置装置,其特征在于,所述装置包括:A device for setting the operating frequency of a chip, characterized in that the device comprises:
    获取模块,用于获取目标任务的多个子任务以及每个子任务的任务参数,所述任务参数包括用于表示所述子任务的运算规模的参数;An obtaining module, configured to obtain multiple subtasks of the target task and task parameters of each subtask, the task parameters including parameters used to indicate the operation scale of the subtasks;
    频率控制模块,用于基于所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率;A frequency control module, configured to determine the target chip frequency of each subtask based on the task parameter of each subtask in the plurality of subtasks;
    频率设置模块,用于根据确定的每个子任务的目标芯片频率,设置芯片执行所述每个子任务的工作频率。The frequency setting module is used to set the working frequency of the chip to execute each subtask according to the determined target chip frequency of each subtask.
  18. 根据权利要求17所述的装置,其特征在于,所述装置还包括:The device according to claim 17, wherein the device further comprises:
    任务解析模块,用于对所述目标任务进行任务解析处理,得到所述多个子任务以及每个子任务的任务参数;并存储所述多个子任务中每个子任务与所述每个子任务的任务参数的对应关系;The task analysis module is used to perform task analysis processing on the target task to obtain the multiple subtasks and the task parameters of each subtask; and store the task parameters of each subtask and each subtask in the multiple subtasks Correspondence;
    所述获取模块还用于:从存储的所述对应关系中查找每个子任务的任务参数。The acquiring module is further used for searching the task parameters of each subtask from the stored corresponding relationship.
  19. 根据权利要求17或18所述的装置,其特征在于,所述任务参数包括如下至少一项:所述子任务的计算量、所述子任务的访存量。The device according to claim 17 or 18, wherein the task parameter includes at least one of the following: a calculation amount of the subtask, and a memory access amount of the subtask.
  20. 根据权利要求17至19任一项所述的装置,其特征在于,所述频率控制模块还用于:The device according to any one of claims 17 to 19, wherein the frequency control module is further configured to:
    获取所述芯片所在设备的设备信息,所述设备信息包括设备资源信息;Acquiring device information of the device where the chip is located, where the device information includes device resource information;
    基于所述设备信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。Determine the target chip frequency of each subtask based on the device information and the task parameter of each subtask of the multiple subtasks.
  21. 根据权利要求20所述的装置,其特征在于,所述设备资源信息包括以下中的任意一项或多项:计算单元的数量、带宽、存储器容量。The apparatus according to claim 20, wherein the device resource information includes any one or more of the following: the number of computing units, bandwidth, and memory capacity.
  22. 根据权利要求20或21所述的装置,其特征在于,所述设备信息还包括:所述芯片的芯片温度;The apparatus according to claim 20 or 21, wherein the device information further comprises: the chip temperature of the chip;
    所述频率控制模块还用于:基于所述多个子任务中的第一子任务的任务参数、所述设备资源信息和所述第一子任务执行时的芯片温度,确定所述第一子任务的目标芯片频率。The frequency control module is further configured to determine the first subtask based on the task parameters of the first subtask among the plurality of subtasks, the device resource information, and the chip temperature during execution of the first subtask The target chip frequency.
  23. 根据权利要求20至22任一项所述的装置,其特征在于,所述频率控制模块还用于:The device according to any one of claims 20 to 22, wherein the frequency control module is further configured to:
    获取多个第一数据和多个第二数据之间的预设映射关系,所述第一数据包括预设任务参数和预设设备信息,所述第二数据包括预设芯片频率;Acquiring a preset mapping relationship between a plurality of first data and a plurality of second data, where the first data includes preset task parameters and preset device information, and the second data includes a preset chip frequency;
    根据所述预设映射关系、所述设备信息以及每个子任务的任务参数,确定所述每个子任务的目标芯片频率。Determine the target chip frequency of each subtask according to the preset mapping relationship, the device information, and the task parameters of each subtask.
  24. 根据权利要求22所述的装置,其特征在于,所述频率控制模块还用于:The device according to claim 22, wherein the frequency control module is further configured to:
    确定第三数据与所述预设映射关系中的多个第一数据中每个第一数据之间的距离,所述第三数据包括所述多个子任务中第一子任务的任务参数和设备信息;Determine the distance between the third data and each first data in the plurality of first data in the preset mapping relationship, where the third data includes task parameters and equipment of the first subtask among the plurality of subtasks information;
    将所述预设映射关系中与所述第三数据距离最近的第一数据对应的预设芯片频率,作为所述第一子任务的目标芯片频率。The preset chip frequency corresponding to the first data closest to the third data in the preset mapping relationship is used as the target chip frequency of the first subtask.
  25. 根据权利要求23或24所述的装置,其特征在于,所述频率控制模块,还用于:The device according to claim 23 or 24, wherein the frequency control module is further configured to:
    在获取多个第一数据和多个第二数据之间的预设映射关系之前,获取多组可选芯片频率,并获取采样得到的离散的多个第一数据;Before acquiring the preset mapping relationship between the plurality of first data and the plurality of second data, acquiring a plurality of sets of optional chip frequencies, and acquiring a plurality of discrete first data obtained by sampling;
    对于所述多个第一数据中的每个所述第一数据,由所述多组可选芯片频率中选择一组芯片频率作为与所述每个第一数据对应的第二数据,并建立所述每个第一数据和选择的所述第二数据之间的映射关系。For each of the first data in the plurality of first data, select a set of chip frequencies from the plurality of selectable chip frequencies as the second data corresponding to each of the first data, and establish The mapping relationship between each of the first data and the selected second data.
  26. 根据权利要求23至25中任一项所述的装置,其特征在于,预设所述预设映射关系中与所述第一数据对应的预设芯片频率是基于在多组可选芯片频率中每组可选芯片频率的条件下所述第一数据对应的性能评估参数,从所述多组可选芯片频率中选择的。The device according to any one of claims 23 to 25, wherein the preset chip frequency corresponding to the first data in the preset mapping relationship is based on a plurality of sets of selectable chip frequencies The performance evaluation parameter corresponding to the first data under the condition of each set of selectable chip frequencies is selected from the multiple sets of selectable chip frequencies.
  27. 根据权利要求26所述的装置,其特征在于,The device of claim 26, wherein:
    所述性能评估参数包括任务处理性能参数和芯片运行功耗;The performance evaluation parameters include task processing performance parameters and chip operating power consumption;
    所述预设映射关系中与所述第一数据对应的预设芯片频率为:所述多组可选芯片频率中芯片运行功耗低于预设功耗且任务处理性能参数最优的可选芯片频率。The preset chip frequency corresponding to the first data in the preset mapping relationship is: among the multiple sets of selectable chip frequencies, the operating power consumption of the chip is lower than the preset power consumption and the task processing performance parameter is optimal. Chip frequency.
  28. 根据权利要求23至27中任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 23-27, wherein the device further comprises:
    接口模块,用于接收用户输入的对于所述预设映射关系的配置信息。The interface module is used to receive the configuration information of the preset mapping relationship input by the user.
  29. 根据权利要求17至28中任一项所述的装置,其特征在于,所述频率控制模块还用于:The device according to any one of claims 17 to 28, wherein the frequency control module is further configured to:
    基于所述多个子任务中每个子任务的任务参数,由多组可选芯片频率中,选择能够使得所述芯片在芯片功耗限制条件下实现任务运行时间最低的芯片频率,作为所述目标芯片频率。Based on the task parameters of each of the multiple subtasks, from multiple sets of optional chip frequencies, select the chip frequency that enables the chip to achieve the lowest task running time under the restriction of chip power consumption, as the target chip frequency.
  30. 根据权利要求28所述的装置,其特征在于,The device of claim 28, wherein:
    所述接口模块还用于:接收用户输入的频率设置策略信息;The interface module is further configured to: receive frequency setting strategy information input by the user;
    所述频率控制模块还用于:基于所述频率设置策略信息和所述多个子任务中每个子任务的任务参数,确定所述每个子任务的目标芯片频率。The frequency control module is further configured to determine the target chip frequency of each subtask based on the frequency setting strategy information and the task parameter of each subtask of the multiple subtasks.
  31. 根据权利要求30所述的装置,其特征在于,所述频率设置策略信息包括打开或关闭针对子任务的芯片频率动态设置功能。The device according to claim 30, wherein the frequency setting strategy information comprises enabling or disabling the chip frequency dynamic setting function for subtasks.
  32. 根据权利要求17至31任一项所述的装置,其特征在于,所述工作频率包括如下至少一项:所述芯片的核心频率、或者存储器频率。The device according to any one of claims 17 to 31, wherein the operating frequency comprises at least one of the following: a core frequency of the chip or a memory frequency.
  33. 一种电子设备,其特征在于,包括:存储器、处理器,所述存储器用于存储机器可读指令,所述处理器用于调用所述机器可读指令,实现权利要求1至16任一项所述的方法。An electronic device, comprising: a memory and a processor, the memory is used to store machine-readable instructions, and the processor is used to call the machine-readable instructions to implement any one of claims 1 to 16 The method described.
  34. 根据权利要求33所述的设备,其特征在于,还包括:The device according to claim 33, further comprising:
    芯片,用于基于所述处理器设置的工作频率对目标任务中的每个子任务进行处理。The chip is used to process each subtask in the target task based on the operating frequency set by the processor.
  35. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行时,促使所述处理器实现权利要求1至16任一项所述的方法。A computer-readable storage medium having a computer program stored thereon, wherein when the program is executed by a processor, the processor is prompted to implement the method according to any one of claims 1 to 16.
PCT/CN2020/105195 2019-11-29 2020-07-28 Configuration of operating frequency of chip WO2021103618A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217020535A KR20210098508A (en) 2019-11-29 2020-07-28 Setting the chip operating frequency
JP2021538698A JP2022516549A (en) 2019-11-29 2020-07-28 Chip operating frequency setting

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911201455.9 2019-11-29
CN201911201455.9A CN112882819B (en) 2019-11-29 2019-11-29 Method and device for setting chip working frequency

Publications (1)

Publication Number Publication Date
WO2021103618A1 true WO2021103618A1 (en) 2021-06-03

Family

ID=76038734

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105195 WO2021103618A1 (en) 2019-11-29 2020-07-28 Configuration of operating frequency of chip

Country Status (5)

Country Link
JP (1) JP2022516549A (en)
KR (1) KR20210098508A (en)
CN (1) CN112882819B (en)
TW (1) TWI743934B (en)
WO (1) WO2021103618A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114785376B (en) * 2022-05-06 2023-07-21 Oppo广东移动通信有限公司 Frequency-voltage pre-configuration method and related device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2362297A1 (en) * 2010-02-25 2011-08-31 Telefonaktiebolaget L M Ericsson (publ) Technique for selecting a frequency of operation in a processor system
CN102169357A (en) * 2011-02-23 2011-08-31 北京大学深圳研究生院 DSP (Digital Signal Processor) capable of regulating working voltage and clock frequency and regulating method thereof
CN103955264A (en) * 2014-05-15 2014-07-30 乐视致新电子科技(天津)有限公司 Method and system for dynamically regulating working frequency of processor
CN104407690A (en) * 2014-12-19 2015-03-11 中科创达软件股份有限公司 Method, device and mobile terminal for regulating operating frequency of CPU (Central Processing Unit)
US9400518B2 (en) * 2013-06-05 2016-07-26 Qualcomm Innovation Center, Inc. Temporary frequency adjustment of mobile device processors based on task migration
CN107357658A (en) * 2017-06-29 2017-11-17 上海斐讯数据通信技术有限公司 A kind of method and system for adjusting mobile terminal cpu frequency

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2003083693A1 (en) * 2002-04-03 2005-08-04 富士通株式会社 Task scheduling device in distributed processing system
US7206950B2 (en) * 2004-06-16 2007-04-17 Matsushita Electric Industrial Co., Ltd. Processor system, instruction sequence optimization device, and instruction sequence optimization program
WO2006011189A1 (en) * 2004-07-26 2006-02-02 Mitsubishi Denki Kabushiki Kaisha Parallel computer
US20100064124A1 (en) * 2006-11-16 2010-03-11 Karl Rinne Digital power controller
TWI336453B (en) * 2006-12-18 2011-01-21 Asustek Comp Inc Method for adjusting working frequency of chip
WO2009047853A1 (en) * 2007-10-11 2009-04-16 Fujitsu Limited Information processor, operation control method, and operation control program
US8381004B2 (en) * 2010-05-26 2013-02-19 International Business Machines Corporation Optimizing energy consumption and application performance in a multi-core multi-threaded processor system
KR101885857B1 (en) * 2012-01-04 2018-08-06 삼성전자주식회사 Temperature management unit, system on chip including the same and method of managing temperature in a system on chip
JP5898342B2 (en) * 2012-12-21 2016-04-06 ルネサスエレクトロニクス株式会社 Semiconductor device and control method thereof
US9494996B2 (en) * 2013-03-15 2016-11-15 Intel Corporation Processor having frequency of operation information for guaranteed operation under high temperature events
CN105045702B (en) * 2015-07-22 2018-03-20 Tcl移动通信科技(宁波)有限公司 It is a kind of that the method for chip, system and chip are protected according to chip temperature
CN106527653A (en) * 2016-10-12 2017-03-22 东软集团股份有限公司 CPU frequency adjusting method and apparatus
JP2018106591A (en) * 2016-12-28 2018-07-05 ルネサスエレクトロニクス株式会社 Semiconductor device, operation control method, and program
CN108334405A (en) * 2017-01-20 2018-07-27 阿里巴巴集团控股有限公司 Frequency isomery CPU, frequency isomery implementation method, device and method for scheduling task
US10921874B2 (en) * 2017-03-06 2021-02-16 Facebook Technologies, Llc Hardware-based operating point controller for circuit regions in an integrated circuit
CN109212306B (en) * 2017-07-06 2021-02-26 龙芯中科技术股份有限公司 Method, circuit and device for adjusting chip power consumption
CN107562682A (en) * 2017-08-01 2018-01-09 华南理工大学 Communication means based on many-core chip coker heat energy in the heart
CN107678855B (en) * 2017-09-19 2020-06-12 中国电子产品可靠性与环境试验研究所 Processor dynamic adjustment method and device and processor chip
CN108268265A (en) * 2018-01-24 2018-07-10 深圳市道通科技股份有限公司 Link mapping method, software code method for burn-recording and the burning host in common code library

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2362297A1 (en) * 2010-02-25 2011-08-31 Telefonaktiebolaget L M Ericsson (publ) Technique for selecting a frequency of operation in a processor system
CN102169357A (en) * 2011-02-23 2011-08-31 北京大学深圳研究生院 DSP (Digital Signal Processor) capable of regulating working voltage and clock frequency and regulating method thereof
US9400518B2 (en) * 2013-06-05 2016-07-26 Qualcomm Innovation Center, Inc. Temporary frequency adjustment of mobile device processors based on task migration
CN103955264A (en) * 2014-05-15 2014-07-30 乐视致新电子科技(天津)有限公司 Method and system for dynamically regulating working frequency of processor
CN104407690A (en) * 2014-12-19 2015-03-11 中科创达软件股份有限公司 Method, device and mobile terminal for regulating operating frequency of CPU (Central Processing Unit)
CN107357658A (en) * 2017-06-29 2017-11-17 上海斐讯数据通信技术有限公司 A kind of method and system for adjusting mobile terminal cpu frequency

Also Published As

Publication number Publication date
CN112882819B (en) 2022-03-08
KR20210098508A (en) 2021-08-10
TW202121116A (en) 2021-06-01
CN112882819A (en) 2021-06-01
TWI743934B (en) 2021-10-21
JP2022516549A (en) 2022-02-28

Similar Documents

Publication Publication Date Title
US11586473B2 (en) Methods and apparatus for allocating a workload to an accelerator using machine learning
US20170286484A1 (en) Graph Data Search Method and Apparatus
WO2018102240A1 (en) Joint language understanding and dialogue management
CN109558945A (en) The method and device that artificial neural network and floating-point neural network are quantified
KR102521054B1 (en) Method of controlling computing operations based on early-stop in deep neural network
CN112513886B (en) Information processing method, information processing apparatus, and information processing program
CN103309738A (en) User job scheduling method and device
TWI775210B (en) Data dividing method and processor for convolution operation
CN113645637B (en) Method and device for unloading tasks of ultra-dense network, computer equipment and storage medium
WO2021103618A1 (en) Configuration of operating frequency of chip
CN112862112A (en) Federal learning method, storage medium, terminal, server, and federal learning system
CN113159188B (en) Model generation method, device, equipment and storage medium for image classification
CN116501505B (en) Method, device, equipment and medium for generating data stream of load task
CN108897619B (en) Multi-level resource flexible configuration method for super computer
WO2018168695A1 (en) Distributed machine learning device, distributed machine learning method, and distributed machine learning recording medium
CN112115667A (en) FPGA layout method, device, electronic equipment and computer readable medium
CN112988275B (en) Task perception-based mobile edge computing multi-user computing unloading method
CN113961267B (en) Service processing method, device and equipment
US11475311B2 (en) Neural network instruction streaming
CN114675975A (en) Job scheduling method, device and equipment based on reinforcement learning
WO2022178731A1 (en) Operating method and apparatus for accelerator
CN117407155A (en) Resource scheme determining method and device, storage medium and electronic equipment
CN117667383A (en) Resource-constrained-edge-oriented distributed data stream optimization and training optimization method
TW202328984A (en) Control method and system based on layer-wise adaptive channel pruning
JP2016162400A (en) Mapping information generation program, mapping information generation method, and mapping information generation device

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021538698

Country of ref document: JP

Kind code of ref document: A

Ref document number: 20217020535

Country of ref document: KR

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20892245

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20892245

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.01.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20892245

Country of ref document: EP

Kind code of ref document: A1