WO2021056033A2 - Apparatus and method of intelligent power and performance management - Google Patents

Apparatus and method of intelligent power and performance management Download PDF

Info

Publication number
WO2021056033A2
WO2021056033A2 PCT/US2021/014235 US2021014235W WO2021056033A2 WO 2021056033 A2 WO2021056033 A2 WO 2021056033A2 US 2021014235 W US2021014235 W US 2021014235W WO 2021056033 A2 WO2021056033 A2 WO 2021056033A2
Authority
WO
WIPO (PCT)
Prior art keywords
information
power
functional blocks
training system
configuration
Prior art date
Application number
PCT/US2021/014235
Other languages
French (fr)
Other versions
WO2021056033A3 (en
Inventor
Jing Wu
Jian Huang
Yu Zhang
Original Assignee
Zeku, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zeku, Inc. filed Critical Zeku, Inc.
Priority to PCT/US2021/014235 priority Critical patent/WO2021056033A2/en
Publication of WO2021056033A2 publication Critical patent/WO2021056033A2/en
Publication of WO2021056033A3 publication Critical patent/WO2021056033A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/324Power saving characterised by the action undertaken by lowering clock frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3296Power saving characterised by the action undertaken by lowering the supply or operating voltage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • Embodiments of the present disclosure relate to an apparatus and method for configurable power management within a subsystem.
  • SoC system on a chip
  • a subsystem of an SoC may be comprised of a plurality of functional blocks that together process instructions that enable the subsystem to perform its dedicated function.
  • the functional blocks may be organized into pipelines configured to concurrently perform fixed functions, process different portions of an instruction, or processes different instructions.
  • Embodiments of the disclosure provide a configuration training system for configurable power management.
  • the configuration training system may include a memory and at least one processor coupled to the memory.
  • the at least one processor may be configured to receive application information associated with a plurality of functional blocks.
  • the at least one processor may be configured to receive first status feedback information from the plurality of functional blocks.
  • the at least one processor may be configured to generate first power control information based at least in part on one or more of the application information or the first status feedback information.
  • the at least one processor may generate first function execution control state information based at least in part on the application information and the first status feedback information.
  • the at least one processor may send the first power control information to a power controller. In certain other aspects, the at least one processor may send the first function execution control state information to the plurality of functional blocks.
  • Embodiments of the disclosure provide a method for configurable power management of a configuration training system. The method may include receiving, at a configuration training system application information associated with a plurality of functional blocks. In certain aspects, the method may further include receiving, at the configuration training system, first status feedback information from the plurality of functional blocks. In certain other aspects, the method may further include generating, at the configuration training system, first power control information based at least in part on one or more of the application information or the first status feedback information.
  • the method may further include generating, at the configuration training system, first function execution control state information based at least in part on the application information and the first status feedback information. In certain other aspects, the method may further include sending the first power control information to a power controller. In certain other aspects, the method may further include sending the first function execution control state information to the plurality of functional blocks. [0006] Embodiments of the disclosure further provide a non-transitory computer-readable medium having instructions stored thereon that, when executed by one or more processors, causes the one or more processors to perform configurable power management of a configuration training system. The method may further include generating second power control information based at least in part on the second status feedback information.
  • the method may further include generating second function execution control state information based at least in part on the second status feedback information. In certain other aspects, the method may further include sending the second power control information to the power controller. In certain other aspects, the method may further include sending the second function execution control state information to the plurality of functional blocks.
  • Embodiments of the disclosure provide a power controller for configurable power management.
  • the power controller may include a calculation block and a power control block. In certain aspects, the calculation block may be configured to receive first power control information associated with a plurality of functional blocks. function execution control state informationIn certain other aspects, the calculation block may be configured to identify first power management technique associated with the plurality of functional blocks based at least in part on one or more of the first power control information.
  • the power control block configured to apply the first power management technique to the plurality of functional blocks.
  • Embodiments of the disclosure provide a method for configurable power management of a power controller. The method may include receiving, at a power controller, first power control information associated with a plurality of functional blocks. In certain other aspects, the method may including receive first function execution control state information indicating an allocation of resources at the plurality of functional blocks. In certain aspects, the method may further include identifying, at the power controller, first power management technique associated with the plurality of functional blocks based at least in part on one or more of the first power control information or the first function execution state information. In certain other aspects, the method may include applying the first power management technique to the plurality of functional blocks.
  • Embodiments of the disclosure further provide a non-transitory computer-readable medium having instructions stored thereon that, when executed by a power controller, causes the power controller to perform configurable power management of a power controller.
  • the method may include receiving first power control information associated with a plurality of functional blocks.
  • the method may further include receiving first function execution control state information indicating an allocation of resources at the plurality of functional blocks.
  • the method may further include identifying first power management technique associated with the plurality of functional blocks based at least in part on one or more of the first power control information or the first function execution control state information.
  • the method may include applying the first power management technique to the plurality of functional blocks.
  • FIG. 1 illustrates a block diagram of an embedded SoC apparatus, in accordance with certain aspects of the disclosure.
  • FIG.2 illustrates a block diagram of a subsystem that is part of an SoC apparatus, in accordance with certain aspects of the disclosure.
  • FIG. 3A illustrates a block diagram of a power controller configured to perform configurable power management, in accordance with certain aspects of the disclosure.
  • FIG. 3B illustrates a detailed view of a power meter that may be configured to perform configurable power management, in accordance with certain aspects of the disclosure.
  • FIG.3C illustrates another block diagram of a subsystem that is part of an SoC, in accordance with certain aspects of the disclosure
  • FIG.3D illustrates a data flow performed by a power controller, in accordance with certain aspects of the disclosure.
  • FIG.4 illustrates a block diagram of an exemplary system for configurable power management, according to embodiments of the disclosure.
  • FIG. 5 illustrates a flow chart of an exemplary method for configurable power management, according to embodiments of the disclosure. [0019] FIG.
  • FIG. 6 illustrates a data flow diagram of an exemplary system for configurable power management, according to embodiments of the disclosure.
  • FIG. 7 illustrates a block diagram of a conventional thermal and current limits management system.
  • Embodiments of the present disclosure will be described with reference to the accompanying drawings. DETAILED DESCRIPTION [0022] Although specific configurations and arrangements are discussed, it should be understood that this is done for illustrative purposes only. A person skilled in the pertinent art will recognize that other configurations and arrangements can be used without departing from the spirit and scope of the present disclosure. It will be apparent to a person skilled in the pertinent art that the present disclosure can also be employed in a variety of other applications.
  • references in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” “some embodiments,” “certain embodiments,” etc. indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases do not necessarily refer to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of a person skilled in the pertinent art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. [0024] In general, terminology may be understood at least in part from usage in context.
  • the term “one or more” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features structures or characteristics in a plural sense
  • terms such as “a,” “an,” or “the,” again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context.
  • the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context.
  • an SoC is an integrated circuit that integrates subsystems, each having a different function, in a computing system or other electronic device.
  • the subsystems integrated by an SoC may include, without limitation, one or more of the following: central processing units (CPUs), graphical processing units (GPUs), microcontrollers, microprocessors, multiprocessors, digital signal processor (DSP) cores, other types of cores, a memory unit, read- only memory (ROM), random-access memory (RAM), clock signal generators, input/output (I/O) interfaces, analog interfaces, voltage regulators and power management circuits, an advanced peripheral unit(s), wireless communication unit(s) (e.g., Wi-Fi module, cellular module, 5G new radio (NR) module, Bluetooth® module, etc.), or coprocessors, just to name a few.
  • CPUs central processing units
  • GPUs graphical processing units
  • microcontrollers microprocessors
  • microprocessors multiprocessors
  • DSP digital signal processor
  • I/O input/output
  • I/O input/output
  • analog interfaces analog interfaces
  • voltage regulators and power management circuits
  • a subsystem may perform its dedicated function by running an application comprised of one or more instructions.
  • the subsystem may perform instruction pipelining to implement instruction-level parallelism within a single processor, computing device, and/or circuit. Pipelining attempts to keep every part of the subsystem (e.g., the functional blocks) busy with some instruction by dividing incoming instructions into different portions that each include a series of sequential steps (e.g., “pipeline”).
  • a pipeline may include a plurality of functional blocks configured to perform the series of sequential steps.
  • Certain pipelines may increase the power consumption and thermal output of the subsystem more than others.
  • subsystems have been designed with an increased number of functional blocks to increase pipelining and multi-thread performance within the subsystem.
  • the increased number of functional blocks may also increase, among other things, the amount of power and, hence, the thermal output associated with a subsystem.
  • Power management techniques may be used to optimize power consumption and mitigate the temperature within the subsystem.
  • One such power management technique is frequency reduction. Frequency reduction may be used to reduce the power consumed by the subsystem.
  • Another such power management technique is dynamic voltage and frequency scaling (DVFS). DVFS may be used to reduce the power consumption of a subsystem on the fly by scaling down the voltage and frequency based on the targeted performance requirements of the application being run by the subsystem.
  • DVFS dynamic voltage and frequency scaling
  • FIG. 7 illustrates a block diagram of a conventional thermal and current limits management system 700 (e.g., “conventional system 700,” hereinafter).
  • conventional system 700 may include a subsystem 702, current sensor 704 and a temperature sensor 706 located at the subsystem 702, a voltage source 708, a temperature / current monitor and mitigation unit 710 (e.g., “mitigation unit 710,” hereinafter), a phase-locked loop (PLL) frequency source 712.
  • PLL phase-locked loop
  • the mitigation unit 710 may receive current information 703 and temperature information 705 from the current sensor 704 and temperature sensor 706, respectively. Normally, the mitigation unit 710 may include programmable thresholds and an algorithm to perform a mitigation scheme for the entire subsystem 702 (e.g., across all pipelines) using the clock frequency 701 and DVFS.
  • An increase in subsystem temperature may result from the dissipation of switching power during periods of high switching activity in certain pipelines according to the dynamic power equation set forth below as Equation (1): where P SWitching is the switching power, a is the switching activity, f is the switching frequency, c eff is the effective capacitance, and V dd is the supply voltage.
  • the mitigation unit 710 may use a predictive mechanism to reduce the frequency of the subsystem by sending a signal 711 to the PLL frequency source 712.
  • the PLL frequency source 712 may reduce the clock frequency 701 of the subsystem 702 and, therefore, power consumption when the temperature reaches a threshold.
  • the mitigation unit 710 may use the thermal mitigation algorithm to lower the DVFS set point, which may further reduce the power consumption.
  • a fast and steep reduction in the clock frequency 701 may be used depending on the algorithm.
  • the mitigation unit 710 may send a signal 707 instructing the voltage source 708 to lower the voltage (V dd ) 709 and a signal 711 instructing the PLL frequency source 712 to lower the clock frequency 701 using DVFS step down.
  • the mitigation unit 710 may decide between a frequency only reduction or DVFS step down depending on the algorithm.
  • the 700 may involve two mechanisms: 1) an inner loop to reduce frequency only and 2) an outer loop to reduce both voltage and frequency using DVFS step down.
  • the advantage of the inner loop is that PLL frequency reduction is usually fast (e.g., 50 ns, 100 ns, 200 ns, etc.).
  • the mitigation for temperature may be slow compared to current mitigation because heat transfer is relatively slow.
  • the outer loop may provide much faster and effective mitigation for temperature but at the cost of more performance degradation.
  • DVFS DVFS
  • the voltage 709 and clock frequency 701 of the entire subsystem may be reduced when the temperature reaches a threshold.
  • the overall performance of the subsystem may be reduced, e.g., due to increased processing time at all pipelines.
  • FIG. 1 illustrates a block diagram of an embedded system-on-chip (SoC) 100, in accordance with certain aspects of the disclosure.
  • SoC system-on-chip
  • the SoC 100 may include, e.g., a main memory 102, a CPU 104, a system bus 106, an input-output (IO) processor 108, and a subsystem 110, just to name a few.
  • the subsystem 110 may be configured as, e.g., one or more of a microcontroller unit (MCU), a CPU, a GPU, microcontroller, a processor, microprocessor, a multiprocessor, DSP core, a circuit, a memory unit, ROM, RAM, clock signal generators, I/O interfaces, analog interfaces, voltage regulators and power management circuits, an advanced peripheral unit(s), wireless communication unit(s) (e.g., Wi-Fi module, cellular module, 5G NR module, Bluetooth® module, etc.), or coprocessors, just to name a few.
  • the subsystem 110 may comprise a plurality of functional blocks (e.g., seen in FIG. 2) configured to process one or more instructions. To enable parallel processing, the functional blocks may be organized into a plurality of pipelines. Each pipeline may be configured to process a portion of an instruction such that each of the pipelines process different portions of the instruction in parallel. A portion of an instruction may include a series of steps that are processed sequentially.
  • An example subsystem architecture including a plurality of functional blocks is illustrated in FIG.2.
  • the subsystem 110 may include an application scenario unit 130 that is configured to send application information to a configuration training system 140. The application information may that is associated with an application performed and/or run by the functional blocks.
  • the application information may indicate, among others, pipeline architecture, instruction(s), character variables, etc.
  • the configuration training system 140 may be configured to generate (125) configuration information (e.g., power threshold, sliding window size, throttle duration, etc.) and/or function execution control state information (e.g., allocation of resources) based at least in part on the application information and/or feedback information (e.g., thermal feedback, power feedback, performance feedback, frequency feedback, etc.) from the plurality of functional blocks.
  • configuration information e.g., power threshold, sliding window size, throttle duration, etc.
  • function execution control state information e.g., allocation of resources
  • feedback information e.g., thermal feedback, power feedback, performance feedback, frequency feedback, etc.
  • subsystem 110 may also include a configuration training system 140 that may configure the threshold conditions used by the power controller 150 for configurable power management.
  • one or more of the configuration training system 140 and/or the power controller 150 may be located external to the subsystem 110. When located externally, the configuration training system 140 and/or the power controller 150 may be in communication with the subsystem 110. Additional details associated with the configuration training system 140 are set forth below in connection with FIG.3B.
  • FIG.2 illustrates a more detailed view of subsystem 110 from FIG.1, in accordance with certain aspects of the disclosure.
  • FIG.3A illustrates a block diagram 300 of a power controller configured to perform configurable power management, in accordance with certain aspects of the disclosure.
  • the power controller 150 may be in communication with a plurality of pipelines (e.g., pipeline1218, pipeline2219 ... pipelineN 220.
  • the power controller 150 in FIG.3A is illustrated as being in communication with three pipelines, the power controller 150 may be in communication with more or fewer than three pipelines without departing from the scope of the present disclosure.
  • FIG.3B illustrates a detailed view of a power meter 312 that may be included in the power controller 150, in accordance with certain aspects of the disclosure.
  • FIG. 3C illustrates another block diagram of subsystem 110, in accordance with certain aspects of the disclosure.
  • FIG. 2 illustrates another block diagram of a power controller 150, in accordance with certain aspects of the present disclosure.
  • FIGs.2, 3A, 3B, 3C, and 3D will now be described together.
  • the subsystem 110 illustrated in FIG. 2 is configured as a processor, and, hence, the functional blocks described below are those associated with a processor.
  • subsystem 110 is not limited to a processor and may include one or more of the other non-limiting examples of subsystem 110 described above in connection with FIG.1.
  • a different combination of functional blocks than those described below may be included in subsystem 110 without departing from the scope of the present disclosure.
  • the number of functional blocks illustrated in FIG.2 is for illustrative purposes only and not limited thereto.
  • subsystem 110 may include any different number or type of functional blocks without departing from the scope of the present disclosure.
  • subsystem 110 may comprise a plurality of functional blocks configured to process at least one instruction (e.g., the instruction, hereinafter). The plurality of functional blocks illustrated in FIG.
  • pipeline1218 may include instruction cache 202, instruction buffer 204, ALU 206, and common register file 216.
  • the plurality of pipelines may be configured to process an instruction, logic, and/or dedicated function.
  • the instruction may include a plurality of portions that may be processed concurrently using different pipelines that are each comprised of a different set of functional blocks.
  • each portion of the instruction may include a set of stages. Each of the stages may be sequentially processed by one or more of the functional blocks in the pipeline.
  • a pipeline may include a register after each stage. The registers may be configured to store information from the instruction and/or calculation(s) from one stage so that the logic gates of the next stage in the pipeline may perform the subsequent step using the information in the register of the previous stage. After each stage, a handshake signal may be exchanged with the next downstream functional block indicating that the previous stage is complete, and the next stage should begin. Each stage may consume a certain amount of power.
  • the power consumed by a stage may be referred to as a power event.
  • pipelines may be included in a same device, subsystem, or functional block. Additionally and/or alternatively, pipelines may be located in separate devices, subsystem, or functional blocks.
  • the portion of the instruction processed by pipeline1218 includes five stages and that each stage is associated with a power event.
  • the first stage may include fetching (e.g., first power event) the portion of the instruction into the instruction buffer 204.
  • the second stage may include fetching (e.g., second power event) the decoded portion of the instruction into the ALU 206.
  • the third stage may include executing (e.g., third power event) a calculation at the ALU 206.
  • the fourth stage may include accessing (e.g., fourth power event) the common register file 216.
  • the fifth stage may include writing (e.g., fifth power event) information associated with the calculation to the common register file 216.
  • more or fewer than five stages may be associated with each portion of the instruction without departing from the scope of the present disclosure.
  • the operations described above in connection with the five stages are not limited to the operations described herein. Instead, the stages may include any number of different stages each performing any operation without departing from the scope of the present disclosure.
  • event information 301 (e.g., a power event signal) may be sent to the power controller 150.
  • Event information 301 may include power information indicating an amount or percentage of the power consumed during a stage.
  • the event information may include a plurality of event information.
  • the event information 301 may include first event information associated with a first power event, second event information associated with a second power event, and so on.
  • first event information may be sent at a first time (e.g., t 0 )
  • second event information may be sent at a second time (e.g., t 1 ), and so on.
  • the power controller 150 may receive a plurality of event information 301 for each pipeline.
  • the event counter unit 314 may calculate the total power for, e.g., pipeline1218 during the clock cycle by summing the power bits indicated in each of the event information 301.
  • the event counter unit 314 may send the event information that indicates the total power consumed by the pipeline during that clock cycle.
  • the power controller 150 may be a central power controller configured to probe the power events for each pipeline (e.g., pipeline1 218, pipeline2 219, pipelineN 220) in a subsystem 110.
  • the power controller 150 may include a plurality of power meters 312, each associated with one or more pipelines.
  • the power meter 312 may include an event counter unit 314, a sensitivity level unit 320, a sliding window unit 316, a throttle generator 318, a sensitivity level configuration unit 332, a sliding window configuration unit 334, and a throttle configuration unit 336, among others.
  • the sensitivity level unit 320 may be part of the event counter unit 314.
  • the sensitivity level unit 320 may be separate from but in communication with the event counter unit 314.
  • power meter 312 may be configured to probe power events for one or more pipelines.
  • the event counter unit 314 may monitor the power events for its respective pipeline(s) every clock cycle.
  • the event counter unit 314 may be configured to receive event information 301 each time a power event occurs in pipeline1218.
  • Each of the event information 301 may include, e.g., power bit information indicating an amount of power associated with that event.
  • the event counter unit 314 may calculate the total power for, e.g., pipeline1218 during the clock cycle by summing the power bits indicated in each of the event information 301.
  • the sensitivity level unit 320 may be configured to determine whether the event information 301 meets a power threshold (e.g., a power percentage for a clock cycle). By way of example and not limitation, assume that the total power consumed pipeline1218 reaches 90% of the maximum allowable power for that clock cycle. The sensitivity level unit 320 may then send the event information 301 to the sliding window unit 316 when the power threshold of 90% is reached. The sensitivity level unit 320 may include a regulator trigger in the event information sent to the sliding window unit 316.
  • a power threshold e.g., a power percentage for a clock cycle
  • the event information 301 may not be sent to the sliding window unit 316.
  • the sensitivity level may be set to 100%, in this case, the sensitivity level unit 320 may not exist, and the event information may be directly sent to sliding window unit 316.
  • the sliding window unit 316 may be configured to determine whether the event information 301 meets a cycle threshold.
  • the cycle threshold may be the sliding window size.
  • the sliding window size may include a predetermined number of clock cycles (e.g., 1 clock cycle, 2 clock cycles, 10 clock cycles, 50 clock cycles, 100 clock cycles, etc.).
  • the sliding window unit 316 may determine that the cycle threshold is met when event information 301 associated with pipeline1218 is received from the sensitivity level unit 320 for 100 clock cycles, e.g., indicating that the power threshold was reached for 100 cycles.
  • the sliding window unit 316 may include a state machine.
  • the state machine may include a wait trigger state, a threshold set state, and/or throttle enable state.
  • the sliding window unit 316 may remain in the wait trigger state until event information including a regulator trigger is received from the sensitivity level unit 320.
  • the regulator trigger may cause the state machine to transition to the threshold set state.
  • the threshold set state may increment a counter upon receipt of the regulator trigger.
  • a threshold number of regulator triggers e.g., when the power threshold at the sensitivity level unit 320 is reached for a predetermined number of clock cycles
  • the sliding window unit 316 may transition to the throttle enable state.
  • a throttle enable signal may be sent to the throttle generator 318 upon transitioning to the throttle enable state.
  • a regulator deassert signal may be sent to the sliding window unit 316.
  • the sliding window unit 316 may transition from the throttle enable state to the wait trigger state.
  • the counter associated with the regulator triggers may be reset upon transitioning to the wait trigger state.
  • the sliding window unit 316 may send a signal that instructs the throttle generator 318 to assert a power control signal 303 (e.g., throttle enable signal) for pipeline1218.
  • the power control signal 303 may be asserted until a deassert trigger is received from the sliding window unit 316.
  • the sliding window unit 316 may be configured to initiate DVFS power management 305 for the subsystem 110 when certain conditions are met. For example, if the sliding window unit 316 determines that that power or temperature of a pipeline and/or subsystem reaches a DVFS set point, then a signal to initiate DVFS power management 305 may be asserted to adjust the frequency, or the voltage, or both.
  • the sliding window unit 316 may be configured to initiate clock or power gating for one or more functional blocks or pipelines.
  • the power controller 150 may be in communication with a configuration training system, e.g., such as the configuration training system 140 in FIGs.1 and 2.
  • the configuration training system 140 may be configured to receive second status feedback information 360 (e.g., thermal feedback, current feedback, power feedback, performance feedback, voltage feedback, frequency feedback, etc.) from the plurality of functional blocks 302 and/or one or more pipelines.
  • the second status feedback information 360 may include pipeline-level feedback, functional block-level feedback, and/or subsystem-level feedback.
  • the second status feedback information 360 may include thermal feedback, current feedback, power feedback, performance feedback, voltage feedback, frequency feedback associated with each pipeline.
  • the feedback is not necessarily fixed times, it may have multiple times based on the training system to find the best configuration at certain point.
  • the configuration training system 140 may be configured to generate configuration information 330 based at least in part on the second status feedback information 360.
  • the configuration information 330 may be used to configure the power threshold (e.g., sensitivity level information), the cycle threshold (e.g., sliding window size), throttle enable / disable information, etc.
  • the configuration training system 140 may select configuration information 330 that provides a desirable thermal, current, and performance tradeoff. Additional details regarding the generation of the configuration information 330 are described below in connection with FIG.3C.
  • the sensitivity level configuration unit 332 may be configured to receive configuration information 330 associated with the power threshold (e.g., 20%, 30%, 50%, 85%, 90%, etc.).
  • the sliding window configuration unit 334 may be configured to receive configuration information 330 associated with a sliding window size (e.g., 5 clock cycles, 10 clock cycles, 50 clock cycles, 100 clock cycles) that may be used as the cycle threshold.
  • the throttle configuration unit 336 may be configured to receive throttle amount information (e.g., a percentage to throttle the power of the pipeline or functional block(s)) associated a throttle enable signal and/or throttle disable signal.
  • feedback information 340 may be sent to the event counter unit 314, sensitivity level unit 320, sliding window unit 316, and throttle generator 318.
  • Feedback information 340 may be used to adjust the power model to achieve more accurate power detection.
  • the configuration training system 140 may send feedback information 340 to one or more of the event counter unit 314, the sensitivity level unit 320, the sliding window unit 316, and/or throttle generator 318.
  • the event counter unit 314, the sensitivity level unit 320, the sliding window unit 316, and/or throttle generator 318 may use the feedback information 340 to reconfigure, e.g., an offset and/or coefficient used in summing the power bits, a power threshold (e.g., sensitivity level), a sliding window size, and an amount to throttle, respectively.
  • a power threshold e.g., sensitivity level
  • the sensitivity level unit 320 may lower the power threshold
  • the sliding window unit 316 may reduce the cycle threshold.
  • the overall temperature of the subsystem 110 may be decreased.
  • the subsystem 110 may include an application scenario unit 130, a configuration training system 140, a power controller 150, and a plurality of functional blocks 302.
  • the plurality of functional blocks 302 may correspond to, e.g., the plurality of functional blocks 202, 204, 206, 208, 210, 212, 214, 216 described above in connection with FIG. 2.
  • the plurality of functional blocks 302 may be organized into a plurality of pipelines, e.g., such as the plurality of pipelines 218, 219, 220, etc.
  • the application scenario unit 130 may be configured to maintain or access application information 350 related to at least one application that may be run by the subsystem 110.
  • the application information 350 may include information associated with a pipeline architecture within the plurality of functional blocks 302. Information associated with the pipeline architecture may indicate, which pipelines process which instruction(s),portions of an instruction, fixed functions, and/or logic associated with the application.
  • the application information 350 may include information that relates particular clock cycles to a particular instruction(s), portions of an instruction, fixed functions, and/or logic.
  • the application information 350 may include a character variable (e.g., type of application, workload, control state, etc.) associated with an application, thread information, or information related to the allocation of resources within a pipeline for concurrent processing of threads, etc.
  • the application information 350 may indicate that pipeline 1 performs instruction 1 at a first clock cycle, pipeline 2 performs instruction 2 at the first clock cycle, pipeline 1 performs instruction 3 at the second clock cycle, and so on.
  • the application scenario unit 130 may send the application information 350 to the configuration training system 140.
  • the configuration training system 140 may generate configuration information 330 based at least in part on the application information 350.
  • the configuration training system 140 may be configured to access correlation information that correlates application information 350 to configuration information 330.
  • the correlation information may include, e.g., one or more of a lookup table, a neural network, a database, an artificial intelligence engine, a machine learning engine, etc.
  • the correlation information may be maintained locally at the configuration training system 140 and/or subsystem 110.
  • the correlation information may be located remotely from the configuration training system 140. When located remotely, the configuration training system 140 may access the correlation information using wired or wireless communication.
  • the configuration training system 140 may generate configuration information 330.
  • the configuration information 330 may configure power controller 150 to use certain threshold parameters (e.g., coefficient and/or offset used to calculate a total power at the event counter unit 314, power threshold, cycle threshold, throttle amount, etc.).
  • the configuration information 330 may include different threshold parameters that may be used by the power controller 150 at different times.
  • the configuration information 330 may include first configuration information that is sent at a first time (e.g., t 0 ), second configuration information that is sent at a second time (e.g., t 1 ), and so on.
  • application information 350 indicates a first application scenario.
  • the configuration training system 140 may generate first configuration information that includes first threshold parameters.
  • the first threshold parameters may include one or more of, e.g., a coefficient and/or offset used by the event counter unit 314 to sum the power associated with power events, a first power threshold that may be used by the sensitivity level unit 320, a first sliding window size (e.g., first cycle threshold) that may be used by the sliding window unit 316, and/or a first throttle amount used by the throttle generator 318 to assert a throttle enable signal and/or throttle disable signal.
  • the first threshold parameters may provide a target power, thermal, and performance tradeoff at the subsystem 110 while the first application is being run.
  • the first configuration information may be associated with a first set of performance characteristics (e.g., speed, thermal, etc.). Once generated, the first configuration information may be sent to the power controller 150.
  • the power controller 150 may use the first threshold parameters to perform operations associated with configurable power management, e.g., described above in connection with FIGs.3A and 3B.
  • the plurality of functional blocks 302 may send second status feedback information 360 to the configuration training system 140 that is related to subsystem performance (e.g., thermal information, performance information, frequency information, and the first threshold parameters, etc.
  • the second status feedback information 360 may include different feedback information that may be used by the configuration training system 140 at different times.
  • the second status feedback information 360 may include first information that is received in response to the first threshold parameters (e.g., indicated by the first configuration information), second status feedback information 360 received in response to the second threshold parameters (e.g., indicated by the second configuration information), and so on.
  • the second status feedback information 360 may include one or more of power information, performance information, current information, or thermal information associated with each pipeline and/or the plurality of functional blocks 302.
  • the second status feedback information 360 may be sent at the end of each clock cycle, at the end of a predetermined number of clock cycles (e.g., 2 clock cycles, 3 clock cycles, 10 clock cycles, 100 clock cycles, etc.), or upon request from the configuration training system 140.
  • the configuration training system 140 may update the correlation information to associate the feedback information with the application. If the feedback information indicates that certain targets (e.g., power, thermal, etc.) have not been met using the first configuration information, the configuration training system 140 may generate second configuration information that may improve the power and/or performance target of the subsystem 110.
  • the second configuration information may include a second set of threshold parameters (e.g., coefficient and/or offsets associated with calculating total power, power threshold, cycle threshold, throttle amount, etc.) for use by the power controller 150.
  • the second configuration information may be generated using, e.g., the threshold parameters may improve certain power and/or performance characteristics.
  • the configuration training system 140 may generate the second configuration information using, e.g., machine learning. Once generated, the second configuration information may be sent to the power controller 150.
  • the power controller 150 may perform configurable power management of the plurality of functional blocks 302 using the second configuration information. Second status feedback information 360 may be received multiple times while an application is running.
  • the configuration training system 140 may determine whether and how to generate new configuration information 330, and eventually may find the best configuration at certain point.
  • the new configuration information 330 may be sent to the power controller 150 to improve the power and/or performance of the subsystem 110.
  • the configuration training system 140 may generate function execution control state information 390 that allocates resources to the plurality of functional blocks 302.
  • the resources may be used by the plurality of functional blocks 302 to organize and/or process the threads.
  • the allocation of resources may be at one or more of the pipeline level and/or the functional block level.
  • the allocation of resources may affect subsystem performance because processing threads with a first allocation of resources may draw more power from the voltage source and, hence, generate greater thermal output than processing the threads using a second allocation of resources.
  • the function execution control state information 390 may indicate an allocation of resources that provides target power and/or performance characteristics (e.g., speed, thermal, etc.) for a particular application scenario.
  • the configuration training system 140 may select the allocation of resources and/or other control states based at least in part on one or more of the application information 350 and/or the correlation information.
  • the application information 350 may indicate the allocation of resources and/or other control states for the application scenario.
  • the configuration training system 140 may access the correlation information described above in connection with the configuration information 330 to select an allocation of resources and/or other control states.
  • the correlation information may indicate, for a particular application scenario, a predetermined allocation of resources and/or other control states.
  • the predetermined allocation of resources and/or other control states may provide target subsystem performance for a particular application scenario.
  • the configuration training system 140 may receive, from the plurality of functional blocks 302, status feedback information 360 indicating power and/or performance characteristics associated with running an application using the allocation of resources. Based at least in part on the status feedback information 360, the configuration training system 140 may determine whether a different allocation of resources and/or other control states may provide an improved subsystem power and/or performance.
  • the configuration training system 140 may determine a second function execution control state that may improve the power and/or performance of the subsystem 110. Additionally and/or alternatively, the configuration training system 140 may generate the second function execution control state using, e.g., machine learning. Once generated, function execution control state information 390 may be sent to the plurality of functional blocks 302. [0078] Additionally and/or alternatively, the configuration information 330 may include information associated with the function execution control state.
  • the power controller 150 may perform configurable power management based at least in part on the allocation of resources and/or other control states indicated in the function execution control state information.
  • Status feedback information 360 may be received multiple times while an application is being run. Each time feedback information 360 is received, the configuration training system 140 may determine whether and how to generate new function execution control state information 390 indicating a different allocation of resources and/or other control states. [0079] As mentioned above, the configuration training system 140 may send feedback information 340 (e.g., indicated in the second status feedback information 360) to the power controller 150.
  • event counter unit 314, the sensitivity level unit 320, the sliding window unit 316, and throttle generator 318 may respectively use the feedback information 340 to reconfigure one or more of the coefficients and/or offset used to sum the total power, a power threshold (e.g., sensitivity level), sliding window size, and/or a throttling amount on the fly. Additional details associated with configurable power management by the power controller 150 will now be described in connection with FIG.3D.
  • a neural network380 of the configuration training system 140 may receive, e.g., one or more of configuration information 330, status information 360, event information 301, function execution control state information 390, or miscellaneous information 309.
  • the miscellaneous information 309 may include frequency information, voltage information, etc.
  • the neural network 380 may use one or more of the configuration information 330, status information 360, event information 301, and/or miscellaneous information 309 to identify a power management technique 307 to perform.
  • the power management technique 307 may indicate sending a power control signal 303 (e.g., throttle enable, throttle disable, etc.) to one or more pipelines in the plurality of functional blocks 302, initiating DVFS power management 305 for the subsystem 110, sending function execution control state information 390 that controls the execution of the plurality of functional blocks 302 (e.g., thread, pipeline, etc.), or miscellaneous power management 311 (e.g., frequency reduction, block(s) power gating, etc.).
  • a signal indicating the selected power management technique 307 may be sent to the throttle generator 318.
  • the throttle generator block e.g., sliding window unit 316 and/or throttle generator 318) may apply the power management technique 307 to the plurality of functional blocks 302.
  • FIG. 4 illustrates a block diagram of an exemplary system 400 for configurable power management, according to embodiments of the disclosure.
  • system 400 may include a processor 404, a memory 406, and a storage 408.
  • system 400 may have different modules in a single device, such as an integrated circuit (IC) chip (e.g., implemented as an application-specific integrated circuit (ASIC) or a field- programmable gate array (FPGA)), or separate devices with dedicated functions.
  • IC integrated circuit
  • ASIC application-specific integrated circuit
  • FPGA field- programmable gate array
  • one or more components of system 400 may be located in a cloud or may be alternatively in a single location (such as inside a mobile device) or distributed locations.
  • Components of system 400 may be in an integrated device or distributed at different locations but communicate with each other through a network (not shown). Consistent with the present disclosure, system 400 may be configured to perform configurable power management.
  • Processor 404 may include any appropriate type of general-purpose or special- purpose circuit, microprocessor, digital signal processor, or microcontroller. Processor 404 may be configured as a separate processor module dedicated to performing configurable power management. Alternatively, processor 404 may be configured as a shared processor module for performing other functions in addition to performing configurable power management. [0084] Memory 406 and storage 408 may include any appropriate type of mass storage provided to store any type of information that processor 404 may need to operate.
  • Memory 406 and storage 408 may be a volatile or non-volatile, magnetic, semiconductor, tape, optical, removable, non-removable, or other types of storage device or tangible (i.e., non-transitory) computer-readable medium including, but not limited to, a ROM, a flash memory, a dynamic RAM, and a static RAM.
  • Memory 406 and/or storage 408 may be configured to store one or more computer programs that may be executed by processor 404 to perform functions disclosed herein.
  • memory 406 and/or storage 408 may be configured to store program(s) that may be executed by processor 404 to perform configurable power management.
  • memory 406 and/or storage 408 may also store various parameters including, e.g., coefficient and/or offset information, power threshold, cycle threshold, throttle amount(s), and/or a lookup table the correlates configuration information 330 and one or more of these parameters.
  • processor 404 may include multiple modules, such as an application scenario unit 442, a configuration training system 444, a power controller 446, a plurality of functional blocks 448, and the like. These modules (and any corresponding sub- modules or sub-units) can be hardware units (e.g., portions of an integrated circuit) of processor 404 designed for use with other components or software units implemented by processor 404 through executing at least part of a program.
  • FIG.5 illustrates a flowchart of an exemplary method 500 for generating configuration information, according to embodiments of the disclosure. Method 500 may be performed by system 400 and particularly processor 404 or a separate processor not shown in FIG. 4. Method 500 may include operations 502-514, as described below.
  • FIG.6 illustrates a data flow diagram 600 of an exemplary system for configurable power management, according to embodiments of the disclosure. FIGs.4-6 will be described together below.
  • the configuration training system 444 may receive application information associated with the plurality of functional blocks 448.
  • the application information may be received from the application scenario unit 442.
  • the application scenario unit 130 may be configured to maintain or access application information 350 related to at least one application that may be run by the subsystem 110.
  • the application information 350 may include one or more of information related to a pipeline architecture within the plurality of functional blocks, which pipelines process which instruction(s) and/or portions of an instruction associated with the application, information that relates particular clock cycles to particular instruction(s) and/or portions of an instruction, or a character variable (e.g., type of application, workload, control state, etc.) associated with an application.
  • the application scenario unit 130 may send the application information 350 to the configuration training system 140.
  • the configuration training system 140 may generate configuration information 330 based at least in part on the application information 350.
  • the configuration training system 444 may receive status information from the plurality of functional blocks 448. For example, referring to FIG.
  • the plurality of functional blocks 302 may send first status feedback information to the configuration training system 140 that is related to the power and performance of the subsystem 110 using the first threshold parameters.
  • the first status feedback information may include one or more of power information, current information, or thermal information associated with each pipeline and/or the plurality of functional blocks 302.
  • the second status feedback information 360 may be sent at the end of each clock cycle, at the end of a predetermined number of clock cycles (e.g., 2 clock cycles, 3 clock cycles, 10 clock cycles, 100 clock cycles, etc.), or upon request from the configuration training system 140.
  • the configuration training system 444 may generate first power control information (e.g., first configuration information and/or feedback information 340) based at least in part on one or more of the application information or the status feedback information. For example, referring to FIG.3C, based on a comparison of the application information 350 and the correlation information, the configuration training system 140 may generate configuration information 330. As mentioned above in connection with FIG.3B, the configuration information 330 may configure the power controller 150 to use certain threshold parameters (e.g., coefficient and/or offset used in calculating total power at event counter unit 314, power threshold, cycle threshold, throttle amount, etc.). The configuration information 330 may include different threshold parameters that may be used by the power controller 150 at different times.
  • first power control information e.g., first configuration information and/or feedback information 340
  • the configuration information 330 may include first configuration information that is sent at a first time (e.g., t 0 ), second configuration information that is sent at a second time (e.g., t 1 ), and so on.
  • the configuration training system 140 may update the correlation information to include the status feedback information as an additional data point associated with the application scenario. If the first status feedback information indicates that power and/or performance targets have not been met using the first configuration information, the configuration training system 140 may generate second configuration information that may improve the power and/or performance target of the subsystem 110.
  • the second configuration information may include a second set of threshold parameters (e.g., power threshold, cycle threshold, throttle amount, etc.) for use by the power controller 150.
  • the second configuration information may be generated using, e.g., the threshold parameters may improve certain power and/or performance characteristics. Additionally and/or alternatively, the configuration training system 140 may generate the second configuration information using, e.g., machine learning. Once generated, the second configuration information may be sent to the power controller 150. The power controller 150 may perform configurable power management of the plurality of functional blocks using the second configuration information. [0091] At operation 508, the configuration training system 444 may generate first function execution control state information based at least in part on the application information and the first status feedback information. For example, referring to FIG. 3C, the configuration training system 140 may generate function execution control state information 390 that allocates resources to the plurality of functional blocks 302.
  • the resources may be used by the plurality of functional blocks 302 to process the threads.
  • the allocation of resources may be at one or more of the pipeline level and/or the functional block level.
  • the allocation of resources may affect subsystem performance because processing threads with a first allocation of resources may draw more power from the voltage source and, hence, generate greater thermal output than processing the threads using a second allocation of resources.
  • the function execution control state information 390 may indicate an allocation of resources that provides a target power and performance for a particular application scenario.
  • the configuration training system 140 may select the allocation of resources based at least in part on one or more of the application information 350 and/or the correlation information.
  • the application information 350 may indicate the allocation of resources to include in the function execution control state information 390.
  • the configuration training system 140 may access the correlation information described above in connection with the configuration information 330.
  • the correlation information may indicate, for a particular application scenario, a predetermined allocation of resources and/or other control states.
  • the predetermined allocation of resources indicated by the correlation information may provide a target subsystem power and performance for a particular application scenario.
  • the configuration training system 140 may determine whether a different allocation of resources may provide a more target subsystem performance and less power consumption.
  • the configuration training system 140 may determine a second allocation of resources and/or other control states that may improve the performance of the subsystem 110.
  • the second allocation of resources may indicate a different allocation of resources across the pipelines as compared to the first allocation of resources.
  • the second allocation of resources may be generated using, e.g., the allocation of resources associated with a similar but different application scenario that may improve certain power and/or performance characteristics.
  • the configuration training system 140 may generate the second allocation of resources and/or other control states using, e.g., machine learning.
  • function execution control state information 390 that indicates the second allocation of resources and/or other control states may be sent to the plurality of functional blocks 302.
  • the configuration information 330 may include information associated with the allocation of resources and/or other control states.
  • the power controller 150 may perform configurable power management based at least in part on the allocation of resources and/or other control states.
  • Second status feedback information 360 may be received multiple times while an application is being run. Each time status feedback information 360 is received, the configuration training system 140 may determine whether and how to generate new function execution control state information 390 indicating a different allocation of resources or other states. Information associated with the new allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302 to improve the power and performance of the subsystem 110.
  • the configuration training system 444 may send the first power control information to a power controller. For example, referring to FIG.3C, once generated, the first configuration information may be set to the power controller 150. [0093] At operation 512, the configuration training system 444 may send the first function execution control state information to the plurality of functional blocks. For example, referring to FIG. 3C, function execution control state information 390 indicating an allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302. [0094] At operation 514, the configuration training system 444 may receive second status feedback information from the plurality of functional blocks.
  • the configuration training system 140 may determine whether and how to generate new function execution control state information 390 indicating a new allocation of resources and/or other control states. Information associated with the new allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302 to improve the power and performance of the subsystem 110. [0095] At operation 516, the configuration training system 444 may generate second power control information based at least in part on the second status feedback information. For example, referring to FIG.3C, the configuration information 330 may include different threshold parameters that may be used by the power controller 150 at different times.
  • the configuration information 330 may include first configuration information that is sent at a first time (e.g., t 0 ), second configuration information that is sent at a second time (e.g., t 1 ), and so on.
  • the configuration training system 140 may update the correlation information to include the first status feedback information as an additional data point associated with the first application scenario. If the first status feedback information indicates that certain power and performance thresholds (e.g., thermal, current, frequency, etc.) have not been met using the first configuration information, the configuration training system 140 may generate second configuration information that may improve the power and performance of the subsystem 110.
  • certain power and performance thresholds e.g., thermal, current, frequency, etc.
  • the second configuration information may include a second set of threshold parameters (e.g., coefficient and/or offset used to calculate the total power at the event counter unit, power threshold, cycle threshold, throttle amount, etc.) for use by the power controller 150.
  • the second configuration information may be generated using, e.g., the threshold parameters associated with a similar but different application scenario that may improve certain power and/or performance characteristics.
  • the configuration training system 140 may generate the second configuration information using, e.g., machine learning. Once generated, the second configuration information may be sent to the power controller 150.
  • the power controller 150 may perform configurable power management of the plurality of functional blocks using the second configuration information.
  • the configuration training system 444 may generate second function execution control state information based at least in part on the second status feedback information. For example, referring to FIG.3C, based at least in part on the second status feedback information 360, the configuration training system 140 may determine whether and how a different allocation of resources and/or other control states may provide a more target subsystem power and performance. If the second status feedback information 360 indicates that certain power and performance thresholds (e.g., thermal, current, frequency, etc.) have not been met using a first allocation of resources and/or other control states, the configuration training system 140 may determine a second allocation of resources and/or other control states that may improve the power and performance of the subsystem 110.
  • certain power and performance thresholds e.g., thermal, current, frequency, etc.
  • the second allocation of resources may indicate a different allocation of resources across the pipelines as compared to the first allocation of resources.
  • the second allocation of resources and/or other control states may be generated using, e.g., the allocation of resources and/or other control states associated with a similar but different application scenario that may improve certain power and/or performance characteristics.
  • the configuration training system 140 may generate the second allocation of resources using, e.g., machine learning and/or artificial intelligence.
  • function execution control state information 390 that indicates the second allocation of resources may be sent to the plurality of functional blocks 302.
  • the configuration information 330 may include information associated with the allocation of resources.
  • the power controller 150 may perform configurable power management based at least in part on the allocation of resources.
  • Second status feedback information 360 may be received multiple times while an application is being run. Each time status feedback information 360 is received, the configuration training system 140 may determine whether to generate new function execution control state information 390 indicating a different allocation of resources and/or other control states. Information associated with the new allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302 to improve the power and/or performance of the subsystem 110. [0097] At operation 520, the configuration training system 444 may send the second power control information to the power controller. For example, referring to FIG. 3C, once generated, the first configuration information may be set to the power controller 150. [0098] At operation 522, the configuration training system 444 may send the second function execution control state information to the plurality of functional blocks.
  • Embodiments of the disclosure provide a configuration training system for configurable power management.
  • the configuration training system may include a memory and at least one processor coupled to the memory.
  • the at least one processor may be configured to receive application information associated with a plurality of functional blocks.
  • the at least one processor may be configured to receive first status feedback information from the plurality of functional blocks.
  • the at least one processor may be configured to generate first power control information based at least in part on one or more of the application information or the first status feedback information.
  • the at least one processor may generate first function execution control state information based at least in part on the application information and the first status feedback information. In certain other aspects, the at least one processor may send the first power control information to a power controller. In certain other aspects, the at least one processor may send the first function execution control state information to the plurality of functional blocks.
  • the application information may include at least one character variable associated with a set of operations performed by the plurality of functional blocks.
  • the first status feedback information may include one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks.
  • the first power control information may include one or more of a first power controller configuration or first thermal feedback information.
  • the first function execution control state information may include instructions associated with a set of data flow processes performed by the plurality of functional blocks.
  • the set of data flow processes may include one or more of functional block enable instructions, functional block disable instructions, thread capacity information, storage region information, or task distribution information, etc.
  • the at least one processor may be further configured to receive second status feedback information from the plurality of functional blocks.
  • the at least one processor may be further configured to generate second power control information based at least in part on the second status feedback information.
  • the at least one processor may be further configured to generate second function execution control state information based at least in part on the second status feedback information.
  • the at least one processor may be further configured to send the second power control information to the power controller.
  • the at least one processor may be further configured to send the second function execution control state information to the plurality of functional blocks.
  • the first power control information and the second power control information may be different.
  • the first function execution control state information and the second function execution control state information may be different.
  • the first power control information and the first function execution control state information may be associated with a first set of at least one of power or performance characteristics associated with the plurality of functional blocks.
  • the second power control information and the second function execution control state information may be associated with a second set of at least one of power or performance characteristics associated with the plurality of functional blocks.
  • the first set of at least one of power or performance characteristics and the second set of at least one of power or performance characteristics may be different.
  • Embodiments of the disclosure provide a method for configurable power management of a configuration training system. The method may include receiving, at a configuration training system, application information associated with a plurality of functional blocks. In certain aspects, the method may further include receiving, at the configuration training system, first status feedback information from the plurality of functional blocks.
  • the method may further include generating, at the configuration training system, first power control information based at least in part on one or more of the application information or the first status feedback information. In certain other aspects, the method may further include generating, at the configuration training system, first function execution control state information based at least in part on the application information and the first status feedback information. In certain other aspects, the method may further include sending the first power control information to a power controller. In certain other aspects, the method may further include sending the first function execution control state information to the plurality of functional blocks. [0116] In certain aspects, the application information may include at least one character variable associated with a set of operations performed by the plurality of functional blocks.
  • the first status feedback information may include one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks.
  • the first power control information may include one or more of a first power controller configuration or first thermal feedback information.
  • the first function execution control state information may include instructions associated with a set of data flow processes performed by the plurality of functional blocks.
  • the set of data flow processes may include one or more of functional block enable instructions, functional block disable instructions, thread capacity information, storage region information, or task distribution information.
  • the method may further include receiving, at the configuration training system, second status feedback information from the plurality of functional blocks.
  • the method may further include generating, at the configuration training system, second power control information based at least in part on the second status feedback information. In certain other aspects, the method may further include generating, at the configuration training system, second function execution control state information based at least in part on the second status feedback information. In certain other aspects, the method may further include sending the second power control information to the power controller. In certain other aspects, the method may further include sending the second function execution control state information to the plurality of functional blocks. [0122] In certain other aspects, the first power control information and the second power control information may be different. [0123] In certain other aspects, the first function execution control state information and the second function execution control state information may be different.
  • the first power control information and the first function execution control state information may be associated with a first set of at least one of power or performance characteristics associated with the plurality of functional blocks.
  • the second power control information and the second function execution control state information may be associated with a second set of at least one of power or performance characteristics associated with the plurality of functional blocks.
  • the first set of at least one of power or performance characteristics and the second set of at least one of power or performance characteristics may be different.
  • Embodiments of the disclosure further provide a non-transitory computer-readable medium having instructions stored thereon that, when executed by one or more processors, causes the one or more processors to perform configurable power management of a configuration training system.
  • the method may further include generating second power control information based at least in part on the second status feedback information.
  • the method may further include generating second function execution control state information based at least in part on the second status feedback information.
  • the method may further include sending the second power control information to the power controller.
  • the method may further include sending the second function execution control state information to the plurality of functional blocks.
  • the application information may include at least one character variable associated with a set of operations performed by the plurality of functional blocks.
  • the first status feedback information may include one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks.
  • the first power control information may include one or more of a first power controller configuration or first thermal feedback information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)
  • Feedback Control In General (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

Embodiments include a configuration training system for configurable power management. The configuration training system may receive application information associated with a plurality of functional blocks. The configuration training system may receive first status feedback information from the plurality of functional blocks. The configuration training system may generate first power control information based on the application information and the first status feedback information. The configuration training system may generate first function execution control state information based on the application information and the first status feedback information. The configuration training system may send the first power control information to a power controller. The configuration training system may send the first function execution control state information to the plurality of functional blocks. The configuration training system may perform these operations in a loop with one or more iterations until an optimal power and performance control configuration is found at certain conditions.

Description

APPARATUS AND METHOD OF INTELLIGENT POWER AND PERFORMANCE MANAGEMENT BACKGROUND [0001] Embodiments of the present disclosure relate to an apparatus and method for configurable power management within a subsystem. [0002] A system on a chip (SoC) is an integrated circuit that integrates different subsystems, each having a different function, in a computing system or other electronic device. A subsystem of an SoC may be comprised of a plurality of functional blocks that together process instructions that enable the subsystem to perform its dedicated function. The functional blocks may be organized into pipelines configured to concurrently perform fixed functions, process different portions of an instruction, or processes different instructions. SUMMARY [0003] Embodiments of apparatus and method for configurable power management are disclosed herein. [0004] Embodiments of the disclosure provide a configuration training system for configurable power management. The configuration training system may include a memory and at least one processor coupled to the memory. In certain aspects, the at least one processor may be configured to receive application information associated with a plurality of functional blocks. In certain other aspects, the at least one processor may be configured to receive first status feedback information from the plurality of functional blocks. In certain other aspects, the at least one processor may be configured to generate first power control information based at least in part on one or more of the application information or the first status feedback information. In certain other aspects, the at least one processor may generate first function execution control state information based at least in part on the application information and the first status feedback information. In certain other aspects, the at least one processor may send the first power control information to a power controller. In certain other aspects, the at least one processor may send the first function execution control state information to the plurality of functional blocks. [0005] Embodiments of the disclosure provide a method for configurable power management of a configuration training system. The method may include receiving, at a configuration training system application information associated with a plurality of functional blocks. In certain aspects, the method may further include receiving, at the configuration training system, first status feedback information from the plurality of functional blocks. In certain other aspects, the method may further include generating, at the configuration training system, first power control information based at least in part on one or more of the application information or the first status feedback information. In certain other aspects, the method may further include generating, at the configuration training system, first function execution control state information based at least in part on the application information and the first status feedback information. In certain other aspects, the method may further include sending the first power control information to a power controller. In certain other aspects, the method may further include sending the first function execution control state information to the plurality of functional blocks. [0006] Embodiments of the disclosure further provide a non-transitory computer-readable medium having instructions stored thereon that, when executed by one or more processors, causes the one or more processors to perform configurable power management of a configuration training system. The method may further include generating second power control information based at least in part on the second status feedback information. In certain other aspects, the method may further include generating second function execution control state information based at least in part on the second status feedback information. In certain other aspects, the method may further include sending the second power control information to the power controller. In certain other aspects, the method may further include sending the second function execution control state information to the plurality of functional blocks. [0007] Embodiments of the disclosure provide a power controller for configurable power management. The power controller may include a calculation block and a power control block. In certain aspects, the calculation block may be configured to receive first power control information associated with a plurality of functional blocks. function execution control state informationIn certain other aspects, the calculation block may be configured to identify first power management technique associated with the plurality of functional blocks based at least in part on one or more of the first power control information. In certain other aspects, the power control block configured to apply the first power management technique to the plurality of functional blocks. [0008] Embodiments of the disclosure provide a method for configurable power management of a power controller. The method may include receiving, at a power controller, first power control information associated with a plurality of functional blocks. In certain other aspects, the method may including receive first function execution control state information indicating an allocation of resources at the plurality of functional blocks. In certain aspects, the method may further include identifying, at the power controller, first power management technique associated with the plurality of functional blocks based at least in part on one or more of the first power control information or the first function execution state information. In certain other aspects, the method may include applying the first power management technique to the plurality of functional blocks. [0009] Embodiments of the disclosure further provide a non-transitory computer-readable medium having instructions stored thereon that, when executed by a power controller, causes the power controller to perform configurable power management of a power controller. The method may include receiving first power control information associated with a plurality of functional blocks. In certain other aspects, the method may further include receiving first function execution control state information indicating an allocation of resources at the plurality of functional blocks. In certain aspects, the method may further include identifying first power management technique associated with the plurality of functional blocks based at least in part on one or more of the first power control information or the first function execution control state information. In certain other aspects, the method may include applying the first power management technique to the plurality of functional blocks. BRIEF DESCRIPTION OF THE DRAWINGS [0010] The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present disclosure and, together with the description, further serve to explain the principles of the present disclosure and to enable a person skilled in the pertinent art to make and use the present disclosure. [0011] FIG. 1 illustrates a block diagram of an embedded SoC apparatus, in accordance with certain aspects of the disclosure. [0012] FIG.2 illustrates a block diagram of a subsystem that is part of an SoC apparatus, in accordance with certain aspects of the disclosure. [0013] FIG. 3A illustrates a block diagram of a power controller configured to perform configurable power management, in accordance with certain aspects of the disclosure. [0014] FIG. 3B illustrates a detailed view of a power meter that may be configured to perform configurable power management, in accordance with certain aspects of the disclosure. [0015] FIG.3C illustrates another block diagram of a subsystem that is part of an SoC, in accordance with certain aspects of the disclosure [0016] FIG.3D illustrates a data flow performed by a power controller, in accordance with certain aspects of the disclosure. [0017] FIG.4 illustrates a block diagram of an exemplary system for configurable power management, according to embodiments of the disclosure. [0018] FIG. 5 illustrates a flow chart of an exemplary method for configurable power management, according to embodiments of the disclosure. [0019] FIG. 6 illustrates a data flow diagram of an exemplary system for configurable power management, according to embodiments of the disclosure. [0020] FIG. 7 illustrates a block diagram of a conventional thermal and current limits management system. [0021] Embodiments of the present disclosure will be described with reference to the accompanying drawings. DETAILED DESCRIPTION [0022] Although specific configurations and arrangements are discussed, it should be understood that this is done for illustrative purposes only. A person skilled in the pertinent art will recognize that other configurations and arrangements can be used without departing from the spirit and scope of the present disclosure. It will be apparent to a person skilled in the pertinent art that the present disclosure can also be employed in a variety of other applications. [0023] It is noted that references in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” “some embodiments,” “certain embodiments,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases do not necessarily refer to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of a person skilled in the pertinent art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. [0024] In general, terminology may be understood at least in part from usage in context. For example, the term “one or more” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features structures or characteristics in a plural sense Similarly terms such as “a,” “an,” or “the,” again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context. In addition, the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context. [0025] Various aspects of configurable power management will now be described with reference to various apparatus and methods. These apparatus and methods will be described in the following detailed description and illustrated in the accompanying drawings by various blocks, modules, units, components, circuits, steps, operations, processes, algorithms, etc. (collectively referred to as “elements”). These elements may be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether such elements are implemented as hardware, firmware, or software depends upon the particular application and design constraints imposed on the overall system. [0026] In addition, various aspects of the configurable power management techniques of the present disclosure may be described in connection with an SoC, one or more subsystems of the SoC, one or more pipelines of the subsystem, or a plurality of functional blocks associated with the pipeline. The configurable power management techniques described herein may be applied at the functional block level, pipeline level, or subsystem level to other types of circuits, computing devices and/or electronic devices, other than an SoC, without departing from the scope of the present disclosure. [0027] As mentioned above, an SoC is an integrated circuit that integrates subsystems, each having a different function, in a computing system or other electronic device. The subsystems integrated by an SoC may include, without limitation, one or more of the following: central processing units (CPUs), graphical processing units (GPUs), microcontrollers, microprocessors, multiprocessors, digital signal processor (DSP) cores, other types of cores, a memory unit, read- only memory (ROM), random-access memory (RAM), clock signal generators, input/output (I/O) interfaces, analog interfaces, voltage regulators and power management circuits, an advanced peripheral unit(s), wireless communication unit(s) (e.g., Wi-Fi module, cellular module, 5G new radio (NR) module, Bluetooth® module, etc.), or coprocessors, just to name a few. [0028] A subsystem may perform its dedicated function by running an application comprised of one or more instructions. To decrease the processing time, the subsystem may perform instruction pipelining to implement instruction-level parallelism within a single processor, computing device, and/or circuit. Pipelining attempts to keep every part of the subsystem (e.g., the functional blocks) busy with some instruction by dividing incoming instructions into different portions that each include a series of sequential steps (e.g., “pipeline”). A pipeline may include a plurality of functional blocks configured to perform the series of sequential steps. [0029] While pipelining may decrease processing time and, hence, increase the efficiency of a subsystem, pipelining may lead to an undesirable increase in the power consumption and thermal output at the subsystem. Certain pipelines may increase the power consumption and thermal output of the subsystem more than others. Recently, subsystems have been designed with an increased number of functional blocks to increase pipelining and multi-thread performance within the subsystem. The increased number of functional blocks may also increase, among other things, the amount of power and, hence, the thermal output associated with a subsystem. [0030] Power management techniques may be used to optimize power consumption and mitigate the temperature within the subsystem. One such power management technique is frequency reduction. Frequency reduction may be used to reduce the power consumed by the subsystem. Another such power management technique is dynamic voltage and frequency scaling (DVFS). DVFS may be used to reduce the power consumption of a subsystem on the fly by scaling down the voltage and frequency based on the targeted performance requirements of the application being run by the subsystem. However, for DVFS to be effective, a large reduction in performance may be incurred due to the nature of the voltage and frequency control of a subsystem, e.g., as described below in connection with FIG.7. [0031] FIG. 7 illustrates a block diagram of a conventional thermal and current limits management system 700 (e.g., “conventional system 700,” hereinafter). As seen in FIG. 7, conventional system 700 may include a subsystem 702, current sensor 704 and a temperature sensor 706 located at the subsystem 702, a voltage source 708, a temperature / current monitor and mitigation unit 710 (e.g., “mitigation unit 710,” hereinafter), a phase-locked loop (PLL) frequency source 712. [0032] The mitigation unit 710 may receive current information 703 and temperature information 705 from the current sensor 704 and temperature sensor 706, respectively. Normally, the mitigation unit 710 may include programmable thresholds and an algorithm to perform a mitigation scheme for the entire subsystem 702 (e.g., across all pipelines) using the clock frequency 701 and DVFS. An increase in subsystem temperature may result from the dissipation of switching power during periods of high switching activity in certain pipelines according to the dynamic power equation set forth below as Equation (1):
Figure imgf000009_0001
where PSWitching is the switching power, a is the switching activity, f is the switching frequency, ceff is the effective capacitance, and Vdd is the supply voltage.
[0033] The mitigation unit 710 may use a predictive mechanism to reduce the frequency of the subsystem by sending a signal 711 to the PLL frequency source 712. The PLL frequency source 712 may reduce the clock frequency 701 of the subsystem 702 and, therefore, power consumption when the temperature reaches a threshold. For faster temperature mitigation in extreme cases where subsystem temperature is too high, the mitigation unit 710 may use the thermal mitigation algorithm to lower the DVFS set point, which may further reduce the power consumption.
[0034] An increase in current may also a result of high switching activity in the subsystem
702. For current mitigation, a fast and steep reduction in the clock frequency 701 may be used depending on the algorithm. For faster mitigation, the mitigation unit 710 may send a signal 707 instructing the voltage source 708 to lower the voltage (Vdd) 709 and a signal 711 instructing the PLL frequency source 712 to lower the clock frequency 701 using DVFS step down. The mitigation unit 710 may decide between a frequency only reduction or DVFS step down depending on the algorithm.
[0035] Current and temperature limit management performed by the conventional system
700 may involve two mechanisms: 1) an inner loop to reduce frequency only and 2) an outer loop to reduce both voltage and frequency using DVFS step down. The advantage of the inner loop is that PLL frequency reduction is usually fast (e.g., 50 ns, 100 ns, 200 ns, etc.). The mitigation for temperature may be slow compared to current mitigation because heat transfer is relatively slow. The outer loop may provide much faster and effective mitigation for temperature but at the cost of more performance degradation.
[0036] Furthermore, for DVFS to be effective, a large reduction in subsystem performance may be incurred due to the nature of the voltage and frequency control at the subsystem level. For example, using DVFS, the voltage 709 and clock frequency 701 of the entire subsystem (e.g., across all pipelines) may be reduced when the temperature reaches a threshold. By reducing the voltage 709 and clock frequency 701 of the entire subsystem 702, the overall performance of the subsystem may be reduced, e.g., due to increased processing time at all pipelines. [0037] Thus, there is an unmet need for a power management technique that optimizes power consumption and mitigates temperature without reducing the voltage and/or frequency of the entire subsystem. [0038] The present disclosure provides a solution by enabling a central power controller unit to probe the power events of individual pipelines and assert a power control signal for only those pipelines for which a threshold condition is met. By applying power control for individual pipelines, the overall performance of the subsystem may be increased by reducing the power for only those pipelines that meet the threshold condition, e.g., as described below in connection with FIGs.1-8. [0039] FIG. 1 illustrates a block diagram of an embedded system-on-chip (SoC) 100, in accordance with certain aspects of the disclosure. The SoC 100 may include, e.g., a main memory 102, a CPU 104, a system bus 106, an input-output (IO) processor 108, and a subsystem 110, just to name a few. [0040] The subsystem 110 may be configured as, e.g., one or more of a microcontroller unit (MCU), a CPU, a GPU, microcontroller, a processor, microprocessor, a multiprocessor, DSP core, a circuit, a memory unit, ROM, RAM, clock signal generators, I/O interfaces, analog interfaces, voltage regulators and power management circuits, an advanced peripheral unit(s), wireless communication unit(s) (e.g., Wi-Fi module, cellular module, 5G NR module, Bluetooth® module, etc.), or coprocessors, just to name a few. [0041] The subsystem 110 may comprise a plurality of functional blocks (e.g., seen in FIG. 2) configured to process one or more instructions. To enable parallel processing, the functional blocks may be organized into a plurality of pipelines. Each pipeline may be configured to process a portion of an instruction such that each of the pipelines process different portions of the instruction in parallel. A portion of an instruction may include a series of steps that are processed sequentially. An example subsystem architecture including a plurality of functional blocks is illustrated in FIG.2. [0042] In certain implementations, the subsystem 110 may include an application scenario unit 130 that is configured to send application information to a configuration training system 140. The application information may that is associated with an application performed and/or run by the functional blocks. By way of example and not limitation, the application information may indicate, among others, pipeline architecture, instruction(s), character variables, etc. In certain implementations, the configuration training system 140 may be configured to generate (125) configuration information (e.g., power threshold, sliding window size, throttle duration, etc.) and/or function execution control state information (e.g., allocation of resources) based at least in part on the application information and/or feedback information (e.g., thermal feedback, power feedback, performance feedback, frequency feedback, etc.) from the plurality of functional blocks. [0043] In some implementations, such as the one illustrated in FIG.1, the power controller 150 may be located at the subsystem 110. In certain other implementations, the power controller 150 may be located externally to subsystem 110. When externally located, the power controller 150 may be in communication with the subsystem 110 using wired or wireless communication. [0044] In certain other implementations, subsystem 110 may also include a configuration training system 140 that may configure the threshold conditions used by the power controller 150 for configurable power management. In certain configurations, one or more of the configuration training system 140 and/or the power controller 150 may be located external to the subsystem 110. When located externally, the configuration training system 140 and/or the power controller 150 may be in communication with the subsystem 110. Additional details associated with the configuration training system 140 are set forth below in connection with FIG.3B. [0045] FIG.2 illustrates a more detailed view of subsystem 110 from FIG.1, in accordance with certain aspects of the disclosure. FIG.3A illustrates a block diagram 300 of a power controller configured to perform configurable power management, in accordance with certain aspects of the disclosure. As seen in FIG.3A, the power controller 150 may be in communication with a plurality of pipelines (e.g., pipeline1218, pipeline2219 ... pipelineN 220. Although the power controller 150 in FIG.3A is illustrated as being in communication with three pipelines, the power controller 150 may be in communication with more or fewer than three pipelines without departing from the scope of the present disclosure. FIG.3B illustrates a detailed view of a power meter 312 that may be included in the power controller 150, in accordance with certain aspects of the disclosure. FIG. 3C illustrates another block diagram of subsystem 110, in accordance with certain aspects of the disclosure. FIG. 3D illustrates another block diagram of a power controller 150, in accordance with certain aspects of the present disclosure. FIGs.2, 3A, 3B, 3C, and 3D will now be described together. [0046] The subsystem 110 illustrated in FIG. 2 is configured as a processor, and, hence, the functional blocks described below are those associated with a processor. However, subsystem 110 is not limited to a processor and may include one or more of the other non-limiting examples of subsystem 110 described above in connection with FIG.1. When subsystem 110 is configured as something other than a processor, a different combination of functional blocks than those described below may be included in subsystem 110 without departing from the scope of the present disclosure. Also, the number of functional blocks illustrated in FIG.2 is for illustrative purposes only and not limited thereto. In other words, subsystem 110 may include any different number or type of functional blocks without departing from the scope of the present disclosure. [0047] Referring to FIG.2, subsystem 110 may comprise a plurality of functional blocks configured to process at least one instruction (e.g., the instruction, hereinafter). The plurality of functional blocks illustrated in FIG. 2 may include, e.g., an instruction cache 202, one or more instruction buffers 204, a plurality of arithmetic and logic units (ALUs) 206, a plurality of load / store units (LD/STs) 208, a plurality of special function units (SFUs) 210, a texture / L1 cache 212, a plurality of texture units (TEXs) 214, and/or a common register file 216, just to name a few. In an example embodiment, pipeline1218 may include instruction cache 202, instruction buffer 204, ALU 206, and common register file 216. [0048] As mentioned above, the plurality of pipelines may be configured to process an instruction, logic, and/or dedicated function. In certain implementations, the instruction may include a plurality of portions that may be processed concurrently using different pipelines that are each comprised of a different set of functional blocks. [0049] Furthermore, each portion of the instruction may include a set of stages. Each of the stages may be sequentially processed by one or more of the functional blocks in the pipeline. In certain implementations, a pipeline may include a register after each stage. The registers may be configured to store information from the instruction and/or calculation(s) from one stage so that the logic gates of the next stage in the pipeline may perform the subsequent step using the information in the register of the previous stage. After each stage, a handshake signal may be exchanged with the next downstream functional block indicating that the previous stage is complete, and the next stage should begin. Each stage may consume a certain amount of power. The power consumed by a stage may be referred to as a power event. Furthermore, pipelines may be included in a same device, subsystem, or functional block. Additionally and/or alternatively, pipelines may be located in separate devices, subsystem, or functional blocks. [0050] By way of example and not limitation, assume that the portion of the instruction processed by pipeline1218 includes five stages and that each stage is associated with a power event. Using this example, the first stage may include fetching (e.g., first power event) the portion of the instruction into the instruction buffer 204. The second stage may include fetching (e.g., second power event) the decoded portion of the instruction into the ALU 206. The third stage may include executing (e.g., third power event) a calculation at the ALU 206. The fourth stage may include accessing (e.g., fourth power event) the common register file 216. The fifth stage may include writing (e.g., fifth power event) information associated with the calculation to the common register file 216. One of ordinary skill in the art understands that more or fewer than five stages may be associated with each portion of the instruction without departing from the scope of the present disclosure. The operations described above in connection with the five stages are not limited to the operations described herein. Instead, the stages may include any number of different stages each performing any operation without departing from the scope of the present disclosure. [0051] Each time information is stored or fetched into one of the registers, or when a handshake signal is exchanged between two functional blocks in a pipeline, event information 301 (e.g., a power event signal) may be sent to the power controller 150. Event information 301 may include power information indicating an amount or percentage of the power consumed during a stage. The event information may include a plurality of event information. For example, the event information 301 may include first event information associated with a first power event, second event information associated with a second power event, and so on. Furthermore, first event information may be sent at a first time (e.g., t0), second event information may be sent at a second time (e.g., t1), and so on. [0052] Each clock cycle, the power controller 150 may receive a plurality of event information 301 for each pipeline. The event counter unit 314 may calculate the total power for, e.g., pipeline1218 during the clock cycle by summing the power bits indicated in each of the event information 301. Each clock cycle, the event counter unit 314 may send the event information that indicates the total power consumed by the pipeline during that clock cycle. [0053] Referring to FIG. 3A, the power controller 150 may be a central power controller configured to probe the power events for each pipeline (e.g., pipeline1 218, pipeline2 219, pipelineN 220) in a subsystem 110. As seen in FIG.3A, the power controller 150 may include a plurality of power meters 312, each associated with one or more pipelines. Additional details of power meter 312 will now be described in connection with FIG.3B. [0054] As seen in FIG.3B, the power meter 312 may include an event counter unit 314, a sensitivity level unit 320, a sliding window unit 316, a throttle generator 318, a sensitivity level configuration unit 332, a sliding window configuration unit 334, and a throttle configuration unit 336, among others. In certain implementations, the sensitivity level unit 320 may be part of the event counter unit 314. In certain other implementations, the sensitivity level unit 320 may be separate from but in communication with the event counter unit 314. As mentioned above, power meter 312 may be configured to probe power events for one or more pipelines. For illustrative purposes, power meter 312 will be described below as probing power events for a single pipeline, e.g., pipeline1218. [0055] As seen in FIG.3B, the event counter unit 314 may monitor the power events for its respective pipeline(s) every clock cycle. The event counter unit 314 may be configured to receive event information 301 each time a power event occurs in pipeline1218. Each of the event information 301 may include, e.g., power bit information indicating an amount of power associated with that event. The event counter unit 314 may calculate the total power for, e.g., pipeline1218 during the clock cycle by summing the power bits indicated in each of the event information 301. Each clock cycle, the event counter unit 314 may send the event information that indicates the total power consumed by the pipeline during that clock cycle. [0056] The sensitivity level unit 320 may be configured to determine whether the event information 301 meets a power threshold (e.g., a power percentage for a clock cycle). By way of example and not limitation, assume that the total power consumed pipeline1218 reaches 90% of the maximum allowable power for that clock cycle. The sensitivity level unit 320 may then send the event information 301 to the sliding window unit 316 when the power threshold of 90% is reached. The sensitivity level unit 320 may include a regulator trigger in the event information sent to the sliding window unit 316. Otherwise, when the total power consumed during that clock cycle is less than 90%, then the event information 301 may not be sent to the sliding window unit 316. In certain implementations, the sensitivity level may be set to 100%, in this case, the sensitivity level unit 320 may not exist, and the event information may be directly sent to sliding window unit 316. [0057] The sliding window unit 316 may be configured to determine whether the event information 301 meets a cycle threshold. The cycle threshold may be the sliding window size. In certain implementations, the sliding window size may include a predetermined number of clock cycles (e.g., 1 clock cycle, 2 clock cycles, 10 clock cycles, 50 clock cycles, 100 clock cycles, etc.). [0058] By way of example and not limitation, assume the sliding window size (e.g., the cycle threshold) is 100 cycles and is associated with pipeline1218. Here, the sliding window unit 316 may determine that the cycle threshold is met when event information 301 associated with pipeline1218 is received from the sensitivity level unit 320 for 100 clock cycles, e.g., indicating that the power threshold was reached for 100 cycles. [0059] In certain implementations, the sliding window unit 316 may include a state machine. The state machine may include a wait trigger state, a threshold set state, and/or throttle enable state. The sliding window unit 316 may remain in the wait trigger state until event information including a regulator trigger is received from the sensitivity level unit 320. The regulator trigger may cause the state machine to transition to the threshold set state. The threshold set state may increment a counter upon receipt of the regulator trigger. When a threshold number of regulator triggers (e.g., when the power threshold at the sensitivity level unit 320 is reached for a predetermined number of clock cycles), the sliding window unit 316 may transition to the throttle enable state. A throttle enable signal may be sent to the throttle generator 318 upon transitioning to the throttle enable state. When the power threshold is not reached at the sensitivity level unit 320, a regulator deassert signal may be sent to the sliding window unit 316. The sliding window unit 316 may transition from the throttle enable state to the wait trigger state. The counter associated with the regulator triggers may be reset upon transitioning to the wait trigger state. [0060] As mentioned above, the sliding window unit 316 may send a signal that instructs the throttle generator 318 to assert a power control signal 303 (e.g., throttle enable signal) for pipeline1218. The power control signal 303 may be asserted until a deassert trigger is received from the sliding window unit 316. [0061] In certain implementations, the sliding window unit 316 may be configured to initiate DVFS power management 305 for the subsystem 110 when certain conditions are met. For example, if the sliding window unit 316 determines that that power or temperature of a pipeline and/or subsystem reaches a DVFS set point, then a signal to initiate DVFS power management 305 may be asserted to adjust the frequency, or the voltage, or both. In certain implementations, the sliding window unit 316 may be configured to initiate clock or power gating for one or more functional blocks or pipelines. [0062] In certain implementations, the power controller 150 may be in communication with a configuration training system, e.g., such as the configuration training system 140 in FIGs.1 and 2. In certain implementations, the configuration training system 140 may be configured to receive second status feedback information 360 (e.g., thermal feedback, current feedback, power feedback, performance feedback, voltage feedback, frequency feedback, etc.) from the plurality of functional blocks 302 and/or one or more pipelines. The second status feedback information 360 may include pipeline-level feedback, functional block-level feedback, and/or subsystem-level feedback. For example, the second status feedback information 360 may include thermal feedback, current feedback, power feedback, performance feedback, voltage feedback, frequency feedback associated with each pipeline. The feedback is not necessarily fixed times, it may have multiple times based on the training system to find the best configuration at certain point. The configuration training system 140 may be configured to generate configuration information 330 based at least in part on the second status feedback information 360. The configuration information 330 may be used to configure the power threshold (e.g., sensitivity level information), the cycle threshold (e.g., sliding window size), throttle enable / disable information, etc. The configuration training system 140 may select configuration information 330 that provides a desirable thermal, current, and performance tradeoff. Additional details regarding the generation of the configuration information 330 are described below in connection with FIG.3C. [0063] The sensitivity level configuration unit 332 may be configured to receive configuration information 330 associated with the power threshold (e.g., 20%, 30%, 50%, 85%, 90%, etc.). The sliding window configuration unit 334 may be configured to receive configuration information 330 associated with a sliding window size (e.g., 5 clock cycles, 10 clock cycles, 50 clock cycles, 100 clock cycles) that may be used as the cycle threshold. The throttle configuration unit 336 may be configured to receive throttle amount information (e.g., a percentage to throttle the power of the pipeline or functional block(s)) associated a throttle enable signal and/or throttle disable signal. [0064] In certain implementations, feedback information 340 (e.g., thermal feedback, current feedback, power feedback, voltage feedback, frequency feedback, technology feedback, etc.) may be sent to the event counter unit 314, sensitivity level unit 320, sliding window unit 316, and throttle generator 318. Feedback information 340 may be used to adjust the power model to achieve more accurate power detection. Additionally and/or alternatively, the configuration training system 140 may send feedback information 340 to one or more of the event counter unit 314, the sensitivity level unit 320, the sliding window unit 316, and/or throttle generator 318. The event counter unit 314, the sensitivity level unit 320, the sliding window unit 316, and/or throttle generator 318 may use the feedback information 340 to reconfigure, e.g., an offset and/or coefficient used in summing the power bits, a power threshold (e.g., sensitivity level), a sliding window size, and an amount to throttle, respectively. For example, when the feedback information 340 indicates that the temperature of the subsystem 110 is too high, the sensitivity level unit 320 may lower the power threshold, and the sliding window unit 316 may reduce the cycle threshold. By increasing the sensitivity level and/or the number of cycles the power control signal 303 is asserted, the overall temperature of the subsystem 110 may be decreased. [0065] Additional details associated with the use of configuration information 330 and feedback information 340 in performing configurable power management by the power controller 150 are described below in connection with FIGs.3C and 3D. [0066] Referring to FIG.3C, the subsystem 110 may include an application scenario unit 130, a configuration training system 140, a power controller 150, and a plurality of functional blocks 302. The plurality of functional blocks 302 may correspond to, e.g., the plurality of functional blocks 202, 204, 206, 208, 210, 212, 214, 216 described above in connection with FIG. 2. The plurality of functional blocks 302 may be organized into a plurality of pipelines, e.g., such as the plurality of pipelines 218, 219, 220, etc. described above in connection with FIGs.2 and 3A. [0067] The application scenario unit 130 may be configured to maintain or access application information 350 related to at least one application that may be run by the subsystem 110. In certain aspects, the application information 350 may include information associated with a pipeline architecture within the plurality of functional blocks 302. Information associated with the pipeline architecture may indicate, which pipelines process which instruction(s),portions of an instruction, fixed functions, and/or logic associated with the application. In certain other aspects, the application information 350 may include information that relates particular clock cycles to a particular instruction(s), portions of an instruction, fixed functions, and/or logic. In certain other aspects, the application information 350 may include a character variable (e.g., type of application, workload, control state, etc.) associated with an application, thread information, or information related to the allocation of resources within a pipeline for concurrent processing of threads, etc. For example, the application information 350 may indicate that pipeline 1 performs instruction 1 at a first clock cycle, pipeline 2 performs instruction 2 at the first clock cycle, pipeline 1 performs instruction 3 at the second clock cycle, and so on. [0068] Prior to running the application at the plurality of functional blocks 302, the application scenario unit 130 may send the application information 350 to the configuration training system 140. The configuration training system 140 may generate configuration information 330 based at least in part on the application information 350. [0069] In certain implementations, the configuration training system 140 may be configured to access correlation information that correlates application information 350 to configuration information 330. By way of example and not limitation, the correlation information may include, e.g., one or more of a lookup table, a neural network, a database, an artificial intelligence engine, a machine learning engine, etc. In certain implementations, the correlation information may be maintained locally at the configuration training system 140 and/or subsystem 110. However, in certain other implementations, the correlation information may be located remotely from the configuration training system 140. When located remotely, the configuration training system 140 may access the correlation information using wired or wireless communication. [0070] Based on a comparison of the application information 350 and the correlation information, the configuration training system 140 may generate configuration information 330. As mentioned above in connection with FIG.3B, the configuration information 330 may configure power controller 150 to use certain threshold parameters (e.g., coefficient and/or offset used to calculate a total power at the event counter unit 314, power threshold, cycle threshold, throttle amount, etc.). The configuration information 330 may include different threshold parameters that may be used by the power controller 150 at different times. For example, the configuration information 330 may include first configuration information that is sent at a first time (e.g., t0), second configuration information that is sent at a second time (e.g., t1), and so on. [0071] By way of example and not limitation, assume that application information 350 indicates a first application scenario. Here, the configuration training system 140 may generate first configuration information that includes first threshold parameters. The first threshold parameters may include one or more of, e.g., a coefficient and/or offset used by the event counter unit 314 to sum the power associated with power events, a first power threshold that may be used by the sensitivity level unit 320, a first sliding window size (e.g., first cycle threshold) that may be used by the sliding window unit 316, and/or a first throttle amount used by the throttle generator 318 to assert a throttle enable signal and/or throttle disable signal. The first threshold parameters may provide a target power, thermal, and performance tradeoff at the subsystem 110 while the first application is being run. The first configuration information may be associated with a first set of performance characteristics (e.g., speed, thermal, etc.). Once generated, the first configuration information may be sent to the power controller 150. [0072] While the application is running, the power controller 150 may use the first threshold parameters to perform operations associated with configurable power management, e.g., described above in connection with FIGs.3A and 3B. The plurality of functional blocks 302 may send second status feedback information 360 to the configuration training system 140 that is related to subsystem performance (e.g., thermal information, performance information, frequency information, and the first threshold parameters, etc. The second status feedback information 360 may include different feedback information that may be used by the configuration training system 140 at different times. For example, the second status feedback information 360 may include first information that is received in response to the first threshold parameters (e.g., indicated by the first configuration information), second status feedback information 360 received in response to the second threshold parameters (e.g., indicated by the second configuration information), and so on. [0073] The second status feedback information 360 may include one or more of power information, performance information, current information, or thermal information associated with each pipeline and/or the plurality of functional blocks 302. The second status feedback information 360 may be sent at the end of each clock cycle, at the end of a predetermined number of clock cycles (e.g., 2 clock cycles, 3 clock cycles, 10 clock cycles, 100 clock cycles, etc.), or upon request from the configuration training system 140. [0074] The configuration training system 140 may update the correlation information to associate the feedback information with the application. If the feedback information indicates that certain targets (e.g., power, thermal, etc.) have not been met using the first configuration information, the configuration training system 140 may generate second configuration information that may improve the power and/or performance target of the subsystem 110. For example, the second configuration information may include a second set of threshold parameters (e.g., coefficient and/or offsets associated with calculating total power, power threshold, cycle threshold, throttle amount, etc.) for use by the power controller 150. The second configuration information may be generated using, e.g., the threshold parameters may improve certain power and/or performance characteristics. Additionally and/or alternatively, the configuration training system 140 may generate the second configuration information using, e.g., machine learning. Once generated, the second configuration information may be sent to the power controller 150. The power controller 150 may perform configurable power management of the plurality of functional blocks 302 using the second configuration information. Second status feedback information 360 may be received multiple times while an application is running. Each time second status feedback information 360 is received, the configuration training system 140 may determine whether and how to generate new configuration information 330, and eventually may find the best configuration at certain point. The new configuration information 330 may be sent to the power controller 150 to improve the power and/or performance of the subsystem 110. [0075] In certain implementations, the configuration training system 140 may generate function execution control state information 390 that allocates resources to the plurality of functional blocks 302. The resources may be used by the plurality of functional blocks 302 to organize and/or process the threads. In certain implementations, the allocation of resources may be at one or more of the pipeline level and/or the functional block level. The allocation of resources may affect subsystem performance because processing threads with a first allocation of resources may draw more power from the voltage source and, hence, generate greater thermal output than processing the threads using a second allocation of resources. Hence, the function execution control state information 390 may indicate an allocation of resources that provides target power and/or performance characteristics (e.g., speed, thermal, etc.) for a particular application scenario. [0076] In certain implementations, the configuration training system 140 may select the allocation of resources and/or other control states based at least in part on one or more of the application information 350 and/or the correlation information. In certain example embodiments, the application information 350 may indicate the allocation of resources and/or other control states for the application scenario. In certain other example embodiments, the configuration training system 140 may access the correlation information described above in connection with the configuration information 330 to select an allocation of resources and/or other control states. The correlation information may indicate, for a particular application scenario, a predetermined allocation of resources and/or other control states. For example, the predetermined allocation of resources and/or other control states may provide target subsystem performance for a particular application scenario. [0077] The configuration training system 140 may receive, from the plurality of functional blocks 302, status feedback information 360 indicating power and/or performance characteristics associated with running an application using the allocation of resources. Based at least in part on the status feedback information 360, the configuration training system 140 may determine whether a different allocation of resources and/or other control states may provide an improved subsystem power and/or performance. If the status feedback information 360 indicates that certain power and performance thresholds (e.g., thermal, current, frequency, voltage etc.) have not been met using a first function execution control state, the configuration training system 140 may determine a second function execution control state that may improve the power and/or performance of the subsystem 110. Additionally and/or alternatively, the configuration training system 140 may generate the second function execution control state using, e.g., machine learning. Once generated, function execution control state information 390 may be sent to the plurality of functional blocks 302. [0078] Additionally and/or alternatively, the configuration information 330 may include information associated with the function execution control state. The power controller 150 may perform configurable power management based at least in part on the allocation of resources and/or other control states indicated in the function execution control state information. Status feedback information 360 may be received multiple times while an application is being run. Each time feedback information 360 is received, the configuration training system 140 may determine whether and how to generate new function execution control state information 390 indicating a different allocation of resources and/or other control states. [0079] As mentioned above, the configuration training system 140 may send feedback information 340 (e.g., indicated in the second status feedback information 360) to the power controller 150. As mentioned in connection with FIG.3B, event counter unit 314, the sensitivity level unit 320, the sliding window unit 316, and throttle generator 318 may respectively use the feedback information 340 to reconfigure one or more of the coefficients and/or offset used to sum the total power, a power threshold (e.g., sensitivity level), sliding window size, and/or a throttling amount on the fly. Additional details associated with configurable power management by the power controller 150 will now be described in connection with FIG.3D. [0080] As seen in FIG.3D, a neural network380 of the configuration training system 140 may receive, e.g., one or more of configuration information 330, status information 360, event information 301, function execution control state information 390, or miscellaneous information 309. In certain implementations, the miscellaneous information 309 may include frequency information, voltage information, etc. The neural network 380 may use one or more of the configuration information 330, status information 360, event information 301, and/or miscellaneous information 309 to identify a power management technique 307 to perform. For example, the power management technique 307 may indicate sending a power control signal 303 (e.g., throttle enable, throttle disable, etc.) to one or more pipelines in the plurality of functional blocks 302, initiating DVFS power management 305 for the subsystem 110, sending function execution control state information 390 that controls the execution of the plurality of functional blocks 302 (e.g., thread, pipeline, etc.), or miscellaneous power management 311 (e.g., frequency reduction, block(s) power gating, etc.). A signal indicating the selected power management technique 307 may be sent to the throttle generator 318. The throttle generator block (e.g., sliding window unit 316 and/or throttle generator 318) may apply the power management technique 307 to the plurality of functional blocks 302. [0081] Thus, by performing configurable power management for only those pipelines that meet a threshold condition, the performance degradation of the subsystem 110 due to power limits management may be minimized. In extreme thermal conditions, however, the power controller 150 of the present disclosure may be enabled to perform DVFS power management, subsystem- level frequency reduction, etc. [0082] FIG. 4 illustrates a block diagram of an exemplary system 400 for configurable power management, according to embodiments of the disclosure. In some embodiments, as shown in FIG.4A, system 400 may include a processor 404, a memory 406, and a storage 408. In some embodiments, system 400 may have different modules in a single device, such as an integrated circuit (IC) chip (e.g., implemented as an application-specific integrated circuit (ASIC) or a field- programmable gate array (FPGA)), or separate devices with dedicated functions. In some embodiments, one or more components of system 400 may be located in a cloud or may be alternatively in a single location (such as inside a mobile device) or distributed locations. Components of system 400 may be in an integrated device or distributed at different locations but communicate with each other through a network (not shown). Consistent with the present disclosure, system 400 may be configured to perform configurable power management. [0083] Processor 404 may include any appropriate type of general-purpose or special- purpose circuit, microprocessor, digital signal processor, or microcontroller. Processor 404 may be configured as a separate processor module dedicated to performing configurable power management. Alternatively, processor 404 may be configured as a shared processor module for performing other functions in addition to performing configurable power management. [0084] Memory 406 and storage 408 may include any appropriate type of mass storage provided to store any type of information that processor 404 may need to operate. Memory 406 and storage 408 may be a volatile or non-volatile, magnetic, semiconductor, tape, optical, removable, non-removable, or other types of storage device or tangible (i.e., non-transitory) computer-readable medium including, but not limited to, a ROM, a flash memory, a dynamic RAM, and a static RAM. Memory 406 and/or storage 408 may be configured to store one or more computer programs that may be executed by processor 404 to perform functions disclosed herein. For example, memory 406 and/or storage 408 may be configured to store program(s) that may be executed by processor 404 to perform configurable power management. [0085] In some embodiments, memory 406 and/or storage 408 may also store various parameters including, e.g., coefficient and/or offset information, power threshold, cycle threshold, throttle amount(s), and/or a lookup table the correlates configuration information 330 and one or more of these parameters. [0086] As shown in FIG. 4, processor 404 may include multiple modules, such as an application scenario unit 442, a configuration training system 444, a power controller 446, a plurality of functional blocks 448, and the like. These modules (and any corresponding sub- modules or sub-units) can be hardware units (e.g., portions of an integrated circuit) of processor 404 designed for use with other components or software units implemented by processor 404 through executing at least part of a program. The program may be stored on a computer-readable medium, and when executed by processor 404, it may perform one or more functions. Although FIG.4 shows units 442-448 all within one processor 404, it is contemplated that these units may be distributed among different processors located closely or remotely with each other. [0087] In some embodiments, one or more of units 442-448 of FIG. 4 may execute computer instructions to perform configurable power management. FIG.5 illustrates a flowchart of an exemplary method 500 for generating configuration information, according to embodiments of the disclosure. Method 500 may be performed by system 400 and particularly processor 404 or a separate processor not shown in FIG. 4. Method 500 may include operations 502-514, as described below. It is to be appreciated that some of the steps may be optional, and some of the steps may be performed simultaneously, or in a different order than shown in FIG.5. In FIG.5, optional operations may be indicated with dashed lines. FIG.6 illustrates a data flow diagram 600 of an exemplary system for configurable power management, according to embodiments of the disclosure. FIGs.4-6 will be described together below. [0088] At operation 502, the configuration training system 444 may receive application information associated with the plurality of functional blocks 448. The application information may be received from the application scenario unit 442. For example, referring to FIG. 4C, the application scenario unit 130 may be configured to maintain or access application information 350 related to at least one application that may be run by the subsystem 110. In certain aspects, the application information 350 may include one or more of information related to a pipeline architecture within the plurality of functional blocks, which pipelines process which instruction(s) and/or portions of an instruction associated with the application, information that relates particular clock cycles to particular instruction(s) and/or portions of an instruction, or a character variable (e.g., type of application, workload, control state, etc.) associated with an application. Prior to running the application at the plurality of functional blocks, the application scenario unit 130 may send the application information 350 to the configuration training system 140. The configuration training system 140 may generate configuration information 330 based at least in part on the application information 350. [0089] At operation 504, the configuration training system 444 may receive status information from the plurality of functional blocks 448. For example, referring to FIG. 3C, the plurality of functional blocks 302 may send first status feedback information to the configuration training system 140 that is related to the power and performance of the subsystem 110 using the first threshold parameters. The first status feedback information may include one or more of power information, current information, or thermal information associated with each pipeline and/or the plurality of functional blocks 302. The second status feedback information 360 may be sent at the end of each clock cycle, at the end of a predetermined number of clock cycles (e.g., 2 clock cycles, 3 clock cycles, 10 clock cycles, 100 clock cycles, etc.), or upon request from the configuration training system 140. [0090] At operation 506, the configuration training system 444 may generate first power control information (e.g., first configuration information and/or feedback information 340) based at least in part on one or more of the application information or the status feedback information. For example, referring to FIG.3C, based on a comparison of the application information 350 and the correlation information, the configuration training system 140 may generate configuration information 330. As mentioned above in connection with FIG.3B, the configuration information 330 may configure the power controller 150 to use certain threshold parameters (e.g., coefficient and/or offset used in calculating total power at event counter unit 314, power threshold, cycle threshold, throttle amount, etc.). The configuration information 330 may include different threshold parameters that may be used by the power controller 150 at different times. For example, the configuration information 330 may include first configuration information that is sent at a first time (e.g., t0), second configuration information that is sent at a second time (e.g., t1), and so on. The configuration training system 140 may update the correlation information to include the status feedback information as an additional data point associated with the application scenario. If the first status feedback information indicates that power and/or performance targets have not been met using the first configuration information, the configuration training system 140 may generate second configuration information that may improve the power and/or performance target of the subsystem 110. For example, the second configuration information may include a second set of threshold parameters (e.g., power threshold, cycle threshold, throttle amount, etc.) for use by the power controller 150. The second configuration information may be generated using, e.g., the threshold parameters may improve certain power and/or performance characteristics. Additionally and/or alternatively, the configuration training system 140 may generate the second configuration information using, e.g., machine learning. Once generated, the second configuration information may be sent to the power controller 150. The power controller 150 may perform configurable power management of the plurality of functional blocks using the second configuration information. [0091] At operation 508, the configuration training system 444 may generate first function execution control state information based at least in part on the application information and the first status feedback information. For example, referring to FIG. 3C, the configuration training system 140 may generate function execution control state information 390 that allocates resources to the plurality of functional blocks 302. The resources may be used by the plurality of functional blocks 302 to process the threads. In certain implementations, the allocation of resources may be at one or more of the pipeline level and/or the functional block level. The allocation of resources may affect subsystem performance because processing threads with a first allocation of resources may draw more power from the voltage source and, hence, generate greater thermal output than processing the threads using a second allocation of resources. Hence, the function execution control state information 390 may indicate an allocation of resources that provides a target power and performance for a particular application scenario. In certain implementations, the configuration training system 140 may select the allocation of resources based at least in part on one or more of the application information 350 and/or the correlation information. In certain example embodiments, the application information 350 may indicate the allocation of resources to include in the function execution control state information 390. In certain other example embodiments, the configuration training system 140 may access the correlation information described above in connection with the configuration information 330. The correlation information may indicate, for a particular application scenario, a predetermined allocation of resources and/or other control states. For example, the predetermined allocation of resources indicated by the correlation information may provide a target subsystem power and performance for a particular application scenario. Based at least in part on the second status feedback information 360, the configuration training system 140 may determine whether a different allocation of resources may provide a more target subsystem performance and less power consumption. If the second status feedback information 360 indicates that certain performance thresholds (e.g., thermal, current, frequency, power, etc.) have not been met using a first allocation of resources and/or other control states, the configuration training system 140 may determine a second allocation of resources and/or other control states that may improve the performance of the subsystem 110. For example, the second allocation of resources may indicate a different allocation of resources across the pipelines as compared to the first allocation of resources. The second allocation of resources may be generated using, e.g., the allocation of resources associated with a similar but different application scenario that may improve certain power and/or performance characteristics. Additionally and/or alternatively, the configuration training system 140 may generate the second allocation of resources and/or other control states using, e.g., machine learning. Once generated, function execution control state information 390 that indicates the second allocation of resources and/or other control states may be sent to the plurality of functional blocks 302. Additionally and/or alternatively, the configuration information 330 may include information associated with the allocation of resources and/or other control states. The power controller 150 may perform configurable power management based at least in part on the allocation of resources and/or other control states. Second status feedback information 360 may be received multiple times while an application is being run. Each time status feedback information 360 is received, the configuration training system 140 may determine whether and how to generate new function execution control state information 390 indicating a different allocation of resources or other states. Information associated with the new allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302 to improve the power and performance of the subsystem 110. [0092] At operation 510, the configuration training system 444 may send the first power control information to a power controller. For example, referring to FIG.3C, once generated, the first configuration information may be set to the power controller 150. [0093] At operation 512, the configuration training system 444 may send the first function execution control state information to the plurality of functional blocks. For example, referring to FIG. 3C, function execution control state information 390 indicating an allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302. [0094] At operation 514, the configuration training system 444 may receive second status feedback information from the plurality of functional blocks. For example, referring to FIG.3C, each time status feedback information 360 is received, the configuration training system 140 may determine whether and how to generate new function execution control state information 390 indicating a new allocation of resources and/or other control states. Information associated with the new allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302 to improve the power and performance of the subsystem 110. [0095] At operation 516, the configuration training system 444 may generate second power control information based at least in part on the second status feedback information. For example, referring to FIG.3C, the configuration information 330 may include different threshold parameters that may be used by the power controller 150 at different times. For example, the configuration information 330 may include first configuration information that is sent at a first time (e.g., t0), second configuration information that is sent at a second time (e.g., t1), and so on. The configuration training system 140 may update the correlation information to include the first status feedback information as an additional data point associated with the first application scenario. If the first status feedback information indicates that certain power and performance thresholds (e.g., thermal, current, frequency, etc.) have not been met using the first configuration information, the configuration training system 140 may generate second configuration information that may improve the power and performance of the subsystem 110. For example, the second configuration information may include a second set of threshold parameters (e.g., coefficient and/or offset used to calculate the total power at the event counter unit, power threshold, cycle threshold, throttle amount, etc.) for use by the power controller 150. The second configuration information may be generated using, e.g., the threshold parameters associated with a similar but different application scenario that may improve certain power and/or performance characteristics. Additionally and/or alternatively, the configuration training system 140 may generate the second configuration information using, e.g., machine learning. Once generated, the second configuration information may be sent to the power controller 150. The power controller 150 may perform configurable power management of the plurality of functional blocks using the second configuration information. [0096] At 518, the configuration training system 444 may generate second function execution control state information based at least in part on the second status feedback information. For example, referring to FIG.3C, based at least in part on the second status feedback information 360, the configuration training system 140 may determine whether and how a different allocation of resources and/or other control states may provide a more target subsystem power and performance. If the second status feedback information 360 indicates that certain power and performance thresholds (e.g., thermal, current, frequency, etc.) have not been met using a first allocation of resources and/or other control states, the configuration training system 140 may determine a second allocation of resources and/or other control states that may improve the power and performance of the subsystem 110. For example, the second allocation of resources may indicate a different allocation of resources across the pipelines as compared to the first allocation of resources. The second allocation of resources and/or other control states may be generated using, e.g., the allocation of resources and/or other control states associated with a similar but different application scenario that may improve certain power and/or performance characteristics. Additionally and/or alternatively, the configuration training system 140 may generate the second allocation of resources using, e.g., machine learning and/or artificial intelligence. Once generated, function execution control state information 390 that indicates the second allocation of resources may be sent to the plurality of functional blocks 302. Additionally and/or alternatively, the configuration information 330 may include information associated with the allocation of resources. The power controller 150 may perform configurable power management based at least in part on the allocation of resources. Second status feedback information 360 may be received multiple times while an application is being run. Each time status feedback information 360 is received, the configuration training system 140 may determine whether to generate new function execution control state information 390 indicating a different allocation of resources and/or other control states. Information associated with the new allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302 to improve the power and/or performance of the subsystem 110. [0097] At operation 520, the configuration training system 444 may send the second power control information to the power controller. For example, referring to FIG. 3C, once generated, the first configuration information may be set to the power controller 150. [0098] At operation 522, the configuration training system 444 may send the second function execution control state information to the plurality of functional blocks. For example, referring to FIG.3C, function execution control state information 390 indicating an allocation of resources and/or other control states may be sent to one or more of the power controller 150 or functional blocks 302. [0099] Embodiments of the disclosure provide a configuration training system for configurable power management. The configuration training system may include a memory and at least one processor coupled to the memory. In certain aspects, the at least one processor may be configured to receive application information associated with a plurality of functional blocks. In certain other aspects, the at least one processor may be configured to receive first status feedback information from the plurality of functional blocks. In certain other aspects, the at least one processor may be configured to generate first power control information based at least in part on one or more of the application information or the first status feedback information. In certain other aspects, the at least one processor may generate first function execution control state information based at least in part on the application information and the first status feedback information. In certain other aspects, the at least one processor may send the first power control information to a power controller. In certain other aspects, the at least one processor may send the first function execution control state information to the plurality of functional blocks. [0100] In certain aspects, the application information may include at least one character variable associated with a set of operations performed by the plurality of functional blocks. [0101] In certain other aspects, the first status feedback information may include one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks. [0102] In certain other aspects, the first power control information may include one or more of a first power controller configuration or first thermal feedback information. [0103] In certain other aspects, the first function execution control state information may include instructions associated with a set of data flow processes performed by the plurality of functional blocks. [0104] In certain other aspects, the set of data flow processes may include one or more of functional block enable instructions, functional block disable instructions, thread capacity information, storage region information, or task distribution information, etc. [0105] In certain other aspects, the at least one processor may be further configured to receive second status feedback information from the plurality of functional blocks. [0106] In certain other aspects, the at least one processor may be further configured to generate second power control information based at least in part on the second status feedback information. [0107] In certain other aspects, the at least one processor may be further configured to generate second function execution control state information based at least in part on the second status feedback information. [0108] In certain other aspects, the at least one processor may be further configured to send the second power control information to the power controller. [0109] In certain other aspects, the at least one processor may be further configured to send the second function execution control state information to the plurality of functional blocks. [0110] In certain other aspects, the first power control information and the second power control information may be different. [0111] In certain other aspects, the first function execution control state information and the second function execution control state information may be different. [0112] In certain other aspects, the first power control information and the first function execution control state information may be associated with a first set of at least one of power or performance characteristics associated with the plurality of functional blocks. [0113] In certain other aspects, the second power control information and the second function execution control state information may be associated with a second set of at least one of power or performance characteristics associated with the plurality of functional blocks. [0114] In certain other aspects, the first set of at least one of power or performance characteristics and the second set of at least one of power or performance characteristics may be different. [0115] Embodiments of the disclosure provide a method for configurable power management of a configuration training system. The method may include receiving, at a configuration training system, application information associated with a plurality of functional blocks. In certain aspects, the method may further include receiving, at the configuration training system, first status feedback information from the plurality of functional blocks. In certain other aspects, the method may further include generating, at the configuration training system, first power control information based at least in part on one or more of the application information or the first status feedback information. In certain other aspects, the method may further include generating, at the configuration training system, first function execution control state information based at least in part on the application information and the first status feedback information. In certain other aspects, the method may further include sending the first power control information to a power controller. In certain other aspects, the method may further include sending the first function execution control state information to the plurality of functional blocks. [0116] In certain aspects, the application information may include at least one character variable associated with a set of operations performed by the plurality of functional blocks. [0117] In certain other aspects, the first status feedback information may include one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks. [0118] In certain other aspects, the first power control information may include one or more of a first power controller configuration or first thermal feedback information. [0119] In certain other aspects, the first function execution control state information may include instructions associated with a set of data flow processes performed by the plurality of functional blocks. [0120] In certain other aspects, the set of data flow processes may include one or more of functional block enable instructions, functional block disable instructions, thread capacity information, storage region information, or task distribution information. [0121] In certain other aspects, the method may further include receiving, at the configuration training system, second status feedback information from the plurality of functional blocks. In certain other aspects, the method may further include generating, at the configuration training system, second power control information based at least in part on the second status feedback information. In certain other aspects, the method may further include generating, at the configuration training system, second function execution control state information based at least in part on the second status feedback information. In certain other aspects, the method may further include sending the second power control information to the power controller. In certain other aspects, the method may further include sending the second function execution control state information to the plurality of functional blocks. [0122] In certain other aspects, the first power control information and the second power control information may be different. [0123] In certain other aspects, the first function execution control state information and the second function execution control state information may be different. [0124] In certain other aspects, the first power control information and the first function execution control state information may be associated with a first set of at least one of power or performance characteristics associated with the plurality of functional blocks. [0125] In certain other aspects, the second power control information and the second function execution control state information may be associated with a second set of at least one of power or performance characteristics associated with the plurality of functional blocks. [0126] In certain other aspects, the first set of at least one of power or performance characteristics and the second set of at least one of power or performance characteristics may be different. [0127] Embodiments of the disclosure further provide a non-transitory computer-readable medium having instructions stored thereon that, when executed by one or more processors, causes the one or more processors to perform configurable power management of a configuration training system. The method may further include generating second power control information based at least in part on the second status feedback information. In certain other aspects, the method may further include generating second function execution control state information based at least in part on the second status feedback information. In certain other aspects, the method may further include sending the second power control information to the power controller. In certain other aspects, the method may further include sending the second function execution control state information to the plurality of functional blocks. [0128] In certain other aspects, the application information may include at least one character variable associated with a set of operations performed by the plurality of functional blocks. [0129] In certain other aspects, the first status feedback information may include one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks. [0130] In certain aspects, the first power control information may include one or more of a first power controller configuration or first thermal feedback information. [0131] The Summary and Abstract sections may set forth one or more but not all exemplary embodiments of the present disclosure as contemplated by the inventor(s), and thus, are not intended to limit the present disclosure and the appended claims in any way. [0132] Various functional blocks, modules, and steps are disclosed above. The particular arrangements provided are illustrative and without limitation. Accordingly, the functional blocks, modules, and steps may be re-ordered or combined in different ways than in the examples provided above. Likewise, certain embodiments include only a subset of the functional blocks, modules, and steps, and any such subset is permitted. [0133] The breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

WHAT IS CLAIMED IS: 1. A configuration training system, comprising: a memory; and at least one processor coupled to the memory, configured to: receive application information associated with a plurality of functional blocks; receive first status feedback information from the plurality of functional blocks; generate first power control information based at least in part on one or more of the application information or the first status feedback information; generate first function execution control state information based at least in part on the application information and the first status feedback information; send the first power control information to a power controller; and send the first function execution control state information to the plurality of functional blocks.
2. The configuration training system of claim 1, wherein the application information includes at least one character variable associated with a set of operations performed by the plurality of functional blocks.
3. The configuration training system of claim 1, wherein the first status feedback information includes one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks.
4. The configuration training system of claim 1, wherein the first power control information includes one or more of a first power controller configuration or first thermal feedback information.
5. The configuration training system of claim 1, wherein the first function execution control state information includes instructions associated with a set of data flow processes performed by the plurality of functional blocks.
6. The configuration training system of claim 5, wherein the set of data flow processes includes one or more of functional block enable instructions, functional block disable instructions, thread capacity information, storage region information, or task distribution information, etc.
7. The configuration training system of claim 1, wherein the at least one processor is further configured to: receive second status feedback information from the plurality of functional blocks; generate second power control information based at least in part on the second status feedback information; generate second function execution control state information based at least in part on the second status feedback information; send the second power control information to the power controller; and send the second function execution control state information to the plurality of functional blocks, wherein the first power control information and the second power control information are different, and wherein the first function execution control state information and the second function execution control state information may be different.
8. The configuration training system of claim 7, wherein: the first power control information and the first function execution control state information are associated with a first set of at least one of power or performance characteristics associated with the plurality of functional blocks, the second power control information and the second function execution control state information is associated with a second set of at least one of power or performance characteristics associated with the plurality of functional blocks, and the first set of at least one of power or performance characteristics and the second set of at least one of power or performance characteristics are different.
9. A method of power management, comprising: receiving, at a configuration training system, application information associated with a plurality of functional blocks; receiving, at the configuration training system, first status feedback information from the plurality of functional blocks; generating, at the configuration training system, first power control information based at least in part on one or more of the application information or the first status feedback information; generating, at the configuration training system, first function execution control state information based at least in part on the application information and the first status feedback information; sending the first power control information to a power controller; and sending the first function execution control state information to the plurality of functional blocks.
10. The method of claim 9, wherein the application information includes at least one character variable associated with a set of operations performed by the plurality of functional blocks.
11. The method of claim 9, wherein the first status feedback information includes one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks.
12. The method of claim 9, wherein the first power control information includes one or more of a first power controller configuration or first thermal feedback information.
13. The method of claim 9, wherein the first function execution control state information includes instructions associated with a set of data flow processes performed by the plurality of functional blocks.
14. The method of claim 13, wherein the set of data flow processes include one or more of functional block enable instructions, functional block disable instructions, thread capacity information, storage region information, or task distribution information.
15. The method of claim 9, further comprising: receiving, at the configuration training system, second status feedback information from the plurality of functional blocks; generating, at the configuration training system, second power control information based at least in part on the second status feedback information; generating, at the configuration training system, second function execution control state information based at least in part on the second status feedback information; sending the second power control information to the power controller; and sending the second function execution control state information to the plurality of functional blocks, wherein the first power control information and the second power control information may be different, and wherein the first function execution control state information and the second function execution control state information may be different.
16. The method of claim 15, wherein: the first power control information and the first function execution control state information are associated with a first set of performance characteristics associated with the plurality of functional blocks, the second power control information and the second function execution control state information is associated with a second set of performance characteristics associated with the plurality of functional blocks, and the first set of performance characteristics and the second set of performance characteristics may be different.
17. A non-transitory computer-readable medium having stored thereon computer instructions, when executed by at least one processor, configured to perform a method of power management for a subsystem, the method comprises: receiving application information associated with a plurality of functional blocks; receiving first status feedback information from the plurality of functional blocks; generating first power control information based at least in part on one or more of the application information or the first status feedback information; generating first function execution control state information based at least in part on the application information and the first status feedback information; sending the first power control information to a power controller; and sending the first function execution control state information to the plurality of functional blocks.
18. The non-transitory computer-readable medium of claim 17, wherein the application information includes at least one character variable associated with a set of operations performed by the plurality of functional blocks.
19. The non-transitory computer-readable medium of claim 17, wherein the first status feedback information includes one or more of first power information, first thermal information, or first performance information associated with the plurality of functional blocks.
20. The non-transitory computer-readable medium of claim 17, wherein the first power control information includes one or more of a first power controller configuration or first thermal feedback information.
PCT/US2021/014235 2021-01-20 2021-01-20 Apparatus and method of intelligent power and performance management WO2021056033A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2021/014235 WO2021056033A2 (en) 2021-01-20 2021-01-20 Apparatus and method of intelligent power and performance management

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2021/014235 WO2021056033A2 (en) 2021-01-20 2021-01-20 Apparatus and method of intelligent power and performance management

Publications (2)

Publication Number Publication Date
WO2021056033A2 true WO2021056033A2 (en) 2021-03-25
WO2021056033A3 WO2021056033A3 (en) 2021-06-03

Family

ID=74884240

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/014235 WO2021056033A2 (en) 2021-01-20 2021-01-20 Apparatus and method of intelligent power and performance management

Country Status (1)

Country Link
WO (1) WO2021056033A2 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8639862B2 (en) * 2009-07-21 2014-01-28 Applied Micro Circuits Corporation System-on-chip queue status power management
US10409353B2 (en) * 2013-04-17 2019-09-10 Qualcomm Incorporated Dynamic clock voltage scaling (DCVS) based on application performance in a system-on-a-chip (SOC), and related methods and processor-based systems
US20160091957A1 (en) * 2014-09-26 2016-03-31 Suketu R. Partiwala Power management for memory accesses in a system-on-chip
US9910481B2 (en) * 2015-02-13 2018-03-06 Intel Corporation Performing power management in a multicore processor

Also Published As

Publication number Publication date
WO2021056033A3 (en) 2021-06-03

Similar Documents

Publication Publication Date Title
Lee et al. Warped-compression: Enabling power efficient GPUs through register compression
US20140089699A1 (en) Power management system and method for a processor
US8984311B2 (en) Method, apparatus, and system for energy efficiency and energy conservation including dynamic C0-state cache resizing
US11360540B2 (en) Processor core energy management
US10613957B2 (en) Achieving balanced execution through runtime detection of performance variation
US20120159216A1 (en) Method, apparatus, and system for energy efficiency and energy conservation including enhanced temperature based voltage control
Ma et al. Spendthrift: Machine learning based resource and frequency scaling for ambient energy harvesting nonvolatile processors
US9946319B2 (en) Setting power-state limits based on performance coupling and thermal coupling between entities in a computing device
US10176014B2 (en) System and method for multithreaded processing
CN103218029B (en) Ultra-low power consumption processor pipeline structure
Chéour et al. Microcontrollers for IoT: optimizations, computing paradigms, and future directions
JP2018506111A (en) Enable system low power state when computational element is active
CN105159654A (en) Multi-thread parallelism based integrity measurement hash algorithm optimization method
US20190146567A1 (en) Processor throttling based on accumulated combined current measurements
Song et al. Energy-efficient scheduling for memory-intensive GPGPU workloads
US20190286971A1 (en) Reconfigurable prediction engine for general processor counting
CN108139791B (en) CPU power network design for power monitoring
CN105353865A (en) Multiprocessor based dynamic frequency adjustment method
US20220350863A1 (en) Technology to minimize the negative impact of cache conflicts caused by incompatible leading dimensions in matrix multiplication and convolution kernels without dimension padding
US9760145B2 (en) Saving the architectural state of a computing device using sectors
WO2021056033A2 (en) Apparatus and method of intelligent power and performance management
Gelashvili et al. L3 fusion: Fast transformed convolutions on CPUs
WO2021056032A2 (en) Apparatus and method of intelligent power allocation using a power controller
WO2021056031A2 (en) Apparatus and method of configurable power management using a power controller
KR101682985B1 (en) Priority based intelligent platform passive thermal management

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21712932

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21712932

Country of ref document: EP

Kind code of ref document: A2