US20100057404A1 - Optimal Performance and Power Management With Two Dependent Actuators - Google Patents
Optimal Performance and Power Management With Two Dependent Actuators Download PDFInfo
- Publication number
- US20100057404A1 US20100057404A1 US12/201,877 US20187708A US2010057404A1 US 20100057404 A1 US20100057404 A1 US 20100057404A1 US 20187708 A US20187708 A US 20187708A US 2010057404 A1 US2010057404 A1 US 2010057404A1
- Authority
- US
- United States
- Prior art keywords
- processor chip
- performance
- voltage level
- power consumption
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3296—Power saving characterised by the action undertaken by lowering the supply or operating voltage
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present invention relates to processor chips, and more particularly, to techniques for processor chip power management and performance optimization.
- Power management features are common in today's high-power computing devices to conserve power and are especially useful in devices, such as laptop computers, that run on batteries.
- One way to conserve power is to modulate processor activity, which is typically enabled through the use of power management actuators, such as dynamic frequency scaling (DFS) or combined frequency and voltage scaling (DVFS) actuators, that scale-down processor frequency and/or voltage at certain times or in certain modes.
- DFS dynamic frequency scaling
- DVFS combined frequency and voltage scaling
- power management actuators such as DVFS actuators
- DVFS actuators are typically used to vary the voltage and frequency at which the processor is run to accommodate for changes in computing workload and so as to maintain a particular power consumption budget.
- Such voltage and frequency changes can only be instituted at a certain frequency to ensure proper operation of the processor. Namely, a proper amount of time must be allotted between voltage changes, for example, to allow for voltage step-down and regulation.
- the workload on the processor likely will have already changed, and as such, the processor will be operating at a sub-optimal level.
- the present invention provides techniques for processor chip power management and performance optimization.
- a method for maximizing performance of a processor chip within a given power consumption budget comprises the following steps.
- a power consumption and performance of the processor chip at all possible voltage level and frequency combinations is predicted.
- the processor chip is adjusted to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget.
- the frequency of the processor chip is varied to accommodate for any shift in workload to maintain the highest performance within the power budget.
- the adjust and vary steps are repeated, wherein time interval t 2 is greater than time interval t 1 .
- FIG. 1 is a diagram illustrating an exemplary methodology for maximizing performance of a processor chip within a given power consumption budget according to an embodiment of the present invention
- FIG. 2 is a graph illustrating voltage level/maximum frequency pairs for a particular set of workloads according to an embodiment of the present invention.
- FIG. 3 is a diagram illustrating an exemplary apparatus for maximizing performance of a processor chip within a given power consumption budget according to an embodiment of the present invention.
- FIG. 1 is a diagram illustrating exemplary methodology 100 for maximizing performance of a processor chip within a given power consumption budget.
- the processor chip can be a single core processor chip or a multi-core processor chip.
- Methodology 100 can be implemented using standard frequency and voltage scaling (DVFS) actuators which, as will be described in detail below, are configured to change voltage levels and/or frequencies on a per-core or chip-wide basis.
- DVFS standard frequency and voltage scaling
- step 102 power consumption and performance of the processor chip are predicted for each possible voltage level in combination with each possible frequency.
- the voltage level and frequency can be equated with power consumption using a power management tool, such as MaxBIPS.
- a power management tool such as MaxBIPS. See, for example, C. Isci et al., “An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget,” Proceedings of the 39 th annual International Symposium on Microarchitecture (MICRO' 06), IEEE, pp. 347-358 (Dec. 9-13, 2006) (hereinafter “Isci”), the disclosure of which is incorporated by reference herein.
- MaxBIPS predicts power and billion instructions per second (BIPS) values for different combinations of power (voltage (Vdd)/frequency (f)) modes, i.e., full-throttle execution (Vdd, f), medium power savings (95 percent (%) Vdd, 95% f) and high power savings (85% Vdd, 85% f), and chooses the combination with the highest throughput that meets a power budget.
- Vdd voltage
- f frequency
- Vdd, f full-throttle execution
- medium power savings 95 percent (%) Vdd, 95% f
- high power savings 85% Vdd, 85% f
- the voltage level is varied on a chip-wide basis, while the frequency is varied on a per-core basis (in the case of a multi-core processor chip). Therefore, when the processor chip is a multi-core processor chip, in step 102 the power consumption and performance of each of the cores can be predicted for all possible chip-wide voltages in combination with all possible frequencies for each individual core.
- step 102 can be carried out by first selecting a particular voltage level and then varying the frequencies available (for the single core or for each core in a multi-core configuration) for that particular voltage level. This process can be systematically repeated to obtain all possible voltage level/frequency combinations.
- Core performance is a measure of throughput. According to an exemplary embodiment, performance is measured as the number of instructions executed per second. As will be described in detail below, performance can vary as a function of workload distribution.
- Each core reports its actual power consumption and performance at regular measurement intervals.
- the predicted power consumption and performance can be obtained by extrapolating from the actual power consumption and performance data.
- the power consumption and performance for each core can be predicted by extrapolating from data collected at the last measurement interval. See, for example, R. Bergamaschi et al., “Exploring Power Management in Multi-Core Systems,” Proceedings of the 13 th Asia and South Pacific Design Automation Conference (ASP-DAC 2008), Seoul, Korea (January 2008) (wherein when voltage (v) and frequency (f) mode (v, f) is set as (v′, f′), performance (I) is predicted as
- a total predicted power consumption is determined for each of the voltage level/frequency combinations.
- the total predicted power consumption is the sum of the predicted power consumption values for each of the cores.
- the total predicted power consumption is simply the predicted power consumption value for the single core.
- step 108 from the voltage level/frequency combinations that remain (i.e., those voltage level/frequency combinations with a total predicted power consumption that meets (is less than or equal to) the power budget), the voltage level/frequency combination that provides the highest predicted performance for the processor chip is selected.
- the total predicted performance is the sum of the predicted performance values for each of the cores.
- the total predicted performance is simply the predicted performance value for the single core. This selection process is shown graphically in FIG. 2 , below. As highlighted above, the performance of the core(s) can vary as a function of workload distribution during operation of the processor chip. In this step, processor chip performance is maximized by selecting the voltage level/frequency combination that provides the highest performance.
- the voltage level selected in this step will determine the maximum frequency for the core(s), both in this step and in steps 110 - 112 , described below. Namely, for a given voltage there is only a certain range of frequencies that can be implemented as each frequency requires a certain minimum voltage.
- step 110 the processor chip is adjusted to the voltage level/frequency combination selected in step 108 , above.
- This voltage level/frequency combination will, within the confines of the given power budget, maximize performance of the processor chip (i.e., across all of the cores in the case of a multi-core configuration), for at least the current operating conditions.
- the current operating conditions may change before the next step of methodology 100 , step 112 , is carried out.
- the frequency of the core (in a single core configuration) or one or more of the cores (in a multi-core configuration) is varied to accommodate for any shift in the workload. This is done to again optimize the total performance of the processor chip given the workload change.
- the workload can shift among the cores. For example, one or more of the cores that were actively performing computations might now be stalled due to memory accesses, while one or more of the other cores might now be more active.
- the frequency now chosen for each core can again be based on the core power consumption and performance predictions made in step 102 , above. As highlighted above, the frequencies chosen in this step are limited to the frequencies that can be implemented for the voltage level selected in step 108 (described above).
- the voltage level and frequency of the processor chip can be adjusted using standard DVFS actuators.
- two DVFS actuators are employed, one to adjust the voltage level and another to adjust the frequency.
- the DVFS actuators can be configured to adjust the voltage level and/or frequency on a per-core basis or on a chip-wide basis.
- the DVFS actuators can be configured to adjust the voltage level and the frequency on a per-core basis (e.g., in the case of a multi-core processor chip).
- the DVFS actuators can be configured to adjust the voltage level on a chip-wide basis and the frequency on a per-core basis (e.g., in the case of a multi-core processor chip).
- the DVFS actuators can be configured to adjust both the voltage level and the frequency on a chip-wide basis (for both single core and multi-core processor chips).
- methodology 100 has two invocation intervals, a shorter interval (i.e., time interval t 1 ) for frequency changes and a longer interval (i.e., time interval t 2 , see below) for combined voltage level and frequency changes.
- time interval t 1 a shorter interval for frequency changes
- time interval t 2 a longer interval for combined voltage level and frequency changes.
- time interval t 2 is longer than time interval t 1 , due to the processor chip being able to accommodate more frequent changes in frequency than in voltage level.
- Time intervals t 1 and t 2 can be predetermined and set by a system administrator.
- time interval t 1 can have a duration of about 50 microseconds ( ⁇ s) and time interval t 2 can have a duration of about two milliseconds (ms).
- time interval t 1 time interval t 1
- time interval t 2 time interval t 2
- FIG. 2 is graph 200 illustrating voltage level/maximum frequency pairs for a particular set of workloads.
- core performance is plotted as a function of power budget (measured in Watts (W)).
- W Watts
- the legend in graph 200 gives the maximum frequency for the associated voltage level.
- the particular voltage level/maximum frequency combination that provides the highest performance depends on the power budget. Namely, to meet the power budget the frequency is reduced along a curve, reducing power consumption, while the voltage is fixed for each curve.
- a chip voltage level of one volt (V) is selected enabling a maximum core frequency of 3.7 gigahertz (GHz)
- a chip voltage level of 0.9 V is selected enabling a maximum core frequency of 2.9 GHz
- a chip voltage level of 0.8 V is selected enabling a maximum core frequency of 2.3 GHz.
- FIG. 3 a block diagram is shown of an apparatus 300 for maximizing performance of a processor chip within a given power consumption budget, in accordance with one embodiment of the present invention.
- the processor chip can be local or remote to apparatus 300 . It should be understood that apparatus 300 represents one embodiment for implementing methodology 100 of FIG. 1 .
- Apparatus 300 comprises a computer system 310 and removable media 350 .
- Computer system 310 comprises a local processor 320 , a network interface 325 , a memory 330 , a media interface 335 and an optional display 340 .
- Network interface 325 allows computer system 310 to connect to a network
- media interface 335 allows computer system 310 to interact with media, such as a hard drive or removable media 350 .
- the methods and apparatus discussed herein may be distributed as an article of manufacture that itself comprises a machine-readable medium containing one or more programs which when executed implement embodiments of the present invention.
- the machine-readable medium may contain a program configured to predict a power consumption and performance of the processor chip at all possible voltage level and frequency combinations; adjust the processor chip to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget; after a time interval t 1 , vary the frequency of the processor chip to accommodate for any shift in workload to maintain the highest performance within the power budget; and after a time interval t 2 , repeat the adjust and vary steps, wherein time interval t 2 is greater than time interval t 1 .
- apparatus 300 can control one or more DVFS actuators (not shown) and by way thereof implement one or more of the steps of methodology 100 .
- the machine-readable medium may be a recordable medium (e.g., floppy disks, hard drive, optical disks such as removable media 350 , or memory cards) or may be a transmission medium (e.g., a network comprising fiber-optics, the world-wide web, cables, or a wireless channel using time-division multiple access, code-division multiple access, or other radio-frequency channel). Any medium known or developed that can store information suitable for use with a computer system may be used.
- a recordable medium e.g., floppy disks, hard drive, optical disks such as removable media 350 , or memory cards
- a transmission medium e.g., a network comprising fiber-optics, the world-wide web, cables, or a wireless channel using time-division multiple access, code-division multiple access, or other radio-frequency channel. Any medium known or developed that can store information suitable for use with a computer system may be used.
- Local processor 320 can be configured to implement the methods, steps, and functions disclosed herein.
- the memory 330 could be distributed or local and the local processor 320 could be distributed or singular.
- the memory 330 could be implemented as an electrical, magnetic or optical memory, or any combination of these or other types of storage devices.
- the term “memory” should be construed broadly enough to encompass any information able to be read from, or written to, an address in the addressable space accessed by local processor 320 . With this definition, information on a network, accessible through network interface 325 , is still within memory 330 because the local processor 320 can retrieve the information from the network. It should be noted that each distributed processor that makes up local processor 320 generally contains its own addressable memory space. It should also be noted that some or all of computer system 310 can be incorporated into an application-specific or general-use integrated circuit.
- Optional video display 340 is any type of video display suitable for interacting with a human user of apparatus 300 .
- video display 340 is a computer monitor or other similar video display.
Abstract
Techniques for processor chip power management and performance optimization are provided. In one aspect, a method for maximizing performance of a processor chip within a given power consumption budget is provided. The method comprises the following steps. A power consumption and performance of the processor chip at all possible voltage level and frequency combinations is predicted. The processor chip is adjusted to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget. After a time interval t1, the frequency of the processor chip is varied to accommodate for any shift in workload to maintain the highest performance within the power budget. After a time interval t2, the adjust and vary steps are repeated, wherein time interval t2 is greater than time interval t1.
Description
- This invention was made with Government support under Contract number HR00110790002 awarded by (DARPA) Defense Advanced Research Projects Agency. The Government has certain rights in this invention.
- The present invention relates to processor chips, and more particularly, to techniques for processor chip power management and performance optimization.
- Power management features are common in today's high-power computing devices to conserve power and are especially useful in devices, such as laptop computers, that run on batteries. One way to conserve power is to modulate processor activity, which is typically enabled through the use of power management actuators, such as dynamic frequency scaling (DFS) or combined frequency and voltage scaling (DVFS) actuators, that scale-down processor frequency and/or voltage at certain times or in certain modes. By temporarily reducing processor activity, heat produced by the device is also reduced, thereby further conserving power needed for cooling.
- In conventional systems, power management actuators, such as DVFS actuators, are typically used to vary the voltage and frequency at which the processor is run to accommodate for changes in computing workload and so as to maintain a particular power consumption budget. Such voltage and frequency changes can only be instituted at a certain frequency to ensure proper operation of the processor. Namely, a proper amount of time must be allotted between voltage changes, for example, to allow for voltage step-down and regulation. However, during this time period, the workload on the processor likely will have already changed, and as such, the processor will be operating at a sub-optimal level.
- Therefore, techniques that maximize processor performance within the confines of a given power budget would be desirable.
- The present invention provides techniques for processor chip power management and performance optimization. In one aspect of the invention, a method for maximizing performance of a processor chip within a given power consumption budget is provided. The method comprises the following steps. A power consumption and performance of the processor chip at all possible voltage level and frequency combinations is predicted. The processor chip is adjusted to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget. After a time interval t1, the frequency of the processor chip is varied to accommodate for any shift in workload to maintain the highest performance within the power budget. After a time interval t2, the adjust and vary steps are repeated, wherein time interval t2 is greater than time interval t1.
- A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.
-
FIG. 1 is a diagram illustrating an exemplary methodology for maximizing performance of a processor chip within a given power consumption budget according to an embodiment of the present invention; -
FIG. 2 is a graph illustrating voltage level/maximum frequency pairs for a particular set of workloads according to an embodiment of the present invention; and -
FIG. 3 is a diagram illustrating an exemplary apparatus for maximizing performance of a processor chip within a given power consumption budget according to an embodiment of the present invention. -
FIG. 1 is a diagram illustratingexemplary methodology 100 for maximizing performance of a processor chip within a given power consumption budget. The processor chip can be a single core processor chip or a multi-core processor chip.Methodology 100 can be implemented using standard frequency and voltage scaling (DVFS) actuators which, as will be described in detail below, are configured to change voltage levels and/or frequencies on a per-core or chip-wide basis. - In
step 102, power consumption and performance of the processor chip are predicted for each possible voltage level in combination with each possible frequency. The voltage level and frequency can be equated with power consumption using a power management tool, such as MaxBIPS. See, for example, C. Isci et al., “An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget,” Proceedings of the 39th annual International Symposium on Microarchitecture (MICRO' 06), IEEE, pp. 347-358 (Dec. 9-13, 2006) (hereinafter “Isci”), the disclosure of which is incorporated by reference herein. For example, as described in Isci, MaxBIPS predicts power and billion instructions per second (BIPS) values for different combinations of power (voltage (Vdd)/frequency (f)) modes, i.e., full-throttle execution (Vdd, f), medium power savings (95 percent (%) Vdd, 95% f) and high power savings (85% Vdd, 85% f), and chooses the combination with the highest throughput that meets a power budget. As further described in Isci, with combined frequency and voltage scaling, power has a cubic relation to frequency and voltage scaling, and performance has a relatively linear dependence on frequency. As highlighted above, the voltage level and/or frequency can be varied on a per-core or a chip-wide basis. According to an exemplary embodiment, the voltage level is varied on a chip-wide basis, while the frequency is varied on a per-core basis (in the case of a multi-core processor chip). Therefore, when the processor chip is a multi-core processor chip, instep 102 the power consumption and performance of each of the cores can be predicted for all possible chip-wide voltages in combination with all possible frequencies for each individual core. By way of example only,step 102 can be carried out by first selecting a particular voltage level and then varying the frequencies available (for the single core or for each core in a multi-core configuration) for that particular voltage level. This process can be systematically repeated to obtain all possible voltage level/frequency combinations. - Core performance is a measure of throughput. According to an exemplary embodiment, performance is measured as the number of instructions executed per second. As will be described in detail below, performance can vary as a function of workload distribution.
- Each core reports its actual power consumption and performance at regular measurement intervals. The predicted power consumption and performance can be obtained by extrapolating from the actual power consumption and performance data. For example, at any given point in time, the power consumption and performance for each core can be predicted by extrapolating from data collected at the last measurement interval. See, for example, R. Bergamaschi et al., “Exploring Power Management in Multi-Core Systems,” Proceedings of the 13th Asia and South Pacific Design Automation Conference (ASP-DAC 2008), Seoul, Korea (January 2008) (wherein when voltage (v) and frequency (f) mode (v, f) is set as (v′, f′), performance (I) is predicted as
-
- dynamic power (P) is predicted as
-
- and static power (L) is predicted as
-
- and wherein the total power is the sum of static and dynamic power), the disclosure of which is incorporated by reference herein.
- In
step 104, a total predicted power consumption is determined for each of the voltage level/frequency combinations. With a multi-core processor chip, the total predicted power consumption is the sum of the predicted power consumption values for each of the cores. With a single core processor chip, the total predicted power consumption is simply the predicted power consumption value for the single core. Once the total predicted power consumption is determined for each voltage level/frequency combination, instep 106, any voltage level/frequency combination that results in a total predicted power consumption that is greater than the given power budget is eliminated. A power budget is generally established, e.g., by a system administrator, and might not be a physical limit, but more of a power usage guideline, that if adhered to, can help control operating costs. - In
step 108, from the voltage level/frequency combinations that remain (i.e., those voltage level/frequency combinations with a total predicted power consumption that meets (is less than or equal to) the power budget), the voltage level/frequency combination that provides the highest predicted performance for the processor chip is selected. With a multi-core processor chip, the total predicted performance is the sum of the predicted performance values for each of the cores. With a single core processor chip, the total predicted performance is simply the predicted performance value for the single core. This selection process is shown graphically inFIG. 2 , below. As highlighted above, the performance of the core(s) can vary as a function of workload distribution during operation of the processor chip. In this step, processor chip performance is maximized by selecting the voltage level/frequency combination that provides the highest performance. The voltage level selected in this step will determine the maximum frequency for the core(s), both in this step and in steps 110-112, described below. Namely, for a given voltage there is only a certain range of frequencies that can be implemented as each frequency requires a certain minimum voltage. - In
step 110, the processor chip is adjusted to the voltage level/frequency combination selected instep 108, above. This voltage level/frequency combination will, within the confines of the given power budget, maximize performance of the processor chip (i.e., across all of the cores in the case of a multi-core configuration), for at least the current operating conditions. - The current operating conditions may change before the next step of
methodology 100,step 112, is carried out. Thus, after a time interval t1, instep 112, the frequency of the core (in a single core configuration) or one or more of the cores (in a multi-core configuration) is varied to accommodate for any shift in the workload. This is done to again optimize the total performance of the processor chip given the workload change. In a multi-core configuration, the workload can shift among the cores. For example, one or more of the cores that were actively performing computations might now be stalled due to memory accesses, while one or more of the other cores might now be more active. - The frequency now chosen for each core can again be based on the core power consumption and performance predictions made in
step 102, above. As highlighted above, the frequencies chosen in this step are limited to the frequencies that can be implemented for the voltage level selected in step 108 (described above). - As highlighted above, the voltage level and frequency of the processor chip can be adjusted using standard DVFS actuators. According to an exemplary embodiment, two DVFS actuators are employed, one to adjust the voltage level and another to adjust the frequency. The DVFS actuators can be configured to adjust the voltage level and/or frequency on a per-core basis or on a chip-wide basis. For example, the DVFS actuators can be configured to adjust the voltage level and the frequency on a per-core basis (e.g., in the case of a multi-core processor chip). Alternatively, the DVFS actuators can be configured to adjust the voltage level on a chip-wide basis and the frequency on a per-core basis (e.g., in the case of a multi-core processor chip). Further, the DVFS actuators can be configured to adjust both the voltage level and the frequency on a chip-wide basis (for both single core and multi-core processor chips).
- The present techniques take advantage of the notion that the processor chip can cope with more frequent changes in frequency than in voltage. Therefore,
methodology 100 has two invocation intervals, a shorter interval (i.e., time interval t1) for frequency changes and a longer interval (i.e., time interval t2, see below) for combined voltage level and frequency changes. This approach enables a more frequent performance optimization than would be achieved if the voltage level and frequency were only changed at the same time, resulting in higher performance. - After a time interval t2, the steps of
methodology 100 are repeated. As highlighted above, time interval t2 is longer than time interval t1, due to the processor chip being able to accommodate more frequent changes in frequency than in voltage level. Time intervals t1 and t2 can be predetermined and set by a system administrator. By way of example only, time interval t1 can have a duration of about 50 microseconds (μs) and time interval t2 can have a duration of about two milliseconds (ms). It is to be understood that these time interval values are merely exemplary and other time interval values may be employed, as long as the time interval for frequency changes, i.e., time interval t1, is shorter than the time interval for voltage level changes, i.e., time interval t2. -
FIG. 2 isgraph 200 illustrating voltage level/maximum frequency pairs for a particular set of workloads. Namely, ingraph 200, core performance is plotted as a function of power budget (measured in Watts (W)). The legend ingraph 200 gives the maximum frequency for the associated voltage level. As shown ingraph 200, the particular voltage level/maximum frequency combination that provides the highest performance depends on the power budget. Namely, to meet the power budget the frequency is reduced along a curve, reducing power consumption, while the voltage is fixed for each curve. By way of example only, for a power budget greater than about 47 W a chip voltage level of one volt (V) is selected enabling a maximum core frequency of 3.7 gigahertz (GHz), for a power budget of from about 47 W to about 33 W a chip voltage level of 0.9 V is selected enabling a maximum core frequency of 2.9 GHz and for a power budget of less than about 33 W a chip voltage level of 0.8 V is selected enabling a maximum core frequency of 2.3 GHz. Using this selection process, a core performance at the top of the set of the curves shown ingraph 200 can be achieved. - Turning now to
FIG. 3 , a block diagram is shown of anapparatus 300 for maximizing performance of a processor chip within a given power consumption budget, in accordance with one embodiment of the present invention. The processor chip can be local or remote toapparatus 300. It should be understood thatapparatus 300 represents one embodiment for implementingmethodology 100 ofFIG. 1 . -
Apparatus 300 comprises acomputer system 310 andremovable media 350.Computer system 310 comprises alocal processor 320, anetwork interface 325, amemory 330, amedia interface 335 and anoptional display 340.Network interface 325 allowscomputer system 310 to connect to a network, whilemedia interface 335 allowscomputer system 310 to interact with media, such as a hard drive orremovable media 350. - As is known in the art, the methods and apparatus discussed herein may be distributed as an article of manufacture that itself comprises a machine-readable medium containing one or more programs which when executed implement embodiments of the present invention. For instance, the machine-readable medium may contain a program configured to predict a power consumption and performance of the processor chip at all possible voltage level and frequency combinations; adjust the processor chip to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget; after a time interval t1, vary the frequency of the processor chip to accommodate for any shift in workload to maintain the highest performance within the power budget; and after a time interval t2, repeat the adjust and vary steps, wherein time interval t2 is greater than time interval t1.
- As highlighted above, the voltage level and frequency of the processor chip can be adjusted using one or more standard DVFS actuators. Thus, by way of example only,
apparatus 300 can control one or more DVFS actuators (not shown) and by way thereof implement one or more of the steps ofmethodology 100. - The machine-readable medium may be a recordable medium (e.g., floppy disks, hard drive, optical disks such as
removable media 350, or memory cards) or may be a transmission medium (e.g., a network comprising fiber-optics, the world-wide web, cables, or a wireless channel using time-division multiple access, code-division multiple access, or other radio-frequency channel). Any medium known or developed that can store information suitable for use with a computer system may be used. -
Local processor 320 can be configured to implement the methods, steps, and functions disclosed herein. Thememory 330 could be distributed or local and thelocal processor 320 could be distributed or singular. Thememory 330 could be implemented as an electrical, magnetic or optical memory, or any combination of these or other types of storage devices. Moreover, the term “memory” should be construed broadly enough to encompass any information able to be read from, or written to, an address in the addressable space accessed bylocal processor 320. With this definition, information on a network, accessible throughnetwork interface 325, is still withinmemory 330 because thelocal processor 320 can retrieve the information from the network. It should be noted that each distributed processor that makes uplocal processor 320 generally contains its own addressable memory space. It should also be noted that some or all ofcomputer system 310 can be incorporated into an application-specific or general-use integrated circuit. -
Optional video display 340 is any type of video display suitable for interacting with a human user ofapparatus 300. Generally,video display 340 is a computer monitor or other similar video display. - Although illustrative embodiments of the present invention have been described herein, it is to be understood that the invention is not limited to those precise embodiments, and that various other changes and modifications may be made by one skilled in the art without departing from the scope of the invention.
Claims (20)
1. A method for maximizing performance of a processor chip within a given power consumption budget, comprising the steps of:
predicting a power consumption and performance of the processor chip at all possible voltage level and frequency combinations;
adjusting the processor chip to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget;
after a time interval t1, varying the frequency of the processor chip to accommodate for any shift in workload to maintain the highest performance within the power budget; and
after a time interval t2, repeating the adjusting and varying steps, wherein time interval t2 is greater than time interval t1.
2. The method of claim 1 , further comprising the step of:
at a given measurement interval, collecting power consumption and performance data from the processor chip.
3. The method of claim 2 , further comprising the step of:
extrapolating the power consumption and performance data collected from the processor chip to predict the power consumption and performance of the processor chip at all possible voltage level and frequency combinations.
4. The method of claim 1 , wherein the predicting step further comprises the steps of:
selecting a particular voltage level;
varying the available frequencies for the selected voltage level; and
repeating the steps of selecting the particular voltage level and varying the available frequencies to obtain all possible voltage level and frequency combinations.
5. The method of claim 1 , wherein the processor chip is a multi-core processor chip and wherein the step of predicting the power consumption and performance of the processor chip further comprises the step of:
predicting a power consumption and performance of each core at all possible voltage level and frequency combinations.
6. The method of claim 5 , further comprising the steps of:
calculating a total predicted power consumption for each of the voltage level and frequency combinations;
eliminating any of the voltage level and frequency combinations with a total predicted power consumption that exceeds the given power budget; and
selecting, from the remaining voltage level and frequency combinations, the voltage level and frequency combination with a highest total predicted performance for the processor chip.
7. The method of claim 5 , wherein the processor chip is a multi-core processor chip and wherein the step of varying the frequency of the processor chip further comprises the step of:
at the time interval t1, varying the frequency of one or more of the cores to accommodate for any shift in workload among the cores to maintain the highest predicted performance for the processor chip within the given power budget.
8. The method of claim 1 , wherein the processor chip is a multi-core processor chip and wherein the step of predicting the power consumption and performance of the processor chip further comprises the step of:
predicting a power consumption and performance of each core at all possible voltage level and frequency combinations, wherein the voltage level is determined on a chip-wide basis and the frequency is determined on a per-core basis.
9. An apparatus for maximizing performance of a remote processor chip within a given power consumption budget, the apparatus comprising:
a memory; and
at least one local processor, coupled to the memory, operative to:
predict a power consumption and performance of the remote processor chip at all possible voltage level and frequency combinations;
adjust the remote processor chip to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget;
after a time interval t1, vary the frequency of the remote processor chip to accommodate for any shift in workload to maintain the highest performance within the power budget; and
after a time interval t2, repeat the adjust and vary steps, wherein time interval t2 is greater than time interval t1.
10. The apparatus of claim 9 , wherein the at least one local processor is further operative to:
at a given measurement interval, collect power consumption and performance data from the remote processor chip.
11. The apparatus of claim 10 , wherein the at least one local processor is further operative to:
extrapolate the power consumption and performance data collected from the remote processor chip to predict the power consumption and performance of the remote processor chip at all possible voltage level and frequency combinations.
12. The apparatus of claim 9 , wherein the remote processor chip is a multi-core processor chip and wherein the at least one local processor, operative to predict the power consumption and performance of the remote processor chip, is further operative to:
predict a power consumption and performance of each core at all possible voltage level and frequency combinations.
13. The apparatus of claim 12 , wherein the at least one local processor is further operative to:
calculate a total predicted power consumption for each of the voltage level and frequency combinations;
eliminate any of the voltage level and frequency combinations with a total predicted power consumption that exceeds the given power budget; and
select, from the remaining voltage level and frequency combinations, the voltage level and frequency combination with a highest total predicted performance for the remote processor chip.
14. The apparatus of claim 12 , wherein the remote processor chip is a multi-core processor chip and wherein the at least one local processor, operative to vary the frequency of the remote processor chip, is further operative to:
at the time interval t1, vary the frequency of one or more of the cores to accommodate for any shift in workload among the cores to maintain the highest predicted performance for the processor chip within the given power budget.
15. An article of manufacture for maximizing performance of a processor chip within a given power consumption budget, comprising a machine-readable medium containing one or more programs which when executed implement the steps of:
predicting a power consumption and performance of the processor chip at all possible voltage level and frequency combinations;
adjusting the processor chip to the voltage level and frequency combination that provides the highest performance while having a power consumption that does not exceed the power budget;
after a time interval t1, varying the frequency of the processor chip to accommodate for any shift in workload to maintain the highest performance within the power budget; and
after a time interval t2, repeating the adjusting and varying steps, wherein time interval t2 is greater than time interval t1.
16. The article of manufacture of claim 15 , wherein the one or more programs which when executed further implement the step of:
at a given measurement interval, collecting power consumption and performance data from the processor chip.
17. The article of manufacture of claim 16 , wherein the one or more programs which when executed further implement the step of:
extrapolating the power consumption and performance data collected from the processor chip to predict the power consumption and performance of the processor chip at all possible voltage level and frequency combinations.
18. The article of manufacture of claim 16 , wherein the processor chip is a multi-core processor chip and wherein the step of predicting the power consumption and performance of the processor chip further comprises the step of:
predicting a power consumption and performance of each core at all possible voltage level and frequency combinations.
19. The article of manufacture of claim 18 , wherein the one or more programs which when executed further implement the step of:
calculating a total predicted power consumption for each of the voltage level and frequency combinations;
eliminating any of the voltage level and frequency combinations with a total predicted power consumption that exceeds the given power budget; and
selecting, from the remaining voltage level and frequency combinations, the voltage level and frequency combination with a highest total predicted performance for the processor chip.
20. The article of manufacture of claim 18 , wherein the processor chip is a multi-core processor chip and wherein the step of varying the frequency of the processor chip further comprises the step of:
at the time interval t1, varying the frequency of one or more of the cores to accommodate for any shift in workload among the cores to maintain the highest predicted performance for the processor chip within the given power budget.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/201,877 US20100057404A1 (en) | 2008-08-29 | 2008-08-29 | Optimal Performance and Power Management With Two Dependent Actuators |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/201,877 US20100057404A1 (en) | 2008-08-29 | 2008-08-29 | Optimal Performance and Power Management With Two Dependent Actuators |
Publications (1)
Publication Number | Publication Date |
---|---|
US20100057404A1 true US20100057404A1 (en) | 2010-03-04 |
Family
ID=41726625
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/201,877 Abandoned US20100057404A1 (en) | 2008-08-29 | 2008-08-29 | Optimal Performance and Power Management With Two Dependent Actuators |
Country Status (1)
Country | Link |
---|---|
US (1) | US20100057404A1 (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110154089A1 (en) * | 2009-12-21 | 2011-06-23 | Andrew Wolfe | Processor core clock rate selection |
US8689021B1 (en) * | 2010-09-10 | 2014-04-01 | Marvell International Ltd. | System and method for selecting a power management configuration in a multi-core environment according to various operating conditions such as voltage, frequency, power mode, and utilization factor varied on a per-core basis |
WO2014105199A1 (en) * | 2012-12-27 | 2014-07-03 | Intel Corporation | Managing performance policies based on workload scalability |
US20140281609A1 (en) * | 2013-03-14 | 2014-09-18 | Arizona Board Of Regents For And On Behalf Of Arizona State University | Determining parameters that affect processor energy efficiency |
US20150074435A1 (en) * | 2013-09-09 | 2015-03-12 | Apple Inc. | Processor Power and Performance Manager |
WO2015126728A1 (en) * | 2014-02-21 | 2015-08-27 | Qualcomm Incorporated | Systems and methods for power optimization using throughput feedback |
US10133323B2 (en) | 2013-03-14 | 2018-11-20 | Arizona Board Of Regents For And On Behalf Of Arizona State University | Processor control system |
US10761586B2 (en) * | 2018-01-11 | 2020-09-01 | Intel Corporation | Computer performance and power consumption optimization |
US11068018B2 (en) * | 2016-10-25 | 2021-07-20 | Dolphin Design | System and method for power management of a computing system with a plurality of islands |
US20210224119A1 (en) * | 2018-10-26 | 2021-07-22 | Huawei Technologies Co., Ltd. | Energy efficiency adjustments for a cpu governor |
US11791233B1 (en) | 2021-08-06 | 2023-10-17 | Kepler Computing Inc. | Ferroelectric or paraelectric memory and logic chiplet with thermal management in a multi-dimensional packaging |
US11836102B1 (en) | 2019-03-20 | 2023-12-05 | Kepler Computing Inc. | Low latency and high bandwidth artificial intelligence processor |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4051306A (en) * | 1972-03-27 | 1977-09-27 | Owens-Illinois, Inc. | Controlled environmental deterioration of plastics |
US20030065960A1 (en) * | 2001-09-28 | 2003-04-03 | Stefan Rusu | Method and apparatus for adjusting the voltage and frequency to minimize power dissipation in a multiprocessor system |
US20050278664A1 (en) * | 2004-05-27 | 2005-12-15 | International Business Machines Corporation | Predicting power consumption for a chip |
US20060026447A1 (en) * | 2004-07-27 | 2006-02-02 | Intel Corporation | Power management coordination in multi-core processors |
US7051306B2 (en) * | 2003-05-07 | 2006-05-23 | Mosaid Technologies Corporation | Managing power on integrated circuits using power islands |
US7093147B2 (en) * | 2003-04-25 | 2006-08-15 | Hewlett-Packard Development Company, L.P. | Dynamically selecting processor cores for overall power efficiency |
US20060288243A1 (en) * | 2005-06-16 | 2006-12-21 | Lg Electronics Inc. | Automatically controlling processor mode of multi-core processor |
US7155617B2 (en) * | 2002-08-01 | 2006-12-26 | Texas Instruments Incorporated | Methods and systems for performing dynamic power management via frequency and voltage scaling |
US7174467B1 (en) * | 2001-07-18 | 2007-02-06 | Advanced Micro Devices, Inc. | Message based power management in a multi-processor system |
US20070033425A1 (en) * | 2005-08-02 | 2007-02-08 | Advanced Micro Devices, Inc. | Increasing workload performance of one or more cores on multiple core processors |
US7337335B2 (en) * | 2004-12-21 | 2008-02-26 | Packet Digital | Method and apparatus for on-demand power management |
US20080288796A1 (en) * | 2007-05-18 | 2008-11-20 | Semiconductor Technology Academic Research Center | Multi-processor control device and method |
US20090125293A1 (en) * | 2007-11-13 | 2009-05-14 | Lefurgy Charles R | Method and System for Real-Time Prediction of Power Usage for a Change to Another Performance State |
US20100025483A1 (en) * | 2008-07-31 | 2010-02-04 | Michael Hoeynck | Sensor-Based Occupancy and Behavior Prediction Method for Intelligently Controlling Energy Consumption Within a Building |
US20100058084A1 (en) * | 2008-08-29 | 2010-03-04 | International Business Machines Corporation | Self-Tuning Power Management Techniques |
-
2008
- 2008-08-29 US US12/201,877 patent/US20100057404A1/en not_active Abandoned
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4051306A (en) * | 1972-03-27 | 1977-09-27 | Owens-Illinois, Inc. | Controlled environmental deterioration of plastics |
US7174467B1 (en) * | 2001-07-18 | 2007-02-06 | Advanced Micro Devices, Inc. | Message based power management in a multi-processor system |
US20030065960A1 (en) * | 2001-09-28 | 2003-04-03 | Stefan Rusu | Method and apparatus for adjusting the voltage and frequency to minimize power dissipation in a multiprocessor system |
US7155617B2 (en) * | 2002-08-01 | 2006-12-26 | Texas Instruments Incorporated | Methods and systems for performing dynamic power management via frequency and voltage scaling |
US7093147B2 (en) * | 2003-04-25 | 2006-08-15 | Hewlett-Packard Development Company, L.P. | Dynamically selecting processor cores for overall power efficiency |
US7051306B2 (en) * | 2003-05-07 | 2006-05-23 | Mosaid Technologies Corporation | Managing power on integrated circuits using power islands |
US20050278664A1 (en) * | 2004-05-27 | 2005-12-15 | International Business Machines Corporation | Predicting power consumption for a chip |
US20060026447A1 (en) * | 2004-07-27 | 2006-02-02 | Intel Corporation | Power management coordination in multi-core processors |
US7337335B2 (en) * | 2004-12-21 | 2008-02-26 | Packet Digital | Method and apparatus for on-demand power management |
US20060288243A1 (en) * | 2005-06-16 | 2006-12-21 | Lg Electronics Inc. | Automatically controlling processor mode of multi-core processor |
US20070033425A1 (en) * | 2005-08-02 | 2007-02-08 | Advanced Micro Devices, Inc. | Increasing workload performance of one or more cores on multiple core processors |
US20080288796A1 (en) * | 2007-05-18 | 2008-11-20 | Semiconductor Technology Academic Research Center | Multi-processor control device and method |
US20090125293A1 (en) * | 2007-11-13 | 2009-05-14 | Lefurgy Charles R | Method and System for Real-Time Prediction of Power Usage for a Change to Another Performance State |
US20100025483A1 (en) * | 2008-07-31 | 2010-02-04 | Michael Hoeynck | Sensor-Based Occupancy and Behavior Prediction Method for Intelligently Controlling Energy Consumption Within a Building |
US20100058084A1 (en) * | 2008-08-29 | 2010-03-04 | International Business Machines Corporation | Self-Tuning Power Management Techniques |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9519305B2 (en) | 2009-12-21 | 2016-12-13 | Empire Technology Development Llc | Processor core clock rate selection |
US8751854B2 (en) * | 2009-12-21 | 2014-06-10 | Empire Technology Development Llc | Processor core clock rate selection |
US20110154089A1 (en) * | 2009-12-21 | 2011-06-23 | Andrew Wolfe | Processor core clock rate selection |
US8689021B1 (en) * | 2010-09-10 | 2014-04-01 | Marvell International Ltd. | System and method for selecting a power management configuration in a multi-core environment according to various operating conditions such as voltage, frequency, power mode, and utilization factor varied on a per-core basis |
US8930728B1 (en) * | 2010-09-10 | 2015-01-06 | Marvell International Ltd. | System and method for selecting a power management configuration in a multi-core environment to balance current load demand and required power consumption |
WO2014105199A1 (en) * | 2012-12-27 | 2014-07-03 | Intel Corporation | Managing performance policies based on workload scalability |
GB2522584B (en) * | 2012-12-27 | 2021-06-02 | Intel Corp | Managing performance policies based on workload scalability |
GB2522584A (en) * | 2012-12-27 | 2015-07-29 | Intel Corp | Managing performance policies based on workload scalability |
US9110735B2 (en) | 2012-12-27 | 2015-08-18 | Intel Corporation | Managing performance policies based on workload scalability |
US20140281609A1 (en) * | 2013-03-14 | 2014-09-18 | Arizona Board Of Regents For And On Behalf Of Arizona State University | Determining parameters that affect processor energy efficiency |
US10133323B2 (en) | 2013-03-14 | 2018-11-20 | Arizona Board Of Regents For And On Behalf Of Arizona State University | Processor control system |
US9933825B2 (en) * | 2013-03-14 | 2018-04-03 | Arizona Board Of Regents For And On Behalf Of Arizona State University | Determining parameters that affect processor energy efficiency |
US9329663B2 (en) * | 2013-09-09 | 2016-05-03 | Apple Inc. | Processor power and performance manager |
US20150074435A1 (en) * | 2013-09-09 | 2015-03-12 | Apple Inc. | Processor Power and Performance Manager |
US9436263B2 (en) | 2014-02-21 | 2016-09-06 | Qualcomm Incorporated | Systems and methods for power optimization using throughput feedback |
WO2015126728A1 (en) * | 2014-02-21 | 2015-08-27 | Qualcomm Incorporated | Systems and methods for power optimization using throughput feedback |
US11068018B2 (en) * | 2016-10-25 | 2021-07-20 | Dolphin Design | System and method for power management of a computing system with a plurality of islands |
US10761586B2 (en) * | 2018-01-11 | 2020-09-01 | Intel Corporation | Computer performance and power consumption optimization |
US20210224119A1 (en) * | 2018-10-26 | 2021-07-22 | Huawei Technologies Co., Ltd. | Energy efficiency adjustments for a cpu governor |
US11836102B1 (en) | 2019-03-20 | 2023-12-05 | Kepler Computing Inc. | Low latency and high bandwidth artificial intelligence processor |
US11791233B1 (en) | 2021-08-06 | 2023-10-17 | Kepler Computing Inc. | Ferroelectric or paraelectric memory and logic chiplet with thermal management in a multi-dimensional packaging |
US11899613B1 (en) | 2021-08-06 | 2024-02-13 | Kepler Computing Inc. | Method and apparatus to process an instruction for a distributed logic having tightly coupled accelerator core and processor core in a multi-dimensional packaging |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20100057404A1 (en) | Optimal Performance and Power Management With Two Dependent Actuators | |
US9619240B2 (en) | Core-level dynamic voltage and frequency scaling in a chip multiprocessor | |
Helmbold et al. | A dynamic disk spin-down technique for mobile computing | |
EP1620783B1 (en) | Method and apparatus for dynamic power management in a processor system | |
Simunic et al. | Event-driven power management | |
Sheikh et al. | An overview and classification of thermal-aware scheduling techniques for multi-core processing systems | |
Horvath et al. | Multi-mode energy management for multi-tier server clusters | |
Liu et al. | Sleepscale: Runtime joint speed scaling and sleep states management for power efficient data centers | |
US7904287B2 (en) | Method and system for real-time prediction of power usage for a change to another performance state | |
US8707060B2 (en) | Deterministic management of dynamic thermal response of processors | |
Haj-Yahya et al. | SysScale: Exploiting multi-domain dynamic voltage and frequency scaling for energy efficient mobile processors | |
CN107678855A (en) | Processor dynamic regulating method, device and processor chips | |
Kang et al. | Personalized battery lifetime prediction for mobile devices based on usage patterns | |
Sundriyal et al. | Modeling of the CPU frequency to minimize energy consumption in parallel applications | |
Lent | Analysis of an energy proportional data center | |
US8001405B2 (en) | Self-tuning power management techniques | |
Wang et al. | An energy efficiency optimization and control model for hadoop clusters | |
Thomas et al. | A predictor-based power-saving policy for dram memories | |
Lahiri et al. | Communication-based power management | |
Shehzad et al. | Threshold temperature scaling: Heuristic to address temperature and power issues in MPSoCs | |
Akbar et al. | A Shapley value-based thermal-efficient workload distribution in heterogeneous data centers | |
Begum et al. | Algorithms for CPU and DRAM DVFS under inefficiency constraints | |
Rajamani et al. | Online power and performance estimation for dynamic power management | |
Cai et al. | Esprint: Qos-aware management for effective computational sprinting in data centers | |
Shen | Adaptive power management for computers and mobile devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION,NEW YO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DITTMANN, GERO;BUYUKTOSUNOGLU, ALPER;NAIR, INDIRA;AND OTHERS;SIGNING DATES FROM 20080822 TO 20080826;REEL/FRAME:021685/0640 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |