WO2017105610A1 - Integrated circuit thermal throttling with workload adapted thermal sensor maximum temperature - Google Patents

Integrated circuit thermal throttling with workload adapted thermal sensor maximum temperature Download PDF

Info

Publication number
WO2017105610A1
WO2017105610A1 PCT/US2016/057025 US2016057025W WO2017105610A1 WO 2017105610 A1 WO2017105610 A1 WO 2017105610A1 US 2016057025 W US2016057025 W US 2016057025W WO 2017105610 A1 WO2017105610 A1 WO 2017105610A1
Authority
WO
WIPO (PCT)
Prior art keywords
temperature
thermal
area
areas
sensors
Prior art date
Application number
PCT/US2016/057025
Other languages
French (fr)
Inventor
Ameya Limaye
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Priority to EP16876222.7A priority Critical patent/EP3391180B1/en
Priority to CN201680069172.1A priority patent/CN108292161B/en
Publication of WO2017105610A1 publication Critical patent/WO2017105610A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • G06F1/206Cooling means comprising thermal management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • G06F1/203Cooling means for portable computers, e.g. for laptops
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/324Power saving characterised by the action undertaken by lowering clock frequency
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This disclosure pertains to energy efficiency and energy conservation in integrated circuits and in particular but not exclusively, to the field of thermal control of computing device processors. More particularly, embodiments of the invention relate to energy efficient and energy conserving thermal throttling of electronic device processors.
  • processor power consumption can lead to overheating, which may negatively affect performance, damage components (e.g., the processor), cause discomfort or injury to the user, and can significantly reduce battery life.
  • Some processors and SoCs are enabled with Intel® Thermal Monitor, which is a feature that activates a Thermal Control Circuit (TCC) to reduce power and maintain die temperatures within product limits.
  • TCC Thermal Control Circuit
  • the TCC reduces power by either frequency/voltage reduction or clock modulation (both of which are considered throttling herein).
  • This feature relies on multiple on-die Digital Thermal Sensors (DTS) to monitor real-time die temperatures for TCC activation decisions. Since each DTS cannot be located exactly at a die hotspot, there is an offset between the temperature of the hot spot and the temperature measured by the DTS close to it. This hotspot-to-DTS temperature offset is obtained by thermal modeling and is used to determine the maximum DTS temperature limit that would trigger TCC activation.
  • DTS Digital Thermal Sensors
  • the die hotpot-to-DTS temperature offset is calculated by modeling the worst case TDP workload power map and is hardcoded into the throttling algorithm. Consequently, the TCC will be accurately activated only for the workload scenario corresponding to the maximum TDP. For other workloads experienced during product operation by the end user, the offset between the die hot spot temperature and DTS temperature will be different than the one hard coded into the throttling algorithm. This can lead to potential undesirable early/late TCC activation depending on workload, and can impact customer performance (early activation) or possibly exceed maximum temperatures (late activation).
  • the throttling circuit is activated depending upon the temperature recorded by the sensors (DTS) present at various locations on the chip.
  • DTS the maximum allowed temperature that any DTS can reach before a throttling controller (e.g., a throttling circuit) is activated is referred to herein as DTSmax and is calculated using equation (1).
  • DTSmax is the maximum temperature that any DTS can read before thermal throttling is activated.
  • DTSmax operates as customer visible throttling temperature.
  • Wjp is a measure of thermal resistance between the hot spot on the chip and the DTS.
  • this is determined by running well-known thermal simulations or experiments.
  • TDP is the maximum power that a chip would see.
  • the drawback of setting the DTSmax limits by this method is that it ignores the effect of the power map. Under actual use conditions, the temperature offset between the hot spot temperature and the DTS temperature is influenced heavily by the way the power is distributed on the chip by the specific workload.
  • Figure 1 illustrates locations of DTS relative to cores and input/output (10) regions in one embodiment of an integrated circuit (IC).
  • IC integrated circuit
  • Figure 2 illustrates a flow diagram of one embodiment of a process for performing thermal management of an integrated circuit (IC).
  • IC integrated circuit
  • Figure 3 illustrates a more detailed view of one embodiment of a power delivery system.
  • Figure 4 is one embodiment of a computing system.
  • DTSmax digital thermal sensor DTSmax limit
  • TCC thermal control circuit
  • the value of the maximum allowed temperature (DTSmax) is updated for every DTS in real-time based on the power supplied to multiple functional blocks (e.g., processor cores, pipelines, 10 register, or uncores, controllers, etc.).
  • these multiple functional blocks includes all the cores and input/output (10) blocks of the IC.
  • these multiple function blocks includes all cores and IO blocks having a DTS.
  • these multiple functional blocks is a subset of the cores and/or 10 blocks. This may result in less accuracy.
  • the process for performing the update is based on the principle of thermal superposition in that the temperature difference between hot spot and a DTS caused by a workload is expressed as the superposition the temperature difference caused by every power source individually.
  • One embodiment of the process presented herein sets a unique maximum allowed temperature limit on every DTS, depending upon its location on the chip relative to the heat producing sources, which is in contrast to imposing a single limit on all the DTSs.
  • the process for calculating the individual DTSmax values is based on the principle of thermal superposition in that the temperature difference between hot spot and a DTS is expressed as the superposition of the temperature difference caused by every power source individually.
  • the temperature difference is expressed as a linear function of the power supplied to every power producing region on a die (e.g., core or 10 regions), multiplied by appropriate weighting factors.
  • the temperature difference between the hot spot in Die Region #1 and the DTS located near Die Region #1 can be expressed as:
  • T DR1 - T DTS1 w 1 P DR1 + w 2 P DR 2 + ⁇ ⁇ ⁇ ⁇ w n P DRn (2)
  • T DTS1 Temperature recorded by DTS present in DRl.
  • Wj Weighting factor assigned to power in DRz.
  • i is an integer index for each of the units (e.g., cores) receiving power.
  • the weighting factors reflect the relative impact of the power sources on the temperature difference between the core and DTS of interest.
  • the value of a weighting factor depends upon the distance of the power source from the core and DTS of interest.
  • these values are obtained by running thermal models, which are well-known in the art. These include, for example, the finite difference method or the finite element method. In another embodiment, the values are obtained by running experiments and recording
  • DTS max the maximum allowed temperature in DRl
  • DTS max DR max — WiP DRi + w 2 P DR2 + . . . . w n P DRn (3)
  • the weighting factor matrix relates the impact of the power of every DR on the maximum allowed temperature for every DTS.
  • the weighting factor matrix is populated by running thermal simulations and can be hard coded into the throttling process (e.g., hardcoded in memory or firmware on the IC with the TCC or off-chip in a memory accessible by the IC.
  • DTSmax the maximum allowed DTS temperature
  • this update occurs every clock cycle; however, it may be performed more or less often depending on a desired level of accuracy.
  • the improvement of in the accuracy of calculating DTSmax using the techniques disclosed herein has been quantified on a processor power-map.
  • the techniques are not limited to processors and can be installed for any integrated circuit (e.g., SOC, controller, etc.)
  • FIG. 1 is a block diagram of an IC.
  • IC 100 has eight DTS corresponding to nine power producing regions, labeled 1-9.
  • the DRs may include one or more processor cores and one or more IO regions.
  • IC 100 may comprise a processor, system on a chip (SOC), embedded controller, etc.
  • SOC system on a chip
  • Each of DRs 1-9 include a digital thermal sensor (DTS) which produces sensor output signal lines 110 which are sent to a power control unit (PCU) 101 with a thermal controller.
  • the thermal controller of PCU 101 includes a thermal control circuit.
  • Each of the thermal sensors is located in an area in the IC record the die temperature at a location in its associated area.
  • the thermal controller of PCU 101 compares each of the temperature readings received via the sensor output signal lines 110 from the DTSs of each of the cores and IO regions to the maximum temperature, DTSmax, associated with areas and performs thermal throttling of the IC to reduce its temperature based on results of the comparison. This occurs in response to the die temperature recorded by any one of the temperature sensors exceeding the individual temperature thresholds for that temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which the thermal sensor resides and the power supplied to all the areas or to multiple areas of the IC. In the case of Figure 1, it's the power supplied to all of the DR regions 1-9 (e.g., cores, I/O regions, etc.).
  • the individual temperature thresholds for each temperature sensor is based on the difference between the maximum temperature allowed for the area in the IC in which the thermal sensor resides and the power supplied to the multiple areas of the IC. In another embodiment, the individual temperature threshold for each temperature sensor is based on the difference between the maximum temperature allowed for the area in the IC in which the thermal sensor resides and a sum of a product of each power supplied to each DR area (e.g., the DR1-DR9 areas cores, I/O regions) and a weighting factor associated with each power supplied to each individual area.
  • a product of each power supplied to each DR area e.g., the DR1-DR9 areas cores, I/O regions
  • the PCU 101 performs thermal throttling on IC 101.
  • the thermal throttling may comprise dynamic frequency and/or voltage scaling.
  • the thermal throttling may comprise clock modulation.
  • the thermal controller performs thermal throttling by sending control signals to each of the core to notify the cores to change power states to a lower power state in order to reduce the temperature of the IC.
  • the thermal controller generates one or more core/IO control signals 120 that are sent to individual core or I/O regions or to components supplying clock or voltage signals to those units to implement the temperature thermal throttling.
  • FIG. 2 illustrates a flow diagram of one embodiment of a process for performing thermal management of an integrated circuit (IC).
  • processing logic may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), firmware, or a combination of the three.
  • the process begins by setting the maximum temperature for each temperature sensor in each of the cores and/or IO regions of an integrated circuit (processing block 401). In one embodiment, this is performed by running thermal models on the IC. In one embodiment, this includes determining weighting factors associated with each of the power for each of the individual core and IO areas to set up a weighting factor matrix as discussed above. In one embodiment, the weighting factor matrix is populated by running thermal simulations in a manner well known in the art.
  • the individual temperature threshold for each temperature sensor is based on the difference between the maximum temperature allowed for the area in the IC in which the thermal sensor resides and the power supplied to multiple areas of the IC.
  • the multiple areas of the IC include the areas of the core regions.
  • the areas include a subset of the cores.
  • the areas include the cores and one or more IO regions.
  • processing logic stores the individual temperature thresholds for the plurality of sensors in memory (processing block 402).
  • processing logic may optionally update the value based on the current updated value (processing block 403).
  • the updating may be performed based on workload that is currently being performed by all the core regions. In one embodiment, this update occurs every clock cycle.
  • processing logic in the various cores and the IO regions processes data (processing block 404).
  • processing monitors the plurality of sensors (processing block 405). This may occur in the thermal controller of a power control unit in the IC. Processing logic checks for other one of the temperatures of one of the sensors is above its maximum (processing block 406). If not, processing logic continues transitions to processing block 404 where the processing logic continues to process data.
  • processing logic performs thermal throttling to reduce the temperature recorded die (processing block 407).
  • the thermal throttling comprises dynamic voltage and/or frequency scaling.
  • the thermal throttling includes performing clock modulation.
  • the thermal throttling comprises sending control signals to each core and/or IO unit to signal the unit to change its power state to a lower power state in an attempt to reduce the core temperature of the die.
  • FIG. 3 illustrates a more detailed view of the power delivery system.
  • memory 501 stores the maximum temperatures for each of the monitored areas (e.g., an area having a DTS) of the IC.
  • the monitored areas include the cores and IO regions.
  • the monitored regions include a subset of all the cores and IO regions.
  • the digital eye temperature sensor e.g., DTS
  • each of the cores/IO regions provide a temperature that is compared against its maximum using comparators 502.
  • thermal control circuit 504 receives the results of each of the comparisons and if any of the comparisons indicates that a die temperature for one of the regions for one of the areas is greater than its associated maximum temperature threshold, thermal control circuit 504 performs thermal throttling.
  • the thermal throttling may include sending control signals 510 signals to a voltage regulator (VR) 505 that's providing voltage to each of the core and IO regions.
  • the thermal throttling may include sending voltage control signals 511 to power gates (e.g., 506i-506 n associated with each of the core regions.
  • thermal control current 504 sends signals 512 to a clock generator (e.g., phase lock loop (PLL)) in each of the core and IO regions to control the clock signals that is generated in order to perform frequency scaling in the respective core/IO region.
  • thermal control circuit 504 sends control signals 523 (e.g., power state change control signals) to the cores/IO regions to cause these units to reduce power consumption (e.g., change to a lower power state) to achieve thermal throttling.
  • An update module 510 updates the maximum individual temperature thresholds for each of the temperature sensors if the power supplied to those areas changes, the power may be supplied based on changes in workload associated with that individual core/I/O region.
  • Figure 4 is one embodiment of a system level diagram 600 that may incorporate the techniques described above. For example, the techniques described above may be incorporated into a processor in system 600.
  • system 600 includes, but is not limited to, a desktop computer, a laptop computer, a netbook, a tablet, a notebook computer, a personal digital assistant (PDA), a server, a workstation, a cellular telephone, a mobile computing device, a smart phone, an Internet appliance or any other type of computing device.
  • system 600 implements the methods disclosed herein and may be a system on a chip (SOC) system.
  • SOC system on a chip
  • processor 610 has one or more processor cores 612 to 612N, where 612N represents the Nth processor core inside the processor 610 where N is a positive integer.
  • system 600 includes multiple processors including processors 610 and 605, where processor 605 has logic similar or identical to logic of processor 610.
  • system 600 includes multiple processors including processors 610 and 605 such that processor 605 has logic that is completely independent from the logic of processor 610.
  • a multi-package system 600 is a heterogeneous multi-package system because the processors 605 and 610 have different logic units.
  • processing core 612 includes, but is not limited to, pre-fetch logic to fetch instructions, decode logic to decode the instructions, execution logic to execute instructions and the like.
  • processor 610 has a cache memory 616 to cache instructions and/or data of the system 600.
  • cache memory 616 includes level one, level two and level three, cache memory, or any other configuration of the cache memory within processor 610.
  • processor 610 includes a memory control hub (MCH) 614, which is operable to perform functions that enable processor 610 to access and communicate with a memory 630 that includes a volatile memory 632 and/or a non-volatile memory 634.
  • MCH memory control hub
  • memory control hub (MCH) 614 is positioned outside of processor 610 as an independent integrated circuit.
  • processor 610 is operable to communicate with memory 630 and a chipset 620.
  • SSD 680 executes the computer-executable instructions when SSD 680 is powered up.
  • processor 610 is also coupled to a wireless antenna 678 to
  • wireless antenna interface 678 operates in accordance with, but is not limited to, the IEEE 802.11 standard and its related family, HomePlug AV (HPAV), Ultra Wide Band (UWB), Bluetooth, WiMAX, or any form of wireless communication protocol.
  • the volatile memory 632 includes, but is not limited to, Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM), and/or any other type of random access memory device.
  • Non-volatile memory 634 includes, but is not limited to, flash memory (e.g., NAND, NOR), phase change memory (PCM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), or any other type of non- volatile memory device.
  • Memory 630 stores information and instructions to be executed by processor 610.
  • chipset 620 connects with processor 610 via Point-to-Point (PtP or P-P) interfaces 617 and 622.
  • chipset 620 enables processor 610 to connect to other modules in the system 600.
  • interfaces 617 and 622 operate in accordance with a PtP communication protocol such as the Intel QuickPath Interconnect (QPI) or the like.
  • QPI QuickPath Interconnect
  • chipset 620 is operable to communicate with processor 610, 605, display device 640, and other devices 672, 676, 674, 660, 662, 664, 666, 677, etc. In one embodiment, chipset 620 is also coupled to a wireless antenna 678 to communicate with any device configured to transmit and/or receive wireless signals.
  • chipset 620 connects to a display device 640 via an interface 626.
  • display device 640 includes, but is not limited to, liquid crystal display (LCD), plasma, cathode ray tube (CRT) display, or any other form of visual display device.
  • chipset 620 connects to one or more buses 650 and 655 that interconnect various modules 674, 660, 662, 664, and 666.
  • buses 650 and 655 may be interconnected together via a bus bridge 672 if there is a mismatch in bus speed or communication protocol.
  • chipset 620 couples with, but is not limited to, a non-volatile memory 660, a mass storage device(s) 662, a keyboard/mouse 664, and a network interface 666 via interface 624, smart TV 676, consumer electronics 677, etc.
  • mass storage device 662 includes, but is not limited to, a solid state drive, a hard disk drive, a universal serial bus flash memory drive, or any other form of computer data storage medium.
  • network interface 666 is implemented by any type of well-known network interface standard including, but not limited to, an Ethernet interface, a universal serial bus (USB) interface, a Peripheral Component Interconnect (PCI) Express interface, a wireless interface and/or any other suitable type of interface.
  • modules shown in Figure 4 are depicted as separate blocks within the system 600, the functions performed by some of these blocks may be integrated within a single semiconductor circuit or may be implemented using two or more separate integrated circuits.
  • an apparatus to control temperature of an integrated circuit comprises a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and a thermal controller coupled to the plurality of thermal sensors to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the lC.
  • the subject matter of the first example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
  • the subject matter of the first example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
  • the subject matter of the first example embodiment can optionally include: a memory to store the individual temperature thresholds for the plurality of sensors; and an update module coupled to the memory to update the individual temperature threshold for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes.
  • the subject matter of this example embodiment can optionally include that the update module monitors changes to the power supplied to one or more areas of the plurality of areas changes.
  • the subject matter of the first example embodiment can optionally include that the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
  • the subject matter of the first example embodiment can optionally include that the thermal throttling comprises clock modulation.
  • a method for controlling temperature of an integrated circuit comprises: monitoring a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and performing thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, the individual temperature threshold for each temperature sensor being based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
  • the subject matter of the second example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
  • the subject matter of the second example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
  • the subject matter of the second example embodiment can optionally include: storing the individual temperature thresholds for the plurality of sensors in memory; and updating the individual temperature threshold for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes.
  • the subject matter of this example embodiment can optionally include that updating the individual temperature threshold for each of the plurality of temperature sensors includes monitoring changes to the power supplied to one or more areas of the plurality of areas changes.
  • the subject matter of the second example embodiment can optionally include that the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
  • the subject matter of the second example embodiment can optionally include that the thermal throttling comprises clock modulation.
  • a computing system comprises: a memory to store the individual temperature thresholds for the plurality of sensors in memory; a display coupled to the memory; a processor coupled to the memory and the display, the processor comprising a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and a thermal controller coupled to the plurality of thermal sensors to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
  • the subject matter of the third example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
  • the subject matter of the third example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
  • the subject matter of the third example embodiment can optionally include an update module coupled to the processor to update the individual temperature thresholds for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes.
  • the subject matter of this example embodiment can optionally include that the update module monitors changes to the power supplied to one or more areas of the plurality of areas changes.
  • the subject matter of the third example embodiment can optionally include that the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
  • the subject matter of the third example embodiment can optionally include that the thermal throttling comprises clock modulation.
  • a machine-readable medium has instructions that when operated on by the machine cause the machine to perform operations comprising: receiving recorded die temperatures from a plurality of thermal sensors being monitored, each of the plurality of thermal sensors being located in an area in an IC to record a die temperature at a location in its associated area; and sending commands to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the lC.
  • the subject matter of the fourth example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
  • the subject matter of the fourth example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
  • the present invention also relates to apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer.
  • a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • ROMs read-only memories
  • RAMs random access memories
  • EPROMs electrically erasable programmable read-only memories
  • EEPROMs electrically erasable programmable read-only memory
  • magnetic or optical cards or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • the algorithms and displays presented herein are not inherently
  • a machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer).
  • a machine- readable medium includes read only memory ("ROM”); random access memory (“RAM”); magnetic disk storage media; optical storage media; flash memory devices; etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Hardware Design (AREA)
  • Power Sources (AREA)
  • Semiconductor Integrated Circuits (AREA)

Abstract

A method and apparatus for integrated circuit (IC) thermal throttling is described. In one embodiment, the apparatus comprises a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and a thermal controller coupled to the plurality of thermal sensors to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, the individual temperature threshold for each temperature sensor being based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.

Description

INTEGRATED CIRCUIT THERMAL THROTTLING WITH WORKLOAD ADAPTED THERMAL SENSOR MAXIMUM TEMPERATURE
FIELD OF THE INVENTION
This disclosure pertains to energy efficiency and energy conservation in integrated circuits and in particular but not exclusively, to the field of thermal control of computing device processors. More particularly, embodiments of the invention relate to energy efficient and energy conserving thermal throttling of electronic device processors.
BACKGROUND OF THE INVENTION
Advances in semi-conductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit (IC) devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple hardware threads, multiple cores, multiple devices, and/or complete systems on individual integrated circuits. Additionally, as the density of integrated circuits has grown, the power requirements for computing systems (from embedded systems to servers) have also escalated. Furthermore, software inefficiencies, and its requirements of hardware, have also caused an increase in computing device energy consumption. As a result, there is a vital need for energy efficiency and conservation associated with integrated circuits.
As the trend toward advanced microprocessors with more transistors and higher frequencies continues to grow, computer designers and manufacturers are often faced with corresponding increases in power and thermal consumption. Particularly in computing devices, processor power consumption can lead to overheating, which may negatively affect performance, damage components (e.g., the processor), cause discomfort or injury to the user, and can significantly reduce battery life.
Some processors and SoCs (System-on-a-Chip) are enabled with Intel® Thermal Monitor, which is a feature that activates a Thermal Control Circuit (TCC) to reduce power and maintain die temperatures within product limits. The TCC reduces power by either frequency/voltage reduction or clock modulation (both of which are considered throttling herein). This feature relies on multiple on-die Digital Thermal Sensors (DTS) to monitor real-time die temperatures for TCC activation decisions. Since each DTS cannot be located exactly at a die hotspot, there is an offset between the temperature of the hot spot and the temperature measured by the DTS close to it. This hotspot-to-DTS temperature offset is obtained by thermal modeling and is used to determine the maximum DTS temperature limit that would trigger TCC activation. Currently, the die hotpot-to-DTS temperature offset is calculated by modeling the worst case TDP workload power map and is hardcoded into the throttling algorithm. Consequently, the TCC will be accurately activated only for the workload scenario corresponding to the maximum TDP. For other workloads experienced during product operation by the end user, the offset between the die hot spot temperature and DTS temperature will be different than the one hard coded into the throttling algorithm. This can lead to potential undesirable early/late TCC activation depending on workload, and can impact customer performance (early activation) or possibly exceed maximum temperatures (late activation).
To guarantee the long term reliability of a processor, under fan failure or other anomalous thermal excursion, thermal throttling will kick-in to cool down the temperature of the processor. In one embodiment, the throttling circuit is activated depending upon the temperature recorded by the sensors (DTS) present at various locations on the chip. The maximum allowed temperature that any DTS can reach before a throttling controller (e.g., a throttling circuit) is activated is referred to herein as DTSmax and is calculated using equation (1).
DTSmax = Tjmax - Wjp *TDP ( 1 )
where DTSmax is the maximum temperature that any DTS can read before thermal throttling is activated. In other words, DTSmax operates as customer visible throttling temperature. Wjp is a measure of thermal resistance between the hot spot on the chip and the DTS. In one
embodiment, this is determined by running well-known thermal simulations or experiments. TDP is the maximum power that a chip would see. The drawback of setting the DTSmax limits by this method is that it ignores the effect of the power map. Under actual use conditions, the temperature offset between the hot spot temperature and the DTS temperature is influenced heavily by the way the power is distributed on the chip by the specific workload.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the invention, which, however, should not be taken to limit the invention to the specific embodiments, but are for explanation and understanding only.
Figure 1 illustrates locations of DTS relative to cores and input/output (10) regions in one embodiment of an integrated circuit (IC).
Figure 2 illustrates a flow diagram of one embodiment of a process for performing thermal management of an integrated circuit (IC).
Figure 3 illustrates a more detailed view of one embodiment of a power delivery system. Figure 4 is one embodiment of a computing system.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
In the following description, numerous details are set forth to provide a more thorough explanation of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without these specific details. In other instances, well- known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the present invention.
Techniques are disclosed to adaptively change the thermal sensor maximum limit (e.g., digital thermal sensor DTSmax limit) used to control integrated circuit (IC) throttling by a thermal control circuit (TCC). In one embodiment, the DTSmax is changed adaptively based on changing workload scenarios. The adaptive changing of the DTSmax increases the accuracy of TCC triggering, thereby benefiting the customers as well as maintaining die temperature limits. Note that while digital thermal sensors (DTSs) are mentioned throughout the disclosure, the techniques herein are not limited to digital implementations of thermal sensors, and analog thermal sensors may be used.
A method and apparatus for adapting the value of DTSmax according to the workload is described below. In one embodiment, the value of the maximum allowed temperature (DTSmax) is updated for every DTS in real-time based on the power supplied to multiple functional blocks (e.g., processor cores, pipelines, 10 register, or uncores, controllers, etc.). In one embodiment, these multiple functional blocks includes all the cores and input/output (10) blocks of the IC. In one embodiment, these multiple function blocks includes all cores and IO blocks having a DTS. In another embodiment, these multiple functional blocks is a subset of the cores and/or 10 blocks. This may result in less accuracy.
In one embodiment, the process for performing the update is based on the principle of thermal superposition in that the temperature difference between hot spot and a DTS caused by a workload is expressed as the superposition the temperature difference caused by every power source individually. One embodiment of the process presented herein sets a unique maximum allowed temperature limit on every DTS, depending upon its location on the chip relative to the heat producing sources, which is in contrast to imposing a single limit on all the DTSs.
An Example Process for Calculating DTSmax
In one embodiment, the process for calculating the individual DTSmax values is based on the principle of thermal superposition in that the temperature difference between hot spot and a DTS is expressed as the superposition of the temperature difference caused by every power source individually. In one embodiment, the temperature difference is expressed as a linear function of the power supplied to every power producing region on a die (e.g., core or 10 regions), multiplied by appropriate weighting factors. For example, the temperature difference between the hot spot in Die Region #1 and the DTS located near Die Region #1 can be expressed as:
TDR1 - TDTS1 = w1PDR1 + w2PDR2 +■ ■ ■ ■ wn PDRn (2) where
TDR1= Maximum temperature in DRl
TDTS1 = Temperature recorded by DTS present in DRl.
PDRi = Power supplied to Die Region # i (DRi)
Wj = Weighting factor assigned to power in DRz.
where i is an integer index for each of the units (e.g., cores) receiving power.
In one embodiment, the weighting factors reflect the relative impact of the power sources on the temperature difference between the core and DTS of interest. The value of a weighting factor depends upon the distance of the power source from the core and DTS of interest. In one embodiment, these values are obtained by running thermal models, which are well-known in the art. These include, for example, the finite difference method or the finite element method. In another embodiment, the values are obtained by running experiments and recording
temperatures.
If the maximum allowed temperature in DRl is known (referred to herein as DRmax ), it is plugged into equation (1) to obtain the maximum allowed value temperature that for DTS#1 can reach (referred to herein DTSmax). This can be expressed as:
DTSmax = DRmax— WiPDRi + w2PDR2 + . . . . wn PDRn (3)
Thus, there is a relationship between the maximum allowed DTS temperature in a DR with the maximum allowed temperature in that DR and the power supplied to all the other DRs that is used to adapt the DTSmax. For a chip having n number of DTSs monitoring n number of cores, n such equations are used. These equations can be expressed compactly in matrix form as shown in below.
Figure imgf000005_0001
In the above equation, the weighting factor matrix relates the impact of the power of every DR on the maximum allowed temperature for every DTS. In one embodiment, the weighting factor matrix is populated by running thermal simulations and can be hard coded into the throttling process (e.g., hardcoded in memory or firmware on the IC with the TCC or off-chip in a memory accessible by the IC. By monitoring the power to every DR, the maximum allowed DTS temperature (DTSmax) can be updated in real-time depending upon the workload. In one embodiment, this update occurs every clock cycle; however, it may be performed more or less often depending on a desired level of accuracy. Case Study and Simulations to Quantify Impact of the Disclosed Techniques
The improvement of in the accuracy of calculating DTSmax using the techniques disclosed herein has been quantified on a processor power-map. However, the techniques are not limited to processors and can be installed for any integrated circuit (e.g., SOC, controller, etc.)
Figure 1 is a block diagram of an IC. Referring to Figure 1, IC 100 has eight DTS corresponding to nine power producing regions, labeled 1-9. In one embodiment, the DRs may include one or more processor cores and one or more IO regions. In one embodiment, IC 100 may comprise a processor, system on a chip (SOC), embedded controller, etc. Each of DRs 1-9 include a digital thermal sensor (DTS) which produces sensor output signal lines 110 which are sent to a power control unit (PCU) 101 with a thermal controller. In one embodiment, the thermal controller of PCU 101 includes a thermal control circuit.
Each of the thermal sensors is located in an area in the IC record the die temperature at a location in its associated area. In one embodiment, the thermal controller of PCU 101 compares each of the temperature readings received via the sensor output signal lines 110 from the DTSs of each of the cores and IO regions to the maximum temperature, DTSmax, associated with areas and performs thermal throttling of the IC to reduce its temperature based on results of the comparison. This occurs in response to the die temperature recorded by any one of the temperature sensors exceeding the individual temperature thresholds for that temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which the thermal sensor resides and the power supplied to all the areas or to multiple areas of the IC. In the case of Figure 1, it's the power supplied to all of the DR regions 1-9 (e.g., cores, I/O regions, etc.).
In one embodiment, the individual temperature thresholds for each temperature sensor is based on the difference between the maximum temperature allowed for the area in the IC in which the thermal sensor resides and the power supplied to the multiple areas of the IC. In another embodiment, the individual temperature threshold for each temperature sensor is based on the difference between the maximum temperature allowed for the area in the IC in which the thermal sensor resides and a sum of a product of each power supplied to each DR area (e.g., the DR1-DR9 areas cores, I/O regions) and a weighting factor associated with each power supplied to each individual area.
In response to one of the recorded die temperatures from one of the DTSs being greater than its individual temperature threshold, the PCU 101 performs thermal throttling on IC 101. In one embodiment, the thermal throttling may comprise dynamic frequency and/or voltage scaling. In another embodiment, the thermal throttling may comprise clock modulation. In another embodiment, the thermal controller performs thermal throttling by sending control signals to each of the core to notify the cores to change power states to a lower power state in order to reduce the temperature of the IC. In one embodiment, the thermal controller generates one or more core/IO control signals 120 that are sent to individual core or I/O regions or to components supplying clock or voltage signals to those units to implement the temperature thermal throttling.
The impact of power in every core and 10 on each of these ten DTSs may be quantified by running thermal models and the weighting factor matrix is populated. Note the maximum allowed DTS temperatures calculated using the algorithm disclosed herein are much more accurate than a prior art algorithm in use today.
Figure 2 illustrates a flow diagram of one embodiment of a process for performing thermal management of an integrated circuit (IC). The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), firmware, or a combination of the three.
Referring to Figure 2, the process begins by setting the maximum temperature for each temperature sensor in each of the cores and/or IO regions of an integrated circuit (processing block 401). In one embodiment, this is performed by running thermal models on the IC. In one embodiment, this includes determining weighting factors associated with each of the power for each of the individual core and IO areas to set up a weighting factor matrix as discussed above. In one embodiment, the weighting factor matrix is populated by running thermal simulations in a manner well known in the art.
In one embodiment, the individual temperature threshold for each temperature sensor is based on the difference between the maximum temperature allowed for the area in the IC in which the thermal sensor resides and the power supplied to multiple areas of the IC. In one embodiment, the multiple areas of the IC include the areas of the core regions. In another embodiment, the areas include a subset of the cores. In yet another embodiment, the areas include the cores and one or more IO regions.
In one embodiment, processing logic stores the individual temperature thresholds for the plurality of sensors in memory (processing block 402).
After initial values have been set, processing logic may optionally update the value based on the current updated value (processing block 403). In one embodiment, the updating may be performed based on workload that is currently being performed by all the core regions. In one embodiment, this update occurs every clock cycle.
Then processing logic in the various cores and the IO regions processes data (processing block 404).
While processing data, processing monitors the plurality of sensors (processing block 405). This may occur in the thermal controller of a power control unit in the IC. Processing logic checks for other one of the temperatures of one of the sensors is above its maximum (processing block 406). If not, processing logic continues transitions to processing block 404 where the processing logic continues to process data.
If the recorded temperature of one of the thermal sensors is greater than its associated maximum temperature, processing logic performs thermal throttling to reduce the temperature recorded die (processing block 407). In one embodiment, the thermal throttling comprises dynamic voltage and/or frequency scaling. In another embodiment, the thermal throttling includes performing clock modulation. In another embodiment, the thermal throttling comprises sending control signals to each core and/or IO unit to signal the unit to change its power state to a lower power state in an attempt to reduce the core temperature of the die.
Figure 3 illustrates a more detailed view of the power delivery system. Referring to Figure 3, memory 501 stores the maximum temperatures for each of the monitored areas (e.g., an area having a DTS) of the IC. In one embodiment, the monitored areas include the cores and IO regions. In another embodiment, the monitored regions include a subset of all the cores and IO regions. The digital eye temperature sensor (e.g., DTS) for each of the cores/IO regions provide a temperature that is compared against its maximum using comparators 502. Logic 503 (e.g., gate logic) receives the results of each of the comparisons and if any of the comparisons indicates that a die temperature for one of the regions for one of the areas is greater than its associated maximum temperature threshold, thermal control circuit 504 performs thermal throttling. The thermal throttling may include sending control signals 510 signals to a voltage regulator (VR) 505 that's providing voltage to each of the core and IO regions. In another embodiment, the thermal throttling may include sending voltage control signals 511 to power gates (e.g., 506i-506n associated with each of the core regions. And in yet another embodiment, thermal control current 504 sends signals 512 to a clock generator (e.g., phase lock loop (PLL)) in each of the core and IO regions to control the clock signals that is generated in order to perform frequency scaling in the respective core/IO region. In still another embodiment, thermal control circuit 504 sends control signals 523 (e.g., power state change control signals) to the cores/IO regions to cause these units to reduce power consumption (e.g., change to a lower power state) to achieve thermal throttling.
An update module 510 updates the maximum individual temperature thresholds for each of the temperature sensors if the power supplied to those areas changes, the power may be supplied based on changes in workload associated with that individual core/I/O region.
The techniques described herein allow for more effective throttling. This improves the product performance (by avoiding over throttling) and also its reliability (by avoiding under-throttling) . Figure 4 is one embodiment of a system level diagram 600 that may incorporate the techniques described above. For example, the techniques described above may be incorporated into a processor in system 600.
Referring to Figure 4, system 600 includes, but is not limited to, a desktop computer, a laptop computer, a netbook, a tablet, a notebook computer, a personal digital assistant (PDA), a server, a workstation, a cellular telephone, a mobile computing device, a smart phone, an Internet appliance or any other type of computing device. In another embodiment, system 600 implements the methods disclosed herein and may be a system on a chip (SOC) system.
In one embodiment, processor 610 has one or more processor cores 612 to 612N, where 612N represents the Nth processor core inside the processor 610 where N is a positive integer. In one embodiment, system 600 includes multiple processors including processors 610 and 605, where processor 605 has logic similar or identical to logic of processor 610. In one embodiment, system 600 includes multiple processors including processors 610 and 605 such that processor 605 has logic that is completely independent from the logic of processor 610. In such an embodiment, a multi-package system 600 is a heterogeneous multi-package system because the processors 605 and 610 have different logic units. In one embodiment, processing core 612 includes, but is not limited to, pre-fetch logic to fetch instructions, decode logic to decode the instructions, execution logic to execute instructions and the like. In one embodiment, processor 610 has a cache memory 616 to cache instructions and/or data of the system 600. In another embodiment of the invention, cache memory 616 includes level one, level two and level three, cache memory, or any other configuration of the cache memory within processor 610.
In one embodiment, processor 610 includes a memory control hub (MCH) 614, which is operable to perform functions that enable processor 610 to access and communicate with a memory 630 that includes a volatile memory 632 and/or a non-volatile memory 634. In one embodiment, memory control hub (MCH) 614 is positioned outside of processor 610 as an independent integrated circuit.
In one embodiment, processor 610 is operable to communicate with memory 630 and a chipset 620. In such an embodiment, SSD 680 executes the computer-executable instructions when SSD 680 is powered up.
In one embodiment, processor 610 is also coupled to a wireless antenna 678 to
communicate with any device configured to transmit and/or receive wireless signals. In one embodiment, wireless antenna interface 678 operates in accordance with, but is not limited to, the IEEE 802.11 standard and its related family, HomePlug AV (HPAV), Ultra Wide Band (UWB), Bluetooth, WiMAX, or any form of wireless communication protocol.
In one embodiment, the volatile memory 632 includes, but is not limited to, Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM), and/or any other type of random access memory device. Non-volatile memory 634 includes, but is not limited to, flash memory (e.g., NAND, NOR), phase change memory (PCM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), or any other type of non- volatile memory device.
Memory 630 stores information and instructions to be executed by processor 610. In one embodiment, chipset 620 connects with processor 610 via Point-to-Point (PtP or P-P) interfaces 617 and 622. In one embodiment, chipset 620 enables processor 610 to connect to other modules in the system 600. In one embodiment, interfaces 617 and 622 operate in accordance with a PtP communication protocol such as the Intel QuickPath Interconnect (QPI) or the like.
In one embodiment, chipset 620 is operable to communicate with processor 610, 605, display device 640, and other devices 672, 676, 674, 660, 662, 664, 666, 677, etc. In one embodiment, chipset 620 is also coupled to a wireless antenna 678 to communicate with any device configured to transmit and/or receive wireless signals.
In one embodiment, chipset 620 connects to a display device 640 via an interface 626. In one embodiment, display device 640 includes, but is not limited to, liquid crystal display (LCD), plasma, cathode ray tube (CRT) display, or any other form of visual display device. In addition, chipset 620 connects to one or more buses 650 and 655 that interconnect various modules 674, 660, 662, 664, and 666. In one embodiment, buses 650 and 655 may be interconnected together via a bus bridge 672 if there is a mismatch in bus speed or communication protocol. In one embodiment, chipset 620 couples with, but is not limited to, a non-volatile memory 660, a mass storage device(s) 662, a keyboard/mouse 664, and a network interface 666 via interface 624, smart TV 676, consumer electronics 677, etc.
In one embodiment, mass storage device 662 includes, but is not limited to, a solid state drive, a hard disk drive, a universal serial bus flash memory drive, or any other form of computer data storage medium. In one embodiment, network interface 666 is implemented by any type of well-known network interface standard including, but not limited to, an Ethernet interface, a universal serial bus (USB) interface, a Peripheral Component Interconnect (PCI) Express interface, a wireless interface and/or any other suitable type of interface.
While the modules shown in Figure 4 are depicted as separate blocks within the system 600, the functions performed by some of these blocks may be integrated within a single semiconductor circuit or may be implemented using two or more separate integrated circuits.
In a first example embodiment, an apparatus to control temperature of an integrated circuit (IC) comprises a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and a thermal controller coupled to the plurality of thermal sensors to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the lC.
In another example embodiment, the subject matter of the first example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
In another example embodiment, the subject matter of the first example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
In another example embodiment, the subject matter of the first example embodiment can optionally include: a memory to store the individual temperature thresholds for the plurality of sensors; and an update module coupled to the memory to update the individual temperature threshold for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes. In another example embodiment, the subject matter of this example embodiment can optionally include that the update module monitors changes to the power supplied to one or more areas of the plurality of areas changes.
In another example embodiment, the subject matter of the first example embodiment can optionally include that the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
In another example embodiment, the subject matter of the first example embodiment can optionally include that the thermal throttling comprises clock modulation.
In a second example embodiment, a method for controlling temperature of an integrated circuit (IC) comprises: monitoring a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and performing thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, the individual temperature threshold for each temperature sensor being based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
In another example embodiment, the subject matter of the second example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
In another example embodiment, the subject matter of the second example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
In another example embodiment, the subject matter of the second example embodiment can optionally include: storing the individual temperature thresholds for the plurality of sensors in memory; and updating the individual temperature threshold for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes. In another example embodiment, the subject matter of this example embodiment can optionally include that updating the individual temperature threshold for each of the plurality of temperature sensors includes monitoring changes to the power supplied to one or more areas of the plurality of areas changes.
In another example embodiment, the subject matter of the second example embodiment can optionally include that the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
In another example embodiment, the subject matter of the second example embodiment can optionally include that the thermal throttling comprises clock modulation.
In a third example embodiment, a computing system comprises: a memory to store the individual temperature thresholds for the plurality of sensors in memory; a display coupled to the memory; a processor coupled to the memory and the display, the processor comprising a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and a thermal controller coupled to the plurality of thermal sensors to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
In another example embodiment, the subject matter of the third example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
In another example embodiment, the subject matter of the third example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
In another example embodiment, the subject matter of the third example embodiment can optionally include an update module coupled to the processor to update the individual temperature thresholds for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes. In another example embodiment, the subject matter of this example embodiment can optionally include that the update module monitors changes to the power supplied to one or more areas of the plurality of areas changes.
In another example embodiment, the subject matter of the third example embodiment can optionally include that the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
In another example embodiment, the subject matter of the third example embodiment can optionally include that the thermal throttling comprises clock modulation.
In a fourth example embodiment, a machine-readable medium has instructions that when operated on by the machine cause the machine to perform operations comprising: receiving recorded die temperatures from a plurality of thermal sensors being monitored, each of the plurality of thermal sensors being located in an area in an IC to record a die temperature at a location in its associated area; and sending commands to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, where the individual temperature threshold for each temperature sensor is based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the lC.
In another example embodiment, the subject matter of the fourth example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
In another example embodiment, the subject matter of the fourth example embodiment can optionally include that the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
Some portions of the detailed descriptions above are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as "processing" or "computing" or "calculating" or "determining" or "displaying" or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
The present invention also relates to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description below. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein.
A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine- readable medium includes read only memory ("ROM"); random access memory ("RAM"); magnetic disk storage media; optical storage media; flash memory devices; etc.
Whereas many alterations and modifications of the present invention will no doubt become apparent to a person of ordinary skill in the art after having read the foregoing description, it is to be understood that any particular embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore, references to details of various embodiments are not intended to limit the scope of the claims which in themselves recite only those features regarded as essential to the invention.

Claims

CLAIMS We claim:
1. An apparatus to control temperature of an integrated circuit (IC), the apparatus comprising:
a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and
a thermal controller coupled to the plurality of thermal sensors to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, the individual temperature threshold for each temperature sensor being based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
2. The apparatus defined in Claim 1 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
3. The apparatus defined in Claim 1 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
4. The apparatus defined in Claim 1 further comprising:
a memory to store the individual temperature thresholds for the plurality of sensors; and an update module coupled to the memory to update the individual temperature threshold for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes.
5. The apparatus defined in Claim 4 wherein the update module monitors changes to the power supplied to one or more areas of the plurality of areas changes.
6. The apparatus defined in Claim 1 wherein the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
7. The apparatus defined in Claim 1 wherein the thermal throttling comprises clock modulation.
8. A method for controlling temperature of an integrated circuit (IC), the method comprising: monitoring a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and performing thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, the individual temperature threshold for each temperature sensor being based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
9. The method defined in Claim 8 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
10. The method defined in Claim 8 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
11. The method defined in Claim 8 further comprising:
storing the individual temperature thresholds for the plurality of sensors in memory; and updating the individual temperature threshold for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes.
12. The method defined in Claim 11 wherein updating the individual temperature threshold for each of the plurality of temperature sensors includes monitoring changes to the power supplied to one or more areas of the plurality of areas changes.
13. The method defined in Claim 8 wherein the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
14. The method defined in Claim 8 wherein the thermal throttling comprises clock modulation.
15. A computing system comprising:
a memory to store the individual temperature thresholds for the plurality of sensors in memory;
a display coupled to the memory;
a processor coupled to the memory and the display, the processor comprising
a plurality of thermal sensors, each of the plurality of thermal sensors being located in an area in the IC to record a die temperature at a location in its associated area; and a thermal controller coupled to the plurality of thermal sensors to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, the individual temperature threshold for each temperature sensor being based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
16. The computing system defined in Claim 15 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum
temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
17. The computing system defined in Claim 15 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum
temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
18. The computing system defined in Claim 15 further comprising:
an update module coupled to the processor to update the individual temperature thresholds for each of the plurality of temperature sensors if the power supplied to one or more areas of the plurality of areas changes.
19. The computing system defined in Claim 18 wherein the update module monitors changes to the power supplied to one or more areas of the plurality of areas changes.
20. The computing system defined in Claim 15 wherein the thermal throttling is operable to cause dynamic frequency scaling to reduce the temperature of the IC.
21. The computing system defined in Claim 15 wherein the thermal throttling comprises clock modulation.
22. A machine-readable medium having instructions that when operated on by the machine cause the machine to perform operations comprising:
receiving recorded die temperatures from a plurality of thermal sensors being monitored, each of the plurality of thermal sensors being located in an area in an IC to record a die temperature at a location in its associated area; and
sending commands to perform thermal throttling of the IC to reduce the temperature of the IC in response to the die temperature recorded by any one temperature sensors of the plurality of thermal sensors being greater than an individual temperature threshold for said any one temperature sensor, the individual temperature threshold for each temperature sensor being based on a maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to a plurality of areas of the IC.
23. The machine-readable medium defined in Claim 22 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and power supplied to the plurality of areas of the IC.
24. The machine-readable medium defined in Claim 22 wherein the individual temperature threshold for each temperature sensor is based on a difference between the maximum temperature allowed for the area in the IC in which said each thermal sensor is located and a sum of a product of each power supplied to each area of the plurality of areas and a weighting factor associated with said each power supplied to each area.
PCT/US2016/057025 2015-12-18 2016-10-14 Integrated circuit thermal throttling with workload adapted thermal sensor maximum temperature WO2017105610A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP16876222.7A EP3391180B1 (en) 2015-12-18 2016-10-14 Integrated circuit thermal throttling with workload adapted thermal sensor maximum temperature
CN201680069172.1A CN108292161B (en) 2015-12-18 2016-10-14 Integrated circuit thermal throttling of thermal sensor maximum temperature with workload adaptation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/975,356 US10712789B2 (en) 2015-12-18 2015-12-18 Integrated circuit thermal throttling with workload adapted thermal sensor maximum temperature
US14/975,356 2015-12-18

Publications (1)

Publication Number Publication Date
WO2017105610A1 true WO2017105610A1 (en) 2017-06-22

Family

ID=59057230

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/057025 WO2017105610A1 (en) 2015-12-18 2016-10-14 Integrated circuit thermal throttling with workload adapted thermal sensor maximum temperature

Country Status (4)

Country Link
US (2) US10712789B2 (en)
EP (1) EP3391180B1 (en)
CN (1) CN108292161B (en)
WO (1) WO2017105610A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912442B (en) * 2016-04-12 2019-01-04 英业达科技有限公司 Electronic device with temperature management function
US10884811B2 (en) * 2017-06-04 2021-01-05 Apple Inc. Scheduler for AMP architecture with closed loop performance controller using static and dynamic thread grouping
CN113157208A (en) 2017-07-24 2021-07-23 三星电子株式会社 Storage device and temperature control of electronic device including the same
US10901655B2 (en) * 2018-09-27 2021-01-26 Western Digital Technologies, Inc. Non-volatile storage system with command response piggybacking
US11656676B2 (en) * 2018-12-12 2023-05-23 Intel Corporation System, apparatus and method for dynamic thermal distribution of a system on chip
US11209879B2 (en) * 2019-07-01 2021-12-28 Dell Products L.P. Method and apparatus for adjusting a workload to perform power management to extend battery cycle life
US11422597B2 (en) * 2021-01-06 2022-08-23 International Business Machines Corporation Predetermining separate thermal control points for chips of a multi-chip module

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060161373A1 (en) * 2004-12-30 2006-07-20 Intel Corporation Method and apparatus for accurate on-die temperature measurement
WO2007024403A2 (en) 2005-08-25 2007-03-01 Apple Computer, Inc. Methods and apparatus for dynamic thermal control
US20070124622A1 (en) 2005-11-29 2007-05-31 Johns Charles R Implementation of thermal throttling logic
KR20070104711A (en) * 2006-04-24 2007-10-29 엘지전자 주식회사 Apparatus and method of controlling for cpu frequency
US20100213919A1 (en) * 2009-02-20 2010-08-26 Toshinari Takayanagi Temperature compensation in integrated circuit
US8037445B2 (en) * 2003-08-20 2011-10-11 Hewlett-Packard Development Company, L.P. System for and method of controlling a VLSI environment
US20120166839A1 (en) 2011-12-22 2012-06-28 Sodhi Inder M Method, apparatus, and system for energy efficiency and energy conservation including energy efficient processor thermal throttling using deep power down mode
US20130246820A1 (en) 2010-10-29 2013-09-19 Advanced Micro Devices, Inc. Method for adaptive performance optimization of the soc
US20140022003A1 (en) * 2011-04-11 2014-01-23 Sony Computer Entertainment Inc. Semiconductor integrated circuit

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1153563A (en) * 1994-04-28 1997-07-02 文鲁奈克斯技术公司 Temp. management for integrated circuits
US6393374B1 (en) * 1999-03-30 2002-05-21 Intel Corporation Programmable thermal management of an integrated circuit die
JP3781758B2 (en) * 2004-06-04 2006-05-31 株式会社ソニー・コンピュータエンタテインメント Processor, processor system, temperature estimation device, information processing device, and temperature estimation method
US7337339B1 (en) * 2005-09-15 2008-02-26 Azul Systems, Inc. Multi-level power monitoring, filtering and throttling at local blocks and globally
US7590473B2 (en) * 2006-02-16 2009-09-15 Intel Corporation Thermal management using an on-die thermal sensor
CN100517176C (en) * 2006-06-21 2009-07-22 国际商业机器公司 System and method for implementing heat conditioning logical
US8044697B2 (en) * 2006-06-29 2011-10-25 Intel Corporation Per die temperature programming for thermally efficient integrated circuit (IC) operation
US20130090888A1 (en) * 2011-10-07 2013-04-11 Qualcomm Incorporated System and method for proximity based thermal management of mobile device
US9285818B2 (en) * 2012-07-26 2016-03-15 Qualcomm Incorporated Autonomous thermal controller for power management IC
US20150277520A1 (en) 2012-09-28 2015-10-01 Hewlett-Packard Development Company, L.P. Temperature regulation of a cpu
TWI443496B (en) * 2012-11-01 2014-07-01 Asustek Comp Inc Electronic device and method for controlling working frequency of overclocking operation
ES2734205T3 (en) * 2013-08-06 2019-12-04 Gogoro Inc Adjustment of electric vehicle systems based on a thermal profile of an electric energy storage device
US20160109491A1 (en) * 2014-10-20 2016-04-21 Itron, Inc. Grid topology mapping with voltage data

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8037445B2 (en) * 2003-08-20 2011-10-11 Hewlett-Packard Development Company, L.P. System for and method of controlling a VLSI environment
US20060161373A1 (en) * 2004-12-30 2006-07-20 Intel Corporation Method and apparatus for accurate on-die temperature measurement
WO2007024403A2 (en) 2005-08-25 2007-03-01 Apple Computer, Inc. Methods and apparatus for dynamic thermal control
US20070124622A1 (en) 2005-11-29 2007-05-31 Johns Charles R Implementation of thermal throttling logic
KR20070104711A (en) * 2006-04-24 2007-10-29 엘지전자 주식회사 Apparatus and method of controlling for cpu frequency
US20100213919A1 (en) * 2009-02-20 2010-08-26 Toshinari Takayanagi Temperature compensation in integrated circuit
US20130246820A1 (en) 2010-10-29 2013-09-19 Advanced Micro Devices, Inc. Method for adaptive performance optimization of the soc
US20140022003A1 (en) * 2011-04-11 2014-01-23 Sony Computer Entertainment Inc. Semiconductor integrated circuit
US20120166839A1 (en) 2011-12-22 2012-06-28 Sodhi Inder M Method, apparatus, and system for energy efficiency and energy conservation including energy efficient processor thermal throttling using deep power down mode

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
EFRAIM ROTEM ET AL.: "Temperature measurement in the Intel Core TM Duo Processor", PROCEEDINGS OF 12TH INTERNATIONAL WORKSHOP ON THERMAL INVESTIGATIONS OF ICS, THERMINIC 2006, 12 September 2007 (2007-09-12), pages 23 - 27, XP055598825
See also references of EP3391180A4

Also Published As

Publication number Publication date
US11320877B2 (en) 2022-05-03
EP3391180A1 (en) 2018-10-24
CN108292161A (en) 2018-07-17
CN108292161B (en) 2022-07-19
US20170177044A1 (en) 2017-06-22
EP3391180A4 (en) 2019-08-07
US20200401199A1 (en) 2020-12-24
US10712789B2 (en) 2020-07-14
EP3391180B1 (en) 2023-04-19

Similar Documents

Publication Publication Date Title
US11320877B2 (en) Integrated circuit thermal throttling with workload adapted thermal sensor maximum temperature
TWI622870B (en) Method and system for controlling temperature of multiple domains of a multidomain processor and processor thereof
US8793512B2 (en) Method and apparatus for thermal control of processing nodes
US9335803B2 (en) Calculating a dynamically changeable maximum operating voltage value for a processor based on a different polynomial equation using a set of coefficient values and a number of current active cores
US8713256B2 (en) Method, apparatus, and system for energy efficiency and energy conservation including dynamic cache sizing and cache operating voltage management for optimal power performance
US20150082811A1 (en) Adaptive Thermoelectric Cooling In A Processor
US20120166839A1 (en) Method, apparatus, and system for energy efficiency and energy conservation including energy efficient processor thermal throttling using deep power down mode
US9857809B2 (en) Fuzzy logic control of thermoelectric cooling in a processor
US9753516B2 (en) Method, apparatus, and system for energy efficiency and energy conservation by mitigating performance variations between integrated circuit devices
TW202321871A (en) Processor core energy management
WO2016041509A1 (en) Dynamic frequency scaling in multi-processor systems
US10001800B1 (en) Systems and methods for determining temperatures of integrated circuits
US20150019891A1 (en) Controlling power consumption in multi-core environments
TW202206973A (en) Leakage degradation control and measurement
US10474216B2 (en) Method and apparatus for providing power state information using in-band signaling
US9268393B2 (en) Enforcing a power consumption duty cycle in a processor
US9612613B2 (en) Current sensor based closed loop control apparatus
US9261098B2 (en) Fan speed and memory regulator control based on memory margin
WO2023091847A1 (en) Application negotiable platform thermal aware scheduler

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16876222

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE