CN100520680C - Method, system and processor used for lagging of heat conditioning - Google Patents

Method, system and processor used for lagging of heat conditioning Download PDF

Info

Publication number
CN100520680C
CN100520680C CNB2007101090715A CN200710109071A CN100520680C CN 100520680 C CN100520680 C CN 100520680C CN B2007101090715 A CNB2007101090715 A CN B2007101090715A CN 200710109071 A CN200710109071 A CN 200710109071A CN 100520680 C CN100520680 C CN 100520680C
Authority
CN
China
Prior art keywords
temperature
heat management
register
adjusting
sensor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2007101090715A
Other languages
Chinese (zh)
Other versions
CN101093415A (en
Inventor
C·R·约翰斯
王帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US11/425,499 external-priority patent/US7603576B2/en
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN101093415A publication Critical patent/CN101093415A/en
Application granted granted Critical
Publication of CN100520680C publication Critical patent/CN100520680C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Storage Device Security (AREA)

Abstract

A computer implemented method, data processing system, and processor for delay in thermal regulation are provided. The digital thermal sensor senses the temperature of an IC to determine whether the sensed temperature is greater than or equal to the regulating temperature or not. The regulating mode is initiated in response to the state that the sensor temperature is greater than or equal to the regulating temperature. The digital thermal sensor senses temperature of an IC again to determine whether the sensed temperature is lower than the regulating temperature or not. The regulating mode is forbidden in response to the state that the sensor temperature is lower than the regulating temperature.

Description

The method, system and the processor that are used for the hysteresis of thermal conditioning
Technical field
The use of the application's relate generally to opposite heat tube reason.More particularly, the application relates to a kind of computer implemented method, data handling system and processor that is used for realizing the hysteresis of thermal conditioning.
Background technology
The heterogeneous Cell Broadband of first generation Engine TM(BE) processor is the Power that comprises one 64 bit
Figure C200710109071D0005082309QIETU
Many core chip of processor cores and eight single instruction multiple datas (SIMD) coprocessor kernel can be carried out extensive floating-point and be handled, and optimize at the intensive operating load of computing and broadband rich media applications.High speed storing controller and high-bandwidth bus interface also are integrated on the chip.The breakthrough many kernels architecture of Cell BE and hypervelocity communication capacity are sent the real-time response of improving greatly with 10 times of up-to-date PC processor performance under many circumstances.Cell BE be operating system neutrality and support a plurality of operating systems simultaneously.The scope of the application of such processor can be from the games system of future generation of the sense of reality with remarkable enhancing, to forming the system that family digital medium and streaming transmit content center (hub), to the system that is used to develop with the distributed digital loop content, and to quickening system visual and that supercomputing is used.
Present multi-core processor usually is subjected to the restriction that heat is considered.Typical solution comprises cooling and power management.Cooling may be expensive and/or be difficult to integrate.Power management generally is rough measure, as to reaching the response of hot limit, a big chunk or the entire process device of processor is carried out " adjusting ".Other technologies such as heat management help realize these rough measures by only regulating to surpass to the unit of fixed temperature.But most thermal management technologies can influence the real-time ensuring of application.Therefore, it is useful that the heat management solution is provided, and this solution provides a kind of method for processor, even the real-time in order to also to guarantee to use under the situation of the hot situation of occur to need regulating processor.Under the situation that can not satisfy real-time ensuring, the notice application manager makes it possible to achieve corrective action.
Summary of the invention
The different aspect of illustrative embodiment provides a kind of computer implemented method, data handling system and processor that is used for writing down the maximum temperature of integrated circuit.The temperature of illustrative embodiment in the digital thermal sensor sensing integrated circuit.Illustrative embodiment determines that whether temperature sensor is more than or equal to regulating temperature.As temperature sensor being met or exceeded the response of regulating temperature, pattern is regulated in the illustrative embodiment initialization.The illustrative embodiment new temperature of digital thermal sensor sensing.Illustrative embodiment determines that whether this new temperature sensor is less than finishing to regulate temperature.As this new temperature sensor being regulated the response of temperature less than end, illustrative embodiment forbidding adjusting pattern.
Description of drawings
In claims, illustrated and be sure of it is the peculiar novel feature of illustrative embodiment.But, when reading in conjunction with the accompanying drawings, with reference to following detailed description to illustrative embodiment, can understand best illustrative embodiment itself with and preferably use pattern, further purpose and advantage, wherein:
Fig. 1 has described the diagram of the network of the data handling system that can realize the illustrative embodiment each side;
Fig. 2 has described the block diagram that can realize the data handling system of illustrative embodiment each side;
Fig. 3 has described the exemplary diagram of the Cell BE chip that can realize the illustrative embodiment each side;
Fig. 4 shows the exemplary hot management system according to illustrative embodiment;
Fig. 5 has described according to the thetagram of illustrative embodiment and may interrupt each point with dynamic adjustments;
Fig. 6 has described the process flow diagram of operation that is used to write down maximum temperature according to illustrative embodiment;
Fig. 7 has described the process flow diagram that is used for following the tracks of by performance monitoring the operation of dsc data according to another illustrative embodiment;
Fig. 8 A and Fig. 8 B have described the process flow diagram according to the operation that produces at senior thermal break of other illustrative embodiment;
Fig. 9 has described the process flow diagram that is used for supporting at heat management system the operation that degree of depth energy saver mode and part are good according to other illustrative embodiment;
Figure 10 described according to other illustrative embodiment at make hotness know software application real-time testing can with the temperature process flow diagram of the operation of independently thermal conditioning controlling features mutually;
Figure 11 has described the process flow diagram that is used to realize interrupt latency is influenced the operation of minimum thermal conditioning control according to other illustrative embodiment;
Figure 12 has described the process flow diagram according to the operation of the hysteresis that is used for thermal conditioning of other illustrative embodiment; And
Figure 13 has described the process flow diagram of operation that is used to realize the thermal conditioning logic according to other illustrative embodiment.
Embodiment
Illustrative embodiment relates to the hysteresis in the thermal conditioning.Fig. 1-Fig. 2 is provided as realizing the exemplary diagram of the data processing circumstance of illustrative embodiment.Should be appreciated that Fig. 1-Fig. 2 is exemplary, be not be intended to clear and definite or hint arbitrarily about realizing the environmental limit of embodiment each side.Under the situation of the spirit and scope that do not depart from illustrative embodiment, can much revise described environment.
With reference now to accompanying drawing,, Fig. 1 has described the diagram of the network of the data handling system that can realize the illustrative embodiment each side.Network data processing system 100 is the computer networks that can realize illustrative embodiment.Network data processing system 100 comprises network 102, and this network 102 is a kind of media that are used for providing between the various device that links together and the computing machine communication link in network data processing system 100.Network 102 can comprise the connection such as cable, wireless communication link or fiber optic cables.
In described example, server 104 and server 106 are connected to network 102 and storage unit thereupon 108.In addition, client 110,112 and 114 is connected to network 102.These clients 110,112 and 114 can be for example personal computer or network computer.In described example, the data that server 104 provides such as startup file, operation system image and application to client 110,112 and 114.In this example, client 110,112 and 114 is clients of server 104.Network data processing system 100 can comprise additional server, client and other equipment that do not illustrate.
In described example, network data processing system 100 is the Internets with network 102, the network that network 102 expression use TCP (TCP/IP) protocol groups intercom mutually and the whole world set of gateway.Be the high-speed data communication lines backbone between host node or the main frame at the center of the Internet, comprise that thousands of business computer systems, government's computer system, computer in education system and other carry out route calculation machine system to data and message.Certainly, network data processing system 100 can also be embodied as the network of number of different types, such as Intranet, Local Area Network or wide area network (WAN).Fig. 1 is intended to as an example, rather than as the architectural limitation to different illustrative embodiment.
With reference now to Fig. 2,, shows the block diagram of the data handling system that can realize the illustrative embodiment each side.Data handling system 200 is the server 104 in Fig. 1 or the example of the computing machine the client 110, realizes that the computer usable code or the instruction of the processing of illustrative embodiment can be arranged in this computing machine.
In described example, data handling system 200 adopts centric architecture, comprises north bridge and memory controller center (MCH) 202 and south bridge and I/O (I/O) controller center (ICH) 204.Processing unit 206, primary memory 208 and graphic process unit 210 are connected to north bridge and memory controller center 202.Graphic process unit 210 can be connected to north bridge and memory controller center 202 by Accelerated Graphics Port (AGP).
In described example, lan adapter 212 is connected to south bridge and I/O controller center 204.Audio frequency adapter 216, keyboard and mouse adapter 220, modulator-demodular unit 222, ROM (read-only memory) (ROM) 224, hard disk drive (HDD) 226, CD-ROM drive 230, USB (universal serial bus) (USB) port and other communication port 232 and PCI/PCIe equipment 234 are connected to south bridge and I/O controller center 204 by bus 238 and bus 240.PCI/PCIe equipment can comprise for example PC card of Ethernet Adaptation Unit, insertion card and notebook.PCI uses the card bus controller, and PCIe does not then use.ROM224 can be a flash binary input/output (BIOS) for example.
Hard disk drive 226 and CD-ROM drive 230 are connected to south bridge and I/O controller center 204 by bus 240.Hard disk drive 226 and CD-ROM drive 230 can be used for example integrated drive electronics (IDE) or Serial Advanced Technology Attachment (SATA) interface.Super I/O (SIO) equipment 236 can be connected to south bridge and I/O controller center 204.
Control to the various assemblies in the data handling system among Fig. 2 200 is moved and adjust and provided to operating system on processing unit 206.As client, operating system can be operating system on sale on the market, such as
Figure C200710109071D0009145503QIETU
Figure C200710109071D0009145507QIETU
XP (Microsoft and Windows are Microsofts in the U.S., other countries or simultaneously at the trade mark of the U.S. and other countries).Such as Java TMThe object based programming system of programing system and so on can the binding operation system and move, and provides from the java applet carried out in data handling system 200 or use the calling of operating system (Java is a Sun Microsystems, Inc. in the U.S., other countries or simultaneously at the trade mark of the U.S. and other countries).
As server, data handling system 200 can be for example move senior mutual execution (
Figure C200710109071D0009082356QIETU
) the IBM eServer of operating system or LINUX operating system TM
Figure C200710109071D0009145525QIETU
Computer system (eServer, pSeries and AIX are International Business Machine Corporation (IBM) in the U.S., other countries or simultaneously at the trade mark of the U.S. and other countries, and Linux is Linux Torvalds in the U.S., other countries or simultaneously at the trade mark of the U.S. and other countries).Data handling system 200 can be the symmetric multi processor (smp) system that comprises a plurality of processors in processing unit 206.As selection, can adopt single processor system.
The instruction of operating system, Object oriented programming system and application or program is positioned on the memory device such as hard disk drive 226, and can be written into primary memory 208 and carry out for processing unit 206.The processing of illustrative embodiment is carried out by processing unit 206 usable program code that uses a computer, and these codes can be arranged in the storer such as primary memory 208, ROM (read-only memory) 224, or are arranged in one or more peripherals 226 and 230.
Those of ordinary skill in the art should be appreciated that according to different realizations, the hardware among Fig. 1-Fig. 2 can change.Can use such as other internal hardwares such as flash memory, equivalent nonvolatile memory or CD drive or peripherals and replenish or replace hardware described in Fig. 1-Fig. 2.Equally, the processing of illustrative embodiment can also be applied to multi-processor data process system.
In some illustrated examples, data handling system 200 can be a PDA(Personal Digital Assistant), and this personal digital assistant disposes the nonvolatile memory of flash memory with the data that are provided for storage operating system file and/or user and generate.
Bus system can comprise one or more buses, all buses as shown in Figure 2 238 or bus 240.Certainly, can use the communication construction of any type or architecture to realize bus system, this framework or architecture provide being attached to the different assemblies on this framework or the architecture or the transmission of the data between the equipment.Communication unit can comprise one or more equipment that are used to send or receive data, such as modulator-demodular unit 222 or the network adapter 212 of Fig. 2.Storer can be for example primary memory 208, ROM (read-only memory) 224 or such as the high-speed cache of seeing in north bridge in Fig. 2 and the memory controller center 202.Example described in Fig. 1-Fig. 2 and above-mentioned example are not to mean the hint architectural limitation.For example, except the form of taking PDA, data handling system 200 can also be tablet computers, laptop computer or telephone plant.
Fig. 3 has described the exemplary diagram of the Cell BE chip that can realize the illustrative embodiment each side.Cell BE chip 300 is to realize that at the chip multiprocessors of distributed treatment the target of this distributed treatment is the rich media applications such as game console, desktop system and server.
Cell BE chip 300 can logically be divided into following functional module: Power Processor unit (PPE) 301, coprocessor unit (SPU) 310,311 and 312 and memory stream controller (MFC) 305,306 and 307.Although show coprocessor unit (SPE) 302,303 and 304 and PPE 301, can support the processor unit of any type by example.Although Fig. 3 only shows three SPE 302,303 and 304, the realization of exemplary Cell BE chip 300 comprises a PPE 301 and eight SPE.The SPE of CELL processor is first the realizing of new processor architecture that is designed to quicken medium and data stream operating load.
Cell BE chip 300 can be a SOC (system on a chip), and making to provide each unit shown in Figure 3 on single microprocessor chip.In addition, Cell BE chip 300 is a kind of heterogeneous processing environments, and wherein each SPU 310,311 can receive different instructions by each other SPU from system with 312.In addition, SPU 310,311 and 312 instruction set and Power
Figure C200710109071D0010082447QIETU
The instruction set difference of processor unit (PPU) 308, for example, PPU 308 can be at Power TMCarry out instruction in the architecture based on Reduced Instruction Set Computer (RISC), and the instruction of SPU 310,311 and 312 execute vectorizations.
Each SPE comprises a SPU 310,311 or 312, and preserve and processing memory protection and access permission information with the special-purpose MFC 305,306 with related memory management unit (MMU) 316,317 or 318 or 307 in its this locality storage (LS) zone 313,314 or 315.Equally, although show SPU, can support the processor unit of any type by example.In addition, Cell BE chip 300 realizes that cell interconnection bus (EIB) 319 and other I/O structures are to realize on the sheet and external data stream.
EIB 319 is as bus on the main leaf of PPE 301 and SPE 302,303 and 304.In addition, interface controller carries out interface and is connected on EIB 319 and other sheets that are exclusively used in sheet outer (off-chip) visit.Interface controller comprises memory interface controller (MIC) 320 and Cell BE interface unit (BEI) 323 on the sheet, wherein MIC 320 provides two very fast data rate I/O (XIO) memory channel 321 and 322, and BEI 323 provides two high speed exterior I/O passages and internal interrupt control for Cell BE 300.BEI 323 is embodied as bus interface controller (BIC is labeled as BIC0 and BIC1) 324 and 325 and I/O interface controller (IOC) 326.Two high speed exterior I/O passages are connected to Redwood
Figure C200710109071D0011082520QIETU
One end of Asic Cell (RRAC) interface, this interface provides input and output flexibly (FlexIO_0 and FlexIO_1) 353 for Cell BE 300.
Each SPU 310,311 or 312 has corresponding LS zone 313,314 or 315 and association's performance element (SXU) 354,355 or 356.Each independent SPU 310,311 or 312 can only execute instruction (comprising data load and storage operation) from the LS zone related with it 313,314 or in 315.For this reason, the MFC 305,306 and 307 of MFC direct memory visit (DMA) operation by 310,311 and 312 special uses of SPU carries out that all need going to or transmitting from the data of other local storeies in the system.
The program of operation only uses the LS address to quote its LS zone 313,314 or 315 on SPU 310,311 or 312.But, also be that the true address (RA) in the Storage Mapping of total system is distributed in the LS zone 313,314 or 315 of each SPU.This RA is the address of equipment with response.At Power
Figure C200710109071D0011082546QIETU
In, using and come reference stores position (or equipment) by effective address (EA), this EA is mapped to the virtual address (VA) of memory location (or equipment) then, and this VA is mapped as RA then.EA is used for the address of reference stores device and/or equipment by application.This mapping make operating system can the distribution ratio system in physically more storer (virtual memory that just is called VA).Storage Mapping is the tabulation of the RA of all devices in the system (comprising storer) and their correspondences.Storage Mapping is to the mapping with the real address space of the RA of response of marking equipment or storer.
This makes EA that privilege software can be mapped to the LS zone processor to realize that the direct memory visit transmits between the LS zone of the LS of a SPU and another SPU.PPE 301 can also use EA directly to visit the LS zone of any SPU.At Power
Figure C200710109071D0012082559QIETU
In three states (problem, privilege and management) are arranged.Privilege software is the software that moves under privilege or supervisor status.These states have different access privilegess.For example, privilege software can be visited the data structure register that is used for real memory is mapped to the EA of application.Problem state is the common residing state of processor when operation application and common disabled access system management resource (such as the data structure that is used to shine upon real memory).
MFC DMA data command comprises a LS address and an EA all the time.Command dma copies to another position with memory contents from a position.In this case, MFC command dma copy data between EA and LS address.The LS address is directly pointed to and corresponding related SPU 310,311 of MFC command queue or 312 LS zone 313,314 or 315.Command queue is the formation of MFC order.Have a formation to be used for preserving order from SPU, formation is used for preserving the order from PXU or other equipment.But can arrange or shine upon EA with any other memory storage area in the access system, comprise the LS zone 313,314 and 315 of other SPE 302,303 and 304.
The primary memory (not shown) by the PPU in the system such as system illustrated in fig. 2 308, PPE 301, SPE 302,303 and 304 and I/O equipment (not shown) share.All information that are kept in the primary memory are visible to all processors in the system and equipment.Program uses EA to quote primary memory.The Power processor unit because the formation of MFC proxy commands, control and state facility have RA, and use EA to shine upon RA, so may use EA to come the initialization dma operation between the primary memory of the SPE 302,303 of association and 304 and local storage.
As example, when the program of operation on SPU 310,311 or 312 need visit primary memory the time, the SPU program generates command dma with suitable EA and LS address and it is placed among its MFC 305,306 or 307 command queue.After order was placed in the formation by the SPU program, MFC 305,306 or 307 carried out this order and transmit needed data between LS zone and primary memory.The order that MFC 305,306 or 307 other equipment such as PPE 301 of serving as reasons generate provides second to act on behalf of command queue.The formation of MFC proxy commands typically is used for before starting SPU procedure stores being stored to this locality.The MFC proxy commands can also be used for the context storage operation.
The EA address provides an address that can be converted to RA by MMU for MFC.Conversion process is considered the virtual of system storage and the visit of storer in real address space and equipment is protected.Because the LS zone is mapped to real address space, so EA can also refer to all SPU LS zones.
PPE 301 on the Cell BE chip 300 comprises the PPU 308 and the Power of 64 bits
Figure C200710109071D0013082628QIETU
Storage subsystem (PPSS) 309.PPU 308 comprises processor performance element (PXU) 329, one-level (L1) high-speed cache 330, MMU 331 and replacement management table (RMT) 332.PPSS309 comprise cacheable interface unit (CIU) 333, can not cache element (NCU) 334, secondary (L2) high-speed cache 328, RMT3 35 and Bus Interface Unit (BIU) 327.BIU 327 is connected to EIB 319 with PPSS 309.
SPU 310,311 or 312 and MFC 305,306 intercom mutually by half-duplex channel with 307 with capacity.Passage comes down to use 34 FIFO that instruction visits in the SPU instruction, read channel (RDCH), write access (WRCH) and read channel counting (RDCHCNT).Quantity of information in the RDCHCNT backward channel.Capacity is the degree of depth of FIFO.Passage is to going to and from MFC 305,306 and 307, and SPU 310,311 and 312 data transmit.BIU 339,340 and 341 is connected to EIB 319 with MFC 305,306 and 307.
MFC 305,306 and 307 provides two major functions for SPU 310,311 and 312.MFC 305,306 and 307 is at SPU 310,311 or 312, LS zone 313,314 or 315 and primary memory between mobile data.In addition, MFC 305,306 and 307 SPU 310,311 and 312 and system in other equipment between synchronous facility is provided.
MFC 305,306 and 307 realization have four functional units: direct memory access controller (DMAC) 336,337 and 338, MMU 316,317 and 318, atomic unit (ATO) 342,343 and 344, RMT 345,346 and 347 and BIU 339,340 and 341.DMAC 336,337 and 338 safeguards and handles MFC command queue (MFC CMDQ) (not shown) that it comprises MFC SPU command queue (MFC SPUQ) and MFC proxy commands formation (MFC PrxyQ).The MFC SPUQ of 16 clauses and subclauses handles the MFC order that receives from the SPU channel interface.The MFC that the MFC PrxyQ of eight clauses and subclauses loads by Storage Mapping input and output (MMIO) and storage operation is handled from other equipment such as PPE 301 or SPE 302,303 and 304 orders.Typical direct memory visit order LS zone 313,314 or 315 and primary memory between mobile data.The EA parameter of MFC command dma is used in reference to main storage device, comprises that primary memory, local storage and all have the equipment of RA.The local storage parameter of MFC command dma is used in reference to the local storage to association.
In Virtualization Mode, MMU 316,317 and 318 provides address translation and storage protection facility to handle from the EA conversion request of DMAC 336,337 and 338 and to send switched address back to.The MMU of each SPE safeguards section look aside buffer (SLB) and conversion look aside buffer (TLB).SLB is converted to VA with EA, and TLB will be converted to RA from the VA that SLB comes out.EA is by use using and the address of 32 bits or 64 bits normally.A plurality of copies of different application or an application can use identical EA to quote different memory location (for example, all use two copies of the application of identical EA, need two different physical storage locations).In order to finish this point, EA at first is converted into a bigger VA space, and it is used all that move under operating system is public.EA is carried out by SLB to the conversion of VA.Use TLB to convert VA to RA then, this TLB comprises VA to the page table of the mapping of RA or the high-speed cache of mapping table.This table is safeguarded by operating system.
ATO 342,343 and 344 provides the rank of the synchronous necessary data cache of other processing units in maintenance and the system.Atom direct memory visit order provides the synchronous means that make coprocessor unit execution and other unit.
BIU 339,340 and 341 major function are the interfaces that is provided to EIB for SPE 302,303 and 304.EIB 319 is at all processor cores on the Cell BE chip 300 and be attached between the outer interface controller on the EIB 319 communication path is provided.
Provide interface between MIC 320 one or two in EIB 319 and XIO 321 and 322.Very fast data rate (XDR TM) dynamic RAM (DRAM) be by
Figure C200710109071D0014082716QIETU
The high speed height serial storage that provides.By the very fast data rate dynamic RAM of grand visit that Rambus provides, this storer is referred to herein as XIO 321 and 322.
MIC 320 is the slave unit on the EIB 319.320 pairs of orders in its configuration address scope of MIC confirm, this scope with supported in the heart storer corresponding.
BIC 324 and 325 manages on the sheets or the data of any one from 319 to two external units of EIB that sheet is outer transmit. BIC 324 and 325 can with I/O devices exchange nonuniformity business, perhaps it can expand to another equipment with EIB 319, this equipment even can be another Cell BE chip.When being used to expand EIB 319, bus protocol is safeguarded high-speed cache in the Cell BE chip 300 and the consistance between the high-speed cache in the subsidiary external unit, and this external unit can be another Cell BE chip.
IOC 326 handles the order of initiating and mailing to relevant EIB 319 in the I/O interfacing equipment.The I/O interfacing equipment can be any apparatus that is attached on the I/O interface, the I/O bridge chip of another Cell BE chip 300 of visiting such as subsidiary a plurality of I/O equipment or with nonconforming mode.IOC 326 also intercepts on the EIB 319 visit at the Storage Mapping register, and they are routed to correct I/O interface, and these registers reside among I/O bridge chip or the nonuniformity Cell BE chip 300 or afterwards.IOC 326 also comprises internal interrupt controller (IIC) 349 and I/O address converting unit (I/O Trans) 350.
Pervasive logic (pervasive logic) the 351st provides the controller of Clock management, test feature and last electric array for Cell BE chip 300.Pervasive logic can provide heat management system for processor.Pervasive logic comprises the connection by combined testing action group well known in the art (JTAG) or SPI (serial peripheral interface) interface other equipment in the system.
Although the specific example that how to realize different assemblies is provided, this does not also mean that architecture that can operation instruction embodiment each side is construed as limiting.Can be in conjunction with any multi-core processor system and the each side of operation instruction embodiment.
Use or software the term of execution, the temperature in the zone in the Cell BE chip may rise.If unconstrained, temperature may rise to maximum the appointment on the junction temperature (junction temperature), causes incorrect operation or physical hazard.For fear of these situations, the digital thermal administrative unit of Cell BE chip is in run duration monitoring and attempt to control temperature in the Cell BE chip.The digital thermal administrative unit comprises a heat management control module described herein (TMCU) and ten distributed digital thermal sensors (DTS).
A sensor is arranged in the SPE of eight SPE, and a sensor is arranged in PPE, and a sensor is adjacent with linear thermal diode.Linear thermal diode is a diode on the sheet of accounting temperature.These sensors are positioned at each the regional position adjacent with related unit, and the maximum rising of experiencing temperature during great majority are used is typically being carried out in this related unit.The thermal control units monitoring is from the feedback of each these sensor.If the temperature of sensor rises on the programmable point, thermal control units just can be configured to cause to the interruption of PPE or one or more SPE and dynamically regulate related PPE or the execution of SPE.
The cycle that PPE or SPE is stopped and moving programmable number provides essential adjusting.Interruption makes privilege software can take correct measure, and dynamic adjustments attempts the temperature in broadband engine (wideband engine) chip is remained on below the rank able to programme under the situation that does not have software to get involved simultaneously.Privilege software is regulated rank and is set to be equal to or less than the recommendation setting that application provides.Each application may be different.
Do not manage temperature and temperature continuation rising effectively if regulate PPE or SPE, then pervasive logic 351 stops the clock of Cell BE chip when temperature reaches thermal overload temperature (being limited by programmable configuration data).Thermal overload feature protection Cell BE chip is not subjected to physical hazard.Recover to restart (hard reset) firmly from this situation.By the temperature in the zone of DTS monitoring might not be related PPE or the hottest point in the SPE.
Fig. 4 shows the exemplary hot management system according to illustrative embodiment.Heat management system can be implemented as integrated circuit, the integrated circuit that is provided such as the pervasive logical block 351 of Fig. 3.Heat management system can be special IC, processor, multiprocessor or heterogeneous multi-core processor.Heat management system is between ten distributed DTS and the heat management control module (TMCU) 402, only shows DTS 404,406,408 and 410 for the sake of simplicity.Each DTS 404 and 406, the DTS in PPU sensor 442 408 and DTS 410 in the sensor 444 adjacent with the linear thermal diode (not shown) in SPU sensor 440 provides the Current Temperatures detection signal.This signal indicating temperature is equal to or less than the Current Temperatures sensing range that is provided with by TMCU 402.TMCU 402 uses from the state of the signal of DTS 404,406,408 and 410 and follows the tracks of the DTS 404,406,408 of each PPE or SPE or 410 temperature continuously.Along with the tracking to temperature, TMCU 402 is provided as the related PPE of expression or the numerical value of the temperature in the SPE with Current Temperatures.The manufacturing plant of calibrating independent sensor is provided with internal calibration storer 428.
Except the unit of above-mentioned TMCU 402, TMCU 402 also comprises multiplexer 446 and 450, work register 448, comparer 452 and 454, serializer 456, heat management control state machine 458 and data stream (DF) unit 460.Various that send and the signals that enter of multiplexer 446 and 450 pairs make up so that transmit on single media.Work register 448 is kept at the result of the multiplication of carrying out among the TMCU 402.Comparer 452 and 454 provides the comparing function to two inputs.Comparer 452 is more than or equal to comparer.Comparer 454 is greater than comparer.Serializer 456 will be converted to the high-speed serial data that is used to transmit from the low-speed parallel data in source.Serializer 456 is in conjunction with the deserializer on the SPU sensor 440 462 and 464 and work.Deserializer 462 and 464 is converted to the low-speed parallel data with the high-speed serial data that receives.Heat management control state machine 458 starts the internal initialization of TMCU 402.DF unit 460 control is gone to and from the data of heat management control state machine 458.
TMCU 402 can be configured to use interrupt logic 416 to cause the interruption of PPE is regulated the execution that logic 418 is dynamically regulated PPE or SPE to use.
TMCU 402 will represent that the numerical value of temperature and programmable interruption temperature and programmable point of adjustment compare.Each DTS has an independently programmable interruption temperature.If this temperature is within the interruption temperature range of having programmed, if enable so, then TMCU 402 generations are to the interruption of PPE.If temperature depend on following directional bit more than the program level or below, then produce to interrupt.In addition, the second programmable interruption temperature can cause the caution signal to system controller.System controller is on system's panel and be connected to CellBE on the SPI port.
If the temperature by the DTS induction related with PPE or SPE is equal to or higher than point of adjustment, then TMCU 402 is by beginning and stop the execution that PPE or one or more SPE regulate this PPE or SPE independently.Software can use the heat management register such as heat management stand-by time register and heat management ratio register to come the ratio and the frequency of regulating and controlling.
Fig. 5 has described according to the thetagram of illustrative embodiment and may interrupt each point with dynamic adjustments.In Fig. 5, line 500 can be represented the temperature of PPE or SPE.If PPE or SPE normally move, do not regulating with in the zone of " N " mark so.When the temperature of PPE or SPE reached point of adjustment, TMCU began to regulate related PPE or the execution of SPE.Regulate zone " T " mark that takes place.Finish point of adjustment when following when the temperature of PPE or SPE drops to, carry out turning back to normal running.
If owing to any reason temperature continue to rise and reach point of adjustment place comprehensively or on temperature the time, TMCU 402 stops PPE or SPE drops to below comprehensive point of adjustment up to temperature.Stop zone " S " mark of PPE or SPE.Temperature comprehensive point of adjustment place or on the time stop PPE or SPE and be called kernel and stop security.
In this graphical representation of exemplary, will interrupt temperature and be arranged on more than the point of adjustment; Therefore, TMCU402 produces interruption, and this interruption is the notice to software, thus promptly about because temperature once or still stop more than the temperature notice that corresponding PPE or SPE are stopped at kernel; It is movable to suppose that thermal break mask register (TM_ISR) is set to, and referring to 422 among Fig. 4, PPE or SPE can be continued during unsettled interruption (pending interrupt).If dynamic adjustments is disabled, then privilege software is managed hot situation.Not managing hot situation may cause the maloperation of related PPE or SPE or closed by the heat that the thermal overload function causes.
Turn back to Fig. 4, the thermal sensor status register comprises thermal sensor Current Temperatures status register 412 and thermal sensor maximum temperature status register 414.These registers make software can read the Current Temperatures of each DTS, determine the maximum temperature that reaches during a period of time, and cause when temperature reaches programmable temperature and interrupt.The thermal sensor status register has related can be labeled as the real page address of enjoying management concession.
Thermal sensor Current Temperatures status register 412 comprises the encoded radio or the digital value of the Current Temperatures of each DTS.Because the stand-by period during sensor temperature detects, the stand-by period of reading these registers and normal temperature fluctuation, the temperature of reporting in these registers is the temperature than time point morning, may not reflect the actual temperature when software receives data.Because each sensor all has special-purpose steering logic, so the steering logic in DTS 404,408 and 410 is sampled to all sensors concurrently.TMCU 402 upgrades the content of thermal sensor Current Temperatures status register 412 when the sampling period finishes.TMCU 402 changes into Current Temperatures with the value in the thermal sensor Current Temperatures status register 412.402 each SenSampTime cycle of TMCU are the new Current Temperatures of poll all.The length in SenSampTime configuration field control sampling period.
Thermal sensor maximum temperature status register 414 comprises the time that is read at last from thermal sensor maximum temperature status register 414, the digitally maximum temperature of coding that each sensor reaches.Reading these registers by software or any sheet external equipment such as sheet external equipment 472 or the outer I/O equipment 474 of sheet makes TMCU 402 that the Current Temperatures of each sensor is duplicated into register.After reading, TMCU 402 begins to continue to follow the tracks of maximum temperature from this point.Reading of each register is independently.Reading of a register do not influenced the content of another register.
Each sensor all has special-purpose steering logic, so the steering logic in DTS 404,406,408 and 410 is sampled to all sensors concurrently.TMCU 402 changes into Current Temperatures with the value in the thermal sensor maximum temperature status register 414.402 each SenSampTime cycle of TMCU are the new Current Temperatures of poll all.The length in SenSampTime configuration field control sampling period.
Thermal sensor interrupt register control in the interrupt logic 416 is to the generation of the heat management interruption of PPE.This group register comprises that thermal sensor interrupts temperature register 420 (TS_ITR1 and TS_ITR2), thermal sensor interrupt status register 422 (TS_ISR), thermal sensor interrupt mask register 424 (TS_IMR) and thermal sensor global interrupt temperature register 426 (TS_GITR).Thermal sensor interrupts the coding that temperature register 420 and thermal sensor global interrupt temperature register 426 comprise the temperature that causes that the heat management to PPE interrupts.
When the temperature with digital format coding at sensor in the thermal sensor Current Temperatures status register 412 was interrupted the interruption temperature coding of the respective sensor in the temperature register 420 more than or equal to thermal sensor, TMCU 402 was provided with corresponding state bit in the thermal sensor interrupt status register 422 (TS_ISR[Sx]).When encoding more than or equal to the global interrupt temperature in the thermal sensor global interrupt temperature register 426 at the temperature of any sensor coding in the thermal sensor Current Temperatures status register 412, TMCU 402 is provided with corresponding state bit in the thermal sensor interrupt status register 422 (TS_ISR[Gx]).
If be provided with any thermal sensor interrupt status register 422 bits (TS_ISR[Sx]) and be provided with corresponding shielding bit in the thermal sensor interrupt mask register 424 (TS_IMR[Mx]), the heat management look-at-me that causes PPE of TMCU 402 so.If be provided with any thermal sensor interrupt status register 422 bits (TS_ISR[Gx]) and be provided with corresponding shielding bit in the thermal sensor interrupt mask register 424 (TS_IMR[Cx]), the heat management look-at-me that causes PPE of TMCU 402 so.
In order to remove interrupt condition, privilege software should shielding ratio be ad hoc is changed to " 0 " with corresponding arbitrarily in the thermal sensor interrupt mask register.Interrupt in order to enable heat management, privilege software assurance temperature is carried out following sequence then below the interruption temperature of respective sensor.When the interruption temperature is following, do not enable interruption in temperature and may cause producing heat management interruption immediately.
1. " 1 " is write the corresponding state bit in the thermal sensor interrupt status register 422.
2. " 1 " is write the corresponding shielding bit in the thermal sensor interrupt mask register 424.
Thermal sensor interrupts temperature register 420 and comprises the interruption temperature rank that is arranged in SPE, PPE and the sensor adjacent with linear thermal diode.TMCU 402 compares the interruption temperature rank and the coding of the respective interrupt temperature in the thermal sensor Current Temperatures status register 412 of having encoded in this register.These results relatively produce heat management and interrupt.The interruption temperature rank of each sensor is independently.
Except the independently interruption temperature rank of interrupting being provided with in the temperature register 420 at thermal sensor, thermal sensor global interrupt temperature register 426 comprises second and interrupts the temperature rank.This rank is applicable to all the sensors in the Cell BE chip.TMCU 402 compares the Current Temperatures coding of the global interrupt temperature rank of having encoded in this register and each sensor.These results relatively produce heat management and interrupt.
The purpose of global interrupt temperature provides the early stage indication that the temperature in the Cell BE chip is risen.Privilege software and system controller can use this information to come the startup measure with the control temperature, for example, increase fan-in speed, balance application software or the like again between the unit.
Thermal sensor interrupt status register 422 which register of sign satisfy interrupt condition.Interrupt condition is meant the specified conditions that each thermal sensor interrupt status register 422 bit is had, and interrupts when satisfying these specified conditions and may take place.If be provided with corresponding shielding bit, actual interrupt is only submitted to PPE so.
Thermal sensor interrupt status register 422 comprises three groups of status bits, i.e. digital sensor global threshold interruption status bit (TS_ISR[Gx]), digital sensor thresholding interruption status bit (TS_ISR[Sx]) and the following interruption status bit of digital sensor global threshold (TS_ISR[Gb]).
When the sensor temperature in the thermal sensor Current Temperatures status register 412 coding interrupts the interruption temperature coding of the corresponding sensor in the temperature register 420 and the respective direction bit TM_IMR[Bx in the thermal sensor interrupt mask register 424 more than or equal to thermal sensor]=during ' 0 ', TMCU 402 is provided with status bits in the thermal sensor interrupt status register 422 (TS_ISR[Sx]).In addition, interrupt the interruption temperature coding of the corresponding sensor in the temperature register 420 and the respective direction bit TM_IMR[Bx in the thermal sensor interrupt mask register 424 when the sensor temperature in the thermal sensor Current Temperatures status register 412 coding is lower than thermal sensor]=during ' 1 ', TMCU 402 is provided with thermal sensor interrupt status register 422, i.e. TS_ISR[Sx].
As the Current Temperatures of the sensor of any participation Current Temperatures and thermal sensor interrupt mask register 424TM_IMR[B more than or equal to thermal sensor global interrupt temperature register 426 GDuring]=' 0 ', TMCU 402 is provided with thermal sensor interrupt status register 422, i.e. TS_ISR[Gx].The TS_ISR[Gx of independent thermal sensor interrupt status register 422] bit shows which independent sensor satisfies these conditions.
As thermal sensor interrupt mask register 424TM_IMR[Cx] in the Current Temperatures of sensor of all participations be lower than the Current Temperatures and thermal sensor interrupt mask register 424 TM_IMR[B of thermal sensor global interrupt temperature register 426 GDuring]=' 1 ', TMCU 402 is provided with thermal sensor interrupt status register 422, i.e. TS_ISR[Gb].Because the Current Temperatures of the sensor of all participations is lower than the Current Temperatures of thermal sensor global interrupt temperature register 426, therefore for the following interrupt condition of global threshold, the status bits (TS_ISR[Gb]) in the thermal sensor interrupt status register 422 only appears.
In case with a status bits in the thermal sensor interrupt status register 422 (TS_ISR[Sx], TS_ISR[Gx] or TS_ISR[Gb]) be set to ' 1 ', TMCU 402 just safeguards that this state is up to resetting to ' 0 ' by privilege software.Privilege software is write the corresponding bits in the thermal sensor interrupt status register 422 and status bits is reset to ' 0 ' by general ' 1 '.
Thermal sensor interrupt mask register 424 comprises two fields of separated sensor and a plurality of fields of global interrupt condition.Interrupt condition is meant the specified conditions that each thermal sensor interrupt status register 422 bit is had, and interrupts when satisfying these specified conditions and may take place.If be provided with corresponding shielding bit, actual interrupt is only submitted to PPE so.
The digital thermal thresholding break field of two thermal sensor interrupt mask register of separated sensor is TS_IMR[Mx] and TS_IMR[Bx].The shielding bit TS_IMR[Mx of thermal sensor interrupt mask register 424] prevent that the heat management that the interruption status bit produces PPE from interrupting.The directional bit TS_IMR[Bx of thermal sensor interrupt mask register 424] temperature direction of interrupt condition is set to be higher or lower than thermal sensor and interrupts relevant temperature in the temperature register 420.TS_IMR[Bx with thermal sensor interrupt mask register 424] temperature that is set to ' 1 ' interrupt condition is set to be lower than thermal sensor and interrupts relevant temperature in the temperature register 420.TS_IMR[Bx with thermal sensor interrupt mask register 424] temperature that is set to ' 0 ' interrupt condition is set to be equal to or higher than thermal sensor and interrupts relevant temperature in the temperature register 420.
Thermal sensor interrupt mask register 424 fields at the global interrupt condition are TS_IMR[Cx], TS_IMR[B G], TS_IMR[Cgb] and TS_IMR[A].The shielding bit TS_IMR[Cx of thermal sensor interrupt mask register 424] prevent the global threshold interruption and select which sensor to participate in the following interrupt condition of global threshold.The directional bit TS_IMR[B of thermal sensor interrupt mask register 424 G] select temperature direction at the global interrupt condition.The shielding bit TS_IMR[Cgb of thermal sensor interrupt mask register 424] prevent to interrupt below the global threshold.Thermal sensor interrupt mask register 424 TS_IMR[A] cause caution signal to system controller.Caution signal is a kind of signal to system controller, shows that pervasive logic should be noted that or has a state at system controller.Caution signal can be mapped to the interruption in the system controller.CellBroadband Engine on the SPI port is gone up and be connected to system controller at system's panel (planer).
TS_IMR[B with thermal sensor interrupt mask register 424 G] be set to ' 1 ', just the temperature range at the global interrupt condition is set to as the TS_IMR[Cx in thermal sensor interrupt mask register 424] in all temperature that participate in sensors of being provided with take place when all being lower than global interrupt temperature rank.TS_IMR[B with thermal sensor interrupt mask register 424 G] be set to ' 0 ', just be set to take place during more than or equal to the relevant temperature in the thermal sensor global interrupt temperature register 426 when the temperature that participates in sensor arbitrarily at the temperature range of global interrupt condition.If the TS_IMR[A of thermal sensor interrupt mask register 424] be set to ' 1 ', so as any thermal sensor interrupt mask register 424 TS_IMR[Cx] bit and its corresponding thermal sensor interrupt status register 422 status bits (TS_ISR[Gx]) TMCU 402 causes caution signals when all being set to ' 1 '.In addition, as the TS_IMR[Cgb of thermal sensor interrupt mask register 424] and the TS_ISR[Gb of thermal sensor interrupt status register 422] when all being set to ' 1 ', TMCU402 causes caution signal.
As any thermal sensor interrupt mask register 424 TS_IMR[Mx] and bit and its corresponding thermal sensor interrupt status register 422 status bits (TS_ISR[Sx]) when all being set to ' 1 ', TMCU 402 interrupts submitting to PPE with heat management.As any thermal sensor interrupt mask register 424 TS_IMR[Cx] and bit and its corresponding thermal sensor interrupt status register 422 status bits (TS_ISR[Gx]) when all being set to ' 1 ', TMCU 402 produces heat managements and interrupts.In addition, as the TS_IMR[Cgb of thermal sensor interrupt mask register 424] and the TS_ISR[Gb of thermal sensor interrupt status register 422] when all being set to ' 1 ', TMCU 402 interrupts submitting to PPE with heat management.
Regulate dynamic heat-pipe reason register in the logic 418 and comprise the parameter that the execution that is used to control PPE or SPE is regulated.Dynamic heat-pipe reason register is one group of register, comprises heat management control register 430 (TM_CR1 and TM_CR2), heat management point of adjustment register 432 (TM_TPR), heat management stand-by time register 434 (TM_STR1 and TM_STR2), heat management adjusting ratio register 436 (TM_TSR) and heat management system interrupt mask register 438 (TM_SIMR).
Heat management point of adjustment register 432 is provided with the adjusting temperature spot of sensor.Can in heat management point of adjustment register 432, be provided with two and independently regulate temperature spot, i.e. ThrottlePPE and ThrottleSPE, one is used for PPE and another is used for SPE.Also comprise in this register and be used to forbid the temperature spot of regulating and stopping PPE or SPE.The execution of PPE or SPE is adjusted in when temperature is equal to or higher than point of adjustment and begins.Being adjusted in temperature drops to the temperature of regulating in order to forbidding (TM_TPR[EndThrottlePPE/EndThrottleSPE]) and stops when following.If temperature reaches comprehensive adjusting temperature or stops temperature (TM_TPR[FullThrottlePPE/FullThrottleSPE]), then TMCU 402 stops the execution of PPE or SPE.The 430 regulating and controlling behaviors of heat management control register.
Heat management stand-by time register 434 and heat management are regulated ratio register 436 regulating and controlling frequency and regulated quantitys.When temperature reaches point of adjustment, TMCU 402 stops certain clock number with corresponding PPE or SPE, and this clock number is specified by the corresponding proportion value that the stand-by time in the analog value in the heat management stand-by time register 434 multiply by in the heat management adjusting ratio register 436.TMCU 402 makes PPE or SPE can move certain clock number then, and this clock number is specified by multiply by corresponding ratio value working time, and wherein be the set time that depends on implementation to measure the difference that deducts between the stand-by time working time.The programmable ratio value that heat management is regulated in the ratio register 436 is the multiplier of stand-by time and working time.Example can be (Stop * Scale)/(Run * Scale) ((stand-by time * ratio)/(working time * ratio)).It is identical that the percentage of time that kernel stops to keep, but the cycle increases or frequency reduces.This sequence continues to drop to forbidding up to temperature and regulates below (TM_TPR[EndThrottlePPE/EndThrottleSPE]).
Which PPE is heat management system interrupt mask register 438 select interrupt making TMCU 402 forbiddings to regulate.When these interrupt still unsettled and shielding when still selecting unsettled interruptions TMCU402 will continue the prevention adjusting.If selected shielding of cancellation or interruption no longer are unsettled, then TMCU402 will no longer stop interruption.
Heat management control register 430 is provided with the adjusting pattern for each PPE or SPE independently.Between two registers, split control bit.Be to be five different patterns that each PPE or SPE are provided with independently below:
Forbidding dynamic adjustments (comprising that kernel stops security);
Normal running (enable dynamic adjustments and kernel stops security);
All the time regulate PPE or SPE (enable kernel and stop security);
The forbidding kernel stops security (enable dynamic adjustments and forbid kernel and stop security);
All the time regulate PPE or SPE and forbid kernel and stop security.
Privilege software should be used or the PPE of operating system or SPE and control ratio is set to normal running at operation.If PPE or SPE do not move application code, then privilege software should be set to forbidding by control bit.Pattern intends being used for application and development " to regulate PPE or SPE all the time ".These patterns are useful to determine whether application can move under limit adjusting condition.PPE or SPE can under stopping the situation of security, forbidding dynamic adjustments or kernel be carried out.
Heat management system interrupt mask register 438 which PPE of control interrupt making the heat management logic temporarily stop to regulate PPE.TMCU 402 interrupt unsettled in temporary suspension to the adjusting of these two threads, and no matter interrupt thread pointed.When interruption is no longer unsettled,, just can continue adjusting condition as long as still existing to regulate.Never forbid adjusting to SPE based on the system break condition.The PPE interrupt condition that can have precedence over adjusting condition is as follows:
Outside
The decrement device
Supervisory routine (Hypervisor) decrement device
System mistake
Heat management
The execution that heat management point of adjustment register 432 comprises PPE or SPE is regulated beginning and the temperature spot of coding when finishing.The temperature spot of coding when the execution that this register also comprises PPE or SPE is regulated comprehensively.
Three temperature spots that software uses the value in the heat management point of adjustment register to be provided for changing between three thermal management states, these three states are: normally move (N), regulate PPE or SPE (T) and stop PPE or SPE (S).The independent temperature point that TMCU 402 supports at PPE and SPE.
When the sensor Current Temperatures of having encoded in the thermal sensor Current Temperatures status register 412 was equal to or greater than adjusting temperature (ThrottlePPE/ThrottleSPE), if enable, the execution of so corresponding PPE or SPE was regulated and will be begun.Carry out and regulate till the temperature of coding of the Current Temperatures of coding that proceeds to corresponding sensor less than end adjusting (EndThrottlePPE/EndThrottleSPE).As a kind of safety practice, if the Current Temperatures of having encoded is equal to or greater than comprehensive point of adjustment (FullThrottlePPE/FullThrottleSPE), TMCU 402 stops corresponding PPE or SPE so.
Heat management stand-by time register 434 is controlled at the regulated quantity that is applied to specific PPE or SPE under the heat management adjustment state.The time quantum that will be stopped by the value representation kernel of software setting in the heat management stand-by time register 434 is with respect to the ratio (stop/run) of the time quantum that allows the kernel operation or the percentage of time that kernel stops.The actual clock number (NClks) that heat management adjusting ratio register 436 control PPE or SPE stop and moving.
The actual cycle number that heat management adjusting ratio register 436 control PPE or SPE stop during the heat management adjustment state and move.Value in this register is that configuration ring is provided with TM_config[MinStopSPE] multiple.Following equation calculates actual stopping and the operation week issue:
SPE operation and stand-by time:
SPE_StopTime=(TM_STR1[StopCore(x)]*
TM_Config[MinStopSPE])*TM_TSR[ScaleSPE]
SPE_RunTime=(32-TM_STR1[StopCore(x)])*
TM_Config[MinStopSPE])*TM_TSR[ScaleSPE]
Power
Figure C200710109071D0026083241QIETU
Unit operation and stand-by time:
PPE_StopTime=(TM_STR2[StopCore(8)]*
TM_Config[MinStopPPE])*TM_TSR[ScalePPE]
PPE_RunTime=(32-TM_STR2[StopCore(8)])*
TM_Config[MinStopPPE])*TM_TSR[ScalePPE]
Operation and stand-by time can change by the privilege software of interrupting and various heat management registers are write.
Network Performance Monitor 466 can provide the performance monitoring that the dsc data that is provided by the temperature sense equipment such as DTS 404,406,408 and 410 can be provided on the sheet.Dsc data can be stored in the storer 470 or write the sheet external equipment 472 such as the primary memory 208 of Fig. 2 or write such as the south bridge of Fig. 2 and the outer I/O equipment 474 of the sheet I/O (I/O) the controller center (ICH) 204.Controller 468 controls that are arranged in Network Performance Monitor 466 determine where dsc data is sent to.
Although following description is at an instruction stream and a processor, this instruction stream can be that one group of instruction stream and this processor can be one group of processors.That is to say that one group can be single instruction stream and single processor or two or more instruction streams and processor.
Utilize above-mentioned architecture, carried out a lot of improvement and added programmability at the heat management and the thermal conditioning of Cell BE chip.These improve and the programmability of being added in some make it possible to achieve key feature and other have strengthened availability.
Fig. 6 has described the process flow diagram of operation that is used to write down maximum temperature according to illustrative embodiment.Along with the operation beginning, comprise the computer system starting of the Cell Be chip such as the Cell BE chip 300 of Fig. 3 or restart (step 602).As mentioned previously, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.For each DTS such as the DTS 404,406,408 and 410 of Fig. 4, this heat management system comprises one group of maximum temperature status register and one group of Current Temperatures status register, such as maximum temperature status register 414 and the Current Temperatures status register 412 of Fig. 4.The Current Temperatures of its target DTS of Current Temperatures status register storage after the heat management control state machine such as the heat management control state machine 458 of Fig. 4 is responded to DTS for the last time.Its target DTS of maximum temperature status register storage reads maximum temperature status register or the computer system maximum temperature after restarting for the last time from computer system.Can use the equipment such as processor, integrated circuit of any amount or read the maximum temperature status register by the equipment that uses serial peripheral interface (SPI) port or combined testing action group (JTAG) port.But, read register by jtag port and can not cause restarting.
Illustrative following discussion is limited to a DTS, computer system starting or to restart (step 602) maximum temperature afterwards be zero.In case the heat management control state machine is sensed the temperature of DTS, this heat management control state machine just sends to the temperature of DTS of induction the comparer (step 604) such as the comparer 454 of Fig. 4.This comparer compares (step 606) with the current maximum temperature of storing in temperature sensor and the maximum temperature status register at this DTS.If be higher than the current maximum temperature of storing in the maximum temperature status register at step 606 temperature sensor, temperature sensor becomes new maximum temperature and the heat management control state machine records new maximum temperature in the maximum temperature status register (step 608) so.That is to say that the heat management control state machine covers or replace the current maximum temperature of storing in the maximum temperature status register.If be less than or equal to the current maximum temperature of storing in the maximum temperature status register at step 606 temperature sensor, the maximum temperature status register keeps existing current maximum temperature (step 610) in the maximum temperature status register so.
Current maximum temperature in the maximum temperature status register rests on maximum temperature up to computer system reads maximum temperature status register (step 612) or computer system and restarts with the form of the request of reading till.If do not read current maximum temperature, operation turns back to step 604 so.If at the current maximum temperature of step 612 computer system reads, the heat management control state machine resets to Current Temperatures (step 614) in the Current Temperatures status register with current maximum temperature so, and operation turns back to step 604 then.
An example for this operation, if the DTS of the discrete cell such as processor cores or processor itself will respond to the temperature of 67 ℃, 70 ℃, 75 ℃, 72 ℃ and 74 ℃ on a period of time, the maximum temperature in the maximum temperature status register will be 75 ℃ so.If after the 4th induction to DTS, computer system is sent the request of reading, the maximum temperature that returns so will be 75 ℃.But this moment, the heat management control state machine reset to Current Temperatures with maximum temperature, and after the last induction of being carried out by DTS, the maximum temperature in the maximum temperature status register will be 74 ℃.
Like this, the purpose of maximum temperature status register is the maximum temperature that record DTS reaches after the maximum temperature register is read for the last time.The maximum temperature that this maximum temperature information helps operating system to reach the term of execution determining DTS in application or program under the situation of poll Current Temperatures register continuously.Poll will influence the performance of system continuously, and temperature therefore may have the greatest impact.In addition, the poll Current Temperatures can not guarantee to read maximum temperature.If maximum temperature occurs between repeatedly the reading of Current Temperatures, just belong to this situation.
Fig. 7 has described the process flow diagram that is used for following the tracks of by performance monitoring the operation of dsc data according to another illustrative embodiment.As previously mentioned, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.Can provide performance monitoring by the Network Performance Monitor such as the Network Performance Monitor 466 of Fig. 4.Performance monitoring can be provided by the dsc data in its internal storage such as the storer 470 of Fig. 4 that is provided by the temperature sense equipment such as the DTS404,406,408 and 410 of Fig. 4, write primary memory such as the primary memory 208 of Fig. 2 or the sheet external equipment 472 of Fig. 4, or write the I/O equipment such as the outer I/O equipment 474 of sheet of the south bridge of Fig. 2 and I/O (I/O) controller center (ICH) 204 or Fig. 4.
Performance monitoring is supported two kinds of main tracing modes: follow the tracks of the set time section and continue and follow the tracks of.To the tracking of thermal behavior can be tracking such as the tracking 500 of Fig. 5.Performance monitoring can also the regulation sample frequency be configured to control two time periods between the continuous sampling.In addition, can use thermal information to compress and increase sampling interval.A kind of compress technique is only to store thermal information when changing.Counting to the quantity of identical thermal recovery sample can also be stored with thermal information.Because thermal information typically changes slowly, so this is a kind of useful technology.
Along with the operation that is used for following the tracks of by Network Performance Monitor dsc data begins, the heat management control state machine Network Performance Monitor such as the heat management control state machine 458 of Fig. 4 is set to tracing mode (step 702).Illustrative following discussion is limited to a DTS, the temperature (step 704) of heat management control state machine induction DTS also sends to Current Temperatures status register and/or other data structures to store (step 706) with the temperature of the DTS that senses.This moment, the heat management control state machine determined that whether Network Performance Monitor is also in operation (step 708).In case Network Performance Monitor starts in step 702, then this Network Performance Monitor is with time period of run user appointment or run to by the user and stop by user's input.But Network Performance Monitor can also stop based on specific hot situation.This specific hot situation is called trigger, such as the logic analyzer of seeking specified conditions on one group of signal.The use of trigger in software debugging of great use.For example, the user can Network Performance Monitor be set to stop when reaching hot situation or examines and stop (checkstop) system.This can make the user can determine exactly which Codabar code or code combination are causing hot situation.If still in operation, then operation turns back to step 704 to Network Performance Monitor in step 708.
Turn back to step 708, if Network Performance Monitor does not rerun, then the heat management control state machine reads the temperature information that is stored in the storer and is that the user shows institute's canned data (step 710) with the graphic form, afterwards EO.The temperature sensor that sends to Current Temperatures status register and/or other data structures in step 706 can also show when operation is still in the indicated processing (step 710) of arrow 712 simultaneously, rather than etc. end to be tracked.
Like this, Network Performance Monitor is provided by the dsc data that is provided by DTS.The needs that software continued poll Current Temperatures register have been eliminated from the motion tracking dsc data.Performance monitoring is very important for the dsc data of collection work load, and reason is that performance monitoring does not need to insert extracode and comes the poll dsc data, and this insertion may change the behavior of operating load.In other words, performance monitoring provides non-invasive method to come the hot characteristic of real-time follow-up software application.The other benefit that thermal information is sent to Network Performance Monitor is the record that can trigger or stop the thermal information on the preassigned hot situation.In addition, Network Performance Monitor can also be used for halt system when satisfying hot situation (or inspection stops).Do like this and make the user can determine which code segment or code segment combination are producing hot situation.The user can rewrite code segment or avoid specific combination then, thereby has avoided incident heat.
Fig. 8 A and Fig. 8 B have described the process flow diagram according to the operation that produces at senior thermal break of other illustrative embodiment.As previously mentioned, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.It is to help operating system to handle another feature of incident heat that senior thermal break produces.Senior thermal break logic is the part of the heat management control module such as the TMCU 402 of Fig. 4.When hot situation (just chip temperature rises to more than certain thresholding), thermal break alarm operation system.In this case, operating system should adopt right measures and reduce chip temperature.Correct measure can be handled by software interrupt handler, and software interrupt handler is a code of handling hot situation and the correct measure of initialization.Operating system waits for that hot situation disappeared before the continuation normal running then.This needs operating system to wait for specific time quantum usually, and the temperature of poll processor is to determine to continue whether safety of normal running then.Use senior thermal break to produce, operating system can be provided with interruption and when drop to below certain thresholding with detected temperatures, thereby has eliminated the needs to poll Current Temperatures register.The thermal sensor interrupt mask register 424 (TS_IMR) of Fig. 4 and the combination of thermal sensor interrupt status register 422 (TS_ISR) make operating system handle incident heat and are more prone to.
Senior thermal break produces and can carry out on local level or global level.That is to say that senior thermal break produces (part) individually and carries out or owning the last execution of (overall situation) DTS such as the DTS 404,406,408 and 410 of Fig. 4 on specific DTS.The directional bit of thermal sensor interrupt mask register is B GAnd B XInterrupt direction and defined the condition of interrupting that produces.Interruption can be interrupted temperature and changes to and be equal to or higher than when interrupting temperature from being lower than in temperature, or interrupts temperature and change to be lower than and produce when interrupting temperature from being greater than or equal in temperature.The directional bit B of heat management control state machine in the interrupt mask register GAnd B XCome identification condition.B GIt is overall directional bit.Work as B GWhen being set to ' 0 ', the heat management control state machine produces during more than or equal to the global interrupt temperature in the temperature of any DTS interrupts.Work as B GWhen being set to ' 1 ', the heat management control state machine produces interruption when the temperature of all DTS all is lower than the global interrupt temperature.B XBe the local direction bit, wherein X is the quantity of the DTS of independent association.Work as B XWhen being set to ' 0 ', the heat management control state machine produces when the temperature of DTS is interrupted temperature more than or equal to DTS separately interrupts.Work as B XWhen being set to ' 1 ', the heat management control state machine produces when the temperature of DTS is lower than DTS interruption temperature separately interrupts.Which sensor thermal break status register (TS_ISR) writes down has caused senior thermal break.Software reads this register to determine which kind of situation and which sensor or which sensor have taken place has caused interruption.In case read the heat management control state machine status bits in the thermal break status register of just resetting by software.
Therefore, the operation that produces at senior thermal break can illustrate from the overall situation and local angle.Fig. 8 A has described overall senior thermal break and has produced, and Fig. 8 B has described local senior thermal break and produced.Along with the overall senior thermal break that operates in Fig. 8 A produce in beginning, heat management control state machine global interrupt temperature T is set to temperature T 1 and with global interrupt direction B GBe set to ' 0 ' (step 802).The temperature (step 804) of heat management control state machine induction DTS.The heat management control state machine determines whether that any temperature of responding to from DTS is more than or equal to temperature T 1 (step 806).If do not have temperature sensor more than or equal to temperature T 1, operation turns back to step 804 so.If more than or equal to temperature T 1, the heat management control state machine produce to be interrupted and corresponding state bit in the thermal break status register is set to write down which sensor or which sensor has caused interruption (step 808) so at any one temperature sensor of step 806.Operating system will be for interrupting providing service and can slowing down the operating load on the processor or the part operating load of processor is unloaded to another processor in the system then.
After produce interrupting, heat management control state machine global interrupt temperature T is set to temperature T 2 and with global interrupt direction B GBe set to ' 1 ' (step 810).Temperature T 2 should be set to be less than or equal to temperature T 1.The heat management control state machine is responded to the temperature (step 812) of DTS once more.The heat management control state machine determines whether that all temperature from the DTS induction all are lower than temperature T 2 (step 814).If there is not temperature sensor to be lower than temperature T 2, operation turns back to step 812 so.If all be lower than temperature T 2 at all temperature sensors of step 814, the heat management control state machine produce to be interrupted and corresponding state bit in the thermal break status register is set to write down which sensor or which sensor has caused interruption (step 816) so.At this moment, operating system continuation normal running is safe now.Operating system will provide service and restore the system to normal running for interrupting then.Next step, operation turns back to step 802, and wherein the global interrupt temperature T is set to temperature T 1 and global interrupt direction B GBe set to ' 0 '.
An example of this operation is that all DTS have 80 ℃ of global interrupt temperature and global interrupt direction ' 0 '.In case any DTS of the unit of the association such as processor cores or processor itself senses the temperature more than or equal to 80 ℃, the heat management control state machine just produces interrupts and corresponding state bit in the thermal break status register is set to write down which sensor or which sensor has caused interruption.Operating system will be for interrupting providing service and can slowing down the operating load on the processor or the part operating load of processor is unloaded to another processor in the system then.Equally, the heat management control state machine can reset to the global interrupt temperature exemplary 77 ℃ and global interrupt direction and was set to ' 1 ' this moment.Operating load will continue under slow mode operation or keep not being handled to sense up to DTS by processor all being lower than 77 ℃ temperature for all DTS.In case the heat management control state machine is determined temperature sensor and is lower than 77 ℃ that this heat management control state machine just produces another interruption.Heat management control state machine global interrupt temperature is set to 80 ℃, and the global interrupt direction is set to ' 0 ', and operating system continues the normal running to operating load then.
Forward Fig. 8 B to, illustrative embodiment is limited to a DTS, but this illustrative embodiment all is identical for each DTS.Along with the operation that produces at the senior thermal break in part begins, heat management control state machine local interruption temperature T is set to temperature T 3 and with local interruption direction B XBe set to ' 0 ' (step 852).The temperature (step 854) of heat management control state machine induction DTS.Whether the definite temperature of responding to from DTS of heat management control state machine is more than or equal to temperature T 3 (step 856).If temperature sensor also is not greater than or equal to temperature T 3, operation turns back to step 854 so.If temperature sensor is more than or equal to temperature T 3, the heat management control state machine produces and interrupts and corresponding state bit in the thermal break status register is set to write down which sensor or which sensor has caused interruption (step 858) so.Operating system will be for interrupting providing service and can slowing down the operating load on the processor or the part operating load of processor is unloaded to other unit in the processor or is unloaded to another processor in the system then.
After the heat management control state machine produce to be interrupted, heat management control state machine local interruption temperature T was set to temperature T 4 and with global interrupt direction B XBe set to ' 1 ' (step 860).Temperature T 4 should be set to be less than or equal to temperature T 3.The heat management control state machine is responded to the temperature (step 862) of DTS once more.The heat management control state machine determines whether be lower than temperature T 4 (step 864) from the temperature of DTS induction.If temperature sensor is not less than temperature T 4, operation turns back to step 862 so.If temperature sensor is lower than temperature T 4, the heat management control state machine produce to be interrupted and corresponding state bit in the thermal break status register is set to write down which sensor or which sensor has caused interruption (step 866) so.At this moment, operating system continuation normal running is safe now.Operating system will provide service and restore the system to normal running for interrupting then.Next step, operation turns back to step 852, and wherein heat management control state machine global interrupt temperature T is set to temperature T 3 and global interrupt direction B XBe set to ' 0 '.
An example of this operation is that given DTS has 80 ℃ of local interruption temperature and local interruption direction ' 0 '.In case the DTS of related unit senses the temperature more than or equal to 80 ℃, the heat management control state machine just produces interrupts and corresponding state bit in the thermal break status register is set to write down which sensor or which sensor has caused interruption.Operating system will be for interrupting providing service and can slowing down the operating load on the processor or the part operating load of processor is unloaded to another processor in the system then.Equally, the heat management control state machine can reset to the local interruption temperature exemplary 77 ℃ and local interruption direction and was set to ' 1 ' this moment.Operating load will continue under slow mode operation or remain on outside the processor unit to sense up to DTS to be lower than 77 ℃ temperature.In case the heat management control state machine is determined temperature sensor and is lower than 77 ℃ that this heat management control state machine just produces another interruption.Heat management control state machine local interruption temperature is set to 80 ℃, and the local interruption direction is set to ' 0 ', and operating system continues the normal running to operating load then.
Like this, senior thermal break produces operating system can centering stopping pregnancy life be programmed following the direction of temperature variation, and eliminated to interrupt handling routine need be under the situation of thermal break, to continue the poll Current Temperatures.
Fig. 9 has described the process flow diagram that is used for supporting at heat management system the operation that degree of depth energy saver mode and part are good according to other illustrative embodiment.As previously mentioned, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.In the Cell of Fig. 3 BE chip 300, there is multiple energy saver mode.According to the implementation of every kind of energy saver mode, some energy saver modes can limit the accessibility of the DTS such as the DTS 404,406,408 and 410 of Fig. 4.For example, if the SPU such as the SPU 310,311 and 312 of Fig. 3 is in the energy saver mode that clock is closed, that is to say that the deserializer such as the deserializer 462 of Fig. 4 is disabled, the path between serializer such as the serializer 456 of Fig. 4 and the DTS such as the DTS 404 of Fig. 4 is inoperative so.Another example of energy saver mode can be the situation of power-off.In this case, actual DTS may be disabled.Another example is that the heat management control state machine is determined the situation whether sensor in the processor or unit break down during manufacturing test.If sensor or unit are unnecessary, manufacturer can be labeled as this sensor or unit defective, will have only the unit of limited quantity or the good processor of part that sensor works thereby produce.In either case, the heat management control state machine such as the heat management control state machine 458 of Fig. 4 need be monitored the state of these electric source modes and mask inoperative DTS and make it can not participate in heat management task (adjusting, interruption etc.).
Turn back to Fig. 9, it has described the process flow diagram that is used for supporting at thermoinduction and heat management system the operation that degree of depth energy saver mode and part are good.Along with the operation beginning, the heat management control state machine is used the state (step 902) from the data tracking DTS of each DTS.The heat management control state machine with these data storage in the internal calibration storer, in the internal calibration storer 428 such as Fig. 4.As previously mentioned, energy saver mode, underproof DTS or can forbid the operation of specific DTS by the SPU that the data stream such as the data stream 460 of Fig. 4 is communicated by letter with the heat management control state machine.Effect and energy saver mode by the part good condition of manufacturing process report are similar, well are permanent conditions only partly and should for good and all mask DTS.Be marked as under the underproof situation at SPU, the heat management control state machine is closed whole SPU, and the forbidding serializer.Be marked as under the underproof situation at DTS, the heat management control state machine masks this DTS.The heat management control state machine determines that DTS or SPU are defective or (step 904) in action.If DTS or SPU are defective, then the heat management control state machine masks DTS (step 906), EO afterwards.
In order to mask the DTS that is in power management states, the heat management control state machine resets to 0x0 with the relevant Current Temperatures status register in the Current Temperatures status register the Current Temperatures status register 412 of Fig. 4, and 0x0 is the minimum temperature setting.Another kind method can also be to distribute the coding of relevant Current Temperatures status register by status bits is set, and to show the DTS conductively-closed, this can be more accurate than the sensor reading of only resetting.The heat management control state machine stops to go to and communicating by letter from DTS from the Current Temperatures status register then.Stopping communication is an optional step, is mainly used in energy-conservation and does not carry out useless expense work.The heat management control state machine produces the signal that shows the present conductively-closed of DTS and should not participate in the heat management task then.At last, the state of heat management control state machine replacement DTS.When the unit relevant with DTS such as processor cores or processor itself withdraws from energy saver mode, the heat management control state machine continues to communicate by letter with DTS, continuation is upgraded the Current Temperatures status register, and sends the signal that DTS can participate in the heat management task.
Return step 904, if DTS and SPU work, heat management control state machine begin to communicate by letter (step 908) then with DTS.The power management states of heat management state of a control machine monitoring SPU is to determine when SPU enters energy saver mode (step 910).Before SPU entered energy saver mode, operation turned back to step 908.If SPU enters energy saver mode and DTS is disabled, the method discussed with top integrating step 906 of heat management control state machine masks DTS (step 912) so.Because having shown DTS is forbidding or in action, the heat management control state machine continues the power management states (step 914) of monitoring SPU.Before SPU withdrawed from energy saver mode, operation turned back to step 912.When SPU withdraws from energy saver mode and DTS when no longer disabled, the heat management control state machine begins to communicate by letter with DTS, continuation is upgraded the Current Temperatures status register, and sends the signal (step 916) that DTS can participate in the heat management task, and operation turns back to step 908 then.
Like this, the shielding of temperature reading good, defective to part or that be in the DTS of energy saver mode has been isolated DTS idle or forbidding and is made it can not participate in the heat management task.
Figure 10 described according to other illustrative embodiment at make hotness know software application real-time testing can with the temperature process flow diagram of the operation of independently thermal conditioning controlling features mutually.As previously mentioned, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.Heat management control register such as the heat management control register 430 of Fig. 4 provides visit and configuration for various thermal conditioning controlling features.Thermal conditioning is designed to reduce temperature by relief performance under the situation of using the incident heat of regulating.
Heat management stand-by time register such as the heat management stand-by time register 434 of Fig. 4 and the heat management such as the heat management of Fig. 4 is regulated ratio register 436 are regulated the ratio register regulated quantity and adjusting behavior are set together.In real-time system, need to guarantee the real-time time limit.Software developer and quality assurance team are known and to test the maximal regulated amount very important, and the maximal regulated amount is that program or code segment can be tolerated and still guarantee the heat management stand-by time register in real-time time limit of real-time system and the maximum setting that heat management is regulated the ratio register.As to the actual temperature of regulating hardware to cause incident heat and therefore to trigger substituting of adjusting condition, the heat management control state machine provides no matter how temperature all provides the pattern of adjusting all the time.The heat management control state machine is provided with this pattern in the heat management control register, this chip is set to constant adjustment state.This feature helper applications developer tests and guarantees that their code satisfies real-time standard.
Along with the operation beginning, receive the heat control setting (step 1002) that heat management stand-by time register and heat management are regulated the ratio register.The heat management control state machine uses heat management stand-by time register and heat management to regulate the next adjusting of determining how to carry out that is provided with of ratio register.Then, the heat management control state machine is provided with test pattern and the heat management control register is set to regulate all the time setting (step 1004).Program run is to carry out real-time confirmation then, and promptly software or program will satisfy the real-time time limit (1006) under will being provided with in the heat control of heat management stand-by time register and heat management adjusting ratio register.Test pattern can be the adjusting pattern of any type, such as regulating all the time or regulating at random.The heat management control state machine determines whether to satisfy the real-time time limit (step 1008) then.If discontented time limit when full, then the heat control setting that current heat management stand-by time register and heat management are regulated the ratio register of heat management control state machine is recorded as failure (step 1010).The heat management control state machine determines whether and will reduce any new heat management stand-by time register of regulated quantity and the heat control setting (step 1012) that heat management is regulated the ratio register then.If have new heat management stand-by time register and heat management to regulate the heat control setting of ratio register, then operation turns back to step 1002.If at the heat control setting of step 1002 without any new heat management stand-by time register and heat management adjusting ratio register, then EO.
Turn back to step 1008, if satisfy the real-time time limit, then the heat management control state machine is recorded as the heat control setting of current heat management stand-by time register and heat management adjusting ratio register by (step 1014).The heat management control state machine determines whether and will increase any new heat management stand-by time register of regulated quantity and the heat control setting (step 1016) that heat management is regulated the ratio register.If have new heat management stand-by time register and heat management to regulate the heat control setting of ratio register, then operation turns back to step 1002.If at the heat control setting of step 1016 without any new heat management stand-by time register and heat management adjusting ratio register, then EO.
Like this, provide all the time the operator scheme helper applications developer who regulates to test and guarantee that their code also can satisfy the real-time time limit under the hot situation of worst case.Software developer and quality assurance team can use this feature to determine that program or code segment can be allowed and still guarantee to satisfy the maximal regulated amount in the real-time time limit of real-time system.In case the heat management control state machine is determined and confirmed that maximal regulated amount, software just can interrupt being set to take place under the situation of adjusting generation comprehensively.If the heat management control state machine always produces this interruption, heat management control state machine notice is used to exist and is violated or discontented situation about guaranteeing when full so.
Except regulating control setting all the time, implementation can also provide the pattern of injecting incident heat at random or directed incident heat at random with to regulate with software carry out have more the sense of reality carry out emulation alternately.This technology type is similar to and injects mistake at random with test errors recovery code on bus.
Figure 11 has described the process flow diagram that is used to realize interrupt latency is influenced the operation of minimum thermal conditioning control according to other illustrative embodiment.As previously mentioned, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.When the arbitrary portion of computer system is placed under the adjusting condition, this adjusting condition can reduce the performance of total system.The reduction meeting of performance is according to will how long could service being provided and will increasing interrupt latency for this interruption provides service how long for this interruption.Therefore being increased in generally of interrupt latency seriously influences system, expects and is necessary to minimize the influence of thermal conditioning to interrupt latency.Minimizing the influence that thermal conditioning produces because of interrupt latency is the feature of regulating control at the PPU that carries out such as the PPU 308 by Fig. 3.SPU such as the SPU 310,311 and 312 of Fig. 3 can not obtain to interrupt, therefore can be by this feature affects.
Along with the operation beginning, PPU interruption status bit and heat management system interrupt mask register that heat management state of a control machine monitoring such as the heat management control state machine 458 of Fig. 4 is all are such as the heat management system interrupt mask register 438 (step 1102) of Fig. 4.The shielding of heat management system interrupt mask register control to interrupting.The heat management control state machine has determined whether unscreened unsettled interruption (step 1104) arbitrarily.If do not have unsettled interruption or but unsettled interruption conductively-closed has been arranged, then operation turns back to step 1102.
If in step 1104 unscreened unsettled interruption is arranged, then the heat management control state machine is temporary transient forbids any adjusting pattern, no matter be that part is regulated or comprehensive adjustment state (step 1106).Forbidding adjusting pattern makes PPU temporarily move and to handle any unsettled interruption with full performance under the situation of any delay that does not have the thermal conditioning effect to cause.Equally, all PPU interruption status and the heat management system interrupt mask register (step 1108) of heat management state of a control machine monitoring.The heat management control state machine has determined whether unscreened unsettled interruption (step 1110) arbitrarily.If do not have unsettled interruption or but unsettled interruption conductively-closed has been arranged, then operation turns back to step 1108.When step 1110 interruption status is removed, the heat management control state machine returns to initial adjustment pattern (step 1112) with PPU, and operation turns back to step 1102.
Interrupt handling routine can be chosen in the beginning or the ending of interrupt handling routine routine and remove the interruption status bit.The software that interrupt handling routine can be arranged in the Power processor unit such as the Power processor unit 301 of Fig. 3 or be carried out by the Power processor unit.The interruption status bit is removed in beginning and hope avoids any performance of PPU to reduce if interrupt handling routine is chosen in, and then interrupt handling routine can be forbidden thermal conditioning before removing the interruption status bit.That is to say, interrupt not causing the variation in the control register.Therefore, regulate to remain and enable, but when occurring that shielding is interrupted, hang up by the heat management control module such as the TMCU 402 of Fig. 4.The interruption status if interrupt handling routine should be reset before interruption is handled, then this handling procedure should control register be set to forbidding and regulates (or be reduced to acceptable rank with regulated quantity), reset and interrupt, regulate or regulated quantity is established back rank in the past for interrupting providing service, reactivating then.Can be set to 0XX by the heat management control register such as the heat management control register 430 of Fig. 4 and carry out exemplary forbidding to thermal conditioning, wherein X is " don't-care bit " (does not care).In the ending of interruption routine, interrupt handling routine should be established back the heat management control register its initial value.If interrupt handling routine is removed the interruption status bit in the interruption routine ending, so just do not need extra work, and as long as the interruption status bit activates, the heat management control state machine will remain on PPU outside the adjusting pattern.
Figure 12 has described the process flow diagram according to the operation of the hysteresis that is used for thermal conditioning of other illustrative embodiment.As previously mentioned, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.Hysteresis in the thermal conditioning is change and the response of this variation or the sluggishness of making such as regulating or finishing to regulate between the effect.For example, also finishing point of adjustment is set to 72 ℃ if point of adjustment is set to 75 ℃, and hysteresis scope so is from 75 ℃ to 72 ℃.Fig. 5 has described the thermal conditioning hysteresis.
Heat management point of adjustment register such as the heat management point of adjustment register 432 of Fig. 4 provides two temperature settings: regulate temperature and finish to regulate temperature.Regulate temperature and should be set to be higher than end adjusting temperature.Temperature contrast has defined the hysteresis of regulating between temperature and the end adjusting temperature, thereby programmable hysteresis is provided.
Illustrative following discussion is limited to a DTS, along with the operation of hysteresis thermal conditioning begins, the heat management control state machine is provided with the adjusting temperature in the heat management point of adjustment register and finishes to regulate temperature (step 1202).The temperature (step 1204) of heat management control state machine induction DTS.Whether the definite temperature of responding to from DTS of heat management control state machine is more than or equal to regulating temperature (step 1206).If temperature sensor also is not greater than or equal to the adjusting temperature, then step 1204 is returned in operation.If more than or equal to regulating temperature, then pattern (step 1208) is regulated in the initialization of heat management control state machine at step 1206 temperature sensor.
Equally, the temperature (step 1210) of heat management control state machine induction DTS.The heat management control state machine determines whether regulate temperature (step 1212) more than or equal to end from the temperature of DTS induction.Finish to regulate temperature if temperature sensor is not less than, then step 1210 is returned in operation.If less than finishing to regulate temperature, then heat management control state machine forbidding adjusting pattern (step 1214) operates turning back to step 1204 at step 1212 temperature sensor.
Like this, suppose the correct heat management control register that disposed allowing the adjusting pattern, be equal to or higher than when regulating temperature that the heat management control state machine makes the unit enter the adjusting pattern when temperature rises to.The heat management control state machine remains on the unit under the adjusting pattern and drops to below the end adjusting temperature up to temperature.If finish to regulate temperature less than regulating temperature, the hysteresis that is identified can fully be cooled off the unit before forbidding adjusting pattern so.Do not lag behind, the unit may enter and withdraw from the adjusting pattern very continually and reduce the whole efficiency of adjusting and the efficient of processor.
Can send by the branch of blocking-up instruction and finish the example processor control method.Started continually and forbid if regulate, the so frequent streamline of refresh process device, thereby reduction processing power.Another example processor control method can be finished by slowing down clock frequency.
Figure 13 has described the process flow diagram of operation that is used to realize the thermal conditioning logic according to other illustrative embodiment.Figure 13 represents one as the described complete heat management solution of above accompanying drawing.As previously mentioned, Cell BE chip comprises the heat management system that the pervasive logical block 351 by Fig. 3 provides.TMCU such as the TMCU 402 of Fig. 4 comprises a plurality of dynamic heat-pipe reason registers.Dynamic heat-pipe reason register is heat management control register, heat management point of adjustment register, heat management stand-by time register, heat management adjusting ratio register and heat management system interrupt mask register, regulates ratio register 436 (TM_TSR) and heat management system interrupt mask register 438 (TM_SIMR) such as heat management control register 430 (TM_CR1 and TM_CR2), heat management point of adjustment register 432 (TM_TPR), heat management stand-by time register 434 (TM_STR1 and TM_STR2), the heat management of Fig. 4.
Heat management point of adjustment register is provided with the point of adjustment at DTS.Two independently point of adjustment can be set in heat management point of adjustment register, and one at PPE, and one at SPE.Also comprise in this register to be used to enable and regulate and forbid the temperature spot of regulating or stopping PPE or SPE.When temperature equals or begin during about point of adjustment the execution of PPE or SPE is regulated.Regulate when following and stop when temperature drops to temperature that forbidding regulates.If temperature reaches comprehensive adjusting temperature or stops temperature, then stop execution to PPE or SPE.
The heat management control state machine uses heat management stand-by time register and heat management adjusting ratio register to come regulating and controlling frequency and regulated quantity.When temperature reached point of adjustment, the heat management control state machine stopped corresponding PPE or SPE by the clock number of regulating the corresponding proportion value appointment in the ratio register at heat management.The heat management control state machine can be moved by the runtime value in heat management stand-by time register PPE or SPE and be multiply by the specified clock number of corresponding ratio value then.This sequence lasts till that temperature drops to below the forbidding adjusting.
The heat management control state machine use the heat management system interrupt mask register select which interrupt interrupt unsettled in forbidding to the adjusting of PPE.
The heat management control register is provided with the adjusting pattern for each PPE or SPE independently.Be to be five kinds of different patterns that each PPE or SPE are provided with independently below:
Forbidding dynamic adjustments (comprising that kernel stops security);
Normal running (enable dynamic adjustments and kernel stops security);
All the time regulate PPE or SPE (enable kernel and stop security);
The forbidding kernel stops security (enable dynamic adjustments and forbid kernel and stop security);
All the time regulate PPE or SPE and forbid kernel and stop security.
As the operation that is used to realize the thermal conditioning logic, the heat management control state machine is provided with the adjusting temperature in the heat management point of adjustment register and finishes to regulate temperature (step 1302).The temperature (step 1304) of heat management control state machine induction DTS.Whether the definite temperature of responding to from DTS of heat management control state machine is more than or equal to regulating temperature (step 1306).If temperature sensor also is not greater than or equal to the adjusting temperature, then step 1304 is returned in operation.If temperature sensor is more than or equal to regulating temperature, then pattern (step 1308) is regulated in the initialization of heat management control state machine.
Then, the heat management control state machine is by coming regulating and controlling (step 1310) as the represented adjusting type of value as indicated in the heat management control register.In case shown the adjusting pattern, the heat management control state machine just limits adjusting (step 1312) by indicated regulated quantity in heat management stand-by time register.Time that stand-by time register set handling device will stop and will allowing the ratio between time of processor operation or regulate number percent.At last, the heat management control state machine duration and working time (step 1314) of coming convergent-divergent to stop by the value of appointment in heat management ratio register.This moment, operation was divided into concurrent operation, i.e. step 1316 and step 1322.In step 1316, the temperature of heat management control state machine induction DTS.Whether the definite temperature of responding to from DTS of heat management control state machine is less than regulating temperature (step 1318).Finish to regulate temperature if temperature sensor is not less than, then step 1316 is returned in operation.If DTS is less than finishing to regulate temperature, then heat management control state machine forbidding adjusting pattern (step 1320) operates turning back to step 1304.
Turn back to step 1314, after realizing final adjusting restriction, the heat management control state machine is monitored all PPU interruption statuss (step 1322) of any unsettled interruption concomitantly.If run into interruption when realizing adjusting, then any adjusting pattern of the temporary transient forbidding of heat management control state machine is processed up to interrupting, so no matter be that part adjustment state or comprehensive adjustment state are all enabled adjusting and operation turns back to step 1308.Carry out thoroughly discussing with reference to Figure 11 to the monitoring interruption status.
Like this, the thermal break logic of the included heat management system of Cell BE chip provides hot state and protection Cell BE chip and its assembly that a kind of dynamic means is managed Cell BE chip.
Illustrative embodiment can be taked devices at full hardware embodiment, full software implementation example or not only comprise hardware cell but also comprised the form of the embodiment of software unit.Illustrative embodiment realizes that in software this software includes but not limited to firmware, resident software, microcode etc.
In addition, illustrative embodiment can take can from computing machine can with or the form of the computer program of computer-readable medium access, this computer program provides program code for computing machine or the arbitrary instruction executive system is used or use in conjunction with computing machine or arbitrary instruction executive system.For this purpose of description, computing machine can with or computer-readable medium can be can comprise arbitrarily, storage, transmission, propagation or transmission procedure uses for instruction execution system, equipment or device or combined command executive system, equipment or device and the tangible equipment that uses.
Medium can be electronics, magnetic, light, electromagnetism, infrared ray or semiconductor system (perhaps equipment or device) or communication media.The example of computer-readable medium comprises semiconductor or solid-state memory, tape, removable computer diskette, random-access memory (ram), ROM (read-only memory) (ROM), hard disc and CD.The current example of CD comprises compact disk-ROM (read-only memory) (CD-ROM), CD-read/write (CD-R/W) and DVD.
Be suitable for storing and/or the data handling system of executive routine code will comprise that at least one directly or the processor that is connected with storage unit indirectly by system bus.The local storage, mass storage and the cache memory that use the term of execution that storage unit can being included in program code actual, for reduce the term of execution obtain the number of times of code from mass storage, cache memory provides the interim storage at least a portion program code.
I/O or I/O equipment (including but not limited to keyboard, display, indicating equipment or the like) can directly or by middle I/O controller be connected with system.
Network adapter also can be connected with system so that data handling system can be connected by intermediate dedicated network or public network and other data handling systems or remote printer or memory device.Modulator-demodular unit, cable modem and Ethernet card be the network adapter of several current available types just.
Proposed description, and this description is not to be intended to exhaustive or to limit the invention to the illustrative embodiment of disclosed form to illustrative embodiment for the purpose of illustration and description.A lot of modifications and changes will be clearly to those skilled in the art.Select and to describe embodiment be for principle, practical application that illustrative embodiment is described best and make those of ordinary skill in the art understand illustrative embodiment at the various embodiment with the various modifications that are suitable for the special-purpose considered.

Claims (20)

1. one kind is used for using the computer implemented method that lags behind and carry out thermal conditioning at integrated circuit, comprising:
With the temperature in the described integrated circuit of digital thermal sensor sensing;
Determine that whether described temperature sensor is more than or equal to regulating temperature;
As the response that described temperature sensor is met or exceeded described adjusting temperature, pattern is regulated in initialization;
With the new temperature of described digital thermal sensor sensing;
Determine that whether described new temperature sensor is less than finishing to regulate temperature; And
As described new temperature sensor is regulated the response of temperature less than described end, forbid described adjusting pattern.
2. method according to claim 1, following steps are wherein carried out by the heat management control state machine that resides in the described integrated circuit:
Determine that whether described temperature sensor is more than or equal to regulating temperature;
As the response that described temperature sensor is met or exceeded described adjusting temperature, pattern is regulated in initialization;
Determine that whether described new temperature sensor is less than finishing to regulate temperature;
As described new temperature sensor is regulated the response of temperature less than described end, forbid described adjusting pattern.
3. method according to claim 1, wherein said adjusting temperature is regulated temperature greater than described end.
4. method according to claim 1, wherein said adjusting pattern comprise treats initialized adjusting mode type.
5. method according to claim 4, wherein said adjusting mode type is following at least a:
The forbidding dynamic adjustments;
Normal running;
All the time regulate Power processor unit or coprocessor unit;
The forbidding kernel stops security;
All the time regulate Power processor unit or coprocessor unit and forbid kernel and stop security.
6. method according to claim 1, wherein said integrated circuit are heterogeneous multi-core processors.
7. method according to claim 6, wherein said digital thermal sensor reside in the kernel in the described heterogeneous multi-core processor.
8. method according to claim 6, wherein said digital thermal sensor resides in the described heterogeneous multi-core processor, but is not in kernel.
9. method according to claim 1, wherein said adjusting pattern stops the assignment of instruction.
10. method according to claim 1, wherein said adjusting pattern is slowed down clock frequency.
11. method according to claim 1, wherein said adjusting temperature is programmable temperature.
12. method according to claim 1, it is programmable temperature that temperature is regulated in wherein said end.
13. a data handling system comprises:
Bus;
Be connected to the storer of described bus, wherein said storer comprises one group of instruction; And
Be connected to the integrated circuit of described bus, wherein said integrated circuit carry out described one group the instruction so that:
With the temperature in the digital thermal sensor sensing integrated circuit;
Determine that whether described temperature sensor is more than or equal to regulating temperature;
As the response that described temperature sensor is met or exceeded described adjusting temperature, pattern is regulated in initialization;
With the new temperature of described digital thermal sensor sensing;
Determine that whether described new temperature sensor is less than finishing to regulate temperature; And
As described new temperature sensor is regulated the response of temperature less than described end, forbid described adjusting pattern.
14. system according to claim 13, wherein said adjusting temperature is regulated temperature greater than described end.
15. comprising, system according to claim 13, wherein said adjusting pattern treat initialized adjusting mode type.
16. system according to claim 13, wherein said adjusting pattern carry out the assignment that stops instruction and slow down in the clock frequency at least one.
17. a processor comprises:
At least one handles kernel;
The heat management control state machine; And
Digital thermal sensor, wherein said processor carry out one group the instruction so that:
With the temperature in the digital thermal sensor sensing integrated circuit;
Use described state machine to determine that whether described temperature sensor is more than or equal to regulating temperature;
As the response that described temperature sensor is met or exceeded described adjusting temperature, use described state machine initialization to regulate pattern;
With the new temperature of described digital thermal sensor sensing;
Use described state machine to determine that whether described new temperature sensor is less than finishing to regulate temperature; And
As described new temperature sensor is regulated the response of temperature less than described end, use described state machine to forbid described adjusting pattern.
18. processor according to claim 13, wherein said adjusting temperature is regulated temperature greater than described end.
19. comprising, processor according to claim 13, wherein said adjusting pattern treat initialized adjusting mode type.
20. processor according to claim 13, wherein said adjusting pattern are carried out the assignment that stops instruction and are slowed down in the clock frequency at least one.
CNB2007101090715A 2006-06-21 2007-06-15 Method, system and processor used for lagging of heat conditioning Active CN100520680C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/425,499 2006-06-21
US11/425,499 US7603576B2 (en) 2005-11-29 2006-06-21 Hysteresis in thermal throttling

Publications (2)

Publication Number Publication Date
CN101093415A CN101093415A (en) 2007-12-26
CN100520680C true CN100520680C (en) 2009-07-29

Family

ID=38991700

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007101090715A Active CN100520680C (en) 2006-06-21 2007-06-15 Method, system and processor used for lagging of heat conditioning

Country Status (1)

Country Link
CN (1) CN100520680C (en)

Also Published As

Publication number Publication date
CN101093415A (en) 2007-12-26

Similar Documents

Publication Publication Date Title
CN101356486B (en) Thermal management system for integrated circuit
CN100517176C (en) System and method for implementing heat conditioning logical
CN1975628B (en) Method and system of generation of hardware thermal profiles for a set of processors
US9015501B2 (en) Structure for asymmetrical performance multi-processors
US7721128B2 (en) Implementation of thermal throttling logic
US7376532B2 (en) Maximal temperature logging
US9261935B2 (en) Allocating power to compute units based on energy efficiency
CN100533344C (en) Heat regulation controlling method,system and processor used for testing real-time software
US9097590B2 (en) Tracing thermal data via performance monitoring
KR102151628B1 (en) Ssd driven system level thermal management
US7480585B2 (en) Tracing thermal data via performance monitoring
JP4249779B2 (en) Device controller
CN101027640A (en) Providing support for a timer associated with a virtual machine monitor
TW200817877A (en) Selection of processor cores for optimal thermal performance
JP2007193775A (en) Computer implemented method which carries out scheduling using software and hardware thermal profiles, data processing system, and computer program
US20110252260A1 (en) Reducing Power Requirements of a Multiple Core Processor
CN100478906C (en) Method and system of evaluating data processing system health using an I/O device
US7603576B2 (en) Hysteresis in thermal throttling
CN104115091A (en) Multi-level cpu high current protection
US7681053B2 (en) Thermal throttle control with minimal impact to interrupt latency
JP2007183925A (en) Method for executing analytical generation of software thermal profile by computer, data processing system and computer program
JP2007200285A (en) Method executed by computer generating software thermal profile for application in simulated environment, data processing system, and computer program
CN100543645C (en) To minimum thermal conditioning control method, the system of interrupt latency influence
CN100520680C (en) Method, system and processor used for lagging of heat conditioning
WO2014031384A1 (en) Power management of multiple compute units sharing a cache

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant