EP2840508A2 - Apparatus and method for extended cache correction - Google Patents
Apparatus and method for extended cache correction Download PDFInfo
- Publication number
- EP2840508A2 EP2840508A2 EP13193571.0A EP13193571A EP2840508A2 EP 2840508 A2 EP2840508 A2 EP 2840508A2 EP 13193571 A EP13193571 A EP 13193571A EP 2840508 A2 EP2840508 A2 EP 2840508A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- data
- fuses
- cache memory
- sub
- fuse array
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012937 correction Methods 0.000 title claims abstract description 96
- 238000000034 method Methods 0.000 title claims description 47
- 239000004065 semiconductor Substances 0.000 claims abstract description 55
- 230000015654 memory Effects 0.000 claims abstract description 49
- 230000006837 decompression Effects 0.000 description 35
- 230000007246 mechanism Effects 0.000 description 35
- 238000003491 array Methods 0.000 description 31
- 238000010586 diagram Methods 0.000 description 26
- 230000006870 function Effects 0.000 description 19
- 230000008569 process Effects 0.000 description 16
- 238000007906 compression Methods 0.000 description 15
- 230000006835 compression Effects 0.000 description 15
- 238000004519 manufacturing process Methods 0.000 description 15
- 238000009826 distribution Methods 0.000 description 7
- 238000007664 blowing Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 238000013144 data compression Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000006467 substitution reaction Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000004883 computer application Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 208000011580 syndromic disease Diseases 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011982 device technology Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 238000004377 microelectronic Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008672 reprogramming Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000012876 topography Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1008—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0632—Configuration or reconfiguration of storage systems by initialisation or re-initialisation of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C29/00—Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
- G11C29/70—Masking faults in memories by using spares or by reconfiguring
- G11C29/78—Masking faults in memories by using spares or by reconfiguring using programmable devices
- G11C29/785—Masking faults in memories by using spares or by reconfiguring using programmable devices with redundancy programming schemes
- G11C29/787—Masking faults in memories by using spares or by reconfiguring using programmable devices with redundancy programming schemes using a fuse hierarchy
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C29/00—Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
- G11C29/70—Masking faults in memories by using spares or by reconfiguring
- G11C29/78—Masking faults in memories by using spares or by reconfiguring using programmable devices
- G11C29/80—Masking faults in memories by using spares or by reconfiguring using programmable devices with improved layout
- G11C29/802—Masking faults in memories by using spares or by reconfiguring using programmable devices with improved layout by encoding redundancy signals
Definitions
- This invention relates in general to the field of microelectronics, and more particularly to apparatus and methods for providing compressed configuration data in a fuse array associated with a multi-core device.
- Integrated device technologies have exponentially advanced over the past 40 years. More specifically directed to the microprocessor fields, starting with 4-bit, single instruction, 10-micrometer devices, the advances in semiconductor fabrication technologies have enabled designers to provide increasingly more complex devices in terms of architecture and density. In the 80's and 90's so-called pipeline microprocessors and superscalar microprocessors were developed comprising millions of transistors on a single die. And now 20 years later, 64-bit, 32-nanometer devices are being produced that have billions of transistors on a single die and which comprise multiple microprocessor cores for the processing of data.
- One requirement that has persisted since these early devices were produced is the need to initialize these devices with configuration data when they are turned on or when they are reset.
- many architectures enable devices to be configured to execute at one of many selectable frequencies and/or voltages.
- Other architectures require that each device have a serial number and other information that can be read via execution of an instruction.
- Yet other devices require initialization data for internal registers and control circuits.
- Still other devices utilize configuration data to implement redundant circuits when primary circuits are fabricated in error or outside of marginal constraints.
- fuse array mechanism that can store and provide significantly more configuration data than current techniques while requiring the same or less real estate on a multi-core die.
- an apparatus for providing configuration data to an integrated circuit.
- the apparatus includes a semiconductor fuse array, a cache memory, and a plurality of cores.
- the semiconductor fuse array is disposed on a die, into which is programmed the configuration data.
- the semiconductor fuse array has a first plurality of semiconductor fuses that is configured to store compressed cache correction data.
- the a cache memory is disposed on the die.
- the plurality of cores is disposed on the die, where each of the plurality of cores is coupled to the semiconductor fuse array and the cache memory, and is configured to access the semiconductor fuse array upon power-up/reset, to decompress the compressed cache correction data, and to distribute decompressed cached correction data to initialize the cache memory.
- the apparatus includes a multi-core microprocessor.
- the multi-core microprocessor has a semiconductor fuse array, disposed on a die, into which is programmed the configuration data.
- the semiconductor fuse array has a first plurality of semiconductor fuses configured to store compressed cache correction data.
- the multi-core microprocessor also has a cache memory, disposed on the die and a plurality of cores, disposed on the die, where each of the plurality of cores is coupled to the semiconductor fuse array and the cache memory, and is configured to access the semiconductor fuse array upon power-up/reset, to decompress the compressed cache correction data, and to distribute decompressed cached correction data to initialize the cache memory.
- the present invention contemplates a method for providing configuration data to an integrated circuit.
- the method includes first disposing a semiconductor fuse array on a die.
- the first disposing includes storing compressed cache correction data in a first plurality of semiconductor fuses.
- the method also includes second disposing a cache memory on the die; and also includes third disposing a plurality of cores on the die, where each of the plurality of cores is coupled to the semiconductor fuse array and the cache memory;
- the method further includes, via each of the plurality of cores, accessing the semiconductor fuse array upon power-up/reset, decompressing the compressed cache correction data, and distributing decompressed cached correction data to initialize the cache memory.
- the present invention is implemented within a MICROPROCESSOR which may be used in a general purpose or special purpose computing device.
- Integrated Circuit A set of electronic circuits fabricated on a small piece of semiconductor material, typically silicon.
- An IC is also referred to as a chip, a microchip, or a die.
- CPU Central Processing Unit
- the electronic circuits i.e., "hardware” that execute the instructions of a computer program (also known as a “computer application” or “application”) by performing operations on data that include arithmetic operations, logical operations, and input/output operations.
- Microprocessor An electronic device that functions as a CPU on a single integrated circuit.
- a microprocessor receives digital data as input, processes the data according to instructions fetched from a memory (either on-die or off-die), and generates results of operations prescribed by the instructions as output.
- a general purpose microprocessor may be employed in a desktop, mobile, or tablet computer, and is employed for uses such as computation, text editing, multimedia display, and Internet browsing.
- a microprocessor may also be disposed in an embedded system to control a wide variety of devices including appliances, mobile telephones, smart phones, and industrial control devices.
- Multi-Core Processor Also known as a multi-core microprocessor, a multi-core processor is a microprocessor having multiple CPUs ("cores") fabricated on a single integrated circuit.
- Instruction Set Architecture or Instruction Set: A part of a computer architecture related to programming that includes data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and input/output.
- An ISA includes a specification of the set of opcodes (i.e., machine language instructions), and the native commands implemented by a particular CPU.
- x86-Compatible Microprocessor A microprocessor capable of executing computer applications that are programmed according to the x86 ISA.
- Microcode A term employed to refer to a plurality of micro instructions.
- a micro instruction (also referred to as a "native instruction”) is an instruction at the level that a microprocessor sub-unit executes. Exemplary sub-units include integer units, floating point units, MMX units, and load/store units.
- micro instructions are directly executed by a reduced instruction set computer (RISC) microprocessor.
- RISC reduced instruction set computer
- CISC complex instruction set computer
- x86 instructions are translated into associated micro instructions, and the associated micro instructions are directly executed by a sub-unit or sub-units within the CISC microprocessor.
- Fuse A conductive structure typically arranged as a filament which can be broken at select locations by applying a voltage across the filament and/or current through the filament. Fuses may be deposited at specified areas across a die topography using well known fabrication techniques to produce filaments at all potential programmable areas. A fuse structure is blown (or unblown) subsequent to fabrication to provide for desired programmability of a corresponding device disposed on the die.
- FIGURE 1 a block diagram 100 is presented illustrating a present day microprocessor core 101 that includes a fuse array 102 for providing configuration data to the microprocessor core 101.
- the fuse array 102 comprises a plurality of semiconductor fuses (not shown) typically arranged in groups known as banks.
- the fuse array 102 is coupled to reset logic 103 that includes both reset circuits 104 and reset microcode 105.
- the reset logic 103 is coupled to control circuits 107, microcode registers 108, microcode patches elements 109, and cache correction elements 110.
- An external reset signal RESET is coupled to the microprocessor core 101 and is routed to the reset logic 103.
- fuses also called “links” or “fuse structures” are employed in a vast number of present day integrated circuit devices to provide for configuration of the devices after the devices have been fabricated.
- links also called “links” or “fuse structures”
- fuse structures are employed in a vast number of present day integrated circuit devices to provide for configuration of the devices after the devices have been fabricated.
- the microprocessor core 101 of FIGURE 1 is fabricated to provide functionality selectively either as a desktop device or a mobile device. Accordingly, following fabrication, prescribed fuses within the fuse array 102 may be blown to configure the device as, say, a mobile device.
- the reset logic 103 reads the state of the prescribed fuses in the fuse array 102 and the reset circuits 104 (rather than reset microcode 105, in this example) enable corresponding control circuits 107 that deactivate elements of the core 101 exclusively associated with desktop operations and activate elements of the core 101 exclusively associated with mobile operations. Consequently, the core 101 is configured upon power-up reset as a mobile device.
- the reset logic 103 reads the state of the other fuses in the fuse array 102 and the reset circuits 104 (rather than reset microcode 105, in this example) enable corresponding cache correction circuits 107 provide corrective mechanisms for one or more cache memories associated (not shown) with the core 101. Consequently, the core 101 is configured upon power-up reset as a mobile device and corrective mechanisms for its cache memories are in place.
- configuration fuses in an integrated circuit device such as a microprocessor core 101 of FIGURE 1 .
- uses for configuration fuses include, but are not limited to, configuration of device specific data (e.g., serial numbers, unique cryptographic keys, architecture mandated data that can be accessed by users, speed settings, voltage settings), initialization data, and patch data.
- device specific data e.g., serial numbers, unique cryptographic keys, architecture mandated data that can be accessed by users, speed settings, voltage settings
- initialization data e.g., patch data that can be accessed by users, speed settings, voltage settings
- patch data e.g., patch data that can be accessed by users, speed settings, voltage settings
- many present day devices execute microcode and often require initialization of registers 108 that are read by the microcode.
- Such initialization data may be provided by microcode register fuses (not shown) within the fuse array 102, which are read upon reset and provided to the microcode registers 108 by the reset logic 103 (using either the reset circuits 104, the reset microcode 105, or both elements 104-105).
- the reset circuits 104 comprise hardware elements that provide certain types of configuration data, which cannot be provided via the execution of the reset microcode 105.
- the reset microcode 105 comprises a plurality of micro instructions disposed within an internal microcode memory (not shown) that is executed upon reset of the core 101 to perform functions corresponding to initialization of the core 101, those functions including provision of configuration data that is read from the fuse array 102 to elements such as microcode registers 108 and microcode patch mechanisms 109.
- a 64-bit control circuit 107 may include ASCII data that prescribes a serial number for the core 101.
- Another 64-bit control register may have 64 different speed settings, only one of which is asserted to specify an operating speed for the core 101.
- Microcode registers 108 may typically be initialized to all zeros (i.e. logic low states) or to all ones (i.e., logic high states).
- Microcode patch mechanisms 109 may include an approximately uniform distribution of ones and zeros to indicate addresses in a microcode ROM (not shown) along with replacement microcode values for those addresses.
- cache correction mechanisms may comprise very sparse settings of ones to indicate substitution control signals to replace a certain cache sub-bank element (i.e., a row or a column) with a particular replacement sub-bank element.
- Fuse arrays 102 provide an excellent means for configuring a device such as a microprocessor core 101 subsequent to fabrication of the device. By blowing selected fuses in the fuse array 102, the core can be configured for operation in its intended environment. Yet, as one skilled in the art will appreciate, operating environments may change following programming of the fuse array 102. Business requirements may dictate that a device 101 originally configured as, say, as desktop device 101, be reconfigured as a mobile device 101. Accordingly, designers have provided techniques that utilize redundant banks of fuses within the fuse array 102 to provide for "unblowing" selected fuses therein, thus enabling the device 101 to be reconfigured, fabrication errors to be corrected, and etc. These redundant array techniques will now be discussed with reference to FIGURE 2 .
- FIGURE 2 a block diagram 200 is presented depicting a fuse array 201 within the microprocessor core 101 of FIGURE 1 including redundant fuse banks 202 RFB1-RFBN that that may be blown subsequent to blowing first fuse banks 202 PFB1-PFBN within the fuse array 201.
- Each of the fuse banks 202 PFB1-PFBN, RFB1-RFBN comprises a prescribed number of individual fuses 203 corresponding to specific design of the core 101.
- the number of fuses 203 in a given fuse bank 202 may be 64 fuses 203 in a 64-bit microprocessor core 101 to facilitate provision of configuration data in a format that is easily implemented in the core 101.
- the fuse array 201 is coupled to a set of registers 210-211 that are typically disposed within reset logic in the core 101.
- a primary register PR1 is employed to read one of the first fuse banks PFB1-PFBN (say, PFB3 as is shown in the diagram 200) and a redundant register RR1 is employed to read a corresponding one of the redundant fuse banks RFB1-RFBN.
- the registers 210-211 are coupled to exclusive-OR logic 212 that generates an output FB3.
- the first fuse banks PFB1-PFBN are programmed by known techniques with configuration data for the core 101.
- the redundant fuse banks RFB1-RFBN are not blown and remain at a logic low state for all fuses therein.
- both the first fuse banks PFB1-PFBN and the redundant fuse banks RFB1-RFBN are read as required for configuration into the primary and redundant registers 210-211, respectively.
- the exclusive-OR logic 212 generates the output FB3 that is a logical exclusive-OR result of the contents of the registers 210-211. Since all of the redundant fuse banks are unblown (i.e., logic low states), the output FB3 value is simply that which was programmed into the first fuse banks PFB1-PFBN subsequent to fabrication.
- FIGURE 2 may be employed to provide for "reblow" of fuses 203 within a device 101, but as one skilled in the art will appreciate, a given fuse 203 may only be reblown one time as there is only one set of redundant fuse banks RFB1-RFBN. To provide for additional reblows, a corresponding number of additional fuse banks 202 and registers 210-211 must be added to the part 101.
- the fuse array mechanisms as discussed above with reference to FIGURES 1-2 has provided enough flexibility to sufficiently configure microprocessor cores and other related devices, while also allowing for a limited number of reblows. This is primarily due to the fact that former fabrication technologies, say 65 nanometer and 45 nanometer processes, allow ample real estate on a die for the implementation of enough fuses to provide for configuration of a core 101 disposed on the die.
- former fabrication technologies say 65 nanometer and 45 nanometer processes
- present day techniques are limited going forward due to two significant factors.
- the trend in the art is to dispose multiple device cores 101 on a single die to increase processing performance.
- These so-called multi-core devices may include, say, 2-16 individual cores 101, each of which must be configured with fuse data upon power-up/reset.
- fuse arrays 201 are required in that some of the data associated with individual cores may vary (e.g., cache correction data, redundant fuse data, etc.).
- some of the data associated with individual cores may vary (e.g., cache correction data, redundant fuse data, etc.).
- fuse size increases, thus requiring more die real estate to implement the same size fuse array on a 32-nanometer die opposed to that on a 45-nanometer die.
- FIGURE 3 a block diagram is presented featuring a system 300 according to the present invention that provides for compression and decompression of configuration data for a multi-core device.
- the multi-core device comprises a plurality of cores 332 disposed on a die 330.
- cores 332 CORE 1-CORE 4 are depicted on the die 330, although the present invention contemplates various numbers of cores 332 disposed on the die 330.
- all the cores 332 share a single cache memory 334 that is also disposed on the die 330.
- a single programmable fuse array 336 is also disposed on the die 330 and each of the cores 332 are configured to access the fuse array 336 to retrieve and decompress configuration data as described above during power-up/reset.
- the cores 332 comprise microprocessor cores configured as a multi-core microprocessor 330.
- the multi-core microprocessor 330 is configured as an x86-compatible multi-core microprocessor.
- the cache 334 comprises a level 2 (L2) cache 334 associated with the microprocessor cores 332.
- the fuse array 336 comprises 8192 (8K) individual fuses (not shown), although other numbers of fuses are contemplated.
- only one core 332 is disposed on the die 330 and the core 332 is coupled to the cache 334 and physical fuse array 336.
- the system 300 also includes a device programmer 310 that includes a compressor 320 that is coupled to a virtual fuse array 303.
- the device programmer 310 may comprise a CPU (not shown) that is configured to process configuration data and to program the fuse array 336 following fabrication of the die 330 according to well known programming techniques. The CPU may be integrated into a wafer test apparatus that is employed to test the device die 330 following fabrication.
- the compressor 320 may comprise an application program that executes on the device programmer 310 and the virtual fuse array 303 may comprise locations within a memory that is accessed by the compressor 320.
- the virtual fuse array 303 includes a plurality of virtual fuse banks 301, that each comprise a plurality of virtual fuses 302. In one embodiment the virtually fuse array 303 comprises 128 virtual fuse banks 301 that each comprise 64 virtual fuses 302, resulting in a virtual array 303 that is 8 Kb in size.
- configuration information for the device 330 is entered into the virtual fuse array 303 as part of the fabrication process, and as is described above with reference to FIGURE 1 .
- the configuration information comprises control circuits configuration data, initialization data for microcode registers, microcode patch data, and cache correction data. Further, as described above, the distributions of values for associated with each of the data types is substantially different from type to type.
- the virtual fuse array 303 is a logical representation of a fuse array (not shown) that comprises configuration information for each of the microprocessor cores 332 on the die 330 and correction data for each of the caches 334 on the die 330.
- the compressor 320 After the information is entered into the virtual fuse array 303, the compressor 320 reads the state of the virtual fuses 302 in each of the virtual fuse banks 301 and compresses the information using distinct compression algorithms corresponding to each of the data types to render compressed fuse array data 303.
- system data for control circuits is not compressed, but rather is transferred without compression.
- a microcode register data compression algorithm is employed that is effective for compressing data having a state distribution that corresponds to the microcode register data.
- a microcode patch data compression algorithm To compress microcode patch data, a microcode patch data compression algorithm is employed that is effective for compressing data having a state distribution that corresponds to the microcode patch data.
- a cache correction data compression algorithm is employed that is effective for compressing data having a state distribution that corresponds to the cache correction data.
- the device programmer 310 then programs the uncompressed and compressed fuse array data into the physical fuse array 336 on the die 330.
- each of the cores 332 may access the physical fuse array 336 to retrieve the uncompressed and compressed fuse array data, and reset circuits/microcode (not shown) disposed within each of the cores 332 distributes the uncompressed fuse array data, and decompresses the compressed fuse array data according to distinct decompression algorithms corresponding to each of the data types noted above to render values originally entered into the virtual fuse array 303.
- the reset circuits/microcode then enter the configuration information into control circuits (not shown), microcode registers (not shown), patch elements (not shown), and cache correction elements (not shown).
- the fuse array compression system 300 enables device designers to employ substantially fewer numbers of fuses in a physical fuse array 336 over that which has heretofore been provided, and to utilize the compressed information programmed therein to configure a multi-core device 330 during power-up/reset.
- FIGURE 4 a block diagram 400 is presented showing a fuse decompression mechanism according to the present invention.
- the decompression mechanism may be disposed within each of the microprocessor cores 332 of FIGURE 3 .
- a physical fuse array 401 disposed on the die as described above is coupled to the core 420.
- the physical fuse array 401 comprises compressed microcode patch fuses 403, compressed register fuses 404, compressed cache correction fuses 405, and compressed fuse correction fuses 406.
- the physical fuse array 401 may also comprise uncompressed configuration data (not shown) such as system configuration data as discussed above and/or block error checking and correction (ECC) codes (not shown).
- ECC block error checking and correction
- the microprocessor core 420 comprises a reset controller 417 that receives a reset signal RESET which is asserted upon power-up of the core 420 and in response to events that cause the core 420 to initiate a reset sequence of steps.
- the reset controller 417 includes a decompressor 421.
- the decompressor 421 has a patch fuses element 408, a register fuses element 409, and a cache fuses element 410.
- the decompressor also comprises a fuse correction element 411 that is coupled to the patch fuses element 408, the register fuses element 409, and the cache fuses element 410 via bus 412.
- the patch fuses decompressor is coupled to microcode patch elements 414 in the core 420.
- the register fuses element 409 is coupled to microcode registers 415 in the core 420.
- the cache fuses element 410 is coupled to cache correction elements 416 in the core 420.
- the cache correction elements 416 are disposed within an on-die L2 cache (not shown) that is shared by all the cores 420, such as the cache 334 of FIGURE 3 .
- Another embodiment contemplates cache correction elements 416 disposed within an L1 cache (not shown) within the core 420.
- a further embodiment considers cache correction elements 416 disposed to correct both the L2 and L1 caches described above.
- the reset controller 416 reads the states of the fuses 402-406 in the physical fuse array 401 and distributes the states of the compressed system fuses 402 to the decompressor 421.
- the fuse correction element 411 of the decompressor 421 decompresses the compressed fuse correction fuses states to render data that indicates one or more fuse addresses in the physical fuse array 401 whose states are to be changed from that which was previously programmed.
- the data may also include a value for each of the one or more fuse addresses.
- the one or more fuse addresses (and optional values) are routed via bus 412 to the elements 408-410 so that the states of corresponding fuses processed therein are changed prior to decompression of their corresponding compressed data.
- the patch fuses element 408 comprises microcode that operates to decompress the states of the compressed microcode patch fuses 403 according to a microcode patch decompression algorithm that corresponds the microcode patch compression algorithm described above with reference to FIGURE 3 .
- the register fuses element 409 comprises microcode that operates to decompress the states of the compressed register fuses 404 according to a register fuses decompression algorithm that corresponds to the register fuses compression algorithm described above with reference to FIGURE 3 .
- the cache fuses element 410 comprises microcode that operates to decompress the states of the compress cache correction fuses 405 according to a cache correction fuses decompression algorithm that corresponds to the cache correction fuses compression algorithm described above with reference to FIGURE 3 .
- bus 412 may comprise conventional microcode programming mechanisms that are employed to transfer data between respective routines therein.
- the present invention further contemplates a comprehensive decompressor 421 having capabilities to recognize and decompress configuration data based upon its specific type.
- the recited elements 408-411 within the decompressor 421 are presented in order to teach relevant aspects of the present invention, however, contemplated implementations of the present invention may not necessarily include distinct elements 408-411, but rather a comprehensive decompressor 421 that provides functionality corresponding to each of the elements 408-411 discussed above.
- the reset controller 417 initiates execution of microcode within the patch fuses element 408 to decompress the states of the compressed microcode patch fuses 403.
- the reset controller 417 also initiates execution of microcode within the register fuses element 409 to decompress the states of the compressed register fuses 404.
- the reset controller 417 further initiates execution of microcode within the cache fuses element 410 to decompress the states of the compressed cache correction fuses 406.
- the microcode within the decompressor 421 also operates to change the states of any fuses addressed by fuse correction data provided by the compressed fuse correction fuses 406 prior to decompression of the compressed data.
- the reset controller 417, decompressor 421, and elements 408-411 therein according to the present invention are configured to perform the functions and operations as discussed above.
- the reset controller 417, decompressor 421, and elements 408-411 therein may comprise logic, circuits, devices, or microcode, or a combination of logic, circuits, devices, or microcode, or equivalent elements that are employed to execute the functions and operations according to the present invention as noted.
- the elements employed to accomplish these operations and functions within the reset controller 417, decompressor 421, and elements 408-411 therein may be shared with other circuits, microcode, etc., that are employed to perform other functions and/or operations within the reset controller 417, decompressor 421, and elements 408-411 therein or with other elements within the core 420.
- the core 420 is configured for operation following completion of a reset sequence.
- microcode patches may be decompressed following decompression of microcode registers initialization data.
- the decompression functions may be performed in parallel or in an order suitable to satisfy design constraints.
- the present inventors note that the implementations of the elements 408-411 need not necessarily be implemented in microcode versus hardware circuits, since in a typical microprocessor core 420 there exist elements of the core 420 which can more easily be initialized via hardware (such as a scan chain associated with a cache) as opposed to direct writes by microcode. Such implementation details are left up to designer judgment.
- the present inventors submit that the prior art teaches that cache correction fuses are conventionally read and entered into a cache correction scan chain by hardware circuits during reset prior to initiating the execution of microcode, and it is a feature of the present invention to implement the cache fuses decompressor 410 in microcode as opposed to hardware control circuits since a core's caches are generally not turned on until microcode runs. By utilizing microcode to implement the cache fuses element 410, a more flexible and advantageous mechanisms is provided for entering cache correction data into a scan chain, and significant hardware is saved.
- FIGURE 5 a block diagram is presented illustrating an exemplary format 500 for compressed configuration data 500 according to the present invention.
- the compressed configuration data 500 is compressed by the compressor 320 of FIGURE 3 from data residing in the virtual fuse array 303 and is programmed (i.e., "blown") into the physical fuse array 336 of the multi-core device 330.
- the compressed configuration data 500 is retrieved from the physical fuse array 336 by each of the cores 332 and is decompressed and corrected by the elements 408-411 of the decompressor 421 within each of the cores 420.
- the decompressed and corrected configuration data is then provided to the various elements 413-416 within the core 420 to initialize the core 420 for operation.
- the compressed configuration data 500 comprises one or more compressed data fields 502 for each of the configuration data types discussed above and are demarcated by end-of-type fields 503. Programming events (i.e., "blows") are demarcated by an end-of-blow field 504.
- the compressed data fields 502 associated with each of the data types are encoded according to a compression algorithm that is optimized to minimize the number of bits (i.e., fuses) that are required to store the particular bit patterns associated with each of the data types.
- the number of fuses in the physical fuse array 336 that make up each of compressed data fields 502 is a function of the compression algorithm that is employed for a particular data type.
- each of the compressed data fields 502 comprises initialization data for a particular microcode register where the compressed data fields 502 are prescribed in register number order (i.e., 1-64).
- each of the compressed data fields 502 comprises a single fuse which is blown if a corresponding microcode register is initialized to all ones, and which is not blown if the corresponding microcode register is initialized to all zeros.
- the elements 408-410 of the decompressor 421 in the core 420 are configured to utilize the end-of-type fields 503 to determine where their respective compressed data is located within the physical fuse array 336 and the fuse correction decompressor 411 is configured to utilize the end-of-blow fields 504 to locate compressed fuse correction data that has been programmed (i.e., blown) subsequent to an initial programming event. It is a feature of the present invention to provide a substantial amount of spare fuses in the physical fuse array 336 to allow for a significant number of subsequent programming events, as will be discussed in more detail below.
- the exemplary compressed type format discussed above is presented to clearly teach aspects of the present invention that are associated with compression and decompression of configuration data.
- the manner in which specific type data is compressed, demarcated, and the number and types of data to be compressed within the fuse array 401 is not intended to be restricted to the example of FIGURE 5 .
- Other numbers, types, and formats are contemplated that allow for tailoring of the present invention to various devices and architectures extant in the art.
- FIGURE 6 a block diagram is presented illustrating an exemplary format for decompressed microcode patch configuration data 600 according to the present invention.
- compressed microcode patch configuration data is read by each core 420 from the physical fuse array 401.
- the compressed microcode patch configuration data is then corrected according to fuse correction data provided via bus 412.
- the corrected compressed microcode patch configuration data is decompressed by the patch fuses decompressor 408.
- the result of the decompression process is the decompressed microcode patch configuration data 600.
- the data 600 comprises a plurality of decompressed data blocks 604 corresponding to the number of microcode patch elements 414 within the core 420 that require initialization data.
- Each decompressed data block 604 comprises a core address field 601, a microcode ROM address field 602, and a microcode patch data field 603.
- the sizes of the fields 601-603 are a function of the core architecture.
- the patch fuses decompressor 408 creates a complete image of the target data required to initialize the microcode patch elements 414.
- conventional distribution mechanisms may be employed to distribute the data 603 to respectively addressed core and microcode ROM substitution circuits/registers in the microcode patch elements 414.
- FIGURE 7 a block diagram is presented depicting an exemplary format for decompressed microcode register configuration data 700 according to the present invention.
- compressed microcode register configuration data is read by each core 420 from the physical fuse array 401.
- the compressed microcode register configuration data is then corrected according to fuse correction data provided via bus 412.
- the corrected compressed microcode register configuration data is decompressed by the register fuses decompressor 407.
- the result of the decompression process is the decompressed microcode register configuration data 700.
- the data 700 comprises a plurality of decompressed data blocks 704 corresponding to the number of microcode registers 415 within the core 420 that require initialization data.
- Each decompressed data block 704 comprises a core address field 701, a microcode register address field 702, and a microcode register data field 703.
- the sizes of the fields 701-703 are a function of the core architecture.
- the register fuses decompressor 407 creates a complete image of the target data required to initialize the microcode registers 415.
- conventional distribution mechanisms may be employed to distribute the data 703 to respectively addressed core and microcode registers 415.
- FIGURE 8 a block diagram is presented featuring an exemplary format for decompressed cache correction data 800 according to the present invention.
- compressed cache correction data is read by each core 420 from the physical fuse array 401.
- the compressed cache correction data is then corrected according to fuse correction data provided via bus 412.
- the corrected compressed cache correction data is decompressed by the cache fuses decompressor 410.
- the result of the decompression process is the decompressed cache correction data 800.
- Various cache mechanisms may be employed in the multi-core processor 330 and the decompressed cache correction data 800 is presented in the context of a shared L2 cache 334, where all of the cores 332 may access a single cache 334, utilizing shared areas.
- the data 800 comprises a plurality of decompressed data blocks 804 corresponding to the number of cache correction elements 416 within the core 420 that require corrective data.
- Each decompressed data block 804 a sub-unit column address field 802 and a replacement column address field 803.
- memory caches are fabricated with redundant columns (or rows) in sub-units of the caches to allow for a functional redundant column (or row) in a particular sub-unit to be substituted for a non-functional column (or row).
- the decompressed cache correction data 800 allows for substitution of functional columns (as shown in FIGURE 8 ) for non-functional columns.
- conventional fuse array mechanisms associated with cache correction include fuses associated with each sub-unit column that are blown when substitution is required by redundant sub-unit columns. Accordingly, because such a large number of fuses are required (to address all sub-units and columns therein), only a portion of the sub-units are typically covered, and then the resulting conventional cache correction fuses are very sparsely blown. And the present inventors note that it is a feature of the present invention to address and compress sub-unit column addresses and replacement column addresses only for those sub-unit columns that require replacement, thus minimizing the number of fuses that are required to implement cache correction data.
- the present invention provides the potential for expanding the number of sub-unit columns (or rows) in a cache 334 that can be corrected over that which has heretofore been provided.
- the associated cores 332 are configured such that only one of the cores 334 sharing the L2 cache 334 would access and provide the corrective data 802-803 to its respective cache correction elements 416.
- the sizes of the fields 801-803 are a function of the core architecture.
- the cache correction fuses decompressor 410 creates a complete image of the target data required to initialize the cache correction elements 416.
- conventional distribution mechanisms in the responsible core 420 may be employed to distribute the data 802-803 to respectively addressed cache correction elements 416.
- FIGURE 9 a block diagram is presented showing an exemplary format for decompressed fuse correction data 900 according to the present invention.
- the fuse correction decompressor 411 accesses compressed fuse correction data 406 within the physical fuse array 401, decompresses the compressed fuse correction data, and supplies the resulting decompressed fuse correction data 900 to the other decompressors 407-49 within the core 420.
- the decompressed fuse correction data comprises one or more end-of-blow fields 901 that indicate the end of successively programming events in the physical fuse array 401.
- a reblow field 902 is programmed to indicate that a following one or more fuse correction fields 903 indicate fuses within the physical fuse array 401 that are to be reblown.
- Each of the fuse correction fields comprises an address of a specific fuse within the physical fuse array 401 that is to be reblown along with a state (i.e., blown or unblown) for the specific fuse. Only those fuses that are to be reblown are provided in the fuse correction blocks fields 903, and each group of fields 903 within a given reblow event is demarcated by an end-of-blow field 901.
- reblow field 902 properly encoded, is present after a given end-of-blow field 901, then subsequent one or more fuse fuses may be configured reblown as indicated by corresponding fuse correction fields.
- the present invention provides the capability for a substantial number of reblows for the same fuse, as limited by array size and other data provided therein.
- present inventors have also observed that the real estate and power gains associated with utilization of a shared physical fuse array within which compressed configuration data is stored presents opportunities for additional features disposed on a multi-core die.
- present day semiconductor fuse structures often suffer from several shortcomings, one of which is referred to as "growback.” Growback is the reversal of the programming process such that a fuse will, after some time, reconnect after it has been blown, that is, it goes from a programmed (i.e., blown) state back to an unprogrammed (i.e., unblown) state.
- the present invention provides several advantages, one of which is provision of redundant, yet configurable, physical fuse arrays. Accordingly, a configurable, redundant fuse bank mechanism will now be presented with reference to FIGURE 11 .
- FIGURE 10 a block diagram is presented illustrating configurable, redundant fuse arrays 1001 in a multi-core device 1000 according to the present invention.
- the multi-core device 1000 includes a plurality of cores 1002 that are configured substantially as described above with reference to FIGURES 3-10 .
- each of the cores 1002 includes array control 1003 that is programmed with configuration data within a configuration data register 1004.
- Each of the cores 1003 is coupled to the redundant fuse arrays 1001.
- cores 1002 and two physical fuse arrays 1001 For purposes of illustration, only four cores 1002 and two physical fuse arrays 1001 are shown, however the present inventors note that the novel and inventive concepts according to the present invention can be extended to a plurality of cores 1002 of any number and to more than two physical fuse arrays 1001.
- each of the cores 1001 receives configuration data within the configuration data register 1004 that indicates a specified configuration for the physical fuse arrays 1001.
- the arrays may be configured according to the value of the configuration data as an aggregate physical fuse array. That is, the size of the aggregate physical fuse array is equal to the sum of the sizes of the individual physical fuse arrays 1001, and the aggregate physical fuse array may be employed to store substantially more configuration data than is provided for by a single one of the individual fuse arrays 1001. Accordingly, the array control 1003 directs its corresponding core 1002 to read the physical fuse arrays 1001 as an aggregate physical fuse array.
- the physical fuse arrays 1001 are configured as redundant fuse arrays that are programmed with the same configuration data, and the array control 1003 within each of the cores 1002 comprises elements that enable the contents of the two (or more) arrays to be logically OR-ed together so that if one or more of the blown fuses within a given array 1001 exhibits growback, at least one of its corresponding fuses in the remaining arrays 1001 will still be blown.
- one or more of the physical fuse arrays 1001 may be selectively disabled, and the remaining arrays 1001 enabled for use in either an aggregate configuration or a logically OR-ed configuration. Accordingly, the array control 1003 in each of the cores 1002 will not access contents of the selectively disabled arrays 1001, and will access the remaining arrays according to the configuration specified by the configuration data in the configuration data register 1004.
- the configuration data registers 1004 may be programmed by any of a number of well known means to include programmable fuses, external pin settings, JTAG programming, and the like.
- the present inventors have noted that there may exist challenges when one or more physical fuse arrays are disposed on a single die that comprises multiple cores which access the arrays. More specifically, upon power-up/reset each core in a multi-core processor must read the physical fuse array in a serial fashion. That is, a first core reads the array, then a second core, then a third core, and so on. As one skilled in the art will appreciate, compared to other operations performed by the core, the reading of a fuse array is exceedingly time consuming and, thus, when multiple cores must read the same array, the time required to do so is roughly the time required for one core to read the array multiplied by the number of cores on the die.
- semiconductor fuses degrade as they are read and there are lifetime limitations, according to fabrication process, for the reading of those fuses to obtain reliable results. Accordingly, another embodiment of the present invention is provided to 1) decrease the amount of time required for all cores to read a physical fuse array and 2) increase the overall lifetime of the fuse array by decreasing the number of accesses by the cores in a multi-core processor upon power-up/reset.
- FIGURE 11 a block diagram is presented detailing a mechanism according to the present invention for rapidly loading configuration data into a multi-core device 1100.
- the device 1100 includes a plurality of cores 1102 that are configured substantially as described above with reference to FIGURES 3-10 .
- each of the cores 1102 includes array control 1103 that is programmed with load data within a load data register 1104.
- Each of the cores 1102 are coupled to a physical fuse array 1101 that is configured as described above with reference to FIGURES 3-10 , and to a random access memory (RAM) 1105 that is disposed on the same die as the cores 1102, but which is not disposed within any of the cores 1102.
- the RAM 1105 is referred to as "uncore" RAM 1105.
- each of the cores 1101 receives load data within the load data register 1104 that indicates a specified load order for data corresponding to the physical fuse array 1101.
- the value of contents of the load register 1104 designates one of the cores 1102 as a "master" core 1102, and the remaining cores as “slave” cores 1102 having a load order associated therewith.
- the array control 1103 directs the master core 1102 to read the contents of the physical fuse array 1101 and then to write the contents of the physical fuse array 1101 to the uncore RAM 1105. If a plurality of physical fuse arrays 1101 are disposed on the die, then the uncore RAM 1105 is sized appropriately to store the contents of the plurality of arrays 1101.
- array control 1103 directs their corresponding slave cores 1102 to read the fuse array contents from the uncore RAM 1105 in the order specified by contents of the load data register 1104.
- the load data registers 1104 may be programmed by any of a number of well known means to include programmable fuses, external pin settings, JTAG programming, and the like. It is also noted that the embodiment of FIGURE 11 may be employed in conjunction with any of the embodiments of the configurable, redundant fuse array mechanism discussed above with reference to FIGURE 10 .
- FIGURE 12 a block diagram 1200 is presented illustrating an error checking and correction (ECC) mechanism according to the present invention.
- ECC error checking and correction
- the ECC mechanism may be employed in conjunction with any of the embodiments of the present invention described above with reference to FIGURES 3-11 and provides for another layer of robustness for the compression and decompression of configuration data.
- the diagram depicts a microprocessor core 1220 disposed on a die that is coupled to a physical fuse array 1201 comprising compressed configuration data blocks 1203 as is described above.
- the array 1201 includes ECC code blocks 1202 that each are associated with a corresponding one of the data blocks 1203.
- the data blocks 1203 are 64 bits (i.e., fuses) in size and the ECC code blocks 1202 are 8 bits in size.
- the core 1220 includes a reset controller 1222 that receives a reset signal RESET.
- the reset controller 1222 has an ECC element 1224 that is coupled to a decompressor 1226 via bus CDATA.
- the ECC element 1224 is coupled to the fuse array 1201 via an address bus ADDR, a data bus DATA, and a code bus CODE.
- the fuse array 1201 is programmed with configuration data in the data blocks 1203 as is described above with reference to FIGURES 3-11 .
- the configuration data corresponding to a particular data type e.g., microcode path data, microcode register data
- microcode path data e.g., microcode path data, microcode register data
- configuration data corresponding to two or more types of configuration data may be programmed into the same data block 1203.
- the array 1201 is programmed with ECC codes in the ECC code blocks 1202 that each result from ECC generation for the data programmed into a corresponding data block 1203 according to one of a number of well known ECC mechanisms including, but not limited to, SECDED Hamming ECC, Chipkill ECC, and variations of forward error correction (FEC) codes.
- ECC codes in the ECC code blocks 1202 that each result from ECC generation for the data programmed into a corresponding data block 1203 according to one of a number of well known ECC mechanisms including, but not limited to, SECDED Hamming ECC, Chipkill ECC, and variations of forward error correction (FEC) codes.
- FEC forward error correction
- the addresses associated with a given data block 1203 and its corresponding ECC code block 1202 are known. Thus, it is not required that the corresponding ECC code block 1202 be located adjacent to the given data block 1203, as is depicted in FIGURE 12 .
- the decompressor 1226 is configured and functions substantially similar to the decompressor 421 described above with reference to FIGURE 4 , and as allude to with reference to FIGURES 5-11 .
- the ECC element within the reset controller 1222 accesses the fuse array 1201 to obtain its contents. Addresses associated with given data blocks 1203 and code blocks 1202 may be obtained via bus ADDR. Compressed configuration data within each of the data blocks 1203 may be obtained via bus DATA. And ECC codes for each of the ECC code blocks 1202 may be obtained via bus CODE.
- the ECC element 1224 operates to generate ECC checks for the data retrieved for each data block 1202 according to the ECC mechanism that was employed to generate the ECC code stored in the corresponding ECC code block 1202.
- the ECC element 1224 also compares the ECC checks with corresponding ECC codes obtained from the array 1201 to produce ECC syndromes.
- the ECC element 1224 further decodes the ECC syndromes to determine if no error occurred, a correctable error occurred, or a non-correctable error occurred.
- the ECC element 1224 moreover operates to correct correctable errors. Correct and corrected data is then routed to the decompressor 1226 via bus CDATA for decompression as described above.
- Non-correctable data is also passed to the decompressor 1226 via bus CDATA along with an indication of such. If an operationally critical portion of the configuration data is determined to be non-correctable, the decompressor 1226 may cause the core 1220 to shut down or otherwise flag the error.
- ECC element 1224 comprises one or more microcode routines that are executed to perform the ECC functions noted above.
- the software implemented aspects of the invention are typically encoded on some form of program storage medium or implemented over some type of transmission medium.
- the program storage medium may be electronic (e.g., read only memory, flash read only memory, electrically programmable read only memory), random access memory magnetic (e.g., a floppy disk or a hard drive) or optical (e.g., a compact disk read only memory, or "CD ROM"), and may be read only or random access.
- the transmission medium may be metal traces, twisted wire pairs, coaxial cable, optical fiber, or some other suitable transmission medium known to the art. The invention is not limited by these aspects of any given implementation.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Computer Security & Cryptography (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Semiconductor Integrated Circuits (AREA)
Abstract
Description
- This application is related to the following co-pending U.S. Patent Applications, each of which has a common assignee and common inventors.
SERIAL NUMBER FILING DATE TITLE (CNTR.2617) ________________ APPARATUS AND METHOD FOR STORAGE AND DECOMPRESSION OF CONFIGURATION DATA (CNTR.2672) ________________ MULTI-CORE FUSE DECOMPRESSION MECHANISM (CNTR.2673 ________________ EXTENDED FUSE REPROGRAMMABILITY MECHANISM (CNTR.2675) ________________ CORE-SPECIFIC FUSE MECHANISM FOR A MULTI-CORE DIE (CNTR.2686) ________________ APPARATUS AND METHOD FOR CONFIGURABLE REDUNDANT FUSE BANKS (CNTR.2687) ________________ APPARATUS AND METHOD FOR RAPID FUSE BANK ACCESS IN A MULTI-CORE PROCESSOR (CNTR.2697) ________________ MULTI-CORE MICROPROCESSOR CONFIGURATION DATA COMPRESSION AND DECOMPRESSION SYSTEM (CNTR.2698) ________________ APPARATUS AND METHOD FOR COMPRESSION OF CONFIGURATION DATA (CNTR.2699) ________________ MICROPROCESSOR MECHANISM FOR DECOMPRESSION OF FUSE CORRECTION DATA (CNTR.2700) ________________ MICROPROCESSOR MECHANISM FOR DECOMPRESSION OF CACHE CORRECTION DATA (CNTR.2705) ________________ APPARATUS AND METHOD FOR COMPRESSION AND DECOMPRESSION OF MICROPROCESSOR CONFIGURATION DATA (CNTR.2706) ________________ CORRECTABLE CONFIGURATION DATA COMPRESSION AND DECOMPRESSION SYSTEM - This invention relates in general to the field of microelectronics, and more particularly to apparatus and methods for providing compressed configuration data in a fuse array associated with a multi-core device.
- Integrated device technologies have exponentially advanced over the past 40 years. More specifically directed to the microprocessor fields, starting with 4-bit, single instruction, 10-micrometer devices, the advances in semiconductor fabrication technologies have enabled designers to provide increasingly more complex devices in terms of architecture and density. In the 80's and 90's so-called pipeline microprocessors and superscalar microprocessors were developed comprising millions of transistors on a single die. And now 20 years later, 64-bit, 32-nanometer devices are being produced that have billions of transistors on a single die and which comprise multiple microprocessor cores for the processing of data.
- One requirement that has persisted since these early devices were produced is the need to initialize these devices with configuration data when they are turned on or when they are reset. For example, many architectures enable devices to be configured to execute at one of many selectable frequencies and/or voltages. Other architectures require that each device have a serial number and other information that can be read via execution of an instruction. Yet other devices require initialization data for internal registers and control circuits. Still other devices utilize configuration data to implement redundant circuits when primary circuits are fabricated in error or outside of marginal constraints.
- As one skilled in the art will appreciate, designers have traditionally employed semiconductor fuse arrays on-die to store and provide initial configuration data. These fuse arrays are generally programmed by blowing selected fuses therein after a part has been fabricated and the arrays contain thousands of bits of information which is read by its corresponding device upon power-up/reset to initialize and configure the device for operation.
- As device complexity has increase over the past years, the amount of configuration data that is required for a typical device has proportionately increased. Yet, as one skilled in the art will appreciate, though transistor size shrinks in proportion to the semiconductor fabrication process employed, semiconductor fuse size increases to the unique requirements for programming fuses on die. This phenomenon, in and of itself, is a problem for designers, who are prevalently constrained by real estate and power considerations. That is, there is just not enough real estate on a given die to fabricate a huge fuse array.
- In addition, the ability to fabricate multiple device cores on a single die has geometrically exacerbated the problem, because configuration requirements for each of the cores results in requirement for a number of fuses on die, in a single array or distinct arrays, that are equal to the number of cores disposed thereon.
- Therefore, what is needed is apparatus and methods that enable configuration data to be stored and provided to a multi-core device that require significantly less real estate and power on a single die than that which has heretofore been provided.
- In addition, what is needed is a fuse array mechanism that can store and provide significantly more configuration data than current techniques while requiring the same or less real estate on a multi-core die.
- The present invention, among other applications, is directed to solving the above-noted problems and addresses other problems, disadvantages, and limitations of the prior art by providing a superior technique for utilizing compressed configuration data in a fuse array associated with a multi-core device. In one embodiment, an apparatus is contemplated for providing configuration data to an integrated circuit. The apparatus includes a semiconductor fuse array, a cache memory, and a plurality of cores. The semiconductor fuse array is disposed on a die, into which is programmed the configuration data. The semiconductor fuse array has a first plurality of semiconductor fuses that is configured to store compressed cache correction data. The a cache memory is disposed on the die. The plurality of cores is disposed on the die, where each of the plurality of cores is coupled to the semiconductor fuse array and the cache memory, and is configured to access the semiconductor fuse array upon power-up/reset, to decompress the compressed cache correction data, and to distribute decompressed cached correction data to initialize the cache memory.
- One aspect of the present invention contemplates an apparatus for providing configuration data to an integrated circuit. The apparatus includes a multi-core microprocessor. The multi-core microprocessor has a semiconductor fuse array, disposed on a die, into which is programmed the configuration data. The semiconductor fuse array has a first plurality of semiconductor fuses configured to store compressed cache correction data. The multi-core microprocessor also has a cache memory, disposed on the die and a plurality of cores, disposed on the die, where each of the plurality of cores is coupled to the semiconductor fuse array and the cache memory, and is configured to access the semiconductor fuse array upon power-up/reset, to decompress the compressed cache correction data, and to distribute decompressed cached correction data to initialize the cache memory.
- Another aspect of the present invention contemplates a method for providing configuration data to an integrated circuit. The method includes first disposing a semiconductor fuse array on a die. The first disposing includes storing compressed cache correction data in a first plurality of semiconductor fuses. The method also includes second disposing a cache memory on the die; and also includes third disposing a plurality of cores on the die, where each of the plurality of cores is coupled to the semiconductor fuse array and the cache memory; The method further includes, via each of the plurality of cores, accessing the semiconductor fuse array upon power-up/reset, decompressing the compressed cache correction data, and distributing decompressed cached correction data to initialize the cache memory.
- Regarding industrial applicability, the present invention is implemented within a MICROPROCESSOR which may be used in a general purpose or special purpose computing device.
- These and other objects, features, and advantages of the present invention will become better understood with regard to the following description, and accompanying drawings where:
-
FIGURE 1 is a block diagram illustrating a present day microprocessor core that includes a fuse array for providing configuration data to the microprocessor core; -
FIGURE 2 is a block diagram depicting a fuse array within the microprocessor core ofFIGURE 1 which includes redundant fuse banks that may be blown subsequent to blowing first fuse banks within the fuse array; -
FIGURE 3 is a block diagram featuring a system according to the present invention that provides for compression and decompression of configuration data for a multi-core device; -
FIGURE 4 is a block diagram showing a fuse decompression mechanism according to the present invention; -
FIGURE 5 is a block diagram illustrating an exemplary format for compressed configuration data according to the present invention; -
FIGURE 6 is a block diagram illustrating an exemplary format for decompressed microcode patch configuration data according to the present invention; -
FIGURE 7 is a block diagram depicting an exemplary format for decompressed microcode register configuration data according to the present invention; -
FIGURE 8 is a block diagram featuring an exemplary format for decompressed cache correction data according to the present invention; -
FIGURE 9 is a block diagram showing an exemplary format for decompressed fuse correction data according to the present invention; -
FIGURE 10 is a block diagram illustrating configurable redundant fuse arrays in a multi-core device according to the present invention; -
FIGURE 11 is a block diagram detailing a mechanism according to the present invention for rapidly loading configuration data into a multi-core device; and -
FIGURE 12 is a block diagram showing an error checking and correction mechanism according to the present invention. - Exemplary and illustrative embodiments of the invention are described below. In the interest of clarity, not all features of an actual implementation are described in this specification, for those skilled in the art will appreciate that in the development of any such actual embodiment, numerous implementation specific decisions are made to achieve specific goals, such as compliance with system-related and business related constraints, which vary from one implementation to the next. Furthermore, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure. Various modifications to the preferred embodiment will be apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described herein, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed.
- The present invention will now be described with reference to the attached figures. Various structures, systems, and devices are schematically depicted in the drawings for purposes of explanation only and so as to not obscure the present invention with details that are well known to those skilled in the art. Nevertheless, the attached drawings are included to describe and explain illustrative examples of the present invention. The words and phrases used herein should be understood and interpreted to have a meaning consistent with the understanding of those words and phrases by those skilled in the relevant art. No special definition of a term or phrase (i.e., a definition that is different from the ordinary and customary meaning as understood by those skilled in the art) is intended to be implied by consistent usage of the term or phrase herein. To the extent that a term or phrase is intended to have a special meaning (i.e., a meaning other than that understood by skilled artisans) such a special definition will be expressly set forth in the specification in a definitional manner that directly and unequivocally provides the special definition for the term or phrase.
- In view of the above background discussion on device fuse arrays and associated techniques employed within present day integrated circuits for providing configuration data during initial power-up, a discussion of the limitations and disadvantages of those techniques will be presented with reference to
FIGURES 1-2 . Following this, a discussion of the present invention will be presented with reference toFIGURES 3-12 . The present invention overcomes all of the limitations and disadvantages discussed below by providing apparatus and methods for employing compressed configuration in a multi-core die which utilize less power and real estate on the multi-core die, and which are more reliable than that which has heretofore been provided. - Integrated Circuit (IC): A set of electronic circuits fabricated on a small piece of semiconductor material, typically silicon. An IC is also referred to as a chip, a microchip, or a die.
- Central Processing Unit (CPU): The electronic circuits (i.e., "hardware") that execute the instructions of a computer program (also known as a "computer application" or "application") by performing operations on data that include arithmetic operations, logical operations, and input/output operations.
- Microprocessor: An electronic device that functions as a CPU on a single integrated circuit. A microprocessor receives digital data as input, processes the data according to instructions fetched from a memory (either on-die or off-die), and generates results of operations prescribed by the instructions as output. A general purpose microprocessor may be employed in a desktop, mobile, or tablet computer, and is employed for uses such as computation, text editing, multimedia display, and Internet browsing. A microprocessor may also be disposed in an embedded system to control a wide variety of devices including appliances, mobile telephones, smart phones, and industrial control devices.
- Multi-Core Processor: Also known as a multi-core microprocessor, a multi-core processor is a microprocessor having multiple CPUs ("cores") fabricated on a single integrated circuit.
- Instruction Set Architecture (ISA) or Instruction Set: A part of a computer architecture related to programming that includes data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and input/output. An ISA includes a specification of the set of opcodes (i.e., machine language instructions), and the native commands implemented by a particular CPU.
- x86-Compatible Microprocessor: A microprocessor capable of executing computer applications that are programmed according to the x86 ISA.
- Microcode: A term employed to refer to a plurality of micro instructions. A micro instruction (also referred to as a "native instruction") is an instruction at the level that a microprocessor sub-unit executes. Exemplary sub-units include integer units, floating point units, MMX units, and load/store units. For example, micro instructions are directly executed by a reduced instruction set computer (RISC) microprocessor. For a complex instruction set computer (CISC) microprocessor such as an x86-compatible microprocessor, x86 instructions are translated into associated micro instructions, and the associated micro instructions are directly executed by a sub-unit or sub-units within the CISC microprocessor.
- Fuse: A conductive structure typically arranged as a filament which can be broken at select locations by applying a voltage across the filament and/or current through the filament. Fuses may be deposited at specified areas across a die topography using well known fabrication techniques to produce filaments at all potential programmable areas. A fuse structure is blown (or unblown) subsequent to fabrication to provide for desired programmability of a corresponding device disposed on the die.
- Turning to
FIGURE 1 , a block diagram 100 is presented illustrating a presentday microprocessor core 101 that includes a fuse array 102 for providing configuration data to themicroprocessor core 101. The fuse array 102 comprises a plurality of semiconductor fuses (not shown) typically arranged in groups known as banks. The fuse array 102 is coupled to resetlogic 103 that includes both resetcircuits 104 and resetmicrocode 105. Thereset logic 103 is coupled to controlcircuits 107, microcode registers 108,microcode patches elements 109, andcache correction elements 110. An external reset signal RESET is coupled to themicroprocessor core 101 and is routed to thereset logic 103. - As one skilled in the art will appreciate, fuses (also called "links" or "fuse structures") are employed in a vast number of present day integrated circuit devices to provide for configuration of the devices after the devices have been fabricated. For example, consider that the
microprocessor core 101 ofFIGURE 1 is fabricated to provide functionality selectively either as a desktop device or a mobile device. Accordingly, following fabrication, prescribed fuses within the fuse array 102 may be blown to configure the device as, say, a mobile device. Accordingly, upon assertion of RESET, thereset logic 103 reads the state of the prescribed fuses in the fuse array 102 and the reset circuits 104 (rather thanreset microcode 105, in this example) enable correspondingcontrol circuits 107 that deactivate elements of the core 101 exclusively associated with desktop operations and activate elements of the core 101 exclusively associated with mobile operations. Consequently, thecore 101 is configured upon power-up reset as a mobile device. In addition, thereset logic 103 reads the state of the other fuses in the fuse array 102 and the reset circuits 104 (rather thanreset microcode 105, in this example) enable correspondingcache correction circuits 107 provide corrective mechanisms for one or more cache memories associated (not shown) with thecore 101. Consequently, thecore 101 is configured upon power-up reset as a mobile device and corrective mechanisms for its cache memories are in place. - The above example is merely one of many different uses for configuration fuses in an integrated circuit device such as a
microprocessor core 101 ofFIGURE 1 . One skilled in the art will appreciate that other uses for configuration fuses include, but are not limited to, configuration of device specific data (e.g., serial numbers, unique cryptographic keys, architecture mandated data that can be accessed by users, speed settings, voltage settings), initialization data, and patch data. For example, many present day devices execute microcode and often require initialization ofregisters 108 that are read by the microcode. Such initialization data may be provided by microcode register fuses (not shown) within the fuse array 102, which are read upon reset and provided to the microcode registers 108 by the reset logic 103 (using either thereset circuits 104, thereset microcode 105, or both elements 104-105). For purposes of the present application, thereset circuits 104 comprise hardware elements that provide certain types of configuration data, which cannot be provided via the execution of thereset microcode 105. Thereset microcode 105 comprises a plurality of micro instructions disposed within an internal microcode memory (not shown) that is executed upon reset of the core 101 to perform functions corresponding to initialization of thecore 101, those functions including provision of configuration data that is read from the fuse array 102 to elements such as microcode registers 108 andmicrocode patch mechanisms 109. The criteria for whether certain types of configuration data provided via fuses can be distributed to the various elements 107-110 in thecore 101 viareset microcode 105 or not is a function primarily of the specific design of thecore 101. It is not the intent of the present application to provide a comprehensive tutorial on specific configuration techniques that are employed to initialize integrated circuit devices, for one skilled in the art will appreciate that for a presentday microprocessor core 101 the types of configurable elements 107-110 generally fall into four categories as are exemplified inFIGURE 1 : control circuits, microcode registers, microcode patch mechanisms, and cache correction mechanisms. Furthermore, one skilled will appreciate that the specific values of the configuration data significantly vary based upon the specific type of data. For instance, a 64-bit control circuit 107 may include ASCII data that prescribes a serial number for thecore 101. Another 64-bit control register may have 64 different speed settings, only one of which is asserted to specify an operating speed for thecore 101. Microcode registers 108 may typically be initialized to all zeros (i.e. logic low states) or to all ones (i.e., logic high states).Microcode patch mechanisms 109 may include an approximately uniform distribution of ones and zeros to indicate addresses in a microcode ROM (not shown) along with replacement microcode values for those addresses. Finally, cache correction mechanisms may comprise very sparse settings of ones to indicate substitution control signals to replace a certain cache sub-bank element (i.e., a row or a column) with a particular replacement sub-bank element. - Fuse arrays 102 provide an excellent means for configuring a device such as a
microprocessor core 101 subsequent to fabrication of the device. By blowing selected fuses in the fuse array 102, the core can be configured for operation in its intended environment. Yet, as one skilled in the art will appreciate, operating environments may change following programming of the fuse array 102. Business requirements may dictate that adevice 101 originally configured as, say, asdesktop device 101, be reconfigured as amobile device 101. Accordingly, designers have provided techniques that utilize redundant banks of fuses within the fuse array 102 to provide for "unblowing" selected fuses therein, thus enabling thedevice 101 to be reconfigured, fabrication errors to be corrected, and etc. These redundant array techniques will now be discussed with reference toFIGURE 2 . - Referring now to
FIGURE 2 , a block diagram 200 is presented depicting afuse array 201 within themicroprocessor core 101 ofFIGURE 1 includingredundant fuse banks 202 RFB1-RFBN that that may be blown subsequent to blowingfirst fuse banks 202 PFB1-PFBN within thefuse array 201. Each of thefuse banks 202 PFB1-PFBN, RFB1-RFBN comprises a prescribed number ofindividual fuses 203 corresponding to specific design of thecore 101. For example, the number offuses 203 in a givenfuse bank 202 may be 64fuses 203 in a 64-bit microprocessor core 101 to facilitate provision of configuration data in a format that is easily implemented in thecore 101. - The
fuse array 201 is coupled to a set of registers 210-211 that are typically disposed within reset logic in thecore 101. A primary register PR1 is employed to read one of the first fuse banks PFB1-PFBN (say, PFB3 as is shown in the diagram 200) and a redundant register RR1 is employed to read a corresponding one of the redundant fuse banks RFB1-RFBN. The registers 210-211 are coupled to exclusive-ORlogic 212 that generates an output FB3. - In operation, subsequent to fabrication of the
core 101, the first fuse banks PFB1-PFBN are programmed by known techniques with configuration data for thecore 101. The redundant fuse banks RFB1-RFBN are not blown and remain at a logic low state for all fuses therein. Upon power-up/reset of thecore 101, both the first fuse banks PFB1-PFBN and the redundant fuse banks RFB1-RFBN are read as required for configuration into the primary and redundant registers 210-211, respectively. The exclusive-ORlogic 212 generates the output FB3 that is a logical exclusive-OR result of the contents of the registers 210-211. Since all of the redundant fuse banks are unblown (i.e., logic low states), the output FB3 value is simply that which was programmed into the first fuse banks PFB1-PFBN subsequent to fabrication. - Consider now, though, that design or business requirements dictate that some of the information that was programmed into the first fuse banks PFB1-PFBN needs to change. Accordingly, a programming operation is performed to blow
corresponding fuses 203 within the redundant fuse banks RFB1-RFBN in order to change the information that is read at power-up. By blowing afuse 203 in a selected redundant bank RFB1-RFBN, the value of acorresponding fuse 203 in the primary fuse bank PFB1-PFBN is logically complemented. - The mechanism of
FIGURE 2 may be employed to provide for "reblow" offuses 203 within adevice 101, but as one skilled in the art will appreciate, a givenfuse 203 may only be reblown one time as there is only one set of redundant fuse banks RFB1-RFBN. To provide for additional reblows, a corresponding number ofadditional fuse banks 202 and registers 210-211 must be added to thepart 101. - Heretofore, the fuse array mechanisms as discussed above with reference to
FIGURES 1-2 has provided enough flexibility to sufficiently configure microprocessor cores and other related devices, while also allowing for a limited number of reblows. This is primarily due to the fact that former fabrication technologies, say 65 nanometer and 45 nanometer processes, allow ample real estate on a die for the implementation of enough fuses to provide for configuration of a core 101 disposed on the die. However, the present inventors have observed that present day techniques are limited going forward due to two significant factors. First, the trend in the art is to disposemultiple device cores 101 on a single die to increase processing performance. These so-called multi-core devices may include, say, 2-16individual cores 101, each of which must be configured with fuse data upon power-up/reset. Accordingly, for a 4-core device, fourfuse arrays 201 are required in that some of the data associated with individual cores may vary (e.g., cache correction data, redundant fuse data, etc.). Secondly, as one skilled in the art will appreciate, as fabrication process technologies shrink to, say, 32 nanometers, while transistor size shrinks accordingly, fuse size increases, thus requiring more die real estate to implement the same size fuse array on a 32-nanometer die opposed to that on a 45-nanometer die. - Both of the above limitations, and others, pose significant challenges to device designers, and more specifically to multi-core device designers, and the present inventors note that significant improvements over conventional device configuration mechanisms can be implemented in accordance with the present invention, which allows for programming of individual cores in a multi-core device along with substantial increases in cache correction and fuse reprogramming ("reblow") elements. The present invention will now be discussed with reference to
FIGURES 3-12 . - Turning to
FIGURE 3 , a block diagram is presented featuring asystem 300 according to the present invention that provides for compression and decompression of configuration data for a multi-core device. The multi-core device comprises a plurality ofcores 332 disposed on adie 330. For illustrative purposes, fourcores 332 CORE 1-CORE 4 are depicted on thedie 330, although the present invention contemplates various numbers ofcores 332 disposed on thedie 330. In one embodiment, all thecores 332 share asingle cache memory 334 that is also disposed on thedie 330. A singleprogrammable fuse array 336 is also disposed on thedie 330 and each of thecores 332 are configured to access thefuse array 336 to retrieve and decompress configuration data as described above during power-up/reset. - In one embodiment, the
cores 332 comprise microprocessor cores configured as amulti-core microprocessor 330. In another embodiment, themulti-core microprocessor 330 is configured as an x86-compatible multi-core microprocessor. In yet another embodiment, thecache 334 comprises a level 2 (L2)cache 334 associated with themicroprocessor cores 332. In one embodiment, thefuse array 336 comprises 8192 (8K) individual fuses (not shown), although other numbers of fuses are contemplated. In a single-core embodiment, only onecore 332 is disposed on thedie 330 and thecore 332 is coupled to thecache 334 andphysical fuse array 336. The present inventors note that although features and functions of the present invention will henceforth be discussed in the context of amulti-core device 330, these features and functions are equally applicable to a single-core embodiment as well. - The
system 300 also includes adevice programmer 310 that includes acompressor 320 that is coupled to avirtual fuse array 303. In one embodiment, thedevice programmer 310 may comprise a CPU (not shown) that is configured to process configuration data and to program thefuse array 336 following fabrication of thedie 330 according to well known programming techniques. The CPU may be integrated into a wafer test apparatus that is employed to test the device die 330 following fabrication. In one embodiment, thecompressor 320 may comprise an application program that executes on thedevice programmer 310 and thevirtual fuse array 303 may comprise locations within a memory that is accessed by thecompressor 320. Thevirtual fuse array 303 includes a plurality ofvirtual fuse banks 301, that each comprise a plurality ofvirtual fuses 302. In one embodiment thevirtually fuse array 303 comprises 128virtual fuse banks 301 that each comprise 64virtual fuses 302, resulting in avirtual array 303 that is 8 Kb in size. - Operationally, configuration information for the
device 330 is entered into thevirtual fuse array 303 as part of the fabrication process, and as is described above with reference toFIGURE 1 . Accordingly, the configuration information comprises control circuits configuration data, initialization data for microcode registers, microcode patch data, and cache correction data. Further, as described above, the distributions of values for associated with each of the data types is substantially different from type to type. Thevirtual fuse array 303 is a logical representation of a fuse array (not shown) that comprises configuration information for each of themicroprocessor cores 332 on thedie 330 and correction data for each of thecaches 334 on thedie 330. - After the information is entered into the
virtual fuse array 303, thecompressor 320 reads the state of thevirtual fuses 302 in each of thevirtual fuse banks 301 and compresses the information using distinct compression algorithms corresponding to each of the data types to render compressedfuse array data 303. In one embodiment, system data for control circuits is not compressed, but rather is transferred without compression. To compress microcode register data, a microcode register data compression algorithm is employed that is effective for compressing data having a state distribution that corresponds to the microcode register data. To compress microcode patch data, a microcode patch data compression algorithm is employed that is effective for compressing data having a state distribution that corresponds to the microcode patch data. To compress cache correction data, a cache correction data compression algorithm is employed that is effective for compressing data having a state distribution that corresponds to the cache correction data. - The
device programmer 310 then programs the uncompressed and compressed fuse array data into thephysical fuse array 336 on thedie 330. - Upon power-up/reset, each of the
cores 332 may access thephysical fuse array 336 to retrieve the uncompressed and compressed fuse array data, and reset circuits/microcode (not shown) disposed within each of thecores 332 distributes the uncompressed fuse array data, and decompresses the compressed fuse array data according to distinct decompression algorithms corresponding to each of the data types noted above to render values originally entered into thevirtual fuse array 303. The reset circuits/microcode then enter the configuration information into control circuits (not shown), microcode registers (not shown), patch elements (not shown), and cache correction elements (not shown). - Advantageously, the fuse
array compression system 300 according to the present invention enables device designers to employ substantially fewer numbers of fuses in aphysical fuse array 336 over that which has heretofore been provided, and to utilize the compressed information programmed therein to configure amulti-core device 330 during power-up/reset. - Turning now to
FIGURE 4 , a block diagram 400 is presented showing a fuse decompression mechanism according to the present invention. The decompression mechanism may be disposed within each of themicroprocessor cores 332 ofFIGURE 3 . For purposes of clearly teaching the present invention, only onecore 420 is depicted inFIGURE 4 and each of thecores 332 disposed on the die comprise substantially equivalent elements as thecore 420 shown. Aphysical fuse array 401 disposed on the die as described above is coupled to thecore 420. Thephysical fuse array 401 comprises compressed microcode patch fuses 403, compressed register fuses 404, compressed cache correction fuses 405, and compressed fuse correction fuses 406. Thephysical fuse array 401 may also comprise uncompressed configuration data (not shown) such as system configuration data as discussed above and/or block error checking and correction (ECC) codes (not shown). The inclusion of ECC features according to the present invention will be discussed in further detail below. - The
microprocessor core 420 comprises areset controller 417 that receives a reset signal RESET which is asserted upon power-up of thecore 420 and in response to events that cause thecore 420 to initiate a reset sequence of steps. Thereset controller 417 includes adecompressor 421. Thedecompressor 421 has a patch fuseselement 408, a register fuses element 409, and a cache fuses element 410. The decompressor also comprises afuse correction element 411 that is coupled to the patch fuseselement 408, the register fuses element 409, and the cache fuses element 410 viabus 412. The patch fuses decompressor is coupled tomicrocode patch elements 414 in thecore 420. The register fuses element 409 is coupled tomicrocode registers 415 in thecore 420. And the cache fuses element 410 is coupled tocache correction elements 416 in thecore 420. In one embodiment, thecache correction elements 416 are disposed within an on-die L2 cache (not shown) that is shared by all thecores 420, such as thecache 334 ofFIGURE 3 . Another embodiment contemplatescache correction elements 416 disposed within an L1 cache (not shown) within thecore 420. A further embodiment considerscache correction elements 416 disposed to correct both the L2 and L1 caches described above. - In operation, upon assertion of RESET the
reset controller 416 reads the states of the fuses 402-406 in thephysical fuse array 401 and distributes the states of the compressed system fuses 402 to thedecompressor 421. After the fuse data has been read and distributed, thefuse correction element 411 of thedecompressor 421 decompresses the compressed fuse correction fuses states to render data that indicates one or more fuse addresses in thephysical fuse array 401 whose states are to be changed from that which was previously programmed. The data may also include a value for each of the one or more fuse addresses. The one or more fuse addresses (and optional values) are routed viabus 412 to the elements 408-410 so that the states of corresponding fuses processed therein are changed prior to decompression of their corresponding compressed data. - In one embodiment, the patch fuses
element 408 comprises microcode that operates to decompress the states of the compressed microcode patch fuses 403 according to a microcode patch decompression algorithm that corresponds the microcode patch compression algorithm described above with reference toFIGURE 3 . In one embodiment, the register fuses element 409 comprises microcode that operates to decompress the states of the compressed register fuses 404 according to a register fuses decompression algorithm that corresponds to the register fuses compression algorithm described above with reference toFIGURE 3 . In one embodiment, the cache fuses element 410 comprises microcode that operates to decompress the states of the compress cache correction fuses 405 according to a cache correction fuses decompression algorithm that corresponds to the cache correction fuses compression algorithm described above with reference toFIGURE 3 . After each of the elements 408-410 change the states of any fuses whose addresses (and optional values) are provided viabus 412 from thefuse correction element 411, their respective data is decompressed according to the corresponding algorithm employed. As will be described in further detail below, the present invention contemplates multiple "reblows" of any fuse address within the physical fuse array prior to the initiation of the decompression process executed by any of the decompressors 407-411. In oneembodiment bus 412 may comprise conventional microcode programming mechanisms that are employed to transfer data between respective routines therein. The present invention further contemplates acomprehensive decompressor 421 having capabilities to recognize and decompress configuration data based upon its specific type. Accordingly, the recited elements 408-411 within thedecompressor 421 are presented in order to teach relevant aspects of the present invention, however, contemplated implementations of the present invention may not necessarily include distinct elements 408-411, but rather acomprehensive decompressor 421 that provides functionality corresponding to each of the elements 408-411 discussed above. - In one embodiment, the
reset controller 417 initiates execution of microcode within the patch fuseselement 408 to decompress the states of the compressed microcode patch fuses 403. Thereset controller 417 also initiates execution of microcode within the register fuses element 409 to decompress the states of the compressed register fuses 404. And thereset controller 417 further initiates execution of microcode within the cache fuses element 410 to decompress the states of the compressed cache correction fuses 406. The microcode within thedecompressor 421 also operates to change the states of any fuses addressed by fuse correction data provided by the compressed fuse correction fuses 406 prior to decompression of the compressed data. - The
reset controller 417,decompressor 421, and elements 408-411 therein according to the present invention are configured to perform the functions and operations as discussed above. Thereset controller 417,decompressor 421, and elements 408-411 therein may comprise logic, circuits, devices, or microcode, or a combination of logic, circuits, devices, or microcode, or equivalent elements that are employed to execute the functions and operations according to the present invention as noted. The elements employed to accomplish these operations and functions within thereset controller 417,decompressor 421, and elements 408-411 therein may be shared with other circuits, microcode, etc., that are employed to perform other functions and/or operations within thereset controller 417,decompressor 421, and elements 408-411 therein or with other elements within thecore 420. - After the states of the fuses 403-406 within the
physical fuse array 401 have been changed and decompressed, the states of the decompressed "virtual" fuses are then routed, as appropriate to themicrocode patch elements 414, the microcode registers 415, and thecache correction elements 416. Accordingly, thecore 420 is configured for operation following completion of a reset sequence. - The present inventors note that the decompression functions discussed above need not necessarily be performed in a particular order during a reset sequence. For example, microcode patches may be decompressed following decompression of microcode registers initialization data. Likewise, the decompression functions may be performed in parallel or in an order suitable to satisfy design constraints.
- Furthermore, the present inventors note that the implementations of the elements 408-411 need not necessarily be implemented in microcode versus hardware circuits, since in a
typical microprocessor core 420 there exist elements of the core 420 which can more easily be initialized via hardware (such as a scan chain associated with a cache) as opposed to direct writes by microcode. Such implementation details are left up to designer judgment. However, the present inventors submit that the prior art teaches that cache correction fuses are conventionally read and entered into a cache correction scan chain by hardware circuits during reset prior to initiating the execution of microcode, and it is a feature of the present invention to implement the cache fuses decompressor 410 in microcode as opposed to hardware control circuits since a core's caches are generally not turned on until microcode runs. By utilizing microcode to implement the cache fuses element 410, a more flexible and advantageous mechanisms is provided for entering cache correction data into a scan chain, and significant hardware is saved. - Now referring to
FIGURE 5 , a block diagram is presented illustrating anexemplary format 500 forcompressed configuration data 500 according to the present invention. Thecompressed configuration data 500 is compressed by thecompressor 320 ofFIGURE 3 from data residing in thevirtual fuse array 303 and is programmed (i.e., "blown") into thephysical fuse array 336 of themulti-core device 330. During a reset sequence, as is described above, thecompressed configuration data 500 is retrieved from thephysical fuse array 336 by each of thecores 332 and is decompressed and corrected by the elements 408-411 of thedecompressor 421 within each of thecores 420. The decompressed and corrected configuration data is then provided to the various elements 413-416 within thecore 420 to initialize thecore 420 for operation. - The
compressed configuration data 500 comprises one or morecompressed data fields 502 for each of the configuration data types discussed above and are demarcated by end-of-type fields 503. Programming events (i.e., "blows") are demarcated by an end-of-blow field 504. Thecompressed data fields 502 associated with each of the data types are encoded according to a compression algorithm that is optimized to minimize the number of bits (i.e., fuses) that are required to store the particular bit patterns associated with each of the data types. The number of fuses in thephysical fuse array 336 that make up each of compresseddata fields 502 is a function of the compression algorithm that is employed for a particular data type. For example, consider a core that comprises sixty-four 64-bit microcode registers which must be initialized to, say, all ones or all zeros. An optimum compression algorithm may be employed to yield 64compressed data fields 502 for that data type, where each of the compressed data fields 502 comprises initialization data for a particular microcode register where the compressed data fields 502 are prescribed in register number order (i.e., 1-64). And each of the compressed data fields 502 comprises a single fuse which is blown if a corresponding microcode register is initialized to all ones, and which is not blown if the corresponding microcode register is initialized to all zeros. - The elements 408-410 of the
decompressor 421 in thecore 420 are configured to utilize the end-of-type fields 503 to determine where their respective compressed data is located within thephysical fuse array 336 and thefuse correction decompressor 411 is configured to utilize the end-of-blow fields 504 to locate compressed fuse correction data that has been programmed (i.e., blown) subsequent to an initial programming event. It is a feature of the present invention to provide a substantial amount of spare fuses in thephysical fuse array 336 to allow for a significant number of subsequent programming events, as will be discussed in more detail below. - The exemplary compressed type format discussed above is presented to clearly teach aspects of the present invention that are associated with compression and decompression of configuration data. However, the manner in which specific type data is compressed, demarcated, and the number and types of data to be compressed within the
fuse array 401 is not intended to be restricted to the example ofFIGURE 5 . Other numbers, types, and formats are contemplated that allow for tailoring of the present invention to various devices and architectures extant in the art. - Turning now to
FIGURE 6 , a block diagram is presented illustrating an exemplary format for decompressed microcodepatch configuration data 600 according to the present invention. During a reset sequence, compressed microcode patch configuration data is read by each core 420 from thephysical fuse array 401. The compressed microcode patch configuration data is then corrected according to fuse correction data provided viabus 412. Then, the corrected compressed microcode patch configuration data is decompressed by the patch fusesdecompressor 408. The result of the decompression process is the decompressed microcodepatch configuration data 600. Thedata 600 comprises a plurality of decompressed data blocks 604 corresponding to the number ofmicrocode patch elements 414 within thecore 420 that require initialization data. Each decompresseddata block 604 comprises acore address field 601, a microcodeROM address field 602, and a microcodepatch data field 603. The sizes of the fields 601-603 are a function of the core architecture. As part of the decompression process, the patch fusesdecompressor 408 creates a complete image of the target data required to initialize themicrocode patch elements 414. Following decompression of the microcodepatch configuration data 600, conventional distribution mechanisms may be employed to distribute thedata 603 to respectively addressed core and microcode ROM substitution circuits/registers in themicrocode patch elements 414. - Now turning to
FIGURE 7 , a block diagram is presented depicting an exemplary format for decompressed microcoderegister configuration data 700 according to the present invention. During a reset sequence, compressed microcode register configuration data is read by each core 420 from thephysical fuse array 401. The compressed microcode register configuration data is then corrected according to fuse correction data provided viabus 412. Then, the corrected compressed microcode register configuration data is decompressed by the register fuses decompressor 407. The result of the decompression process is the decompressed microcoderegister configuration data 700. Thedata 700 comprises a plurality of decompressed data blocks 704 corresponding to the number of microcode registers 415 within thecore 420 that require initialization data. Each decompresseddata block 704 comprises acore address field 701, a microcoderegister address field 702, and a microcoderegister data field 703. The sizes of the fields 701-703 are a function of the core architecture. As part of the decompression process, the register fuses decompressor 407 creates a complete image of the target data required to initialize the microcode registers 415. Following decompression of the microcoderegister configuration data 700, conventional distribution mechanisms may be employed to distribute thedata 703 to respectively addressed core and microcode registers 415. - Referring now to
FIGURE 8 , a block diagram is presented featuring an exemplary format for decompressedcache correction data 800 according to the present invention. During a reset sequence, compressed cache correction data is read by each core 420 from thephysical fuse array 401. The compressed cache correction data is then corrected according to fuse correction data provided viabus 412. Then, the corrected compressed cache correction data is decompressed by the cache fuses decompressor 410. The result of the decompression process is the decompressedcache correction data 800. Various cache mechanisms may be employed in themulti-core processor 330 and the decompressedcache correction data 800 is presented in the context of a sharedL2 cache 334, where all of thecores 332 may access asingle cache 334, utilizing shared areas. Accordingly, the exemplary format is provided according to the noted architecture. Thedata 800 comprises a plurality of decompressed data blocks 804 corresponding to the number ofcache correction elements 416 within thecore 420 that require corrective data. Each decompressed data block 804 a sub-unitcolumn address field 802 and a replacementcolumn address field 803. As one skilled in the art will appreciate, memory caches are fabricated with redundant columns (or rows) in sub-units of the caches to allow for a functional redundant column (or row) in a particular sub-unit to be substituted for a non-functional column (or row). Thus, the decompressedcache correction data 800 allows for substitution of functional columns (as shown inFIGURE 8 ) for non-functional columns. In addition, as one skilled in the art will concur, conventional fuse array mechanisms associated with cache correction include fuses associated with each sub-unit column that are blown when substitution is required by redundant sub-unit columns. Accordingly, because such a large number of fuses are required (to address all sub-units and columns therein), only a portion of the sub-units are typically covered, and then the resulting conventional cache correction fuses are very sparsely blown. And the present inventors note that it is a feature of the present invention to address and compress sub-unit column addresses and replacement column addresses only for those sub-unit columns that require replacement, thus minimizing the number of fuses that are required to implement cache correction data. Consequently, the present invention, as limited by physical fuse array size and the amount of additional configuration data that is programmed therein, provides the potential for expanding the number of sub-unit columns (or rows) in acache 334 that can be corrected over that which has heretofore been provided. In the embodiment shown inFIGURE 8 , it is noted that the associatedcores 332 are configured such that only one of thecores 334 sharing theL2 cache 334 would access and provide the corrective data 802-803 to its respectivecache correction elements 416. The sizes of the fields 801-803 are a function of the core architecture. As part of the decompression process, the cache correction fuses decompressor 410 creates a complete image of the target data required to initialize thecache correction elements 416. Following decompression of thecache correction data 800, conventional distribution mechanisms in theresponsible core 420 may be employed to distribute the data 802-803 to respectively addressedcache correction elements 416. - Turning now to
FIGURE 9 , a block diagram is presented showing an exemplary format for decompressedfuse correction data 900 according to the present invention. As has been discussed above, during reset thefuse correction decompressor 411 accesses compressed fuse correction data 406 within thephysical fuse array 401, decompresses the compressed fuse correction data, and supplies the resulting decompressedfuse correction data 900 to the other decompressors 407-49 within thecore 420. The decompressed fuse correction data comprises one or more end-of-blow fields 901 that indicate the end of successively programming events in thephysical fuse array 401. If a subsequent programming event has occurred, areblow field 902 is programmed to indicate that a following one or more fuse correction fields 903 indicate fuses within thephysical fuse array 401 that are to be reblown. Each of the fuse correction fields comprises an address of a specific fuse within thephysical fuse array 401 that is to be reblown along with a state (i.e., blown or unblown) for the specific fuse. Only those fuses that are to be reblown are provided in the fusecorrection blocks fields 903, and each group offields 903 within a given reblow event is demarcated by an end-of-blow field 901. Ifreblow field 902, properly encoded, is present after a given end-of-blow field 901, then subsequent one or more fuse fuses may be configured reblown as indicated by corresponding fuse correction fields. Thus, the present invention provides the capability for a substantial number of reblows for the same fuse, as limited by array size and other data provided therein. - The present inventors have also observed that the real estate and power gains associated with utilization of a shared physical fuse array within which compressed configuration data is stored presents opportunities for additional features disposed on a multi-core die. In addition, the present inventors have noted that, as one skilled in the art will appreciate, present day semiconductor fuse structures often suffer from several shortcomings, one of which is referred to as "growback." Growback is the reversal of the programming process such that a fuse will, after some time, reconnect after it has been blown, that is, it goes from a programmed (i.e., blown) state back to an unprogrammed (i.e., unblown) state.
- To address growback, and other challenges, the present invention provides several advantages, one of which is provision of redundant, yet configurable, physical fuse arrays. Accordingly, a configurable, redundant fuse bank mechanism will now be presented with reference to
FIGURE 11 . - Referring to
FIGURE 10 , a block diagram is presented illustrating configurable,redundant fuse arrays 1001 in amulti-core device 1000 according to the present invention. Themulti-core device 1000 includes a plurality ofcores 1002 that are configured substantially as described above with reference toFIGURES 3-10 . In addition, each of thecores 1002 includesarray control 1003 that is programmed with configuration data within aconfiguration data register 1004. Each of thecores 1003 is coupled to theredundant fuse arrays 1001. - For purposes of illustration, only four
cores 1002 and twophysical fuse arrays 1001 are shown, however the present inventors note that the novel and inventive concepts according to the present invention can be extended to a plurality ofcores 1002 of any number and to more than twophysical fuse arrays 1001. - In operation, each of the
cores 1001 receives configuration data within the configuration data register 1004 that indicates a specified configuration for thephysical fuse arrays 1001. In one embodiment, the arrays may be configured according to the value of the configuration data as an aggregate physical fuse array. That is, the size of the aggregate physical fuse array is equal to the sum of the sizes of the individualphysical fuse arrays 1001, and the aggregate physical fuse array may be employed to store substantially more configuration data than is provided for by a single one of theindividual fuse arrays 1001. Accordingly, thearray control 1003 directs itscorresponding core 1002 to read thephysical fuse arrays 1001 as an aggregate physical fuse array. In another embodiment, to address growback, according to the value of the configuration data, thephysical fuse arrays 1001 are configured as redundant fuse arrays that are programmed with the same configuration data, and thearray control 1003 within each of thecores 1002 comprises elements that enable the contents of the two (or more) arrays to be logically OR-ed together so that if one or more of the blown fuses within a givenarray 1001 exhibits growback, at least one of its corresponding fuses in the remainingarrays 1001 will still be blown. In a fail-safe embodiment , according to the value of the configuration data, one or more of thephysical fuse arrays 1001 may be selectively disabled, and the remainingarrays 1001 enabled for use in either an aggregate configuration or a logically OR-ed configuration. Accordingly, thearray control 1003 in each of thecores 1002 will not access contents of the selectivelydisabled arrays 1001, and will access the remaining arrays according to the configuration specified by the configuration data in theconfiguration data register 1004. - The configuration data registers 1004 may be programmed by any of a number of well known means to include programmable fuses, external pin settings, JTAG programming, and the like.
- In another aspect, the present inventors have noted that there may exist challenges when one or more physical fuse arrays are disposed on a single die that comprises multiple cores which access the arrays. More specifically, upon power-up/reset each core in a multi-core processor must read the physical fuse array in a serial fashion. That is, a first core reads the array, then a second core, then a third core, and so on. As one skilled in the art will appreciate, compared to other operations performed by the core, the reading of a fuse array is exceedingly time consuming and, thus, when multiple cores must read the same array, the time required to do so is roughly the time required for one core to read the array multiplied by the number of cores on the die. And as one skilled in the art will appreciate, semiconductor fuses degrade as they are read and there are lifetime limitations, according to fabrication process, for the reading of those fuses to obtain reliable results. Accordingly, another embodiment of the present invention is provided to 1) decrease the amount of time required for all cores to read a physical fuse array and 2) increase the overall lifetime of the fuse array by decreasing the number of accesses by the cores in a multi-core processor upon power-up/reset.
- Attention is now directed to
FIGURE 11 , where a block diagram is presented detailing a mechanism according to the present invention for rapidly loading configuration data into amulti-core device 1100. Thedevice 1100 includes a plurality ofcores 1102 that are configured substantially as described above with reference toFIGURES 3-10 . In addition, each of thecores 1102 includesarray control 1103 that is programmed with load data within aload data register 1104. Each of thecores 1102 are coupled to aphysical fuse array 1101 that is configured as described above with reference toFIGURES 3-10 , and to a random access memory (RAM) 1105 that is disposed on the same die as thecores 1102, but which is not disposed within any of thecores 1102. Hence, theRAM 1105 is referred to as "uncore"RAM 1105. - For purposes of illustration, only four
cores 1102 and a singlephysical fuse array 1101 are shown, however the present inventors note that the novel and inventive concepts according to the present invention can be extended to a plurality ofcores 1102 of any number and to a plurality ofphysical fuse arrays 1101. - In operation, each of the
cores 1101 receives load data within the load data register 1104 that indicates a specified load order for data corresponding to thephysical fuse array 1101. The value of contents of theload register 1104 designates one of thecores 1102 as a "master"core 1102, and the remaining cores as "slave"cores 1102 having a load order associated therewith. Accordingly, upon power-up/reset, thearray control 1103 directs themaster core 1102 to read the contents of thephysical fuse array 1101 and then to write the contents of thephysical fuse array 1101 to theuncore RAM 1105. If a plurality ofphysical fuse arrays 1101 are disposed on the die, then theuncore RAM 1105 is sized appropriately to store the contents of the plurality ofarrays 1101. After themaster core 1102 has written the contents of thephysical fuse array 1101 to theuncore RAM 1105, thenarray control 1103 directs theircorresponding slave cores 1102 to read the fuse array contents from theuncore RAM 1105 in the order specified by contents of theload data register 1104. - The
load data registers 1104 may be programmed by any of a number of well known means to include programmable fuses, external pin settings, JTAG programming, and the like. It is also noted that the embodiment ofFIGURE 11 may be employed in conjunction with any of the embodiments of the configurable, redundant fuse array mechanism discussed above with reference toFIGURE 10 . - Now referring to
FIGURE 12 , a block diagram 1200 is presented illustrating an error checking and correction (ECC) mechanism according to the present invention. The ECC mechanism may be employed in conjunction with any of the embodiments of the present invention described above with reference toFIGURES 3-11 and provides for another layer of robustness for the compression and decompression of configuration data. The diagram depicts amicroprocessor core 1220 disposed on a die that is coupled to aphysical fuse array 1201 comprising compressedconfiguration data blocks 1203 as is described above. In addition to the compressed configuration data blocks 1203, thearray 1201 includes ECC code blocks 1202 that each are associated with a corresponding one of the data blocks 1203. In one embodiment, the data blocks 1203 are 64 bits (i.e., fuses) in size and the ECC code blocks 1202 are 8 bits in size. Thecore 1220 includes areset controller 1222 that receives a reset signal RESET. Thereset controller 1222 has anECC element 1224 that is coupled to adecompressor 1226 via bus CDATA. TheECC element 1224 is coupled to thefuse array 1201 via an address bus ADDR, a data bus DATA, and a code bus CODE. - In operation, the
fuse array 1201 is programmed with configuration data in the data blocks 1203 as is described above with reference toFIGURES 3-11 . The configuration data corresponding to a particular data type (e.g., microcode path data, microcode register data) is not required to be programmed within the boundaries of a givendata block 1201, but rather may span more than onedata block 1203. Furthermore, configuration data corresponding to two or more types of configuration data may be programmed into thesame data block 1203. In addition, thearray 1201 is programmed with ECC codes in the ECC code blocks 1202 that each result from ECC generation for the data programmed into a correspondingdata block 1203 according to one of a number of well known ECC mechanisms including, but not limited to, SECDED Hamming ECC, Chipkill ECC, and variations of forward error correction (FEC) codes. In one embodiment, the addresses associated with a givendata block 1203 and its correspondingECC code block 1202 are known. Thus, it is not required that the correspondingECC code block 1202 be located adjacent to the givendata block 1203, as is depicted inFIGURE 12 . - The
decompressor 1226 is configured and functions substantially similar to thedecompressor 421 described above with reference toFIGURE 4 , and as allude to with reference toFIGURES 5-11 . Upon reset of thecore 1220, prior to execution of any of the decompression functions described above, the ECC element within thereset controller 1222 accesses thefuse array 1201 to obtain its contents. Addresses associated with givendata blocks 1203 andcode blocks 1202 may be obtained via bus ADDR. Compressed configuration data within each of the data blocks 1203 may be obtained via bus DATA. And ECC codes for each of the ECC code blocks 1202 may be obtained via bus CODE. As the noted data, addresses, and codes are obtained, theECC element 1224 operates to generate ECC checks for the data retrieved for each data block 1202 according to the ECC mechanism that was employed to generate the ECC code stored in the correspondingECC code block 1202. TheECC element 1224 also compares the ECC checks with corresponding ECC codes obtained from thearray 1201 to produce ECC syndromes. TheECC element 1224 further decodes the ECC syndromes to determine if no error occurred, a correctable error occurred, or a non-correctable error occurred. TheECC element 1224 moreover operates to correct correctable errors. Correct and corrected data is then routed to thedecompressor 1226 via bus CDATA for decompression as described above. Non-correctable data is also passed to thedecompressor 1226 via bus CDATA along with an indication of such. If an operationally critical portion of the configuration data is determined to be non-correctable, thedecompressor 1226 may cause thecore 1220 to shut down or otherwise flag the error. - One embodiment contemplates that the
ECC element 1224 comprises one or more microcode routines that are executed to perform the ECC functions noted above. - Portions of the present invention and corresponding detailed description are presented in terms of software, or algorithms and symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the ones by which those of ordinary skill in the art effectively convey the substance of their work to others of ordinary skill in the art. An algorithm, as the term is used here, and as it is used generally, is conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of optical, electrical, or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
- It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise, or as is apparent from the discussion, terms such as "processing" or "computing" or "calculating" or "determining" or "displaying" or the like, refer to the action and processes of a computer system, a microprocessor, a central processing unit, or similar electronic computing device, that manipulates and transforms data represented as physical, electronic quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
- Note also that the software implemented aspects of the invention are typically encoded on some form of program storage medium or implemented over some type of transmission medium. The program storage medium may be electronic (e.g., read only memory, flash read only memory, electrically programmable read only memory), random access memory magnetic (e.g., a floppy disk or a hard drive) or optical (e.g., a compact disk read only memory, or "CD ROM"), and may be read only or random access. Similarly, the transmission medium may be metal traces, twisted wire pairs, coaxial cable, optical fiber, or some other suitable transmission medium known to the art. The invention is not limited by these aspects of any given implementation.
- The particular embodiments disclosed above are illustrative only, and those skilled in the art will appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention, and that various changes, substitutions and alterations can be made herein without departing from the scope of the invention as set forth by the appended claims.
Claims (21)
- An apparatus for providing configuration data to an integrated circuit, the apparatus comprising:a semiconductor fuse array, disposed on a die, into which is
programmed the configuration data, said semiconductor fuse array comprising:a first plurality of semiconductor fuses, configured to store
compressed cache correction data;a cache memory, disposed on said die; anda plurality of cores, disposed on said die, wherein each of said plurality of cores is coupled to said semiconductor fuse array and said cache memory, and is configured to access said semiconductor fuse array upon power-up/reset, to decompress said compressed cache correction data, and to distribute decompressed cached correction data to initialize said cache memory. - The apparatus as recited in claim 1, wherein said each of said plurality of cores decompresses said compressed cache correction data by executing microcode during power-up/reset.
- The apparatus as recited in claim 1, wherein said first plurality of semiconductor fuses comprises a second plurality of semiconductor fuses that indicates one or more sub-unit locations within said cache memory that are not to be employed during normal operation.
- The apparatus as recited in claim 3, wherein said first plurality of semiconductor fuses further comprises a third plurality of semiconductor fuses that indicates one or more replacement sub-unit locations within said cache memory that are to be employed during normal operation in replacement of corresponding ones of said one or more sub-unit locations.
- The apparatus as recited in claim 4, wherein said sub-unit locations and said replacement sub-unit locations comprise columns and redundant columns, respectively, within said cache memory.
- The apparatus as recited in claim 4, wherein said sub-unit locations and said replacement sub-unit locations comprise rows and redundant rows, respectively, within said cache memory.
- The apparatus as recited in claim 1, wherein the integrated circuit comprises an x86-compatible multi-core microprocessor.
- An apparatus for providing configuration data to an integrated circuit, the apparatus comprising:a multi-core microprocessor, comprising:a semiconductor fuse array, disposed on a die, into which is programmed the configuration data, said semiconductor fuse array comprising:a first plurality of semiconductor fuses, configured to store compressed cache correction data;a cache memory, disposed on said die; anda plurality of cores, disposed on said die, wherein each of said plurality of cores is coupled to said semiconductor fuse array and said cache memory, and is configured to access said semiconductor fuse array upon power-up/reset, to decompress said compressed cache correction data, and to distribute decompressed cached correction data to initialize said cache memory.
- The apparatus as recited in claim 8, wherein said each of said plurality of cores decompresses said compressed cache correction data by executing microcode during power-up/reset.
- The apparatus as recited in claim 8, wherein said first plurality of semiconductor fuses comprises a second plurality of semiconductor fuses that indicates one or more sub-unit locations within said cache memory that are not to be employed during normal operation.
- The apparatus as recited in claim 10, wherein said first plurality of semiconductor fuses further comprises a third plurality of semiconductor fuses that indicates one or more replacement sub-unit locations within said cache memory that are to be employed during normal operation in replacement of corresponding ones of said one or more sub-unit locations.
- The apparatus as recited in claim 11, wherein said sub-unit locations and said replacement sub-unit locations comprise columns and redundant columns, respectively, within said cache memory.
- The apparatus as recited in claim 11, wherein said sub-unit locations and said replacement sub-unit locations comprise rows and redundant rows, respectively, within said cache memory.
- The apparatus as recited in claim 1, wherein said multi-core microprocessor comprises an x86-compatible multi-core microprocessor.
- A method for providing configuration data to an integrated circuit, the method comprising:first disposing a semiconductor fuse array on a die, said first disposing comprising:storing compressed cache correction data in a first plurality of semiconductor fuses;second disposing a cache memory on the die; andthird disposing a plurality of cores on the die, wherein each of the plurality of cores is coupled to the semiconductor fuse array and the cache memory;via each of the plurality of cores, accessing the semiconductor fuse array upon power-up/reset, decompressing the compressed cache correction data, and distributing decompressed cached correction data to initialize the cache memory.
- The method as recited in claim 15, wherein the each of the plurality of cores decompresses the compressed cache correction data by executing microcode during power-up/reset.
- The method as recited in claim 15, wherein the first plurality of semiconductor fuses comprises a second plurality of semiconductor fuses that indicates one or more sub-unit locations within the cache memory that are not to be employed during normal operation.
- The method as recited in claim 17, wherein the first plurality of semiconductor fuses further comprises a third plurality of semiconductor fuses that indicates one or more replacement sub-unit locations within the cache memory that are to be employed during normal operation in replacement of corresponding ones of the one or more sub-unit locations.
- The method as recited in claim 18, wherein the sub-unit locations and the replacement sub-unit locations comprise columns and redundant columns, respectively, within the cache memory.
- The method as recited in claim 18, wherein the sub-unit locations and the replacement sub-unit locations comprise rows and redundant rows, respectively, within the cache memory.
- The apparatus as recited in claim 15, wherein the integrated circuit comprises an x86-compatible multi-core microprocessor.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/972,481 US20150058564A1 (en) | 2013-08-21 | 2013-08-21 | Apparatus and method for extended cache correction |
Publications (3)
Publication Number | Publication Date |
---|---|
EP2840508A2 true EP2840508A2 (en) | 2015-02-25 |
EP2840508A3 EP2840508A3 (en) | 2015-09-02 |
EP2840508B1 EP2840508B1 (en) | 2021-07-07 |
Family
ID=49666971
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP13193571.0A Active EP2840508B1 (en) | 2013-08-21 | 2013-11-19 | Apparatus and method for extended cache correction |
Country Status (2)
Country | Link |
---|---|
US (1) | US20150058564A1 (en) |
EP (1) | EP2840508B1 (en) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7350119B1 (en) * | 2004-06-02 | 2008-03-25 | Advanced Micro Devices, Inc. | Compressed encoding for repair |
US7529997B2 (en) * | 2005-03-14 | 2009-05-05 | International Business Machines Corporation | Method for self-correcting cache using line delete, data logging, and fuse repair correction |
US7898882B2 (en) * | 2006-06-23 | 2011-03-01 | Synopsys, Inc. | Architecture, system and method for compressing repair data in an integrated circuit (IC) design |
-
2013
- 2013-08-21 US US13/972,481 patent/US20150058564A1/en not_active Abandoned
- 2013-11-19 EP EP13193571.0A patent/EP2840508B1/en active Active
Non-Patent Citations (1)
Title |
---|
None |
Also Published As
Publication number | Publication date |
---|---|
EP2840508B1 (en) | 2021-07-07 |
EP2840508A3 (en) | 2015-09-02 |
US20150058564A1 (en) | 2015-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8879345B1 (en) | Microprocessor mechanism for decompression of fuse correction data | |
US9710390B2 (en) | Apparatus and method for extended cache correction | |
US8982655B1 (en) | Apparatus and method for compression and decompression of microprocessor configuration data | |
EP2840509B1 (en) | Apparatus and method for storage and decompression of configuration data | |
US9348690B2 (en) | Correctable configuration data compression and decompression system | |
EP2840490B1 (en) | Core-specific fuse mechanism for a multi-core die | |
US20150058563A1 (en) | Multi-core fuse decompression mechanism | |
EP2840510B1 (en) | Apparatus and method for rapid fuse bank access in a multi-core processor | |
EP2840507B9 (en) | Apparatus and method for configurable redundant fuse banks | |
EP2840491B1 (en) | Extended fuse reprogrammability mechanism | |
EP2840508B1 (en) | Apparatus and method for extended cache correction | |
US20150055427A1 (en) | Multi-core microprocessor configuration data compression and decompression system | |
US20150058565A1 (en) | Apparatus and method for compression of configuration data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20131119 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
AK | Designated contracting states |
Kind code of ref document: A3 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 11/08 20060101ALI20150729BHEP Ipc: G06F 15/76 20060101AFI20150729BHEP Ipc: G06F 9/445 20060101ALI20150729BHEP Ipc: G11C 29/00 20060101ALI20150729BHEP |
|
R17P | Request for examination filed (corrected) |
Effective date: 20160229 |
|
RBV | Designated contracting states (corrected) |
Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20180905 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
INTG | Intention to grant announced |
Effective date: 20210408 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 1409262 Country of ref document: AT Kind code of ref document: T Effective date: 20210715 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602013078228 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG9D |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20210707 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1409262 Country of ref document: AT Kind code of ref document: T Effective date: 20210707 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211007 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211108 Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211007 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211008 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602013078228 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 |
|
26N | No opposition filed |
Effective date: 20220408 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211119 Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211130 |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20211130 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211130 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211130 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211119 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20131119 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20231024 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20231024 Year of fee payment: 11 Ref country code: DE Payment date: 20231129 Year of fee payment: 11 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210707 |