Embodiment
Integrated circuit (IC) refers to that an electronic circuit set is formed on a undersized semiconductor material, as silicon.Integrated circuit also can be described as chip, microchip or crystal grain.
CPU (central processing unit) (CPU) refers to that electronic circuit (being hardware) is by the operation of carrying out data, carry out the instruction of a computer program (being computer applied algorithm or application program), the operation of these data comprises arithmetical operation, logical operation and input/output operations.
Microprocessor refers to that an electronic installation is as the CPU (central processing unit) on a single IC for both.One microprocessor receiving digital data, as input, according to the instruction treatmenting data of a storer, and produces the desired operating result of output order, wherein storer setting or be not arranged on crystal grain.One general purpose microprocessor can be applicable in desktop, portability type or flat circuits, and can calculating, word processing, multimedia display and network browsing.One microprocessor may be arranged in an embedded system, in order to control many devices, comprises equipment, mobile phone, smart mobile phone and industrial control device.
Multi-core processor is a microprocessor also referred to as many core microprocessors, multi-core processor, and it has many centre unit (kernel), and it is formed on same integrated circuit.
Instruction set architecture (ISA) or instruction set refer to that it comprises data type, instruction, register, address pattern, memory architecture, interruption and abnormality processing and I/O in order to the part of a computer architecture of programming.One ISA comprises the characteristic (being machine language instruction) of operational code set and the local command that a particular CPU uses.
The compatible microprocessor of x86 refers to that one has the microprocessor of object computer application program, according to just programmable calculator application program of x86ISA.
Microcode refers to multiple micro-orders.One micro-order (also referred to as local instruction) is an instruction, and it can be performed by a microprocessor time arithmetic element.In a possibility embodiment, sub-cell comprises integer arithmetic unit, Float Point Unit, MMX arithmetic element and is written into/stores arithmetic element.For example, directly carry out micro-order by reduced instruction set computer (RISC).For multiple instruction set (CISC) microprocessor (as x86 compatibility microprocessor), x86 instruction is translated into combination micro-order, and directly carries out combination micro-order by the microprocessor of CISC.
Fuse is a conductor structure, is generally fine rule, by applying voltages on fine rule and/or making electric current flow through fine rule, and just fusible fine rule.Utilize known manufacturing technology, fuse is deposited on to an ad-hoc location of a crystal grain topology, in order to produce programmable fine rule.After manufacture completes, fusing (or not fusing) fuse, in order to provide the programming of the corresponding device on crystal grain.
Please refer to Fig. 1, square 100 is the schematic diagram of current microprocessor kernel 101.Microprocessor kernel 101 has a fuse array 102, in order to provide configuration data to microprocessor kernel 101.Fuse array 102 has multiple semiconductor fuse (not shown)s.Semiconductor fuse is generally arrangement in column.Fuse array 102 couples replacement logical one 03.Replacement logical one 03 comprises reset circuit 104 and replacement microcode 105.Replacement logical one 03 couples control circuit 107, microcode register 108, microcode insertion element 109 and cache correcting element 110.One outside reset signal RESET couples microprocessor kernel 101.Replacement logical one 03 receives outside reset signal RESET.
Those skilled in the art all knows very well, after integrated circuit (IC) apparatus has been manufactured, a large amount of integrated circuit (IC) apparatus is used fuse (also referred to as linking or fuse structure), in order to the configuration of integrated circuit to be provided.For example, the microprocessor kernel 101 of Fig. 1 provides function to select, and is to be applied in desktop apparatus or mancarried device in order to select.Therefore, during fabrication, the fuse that fuse array is 102 li may be blown, in order to selecting arrangement, as a mancarried device.Therefore, after reset signal RESET is enabled, replacement logical one 03 reads the state of 102 li of appointed fuses of fuse array, and reset circuit 104 (in this example, not being replacement microcode 105) the corresponding control circuit 107 of activation.The element relevant with desktop type functional in control circuit 107 forbidden energy microprocessor kernels 101, and the element relevant with portable functional in activation microprocessor kernel 101.Therefore, microprocessor kernel 101 is activated, and is reset to a mancarried device.In addition, replacement logical one 03 reads the state of other fuse of 102 li of fuse array, and reset circuit 104 is (in this example, replacement microcode 105) the corresponding cache correcting element 110 of activation, in order to provide correction mechanism at least one memory cache (not shown) of giving microprocessor kernel 101.Therefore, microprocessor kernel 101 is activated, and is reset to a mancarried device, and the correction mechanism of the memory cache of microprocessor kernel 101 is also set up appropriate.
Above-mentioned example is only to describe the many different purposes of fuse of 101 li of microprocessor kernels of Fig. 1.Those skilled in the art all knows very well other purposes of fuse, be not limited in the configuration of device particular data (as sequence number, unique encrypted code, the authorization data of computer-internal structure, it can be set by user's access, Speed Setting, voltage), initialization data and data inserting.For example, many current devices are carried out microcode, in order to initialization microcode register 108.The microcode register fuse (not shown) that fuse array is 102 li may provide in order to initialized data, resetting under operation, by replacement logical one 03 (reset circuit 104 or replacement microcode 105, or reset circuit 104 and replacement microcode 105) read initialized data, and the initialization data reading is offered to microcode register 108.In order to achieve the above object, reset circuit 104 comprises hardware element, and it provides the configuration data of particular type, and replacement microcode 105 cannot provide the configuration data of these particular types.Replacement microcode 105 comprises multiple micro-orders, and described micro-order is arranged in an inner microcode memory (not shown).In the time of replacement microprocessor kernel 101, carry out inner microcode memory, in order to carry out the function of initializing of microprocessor kernel 101, these functions comprise, read the configuration data of 102 li of fuse array, and reading result is offered to multiple elements, as microcode register 108 and microcode insertion mechanism 109.One special setting of microprocessor kernel 101 is exactly whether the configuration data that judges fuse array offers in the different elements 107~110 of microprocessor kernel 101 by replacement microcode 105.Not initialize integrated circuit device individually of object of the present invention, those skilled in the art all knows very well the kind of the configuration element 107-110 of current microprocessor kernel 101 conventionally to drop in Four types, taking Fig. 1 as example, be control circuit, microcode register, microcode insertion mechanism and cache correction mechanism.In addition, those skilled in the art is by known, and the value of configuration data is clearly to change according to the type of data.For example, the control circuit 107 of one 64 may comprise ascii data, and ascii data is in order to specify the sequence number of microprocessor kernel 101.Other control register of 64 may have 64 kinds of different Speed Settings, only has a kind of Speed Setting to be enabled at every turn, in order to control the operating speed of microprocessor kernel 101.Generally speaking, microcode register 108 may be initialised to help and be 0 (being low logic state) or be 1 (as high logic state) entirely.Microcode insertion mechanism 109 may comprise equally distributed 1 and 0, and in order to represent to need in a microcode ROM (not shown) address of replaced microcode value, the microcode value of these addresses is by replaced.Finally, cache correction mechanism may comprise little setting value 1, in order to represent that an a certain cache time group (sub-bank) element (i.e. row or a line) need be replaced by a specific generation set of pieces.
Fuse array 102 provides an outstanding function, after having manufactured at a device (as microprocessor kernel 101), sets microprocessor kernel 101.By some fuse of 102 li of fuse wire arrays, just can make microprocessor kernel 101 operate in corresponding environment.But those skilled in the art all knows very well, by programming fuse array 102, just can change the operating environment of microprocessor kernel 101.Microprocessor kernel 101 may be because business demand is initialised, as being initialized to a mancarried device by a desktop apparatus.Therefore, deviser can arrange redundancy fuse in fuse array 102, as fuse wire not, therefore, configuration that just can initialization microprocessor kernel 101, proofreaies and correct and manufactures mistake ... etc..The fuse array with redundancy fuse will be chatted bright in Fig. 2.
Please refer to Fig. 2, square 200 shows a fuse array 201 of 101 li of microprocessor kernels, and it has fuse group 202 (redundancy fuse group RFB1~RFBN and first fuse PFB1~PFBN).The first fuse group PFB1~PFBN that fuse array is 201 li can first be fused, and then fusing redundancy fuse group RFB1~RFBN.Redundancy fuse group RFB1~RFBN and PFB1~PFBN comprise the fuse 203 of a set quantity, and fuse 203 is independent separately, and the quantity of fuse 203 is relevant with the particular design of microprocessor kernel 101.For example, in the microprocessor kernel 101 of 64, fuse 203 quantity of fuse group 202 may be 64, with so that microprocessor kernel 101 uses configuration data.
Fuse array 201 couples register 210~211.Generally speaking, register 210~211 is arranged in the replacement logic of microprocessor kernel 101.Main register PR1 is in order to read (supposing it is the fuse group PFB3 in square Figure 200) in first fuse group PFB1~PFBN.Redundancy register RR1 is in order to read in redundancy fuse group RFB1~RFBN.Register 210 and 211 all couples an exclusive or logic gate 212.Exclusive or logic gate 212 provides an output FB3.
In operation, after producing microprocessor kernel 101, can, by the known technology first fuse group PFB1~PFBN that programmes, become the spendable configuration data of microprocessor kernel 101.Redundancy fuse group RFB1~RFBN is not all fused, and maintains a low logic state.In the time of startup/replacement microprocessor kernel 101, main register 210 and redundancy register 211 read respectively the state of first fuse group PFB1~PFBN and redundancy fuse group RFB1~RFBN.The data that XOR (Exclusive OR is referred to as again " mutual exclusion or ") logic gate 212 is stored register 210 and 211 are carried out XOR, in order to produce output FB3.Because all redundancy fuse groups are not all by fusing (being low logic state), therefore, the value of output FB3 is very simple, is exactly after manufacturing, the result that first fuse group PFB1~PFBN is programmed.
At present, because of design or business demand, require the information that writes to first fuse group PFB1~PFBN to be modified.Therefore,, in order to change the rear information reading that starts, must carry out a programmable operations, in order to the corresponding redundancy fuse 203 fusing in redundancy fuse group RFB1~RFBN.In the time of the fuse 203 fusing in selected redundancy fuse group RFB1~RFBN, the corresponding fuse 203 logicalities ground in first fuse group PFB1~PFBN match with it.
The mechanism of Fig. 2 may be in microprocessor kernel 101, the fuse 203 of remelting is provided, but those skilled in the art know very well, owing to only having one group of redundancy fuse group RFB1~RFBN, therefore, every fuse 203 in redundancy fuse group RFB1~RFBN can only be remelted once, for remelting is repeatedly provided, can in microprocessor kernel 101, add the extra fuse group 202 of many groups and register 210~211.
Up to now, the fuse array mechanism of Fig. 1 and Fig. 2 provides enough elasticity to micro-processor kernel and other relevant device, in order to allow the remelting of limited number of times.Manufacturing technology (as 65 and 45 nanometer technologies) can form enough fuses on crystal grain, in order to set the microprocessor kernel 101 on crystal grain.But current technology is still confined to two obvious factors.First factor is that the trend of this area is to form multiple microprocessor kernels 101 in same crystal grain, in order to increase treatment efficiency.These are called many kernel device may have 2-16 independent kernel 101, and for open/replacement kernel 101, each kernel setup has fuse data.Therefore, for 4 kernel device, 4 fuse array 201 can be used in independently in kernel, and the data of each kernel may different (as cache correction data, redundancy fuse data etc.).The secondth,, those skilled in the art all know very well, the reduction (as 32 nanometers) of manufacturing technology, therefore, transistorized size also reduces, thus the increase of the size of fuse, therefore need realize the fuse array of 45 nanometers on the crystal grain of 32 nanometers.
According to other challenge of above-mentioned restriction and device deviser, the particularly deviser of many kernel device, the invention provides obvious improvement, be better than known devices configuration mechanism, the present invention's independently kernel of programming in many kernel device, and increase the number of times that cache is proofreaied and correct and fuse is programmed again (remelting).To the present invention be described by Fig. 3-Figure 12 after a while.
Fig. 3 is the schematic diagram of system 300 of the present invention, in order to compress and the configuration data of many kernel device that decompress.Many kernel device have many kernels 332.Kernel 332 is arranged on a crystal grain 330.For convenience of description, Fig. 3 only shows kernel CORE1~CORE4.Kernel CORE1~CORE4 is arranged on crystal grain 330.In other embodiments, crystal grain 330 may have the kernel 332 of other quantity.In the present embodiment, all kernels 332 share single memory cache 334.Memory cache 334 is also arranged on crystal grain 330.Single programmable fuse array 336 is also arranged on crystal grain 330, and is starting/resetting under operation, and each kernel 332 is in order to access fuse array 336, in order to extract and to decompress configuration data.
In one embodiment, kernel 332 comprises micro-processor kernel, in order to form core microprocessor more than (crystal grain) 330.In other embodiments, many core microprocessors 330 are as the compatible many core microprocessors of x86.At other embodiment, memory cache 334 comprises secondary (level 2) memory cache, and it couples micro-processor kernel 332.One may embodiment in, fuse array 336 has the individual fuse (not shown) independently separately of 8192 (8K), but also can use the fuse of other quantity.In the embodiment of single kernel, only have a kernel 332 to be arranged on crystal grain 330, and this kernel 332 couple memory cache 334 and fuse array 336.Although feature and the function of many kernel device (crystal grain) 330 will be described after a while, and the feature of many kernel device is identical with the feature of single kernel.
System 300 also comprises a device programmable device 310.Device programmable device 310 comprises a compressor reducer 320.Compressor reducer 320 couples virtual fuse array 303.In a possibility embodiment, device programmable device 310 may comprise a central processing unit (not shown), in order to processing configuration data, and after crystal grain 330 has been manufactured, utilizes known programming technique, programming fuse array 336.Central processing unit may be incorporated in a wafer sort equipment, the device crystal grain 330 after having manufactured in order to test.One may embodiment in, compressor reducer 320 may have an application program, it can be performed on device programmable device 310, and virtual fuse array 303 may comprise the address of a storer, this storer is by 320 accesses of compressor reducer.Virtual fuse array 303 has many virtual fuse groups 301.Each virtual fuse group 301 has multiple virtual fuses 302.In a possibility embodiment, virtual fuse array 303 has 128 virtual fuse groups 301, and each virtual fuse group 301 has 64 virtual fuses 302, and therefore, fuse array 303 is of a size of 8Kb.
In operation, as shown in Figure 1, in the fabrication phase, the configuration information of device 330 can be input in virtual fuse array 330.Therefore, configuration information comprises the configuration data of control circuit, initialization data, microcode data inserting and the cache correction data of microcode register.In addition, as mentioned above, the value of dissimilar configuration data is all not identical.The representative of logic that virtual fuse array 303 is a fuse array (not shown), it has the configuration information of each micro-processor kernel 332 on crystal grain 330, and the correction data of each memory cache 334 on crystal grain 330.
When information deposits in after virtual fuse array 303, compressor reducer 320 reads the state of the virtual fuse 302 of each virtual fuse group 301, and utilize the corresponding disjoining pressure compression algorithm of each data type (distinct compression algorithms) to compress, in order to produce compression fuse array data.In a possibility embodiment, the system data of control circuit can't be compressed, but can in the situation that not compressing, be converted.In order to compress microcode register data, can use a microcode register data compression algorithm, there are the data of a distributions in order to compression, this distributions is with respect to microcode register data.In order to compress microcode data inserting, can use a microcode data inserting compression algorithm, in order to effectively to compress the data with a distributions, this distributions is corresponding to microcode data inserting.In order to compress cache correction data, can use a cache correction data compression algorithm, in order to effectively to compress the data with a distributions, this distributions is corresponding to cache correction data.
Then, device programmable device 310 by not compressed and compressed fuse array data programing to the physical level fuse array 336 on crystal grain 330.
In the time starting/reset operation, each kernel 332 possibility access physical level fuse array 336, in order to the fuse array data of extracting not compression and having compressed, and the reset circuit/microcode (not shown) in each kernel 332 is issued and is not compressed fuse array data, and according to the corresponding separation decompression algorithm of each data type, the fuse array data that decompression has been compressed, in order to provide the original value 303 li of virtual fuse array originally.Then, configuration information is offered control circuit (not shown), microcode register (not shown), insertion element (not shown) and cache correcting element (not shown) by reset circuit/microcode.
By fuse array compressibility 300 of the present invention, can make device deviser reduce the fuse quantity of 336 li of physical level fuse array, and starting/reset in operation, utilize the information programme of having compressed, kernel device more than one 330 is set.
Please refer to Fig. 4, square 400 shows fuse solution decompression scheme of the present invention.Decompression scheme may be arranged in each micro-processor kernel 332 of Fig. 3.Chat bright the present invention for clear, Fig. 4 only shows single kernel 420, but each kernel 332 on the crystal grain of Fig. 3 all has the element of the kernel 420 of Fig. 4.Physical level fuse array 401 is arranged on crystal grain, and couples kernel 420.Physical level fuse array 401 have the microcode that compressed insert fuse 403, compress register fuse 404, the cache compressed is proofreaied and correct fuse 405 and the fuse that compressed is proofreaied and correct fuse 406.Physical level fuse array group 401 may also have unpressed configuration data (not shown), system configuration data described above and/or error-detecting and correction (Error Checking and Correction; Hereinafter to be referred as ECC) code (not shown).To illustrate after a while according to ECC feature of the present invention.
Microprocessor kernel 420 comprises a replacement controller 417.Replacement controller 417 receives a reset signal REST, and reset signal REST, in order to initialization kernel 420, makes kernel 420 carry out a replacement step.Replacement controller 417 has a decompressor 421.Decompressor 421 has an insertion electrical fuse element 408, a register electrical fuse element 409 and a cache electrical fuse element 410.Decompressor 421 also comprises a fuse correcting element 411, and it is coupled and inserted electrical fuse element 408, register electrical fuse element 409 and cache electrical fuse element 410 by bus 412.Insert electrical fuse element 408 and couple the microcode insertion element 414 in kernel 420.Register electrical fuse element 409 couples the microcode register 415 of 420 li of kernels.Cache electrical fuse element 410 couples the cache correcting element 416 in kernel 420.One may embodiment in, cache correcting element 416 is arranged on the crystal grain of (L2) memory cache (not shown) that has secondary.All kernels 420 share cache correcting element 416, as the memory cache 334 of Fig. 3.In another embodiment, cache correcting element 416 is arranged on the crystal grain of (L1) memory cache (not shown) that has one-level.In other embodiments, cache correcting element 416 is arranged on the crystal grain of have one-level (L1) and secondary (L2) memory cache (not shown).
In when operation, in the time that reset signal RESET is enabled, replacement controller 417 reads the state of the fuse 403~406 of 401 li of physical level fuse array, and the state of compressibility fuse (not shown) is offered to decompressor 421.After reading and providing, the decompress fuse that compressed of the fuse correcting element 411 that decompressor is 421 li is proofreaied and correct the state of fuse 406, in order to data to be provided, at least one fuse address of this data representation physical level fuse array 401, this state being previously programmed can be changed.Data after decompression may comprise the value of at least one fuse address.This at least one fuse address (and arbitrarily value) can be sent to element 408~410 by bus 412, and the state of corresponding fuse is just changed before decompressed.
In a possibility embodiment, insert electrical fuse element 408 and comprise microcode, in order to insert decompression algorithm according to a microcode, the compressed microcode that decompresses inserts the state of fuse 403, and microcode inserts decompression algorithm and inserts compression algorithm corresponding to the microcode described in Fig. 3.In a possibility embodiment, register electrical fuse element 409 comprises microcode, in order to according to a register fuse decompression algorithm, the register fuse 404 that decompresses and compressed, register fuse decompression algorithm is corresponding to the register fuse compression algorithm described in Fig. 3.In a possibility embodiment, cache electrical fuse element 410 comprises microcode, in order to proofread and correct fuse decompression algorithm according to a cache, the cache of having compressed that decompresses is proofreaied and correct fuse 405, and cache is proofreaied and correct fuse decompression algorithm and proofreaied and correct fuse compression algorithm corresponding to the cache described in Fig. 3.Fuse correcting element 411 provides the address of fuse by bus 412, each of element 408~410 is according to after the state of the corresponding fuse of these address modifications, then according to corresponding algorithm, decompression fuse data separately.To describe after a while repeatedly remelting fuse of the present invention in detail, the step of remelting is early than the initialization of the decompression action of element 408~411.In a possibility embodiment, bus 412 may comprise known micro code program mechanism, in order to transmit data.The present invention also has a comprehensive decompression machine 421, and it can be according to the type of configuration data, the configuration data of distinguishing and decompress.Therefore, for the present invention is described, decompressor 421 only has element 408~411, but as long as condensation decompression machine 421 can provide the function of element 408~411, the present invention may not need element 408~411.
In a possibility embodiment, the microcode of electrical fuse elements 408 is inserted in 417 initialization of replacement controller, inserts fuse 403 decompress in order to the microcode to having compressed.Replacement controller 417 is the microcode of initialization register electrical fuse element 409 also, decompresses in order to the state of the register fuse 404 to having compressed.Moreover the microcode of replacement controller 417 more initialization cache electrical fuse elements 410, proofreaies and correct fuse 405 in order to the cache to having compressed and decompresses.Before decompressing, the microcode of decompressor 421 can first change the state of some fuse, and wherein these reformed fuses are the specified fuse of fuse correction data that the fuse that compressed is proofreaied and correct fuse 406.
Replacement controller 417, decompressor 421 and element 408~411 are in order to carry out above-mentioned function.Replacement controller 417, decompressor 421 and element 408~411 may comprise combination or the equivalence element of logic, circuit, device or microcode or logic, circuit, device or microcode, and it can carry out above-mentioned functions and operation.These may be by other circuit, microcode in order to the element of realizing replacement controller 417, decompressor 421 and element 408~411 ... Deng share, it can carry out other function and/or the operation of other element of 420 li of replacement controller 417, decompressor 421 and element 408~411 or kernels.
After the state of the fuse 403~406 in change and decompression physical level fuse array 401, the state of the virtual fuse after decompression can be provided for microcode insertion element 414, microcode register 415 and cache correcting element 416.Therefore, kernel 420 carries out ensuing replacement operation.
In other embodiments, in the time resetting operation, above-mentioned decompressing function need not be performed according to a special order.For example, the decompression of microcode data inserting action may be after the decompression action of microcode initialization of register data.Similarly, in other embodiments, in order to meet design requirement, decompressing function may carry out simultaneously.
In addition, the realization of element 408~411 of the present invention not must be used the corresponding microcode of hardware circuit, due in general micro-processor kernel 420, it has some elements, these elements can be initialised by (as the one scan chain relevant to a cache) by hardware more easily, and are different from the microcode that writes direct.These the details that realizes is decided in its sole discretion by deviser.But in the replacement operation before initialization microcode, known technology utilizes hardware circuit, make cache proofread and correct fuse and be read as usual and enter a cache correct scan chain.Unless microcode starts action, not so the memory cache of kernel can't be switched on, and therefore, feature of the present invention is to utilize the microcode of corresponding hardware control circuit, carries out cache fuse decompressor 410.Utilize microcode to carry out cache electrical fuse element 410, just cache correction data can be write in one scan chain, and clearly save hardware element, thereby increase design flexibility and useful mechanism.
Please refer to Fig. 5, it shows the form of compressed configuration data 500 of the present invention.The compressor reducer 320 of Fig. 3 compresses the data of virtual fuse array 330, and programming (i.e. fusing) compressed configuration data 500 are at the most in the physical level fuse array 336 of kernel device 330.In above-mentioned replacement result, by each kernel 332, compressed configuration data 500 can be extracted from physical level fuse array 336, and decompressed, and are proofreaied and correct by the element 408~411 of the decompressor 421 of each kernel 420.Decompression and correction configuration data can then be provided for the multicomponent 414~416 of kernel 420, in order to initialization kernel 420.
Compressed configuration data 500 have at least one packed data field (D) 502, and each above-mentioned configuration data type is separated by 503 of end type fields (ET).Programmed events (i.e. fusing) can be moved to end fusing field (EB) 504 and separate.According to a compression algorithm, the packed data field 502 relevant with each data type of encoding, in order to minimize bit (being fuse) quantity, these are in order to the storage Q-character pattern relevant with each data type.Form the feature of the compression algorithm that the fuse quantity of the physical level fuse array 336 of each packed data field 502 uses for a specific data type.For example, consider when a kernel has 64 microcode registers, it must be initialized to 0 or 1 entirely.One best contractive pressure algorithm may be according to data type, 64 packed data fields 502 are provided, each packed data field 502 has the initialization data of a specific microcode register, and packed data field 502 is specified in register quantity order (being 1-64).And each packed data field 502 has a single fuse, if a corresponding microcode register need be initialized at 1 o'clock, this single fuse is fused, if corresponding microcode register need be initialized at 0 o'clock, this single fuse is not fused.
After initial programming event, the element 408~410 of the decompressor 421 that kernel is 420 li utilizes end type field 503 to judge whether that their packed datas have separately been placed in physical level fuse array 336, and fuse Correction Solution compressor reducer 411 utilizes and finishes fuse wire 504, find out compression fuse correction data, compression fuse correction data, after an initialize routine event, is programmed (i.e. fusing).For the many programmed events that carry out subsequently, the present invention is provided with a large amount of backup fuses in physical level fuse array 336, below will describe in detail.
Above-mentioned compression type form is in order to illustrate compression and the decompression of configuration data of the present invention.But, the compression of the specific type of data shown in Fig. 5, separate and be compressed into the data type of 401 li of fuse array and quantity not in order to limit the present invention.In other embodiments, can utilize other quantity, type and form modifying the present invention, to obtain different devices and algorithm.
Please refer to Fig. 6, Fig. 6 shows a possibility form that inserts configuration data 600 according to decompression microcode of the present invention.Resetting under operation, the compression microcode that utilizes each kernel 420 to read 401 li of physical level fuse array inserts configuration data.Then, the fuse correction data providing according to bus 412, proofreaies and correct compression microcode and inserts configuration data.Then, by inserting fuse decompressor 408, the compression microcode insertion configuration data of having proofreaied and correct is decompressed.The result of gunzip is that decompression microcode inserts configuration data 600.Data 600 comprise multiple decompressed data squares 604.The quantity of decompressed data square 604 needs the quantity of the microcode insertion element 414 of initialization data corresponding to 420 li of kernels.Each decompressed data square 604 comprises a kernel address field 601, a microcode memory (ROM) address field 602 and a microcode data inserting field 603.The length of field 601~603 is the feature of kernel algorithm.In the time carrying out the gunzip of part, the complete image of the target data that insertion electrical fuse element 408 provides, it is in order to initialization microcode insertion element 414.Insert in the decompression of configuration data 600 at microcode subsequently, may use known issue mechanism, give the microcode ROM replacement circuit/register of 414 li of address kernel separately and microcode insertion elements in order to distributing data 603.
Please refer to Fig. 7, Fig. 7 shows according to the form of decompression microcode register configuration data 700 of the present invention.Resetting in operation, by each kernel 420, read the compression microcode register configuration data of 401 li of physical level fuse array.Then the fuse correction data providing according to bus 412 is proofreaied and correct compression microcode register configuration data.Then, register electrical fuse element 409 decompresses to the compression microcode register configuration data after proofreading and correct.The result of gunzip is decompression microcode register configuration data 700.Data 700 comprise multiple decompressed data squares 704, and 420 li of corresponding kernels of the quantity of decompressed data square 704 need the quantity of the microcode register 415 of primary data.Each decompressed data square 704 has a kernel address field 701, a microcode register address field 702 and a microcode register data field 703.The length of field 701~703 is the feature of kernel algorithm.In the time carrying out the gunzip of part, register electrical fuse element provides the complete image of target data, in order to initialization microcode register 415.In the decompression of microcode register configuration data 700 subsequently, may use known issue mechanism, address kernel and the microcode register 415 given separately in order to distributing data 703.
Please refer to Fig. 8, Fig. 8 shows may form according to one of decompression cache correction data 800 of the present invention.Resetting in operation, read the compression cache correction data of physical level fuse array 401 by each kernel 420.Then the fuse correction data, providing according to bus 412 is proofreaied and correct compression cache correction data.Then, utilize cache electrical fuse element 410 to decompress and proofread and correct compression cache correction data.The result of gunzip is decompression cache correction data 800.Multi-core processor 300 uses different cache mechanism, and decompression cache correction data 800 exists in shared secondary memory cache 334.The same memory cache 334 of all kernels 332 possibility accesses, in order to use identical storage space.Therefore, the form shown in Fig. 8 is according to above-mentioned algorithm.Data 800 comprise multiple decompressed data squares 804, and 420 li of corresponding kernels of the quantity of decompressed data square 804 need the quantity of the cache correcting element 416 of correction data.Each decompressed data square 804 has cell row address field 802 and a line to be replaced address field 803.Those skilled in the art all know very well, in the time manufacturing memory cache, can be in the sub-cell of memory cache, form in the lump the row (or row) of redundancy, in order to utilize a non-functional capable (or row) to replace the functional redundancy row (or row) in a specific sub-cell.Therefore, decompression cache correction data 800 allows the capable functional row (as shown in Figure 8) that replaces of non-functional.In addition, those skilled in the art all know very well, in the time that needs utilize that redundancy sub-cell is capable to be replaced, the fuse of the known cell row each time with the fuse array mechanism that cache proofreaies and correct can be fused.Therefore, due to a large amount of fuse of needs (in order to all sub-cells of access and row), thus can only include a part of sub-cell, thereby cause known cache correction fuse seldom to be fused.The invention is characterized in access and the compression capable address of sub-cell, and the capable line to be replaced of the sub-cell being replaced for needs address.Therefore, minimize the fuse quantity that is used in cache correction data.Therefore, under the size of physical level fuse array and the configuration data quantitative limitation that is additionally programmed, the quantity of the sub-cell capable (or row) that the present invention extends memory cache 334, memory cache 334 can be corrected.In the embodiment shown in fig. 8, the kernel that is associated 332 shares secondary memory cache 334, in order to access and provide correction data 802~803 to cache correcting element 416 separately.The length of field 801~803 is the feature of kernel algorithm.In the part of gunzip, cache is proofreaied and correct electrical fuse element 410 provides the complete image of target data, and target data is in order to initialization cache correcting element 416.After decompression cache correction data 800, the known issue mechanism in responsible kernel 420 may be given the cache correcting element 416 being accessed by distributing data 802~803.
Please refer to Fig. 9, Fig. 9 shows that one of decompression fuse correction data 900 of the present invention may form.As mentioned above, in the time resetting, the compression fuse correction data 406 that fuse correcting element 411 access physical level fuse array is 401 li, decompresses to compression fuse correction data, and decompression fuse correction data 900 other element 408~410 to kernel 420 is provided.Decompression fuse correction data has at least one end fusing field (EB) 901, and its programmed events that is illustrated in 401 li of physical level fuse array successfully finishes.If subsequently when a raw programmed events, a remelting field (R) 902 can be programmed, and in order to represent that at least one fuse subsequently proofreaies and correct field (FC) 903, its fuse that represents 401 li of physical level fuse battle arrays can be fused again.Each fuse correction field has the address of the specific insurance silk of 401 li of physical level fuse array, and specific insurance silk can be configured to a state (fuse or do not fuse) again.The fuse that only has fuse to proofread and correct 903 li of square fields can be set again, and each field 903 of again setting event can be separated by an end fusing field 901.If remelting field 902 is successfully coded in after a specific end fusing field 901, proofread and correct field according to corresponding fuse, at least fuse may be fused again subsequently.Therefore,, in the data that can provide at fuse array size and the array of restriction, the present invention can carry out setting repeatedly to identical fuse.
For the additional features on the crystal grain of kernel more than, the present invention shares has the physical level fuse array of compressed configuration data, just can have actual characteristic and power supply gain.In addition, those skilled in the art all knows very well current semiconductor fuse structure usually to have some shortcomings, wherein one be exactly " growing back " (growback).Growing is back exactly putting upside down of program, if a fuse is after fusing a period of time, recovers again to connect, and namely gets back to a programming state (i.e. not fusing) not from a programming state (i.e. fusing).
Grow back and other challenge in order to control, the present invention has many advantages, wherein a physical level fuse array that is just to provide redundancy, does not configure.Therefore, Figure 11 provides a configurable redundancy fuse group mechanism.
Please refer to Figure 10, Figure 10 shows may embodiment according to one of the physical level fuse array 1001 of many kernel device 1000 of the present invention.Many kernel device 1000 comprise multiple kernels 1002, and its feature has been disclosed in Fig. 3-Figure 10 and related description.In addition, each kernel 1002 comprises antenna array control 1003, and its configuration data according to 1004 li of configuration data register is programmed.Each antenna array control 1003 couples redundancy fuse array 1001.
For the present invention is described, Figure 10 only shows four kernels 1002 and two physical level fuse array 1001, but not in order to limit the present invention, in other embodiments, according to of the present invention open, also can use kernel 1002 and the physical level fuse array 1001 of other quantity.
In the time of operation, each physical level fuse array 1001 receives the configuration data of 1004 li of configuration data register, and it represents a customized configuration of physical level fuse array 1001.In one embodiment, according to the value of configuration data, physical level fuse array 1001 is as a gathering physical level fuse array.The size of assembling physical level fuse array equals the size summation of physical level fuse array 1001 separately, and assembling physical level fuse array may be in order to store ensuing many configuration datas, and its data volume of storing is greater than the data volume that single one physical level fuse array 1001 is stored.Therefore, antenna array control 1003 is controlled corresponding kernel 1002, in order to read physical level fuse array 1001, as a gathering physical level fuse array.In other embodiments, grow back in order to control, physical level fuse array 1001 is according to the value of configuration data, as redundancy fuse array, it utilizes identical configuration data and is programmed, and the antenna array control 1003 of 1002 li of each kernels has many elements, in order to the content of two (or more) arrays is carried out to OR logic, therefore, if when at least one fuse wire of array 1001 occurs to grow back, at least another corresponding fuse that array is 1001 li still maintains blown state.In an automatic anti-fault embodiment, according to the value of configuration data, the remaining array 1001 of optionally at least one physical level fuse array 1001 of forbidden energy, and activation, in order to configure as a gathering or OR logic configuration.Therefore, the antenna array control 1003 that each kernel is 1002 li is according to a particular configuration data of 1004 li of configuration data register, the content of the array 1001 that not access is disabled, but the redundant array that access is enabled.
By any known device, external pin setting, JTAG program or other similar installation, just the programmable configuration data register 1004 with programmable fuse.
In another embodiment, the present invention finds, in the time that at least one physical level fuse array is arranged on the single crystal grain with many kernels, in the time of kernel access array, problem may occur.Particularly, starting/resetting under operation, each kernel in multi-core processor must, according to a serial orientation, read physical level fuse array.First, the first kernel reads array, and then the second kernel reads array, and then the 3rd kernel reads array, by that analogy.Those skilled in the art all knows very well, compared to the operation of performed other of kernel, reading of fuse array is spended time, therefore, in the time that many kernels must read identical array, the needed time is the time of reading of a kernel to be multiplied by the number of cores on crystal grain haply.Those skilled in the art all know very well, in order to obtain reliable result, must read these fuses, but according to manufacture process, the reading times of semiconductor fuse and life-span impact will affect the quality of semiconductor fuse.Therefore, in other embodiments, the time that the present invention reduces described kernel and reads physical level fuse array, and start and the operation of resetting in by the access quantity of kernel that reduces multi-core processor, in order to increase the life-span of protecting silk array.
Please refer to Figure 11, it shows the schematic diagram of mechanism that rapidly configuration data is written into many kernel device 1100 according to the present invention.Device 1100 has multiple kernels 1102, and its characteristic is as described in the related description of Fig. 3-Figure 10.In addition, each kernel 1102 has antenna array control 1103, and it is programmed by the data that are written into that are written into data register 1104.Each kernel 1102 couples a physical level fuse 1101, and its feature is as described in the related description of Fig. 3-Figure 10.Each kernel 1102 couples random access memory (RAM) 1105, on it is arranged on identical crystal grain with kernel 1102, but can not be arranged among kernel 1102.Therefore, RAM 1105 is called non-kernel RAM 1105.
For convenience of description, Figure 11 only shows four kernels 1102 and a physical level fuse array 1101, but not in order to limit the present invention, in other embodiments, may be extended to kernel 1101 and multiple physical level fuse array 1101 of any amount.
In the time of operation, each kernel receives the data that are written into that are written into data register 1104, and its representative is with respect to the specific data that are written into of physical level fuse array 1101.It is main kernel 1102 that the contents value that is written into data register 1104 is specified a kernel 1102, and other residue kernel is called time kernel 1102, and it has the order of being written into.Therefore, starting/resetting under operation, antenna array control 1103 makes main kernel 1102 read the content of physical level fuse array 1101, then the content of physical level fuse array 1101 is write to non-kernel RAM 1105.If multiple physical level fuse array 1101 are arranged on crystal grain, the capacity of non-kernel RAM 1105 must be able to be stored the data of all physical level fuse array 1101.At main kernel 1102, the content of physical level fuse array 1101 is deposited in after non-kernel RAM 1105, antenna array control 1103 makes corresponding inferior kernel 1102 read the certain content that is written into data register 1104 in non-kernel RAM 1105.
Knownly have that programmable fuse, external pin are set, JTAG program or other relevant apparatus, all data register 1104 that is written into able to programme.Embodiment shown in Figure 11 also can be incorporated in the redundancy fuse array mechanism described in Figure 10.
Please refer to Figure 12, it shows a possibility embodiment who proofreaies and correct (ECC) mechanism according to error check of the present invention.Error check correction mechanism 1200 can be incorporated into Fig. 3-Figure 11 embodiment in, and strengthen compression and the decompression of configuration data.Figure 12 describes a micro-processor kernel 1220, and it is arranged on a crystal grain, and couples a physical level fuse array 1201.Physical level fuse array 1201 comprises compressed configuration data block 1203.Compressed configuration data block is described above.For compressed configuration data block 1203, physical level fuse array 1201 has ECC code square 1202.Each ECC code square 1202 data block 1203 corresponding with is relevant.In a possibility embodiment, data block 1203 has 64 (i.e. 64 fuses), and ECC code square 1202 has 8 (i.e. 8 fuses).Kernel 1220 has a replacement controller 1222, and it receives a reset signal RESET.Replacement controller 1222 has an ECC element 1224, and it couples a decompressor 1226 by bus CDATA.ECC element 1224, by an address bus ADDR, a data bus DATA and one yard of bus CODE, couples fuse array 1201.
In the time of operation, described in Fig. 3-Figure 11, fuse array 1201 can be programmed by the configuration data of data block 1203.One specific data block 1203 or cross over the corresponding specific data type of multiple data block 1203 (as microcode data inserting, microcode register data) configuration data can't by programme.In addition, may be programmed in identical data block 1203 with respect to the configuration data of more than two data type.In addition, the ECC code programmed array 1201 of 1202 li of ECC code squares.According to known ECC mechanism, ECC code is programmed to a corresponding data block 1203, but not in order to limit the present invention.In other embodiments, also can use the variation of SECDED Hamming (Hamming) code, Chipkill ECC or preposition error recovery (FEC) code.In a possibility embodiment, the address that data block 1203 is relevant and corresponding ECC code square 1202 thereof are known.Therefore, do not need to use the corresponding ECC code square 1202 of adjacent data square 1203 in Figure 12.
The structure of decompressor 1226 and function are approximately identical to the decompressor 421 shown in Fig. 4, and are slightly described in Fig. 5-11.In the time of replacement kernel 1220, before carrying out above-mentioned decompressing function, the ECC element access fuse array that replacement controller is 1222 li, in order to obtain its content.By bus ADDR, may obtain the address of data block 1203 and ECC code square 1202.Can obtain the configuration data of 1203 li of packed data squares by bus DATA.By bus CODE, can obtain the ECC code of 1202 li of each ECC code squares.Obtaining after data, address and code, ECC element 1224, according to ECC mechanism, to the data of being extracted by each data block 1202, produces ECC and confirms, ECC mechanism is in order to produce ECC code, and ECC code is stored in corresponding ECC code square 1202.ECC element 1224 also compares ECC to be confirmed and the corresponding ECC code of array 1201, checks son in order to produce ECC.ECC element 1224 ECC inspection of more decoding, in order to judging whether not have wrongly to occur, whether correctable error occurs or unrecoverable error occurs.ECC element 1224 is also in order to proofread and correct correctable error.By bus CDATA will not proofread and correct and correction data offer decompressor 1226, in order to carry out above-mentioned decompression action.By bus CDATA, uncorrectable error is offered to decompressor 1226.If to be judged be can not timing to key component in the operation of configuration data, decompressor 1226 may cause cutting out of kernel 1220 or marking error otherwise.
In a possibility embodiment, ECC element 124 comprises at least one micro code program, and it is in order to carry out above-mentioned ECC function
The software that the present invention and corresponding narrating content provide or algorithm and symbol represent the operation of the data bit in a computer memory.These contents and diagram can make those skilled in the art effectively express related content to others skilled in the art.Use above-mentioned algorithm in order to express the order of a self-self-consistentency.These steps need the physical level operation of physical quantity.Generally speaking, these physical quantitys may be optical, electrical or magnetic number, and it can be stored, changes, integrates, compare and other operation.Some is for convenient, and these signals can be called as position, value, element, symbol, characteristic, project, quantity or other related content.
But, should be noted, these similar terms are relevant with physical quantity, and just in order to convenient these physical quantitys of explanation.Unless stated otherwise, not so above-mentioned term (as processed, estimation, calculate, judgement, show or other relational language) refers to action and the processing of a computer system, a microprocessor, a CPU (central processing unit) or similar Electronic Accounting Machine Unit, its operation translation data, it represents physical property, the register of computer system and the quantity of storer, in order to obtain the data of physical quantity of storer, register or other similar information-storing device or display device of other similar computer system.
Should be noted, the present invention realizes the method for software and encodes on the transmission medium of program recorded medium or other similar type.Program recorded medium may be electronic type (as ROM (read-only memory), flash ROM, the electronics formula ROM (read-only memory) of erasing), random access memory magnetic devices (as a floppy disk or a hard disk) or optical profile type (as compact disc-ROM CD ROM) and other read-only or random access element.Similarly, transmission medium may be the transmission medium of plain conductor, twisted-pair feeder, concentric cable, optical fiber or other known similar.The present invention is not limited in these embodiment.
Although the present invention with preferred embodiment openly as above; so it is not in order to limit the present invention; without departing from the spirit and scope of the present invention, when doing a little change and retouching, therefore protection scope of the present invention is when being as the criterion depending on appended claims confining spectrum for those skilled in the art.