CN1851639A

CN1851639A - Method and apparatus for recoding instructions

Info

Publication number: CN1851639A
Application number: CN 200510067702
Authority: CN
Inventors: 苏玛雅·班纳基; 约翰·L·凯利; 瑞安·C·金特
Original assignee: MIPS Technologies Inc
Current assignee: Arm Overseas Finance Co ltd; Overpass Bridge Co ltd
Priority date: 2005-04-22
Filing date: 2005-04-22
Publication date: 2006-10-25
Anticipated expiration: 2025-04-22
Also published as: CN100543669C

Abstract

Method and device for recoding to one or more of instruction set reads extended instruction and extensible instruction from instruction cache. Label comparison and channel selection unit make verification for verifying each instruction being the required instruction. Instruction staging unit sends extended instruction to the first recoder of recoding unit, sending extensible instruction to second recoder. First recoder generates at least one information bit according to extended instruction. Second recoder uses the first recoder generated at least one information bit recoding the extensible instruction and locating the recoded extensible instruction in instruction cache.

Description

Be used for Methods for Coding and device are carried out in instruction again

Technical field

Relate generally to field of computer architecture of the present invention, more particularly, the coding again that the present invention relates to instruct.

Background technology

People know that computer system (for example, mainframe computer, personal computer, microprocessor etc.) can be designed as the instructions of execution from one or more instruction set.In being designed to carry out from the computer system more than the instructions of an instruction set, for example, first instruction set can be optimized for the quick execution on the goal systems.Yet, can have the form (for example, width is 32 or 64) of relative broad from the instructions of this first collection, so use a large amount of relatively storage space to be used for storage.Therefore, second instruction set can be optimized like this, uses less storage space by using narrower instruction width form (for example, width is 8 or 16).These instructions may be slower than the instruction (different instructions realizes identical functions with possibility because requirement is more) from first instruction set, but narrower form help to reduce required total memory space when executive routine.

In addition, available the 3rd instruction set provides the downward compatibility of last generation machine to the instruction width form that may utilize different sizes (for example, older 16 machines).And available the 4th (perhaps more) instruction set provide the new development to the instruction set that also needs different instruction width form (for example, 8 JAVA syllabified code) that upward compatibility is provided.Certainly, above example is also non exhaustive.

In order to make single computer systems support aforesaid different instruction collection, system need possess the ability that adapts to the different instruction collection with the different instruction width form of possibility.A kind of method that realizes this ability in the past is by an instruction set being mapped to another, allowing single demoder to be used for different instruction width forms.For example, in the occasion that an instruction set is the subclass of another instruction set, such mapping is fine.Yet because most of instruction set are so not relevant, this is a kind of extremely limited feature.

And, take out the computer system that many instructions are used for handling simultaneously at those, it is complicated more that this problem becomes.In this system, can realize mapping by the sequence of operations of carrying out in (in pipelined processor) one or more pipe stage.These operations comprise reads many instructions from cache memory, handle such instruction by the label that compares each bar instruction, (relatively) instruction that selection needs from many instructions according to label, and shine upon the instruction that needs subsequently.Yet in such serial mapping method, the processing of these instructions causes shifting obstacle and/or increase cycling time.

Therefore, need a kind of more efficient methods, handle instructions and be used for carrying out by the processor of computer system.

Summary of the invention

In one embodiment of the invention, provide a kind of calculation of coding machine architecture again that is used for.In an embodiment, this architecture comprises the scrambler again of at least two interconnection, is used for instructions is encoded again.When instructions was encoded again, scrambler was independent and work together again for these.As described here, the present invention can implement in various architectures, system, device, computer program code and method.

In some embodiments of the invention, this architecture is responsible for, and for example, takes out instructions from command cache, instructions is encoded again, and provide instructions to other pipe stage of computer system.As described, in some embodiment of present architecture, from command cache (for example, have on the chip of multichannel associativity memory block), read one or multinomial instruction and cache tags here.The instruction of reading from command cache and the number of cache tags depend on available bandwidth.After reading instructions and cache tags, label comparison and routing unit carry out verification to each label, all can obtain (that is, being present in the Cache) with the instruction of verifying every needs.The instruction unit of going up on the stage goes up on the stage to handle to the instructions of having taken out, and it is sent to instructs coding unit again.Owing to can read many instructions from command cache in the single clock period, many instructions are carried out to go up on the stage to handle and be sent to instructs coding unit again.Instruct again coding unit that the instructions that the unit of going up on the stage from instruction receives is encoded again, so that form the coded order again that to decode in the back and to carry out.According to embodiments of the invention, instruct again coding unit to comprise the scrambler again of at least two interconnection, be used for instructions is encoded again.Be stored in the Instruction Register by the coded order again of instructing again coding unit to produce.This Instruction Register will be implemented the instruction fetch pipeline stages operating of computer system of present architecture and other pipe stage operations of computer system are isolated.In certain embodiments, the instruction bypass unit allows instruction directly lead to Instruction Register from label comparison and routing unit.

Below, describe more embodiment of the present invention, feature and advantage in detail with reference to all accompanying drawings.

Description of drawings

By the detailed description of stating below in conjunction with all accompanying drawings, it is more obvious that the features and advantages of the present invention will become, and in all accompanying drawings, similar label is represented element similar on identical or the function.In addition, the figure that occurs first therein of this label of Far Left Digital ID of label.

Figure 1A is the block scheme of an exemplary computer system of explanation.

Figure 1B is the block scheme of performed a series of streamlines of the computer system of explanation Figure 1A or pipe stage operation.

Fig. 2 is the block scheme of exemplary enforcement of the instruction fetch pipeline stages operating of explanation Figure 1B.

Fig. 3 is the synoptic diagram of exemplary fetch unit.

Fig. 4 A-B is the process flow diagram of the method for the explanation instruction fetch pipeline stages operating that is used to carry out Figure 1B, and it can be implemented by the fetch unit of Fig. 3.

But Fig. 5 A-B is the figure of coding again of illustrated example extended instruction and exemplary extended instruction.

But Fig. 6 A-F is the block scheme of coding again that further specifies such as the illustrative instructions of the extended instruction of Fig. 5 A and 5B and extended instruction.

Embodiment

Figure 1A is the block scheme of exemplary computer system 100.Computer system 100 comprises pipelined processor 101, storer 111 and processor-memory bus 121.Processor 101 is connected to processor-memory bus 121 via Cache controller 103 and memory cache device 107.Storer 111 is connected to processor-memory bus 121 via Memory Management Unit (MMU) 113.Bus interface 133 is connected to processor-memory bus 121 with I/O (I/O) bus 131.There is shown 3 exemplary I/O (I/O) controller 135,137 and 139 that is connected to I/O bus 131.

Figure 1B is that expression is by a series of streamlines of computer system 100 execution or the block scheme of pipe stage operation.Shown in Figure 1B, the pipe stage operation comprises that instruction fetch operation 102, instruction decoding and register take out operation 104, execution and address calculation operations 106, memory access operations 108 and write back operations 110.Pipe stage operation shown in Figure 1B is the typical operation of for example being carried out by Reduced Instruction Set Computer (RISC) architecture.According to the RISC architecture of routine, each pipe stage operation allows single, the unified clock period to finish.Owing to carry out stages operating concomitantly, so the clock period long enough is to adapt to the slowest stage.Thereby in case the streamline of computer system 100 has been expired (that is, each stage is being handled one or multinomial instruction), execution is finished at least one instruction in each clock period.Except the stages operating of Fig. 1, the alternate embodiments of system 100 can be divided into a plurality of stages with any single phase shown in the figure.For example, the instruction fetch phase 102 can be divided into 3 stages, comprising the command cache access in the phase one, and label comparison and routing in the subordinate phase, and the instruction in the phase III is encoded again.These alternate embodiments have been represented design alternative well-known to those skilled in the art.

For example " COMPUTER ARCHITECTURE:A QUANTITATIVEAPPROACH " third edition (2003) one books of John L.Hennessy that can buy and David A Patterson collaboration from the Morgan Kaufman publishing house of the inferior state of markon's width of cloth Buddhist nun city of san francisco, can obtain the detailed description of the conventional feature of computer system 100 known to those skilled in the art of correlation computations machine technology and conventional pipe stage operation thereof, the full text of this book is here by with reference to incorporating into.Therefore, here these conventional features will be further described.The following description will concentrate on novelty and unconventional various features of computer system 100 and pipe stage thereof operation, and these are that those skilled in the art of correlation computations machine technology do not know as yet.

Fig. 2 is the block scheme of the illustrative embodiments of expression instruction fetch pipeline stages operating 102.As shown in Figure 2, can with command cache 202, label comparison and routing unit 204, the instruction go up on the stage (stage) unit 206, instruct again coding unit 208, instruction bypass unit 210 to implement instruction fetch pipeline stages operating 102.Instruction fetch pipeline stages operating 102 shown in Figure 2 is responsible for taking out instruction, and provides these instructions to other pipe stage of computer system 100.Instruction fetch pipeline stages operating 102 shown in Figure 2 also is responsible for handling the result of all control transfer instructions (such as branch instruction and jump instruction).

In one embodiment, the following operation of instruction fetch pipeline stage of computer system 100.At first, from command cache 202, read one or multinomial instruction and cache tags.Command cache 202 is parts of memory cache device 107, and preferably has the memory block on the chip of multichannel associativity.The instruction of reading from command cache 202 and the number of cache tags depend on available bandwidth.For example, in one embodiment, in a clock period of computer system 100, read 64 director data positions and a buffer label from command cache 202.This equals 88 bit instructions, 4 16 bit instructions, two 32 bit instructions, perhaps 1 64 bit instruction.Big bandwidth allowed to read other instruction and cache tags in a clock period of computer system 100.

After reading instructions and (respectively) buffer label, label compares and routing unit 204 carries out verification to (respectively) label, so that checking, for example, the instruction that each bar is read is correct (that is, needing) instruction.Other label verifications that can carry out comprise, for example, lock verification and parity checking.

The instruction 206 pairs of instructions in unit of going up on the stage go up on the stage to handle, and it are sent to instruct coding unit 208 again.In one embodiment, as suggested in top, in the single clock period of computer system 100, can from command cache 202, read many instructions.Therefore, when this kind situation occurring, if the parallel processing capability of coding unit 208 is instructed in outnumbering of the instruction of taking out again, then many instructions must be gone up on the stage and be sent to instruct coding unit 208 again.

Instruct again 208 pairs of coding units to encode again from the instruction required instruction that unit 206 receives of going up on the stage.The encoding operation again of unit 208 is mapped to another kind of encoding state (for example 32 bit instructions) with instructions from a kind of encoding state (for example 16 bit instructions).This is different from the decode operation of carrying out in pipe stage operation 104 (instruction decoding and register take out), there, coded order is decoded as one or more independent control signals, in order to the selection operation in the booting computer system 100.Instruct again parallel processing that coding unit 208 comprises at least two interconnection scrambler again, in order to the instructions of taking from command cache 202 is encoded again.In one embodiment, instruct again coding unit 208 to encode again to instruction that belongs to the multiple instruction set architecture and instruction with different bit widths.With reference to Fig. 3-6, illustrate further how this finishes below.As shown in Figure 3, be stored in the Instruction Register 316 by the coded order again of instructing again coding unit 208 to produce.Instruction Register 316 is isolated the operation of other pipe stage of the instruction fetch pipeline stages operating of computer system 100 and computer system 100.

Instruction bypass unit 210 allows instructions directly be sent to Instruction Register 316 from label comparison and routing unit 204.In one embodiment, instruction bypass unit 210 is data communication paths.In another embodiment, instruction bypass unit 210 can comprise the part that is used to instruct or the equipment of early decode.The instructions that instruction bypass unit 210 can be used for allowing not to be needed to encode again places Instruction Register 316 apace, perhaps is sent to the instruction decode pipe stage of computer system 100.In certain embodiments, processor 101 is configured decoding and carries out 32 bit instructions.In one embodiment, when taking out one 32 bit instruction from command cache 202, it can directly be sent to Instruction Register 316, and without encoding again.On the other hand, 16 bit instructions that take out from command cache 202 will need to encode again, therefore, any 16 bit instructions that take out from Instruction Register 202 will be by instructing coding unit 208 to handle again, and will be placed in the Instruction Register 316 from the coded order again of instructing again coding unit 208 to produce, be used for the subsequent decoding and the execution of processor 101.Based on explanation provided here, concerning those skilled in the art of correlation computations machine technology, can use other examples of instruction bypass unit 210 to become apparent.

Fig. 3 is a synoptic diagram of implementing the exemplary fetch unit 300 of instruction fetch pipeline stages operating 102.Fetch unit 300 comprises command cache 202, multiplexer 302,304,314a and 314b, instruction bypass path 210a and 210b, data trigger 306a, 306b, 306c and 306d,

scrambler

310a and 310b, information cache device 312 and Instruction Register 316 again.

Command cache 202 is connected to multiplexer 302.In one embodiment, this connection provides 64 data bit to add the bandwidth of respective labels (that is, each 64 bit data is associated with a label).This bandwidth allows to read from command cache 202 in each read cycle, for example, and 4 16 bit instructions or two 32 bit instructions.In one embodiment, unless Instruction Register 316 is full,, from command cache 202, read instructions and label every the clock period of a computer system 100.If Instruction Register 316 is full, then can suspend the other instruction of taking-up from command cache 202, till Instruction Register 316 can be accepted data once more.

Multiplexer 302 is used to realize that feature described here-label compares and routing unit 204.Multiplexer 302 is output as 64 data bit.Can provide these positions to Instruction Register 316 via instruction bypass path 210a and 210b, perhaps provide these positions to multiplexer 304, be used for instruction and go up on the stage via data trigger 306a-d.

Multiplexer 304 and data trigger 306a-d are used to realize unit 206 is gone up on the stage at instruction in the front and the instruction described is gone up on the stage feature.Multiplexer 304 is connected at least two scramblers 310 again.In one embodiment, by the multiplexer 304 operations data relevant with 306b with data trigger 306a, and in a clock period of computer system 100, be sent to again

scrambler

310a and 310b respectively (promptly, the data relevant with data trigger 306a are sent to scrambler 310a again, and the data relevant with data trigger 306b are sent to scrambler 310b again).In the next clock period of computer system 100, by the multiplexer 304 operations data relevant with 306d with data trigger 306c, and be sent to again

scrambler

310a and 310b respectively (promptly, the data relevant with data trigger 306c are sent to scrambler 310a again, and the data relevant with data trigger 306d are sent to scrambler 310b again).Read among the embodiment of the instruction (number) that the instruction (number) of taking out in the process can handle with parallel mode more than available scrambler again 310 at command cache, this processing allows suitably going up on the stage of many instructions.As suggested in the front, in such as the embodiment that Fig. 3 described, computer system 100 every a clock period reading command Cache 202, take out the coding again of instruction simultaneously.

As shown in Figure 3,

scrambler

310a and 310b interconnect again, and are connected to multiplexer 304,314a and 314b.Scrambler 310a and 310b can be disposed like this and be moved again, make any given instruction or instruction set are encoded to any specific expectation instruction again.For example, carry out two kinds of different instruction set architectures if wish computer system 100: one has the instruction of X bit width, another has the instruction of Y bit width, Y is greater than X, then

scrambler

310a and 310b can be configured to for example the X bit width be instructed encode to form the coded order again of Y bit width again again, perhaps be configured to instruction of X bit width and the instruction of Y bit width are encoded again, to form the instruction of Z bit width.For example, scrambler 310a and the 310b instruction that can also be configured to belong to an instruction set is encoded to the instruction of another instruction set more again, helps to realize the compatible forward of the back compatible of portability, program code of program code and/or program code thus.As explanation given here will be by correlative technology field understood by one of ordinary skill in the art, the possible configuration of

scrambler

310a and 310b and they ability that various instructions are encoded again may be unlimited again.

Each again scrambler 310 be connected to multiplexer 304 and allow parallel coding again the instructions that is sent to again

scrambler

310a and 310b by multiplexer 304.Parallel coding again, in conjunction with the storage of coded order in Instruction Register 316 again, the instruction fetch pipeline stages operating of computer system 100 is separated with other pipe stage operations of computer system 100, and allow fetch unit 300 to get the jump on, for example, the front of instruction decoding and executable operations.By doing sth. before others have a chance to, fetch unit 300 shields other pipe stage operations of computer system 100 with instruction fetch fault (such as the Cache disappearance), and improves the overall operation performance of computer system 100.

As shown in Figure 3, for example, again the output of scrambler 310a be connected to again the input of scrambler 310b and again the output of scrambler 310b be connected to the input of scrambler 310a again via information cache device 312,

scrambler

310a and 310b are worked together, so that but expansion and extended instruction (inter-related instruction is shown in Fig. 5 A and 5B, and will be elaborated below) are encoded again.By

scrambler

310a and 310b again but expansion and extended instruction being united coding has again avoided when the instruction of these types is encoded inevitably the result of coding delay again again and will instruct gap or foam to be inserted in the Instruction Register 312.

Multiplexer 314a and 314b select to provide which kind of data bit to Instruction Register 316.Each multiplexer 314a and 314b are connected to the output and the instruction bypass path 210 of scrambler 310 again.In one embodiment, use the method for operation by the computer system 100 of one or more mode bit representations, control multiplexer 314a and 314b, thus select when to walk around again scrambler 310.

Instruction Register 316 is conventional first-in first-out type (FIFO) buffers.As suggested in top, buffer 316 helps the instruction fetch pipeline stages operating of separate computer system 100 and other pipe stage operations of computer system 100, and allows fetch unit 300 to get the jump on, for example, and the front of instruction decoding and executable operations.In one embodiment, when Instruction Register 316 is expired, suspend reading of Cache.

Shown in the embodiment of Fig. 3, instruct again coding unit 208 to comprise two scramblers again, that is, and 310a and 310b.Yet alternate embodiment of the present invention can comprise scrambler again more than two (work in parallel, serial, or the two).Just as the skilled person will appreciate, the structure of these alternate embodiments and class of operation are similar to two embodiment of scrambler more as described herein, also are the latter's extensions in logic simultaneously.In addition, the embodiment of Fig. 3 illustrates a kind of encoding operation again, and it receives 16 bit instructions and produces 32 bit instructions.In alternate embodiment, encoding operation can receive and produce the instructions of the size shown in being different from here again.Also has the instruction that this operation can the many sizes of simultaneous adaptation.For example, the two can together be encoded to the instruction (for example, 35) of different size again 16 and 32 bit instructions, to adapt to the unique trait of each instruction set.

Fig. 4 A and 4B represent to be used to carry out the process flow diagram of the method 400 of instruction fetch pipeline stages operating 102.For example, can come implementation method 400 by fetch unit 300.

Method 400 starts from step 402.In step 402, from command cache, take out (reading) many instructions.Preferably, the number of the instruction of taking out in step 402 will be equal to or greater than the number that can be used for scrambler again that the instructions of taking out is encoded again.

In step 404, the instructions of taking out in the step 402 is sent to each can be used for scrambler again that instruction is encoded again.

In step 406, determine whether the instruction that will be encoded again is required instruction.If the instruction that will be encoded again is required instruction, then flow process control enters step 408.If the instruction that will be encoded again is not required instruction, then flow process control enters step 420.

In step 408, be noted that for can be used for the instruction of taking out from command cache in step 402 is encoded again each again the step 410 of scrambler manner of execution 400 to 418.

In step 410, determine by each available scrambler more whether the instruction that will be encoded again is extended instruction.For example, by checking the operational code of this instruction, just can make such determining.If the instruction that will be encoded again is an extended instruction, then flow process control enters step 412.If the instruction that will be encoded again is not an extended instruction, then flow process control enters step 416.

Fig. 5 A and 5B provide an example of extended instruction.As used herein, but extended instruction is to have everybody instruction of each data bit that is added to or is juxtaposed to second extended instruction, thus but expansion is kept at the immediate value in the immediate field (immediate field) of extended instruction.But Fig. 5 A and 5B also provide an example of extended instruction.In the process of encoding again, but each bar extended instruction all have must with a relevant extended instruction of its pairing, otherwise, in the process of coding again, but the immediate value of the expansion that the position of the immediate field of data bit and extended instruction by the combination extended instruction forms will cause incorrect coded order again.Those skilled in the art will recognize that of correlation computations machine technology, but extended instruction and extended instruction are just as used herein, are similar to MIPS 16e (trade mark) instruction (for example, so-called " expansion " instruction) with identity function." MIPS 32 (TM) Architecture for programmers " in the publication that can buy from MIPS Technologies Inc., volume IV-a: to MIPS 16e (trade mark) proprietary extensions of MIPS 32 (registered trademark) architecture, revised edition 2.00, MIPS Technologies Inc. (2003), can find the additional information about MIPS 16e (trade mark) architecture, the full text of this book is here incorporated into by reference.Yet, but expansion as described herein and extended instruction are not confined to the function that MIPS 16e (trade mark) instruction is possessed just.

In step 412, obtain the information of relevant extended instruction by scrambler again, but this is the information that needs when the relevant extended instruction of extended instruction is encoded again.Bottom line, but this information will be included in one or more data bit of the extended instruction of one or more data bit that will be added to or be juxtaposed to relevant extended instruction in the process of encoding again.According to the present invention, but will depend on the configuration that is used for scrambler again that these instructions are encoded again to the minimal information amount of the required reality of encoding again to given expansion and extended instruction.

In step 414, with the direct information that obtains in the step 412 scrambler again, but the latter needs these information relevant extended instruction of encoding again.In one embodiment, this information transmits together with other information (such as the fact that detects an extended instruction).

In step 416, but determine whether the instruction that will be encoded again is extended instruction.In one embodiment, for example, the operational code by checking this instruction and/or by another information of sending here of scrambler is again made this and is determined.If but the instruction that will be encoded again is an extended instruction, then flow process control enters step 418.If but the instruction that will be encoded again is not an extended instruction, then flow process control enters step 419.

In step 418, according to by another information of sending here of scrambler (for example, in step 414) again, but extended instruction is encoded again.As suggested in here, but employed encoding process again depends on and is used for the configuration and the operation of specific scrambler again that extended instruction is encoded again.But Fig. 5 A and 5B illustrate the encoding process again that is used to expand with extended instruction.

In step 419, normal (for example do not expand maybe can not expand) instruction is encoded not again need be by another information of sending here of scrambler again.Have, as suggested in here, the encoding process of using in step 419 again depends on and is used for the configuration and the operation of specific scrambler again that normal instruction is encoded again again.

In step 420, determine whether the extra-instruction that to encode again, in step 402, take out.The extra-instruction of encoding again if necessary, then flow process control enters step 404.If there is not extra-instruction that need encode again, taking-up in step 402, then flow process control enters step 422.

In step 422, determine whether that other instruction will take out from command cache.If other instruction to be removed is arranged, then flow process control enters step 402, otherwise flow process control enters step 424.

In step 424, method 400 finishes.

Fig. 5 A illustrates but exemplary extended instruction 500 and exemplary extended instruction 510 is encoded to form the processing procedure of coded order again 520 again.

Extended instruction 500 comprises opcode field 502 and extended field 504.Shown in Fig. 5 A, instruction 500 has X position (B _x) width.

But extended instruction 510 comprises opcode field 512 and immediate field 514.But

field

512 and 514 be not extended instruction 510 field only arranged.But extended instruction 510 can be any instruction (for example, jump instruction, branch instruction, memory read instruction fetch, storer write instruction or the like) with immediate field.Shown in Fig. 5 A, instruction 510 also has X position (B _x) width.

By add or and put extended field 504 and immediate field 514 everybody form expansion immediate field in the instruction 520, form coded order again 520.Opcode field 522 booting computer systems 100 of coded order 520 are carried out by opcode field 512 indicated (various) of instruction 510 and are operated again.Shown in Fig. 5 A, in one embodiment, coded order 520 has Y position (B again _y) width.

Fig. 5 B illustrates but the second exemplary extended instruction 511 is encoded to form the processing of coded order again 530 again.

But extended instruction 511 comprises opcode field 513 and immediate field 515.In this case, but field 513 and 515 be extended instruction 511 field only arranged.But extended instruction 511 is to have the instruction of the function that is similar to MIPS 16e (trade mark) redirect and the representative of link (JAL) instruction or redirect and link and switching manipulation mode (JALX) instruction.

By add or and everybody the expansion immediate field that forms in the instruction 530 of putting extended field 504 and immediate field 515 form coded order again 530.Opcode field 532 booting computer systems 100 of coded order 530 are carried out (various) operation by opcode field 513 indications of instruction 511 again.Shown in Fig. 5 B, in one embodiment, coded order 530 also has Y position (B again _y) width.

Fig. 6 A-F is a block scheme, illustrates two runnings of scrambler embodiment again of computer system 100 further, that is, by fetch unit 300, but and the coding again of for example instruction of the expansion of Fig. 5 A and 5B and extended instruction this embodiment is described.

Fig. 6 A has illustrated exemplary encoding operation again, wherein, in the clock period 0 of computer system 100, takes out 4 routines (for example, do not expand and can not expand) instruction I from command cache ₀, I ₁, I ₂And I ₃Instruction I ₀Be sent to scrambler 310a again, instruction I ₁Be sent to scrambler 310b again.Because these two instructions all are conventional instructions,

scrambler

310a and 310b can operate independently again, and in a clock period of computer system 100, to these two instruction I ₀And I ₁Encode again.In the next clock period of computer system 100, instruction I ₂Be sent to scrambler 310a again, instruction I ₃Be sent to scrambler 310b again.Once more, because these two instructions all are normal instructions,

scrambler

310a and 310b can operate independently again, and in the single clock period of computer system 100, to instruction I ₂And I ₃Encode again.Therefore, when two clock period of computer system 100 finish, all 4 instruction I ₀, I ₁, I ₂And I ₃All

scrambler

310a and 310b encode again again by two.

Fig. 6 B has illustrated exemplary encoding operation again, wherein, in the clock period 0 of computer system 100, takes out an extended instruction I from command cache ₀But, an extended instruction I ₁, two routines (for example, do not expand and can not expand) instruction I ₂And I ₃Instruction I ₀Be sent to scrambler 310a again.Instruction I ₁Be sent to scrambler 310b again.Because instruction I ₀Be extended instruction, so but again scrambler 310a obtain coding extended instruction I again ₁Required information, and with this direct information scrambler 310b again.Then, scrambler 310b uses the information of sending here from scrambler 310a more again, but to extended instruction I ₁Encode again.Shown in Fig. 6 B, scrambler 310a and 310b operate together to instruction I again ₀And I ₁Encode again, and in the single clock period of computer system 100, form single coded order again.In the next clock period of computer system 100, instruction I ₂Be sent to scrambler 310a again, instruction I ₃Be sent to scrambler 310b again.At this moment, because these two instructions all are conventional instructions, scrambler 310a and 310b can both operate independently again, and in a clock period of computer system 100, to these two instruction I ₂And I ₃Encode again.When two clock period of computer system 100 finish, all 4 instruction I ₀, I ₁, I ₂And I ₃All by two again scrambler 310a and 310b encode again, to form 3 coded orders again.

Fig. 6 C has illustrated exemplary encoding operation again, wherein, in the clock period 0 of computer system 100, takes out an extended instruction I from command cache ₁But, an extended instruction I ₂With two routines (for example, do not expand and can not expand) instruction I ₀And I ₃Instruction I ₀Be sent to scrambler 310a again.Instruction I ₁Be sent to scrambler 310b again.Because instruction I ₀Be normal instruction, so scrambler 310a can encode to this instruction again again, and need be from another any input of scrambler again.Because instruction I ₁Be extended instruction, so but again scrambler 310b obtain coding extended instruction I again ₂Required information, and via information cache device 312 with this direct information scrambler 310b again.Buffer 312 is stored coded order I again ₂Required information is up to instruction I ₂Can be sent to again till the scrambler 310a.Then, in a subsequent clock cycle (clock period 2) of computer system 100, scrambler 310a uses the information of sending here from scrambler 310b more again, but to extended instruction I ₂Encode again.Because instruction I ₃Be normal instruction, so scrambler 310b can encode to this instruction again again, and need be from another any input of scrambler again.Once more, when two clock period of computer system 100 finish, all 4 instruction I ₀, I ₁, I ₂And I ₃All by two again scrambler 310a and 310b encode again, to form 3 coded orders again.

Fig. 6 D has illustrated exemplary encoding operation again, wherein, in the clock period 0 of computer system 100, takes out an extended instruction I from command cache ₂But, an extended instruction I ₃And two routines (for example, do not expand and can not expand) instruction I ₀And I ₁Instruction I ₀Be sent to scrambler 310a again.Instruction I ₁Be sent to scrambler 310b again.Because these two instructions all are conventional instructions, so

scrambler

310a and 310b can both carry out work independently again, and in the single clock period of computer system 100, to instruction I ₀And I ₁Encode again.Because instruction I ₂Be extended instruction, so but again scrambler 310a obtain coding extended instruction I again ₃Required information, and with this direct information scrambler 310b again.Then, scrambler 310b uses the information of sending here from scrambler 310a more again, but to extended instruction I ₃Encode again.Shown in Fig. 6 B,

scrambler

310a and 310b operate together again, to instruction I ₂And I ₃Encode again, and in a clock period of computer system 100, form single coded order again.When two clock period of computer system 100 finish, all 4 instruction I ₀, I ₁, I ₂And I ₃All by two again

scrambler

310a and 310b encode again, to form 3 coded orders again.

Fig. 6 E has illustrated exemplary encoding operation again, wherein, in the clock period 0 of computer system 100, takes out an extended instruction I from command cache ₃And 3 routines (for example, do not expand and can not expand) instruction I ₀, I ₁And I ₂Instruction I ₀Be sent to scrambler 310a again.Instruction I ₁Be sent to scrambler 310b again.Because these two instructions all are conventional instructions, so

scrambler

310a and 310b can both carry out work independently again, and in the single clock period of computer system 100 to instruction I ₀And I ₁Encode again.Instruction I ₂Also be sent to scrambler 310a again.Because instruction I ₂Be normal instruction, so scrambler 310a can encode to this instruction again again, and need be from another input of scrambler again.Instruction I ₃Be sent to scrambler 310b again.Because instruction I ₃Be extended instruction, so but again scrambler 310b obtained again coding extended instruction I ₄Required information, and via information cache device 312 with this direct information scrambler 310a again.Then, scrambler 310a uses the information of sending here from scrambler 310b more again, but to extended instruction I ₄Encode again.

Fig. 6 F has illustrated exemplary encoding operation again, wherein, in the clock period 0 of computer system 100, takes out one incorrect (that is, unwanted) instruction I from command cache ₀, an extended instruction I ₁But, an extended instruction I ₂And conventional instruction I ₃Instruction I ₀Do not encode again because it is incorrect instruction.Instruction I ₁Be sent to scrambler 310b again.Because instruction I ₁Be extended instruction, so but again scrambler 310b obtain coding extended instruction I again ₂Required information, and via information cache device 312 with this direct information scrambler 310a again.Then, scrambler 310a uses the information of sending here from scrambler 310b more again, but to extended instruction I ₂Encode again.Because instruction I ₃Be conventional instruction, so scrambler 310b can carry out work independently again, and to instruction I ₃Encode again, and need be from another any input of scrambler again.Shown in Fig. 6 F, when two clock period of computer system 100 finish, 3 instruction I ₁, I ₂And I ₃All by two again

scrambler

310a and 310b encode again, to form two coded orders again.

Just as already noted, of the present invention can have plural scrambler again for each embodiment that substitutes.These embodiment will carry out work to be similar to above-mentioned two modes of scrambler embodiment again.How these embodiment that provide explanation of the present invention here realize that these those skilled in the art to the correlation computations machine technology are conspicuous.

Conclusion

Various embodiments of the present invention below have been described, have should be appreciated that they only are suggested as an example, rather than a kind of restriction.Concerning those skilled in the art of correlation computations machine technology, it is evident that, under the prerequisite of not leaving spirit of the present invention and scope, can make various changes in form and details.

For example, except (for example using hardware, within it or be attached thereto: CPU (central processing unit) [" CPU "], microprocessor, microcontroller, digital signal processor, processor core, system on a chip [" SOC "], perhaps any other programming device) realizes again beyond the coded system, can also be (for example with software, computer-readable code, program code, instruction and/or the data of arranging in any form, such as the source, target or machine language) realize, above-mentioned software is arranged at, for example, configuration is used to store the computing machine of this software can be with in (for example, readable) medium.Such software has been realized here function, manufacturing, modeling, emulation, description and/or the test of the apparatus and method of explanation.For example, can be by (for example using general programming language, C, C++), the GDSII database, comprise the hardware description language (HDL) of Verilog HDL, VHDL, AHDL (Altera HDL) etc., perhaps other available programs, database and/or circuit (that is the schematic diagram) instrument of catching is finished.Such software can be arranged at and (for example comprise semiconductor, disk, CD, CD-ROM, DVD-ROM etc.) the medium that can use of any known computing machine in, and as computing machine can use (for example, readable) transmission medium (for example, carrier wave or any other medium, comprise numeral, optics or based on the medium of simulation) in the computer data signal realized.So, this software can be sent out on the communication network that comprises the Internet and Intranet.

Should be appreciated that illustrated apparatus and method can be included into such as the semiconductor intellectual property core of microcontroller core (for example, realizing) in the heart here in hardware description language, and be converted into the hardware in the integrated circuit (IC) products.In addition, illustrated here apparatus and method may be implemented as the combination of hardware and software.Therefore, the present invention should not be confined to any above-mentioned each exemplary embodiment, but should only be limited according to following claims and equivalent thereof.

Claims

1. fetch unit that is used for processor comprises:

First scrambler again; And

Be connected to first second scrambler again of scrambler again,

Wherein, first again scrambler with direct information second scrambler again of relevant first instruction, and second again scrambler according to by first again the information sent here of scrambler second instruction is encoded again.

2. fetch unit according to claim 1 also comprises:

Be connected to first scrambler and second instruction of the scrambler unit of going up on the stage again again, be used for instruction is sent to first scrambler and second one of scrambler more again from command cache.

3. fetch unit according to claim 1, wherein, described processor is carried out the instructions that has the X position and belong to first instruction set, and have the Y position and belong to the instructions of second instruction set, Y is greater than X, and wherein, first again scrambler and second again scrambler the instruction that belongs to one of first instruction set and second instruction set is encoded again, to form the coded order again have the Y position at least.

4. fetch unit according to claim 3, wherein, each bar instruction of first instruction set has 16, and each bar instruction of second instruction set has 32.

5. fetch unit according to claim 3, wherein, first instruction set comprises extended instruction, but be used to enlarge the immediate field of the extended instruction of first instruction set, and wherein, first again scrambler at least 1 of extended instruction is sent to second scrambler again, thereby but allow the second scrambler described extended instruction of encoding again again.

6. fetch unit according to claim 5, wherein, but at least 1 at least 1 of being juxtaposed to extended instruction of extended instruction.

7. fetch unit according to claim 3, wherein, first instruction set comprises the mode switching command, be used to switch the working method of described processor, and wherein, first again scrambler be sent to second scrambler again with one or more, thereby allow second again scrambler the mode switching command is encoded again.

8. fetch unit according to claim 7, wherein, the described one or more at least one positions that are juxtaposed to the mode switching command.

9. processor comprises:

First scrambler again; And

Be connected to first second scrambler again of scrambler again,

Wherein, first scrambler will be about direct information second scrambler again of first instruction again, and second again scrambler according to by first again the information sent here of scrambler second instruction is encoded again.

10. processor according to claim 9 also comprises:

Be connected to first scrambler and second instruction of the scrambler unit of going up on the stage again again, will instruct and be sent to first scrambler and second one of scrambler more again from command cache.

11. processor according to claim 10, wherein, described processor is carried out the instructions that has the X position and belong to first instruction set, and have the Y position and belong to the instructions of second instruction set, Y is greater than X, and wherein, first again scrambler and second again scrambler the instruction that belongs to one of first instruction set and second instruction set is encoded again, to form the coded order again have the Y position at least.

12. processor according to claim 11, wherein, each bar instruction of first instruction set has 16, and each bar instruction of second instruction set has 32.

13. processor according to claim 11, wherein, first instruction set comprises extended instruction, but be used to enlarge the immediate field of the extended instruction of first instruction set, and wherein, first again scrambler second scrambler again is sent at least one position of extended instruction, thereby allow second again scrambler but this extended instruction is encoded again.

14. processor according to claim 13, wherein, but at least one position of described extended instruction is juxtaposed at least one position of extended instruction.

15. processor according to claim 11, wherein, first instruction set comprises the mode switching command, the working method that is used for handoff processor, and wherein, first again scrambler be sent to second scrambler again with one or more, thereby allow second again scrambler the mode switching command is encoded again.

16. processor according to claim 15, wherein, the described one or more at least one positions that are juxtaposed to the mode switching command.

17. a disposal system comprises:

First scrambler again is used for producing at least one information bit according to extended instruction; And

Be connected to first second scrambler again of scrambler again, be used for according to from first at least one information bit of scrambler again, but extended instruction is encoded again.

18. disposal system according to claim 17 also comprises:

19. disposal system according to claim 17, wherein, described disposal system is carried out the instructions that has the X position and belong to first instruction set, and have the Y position and belong to the instructions of second instruction set, Y is greater than X, and wherein, first again scrambler and second again scrambler the instruction that belongs to one of first instruction set and second instruction set is encoded again, to form the coded order again have the Y position at least.

20. disposal system according to claim 19, wherein, each bar instruction of first instruction set has 16, and each bar instruction of second instruction set has 32.

21. system according to claim 19, wherein, but described extended instruction is used to enlarge the immediate field of extended instruction, and wherein, first again scrambler everybody of extended field is sent to second scrambler again.

22. system according to claim 21, wherein, but everybody of extended field is juxtaposed at least one position of extended instruction.

23. a computer-readable medium comprises the microcontroller core of realizing with software, this microcontroller core comprises:

First scrambler again; And

Be connected to first second scrambler again of scrambler again,

24. computer-readable medium according to claim 23 also comprises:

25. computer-readable medium according to claim 23, wherein, described microcontroller core is carried out the instructions that has the X position and belong to first instruction set, and have the Y position and belong to the instructions of second instruction set, Y is greater than X, and wherein, first again scrambler and second again scrambler the instruction that belongs to one of first instruction set and second instruction set is encoded again, to form the coded order again have the Y position at least.

26. computer-readable medium according to claim 25, wherein, each bar instruction of first instruction set has 16, and each bar instruction of second instruction set has 32.

27. computer-readable medium according to claim 25, wherein, first instruction set comprises extended instruction, but be used to enlarge the immediate field of the extended instruction of first instruction set, and wherein, first again scrambler second scrambler again is sent at least one position of this extended instruction, thereby allow second again scrambler but described extended instruction is encoded again.

28. computer-readable medium according to claim 27, wherein, but at least one position of described extended instruction is juxtaposed at least one position of extended instruction.

29. computer-readable medium according to claim 25, wherein, first instruction set comprises the mode switching command, the working method that is used for handoff processor, and wherein, first again scrambler be sent to second scrambler again with one or more, thereby allow second again scrambler the mode switching command is encoded again.

30. computer-readable medium according to claim 29, wherein, the described one or more at least one positions that are juxtaposed to the mode switching command.

31. one kind is used for coded order again and is used for the method carried out by processor, comprising:

(a) but from command cache, take out extended instruction and extended instruction;

(b) extended instruction is sent to first scrambler again, but and extended instruction is sent to second scrambler again;

(c) according to extended instruction, first again scrambler produce at least one information bit; And

(d) second again scrambler use in first at least one information bit of producing of scrambler again, but extended instruction is encoded again.

32. method according to claim 31, wherein, step (a) comprising:

(i) in first clock period of processor, take out extended instruction; And

But (ii) in a follow-up clock period of processor, take out extended instruction.

33. method according to claim 31, wherein, in first clock period of processor, first again in the scrambler, produce at least one information bit according to extended instruction, and at the second clock of processor in the cycle, but in the scrambler extended instruction is encoded again again second.

34. method according to claim 33 in step (c) with (d), also comprises a step:

To be stored in the information cache device in first at least one information bit that produces in the scrambler again.

35. one kind is used for instruction encoded again and is used for the method carried out by processor, comprising:

Take out many instructions from command cache, wherein, these many instructions comprise first instruction and second instruction, and first instruction is different from second instruction;

First scrambler again is sent in first instruction, and second scrambler again is sent in second instruction; And

In the single clock period, first and second instructions are encoded again.

36. method according to claim 35 wherein, is used from the information of first instruction second instruction is encoded again.

37. method according to claim 35, also comprise from first again scrambler to second again scrambler transmit information, such information by second again scrambler be used for carrying out encoding operation again.

38. a fetch unit that is used for processor comprises:

First scrambler again; And

With first second scrambler again of scrambler parallel work-flow again;

Wherein, in the single clock period, first again scrambler to first the instruction encode again, second again scrambler to second the instruction encode again, and first the instruction be different from second the instruction.

39. according to the described fetch unit of claim 38, wherein, second again scrambler use from the information of first instruction second instruction encoded again.

40. according to the described fetch unit of claim 39, wherein, first again scrambler be connected to second scrambler again.

41. fetch unit according to claim 1, wherein, first instruction is used to enlarge a field of second instruction, and described information is at least one position of first instruction.

42. according to the described fetch unit of claim 41, wherein, first instruction is an extended instruction, but second instruction is an extended instruction, and described field is an immediate field.

43. processor according to claim 9, wherein, first instruction is used to enlarge a field of second instruction, and described information is at least one position of first instruction.

44. according to the described processor of claim 43, wherein, first instruction is an extended instruction, but second instruction is an extended instruction, and described field is an immediate field.

45. computer-readable medium according to claim 23, wherein, first instruction is used to enlarge a field of second instruction, and described information is at least one position of first instruction.

46. according to the described computer-readable medium of claim 45, wherein, first instruction is an extended instruction, but second instruction is an extended instruction, and described field is an immediate field.