CN100343799C - Apparatus and method for generating early status flags of pipelined microprocessor - Google Patents
Apparatus and method for generating early status flags of pipelined microprocessor Download PDFInfo
- Publication number
- CN100343799C CN100343799C CNB2005100051447A CN200510005144A CN100343799C CN 100343799 C CN100343799 C CN 100343799C CN B2005100051447 A CNB2005100051447 A CN B2005100051447A CN 200510005144 A CN200510005144 A CN 200510005144A CN 100343799 C CN100343799 C CN 100343799C
- Authority
- CN
- China
- Prior art keywords
- early
- instruction
- status
- pipeline microprocessor
- stage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Executing Machine-Instructions (AREA)
- Advance Control (AREA)
Abstract
An apparatus and method for generating early status flags in a pipeline microprocessor is disclosed. The apparatus includes early status flag generation logic that receives an instruction, an early result of the instruction, and a valid indicator of the early result and responsively generates the early flags. If the instruction is flag-modifying, then the early status flags are stored in an early flags register. The early flags in the register are invalidated if the early result from which they are generated is invalid. The early status flags and associated valid indicator may be employed by subsequent conditional instructions for early execution to avoid delay in waiting for the architected status flag values to be generated by execution units later in the pipeline. The early flags are revalidated if all flags-modifying instructions in pipeline stages below the early flag generation logic, if any, have already updated the architected status flags.
Description
Technical field
The present invention relates to a kind of pipeline microprocessor, the early stage instruction that particularly relates to a kind of pipeline microprocessor is carried out.
Background technology
Microprocessor now is generally pipeline microprocessor (Pipeline microprocessor).Promptly in the different module of microprocessor or in the stage of streamline, carry out a plurality of instructions simultaneously.Hennessy and Patterson are defined as streamline and " carry out a kind of execution technique that the overlapping execution of a plurality of instructions is arranged in the action at one." they for pipelining at ComputerArchitecture:A Quantitative Approach; 2nd edition; by John L.Hennessyand David A.Patterson; Morgan Kaufmann Publishers; San Francisco; CA in 1,966 one books, provides following explanation:
One streamline is example just as an assembly line with the automobile assembly line: in automobile assembly line, comprised a lot of steps, each step all provides some contributions to making automobile, though and on different automobiles, all be parallel in the operation between step.In computer pipeline, each step in the streamline is used to finish the part of instruction, similarly is assembly line, and different steps is finished the different piece of different instruction abreast.Each step just is called a flow line stage (pipe stage) or a streamline section (pipe segment).These stages interconnect to form a streamline, and instruction is entered through after the phase process by an end of streamline, is gone out by the other end, just as automobile is finished on assembly line.
Synchronous microprocessor moved according to the clock period (clock cycle).Typical way is in a clock cycle one instruction to be passed to the next stage with streamline from stage of pipeline microprocessor streamline.In an automobile assembly line, if the worker in a certain stage of assembly line because there is not automobile can assemble and leave unused, then the achievement of this assembly line or output can reduce.Similarly, if stage of a microprocessor a clock in the cycle because instruction can not carried out and leave unused, then the achievement of this program can reduce, general situation is called as streamline bubble (pipelinebubble).
A potential cause that causes the streamline bubble is because branch instruction (branchinstruction).When branch instruction took place, processor must be judged the destination address of branch instruction, and began to take out instruction in destination address, rather than took out the next continuation address of branch instruction.In addition, if branch instruction is a conditional branch instructions (promptly whether a branch depends on whether a clear and definite condition exists), processor must judge whether branch instruction takes place, and in addition, processor also must be judged destination address.Because flow line stage is judged the time of destination address and/or branch outcome (being whether branch takes place) at last, normally the many stages after this stage are taken out instruction, and produce bubble.
At this problem, modern microprocessor utilizes branch prediction mechanism to do sth. in advance predicted target address and branch outcome in streamline usually.And microprocessor Design person also ceaselessly makes great efforts to design the branch predictor with high accuracy.Yet branch misprediction will be wasted many times.As mentioned above, error prediction must be discovered and revised at the flow line stage in branch prediction stage.Because the loss relevant with error prediction is related in branch prediction and the execution of many flow line stages between the branch misprediction correction stage.Therefore, a kind of be used for revising in the apparatus and method of the conditional branch instructions of the leading portion institute error prediction of streamline real in current required.
In addition, conditional branch instructions is indicated a branch condition, if set up, then indicates microprocessor that branch condition is transferred to branch target address; Otherwise microprocessor then continues to take out the instruction of next continuation address.Microprocessor comprises the Status Flag that stores the microprocessor state, and whether Status Flag is to be used to detect to judge set up by the indicated condition of conditional branch instructions.Therefore, finally in order to judge whether a conditional branch instructions is mispredicted, and microprocessor must detect the state of last state sign.Yet now, Status Flag can be just detected judging whether branch condition is set up in streamline late period, and whether branch prediction is correct.Therefore, a kind of streamline produce in early days the apparatus and method of Status Flag real belong to current required.
At last, the state of Status Flag is normally influenced by the result of previous conditional branch instructions indication.For example: condition may be set by a kind of Status Flag-carry flag (carry flag), and its state may be determined by up-to-date add instruction result.Yet the instruction results that influences Status Flag is to produce at the performance element of microprocessor than low flow line stage.Therefore, a kind of can streamline produce in early days the device of instruction or method real belong to current required.
Summary of the invention
In view of above-mentioned problem, the invention provides a kind of device and method that produces the early status flags of pipeline microprocessor.
Therefore, comprise that according to the device of the generation early status flags of one embodiment of the invention early status flags produces logical circuit, it is used to receive an early stage result of an instruction, this instruction and is used to represent a whether effective efficiency index of earlier results.Early sign produces logical circuit and produces early status flags.If instruction is a Status Flag revision directive, then early status flags will be stored in the early sign register.If early sign is produced before the result in early days, then the early status flags register is invalid.Early status flags and relevant efficiency index may be used by conditional order thereafter, in order to earlier executed when waiting for the structuring status flag value to avoid time delay.If all sign revision directives in flow line stage are to indicate in early days to produce the logical circuit place after the stage, then early sign can be estimated again, if early sign is estimated again and represented that then the structuring Status Flag is updated.
Early status flags may be used to the early stage correction of execution of conditional branch instructions in pipeline microprocessor.One branch predictor is the result of the conditional branch instructions in the predicted flows waterline.This prediction is along with branch instruction is transmitted in streamline.Early stage branch correction logic circuit is performed at a flow line stage, and this flow line stage was positioned at before the stage that late period, branch's correction logic circuit can be performed.When branch instruction arrives at early stage branch correction logic circuit, in order to determine whether error prediction, the branch instruction logical circuit can utilize the branch instruction indication and determine whether the condition that is instructed to is satisfied with early status flags in early days, to check condition code.Early status flags produces according to the instruction before the branch instruction.If early status flags is effectively and is error prediction that then early stage branch correction logic circuit can be revised this error prediction.
Because above-mentioned problem, according to some embodiment of the present invention, the earlier results of the instruction before conditional branch instructions is produced by the earlier executed logical circuit.The earlier executed logical circuit is performed at a flow line stage, and it is early to be performed than performance element, and performance element is used to produce final instruction results and end-state value of statistical indicant.The earlier executed logical circuit is the most instructions that is used for carrying out a subset of instructions of microprocessor instruction set.Particularly, the earlier executed logical circuit is to correspond to an address phase of streamline and comprise an address generator, and the early time treatment logical circuit can increase the processing speed to the instructions that frequently are performed.In one embodiment, the earlier executed logical circuit has comprised logical circuit, calculating fast, Boolean calculation and the shift operation of common execution.If the earlier results that is produced by the earlier executed logical circuit is the result that the earlier executed logical circuit is carried out a previous instruction, but this earlier executed logical circuit is not to be designed to carry out this previous instruction, then this earlier results is invalid.If this instruction is a Status Flag revision directive, this early status flags is invalid again.
Because above-mentioned problem, in certain embodiment of the present invention, earlier results is stored in the early stage register file, and its register correspondence is in the structured registers file of microprocessor, and earlier results is stored in the early stage register file with the efficiency index of each register.Early stage register file provide by as the earlier results of operand to earlier executed logical circuit/address generator, with usefulness as the generation earlier results.If early stage register file provides an invalid input operand to the earlier executed logical circuit, then the generation of earlier results is invalid.If instruction is the revision directive of a Status Flag, then early status flags is invalid again.
Description of drawings
Fig. 1 is for showing the block schematic diagram according to a streamline microprocessor of the present invention;
Fig. 2 describes in detail for the block schematic diagram of demonstration according to R stage, A stage and the J stage of the pipeline microprocessor of Fig. 1 of the present invention;
Fig. 3 describes in detail for a calcspar that shows the structured registers file according to Fig. 1 of the present invention and Fig. 2, early stage register file, structuring status register and early stage status register;
Fig. 4 is for showing according to the generation earlier results of Fig. 2 of the present invention and a device operational flow diagram of early stage status register;
Fig. 5 is for showing according to the recovery of the device of Fig. 2 of the present invention and confirming an operational flowchart of early stage status register;
Fig. 6 revises operational flowchart for showing according to the early stage branch of the execution of pipeline microprocessor of the present invention; And
Fig. 7 is for showing the operational flowchart of carrying out the microprocessor that branch is revised in late period according to the present invention.
The component symbol explanation:
100 pipeline microprocessors
102 I stages
104 F stages
106 X stages
108 R stages
112 A stages
114 J stages
116 D stages
118 G stages
122 H stages
124 E stages
126 S stages
128 W stages
132 branch predictors
134 structured registers files
136 early stage register files
138 earlier executed logical circuit/address generators
142 early stage status registers
144 early stage branch correction logic circuit
146 performance elements
148 late period branch's correction logic circuit
152 earlier results buses
154 first control signals
156 second control signals
158 result bus
162 structuring status registers
There is signal in 202 sign revision directives
204 purpose labels
206 instructions
208 branch prediction generation signals
212 early sign generation/control logic circuits
214 source operand labels
216 target operand labels
218 significance bits
222 direct/fixing operation numbers
226 multiplexers
228 halt signals
232,234 pipeline registers
236 significance bit pipeline registers
238 source operand pipeline registers
242 earlier results
244 earlier results useful signals
The effective register of 246 early stage states
248~252 pipeline registers
254 earlier results pipeline registers
258 early stage branch corrected signals
262 early stage status register values
264 control signals
266 memory operand signals
The signal in late period is revised by 268 branches
272 logical circuit of arithmetic
274 Boolean calculation logical circuits
276 logic with shift circuit
278 target operand labels
282 selector control signals
The flow process of 402~444 earlier results and early sign computing
502~514 early signs recover the flow process of computing
The flow process that 602~616 early stage branches are revised
702~716 flow processs that late period, branch was revised
Embodiment
Hereinafter with reference to relevant drawings, illustrate according to the device and method of the early status flags of the generation pipeline microprocessor of preferred embodiment of the present invention, wherein components identical is illustrated with identical reference marks.
Please refer to shown in Figure 1, according to the block schematic diagram of a kind of pipeline microprocessor 100 of preferred embodiment of the present invention.Pipeline microprocessor 100 comprises a plurality of stages in its streamline.Fig. 1 is the pipeline microprocessor of 12 kinds of flow line stages of explanation.
Pipeline microprocessor 100 comprises an I stage 102 (instruction obtains the stage), it is used to get instruction, the I stage 102 comprises an instruction cache, be used for the high-speed cache programmed instruction, the I stage 102 is obtained programmed instruction from an instruction cache or a system storage that connects pipeline microprocessor 100, the storage address of I stage 102 in an instruction pointer register gets instruction, and instruction pointer can increase progressively after instruction is removed usually, and therefore instruction can be removed continuously.Yet the instruction pointer value may change a noncontinuity storage address into according to a branch instruction makes pipeline microprocessor 100 be transferred to a branch target address.The I stage also can comprise a branch predictor 132, it predicts whether a branch instruction appears in the stream that gets instruction, the branch target address whether branch instruction takes place and determine branch instruction when branch instruction takes place, therefore, branch predictor 132 predictions are worked as in pipeline microprocessor 100 at last by the branch instruction of identification, whether indicate pipeline microprocessor 100 to be transferred to a branch target address (generation) of indicating by branch instruction, or indication pipeline microprocessor 100 reach and carry out next continual command (not taking place) after branch instruction, the I stage 102 receives first and second control signal 154 and 156, as described below, it is used to indicate the I stage 102 to revise a prediction of a branch instruction that is produced by branch predictor 132, in one embodiment, branch predictor 132 comprises a branch target address caching (branch target address cache, BTAC), it stores the address of the branch instruction before be performed and by the branch target address of identification.Whether in addition, branch target address caching is to store the historical record of information of forecasting based on branch instruction, be used for the predicted branches instruction and will take place.All the other embodiment comprise dynamic branch history record sheet (dynamic branch history tables), static branch predictor (static branch predictors) and mix static and dynamic branch predictor device (hybrid static/dynamic branch predictors) that this type of technology is well known in the branch prediction field.In one embodiment, the I stage 102 has comprised four flow line stages.
Because being stored in early stage register file register, value may cause pipeline microprocessor 100 to be in a nondeterministic statement, so value is stored in the user visible state of structured registers file 134 registers with reflection pipeline microprocessor 100.Therefore, produce an instruction results and instruct indication that one structured registers file, 134 registers are treated as terminal objective when a stage of pipeline microprocessor 100, it no longer is uncertain up to instruction that this result does not allow to write to structured registers file 134, that is, be determined up to instruction and be complete or recall.As described below, by contrast, an instruction results may instruction be determined complete before, just be written into early stage register file 136.Particularly an instruction results that is produced by earlier executed logical circuit/address generator 138 (comprising the A stage 112 and as described below) may instruction be determined complete before, just be written into early stage register file 136.If pipeline microprocessor 100 decision instructions and not have other be generation one exception of having the ability before the instructions before this instruction itself, then to be determined be complete in this instruction, and all branch instructions before instruction will be eliminated at last, in other words, pipeline microprocessor 100 can be judged each branch instruction (no matter whether correctly taking place) before instruction at last, and whether the branch target address of each branch instruction that obtains is correct.In addition, because the value of early stage register file 136 can also be effective or invalid except uncertain, so the value of structured registers file 134 always is confirmed as effectively, and is as described below.Therefore whether each register among the register file 136 also comprises a corresponding significance bit 218 (as shown in Figure 3) in early days, be used to indicate the value that is stored in relevant register effective.When pipeline microprocessor 100 is reset, be identical with the value of the structured registers file 134 that is initialised by the first early stage register file 136 of making.
The R stage 108 of pipeline microprocessor 100 streamlines has also comprised a structuring status register (architected EFLAGS register) 162, and the A stage 112 has comprised an early stage status register (early EFLAGS register) 142.Structuring status register 162 and early stage status register 142 be the configuration status sign to point out the attribute of instruction results, for example whether the result is zero, whether produces carry or be negative value.In one embodiment, each Status Flag is by a bit representation.In one embodiment, structuring status register 162 comprises an x86 architecture states register, x86 architecture states register contains following Status Flag: overflow indicator (overflowflag, OF), symbol (sign flag, SF), zero flag (zero flag, ZF), parity flag (parity flag, PF) and carry flag (carry flag, CF), as shown in Figure 3.The impose a condition instruction of instruction of the Ah that pipeline microprocessor 100 is included in the indicated condition sign indicating number.One condition code is indicated a state of one or more Status Flag.If the current state of Status Flag is equal to the state in condition code, then this condition be true and pipeline microprocessor 100 execution one by the indicated computing of conditional order, otherwise the computing that is instructed to can not be performed.Below be that a conditional order is the example of a conditional branch instructions, if instruction runs into the situation of (Jcc), then this condition is a jump instruction to conditional branch instructions in the x86 framework, and conditional order indicates a condition code and a displacement to calculate a branch target address.The example of one Jcc is a JNZ (jump if not zero) instruction.JNZ instruction indication one " non-zero " condition code.If zero flag (ZF) is eliminated (promptly " non-zero condition " for true), then pipeline microprocessor 100 is transferred to and is branched the indicated branch target address of instruction (being the conditional branch instructions generation); Yet, if zero flag (ZF) is set (promptly " non-zero " condition is for false), the continual command of pipeline microprocessor 100 taking-ups after conditional branch instructions.Other examples with condition x86 instruction have SETcc, LOOPcc, and instruction such as CMOVcc.
Structuring status register 162 is included in the Status Flag that program that pipeline microprocessor 100 carries out can be seen.Especially, a condition programmed instruction can according at the state index of 162 li of structuring status registers to point out a condition code.The included Status Flag of early stage status register 142 is corresponding to each Status Flag in the structuring status register 162, as shown in Figure 3.Similar to relation between the structured registers file 134 and early stage register file 136 is that the value that is disposed at structuring status register 162 reflects pipeline microprocessor 100 user's visible states.The value that is disposed at early stage status register 142 can reflect the nondeterministic statement of a streamline microprocessor 100.Therefore, when pipeline microprocessor was carried out the instruction of one or more Status Flag of correction, Status Flag can not be updated in structuring status register 162, no longer is uncertain until instruction.Otherwise as described below, Status Flag can be updated in the status register 142 before instruction is determined to finish in early days.Especially, before instruction was determined to be done, according to one by the performed instruction of earlier executed logical circuit/address processor 138, pipeline microprocessor 100 was updated in the status register 142 in early days, also is called as earlier executed unit 146.In addition, because early stage status register 142 values may be for effective or invalid, the value of structuring status register 162 is that effectively following goes through forever.Therefore, as shown in Figure 3, early stage status register 142 also comprises a significance bit 246, and whether the value that is disposed at early stage status register 142 with expression is effective.Early stage status register 142 is to be initialized to be same as the value that the structuring status register 162 of pipeline microprocessor 100 when being reset is initialised.The embodiment of Fig. 3 represents the single significance bit 246 of early stage status register 142.Yet all the other embodiment are considered in early days in the status register 142 carefully which effective value and are kept for each Status Flag.
Earlier executed logical circuit/the address generator 138 in A stage 112 produces an early stage result and an efficiency index, and it is provided via an early stage result bus 152 and gets back to the R stage 108, and is as described below.Earlier executed logical circuit/address generator 138 is also carried out effective address computation to produce storage address according to input operand and for storage access, just provide operand as structured registers file 134, the fixing operation number that early stage register file 136 and/or instruction are provided is as a displacement or side-play amount.Storage address can be instructed impliedly to indicate.Storage address also can impliedly be indicated by instruction, as: the address of a position in the stacked memory; And above-mentioned action is to indicate stack pointer register (being ESP) or module index register (being EBP) by hinting, as in a transmission or spring instruction.
In one embodiment, pipeline microprocessor 100 can be that microprocessor, scale microprocessor or single execution microprocessor are sent in single instruction with charge free.Promptly, an instruction of sending with charge free from each clock period of the pipeline microprocessor 100 at the instruction place of sending with charge free, or instruction generation stage (the I stage 102 is by the X stage 106) to the execution phase (the R stage 108 is by the W stage 128), compare with the SuperScale microprocessor, it can send one or more execution command with charge free in each clock period.Yet here method of Miao Shuing and device are not limited to a scale microprocessor.In one embodiment, pipeline microprocessor 100 comprises one and sends microprocessor successively with charge free.That is, instruction is sent with charge free to carry out with order indicated in the program, has the ability that order is sent instruction 206 with charge free of disobeying unlike some microprocessors.
Please refer to shown in Figure 2, according to a block schematic diagram of the present invention with R stage 108 of the pipeline microprocessor 100 that describes Fig. 1 in detail, A stage 112, and J stage 114.As shown in Figure 1, the R stage 108 comprises structured registers file 134, early stage register file 136, and structuring status register 162.Structuring status register 162 is upgraded by the W stage 128 by result bus 158.The instruction 206 that the R stage 108 received from the X stage 106.Except the command byte of itself, instruction 206 can comprise decoded information.The type of instruction 206 indications one instruction is as an addition or branch etc.Instruction 206 also can be indicated a condition code.Instruction 206 also can be indicated a target operand position by a label.Especially, target operand can indicate wherein one to be positioned at the register of structured registers file 134 with the destination locations as an instruction 206 results.Fig. 1, structured registers file 134 sees through the instruction results that result bus 158 received from the W stage 128.One target operand label 278 is provided to as a selector switch, inputs to structured registers file 134 to see through bus, selects certain will be by the register that together upgrades with instruction results from result bus 158.As shown in Figure 1, early stage register file 136 is by the earlier results 242 of earlier results bus 152 receptions from the A stage 112.Instruction 206 comprises a target operand label 216, its be provided be used as a selector switch as the early stage register file 136 of input to select a certain register that will together be updated with earlier results 242.The target operand label 216 that is provided to early stage register file 136 is from instructing 206 to be sent to a streamline register 232 in A stage 112 and to come.
Instruction 206 also can be via source operand label 214 to indicate one or more source operand.In one embodiment, instruction 206 can be indicated three source operands.Instruction 206 is with the indicator register source operand via source operand label 214.Source operand label 214 is taken as a selector switch to input to structured registers file 134 and early stage register file 136, will be provided as source operand to instructing 206 in order to select which register.Direct/fixing (for example replacing an or payment) operand 222 may be also indicated in instruction 206.
Also as shown in Figure 3, the R stage 108 also comprises the significance bit 218 that is used for each early stage register file 136 register.Early stage register file significance bit 218 receives early in the morning phase useful signal 244 as a result by the A stage 112 via the earlier results bus.Earlier results useful signal 244 is used to upgrade the significance bit 218 corresponding to by target operand label 216 selected early stage register file 136 registers.
The R stage 108 also comprises a multiplexer 226, and it selects source operand so that instruct 206 to enter the R stage.Multiplexer 226 receives the source operand by structured registers file 134 and 136 inputs of early stage register file.In one embodiment, structured registers file 134 comprises three read ports, and their two outputs are used for input as multiplexer 226 so that two source operands to be provided in each clock period.In one embodiment, early stage register file 136 comprises two read ports, and their output is used for input as multiplexer 226 so that two source operands to be provided in each clock period.Multiplexer 226 also receives one directly/and fixing operation is several 222, and it may be included among the instruction 206.Multiplexer 226 also receives earlier results 242 by the A stage 112.
In addition, multiplexer 226 receives the significance bit input relevant with each operand input.Significance bit is relevant with the operand that the early stage register file 136 that is provided by early stage register file significance bit 218 is received.The relevant significance bit of earlier results 242 operands that is received by the A stage 112 is an earlier results useful signal 244.By structured registers file 134 and directly/and the significance bit of the operand that fixing operation several 222 is provided is forever for true, and promptly structured registers file 134 and directly/fix 222 operands is effectively forever.By multiplexer 226 selected source operands and corresponding significance bit, provided respectively to a source operand pipeline register 238 and a significance bit pipeline register 236, in order to be supplied to the A stage 112.In one embodiment, source operand pipeline register 238 is used to store three source operands, and the significance bit register is used to store the significance bit of three correspondences.
In one embodiment, pipeline microprocessor 100 also comprises transmission bus (not shown), and the result that it is produced performance element 146 is delivered to the R stage 108 with the supply operand by E stage 124 and S stage 126.The transmission bus is provided as the input of multiplexer 226.If R stage 108 instruction indications one source operand, its label with do not match by the purpose label of A stage 112 in the H stage 122, but the purpose tag match in itself and E stage 124 or S stage 126, then, early sign generation/control logic circuit 212 control multiplexers 226 are to select the transmission bus to provide latest result as operand.If an operand is on the transmission bus, it is effectively forever.
According to its input, early sign generation/control logic circuit 212 produces different control signals.Early sign generation/control logic circuit 212 produces a selector control signal 282, and its control multiplexer 226 is with the source operand of suitable selection instruction 206.Early sign generation/control logic circuit 212 also produces earlier results useful signal 244.Early sign generation/control logic circuit 212 also produces an early stage status register value 262 that is used for being stored in early stage status register 142, and is used to be updated in a control signal 264 of the value of the effective register 246 of early stage state.Value in the effective register of state is to become an effective value being initialised when microprocessor resets in early days.Early stage status register 142 and the effective register 246 of early stage state provide early stage status register and significance bit to the J stage 114 as pipeline register.Early sign generation/control logic circuit 212 also produces a halt signal 228, and it is provided for pipeline register 232,234, significance bit pipeline register 236 and source operand pipeline register 238 and closes the R stage 108.In one embodiment, pipeline register 232,234,236 and 238 comprises the multipath transmission register, and it is used for keeping their present states up to the next clock period at halt signal 228 for true time.The operation of halt signal 228 describes in detail hereinafter with reference to Fig. 4.
The A stage 112 comprises earlier executed logical circuit/address generator 138, and it is by source operand pipeline register 238 reception sources operands, and produces earlier results 242 according to source operand pipeline register 238.Earlier executed logical circuit/address generator 138 comprises logical circuit of arithmetic 272, Boolean calculation logical circuit 274 and logic with shift circuit 276.Earlier executed logical circuit/address generator 138 is used to produce actual address and gives storage operation.Earlier executed logical circuit/address generator 138 also is used for the subclass of the required computing of instruction of the instruction set of execution pipeline microprocessor 100.So earlier executed logical circuit/address generator 138 is the subclass that can be carried out computing by performance element 146 as shown in Figure 1.Earlier executed logical circuit/address generator 138 is used to carry out a subclass of computing, and it is the computing of general execution.In one embodiment, the computing of common execution is substantially also overlapping with the computing of the quickest execution (promptly needs a relative short time to go to carry out, therefore they can be performed in a single clock cycle) and need the hardware of relatively small amount, particularly beyond hardware, required to produce storage address.In one embodiment, logical circuit of arithmetic 272 is used to carry out an addition, subtraction, increment and decrement; Yet logical circuit of arithmetic 272 is not in order to the addition of carrying out carry or the subtraction of borrow.In one embodiment, Boolean calculation logical circuit 274 be used to carry out a boolean with (AND) or (OR), XOR (XOR), with moving of extending of non-(NAND), tape symbol and the band zero-bit is extended moves; Yet Boolean calculation logical circuit 274 is not in order to carry out byte exchange.In one embodiment, logic with shift circuit 276 is used to carry out one and shifts left or dextroposition; Yet logic with shift circuit 276 is not in order to carry out rotation or the computing of rotation carry.Though specific embodiment is described, but 138 li concrete subclass of carrying out computing of earlier executed logical circuit/address generator, the present invention is not limited to special embodiment, and the technician in microprocessor Design field may know from experience earlier executed logical circuit/address generator 138 easily and can carry out and the actual terminal point of circuit in order to particular subset and the target of carrying out a computing based on the particular, instruction set of pipeline microprocessor 100.The earlier results 242 that is produced by earlier executed logical circuit/address generator 138 is provided to an early stage pipeline register 254 as a result to store and to be supplied to subsequently the J stage 114.
The J stage comprises early stage branch correction logic circuit 144 as shown in Figure 1.Early stage branch correction logic circuit 144 receives the output of early stage status register 142, the early stage effective register 246 of state and early stage pipeline register as a result 254.Early stage branch correction logic circuit 144 also by a streamline register 248 receive in the J stages 114 instruction in order to by pipeline register 232 with streamline move instruction 206.Pipeline microprocessor 100 also comprises a branch prediction generation signal 208, it is from I stage 102, F stage 104 and X stage 106, pipeline register 234 and the pipeline register in A stage 112 252 by the R stage 108, by the streamline transmission, and be provided to early stage branch correction logic circuit 144.In the instruction of the indication of the true value on the branch prediction generation signal 208 in the corresponding stage is a branch instruction, and it is predicted generation by the branch predictor 132 of Fig. 1, that is, be the prediction of being done according to branch predictor 132 before pipeline microprocessor 100 shifts.
According to its input, early stage branch correction logic circuit 144 produces first control signal 154, and it is used to provide the I stage 102 of Fig. 1.First control signal 154 comprises an early stage branch corrected signal 258, and it is used to provide the branch's correction logic circuit 148 in late period to Fig. 1.Revise a branch prediction when early stage branch correction logic circuit 144, then branch's corrected signal 258 is true in early days.The operation of relevant early stage branch correction logic circuit 144 please refer to Fig. 6 also following detailed description.
Please refer to shown in Figure 4ly, is the device operational flow diagram according to the generation earlier results of Fig. 2 of the present invention and early stage status register.The process flow diagram of Fig. 4 comprises two drawings, be respectively Fig. 4 A and Fig. 4 B, and flow process starts from square 402.
In square 402, an instruction 206 indications one source operand arrives at the R stage 108 by structured registers file 134.One or more source register manipulation number of instruction indication is via source operand label 214.Flow process proceeds to decision block 404.
In decision block 404, no matter early sign generation/control logic circuit 212 can detect instruction type and the judgement instruction is which kind of type all must be carried out in the A stage 112.In one embodiment, when the generation of instruction needs one storage address that must be provided to the data cache in the D stage 116, then instruction must be carried out in the A stage 112.If instruction must be carried out in the A stage 112, then flow process proceeds to square 406, otherwise flow process proceeds to square 412.
In decision block 406, early sign generation/control logic circuit 212 judges all whether existed and be effective by the indicated source operand of instruction.If instruction indication one memory operand, whether the memory operand that early sign generation/control logic circuit 212 meeting detection of stored device operands exist signal 266 to be instructed to judgement exists and is effective.As if instruction indication one direct operand, immediate operand is existence forever and effective.If instruction indication one register manipulation number, purpose label 204 and source operand label 214 that early sign generation/control logic circuit 212 can relatively be come by the low order streamline produce a result to judge the old instruction in the pipeline microprocessor 100, and it is intended for by source operand label 214 indicating structure register files 134 registers.If so, the result is present among the early stage register file, and no matter under which kind of situation, whether early stage generations/control logic circuit 212 can effective with the decision operation number for the operand detection significance bit 218 that is instructed to.If do not produce a result in one of 100 li of pipeline microprocessors old instruction, it is intended for by source operand label 214 indicating structure register files 134 registers, then, operand is existence and effective, because it will be provided by structured registers file 134.If by the indicated all source operands of instruction is existence and effective, then flow process proceeds to decision block 412; Otherwise flow process proceeds to square 408.
In square 408, early sign generation/control logic circuit 212 produces a true value on halt signal 228, arrive or write back the source operand that structured registers file 134 or divert via bus are obtained in order in the current clock period, to wait for, suspend the instruction in the R stage 108 by storer.Flow process continues to get back to square 406 by square 408.
In decision block 412, early sign generation/control logic circuit 212 can be made comparisons with source operand label 214 and by the purpose label 204 of low flow line stage, to judge whether the old instruction in the pipeline microprocessor 100 produces a result, and it is intended for by source operand label 214 indicating structure register files 134 registers.So, though the result is not necessarily effective, the result can be present among the early stage register file 136.If the old instruction in pipeline microprocessor 100 does not produce a result, it is intended for by source operand label 214 indicating structure register files 134 registers, and flow process proceeds to square 416; Otherwise flow process proceeds to square 414.
In square 414, early sign generation/control logic circuit 212 can produce a value on control signal, so that multiplexer 226 goes to select by the indicated register source operand of source operand label 214, it is provided by early stage register file 136.If earlier results 242 was produced in the A stage 112 of same clock period, then instruction arrives the R stage 108 and needs source operand, and multiplexer 226 will select earlier results 242 inputs so that source operand is offered instruction then.Multiplexer 226 may select multiple register manipulation number to an instruction by early stage register file 136, and it indicates multiple register manipulation number.Flow process proceeds to square 418.
In square 416, early sign generation/control logic circuit 212 can produce a value on selector control signal 282, so that multiplexer 226 goes to select by the indicated register source operand of source operand label 214, it is provided by structured registers file 134.In addition, early sign generation/control logic circuit 212 can go to select by the indicated non-register manipulation number of instruction by control multiplexers 226, if having, for example directly/fixing operation several 222 or effectively pass on operand.Flow process proceeds to square 418.
In square 418, instruction proceeds to the A stage 112, and in the stage, earlier executed logical circuit/address generator 138 can produce earlier results 242 by multiplexer 226 selected source operands at A.Specifically, one of them meeting of suitable logical circuit of arithmetic 272, Boolean calculation logical circuit 274 or logic with shift circuit 276 produce earlier results 242 according to the instruction type on it.Flow process proceeds to decision block 422.
In decision block 422, early sign generation/control logic circuit 212 can detect the target operand label of instruction, with judge earlier results 242 whether be predetermined will be to the register in the structured registers file.If flow process proceeds to decision block 424; Otherwise flow process proceeds to square 434.
In decision block 424, whether early sign generation/control logic circuit 212 can detect instruction types is can be by the performed type of earlier executed logical circuit/address generator 138 with decision instruction.Suppose that promptly source operand is effectively, then early sign generations/control logic circuit 212 can decision instructions whether in the scope of subset of instructions, so that the correct earlier results of earlier executed logical circuit/address generator 138 generations.If flow process proceeds to decision block 428; Otherwise flow process proceeds to square 426.
In square 426, early sign generation/control logic circuit 212 in early days as a result useful signal 244 produce falsities and accord with the significance bit 218 of early stage register file 136 registers by the target operand label 216 that contains falsity does not produce an effective earlier results 242 from earlier executed logical circuit/address generator 138 instruction type with renewal.Flow process proceeds to decision block 434.
In decision block 428, revise earlier results 242 since earlier executed logical circuit/address generator 138 produces one, early sign generations/control logic circuit 212 judge be useful on generation earlier results 242 operand whether be effective.Its indication is not if early sign generation/control logic circuit 212 needs to be performed in the A stage in decision block 404 places decision instruction, and then the instruction meeting is not stored in the R stage 108 because of lacking an operand.Therefore, be invalid even if work as a register manipulation number that comes by early stage register file 136, instruction can not be stored in the R stage 108.In the same manner, when memory operand is not written into by storer as yet, instruct and can not be stored in the R stage 108.In the same manner, when earlier executed logical circuit/address generator 138 does not produce an effective instruction type of demand in early days, instruct and can not be stored in the R stage 108.Opposite, it is invalid that earlier results 242 is marked as in square 426, and correction result is estimated at the performance element 146 in E stage 124.In comparison, when operand not acquisition or invalid as yet, the then instruction that must be performed in the A stage 112 (for example using data cache in order to an instruction of calculating an address in the D stage 116) was stored in the R stage 108.The operand that is used to produce earlier results 242 when all is that effectively flow process proceeds to square 432; Otherwise flow process proceeds to square 426.
In square 432, early sign generation/control logic circuit 212 produces a true value on the useful signal 244 in early days as a result, and is indicated to upgrade the significance bit 218 corresponding to early stage register file 136 registers by the effective earlier results that square 418 is produced by the target operand label 216 that contains true value.Side by side, upgraded by effective earlier results 242 by target operand label 216 indicated early stage register file 136 registers.Flow process proceeds to decision block 434.
In decision block 434, early sign generation/control logic circuit 212 can detect whether instruction is the type of a correcting principle status register 162 with decision instruction.In one embodiment, the instruction of correction Status Flag is instructed to according to x86 framework instruction set.When the content of instruction correction status register 162, then flow process proceeds to square 436; Otherwise flow process finishes.
In square 436, early sign generation/control logic circuit 212 produces early stage status register value 262, it is the earlier results 242 that is produced according to by earlier executed logical circuit/address generator 138, and according to the instruction 206, and by early stage status register value 262 to upgrade early stage status register 142.In one embodiment, when the complement computing of the signed integer two performed by earlier executed logical circuit/address generator 138 causes one to overflow situation (being that earlier results is too big or too little so that can't accord with among the target operand) to produce earlier results 242, then early sign generation/control logic circuit 212 produces a true value and gives overflow indicator (overflow flag, otherwise can produce a falsity OF); Early sign generation/control logic circuit 212 is set symbols, and (sign flag SF) gives the highest significant position value of earlier results 242; When earlier results was zero, early sign generation/control logic circuit 212 can produce a true value, and (zero flag ZF), otherwise then produced a falsity to zero flag; When the least significant digit hyte of earlier results 242 comprises 1 even number, then early sign generation/control logic circuit 212 can produce a true value and gives parity flag (parity flag PF), otherwise can produce a falsity; When cause one to overflow situation (be Arithmetic Operator produce a carry or by the most significant digit borrow of earlier results 242) by the 138 performed signless integer computings of earlier executed logical circuit/address generator, then early sign generation/control logic circuit 212 can produce a true value and give carry flag (carry flag, otherwise can produce a falsity CF).In one embodiment, definite says, producing a complete Status Flag is in order to write early stage status register 142, the specific Status Flag that 212 renewal meetings of early sign generation/control logic circuit are influenced by earlier results 242.In another embodiment, early sign generation/control logic circuit 212 can be replicated by structuring status register 162 up to them by the previous Status Flag that instructs of accumulation, please refer to Fig. 5 and as described below.Flow process proceeds to decision block 438.
In decision block 438, early sign generation/control logic circuit 212 judges whether the correction of Status Flag depends on the result of instruction.For instance, in one embodiment, status register (for example STC of x86 framework (carry flag is set), CLC (removing carry flag) or CMC (carry flag negate) instruction) is directly revised in a certain instruction, and because except the revision directive of Status Flag, revising is not to depend on instruction results.When the correction of status register is that then flow process proceeds to decision block 442 according to the result of instruction; Otherwise flow process finishes.
In decision block 442, whether early sign generation/control logic circuit 212 can detect earlier results useful signal 244 effective to judge earlier results 242.If flow process finishes; Otherwise flow process proceeds to square 444.
In square 444, early sign generation/control logic circuit 212 produces a value on control signal 264, and to upgrade the value in the effective register 246 of early stage state, the Status Flag that is stored in the early stage status register 142 with indication is invalid.Need to prove that square 434 to square 444 operation comes into force up to early stage status register 142 again so that the invalid record of accumulating early stage status register 142 is once invalid, please refer to shown in Figure 5 and as described below.In one embodiment, the instruction of status register is relatively less to be performed (to simplify early sign generation/control logic circuit 212) because directly revise, and is run into when one of direct renewal one Status Flag instructs, and then early stage status register 142 is invalid.Flow process ends at square 444.
Please refer to shown in Figure 5, its according to a process flow diagram of the present invention with the computing of the device of key diagram 2, so that early stage status register 142 recovers and effectively.Fig. 5 comprises the process flow diagram of two differences.Each process flow diagram recovers and effective situation according to different triggerings, with the recovery of describing early stage status register 142 and make effectively.With reference to first kind of situation, flow process starts from square 502.
In square 502, a branch instruction arrives at the S stage 126.When branch instruction needs to revise (promptly when branch predictor 132 error prediction branch instructions, prediction or branch target address error prediction no matter whether branch makes a mistake), the pipeline microprocessor 100 in the forecasting process that corrects mistakes is removed in branch's correction logic circuit 148 meetings in late period then, please refer to Fig. 7 and as described below.Removing pipeline microprocessor 100 expressions does not have the revision directive of structuring status register to be present in flow line stage under the R stage 108, or being arranged in any instruction at streamline correcting principle status register 162 has upgraded structuring status register 162 or be eliminated.Therefore, structuring status register 162 comprises up-to-date state.Need to prove that the branch instruction correction may be the branch instruction of the non-condition type of a conditioned disjunction.In addition, early stage status register 142 may be according to other incidents that cause streamline to be eliminated except the correction of branch instruction and be resumed.Flow process proceeds to decision block 504.
In decision block 504, early sign generation/control logic circuit 212 can detection branches correction signals in late period 268 with decision S stages 126 branch instruction whether by late period branch's correction logic circuit 148 revise, therefore represent that pipeline microprocessor 100 is eliminated in order to produce the correction of error prediction, if flow process proceeds to square 506; Otherwise flow process finishes.
In square 506, the value of early sign generation/control logic circuit 212 replicated architecture status registers 162 via early stage status register value 262 to early stage status register 142, and it is effective indicating early stage status register 142 via control signal 264, therefore, recovering early stage status register 142 is an effective status, and flow process proceeds to square 506.
Please refer to second situation in Fig. 5, flow process starts from square 512.
In square 512, early sign generation/control logic circuit 212 can detect the sign revision directives and exist signal 202 to judge whether all structuring status register revision directives are present in A stage 112 times, if the words that have, upgrade structuring status register 162, if flow process proceeds to square 514; Otherwise flow process finishes.
In square 514, early sign generation/control logic circuit 212 via early stage status register value 262 with the value of replicated architecture status register 162 to early stage status register 142, and serves as effective via control signal 264 to indicate early stage status register 142, therefore, recovering early stage status register 142 is an effective status.Flow process ends at square 514.
Need to prove that the removing of pipeline microprocessor 100 is because the branch in the S stage 126 of judgement in square 504 is revised, and is an incident, it creates conditions and produces a judgement to square 512.
Please refer to shown in Figure 6ly, revise to carry out early stage branch according to the computing of a flowchart text pipeline microprocessor 100 of the present invention, flow process starts from square 602.
In square 602, a conditional branch instructions arrives at the J stage 114, and flow process proceeds to decision block 604.
In square 604, early stage branch correction logic circuit 144 can detect the output of the effective register 246 of early stage state to judge whether early stage status register is effectively, if flow process proceeds to square 606; Otherwise flow process finishes.Therefore, when early stage status register 142 is invalid, then install and do not carry out early stage conditional branching correction.
In square 606, whether early stage branch correction logic circuit 144 can detect early stage status register 142 contents and be set up by the indicated condition of the condition code of conditional branch instructions to judge.Flow process proceeds to decision block 608.
In decision block 608, early stage branch correction logic circuit 144 need according to the prediction of square 606 with the correction conditions branch instruction to judge whether.Conditional branch instructions need be corrected, when condition is satisfied with effectively early stage status register 142, similarly be that conditional branch instructions should take place, and (as indicated by a falsity of J stages 114 version under branch prediction generation signal 208 streamlines) should not take place in branch predictor 132 predicted branches, promptly causes pipeline microprocessor 100 to reach out next continuous instruction; Opposite, being not content with effectively early stage 142 conditional branch instructions of status register when condition need be corrected, similarly be that conditional branch instructions should not take place, and (as indicated by a true value of J stages 114 version under branch prediction generation signal 208 streamlines) should take place in branch predictor 132 predicted branches, promptly causes pipeline microprocessor 100 to be transferred to the predicted branches destination address.If the prediction of correction conditions branch instruction is necessary, flow process proceeds to decision block 612; Otherwise flow process finishes.
In decision block 612, the Rule of judgment branch instruction needs to revise, and J stages 114 version under the streamline of signals 208 can the detection branches prediction takes place early stage branch correction logic circuit 144, with the whether predicted generation of Rule of judgment branch instruction, if flow process proceeds to square 616; Otherwise flow process proceeds to square 614.
In square 614, early stage branch correction logic circuit 144 is removed the pipeline microprocessor 100 that is positioned on the J stage 114 and branch's pipeline microprocessor 100 branch target address to conditional branch instructions via the 154 indication I stages 102 of first control signal, in one embodiment, the branch target address of conditional branch instructions is produced by the earlier executed logical circuit/address generator 138 in the A stage 112.In addition, early stage branch correction logic circuit 144 is producing a true value by being sent to late period branch's correction logic circuit early stage branch corrected signal 258 of 148 under the streamline, and it uses with reference to Fig. 7 and as described below, and flow process ends at square 614.
In square 616, early stage branch correction logic circuit 144 is removed the pipeline microprocessor 100 that is positioned on the J stage 114 via the 154 indication I stages 102 of first control signal, and the next continuous instruction of branch's pipeline microprocessor 100 to the conditional branch instructions.In addition, early stage branch correction logic circuit 144 is producing a true value by being sent to late period branch's correction logic circuit early stage branch corrected signal 258 of 148 under the streamline, and it uses with reference to Fig. 7 and as described below, and flow process ends at square 616.
Please refer to shown in Figure 7ly, revise to carry out late period branch according to the computing of a flowchart text pipeline microprocessor 100 of the present invention.Flow process starts from square 702.
In square 702, a conditional branch instructions arrives at the S stage 126.Flow process proceeds to square 704.
In square 704, late period, whether branch's correction logic circuit 148 meeting detection architecture status registers 162 contents were set up by the indicated condition of the condition code of conditional branch instructions to judge.Flow process proceeds to decision block 706.
In square 706, late period branch's correction logic circuit 148 judge whether need be according to the prediction of square 704 with the correction conditions branch instruction.Conditional branch instructions need be corrected, when condition is satisfied with structuring status register 162, similarly be that conditional branch instructions should take place, and (as indicated by a falsity of S stages 126 version under branch prediction generation signal 208 streamlines) should not take place in branch predictor 132 predicted branches, promptly causes pipeline microprocessor 100 to reach out next continuous instruction; Opposite, when condition is not content with structuring status register 162, then conditional branch instructions need be corrected, similarly be that conditional branch instructions should not take place, and (as the true value indication of S stages 126 version under instruction 208 streamlines is taken place by branch prediction) should take place in branch predictor 132 predicted branches, promptly causes pipeline microprocessor 100 to be transferred to the predicted branches destination address.If being predicted as of correction conditions branch instruction is necessary, flow process proceeds to decision block 708; Otherwise flow process finishes.
In decision block 708, whether branch's correction logic circuit 148 meetings in late period detect early stage branch corrected signal 258 and are revised by early stage branch correction logic circuit 144 with the error prediction of Rule of judgment branch instruction, if flow process finishes; Otherwise flow process proceeds to decision block 712.
In decision block 712, the Rule of judgment branch instruction needs to revise, and late period, S stages 126 version under the streamline of signals 208 can the detection branches prediction takes place for branch's correction logic circuit 148, with the whether predicted generation of Rule of judgment branch instruction, if flow process proceeds to square 716; Otherwise flow process proceeds to square 714.
In square 714, the pipeline microprocessor 100 that late period, branch's correction logic circuit 148 was removed on the S stage 126 via the 154 indication I stages 102 of first control signal, and branch's pipeline microprocessor 100 is to the branch target address of conditional branch instructions.Flow process ends at square 714.
In square 716, the pipeline microprocessor 100 that late period, branch's correction logic circuit 148 was removed on the S stage 126 via the 154 indication I stages 102 of signal, and the next continuous instruction of branch's pipeline microprocessor 100 to conditional branch instructions.Flow process ends at square 716.
As mentioned above, the pipeline microprocessor 100 that is described is with respect to the microprocessor that earlier executed logical circuit/address generator 138 and early stage register file 136 advantage that provides are not provided, have the ability to do sth. in advance a previous result who instructs of a plurality of clock period to one and produce for a continuous address or the generation instruction of non-address, so reduce the streamline number of bubbles of being caused as a register manipulation number.Reduce the average period of main each instruction of composition of streamline bubble minimizing microprocessor execution.In addition, earlier results can be used with than previous update mode sign faster, therefore may be than the execution of the previous instruction of entry condition faster.In addition, the pipeline microprocessor 100 that is described is had the ability to do sth. in advance a plurality of clock period and is revised a conditional branch instructions of predicting mistakenly with respect to the microprocessor that early stage branch correction logic circuit 144 advantages that provide are not provided.At last, the demand of high clock frequency microprocessor causes little Treatment Design person to increase the quantity of flow line stage.When the quantity increase of flow line stage, the streamline bubble quantity and bubble causes increasing waits for that instruction results and/or Status Flag upgrade, and in the same manner, when the increase of flow line stage quantity, the delay of the branch instruction of the prediction that corrects mistakes also may increase.These facts have increased the advantage of microprocessor described here, device and method.
Though the present invention and purpose thereof, feature and advantage are described in detail, all the other any embodiment that do not break away from spirit of the present invention and scope also should be involved, for instance, though a described microprocessor embodiment meets with the x86 framework substantially, described device and method is not limited to the x86 framework and can be used in different microprocessor architectures.In addition, though a described embodiment conditional branch instructions is therein revised ahead of time, the advantage that produces Status Flag in early days can be utilized to carry out other instructions ahead of time, similarly is, with x86 LOOPcc instruction is example, and advantage described herein occurs in the Jcc instruction equally; Or x86 SETcc and CMOVcc instruction can be carried out ahead of time, so its result can allow the instruction utilization that relies on subsequently.In addition, except utilizing earlier results to produce the early status flags value, earlier results also can be used for carrying out the early stage branch correction of indirect branch instruction, typically refer to direct register redirect (jump throughregister) instruction, its indication branch target address is a source register manipulation numerical value.
In addition, realize the present invention except utilizing hardware, the present invention can be implemented in be stored in a computing machine can be with the computer-readable code (for example computer readable program code, data etc.) in (for example readable) medium, computer code causes function of the present invention or structure or both to be implemented.For instance, it can utilize general program language (C language for example, C Plus Plus, JAVA, and other similar language) to realize; The GDSII database; Hardware description language (HDL) comprises Verilog HDL, VHDL, and Altera HDL (AHDL), or the like; Or other program designs that can obtain in the art and/or circuit (i.e. diagram) acquisition instrument.Computer program code can be arranged on any available computing machine (for example readable) medium and comprise semiconductor memory, magnetic disc, laser disc (for example CD-ROM, DVD-ROM and other similar media), and as utilizing transmission medium (for example carrier wave or any other medium comprise numeral, optics or analogy medium) to store (for example can read) computer data signal in computing machine.Therefore, computer program code can be transmitted on the communication network, comprises Internet and in-house network.Therefore the present invention can be understood and computer program code can be included in (for example as the part of an IP (Wise property) core, a microcontroller core for example, or as a system level design, a System on Chip/SoC (SOC) and be converted into the part that hardware is produced with circuit as a whole for example.In addition, the present invention can be embodied as the combination of hardware and computer program code.
At last, should thank those skilled in the art can utilize immediately notion that the present invention discloses and specific embodiment as design or the basis of revising other structures realizing the purpose identical with the present invention, anyly do not break away from spirit of the present invention and scope defines it by appended claim.
Claims (28)
1. device that is used to produce early status flags, it makes the conditional order that has in the pipeline microprocessor of at least one structuring Status Flag can be by earlier executed, and this device comprises:
One storage unit, it is used for accumulation and the corresponding at least one early status flags of at least one this structuring Status Flag; And
One logical circuit, it is connected with this storage unit, this logical circuit is according to the earlier results of at least one previous instruction before this conditional order, and upgrade this early status flags in this storage unit, wherein, if be used to revise this earlier results that this early status flags at least one should previous instruction one of them for invalid, then this logical circuit makes this early status flags invalid, by this if this early status flags is effective, then before pipeline microprocessor upgrades this structuring Status Flag according to the net result that should before instruct, just enable the execution of this conditional order according to this early status flags.
2. the device that is used to produce early status flags as claimed in claim 1, wherein this earlier results comprises the instruction results that the subset of instructions of an instruction set of being supported by this pipeline microprocessor is formed.
3. the device that is used to produce early status flags as claimed in claim 2, wherein this subset of instructions comprises the instruction of an execution simple operation operation or instruction or the instruction that the simple Boolean logic circuit computing of an execution is operated or the instruction that generally is used to upgrade this structuring Status Flag of an execution simple shift operation, is used as by the indicated condition code of conditional order.
4. the device that is used to produce early status flags as claimed in claim 2, wherein this earlier results performance element of resulting from this pipeline microprocessor is carried out this previous instruction with before the net result that produces this previous instruction.
5. the device that is used to produce early status flags as claimed in claim 1 further comprises:
One early stage execution logic circuit, it is connected with this logical circuit, is used for producing before referring to
6. the device that is used to produce early status flags as claimed in claim 5, wherein if should previous instruction indicate the performed operation of this earlier executed logical circuit, and if all input operands of this earlier executed logical circuit of being used to produce this earlier results are all effectively, this earlier results that then is used to revise this early status flags is effective.
7. the device that is used to produce early status flags as claimed in claim 5, wherein this earlier executed logical circuit in an address generation stage of this pipeline microprocessor, be performed or the moment at this pipeline microprocessor in be performed, this moment is right after after this pipeline microprocessor comprises stage of action of a structured registers file of this pipeline microprocessor.
8. the device that is used to produce early status flags as claimed in claim 5 further comprises:
One early stage register, be used to store an early stage register file, it is connected with this earlier executed logical circuit, this early stage register file has a plurality of registers, the register correspondence of this early stage register file is at least one register of the structured registers file of this pipeline microprocessor, and wherein this register of this early stage register file is optionally effective.
9. the device that is used to produce early status flags as claimed in claim 8, wherein if one of register of this early stage register file provides an input operand to be used to revise this earlier results that should before instruct one of them at least of this early status flags with generation to this earlier executed logical circuit, and this input operand is invalid, and then this logical circuit can make this early status flags invalid.
10. the device that is used to produce early status flags as claimed in claim 5, wherein this earlier executed logical circuit is used to carry out the executable subset of instructions of this pipeline microprocessor, wherein, if this one of them of previous instruction that is used to revise this early status flags does not belong among this subset of instructions at least, and this input operand is invalid, and then this logical circuit makes that this early status flags is invalid.
11. a pipeline microprocessor, it has the effective structuring Status Flag of non-selectivity, and comprises the conditional order of indication one condition and a computing in its instruction set, and wherein if satisfy this condition, then this pipeline microprocessor is carried out this computing, comprises:
One early stage Status Flag register is used for storing phase Status Flag at least early in the morning, and it is corresponding to this structuring Status Flag that is stored in the structured registers, and wherein this early status flags is that selectivity is effective; And
One early stage execution logic circuit, it is connected to receive this early status flags, if this early status flags is effectively, and if be satisfied with this early status flags by the indicated condition of this conditional order, then this earlier executed logical circuit is carried out by this indicated computing of this conditional order.
12. pipeline microprocessor as claimed in claim 11 further comprises:
One final execution logic circuit, it is connected to receive this structuring Status Flag, if this condition is satisfied with this structuring Status Flag and this earlier executed logical circuit is not carried out this computing, then this final execution logic circuit is carried out this computing.
13. pipeline microprocessor as claimed in claim 11 further comprises:
One logical circuit, its one before according to this conditional order had before been instructed to produce this early status flags, correction that wherein should previous this structuring Status Flag of instruction indication.
14. pipeline microprocessor as claimed in claim 13 further comprises:
One early stage register, be used to store an early stage register file, it is connected effectively with this logical circuit, this early stage register file has a plurality of registers, the register correspondence of this early stage register file is in the register of a structured registers file of this pipeline microprocessor, and wherein this register of this early stage register file is optionally effective.
15. pipeline microprocessor as claimed in claim 14, wherein at least one input operand is provided to this previous instruction from this structured registers file, wherein producing this early status flags, generation of the earlier results of this previous instruction is a basis by this input operand that this early stage register file provided rather than according to this structured registers file according to an early stage result that should previous instruction for this logical circuit.
16. pipeline microprocessor as claimed in claim 15, wherein if this input operand that this early stage register file provided is invalid, then this early status flags is invalid.
17. pipeline microprocessor as claimed in claim 15 is invalid as if this input operand to this previous instruction is provided wherein, then this early status flags is invalid.
18. pipeline microprocessor as claimed in claim 13, wherein:
This earlier executed logical circuit is connected with this logical circuit, and this earlier executed logical circuit is carried out the executable subset of instructions of this pipeline microprocessor, and wherein if should before instruct not among this subset of instructions, then this early status flags is invalid.
19. method that produces Status Flag at pipeline microprocessor in early days, it is used to start the earlier executed of a conditional order, this conditional order is according to a condition of this Status Flag of being indicated by this conditional order, and to carry out a computing conditionally, this method comprises:
Previous instruction according to this conditional order is asked to produce one first of this Status Flag, and wherein this first request may be invalid;
After producing first request, to produce one second request of this Status Flag, wherein this second request is always effective according to instruction before; And
When this condition is satisfied with this first request of this Status Flag, and this first request is effectively, then carries out this computing before producing this second request.
20. as claimed in claim 19ly produce the method for Status Flag in early days, further comprise at pipeline microprocessor:
Upgrade a structured registers, after producing this second request, to store this Status Flag.
21. as claimed in claim 19ly produce the method for Status Flag in early days, further comprise at pipeline microprocessor:
Before producing this first request, produce an early stage result of this previous instruction;
Whether this earlier results of judging this previous instruction is effective; And
When this earlier results is invalid, make this first request invalid.
22. as claimed in claim 21ly produce the method for Status Flag in early days at pipeline microprocessor, whether effectively this earlier results of wherein judging this previous instruction step comprises:
Whether this earlier results of judging this previous instruction is to utilize effective input operand to produce; And
When input operand when being invalid, it is invalid then indicating this earlier results.
23. as claimed in claim 21ly produce the method for Status Flag in early days at pipeline microprocessor, whether effectively this earlier results of wherein judging this previous instruction step comprises:
Judge whether this previous instruction is the executable instruction of earlier executed logical circuit of this pipeline microprocessor; And
When this previous instruction is not executable when instruction of this earlier executed logical circuit, it is invalid indicating this earlier results.
24. as claimed in claim 23ly produce the method for Status Flag in early days at pipeline microprocessor, wherein this earlier executed logical circuit produces one at a performance element of this pipeline microprocessor and effectively produces this earlier results before the request.
25. as claimed in claim 21ly produce the method for Status Flag in early days, further comprise at pipeline microprocessor:
Judge whether this previous instruction revises this Status Flag; And
Only in the time should before instructing this Status Flag of correction, make this first request invalid.
26. as claimed in claim 19ly produce the method for Status Flag in early days at pipeline microprocessor, whether be effectively no matter wherein at input operand of this previous instruction, the step that produces this first request need not suspended the streamline of this pipeline microprocessor.
27. as claimed in claim 19ly produce the method for Status Flag in early days, further comprise at pipeline microprocessor:
Whether the streamline of judging this pipeline microprocessor is cleaned; And
When the streamline of this pipeline microprocessor is cleaned, the structuring Status Flag is copied in this first request of this Status Flag, and make this first request of this Status Flag effective.
28. as claimed in claim 19ly produce the method for Status Flag in early days, further comprise at pipeline microprocessor:
Judge whether to produce all Status Flag revision directives in each stage afterwards in stage of this first request, all upgraded the structuring Status Flag of this pipeline microprocessor corresponding to the streamline of this pipeline microprocessor; And
If the structuring Status Flag of this pipeline microprocessor has been upgraded in this Status Flag revision directive, then this structuring Status Flag is copied in this first request of this Status Flag, and be effective this first request marks of this Status Flag.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/771,678 | 2004-02-04 | ||
US10/771,678 US7100024B2 (en) | 2003-02-04 | 2004-02-04 | Pipelined microprocessor, apparatus, and method for generating early status flags |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1629802A CN1629802A (en) | 2005-06-22 |
CN100343799C true CN100343799C (en) | 2007-10-17 |
Family
ID=34860772
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2005100051447A Active CN100343799C (en) | 2004-02-04 | 2005-01-28 | Apparatus and method for generating early status flags of pipelined microprocessor |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN100343799C (en) |
TW (1) | TWI273485B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8521996B2 (en) * | 2009-02-12 | 2013-08-27 | Via Technologies, Inc. | Pipelined microprocessor with fast non-selective correct conditional branch instruction resolution |
US9052890B2 (en) * | 2010-09-25 | 2015-06-09 | Intel Corporation | Execute at commit state update instructions, apparatus, methods, and systems |
CN105993000B (en) * | 2013-10-27 | 2021-05-07 | 超威半导体公司 | Processor and method for floating point register aliasing |
CN107193768B (en) * | 2016-03-15 | 2021-06-29 | 厦门旌存半导体技术有限公司 | Method and device for inquiring queue state |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5944812A (en) * | 1996-12-13 | 1999-08-31 | Advanced Micro Devices, Inc. | Register rename stack for a microprocessor |
CN1397879A (en) * | 2001-05-04 | 2003-02-19 | 智慧第一公司 | Appts. system and method of imaginary branch target address high speed buffer storage branch |
US6647489B1 (en) * | 2000-06-08 | 2003-11-11 | Ip-First, Llc | Compare branch instruction pairing within a single integer pipeline |
CN1460928A (en) * | 2002-11-15 | 2003-12-10 | 威盛-赛瑞斯公司 | System and method for renewing logical circuit optimization of state register |
-
2004
- 2004-09-16 TW TW93128090A patent/TWI273485B/en active
-
2005
- 2005-01-28 CN CNB2005100051447A patent/CN100343799C/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5944812A (en) * | 1996-12-13 | 1999-08-31 | Advanced Micro Devices, Inc. | Register rename stack for a microprocessor |
US6647489B1 (en) * | 2000-06-08 | 2003-11-11 | Ip-First, Llc | Compare branch instruction pairing within a single integer pipeline |
CN1397879A (en) * | 2001-05-04 | 2003-02-19 | 智慧第一公司 | Appts. system and method of imaginary branch target address high speed buffer storage branch |
CN1460928A (en) * | 2002-11-15 | 2003-12-10 | 威盛-赛瑞斯公司 | System and method for renewing logical circuit optimization of state register |
Also Published As
Publication number | Publication date |
---|---|
CN1629802A (en) | 2005-06-22 |
TWI273485B (en) | 2007-02-11 |
TW200527287A (en) | 2005-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1658154A (en) | Pipeline work micro processor, apparatus and method for performing early correction of conditional branch instruction mispredictions | |
KR101754462B1 (en) | Method and apparatus for implementing a dynamic out-of-order processor pipeline | |
CN1629801A (en) | Pipeline type microprocessor, device and method for generating early stage instruction results | |
JP5889986B2 (en) | System and method for selectively committing the results of executed instructions | |
JP3565504B2 (en) | Branch prediction method in processor and processor | |
CN1129843C (en) | Use composite data processor systemand instruction system | |
CN1188778C (en) | Zoning transmit quene and distribution strategy | |
CN1095117C (en) | Forwarding of results of store instructions | |
CN1148650C (en) | Microprocessor and method for processing instruction thereby | |
US20160055004A1 (en) | Method and apparatus for non-speculative fetch and execution of control-dependent blocks | |
US7117347B2 (en) | Processor including fallback branch prediction mechanism for far jump and far call instructions | |
US20050216714A1 (en) | Method and apparatus for predicting confidence and value | |
CN1742257A (en) | Data speculation based on addressing patterns identifying dual-purpose register | |
JP2004145903A (en) | Superscalar microprocessor | |
CN101681259A (en) | A system and method for using a local condition code register for accelerating conditional instruction execution in a pipeline processor | |
KR101723711B1 (en) | Converting conditional short forward branches to computationally equivalent predicated instructions | |
JP5301554B2 (en) | Method and system for accelerating a procedure return sequence | |
CN100343799C (en) | Apparatus and method for generating early status flags of pipelined microprocessor | |
CN1260647C (en) | Locking source registers in data processing apparatus | |
US7426631B2 (en) | Methods and systems for storing branch information in an address table of a processor | |
CN1035190A (en) | Microcode based on operand length and contraposition shifts | |
CN1068445C (en) | Superscalar microprocessor instruction pipeline including instruction dispatch and release control | |
JP2001060152A (en) | Information processor and information processing method capable of suppressing branch prediction | |
TWI284282B (en) | Processor including branch prediction mechanism for far jump and far call instructions | |
EP4020170A1 (en) | Methods, systems, and apparatuses to optimize partial flag updating instructions via dynamic two-pass execution in a processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |