US6003127A - Pipeline processing apparatus for reducing delays in the performance of processing operations - Google Patents
Pipeline processing apparatus for reducing delays in the performance of processing operations Download PDFInfo
- Publication number
- US6003127A US6003127A US08/725,709 US72570996A US6003127A US 6003127 A US6003127 A US 6003127A US 72570996 A US72570996 A US 72570996A US 6003127 A US6003127 A US 6003127A
- Authority
- US
- United States
- Prior art keywords
- instruction
- stage
- branch
- processing
- processing cycle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000001934 delay Effects 0.000 title 1
- 230000001419 dependent effect Effects 0.000 claims abstract 5
- 230000004044 response Effects 0.000 claims description 14
- 230000002457 bidirectional effect Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 238000000034 method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3804—Instruction prefetching for branches, e.g. hedging, branch folding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
- G06F9/30058—Conditional branch instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
- G06F9/3016—Decoding the operand specifier, e.g. specifier format
- G06F9/30167—Decoding the operand specifier, e.g. specifier format of immediate specifier, e.g. constants
Definitions
- This invention relates to a speed-up technique for realizing a quick processing in response to a branch instruction or in an exceptional treatment or handling based on an internal vector.
- instruction address operations in a micro processor are classified into two categories, i.e. fixed command operations in accordance with ordinary instructions and non-fixed command operations in accordance with branch instructions or the like.
- Branch instructions generally comprise the addressing mode such as a program counter relative branch, an immediate branch, and a register direct branch.
- the program counter relative branch is an instruction to execute the operation of the program counter and a relative value at the EX stage.
- the immediate branch and the register direct branch are instructions both dealing with their branch addresses as immediate values, and hence they execute the processing for setting an immediate value to the program counter in the EX stage (Refer to "H8/327, SH7032 Programming Manuals of Hitachi, or RISC System by K. Ohmori, Kaibundo publishing Co., Ltd.).
- Sequential five stages of this pipeline processing consists of IF (instruction fetch), ID (instruction decode), EX (execution of operation), MA (memory access) and WB (write back) stages.
- FIG. 5 shows the sequential flow representing the ordinary 5-stage pipeline processing.
- FIG. 6 shows the arrangement of a conventional micro processor which comprises a decoder 1 and a data path 2.
- Data path 2 comprises an operating section 2-1 performing logical operations, arithmetic operations, shift operations and so on, a register file 2-2 storing the operation or computation data, a program counter 2-3 counting the address of the present program, and an address unit 2-4 selectively switching the output to an address bus 4 from operating section 2-1 or program counter 2-3.
- Operating section 2-1 through address unit 2-4 are respectively controlled in response to the signals of control buses 7-1 through 7-4 fed from decoder 1.
- Micro processor designates an address in a memory (not shown) by outputting data through address bus 4 and reads out the instruction stored in the designated address through data bus 3, and then decodes the readout instruction in decoder 1, thereby controlling the data path 2.
- An immediate bus 6-1 is provided between decoder 1 and operating section 2-1.
- Read buses 6-2 and 6-4 are provided to read out the data from register file 2-2.
- Reference numerals 63 and 6-5 represent input buses of operating section 2-1, while 6-6 represents an output bus of operating section 2-1.
- reference numeral 6-7 represents a read bus of program counter 2-3, and 6-8 represents an input bus of address unit 2-4.
- each stage of IF through WB has the following relationship or correspondence to each of units 2-1 through 2-4 constituting the data path 2.
- Program counter 2-3 operates during the IF stage.
- Decoder 1 control section
- register file 2-2 operate during the ID stage.
- Operating section 2-1 operates during the EX stage.
- Address unit 2-4 operates during the MA stage.
- register file 2-2 operates during the WB stage.
- the immediate branch instruction in the above-described conventional micro processor is executed according to the pipeline flow shown in FIG. 7. More specifically, the branch address decoded in decoder 1 is entered from immediate bus 6-1 to operating section 2-1 in EX stage 301 and then is set through output bus 6-6 to program counter 2-3.
- the branch address is read out from register file 2-2 and is entered through buses 6-4 and 6-5 to operating section 2-1, and is then set via bus 6-6 to program counter 2-3.
- the branch address is always set to the program counter 2-3 via operating section 2-1 in the response to the branch instruction or in the exceptional treatment.
- FIG. 11 shows the relationship between the instruction processing cycle and each stage in a micro processor adopting the 5-stage pipeline processing system.
- conditional branch instruction is fetched in n instruction processing cycle.
- This conditional branch instruction is decoded in the ID stage.
- the operation result obtained from the n-1 instruction processing cycle is compared with the branch condition. Thereafter, the processing flow proceeds to the EX stage of the branch address.
- the ID stage of the n instruction processing cycle requires a relatively long time L equivalent to the sum of a first duration required for waiting the operation result obtained from the n-1 instruction processing cycle and a second duration required for comparing the operation result thus obtained and the condition of the branch instruction.
- L the length of a particular stage, if it has the longest time, will make other stages delay in their processing time.
- a principal object of the present invention is to realize a quick processing in response to a branch instruction and in the exceptional treatment.
- Another object of the present invention is to effectively execute the comparison processing in response to a given conditional branch so as to shorten the processing time.
- a micro processor comprises: a decoder for decoding a branch instruction; an operating section for executing logical, arithmetic, and shift operations; a program counter for counting the address of the present program; a direct-setting bus for allowing the decoder to directly set an immediate value to the program counter without passing through an output bus of the operating section; and a switch for selectively connecting the direct-setting bus or the output bus to the program counter.
- a micro processor comprises: an operating section for executing logical, arithmetic, and shift operations; a register file for storing operation result of the operating section; a program counter for counting the address of the present program; a direct-setting bus for allowing the register file to directly set a register value to the program counter without passing through an output bus of the operating section; and a switch for selectively connecting the direct-setting bus or the output bus to the program counter.
- a third aspect of the present invention provides a pipeline processing apparatus for dividing each instruction processing cycle into a plurality of sequential stages, and executing the processing of respective stages in parallel at timings overlapped partly.
- This pipeline processing apparatus comprises: a means for fetching a branch instruction; and a means for executing a comparison of branch condition relating to the branch instruction at a timing for executing an operating stage of an instruction processing cycle which fetched the branch instruction, the operating stage of the instruction processing cycle being provided to calculate a branch address.
- the instruction processing cycle which fetched the branch instruction calculates the branch address without waiting comparison result of branch condition, so that the calculation of the branch address and the comparison of the branch condition can be executed in a concurrent manner.
- a fourth aspect of the present invention provides a decoding apparatus comprising a group of registers and a group of decoders, for controlling each unit in a data path so as to control a plurality of sequential stages of a pipeline processing, wherein a comparator for judging branch condition relating to a branch instruction is disposed in parallel to a register dedicated to an operating stage of the pipeline processing.
- the registers comprise a condition register for memorizing decoding result of the branch instruction at a decode stage and a flag register for memorizing an operation result flag at the operation stage through the data path, and the data memorized in the condition register and the flag register are entered into the comparator.
- FIG. 1 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a first embodiment of the present invention
- FIG. 2 is a view showing the operation of a pipeline processing system in response to an immediate branch instruction in accordance with the first embodiment of the present invention
- FIG. 3 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a second embodiment of the present invention.
- FIG. 4 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a third embodiment of the present invention.
- FIG. 5 is a view showing the operation of a pipeline processing system in a conventional micro processor
- FIG. 6 is a schematic block diagram showing the overall arrangement of the conventional micro processor
- FIG. 7 is a view showing the operation of a pipeline processing system in the conventional micro processor
- FIG. 8 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a fourth embodiment of the present invention.
- FIG. 9 is a block diagram showing the detailed arrangement of a decoder unit incorporated in the micro processor in accordance with the fourth embodiment of the present invention.
- FIG. 10 is a view showing the operation of a pipeline processing system in accordance with the forth embodiment of the present invention.
- FIG. 11 is a view showing the operation of a pipeline processing system in the conventional micro processor.
- FIG. 1 is a block diagram showing a micro processor in accordance with a first embodiment of the present invention. This micro processor realizes the speed-up of the processing in response to the immediate branch instruction.
- the micro processor of the first embodiment comprises an immediate bus 6-9 associated with a bidirectional switch 5-7 and a bidirectional switch 5-8 provided in the output bus 6-6 in addition to the components provided in the conventional micro processor shown in FIG. 6.
- Immediate bus 6-9 with bidirectional switch 5-7 has a function of allowing decoder 1 to directly set an immediate value to program counter 2-3.
- FIG. 2 shows the pipeline flow of the immediate branch processing in this micro processor. More specifically, an immediate branch instruction is fetched at IF stage 500, then this instruction is decoded by decoder 1 at ID stage 501 and at the same time this instruction is directly set through immediate bus 6-9 to program counter 2-3. Accordingly, there is no necessity of executing the next EX stage. In other words, the processing can be started from the n+2 processing cycle by directly fetching the instruction from the branch address. In this case, switch 5-7 is opened and switch 5-8 is closed, thereby switching the input bus of data to program counter 2-3.
- the immediate branch processing can be quickly performed.
- FIG. 3 is a block diagram showing a micro processor in accordance with a second embodiment of the present invention. This micro processor realizes the speed-up of the processing in response to the register direct branch instruction.
- the micro processor of the second embodiment comprises bidirectional switch 5-8 provided in the output bus 6-6 of operating section 2-1 in the same manner as the first embodiment in addition to the components of the conventional micro processor shown in FIG. 6. Furthermore, the micro processor of the second embodiment comprises a bus 6-10 connecting an intermediate point between bidirectional switch 5-4 and operating section 2-1 and an intermediate point between bidirectional switch 5-8 and program counter 2-3. Bus 6-10, associated with a bidirectional switch 5-9, has a function of allowing the direct setting to program counter 2-3.
- bidirectional switches 5-3 and 5-9 are opened while bidirectional switch 5-8 is closed.
- the data (register value) in the register file 2-2 is directly set to program counter 2-3 via a direct-setting bus 6-10. Accordingly, in the same manner as in the first embodiment, passing through operating section 2-1 is no longer required to complete the setting of the branch address at the stage preceding the EX stage. Thus, the branch processing can be immediately started.
- the register direct branch processing can be quickly performed.
- FIG. 4 is a block diagram showing a micro processor in accordance with a third embodiment of the present invention which is substantially the combination of the above-described first embodiment and the second embodiment.
- FIGS. 8 through 10 A fourth embodiment of the present invention will be explained with reference to FIGS. 8 through 10.
- FIG. 8 shows the arrangement of the micro processor in accordance with the fourth embodiment.
- the micro processor of the fourth embodiment comprises decoder 1 which reads out an intended instruction from a memory (ROM 8 or RAM 9) via data bus 3 in accordance with the data of address bus 4 and decodes the readout instruction, and data path 2 controlled by decoder 1 via a control bus 5.
- decoder 1 which reads out an intended instruction from a memory (ROM 8 or RAM 9) via data bus 3 in accordance with the data of address bus 4 and decodes the readout instruction, and data path 2 controlled by decoder 1 via a control bus 5.
- Data path 2 comprises operating section 2-1 performing logical operations, arithmetic operations, shift operations and so on, register file 2-2 storing the computation data, and program counter 2-3 counting the address of the present program.
- the flag of operation result is supplied to decoder 1 via a flag bus 6.
- FIG. 9 shows the detailed arrangement of decoder 1.
- An instruction register 1-1 stores the instruction sent through data bus 3.
- a condition register 1-2 varies in response to the value of instruction register 1-1, the present control condition and the comparison signal 1-13.
- a condition register 1-3 memorizes the condition of the conditional branch instruction.
- a flag register 1-4 memorizes the operation result flag entered from data path 2 through flag bus 6.
- a comparator 1-5 compares the memorized data between condition register 1-3 and flag register 1-4, and then generates a comparison signal 1-13 based on the comparison result.
- EX register 1-7, MA register 1-9 and WB register 1-11 store the control information relating to the EX stage, MA stage and WB stage, respectively.
- ID decoder 1-6, EX decoder 1-8, MA decoder 1-10 and WB decoder 1-12 decode the control information of respective stages and send out the control signals to corresponding units of data path 2 via control bus 5.
- Each of the above-described registers operates in synchronism with a system clock (not shown).
- decoder 1 in accordance with this fourth embodiment resides in the provision of comparator 1-5.
- Comparator 1-5 is disposed in parallel with EX register 1-7. Accordingly, the forth embodiment of the present invention makes it possible to execute the above-described comparison in this comparator 1-5 independently of the control and execution of the EX stage by EX register 1-7 and EX decoder 1-8.
- FIG. 10 shows the pipeline processing flow in the micro processor comprising the decoder shown in FIG. 9.
- each instruction processing cycle is dissected or divided into sequential five stages of IF (instruction fetch), ID (instruction decode), EX (execution of operation), MA (memory access) and WB (write back). Respective instruction processing cycles are processed in parallel or in a concurrent manner. In other words, a plurality of instruction processing cycles are executed at timings overlapped partly, so as to realize a 5-stage pipeline processing system.
- the branch condition of the conditional branch instruction is sent from ID decoder 1-6 to condition register 1-3 and is stored there.
- the operation result flag of the n-1 instruction processing cycle is sent from data path 2 through flag bus 6 to flag register 1-4 and is stored there.
- the branch condition stored in condition register 1-3 and the operation result flag stored in flag register 1-4 are both entered into comparator 1-5.
- Comparator 1-5 generates a comparison signal representing the result on whether the branch condition is established or not established. This comparison is executed at the time when EX register 1-7 and EX decoder 1-8 cooperatively cause the units of data path 2 to execute the processing for the branch address.
- n+2 instruction is changed to the NOP instruction when the value stored in condition register 1-2 is varied in response to comparison signal 1-13.
- the branch address calculated at the EX stage of the n instruction processing cycle is sent out to address bus 4 to jump to the designated address, thereby changing the program.
- the length of the ID stage in the instruction processing cycle which fetched the conditional branch instruction is influenced only by the operation time not by the sum of the operation time and the comparison time. Hence, it becomes possible to shorten the entire processing time.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
Abstract
A pipeline processing apparatus for performing processing operations in a succession of processing cycles, in which each cycle is composed of a succession of stages that include an instruction decoding stage for decoding an instruction associated with the cycle and an execution stage for executing an operation dependent on the instruction, and the processing cycles include a first cycle which starts at a first time and a second cycle that begins at a second time that is after the first time and that overlaps the first cycle in time. The apparatus is constructed and controlled for causing a branch instruction to be decoded in the instruction decoding stage of the first cycle; and for effecting a calculation in the execution stage of the first cycle, dependent on the branch instruction decoded in the instruction decoding stage of the first cycle.
Description
1. Field of the Invention
This invention relates to a speed-up technique for realizing a quick processing in response to a branch instruction or in an exceptional treatment or handling based on an internal vector.
2. Related Art
In general, instruction address operations in a micro processor are classified into two categories, i.e. fixed command operations in accordance with ordinary instructions and non-fixed command operations in accordance with branch instructions or the like.
In the case of ordinary instructions, their instruction addresses are calculated during an IF (instruction fetch) stage. On the other hand, in the case of branch instructions, their instruction addresses are calculated during an EX (execution of operation) stage. Branch instructions generally comprise the addressing mode such as a program counter relative branch, an immediate branch, and a register direct branch. Among them, the program counter relative branch is an instruction to execute the operation of the program counter and a relative value at the EX stage. On the other hand, the immediate branch and the register direct branch are instructions both dealing with their branch addresses as immediate values, and hence they execute the processing for setting an immediate value to the program counter in the EX stage (Refer to "H8/327, SH7032 Programming Manuals of Hitachi, or RISC System by K. Ohmori, Kaibundo publishing Co., Ltd.).
As one of this kind of conventional technologies, a micro processor adopting a 5-stage pipeline processing system will be explained hereinafter.
Sequential five stages of this pipeline processing consists of IF (instruction fetch), ID (instruction decode), EX (execution of operation), MA (memory access) and WB (write back) stages. FIG. 5 shows the sequential flow representing the ordinary 5-stage pipeline processing.
FIG. 6 shows the arrangement of a conventional micro processor which comprises a decoder 1 and a data path 2. Data path 2 comprises an operating section 2-1 performing logical operations, arithmetic operations, shift operations and so on, a register file 2-2 storing the operation or computation data, a program counter 2-3 counting the address of the present program, and an address unit 2-4 selectively switching the output to an address bus 4 from operating section 2-1 or program counter 2-3. Operating section 2-1 through address unit 2-4 are respectively controlled in response to the signals of control buses 7-1 through 7-4 fed from decoder 1.
Micro processor designates an address in a memory (not shown) by outputting data through address bus 4 and reads out the instruction stored in the designated address through data bus 3, and then decodes the readout instruction in decoder 1, thereby controlling the data path 2.
An immediate bus 6-1 is provided between decoder 1 and operating section 2-1. Read buses 6-2 and 6-4 are provided to read out the data from register file 2-2. Reference numerals 63 and 6-5 represent input buses of operating section 2-1, while 6-6 represents an output bus of operating section 2-1. Furthermore, reference numeral 6-7 represents a read bus of program counter 2-3, and 6-8 represents an input bus of address unit 2-4. There are also provided a plurality of bidirectional switches 5-1 through 5-6 to switch the above-described buses 61, 6-2, 6-4, 6-5, 6-6 and 6-7.
Regarding operation timing, each stage of IF through WB has the following relationship or correspondence to each of units 2-1 through 2-4 constituting the data path 2.
Program counter 2-3 operates during the IF stage. Decoder 1 (control section) and register file 2-2 operate during the ID stage. Operating section 2-1 operates during the EX stage. Address unit 2-4 operates during the MA stage. And, register file 2-2 operates during the WB stage.
The immediate branch instruction in the above-described conventional micro processor is executed according to the pipeline flow shown in FIG. 7. More specifically, the branch address decoded in decoder 1 is entered from immediate bus 6-1 to operating section 2-1 in EX stage 301 and then is set through output bus 6-6 to program counter 2-3.
Similarly, in executing the register direct branch instruction in the above-described micro processor, the branch address is read out from register file 2-2 and is entered through buses 6-4 and 6-5 to operating section 2-1, and is then set via bus 6-6 to program counter 2-3.
In this manner, according to the above-described conventional micro processor, the branch address is always set to the program counter 2-3 via operating section 2-1 in the response to the branch instruction or in the exceptional treatment. Hence, when seen on the processing stage flow, the above-described micro processor is forced to pass through the EX stage every time, resulting in a significant delay in the processing speed.
Furthermore, FIG. 11 shows the relationship between the instruction processing cycle and each stage in a micro processor adopting the 5-stage pipeline processing system.
In this example, it is now assumed that a conditional branch instruction is fetched in n instruction processing cycle. This conditional branch instruction is decoded in the ID stage. There is a waiting time for waiting the operation result coming from the EX stage of the immediately preceding n-1 instruction processing cycle. Then, at the timing of ID stage of the own n instruction processing cycle, the operation result obtained from the n-1 instruction processing cycle is compared with the branch condition. Thereafter, the processing flow proceeds to the EX stage of the branch address.
For this reason, the ID stage of the n instruction processing cycle requires a relatively long time L equivalent to the sum of a first duration required for waiting the operation result obtained from the n-1 instruction processing cycle and a second duration required for comparing the operation result thus obtained and the condition of the branch instruction. In other words, according to the above-described conventional pipeline processing, the length of a particular stage, if it has the longest time, will make other stages delay in their processing time.
Accordingly, in view of above-described problems encountered in the related art, a principal object of the present invention is to realize a quick processing in response to a branch instruction and in the exceptional treatment.
Furthermore, another object of the present invention is to effectively execute the comparison processing in response to a given conditional branch so as to shorten the processing time.
In order to accomplish this and other related objects, according to a first aspect of the present invention, a micro processor comprises: a decoder for decoding a branch instruction; an operating section for executing logical, arithmetic, and shift operations; a program counter for counting the address of the present program; a direct-setting bus for allowing the decoder to directly set an immediate value to the program counter without passing through an output bus of the operating section; and a switch for selectively connecting the direct-setting bus or the output bus to the program counter.
According to a second aspect of the present invention, a micro processor comprises: an operating section for executing logical, arithmetic, and shift operations; a register file for storing operation result of the operating section; a program counter for counting the address of the present program; a direct-setting bus for allowing the register file to directly set a register value to the program counter without passing through an output bus of the operating section; and a switch for selectively connecting the direct-setting bus or the output bus to the program counter.
Furthermore, a third aspect of the present invention provides a pipeline processing apparatus for dividing each instruction processing cycle into a plurality of sequential stages, and executing the processing of respective stages in parallel at timings overlapped partly. This pipeline processing apparatus comprises: a means for fetching a branch instruction; and a means for executing a comparison of branch condition relating to the branch instruction at a timing for executing an operating stage of an instruction processing cycle which fetched the branch instruction, the operating stage of the instruction processing cycle being provided to calculate a branch address.
According to the features of preferred embodiments, it is desirable that the instruction processing cycle which fetched the branch instruction calculates the branch address without waiting comparison result of branch condition, so that the calculation of the branch address and the comparison of the branch condition can be executed in a concurrent manner.
Furthermore, a fourth aspect of the present invention provides a decoding apparatus comprising a group of registers and a group of decoders, for controlling each unit in a data path so as to control a plurality of sequential stages of a pipeline processing, wherein a comparator for judging branch condition relating to a branch instruction is disposed in parallel to a register dedicated to an operating stage of the pipeline processing.
According to the features of the preferred embodiments, it is preferable that the registers comprise a condition register for memorizing decoding result of the branch instruction at a decode stage and a flag register for memorizing an operation result flag at the operation stage through the data path, and the data memorized in the condition register and the flag register are entered into the comparator.
The above and other objects, features and advantages of the present invention will become more apparent from the following detailed description which is to be read in conjunction with the accompanying drawings, in which:
FIG. 1 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a first embodiment of the present invention;
FIG. 2 is a view showing the operation of a pipeline processing system in response to an immediate branch instruction in accordance with the first embodiment of the present invention;
FIG. 3 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a second embodiment of the present invention;
FIG. 4 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a third embodiment of the present invention;
FIG. 5 is a view showing the operation of a pipeline processing system in a conventional micro processor;
FIG. 6 is a schematic block diagram showing the overall arrangement of the conventional micro processor;
FIG. 7 is a view showing the operation of a pipeline processing system in the conventional micro processor;
FIG. 8 is a schematic block diagram showing the overall arrangement of a micro processor in accordance with a fourth embodiment of the present invention;
FIG. 9 is a block diagram showing the detailed arrangement of a decoder unit incorporated in the micro processor in accordance with the fourth embodiment of the present invention;
FIG. 10 is a view showing the operation of a pipeline processing system in accordance with the forth embodiment of the present invention; and
FIG. 11 is a view showing the operation of a pipeline processing system in the conventional micro processor.
Preferred embodiments of the present invention will be explained in greater detail hereinafter with reference to the accompanying drawings. Identical parts are denoted by the same reference numerals throughout the views.
FIG. 1 is a block diagram showing a micro processor in accordance with a first embodiment of the present invention. This micro processor realizes the speed-up of the processing in response to the immediate branch instruction.
The micro processor of the first embodiment, as understood from the comparison between FIGS. 1 and 6, comprises an immediate bus 6-9 associated with a bidirectional switch 5-7 and a bidirectional switch 5-8 provided in the output bus 6-6 in addition to the components provided in the conventional micro processor shown in FIG. 6. Immediate bus 6-9 with bidirectional switch 5-7 has a function of allowing decoder 1 to directly set an immediate value to program counter 2-3.
FIG. 2 shows the pipeline flow of the immediate branch processing in this micro processor. More specifically, an immediate branch instruction is fetched at IF stage 500, then this instruction is decoded by decoder 1 at ID stage 501 and at the same time this instruction is directly set through immediate bus 6-9 to program counter 2-3. Accordingly, there is no necessity of executing the next EX stage. In other words, the processing can be started from the n+2 processing cycle by directly fetching the instruction from the branch address. In this case, switch 5-7 is opened and switch 5-8 is closed, thereby switching the input bus of data to program counter 2-3.
In this manner, according to the first embodiment, the immediate branch processing can be quickly performed.
FIG. 3 is a block diagram showing a micro processor in accordance with a second embodiment of the present invention. This micro processor realizes the speed-up of the processing in response to the register direct branch instruction.
The micro processor of the second embodiment, as understood from the comparison between FIGS. 3 and 6, comprises bidirectional switch 5-8 provided in the output bus 6-6 of operating section 2-1 in the same manner as the first embodiment in addition to the components of the conventional micro processor shown in FIG. 6. Furthermore, the micro processor of the second embodiment comprises a bus 6-10 connecting an intermediate point between bidirectional switch 5-4 and operating section 2-1 and an intermediate point between bidirectional switch 5-8 and program counter 2-3. Bus 6-10, associated with a bidirectional switch 5-9, has a function of allowing the direct setting to program counter 2-3.
When the register direct branch processing is performed, bidirectional switches 5-3 and 5-9 are opened while bidirectional switch 5-8 is closed. With this switching operation, the data (register value) in the register file 2-2 is directly set to program counter 2-3 via a direct-setting bus 6-10. Accordingly, in the same manner as in the first embodiment, passing through operating section 2-1 is no longer required to complete the setting of the branch address at the stage preceding the EX stage. Thus, the branch processing can be immediately started.
In this manner, according to the second embodiment, the register direct branch processing can be quickly performed.
FIG. 4 is a block diagram showing a micro processor in accordance with a third embodiment of the present invention which is substantially the combination of the above-described first embodiment and the second embodiment.
According to this third embodiment, it becomes possible to realize the speed-up the operation of the micro processor in both the immediate branch processing and the register direct branch processing.
A fourth embodiment of the present invention will be explained with reference to FIGS. 8 through 10.
FIG. 8 shows the arrangement of the micro processor in accordance with the fourth embodiment. The micro processor of the fourth embodiment comprises decoder 1 which reads out an intended instruction from a memory (ROM 8 or RAM 9) via data bus 3 in accordance with the data of address bus 4 and decodes the readout instruction, and data path 2 controlled by decoder 1 via a control bus 5.
FIG. 9 shows the detailed arrangement of decoder 1. An instruction register 1-1 stores the instruction sent through data bus 3. A condition register 1-2 varies in response to the value of instruction register 1-1, the present control condition and the comparison signal 1-13. A condition register 1-3 memorizes the condition of the conditional branch instruction. A flag register 1-4 memorizes the operation result flag entered from data path 2 through flag bus 6. A comparator 1-5 compares the memorized data between condition register 1-3 and flag register 1-4, and then generates a comparison signal 1-13 based on the comparison result. EX register 1-7, MA register 1-9 and WB register 1-11 store the control information relating to the EX stage, MA stage and WB stage, respectively. ID decoder 1-6, EX decoder 1-8, MA decoder 1-10 and WB decoder 1-12 decode the control information of respective stages and send out the control signals to corresponding units of data path 2 via control bus 5. Each of the above-described registers operates in synchronism with a system clock (not shown).
The characteristic arrangement of decoder 1 in accordance with this fourth embodiment resides in the provision of comparator 1-5. Comparator 1-5 is disposed in parallel with EX register 1-7. Accordingly, the forth embodiment of the present invention makes it possible to execute the above-described comparison in this comparator 1-5 independently of the control and execution of the EX stage by EX register 1-7 and EX decoder 1-8.
FIG. 10 shows the pipeline processing flow in the micro processor comprising the decoder shown in FIG. 9.
According to this micro processor, each instruction processing cycle is dissected or divided into sequential five stages of IF (instruction fetch), ID (instruction decode), EX (execution of operation), MA (memory access) and WB (write back). Respective instruction processing cycles are processed in parallel or in a concurrent manner. In other words, a plurality of instruction processing cycles are executed at timings overlapped partly, so as to realize a 5-stage pipeline processing system.
As shown in (A) of FIG. 10, when a conditional branch instruction is fetched at n instruction processing cycle, the ID stage of this instruction processing cycle performs only the decoding of instruction without performing the comparison between the operation result of n-1 cycle and the branch condition, and then the processing flow directly proceeds to the EX stage to calculate the branch address.
The branch condition of the conditional branch instruction is sent from ID decoder 1-6 to condition register 1-3 and is stored there. The operation result flag of the n-1 instruction processing cycle is sent from data path 2 through flag bus 6 to flag register 1-4 and is stored there. Then, the branch condition stored in condition register 1-3 and the operation result flag stored in flag register 1-4 are both entered into comparator 1-5. Comparator 1-5 generates a comparison signal representing the result on whether the branch condition is established or not established. This comparison is executed at the time when EX register 1-7 and EX decoder 1-8 cooperatively cause the units of data path 2 to execute the processing for the branch address.
Accordingly, it is regarded that, in the flow (A) of FIG. 10, the comparison for checking the establishment/non-establishment of the branch condition is executed at the ID stage of the n+1 instruction processing cycle.
When the branch condition is established as a result of the above-described comparison, execution of the EX stage in the n+1 processing cycle is abandoned in response to the output of comparator 1-5 sent to EX decoder 1-8. Similarly, executions of the MA stage and the WB stage are abandoned in response to the output of comparator 1-5 sent to MA register 1-9 and WB register 1-11, respectively.
Then, the next n+2 instruction is changed to the NOP instruction when the value stored in condition register 1-2 is varied in response to comparison signal 1-13. Subsequently, in n+3 instruction processing cycle, the branch address calculated at the EX stage of the n instruction processing cycle is sent out to address bus 4 to jump to the designated address, thereby changing the program.
Executions of the MA stage and the WB stage in the n instruction processing cycle are abandoned in response to comparison signal 1-13, too.
On the other hand, when the branch condition is not established, the next n+1 instruction is executed without any jump, as shown in (B) of FIG. 10.
As explained above, the length of the ID stage in the instruction processing cycle which fetched the conditional branch instruction is influenced only by the operation time not by the sum of the operation time and the comparison time. Hence, it becomes possible to shorten the entire processing time.
As this invention may be embodied in several forms without departing from the spirit of essential characteristics thereof, the present embodiments described are therefore intended to be only illustrative and not restrictive, since the scope of the invention is defined by the appended claims rather than by the description preceding them, and all changes that fall within the metes and bounds of the claims, or equivalents of such metes and bounds, are therefore intended to be embraced by the claims.
Claims (3)
1. A pipeline processing apparatus for performing processing operations in a succession of processing cycles, in which each processing cycle is composed of a succession of stages that include an instruction fetch stage for fetching an instruction, an instruction decoding stage for decoding the instruction fetched by the instruction fetch stage associated with the same cycle and an execution stage for executing an operation dependent on the instruction decoded in the same cycle, and in which each of the succession of processing cycles has a beginning, the beginning of each processing cycle precedes in time the beginning of a respective succeeding processing cycle, each processing cycle overlaps in time the respective succeeding processing cycle, and the instruction fetch stage of at least one processing cycle is operative to fetch a branch instruction, the branch instruction having a branch condition, said pipeline processing apparatus comprising:
first means for decoding, in the instruction decoding stage of the at least one processing cycle, the branch instruction fetched by the instruction fetch stage of the at least one processing cycle without performing a branch condition comparison for checking establishment of the branch condition;
second means for effecting, in the execution stage of the at least one processing cycle, a calculation of a branch address dependent on the branch instruction decoded in the instruction decoding stage of the at least one cycle; and
third means for executing the branch condition comparison in the instruction decoding stage of a processing cycle succeeding the at least one processing cycle concurrently with the calculation of the branch address in the execution stage of the at least one processing cycle.
2. A decoding apparatus for decoding information in respective stages of a succession of processing cycles, in which each processing cycle is composed of a succession of stages that include an instruction fetch stage for fetching an instruction, an instruction decoding stage for decoding the instruction fetched by the instruction fetch stage associated with the same cycle and an execution stage for executing an operation dependent on the instruction decoded in the instruction decoding stage of the same cycle, and in which each of the succession of processing cycles overlaps in time a respective succeeding processing cycle, and the instruction fetch stage of at least one processing cycle is operative to fetch a branch instruction, the branch instruction having a branch condition, said decoding apparatus comprising:
a plurality of register units and a plurality of decoder units connected to an operating section of a data path via a data bus for processing information in each processing cycle stage; and
a comparator for judging a branch condition, wherein:
said plurality of register units include an execution register unit for processing the information in the execution stage of each processing cycle, said execution register unit being connected in parallel with said comparator; and
said decoding apparatus is operative for concurrently calculating a branch address in said execution register unit and judging establishment of a branch condition in said comparator in response to a fetched branch instruction so that the branch condition comparison for the branch instruction of the at least one processing cycle is executed in the instruction decoding stage of a processing cycle which follows the at least one processing cycle concurrently with calculation of the branch address performed in said execution register from information in the execution stage of the at least one processing cycle.
3. The decoding apparatus of claim 2 wherein said plurality of registers further include:
a condition register connected to said comparator for storing a decoding result of said branch instruction decoded in the decoding stage of the at least one processing cycle; and
a flag register connected to said comparator for storing an operation result flag representing a result of the operation executed in the execution stage of the at least one processing cycle, wherein
said comparator is operative for comparing the decoding result stored in said condition register with the operation result flag stored in said flag register.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/429,022 US6308263B1 (en) | 1995-10-04 | 1999-10-29 | Pipeline processing apparatus for reducing delays in the performance of processing operations |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP7-257725 | 1995-10-04 | ||
JP25772595A JPH09101888A (en) | 1995-10-04 | 1995-10-04 | Microprocessor |
JP26040995A JP2924735B2 (en) | 1995-10-06 | 1995-10-06 | Pipeline operation device and decoder device |
JP7-260409 | 1995-10-06 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/429,022 Division US6308263B1 (en) | 1995-10-04 | 1999-10-29 | Pipeline processing apparatus for reducing delays in the performance of processing operations |
Publications (1)
Publication Number | Publication Date |
---|---|
US6003127A true US6003127A (en) | 1999-12-14 |
Family
ID=26543369
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US08/725,709 Expired - Lifetime US6003127A (en) | 1995-10-04 | 1996-10-04 | Pipeline processing apparatus for reducing delays in the performance of processing operations |
US09/429,022 Expired - Fee Related US6308263B1 (en) | 1995-10-04 | 1999-10-29 | Pipeline processing apparatus for reducing delays in the performance of processing operations |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/429,022 Expired - Fee Related US6308263B1 (en) | 1995-10-04 | 1999-10-29 | Pipeline processing apparatus for reducing delays in the performance of processing operations |
Country Status (1)
Country | Link |
---|---|
US (2) | US6003127A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210161419A1 (en) * | 2018-04-20 | 2021-06-03 | Nippon Telegraph And Telephone Corporation | Component concentration measurement device and component concentration measurement method |
CN117008977A (en) * | 2023-08-08 | 2023-11-07 | 上海合芯数字科技有限公司 | Instruction execution method, system and computer equipment with variable execution period |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7178138B2 (en) * | 2001-01-24 | 2007-02-13 | Texas Instruments Incorporated | Method and tool for verification of algorithms ported from one instruction set architecture to another |
DE60138805D1 (en) * | 2001-06-29 | 2009-07-09 | Texas Instruments Inc | Method for improving the visibility of calculation of the effective addresses in pipeline architecture |
US7524353B2 (en) * | 2004-10-21 | 2009-04-28 | Climax Engineered Materials, Llc | Densified molybdenum metal powder and method for producing same |
US7721073B2 (en) * | 2006-01-23 | 2010-05-18 | Mips Technologies, Inc. | Conditional branch execution in a processor having a data mover engine that associates register addresses with memory addresses |
US7721074B2 (en) * | 2006-01-23 | 2010-05-18 | Mips Technologies, Inc. | Conditional branch execution in a processor having a read-tie instruction and a data mover engine that associates register addresses with memory addresses |
US7721075B2 (en) | 2006-01-23 | 2010-05-18 | Mips Technologies, Inc. | Conditional branch execution in a processor having a write-tie instruction and a data mover engine that associates register addresses with memory addresses |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS635433A (en) * | 1986-06-26 | 1988-01-11 | Fujitsu Ltd | Branch control system |
US4760519A (en) * | 1983-07-11 | 1988-07-26 | Prime Computer, Inc. | Data processing apparatus and method employing collision detection and prediction |
US4777587A (en) * | 1985-08-30 | 1988-10-11 | Advanced Micro Devices, Inc. | System for processing single-cycle branch instruction in a pipeline having relative, absolute, indirect and trap addresses |
JPH02254541A (en) * | 1989-03-29 | 1990-10-15 | Fujitsu Ltd | Control system for conditional branch instruction |
US5088030A (en) * | 1986-03-28 | 1992-02-11 | Kabushiki Kaisha Toshiba | Branch address calculating system for branch instructions |
JPH0460720A (en) * | 1990-06-29 | 1992-02-26 | Hitachi Ltd | Control system for condition branching instruction |
JPH0498426A (en) * | 1990-08-13 | 1992-03-31 | Nec Corp | Microprocessor |
JPH052494A (en) * | 1991-06-26 | 1993-01-08 | Hitachi Ltd | Interruption control system |
JPH0573310A (en) * | 1991-09-18 | 1993-03-26 | Fujitsu Ltd | Dynamic decision system for conditional branch |
US5202967A (en) * | 1988-08-09 | 1993-04-13 | Matsushita Electric Industrial Co., Ltd. | Data processing apparatus for performing parallel decoding and parallel execution of a variable word length instruction |
JPH05224926A (en) * | 1992-02-14 | 1993-09-03 | Kobe Nippon Denki Software Kk | Condition branch instruction control system |
JPH05257875A (en) * | 1992-03-16 | 1993-10-08 | Hitachi Ltd | Method for evading interruption processing start delay |
JPH05298095A (en) * | 1992-04-17 | 1993-11-12 | Hitachi Ltd | Pipeline arithmetic unit |
US5317703A (en) * | 1990-06-29 | 1994-05-31 | Hitachi, Ltd. | Information processing apparatus using an advanced pipeline control method |
JPH06161751A (en) * | 1992-11-18 | 1994-06-10 | Nec Ibaraki Ltd | Instruction buffer controller |
US5333284A (en) * | 1990-09-10 | 1994-07-26 | Honeywell, Inc. | Repeated ALU in pipelined processor design |
JPH0793150A (en) * | 1993-09-21 | 1995-04-07 | Matsushita Electric Ind Co Ltd | Information processor |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4003033A (en) * | 1975-12-22 | 1977-01-11 | Honeywell Information Systems, Inc. | Architecture for a microprogrammed device controller |
US4179737A (en) * | 1977-12-23 | 1979-12-18 | Burroughs Corporation | Means and methods for providing greater speed and flexibility of microinstruction sequencing |
JPH0776917B2 (en) * | 1984-12-29 | 1995-08-16 | ソニー株式会社 | Micro computer |
-
1996
- 1996-10-04 US US08/725,709 patent/US6003127A/en not_active Expired - Lifetime
-
1999
- 1999-10-29 US US09/429,022 patent/US6308263B1/en not_active Expired - Fee Related
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4760519A (en) * | 1983-07-11 | 1988-07-26 | Prime Computer, Inc. | Data processing apparatus and method employing collision detection and prediction |
US4777587A (en) * | 1985-08-30 | 1988-10-11 | Advanced Micro Devices, Inc. | System for processing single-cycle branch instruction in a pipeline having relative, absolute, indirect and trap addresses |
US5088030A (en) * | 1986-03-28 | 1992-02-11 | Kabushiki Kaisha Toshiba | Branch address calculating system for branch instructions |
JPS635433A (en) * | 1986-06-26 | 1988-01-11 | Fujitsu Ltd | Branch control system |
US5202967A (en) * | 1988-08-09 | 1993-04-13 | Matsushita Electric Industrial Co., Ltd. | Data processing apparatus for performing parallel decoding and parallel execution of a variable word length instruction |
JPH02254541A (en) * | 1989-03-29 | 1990-10-15 | Fujitsu Ltd | Control system for conditional branch instruction |
US5317703A (en) * | 1990-06-29 | 1994-05-31 | Hitachi, Ltd. | Information processing apparatus using an advanced pipeline control method |
JPH0460720A (en) * | 1990-06-29 | 1992-02-26 | Hitachi Ltd | Control system for condition branching instruction |
JPH0498426A (en) * | 1990-08-13 | 1992-03-31 | Nec Corp | Microprocessor |
US5333284A (en) * | 1990-09-10 | 1994-07-26 | Honeywell, Inc. | Repeated ALU in pipelined processor design |
JPH052494A (en) * | 1991-06-26 | 1993-01-08 | Hitachi Ltd | Interruption control system |
JPH0573310A (en) * | 1991-09-18 | 1993-03-26 | Fujitsu Ltd | Dynamic decision system for conditional branch |
JPH05224926A (en) * | 1992-02-14 | 1993-09-03 | Kobe Nippon Denki Software Kk | Condition branch instruction control system |
JPH05257875A (en) * | 1992-03-16 | 1993-10-08 | Hitachi Ltd | Method for evading interruption processing start delay |
JPH05298095A (en) * | 1992-04-17 | 1993-11-12 | Hitachi Ltd | Pipeline arithmetic unit |
JPH06161751A (en) * | 1992-11-18 | 1994-06-10 | Nec Ibaraki Ltd | Instruction buffer controller |
JPH0793150A (en) * | 1993-09-21 | 1995-04-07 | Matsushita Electric Ind Co Ltd | Information processor |
Non-Patent Citations (8)
Title |
---|
"H08/300 Series Programming Manual (3rd Edition)"; Mar. 1993; p. 69. |
"Quantitative Approach for Computer Architecture Design, Realization, and Evaluation (Nikkei BP)" May 31, 1993. |
"RISC System" translated by Ohmori; Nov. 1, 1991; p. 75. |
"SH7000 Series Programming Manual (1st Edition)"; Sep. 1993; p. 148. |
H08/300 Series Programming Manual (3rd Edition) ; Mar. 1993; p. 69. * |
Quantitative Approach for Computer Architecture Design, Realization, and Evaluation (Nikkei BP) May 31, 1993. * |
RISC System translated by Ohmori; Nov. 1, 1991; p. 75. * |
SH7000 Series Programming Manual (1st Edition) ; Sep. 1993; p. 148. * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210161419A1 (en) * | 2018-04-20 | 2021-06-03 | Nippon Telegraph And Telephone Corporation | Component concentration measurement device and component concentration measurement method |
CN117008977A (en) * | 2023-08-08 | 2023-11-07 | 上海合芯数字科技有限公司 | Instruction execution method, system and computer equipment with variable execution period |
CN117008977B (en) * | 2023-08-08 | 2024-03-19 | 上海合芯数字科技有限公司 | Instruction execution method, system and computer equipment with variable execution period |
Also Published As
Publication number | Publication date |
---|---|
US6308263B1 (en) | 2001-10-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5404552A (en) | Pipeline risc processing unit with improved efficiency when handling data dependency | |
US5325495A (en) | Reducing stall delay in pipelined computer system using queue between pipeline stages | |
JP3842474B2 (en) | Data processing device | |
EP0730223B1 (en) | Pipeline data processing apparatus for executing a plurality of data processes having a data-dependent relationship | |
US6003127A (en) | Pipeline processing apparatus for reducing delays in the performance of processing operations | |
JPH0772864B2 (en) | Digital signal processor | |
US20080065870A1 (en) | Information processing apparatus | |
JP3599499B2 (en) | Central processing unit | |
US6182211B1 (en) | Conditional branch control method | |
JP2584156B2 (en) | Program-controlled processor | |
JP3335735B2 (en) | Arithmetic processing unit | |
JP2689914B2 (en) | Information processing device | |
JPH0218729B2 (en) | ||
JPH01271840A (en) | Microcomputer | |
JPH0793151A (en) | Instruction supplying device | |
JP3461887B2 (en) | Variable length pipeline controller | |
JP2636192B2 (en) | Information processing device | |
JP2924735B2 (en) | Pipeline operation device and decoder device | |
US5802346A (en) | Method and system for minimizing the delay in executing branch-on-register instructions | |
US20100153688A1 (en) | Apparatus and method for data process | |
JPS6116334A (en) | Data processor | |
JP3431503B2 (en) | Information processing apparatus and program control method | |
JP4702004B2 (en) | Microcomputer | |
JPH09101888A (en) | Microprocessor | |
JPH11327929A (en) | Program controller |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NIPPONDENSO CO., LTD., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HAYAKAWA,HIROSHI;FUKUMOTO, HARUTSUGU;TANAKA, HIROAKI;REEL/FRAME:008263/0034 Effective date: 19960926 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FPAY | Fee payment |
Year of fee payment: 12 |