CN108287730A - A kind of processor pipeline structure - Google Patents
A kind of processor pipeline structure Download PDFInfo
- Publication number
- CN108287730A CN108287730A CN201810338781.3A CN201810338781A CN108287730A CN 108287730 A CN108287730 A CN 108287730A CN 201810338781 A CN201810338781 A CN 201810338781A CN 108287730 A CN108287730 A CN 108287730A
- Authority
- CN
- China
- Prior art keywords
- instruction
- module
- unit
- long period
- writes back
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002159 abnormal effect Effects 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000000034 method Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 6
- 230000003068 static effect Effects 0.000 claims description 5
- 238000013500 data storage Methods 0.000 claims description 3
- 230000009471 action Effects 0.000 claims description 2
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 230000008520 organization Effects 0.000 description 14
- 238000013461 design Methods 0.000 description 7
- 238000004519 manufacturing process Methods 0.000 description 7
- 238000013475 authorization Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 241000272878 Apodiformes Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004334 sorbic acid Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
- G06F9/3869—Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking
Abstract
The invention discloses a kind of processor pipeline structure, including the location of instruction, Fetch unit, execution unit, memory access unit and writeback unit, the writeback unit writes back module and second including first and writes back module;Described first writes back module for arbitrating writing back sequentially for each long period instruction execution result through execution unit or the output of memory access unit, and the sequence that writes back is made to send sequence consensus with what corresponding long period instructed;Described second writes back module for arbitrate writing back sequentially for multi-cycle instructions implementing result that one-cycle instruction export through execution unit writes back module output with first, and long period is instructed with higher priority;The present invention is improved by the internal structure to pipelined units at different levels, solves the problems, such as that existing processor pipeline structure can not take into account low-power consumption, inexpensive small area and high performance simultaneously.
Description
Technical field
The invention belongs to processor hardware design fields, more particularly, to a kind of high performance place of super low-power consumption
Manage device pipeline organization.
Background technology
In recent years, with the continuous promotion of integrated circuit fabrication process, processor integrated level is continuously improved with performance, accordingly
Ground, power consumption are also constantly increasing, with mobile device be widely used and the fast development of Internet of Things, for low-power consumption,
The demand of the low cost and high performance processor of small area is continuously increased, and possesses high property again while reducing power consumption, low cost
Can, become one new research hotspot of designer.
Authorization Notice No. is that the patent of invention of 104699463 B of CN discloses a kind of Novel hydroelectric cable architecture, using structure
The mode of register stack, to reduce a large amount of dynamic power consumptions that register overturning generates caused by data path;Above-mentioned assembly line
Structure is primarily adapted for use in a large amount of dynamic power consumptions for reducing that register overturning generates caused by mass rapid transmits, and is advised in transmission
The low occasion of the small error rate of mould can not play effectiveness, and can also increase the complexity of design, increase processor area, lead to cost
It increases;
The patent of invention that Authorization Notice No. is 101464721 B of CN discloses setting for a kind of control performance and power consumption
Meter method, when detecting that processor throughput reduces, reconfigures assembly line by monitoring the performance of pipeline-type processor
It is switched to low performance pattern from high performance mode, to reduce power consumption;The system and design are sufficiently complex, and it is small not to be suitable for low cost
The processor of area designs, and has also recognized this point in his specific implementation mode, and it is sufficiently complex to mention development
With take;
Authorization Notice No. is that the patent of invention of 103218029 B of CN discloses a kind of flowing water knot of control supply voltage
Structure by changing the structure of register in existing pipeline organization, while increasing outside register built error correction circuit and assembly line
Portion's error correction circuit, further decreases supply voltage, adjusts voltage in real time using the height of error number so that kernel power consumption is further
It reduces;Above-mentioned design scheme reduces cost, although reducing work(to a certain extent there is no reduction processor area is considered
Consumption, but the complexity and cost of system are also improved simultaneously.
In conclusion although existing processor reduces power consumption to a certain extent, it is the increase in the complexity of system
Degree, increases processor area and improves cost.
Invention content
For the disadvantages described above or Improvement requirement of the prior art, the present invention provides a kind of super low-power consumption high-performance processors
Pipeline organization is improved by the internal structure to pipelined units at different levels, and its object is to solve existing processor
Pipeline organization can not take into account low-power consumption, inexpensive small area and high performance problem simultaneously.
To achieve the above object, according to one aspect of the present invention, a kind of processor pipeline structure is provided, including is referred to
Enable storage unit, Fetch unit, execution unit, memory access unit and writeback unit;The first end of Fetch unit and instruction storage are single
Member is connected, and second end is connected with the first end of execution unit;The second end of execution unit is connected with the first end of writeback unit, the
Three ends are connected with the first end of memory access unit;The second end of memory access unit is connected with the second end of writeback unit;
Fetch unit takes out an instruction from the location of instruction within a clock cycle;Execution unit is used for taking
Refer to the instruction of unit output into row decoding and execution, the result of instruction execution writes back register group by writeback unit;
Writeback unit writes back module and second including first and writes back module;First writes back the first end and execution unit of module
Second end be connected, second end is connected with the second end of memory access unit, and the first end that third end writes back module with second is connected;The
Two write back the second end of module is connected with the 4th end of execution unit;
First writes back module for arbitrating each long period instruction execution result through execution unit or the output of memory access unit
Sequence is write back, makes to write back and sequentially sends sequence consensus with what corresponding long period instructed;Second write back module for arbitrate through execution
The one-cycle instruction and first that unit exports write back writing back sequentially for the multi-cycle instructions implementing result of module output, and long period refers to
Enabling has higher priority.
Preferably, above-mentioned processor pipeline structure, Fetch unit include the first program counter, the second programmed counting
Device, PC generation modules, Partial Decode module, branch prediction module and command register;
The first end of command register is connected with the first end of the location of instruction, the first end of second end and execution unit
It is connected;The first end of Partial Decode module is connected with the second end of the location of instruction, and the of second end and branch prediction module
One end is connected, and third end is connected with the first end of PC generation modules;The of the second end of branch prediction module and PC generation modules
Two ends are connected;The third end of PC generation modules is connected with the first end of the first program counter, the 4th end and the first programmed counting
The second end of device is connected, and the 5th end is connected with the third end of the location of instruction, and the 6th end is connected with the third end of execution unit;
The third end of first program counter is connected with the first end of the second program counter;The second end of second program counter with hold
The third end of row unit is connected;
Partial Decode module is used for the present instruction to being taken out from the location of instruction into row decoding to judge that this is current
The type of instruction is ordinary instruction or branch's jump instruction, and if ordinary instruction, Partial Decode module is directly by the current finger
Order is sent to PC generation modules;PC generation modules are given birth to according to the current instruction address that present instruction and the first program counter are sent
The address for waiting for instruction fetch at next;
If branch's jump instruction, then the present instruction is sent to branch prediction module by Partial Decode module;Branch is pre-
The jump target addresses that module obtains the present instruction by static prediction are surveyed, PC generation modules are obtained according to branch prediction module
The jump target addresses of present instruction generate the next address for waiting for instruction fetch;
Partial Decode module, branch prediction module and PC generation modules are combined logical structure, and the decoding of present instruction divides
Branch is predicted and the next generation for waiting for instruction fetch address is completed within the same clock cycle.
Preferably, above-mentioned processor pipeline structure, execution unit include decoding module, send module, instruction trace
Module, one-cycle instruction computing module, long period ordering calculation module and delivery module;
The first end of decoding module is connected with the second end of command register, second end and the second of the second program counter
End is connected, and third end is connected with the first end of module is sent;Send the second end of module with the first end phase of instruction trace module
Even, third end is connected with the first end of one-cycle instruction computing module, the first end at the 4th end and long period ordering calculation module
It is connected, the 5th end is connected with the first end of memory access unit;The second end of long period ordering calculation module and first writes back module
First end is connected;The second end that the second end of instruction trace module writes back module with first is connected;One-cycle instruction computing module
Second end write back the second end of module with second and be connected, third end is connected with the second end of memory access unit, the 4th end with deliver
The first end of module is connected;The third end that the second end of delivery module writes back module with second is connected, and third end generates mould with PC
6th end of block is connected;
Instruction trace module sends module for storing the long period command information for being sent away and not yet writing back
When carrying out instruction and sending, each long period command information that will store in the information for currently sending instruction and instruction trace module
It is compared, to judge whether present instruction is related to the long period instruction generation data for being sent and not yet write back
Property, if it is not, then normally sending;If so, pause send, until related long period instruction execution finish release data dependence it
Just continue to send afterwards.
Preferably, above-mentioned processor pipeline structure, delivery module include abnormal judging submodule and branch prediction solution
Analyse submodule;
The first end of the first end of branch prediction analyzing sub-module and abnormal judging submodule with one-cycle instruction operation
4th end of module is connected, and the second end of the second end of branch prediction analyzing sub-module and abnormal judging submodule is generated with PC
6th end of module is connected;The third end that the third end of abnormal judging submodule writes back module with second is connected;
Branch prediction analyzing sub-module is used to judge PC generation modules according to the operation result of one-cycle instruction computing module
Whether the next address for waiting for instruction fetch generated be correct, if so, not dealing with;If it is not, then removing wrong address and generating
New next waits for instruction fetch address and is fed back to PC generation modules;
Abnormal judging submodule is used to judge that present instruction is executing according to the operation result of one-cycle instruction computing module
Whether mistake occurs in the process, if it is not, not dealing with then;If so, removing current instruction address and generating new address and incite somebody to action
It feeds back to PC generation modules.
Preferably, above-mentioned processor pipeline structure, instruction trace module include multiple for storing long period instruction
The list item of information, a list item correspond to the information of storage one long period instruction, including source operand register index and result
Register index.
Preferably, above-mentioned processor pipeline structure, instruction trace module realize that first writes back module pair using FIFO
When multiple long period instructions carry out written-back operation, writing for different long period instructions is arbitrated according to the direction of the read pointer of FIFO sequence
It rolls back and rationalize sequence;After a certain long period instruction is written back into, which is instructed corresponding information deletion by instruction trace module.
Preferably, above-mentioned processor pipeline structure, one-cycle instruction computing module are additionally operable to generate memory access
Address;
Control module of the memory access unit as memory access, according to above-mentioned memory reference address by address judge from
Corresponding instruction is obtained in command storage unit part, or corresponding data is obtained from data storage part.
Preferably, above-mentioned processor pipeline structure, the location of instruction are realized using instruction close coupling memory.
In general, through the invention it is contemplated above technical scheme is compared with the prior art, can obtain down and show
Beneficial effect:
(1) a kind of processor pipeline structure provided by the invention writes back module and second by first and writes back module reality
The two-stage for having showed instruction writes back, and first writes back module and the effect of instruction trace module cooperative completes writing for different long periods instruction
It returns, so that it is write back sequence and is sent sequence strict conformance, realize the succinct of hardware configuration, reduce processor area;Second
It writes back module and writes back sequence for arbitrate whole one-cycle instructions and long period instruction, wherein long period instruction has preferential
Grade;And in the idling cycle of no long period instruction write-back, one-cycle instruction then can at will write back;Strategy is write back by two-stage
By the delivery of long period instruction and write back separation so that even if performing the instruction of multicycle long period, still will not block flowing water
Line allows subsequent one-cycle instruction to remain able to smoothly write back and deliver, improves processor performance;
(2) a kind of processor pipeline structure provided by the invention by instruction trace module and sends module cooperative to be made
With solving the problems, such as data dependence;Instruction trace module is for storing the long period for being sent away and not yet writing back
Command information, send module carry out instruction send when, the information and instruction trace mould of currently sending instruction is in the block each
Long period command information is compared, to judge whether present instruction instructs with the long period that has been sent and not yet write back
RAW and WAW correlations are generated, data dependence is such as not present, then normally sends;Such as there is data dependence, then pause is sent,
Just continue to send until related long period instruction execution, which finishes, releases data dependence;The present invention is using obstruction flowing water
The method of line solves the problems, such as data dependence, and the result without instructing long period is directly quickly bypassed to be waited for subsequent
Instruction is sent, the power consumption and area of processor are reduced;
(3) a kind of processor pipeline structure provided by the invention is stored using the ITCM that the monocycle accesses as instruction
Device, Fetch unit can fetch an instruction with a cycle from ITCM;Traditional Cache is replaced using ITCM, disclosure satisfy that
Super low-power consumption small area processor requirement of real-time, and reduce the cost and area of processor;Partial Decode module, branch prediction
Module and PC generation modules are combined logical structure, and Fetch unit completes instruction reading, Partial Decode, divides in one cycle
Branch prediction generates the sequence of operations such as the PC that next is waited for instruction fetch, accomplishes continuously instruction fetch, substantially increases processing
Device performance.
Description of the drawings
Fig. 1 is a kind of integrated stand composition of processor pipeline structure provided in an embodiment of the present invention;
Fig. 2 is a kind of structure chart of the Fetch unit of processor pipeline structure provided in an embodiment of the present invention;
Fig. 3 is a kind of structure chart of the execution unit of processor pipeline structure provided in an embodiment of the present invention;
Fig. 4 is a kind of structure chart of the writeback unit of processor pipeline structure provided in an embodiment of the present invention.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below
It does not constitute a conflict with each other and can be combined with each other.
A kind of processor pipeline structure that the embodiment of the present invention is provided, is primarily adapted for use in Embedded super low-power consumption
The inexpensive small area processor of Scenario Design, the hummingbird E200 processor cores ground certainly based on RISC-V frameworks such as our company;It should
Pipeline organization includes multi-stage pipeline units, specifically includes the location of instruction, Fetch unit, execution unit, memory access unit
And writeback unit;The first end of Fetch unit is connected with the location of instruction, and second end is connected with the first end of execution unit;It holds
The second end of row unit is connected with the first end of writeback unit, and third end is connected with the first end of memory access unit;Memory access unit
Second end is connected with the second end of writeback unit;
The present invention mainly divides the level of assembly line according to the clock cycle, wherein the location of instruction and Fetch unit category
In the first level production line, for storing instruction, Fetch unit is for continuous continual single from instruction storage for the location of instruction
Instruction fetch in member;Execution unit and writeback unit belong to the second level production line, the finger that execution unit is used to export Fetch unit
Enable into row decoding and execution, writeback unit be used for by the result of instruction execution write back general register group (Register File,
Regfile);Because the decoding of instruction is executed and is write back and is in the same clock cycle, by execution unit and writes back list
Member is divided into the second level in pipeline organization.
It only needs above-mentioned two level production line can be completed some one-cycle instructions, and some long periods is instructed,
Need to use the memory access function of memory access unit, memory access unit belongs to third level production line, but the result of memory access unit output is logical
The writeback unit crossed in the second level production line writes back general register group, and therefore, a kind of super low-power consumption provided in this embodiment is high
Performance processor pipeline organization is an elongated pipeline organization, compares existing linear type pipeline organization, reduces stream
Waterline series, and then the area of processor can be reduced and reduce cost.
Fetch unit is used for the instruction fetch from the location of instruction, and to improve processor performance, the process of instruction fetch need to be done
To " fast " and " successive ", since a kind of super low-power consumption high-performance processor pipeline organization that the present embodiment is provided is main
Suitable for the low small area processor of Embedded super low-power consumption Scenario Design, the journey of the embeded processor core of this rank
Sequence size of code is little, therefore the present invention is using instruction close coupling memory (Instruction Tightly Coupled
Memory;ITCM) the storage instructed as the location of instruction, Fetch unit can take in a clock cycle from ITCM
Go out an instruction, realizes quick instruction fetch;Compared to traditional I-Cache, meeting super low-power consumption small area processor real-time
Under the premise of it is required that, the cost and area of processor can be reduced.
Fetch unit includes the first program counter, the second program counter, PC generation modules, Partial Decode module, divides
Branch prediction module and command register;
The first end of command register is connected with the first end of ITCM, and second end is connected with the first end of execution unit;Portion
The first end of decoding module is divided to be connected with the second end of ITCM, second end is connected with the first end of branch prediction module, third end
It is connected with the first end of PC generation modules;The second end of branch prediction module is connected with the second end of PC generation modules;PC is generated
The third end of module is connected with the first end of the first program counter, and the 4th end is connected with the second end of the first program counter,
5th end is connected with the third end of ITCM, and the 6th end is connected with the second end of execution unit;The third end of first program counter
It is connected with the first end of the second program counter;The second end of second program counter is connected with the third end of execution unit;
Command register be used to store the instruction (being known as present instruction) taken out from ITCM in a certain clock cycle and
Next clock cycle sends it to execution unit;Second program counter is used to receive working as the first program counter transmission
Preceding IA simultaneously sends it to execution unit in next clock cycle;Execution unit, which synchronizes, receives present instruction and current
IA.
Partial Decode module is used for the present instruction taken out from ITCM into row decoding to judge the class of the present instruction
Type is ordinary instruction or branch's jump instruction, and if ordinary instruction, which is directly sent to by Partial Decode module
PC generation modules;PC generation modules generate next according to the current instruction address that present instruction and the first program counter are sent
Wait for the address (i.e. PC values) of instruction fetch;If branch's jump instruction, then the present instruction is sent to branch by Partial Decode module
Prediction module;Branch prediction module obtains the jump target addresses of the present instruction by static prediction, PC generation modules according to
The jump target addresses for the present instruction that branch prediction module obtains generate the address that next is waited for instruction fetch and send out it respectively
Give the first program counter and ITCM;
Partial Decode module, branch prediction module and PC generation modules are combined logical structure, and the decoding of present instruction divides
Branch is predicted and the acquisition of the next generation for waiting for instruction fetch address and present instruction is completed within the same clock cycle, therefore this
The Fetch unit that embodiment provides can complete the acquisition of present instruction within a clock cycle and next is waited for instruction fetch address
Generation, realize continuously instruction fetch, improve processor performance.
Branch prediction module is using a kind of simple, flexible static prediction method:The conditional branching redirected backward is referred to
Order is predicted as really redirecting, and the conditional branch instructions redirected forward is predicted as not redirecting, specific main points are as follows:
1, for the direct jump instruction of conditional, the conditional branch instructions such as such as BEQ, BNE use above-mentioned static prediction method
(redirecting backward, be predicted as needing to redirect, be otherwise predicted as to redirect);For jump target addresses, using its PC and
The offset that immediate indicates is added to obtain its jump target addresses;
2, for unconditional direct jump instruction, such as jal is instructed, and since it is bound to redirect, there is no need to predict that it is redirected
Direction;For jump target addresses, it is added with the offset that immediate indicates to obtain its jump target addresses using its PC.
3, for unconditional indirect jump instruction, such as jalr is instructed, and since it is bound to redirect, there is no need to predict its jump
Turn direction;For its jump target addresses, the operation that the base address needed for jump target addresses comes from its rs1 index is calculated
Number, needs to read from general register group, and be also possible to and be carrying out the instruction executed in unit and form RAW data phases
Guan Xing;The present invention is different according to its rsl index and takes different schemes:If corresponding call number is literal register, directly
It connects and uses the constant, without being read from related register;If corresponding call number is common link registers, by the deposit
Device direct cable takes out, and generates the data dependence of read-after-write in order to prevent, needs judgement in the second level production line
Writeback unit not to the register carry out written-back operation, it is specific as follows:
If 31, the call number of rs1 is x0, it is (fixed according to RISC-V frameworks that the constant 0 that then be used directly carries out base address calculating
Adopted x0 indicates constant 0), without being read from Regfile.
If 32, the call number of rs1 is x1, refer to as function return jump since x1 is frequently utilized for link registers
It enables, by x1, direct cable takes out from the Regfile of execution unit, need not occupy the Read Port of Regfile;For
It prevents from being carrying out the instruction executed in unit and needs to write back link registers to cause RAW data dependences, branch pre-
Module is surveyed to need to judge currently whether there are instruction write-back link registers;
If 3, the call number of rs1 is needed using Regfile's in addition to other of x0 and x1 register (abbreviation xn)
Read Port read out xn from Regfile, and need to judge whether resource is not present in the free time to current Read Port
Conflict;Meanwhile it being carrying out the instruction executed in unit in order to prevent needs to write back xn and causing RAW data dependences, branch pre-
Module is surveyed to need to judge currently whether there is instruction write-back Regfile.
Execution unit includes decoding module, sends module, instruction trace module, one-cycle instruction computing module, long period
Ordering calculation module and delivery module;
The first end of decoding module is connected with the second end of command register, second end and the second of the second program counter
End is connected, and third end is connected with the first end of module is sent;Send the second end of module with the first end phase of instruction trace module
Even, third end is connected with the first end of one-cycle instruction computing module, the first end at the 4th end and long period ordering calculation module
It is connected, the 5th end is connected with the first end of memory access unit;The second end of instruction trace module is connected with the first end of writeback unit;
The second end of one-cycle instruction computing module is connected with the second end of writeback unit, the second end phase at third end and memory access unit
Even, the 4th end is connected with the first end of delivery module;The second end of long period ordering calculation module and the third end of writeback unit
It is connected;The second end of delivery module is connected with the 4th end of writeback unit, and third end is connected with the 6th end of PC generation modules;
Decoding module is used for the present instruction of acquisition and current instruction address into row decoding to obtain operand register
Index;And for corresponding operation data to be obtained from Read-Regfile according to operand register index;
Send module for being sent the operation data that decoding module obtains to different arithmetic elements according to instruction type
It executes;Wherein, one-cycle instruction computing module is mainly used for the operation and execution of one-cycle instruction, long period ordering calculation module
It is mainly used for the operation and execution of long period instruction;One-cycle instruction and the implementing result of long period instruction pass through writeback unit
Write back Write-Regfile;
Delivery module is used to the result of calculation of one-cycle instruction computing module consigning to PC generation modules;Delivery module packet
Include abnormal judging submodule and branch prediction analyzing sub-module;
The first end of the first end of branch prediction analyzing sub-module and abnormal judging submodule with one-cycle instruction operation
4th end of module is connected, and the second end of the second end of branch prediction analyzing sub-module and abnormal judging submodule is generated with PC
6th end of module is connected;The third end of abnormal judging submodule is connected with the 4th end of writeback unit;
Branch prediction analyzing sub-module is used to judge PC generation modules according to the result of calculation of one-cycle instruction computing module
Next generated waits for whether instruction fetch address is correct, if so, not dealing with;If it is not, then removing wrong address and generating new
Next wait for instruction fetch address and be fed back to PC generation modules;Abnormal judging submodule according to one-cycle instruction for transporting
The result of calculation for calculating module judges whether present instruction occurs mistake in the process of implementation, if it is not, not dealing with then;If so,
It removes current instruction address and generates new address and be fed back to PC generation modules;PC generation modules are by the new address of acquisition
It is sent to ITCM, Fetch unit instruction fetch and is sent to execution unit and is executed into row decoding from ITCM again.
Since mistake may occur in the process of implementation for the instruction of part long period, so writeback unit needs and abnormal judgement
Submodule triggers exception into line interface, if producing exception, the implementing result of long period instruction does not write back Write-
Regfile。
Send module needs whether to check it based on the micro-architecture sent in order when instruction is sent at every
It is executed but there are data dependences between the instruction that not yet writes back with sending before;Data dependence is divided into writeafterread (Write-
After-Read;WAR), read-after-write (Read-After-Write;) and write after write (Write-After-Write RAW;WAW) several
Kind;
1, WAR correlations:Since pipeline organization provided by the invention is suitable for being based on sending in order, write back in order
Micro-architecture processor, source operand is just had read when sending from general register group in instruction, it is therefore " follow-up
The instruction write-back Write-Regfile operation of execution " there is no fear of being happened at that " instruction that preamble executes is from Read-Regfile
Before read operands ", therefore it there is no fear of data collision caused by WAR correlations occur.
2, RAW correlations:The second level of the instruction in assembly line sent, it is assumed that the instruction sent before is (referred to as
Preamble instructs) be one-cycle instruction (second level for being also at assembly line writes back), then preamble one-cycle instruction, which has been completed, holds
The instruction gone and resulted back into Write-Regfile, therefore sending can not possibly generate and preamble one-cycle instruction
RAW correlations caused by data collision;It is assumed that preamble instruction is long period instruction, since long period instruction needs are multiple
Period could write-back result, therefore the instruction sent is possible to generate the RAW correlations that instruct with preamble long period.
3, WAW correlations:The second level of the instruction in assembly line sent, it is assumed that preamble instruction is to refer to the monocycle
It enabling, then preamble one-cycle instruction, which has been completed, executes and has resulted back into Write-Regfile, therefore sending
Instruction can not possibly generate data collision caused by the WAW correlations with preamble one-cycle instruction;It is assumed that preamble instruction is long
Cycles per instruction, due to long period instruction need multiple periods could write-back result, the instruction sent is possible to generate
With the WAW correlations of preamble long period instruction.
To sum up, in pipeline organization provided by the invention, " instruction sent " is only possible to and " has not carried out and finish
Long period instruction " between generate RAW and WAW correlations.
In order to detect RAW the and WAW correlations between the instruction currently sent and preamble long period instruct, this hair
Bright that an instruction trace module is provided in execution unit, which is sent away and still for storing
The long period command information not write back, information include but not limited to the source operand register index and result of long period instruction
Register index;
The instruction trace module preferably uses the FIFO of first in, first out mechanism to realize;Module is sent often to send a long period
Instruction is then long period instruction one list item (Entry) of distribution in instruction trace module, for storing long period instruction
Source operand register index and result register index;Writeback unit writes back the implementing result that the long period instructs
After Write-Regfile, which is then instructed corresponding list item to remove by instruction trace module, therefore instruction trace module
Middle storage is the long period command information for being sent away and not yet writing back;Send module carry out instruction send when,
The source operand register index and result register index and instruction trace mould each list item in the block of instruction will currently be sent
Information is compared, and RAW is generated to judge whether present instruction instructs with the long period that has been sent and not yet write back
With WAW correlations, data dependence is such as not present, then normally sends;Such as there is data dependence, then pause is sent, Zhi Daoxiang
Customs director's cycles per instruction be finished release data dependence after just continue to send.The depth of FIFO is defaulted as two tables
, you can while storing the information of two long periods instruction;List item number is preferentially no more than four, otherwise will reduce processor
The speed of service.
Pipeline organization provided by the invention conflicts for caused by data dependence, using obstruction assembly line method,
And there is no directly quickly bypassing and waiting sending instruction to subsequent the result of long period instruction, reduce the power consumption of processor with
Area.
Part long period is instructed, such as Load and Store instructions and " A " extended instruction, needs to use memory access list
The memory access function of member;Above-metioned instruction is after sending module to be sent to one-cycle instruction computing module, one-cycle instruction computing module
Memory reference address is generated through operation and sends it to memory access unit, control mould of the memory access unit as memory access
Block judges to obtain corresponding instruction from command storage unit part by address, or obtains corresponding data from data storage part.
Writeback unit writes back module and second including first and writes back module, and first writes back the first end of module and long period refers to
The second end of computing module is enabled to be connected, second end is connected with the second end of instruction trace module, and third end and second writes back module
First end be connected, the 4th end is connected with the third end of memory access unit;Second writes back the second end and one-cycle instruction fortune of module
The second end for calculating module is connected, and third end is connected with the third end of abnormal judging submodule;
First, which writes back module, is mainly used for arbitrating writing back for each long period instruction execution result, as shown in figure 4, long period refers to
It enables the operation result after long period ordering calculation module or memory access cell processing initially enter first and writes back module;In addition, the
One operation result for writing back the long period instruction of module reception is also possible to come from multiplier-divider, FPU and EAI coprocessors etc.;
It when these long period instruction write-backs, is theoretically not necessarily to stringent send sequence according to it, it is only necessary to occur conflicting in register
It is followed when situation and sends sequence, remaining time out of order can write back.But in order to realize the succinct of hardware, the present embodiment choosing
It selects and strictly sends sequence to carry out writing back for its operation result according to what long period instructed;Due to different long period instruction executions
Periodicity is different or even the execution cycle number of some long periods instruction is dynamic, therefore can not easily judge these
The precedence relationship of long period instruction, so needing the precedence relationship between pre-recorded these long periods instruction.
Instruction tracing module provided in this embodiment is the information for recording long period instruction, and module is sent often to send one
A long period instruction can be then long period instruction one list item of distribution in instruction tracing module to record long period instruction
Information;Instruction label (the Instruction Tag that the FIFO pointers (Pointer) of this list item are instructed as the long period;
ITAG);Long period instruction carries always its corresponding ITAG after sending when its operation result is written back into;
Instruction trace module and first writes back the written-back operation that all long period instructions are completed in module cooperative cooperation, and first writes
The operation result for returning the long period instruction that module receives includes that the long period instructs corresponding ITAG;Due to instruction trace module
It is the FIFO of a first in, first out, the read pointer (ReadPointer) of FIFO can be directed toward the list item for entering instruction trace module at first,
First writes back module is sent to the operation result of the long period instruction corresponding to the list item second and writes back module, meanwhile, instruction
The long period is instructed corresponding list item to delete by tracking module;First to write back module true according to the direction of instruction trace module sequence
Fixed length cycles per instruction operation result writes back sequence, it is ensured that it writes back sequence and sends sequence strict conformance.
Second, which writes back module, is mainly used for receiving the operation result for the one-cycle instruction that one-cycle instruction computing module is sent,
And the operation result of the long period instruction after first writes back module arbitration, and all instructions is carried out by the way of priority
The arbitration for writing back sequence, since the execution period of long period instruction is long, than the one-cycle instruction that is writing back in program flow
In in position earlier, so long period instruction writes back writing back with higher priority than one-cycle instruction.If
In the idling cycle of no long period instruction write-back, one-cycle instruction then can at will write back;In later i.e. in program flow
(if without data dependence), therefore the one-cycle instruction of position can first write back than the long period instruction of position earlier
Pipeline organization provided in an embodiment of the present invention is provided simultaneously with the out of order ability write back.
Compared to existing processor pipeline structure, a kind of processor pipeline structure provided by the invention, by right
The internal structure of pipelined units at different levels is improved, in Fetch unit be arranged combined logical structure Partial Decode module,
Branch prediction module and PC generation modules, the acquisition that can complete present instruction within a clock cycle and next wait for instruction fetch
The generation of address realizes continuously instruction fetch, improves processor performance;Instruction trace mould is set in execution unit
Block solves the problems, such as data dependence using the method for obstruction assembly line, and the result without instructing long period is directly fast
Speed bypasses and waits sending instruction to subsequent, reduces the power consumption and area of processor;Writeback unit is divided into first and writes back module
Module is write back with second, strategy is write back by delivery that long period instructs by two-stage and writes back separation so that even if performing more
Period long period instructs, and still will not block assembly line, subsequent one-cycle instruction is allowed to remain able to smoothly write back and deliver,
Improve processor performance;Solve existing processor pipeline structure can not take into account simultaneously low-power consumption, inexpensive small area and
High performance problem.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to
The limitation present invention, all within the spirits and principles of the present invention made by all any modification, equivalent and improvement etc., should all include
Within protection scope of the present invention.
Claims (8)
1. a kind of processor pipeline structure, including the location of instruction, Fetch unit, execution unit, memory access unit and write back
Unit, which is characterized in that the first end of the Fetch unit is connected with the location of instruction, and the first of second end and execution unit
End is connected;The second end of the execution unit is connected with the first end of writeback unit, the first end phase at third end and memory access unit
Even;The second end of the memory access unit is connected with the second end of writeback unit;
The Fetch unit takes out an instruction from the location of instruction within a clock cycle;The execution unit is used for
To Fetch unit output instruction into row decoding and execution, the result of instruction execution writes back register by the writeback unit
Group;
The writeback unit writes back module and second including first and writes back module;Described first writes back first end and the execution of module
The second end of unit is connected, and second end is connected with the second end of memory access unit, the first end phase that third end writes back module with second
Even;Described second writes back the second end of module is connected with the 4th end of execution unit;
Described first writes back module for arbitrating each long period instruction execution result through execution unit or the output of memory access unit
Sequence is write back, the sequence that writes back is made to send sequence consensus with what corresponding long period instructed;Described second writes back module for secondary
The one-cycle instruction and first that sanction is exported through execution unit write back writing back sequentially for the multi-cycle instructions implementing result of module output,
Long period instruction has higher priority.
2. processor pipeline structure as described in claim 1, which is characterized in that the Fetch unit includes the first program meter
Number device, the second program counter, PC generation modules, Partial Decode module, branch prediction module and command register;
The first end of described instruction register is connected with the first end of the location of instruction, the first end of second end and execution unit
It is connected;The first end of the Partial Decode module is connected with the second end of the location of instruction, second end and branch prediction module
First end be connected, third end is connected with the first end of PC generation modules;The second end of the branch prediction module is generated with PC
The second end of module is connected;The third end of the PC generation modules is connected with the first end of the first program counter, the 4th end with
The second end of first program counter is connected, and the 5th end is connected with the third end of the location of instruction, the 6th end and execution unit
Third end be connected;The third end of first program counter is connected with the first end of the second program counter;Described second
The second end of program counter is connected with the third end of execution unit;
The Partial Decode module is used for the present instruction to being taken out from the location of instruction into row decoding to judge that this is current
The type of instruction is ordinary instruction or branch's jump instruction, and if ordinary instruction, Partial Decode module is directly by the current finger
Order is sent to PC generation modules;The present instruction that the PC generation modules are sent according to present instruction and the first program counter
Location generates the next address for waiting for instruction fetch;
If branch's jump instruction, then the present instruction is sent to branch prediction module by Partial Decode module;The branch is pre-
The jump target addresses that module obtains the present instruction by static prediction are surveyed, PC generation modules are obtained according to branch prediction module
The jump target addresses of present instruction generate the next address for waiting for instruction fetch;
The Partial Decode module, branch prediction module and PC generation modules are combined logical structure, and the decoding of present instruction divides
Branch is predicted and the next generation for waiting for instruction fetch address is completed within the same clock cycle.
3. processor pipeline structure as claimed in claim 1 or 2, which is characterized in that the execution unit includes decoding mould
Block sends module, instruction trace module, one-cycle instruction computing module, long period ordering calculation module and delivery module;
The first end of the decoding module is connected with the second end of command register, second end and the second of the second program counter
End is connected, and third end is connected with the first end of module is sent;The second end and the first of instruction trace module for sending module
End is connected, and third end is connected with the first end of one-cycle instruction computing module, and the of the 4th end and long period ordering calculation module
One end is connected, and the 5th end is connected with the first end of memory access unit;The second end of the long period ordering calculation module is write with first
The first end for returning module is connected;The second end that the second end of described instruction tracking module writes back module with first is connected;The list
The second end that the second end of cycles per instruction computing module writes back module with second is connected, the second end phase at third end and memory access unit
Even, the 4th end is connected with the first end of delivery module;The third end phase that the second end of the delivery module writes back module with second
Even, third end is connected with the 6th end of PC generation modules;
Described instruction tracking module is described to send for storing the long period command information for being sent away and not yet writing back
When carrying out instruction and sending, each long period that will be stored in the information for currently sending instruction and instruction trace module instructs module
Information is compared, and data phase is generated to judge whether present instruction instructs with the long period that has been sent and not yet write back
Guan Xing, if it is not, then normally sending;If so, pause is sent, until related long period instruction execution finishes releasing data dependence
Just continue to send later.
4. processor pipeline structure as claimed in claim 3, which is characterized in that the delivery module includes abnormal judgement
Module and branch prediction analyzing sub-module;
The first end of the first end of the branch prediction analyzing sub-module and abnormal judging submodule with one-cycle instruction operation
4th end of module is connected, and the second end of the second end of branch prediction analyzing sub-module and abnormal judging submodule is generated with PC
6th end of module is connected;The third end that the third end of abnormal judging submodule writes back module with second is connected;
The branch prediction analyzing sub-module is used to judge PC generation modules according to the operation result of one-cycle instruction computing module
Whether the next address for waiting for instruction fetch generated be correct, if so, not dealing with;If it is not, then removing wrong address and generating
New next waits for instruction fetch address and is fed back to PC generation modules;
The exception judging submodule is used to judge that present instruction is executing according to the operation result of one-cycle instruction computing module
Whether mistake occurs in the process, if it is not, not dealing with then;If so, removing current instruction address and generating new address and incite somebody to action
It feeds back to PC generation modules.
5. processor pipeline structure as claimed in claim 3, which is characterized in that described instruction tracking module includes multiple use
In the list item of storage long period command information, a list item corresponds to the information of storage one long period instruction, described information
Including source operand register index and result register index.
6. processor pipeline structure as claimed in claim 5, which is characterized in that described instruction tracking module is real using FIFO
Existing, first, which writes back module, instructs multiple long periods when carrying out written-back operation, according to the direction of the read pointer of FIFO sequence
That arbitrates different long period instructions writes back sequence;After a certain long period instruction is written back into, described instruction tracking module is all by the length
Phase instructs corresponding information deletion.
7. processor pipeline structure as claimed in claim 3, which is characterized in that the one-cycle instruction computing module is also used
In generation memory reference address;
Control module of the memory access unit as memory access, according to the memory reference address by address judge from
Corresponding instruction is obtained in command storage unit part, or corresponding data is obtained from data storage part.
8. processor pipeline structure as described in claim 1, which is characterized in that described instruction storage unit is tight using instruction
Coupled memory is realized.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810210100 | 2018-03-14 | ||
CN2018102101005 | 2018-03-14 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108287730A true CN108287730A (en) | 2018-07-17 |
CN108287730B CN108287730B (en) | 2023-12-29 |
Family
ID=62834455
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810338781.3A Active CN108287730B (en) | 2018-03-14 | 2018-04-16 | Processor pipeline device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108287730B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109144573A (en) * | 2018-08-16 | 2019-01-04 | 胡振波 | Two-level pipeline framework based on RISC-V instruction set |
CN109933372A (en) * | 2019-02-26 | 2019-06-25 | 西安理工大学 | A kind of changeable framework low power processor of multi-mode dynamic |
CN109933368A (en) * | 2019-03-12 | 2019-06-25 | 苏州中晟宏芯信息科技有限公司 | A kind of transmitting of instruction and verification method and device |
CN109948200A (en) * | 2019-02-28 | 2019-06-28 | 西安理工大学 | A kind of low power processor of fine granularity control power supply supply |
CN110045989A (en) * | 2019-03-14 | 2019-07-23 | 西安理工大学 | A kind of switching at runtime formula low power processor |
CN110825437A (en) * | 2018-08-10 | 2020-02-21 | 北京百度网讯科技有限公司 | Method and apparatus for processing data |
CN111399912A (en) * | 2020-03-26 | 2020-07-10 | 超验信息科技(长沙)有限公司 | Instruction scheduling method, system and medium for multi-cycle instruction |
CN112181492A (en) * | 2020-09-23 | 2021-01-05 | 北京奕斯伟计算技术有限公司 | Instruction processing method, instruction processing device and chip |
CN113220347A (en) * | 2021-03-30 | 2021-08-06 | 深圳市创成微电子有限公司 | Instruction processing method based on multistage pipeline, floating point DSP and audio equipment |
CN114721724A (en) * | 2022-03-07 | 2022-07-08 | 电子科技大学 | RISC-V instruction set-based six-stage pipeline processor |
CN116225538A (en) * | 2023-05-06 | 2023-06-06 | 苏州萨沙迈半导体有限公司 | Processor and pipeline structure and instruction execution method thereof |
CN116881194A (en) * | 2023-09-01 | 2023-10-13 | 腾讯科技(深圳)有限公司 | Processor, data processing method and computer equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001014161A (en) * | 1999-06-25 | 2001-01-19 | Matsushita Electric Works Ltd | Programmable controller |
CN103984530A (en) * | 2014-05-15 | 2014-08-13 | 中国航天科技集团公司第九研究院第七七一研究所 | Assembly line structure and method for improving execution efficiency of store command |
CN105426160A (en) * | 2015-11-10 | 2016-03-23 | 北京时代民芯科技有限公司 | Instruction classified multi-emitting method based on SPRAC V8 instruction set |
CN208580395U (en) * | 2018-03-14 | 2019-03-05 | 武汉市聚芯微电子有限责任公司 | A kind of processor pipeline structure |
-
2018
- 2018-04-16 CN CN201810338781.3A patent/CN108287730B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001014161A (en) * | 1999-06-25 | 2001-01-19 | Matsushita Electric Works Ltd | Programmable controller |
CN103984530A (en) * | 2014-05-15 | 2014-08-13 | 中国航天科技集团公司第九研究院第七七一研究所 | Assembly line structure and method for improving execution efficiency of store command |
CN105426160A (en) * | 2015-11-10 | 2016-03-23 | 北京时代民芯科技有限公司 | Instruction classified multi-emitting method based on SPRAC V8 instruction set |
CN208580395U (en) * | 2018-03-14 | 2019-03-05 | 武汉市聚芯微电子有限责任公司 | A kind of processor pipeline structure |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825437B (en) * | 2018-08-10 | 2022-04-29 | 昆仑芯(北京)科技有限公司 | Method and apparatus for processing data |
CN110825437A (en) * | 2018-08-10 | 2020-02-21 | 北京百度网讯科技有限公司 | Method and apparatus for processing data |
CN109144573A (en) * | 2018-08-16 | 2019-01-04 | 胡振波 | Two-level pipeline framework based on RISC-V instruction set |
CN109933372B (en) * | 2019-02-26 | 2022-12-09 | 西安理工大学 | Multi-mode dynamic switchable architecture low-power-consumption processor |
CN109933372A (en) * | 2019-02-26 | 2019-06-25 | 西安理工大学 | A kind of changeable framework low power processor of multi-mode dynamic |
CN109948200B (en) * | 2019-02-28 | 2022-09-27 | 西安理工大学 | Low-power-consumption processor for fine-grained control of power supply |
CN109948200A (en) * | 2019-02-28 | 2019-06-28 | 西安理工大学 | A kind of low power processor of fine granularity control power supply supply |
CN109933368A (en) * | 2019-03-12 | 2019-06-25 | 苏州中晟宏芯信息科技有限公司 | A kind of transmitting of instruction and verification method and device |
CN109933368B (en) * | 2019-03-12 | 2023-07-11 | 北京市合芯数字科技有限公司 | Method and device for transmitting and verifying instruction |
CN110045989A (en) * | 2019-03-14 | 2019-07-23 | 西安理工大学 | A kind of switching at runtime formula low power processor |
CN110045989B (en) * | 2019-03-14 | 2023-11-14 | 合肥雷芯智能科技有限公司 | Dynamic switching type low-power-consumption processor |
CN111399912B (en) * | 2020-03-26 | 2022-11-22 | 超睿科技(长沙)有限公司 | Instruction scheduling method, system and medium for multi-cycle instruction |
CN111399912A (en) * | 2020-03-26 | 2020-07-10 | 超验信息科技(长沙)有限公司 | Instruction scheduling method, system and medium for multi-cycle instruction |
WO2022062230A1 (en) * | 2020-09-23 | 2022-03-31 | 北京磐易科技有限公司 | Instruction processing method, instruction processing apparatus, and chip |
CN112181492A (en) * | 2020-09-23 | 2021-01-05 | 北京奕斯伟计算技术有限公司 | Instruction processing method, instruction processing device and chip |
CN113220347A (en) * | 2021-03-30 | 2021-08-06 | 深圳市创成微电子有限公司 | Instruction processing method based on multistage pipeline, floating point DSP and audio equipment |
CN113220347B (en) * | 2021-03-30 | 2024-03-22 | 深圳市创成微电子有限公司 | Instruction processing method based on multistage pipeline, floating point type DSP and audio equipment |
CN114721724A (en) * | 2022-03-07 | 2022-07-08 | 电子科技大学 | RISC-V instruction set-based six-stage pipeline processor |
CN116225538A (en) * | 2023-05-06 | 2023-06-06 | 苏州萨沙迈半导体有限公司 | Processor and pipeline structure and instruction execution method thereof |
CN116881194A (en) * | 2023-09-01 | 2023-10-13 | 腾讯科技(深圳)有限公司 | Processor, data processing method and computer equipment |
CN116881194B (en) * | 2023-09-01 | 2023-12-22 | 腾讯科技(深圳)有限公司 | Processor, data processing method and computer equipment |
Also Published As
Publication number | Publication date |
---|---|
CN108287730B (en) | 2023-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108287730A (en) | A kind of processor pipeline structure | |
US10684860B2 (en) | High performance processor system and method based on general purpose units | |
CN101373427B (en) | Program execution control device | |
US6381692B1 (en) | Pipelined asynchronous processing | |
US6240510B1 (en) | System for processing a cluster of instructions where the instructions are issued to the execution units having a priority order according to a template associated with the cluster of instructions | |
US7487340B2 (en) | Local and global branch prediction information storage | |
US9569214B2 (en) | Execution pipeline data forwarding | |
CN105426160A (en) | Instruction classified multi-emitting method based on SPRAC V8 instruction set | |
CN101201734B (en) | Method and device for predecoding executive instruction | |
US20070288733A1 (en) | Early Conditional Branch Resolution | |
JP2005182825A5 (en) | ||
US9372698B2 (en) | Method and apparatus for implementing dynamic portbinding within a reservation station | |
US20220113966A1 (en) | Variable latency instructions | |
US20070288732A1 (en) | Hybrid Branch Prediction Scheme | |
US7725659B2 (en) | Alignment of cache fetch return data relative to a thread | |
US20070288731A1 (en) | Dual Path Issue for Conditional Branch Instructions | |
CN208580395U (en) | A kind of processor pipeline structure | |
JP2005309762A (en) | Thread switching controller | |
US20140129805A1 (en) | Execution pipeline power reduction | |
US20070288734A1 (en) | Double-Width Instruction Queue for Instruction Execution | |
CN116048627B (en) | Instruction buffering method, apparatus, processor, electronic device and readable storage medium | |
US20120144393A1 (en) | Multi-issue unified integer scheduler | |
US20080141252A1 (en) | Cascaded Delayed Execution Pipeline | |
CN116302106A (en) | Apparatus, method, and system for facilitating improved bandwidth of branch prediction units | |
EP3757772A1 (en) | System, apparatus and method for a hybrid reservation station for a processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |