CN106610816B - The bypassing method and system to conflict between instruction set in a kind of RISC-CPU - Google Patents
The bypassing method and system to conflict between instruction set in a kind of RISC-CPU Download PDFInfo
- Publication number
- CN106610816B CN106610816B CN201611246947.6A CN201611246947A CN106610816B CN 106610816 B CN106610816 B CN 106610816B CN 201611246947 A CN201611246947 A CN 201611246947A CN 106610816 B CN106610816 B CN 106610816B
- Authority
- CN
- China
- Prior art keywords
- instruction
- conflict
- relevant
- cpu
- risc
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 26
- 238000012545 processing Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000007619 statistical method Methods 0.000 claims description 7
- 230000001419 dependent effect Effects 0.000 claims description 4
- 241000288673 Chiroptera Species 0.000 claims description 3
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 claims description 3
- 238000004080 punching Methods 0.000 claims 1
- 238000013461 design Methods 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000001343 mnemonic effect Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
- G06F9/30134—Register stacks; shift registers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/30141—Implementation provisions of register files, e.g. ports
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3814—Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3816—Instruction alignment, e.g. cache line crossing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
- G06F9/3869—Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
Abstract
The invention discloses the bypassing method to conflict between instruction set in a kind of RISC-CPU and systems, include the following steps:Step 1:According to the data dependence relation between different instruction set in RISC-CPU, conflict type is determined;Step 2:For present instruction, judge whether it needs to access " register file " or " memory ", if so, carrying out step 3, otherwise, continues to analyze lower item instruction;Step 3:For present instruction, " coherence window " is defined, in " coherence window ", " relevant instruction " is found, and judge whether " relevant instruction ", if so, entering step four, otherwise, is judged as no read/write conflict;Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, data collision solution is carried out using specific strategy while ensureing assembly line throughput efficiency.
Description
Technical field
The present invention relates to computer processing technical fields, and in particular to the rule to conflict between instruction set in a kind of RISC-CPU
Keep away method and system.
Background technology
Since the design in Embedded RISC-CPU (selects from CPU hardware instruction set and instructs custom design, compiled to corresponding
Translate the design of device) it is extremely important.
Relevant technology is as follows:Application No. is artificial " generation meaning method (Beijing) semiconductor research and development of CN200810191060, application
Co., Ltd " application for a patent for invention " reducing the instruction conflict in processor ", it is to pass through to solve the strategy that instruction is broken through
(instruction issue) stage is sent in instruction, 2 kinds of instruction selections is carried out, is sent to subsequent parallel functional units, which is seen
Kind instruction Lothrus apterus, then arbitrate out one of which instruction.This method is used to select a Lothrus apterus from multi-emitting instruction window
Instruction.
And the application for a patent for invention for the Intel company that number of patent application is CN200710087737 " executes memory disambiguation
Technology ", give solution for the conflict avoidance between adjacent memory reference order, but be not involved with remaining
The conflict avoidance of coherence window inside associated instruction set or between instruction set defined between relevant instruction.
Xian Electronics Science and Technology University application No. is the application for a patent for invention of CN201310054280, " one kind is referred to based on overlength
Enable the assembler design method of word dedicated instruction set processor ", it is designed by assembler and completes instruction scheduling, thought highly of using deposit
Write after write (W-A-W) conflicts with writeafterread (W-A-R) caused by the methods of name instructs Out-of-order execution with removal.
Number of patent application is that the application for a patent for invention of the Intel company of " CN200710087737 " " executes memory disambiguation
Technology ", give solution for the conflict avoidance between adjacent memory reference order, but be not involved with remaining
The conflict avoidance of coherence window inside associated instruction set or between instruction set defined between relevant instruction, and with
The flowing water number of stages of Embedded RISC-CPU involved by patent is different, do not refer to that the design proposes yet from sub based on instruction
CPU hardware engineering design strategies of the prototype CPU of collection to full instruction set.
As it can be seen that the method for not providing the relevant Conflict solvings of RAW from the whole angle of instruction system in the prior art,
The CPU hardware design engineering strategy from the CPU prototypes of Local Minimum instruction set to all instructions collection is not proposed.
Invention content
To solve the shortcomings of the prior art, the invention discloses the rule to conflict between instruction set in a kind of RISC-CPU
Method and system are kept away, the mechanism for solving to conflict between instruction set, and a kind of RISC-CPU is provided and is directed to read/write conflict solution
Intelligent control hardware device certainly.
To achieve the above object, concrete scheme of the invention is as follows:
The bypassing method to conflict between instruction set in a kind of RISC-CPU, includes the following steps:
Step 1:According to the data dependence relation between different instruction set in RISC-CPU, conflict type is determined;
Step 2:For present instruction, judge whether it needs to access " register file " or " memory ", if so,
Step 3 is then carried out, otherwise, continues to analyze lower item instruction;
Step 3:For present instruction, " coherence window " is defined, in " coherence window ", finds " relevant instruction ", and sentence
It is disconnected to whether there is " relevant instruction ", if so, entering step four, otherwise, it is judged as no read/write conflict;
Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, is carried out using specific strategy
Data collision solves while ensureing assembly line throughput efficiency.
Further, for statistical analysis and return for the framework of RISC-CPU, instruction set and flowing water section in step 1
All instructions set is divided into data processing instructions collection, memory reference instruction collection, branch instruction collection and immediate and instructed by class
Collection.
Further, in step 1, after obtaining four kinds of instruction set, the conflict between statistical analysis instruction set is a variety of
The traversal of read/write conflict type between instruction set is sorted out, and obtains existing director data and relies on the conflict introduced:Below
The new write-in data that " dependent instruction " reads old register data or covering is read not yet, i.e., original instruction sequences
Data dependence relation is destroyed, and causes capability error;And original instruction pipeline is interrupted, including since assembly line is frozen
Knot or assembly line, which are inserted into, instructs reduction of handling up caused by NOP instruction.
Further, the classification for the type that conflicts is specially:Data processing, memory access, inside immediate instruction set with
Between them.
Further, in step 3, " coherence window " definition:For superscalar instruction processing, without the concern for instruction
Out-of-order execution, if it needs that some register is written, needed in following window for present instruction
It is interior:If present instruction operates some position of specific register or memory in register file;Refer to from currently
It enables and being pushed away toward following instruction stream, from instruction is taken out until present instruction result submits involved instruction sequence.
Further, in step 3, " relevant instruction " definition:Based on defined good " coherence window ", do not consider
The Out-of-order execution of instruction, for " present instruction ", if it needs to deviate ground to some of some register or memory
Location is written, then in " coherence window " range, is searched from present instruction toward following instruction, is equally also required to " currently referring to
Enable " be written or read one or more instruction, there are the instructions of data dependence relation for these, it is understood that there may be data
Read/write conflict causes the interruption of assembly line, reduces instruction throughput efficiency.
Further, it is located in " coherence window ", the method for Conflict solving, including:(1) before the current command presentation stage
Logic operation result, be directly fed into the arithmetic element of following relevant instruction, and without until present instruction it is submitted,
Last result is fed into again the arithmetic element of subsequent instruction.(2) in the operation knot of the subsequent relevant instruction of present instruction
Fruit carries out delay version, the register resource backup solved as read/write conflict using the output of delay version.
Further, in step 4, collision method specifically includes:
(1), data are before the submission of final operation result, and intermediate operations data feedover in advance, and intermediate operations data shift to an earlier date
Feedforward is presentation stage that need not be to the end, is just directly fed into the input terminal of next instruction;
(2), delay version feedforward of the data before the submission of final operation result;
(3), the flowing water delay version of intermediate calculation results operation version and is deposited with original, in the flowing water section of next relevant instruction
In falling, increase additional delay version register, delay one is clapped or a few bats, while retaining the deposit in original normal flowing water section
Device version;
(4), originally assembly line does not freeze strategy;
(5), the method for being inserted into NOP instruction is not used.
Further, specifically, the RAW Conflict solvings of ADD and XOR instructions:The operation result of ADD instruction is shifted to an earlier date feed-in
To XOR as input, it is not necessary to until ADD instruction submission executes again;
The RAW Conflict solvings of STORE and LOAD instruction:The result of STORE as LOAD address calculation input, by
Before LOAD instruction, the register after addition delay;
The RAW Conflict solvings of ADD and STORE instructions:The operation result of ADD instruction is directly fed into next stage as offset
Address carries out the calculating of STORE order addressable address;
The RAW Conflict solvings of LOAD and ADD instruction:The result of LOAD instruction is directly as the input of ADD, the centre of ADD
As a result the offset address that the version that delay one is clapped is accessed as DMEM.
Further, in step 1, it would be possible to after existing all read/write conflicts are sorted out, be stored in local archive table
In:Conflict retrieval table, including indexes and the type that conflicts;Using superscalar techniques, from command memory, a plurality of finger is taken out
It enables, storage defines searching with " relevant instruction " to " coherence window " in advance in instruction buffer window, is carried out;Pass through definition
" coherence window " and " relevant instruction ", carries out the searching of the read/write conflict of instruction stream, if there is the relevant conflict between instruction,
Then by searching the conflict type being stored in advance between local instruction set, what is conflicted quickly evades in advance.
The avoidance system to conflict between instruction set in a kind of RISC-CPU, including Conflict solving control unit, are applied
In RISC-CPU, the prominent solution control unit for realizing:This will likely be stored in after existing all read/write conflicts sort out
In ground archives table:Conflict retrieval table, including indexes and the type that conflicts;Using superscalar techniques, from command memory, take out
A plurality of instruction, storage define searching with " relevant instruction " to " coherence window " in advance in instruction buffer window, is carried out;Pass through
" coherence window " and " relevant instruction " are defined, the searching of the read/write conflict of instruction stream is carried out, if there is relevant between instruction
Conflict, then by searching the conflict type being stored in advance between local instruction set, what is conflicted quickly advises in advance
It keeps away.
Beneficial effects of the present invention:
The present invention carries out existing conflict type between instruction for the instruction set of current RISC-CPU in advance
The statistical analysis and classification to conflict between instruction set, then by conflict " relevant instruction " find as a result, control " conflict is sought
Look for and solve " unit, achieve the purpose that coherence window internal pipeline is smooth.
The present invention provides the available strategies for solving to conflict between instruction set in RISC-CPU designs;Define instruction " phase
Dry window " and " relevant instruction ", with internal " relevant instruction " read/write of solution " coherence window " ,/write conflict is write in Writing/Reading.This
Text utilizes the solution of RAW conflicts as an example, analyzes the conflict between several major instruction collection, which is instructed
The correlation quantization of conflict is defined and Conflict solving, can be effective by adding read/write conflict positioning and solving hardware cell
Evade existing instruction conflict between RISC-CPU different instruction sets, meanwhile, improve flowing water throughput efficiency.
Description of the drawings
Fig. 1 is the classification of the instruction set involved by RISC-CPU;
Fig. 2 is the signal of " coherence window " and " relevant instruction ";
Fig. 3 is the background and workflow of the proposition of the present invention;
Fig. 4 is the hardware cell of the conflict positioning and solution of addition;
Fig. 5 is the RAW Conflict solvings of ADD and XOR instructions;
Fig. 6 is the RAW Conflict solvings of STORE and LOAD instruction;
Fig. 7 is the RAW Conflict solvings of ADD and STORE instructions;
Fig. 8 is the RAW Conflict solvings of LOAD and ADD instruction.
Specific implementation mode:
The present invention is described in detail below in conjunction with the accompanying drawings:
As shown in Figure 1, the present invention proposes one kind for solving cpu instruction conflict effective ways.The present invention is in design
All instructions be divided into 4 major class (data processing instructions, branch Branch instructions, immediate instruction, memory reference order), from every
Extract representative subset of instructions in class instruction, solve respectively inside subset of instructions and between them RAW (write-it is rear-
Read) the relevant conflict of immediate data.All potential RAW conflicts, the positioning then to conflict and solution is precipitated in advance statistical
Certainly, reach assembly line smoothly purpose.
All instructions set is divided into data processing (Data-Processing) by RISC-CPU designs proposed by the present invention,
Memory access (STORE/LOAD), branch (Branch), four kinds of instruction class of immediate (Immediate).For each instruction
Between instruction class inside (except branch Branch) class, explained as an example with the Conflict solving of RAW (writing-rear-reading),
Shared several RAW conflicts (data processing, memory access, inside immediate instruction set between them), need to carry out data
Relevant Conflict solving caused by relying on.
" coherence window " defines:For superscalar instruction processing, without the concern for the Out-of-order execution of instruction, for currently referring to
For order, if it needs that some register is written, need in following window:If present instruction is to deposit
Some position of specific register or memory in device heap is operated;It is pushed away from present instruction toward following instruction stream, from taking
Go out instruction until present instruction result submits involved instruction sequence.
" relevant instruction " definition:It is defined good " coherence window " based on front, do not consider the Out-of-order execution of instruction, it is right
For " present instruction ", if it needs some offset address to some register or memory to be written, in " phase
It in dry window " range, is searched from present instruction toward following instruction, is equally also required to that " present instruction " is written or is read
One or more instruction.There are the instructions of data dependence relation for these, it is understood that there may be reading and writing data conflict causes assembly line
Interruption, reduce instruction throughput efficiency (CPI).
For each case, " coherence window " and " relevant instruction " are provided respectively.Fig. 2 is shown in the example signal of coherence window,
For example ADD (heavy line) and XOR (fine dotted line) is instructed.Fig. 2, such as the relevant finger of ADD instruction are shown in the example signal of relevant instruction
Corresponding XOR and STORE is enabled to instruct.For between all similar instruction set, such as:For data processing instructions subset:ADD with
XOR is instructed, and as shown in table 1,2, carries out the solution of RAW conflicts respectively, as shown in Figure 3.Such as:For memory reference instruction
(LOAD/STORE), such as table 3,4, shown in 5,6:
The instruction format of 1 data processing of table (Data-Processing instruction) instruction set
The representative of 2 data processing instructions collection of table (ADD addition instructions, XOR xor instructions)
Table 3, memory to register (LOAD) instruction format
Table 4, the representative in LOAD instruction class subset
LDR | RD←[RS] | LDR RD,[RS] |
Table 5, register to memory (STORE) instruction format
Table 6, the representative in STORE instruction class subsets
STR | RD→[RS] | STR RD,[RS] |
The solution of RAW conflicts is carried out respectively, as shown in Figure 5.
For between all foreign peoples's instruction set, carrying out the solution of RAW conflicts, such as Fig. 6, shown in 7,8 respectively.
For remaining instruction set, immediate instruction, branch instruction, the RAW between them, which conflicts, no longer to be analyzed.
In " coherence window ", the method for Conflict solving, including:(1) logic before the current command presentation stage is transported
Calculate as a result, be directly fed into the arithmetic element of following relevant instruction, and without until present instruction it is submitted, then last
Result be fed into the arithmetic element of subsequent instruction.(2) prolonged in the operation result of the subsequent relevant instruction of present instruction
Slow version, the register resource backup solved as read/write conflict using the output of delay version.
The RAW that the design only analyzes between data processing and memory reference instruction conflicts.For other instruction set it
Between and remaining WAR/WAW conflict with evade, no longer concrete example introduction.
In conjunction with Fig. 3,4, the instruction conflict for carrying out the present invention solves explanation.In RISC-CPU hardware designs, due to adjacent
There may be reading and writing data dependences between instruction, i.e., in register file the same register or memory it is same
There are read-write sequences to conflict for the access of sample offset address, causes assembly line that must pause, and reduces efficiency:RAW/WAR/WAR.For
The design of RISC-CPU, between instruction set existing conflict can lead to the interruption of assembly line, reduce instruction throughput efficiency.According to
5 sections of flowing water paragraphs of classics of MIPS:Stage 1- instruction fetch (fetch), stage 2- Instruction decoding (decode), stage 3-, which execute, to be referred to
Enabling (excute), the memory access of stage 4-, (memory access contain load/store two for internal Memory
Major class), stage 5- write-back.(Write Back, operation result write back to the register file in RISC-CPU to 5 sections of flowing water
Register-bank).In classical MIPS frameworks, the longest 5 flowing water paragraphs of occupancy of instruction (for example access memory STORE and refer to
It enables, is then written back into register file);The data operation instruction that common non-memory accesses, 4 sections of flowing water of occupancy (for example ADD adds
Method instructs, and the result being finally added is written in some register of register file, without memory access among ADD instruction
Stage), some instructions are carried out be over (for example redirecting JMP instructions) in decoding stage;Conditional branch instructions, since its needs changes
The sequence for becoming instruction stream causes the non-sequential execution of instruction;As it can be seen that RISC-CPU design in, between different instruction set due to
The flowing water section of occupancy is different, and instructs the randomness of arrangement very strong, causes to the same register or storage in register file
The read/write of device offset address is likely that there are data dependence conflict.
The present invention, by analysis of classical MIPS frameworks, between different instruction set possible register data read-write according to
Rely caused conflict, define " coherence window " for present instruction, and is found for currently finger from " coherence window "
" relevant instruction " enabled is rushed by the data dependence relation between different instruction set in the RISC-CPU that has analyzed in advance
The prominent determination of type and evading for flowing water efficiency:(1) the old register data of " dependent instruction " reading or covering do not have also below
There are the new write-in data of reading, i.e., the data dependence relation of original instruction sequences to be destroyed, causes capability error;(2) with
And original instruction pipeline is interrupted and (including instructs and gulp down caused by assembly line is frozen or assembly line is inserted into NOP instruction
Spit reduction), assembly line is once frozen or bulk delay simply, it will causes the weakening of flowing water advantage, reduces instruction and gulping down
Spit efficiency.
In the present invention, the example of analysis carries out data collision using following strategy and solves while ensureing that assembly line is handled up effect
Rate:(1), before the submission of final operation result, intermediate operations data feedover data in advance;(2) data are in final operation result
Delay version feedforward before submission;(3), intermediate calculation results prolong the slow version of flowing water and former operation version and deposit;(4), originally
Assembly line does not freeze strategy;(5), it does not use:The method for being inserted into NOP instruction.Such as:Method is planted for (1):Intermediate operations number
According to feedovering (presentation stage that i.e. need not be to the end) in advance, it is just directly fed into the input terminal of next instruction;(3) are planted
Method:In the flowing water paragraph of next relevant instruction, increase additional delay version register (delay one is clapped or a few bats), simultaneously
Retain the register version in original normal flowing water section.
Existing conflict type between instruction is instructed in advance for the instruction set of current RISC-CPU
The statistical analysis and classification to conflict between collection, then by conflict " relevant instruction " searching as a result, control " conflicts and finds and solve
Certainly " unit achievees the purpose that coherence window internal pipeline is smooth.Fig. 4,6 are to shift to an earlier date feed forward mechanism using intermediate calculation results;
Fig. 5,7 are the delay version methods for utilizing result of calculation.The current method having had, is NOP mechanism, and this method can be by flowing water
Line postpones backward, and this patent does not use this method, meanwhile, this patent does not use the method for freezing assembly line, and this is specially
The delay version of profit is to retain original pipeline register, while being directed to the relevant instruction conflict categorization results being known in advance, standby
Part delay version reaches the internal local instruction conflict of solution " coherence window " in this way, meanwhile, original normal version will not be destroyed
This flowing water segment data is transmitted.This patent needs that data feed-forward, a variety of methods such as delay version is made full use of to reach relevant window
Mouth is internal, and assembly line is smooth.This patent sorts out the Conflict solving of instruction from statistics, the management of carry out system is solved to positioning.
Meanwhile this patent Conflict solving frame, new instruction solution technology can also be absorbed, so that entire instruction conflict solves system
It is more complete.
Explanation to Fig. 1:Fig. 1 is the conflict between several instruction set (data processing, immediate, branch, storage access)
It solves.
Explanation to Fig. 2:Fig. 2 provides " coherence window " that two kinds of instructions are instructed for current ADD instruction, current XOR;
And for ADD instruction, " the relevant instruction " there are reading and writing data conflict is given:XOR, STORE, the two instructions are
The relevant instruction of ADD instruction.Explanation to Fig. 3:Give referring to based on " coherence window " and " relevant instruction " solution for the present invention
Enable the overall flow of collision method.This method, can as in a kind of RISC-CPU instruction read/write conflict intelligent positioning with
Module is solved, is added in RISC-CPU stones.Certainly, for the framework for the RISC-CPU for needing current design, flowing water hop count,
After instruction set type, instruction format etc. determine, the exploitation for carrying out RISC-CPU stones again is needed.Conventional method, will not conflict
Solution carry out quantization positioning and solve from the angle of system.
The patent, the first read/write conflict between variety classes or type command of the same race carry out advance statistical analysis,
And the stringent classification to conflict, it would be possible to after existing all read/write conflicts are sorted out, be stored in local archive table:Conflict
Retrieval table (index, conflict type);Using superscalar techniques, from command memory, a plurality of instruction, storage to instruction are taken out
In buffer window, carries out " coherence window " in advance and define searching with " relevant instruction ";Pass through stringent definition " relevant window
Mouthful " and " relevant instruction ", the searching of the read/write conflict of instruction stream is carried out, if there is the relevant conflict between instruction, is then passed through
The conflict type being stored in advance between local instruction set is searched, quickly evading in advance for conflicting (passes through addition
Conflict solving control unit).Table 1,2,3 in the invention is to list several instructions, and table 1 is the data of data processing instructions
Format and instruction mnemonic, table 2 and data format and instruction mnemonic that table 3 is memory access LOAD/STORE instructions
Symbol;
Fig. 5,7 are that feedforward in advance and Fig. 6,8 are the follow-up relevant intermediate calculation results instructed before prime instruction results are submitted
Postpone version output, both methods is provided to prevent flowing water from interrupting.The output of relevant instruction different delays beat, needs profit
With some additional registers, these registers are the delay versions of result of calculation, do not destroy the result of original flowing water.
For the technology of remaining solution conflict, which can expand in the technological frame of the present invention.
If present instruction does not access to register file or memory, then next instruction is analyzed, " relevant window
Mouthful " slide downward;If not having " relevant instruction " in " coherence window ", without read/write conflict, analysis terminates.
Explanation to Fig. 5:The operation result of ADD instruction is fed into XOR as input in advance, it is not necessary to wait until ADD instruction
Submission executes again, ensures the smoothness of assembly line.
Explanation to Fig. 6:The result of STORE is inputted as the address calculation of LOAD, by before LOAD instruction, adding
Register after delay ensures the smoothness of assembly line.
Explanation to Fig. 7:The operation result of ADD instruction is directly fed into next stage as offset address, carries out STORE lives
The calculating for enabling addressable address ensures the smoothness of assembly line.
Explanation to Fig. 8:The storage result of LOAD instruction is directly sent to following dependent instruction:The result of LOAD instruction is direct
As the input of ADD, the intermediate result of ADD postpones the offset address that the version that one claps is accessed as DMEM, ensures assembly line
It is smooth.
The present invention can realize that carry out instruction conflict searching with " relevant instruction " using " coherence window " evades with instruction.It is
The effective simple strategy that conflict between a variety of instruction set is solved commonly conflicts solution for RISC-CPU hardware instructions
It is certainly beneficial.
Intelligence conflict, which is found, to be designed with the unit addition solved in RISC-CPU, and super scalar CPU design is can be applied to.This hair
It is bright that data are shifted to an earlier date, are freezed without using flowing water, without using insertion NOP instruction, delay feedforward or based on intermediate calculation results mostly bat
The smooth realization technology of assembly line that the technical combinations such as delay version use.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention
The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not
Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.
Claims (8)
1. the bypassing method to conflict between instruction set in a kind of RISC-CPU, characterized in that include the following steps:
Step 1:According to the data dependence relation between different instruction set in RISC-CPU, conflict type is determined;
Step 2:For present instruction, judge whether it needs to access " register file " or " memory ", if so, into
Otherwise row step 3 continues to analyze lower item instruction;
Step 3:For present instruction, " coherence window " is defined, in " coherence window ", finds " relevant instruction ", and judgement is
It is no to there is " relevant instruction ", if so, entering step four, otherwise, it is judged as no read/write conflict;
Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, data are carried out using specific strategy
Conflict solving ensures assembly line throughput efficiency simultaneously;
" coherence window " defines:For superscalar instruction processing, without the concern for the Out-of-order execution of instruction, for present instruction
Speech, if it needs that some register is written, needs in following window:If present instruction is to register file
In specific register or some position of memory operated;It pushes away from present instruction toward following instruction stream, refers to from taking-up
It enables until present instruction result submits involved instruction sequence;
" relevant instruction " definition:Based on defined good " coherence window ", the Out-of-order execution of instruction is not considered, to " currently referring to
Enable " for, if it needs some offset address to some register or memory to be written, at " coherence window "
It in range, is searched from present instruction toward following instruction, is equally also required to " present instruction " is written or is read one
Or multiple instruction, there are the instructions of data dependence relation for these, it is understood that there may be reading and writing data conflict causes in assembly line
It is disconnected, reduce instruction throughput efficiency.
2. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that in step
It is for statistical analysis and sort out for the framework of RISC-CPU, instruction set and flowing water section in one, all instructions set is divided into number
According to process instruction collection, memory reference instruction collection, branch instruction collection and immediate instruction set.
3. the bypassing method to conflict between instruction set in a kind of RISC-CPU as claimed in claim 2, characterized in that in step
In one, after obtaining four kinds of instruction set, the conflict between statistical analysis instruction set, the read/write conflict class between a variety of instruction sets
The traversal of type is sorted out, and obtains existing director data and relies on the conflict introduced:" dependent instruction " reads old register count below
According to or the new write-in data that read not yet of covering, i.e., the data dependence relation of original instruction sequences is destroyed, and causes work(
It can mistake;And original instruction pipeline is interrupted, including since assembly line is frozen or assembly line is inserted into NOP instruction and causes
Instruction handle up reduction.
4. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1 or 3, characterized in that punching
The classification of prominent type is specially:Data processing, memory access, inside immediate instruction set between them;
The RAW Conflict solvings of ADD and XOR instructions:The operation result of ADD instruction is fed into XOR as input in advance, it is not necessary to etc.
It is executed again to ADD instruction submission;
The RAW Conflict solvings of STORE and LOAD instruction:The result of STORE is inputted as the address calculation of LOAD, by LOAD
Instruction front, the register after addition delay;
The RAW Conflict solvings of ADD and STORE instructions:The operation result of ADD instruction is directly fed into next stage as offset ground
Location carries out the calculating of STORE order addressable address;
The RAW Conflict solvings of LOAD and ADD instruction:The result of LOAD instruction is directly as the input of ADD, the intermediate result of ADD
The offset address that the version that delay one is clapped is accessed as DMEM.
5. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that be located at
In " coherence window ", the method for Conflict solving, including:(1) it the logic operation result before the current command presentation stage, directly presents
Enter the arithmetic element to following relevant instruction, and do not have to wait until that present instruction is submitted, then last result is fed into
The arithmetic element of subsequent instruction;(2) delay version is carried out in the operation result of the subsequent relevant instruction of present instruction, using prolonging
The register resource backup that the output of slow version is solved as read/write conflict.
6. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that step 4
In, collision method specifically includes:
(1), data are before the submission of final operation result, and intermediate operations data feedover in advance, and intermediate operations data feedover in advance
Presentation stage that i.e. need not be to the end, is just directly fed into the input terminal of next instruction;
(2), delay version feedforward of the data before the submission of final operation result;
(3), the flowing water delay version of intermediate calculation results operation version and is deposited with original, in the flowing water paragraph of next relevant instruction
In, increase additional delay version register, delay one is clapped or a few bats, while retaining the register in original normal flowing water section
Version;
(4), originally assembly line does not freeze strategy;
(5), the method for being inserted into NOP instruction is not used.
7. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that in step
In one, it would be possible to after existing all read/write conflicts are sorted out, be stored in local archive table:Conflict retrieval table, including index
And conflict type;Using superscalar techniques, from command memory, a plurality of instruction is taken out, in storage to instruction buffer window, into
" coherence window " of row in advance defines the searching with " relevant instruction ";By definition " coherence window " and " relevant instruction ", carry out
The searching of the read/write conflict of instruction stream has then been stored in advance in this if there is the relevant conflict between instruction by searching
Conflict type between the instruction set on ground, what is conflicted quickly evades in advance.
8. the avoidance system to conflict between instruction set in a kind of RISC-CPU, characterized in that including Conflict solving control unit, answer
In RISC-CPU, the Conflict solving control unit for realizing:It will likely deposit after existing all read/write conflicts sort out
Storage is in local archive table:Conflict retrieval table, including indexes and the type that conflicts;Using superscalar techniques, from command memory
In, it takes out a plurality of instruction, in storage to instruction buffer window, carries out " coherence window " in advance and define to seek with " relevant to instruct "
It looks for;By definition " coherence window " and " relevant instruction ", the searching of the read/write conflict of instruction stream is carried out, if there is between instruction
Relevant conflict, then be stored in advance in conflict type between local instruction set by searching, what is conflicted is quick
Evade in advance;
" coherence window " defines:For superscalar instruction processing, without the concern for the Out-of-order execution of instruction, for present instruction
Speech, if it needs that some register is written, needs in following window:If present instruction is to register file
In specific register or some position of memory operated;It pushes away from present instruction toward following instruction stream, refers to from taking-up
It enables until present instruction result submits involved instruction sequence;
" relevant instruction " definition:Based on defined good " coherence window ", the Out-of-order execution of instruction is not considered, to " currently referring to
Enable " for, if it needs some offset address to some register or memory to be written, at " coherence window "
It in range, is searched from present instruction toward following instruction, is equally also required to " present instruction " is written or is read one
Or multiple instruction, there are the instructions of data dependence relation for these, it is understood that there may be reading and writing data conflict causes in assembly line
It is disconnected, reduce instruction throughput efficiency.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611246947.6A CN106610816B (en) | 2016-12-29 | 2016-12-29 | The bypassing method and system to conflict between instruction set in a kind of RISC-CPU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611246947.6A CN106610816B (en) | 2016-12-29 | 2016-12-29 | The bypassing method and system to conflict between instruction set in a kind of RISC-CPU |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106610816A CN106610816A (en) | 2017-05-03 |
CN106610816B true CN106610816B (en) | 2018-10-30 |
Family
ID=58636378
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611246947.6A Expired - Fee Related CN106610816B (en) | 2016-12-29 | 2016-12-29 | The bypassing method and system to conflict between instruction set in a kind of RISC-CPU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106610816B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109189715B (en) * | 2018-08-16 | 2022-03-15 | 北京算能科技有限公司 | Programmable artificial intelligence accelerator execution unit and artificial intelligence acceleration method |
CN111221573B (en) * | 2018-11-26 | 2022-03-25 | 深圳云天励飞技术股份有限公司 | Management method of register access time sequence, processor, electronic equipment and computer readable storage medium |
CN113918216A (en) | 2020-07-10 | 2022-01-11 | 富泰华工业(深圳)有限公司 | Data read/write processing method, device and computer readable storage medium |
TWI758778B (en) * | 2020-07-10 | 2022-03-21 | 鴻海精密工業股份有限公司 | Data read-write processing method, apparatus, and computer readable storage medium thereof |
CN113312087B (en) * | 2021-06-17 | 2024-06-11 | 东南大学 | Cache optimization method based on RISC processor constant pool layout analysis and integration |
CN114238182B (en) * | 2021-12-20 | 2023-10-20 | 北京奕斯伟计算技术股份有限公司 | Processor, data processing method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5627982A (en) * | 1991-06-04 | 1997-05-06 | Matsushita Electric Industrial Co., Ltd. | Apparatus for simultaneously scheduling instructions from plural instruction stream into plural instruction executions units |
CN101067781A (en) * | 2006-03-07 | 2007-11-07 | 英特尔公司 | Technique to perform memory disambiguation |
CN101770357A (en) * | 2008-12-31 | 2010-07-07 | 世意法(北京)半导体研发有限责任公司 | Method for reducing instruction conflict in processor |
CN103116485A (en) * | 2013-01-30 | 2013-05-22 | 西安电子科技大学 | Assembler designing method based on specific instruction set processor for very long instruction words |
-
2016
- 2016-12-29 CN CN201611246947.6A patent/CN106610816B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5627982A (en) * | 1991-06-04 | 1997-05-06 | Matsushita Electric Industrial Co., Ltd. | Apparatus for simultaneously scheduling instructions from plural instruction stream into plural instruction executions units |
CN101067781A (en) * | 2006-03-07 | 2007-11-07 | 英特尔公司 | Technique to perform memory disambiguation |
CN101770357A (en) * | 2008-12-31 | 2010-07-07 | 世意法(北京)半导体研发有限责任公司 | Method for reducing instruction conflict in processor |
CN103116485A (en) * | 2013-01-30 | 2013-05-22 | 西安电子科技大学 | Assembler designing method based on specific instruction set processor for very long instruction words |
Also Published As
Publication number | Publication date |
---|---|
CN106610816A (en) | 2017-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106610816B (en) | The bypassing method and system to conflict between instruction set in a kind of RISC-CPU | |
US9866218B2 (en) | Boolean logic in a state machine lattice | |
US10909452B2 (en) | Methods and systems for power management in a pattern recognition processing system | |
US9058465B2 (en) | Counter operation in a state machine lattice | |
KR102074961B1 (en) | Method and apparatus for efficient scheduling for asymmetrical execution units | |
US9075428B2 (en) | Results generation for state machine engines | |
TWI600295B (en) | Methods and systems for routing in a state machine | |
Raasch et al. | A scalable instruction queue design using dependence chains | |
US10671295B2 (en) | Methods and systems for using state vector data in a state machine engine | |
Senge et al. | On the problem of error propagation in classifier chains for multi-label classification | |
CN105426160A (en) | Instruction classified multi-emitting method based on SPRAC V8 instruction set | |
GB2287108A (en) | Method and apparatus for avoiding writeback conflicts between execution units sharing a common writeback path | |
Mehdad et al. | Towards topic labeling with phrase entailment and aggregation | |
Hara et al. | Performance comparison of ILP machines with cycle time evaluation | |
Kalaitzidis | Advanced speculation to increase the performance of superscalar processors | |
Chaudhary | Custom exact branch predictor for astar benchmark | |
US11409530B2 (en) | System, method and apparatus for executing instructions | |
Kofman et al. | Application Architecture Adequacy through an FFT case study | |
Papadopoulos et al. | Towards systolic hardware acceleration for local complexity analysis of massive genomic data | |
Kong et al. | KPU-SQL: Kernel Processing Unit for High-Performance SQL Acceleration | |
Jung et al. | Large scale document inversion using a multi-threaded computing system | |
Singhvi et al. | Pipeline Hazards and its Resolutions | |
Theodoropoulos et al. | A distributed colouring algorithm for control hazards in asynchronous pipelines | |
Jung et al. | Parallel Document Inversion using GPU | |
KÖKSAL | Design and Implementation of Fully Associative Instruction Cache Memory on a Processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20181030 |