CN106610816B - The bypassing method and system to conflict between instruction set in a kind of RISC-CPU - Google Patents

The bypassing method and system to conflict between instruction set in a kind of RISC-CPU Download PDF

Info

Publication number
CN106610816B
CN106610816B CN201611246947.6A CN201611246947A CN106610816B CN 106610816 B CN106610816 B CN 106610816B CN 201611246947 A CN201611246947 A CN 201611246947A CN 106610816 B CN106610816 B CN 106610816B
Authority
CN
China
Prior art keywords
instruction
conflict
relevant
cpu
risc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201611246947.6A
Other languages
Chinese (zh)
Other versions
CN106610816A (en
Inventor
孙建辉
王春兴
王公堂
李登旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN201611246947.6A priority Critical patent/CN106610816B/en
Publication of CN106610816A publication Critical patent/CN106610816A/en
Application granted granted Critical
Publication of CN106610816B publication Critical patent/CN106610816B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/30134Register stacks; shift registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30141Implementation provisions of register files, e.g. ports
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3814Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3816Instruction alignment, e.g. cache line crossing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
    • G06F9/3869Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

The invention discloses the bypassing method to conflict between instruction set in a kind of RISC-CPU and systems, include the following steps:Step 1:According to the data dependence relation between different instruction set in RISC-CPU, conflict type is determined;Step 2:For present instruction, judge whether it needs to access " register file " or " memory ", if so, carrying out step 3, otherwise, continues to analyze lower item instruction;Step 3:For present instruction, " coherence window " is defined, in " coherence window ", " relevant instruction " is found, and judge whether " relevant instruction ", if so, entering step four, otherwise, is judged as no read/write conflict;Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, data collision solution is carried out using specific strategy while ensureing assembly line throughput efficiency.

Description

The bypassing method and system to conflict between instruction set in a kind of RISC-CPU
Technical field
The present invention relates to computer processing technical fields, and in particular to the rule to conflict between instruction set in a kind of RISC-CPU Keep away method and system.
Background technology
Since the design in Embedded RISC-CPU (selects from CPU hardware instruction set and instructs custom design, compiled to corresponding Translate the design of device) it is extremely important.
Relevant technology is as follows:Application No. is artificial " generation meaning method (Beijing) semiconductor research and development of CN200810191060, application Co., Ltd " application for a patent for invention " reducing the instruction conflict in processor ", it is to pass through to solve the strategy that instruction is broken through (instruction issue) stage is sent in instruction, 2 kinds of instruction selections is carried out, is sent to subsequent parallel functional units, which is seen Kind instruction Lothrus apterus, then arbitrate out one of which instruction.This method is used to select a Lothrus apterus from multi-emitting instruction window Instruction.
And the application for a patent for invention for the Intel company that number of patent application is CN200710087737 " executes memory disambiguation Technology ", give solution for the conflict avoidance between adjacent memory reference order, but be not involved with remaining The conflict avoidance of coherence window inside associated instruction set or between instruction set defined between relevant instruction.
Xian Electronics Science and Technology University application No. is the application for a patent for invention of CN201310054280, " one kind is referred to based on overlength Enable the assembler design method of word dedicated instruction set processor ", it is designed by assembler and completes instruction scheduling, thought highly of using deposit Write after write (W-A-W) conflicts with writeafterread (W-A-R) caused by the methods of name instructs Out-of-order execution with removal.
Number of patent application is that the application for a patent for invention of the Intel company of " CN200710087737 " " executes memory disambiguation Technology ", give solution for the conflict avoidance between adjacent memory reference order, but be not involved with remaining The conflict avoidance of coherence window inside associated instruction set or between instruction set defined between relevant instruction, and with The flowing water number of stages of Embedded RISC-CPU involved by patent is different, do not refer to that the design proposes yet from sub based on instruction CPU hardware engineering design strategies of the prototype CPU of collection to full instruction set.
As it can be seen that the method for not providing the relevant Conflict solvings of RAW from the whole angle of instruction system in the prior art, The CPU hardware design engineering strategy from the CPU prototypes of Local Minimum instruction set to all instructions collection is not proposed.
Invention content
To solve the shortcomings of the prior art, the invention discloses the rule to conflict between instruction set in a kind of RISC-CPU Method and system are kept away, the mechanism for solving to conflict between instruction set, and a kind of RISC-CPU is provided and is directed to read/write conflict solution Intelligent control hardware device certainly.
To achieve the above object, concrete scheme of the invention is as follows:
The bypassing method to conflict between instruction set in a kind of RISC-CPU, includes the following steps:
Step 1:According to the data dependence relation between different instruction set in RISC-CPU, conflict type is determined;
Step 2:For present instruction, judge whether it needs to access " register file " or " memory ", if so, Step 3 is then carried out, otherwise, continues to analyze lower item instruction;
Step 3:For present instruction, " coherence window " is defined, in " coherence window ", finds " relevant instruction ", and sentence It is disconnected to whether there is " relevant instruction ", if so, entering step four, otherwise, it is judged as no read/write conflict;
Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, is carried out using specific strategy Data collision solves while ensureing assembly line throughput efficiency.
Further, for statistical analysis and return for the framework of RISC-CPU, instruction set and flowing water section in step 1 All instructions set is divided into data processing instructions collection, memory reference instruction collection, branch instruction collection and immediate and instructed by class Collection.
Further, in step 1, after obtaining four kinds of instruction set, the conflict between statistical analysis instruction set is a variety of The traversal of read/write conflict type between instruction set is sorted out, and obtains existing director data and relies on the conflict introduced:Below The new write-in data that " dependent instruction " reads old register data or covering is read not yet, i.e., original instruction sequences Data dependence relation is destroyed, and causes capability error;And original instruction pipeline is interrupted, including since assembly line is frozen Knot or assembly line, which are inserted into, instructs reduction of handling up caused by NOP instruction.
Further, the classification for the type that conflicts is specially:Data processing, memory access, inside immediate instruction set with Between them.
Further, in step 3, " coherence window " definition:For superscalar instruction processing, without the concern for instruction Out-of-order execution, if it needs that some register is written, needed in following window for present instruction It is interior:If present instruction operates some position of specific register or memory in register file;Refer to from currently It enables and being pushed away toward following instruction stream, from instruction is taken out until present instruction result submits involved instruction sequence.
Further, in step 3, " relevant instruction " definition:Based on defined good " coherence window ", do not consider The Out-of-order execution of instruction, for " present instruction ", if it needs to deviate ground to some of some register or memory Location is written, then in " coherence window " range, is searched from present instruction toward following instruction, is equally also required to " currently referring to Enable " be written or read one or more instruction, there are the instructions of data dependence relation for these, it is understood that there may be data Read/write conflict causes the interruption of assembly line, reduces instruction throughput efficiency.
Further, it is located in " coherence window ", the method for Conflict solving, including:(1) before the current command presentation stage Logic operation result, be directly fed into the arithmetic element of following relevant instruction, and without until present instruction it is submitted, Last result is fed into again the arithmetic element of subsequent instruction.(2) in the operation knot of the subsequent relevant instruction of present instruction Fruit carries out delay version, the register resource backup solved as read/write conflict using the output of delay version.
Further, in step 4, collision method specifically includes:
(1), data are before the submission of final operation result, and intermediate operations data feedover in advance, and intermediate operations data shift to an earlier date Feedforward is presentation stage that need not be to the end, is just directly fed into the input terminal of next instruction;
(2), delay version feedforward of the data before the submission of final operation result;
(3), the flowing water delay version of intermediate calculation results operation version and is deposited with original, in the flowing water section of next relevant instruction In falling, increase additional delay version register, delay one is clapped or a few bats, while retaining the deposit in original normal flowing water section Device version;
(4), originally assembly line does not freeze strategy;
(5), the method for being inserted into NOP instruction is not used.
Further, specifically, the RAW Conflict solvings of ADD and XOR instructions:The operation result of ADD instruction is shifted to an earlier date feed-in To XOR as input, it is not necessary to until ADD instruction submission executes again;
The RAW Conflict solvings of STORE and LOAD instruction:The result of STORE as LOAD address calculation input, by Before LOAD instruction, the register after addition delay;
The RAW Conflict solvings of ADD and STORE instructions:The operation result of ADD instruction is directly fed into next stage as offset Address carries out the calculating of STORE order addressable address;
The RAW Conflict solvings of LOAD and ADD instruction:The result of LOAD instruction is directly as the input of ADD, the centre of ADD As a result the offset address that the version that delay one is clapped is accessed as DMEM.
Further, in step 1, it would be possible to after existing all read/write conflicts are sorted out, be stored in local archive table In:Conflict retrieval table, including indexes and the type that conflicts;Using superscalar techniques, from command memory, a plurality of finger is taken out It enables, storage defines searching with " relevant instruction " to " coherence window " in advance in instruction buffer window, is carried out;Pass through definition " coherence window " and " relevant instruction ", carries out the searching of the read/write conflict of instruction stream, if there is the relevant conflict between instruction, Then by searching the conflict type being stored in advance between local instruction set, what is conflicted quickly evades in advance.
The avoidance system to conflict between instruction set in a kind of RISC-CPU, including Conflict solving control unit, are applied In RISC-CPU, the prominent solution control unit for realizing:This will likely be stored in after existing all read/write conflicts sort out In ground archives table:Conflict retrieval table, including indexes and the type that conflicts;Using superscalar techniques, from command memory, take out A plurality of instruction, storage define searching with " relevant instruction " to " coherence window " in advance in instruction buffer window, is carried out;Pass through " coherence window " and " relevant instruction " are defined, the searching of the read/write conflict of instruction stream is carried out, if there is relevant between instruction Conflict, then by searching the conflict type being stored in advance between local instruction set, what is conflicted quickly advises in advance It keeps away.
Beneficial effects of the present invention:
The present invention carries out existing conflict type between instruction for the instruction set of current RISC-CPU in advance The statistical analysis and classification to conflict between instruction set, then by conflict " relevant instruction " find as a result, control " conflict is sought Look for and solve " unit, achieve the purpose that coherence window internal pipeline is smooth.
The present invention provides the available strategies for solving to conflict between instruction set in RISC-CPU designs;Define instruction " phase Dry window " and " relevant instruction ", with internal " relevant instruction " read/write of solution " coherence window " ,/write conflict is write in Writing/Reading.This Text utilizes the solution of RAW conflicts as an example, analyzes the conflict between several major instruction collection, which is instructed The correlation quantization of conflict is defined and Conflict solving, can be effective by adding read/write conflict positioning and solving hardware cell Evade existing instruction conflict between RISC-CPU different instruction sets, meanwhile, improve flowing water throughput efficiency.
Description of the drawings
Fig. 1 is the classification of the instruction set involved by RISC-CPU;
Fig. 2 is the signal of " coherence window " and " relevant instruction ";
Fig. 3 is the background and workflow of the proposition of the present invention;
Fig. 4 is the hardware cell of the conflict positioning and solution of addition;
Fig. 5 is the RAW Conflict solvings of ADD and XOR instructions;
Fig. 6 is the RAW Conflict solvings of STORE and LOAD instruction;
Fig. 7 is the RAW Conflict solvings of ADD and STORE instructions;
Fig. 8 is the RAW Conflict solvings of LOAD and ADD instruction.
Specific implementation mode:
The present invention is described in detail below in conjunction with the accompanying drawings:
As shown in Figure 1, the present invention proposes one kind for solving cpu instruction conflict effective ways.The present invention is in design All instructions be divided into 4 major class (data processing instructions, branch Branch instructions, immediate instruction, memory reference order), from every Extract representative subset of instructions in class instruction, solve respectively inside subset of instructions and between them RAW (write-it is rear- Read) the relevant conflict of immediate data.All potential RAW conflicts, the positioning then to conflict and solution is precipitated in advance statistical Certainly, reach assembly line smoothly purpose.
All instructions set is divided into data processing (Data-Processing) by RISC-CPU designs proposed by the present invention, Memory access (STORE/LOAD), branch (Branch), four kinds of instruction class of immediate (Immediate).For each instruction Between instruction class inside (except branch Branch) class, explained as an example with the Conflict solving of RAW (writing-rear-reading), Shared several RAW conflicts (data processing, memory access, inside immediate instruction set between them), need to carry out data Relevant Conflict solving caused by relying on.
" coherence window " defines:For superscalar instruction processing, without the concern for the Out-of-order execution of instruction, for currently referring to For order, if it needs that some register is written, need in following window:If present instruction is to deposit Some position of specific register or memory in device heap is operated;It is pushed away from present instruction toward following instruction stream, from taking Go out instruction until present instruction result submits involved instruction sequence.
" relevant instruction " definition:It is defined good " coherence window " based on front, do not consider the Out-of-order execution of instruction, it is right For " present instruction ", if it needs some offset address to some register or memory to be written, in " phase It in dry window " range, is searched from present instruction toward following instruction, is equally also required to that " present instruction " is written or is read One or more instruction.There are the instructions of data dependence relation for these, it is understood that there may be reading and writing data conflict causes assembly line Interruption, reduce instruction throughput efficiency (CPI).
For each case, " coherence window " and " relevant instruction " are provided respectively.Fig. 2 is shown in the example signal of coherence window, For example ADD (heavy line) and XOR (fine dotted line) is instructed.Fig. 2, such as the relevant finger of ADD instruction are shown in the example signal of relevant instruction Corresponding XOR and STORE is enabled to instruct.For between all similar instruction set, such as:For data processing instructions subset:ADD with XOR is instructed, and as shown in table 1,2, carries out the solution of RAW conflicts respectively, as shown in Figure 3.Such as:For memory reference instruction (LOAD/STORE), such as table 3,4, shown in 5,6:
The instruction format of 1 data processing of table (Data-Processing instruction) instruction set
The representative of 2 data processing instructions collection of table (ADD addition instructions, XOR xor instructions)
Table 3, memory to register (LOAD) instruction format
Table 4, the representative in LOAD instruction class subset
LDR RD←[RS] LDR RD,[RS]
Table 5, register to memory (STORE) instruction format
Table 6, the representative in STORE instruction class subsets
STR RD→[RS] STR RD,[RS]
The solution of RAW conflicts is carried out respectively, as shown in Figure 5.
For between all foreign peoples's instruction set, carrying out the solution of RAW conflicts, such as Fig. 6, shown in 7,8 respectively.
For remaining instruction set, immediate instruction, branch instruction, the RAW between them, which conflicts, no longer to be analyzed.
In " coherence window ", the method for Conflict solving, including:(1) logic before the current command presentation stage is transported Calculate as a result, be directly fed into the arithmetic element of following relevant instruction, and without until present instruction it is submitted, then last Result be fed into the arithmetic element of subsequent instruction.(2) prolonged in the operation result of the subsequent relevant instruction of present instruction Slow version, the register resource backup solved as read/write conflict using the output of delay version.
The RAW that the design only analyzes between data processing and memory reference instruction conflicts.For other instruction set it Between and remaining WAR/WAW conflict with evade, no longer concrete example introduction.
In conjunction with Fig. 3,4, the instruction conflict for carrying out the present invention solves explanation.In RISC-CPU hardware designs, due to adjacent There may be reading and writing data dependences between instruction, i.e., in register file the same register or memory it is same There are read-write sequences to conflict for the access of sample offset address, causes assembly line that must pause, and reduces efficiency:RAW/WAR/WAR.For The design of RISC-CPU, between instruction set existing conflict can lead to the interruption of assembly line, reduce instruction throughput efficiency.According to 5 sections of flowing water paragraphs of classics of MIPS:Stage 1- instruction fetch (fetch), stage 2- Instruction decoding (decode), stage 3-, which execute, to be referred to Enabling (excute), the memory access of stage 4-, (memory access contain load/store two for internal Memory Major class), stage 5- write-back.(Write Back, operation result write back to the register file in RISC-CPU to 5 sections of flowing water Register-bank).In classical MIPS frameworks, the longest 5 flowing water paragraphs of occupancy of instruction (for example access memory STORE and refer to It enables, is then written back into register file);The data operation instruction that common non-memory accesses, 4 sections of flowing water of occupancy (for example ADD adds Method instructs, and the result being finally added is written in some register of register file, without memory access among ADD instruction Stage), some instructions are carried out be over (for example redirecting JMP instructions) in decoding stage;Conditional branch instructions, since its needs changes The sequence for becoming instruction stream causes the non-sequential execution of instruction;As it can be seen that RISC-CPU design in, between different instruction set due to The flowing water section of occupancy is different, and instructs the randomness of arrangement very strong, causes to the same register or storage in register file The read/write of device offset address is likely that there are data dependence conflict.
The present invention, by analysis of classical MIPS frameworks, between different instruction set possible register data read-write according to Rely caused conflict, define " coherence window " for present instruction, and is found for currently finger from " coherence window " " relevant instruction " enabled is rushed by the data dependence relation between different instruction set in the RISC-CPU that has analyzed in advance The prominent determination of type and evading for flowing water efficiency:(1) the old register data of " dependent instruction " reading or covering do not have also below There are the new write-in data of reading, i.e., the data dependence relation of original instruction sequences to be destroyed, causes capability error;(2) with And original instruction pipeline is interrupted and (including instructs and gulp down caused by assembly line is frozen or assembly line is inserted into NOP instruction Spit reduction), assembly line is once frozen or bulk delay simply, it will causes the weakening of flowing water advantage, reduces instruction and gulping down Spit efficiency.
In the present invention, the example of analysis carries out data collision using following strategy and solves while ensureing that assembly line is handled up effect Rate:(1), before the submission of final operation result, intermediate operations data feedover data in advance;(2) data are in final operation result Delay version feedforward before submission;(3), intermediate calculation results prolong the slow version of flowing water and former operation version and deposit;(4), originally Assembly line does not freeze strategy;(5), it does not use:The method for being inserted into NOP instruction.Such as:Method is planted for (1):Intermediate operations number According to feedovering (presentation stage that i.e. need not be to the end) in advance, it is just directly fed into the input terminal of next instruction;(3) are planted Method:In the flowing water paragraph of next relevant instruction, increase additional delay version register (delay one is clapped or a few bats), simultaneously Retain the register version in original normal flowing water section.
Existing conflict type between instruction is instructed in advance for the instruction set of current RISC-CPU The statistical analysis and classification to conflict between collection, then by conflict " relevant instruction " searching as a result, control " conflicts and finds and solve Certainly " unit achievees the purpose that coherence window internal pipeline is smooth.Fig. 4,6 are to shift to an earlier date feed forward mechanism using intermediate calculation results; Fig. 5,7 are the delay version methods for utilizing result of calculation.The current method having had, is NOP mechanism, and this method can be by flowing water Line postpones backward, and this patent does not use this method, meanwhile, this patent does not use the method for freezing assembly line, and this is specially The delay version of profit is to retain original pipeline register, while being directed to the relevant instruction conflict categorization results being known in advance, standby Part delay version reaches the internal local instruction conflict of solution " coherence window " in this way, meanwhile, original normal version will not be destroyed This flowing water segment data is transmitted.This patent needs that data feed-forward, a variety of methods such as delay version is made full use of to reach relevant window Mouth is internal, and assembly line is smooth.This patent sorts out the Conflict solving of instruction from statistics, the management of carry out system is solved to positioning. Meanwhile this patent Conflict solving frame, new instruction solution technology can also be absorbed, so that entire instruction conflict solves system It is more complete.
Explanation to Fig. 1:Fig. 1 is the conflict between several instruction set (data processing, immediate, branch, storage access) It solves.
Explanation to Fig. 2:Fig. 2 provides " coherence window " that two kinds of instructions are instructed for current ADD instruction, current XOR; And for ADD instruction, " the relevant instruction " there are reading and writing data conflict is given:XOR, STORE, the two instructions are The relevant instruction of ADD instruction.Explanation to Fig. 3:Give referring to based on " coherence window " and " relevant instruction " solution for the present invention Enable the overall flow of collision method.This method, can as in a kind of RISC-CPU instruction read/write conflict intelligent positioning with Module is solved, is added in RISC-CPU stones.Certainly, for the framework for the RISC-CPU for needing current design, flowing water hop count, After instruction set type, instruction format etc. determine, the exploitation for carrying out RISC-CPU stones again is needed.Conventional method, will not conflict Solution carry out quantization positioning and solve from the angle of system.
The patent, the first read/write conflict between variety classes or type command of the same race carry out advance statistical analysis, And the stringent classification to conflict, it would be possible to after existing all read/write conflicts are sorted out, be stored in local archive table:Conflict Retrieval table (index, conflict type);Using superscalar techniques, from command memory, a plurality of instruction, storage to instruction are taken out In buffer window, carries out " coherence window " in advance and define searching with " relevant instruction ";Pass through stringent definition " relevant window Mouthful " and " relevant instruction ", the searching of the read/write conflict of instruction stream is carried out, if there is the relevant conflict between instruction, is then passed through The conflict type being stored in advance between local instruction set is searched, quickly evading in advance for conflicting (passes through addition Conflict solving control unit).Table 1,2,3 in the invention is to list several instructions, and table 1 is the data of data processing instructions Format and instruction mnemonic, table 2 and data format and instruction mnemonic that table 3 is memory access LOAD/STORE instructions Symbol;
Fig. 5,7 are that feedforward in advance and Fig. 6,8 are the follow-up relevant intermediate calculation results instructed before prime instruction results are submitted Postpone version output, both methods is provided to prevent flowing water from interrupting.The output of relevant instruction different delays beat, needs profit With some additional registers, these registers are the delay versions of result of calculation, do not destroy the result of original flowing water. For the technology of remaining solution conflict, which can expand in the technological frame of the present invention.
If present instruction does not access to register file or memory, then next instruction is analyzed, " relevant window Mouthful " slide downward;If not having " relevant instruction " in " coherence window ", without read/write conflict, analysis terminates.
Explanation to Fig. 5:The operation result of ADD instruction is fed into XOR as input in advance, it is not necessary to wait until ADD instruction Submission executes again, ensures the smoothness of assembly line.
Explanation to Fig. 6:The result of STORE is inputted as the address calculation of LOAD, by before LOAD instruction, adding Register after delay ensures the smoothness of assembly line.
Explanation to Fig. 7:The operation result of ADD instruction is directly fed into next stage as offset address, carries out STORE lives The calculating for enabling addressable address ensures the smoothness of assembly line.
Explanation to Fig. 8:The storage result of LOAD instruction is directly sent to following dependent instruction:The result of LOAD instruction is direct As the input of ADD, the intermediate result of ADD postpones the offset address that the version that one claps is accessed as DMEM, ensures assembly line It is smooth.
The present invention can realize that carry out instruction conflict searching with " relevant instruction " using " coherence window " evades with instruction.It is The effective simple strategy that conflict between a variety of instruction set is solved commonly conflicts solution for RISC-CPU hardware instructions It is certainly beneficial.
Intelligence conflict, which is found, to be designed with the unit addition solved in RISC-CPU, and super scalar CPU design is can be applied to.This hair It is bright that data are shifted to an earlier date, are freezed without using flowing water, without using insertion NOP instruction, delay feedforward or based on intermediate calculation results mostly bat The smooth realization technology of assembly line that the technical combinations such as delay version use.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (8)

1. the bypassing method to conflict between instruction set in a kind of RISC-CPU, characterized in that include the following steps:
Step 1:According to the data dependence relation between different instruction set in RISC-CPU, conflict type is determined;
Step 2:For present instruction, judge whether it needs to access " register file " or " memory ", if so, into Otherwise row step 3 continues to analyze lower item instruction;
Step 3:For present instruction, " coherence window " is defined, in " coherence window ", finds " relevant instruction ", and judgement is It is no to there is " relevant instruction ", if so, entering step four, otherwise, it is judged as no read/write conflict;
Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, data are carried out using specific strategy Conflict solving ensures assembly line throughput efficiency simultaneously;
" coherence window " defines:For superscalar instruction processing, without the concern for the Out-of-order execution of instruction, for present instruction Speech, if it needs that some register is written, needs in following window:If present instruction is to register file In specific register or some position of memory operated;It pushes away from present instruction toward following instruction stream, refers to from taking-up It enables until present instruction result submits involved instruction sequence;
" relevant instruction " definition:Based on defined good " coherence window ", the Out-of-order execution of instruction is not considered, to " currently referring to Enable " for, if it needs some offset address to some register or memory to be written, at " coherence window " It in range, is searched from present instruction toward following instruction, is equally also required to " present instruction " is written or is read one Or multiple instruction, there are the instructions of data dependence relation for these, it is understood that there may be reading and writing data conflict causes in assembly line It is disconnected, reduce instruction throughput efficiency.
2. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that in step It is for statistical analysis and sort out for the framework of RISC-CPU, instruction set and flowing water section in one, all instructions set is divided into number According to process instruction collection, memory reference instruction collection, branch instruction collection and immediate instruction set.
3. the bypassing method to conflict between instruction set in a kind of RISC-CPU as claimed in claim 2, characterized in that in step In one, after obtaining four kinds of instruction set, the conflict between statistical analysis instruction set, the read/write conflict class between a variety of instruction sets The traversal of type is sorted out, and obtains existing director data and relies on the conflict introduced:" dependent instruction " reads old register count below According to or the new write-in data that read not yet of covering, i.e., the data dependence relation of original instruction sequences is destroyed, and causes work( It can mistake;And original instruction pipeline is interrupted, including since assembly line is frozen or assembly line is inserted into NOP instruction and causes Instruction handle up reduction.
4. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1 or 3, characterized in that punching The classification of prominent type is specially:Data processing, memory access, inside immediate instruction set between them;
The RAW Conflict solvings of ADD and XOR instructions:The operation result of ADD instruction is fed into XOR as input in advance, it is not necessary to etc. It is executed again to ADD instruction submission;
The RAW Conflict solvings of STORE and LOAD instruction:The result of STORE is inputted as the address calculation of LOAD, by LOAD Instruction front, the register after addition delay;
The RAW Conflict solvings of ADD and STORE instructions:The operation result of ADD instruction is directly fed into next stage as offset ground Location carries out the calculating of STORE order addressable address;
The RAW Conflict solvings of LOAD and ADD instruction:The result of LOAD instruction is directly as the input of ADD, the intermediate result of ADD The offset address that the version that delay one is clapped is accessed as DMEM.
5. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that be located at In " coherence window ", the method for Conflict solving, including:(1) it the logic operation result before the current command presentation stage, directly presents Enter the arithmetic element to following relevant instruction, and do not have to wait until that present instruction is submitted, then last result is fed into The arithmetic element of subsequent instruction;(2) delay version is carried out in the operation result of the subsequent relevant instruction of present instruction, using prolonging The register resource backup that the output of slow version is solved as read/write conflict.
6. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that step 4 In, collision method specifically includes:
(1), data are before the submission of final operation result, and intermediate operations data feedover in advance, and intermediate operations data feedover in advance Presentation stage that i.e. need not be to the end, is just directly fed into the input terminal of next instruction;
(2), delay version feedforward of the data before the submission of final operation result;
(3), the flowing water delay version of intermediate calculation results operation version and is deposited with original, in the flowing water paragraph of next relevant instruction In, increase additional delay version register, delay one is clapped or a few bats, while retaining the register in original normal flowing water section Version;
(4), originally assembly line does not freeze strategy;
(5), the method for being inserted into NOP instruction is not used.
7. the bypassing method to conflict between instruction set in a kind of RISC-CPU as described in claim 1, characterized in that in step In one, it would be possible to after existing all read/write conflicts are sorted out, be stored in local archive table:Conflict retrieval table, including index And conflict type;Using superscalar techniques, from command memory, a plurality of instruction is taken out, in storage to instruction buffer window, into " coherence window " of row in advance defines the searching with " relevant instruction ";By definition " coherence window " and " relevant instruction ", carry out The searching of the read/write conflict of instruction stream has then been stored in advance in this if there is the relevant conflict between instruction by searching Conflict type between the instruction set on ground, what is conflicted quickly evades in advance.
8. the avoidance system to conflict between instruction set in a kind of RISC-CPU, characterized in that including Conflict solving control unit, answer In RISC-CPU, the Conflict solving control unit for realizing:It will likely deposit after existing all read/write conflicts sort out Storage is in local archive table:Conflict retrieval table, including indexes and the type that conflicts;Using superscalar techniques, from command memory In, it takes out a plurality of instruction, in storage to instruction buffer window, carries out " coherence window " in advance and define to seek with " relevant to instruct " It looks for;By definition " coherence window " and " relevant instruction ", the searching of the read/write conflict of instruction stream is carried out, if there is between instruction Relevant conflict, then be stored in advance in conflict type between local instruction set by searching, what is conflicted is quick Evade in advance;
" coherence window " defines:For superscalar instruction processing, without the concern for the Out-of-order execution of instruction, for present instruction Speech, if it needs that some register is written, needs in following window:If present instruction is to register file In specific register or some position of memory operated;It pushes away from present instruction toward following instruction stream, refers to from taking-up It enables until present instruction result submits involved instruction sequence;
" relevant instruction " definition:Based on defined good " coherence window ", the Out-of-order execution of instruction is not considered, to " currently referring to Enable " for, if it needs some offset address to some register or memory to be written, at " coherence window " It in range, is searched from present instruction toward following instruction, is equally also required to " present instruction " is written or is read one Or multiple instruction, there are the instructions of data dependence relation for these, it is understood that there may be reading and writing data conflict causes in assembly line It is disconnected, reduce instruction throughput efficiency.
CN201611246947.6A 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU Expired - Fee Related CN106610816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611246947.6A CN106610816B (en) 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611246947.6A CN106610816B (en) 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU

Publications (2)

Publication Number Publication Date
CN106610816A CN106610816A (en) 2017-05-03
CN106610816B true CN106610816B (en) 2018-10-30

Family

ID=58636378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611246947.6A Expired - Fee Related CN106610816B (en) 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU

Country Status (1)

Country Link
CN (1) CN106610816B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189715B (en) * 2018-08-16 2022-03-15 北京算能科技有限公司 Programmable artificial intelligence accelerator execution unit and artificial intelligence acceleration method
CN111221573B (en) * 2018-11-26 2022-03-25 深圳云天励飞技术股份有限公司 Management method of register access time sequence, processor, electronic equipment and computer readable storage medium
CN113918216A (en) 2020-07-10 2022-01-11 富泰华工业(深圳)有限公司 Data read/write processing method, device and computer readable storage medium
TWI758778B (en) * 2020-07-10 2022-03-21 鴻海精密工業股份有限公司 Data read-write processing method, apparatus, and computer readable storage medium thereof
CN113312087B (en) * 2021-06-17 2024-06-11 东南大学 Cache optimization method based on RISC processor constant pool layout analysis and integration
CN114238182B (en) * 2021-12-20 2023-10-20 北京奕斯伟计算技术股份有限公司 Processor, data processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5627982A (en) * 1991-06-04 1997-05-06 Matsushita Electric Industrial Co., Ltd. Apparatus for simultaneously scheduling instructions from plural instruction stream into plural instruction executions units
CN101067781A (en) * 2006-03-07 2007-11-07 英特尔公司 Technique to perform memory disambiguation
CN101770357A (en) * 2008-12-31 2010-07-07 世意法(北京)半导体研发有限责任公司 Method for reducing instruction conflict in processor
CN103116485A (en) * 2013-01-30 2013-05-22 西安电子科技大学 Assembler designing method based on specific instruction set processor for very long instruction words

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5627982A (en) * 1991-06-04 1997-05-06 Matsushita Electric Industrial Co., Ltd. Apparatus for simultaneously scheduling instructions from plural instruction stream into plural instruction executions units
CN101067781A (en) * 2006-03-07 2007-11-07 英特尔公司 Technique to perform memory disambiguation
CN101770357A (en) * 2008-12-31 2010-07-07 世意法(北京)半导体研发有限责任公司 Method for reducing instruction conflict in processor
CN103116485A (en) * 2013-01-30 2013-05-22 西安电子科技大学 Assembler designing method based on specific instruction set processor for very long instruction words

Also Published As

Publication number Publication date
CN106610816A (en) 2017-05-03

Similar Documents

Publication Publication Date Title
CN106610816B (en) The bypassing method and system to conflict between instruction set in a kind of RISC-CPU
US9866218B2 (en) Boolean logic in a state machine lattice
US10909452B2 (en) Methods and systems for power management in a pattern recognition processing system
US9058465B2 (en) Counter operation in a state machine lattice
KR102074961B1 (en) Method and apparatus for efficient scheduling for asymmetrical execution units
US9075428B2 (en) Results generation for state machine engines
TWI600295B (en) Methods and systems for routing in a state machine
Raasch et al. A scalable instruction queue design using dependence chains
US10671295B2 (en) Methods and systems for using state vector data in a state machine engine
Senge et al. On the problem of error propagation in classifier chains for multi-label classification
CN105426160A (en) Instruction classified multi-emitting method based on SPRAC V8 instruction set
GB2287108A (en) Method and apparatus for avoiding writeback conflicts between execution units sharing a common writeback path
Mehdad et al. Towards topic labeling with phrase entailment and aggregation
Hara et al. Performance comparison of ILP machines with cycle time evaluation
Kalaitzidis Advanced speculation to increase the performance of superscalar processors
Chaudhary Custom exact branch predictor for astar benchmark
US11409530B2 (en) System, method and apparatus for executing instructions
Kofman et al. Application Architecture Adequacy through an FFT case study
Papadopoulos et al. Towards systolic hardware acceleration for local complexity analysis of massive genomic data
Kong et al. KPU-SQL: Kernel Processing Unit for High-Performance SQL Acceleration
Jung et al. Large scale document inversion using a multi-threaded computing system
Singhvi et al. Pipeline Hazards and its Resolutions
Theodoropoulos et al. A distributed colouring algorithm for control hazards in asynchronous pipelines
Jung et al. Parallel Document Inversion using GPU
KÖKSAL Design and Implementation of Fully Associative Instruction Cache Memory on a Processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20181030