CN106610816A - Avoidance method for conflict between instruction sets in RISC-CPU and avoidance system thereof - Google Patents

Avoidance method for conflict between instruction sets in RISC-CPU and avoidance system thereof Download PDF

Info

Publication number
CN106610816A
CN106610816A CN201611246947.6A CN201611246947A CN106610816A CN 106610816 A CN106610816 A CN 106610816A CN 201611246947 A CN201611246947 A CN 201611246947A CN 106610816 A CN106610816 A CN 106610816A
Authority
CN
China
Prior art keywords
instruction
conflict
cpu
risc
relevant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611246947.6A
Other languages
Chinese (zh)
Other versions
CN106610816B (en
Inventor
孙建辉
王春兴
王公堂
李登旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN201611246947.6A priority Critical patent/CN106610816B/en
Publication of CN106610816A publication Critical patent/CN106610816A/en
Application granted granted Critical
Publication of CN106610816B publication Critical patent/CN106610816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/30134Register stacks; shift registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30141Implementation provisions of register files, e.g. ports
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3814Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3816Instruction alignment, e.g. cache line crossing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
    • G06F9/3869Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

The invention discloses an avoidance method for the conflict between instruction sets in RISC-CPU and an avoidance system thereof. The avoidance method comprises the following steps that step one: the conflict type is determined according to the data dependence relationship between different instruction sets in the RISC-CPU; step two: accessing to a "register file" or "memory" is judged by aiming at the current instruction, the step three is performed if the judgment result is yes, or analysis of the next instruction is continued; step three: a "coherent window" is defined by aiming at the current instruction, a "coherent instruction" is searched in the "coherent window" and existence of the "coherent instruction" is judged, the process enters the step four if the judgment result is yes, or the judgment result indicates no read-write conflict; and step four: the conflict method is selected according to the conflict type between different instruction sets, the data conflict is solved by using a concrete strategy and the assembly line throughput efficiency can also be guaranteed.

Description

The bypassing method conflicted between instruction set in a kind of RISC-CPU and system
Technical field
The present invention relates to computer processing technology field, and in particular to the rule conflicted between instruction set in a kind of RISC-CPU Keep away method and system.
Background technology
As the design in Embedded RISC-CPU (is selected and instruction custom design from CPU hardware instruction set, to correspondence volume Translate the design of device) it is extremely important.
Related technology is as follows:Artificial " generation meaning method (Beijing) the quasiconductor research and development of Application No. CN200810191060, application Co., Ltd " application for a patent for invention " reduces the instruction conflict in processor ", and which solves to instruct the strategy for breaking through to be to pass through (instruction issue) stage is sent in instruction, 2 kinds of instructions is carried out and is selected, be sent to follow-up parallel functional units, which is seen Instruction Lothrus apterus are planted, then arbitrates out one of which instruction.The method is for one Lothrus apterus of selection from multi-emitting instruction window Instruction.
And number of patent application " performs memory disambiguation for the application for a patent for invention of the Intel company of CN200710087737 Technology ", its conflict avoidance being directed between adjacent memory reference order gives solution, but is not involved with remaining Coherence window inside associated instruction set or between instruction set is defined and the conflict avoidance between relevant instruction.
" one kind is referred to based on overlength for the application for a patent for invention of Application No. CN201310054280 of Xian Electronics Science and Technology University Make the assembler method for designing of word ASIP ", instruction scheduling is completed by assembler design, is thought highly of using deposit The methods such as name are conflicted with writeafterread (W-A-R) with removing the write after write (W-A-W) that instruction Out-of-order execution causes.
Number of patent application " performs memory disambiguation for the application for a patent for invention of the Intel company of " CN200710087737 " Technology ", its conflict avoidance being directed between adjacent memory reference order gives solution, but is not involved with remaining Coherence window inside associated instruction set or between instruction set is defined and the conflict avoidance between relevant instruction, and with this The flowing water number of stages of the Embedded RISC-CPU involved by patent is different, does not also refer to that the design is proposed sub from instruction is based on CPU hardware engineering design strategy of prototype CPU of collection to full instruction set.
It can be seen that, without the method for providing the relevant Conflict solvings of RAW from the overall angle of instruction system in prior art, Do not have to propose the CPU hardware design engineering strategy from the CPU prototypes of Local Minimum instruction set to all instructions collection.
The content of the invention
To solve the deficiency that prior art is present, the invention discloses the rule conflicted between instruction set in a kind of RISC-CPU Method and system are kept away, and for solving the mechanism conflicted between instruction set, and a kind of RISC-CPU are provided for read/write conflict solution Based Intelligent Control hardware device certainly.
For achieving the above object, concrete scheme of the invention is as follows:
The bypassing method conflicted between instruction set in a kind of RISC-CPU, comprises the following steps:
Step one:According to the data dependence relation in RISC-CPU between different instruction set, it is determined that conflict type;
Step 2:For present instruction, judge whether which needs to access " register file " or " memorizer ", if so, Step 3 is then carried out, otherwise, continues the lower bar instruction of analysis;
Step 3:For present instruction, " coherence window " is defined, in " coherence window ", find " relevant instruction ", and sentence It is disconnected to whether there is " relevant instruction ", step 4 is if so, then entered, otherwise, is judged as without read/write conflict;
Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, carried out using specific strategy Data collision solves to ensure streamline throughput efficiency simultaneously.
Further, in step one, the framework, instruction set and flowing water section for RISC-CPU carries out statistical analysiss and returns Class, is divided into the instruction of data processing instructions collection, memory reference instruction collection, branch instruction collection and immediate all instructions set Collection.
Further, in step one, after four kinds of instruction set are obtained, the conflict between statistical analysiss instruction set is various The traversal of the read/write conflict type between instruction set, classification, the director data for obtaining existing rely on the conflict for introducing:Below " dependent instruction " reads old register data or the new write data that also do not read of covering, i.e., original instruction sequences Data dependence relation is destroyed, and causes capability error;And original instruction pipeline is interrupted, including being frozen due to streamline Knot or the streamline insertion instruction that causes of NOP instruction are handled up reduction.
Further, the classification of the type that conflicts is specially:Data processing, memory access, inside immediate instruction set with Between them.
Further, in step 3, " coherence window " definition:For superscalar instruction process, it is not necessary to consider instruction Out-of-order execution, for present instruction, if which needs to write certain depositor, need in following window It is interior:If present instruction to register file in concrete depositor or certain position of memorizer operate;Refer to from currently Make toward following instruction stream and pushing away, submit involved job sequence up to present instruction result from instruction is taken out.
Further, in step 3, " relevant instruction " definition:Based on defined good " coherence window ", do not consider The Out-of-order execution of instruction, for " present instruction ", if which needs certain the skew ground to certain depositor or memorizer Location is write, then, in the range of " coherence window ", search toward following instruction from present instruction, be equally also required to " currently referring to Make " write or read one or more instruction, there is the instruction of data dependence relation in these, it is understood that there may be data Read/write conflict, causes the interruption of streamline, reduces instruction throughput efficiency.
Further, in " coherence window ", the method for Conflict solving, including:(1) before the current command presentation stage Logic operation result, be directly fed into the arithmetic element of following relevant instruction, and without when present instruction it is submitted, Again last result is fed into the arithmetic element of subsequent instruction.(2) the computing knot of the relevant instruction behind the present instruction Fruit carries out delay version, the depositor resource backup solved as read/write conflict by the use of the output for postponing version.
Further, in step 4, collision method is specifically included:
(1), before the submission of final operation result, intermediate operations data feedover data in advance, and intermediate operations data shift to an earlier date Last presentation stage need not be arrived in feedforward, be just directly fed into the input of next instruction;
(2), delay version feedforward of the data before the submission of final operation result;
(3), the flowing water of intermediate calculation results postpones version with former computing version and deposits, in the flowing water section of next relevant instruction In falling, increase extra delay version register, postpone one and clap or a few bats, while retaining the deposit in original normal flowing water section Device version;
(4), originally streamline does not freeze strategy;
(5) method for, not using insertion NOP instruction.
Further, specifically, the RAW Conflict solvings of ADD and XOR instructions:The operation result of ADD instruction is shifted to an earlier date feed-in To XOR as input, it is not necessary to submit to when ADD instruction and perform again;
The RAW Conflict solvings of STORE and LOAD instruction:The result of STORE as LOAD address computation be input into, by Depositor before LOAD instruction, after addition delay;
The RAW Conflict solvings of ADD and STORE instructions:The operation result of ADD instruction is directly fed into next stage as skew Address, carries out the calculating of STORE order addressable address;
The RAW Conflict solvings of LOAD and ADD instruction:Input of the result of LOAD instruction directly as ADD, the centre of ADD As a result postpone the offset address that a version clapped is accessed as DMEM.
Further, in step one, it would be possible to after all read/write conflicts for existing are sorted out, be stored in local archive table In:Conflict retrieval table, including index and the type that conflicts;Using superscalar techniques, from command memory, a plurality of finger is taken out Order, stores in instruction buffer window, and " coherence window " for carrying out in advance defines the searching with " relevant instruction ";By definition " coherence window " and " relevant instruction ", carries out the searching of the read/write conflict of instruction stream, if there is the relevant conflict between instruction, Then by searching the conflict type being stored in advance between local instruction set, what is conflicted quickly evades in advance.
The avoidance system conflicted between instruction set in a kind of RISC-CPU, including Conflict solving control unit, apply In RISC-CPU, the prominent solution control unit is used to realize:After all read/write conflicts that will likely exist are sorted out, this is stored in In ground archives table:Conflict retrieval table, including index and the type that conflicts;Using superscalar techniques, from command memory, take out A plurality of instruction, stores in instruction buffer window, and " coherence window " for carrying out in advance defines the searching with " relevant instruction ";Pass through " coherence window " and " relevant instruction " are defined, the searching of the read/write conflict of instruction stream is carried out, if there is relevant between instruction Conflict, then by searching the conflict type being stored in advance between local instruction set, what is conflicted quickly advises in advance Keep away.
Beneficial effects of the present invention:
The present invention, is carried out for the conflict type existed between instruction in advance for the instruction set of current RISC-CPU The statistical analysiss conflicted between instruction set with sort out, the result then found by conflict " relevant instruction ", " conflict is sought for control Look for and solve " unit, reach the smooth purpose of coherence window internal pipeline.
The invention provides solving the available strategy conflicted between instruction set in RISC-CPU designs;Define instruction " phase / write conflict is write in dry window " and " relevant instruction ", " relevant instruction " read/write internal to solve " coherence window ", Writing/Reading.This Text utilizes the solution of RAW conflicts as an example, analyzes the conflict between several major instruction collection, and the engineering method is instructed The dependency of conflict quantifies to define and Conflict solving, and by adding read/write conflict positioning and solving hardware cell, which can be effective Evade the instruction conflict existed between RISC-CPU different instruction sets, meanwhile, improve flowing water throughput efficiency.
Description of the drawings
Fig. 1 is the classification of the instruction set involved by RISC-CPU;
Fig. 2 is the signal of " coherence window " and " relevant instruction ";
Fig. 3 is the background and workflow of the proposition of the present invention;
Fig. 4 is the conflict positioning and the hardware cell for solving of addition;
Fig. 5 is the RAW Conflict solvings of ADD and XOR instructions;
Fig. 6 is the RAW Conflict solvings of STORE and LOAD instruction;
Fig. 7 is the RAW Conflict solvings of ADD and STORE instructions;
Fig. 8 is the RAW Conflict solvings of LOAD and ADD instruction.
Specific embodiment:
The present invention is described in detail below in conjunction with the accompanying drawings:
As shown in figure 1, the present invention proposes one kind for solving cpu instruction conflict effective ways.The present invention is in design All instructions be divided into 4 big class (data processing instructions, branch Branch instructions, immediate instruction, memory reference order), from every Representative subset of instructions is extracted in class instruction, solved inside subset of instructions respectively and RAW between them (write-it is rear- Read) the relevant conflict of immediate data.Statistical separates out all potential RAW conflicts, the positioning for then being conflicted and solution in advance Certainly, reach streamline smoothly purpose.
RISC-CPU designs proposed by the present invention are divided into data processing (Data-Processing) all instructions set, Memory access (STORE/LOAD), branch (Branch), four kinds of instruction class of immediate (Immediate).For every kind of instruction Inside (except branch Branch) class and instruction class between, explained with the Conflict solving of RAW (writing-rear-reading) as an example, Total several RAW conflicts (data processing, memory access, inside immediate instruction set between them), need to carry out data The relevant Conflict solving that dependence causes.
" coherence window " is defined:For superscalar instruction process, it is not necessary to consider instruction Out-of-order execution, for currently finger For order, if which needs to write certain depositor, need in following window:If present instruction is to deposit Certain position of concrete depositor or memorizer in device heap is operated;Push away from present instruction toward following instruction stream, from taking Go out instruction until present instruction result submits involved job sequence to.
" relevant instruction " definition:Based on above defined good " coherence window ", the Out-of-order execution for instructing is not considered, it is right For " present instruction ", if which needs to write certain offset address of certain depositor or memorizer, in " phase In the range of dry window ", search toward following instruction from present instruction, be equally also required to " present instruction " is write or read One or more instruction.There is the instruction of data dependence relation in these, it is understood that there may be reading and writing data conflict, cause streamline Interruption, reduce instruction throughput efficiency (CPI).
For every kind of situation, " coherence window " is given respectively with " relevant instruction ".Fig. 2 is shown in the example signal of coherence window, Such as ADD (heavy line) and XOR (fine dotted line) is instructed.The example of relevant instruction is illustrated to see Fig. 2, the relevant finger of such as ADD instruction Order correspondence XOR and STORE is instructed.For between all similar instruction set, such as:For data processing instructions subset:ADD with XOR is instructed, and as shown in table 1,2, carries out the solution of RAW conflicts respectively, as shown in Figure 3.Such as:For memory reference instruction (LOAD/STORE), such as table 3,4, shown in 5,6:
The instruction format of 1 data processing of table (Data-Processing instruction) instruction set
The representative of 2 data processing instructions collection of table (ADD addition instructions, the instruction of XOR XORs)
Table 3, memorizer to depositor (LOAD) instruction format
Table 4, the representative in LOAD instruction class subset
LDR RD←[RS] LDR RD,[RS]
Table 5, depositor to memorizer (STORE) instruction format
Table 6, the representative in STORE instruction class subsets
STR RD→[RS] STR RD,[RS]
The solution of RAW conflicts is carried out respectively, as shown in Figure 5.
For, between all foreign peoples's instruction set, carrying out the solution of RAW conflicts, such as Fig. 6, shown in 7,8 respectively.
For remaining instruction set, immediate instruction, branch instruction, the RAW conflicts between them are no longer analyzed.
In " coherence window ", the method for Conflict solving, including:(1) logic before the current command presentation stage is transported Result is calculated, the arithmetic element of following relevant instruction is directly fed into, and it is submitted without waiting until present instruction, then last Result be fed into the arithmetic element of subsequent instruction.(2) operation result of the relevant instruction behind the present instruction is prolonged Slow version, the depositor resource backup solved as read/write conflict by the use of the output for postponing version.
The RAW that the design is only analyzed between data processing and memory reference instruction conflicts.For other instruction set it Between and remaining WAR/WAW conflict with evade, no longer concrete example introduce.
With reference to Fig. 3,4, carry out the present invention instruction conflict solve explanation.In RISC-CPU hardware designs, due to adjacent Reading and writing data dependence is there may be between instruction, i.e., for register file in same depositor or memorizer it is same There is read-write order and conflict in the access of sample offset address, cause streamline pause, reduce efficiency:RAW/WAR/WAR.For The design of RISC-CPU, the conflict existed between instruction set can cause the interruption of streamline, reduce instruction throughput efficiency.According to 5 sections of flowing water paragraphs of classics of MIPS:Stage 1- instruction fetch (fetch), stage 2- Instruction decoding (decode), stage 3- are performed and are referred to Make (excute), stage 4- memory access (memory access, for internal Memory, containing load/store two Big class), stage 5- write-back.(Write Back, operation result write back to the register file in RISC-CPU to 5 sections of flowing water Register-bank).In classical MIPS frameworks, most long 5 flowing water paragraphs of occupancy are instructed (such as to access memorizer STORE to refer to Order, is then written back into register file);The data operation instruction that common non-memory is accessed, (such as ADD adds to take 4 sections of flowing water Method is instructed, and the final result being added is written in certain depositor of register file, without memory access in the middle of ADD instruction Stage), some instructions are carried out being over (such as redirect JMP instructions) in decoding stage;Conditional branch instructions, as its needs changes Become the order of instruction stream, cause the non-sequential execution of instruction;It can be seen that, RISC-CPU design in, between different instruction set due to The flowing water section of occupancy is different, and instructs the randomness of arrangement very strong, causes to same depositor or storage in register file The read/write of device offset address is likely that there are data dependence conflict.
The present invention, by analysis of classical MIPS framework, between different instruction set possible register data read-write according to Rely the conflict for causing, define " coherence window " for present instruction, and find from " coherence window " for currently referring to " relevant instruction " of order, by the data dependence relation between different instruction set in the RISC-CPU that analyzes in advance, is rushed The determination of prominent type and evading for flowing water efficiency:(1) behind, " dependent instruction " reads old register data or covers and also do not have The data dependence relation for having the new write data of reading, i.e., original instruction sequences is destroyed, and causes capability error;(2) with And original instruction pipeline is interrupted and (includes being gulped down due to the instruction that streamline is frozen or streamline insertion NOP instruction causes Tell reduction), streamline is once frozen or bulk delay simply, it will causes the weakening of flowing water advantage, reduces instruction and gulp down Tell efficiency.
In the present invention, the example of analysis carries out data collision using following strategy and solves while ensureing that streamline is handled up effect Rate:(1), before the submission of final operation result, intermediate operations data feedover data in advance;(2) data are in final operation result Delay version feedforward before submission;(3), intermediate calculation results prolong the slow version of flowing water with former computing version and deposit;(4), originally Streamline does not freeze strategy;(5), do not use:The method of insertion NOP instruction.Such as:Method is planted for (1st):Intermediate operations number According to feedovering (need not arrive last presentation stage) in advance, the input of next instruction is just directly fed into;Plant for (3rd) Method:In the flowing water paragraph of next relevant instruction, increase extra delay version register (postponing to clap or a few bats), while Retain the depositor version in original normal flowing water section.
For the instruction set of current RISC-CPU, for the conflict type existed between instruction, instructed in advance The statistical analysiss conflicted between collection and classification, the result then found by conflict " relevant instruction ", control " are conflicted and are found and solution Certainly " unit, reaches the smooth purpose of coherence window internal pipeline.Fig. 4,6 are to shift to an earlier date feed forward mechanism using intermediate calculation results; Fig. 5,7 are delay version methods using result of calculation.The current method having had, is NOP mechanism, and the method can be by flowing water Line postpones backward, and this patent does not use this method, meanwhile, this patent does not use the method for freezing streamline, and this is specially The delay version of profit is to retain original pipeline register, while for the relevant instruction conflict categorization results being known a priori by, it is standby Part postpones version, so reaches the internal local instruction conflict of solution " coherence window ", meanwhile, original normal version will not be destroyed This flowing water segment data transmission.This patent, needs to make full use of data feed-forward, postpones various methods such as version, reach relevant window Mouth is internal, and streamline is smooth.This patent, sorts out the Conflict solving of instruction from statistics, solves to carry out the management of system to positioning. Meanwhile, this patent Conflict solving framework, it is also possible to absorb new instruction solution technology, so that whole instruction conflict solves system It is more complete.
Explanation to Fig. 1:Fig. 1 is the conflict between several instruction set (data processing, immediate, branch, storage are accessed) Solve.
Explanation to Fig. 2:Fig. 2 provides " coherence window " that two kinds of instructions are instructed for current ADD instruction, current XOR; And for ADD instruction, give " the relevant instruction " that there is reading and writing data conflict:XOR, STORE, the two instructions are The relevant instruction of ADD instruction.Explanation to Fig. 3:Give referring to " relevant instruction " solution based on " coherence window " for the present invention Make the overall flow of collision method.This method, can as the instruction read/write conflict intelligent positioning in a kind of RISC-CPU with Module is solved, is added in RISC-CPU stones.Certainly, for need the framework of the RISC-CPU of current design, flowing water hop count, After instruction set species, instruction format etc. determine, need to carry out the exploitation of RISC-CPU stones again.Traditional method, will not conflict Solution, from the angle of system, carry out quantify positioning with solve.
The patent, carries out advance statistical analysiss to the read/write conflict between variety classes or type command of the same race first, And the strict classification for being conflicted, it would be possible to after all read/write conflicts for existing are sorted out, be stored in local archive table:Conflict Retrieval table (index, conflict type);Using superscalar techniques, from command memory, a plurality of instruction is taken out, storage is to instruction In buffer window, " coherence window " for carrying out in advance defines the searching with " relevant instruction ";By strict definition " relevant window Mouthful " and " relevant instruction ", the searching of the read/write conflict of instruction stream is carried out, if there is the relevant conflict between instruction, is then passed through Search has been stored in advance in the conflict type between local instruction set, and what is conflicted quickly evades in advance (by addition Conflict solving control unit).Table 1,2,3 in the invention is to list several instructions, data of the table 1 for data processing instructions Form and instruction mnemonic, table 2 and data form and instruction mnemonic that table 3 is memory access LOAD/STORE instructions Symbol;
Fig. 5,7 are that feedforward in advance and Fig. 6,8 are the follow-up relevant intermediate calculation results for instructing before prime instruction results are submitted to Postpone version output, both approaches are provided to prevent flowing water from interrupting.The output of relevant instruction different delays beat, needs profit With some extra depositors, these depositors are the delay versions of result of calculation, and which does not destroy the result of original flowing water. For the technology of remaining solution conflict, the invention can be expanded in the technological frame of the present invention.
If present instruction, register file or memorizer are not conducted interviews, then analyze next instruction, " relevant window Mouthful " slide downward;If not having " relevant instruction " in " coherence window ", no read/write conflict, analysis terminate.
Explanation to Fig. 5:The operation result of ADD instruction is fed into XOR in advance as input, it is not necessary to when ADD instruction Submission is performed again, it is ensured that the smoothness of streamline.
Explanation to Fig. 6:The result of STORE is input into as the address computation of LOAD, by, before LOAD instruction, adding Depositor after delay, it is ensured that the smoothness of streamline.
Explanation to Fig. 7:The operation result of ADD instruction is directly fed into next stage as offset address, carries out STORE lives Make the calculating of addressable address, it is ensured that the smoothness of streamline.
Explanation to Fig. 8:The storage result of LOAD instruction is directly sent to following dependent instruction:The result of LOAD instruction is direct Used as the input of ADD, the intermediate result of ADD postpones the offset address that a version clapped is accessed as DMEM, it is ensured that streamline It is smooth.
The present invention can realize that utilizing " coherence window " to carry out instruction conflict searching with " relevant instruction " is evaded with instruction.It is The effective simple strategy solved by conflict between various instruction set, for RISC-CPU hardware instructions commonly conflict solution It is certainly beneficial.
Intelligence conflict is found and is added on RISC-CPU designs with the unit for solving, and can be applicable to super scalar CPU design.This It is bright data to be shifted to an earlier date, does not use flowing water to freeze, not used insertion NOP instruction, postpone feedforward or be based on bat more than intermediate calculation results Postpone the technical combinations such as version using streamline smooth realize technology.
Although the above-mentioned accompanying drawing that combines is described to the specific embodiment of the present invention, not to present invention protection model The restriction enclosed, one of ordinary skill in the art should be understood that on the basis of technical scheme those skilled in the art are not The various modifications made by needing to pay creative work or deformation are still within protection scope of the present invention.

Claims (10)

1. the bypassing method for conflicting between instruction set in a kind of RISC-CPU, is characterized in that, comprise the following steps:
Step one:According to the data dependence relation in RISC-CPU between different instruction set, it is determined that conflict type;
Step 2:For present instruction, judge whether which needs to access " register file " or " memorizer ", if so, then enter Row step 3, otherwise, continues the lower bar instruction of analysis;
Step 3:For present instruction, " coherence window " is defined, in " coherence window ", find " relevant instruction ", and judgement is No presence " relevant instruction ", if so, then enters step 4, otherwise, is judged as without read/write conflict;
Step 4:According to the conflict type between instruction set, the selection of collision method is carried out, data are carried out using specific strategy Conflict solving ensures streamline throughput efficiency simultaneously.
2. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as claimed in claim 1, is characterized in that, in step In one, the framework, instruction set and flowing water section for RISC-CPU carries out statistical analysiss and sorts out, and all instructions set is divided into number According to process instruction collection, memory reference instruction collection, branch instruction collection and immediate instruction set.
3. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as claimed in claim 2, is characterized in that, in step In one, after four kinds of instruction set are obtained, the conflict between statistical analysiss instruction set, the read/write conflict class between various instruction sets The traversal of type, classification, the director data for obtaining existing rely on the conflict for introducing:" dependent instruction " reads old register count below According to or cover the new write data that also do not read, i.e., the data dependence relation of original instruction sequences is destroyed, and causes work( Can mistake;And original instruction pipeline is interrupted, including as streamline is frozen or streamline insertion NOP instruction causes Instruction handle up reduction.
4. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as described in claim 1 or 3, is characterized in that, punching The classification of prominent type is specially:Data processing, memory access, inside immediate instruction set between them;
Preferably, the RAW Conflict solvings of ADD and XOR instructions:The operation result of ADD instruction is fed in advance XOR as defeated Enter, it is not necessary to submit to when ADD instruction and perform again;
The RAW Conflict solvings of STORE and LOAD instruction:The result of STORE is input into as the address computation of LOAD, by LOAD Depositor before instruction, after addition delay;
The RAW Conflict solvings of ADD and STORE instructions:The operation result of ADD instruction is directly fed into next stage as skew ground Location, carries out the calculating of STORE order addressable address;
The RAW Conflict solvings of LOAD and ADD instruction:Input of the result of LOAD instruction directly as ADD, the intermediate result of ADD Postpone the offset address that a version clapped is accessed as DM EM.
5. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as claimed in claim 1, is characterized in that, in step In three, " coherence window " definition:For superscalar instruction process, it is not necessary to consider the Out-of-order execution of instruction, for present instruction For, if which needs to write certain depositor, need in following window:If present instruction is to depositor Certain position of concrete depositor or memorizer in heap is operated;Push away from present instruction toward following instruction stream, from taking-up Instruction is until present instruction result submits involved job sequence to.
6. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as claimed in claim 5, is characterized in that, in step In three, " relevant instruction " definition:Based on defined good " coherence window ", the Out-of-order execution for instructing is not considered, to " currently referring to Make " for, if which needs to write certain offset address of certain depositor or memorizer, at " coherence window " In the range of, search toward following instruction from present instruction, be equally also required to for being write to " present instruction " or being read Or multiple instruction, there is the instruction of data dependence relation in these, it is understood that there may be reading and writing data conflict, in causing streamline It is disconnected, reduce instruction throughput efficiency.
7. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as claimed in claim 1, is characterized in that, be located at In " coherence window ", the method for Conflict solving, including:(1) the logic operation result before the current command presentation stage, directly present Enter the arithmetic element to following relevant instruction, and it is submitted without waiting until present instruction, then last result is fed into The arithmetic element of subsequent instruction.(2) operation result of the relevant instruction behind the present instruction carries out delay version, using prolonging The depositor resource backup that the output of version is solved as read/write conflict late.
8. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as claimed in claim 1, is characterized in that, step 4 In, collision method is specifically included:
(1), before the submission of final operation result, intermediate operations data feedover data in advance, and intermediate operations data are feedovered in advance Last presentation stage need not be arrived, the input of next instruction is just directly fed into;
(2), delay version feedforward of the data before the submission of final operation result;
(3), the flowing water of intermediate calculation results postpones version with former computing version and deposits, in the flowing water paragraph of next relevant instruction In, increase extra delay version register, postpone one and clap or a few bats, while retaining the depositor in original normal flowing water section Version;
(4), originally streamline does not freeze strategy;
(5) method for, not using insertion NOP instruction.
9. the bypassing method for conflicting between instruction set in a kind of RISC-CPU as claimed in claim 1, is characterized in that, in step In one, it would be possible to after all read/write conflicts for existing are sorted out, be stored in local archive table:Conflict retrieval table, including index And conflict type;Using superscalar techniques, from command memory, a plurality of instruction is taken out, store in instruction buffer window, enter Row " coherence window " in advance defines the searching with " relevant instruction ";By defining " coherence window " and " relevant instruction ", carry out The searching of the read/write conflict of instruction stream, if there is the relevant conflict between instruction, has then been stored in advance in this by search Conflict type between the instruction set on ground, what is conflicted are quickly evaded in advance.
10. the avoidance system for conflicting between instruction set in a kind of RISC-CPU, is characterized in that, including Conflict solving control unit, Apply in RISC-CPU, the prominent solution control unit is used to realize:After all read/write conflicts that will likely exist are sorted out, deposit Storage is in local archive table:Conflict retrieval table, including index and the type that conflicts;Using superscalar techniques, from command memory In, a plurality of instruction being taken out, is stored in instruction buffer window, " coherence window " for carrying out in advance is defined to be sought with " relevant instruction " Look for;By defining " coherence window " and " relevant instruction ", the searching of the read/write conflict of instruction stream is carried out, if there is between instruction Relevant conflict, then be stored in advance in conflict type between local instruction set by searching, what is conflicted is quick Evade in advance.
CN201611246947.6A 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU Active CN106610816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611246947.6A CN106610816B (en) 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611246947.6A CN106610816B (en) 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU

Publications (2)

Publication Number Publication Date
CN106610816A true CN106610816A (en) 2017-05-03
CN106610816B CN106610816B (en) 2018-10-30

Family

ID=58636378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611246947.6A Active CN106610816B (en) 2016-12-29 2016-12-29 The bypassing method and system to conflict between instruction set in a kind of RISC-CPU

Country Status (1)

Country Link
CN (1) CN106610816B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189715A (en) * 2018-08-16 2019-01-11 算丰科技(北京)有限公司 Programmable artificial intelligence accelerator execution unit and artificial intelligence accelerated method
CN111221573A (en) * 2018-11-26 2020-06-02 深圳云天励飞技术有限公司 Management method of register access time sequence, processor, electronic equipment and computer readable storage medium
CN113312087A (en) * 2021-06-17 2021-08-27 东南大学 Cache optimization method based on RISC processor constant pool layout analysis and integration
US11256444B2 (en) 2020-07-10 2022-02-22 Hon Hai Precision Industry Co., Ltd. Method for processing read/write data, apparatus, and computer readable storage medium thereof
TWI758778B (en) * 2020-07-10 2022-03-21 鴻海精密工業股份有限公司 Data read-write processing method, apparatus, and computer readable storage medium thereof
CN114238182A (en) * 2021-12-20 2022-03-25 北京奕斯伟计算技术有限公司 Processor, data processing method and device
CN113312087B (en) * 2021-06-17 2024-06-11 东南大学 Cache optimization method based on RISC processor constant pool layout analysis and integration

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5627982A (en) * 1991-06-04 1997-05-06 Matsushita Electric Industrial Co., Ltd. Apparatus for simultaneously scheduling instructions from plural instruction stream into plural instruction executions units
CN101067781A (en) * 2006-03-07 2007-11-07 英特尔公司 Technique to perform memory disambiguation
CN101770357A (en) * 2008-12-31 2010-07-07 世意法(北京)半导体研发有限责任公司 Method for reducing instruction conflict in processor
CN103116485A (en) * 2013-01-30 2013-05-22 西安电子科技大学 Assembler designing method based on specific instruction set processor for very long instruction words

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5627982A (en) * 1991-06-04 1997-05-06 Matsushita Electric Industrial Co., Ltd. Apparatus for simultaneously scheduling instructions from plural instruction stream into plural instruction executions units
CN101067781A (en) * 2006-03-07 2007-11-07 英特尔公司 Technique to perform memory disambiguation
CN101770357A (en) * 2008-12-31 2010-07-07 世意法(北京)半导体研发有限责任公司 Method for reducing instruction conflict in processor
CN103116485A (en) * 2013-01-30 2013-05-22 西安电子科技大学 Assembler designing method based on specific instruction set processor for very long instruction words

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189715A (en) * 2018-08-16 2019-01-11 算丰科技(北京)有限公司 Programmable artificial intelligence accelerator execution unit and artificial intelligence accelerated method
CN111221573A (en) * 2018-11-26 2020-06-02 深圳云天励飞技术有限公司 Management method of register access time sequence, processor, electronic equipment and computer readable storage medium
US11256444B2 (en) 2020-07-10 2022-02-22 Hon Hai Precision Industry Co., Ltd. Method for processing read/write data, apparatus, and computer readable storage medium thereof
TWI758778B (en) * 2020-07-10 2022-03-21 鴻海精密工業股份有限公司 Data read-write processing method, apparatus, and computer readable storage medium thereof
CN113312087A (en) * 2021-06-17 2021-08-27 东南大学 Cache optimization method based on RISC processor constant pool layout analysis and integration
CN113312087B (en) * 2021-06-17 2024-06-11 东南大学 Cache optimization method based on RISC processor constant pool layout analysis and integration
CN114238182A (en) * 2021-12-20 2022-03-25 北京奕斯伟计算技术有限公司 Processor, data processing method and device
CN114238182B (en) * 2021-12-20 2023-10-20 北京奕斯伟计算技术股份有限公司 Processor, data processing method and device

Also Published As

Publication number Publication date
CN106610816B (en) 2018-10-30

Similar Documents

Publication Publication Date Title
CN106610816A (en) Avoidance method for conflict between instruction sets in RISC-CPU and avoidance system thereof
Muslim et al. Efficient FPGA implementation of OpenCL high-performance computing applications via high-level synthesis
Ipek et al. Core fusion: accommodating software diversity in chip multiprocessors
CN100538628C (en) Be used for system and method in SIMD structure processing threads group
TWI742048B (en) Processors, methods, and systems to allocate load and store buffers based on instruction type
Ernst et al. Cyclone: A broadcast-free dynamic instruction scheduler with selective replay
CN105426160A (en) Instruction classified multi-emitting method based on SPRAC V8 instruction set
CN107810479A (en) Determination for the target location of processor control transmission
CN103646009A (en) Apparatus and method for processing an instruction matrix specifying parallel and dependent operations
Sembrant et al. Long term parking (ltp) criticality-aware resource allocation in ooo processors
Repetti et al. Pipelining a triggered processing element
Jeong et al. CASINO core microarchitecture: Generating out-of-order schedules using cascaded in-order scheduling windows
Hara et al. Performance comparison of ILP machines with cycle time evaluation
Iliakis et al. Repurposing GPU microarchitectures with light-weight out-of-order execution
Jacobi Formal verification of complex out-of-order pipelines by combining model-checking and theorem-proving
Henry et al. The ultrascalar processor-an asymptotically scalable superscalar microarchitecture
Kalaitzidis Advanced speculation to increase the performance of superscalar processors
Chaudhary Custom exact branch predictor for astar benchmark
Theodoropoulos et al. A distributed colouring algorithm for control hazards in asynchronous pipelines
US11409530B2 (en) System, method and apparatus for executing instructions
Winkel Optimal Global Instruction Scheduling for the Itanium® Processor Architecture
Molina et al. Implementation of search process for a content-based image retrieval application on system on chip
Gellert et al. Perceptron-Based Selective Load Value Prediction in a Multicore Architecture
Roth et al. Dynamic techniques for load and load-use scheduling
Jost et al. Improving performance in VLIW soft-core processors through software-controlled scratchpads

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant