CN100444118C - Software and hardware combined command relative controlling method based on logic transmitting rank - Google Patents

Software and hardware combined command relative controlling method based on logic transmitting rank Download PDF

Info

Publication number
CN100444118C
CN100444118C CNB2007100345702A CN200710034570A CN100444118C CN 100444118 C CN100444118 C CN 100444118C CN B2007100345702 A CNB2007100345702 A CN B2007100345702A CN 200710034570 A CN200710034570 A CN 200710034570A CN 100444118 C CN100444118 C CN 100444118C
Authority
CN
China
Prior art keywords
instruction
queue
microprocessor
correlativity
depv
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2007100345702A
Other languages
Chinese (zh)
Other versions
CN101021799A (en
Inventor
蒋江
高军
杨学军
张民选
邢座程
阳柳
曾献君
马驰远
李勇
陈海燕
李晋文
衣晓飞
张明
穆长富
倪晓强
唐遇星
张承义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CNB2007100345702A priority Critical patent/CN100444118C/en
Publication of CN101021799A publication Critical patent/CN101021799A/en
Application granted granted Critical
Publication of CN100444118C publication Critical patent/CN100444118C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses a method for relate-controlling software and hardware combination based on directives of a public logic launch serial number, which includes following steps: (a) Compiling : a compiler compiles operation codes of instruction, logic launch serial number, and d of related vector between the instruction I and instruction w-1 for instruction I to achieve operation vector. (b) Inserting a command queue: the instruction controlling system need to ensure there is no instruction with the same logic launch serial number in the instruction queue IQ before inserting process. (c) Relate-controlling: an instruction controlling system for microprocessor hardware generates relate risk controlling signals with relate risk vector of the current queue, without further detection and decoding about the operation code of instruction. (4) Relate-maintenance.

Description

Soft, the hardware combined command relative controlling method of logic-based transmitting rank
Technical field
The present invention is mainly concerned with the microprocessor Design field, refers in particular to a kind of soft, hardware combined command relative controlling method of logic-based transmitting rank.
Background technology
In microprocessor, command control system is responsible for the act of execution of instruction is controlled, and wherein a kind of very important control behavior is exactly instruction dependency control.Modern microprocessor adopts superscale/ultra-pipelining to improve concurrency usually, and instruction adopts the out of order various flows water station of mode while in many execution pipelines to carry out simultaneously, and this makes the correlativity control between call instruction very complicated.
Instruction with correlativity may cause correlativity danger when carrying out simultaneously in streamline.When the instruction with the danger of read-after-write (RAW) correlativity, the danger of write after write (WAW) correlativity, writeafterread (WAR) correlativity danger and structural dependence danger is carried out in streamline simultaneously, in order to guarantee the use conflict of writing order and preventing the limited function parts of " producer consumer " read-write order, program, command control system need be controlled the act of execution of instruction, to guarantee program correctness.
In modern microprocessor systems, when having correlativity between instruction, two kinds of solutions that adopt are usually: microprocessor hardware dynamic dispatching or software compiler static scheduling.
The microprocessor hardware dynamic dispatching is carried out correlativity control by the temporary transient emission of the follow-up instruction that may cause hazard or the method for execution of pausing of command control system, satisfies up to hazard.The relative controlling method that some are classical, as: scoreboard and Tomasula algorithm etc., all realize by hardware, give no thought to the support that software may provide.The advantage of dynamic dispatching is dynamically to consider the resource operating position of microprocessor internal, in time efficiently the instruction act of execution is controlled.But, adopt hardware method to realize correlativity control, need the hazard between all operations number of many instructions that command control system carries out in decision instruction launch window and many execution pipelines simultaneously simultaneously, hardware is very complicated, realizes that cost is very high.Hardware realizes that cost has conclusive restriction to the possible size of transmitting instructions window, thereby has restricted the exploitation of instruction-level parallelism.Meanwhile, relate to the instruction of storage space for some, as: the vector instruction that the continuous storage space of a slice is operated, the microprocessor hardware operation time is difficult to whether to exist between the Action Target of decision instruction storage space hazard (as: address space cover mutually etc.) usually, therefore can't carry out effective dynamic dispatching.In this case, the scheduling of the Static Detection during compiling is more effective.
The software compiler static scheduling carries out correlation detection in advance by compiler when compiling, static execution sequence and the distance of adjusting between the instruction with correlativity, the pause that hazard causes when avoiding carrying out as far as possible.For example: vliw microprocessor adopts compiler correlativity Static Detection dispatching technique usually, and microprocessor hardware is not carried out hazard and detected.Can consider more instruction when the advantage of compiler static scheduling method is correlation analysis, the correlation detection scheduling window is very big, the instruction-level parallelism height that can develop.Simultaneously, microprocessor hardware relies on the static instruction scheduling that compiler carries out fully, does not carry out hazard fully and detects, and emission and steering logic complexity are relatively low.But, because the execution of microprocessor is a dynamic process, the dynamic change constantly of the operating position of program implementation track and hardware resource, impossible entirely true predicted branch direction and resource busy-idle condition etc. during compiling, compiler is merely able to compare conservative scheduling usually, thereby causes problems such as run time version efficient is not high or too fat to move.In sum, the hardware dynamic scheduling and the software compiler static scheduling method of instruction dependency control respectively have relative merits.At present, adopt microprocessor that the software and hardware cooperation carries out instruction dependency control only based on the Intel Itanium series microprocessor of EPIC structure.In the EPIC structure, compiler does not exist the instruction of correlativity to form instruction group and packing with many, transfers to microprocessor hardware and dynamically carries out.Owing to do not have correlativity between the instruction in the instruction group, microprocessor hardware does not need to detect the hazard between the instruction in the same instruction group in the implementation.But compiler does not guarantee the correlativity between a plurality of instruction groups in the EPIC structure, and Itanium series microprocessor hardware still needs the detection logic of hazard.Therefore, this method can only reduce design complexities to a certain extent and realize cost.
Summary of the invention
The technical problem to be solved in the present invention just is: at the technical matters that prior art exists, the invention provides a kind of can make full use of software compiler instruction scheduling window big, can judge static scheduling when the relevant characteristics of multiplexed address compile, fully develop soft, the hardware combined command relative controlling method of the logic-based transmitting rank of instruction-level parallelism.
In the present invention, the correlativity of establishing the consideration of compiler and command control system has the d kind, is expressed as Dep 0, Dep 1..., Dep (d-1)The value of d has a generally acknowledged scope in the microprocessor research field.With regard to present microprocessor technology level, the value of d is generally less than or equals 8.Usually, the correlativity of compiler and command control system consideration mainly contains: read-after-write (RAW) correlativity, write after write (WAW) correlativity, writeafterread (WAR) correlativity and structural dependence etc.
For general purpose microprocessor, the present invention has adopted noun logic transmitting rank, physical transmit sequence number and correlativity vector, and they are defined as:
(a) logic transmitting rank (LIN): establish compiler can the microprocessor hardware of perception in the big or small w of instruction launch window (value of w has a generally acknowledged scope in the microprocessor research field.With regard to present microprocessor technology level, the value of w is generally less than or equals 64, and w is 2 power.If w=2 q, 0≤q≤6).Compiler is the logic transmitting rank LIN (I) of instruction I by a q position of circulation incremental order static allocation (scale-of-two), LIN (I)=i mod w when compiling.Wherein, i is the serial number of instruction I in program, increases progressively by the instruction sequencing, is decided to be arithmetic progression but differ.
(b) physical transmit sequence number (PIN): at moment t, the position PIN of instruction I in microprocessor hardware instruction queue IQ (I t) is called and instructs the physical transmit sequence number of I, and 0≤PIN (I, t)≤w-1.According to the different designs method of instruction queue IQ, the physical transmit sequence number of instruction in instruction queue IQ may be with different variations constantly.A kind of common control method of instruction queue is to adopt first idle item of a rear of queue pointer indication current queue, and new instruction is directly inserted in first idle item; In the process of implementation, will be removed from instruction queue IQ after instruction I is complete, the subsequent instructions of instruction I will all simultaneously move a position to queue heads, and promptly the value of the PIN of subsequent instructions will subtract 1 simultaneously.
(c) correlativity vector: for any one correlativity Dep of compiler and command control system consideration j(0≤j≤d-1), produce according to logic transmitting rank by compiler, whether have correlativity Dep between sign present instruction I and the preceding w-1 bar instruction (being interior all instructions of current compiling scheduling window/instruction window) jThe vector of a w position constituting of value, be designated as DepV j(I).Arbitrary instruction J gets correlativity vector DepV at instruction I j(I) position in determines that by the logic transmitting rank LIN (J) of instruction J the position is DepV j(I) LIN (J) position (0~w-1).(J ≠ I) there is correlativity Dep in and instruction I as instruction J jThe time, DepV j(I) value of instruction J correspondence position is 1 in, otherwise is 0; DepV j(I) value of instruction I self correspondence position is 0 in.Adopt the advantage of logic transmitting rank LIN (J) to be: (J, when t) changing, J is at DepV in instruction as the physical location PIN of instruction J in instruction queue IQ j(I) value of relevant position is constant in.Be that the position of arbitrary instruction J in the correlativity vector of arbitrary instruction I all is unique fixing, only determine by LIN (J).
For solving the problems of the technologies described above, the solution that the present invention proposes is: a kind of soft, hardware combined command relative controlling method of logic-based transmitting rank is characterized in that step is:
(1), compiling: compiler is d correlativity vector between arbitrary instruction I compiling generation instruction operation code OP (I), logic transmitting rank LIN (I) and instruction I and the instruction of preceding w-1 bar, generating run vector VOP (I)={ OP (I), LIN (I), DepV 0(I), DepV 1(I) ..., DepV (d-1)(I) }; The information that produces with compiler is complementary, and the structure of the item k of instruction queue IQ is: IQ_Item (k)={ OP (k), LIN (k), IQ_DepV 0(k), IQ_DepV 1(k) ..., IQ_DepV (d-1)(k) }, wherein, IQ_DepV j(k) (0≤j≤d-1) is that instruction among the instruction queue item k is at any one correlativity Dep jThe current queue hazard vector of a w position; Meanwhile, microprocessor hardware provides the instruction queue IQ seizure condition vector registor MaskInIQ of a w position, characterizes all instructions among the current I Q; For any one correlativity Dep j, the microprocessor instruction hardware control system provides the hazard masking vector register MaskDepV of a w position j, be used for characterizing all instructions of current I Q corresponding to correlativity Dep jState;
(2), insert instruction queue: before instruction queue IQ was inserted in instruction, the microprocessor instruction control system need guarantee can not occur having the instruction of identity logic transmitting rank in instruction queue IQ; Insert in the process of instruction queue IQ in instruction, microprocessor hardware at first will instruct the operation vector VOP (I) of I to carry out pre-service, produce and insert operation vectorial IVOP (I); Then, by the insertion sequence of design IVOP (I) is inserted among the assigned address IQ_Item (k) of instruction queue IQ, k is the insertion position sequence number; The value of k depends on the control method of instruction queue IQ; Wherein, IVOP (I)={ OP (I), LIN (I), IDepV 0(I), IDepV 1(I) ..., IDepV (d-1)(I) }, IDepV j(I)=DepV j() ﹠amp I; MaskInIQ, 0≤j≤d-1;
(3), correlativity control: the microprocessor instruction control system directly adopts the current queue hazard vector of instruction to produce the hazard control signal, need not again instruction operation code to be carried out correlativity decoding and detect; For arbitrary instruction I, when characterizing Dep_OK (I) that whether all hazards satisfied and be 1, present instruction I can launch; Wherein, Dep_OK (I)=﹠amp; (Dep_OK 0(I), Dep_OK 1(I) ..., Dep_OK (d-1)(I)), Dep_OK j(I)=~(IQ_DepV j(PIN (I, t)) ﹠amp;~MaskDepV j), 0≤j≤d-1;
(4), correlativity is safeguarded: in the implementation of arbitrary instruction I, if satisfied correlativity Dep j, the microprocessor instruction control system will be corresponding hazard masking vector register MaskDepV jThe middle position 1 that characterizes instruction I; When the arbitrary instruction I among the instruction queue IQ is complete, the microprocessor instruction control system will be removed instruction I from instruction queue IQ, and will adjust the position of subsequent instructions according to the control method of instruction queue IQ.Meanwhile, the microprocessor instruction control system will be seizure condition vector registor MaskInIQ and all hazard masking vector register (MaskDepV 0, MaskDepV 1..., MaskDepV (d-1)) value in characterize the position clear 0 of instruction I.
Before instruction I prepares to insert instruction queue IQ, be introduced into and insert instruction buffer InsertBuff.Then, the microprocessor instruction control system judges among the present instruction formation IQ whether have identical logic transmitting rank, and method is: will instruct the logic transmitting rank of I to carry out the decoding of w position, and form Decode_w (LIN (I)), and carry out carry out ﹠amp then; (Decode_w (LIN (I)) ﹠amp; MaskInIQ); If the result is 1, illustrate that having had logic transmitting rank among the current I Q is the instruction of LIN (I), the microprocessor instruction control system will stop the instruction of inserting among the instruction buffer InsertBuff to insert instruction queue IQ, up to last logic transmitting rank be the instruction of LIN (I) complete and leave instruction queue IQ till.
Compared with prior art, advantage of the present invention just is: the present invention can make full use of software compiler instruction scheduling window big, can judge static scheduling when the relevant characteristics of multiplexed address compile, fully develop instruction-level parallelism.The result that correlativity vector by the logic-based transmitting rank will compile correlation detection passes to microprocessor hardware; The correlativity vector that the microprocessor hardware command control system uses compiler to produce carries out correlativity control, no longer needs to carry out dynamic correlation and detects, thereby reduce hardware complexity, reduces design overhead.The practical application of the present invention in microprocessor Design shows: the command relative controlling method soft, combination of hardware of logic-based transmitting rank can greatly reduce the complicacy of the correlation detection logic of microprocessor hardware on the basis that makes full use of the software compiler ability.
Description of drawings
Fig. 1 is the schematic flow sheet that the present invention is based on soft, the hardware combined command relative controlling method of logic transmitting rank;
Fig. 2 is that the present invention is applied to the control method schematic flow sheet in the YeS64 microprocessor.
Embodiment
The present invention is described in more detail below with reference to the drawings and specific embodiments.
As shown in Figure 1, soft, the hardware combined command relative controlling method of a kind of logic-based transmitting rank of the present invention the steps include:
(1), compiling: compiler is d correlativity vector between arbitrary instruction I compiling generation instruction operation code OP (I), logic transmitting rank LIN (I) and instruction I and the instruction of preceding w-1 bar, generating run vector VOP (I)={ OP (I), LIN (I), DepV 0(I), DepV 1(I) ..., DepV (d-1)(I) }; The information that produces with compiler is complementary, and the structure of the item k of instruction queue IQ is: IQ_Item (k)={ OP (k), LIN (k), IQ_DepV 0(k), IQ_DepV 1(k) ..., IQ_DepV (d-1)(k) }; Meanwhile, microprocessor hardware provides the instruction queue IQ seizure condition vector registor MaskInIQ of a w position, characterizes all instructions among the current I Q; For any one correlativity Dep j, the microprocessor instruction hardware control system provides the hazard masking vector register MaskDepV of a w position j, be used for characterizing all instructions of current I Q corresponding to correlativity Dep jState;
(2), insert instruction queue: before instruction queue IQ was inserted in instruction, the microprocessor instruction control system need guarantee can not occur having the instruction of identity logic transmitting rank in instruction queue IQ; Insert in the process of instruction queue IQ in instruction, microprocessor hardware at first will instruct the operation vector VOP (I) of I to carry out pre-service, produce and insert operation vectorial IVOP (I); Then, by the insertion sequence of design IVOP (I) is inserted among the assigned address IQ_Item (k) of instruction queue IQ, k is the insertion position sequence number; The value of k depends on the control method of instruction queue IQ; Wherein, IVOP (I)={ OP (I), LIN (I), IDepV 0(I), IDepV 1(I) ..., IDepV (d-1)(I) }, IDepV j(I)=DepV j() ﹠amp I; MaskInIQ (0≤j≤d-1);
(3), correlativity control: the microprocessor instruction control system directly adopts the current queue hazard vector of instruction place instruction queue item to produce the hazard control signal, need not again instruction operation code to be carried out correlativity decoding and detect; For arbitrary instruction I, when characterizing Dep_OK (I) that whether all hazards satisfied and be 1, present instruction I can launch; Wherein, Dep_OK (I)=﹠amp; (Dep_OK 0(I), Dep_OK 1(I) ..., Dep_OK (d-1)(I)), Dep_OK j(I)=~(IQ_DepV j(PIN (I, t)) ﹠amp;~MaskDepV j) (0≤j≤d-1);
(4), correlativity is safeguarded: in the implementation of arbitrary instruction I, if satisfied correlativity Dep j, the microprocessor instruction control system will be corresponding hazard masking vector register MaskDepV jThe middle position 1 that characterizes instruction I; When the arbitrary instruction I among the instruction queue IQ is complete, the microprocessor instruction control system will be removed instruction I from instruction queue IQ, and will adjust the position of subsequent instructions according to the control method of instruction queue IQ.Meanwhile, the microprocessor instruction control system will be seizure condition vector registor MaskInIQ and all hazard masking vector register (MaskDepV 0, MaskDepV 1..., MaskDepV (d-1)) value in characterize the position clear 0 of instruction I.
In specific embodiment, idiographic flow of the present invention is:
(1) compiling
Software compiler also is all instruction generation logic transmitting rank and correlativity vector except traditional compiling of instruction function, and instruction, logic transmitting rank and correlativity vector are bound one by one, forms hardware executable.Compiler provides correlation information between instruction by the correlativity vector in the executable code to microprocessor hardware.
Compiler can the perception microprocessor hardware in the number of instruction launch window be w, correspondingly, the size of the correlation detection scheduling window of compiler is w.Compiler is the logic transmitting rank LIN (I) of present instruction I by a q position of circulation incremental order static allocation (scale-of-two) when compiling.
Logic-based transmitting rank LIN (I), compiler consider the d kind correlativity Dep between present instruction I and the instruction of preceding w-1 bar 0, Dep 1..., Dep (d-1), produce the correlativity vector DepV of d w position 0(I), DepV 1(I) ..., DepV (d-1)(I).After finishing, the compiler compiling generates operation vector: VOP (I)={ OP (I), LIN (I), a DepV for arbitrary instruction I 0(I), DepV 1(I) ..., DepV (d-1)(I) }.Wherein, OP (I) is the traditional operation sign indicating number of instruction I.Normally, read (RAW) correlativity, write after write (WAW) correlativity, writeafterread (WAR) correlativity after between compiler consideration present instruction I and preceding w-1 bar instruct.Compiler mainly is by the dispatch command order to the consideration of structural dependence, avoids functional unit collision to realize.
(2) insert instruction queue
The information that produces with compiler is complementary, and in the method, any the k of the instruction queue IQ of command control system (0≤k≤w-1) except traditional operation sign indicating number OP (k), also increased LIN (k) and at any one correlativity Dep jThe current queue hazard vector IQ_DepV of a w position j(k).Therefore, the structure of the item k of instruction queue IQ is: IQ_Item (k)={ OP (k), LIN (k), IQ_DepV 0(k), IQ_DepV 1(k) ..., IQ_DepV (d-1)(k) }.
Simultaneously, microprocessor hardware provides the instruction queue IQ seizure condition vector registor MaskInIQ of a w position.The MaskInIQ register adopts the logic transmitting rank LIN of current all instructions in instruction queue IQ to characterize all instructions among the current I Q.The position of arbitrary instruction J in MaskInIQ is that (0~w-1), value is 1 in LIN (J) position among the instruction queue IQ.Adopt the benefit of logic transmitting rank LIN to be that the physical location of instruction seizure condition and instruction in instruction queue IQ among the MaskInIQ is irrelevant.When the change in location of instruction in instruction queue IQ, instruction value of relevant position in MaskInIQ is constant.
Before instruction queue IQ was inserted in instruction, command control system need guarantee can not occur having the instruction of identity logic transmitting rank in instruction queue IQ.This is because the detection scheduling window of software translating size for w, if occur the identity logic transmitting rank among the instruction queue IQ, illustrates that the distance between two identical instructions of logic emission preface has surpassed w, has exceeded the correlation detection scope of compiler.
Command control system hardware guarantees that the method that logic transmitting rank does not conflict is: before instruction I prepares to insert instruction queue IQ, be introduced into and insert instruction buffer InsertBuff.Then, command control system judges among the present instruction formation IQ whether have identical logic transmitting rank, and method is: will instruct the logic transmitting rank of I to carry out the decoding of w position, and form Decode_w (LIN (I)), and carry out carry out ﹠amp then; (Decode_w (LIN (I)) ﹠amp; MaskInIQ).If the result is 1, illustrate that having had logic transmitting rank among the current I Q is the instruction of LIN (I), command control system will stop the instruction of inserting among the instruction buffer InsertBuff to insert instruction queue IQ, up to last logic transmitting rank be the instruction of LIN (I) complete and leave instruction queue IQ till.
For adopting the processor of assisting processing mode (in this class processor, it is passive that instruction is carried out), also can adopt the dynamic poll of primary processor and compare the MaskInIQ method, determine whether there is the logic transmitting rank conflict among the coprocessor instruction formation IQ, thereby whether decision sends instruction to coprocessor.This method also can guarantee can not occur having the instruction of identity logic transmitting rank in instruction queue IQ.
Insert in the process of instruction queue IQ in instruction, microprocessor hardware at first will instruct the operation vector VOP (I) of I to carry out pre-service, produce and insert operation vectorial IVOP (I); Then, by the insertion sequence of design IVOP (I) is inserted among the assigned address IQ_Item (k) of instruction queue IQ, k is the insertion position sequence number.The value of k depends on the control method of instruction queue IQ.Usually, insertion sequence can adopt first idle or the mode that circulates and increase progressively inserting instruction queue IQ.
Vectorial IVOP (I)={ OP (I), LIN (I), IDepV operated in the insertion of instruction I 0(I), IDepV 1(I) ..., IDepV (d-1)(I) }.Wherein, IDepV j(I)=DepV j() ﹠amp I; MaskInIQ.With IDepV j(I) insert IQ_DepV j(k) reason is, the instruction queue IQ state when can not the perception instruction is real during the static compiling of software compiler carrying out.Compiler is default thinks present instruction and the instruction of preceding w-1 bar all in instruction queue IQ, and compiling generates present instruction I and preceding w-1 bar dependencies between instructions vector.In practical implementation, in the moment that instruction I inserts instruction queue, some instruction of front may be complete.Therefore, at the vectorial DepV of the correlativity of instruction I j(I) before inserting the instruction queue item, command control system is with DepV j(I) clear 0 in the dependency identification position of all current complete instructions, form IDepV j(I).Otherwise,, can cause the wrong hazard between the instruction of back on present instruction I and the procedure order because compiler recycles 0~w-1 as LIN number.
After instruction I inserts instruction queue IQ, and physical transmit sequence number PIN in instruction queue IQ of t at any time, logic transmitting rank LIN (I) the and instruction I of instruction I (I, t) irrelevant.
(3) correlativity control
The microprocessor hardware command control system directly adopts the current queue hazard vector IQ_DepV of instruction place entries in queues j(0≤j≤d-1) produces various hazard control signals.After all hazards of instruction had all satisfied, instruction can be launched.
The correlativity vector that produces with compiling is complementary, for any one correlativity Dep j, the microprocessor instruction hardware control system provides the hazard masking vector register MaskDepV of a w position j, 0≤j≤d-1.All instructions are corresponding to correlativity Dep among this register sign present instruction formation IQ jState, as: corresponding to read-after-write hazard RAW, MaskDepV RAWWhether the arbitrary instruction that characterizes in the instruction queue has produced usable results (at this moment, subsequent instructions can be read this result); Corresponding to writeafterread hazard WAR, MaskDepV WARWhether the arbitrary instruction that characterizes in the instruction queue has finished register read (at this moment, subsequent instructions can be write this register).Arbitrary instruction I is at MaskDepV among the instruction queue IQ jIn the position be LIN (I) position (0~w-1); If instruction I satisfies Dep jThe attribute that characterizes, MaskDepV jThe value of LIN (I) position be 1, otherwise be 0.Adopt the benefit of logic transmitting rank LIN to be MaskDepV jIn the physical location of command status and instruction in instruction queue IQ irrelevant.
Based on hazard masking vector register, before arbitrary instruction I carries out, for any correlativity Dep j(0≤j≤d-1), the microprocessor instruction hardware control system need not again instruction operation code to be carried out correlativity decoding and detect, only need be according to the current queue hazard vector IQ_DepV of instruction I place instruction queue item jAnd MaskDepV jDep that just can decision instruction jWhether correlativity danger is satisfied, and method is: Dep_OK j(I)=~(IQ_DepV j(PIN (I, t)) ﹠amp;~MaskDepV j), wherein, (I t) is the position of current time t instruction I in instruction queue IQ to PIN.Work as Dep_OK j(k) be at 1 o'clock, the Dep of instruction I jCorrelativity danger is satisfied, otherwise does not satisfy.
After all correlativity danger were all satisfied, present instruction I can launch, that is:
Dep_OK(I)=&(Dep_OK 0(I),Dep_OK 1(I),……,Dep_OK (d-1)(I))。When Dep_OK (I) was 1, all correlativity danger of instruction I were satisfied, otherwise do not satisfy.
(4) correlativity is safeguarded
In the implementation of arbitrary instruction I, if satisfied correlativity Dep j, command control system will be corresponding hazard masking vector register MaskDepV jThe middle position (LIN (I) position) that characterizes instruction I puts 1.
When the arbitrary instruction I among the instruction queue IQ is complete, command control system will be removed instruction I from instruction queue IQ, and will adjust the position of subsequent instructions according to the control method of instruction queue IQ.Meanwhile, command control system will be seizure condition vector registor MaskInIQ and all hazard masking vector register (MaskDepV 0, MaskDepV 1..., MaskDepV (d-1)) value in characterize the position (LIN (I)) clear 0 of instruction I.Otherwise,, can cause the mistake between the instruction of present instruction I and back relevant because compiler recycles 0~w-1 LIN number as instruction.
The present invention is applied among the high-performance microprocessor YeS64 that computing machine institute of the National University of Defense technology designs and Implements.With high-performance microprocessor YeS64 is example, and Fig. 2 is soft, the hardware combined command relative controlling method process flow diagram of logic-based transmitting rank in the YeS64 microprocessor.
The YeS64 compiler can perception YeS64 hardware in the number of instruction launch window be 32.For instruction I distributes a logic transmitting rank LIN (I)=i mod 32 by incremental order, wherein, i is the serial number in program of instruction I to the YeS64 compiler when compiling.
In YeS64, compiler is only considered writeafterread WAR and the read-after-write RAW correlativity between present instruction and preceding 31 instructions, produces 2 32 correlativity vector DepV WAR(I), DepV RAW(I).In YeS64, instruction I can launch for having only instruction I emission back instruction J with the implication that J (I is before J on the procedure order) has the relevant WAR of writeafterread; Instruction I can launch for having only the complete back instruction of instruction I J with the implication that J has the relevant RAW of read-after-write.After compiling was finished, compiler was that every instruction produces an operation vectorial VOP (I): { OP (I), LIN (I), DepV WAR(I), DepV RAW(I)).
Be different from conventional microprocessor, in the YeS64 microprocessor, the content of the item among the instruction queue IQ has also expanded logic transmitting rank register and 2 32 current queue hazard vector registors of 15 except instruction operation code, be respectively: LIN, IQ_DepV WARAnd IQ_DepV RAW
Simultaneously, designed 3 32 bit registers in the command control system of YeS64 microprocessor, be respectively: MaskIssue, MaskInIQ and TailPoint.Wherein, MaskInIQ characterizes current all instructions not complete as yet in instruction queue IQ, and the position of arbitrary instruction J in MaskInIQ is LIN (J) position (0~31) among the instruction queue IQ.In YeS64, RAW hazard masking vector register MaskDepV RAW=~MaskInIQ; WAR hazard masking vector register MaskDepV WARBe called MaskIssue, characterize current all in instruction queue IQ and the instruction of having launched; Any moment of queue pointer's TailPoint register has only one to be 1, idle of first of sign instruction queue IQ.
Before the operation vector VOP (I) of instruction I inserts instruction queue IQ, be introduced into and insert instruction buffer InsertBuff.Command control system judges among the present instruction formation IQ whether have identical logic transmitting rank, and method is that hardware will instruct the logic transmitting rank of I to carry out 32 decodings, forms Decode_32 (LIN (I)), carries out carry out ﹠amp then; (Decode_32 (LIN (I)) ﹠amp; MaskInIQ).If the result is 1, illustrate that it is the instruction of LIN (I) that logic transmitting rank has been arranged among the current I Q.At this moment, for preventing the logic transmitting rank conflict, command control system will stop the instruction I among the instruction buffer InsertBuff to insert instruction queue IQ, up to last logic transmitting rank till to be that the instruction of LIN (I) is complete leave instruction queue IQ.
Insert in the process of instruction queue IQ at instruction I, microprocessor hardware at first will instruct the operation vector VOP (I) of I to carry out pre-service, produce and insert the operation vector.In reality realized, YeS64 command control system hardware directly inserted the OP (I) among the VOP (I) and LIN (I) relevant position of first idle item k of the instruction queue IQ of TailPoint register indication, simultaneously, and with (DepV WARR (I) ﹠amp; MaskInIQ) and (DepV RAW() ﹠amp I; MaskInIQ) insert IQ_DepV respectively WAR(k) and IQ_DepV RAW(k) in.After insertion was finished, the TailPoint register moved right 1.
When instructing insertion instruction queue IQ and after inserting instruction queue IQ, (the I's physical transmit sequence number PIN of LIN (I) and instruction of instruction I in instruction queue IQ t) has nothing to do.
Before arbitrary instruction I carried out, microprocessor hardware need not again instruction operation code to be carried out correlativity decoding and detect, only need be according to IQ_DepV WAR(PIN (I, t)), IQ_DepV RAW(whether whether all hazards that PIN (I, t)) and MaskIssue, MaskInIQ just can decision instructions satisfy, can launch.Method is: Dep_OK (I)=﹠amp; (~(IQ_DepV WAR(PIN (I, t)) ﹠amp;~MaskIssue) ,~(IQ_DepV RAW(PIN (I, t)) ﹠amp; MaskInIQ)).When Dep_OK (I) was 1, instruction I can launch.
After the arbitrary instruction I emission, the WAR correlativity of all instruction and instruction I satisfies among the instruction queue IQ, and command control system will put 1 to the position that characterizes instruction I in the value of WAR hazard masking vector register MaskIssue (LIN (I) position).
When arbitrary instruction I is complete, command control system will be removed instruction I from instruction queue IQ, all subsequent instructions of instruction I all simultaneously move a position to instruction queue team head among the instruction queue IQ, promptly the value of the PIN of all subsequent instructions will subtract 1, but the position and the PIN of the instruction of instruction I front remain unchanged.Simultaneously, command control system will be the position that characterizes instruction I in the value of MaskInIQ, MaskIssue (LIN (I) position) clear 0.

Claims (2)

1, a kind of soft, hardware combined command relative controlling method of logic-based transmitting rank is characterized in that step is:
(1), compiling: compiler is d correlativity vector between arbitrary instruction I compiling generation instruction operation code OP (I), logic transmitting rank LIN (I) and instruction I and the instruction of preceding w-1 bar, generating run vector VOP (I)={ OP (I), LIN (I), DepV 0(I), DepV 1(I) ..., DepV (d-1)(I) }; The information that produces with compiler is complementary, and the structure of the item k of instruction queue IQ is: IQ_Item (k)={ OP (k), LIN (k), IQ_DepV 0(k), IQ_DepV 1(k) ..., IQ_DepV (d-1)(k) }; Meanwhile, microprocessor hardware provides the instruction queue IQ seizure condition vector registor MaskInIQ of a w position, characterizes all instructions among the current I Q; For any one correlativity Dep j, the microprocessor instruction hardware control system provides the hazard masking vector register MaskDepV of a w position j, be used for characterizing all instructions of current I Q corresponding to correlativity Dep jState;
(2), insert instruction queue: before instruction queue IQ was inserted in instruction, the microprocessor instruction control system need guarantee can not occur having the instruction of identity logic transmitting rank in instruction queue IQ; Insert in the process of instruction queue IQ in instruction, microprocessor hardware at first will instruct the operation vector VOP (I) of I to carry out pre-service, produce and insert operation vectorial IVOP (I); Then, by the insertion sequence of design IVOP (I) is inserted among the assigned address IQ_Item (k) of instruction queue IQ, k is the insertion position sequence number; The value of k depends on the control method of instruction queue IQ; Wherein, IVOP (I)={ OP (I), LIN (I), IDepV 0(I), IDepV 1(I) ..., IDepV (d-1)(I) }, IDepV j(I)=DepV j() ﹠amp I; MaskInIQ, 0≤j≤d-1;
(3), correlativity control: the microprocessor instruction control system directly adopts the current queue hazard vector of instruction place instruction queue item to produce the hazard control signal, need not again instruction operation code to be carried out correlativity decoding and detect; For arbitrary instruction I, when characterizing Dep_OK (I) that whether all hazards satisfied and be 1, present instruction I can launch; Wherein, Dep_OK (I)=﹠amp; (Dep_OK 0(I), Dep_OK 1(I) ..., Dep_OK (d-1)(I)), Dep_OK j(I)=~(IQ_DepV j(PIN (I, t)) ﹠amp;~MaskDepV j), 0≤j≤d-1;
(4), correlativity is safeguarded: in the implementation of arbitrary instruction I, if satisfied correlativity Dep j, the microprocessor instruction control system will be corresponding hazard masking vector register MaskDepV jThe middle position 1 that characterizes instruction I; When the arbitrary instruction I among the instruction queue IQ is complete, the microprocessor instruction control system will be removed instruction I from instruction queue IQ, and will adjust the position of subsequent instructions according to the control method of instruction queue IQ; Meanwhile, the microprocessor instruction control system will be seizure condition vector registor MaskInIQ and all hazard masking vector register (MaskDepV 0, MaskDepV 1..., MaskDepV (d-1)) value in characterize the position clear 0 of instruction I.
2, soft, the hardware combined command relative controlling method of a kind of logic-based transmitting rank according to claim 1, it is characterized in that: before instruction I prepares to insert instruction queue IQ, be introduced into and insert instruction buffer InsertBuff, then, the microprocessor instruction control system judges among the present instruction formation IQ whether have identical logic transmitting rank, method is: will instruct the logic transmitting rank of I to carry out the decoding of w position, and form Decode_w (LIN (I)), and carry out carry out ﹠amp then; (Decode_w (LIN (I)) ﹠amp; MaskInIQ); If the result is 1, illustrate that having had logic transmitting rank among the current I Q is the instruction of LIN (I), the microprocessor instruction control system will stop the instruction of inserting among the instruction buffer InsertBuff to insert instruction queue IQ, up to last logic transmitting rank be the instruction of LIN (I) complete and leave instruction queue IQ till.
CNB2007100345702A 2007-03-19 2007-03-19 Software and hardware combined command relative controlling method based on logic transmitting rank Expired - Fee Related CN100444118C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2007100345702A CN100444118C (en) 2007-03-19 2007-03-19 Software and hardware combined command relative controlling method based on logic transmitting rank

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2007100345702A CN100444118C (en) 2007-03-19 2007-03-19 Software and hardware combined command relative controlling method based on logic transmitting rank

Publications (2)

Publication Number Publication Date
CN101021799A CN101021799A (en) 2007-08-22
CN100444118C true CN100444118C (en) 2008-12-17

Family

ID=38709573

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007100345702A Expired - Fee Related CN100444118C (en) 2007-03-19 2007-03-19 Software and hardware combined command relative controlling method based on logic transmitting rank

Country Status (1)

Country Link
CN (1) CN100444118C (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103916412B (en) * 2012-12-31 2018-04-06 深圳市傲冠软件股份有限公司 A kind of method and system of information technoloy equipment novel maintenance
CN110825437B (en) * 2018-08-10 2022-04-29 昆仑芯(北京)科技有限公司 Method and apparatus for processing data
CN109412846B (en) * 2018-10-11 2021-08-24 杭州迪普科技股份有限公司 Configuration rollback method and device
CN110007966A (en) * 2019-04-10 2019-07-12 龚伟峰 A method of it reducing memory and reads random ordering
CN110874643B (en) * 2019-11-08 2021-01-12 安徽寒武纪信息科技有限公司 Conversion method and device of machine learning instruction, board card, mainboard and electronic equipment
CN112787937A (en) * 2021-01-21 2021-05-11 深圳市中网信安技术有限公司 Message forwarding method, terminal equipment and computer storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003008526A (en) * 2001-06-21 2003-01-10 Sony Corp Data processor
CN1613056A (en) * 2002-01-03 2005-05-04 英特尔公司 Dependence-chain processors
US20060167936A1 (en) * 2003-03-27 2006-07-27 Osamu Okauchi Data processing device
JP2006313546A (en) * 2005-05-04 2006-11-16 Arm Ltd Data processing system
CN1885283A (en) * 2006-06-05 2006-12-27 中国人民解放军国防科学技术大学 Method for decreasing data access delay in stream processor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003008526A (en) * 2001-06-21 2003-01-10 Sony Corp Data processor
CN1613056A (en) * 2002-01-03 2005-05-04 英特尔公司 Dependence-chain processors
US20060167936A1 (en) * 2003-03-27 2006-07-27 Osamu Okauchi Data processing device
JP2006313546A (en) * 2005-05-04 2006-11-16 Arm Ltd Data processing system
CN1885283A (en) * 2006-06-05 2006-12-27 中国人民解放军国防科学技术大学 Method for decreasing data access delay in stream processor

Also Published As

Publication number Publication date
CN101021799A (en) 2007-08-22

Similar Documents

Publication Publication Date Title
CN100444118C (en) Software and hardware combined command relative controlling method based on logic transmitting rank
CN101681259B (en) System and method for using local condition code register for accelerating conditional instruction execution in pipeline processor
US9483243B2 (en) Interleaving data accesses issued in response to vector access instructions
JP6236443B2 (en) Sequence control for data element processing during vector processing.
EP3103015B1 (en) Deterministic and opportunistic multithreading
CN106257411B (en) Single instrction multithread calculating system and its method
CN100461094C (en) Instruction control method aimed at stream processor
EP2680132B1 (en) Staged loop instructions
US9766895B2 (en) Opportunity multithreading in a multithreaded processor with instruction chaining capability
TWI733798B (en) An apparatus and method for managing address collisions when performing vector operations
JP6427054B2 (en) Parallelizing compilation method and parallelizing compiler
US10514919B2 (en) Data processing apparatus and method for processing vector operands
KR20140131472A (en) Reconfigurable processor having constant storage register
WO2007107707A3 (en) Computer architecture
US10540156B2 (en) Parallelization method, parallelization tool, and in-vehicle device
US20090113403A1 (en) Replacing no operations with auxiliary code
US6862676B1 (en) Superscalar processor having content addressable memory structures for determining dependencies
CN103970511A (en) Processor capable of supporting multimode and multimode supporting method thereof
JP2004234038A (en) Low-power operation control device and program optimizing device
US20140129805A1 (en) Execution pipeline power reduction
CN100590592C (en) Processor and its instruction distributing method
EP4034994B1 (en) Retire queue compression
CN111241599B (en) Dynamic identification and maintenance method for processor chip safety dependence
CN110515659B (en) Atomic instruction execution method and device
CN110515656B (en) CASP instruction execution method, microprocessor and computer equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20081217

Termination date: 20110319