CN111506347A - Renaming method based on instruction read-after-write correlation hypothesis - Google Patents

Renaming method based on instruction read-after-write correlation hypothesis Download PDF

Info

Publication number
CN111506347A
CN111506347A CN202010231038.5A CN202010231038A CN111506347A CN 111506347 A CN111506347 A CN 111506347A CN 202010231038 A CN202010231038 A CN 202010231038A CN 111506347 A CN111506347 A CN 111506347A
Authority
CN
China
Prior art keywords
phy
register
val
source
renaming
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010231038.5A
Other languages
Chinese (zh)
Other versions
CN111506347B (en
Inventor
刘权胜
余红斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Saifang Technology Co ltd
Original Assignee
Shanghai Saifang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Saifang Technology Co ltd filed Critical Shanghai Saifang Technology Co ltd
Priority to CN202010231038.5A priority Critical patent/CN111506347B/en
Publication of CN111506347A publication Critical patent/CN111506347A/en
Application granted granted Critical
Publication of CN111506347B publication Critical patent/CN111506347B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • G06F9/384Register renaming
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a renaming method based on instruction write-read correlation hypothesis, and relates to the technical field of computer microelectronic chips. The invention comprises two stages: stage 1: completing RAT reading and judging the relevant attribute of the register; based on the relevant assumptions of the write-after-read of various instructions, obtaining multiple renaming register mappings of each source register, and simultaneously generating onehot control signals for selecting correct renaming registers in parallel; stage 2: and selecting a final renaming register of each source register according to the onehot control signal generated in the stage 1, and updating the mapping relation between the architecture register and the physical register of the RAT table. The invention can obtain higher dominant frequency under the same condition; renaming is carried out by an assumed method, so that the complexity of realizing the renaming is reduced, the bandwidth performance of the renaming is improved, and a better method is provided for realizing a high-performance processor.

Description

Renaming method based on instruction read-after-write correlation hypothesis
Technical Field
The invention belongs to the technical field of computer microelectronic chips, and particularly relates to a renaming method based on instruction write-read correlation hypothesis.
Background
The development of microprocessors has made tremendous progress in the short decades. The performance of processors is constantly being improved from a number of aspects, including hardware architectures, processes, and combinations of software and hardware. The hardware architecture experiences from a single-launch scalar to a multiple-launch superscalar; from the first 3-stage pipeline to a few tens of stages; from an in-order execution instruction to an out-of-order execution instruction; a storage structure from no cache to 3-level cache; from physical single core to physical multiple-Processors (CMP) and logical single core to logical multiple-cores (SMT); even for clustered systems for super-arithmetic, instruction-level parallelism and thread-level parallelism of execution by processors have been greatly developed. The instruction level parallel bandwidth requirement of the single-core microprocessor is higher and higher, and the multiple of the logic complexity program of the chip is increased.
As the number of instructions processed in each clock cycle increases, the combinatorial logic chain becomes longer and longer according to the comparison judgment of the read-after-write priority of the instructions, so that the main frequency of the microprocessor is greatly limited. The method provides a method for realizing renaming 8 instructions every clock cycle, and the method for realizing the method is not limited to the case that the bandwidth is 8 instructions and comprises all other cases. The method is simultaneously suitable for all processor architectures such as an X86 instruction set CPU, a RISC instruction set CPU, a GPU, a DSP and the like, and is suitable for physical single-core, physical multi-Core (CMP) and logical multi-core (SMT).
Currently, the number of source and destination registers for renaming multiple instructions is limited. For example, the bandwidth of the pipeline is 6 instructions, but when the decoder schedules the instructions to rename, the 5 th instruction and the 6 th instruction are limited not to read or write a physical register, otherwise, the current instruction cannot enter the renaming and can only be renamed in the next cycle. That is, by limiting the number of rename instructions for one clock cycle, the complexity of instruction dependency checking in the rename stage is reduced.
Instructions that cannot enter rename because the current cycle are blocked. In this case, the performance of the rename stage pipeline drops either 1/6 or 1/3. Since the renamed bandwidth becomes a bottleneck, it not only affects the pipeline after renaming but also blocks the modules including the decoder and the previous modules. Therefore, in view of the above problems, it is important to provide a renaming method based on the assumption about reading after writing instructions.
Disclosure of Invention
The invention provides a renaming method based on instruction write-after-read related hypothesis, which solves the problem that renaming instruction bandwidth in each clock cycle is limited and performance is influenced.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention relates to a renaming method based on instruction read-after-write correlation hypothesis, which comprises two stages:
stage 1: completing RAT reading and judging the relevant attribute of the register; based on the relevant assumptions of the write-after-read of various instructions, obtaining multiple renaming register mappings of each source register, and simultaneously generating onehot control signals for selecting correct renaming registers in parallel;
stage 2: selecting a final renaming register of each source register according to an onehot control signal generated in the stage 1, and updating the mapping relation between an architecture register and a physical register of the RAT table;
through the supposed parallel generation of multiple rename register results and the selection of the final effective result signals, the number of serial logic gates generated by reading after writing of 8 instructions according to priority comparison is reduced, so that higher main frequency is obtained under the same condition;
the method obtains a plurality of renaming results by assuming that various read-after-write correlations exist in the instruction, and then obtains a final result by selecting a signal.
Furthermore, the method does not need to directly judge the read-after-write correlation among the instructions, thereby eliminating the problem of long paths of combinational logic caused by the priority relationship among the instructions, and the assumption is not limited to that 1 instruction is the granularity, and is suitable for any instruction granularity.
Further, the 1 st stage obtains multiple renaming register mappings of each source register, and is implemented by using a logic expression for judging renaming of each UOP, and the logic expression includes various assumed logic expression implementation forms of each instruction.
Furthermore, the invention is simultaneously suitable for all processor architectures such as an X86 instruction set CPU, a RISC instruction set CPU, a GPU, a DSP and the like, is suitable for physical single-core, physical multi-Core (CMP) and logical multi-core (SMT), and is suitable for servers and clusters.
Further, the invention is not limited to the bandwidth of instruction level parallelism, but not limited to the architecture of renaming implementation, the number of pipeline levels, and the implementation process.
Compared with the prior art, the invention has the following beneficial effects:
the invention divides the renaming stage into 2 stages: completing RAT reading and register related attribute judgment at the 1 st stage; and 2, obtaining a final renaming result and finishing updating the mapping relation between the architecture register and the physical register of the RAT. In the stage 1, based on the read-after-write related assumptions of various instructions, obtaining multiple renaming register mappings of each source register, and simultaneously generating onehot control signals for selecting correct renaming registers in parallel; selecting a final renaming register of each source register according to onehot control signals generated in the 1 st stage in the 2 nd stage, and updating the mapping relation between the architecture register and the physical register of the RAT table; through generating a plurality of rename register results in parallel and selecting signals of final effective results, the number of serial logic gates generated by reading after instruction writing according to priority comparison is reduced, and higher main frequency can be obtained under the same condition; renaming is carried out by an assumed method, so that the complexity of realizing the renaming is reduced, the bandwidth performance of the renaming is improved, and a better method is provided for realizing a high-performance processor.
Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a multi-core CPU with N physical cores sharing L3 and memory according to an embodiment of the present invention;
FIG. 2 illustrates a single physical core in an embodiment of the present invention;
FIG. 3 shows a pipeline of a microprocessor according to an embodiment of the present invention;
FIG. 4 is a renaming scheme for an architectural register and a renaming register sharing a physical register according to the present invention;
FIG. 5 shows 8 pieces of information that needs to be checked during the renaming phase for UOP according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating the renaming process of the i-th and i + 1-th cycles UOP according to an embodiment of the present invention by time and space axes;
in the drawings, the components represented by the respective reference numerals are listed below:
10-multicore CPU, 20-position of rename module in microprocessor, 31-rename stage 1, 32-rename stage 2, 41-index number of architectural register and internal temporary register supported by microprocessor instruction set, 42-whether the architectural register or internal temporary register is renamed, 43-effective data width of each physical register, 44-address mapping relation of architectural register and internal temporary register to physical register, 45-representing R2 mapping to physical register 5 and data width 256bit, 46-R3 mapping to physical register K-6 and data width 16bit, 47-RM-4 mapping to physical register 14 and data width 64bit, 48-architectural registers and internal temporary registers supported by the microprocessor instruction set and renaming shared physical registers, the numbers of 501-8 UOPs and the order of 8 UOPs are UOP0, UOP1, UOP2, UOP3, UOP4, UOP5, UOP6, UOP7, 502-UOP if there is a 2 nd source register, 503-UOP has a 1 st source register number and defaults to 0, 504-width of the 1 st source register, 505-UOP has a 2 nd source register, 506-UOP has a 2 nd source register number, 507-width of the 2 nd source register, 508-UOP has a destination register, 509-UOP has a destination register number and defaults to 0, 510-width of the destination register if there is no source register, 61-cycle 8UOP completes renaming through 2 nd stages and 2 nd stage and 1 st cycle 8UOP +1 st stage of the I +1 st cycle The phases are the same cycle, 8 UOPs in 62-th +1 th cycle complete renaming through 2 phases, and the 1 st phase is the same cycle as the 2 nd phase of 8 UOPs in the i-th cycle.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-6, a renaming method based on the read-after-write related assumption of instructions according to the present invention is to set the renaming stage as 2 stages: completing RAT reading and register related attribute judgment at the 1 st stage; and 2, obtaining a final renaming result and finishing updating the mapping relation between the architecture register and the physical register of the RAT table in the stage 2. In the stage 1, based on the read-after-write related assumptions of various instructions, obtaining multiple renaming register mappings of each source register, and simultaneously generating onehot control signals for selecting correct renaming registers in parallel; and in the 2 nd stage, selecting a final renaming register of each source register according to the onehot control signal generated in the 1 st stage, and updating the mapping relation between the architecture register and the physical register of the RAT table.
The method comprises the following steps of dividing a renaming stage into 2 stages: completing RAT reading and register related attribute judgment at the 1 st stage; and 2, obtaining a final renaming result and finishing updating the mapping relation between the architecture register and the physical register of the RAT table in the stage 2.
In the stage 1, based on the read-after-write related assumptions of various instructions, obtaining multiple renaming register mappings of each source register, and simultaneously generating onehot control signals for selecting correct renaming registers in parallel;
and in the 2 nd stage, selecting a final renaming register of each source register according to the onehot control signal generated in the 1 st stage, and updating the mapping relation between the architecture register and the physical register of the RAT table.
By supposing to generate a plurality of rename register results in parallel and select signals of final effective results, the number of serial logic gates generated by reading after writing of 8 instructions according to priority comparison is reduced, and therefore higher main frequency can be obtained under the same condition.
Thus, the method achieves a method of reducing the logic complexity of implementing high bandwidth renaming.
In FIG. 1, there is a multi-core CPU with L3 and memory shared by N physical cores, each of which may be a single-threaded or multi-threaded architecture.
In fig. 2, a single physical core, which may be a single-threaded or multi-threaded architecture. The modular division of the core is given in table 1 as a functional description. In fig. 2 20 is the location of the rename module in the microprocessor.
Figure BDA0002429282000000041
Table 1: function list of single physical core
FIG. 3 shows a pipeline of a microprocessor, where 31 is the rename 1 st stage and 32 is the rename 2 nd stage.
FIG. 4 is a renaming architecture in which architected registers and rename registers share a physical register. Reference numeral 41 is the architectural register and internal temporary register index number supported by the microprocessor instruction set. 42 indicates whether the architectural register or internal temporary register is renamed. And 43 is the effective data width of each physical register, and the data support device supports 8 bits, 16 bits, 32 bits, 48 bits, 64 bits, 80 bits, 128 bits and 256 bits. By extending the WIDTH of the WIDTH field, any data WIDTH can be supported. 44 denotes an address mapping relationship in which architectural registers and internal temporary registers are mapped to physical registers. The address width of the physical register is Q, and Q satisfies 2Q>K. 45 denotes that R2 maps to physical register 5 and has a data width of 256 bits. 46 denotes that R3 maps to physical register K-6 and has a data width of 16 bits. 47 denotes RM-4 mapping to physical register 14,and the data width is 64 bits. 48 denotes architectural registers and internal temporary registers supported by the microprocessor instruction set and renaming shared physical registers.
In fig. 5 are 8 pieces of information that the UOP needs to check during the renaming phase. 501 denotes the numbering of 8 UOPs and the order of the 8 UOPs is UOP0, UOP1, UOP2, UOP3, UOP4, UOP5, UOP6, UOP 7. 502 indicates whether the UOP has the 1 st source register. 503 indicates that UOP has the number of 1 st source register and defaults to 0 if no source register exists. 504 represents the width of the 1 st source register. 505 indicates whether a UOP exists for the 2 nd source register. 506 indicates that UOP has the number of 2 nd source register and defaults to 0 if no source register exists. 507 denotes the width of the 2 nd source register. 508 indicates whether the UOP has a destination register. 509 denotes the number of the destination register where UOP exists, and if no source register exists, 0 is defaulted. And 510 represents the width of the destination register.
Fig. 6 describes the renaming process of the i-th and i + 1-th cycles UOP by time and space axes. 61 indicates that 8 UOPs of the i-th cycle complete renaming through 2 phases, and that the 2 nd phase is the same cycle as the 1 st phase of the 8 UOPs of the i +1 th cycle. 62 indicates that 8 UOPs of the i +1 th cycle complete renaming through 2 phases, and that the 1 st phase is the same cycle as the 2 nd phase of 8 UOPs of the i-th cycle. When 8 UOPs in the i +1 th cycle read the RAT in the 1 st phase, the numbers of the RATs written by the 8 UOPs in the i +1 th cycle in the 2 nd phase need to be compared, and the latest physical register number of each source register is obtained.
The processing bandwidth of the renaming module is 8 UOPs, and each UOP has at most 2 source operands and 1 destination operand. The number of registers of UOP is determined according to the valid bit. In fig. 5, there are 8 pieces of UOP information, and each piece of UOP information includes architectural register numbers, valid bits, and widths of architectural registers for 2 source operands and 1 destination operand. The method is not limited to the case where there are only 2 source registers per UOP, and can be extended to any number of source and destination registers.
In the renaming process, the source operand of each instruction obtains the physical register number corresponding to the architecture register; the destination register applies for a free physical register number and updates the RAT.
Renaming is completed in 2 stages: completing RAT reading and register related attribute judgment at the 1 st stage; and 2, obtaining a final renaming result and finishing updating the mapping relation between the architecture register and the physical register of the RAT table in the stage 2. In phase 1, 8 UOPs access RATs in parallel at the same time and register dependency determination is done. In the (i + 1) th cycle, 8 UOPs read the RAT table at the same time, and at most there is a UOP write RAT in the (i) th cycle, as shown in fig. 6. The relevance of the 8 UOPs was judged as follows.
At most 8 UOPs are simultaneously updated in each clock cycle, the UOPs in the i-th and i + 1-th clock cycles are overlapped, and therefore in the period of reading the RAT, whether the target registers of the updated RAT in the previous clock cycle are the same or not needs to be compared, and the correct physical register number is selected according to the priority relation. Cycle i instructs phase 1 to generate an onehot selection signal PRV for updating the RATi[7:0]This signal is to select the latest RAT information as the source operand to query the value of RAT. On-hot selection signal PRV of the ith cycle instruction 2 nd stagei_1D[7:0]。PRVi_1D[7:0]And when the phase 1 renaming is instructed in the (i + 1) th cycle, selecting to update the physical register number of the RAT.
PRVi[7]Indicating that item 8UOP7 needs to update the RAT. PRVi[7]The logical expression of (a) is as follows:
PRVi[7]=VAL_7_2i
PRVi[6]indicating that the 7 th UOP6 needs to update RATs and that there is no RAT updating the same architectural register for UOP7 and UOP 6. PRVi[6]The logical expression of (a) is as follows:
PRVi[6]=VAL_6_2i&
(~(VAL_6_2i&VAL_7_2i&(R_7_2i==R_6_2i)))
PRVi[5]indicating that the 6 th UOP5 needs to update RATs and that there is no RAT to update the same architectural register for UOP7, UOP6 and UOP 5. PRVi[5]Logic table (2)The expression is as follows:
PRVi[5]=VAL_5_2i&
(~(VAL_5_2i&VAL_7_2i&(R_7_2i==R_5_2i)))&
(~(VAL_5_2i&VAL_6_2i&(R_6_2i==R_5_2i)))
PRVi[4]indicating that the 5 th UOP4 needs to update RATs, and that there is no RAT updating the same architectural register for UOP7, UOP6, UOP5 and UOP 4. PRVi[4]The logical expression of (a) is as follows:
PRVi[4]=VAL_4_2i&
(~(VAL_4_2i&VAL_7_2i&(R_7_2i==R_4_2i)))&
(~(VAL_4_2i&VAL_6_2i&(R_6_2i==R_4_2i)))&
(~(VAL_4_2i&VAL_5_2i&(R_5_2i==R_4_2i)))
PRVi[3]indicating that the 4 th UOP3 needs to update RATs, and there is no RAT updating the same architectural register for UOP7, UOP6, UOP5, UOP4 and UOP 3. PRVi[3]The logical expression of (a) is as follows:
PRVi[3]=VAL_3_2i&
(~(VAL_3_2i&VAL_7_2i&(R_7_2i==R_3_2i)))&
(~(VAL_3_2i&VAL_6_2i&(R_6_2i==R_3_2i)))&
(~(VAL_3_2i&VAL_5_2i&(R_5_2i==R_3_2i)))&
(~(VAL_3_2i&VAL_4_2i&(R_4_2i==R_3_2i)))
PRVi[2]indicating that 3 rd UOP2 requires RAT update, and UOP7, UOP6, UOP5, UOP4, UOP3 andUOP2 does not have a RAT that updates the same architectural register. PRVi[2]The logical expression of (a) is as follows:
PRVi[2]=VAL_2_2i&
(~(VAL_2_2i&VAL_7_2i&(R_7_2i==R_2_2i)))&
(~(VAL_2_2i&VAL_6_2i&(R_6_2i==R_2_2i)))&
(~(VAL_2_2i&VAL_5_2i&(R_5_2i==R_2_2i)))&
(~(VAL_2_2i&VAL_4_2i&(R_4_2i==R_2_2i)))&
(~(VAL_2_2i&VAL_3_2i&(R_3_2i==R_2_2i)))
PRVi[1]indicating that 2 nd UOP1 needs to update RATs, and there is no RAT updating the same architectural register for UOP7, UOP6, UOP5, UOP4, UOP3, UOP2 and UOP 1. PRVi[1]The logical expression of (a) is as follows:
PRVi[1]=VAL_1_2i&
(~(VAL_1_2i&VAL_7_2i&(R_7_2i==R_1_2i)))&
(~(VAL_1_2i&VAL_6_2i&(R_6_2i==R_1_2i)))&
(~(VAL_1_2i&VAL_5_2i&(R_5_2i==R_1_2i)))&
(~(VAL_1_2i&VAL_4_2i&(R_4_2i==R_1_2i)))&
(~(VAL_1_2i&VAL_3_2i&(R_3_2i==R_1_2i)))&
(~(VAL_1_2i&VAL_2_2i&(R_2_2i==R_1_2i)))
PRVi[0]indicating that the 1 st UOP0 needs to update RATs, and there is no RAT updating the same architectural register for UOP7, UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 and UOP 0. PRVi[0]The logical expression of (a) is as follows:
PRVi[0]=VAL_0_2i&
(~(VAL_0_2i&VAL_7_2i&(R_7_2i==R_0_2i)))&
(~(VAL_0_2i&VAL_6_2i&(R_6_2i==R_0_2i)))&
(~(VAL_0_2i&VAL_5_2i&(R_5_2i==R_0_2i)))&
(~(VAL_0_2i&VAL_4_2i&(R_4_2i==R_0_2i)))&
(~(VAL_0_2i&VAL_3_2i&(R_3_2i==R_0_2i)))&
(~(VAL_0_2i&VAL_2_2i&(R_2_2i==R_0_2i)))&
(~(VAL_0_2i&VAL_1_2i&(R_1_2i==R_0_2i)))
PRVi1D is PRViThe contents of the next stage pipeline of the signal are as follows:
PRVi_1D<=PRVi
updating of RAT, also according to PRViAnd (4) judging whether the items of the RAT need to be written or not, if so, updating the new physical register number to the corresponding architecture register line, otherwise, keeping the number unchanged. PHY _ W _0_2i_1D,PHY_W_1_2i_1D,PHY_W_2_2i_1D,PHY_W_3_2i_1D,PHY_W_4_2i_1D,PHY_W_5_2i_1D,PHY_W_6_2i_1D,PHY_W_7_2i"1D" is the value of the destination register width at rename stage 2. The logical expressions are as follows:
PHY_W_0_2i_1D<=W_0_2i
PHY_W_1_2i_1D<=W_1_2i
PHY_W_2_2i_1D<=W_2_2i
PHY_W_3_2i_1D<=W_3_2i
PHY_W_4_2i_1D<=W_4_2i
PHY_W_5_2i_1D<=W_5_2i
PHY_W_6_2i_1D<=W_6_2i
PHY_W_7_2i_1D<=W_7_2i
similarly, PHY _ R _0_2i_1D,PHY_R_1_2i_1D,PHY_R_2_2i_1D,PHY_R_3_2i_1D,PHY_R_4_2i_1D,PHY_R_5_2i_1D,PHY_R_6_2i_1D,PHY_R_7_2iThe physical register of the destination register is the value of the rename stage 2.
The logical expression updated by the RAT table line 1 architecture register R0 is:
RAT[R0]<=({(Q+3){(PRVi_1D[0]&(R_0_2i_1D==R0))}}&{PHY_W_0_2i_1D,PHY_R_0_2i_1D})|
({(Q+3){(PRVi_1D[1]&(R_1_2i_1D==R0))}}&{PHY_W_1_2i_1D,PHY_R_1_2i_1D})|
({(Q+3){(PRVi_1D[2]&(R_2_2i_1D==R0))}}&{PHY_W_2_2i_1D,PHY_R_2_2i_1D})|
({(Q+3){(PRVi_1D[3]&(R_3_2i_1D==R0))}}&{PHY_W_3_2i_1D,PHY_R_3_2i_1D})|
({(Q+3){(PRVi_1D[4]&(R_4_2i_1D==R0))}}&{PHY_W_4_2i_1D,PHY_R_4_2i_1D})|
({(Q+3){(PRVi_1D[5]&(R_5_2i_1D==R0))}}&{PHY_W_5_2i_1D,PHY_R_5_2i_1D})|
({(Q+3){(PRVi_1D[6]&(R_6_2i_1D==R0))}}&{PHY_W_6_2i_1D,PHY_R_6_2i_1D})|
({(Q+3){(PRVi_1D[7]&(R_7_2i_1D==R0))}}&{PHY_W_7_2i_1D,PHY_R_7_2i_1D})
r1, R2, R3 and … … RM-1 can obtain similar logic expressions, and are not given again. The logic expressions of the updated RAT do not include recovery logic for resetting and for the occurrence of branch instruction execution errors or events.
1.1 renaming procedure for UOP0
The 2 source operands of UOP0 need only query the RAT table according to the architectural register numbers of each source operand. Such as an architectural register R _0i+1The lookup RAT table may represent RAT [ R _0 [ ]i+1]And so on for others. Renaming is completed in 2 stages: the 1 st stage is mainly that 2 source operands of the UOP in the (i + 1) th cycle complete the mapping from the architectural register to the physical register, and the 2 nd stage write RAT corresponding to the instruction in the i th cycle needs to be considered during the mapping. The 1 st stage of the i +1 th cycle instruction and the 2 nd stage of the i +1 th cycle instruction are in the same stage pipeline.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
the logical expression for the 1 st source operand is as follows:
PHY_R_0_0i+1=({Q{(PRVi_1D[0]&(R_0_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_0_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_0_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_0_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_0_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_0_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_0_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_0_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_0_0i+1][Q-1:0])
PHY_VAL_0_0i+1=VAL_0_0i+1
similarly, the logical expression of the 2 nd source operand is as follows:
PHY_R_0_1i+1=({Q{(PRVi_1D[0]&(R_0_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_0_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_0_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_0_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_0_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_0_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_0_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_0_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_0_1i+1][Q-1:0])
PHY_VAL_0_1i+1=VAL_0_1i+1
the destination register of UOP0 is allocated with physical register PHY _ R _0_2i+1. The effective expression of the physical register allocated by the destination register is as follows:
PHY_VAL_0_2i+1=VAL_0_2i+1
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
the logical expression for the 1 st source operand is as follows:
PHY_R_0_0i+1_1D<=PHY_R_0_0i+1
PHY_VAL_0_0i+1_1D<=PHY_VAL_0_0i+1
the logical expression for the 2 nd source operand is as follows:
PHY_R_0_1i+1_1D<=PHY_R_0_1i+1
PHY_VAL_0_1i+1_1D<=PHY_VAL_0_1i+1
the logical expression of the destination register is as follows:
PHY_R_0_2i+1_1D<=PHY_R_0_2i+1
PHY_VAL_0_2i+1_1D<=PHY_VAL_0_2i+1
R_0_2i_1D<=R_0_2i
1.2 renaming procedure for UOP1
2 source operands of UOP1 query RAT table according to the architectural register number of each source operand, and determine whether the architectural register numbers of the 2 source operands of UOP1 are the same as the architectural register number of the destination register of UOP0, i.e. determine R _1_0i+1And R _1i+1Whether or not to compare with R _0_2i+1The same is true. If the architectural register numbers of the 2 source operands of UOP1 are the same as the architectural number of the destination register of UOP0, then the physical register to which the architectural register numbers of the 2 source operands of UOP1 map is the physical register corresponding to the destination register of UOP 0.
The method provides that whether the architecture register numbers of 2 source operands of the UOP1 and the architecture number of a destination register of the UOP0 are the same or not and 2 source operands of the UOP1 are mapped to physical registers for parallel execution, and the serial execution process of logic and mapping logic which is firstly judged to be the same or not is executed in parallel, and the method is mainly based on an assumed method, namely 2 source operands of the UOP1 are divided into 2 cases in the 1 st stage, and physical register results of 2 cases are obtained. The correct result is selected in stage 2 according to the selection logic generated in parallel in stage 1.
1.2.1 assumptions UOP1 and UOP0 lack RAW correlation
When there is no RAW correlation between UOP1 and UOP0, the mapping process of UOP1 is similar to that of UOP 0.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _1_0 of 1 st source operandi+1A is as follows:
PHY_R_1_0i+1_A=({Q{(PRVi_1D[0]&(R_1_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_1_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_1_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_1_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_1_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_1_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_1_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_1_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_1_0i+1][Q-1:0])
PHY_VAL_1_0i+1=VAL_1_0i+1
similarly, the 2 nd source operand logic expression PHY _ R _1i+1A is as follows:
PHY_R_1_1i+1_A=({Q{(PRVi_1D[0]&(R_1_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_1_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_1_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_1_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_1_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_1_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_1_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_1_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_1_1i+1][Q-1:0])
PHY_VAL_1_1i+1=VAL_1_1i+1
the destination register of UOP1 is allocated with physical register PHY _ R _1_2i+1. The effective expression of the physical register allocated by the destination register is as follows:
PHY_VAL_1_2i+1=VAL_1_2i+1
1.2.2 assumed RAW correlation between UOP1 and UOP0
When there is a RAW correlation between UOP1 and UOP0, i.e., the destination register of UOP0 is numbered the same as the meta-register architectural register of UOP 1. The physical register number of the UOP1 source register is the newly allocated physical register number of the UOP0 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _1_0 of 1 st source operandi+1B is as follows:
PHY_R_1_0i+1_B=PHY_R_0_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _1i+1B is as follows:
PHY_R_1_1i+1_B=PHY_R_0_2i+1
the valid identification of the source register is the same as that of UOP1 and UOP0 which are not related.
1.2.3 determining the existence of RAW-related logic between UOP1 and UOP0
Judging whether the UOP1 and the UOP0 have RAW correlation, only comparing whether the source register number of the UOP1 is the same as the destination register number of the UOP0, wherein the 2 source register judgment logic expressions are as follows:
CMP_R_1_0i+1=((R_1_0i+1==R_0_2i+1)&VAL_1_0&VAL_0_2)
CMP_R_1_1i+1=((R_1_1i+1==R_0_2i+1)&VAL_1_1&VAL_0_2)
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
PHY_R_1_0i+1_A_1D<=PHY_R_1_0i+1_A
PHY_R_1_0i+1_B_1D<=PHY_R_1_0i+1_B
PHY_R_1_1i+1_A_1D<=PHY_R_1_1i+1_A
PHY_R_1_1i+1_B_1D<=PHY_R_1_1i+1_B
CMP_R_1_0i+1_1D<=CMP_R_1_0i+1
CMP_R_1_1i+1_1D<=CMP_R_1_1i+1
1 st Source operand Final physical register PHY _ R _1_0i+11D, the correct physical register is selected according to the selection logic.
PHY_R_1_0i+1_1D=({Q{(~CMP_R_1_0i+1_1D)}}&PHY_R_1_0i+1_A_1D)|
({Q{CMP_R_1_0i+1_1D}}&PHY_R_1_0i+1_B_1D)
Final physical register PHY _ R _1 of 2 nd source operandi+11D, the correct physical register is selected according to the selection logic.
PHY_R_1_1i+1_1D=({Q{(~CMP_R_1_1i+1_1D)}}&PHY_R_1_1i+1_A_1D)|
({Q{CMP_R_1_1i+1_1D}}&PHY_R_1_1i+1_B_1D)
PHY_VAL_1_0i+1_1D<=PHY_VAL_1_0i+1
PHY_VAL_1_1i+1_1D<=PHY_VAL_1_1i+1
The logical expression of the destination register is as follows:
PHY_R_1_2i+1_1D<=PHY_R_1_2i+1
PHY_VAL_1_2i+1_1D<=PHY_VAL_1_2i+1
R_1_2i_1D<=R_1_2i
1.3 renaming procedure for UOP2
2 source operands of UOP2 query the RAT table according to the architectural register number of each source operand, and it is also necessary to determine whether the architectural register numbers of the 2 source operands of UOP2 are the same as the architectural numbers of the destination registers of UOP1 and UOP0, i.e., to determine R _2_0i+1And R _2_1i+1Whether or not to be associated with R _1_2i+1Or R _0_2i+1The same is true. If 2 sources of UOP2 operateThe architectural register number of the number is the same as the architectural number of the destination register of UOP1 or UOP0, then the physical register mapped by the architectural register number of the 2 source operands of UOP2 is the physical register corresponding to the destination register of UOP1 or UOP 0. If a source register of UOP2 has a RAW dependency with both UOP1 and UOP0, that physical register of the source register of UOP2 takes the physical register number corresponding to the destination register of UOP1 according to the priority order.
The source register number of UOP2 is divided into 3 cases: 1, assume that UOP2 is not associated with RAW with UOP1 and UOP 0; 2, assume that there is a RAW correlation of UOP2 with UOP0 and a RAW correlation of UOP2 with UOP 1; 3, assume that there is a RAW correlation between UOP2 and UOP 1. Obtaining 3 physical registers of each source register in the 1 st stage of renaming, and simultaneously judging the relevance of the UOP2, the UOP1 and the UOP0 in parallel; the correct result is selected from the 3 physical register numbers of each source register in stage 2 of the renaming according to the dependency logic.
1.3.1 assume that UOP2 is not RAW related to UOP1 and UOP0
When UOP2 is not RAW related to UOP1 and UOP0, the mapping process of UOP2 is similar to that of UOP 0. The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _2_0 of 1 st source operandi+1A is as follows:
PHY_R_2_0i+1_A=({Q{(PRVi_1D[0]&(R_2_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_2_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_2_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_2_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_2_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_2_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_2_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_2_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_2_0i+1][Q-1:0])
PHY_VAL_2_0i+1=VAL_2_0i+1
similarly, the 2 nd source operand logic expression PHY _ R _2_1i+1A is as follows:
PHY_R_2_1i+1_A=({Q{(PRVi_1D[0]&(R_2_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_2_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_2_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_2_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_2_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_2_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_2_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_2_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_2_1i+1][Q-1:0])
PHY_VAL_2_0i+1=VAL_2_0i+1
the destination register of UOP2 is allocated physical register PHY _ R _2i+1. The effective expression of the physical register allocated by the destination register is as follows:
PHY_VAL_2_2i+1=VAL_2_2i+1
1.3.2 assume that there is a RAW correlation between UOP2 and UOP0, and that there is no RAW correlation between UOP2 and UOP1
When UOP2 has a RAW correlation with UOP0 and UOP2 has no RAW correlation with UOP1, i.e., the destination register of UOP0 is the same as the meta-register architectural register number of UOP2 and not the destination register number of UOP 1. The physical register number of the UOP2 source register is the newly allocated physical register number of the UOP0 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _2_0 of 1 st source operandi+1B is as follows:
PHY_R_2_0i+1_B=PHY_R_0_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _1i+1B is as follows:
PHY_R_2_1i+1_B=PHY_R_0_2i+1
the valid identification of the source register is the same as that of UOP2 and UOP0 which are not related.
1.3.3 assumed that there is a RAW correlation between UOP2 and UOP1
When there is a RAW correlation between UOP2 and UOP1, the destination register of UOP1 is the same as the meta-register architectural register number of UOP 2. The physical register number of the UOP2 source register is the newly allocated physical register number of the UOP1 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression of 1 st source operandPHY_R_2_0i+1C is as follows:
PHY_R_2_0i+1_C=PHY_R_1_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _2_1i+1C is as follows:
PHY_R_2_1i+1_C=PHY_R_1_2i+1
the valid identification of the source register is the same as that of UOP2 and UOP0 which are not related.
1.3.4 determining the existence of RAW-related logic between UOP2 and UOP1 and UOP0
Judging whether the UOP2 has RAW correlation with UOP0 and no RAW correlation with UOP1, only comparing whether the source register number of UOP2 is the same as the destination register number of UOP0, and judging whether the source register number of UOP2 is the same as the destination register number of UOP1, the 2 source register judgment logic expression is as follows:
selection logic for 1 st source operand CMP _ R _2_0i+1[1:0]The logical expressions are as follows:
CMP_R_2_0i+1[0]=((R_2_0i+1==R_0_2i+1)&VAL_2_0&VAL_0_2)&
(~((R_2_0i+1==R_1_2i+1)&VAL_2_0&VAL_1_2))
CMP_R_2_0i+1[1]=((R_2_0i+1==R_1_2i+1)&VAL_2_0&VAL_1_2)
selection logic for 2 nd source operand CMP _ R _2_1i+1[1:0]The logical expressions are as follows:
CMP_R_2_1i+1[0]=((R_2_1i+1==R_0_2i+1)&VAL_2_1&VAL_0_2)&
(~((R_2_1i+1==R_1_2i+1)&VAL_2_1&VAL_1_2))
CMP_R_2_1i+1[1]=((R_2_1i+1==R_1_2i+1)&VAL_2_1&VAL_1_2)
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
PHY_R_2_0i+1_A_1D<=PHY_R_2_0i+1_A
PHY_R_2_0i+1_B_1D<=PHY_R_2_0i+1_B
PHY_R_2_0i+1_C_1D<=PHY_R_2_0i+1_C
PHY_R_2_1i+1_A_1D<=PHY_R_2_1i+1_A
PHY_R_2_1i+1_B_1D<=PHY_R_2_1i+1_B
PHY_R_2_1i+1_C_1D<=PHY_R_2_1i+1_C
CMP_R_2_0i+1_1D<=CMP_R_2_0i+1
CMP_R_2_1i+1_1D<=CMP_R_2_1i+1
1 st Source operand Final physical register PHY _ R _2_0i+11D, the correct physical register is selected according to the selection logic.
PHY_R_2_0i+1_1D=({Q{(~(|CMP_R_2_0i+1_1D))}}&PHY_R_2_0i+1_A_1D)|
({Q{CMP_R_2_0i+1_1D[0]}}&PHY_R_2_0i+1_B_1D)|
({Q{CMP_R_2_0i+1_1D[1]}}&PHY_R_2_0i+1_C_1D)
Final physical register PHY _ R _1 of 2 nd source operandi+11D, the correct physical register is selected according to the selection logic.
PHY_R_2_1i+1_1D=({Q{(~(|CMP_R_2_1i+1_1D))}}&PHY_R_2_1i+1_A_1D)|
({Q{CMP_R_2_1i+1_1D[0]}}&PHY_R_2_1i+1_B_1D)|
({Q{CMP_R_2_1i+1_1D[1]}}&PHY_R_2_1i+1_C_1D)
PHY_VAL_2_0i+1_1D<=PHY_VAL_2_0i+1
PHY_VAL_2_1i+1_1D<=PHY_VAL_2_1i+1
The logical expression of the destination register is as follows:
PHY_R_2_2i+1_1D<=PHY_R_2_2i+1
PHY_VAL_2_2i+1_1D<=PHY_VAL_2_2i+1
R_2_2i_1D<=R_2_2i
1.4 renaming procedure for UOP3
2 source operands of UOP3 query the RAT table according to the architectural register number of each source operand, and determine whether the architectural register numbers of the 2 source operands of UOP3 are the same as the architectural register numbers of the destination registers of UOP2, UOP1, and UOP0, i.e., determine that R _3_0 is requiredi+1And R _3_1i+1Whether or not to be associated with R _2i+1,R_1_2i+1Or R _0_2i+1The same is true. If the architectural register numbers of the 2 source operands of UOP3 are the same as the architectural numbers of the destination registers of UOP2, UOP1, or UOP0, then the physical registers to which the architectural register numbers of the 2 source operands of UOP3 map are the physical registers to which the destination registers of UOP2, UOP1, or UOP0 correspond. If a source register of UOP3 is RAW related to UOP2, UOP1, and UOP0, the physical register of the source register of UOP3 takes the physical register number corresponding to the destination register of UOP2 according to the priority order.
The source register number of UOP3 is divided into 4 cases: 1, assume that UOP3 is not RAW-related to UOP2, UOP1, and UOP 0; 2, assume that UOP3 is RAW related to UOP0 and UOP3 is RAW related to UOP2, UOP1 is not present; 3, assume that UOP3 has a RAW correlation with UOP1 and no RAW correlation with UOP 2; 4 assume that there is a RAW correlation between UOP3 and UOP 2. 4 physical registers of each source register are obtained in the 1 st stage of renaming, and meanwhile, the relevance of the UOP3, the UOP2, the UOP1 and the UOP0 is judged in parallel; the correct result is selected from the 4 physical register numbers of each source register in stage 2 of the renaming according to the dependency logic.
1.4.1 assume that there is no RAW correlation between UOP3 and UOP2, UOP1 and UOP0
When UOP3 is not RAW related to UOP2, UOP1 and UOP0, the mapping process for UOP3 is similar to that of UOP 0.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _3_0 of 1 st source operandi+1A is as follows:
PHY_R_3_0i+1_A=({Q{(PRVi_1D[0]&(R_3_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_3_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_3_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_3_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_3_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_3_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_3_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_3_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_3_0i+1][Q-1:0])
PHY_VAL_3_0i+1=VAL_3_0i+1
similarly, the 2 nd source operand logic expression PHY _ R _3_1i+1A is as follows:
PHY_R_3_1i+1_A=({Q{(PRVi_1D[0]&(R_3_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_3_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_3_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_3_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_3_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_3_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_3_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_3_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_3_1i+1][Q-1:0])
PHY_VAL_3_1i+1=VAL_3_1i+1
1.4.2 assume that there is a RAW correlation between UOP3 and UOP0, and that there is no RAW correlation between UOP3 and UOP2, UOP1
When UOP3 is RAW related to UOP0 and UOP3 is RAW related to UOP2, UOP1, i.e., the destination register of UOP0 is numbered the same as the meta-register architecture register of UOP3 and not the same as the destination register of UOP2, UOP 1. The physical register number of the UOP3 source register is the newly allocated physical register number of the UOP0 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _3_0 of 1 st source operandi+1B is as follows:
PHY_R_3_0i+1_B=PHY_R_0_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _3_1i+1B is as follows:
PHY_R_3_1i+1_B=PHY_R_0_2i+1
the valid identification of the source register is the same as that of UOP3 and UOP0 which are not related.
1.4.3 assume that UOP3 is RAW-related to UOP1 and is RAW-related to UOP2 is not present
When UOP3 has a RAW correlation with UOP1 and UOP3 has no RAW correlation with UOP2, i.e., the destination register of UOP1 is the same as the meta-register architectural register number of UOP3 and not the destination register number of UOP 2. The physical register number of the UOP3 source register is the newly allocated physical register number of the UOP1 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _3_0 of 1 st source operandi+1C is as follows:
PHY_R_3_0i+1_C=PHY_R_1_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _3_1i+1C is as follows:
PHY_R_3_1i+1_C=PHY_R_1_2i+1
the valid identification of the source register is the same as that of UOP3 and UOP0 which are not related.
1.4.4 assumed RAW correlation between UOP3 and UOP2
When there is a RAW correlation between UOP3 and UOP2, the destination register of UOP2 is the same as the meta-register architectural register number of UOP 3. The physical register number of the UOP3 source register is the newly allocated physical register number of the UOP2 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _3_0 of 1 st source operandi+1D is as follows:
PHY_R_3_0i+1_D=PHY_R_2_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _3_1i+1D is as follows:
PHY_R_3_1i+1_D=PHY_R_2_2i+1
the valid identification of the source register is the same as that of UOP3 and UOP0 which are not related.
1.4.5 determining the existence of RAW related logic between UOP3 and UOP2, UOP1 and UOP0
Judging whether the UOP3 has RAW correlation with UOP0 and RAW correlation with UOP2 and UOP1 does not have RAW correlation, only comparing whether the source register number of UOP3 is the same as the destination register number of UOP0, and judging whether the source register number of UOP3 is the same as the destination register numbers of UOP2 and UOP1, the 2 source registers judge the logic expressions as follows:
selection logic for 1 st source operand CMP _ R _3_0i+1[2:0]The logical expressions are as follows:
CMP_R_3_0i+1[0]=((R_3_0i+1==R_0_2i+1)&VAL_3_0&VAL_0_2)&
(~((R_3_0i+1==R_1_2i+1)&VAL_3_0&VAL_1_2))&
(~((R_3_0i+1==R_2_2i+1)&VAL_3_0&VAL_2_2))
CMP_R_3_0i+1[1]=((R_3_0i+1==R_1_2i+1)&VAL_3_0&VAL_1_2)&
(~((R_3_0i+1==R_2_2i+1)&VAL_3_0&VAL_2_2))
CMP_R_3_0i+1[2]=((R_3_0i+1==R_2_2i+1)&VAL_3_0&VAL_2_2)
selection logic for 2 nd source operand CMP _ R _3_1i+1[2:0]The logical expressions are as follows:
CMP_R_3_1i+1[0]=((R_3_1i+1==R_0_2i+1)&VAL_3_1&VAL_0_2)&
(~((R_3_1i+1==R_1_2i+1)&VAL_3_1&VAL_1_2))&
(~((R_3_1i+1==R_2_2i+1)&VAL_3_1&VAL_2_2))
CMP_R_3_1i+1[1]=((R_3_1i+1==R_1_2i+1)&VAL_3_1&VAL_1_2)&
(~((R_3_1i+1==R_2_2i+1)&VAL_3_1&VAL_2_2))
CMP_R_3_1i+1[2]=((R_3_1i+1==R_2_2i+1)&VAL_3_1&VAL_2_2)
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
PHY_R_3_0i+1_A_1D<=PHY_R_3_0i+1_A
PHY_R_3_0i+1_B_1D<=PHY_R_3_0i+1_B
PHY_R_3_0i+1_C_1D<=PHY_R_3_0i+1_C
PHY_R_3_0i+1_D_1D<=PHY_R_3_0i+1_D
PHY_R_3_1i+1_A_1D<=PHY_R_3_1i+1_A
PHY_R_3_1i+1_B_1D<=PHY_R_3_1i+1_B
PHY_R_3_1i+1_C_1D<=PHY_R_3_1i+1_C
PHY_R_3_1i+1_D_1D<=PHY_R_3_1i+1_D
CMP_R_3_0i+1_1D<=CMP_R_3_0i+1
CMP_R_3_1i+1_1D<=CMP_R_3_1i+1
1 st Source operand Final physical register PHY _ R _3_0i+11D, the correct physical register is selected according to the selection logic.
PHY_R_3_0i+1_1D=({Q{(~(|CMP_R_3_0i+1_1D))}}&PHY_R_3_0i+1_A_1D)|
({Q{CMP_R_3_0i+1_1D[0]}}&PHY_R_3_0i+1_B_1D)|
({Q{CMP_R_3_0i+1_1D[1]}}&PHY_R_3_0i+1_C_1D)|
({Q{CMP_R_3_0i+1_1D[2]}}&PHY_R_3_0i+1_D_1D)
2 nd source operand final physical register PHY_R_3_1i+11D, the correct physical register is selected according to the selection logic.
PHY_R_3_1i+1_1D=({Q{(~(|CMP_R_3_1i+1_1D))}}&PHY_R_3_1i+1_A_1D)|
({Q{CMP_R_3_1i+1_1D[0]}}&PHY_R_3_1i+1_B_1D)|
({Q{CMP_R_3_1i+1_1D[1]}}&PHY_R_3_1i+1_C_1D)|
({Q{CMP_R_3_1i+1_1D[2]}}&PHY_R_3_1i+1_D_1D)
PHY_VAL_3_0i+1_1D<=PHY_VAL_3_0i+1
PHY_VAL_3_1i+1_1D<=PHY_VAL_3_1i+1
The logical expression of the destination register is as follows:
PHY_R_3_2i+1_1D<=PHY_R_3_2i+1
PHY_VAL_3_2i+1_1D<=PHY_VAL_3_2i+1
R_3_2i_1D<=R_3_2i
1.5 renaming procedure for UOP4
2 source operands of UOP4 query the RAT table according to the architectural register number of each source operand, and it is also necessary to determine whether the architectural register numbers of 2 source operands of UOP4 are the same as the architectural numbers of the destination registers of UOP3, UOP2, UOP1, and UOP0, that is, it is necessary to determine R _4_0i+1And R _4_1i+1Whether or not to be associated with R _3_2i+1,R_2_2i+1,R_1_2i+1Or R _0_2i+1The same is true. If the architectural register numbers of the 2 source operands of UOP4 are the same as the architectural numbers of the destination registers of UOP3, UOP2, UOP1 or UOP0, then the physical registers to which the architectural register numbers of the 2 source operands of UOP4 map are the physical registers corresponding to the destination registers of UOP3, UOP2, UOP1 or UOP 0. If a source register of UOP4 is RAW-related to UOP3, UOP2, UOP1 and UOP0, the physical register of the source register of UOP4 takes the physical register number corresponding to the destination register of UOP3 according to the priority order。
The source register number of UOP4 is divided into 5 cases: 1, assume that UOP4 is not associated with UOP3, UOP2, UOP1, and UOP0 for RAW; 2, assume that UOP4 and UOP0 have a RAW correlation and UOP4 and UOP3, UOP2, UOP1 have no RAW correlation; 3, assume that UOP4 has a RAW correlation with UOP1 and no RAW correlation with UOP3, UOP 2; 4 assume that UOP4 and UOP2 have a RAW correlation and UOP4 and UOP3 have no RAW correlation; 5 assume that there is a RAW correlation between UOP4 and UOP 3. 5 physical registers of each source register are obtained in the 1 st stage of renaming, and meanwhile, the relevance of the UOP4, the UOP3, the UOP2, the UOP1 and the UOP0 is judged in parallel; the correct result is selected from the 5 physical register numbers of each source register in stage 2 of the renaming according to the dependency logic.
1.5.1 assume that there is no RAW correlation between UOP4 and UOP3, UOP2, UOP1 and UOP0
When UOP4 is not RAW related to UOP3, UOP2, UOP1 and UOP0, the mapping process for UOP4 is similar to that of UOP 0. The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _4_0 of 1 st source operandi+1A is as follows:
PHY_R_4_0i+1_A=({Q{(PRVi_1D[0]&(R_4_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_4_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_4_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_4_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_4_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_4_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_4_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_4_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_4_0i+1][Q-1:0])
PHY_VAL_4_0i+1=VAL_4_0i+1
similarly, the 2 nd source operand logic expression PHY _ R _4_1i+1A is as follows:
PHY_R_4_1i+1_A=({Q{(PRVi_1D[0]&(R_4_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_4_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_4_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_4_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_4_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_4_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_4_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_4_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_4_1i+1][Q-1:0])
PHY_VAL_4_1i+1=VAL_4_1i+1
1.5.2 assume that UOP4 is RAW-related to UOP0 and UOP4 is RAW-related to UOP3, UOP2, UOP1 is not present
When UOP4 and UOP0 have a RAW correlation and UOP4 and UOP3, UOP2, UOP1 do not have a RAW correlation, i.e., the destination register of UOP0 is the same as the meta-register architecture register number of UOP4 and is not the same as the destination register number of UOP3, UOP2, UOP 1. The physical register number of the UOP4 source register is the newly allocated physical register number of the UOP0 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _4_0 of 1 st source operandi+1B is as follows:
PHY_R_4_0i+1_B=PHY_R_0_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _4_1i+1B is as follows:
PHY_R_4_1i+1_B=PHY_R_0_2i+1
the valid identification of the source register is the same as that of UOP4 and UOP0 which are not related.
1.5.3 assume that there is a RAW correlation between UOP4 and UOP1, and that there is no RAW correlation between UOP4 and UOP3, UOP2
When UOP4 is RAW related to UOP1 and UOP4 is RAW related to UOP3, UOP2, i.e., the destination register of UOP1 is numbered the same as the meta-register architecture register of UOP4 and not the same as the destination register of UOP3, UOP 2. The physical register number of the UOP4 source register is the newly allocated physical register number of the UOP1 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _4_0 of 1 st source operandi+1C is as follows:
PHY_R_4_0i+1_C=PHY_R_1_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _4_1i+1C is as follows:
PHY_R_4_1i+1_C=PHY_R_1_2i+1
the valid identification of the source register is the same as that of UOP4 and UOP0 which are not related.
1.5.4 assume that there is a RAW correlation between UOP4 and UOP2, and that there is no RAW correlation between UOP4 and UOP3
When UOP4 has a RAW correlation with UOP2 and UOP4 has no RAW correlation with UOP3, i.e., the destination register of UOP2 is the same as the meta-register architectural register number of UOP4 and not the destination register number of UOP 3. The physical register number of the UOP4 source register is the newly allocated physical register number of the UOP2 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _4_0 of 1 st source operandi+1D is as follows:
PHY_R_4_0i+1_D=PHY_R_2_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _4_1i+1D is as follows:
PHY_R_4_1i+1_D=PHY_R_2_2i+1
1.5.5 assumed RAW correlation between UOP4 and UOP3
When there is a RAW correlation of 3 between UOP4 and UOP3, i.e., the destination register of UOP3 is numbered the same as the meta-register architectural register of UOP 4. The physical register number of the UOP4 source register is the newly allocated physical register number of the UOP3 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _4_0 of 1 st source operandi+1E is as follows:
PHY_R_4_0i+1_E=PHY_R_3_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _4_1i+1E is as follows:
PHY_R_4_1i+1_E=PHY_R_3_2i+1
1.5.6 determining the existence of RAW related logic between UOP4 and UOP3, UOP2, UOP1 and UOP0
Judging whether the UOP4 has RAW correlation with the UOP0 and has no RAW correlation with the UOP3, UOP2 and UOP1, only comparing whether the source register number of the UOP4 is the same as the destination register number of the UOP0, and judging whether the source register number of the UOP4 is the same as the destination register numbers of the UOP3, UOP2 and UOP1, the 2 source registers judge the logic expressions as follows:
selection logic for 1 st source operand CMP _ R _4_0i+1[3:0]The logical expressions are as follows:
CMP_R_4_0i+1[0]=((R_4_0i+1==R_0_2i+1)&VAL_4_0&VAL_0_2)&
(~((R_4_0i+1==R_1_2i+1)&VAL_4_0&VAL_1_2))&
(~((R_4_0i+1==R_2_2i+1)&VAL_4_0&VAL_2_2))&
(~((R_4_0i+1==R_3_2i+1)&VAL_4_0&VAL_3_2))
CMP_R_4_0i+1[1]=((R_4_0i+1==R_1_2i+1)&VAL_4_0&VAL_1_2)&
(~((R_4_0i+1==R_2_2i+1)&VAL_4_0&VAL_2_2))&
(~((R_4_0i+1==R_3_2i+1)&VAL_4_0&VAL_3_2))
CMP_R_4_0i+1[2]=((R_4_0i+1==R_2_2i+1)&VAL_4_0&VAL_2_2)&
(~((R_4_0i+1==R_3_2i+1)&VAL_4_0&VAL_3_2))
CMP_R_4_0i+1[3]=((R_4_0i+1==R_3_2i+1)&VAL_4_0&VAL_3_2)
selection logic for 2 nd source operand CMP _ R _4_1i+1[3:0]The logical expressions are as follows:
CMP_R_4_1i+1[0]=((R_4_1i+1==R_0_2i+1)&VAL_4_1&VAL_0_2)&
(~((R_4_1i+1==R_1_2i+1)&VAL_4_1&VAL_1_2))&
(~((R_4_1i+1==R_2_2i+1)&VAL_4_1&VAL_2_2))&
(~((R_4_1i+1==R_3_2i+1)&VAL_4_1&VAL_3_2))
CMP_R_4_1i+1[1]=((R_4_1i+1==R_1_2i+1)&VAL_4_0&VAL_1_2)&
(~((R_4_1i+1==R_2_2i+1)&VAL_4_0&VAL_2_2))&
(~((R_4_1i+1==R_3_2i+1)&VAL_4_0&VAL_3_2))
CMP_R_4_1i+1[2]=((R_4_1i+1==R_2_2i+1)&VAL_4_0&VAL_2_2)&
(~((R_4_1i+1==R_3_2i+1)&VAL_4_0&VAL_3_2))
CMP_R_4_1i+1[3]=((R_4_1i+1==R_3_2i+1)&VAL_4_1&VAL_3_2)
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
PHY_R_4_0i+1_A_1D<=PHY_R_4_0i+1_A
PHY_R_4_0i+1_B_1D<=PHY_R_4_0i+1_B
PHY_R_4_0i+1_C_1D<=PHY_R_4_0i+1_C
PHY_R_4_0i+1_D_1D<=PHY_R_4_0i+1_D
PHY_R_4_0i+1_E_1D<=PHY_R_4_0i+1_E
PHY_R_4_1i+1_A_1D<=PHY_R_4_1i+1_A
PHY_R_4_1i+1_B_1D<=PHY_R_4_1i+1_B
PHY_R_4_1i+1_C_1D<=PHY_R_4_1i+1_C
PHY_R_4_1i+1_D_1D<=PHY_R_4_1i+1_D
PHY_R_4_1i+1_E_1D<=PHY_R_4_1i+1_E
CMP_R_4_0i+1_1D<=CMP_R_4_0i+1
CMP_R_4_1i+1_1D<=CMP_R_4_1i+1
1 st Source operand Final physical register PHY _ R _4_0i+11D, the correct physical register is selected according to the selection logic.
PHY_R_4_0i+1_1D=({Q{(~(|CMP_R_4_0i+1_1D))}}&PHY_R_4_0i+1_A_1D)|
({Q{CMP_R_4_0i+1_1D[0]}}&PHY_R_4_0i+1_B_1D)|
({Q{CMP_R_4_0i+1_1D[1]}}&PHY_R_4_0i+1_C_1D)|
({Q{CMP_R_4_0i+1_1D[2]}}&PHY_R_4_0i+1_D_1D)|
({Q{CMP_R_4_0i+1_1D[3]}}&PHY_R_4_0i+1_E_1D)
Final physical register PHY _ R _4_1 of 2 nd source operandi+11D, the correct physical register is selected according to the selection logic.
PHY_R_4_1i+1_1D=({Q{(~(|CMP_R_4_1i+1_1D))}}&PHY_R_4_1i+1_A_1D)|
({Q{CMP_R_4_1i+1_1D[0]}}&PHY_R_4_1i+1_B_1D)|
({Q{CMP_R_4_1i+1_1D[1]}}&PHY_R_4_1i+1_C_1D)|
({Q{CMP_R_4_1i+1_1D[2]}}&PHY_R_4_1i+1_D_1D)|
({Q{CMP_R_4_1i+1_1D[3]}}&PHY_R_4_1i+1_E_1D)
PHY_VAL_4_0i+1_1D<=PHY_VAL_4_0i+1
PHY_VAL_4_1i+1_1D<=PHY_VAL_4_1i+1
The logical expression of the destination register is as follows:
PHY_R_4_2i+1_1D<=PHY_R_4_2i+1
PHY_VAL_4_2i+1_1D<=PHY_VAL_4_2i+1
R_4_2i_1D<=R_4_2i
1.6 renaming procedure for UOP5
2 source operands of UOP5 query the RAT table according to the architectural register number of each source operand, and at the same time, it is necessary to determine whether the architectural register numbers of the 2 source operands of UOP5 are the same as the architectural numbers of the destination registers of UOP4, UOP3, UOP2, UOP1, and UOP0, that is, it is necessary to determine that R _5_0i+1And R _5_1i+1Whether or not to be associated with R _4_2i+1,R_3_2i+1,R_2_2i+1,R_1_2i+1Or R _0_2i+1The same is true. If the architectural register numbers of the 2 source operands of UOP5 are the same as the architectural numbers of the destination registers of UOP4, UOP3, UOP2, UOP1 or UOP0, then the physical registers mapped by the architectural register numbers of the 2 source operands of UOP5 are the physical registers corresponding to the destination registers of UOP4, UOP3, UOP2, UOP1 or UOP 0. If a source register of UOP5 has RAW dependency with UOP4, UOP3, UOP2, UOP1 and UOP0, the physical register of the source register of UOP5 takes the physical register number corresponding to the destination register of UOP4 according to the priority order.
The source register number of UOP5 is divided into 6 cases: 1, assume that UOP5 is not associated with UOP4, UOP3, UOP2, UOP1 and UOP0 for RAW; 2, assume that UOP5 and UOP0 have RAW correlation and UOP5 and UOP4, UOP3, UOP2, UOP1 have no RAW correlation; 3, assume that UOP5 is RAW related to UOP1 and not RAW related to UOP4, UOP3, UOP 2; 4, assume that UOP5 is RAW related to UOP2 and UOP5 is RAW related to UOP4, UOP3 is not present; 5, assume that there is a RAW correlation of UOP5 with UOP3 and a RAW correlation of UOP5 with UOP 4; 6, assume that there is a RAW correlation between UOP5 and UOP 4. 6 physical registers of each source register are obtained in the 1 st stage of renaming, and meanwhile, the relevance of the UOP5, the UOP4, the UOP3, the UOP2, the UOP1 and the UOP0 is judged in parallel; the correct result is selected from the 6 physical register numbers of each source register in stage 2 of the renaming according to the dependency logic.
1.6.1 assume that there is no RAW correlation between UOP5 and UOP4, UOP3, UOP2, UOP1 and UOP0
The mapping process for UOP5 is similar to that of UOP0 when UOP5 is not RAW related to UOP4, UOP3, UOP2, UOP1, and UOP 0. The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _5_0 of 1 st source operandi+1A is as follows:
PHY_R_5_0i+1_A=({Q{(PRVi_1D[0]&(R_5_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_5_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_5_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_5_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_5_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_5_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_5_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_5_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_5_0i+1][Q-1:0])
PHY_VAL_5_0i+1=VAL_5_0i+1
similarly, the 2 nd source operand logic expression PHY _ R _5_1i+1A is as follows:
PHY_R_5_1i+1_A=({Q{(PRVi_1D[0]&(R_5_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_5_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_5_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_5_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_5_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_5_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_5_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_5_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_5_1i+1][Q-1:0])
PHY_VAL_5_1i+1=VAL_5_1i+1
1.6.2 assume that UOP5 is RAW-related to UOP0 and UOP5 is RAW-related to UOP4, UOP3, UOP2, UOP1 are not RAW-related
When UOP5 and UOP0 have a RAW correlation and UOP5 and UOP4, UOP3, UOP2, UOP1 do not have a RAW correlation, i.e. the destination register of UOP0 is numbered the same as the meta-register architecture register of UOP5 and not the same as the destination register of UOP4, UOP3, UOP2, UOP 1. The physical register number of the UOP5 source register is the newly allocated physical register number of the UOP0 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _5_0 of 1 st source operandi+1B is as follows:
PHY_R_5_0i+1_B=PHY_R_0_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _5_1i+1B is as follows:
PHY_R_5_1i+1_B=PHY_R_0_2i+1
the valid identification of the source register is the same as that of UOP5 and UOP0 which are not related.
1.6.3 assume that UOP5 is RAW-related to UOP1 and UOP5 is RAW-related to UOP4, UOP3, UOP2 is not present
When UOP5 and UOP1 have a RAW correlation and UOP5 and UOP4, UOP3, UOP2 do not have a RAW correlation, i.e., the destination register of UOP1 is the same as the meta-register architecture register number of UOP5 and is not the same as the destination register number of UOP4, UOP3, UOP 2. The physical register number of the UOP5 source register is the newly allocated physical register number of the UOP1 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _5_0 of 1 st source operandi+1C is as follows:
PHY_R_5_0i+1_C=PHY_R_1_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _5_1i+1C is as follows:
PHY_R_5_1i+1_C=PHY_R_1_2i+1
the valid identification of the source register is the same as that of UOP5 and UOP0 which are not related.
1.6.4 assume that there is a RAW correlation between UOP5 and UOP2, and that there is no RAW correlation between UOP5 and UOP4, and between UOP3
When UOP5 is RAW related to UOP2 and UOP5 is RAW related to UOP4, UOP3, i.e., the destination register of UOP2 is numbered the same as the meta-register architecture register of UOP5 and not the same as the destination register of UOP4, UOP 3. The physical register number of the UOP5 source register is the newly allocated physical register number of the UOP2 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _5_0 of 1 st source operandi+1D is as follows:
PHY_R_5_0i+1_D=PHY_R_2_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _5_1i+1D is as follows:
PHY_R_5_1i+1_D=PHY_R_2_2i+1
1.6.5 assume that there is a RAW correlation between UOP5 and UOP3, and that there is no RAW correlation between UOP5 and UOP4
When UOP5 has a RAW correlation with UOP3 and UOP5 has no RAW correlation with UOP4, i.e., the destination register of UOP3 is the same as the meta-register architectural register number of UOP5 and not the destination register number of UOP 4. The physical register number of the UOP5 source register is the newly allocated physical register number of the UOP3 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _5_0 of 1 st source operandi+1E is as follows:
PHY_R_5_0i+1_E=PHY_R_3_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _5_1i+1E is as follows:
PHY_R_5_1i+1_E=PHY_R_3_2i+1
1.6.6 assumed RAW correlation between UOP5 and UOP4
When there is a RAW correlation between UOP5 and UOP4, the destination register of UOP4 is the same as the meta-register architectural register number of UOP 5. The physical register number of the UOP5 source register is the newly allocated physical register number of the UOP4 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _5_0 of 1 st source operandi+1F is as follows:
PHY_R_5_0i+1_F=PHY_R_4_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _5_1i+1F is as follows:
PHY_R_5_1i+1_F=PHY_R_4_2i+1
1.6.7 determining the existence of RAW related logic between UOP5 and UOP4, UOP3, UOP2, UOP1 and UOP0
Determining that UOP5 has RAW correlation with UOP0 and no RAW correlation with UOP4, UOP3, UOP2, UOP1, it is only necessary to compare whether the source register number of UOP5 is the same as the destination register number of UOP0, and to determine that the source register number of UOP5 is the same as the destination register number of UOP4, UOP3, UOP2, UOP1, the 2 source register determination logic expressions are as follows:
selection logic for 1 st source operand CMP _ R _5_0i+1[4:0]The logical expressions are as follows:
CMP_R_5_0i+1[0]=((R_5_0i+1==R_0_2i+1)&VAL_5_0&VAL_0_2)&
(~((R_5_0i+1==R_1_2i+1)&VAL_5_0&VAL_1_2))&
(~((R_5_0i+1==R_2_2i+1)&VAL_5_0&VAL_2_2))&
(~((R_5_0i+1==R_3_2i+1)&VAL_5_0&VAL_3_2))&
(~((R_5_0i+1==R_4_2i+1)&VAL_5_0&VAL_4_2))
CMP_R_5_0i+1[1]=((R_5_0i+1==R_1_2i+1)&VAL_5_0&VAL_1_2)&
(~((R_5_0i+1==R_2_2i+1)&VAL_5_0&VAL_2_2))&
(~((R_5_0i+1==R_3_2i+1)&VAL_5_0&VAL_3_2))&
(~((R_5_0i+1==R_4_2i+1)&VAL_5_0&VAL_4_2))
CMP_R_5_0i+1[2]=((R_5_0i+1==R_2_2i+1)&VAL_5_0&VAL_2_2)&
(~((R_5_0i+1==R_3_2i+1)&VAL_5_0&VAL_3_2))&
(~((R_5_0i+1==R_4_2i+1)&VAL_5_0&VAL_4_2))
CMP_R_5_0i+1[3]=((R_5_0i+1==R_3_2i+1)&VAL_5_0&VAL_3_2)&
(~((R_5_0i+1==R_4_2i+1)&VAL_5_0&VAL_4_2))
CMP_R_5_0i+1[4]=((R_5_0i+1==R_4_2i+1)&VAL_5_0&VAL_4_2)
selection logic for 2 nd source operand CMP _ R _5_1i+1[4:0]The logical expressions are as follows:
CMP_R_5_1i+1[0]=((R_5_1i+1==R_0_2i+1)&VAL_5_1&VAL_0_2)&
(~((R_5_1i+1==R_1_2i+1)&VAL_5_1&VAL_1_2))&
(~((R_5_1i+1==R_2_2i+1)&VAL_5_1&VAL_2_2))&
(~((R_5_1i+1==R_3_2i+1)&VAL_5_1&VAL_3_2))&
(~((R_5_1i+1==R_4_2i+1)&VAL_5_1&VAL_4_2))
CMP_R_5_1i+1[1]=((R_5_1i+1==R_1_2i+1)&VAL_5_1&VAL_1_2)&
(~((R_5_1i+1==R_2_2i+1)&VAL_5_1&VAL_2_2))&
(~((R_5_1i+1==R_3_2i+1)&VAL_5_1&VAL_3_2))&
(~((R_5_1i+1==R_4_2i+1)&VAL_5_1&VAL_4_2))
CMP_R_5_1i+1[2]=((R_5_1i+1==R_2_2i+1)&VAL_5_1&VAL_2_2)&
(~((R_5_1i+1==R_3_2i+1)&VAL_5_1&VAL_3_2))&
(~((R_5_1i+1==R_4_2i+1)&VAL_5_1&VAL_4_2))
CMP_R_5_1i+1[3]=((R_5_1i+1==R_3_2i+1)&VAL_5_1&VAL_3_2)&
(~((R_5_1i+1==R_4_2i+1)&VAL_5_1&VAL_4_2))
CMP_R_5_1i+1[4]=((R_5_1i+1==R_4_2i+1)&VAL_5_1&VAL_4_2)
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
PHY_R_5_0i+1_A_1D<=PHY_R_5_0i+1_A
PHY_R_5_0i+1_B_1D<=PHY_R_5_0i+1_B
PHY_R_5_0i+1_C_1D<=PHY_R_5_0i+1_C
PHY_R_5_0i+1_D_1D<=PHY_R_5_0i+1_D
PHY_R_5_0i+1_E_1D<=PHY_R_5_0i+1_E
PHY_R_5_0i+1_F_1D<=PHY_R_5_0i+1_F
PHY_R_5_1i+1_A_1D<=PHY_R_5_1i+1_A
PHY_R_5_1i+1_B_1D<=PHY_R_5_1i+1_B
PHY_R_5_1i+1_C_1D<=PHY_R_5_1i+1_C
PHY_R_5_1i+1_D_1D<=PHY_R_5_1i+1_D
PHY_R_5_1i+1_E_1D<=PHY_R_5_1i+1_E
PHY_R_5_1i+1_F_1D<=PHY_R_5_1i+1_F
CMP_R_5_0i+1_1D<=CMP_R_5_0i+1
CMP_R_5_1i+1_1D<=CMP_R_5_1i+1
1 st Source operand Final physical register PHY _ R _5_0i+11D, the correct physical register is selected according to the selection logic.
PHY_R_5_0i+1_1D=({Q{(~(|CMP_R_5_0i+1_1D))}}&PHY_R_5_0i+1_A_1D)|
({Q{CMP_R_5_0i+1_1D[0]}}&PHY_R_5_0i+1_B_1D)|
({Q{CMP_R_5_0i+1_1D[1]}}&PHY_R_5_0i+1_C_1D)|
({Q{CMP_R_5_0i+1_1D[2]}}&PHY_R_5_0i+1_D_1D)|
({Q{CMP_R_5_0i+1_1D[3]}}&PHY_R_5_0i+1_E_1D)|
({Q{CMP_R_5_0i+1_1D[4]}}&PHY_R_5_0i+1_F_1D)
Final physical register PHY _ R _5_1 for 2 nd source operandi+11D, the correct physical register is selected according to the selection logic.
PHY_R_5_1i+1_1D=({Q{(~(|CMP_R_5_1i+1_1D))}}&PHY_R_5_1i+1_A_1D)|
({Q{CMP_R_5_1i+1_1D[0]}}&PHY_R_5_1i+1_B_1D)|
({Q{CMP_R_5_1i+1_1D[1]}}&PHY_R_5_1i+1_C_1D)|
({Q{CMP_R_5_1i+1_1D[2]}}&PHY_R_5_1i+1_D_1D)|
({Q{CMP_R_5_1i+1_1D[3]}}&PHY_R_5_1i+1_E_1D)|
({Q{CMP_R_5_1i+1_1D[4]}}&PHY_R_5_1i+1_F_1D)
PHY_VAL_5_0i+1_1D<=PHY_VAL_5_0i+1
PHY_VAL_5_1i+1_1D<=PHY_VAL_5_1i+1
The logical expression of the destination register is as follows:
PHY_R_5_2i+1_1D<=PHY_R_5_2i+1
PHY_VAL_5_2i+1_1D<=PHY_VAL_5_2i+1
R_5_2i_1D<=R_5_2i
1.7 renaming procedure for UOP6
2 source operands of UOP6 query the RAT table according to the architectural register number of each source operand, and at the same time, it is necessary to determine whether the architectural register numbers of the 2 source operands of UOP6 are the same as the architectural register numbers of the destination registers of UOP5, UOP4, UOP3, UOP2, UOP1, and UOP0, that is, it is necessary to determine R _6_0i+1And R _6_1i+1Whether or not to be associated with R _5_2i+1,R_4_2i+1,R_3_2i+1,R_2_2i+1,R_1_2i+1Or R _0_2i+1The same is true. If the architectural register numbers of the 2 source operands of UOP6 are the same as the architectural numbers of the destination registers of UOP5, UOP4, UOP3, UOP2, UOP1 or UOP0, then the physical registers to which the architectural register numbers of the 2 source operands of UOP6 map are the physical registers corresponding to the destination registers of UOP5, UOP4, UOP3, UOP2, UOP1 or UOP 0. If a source register of UOP6 has RAW dependency with UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0, the physical register of the source register of UOP6 takes the physical register number corresponding to the destination register of UOP5 according to the priority order.
The source register numbers of UOP6 are divided into 7 cases: 1, assume that UOP6 is not RAW related to UOP5, UOP4, UOP3, UOP2, UOP1, and UOP 0; 2, assume that UOP6 is RAW related to UOP0 and UOP6 is RAW related to UOP5, UOP4, UOP3, UOP2, UOP 1; 3, assume that UOP6 has a RAW correlation with UOP1 and no RAW correlation with UOP5, UOP4, UOP3, UOP 2; 4, assume that UOP6 and UOP2 have a RAW correlation and UOP6 and UOP5, UOP4, UOP3 have no RAW correlation; 5, assume that UOP6 is RAW related to UOP3 and UOP6 is RAW related to UOP5, UOP4 is not present; 6, assume that there is a RAW correlation of UOP6 with UOP4 and a RAW correlation of UOP6 with UOP 5; 7, assume that there is a RAW correlation between UOP6 and UOP 5. Obtaining 7 physical registers of each source register in the 1 st stage of renaming, and simultaneously judging the relevance of UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0 in parallel; the correct result is selected from the 7 physical register numbers of each source register in stage 2 of the renaming according to the dependency logic.
1.7.1 assume that there is no RAW correlation between UOP6 and UOP5, UOP4, UOP3, UOP2, UOP1, and UOP0
When UOP6 is not RAW related to UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0, the mapping process of UOP6 is similar to that of UOP 0. The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _6_0 of 1 st source operandi+1A is as follows:
PHY_R_6_0i+1_A=({Q{(PRVi_1D[0]&(R_6_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_6_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_6_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_6_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_6_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_6_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_6_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_6_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_6_0i+1][Q-1:0])
PHY_VAL_6_0i+1=VAL_6_0i+1
similarly, the 2 nd source operand logic expression PHY _ R _6_1i+1A is as follows:
PHY_R_6_1i+1_A=({Q{(PRVi_1D[0]&(R_6_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_6_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_6_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_6_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_6_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_6_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_6_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_6_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_6_1i+1][Q-1:0])
PHY_VAL_6_1i+1=VAL_6_1i+1
1.7.2 assume that UOP6 is RAW-related with UOP0 and UOP6 is RAW-related with UOP5, UOP4, UOP3, UOP2, UOP1 is not RAW-related
When UOP6 and UOP0 have a RAW correlation and UOP6 and UOP5, UOP4, UOP3, UOP2, UOP1 do not have a RAW correlation, i.e. the destination register of UOP0 is numbered the same as the meta-register architecture register of UOP6 and not the destination register of UOP5, UOP4, UOP3, UOP2, UOP 1. The physical register number of the UOP6 source register is the newly allocated physical register number of the UOP0 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _6_0 of 1 st source operandi+1B is as follows:
PHY_R_6_0i+1_B=PHY_R_0_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _6_1i+1B is as follows:
PHY_R_6_1i+1_B=PHY_R_0_2i+1
the valid identification of the source register is the same as that of UOP6 and UOP0 which are not related.
1.7.3 assume that UOP6 is RAW-related to UOP1 and UOP6 is RAW-related to UOP5, UOP4, UOP3, UOP2 are not RAW-related
When UOP6 and UOP1 have a RAW correlation and UOP6 and UOP5, UOP4, UOP3, UOP2 do not have a RAW correlation, i.e. the destination register of UOP1 is numbered the same as the meta-register architecture register of UOP6 and not the same as the destination register of UOP5, UOP4, UOP3, UOP 2. The physical register number of the UOP6 source register is the newly allocated physical register number of the UOP1 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _6_0 of 1 st source operandi+1C is as follows:
PHY_R_6_0i+1_C=PHY_R_1_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _6_1i+1C is as follows:
PHY_R_6_1i+1_C=PHY_R_1_2i+1
the valid identification of the source register is the same as that of UOP6 and UOP0 which are not related.
1.7.4 assume that UOP6 is RAW-related with UOP2 and UOP6 is RAW-related with UOP5, UOP4, UOP3 is not present
When UOP6 and UOP2 have a RAW correlation and UOP6 and UOP5, UOP4, UOP3 do not have a RAW correlation, i.e., the destination register of UOP2 is the same as the meta-register architecture register number of UOP6 and is not the same as the destination register number of UOP5, UOP4, UOP 3. The physical register number of the UOP6 source register is the newly allocated physical register number of the UOP2 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _6_0 of 1 st source operandi+1D is as follows:
PHY_R_6_0i+1_D=PHY_R_2_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _6_1i+1D is as follows:
PHY_R_6_1i+1_D=PHY_R_2_2i+1
1.7.5 assume that there is a RAW correlation between UOP6 and UOP3, and that there is no RAW correlation between UOP6 and UOP5, and between UOP4
When UOP6 is RAW related to UOP3 and UOP6 is RAW related to UOP5, UOP4, i.e., the destination register of UOP3 is numbered the same as the meta-register architecture register of UOP6 and not the same as the destination register of UOP5, UOP 4. The physical register number of the UOP6 source register is the newly allocated physical register number of the UOP3 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _6_0 of 1 st source operandi+1E is as follows:
PHY_R_6_0i+1_E=PHY_R_3_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _6_1i+1E is as follows:
PHY_R_6_1i+1_E=PHY_R_3_2i+1
1.7.6 assume that there is a RAW correlation between UOP6 and UOP4, and that there is no RAW correlation between UOP6 and UOP5
When there is a RAW correlation between UOP6 and UOP4, i.e., the destination register of UOP4 is numbered the same as the meta-register architecture register of UOP6, and there is no RAW correlation between UOP6 and UOP 5. The physical register number of the UOP6 source register is the newly allocated physical register number of the UOP4 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _6_0 of 1 st source operandi+1F is as follows:
PHY_R_6_0i+1_F=PHY_R_4_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _6_1i+1F is as follows:
PHY_R_6_1i+1_F=PHY_R_4_2i+1
1.7.7 assume that there is a RAW correlation between UOP6 and UOP5
When there is a RAW correlation between UOP6 and UOP5, the destination register of UOP5 is the same as the meta-register architectural register number of UOP 6. The physical register number of the UOP6 source register is the newly allocated physical register number of the UOP5 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _6_0 of 1 st source operandi+1G is as follows:
PHY_R_6_0i+1_G=PHY_R_5_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _6_1i+1G is as follows:
PHY_R_6_1i+1_G=PHY_R_5_2i+1
1.7.8 determining the existence of RAW related logic between UOP6 and UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0
Determining that UOP6 has RAW correlation with UOP0 and no RAW correlation with UOP5, UOP4, UOP3, UOP2, UOP1, it is only necessary to compare whether the source register number of UOP6 is the same as the destination register number of UOP0, and it is determined that the source register number of UOP6 is the same as the destination register number of UOP5, UOP4, UOP3, UOP2, UOP1, the 2 source registers determine the logical expression as follows:
selection logic for 1 st source operand CMP _ R _6_0i+1[5:0]The logical expressions are as follows:
CMP_R_6_0i+1[0]=((R_6_0i+1==R_0_2i+1)&VAL_6_0&VAL_0_2)&
(~((R_6_0i+1==R_1_2i+1)&VAL_6_0&VAL_1_2))&
(~((R_6_0i+1==R_2_2i+1)&VAL_6_0&VAL_2_2))&
(~((R_6_0i+1==R_3_2i+1)&VAL_6_0&VAL_3_2))&
(~((R_6_0i+1==R_4_2i+1)&VAL_6_0&VAL_4_2))&
(~((R_6_0i+1==R_5_2i+1)&VAL_6_0&VAL_5_2))
CMP_R_6_0i+1[1]=((R_6_0i+1==R_1_2i+1)&VAL_6_0&VAL_1_2)&
(~((R_6_0i+1==R_2_2i+1)&VAL_6_0&VAL_2_2))&
(~((R_6_0i+1==R_3_2i+1)&VAL_6_0&VAL_3_2))&
(~((R_6_0i+1==R_4_2i+1)&VAL_6_0&VAL_4_2))&
(~((R_6_0i+1==R_5_2i+1)&VAL_6_0&VAL_5_2))
CMP_R_6_0i+1[2]=((R_6_0i+1==R_2_2i+1)&VAL_6_0&VAL_2_2)&
(~((R_6_0i+1==R_3_2i+1)&VAL_6_0&VAL_3_2))&
(~((R_6_0i+1==R_4_2i+1)&VAL_6_0&VAL_4_2))&
(~((R_6_0i+1==R_5_2i+1)&VAL_6_0&VAL_5_2))
CMP_R_6_0i+1[3]=((R_6_0i+1==R_3_2i+1)&VAL_6_0&VAL_3_2)&
(~((R_6_0i+1==R_4_2i+1)&VAL_6_0&VAL_4_2))&
(~((R_6_0i+1==R_5_2i+1)&VAL_6_0&VAL_5_2))
CMP_R_6_0i+1[4]=((R_6_0i+1==R_4_2i+1)&VAL_6_0&VAL_4_2)&
(~((R_6_0i+1==R_5_2i+1)&VAL_6_0&VAL_5_2))
CMP_R_6_0i+1[5]=((R_6_0i+1==R_5_2i+1)&VAL_6_0&VAL_5_2)
selection logic for 2 nd source operand CMP _ R _6_1i+1[5:0]The logical expressions are as follows:
CMP_R_6_1i+1[0]=((R_6_1i+1==R_0_2i+1)&VAL_6_1&VAL_0_2)&
(~((R_6_1i+1==R_1_2i+1)&VAL_6_1&VAL_1_2))&
(~((R_6_1i+1==R_2_2i+1)&VAL_6_1&VAL_2_2))&
(~((R_6_1i+1==R_3_2i+1)&VAL_6_1&VAL_3_2))&
(~((R_6_1i+1==R_4_2i+1)&VAL_6_1&VAL_4_2))&
(~((R_6_1i+1==R_5_2i+1)&VAL_6_1&VAL_5_2))
CMP_R_6_1i+1[1]=((R_6_1i+1==R_1_2i+1)&VAL_6_1&VAL_1_2)&
(~((R_6_1i+1==R_2_2i+1)&VAL_6_1&VAL_2_2))&
(~((R_6_1i+1==R_3_2i+1)&VAL_6_1&VAL_3_2))&
(~((R_6_1i+1==R_4_2i+1)&VAL_6_1&VAL_4_2))&
(~((R_6_1i+1==R_5_2i+1)&VAL_6_1&VAL_5_2))
CMP_R_6_1i+1[2]=((R_6_1i+1==R_2_2i+1)&VAL_6_1&VAL_2_2)&
(~((R_6_1i+1==R_3_2i+1)&VAL_6_1&VAL_3_2))&
(~((R_6_1i+1==R_4_2i+1)&VAL_6_1&VAL_4_2))&
(~((R_6_1i+1==R_5_2i+1)&VAL_6_1&VAL_5_2))
CMP_R_6_1i+1[3]=((R_6_1i+1==R_3_2i+1)&VAL_6_1&VAL_3_2)&
(~((R_6_1i+1==R_4_2i+1)&VAL_6_1&VAL_4_2))&
(~((R_6_1i+1==R_5_2i+1)&VAL_6_1&VAL_5_2))
CMP_R_6_1i+1[4]=((R_6_1i+1==R_4_2i+1)&VAL_6_1&VAL_4_2)&
(~((R_6_1i+1==R_5_2i+1)&VAL_6_1&VAL_5_2))
CMP_R_6_1i+1[5]=((R_6_1i+1==R_5_2i+1)&VAL_6_1&VAL_5_2)
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
PHY_R_6_0i+1_A_1D<=PHY_R_6_0i+1_A
PHY_R_6_0i+1_B_1D<=PHY_R_6_0i+1_B
PHY_R_6_0i+1_C_1D<=PHY_R_6_0i+1_C
PHY_R_6_0i+1_D_1D<=PHY_R_6_0i+1_D
PHY_R_6_0i+1_E_1D<=PHY_R_6_0i+1_E
PHY_R_6_0i+1_F_1D<=PHY_R_6_0i+1_F
PHY_R_6_0i+1_G_1D<=PHY_R_6_0i+1_G
PHY_R_6_1i+1_A_1D<=PHY_R_6_1i+1_A
PHY_R_6_1i+1_B_1D<=PHY_R_6_1i+1_B
PHY_R_6_1i+1_C_1D<=PHY_R_6_1i+1_C
PHY_R_6_1i+1_D_1D<=PHY_R_6_1i+1_D
PHY_R_6_1i+1_E_1D<=PHY_R_6_1i+1_E
PHY_R_6_1i+1_F_1D<=PHY_R_6_1i+1_F
PHY_R_6_1i+1_G_1D<=PHY_R_6_1i+1_G
CMP_R_6_0i+1_1D<=CMP_R_6_0i+1
CMP_R_6_1i+1_1D<=CMP_R_6_1i+1
1 st Source operand Final physical register PHY _ R _6_0i+11D, the correct physical register is selected according to the selection logic.
PHY_R_6_0i+1_1D=({Q{(~(|CMP_R_6_0i+1_1D))}}&PHY_R_6_0i+1_A_1D)|
({Q{CMP_R_6_0i+1_1D[0]}}&PHY_R_6_0i+1_B_1D)|
({Q{CMP_R_6_0i+1_1D[1]}}&PHY_R_6_0i+1_C_1D)|
({Q{CMP_R_6_0i+1_1D[2]}}&PHY_R_6_0i+1_D_1D)|
({Q{CMP_R_6_0i+1_1D[3]}}&PHY_R_6_0i+1_E_1D)|
({Q{CMP_R_6_0i+1_1D[4]}}&PHY_R_6_0i+1_F_1D)|
({Q{CMP_R_6_0i+1_1D[5]}}&PHY_R_6_0i+1_G_1D)
2 nd source operationNumber-final physical register PHY _ R _6_1i+11D, the correct physical register is selected according to the selection logic.
PHY_R_6_1i+1_1D=({Q{(~(|CMP_R_6_1i+1_1D))}}&PHY_R_6_1i+1_A_1D)|
({Q{CMP_R_6_1i+1_1D[0]}}&PHY_R_6_1i+1_B_1D)|
({Q{CMP_R_6_1i+1_1D[1]}}&PHY_R_6_1i+1_C_1D)|
({Q{CMP_R_6_1i+1_1D[2]}}&PHY_R_6_1i+1_D_1D)|
({Q{CMP_R_6_1i+1_1D[3]}}&PHY_R_6_1i+1_E_1D)|
({Q{CMP_R_6_1i+1_1D[4]}}&PHY_R_6_1i+1_F_1D)|
({Q{CMP_R_6_1i+1_1D[5]}}&PHY_R_6_1i+1_G_1D)
PHY_VAL_6_0i+1_1D<=PHY_VAL_6_0i+1
PHY_VAL_6_1i+1_1D<=PHY_VAL_6_1i+1
The logical expression of the destination register is as follows:
PHY_R_6_2i+1_1D<=PHY_R_6_2i+1
PHY_VAL_6_2i+1_1D<=PHY_VAL_6_2i+1
R_6_2i_1D<=R_6_2i
1.8 renaming procedure for UOP7
2 source operands of UOP7 query the RAT table according to the architectural register number of each source operand, and at the same time, it is necessary to determine whether the architectural register numbers of the 2 source operands of UOP7 are the same as the architectural register numbers of the destination registers of UOP6, UOP5, UOP4, UOP3, UOP2, UOP1, and UOP0, that is, it is necessary to determine R _7_0i+1And R _7_1i+1Whether or not to be associated with R _6_2i+1,R_5_2i+1,R_4_2i+1,R_3_2i+1,R_2_2i+1,R_1_2i+1Or R _0_2i+1The same is true. Architectural register numbering of 2 source operands if UOP7The same architectural number as the destination register of UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 or UOP0, then the physical register mapped by the architectural register number of the 2 source operands of UOP7 is the physical register corresponding to the destination register of UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 or UOP 0. If a source register of UOP7 is RAW-related to all of UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0, the physical register of the source register of UOP7 takes the physical register number corresponding to the destination register of UOP6 according to the priority order.
The source register number of UOP7 is divided into 8 cases: 1, assume that UOP7 is not RAW related to UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 and UOP 0; 2, assume that UOP7 is RAW related to UOP0 and UOP7 is not RAW related to UOP6, UOP5, UOP4, UOP3, UOP2, UOP 1; 3, assume that UOP7 has a RAW correlation with UOP1 and no RAW correlation with UOP6, UOP5, UOP4, UOP3, UOP 2; 4, assume that UOP7 and UOP2 have RAW correlation and UOP7 and UOP6, UOP5, UOP4, UOP3 have no RAW correlation; 5, assume that UOP7 and UOP3 have a RAW correlation and UOP7 and UOP6, UOP5, UOP4 have no RAW correlation; 6, assume that UOP7 is RAW related to UOP4 and UOP7 is RAW related to UOP6, UOP5 is not present; 7, assume that there is a RAW correlation of UOP7 with UOP5 and a RAW correlation of UOP7 with UOP 6; 8, assume that there is a RAW correlation between UOP7 and UOP 6. Obtaining 8 physical registers of each source register in the 1 st stage of renaming, and simultaneously judging the relevance of UOP7, UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0 in parallel; the correct result is selected from the 8 physical register numbers of each source register in stage 2 of the renaming according to the dependency logic.
1.8.1 assume that there is no RAW correlation between UOP7 and UOP6, UOP5, UOP4, UOP3, UOP2, UOP1, and UOP0
When UOP7 is not RAW related to UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0, the mapping process for UOP7 is similar to that of UOP 0. The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1A is as follows:
PHY_R_7_0i+1_A=({Q{(PRVi_1D[0]&(R_7_0i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_7_0i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_7_0i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_7_0i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_7_0i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_7_0i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_7_0i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_7_0i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_7_0i+1][Q-1:0])
PHY_VAL_7_0i+1=VAL_7_0i+1
similarly, the 2 nd source operand logic expression PHY _ R _7_1i+1A is as follows:
PHY_R_7_1i+1_A=({Q{(PRVi_1D[0]&(R_7_1i+1==R_0_2i_1D))}}&PHY_R_0_2i_1D)|
({Q{(PRVi_1D[1]&(R_7_1i+1==R_1_2i_1D))}}&PHY_R_1_2i_1D)|
({Q{(PRVi_1D[2]&(R_7_1i+1==R_2_2i_1D))}}&PHY_R_2_2i_1D)|
({Q{(PRVi_1D[3]&(R_7_1i+1==R_3_2i_1D))}}&PHY_R_3_2i_1D)|
({Q{(PRVi_1D[4]&(R_7_1i+1==R_4_2i_1D))}}&PHY_R_4_2i_1D)|
({Q{(PRVi_1D[5]&(R_7_1i+1==R_5_2i_1D))}}&PHY_R_5_2i_1D)|
({Q{(PRVi_1D[6]&(R_7_1i+1==R_6_2i_1D))}}&PHY_R_6_2i_1D)|
({Q{(PRVi_1D[7]&(R_7_1i+1==R_7_2i_1D))}}&PHY_R_7_2i_1D)|
({Q{(~(|PRVi_1D))}}&RAT[R_7_1i+1][Q-1:0])
PHY_VAL_7_1i+1=VAL_7_1i+1
1.8.2 assume that UOP7 is RAW-related with UOP0 and UOP7 is RAW-related with UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 not present
When UOP7 is RAW related to UOP0 and UOP7 is not RAW related to UOP6, UOP5, UOP4, UOP3, UOP2, UOP1, i.e. the destination register of UOP0 is numbered the same as the meta-register architecture register of UOP7 and not the same as the destination register of UOP6, UOP5, UOP4, UOP3, UOP2, UOP 1. The physical register number of the UOP7 source register is the newly allocated physical register number of the UOP0 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1B is as follows:
PHY_R_7_0i+1_B=PHY_R_0_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _7_1i+1B is as follows:
PHY_R_7_1i+1_B=PHY_R_0_2i+1
the valid identification of the source register is the same as that of UOP7 and UOP0 which are not related.
1.8.3 assume that UOP7 has a RAW correlation with UOP1 and UOP7 has no RAW correlation with UOP6, UOP5, UOP4, UOP3, UOP2
When UOP7 and UOP1 have a RAW correlation and UOP7 and UOP6, UOP5, UOP4, UOP3, UOP2 do not have a RAW correlation, i.e. the destination register of UOP1 is numbered the same as the meta-register architecture register of UOP7 and not the destination register of UOP6, UOP5, UOP4, UOP3, UOP 2. The physical register number of the UOP7 source register is the newly allocated physical register number of the UOP1 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1C is as follows:
PHY_R_7_0i+1_C=PHY_R_1_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _7_1i+1C is as follows:
PHY_R_7_1i+1_C=PHY_R_1_2i+1
the valid identification of the source register is the same as that of UOP7 and UOP0 which are not related.
1.8.4 assume that there is a RAW correlation between UOP7 and UOP2, and that there is no RAW correlation between UOP7 and UOP6, UOP5, UOP4, UOP3
When UOP7 and UOP2 have a RAW correlation and UOP7 and UOP6, UOP5, UOP4, UOP3 do not have a RAW correlation, i.e. the destination register of UOP2 is numbered the same as the meta-register architecture register of UOP7 and not the same as the destination register of UOP6, UOP5, UOP4, UOP 3. The physical register number of the UOP7 source register is the newly allocated physical register number of the UOP2 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1D is as follows:
PHY_R_7_0i+1_D=PHY_R_2_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _7_1i+1D is as follows:
PHY_R_7_1i+1_D=PHY_R_2_2i+1
1.8.5 assume that there is a RAW correlation between UOP7 and UOP3, and that there is no RAW correlation between UOP7 and UOP6, UOP5, UOP4
When UOP7 and UOP3 have a RAW correlation and UOP7 and UOP6, UOP5, UOP4 do not have a RAW correlation, i.e., the destination register of UOP3 is the same as the meta-register architecture register number of UOP7 and is not the same as the destination register number of UOP6, UOP5, UOP 4. The physical register number of the UOP7 source register is the newly allocated physical register number of the UOP3 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1E is as follows:
PHY_R_7_0i+1_E=PHY_R_3_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _7_1i+1E is as follows:
PHY_R_7_1i+1_E=PHY_R_3_2i+1
1.8.6 assume that there is a RAW correlation between UOP7 and UOP4, and that there is no RAW correlation between UOP7 and UOP6, UOP5
When there is a RAW correlation between UOP7 and UOP4, i.e., the destination register of UOP4 is numbered the same as the meta-register architecture register of UOP7, and UOP7 and UOP6, UOP5 have no RAW correlation. The physical register number of the UOP7 source register is the newly allocated physical register number of the UOP4 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1F is as follows:
PHY_R_7_0i+1_F=PHY_R_4_2i+1
similarly, the logical expression PH of the 2 nd source operandY_R_7_1i+1F is as follows:
PHY_R_7_1i+1_F=PHY_R_4_2i+1
1.8.7 assume that there is a RAW correlation between UOP7 and UOP5, and that there is no RAW correlation between UOP7 and UOP6
When the UOP7 and UOP5 are RAW related and UOP7 and UOP6 are not RAW related, i.e., the destination register of UOP5 is numbered the same as the meta-register architecture register of UOP7 and UOP7 and UOP6 are not RAW related. The physical register number of the UOP7 source register is the newly allocated physical register number of the UOP5 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1G is as follows:
PHY_R_7_0i+1_G=PHY_R_5_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _7_1i+1G is as follows:
PHY_R_7_1i+1_G=PHY_R_5_2i+1
1.8.8 assume that there is a RAW correlation between UOP7 and UOP6
When there is a RAW correlation between UOP7 and UOP6, the destination register of UOP6 is the same as the meta-register architectural register number of UOP 7. The physical register number of the UOP7 source register is the newly allocated physical register number of the UOP6 destination register.
The (i + 1) th cycle instructs the 1 st stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
logical expression PHY _ R _7_0 of 1 st source operandi+1H is as follows:
PHY_R_7_0i+1_H=PHY_R_5_2i+1
similarly, the 2 nd source operand logic expression PHY _ R _7_1i+1H is as follows:
PHY_R_7_1i+1_H=PHY_R_5_2i+1
1.8.9 judging whether there is a RAW correlation logic between UOP7 and UOP6, UOP5, UOP4, UOP3, UOP2, UOP1 and UOP0 judges whether UOP7 has a RAW correlation with UOP0 and is not correlated with UOP6, UOP5, UOP4, UOP3, UOP2, UOP1, there is no RAW correlation, it is only necessary to compare whether the source register number of UOP7 is the same as the destination register number of UOP0, and it is judged that the source register number of UOP7 needs to be compared with the destination register numbers of UOP6, UOP5, UOP4, UOP3, UOP2, UOP1, the 2-source register judgment logic expressions are as follows:
selection logic for 1 st source operand CMP _ R _7_0i+1[6:0]The logical expressions are as follows:
CMP_R_7_0i+1[0]=((R_7_0i+1==R_0_2i+1)&VAL_7_0&VAL_0_2)&
(~((R_7_0i+1==R_1_2i+1)&VAL_7_0&VAL_1_2))&
(~((R_7_0i+1==R_2_2i+1)&VAL_7_0&VAL_2_2))&
(~((R_7_0i+1==R_3_2i+1)&VAL_7_0&VAL_3_2))&
(~((R_7_0i+1==R_4_2i+1)&VAL_7_0&VAL_4_2))&
(~((R_7_0i+1==R_5_2i+1)&VAL_7_0&VAL_5_2))&
(~((R_7_0i+1==R_6_2i+1)&VAL_7_0&VAL_6_2))
CMP_R_7_0i+1[1]=((R_7_0i+1==R_1_2i+1)&VAL_7_0&VAL_1_2)&
(~((R_7_0i+1==R_2_2i+1)&VAL_7_0&VAL_2_2))&
(~((R_7_0i+1==R_3_2i+1)&VAL_7_0&VAL_3_2))&
(~((R_7_0i+1==R_4_2i+1)&VAL_7_0&VAL_4_2))&
(~((R_7_0i+1==R_5_2i+1)&VAL_7_0&VAL_5_2))&
(~((R_7_0i+1==R_6_2i+1)&VAL_7_0&VAL_6_2))
CMP_R_7_0i+1[2]=((R_7_0i+1==R_2_2i+1)&VAL_7_0&VAL_2_2)&
(~((R_7_0i+1==R_3_2i+1)&VAL_7_0&VAL_3_2))&
(~((R_7_0i+1==R_4_2i+1)&VAL_7_0&VAL_4_2))&
(~((R_7_0i+1==R_5_2i+1)&VAL_7_0&VAL_5_2))&
(~((R_7_0i+1==R_6_2i+1)&VAL_7_0&VAL_6_2))
CMP_R_7_0i+1[3]=((R_7_0i+1==R_3_2i+1)&VAL_7_0&VAL_3_2)&
(~((R_7_0i+1==R_4_2i+1)&VAL_7_0&VAL_4_2))&
(~((R_7_0i+1==R_5_2i+1)&VAL_7_0&VAL_5_2))&
(~((R_7_0i+1==R_6_2i+1)&VAL_7_0&VAL_6_2))
CMP_R_7_0i+1[4]=((R_7_0i+1==R_4_2i+1)&VAL_7_0&VAL_4_2)&
(~((R_7_0i+1==R_5_2i+1)&VAL_7_0&VAL_5_2))&
(~((R_7_0i+1==R_6_2i+1)&VAL_7_0&VAL_6_2))
CMP_R_7_0i+1[5]=((R_7_0i+1==R_5_2i+1)&VAL_7_0&VAL_5_2)&
(~((R_7_0i+1==R_6_2i+1)&VAL_7_0&VAL_6_2))
CMP_R_7_0i+1[6]=((R_7_0i+1==R_6_2i+1)&VAL_7_0&VAL_6_2)
selection logic for 2 nd source operand CMP _ R _7_1i+1[6:0]The logical expressions are as follows:
CMP_R_7_1i+1[0]=((R_7_1i+1==R_0_2i+1)&VAL_7_1&VAL_0_2)&
(~((R_7_1i+1==R_1_2i+1)&VAL_7_1&VAL_1_2))&
(~((R_7_1i+1==R_2_2i+1)&VAL_7_1&VAL_2_2))&
(~((R_7_1i+1==R_3_2i+1)&VAL_7_1&VAL_3_2))&
(~((R_7_1i+1==R_4_2i+1)&VAL_7_1&VAL_4_2))&
(~((R_7_1i+1==R_5_2i+1)&VAL_7_1&VAL_5_2))&
(~((R_7_1i+1==R_6_2i+1)&VAL_7_1&VAL_6_2))
CMP_R_7_1i+1[1]=((R_7_1i+1==R_1_2i+1)&VAL_7_1&VAL_1_2)&
(~((R_7_1i+1==R_2_2i+1)&VAL_7_1&VAL_2_2))&
(~((R_7_1i+1==R_3_2i+1)&VAL_7_1&VAL_3_2))&
(~((R_7_1i+1==R_4_2i+1)&VAL_7_1&VAL_4_2))&
(~((R_7_1i+1==R_5_2i+1)&VAL_7_1&VAL_5_2))&
(~((R_7_1i+1==R_6_2i+1)&VAL_7_1&VAL_6_2))
CMP_R_7_1i+1[2]=((R_7_1i+1==R_2_2i+1)&VAL_7_1&VAL_2_2)&
(~((R_7_1i+1==R_3_2i+1)&VAL_7_1&VAL_3_2))&
(~((R_7_1i+1==R_4_2i+1)&VAL_7_1&VAL_4_2))&
(~((R_7_1i+1==R_5_2i+1)&VAL_7_1&VAL_5_2))&
(~((R_7_1i+1==R_6_2i+1)&VAL_7_1&VAL_6_2))
CMP_R_7_1i+1[3]=((R_7_1i+1==R_3_2i+1)&VAL_7_1&VAL_3_2)&
(~((R_7_1i+1==R_4_2i+1)&VAL_7_1&VAL_4_2))&
(~((R_7_1i+1==R_5_2i+1)&VAL_7_1&VAL_5_2))&
(~((R_7_1i+1==R_6_2i+1)&VAL_7_1&VAL_6_2))
CMP_R_7_1i+1[4]=((R_7_1i+1==R_4_2i+1)&VAL_7_1&VAL_4_2)&
(~((R_7_1i+1==R_5_2i+1)&VAL_7_1&VAL_5_2))&
(~((R_7_1i+1==R_6_2i+1)&VAL_7_1&VAL_6_2))
CMP_R_7_1i+1[5]=((R_7_1i+1==R_5_2i+1)&VAL_7_1&VAL_5_2)&
(~((R_7_1i+1==R_6_2i+1)&VAL_7_1&VAL_6_2))
CMP_R_7_1i+1[6]=((R_7_1i+1==R_6_2i+1)&VAL_7_1&VAL_6_2)
the (i + 1) th cycle instructs the 2 nd stage 2 source operands and the destination architectural register to be mapped to the physical register calculation process:
PHY_R_7_0i+1_A_1D<=PHY_R_7_0i+1_A
PHY_R_7_0i+1_B_1D<=PHY_R_7_0i+1_B
PHY_R_7_0i+1_C_1D<=PHY_R_7_0i+1_C
PHY_R_7_0i+1_D_1D<=PHY_R_7_0i+1_D
PHY_R_7_0i+1_E_1D<=PHY_R_7_0i+1_E
PHY_R_7_0i+1_F_1D<=PHY_R_7_0i+1_F
PHY_R_7_0i+1_G_1D<=PHY_R_7_0i+1_G
PHY_R_7_0i+1_H_1D<=PHY_R_7_0i+1_H
PHY_R_7_1i+1_A_1D<=PHY_R_7_1i+1_A
PHY_R_7_1i+1_B_1D<=PHY_R_7_1i+1_B
PHY_R_7_1i+1_C_1D<=PHY_R_7_1i+1_C
PHY_R_7_1i+1_D_1D<=PHY_R_7_1i+1_D
PHY_R_7_1i+1_E_1D<=PHY_R_7_1i+1_E
PHY_R_7_1i+1_F_1D<=PHY_R_7_1i+1_F
PHY_R_7_1i+1_G_1D<=PHY_R_7_1i+1_G
PHY_R_7_1i+1_H_1D<=PHY_R_7_1i+1_H
CMP_R_7_0i+1_1D<=CMP_R_7_0i+1
CMP_R_7_1i+1_1D<=CMP_R_7_1i+1
1 st Source operand Final physical register PHY _ R _7_0i+11D, the correct physical register is selected according to the selection logic.
PHY_R_7_0i+1_1D=({Q{(~(|CMP_R_7_0i+1_1D))}}&PHY_R_7_0i+1_A_1D)|
({Q{CMP_R_7_0i+1_1D[0]}}&PHY_R_7_0i+1_B_1D)|
({Q{CMP_R_7_0i+1_1D[1]}}&PHY_R_7_0i+1_C_1D)|
({Q{CMP_R_7_0i+1_1D[2]}}&PHY_R_7_0i+1_D_1D)|
({Q{CMP_R_7_0i+1_1D[3]}}&PHY_R_7_0i+1_E_1D)|
({Q{CMP_R_7_0i+1_1D[4]}}&PHY_R_7_0i+1_F_1D)|
({Q{CMP_R_7_0i+1_1D[5]}}&PHY_R_7_0i+1_G_1D)|
({Q{CMP_R_7_0i+1_1D[6]}}&PHY_R_7_0i+1_H_1D)
Final physical register PHY _ R _7_1 for 2 nd source operandi+11D, the correct physical register is selected according to the selection logic.
PHY_R_7_1i+1_1D=({Q{(~(|CMP_R_7_1i+1_1D))}}&PHY_R_7_1i+1_A_1D)|
({Q{CMP_R_7_1i+1_1D[0]}}&PHY_R_7_1i+1_B_1D)|
({Q{CMP_R_7_1i+1_1D[1]}}&PHY_R_7_1i+1_C_1D)|
({Q{CMP_R_7_1i+1_1D[2]}}&PHY_R_7_1i+1_D_1D)|
({Q{CMP_R_7_1i+1_1D[3]}}&PHY_R_7_1i+1_E_1D)|
({Q{CMP_R_7_1i+1_1D[4]}}&PHY_R_7_1i+1_F_1D)|
({Q{CMP_R_7_1i+1_1D[5]}}&PHY_R_7_1i+1_G_1D)|
({Q{CMP_R_7_1i+1_1D[6]}}&PHY_R_7_1i+1_H_1D)
PHY_VAL_7_0i+1_1D<=PHY_VAL_7_0i+1
PHY_VAL_7_1i+1_1D<=PHY_VAL_7_1i+1
The logical expression of the destination register is as follows:
PHY_R_7_2i+1_1D<=PHY_R_7_2i+1
PHY_VAL_7_2i+1_1D<=PHY_VAL_7_2i+1
R_7_2i_1D<=R_7_2i
the preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (5)

1. A method of renaming based on instruction read-after-write dependency assumptions, comprising two stages:
stage 1: completing RAT reading and judging the relevant attribute of the register; based on the relevant assumptions of the write-after-read of various instructions, obtaining multiple renaming register mappings of each source register, and simultaneously generating onehot control signals for selecting correct renaming registers in parallel;
stage 2: selecting a final renaming register of each source register according to an onehot control signal generated in the stage 1, and updating the mapping relation between an architecture register and a physical register of the RAT table;
through the supposed parallel generation of multiple rename register results and the selection of the final effective result signals, the number of serial logic gates generated by reading after writing of 8 instructions according to priority comparison is reduced, so that higher main frequency is obtained under the same condition;
the method obtains a plurality of renaming results by assuming that various read-after-write correlations exist in the instruction, and then obtains a final result by selecting a signal.
2. The method according to claim 1, wherein the method does not need to directly determine the read-after-write correlation between instructions, thereby eliminating the issue of long path in combinational logic caused by the priority relationship between instructions, and the assumption is not limited to 1 instruction as granularity, and is applicable to any instruction granularity.
3. The method of claim 1, wherein multiple renamed register maps for each source register are obtained in stage 1 and implemented using logic expressions that determine renaming for each UOP, the logic expressions including logic expression implementations for each hypothesis for each instruction.
4. The method according to claim 1, wherein the method is applicable to all processor architectures such as X86 instruction set CPU, RISC instruction set CPU, GPU, DSP, etc., physical single core and physical multi-Core (CMP) and logical multi-core (SMT), and server and cluster.
5. The method for renaming based on the read-after-write related assumption of the instruction as claimed in claim 1, wherein the invention is not limited to the bandwidth of the parallel instruction level and is not limited to the architecture, the pipeline level and the implementation process of renaming implementation.
CN202010231038.5A 2020-03-27 2020-03-27 Renaming method based on instruction read-after-write related hypothesis Active CN111506347B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010231038.5A CN111506347B (en) 2020-03-27 2020-03-27 Renaming method based on instruction read-after-write related hypothesis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010231038.5A CN111506347B (en) 2020-03-27 2020-03-27 Renaming method based on instruction read-after-write related hypothesis

Publications (2)

Publication Number Publication Date
CN111506347A true CN111506347A (en) 2020-08-07
CN111506347B CN111506347B (en) 2023-05-26

Family

ID=71864661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010231038.5A Active CN111506347B (en) 2020-03-27 2020-03-27 Renaming method based on instruction read-after-write related hypothesis

Country Status (1)

Country Link
CN (1) CN111506347B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022199035A1 (en) * 2021-03-22 2022-09-29 广东赛昉科技有限公司 Renaming method and system for fixed-constant-related instruction

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7171541B1 (en) * 1999-09-08 2007-01-30 Hajime Seki Register renaming system
CN101395573A (en) * 2006-02-28 2009-03-25 Mips技术公司 Distributive scoreboard scheduling in an out-of order processor
CN101593096A (en) * 2009-05-22 2009-12-02 西安交通大学 The implementation method that a kind of shared register dependencies is eliminated
CN101601008A (en) * 2007-01-24 2009-12-09 高通股份有限公司 Be used for the use of the register rename system of forwarding intermediate result between the composition instruction of extended instruction
CN102566976A (en) * 2010-12-27 2012-07-11 北京国睿中数科技股份有限公司 Register renaming system and method for managing and renaming registers
CN103116485A (en) * 2013-01-30 2013-05-22 西安电子科技大学 Assembler designing method based on specific instruction set processor for very long instruction words
CN103577159A (en) * 2012-08-07 2014-02-12 想象力科技有限公司 Multi-stage register renaming using dependency removal
CN104156196A (en) * 2014-06-12 2014-11-19 龚伟峰 Renaming pretreatment method
CN105045562A (en) * 2014-04-25 2015-11-11 美国博通公司 Branch prediction in a processor
CN106155636A (en) * 2015-05-11 2016-11-23 Arm 有限公司 Available register controlled for depositor renaming
CN108027736A (en) * 2015-10-28 2018-05-11 森蒂彼得塞米有限公司 Use the runtime code parallelization of the out of order renaming by being pre-allocated to physical register
US20190361703A1 (en) * 2019-06-04 2019-11-28 Dejan Spasov Method and apparatus for renaming source operands of instructions

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7171541B1 (en) * 1999-09-08 2007-01-30 Hajime Seki Register renaming system
CN101395573A (en) * 2006-02-28 2009-03-25 Mips技术公司 Distributive scoreboard scheduling in an out-of order processor
CN101601008A (en) * 2007-01-24 2009-12-09 高通股份有限公司 Be used for the use of the register rename system of forwarding intermediate result between the composition instruction of extended instruction
CN101593096A (en) * 2009-05-22 2009-12-02 西安交通大学 The implementation method that a kind of shared register dependencies is eliminated
CN102566976A (en) * 2010-12-27 2012-07-11 北京国睿中数科技股份有限公司 Register renaming system and method for managing and renaming registers
CN103577159A (en) * 2012-08-07 2014-02-12 想象力科技有限公司 Multi-stage register renaming using dependency removal
CN103116485A (en) * 2013-01-30 2013-05-22 西安电子科技大学 Assembler designing method based on specific instruction set processor for very long instruction words
CN105045562A (en) * 2014-04-25 2015-11-11 美国博通公司 Branch prediction in a processor
CN104156196A (en) * 2014-06-12 2014-11-19 龚伟峰 Renaming pretreatment method
CN106155636A (en) * 2015-05-11 2016-11-23 Arm 有限公司 Available register controlled for depositor renaming
CN108027736A (en) * 2015-10-28 2018-05-11 森蒂彼得塞米有限公司 Use the runtime code parallelization of the out of order renaming by being pre-allocated to physical register
US20190361703A1 (en) * 2019-06-04 2019-11-28 Dejan Spasov Method and apparatus for renaming source operands of instructions

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张军超;张兆庆;: "指令调度中的寄存器重命名技术" *
翟召岳;: "基于32位超标量处理器的保留站设计" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022199035A1 (en) * 2021-03-22 2022-09-29 广东赛昉科技有限公司 Renaming method and system for fixed-constant-related instruction

Also Published As

Publication number Publication date
CN111506347B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
US8099566B2 (en) Load/store ordering in a threaded out-of-order processor
US8261046B2 (en) Access of register files of other threads using synchronization
TWI599949B (en) Method and apparatus for implementing a dynamic out-of-order processor pipeline
JP5035277B2 (en) A locking mechanism that allows atomic updates to shared memory
US6035391A (en) Floating point operation system which determines an exchange instruction and updates a reference table which maps logical registers to physical registers
US9946549B2 (en) Register renaming in block-based instruction set architecture
US8166282B2 (en) Multi-version register file for multithreading processors with live-in precomputation
JP5416223B2 (en) Memory model of hardware attributes in a transactional memory system
US7660971B2 (en) Method and system for dependency tracking and flush recovery for an out-of-order microprocessor
CN108196884B (en) Computer information processor using generation renames
US9952867B2 (en) Mapping instruction blocks based on block size
US9600288B1 (en) Result bypass cache
US9354875B2 (en) Enhanced loop streaming detector to drive logic optimization
CN106293894B (en) Hardware device and method for performing transactional power management
US7822948B2 (en) Apparatus, system, and method for discontiguous multiple issue of instructions
US20110078418A1 (en) Support for Non-Local Returns in Parallel Thread SIMD Engine
KR20190033084A (en) Store and load trace by bypassing load store units
US8151096B2 (en) Method to improve branch prediction latency
US7404065B2 (en) Flow optimization and prediction for VSSE memory operations
US10545765B2 (en) Multi-level history buffer for transaction memory in a microprocessor
US8898436B2 (en) Method and structure for solving the evil-twin problem
KR20220017403A (en) Limiting the replay of load-based control-independent (CI) instructions in the processor&#39;s speculative predictive failure recovery
CN114675882A (en) Method, system and apparatus for scalable reservation stations
CN111506347B (en) Renaming method based on instruction read-after-write related hypothesis
US20080244224A1 (en) Scheduling a direct dependent instruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant