CN113961248A - Register mapping method, processor, chip and electronic equipment - Google Patents

Register mapping method, processor, chip and electronic equipment Download PDF

Info

Publication number
CN113961248A
CN113961248A CN202111342880.7A CN202111342880A CN113961248A CN 113961248 A CN113961248 A CN 113961248A CN 202111342880 A CN202111342880 A CN 202111342880A CN 113961248 A CN113961248 A CN 113961248A
Authority
CN
China
Prior art keywords
register
bit width
physical
area
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111342880.7A
Other languages
Chinese (zh)
Inventor
林志翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haiguang Information Technology Co Ltd
Original Assignee
Haiguang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haiguang Information Technology Co Ltd filed Critical Haiguang Information Technology Co Ltd
Priority to CN202111342880.7A priority Critical patent/CN113961248A/en
Publication of CN113961248A publication Critical patent/CN113961248A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30101Special purpose registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

The embodiment of the application provides a register mapping method, a processor, a chip and electronic equipment, wherein the method comprises the following steps: determining the actual used target bit width of the architecture register corresponding to the current instruction; determining a free register area with the bit width matched with the target bit width from a physical register file; the physical register file comprises a plurality of groups of physical registers, the register area is a unit area used for register mapping in the physical registers, and the bit widths of the register areas corresponding to different groups of physical registers are different; determining a target register area based on the free register area, the target register area being allocated to the architectural register. The embodiment of the application can improve the resource utilization rate of the physical register when the physical register is mapped for the architecture register.

Description

Register mapping method, processor, chip and electronic equipment
Technical Field
The embodiment of the application relates to the technical field of processors, in particular to a register mapping method, a processor, a chip and electronic equipment.
Background
Physical registers in a processor need to be register mapped when used. In particular, based on the architectural design of the processor instruction set, the processor requires, during the register renaming stage, the allocation of physical registers to architecturally specified architectural registers of the instruction set, a process referred to as the register mapping process. However, currently, when performing register mapping, the resource utilization efficiency of the physical register needs to be improved.
Disclosure of Invention
In view of this, embodiments of the present disclosure provide a register mapping method, a processor, a chip, and an electronic device, so as to improve resource utilization efficiency of a physical register when performing register mapping.
In order to achieve the above object, the embodiments of the present application provide the following technical solutions.
In a first aspect, an embodiment of the present application provides a register mapping method, including:
determining the actual used target bit width of the architecture register corresponding to the current instruction;
determining a free register area with the bit width matched with the target bit width from a physical register file; the physical register file comprises a plurality of groups of physical registers, the register area is a unit area used for register mapping in the physical registers, and the bit widths of the register areas corresponding to different groups of physical registers are different;
determining a target register area based on the free register area, the target register area being allocated to the architectural register.
In a second aspect, an embodiment of the present application provides a processor, including:
a decode unit to determine a target bit width of an architectural register used by a current instruction;
the renaming unit is used for determining a free register area with the bit width matched with the target bit width from the physical register file; determining a target register area based on the free register area, the target register area being allocated to the architectural register;
the physical register file comprises a plurality of groups of physical registers, the register area is a unit area used for register mapping in the physical registers, and the bit widths of the register areas corresponding to different groups of physical registers are different.
In a third aspect, an embodiment of the present application provides a chip including the processor as described in the second aspect.
In a fourth aspect, an embodiment of the present application provides an electronic device including the chip as described in the third aspect.
According to the register mapping method provided by the embodiment of the application, the physical register file can be set into a plurality of groups of physical registers, and the unit areas (called register areas) used for register mapping in different groups of physical registers have different bit widths, so that the physical register file can provide the register areas used for register mapping with different bit widths, and a basis is provided for accurately matching the target bit width actually used by the architectural register. Based on this, in the process of processing the current instruction by the processor, the embodiment of the application can determine the actually used target bit width of the architecture register corresponding to the current instruction; determining an idle register area with the bit width matched with the target bit width from the physical register file based on the target bit width; that is to say, based on register areas with different bit widths provided by the physical register file, the embodiment of the present application can determine, from the physical register file, a free register area with a bit width that is the same as a target bit width, or a free register area with a bit width that is greater than the target bit width but can be further divided to obtain the target bit width, or bit widths can be merged to obtain multiple free register areas with the target bit width, and the like. Furthermore, in the embodiments of the present application, the target register area may be determined based on the free register area, so that the bit width of the target register is exactly the target bit width used by the architectural register, and by allocating the target register area to the architectural register, a physical register area corresponding to the target bit width actually used may be allocated to the architectural register in the register mapping process, thereby utilizing the resource of the physical register to the maximum extent. Therefore, the embodiment of the application can allocate a target register area corresponding to the target bit width actually used by the architecture register for the architecture register, and improve the resource utilization rate of the physical register.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is an architecture diagram of a processor according to an embodiment of the present disclosure.
Fig. 2 is a diagram illustrating an example of a vector register and an operation mask register according to an embodiment of the present application.
Fig. 3A is a diagram illustrating an example of partitioning a physical register file according to an embodiment of the present application.
FIG. 3B is a diagram illustrating another example of partitioning a physical register file according to an embodiment of the present application.
Fig. 4 is a flowchart of a register mapping method according to an embodiment of the present application.
Fig. 5 is an alternative flowchart for determining a target register area according to an embodiment of the present application.
Fig. 6 is another alternative flowchart for determining a target register area according to an embodiment of the present disclosure.
Fig. 7 is an exemplary diagram of a reading result of a renaming table provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The processor is the computational and control core of the computer architecture. Such as a Central Processing Unit (CPU), various Complex Instruction Set Computing (CISC) processors, various Reduced Instruction Set Computing (RISC) processors, various Very Long Instruction Word (VLIW) processors, and so forth. The processor is generally arranged for pipelining and fig. 1 shows an architectural schematic of a processor 100. As shown in fig. 1, processor 100 may include: instruction fetch unit 101, decode unit 102, execution unit 103, rename table 104, physical register file 105, and rename unit 106.
Instruction fetch unit 101 may fetch instructions from an instruction cache according to instruction fetch instructions, such as instruction fetch instructions predicted by a branch prediction unit in the front end of a processor.
Decode units 102 may decode and parse instructions fetched by fetch unit 101. The decoded instruction of the decoding unit 102 may be operation information that is obtained by parsing the instruction and is executable by a machine, for example, an OpCode (operation code), an operand, and a control field of the instruction are parsed to form a uop (micro-instruction) executable by the machine. In addition, the decoding unit 102 decodes the current instruction according to the instruction set architecture setting, obtains specific operations (addition, subtraction, etc.) executed by the subsequent execution units in the OpCode of the instruction, and obtains the architecture registers where the source operand and the destination operand of the instruction are stored, and other required information. The instruction may be expanded into a format required inside the processor after being decoded by the decode unit 102, and the decoded instruction may carry a plurality of attributes, such as opcode, operand, and the like.
The execution unit 103 may perform operations based on the decoded instructions, generating execution results. The execution unit 103 obtains the number of the physical register corresponding to the architectural register by combining the table entry in the rename table 104 according to the architectural register stored by the source operand and the destination operand decoded by the decoding unit 102; acquiring actually stored data from the physical register file 105 according to the serial number of the physical register to obtain a real-time storage result; and writing the real-time storage result back to the physical register corresponding to the destination register.
The renaming unit 106 is used for changing the mapping relationship between the architectural register and the physical register in the renaming table 104. The renaming unit 106 reads the condition of the remaining physical registers of the current physical register file from the physical register file 105 according to the architectural registers stored by the source operands and the destination operands decoded by the decoding unit 102 to obtain the mapping relationship between the architectural registers and the actually allocated physical registers; the mapping relationship is written into the rename table 104, and the physical register file identifies that the physical register is occupied, so that confusion caused by repeated use of the following instruction in renaming is prevented.
It should be noted that the rename table 104 may record the mapping relationship between the actually allocated physical registers and the architectural registers. The renaming unit 106 may be configured to re-establish or modify the mapping between the physical registers in the physical register file 105 and the architectural registers by changing the mapping in the renaming table 104. It should be noted that the architectural register is a register specified on the instruction set architecture and can be used by a programmer when using assembly language, and the physical register is a hardware resource actually existing in the processor.
It should be further noted that the fetch unit 101, decode unit 102, execution unit 103, physical register file 105, and rename unit 106 shown in fig. 1 may be logic circuit units in the processor 100. In addition, it is understood that fig. 1 only shows a part of the optional structure of the processor 100 by way of example, and the processor 100 may also include other possible devices, for example, the processor may also include other circuit devices that are not necessary for understanding the disclosure of the embodiments of the present application, and since other circuit devices are not necessary for understanding the disclosure of the embodiments of the present application, the embodiments of the present application will not be described again.
The instruction set tends to set the bit width of the physical register to the maximum bit width of the architectural register at the architectural level. Therefore, the renaming unit 106 directly allocates the whole physical register with the largest bit width to the architectural register when performing register mapping. However, the data bit width actually used by the architectural register often does not reach the maximum bit width, which results in unused bit width in the physical register and resource waste of the physical register. To more clearly explain the above problem, the following description will be made by taking an example of allocating a physical register to an operation mask register, which is a register for performing operation control on data held in a vector register. It should be noted that, during design, the physical registers may be divided into a vector register for storing operation data and an operation mask register for storing an operation mask, and during register mapping, the physical registers need to be mapped to corresponding architecture registers.
When an instruction (such as a single instruction multiple data stream SMID instruction) is executed, the processor stores multiple sets of data in the vector register, performs an operation on the multiple sets of data simultaneously by using the vector register, and indicates the operation on which the multiple sets of data are executed and the masked operation by using the operation mask register. Fig. 2 shows a diagram of an example of the use of a vector register and an operation mask register. Referring to fig. 2, the vector register X contains 4 sets of data (Ax, Bx, Cx, Dx) and the vector register Y contains 4 sets of data (Ay, By, Cy, Dy), and if the addition of the data held in the vector register X and the vector register Y is not performed in a particular program semantic (e.g., a constraint that only a negative number is added), a multi-bit operation mask for operation control may be held in an operation mask register.
Each bit of the operation mask stored in the operation mask register may be set in one-to-one correspondence with each set of data stored in the vector registers X and Y, thereby indicating data in the vector registers X and Y on which the addition operation and the masked addition operation are performed. Referring to fig. 2, when the value of a certain operation mask in the operation mask register is 1, the data representing the corresponding group in the vector register performs an addition operation, and the value of a certain operation mask in the operation mask register is 0, the data representing the corresponding group in the vector register performs a masked addition operation. Taking the value of the operation mask in the operation mask register in fig. 2 as an example 1010, based on the value of the operation mask of the first bit and the third bit being 1, the data of the first group and the third group in the vector registers X and Y are added, and based on the value of the operation mask of the second bit and the fourth bit being 0, the data of the second group and the fourth group in the vector registers X and Y are masked and added, so as to obtain the operation results (Ax + Ay, Bz, Cx + Cy, Dz), where Bz and Dz represent that the register retains the register value before the addition instruction is executed.
It can be seen that the operation mask in the operation mask register is also 4 bits when 4 sets of data are contained in the vector registers X and Y. That is, the bit width used to store the operation mask in the operation mask register corresponds to the number of data sets stored in the vector register, i.e., the number of valid mask bits in the operation mask register corresponds to the number of data sets stored in the vector register.
However, in order to be compatible with all possible operation mask bit number conditions of the operation mask register, in an architecture level of an instruction set, the bit width of the operation mask register is often set to be the maximum data group number of the vector register capable of storing data, that is, a physical register with a corresponding bit width is allocated to the operation mask register according to the maximum group number of the vector register storing data. In this way, when the number of mask bits actually used by the operation mask register is not the maximum number of data sets stored in the vector register, unused bit width of the physical register allocated to the operation mask register is easily caused, and resource waste of the physical register is caused.
For example, in the Intel AVX512 instruction set, the vector registers may be 128, 256, 512 bits wide, and each group of data may be 8, 16, 32, 64 bits wide. The maximum number of sets of data in a vector register is 512/8-64 sets, and the minimum number is 128/64-2 sets. If the physical registers with corresponding bit widths are uniformly allocated to the operation mask register according to the maximum group number (64 groups) of data stored in the vector register, the bit widths of the physical registers allocated to the operation mask register are uniformly 64 bits, and when the actual mask bit number of the operation mask register is lower than 64 bits, high bits exist in the allocated physical registers and unused bit widths exist, so that resource waste of the physical registers is caused.
That is, when the high bit of the physical register is not needed, the physical register with the maximum upper limit bit width is still allocated to the architectural register according to the design of the instruction set at the architectural level, which results in a great waste; in addition, in terms of power consumption, when the data bit width actually used by the architectural register is smaller than the maximum upper limit bit width, but the data of the maximum upper limit bit width is still transferred, invalid data is unnecessarily transferred, and additional power consumption overhead is caused. On the other hand, when the data bit width actually used by the architectural register is smaller than the maximum upper limit bit width, the physical register still occupies the maximum upper limit width, which results in that the unused bit width in the physical register cannot be used by other instructions, and the area of the physical register is wasted.
In summary, it can be seen that, when performing register mapping, the resource utilization efficiency of the physical register needs to be improved. Therefore, the embodiment of the present application provides a novel register mapping scheme, which can allocate a physical register corresponding to a bit width to an architecture register based on a target bit width actually used by the architecture register, so as to improve a resource utilization rate of the physical register.
In the above thought, in the embodiment of the present application, a complete physical register with the maximum upper limit bit width is not directly allocated to the architectural register, but the register area is used as a unit area for register mapping in the physical register on the basis of dividing the physical register into register areas with different bit widths, so that the situation that the complete physical register with the maximum upper limit bit width is directly mapped to the architectural register is reduced, which enables the embodiment of the present application to be compatible with various bit width requirements of the architectural register in actual use through the register areas with different bit widths. Based on this, the embodiment of the present application may perform register area division on the physical register file 105, and fig. 3A shows an exemplary diagram of the physical register file 105. Referring to fig. 3A, the physical register file 105 may include: the plurality of sets of physical registers 31 to 3n, n is the number of sets of physical registers, and can be set according to actual conditions. One set of physical registers may have a plurality of physical registers, and one physical register has a register area as a unit area for register mapping, rather than performing register mapping with the entire physical register, and the register areas in different sets of physical registers have different bit widths, i.e., different sets of physical registers are register mapped with unit areas having different bit widths. Referring to fig. 3A, the first set of physical registers 31 includes a plurality of register areas 310, and the first set of physical registers 31 performs register mapping by using the register areas 310 as unit areas, the second set of physical registers 32 includes a plurality of register areas 320, and the second set of physical registers 32 performs register mapping by using the register areas 320 as unit areas, and so on, the nth set of physical registers 3n includes a plurality of register areas 3n0, and the nth set of physical registers 3n performs register mapping by using the register areas 3n0 as unit areas. Among the plurality of sets of physical registers 31 to 3n, the register areas for register mapping of the respective sets of physical registers have different bit widths, for example, the register areas 310 to 3n0 are unit areas for register mapping of the plurality of sets of physical registers 31 to 3n, respectively, and the register areas 310 to 3n0 have different bit widths.
In some embodiments, the bit width of the register region corresponding to a set of physical registers is adapted to a required bit width of the architectural register, and the architectural register has a plurality of required bit widths. For example, in the actual running process of the program, the actually used required bit width of the architectural register may be n, and the embodiment of the present application may divide the physical register file 105 into n sets of physical registers 31 to 3n based on the n required bit widths of the architectural register, where one set of physical registers is used for the bit width of the register area of the register map, and corresponds to one required bit width of the architectural register. In one example, assuming that the actually used demand bit widths of the architectural registers are 64 bits, 32 bits, 16 bits and 8 bits, respectively, based on the 4 demand bit widths, the physical register file 105 may be divided into 4 groups of physical registers, where one group of physical registers is used for the register-mapped register area, and the bit width of the register area corresponds to one demand bit width of the architectural register; for example, the physical register file 105 may be divided into 4 groups of physical registers of 64-bit, 32-bit, 16-bit, and 8-bit widths for the register area.
In further embodiments, the required bit width of the architectural register when actually used may include a maximum bit width (e.g., 64 bits) of the physical register, and at least one sub-bit width divided by the maximum bit width (e.g., 32 bits, 16 bits, and 8 bits divided by 64 bits). Based on this, when the physical register file is divided into a plurality of groups of physical registers, the embodiments of the present application may divide the bit width of each group of physical registers from the maximum bit width of the physical registers to a smaller bit width according to the group order of the plurality of groups of physical registers, so as to divide the register area of each group of physical registers, thereby obtaining a plurality of groups of physical registers with different bit widths in the register area. That is to say, the bit widths of the register areas corresponding to different sets of physical registers may be divided from the maximum bit width to a smaller value according to the set order of the multiple sets of physical registers.
In one example, taking 4 sets of physical registers with 64-bit, 32-bit, 16-bit, and 8-bit widths that the physical register file 105 divides into register regions as an example, fig. 3B shows another example diagram of the division of the physical register file 105. As shown in fig. 3B, the physical register file 105 has 64 physical registers numbered 0-63, and the bit width of each physical register is 64 bits (in this example, it is assumed that 64 bits are the maximum upper limit bit width of the physical registers), when the physical register file 105 is divided, the physical registers numbered 0-15 are used as the first set of physical registers, and the physical registers numbered 0-15 remain unchanged, and register mapping is still performed with the 64 bit width of the whole physical registers, that is, the physical registers numbered 0-15 perform register mapping in units of 64-bit wide register areas. The physical register with the number 16-31 is used as a second group of physical registers, and in the physical registers with the number 16-31, one 64-bit physical register is divided into 2 register areas with 32 bits, so that 32 register areas with 32 bits are obtained, that is, the physical register with the number 16-31 performs register mapping by taking the register area with 32 bits as a unit. The physical registers with the numbers 32 to 47 are used as a third group of physical registers, and in the physical registers with the numbers 32 to 47, one 64-bit physical register is divided into 4 register areas with 16-bit width, so that 64 register areas with 16-bit width are obtained, that is, the physical registers with the numbers 32 to 47 perform register mapping by taking the register areas with 16-bit width as a unit. The physical register with the number of 48-63 is used as a fourth group of physical registers, and in the physical registers with the number of 48-63, one 64-bit physical register is divided into 8 register areas with the bit width of 8, so that 128 register areas with the bit width of 8 are obtained, namely the physical registers with the number of 48-63 perform register mapping by taking the register areas with the bit width of 8 as a unit. That is, the bit width of the unit area for register mapping by the physical registers numbered 0 to 15 is 64 bits, the bit width of the unit area for register mapping by the physical registers numbered 16 to 31 is 32 bits, the bit width of the unit area for register mapping by the physical registers numbered 32 to 47 is 16 bits, and the bit width of the unit area for register mapping by the physical registers numbered 48 to 63 is 8 bits.
It should be noted that, a physical register is divided into a plurality of register areas, for example, a 64-bit wide physical register is divided into 2 32-bit wide register areas, and instead of dividing the physical register into a plurality of register areas on hardware, a bit width range read by an instruction in the physical register is adjusted, so that different bit width ranges read by the instruction respectively can exist in the physical register, and the physical register is divided into a plurality of register areas with different bit width ranges. For example, a 64-bit wide physical register is divided into 2 32-bit wide register areas, which may be a range of lower 32 bits and a range of upper 32 bits for a physical register read by an adjustment instruction, respectively, so that the physical register is divided into 2 32-bit wide register areas of lower 32 bits and upper 32 bits.
On the basis that the physical register file is divided into a plurality of groups of physical registers, and different groups of physical registers are used for different bit widths of register areas mapped by the registers, the embodiment of the application can flexibly allocate the register areas matched with the actually used target bit width to the architecture register based on the register areas with different bit widths in the physical register file, so that the bit width of the register area mapped by the architecture register adapts to the actually used target bit width of the architecture register, the utilization rate of the register area of the physical register is improved to the maximum extent, and the effect of improving the resource utilization rate of the physical register is achieved.
In some embodiments, fig. 4 illustrates a flowchart of a register mapping method provided by an embodiment of the present application. It should be noted that the flow is shown for facilitating understanding of the disclosure of the embodiments of the present application, and the embodiments of the present application are not limited to the flow shown in fig. 4. Referring to fig. 4, the flow of the register mapping method may include the following steps.
And S410, determining the actually used target bit width of the architecture register corresponding to the current instruction.
The current instruction may be an instruction currently being processed by a pipeline of the processor. In some embodiments, S410 may be performed at a decode stage of a processor. As an optional implementation, after the current instruction is fetched by the fetch unit, decoding analysis may be performed by the decoding unit, and according to a decoding analysis result of the current instruction, the embodiment of the present application may obtain attribute information of an architecture register operated by the current instruction, so as to determine, based on the attribute information, a target bit width actually used by the architecture register corresponding to the current instruction. It should be noted that the architectural register corresponding to the current instruction may be the architectural register operated by the current instruction, or may be a register associated with the architectural register operated by the current instruction.
Taking an operation mask register corresponding to an architecture register corresponding to a current instruction as an example, when a target bit width actually used by the operation mask register is determined, in the embodiment of the present application, attribute information of a vector register operated by the current instruction (for example, a SIMD instruction) may be obtained by analyzing an OpCode of the current instruction, so that the target bit width actually used by the architecture register associated with the vector register is determined based on the attribute information.
As an optional implementation, in the embodiment of the present application, the actual data group number of the data stored in the vector register in the processing process of the current instruction is determined through the attribute information of the vector register, so that the actual data group number is used as the mask bit number actually used by the architecture register, and the target bit width actually used by the architecture register is obtained. In one example, the attribute information of the vector register may include a vector register width, and the vector register stores a data width of each group of data, and the embodiment of the present application may obtain the vector register width of the vector register operated by the current instruction and the data width of each group of data stored in the vector register by parsing OpCode of the current instruction, thereby calculating an actual data group number of the vector register holding data (the actual data group number of the vector register is the vector register width divided by the data width of each group of data stored in the vector register), and determining the calculated actual data group number as a target bit width actually used by the operation mask register.
Step S411, determining a free register area with the bit width matched with the target bit width from the physical register file.
In some embodiments, S411 may be performed at a register renaming stage of a processor. As an optional implementation, the target bit width actually used by the architecture register acquired in the decoding stage may be transmitted backwards in the pipeline operation of the processor, so that in the register renaming stage, the renaming unit may acquire the target bit width; furthermore, the renaming unit may determine, based on the divided physical register files described above, a free register area having a bit width matching the target bit width from the physical register files having register areas with different bit widths. Note that the idle state of the register area refers to a state in which the unoccupied register area can be allocated for use.
In some embodiments, the bit width of the free register area determined in step S411 may be the same as the target bit width. For example, under the condition that a plurality of groups of physical registers are partitioned by a physical register file and the bit widths of the register areas of the physical registers of the groups are different, the embodiment of the present application may determine a target group physical register with the same bit width as a target bit width of the register area from the physical register file, so that under the condition that the target group physical register has a register area in an idle state, a free register area with the same bit width as the target bit width is determined from the target group physical register. In an example, assuming that the target bit width is 32 bits, the physical register group with the bit width of 32 in the register area in the physical register file may be used as the target group physical register, so that in the embodiment of the present application, when the target group physical register has a register area in an idle state, the idle register area with the bit width of 32 bits may be determined.
In a possible case, there may be no idle register area in the target group physical register (the bit width of the register area is the same as the target bit width), and at this time, in the embodiment of the present application, after waiting for the release of the occupied register area in the target group physical register, the idle register area is selected from the released register area to serve as the idle register area.
For example, as shown in fig. 3B, when the target bit width is 64 bits, the bit width of the register area of the first group of physical register sets corresponding to numbers 0 to 15 in the physical register file is 64 bits, and therefore the physical registers corresponding to numbers 0 to 15 are used as the target group physical registers, in the embodiment of the present application, it may be determined whether a register area in an idle state currently exists in the first group of physical register sets corresponding to numbers 0 to 15, and if so, it is determined that the register area in the idle state is an idle register area; if not, after waiting for the occupied register areas in the first group of physical register groups corresponding to the numbers 0-15 to be released, selecting the register areas in the idle state from the released register areas.
In other embodiments, the bit width of the free register area may be different from the target bit width, for example, the bit width of the free register area is greater than the target bit width, but the free register area can be further divided to obtain the target bit width; for another example, the bit width of the free register area is smaller than the target bit width, but the bit widths of the plurality of free register areas can be combined to obtain the target bit width. As an optional implementation, in the embodiment of the present application, when there is no register area in an idle state in a target group physical register (a bit width of the register area is the same as a target bit width), an alternative group physical register having the register area in the idle state is determined in another group physical register whose bit width of the register area is greater than or less than the target bit width, so that the register area whose idle bit width is greater than the target bit width or a plurality of register areas whose idle bit width is less than the target bit width in the alternative group physical register is used as the idle register area.
For example, in the embodiment of the present application, when there is no register area in an idle state in the target group physical register, according to the direction of increasing the bit width of the register area, a register area in an idle state is searched in other group physical registers of the physical register file, and a group in which a physical register in the register area with an idle state that is found first is located is used as an alternative group physical register. For another example, in the embodiment of the present application, when the target group physical register does not have a register area in an idle state, according to a direction in which the bit width of the register area decreases, the register area in the idle state is searched in other group physical registers of the physical register file, and the first searched register areas having an idle state are used, and the bit widths of the plurality of register areas can be combined to obtain a group in which the physical register with the target bit width is located, so as to be used as an alternative group physical register.
In an example, assuming that the target bit width is 32 bits, and there is no register area in a free state in the target set of physical registers having the bit width of 32 in the register area in the physical register file, the embodiment of the present application may find the register area in the free state in the physical register set having the bit width of 64 bits in the register area as the free register area. In other examples, in a case that no idle register area exists in a target set of physical registers having a bit width of 32 bits in the register area, the embodiment of the present application may first find the idle register area in a physical register set having a bit width of 16 bits in the register area, and if the idle register area is not found or the idle register area is found, but the idle register area cannot be merged to obtain the 32-bit register area, the embodiment of the present application may further find the idle register area in a physical register set having a bit width of 8 bits in the register area.
For example, continuing with fig. 3B, when the target bit width is 64 bits, the physical registers with numbers 0 to 15 in the physical register file are used as target group physical registers, in the embodiment of the present application, it may be determined whether there is a register area in an idle state currently in the first group of physical register groups with numbers 0 to 15, if there is no register area in an idle state currently in the first group of physical register groups with numbers 0 to 15, then search in the physical register groups with numbers 16 to 31, 32 to 47, and 48 to 63, and the register area with the idle state is searched first, and the register areas with the idle state can be merged to obtain a 64-bit physical register group as an alternative group physical register, so as to correspondingly determine the idle register area from the alternative group physical registers.
Step S412, determining a target register area based on the free register area, and allocating the target register area to the architecture register.
In some embodiments, one free register area may be the target register area if the bit width of the one free register area is the same as the target bit width. If the bit width of one free register area is greater than the target bit width, the free register area needs to be further divided in the embodiment of the present application, for example, one free register area is divided into a plurality of register areas with the same bit width as the target bit width by taking the target bit width as a unit, so that the divided register area is taken as the target register area. If the bit width of one free register area is smaller than the target bit width, in the embodiment of the present application, after determining a plurality of free register areas, merging the plurality of free register areas into a register area with the same bit width as the target bit width, so as to use the merged register area as the target register area.
For example, the architectural register may be an operation mask register, in other examples, the architectural register may also be a vector register, and of course, the architectural register may also have other forms in the embodiment of the present application, and the embodiment of the present application does not limit this.
According to the register mapping method provided by the embodiment of the application, the physical register file can be set into a plurality of groups of physical registers, and the unit areas (called register areas) used for register mapping in different groups of physical registers have different bit widths, so that the physical register file can provide the register areas used for register mapping with different bit widths, and a basis is provided for accurately matching the target bit width actually used by the architectural register. Based on this, in the process of processing the current instruction by the processor, the embodiment of the application can determine the actually used target bit width of the architecture register corresponding to the current instruction; determining an idle register area with the bit width matched with the target bit width from the physical register file based on the target bit width; that is to say, based on register areas with different bit widths provided by the physical register file, the embodiment of the present application can determine, from the physical register file, a free register area with a bit width that is the same as a target bit width, or a free register area with a bit width that is greater than the target bit width but can be further divided to obtain the target bit width, or bit widths can be merged to obtain multiple free register areas with the target bit width, and the like. Furthermore, in the embodiments of the present application, the target register area may be determined based on the free register area, so that the bit width of the target register is exactly the target bit width used by the architectural register, and by allocating the target register area to the architectural register, a physical register area corresponding to the target bit width actually used may be allocated to the architectural register in the register mapping process, thereby utilizing the resource of the physical register to the maximum extent. Therefore, the embodiment of the application can allocate a target register area corresponding to the target bit width actually used by the architecture register for the architecture register, and improve the resource utilization rate of the physical register.
In order to implement mapping between the architectural register and the physical register, a free register area matching with a target bit width needs to be determined from the physical register file, and then a target register area is determined based on the free register area, and the target register area is allocated to the architectural register. Fig. 5 is an alternative flowchart of determining a target register area according to an embodiment of the present application, and referring to fig. 5, the flowchart may include the following steps.
And step S521, determining a target group physical register with the bit width of the register area being the same as the target bit width.
In some embodiments, the embodiments of the present application may determine a target set of physical registers having a same bit width as a target bit width of a register region from a plurality of sets of physical registers partitioned by a physical register file. Referring to fig. 3B, when the target bit width is 32 bits, the target set of physical registers having the same target bit width may be the second set of physical registers corresponding to numbers 16-31 of the register area having a bit width of 32 bits.
Step S522 determines whether there is an idle register area in the target set of physical registers, if so, step S523 is executed, and if not, step S524 is executed.
Step S523, the register area in the idle state is determined as an idle register area.
Step S524, waiting for the register area occupied by the physical register of the target group to be released, and selecting a register area in an idle state from the released register area as an idle register area.
And step S525, taking the free register area as a target register area.
In the process of determining the idle register area, if the target group physical register has the idle register area, the idle register area is directly used as the idle register area, and then the idle register area is used as the target register area. If the idle register area does not exist in the target group physical register at present, after the occupied register area in the target group physical register is released, the idle register area can be selected from the released register area to be used as the idle register area, and then the idle register area is used as the target register area.
As another optional implementation manner of determining the target register area in the embodiment of the present application, when there is no register area in an idle state in the target group physical register, the embodiment of the present application may search the candidate group physical register in the physical register file, determine the idle register area from the candidate group physical register, and further divide or merge the idle register area in the candidate group physical register to obtain the target register area. FIG. 6 is an alternative flow chart for determining a target register area according to the embodiment of the present application. Referring to fig. 6, the flow may include the following steps.
Step S631, determining a target group physical register with the same bit width as the target bit width of the register area.
Step S632 is to determine whether there is an idle register area in the target set of physical registers, if yes, step S633 is executed, and if no, step S634 is executed.
Step S633, determining the register area in the idle state as an idle register area, and taking the idle register area as a target register area.
And step S634, determining an alternative group physical register from the plurality of groups of physical registers.
According to the method and the device, when the idle-state register area does not exist in the target set physical register at present, the standby set physical register can be determined from the multiple sets of physical registers, wherein the standby set physical register is a physical register set of a register area in a physical register file, the target bit width can be formed for the bit width of the register area, and the idle-state register area exists.
It should be noted that the alternative group physical register may be selected from a physical register group in which the bit width of the register area is greater than the target bit width, or may be selected from a physical register group in which the bit width of the register area is less than the target bit width. As an optional implementation, when the target group physical register does not currently have a register area in an idle state, in the physical register whose bit width of the register area is greater than the target bit width, the embodiment of the present application may find the alternative group physical register of the register area in which the idle state exists first, according to the direction in which the bit width of the register area increases; or, according to the direction of decreasing the bit width of the register area, in the physical register whose bit width of the register area is smaller than the target bit width, searching for the alternative group physical register of the register area with the idle state at first.
Step S635, if the bit width of the free register area of the alternative group physical register is smaller than the target bit width, merging the plurality of free register areas in the alternative group physical register, and determining the merged register area as the target register area.
And step S636, if the bit width of the register area of the alternative group physical register is larger than the target bit width, dividing the register area of the alternative group physical register to obtain a target register area with the same target bit width.
When the bit width of the register area of the alternative group physical register is greater than the target bit width, the embodiment of the application can further divide the bit width of the register area of the alternative group physical register, so as to divide the register area to obtain the register area with the same bit width as the target bit width, and further use the divided register area as the target register area. As an optional implementation, in the embodiment of the present application, a register area in an idle state selected from the physical registers in the candidate group may be divided, so that the bit width of the divided register area is the same as the target bit width; and determining the divided register area as the target register area.
For example, assuming that the target bit width is 32 bits, if an alternative group physical register is found in the direction of increasing the bit width of the register area, and the bit width of the register area corresponding to the alternative group physical register is 64 bits, the embodiment of the present application may divide the 64-bit free register area in the alternative group physical register into 2 32-bit free register areas, so that one 32-bit register area may be used as the target register area, and the other 32-bit register area may be occupied by other instructions, thereby implementing efficient utilization of resources of the physical register.
When the bit width of the register area of the candidate group physical register is smaller than the target bit width, the bit widths of the plurality of idle register areas in the candidate group physical register can be merged according to the embodiment of the application, so that the register area with the same bit width as the target bit width is obtained through merging. As an optional implementation, in the embodiment of the present application, a plurality of idle register areas selected from the physical registers in the candidate group may be merged, so that the bit width of the merged register area is the same as the target bit width; and determining the merged register area as the target register area.
For example, assuming that the target bit width is 32 bits, if an alternative set physical register is found in the direction of decreasing the bit width of the register area, and the bit width of the register area corresponding to the found alternative set physical register is 16 bits, two 16-bit-wide idle register areas in the alternative set physical register may be merged into one 32-bit-wide register area, and the merged register area may be used as the target register area.
That is, when the bit width of the register area of the candidate group physical register is greater than the target bit width, the register area of the candidate group physical register may be divided, so that the bit width corresponding to the divided register area is equal to the target bit width; when the bit width of the register area of the alternative group physical register is smaller than the target bit width, merging processing can be performed on the register area of the alternative group physical register, so that the bit width corresponding to the merged register area is equal to the target bit width. Furthermore, the register area in which the alternative group of physical registers are further divided or combined can be used as the target register area in the embodiment of the application, so that the resources in the physical registers are flexibly integrated, and the resource utilization efficiency of the physical registers is improved.
Further, if the target group physical register does not have a register area in an idle state, the bit width of the selected register area in the multiple groups of physical registers may form an alternative group physical register with the target bit width, and at this time, if the alternative group physical register does not have a register area in an idle state, the spare state register area may be selected from the released register area as the idle register area after the occupied register area in the target group physical register or the alternative group physical register is released. In some embodiments, whether occupied register regions in the target set of physical registers or the alternate set of physical registers are freed, a free register region may be determined based on the freed register regions, and the target register region may be determined based on the free register region.
After the target register area is determined, the register area may be allocated to the architectural register in the renaming stage, so as to implement mapping between the architectural register and the target register area in the physical register, and the mapping relationship between the architectural register and the target register area may be stored in the renaming table.
Fig. 7 is an alternative exemplary diagram of a reading result of a rename table in the embodiment of the application. The renaming table is used for recording the mapping relation between the target register area and the architecture register. In some embodiments, the renaming table may set a number of a physical register corresponding to the target register region, a number of an architectural register, and a read identifier of the target register region.
Referring to fig. 7, the read result of the rename table may be as shown, and accordingly, the read result includes at least the number of the physical register, the number of the architectural register, and the read identifier of the target register region. In some embodiments, the read result of the rename table may also include information such as target bit width, whether it is the last rename table entry, and so on. In the embodiment of the application, the reading identifier is used for indicating a bit width range of the target register area for reading data in the physical register; the bit width range of the read data includes all or a portion of the bit width range of the physical register.
According to the embodiment of the application, the reading identifier can be set for the register area according to the number of the register areas set in one physical register and the bit width range of each register area in the physical register. Referring to fig. 7, when the target bit width is 32, if one physical register is divided into 2 register areas with 32 bit widths, a free register area exists in the physical register with number 15 for allocation, the read flag may be 01 or 10, the read flag is 01 indicating that the register is divided, the register area with 32 bit lower than the physical register is read, and the read flag is 10 indicating that the current register is divided, and the register area with 32 bit higher than the physical register is read. In some embodiments, the renaming table includes a plurality of renaming entries, a mapping relationship between a number of one architectural register and a number of one physical register is recorded in one renaming entry, and the number of one architectural register maps the numbers of one or more physical registers through one or more renaming entries.
The architecture register can set identification data for the last renamed table entry corresponding to the number of the architecture register in the rename table when the rename table entry is mapped, so that the last renamed table entry corresponding to the number of the architecture register is indicated through the identification data when the rename table is searched; the last rename table entry records at least the number of the last physical register of the number mapping of the architectural register. The identification data may be set in the last rename table entry to distinguish the current rename table entry as the last rename table.
According to the method and the device, the reading identifier is arranged in the rename table, the physical architecture register with the corresponding bit width is dynamically selected for distribution, meanwhile, the mapping relation between the architecture register and the register area in the physical register is conveniently searched through the reading identifier, and meanwhile, the accurate mapping of the architecture register and the physical register is guaranteed. The identification data is also arranged in the renaming table entry, so that the last renaming table entry corresponding to the number of the architecture register is indicated by the identification data when the renaming table is searched, and the accurate mapping between the architecture register and the target physical register area in the read renaming table is further ensured.
An embodiment of the present application further provides a processor, a structure of which can refer to fig. 1, where the processor at least includes:
a decode unit to determine a target bit width of an architectural register used by a current instruction;
the renaming unit is used for determining an idle register area with the bit width matched with the target bit width from the physical register file; determining a target register area based on the free register area, and allocating the target register area to the architecture register;
the physical register file comprises a plurality of groups of physical registers, the register area is a unit area used for register mapping in the physical registers, and the bit widths of the register areas corresponding to different groups of physical registers are different.
In some embodiments, a bit width of a register region corresponding to a set of physical registers is adapted to a required bit width of an architectural register, the architectural register having a plurality of required bit widths; the plurality of required bit widths comprise a maximum bit width of the physical register and at least one sub-bit width divided by the maximum bit width; the bit widths of the register areas corresponding to different groups of physical registers are divided from the maximum bit width to the minimum according to the group sequence of the multiple groups of physical registers.
In some embodiments, renaming the unit to determine from the physical register file a free register region having a bit width that matches the target bit width may include:
determining target group physical registers with the same bit width as the target bit width of a register area from a plurality of groups of physical registers; and selecting a register area in an idle state from the target group of physical registers as an idle register area.
In some embodiments, the renaming unit to determine the target register region based on the free register region includes: the free register area is used as a target register area.
In some embodiments, the renaming unit is configured to select a register area in a free state from the target set of physical registers, and the selecting the register area in the free state as the free register area may include:
and if the idle-state register area does not exist in the target group physical register at present, waiting for the occupied register area in the target group physical register to be released, and selecting the idle-state register area from the released register area to serve as the idle register area.
In some embodiments, the renaming unit, configured to determine a free register area from the physical register file having a bit width that matches the target bit width, may include:
determining target group physical registers with the same bit width as the target bit width of a register area from a plurality of groups of physical registers; if the target group physical register does not have a register area in an idle state currently, determining the bit width of the register area from the plurality of groups of physical registers to form a target bit width, and determining an alternative group physical register of the register area in the idle state; and selecting a register area in an idle state from the candidate group of physical registers as an idle register area.
In some embodiments, the renaming unit, configured to determine the bit width of the register region from the plurality of sets of physical registers to form the target bit width, and the candidate set of physical registers of the register region where the idle state exists may include: according to the direction of increasing the bit width of the register area, searching the alternative group physical register of the register area with the idle state firstly in the physical register with the bit width of the register area larger than the target bit width.
In some embodiments, the renaming unit to determine the target register region based on the free register region includes: dividing the register area in an idle state selected from the standby group physical registers to enable the bit width of the divided register area to be the same as the target bit width; and determining the divided register area as a target register area.
In some embodiments, the renaming unit, configured to determine the bit width of the register region from the plurality of sets of physical registers to form the target bit width, and the candidate set of physical registers of the register region where the idle state exists may include: according to the direction of decreasing the bit width of the register area, in the physical register of which the bit width is smaller than the target bit width, the alternative group physical register of the register area with the idle state is searched.
In some embodiments, the renaming unit to determine the target register region based on the free register region may include: merging a plurality of idle register areas selected from the standby group physical registers to ensure that the bit width of the merged register areas is the same as the target bit width; and determining the merged register area as a target register area.
In some embodiments, the renaming unit may be further configured to:
if the target group physical register does not have a register area in an idle state currently, and the bit width of the register areas in the multiple groups of physical registers forms a register area of an alternate group physical register with a target bit width without an idle state, waiting for the release of the occupied register area in the target group physical register or the alternate group physical register, and then selecting the register area in the idle state from the released register area as an idle register area.
In some embodiments, the renaming unit to allocate the target register region to the architectural register comprises: and establishing a mapping relation between the target register area and the architecture register in the renaming table.
In some embodiments, the renaming unit, configured to establish the target register region in the renaming table, may include:
establishing a mapping relation between the number of the physical register corresponding to the target register area and the number of the architecture register in the rename table, and setting a reading identifier of the target register area;
the reading identifier is used for indicating the bit width range of the data read by the target register area in the physical register; the bit width range includes all or a portion of the bit width range of the physical register.
In some embodiments, the renaming table comprises a plurality of renaming table entries, the mapping relation between the number of one architecture register and the number of one physical register is recorded in one renaming table entry, and the number of one architecture register is mapped to the number of one or more physical registers through one or more renaming table entries; further, the renaming unit may be further configured to:
determining the last renamed table entry corresponding to the number of the architecture register, and setting identification data for the last renamed table entry in the rename table so as to indicate the last renamed table entry corresponding to the number of the architecture register through the identification data when the rename table is searched; the last rename table entry records at least the number of the last physical register of the number map of the architectural register.
The embodiment of the application also provides a chip which can comprise the processor core. The functions of the hardware devices in the processor core can refer to the description of the corresponding parts.
The embodiment of the application also provides an electronic device, which can comprise the chip; the electronic device can be a terminal device and also can be a cloud server device.
While various embodiments have been described above in connection with what are presently considered to be the embodiments of the disclosure, the various alternatives described in the various embodiments can be readily combined and cross-referenced without conflict to extend the variety of possible embodiments that can be considered to be the disclosed and disclosed embodiments of the disclosure.
Although the embodiments of the present application are disclosed above, the present application is not limited thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present disclosure, and it is intended that the scope of the present disclosure be defined by the appended claims.

Claims (16)

1. A register mapping method, comprising:
determining the actual used target bit width of the architecture register corresponding to the current instruction;
determining a free register area with the bit width matched with the target bit width from a physical register file; the physical register file comprises a plurality of groups of physical registers, the register area is a unit area used for register mapping in the physical registers, and the bit widths of the register areas corresponding to different groups of physical registers are different;
determining a target register area based on the free register area, the target register area being allocated to the architectural register.
2. The register mapping method according to claim 1, wherein the bit width of the register area corresponding to a set of physical registers is adapted to a required bit width of an architectural register, the architectural register having a plurality of required bit widths.
3. The register mapping method according to claim 2, wherein the plurality of required bit widths includes a maximum bit width of the physical register, and at least one sub-bit width divided by the maximum bit width; and dividing bit widths of register areas corresponding to different groups of physical registers from the maximum bit width to a small bit width according to the group sequence of the plurality of groups of physical registers.
4. The register mapping method according to claim 1, wherein said determining a free register area from the physical register file having a bit width matching the target bit width comprises:
determining target group physical registers with the same bit width of a register area as the target bit width from the multiple groups of physical registers;
selecting a register area in an idle state from the target group physical registers as the idle register area;
the determining a target register region based on the free register region comprises:
and taking the free register area as the target register area.
5. The register mapping method according to claim 4, wherein the selecting a free-state register area from the target set of physical registers as the free register area comprises:
if the target group physical register does not have a register area in an idle state at present, waiting for the occupied register area in the target group physical register to be released, and then selecting the register area in the idle state from the released register area as the idle register area.
6. The register mapping method according to claim 1, wherein said determining a free register area from the physical register file having a bit width matching the target bit width comprises:
determining target group physical registers with the same bit width of a register area as the target bit width from the multiple groups of physical registers;
if the target group physical register does not have a register area in an idle state currently, determining bit width of the register area from the multiple groups of physical registers to form the target bit width, and determining an alternative group physical register of the register area in the idle state;
and selecting a register area in an idle state from the standby group physical registers as the idle register area.
7. The register mapping method according to claim 6, wherein said determining a bit width of a register region from among said plurality of sets of physical registers forms said target bit width, and wherein an alternative set of physical registers for a register region for which a free state exists comprises:
according to the bit width increasing direction of the register area, searching an alternative group physical register of the register area with an idle state in the physical register with the bit width of the register area larger than the target bit width;
the determining a target register region based on the free register region comprises:
dividing the register area in an idle state selected from the standby group physical registers to enable the bit width of the divided register area to be the same as the target bit width; and determining the divided register area as the target register area.
8. The register mapping method according to claim 6, wherein said determining a bit width of a register region from among said plurality of sets of physical registers forms said target bit width, and wherein an alternative set of physical registers for a register region for which a free state exists comprises:
according to the direction of descending the bit width of the register area, searching an alternative group physical register of the register area with an idle state in the physical register with the bit width of the register area smaller than the target bit width;
the determining a target register region based on the free register region comprises:
merging a plurality of idle register areas selected from the physical registers of the alternative groups, so that the bit width of the merged register areas is the same as the target bit width; and determining the merged register area as the target register area.
9. The register mapping method according to claim 6, further comprising:
if the target group physical register does not have a register area in an idle state currently, and the bit width of the register areas in the multiple groups of physical registers forms an alternative group physical register with the target bit width and does not have a register area in an idle state, waiting for the release of the occupied register area in the target group physical register or the alternative group physical register, and then selecting the register area in an idle state from the released register area as the idle register area.
10. The register mapping method according to any of claims 1-9, wherein said allocating the target register region to the architectural register comprises:
and establishing a mapping relation between the target register area and the architecture register in a renaming table.
11. The register mapping method according to claim 10, wherein the establishing the target register region in the rename table, and the mapping relationship with the architectural register comprises:
establishing the number of the physical register corresponding to the target register area and the mapping relation between the number of the architecture register in a rename table, and setting a reading identifier of the target register area;
the reading identification is used for indicating the bit width range of the target register area for reading data in the physical register; the bit width range includes all or a portion of the bit width range of the physical register.
12. The register mapping method according to claim 11, wherein the renaming table includes a plurality of renaming entries, a mapping relationship between a number of an architectural register and a number of a physical register is recorded in a renaming entry, and the number of an architectural register maps the numbers of one or more physical registers through one or more renaming entries; the method further comprises the following steps:
determining the last renamed table entry corresponding to the number of the architecture register, and setting identification data for the last renamed table entry in the rename table so as to indicate the last renamed table entry corresponding to the number of the architecture register through the identification data when the rename table is searched; the last rename table entry records at least the number of the last physical register of the number mapping of the architectural register.
13. A processor, comprising:
a decode unit to determine a target bit width of an architectural register used by a current instruction;
the renaming unit is used for determining a free register area with the bit width matched with the target bit width from the physical register file; determining a target register area based on the free register area, the target register area being allocated to the architectural register;
the physical register file comprises a plurality of groups of physical registers, the register area is a unit area used for register mapping in the physical registers, and the bit widths of the register areas corresponding to different groups of physical registers are different.
14. The processor of claim 13, wherein a bit width of the register region corresponding to a set of physical registers is adapted to a required bit width of an architectural register, the architectural register having a plurality of required bit widths; the plurality of required bit widths comprise a maximum bit width of the physical register and at least one sub-bit width divided by the maximum bit width; and dividing bit widths of register areas corresponding to different groups of physical registers from the maximum bit width to a small bit width according to the group sequence of the plurality of groups of physical registers.
15. A chip comprising a processor as claimed in any one of claims 13 to 14.
16. An electronic device comprising the chip of claim 15.
CN202111342880.7A 2021-11-12 2021-11-12 Register mapping method, processor, chip and electronic equipment Pending CN113961248A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111342880.7A CN113961248A (en) 2021-11-12 2021-11-12 Register mapping method, processor, chip and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111342880.7A CN113961248A (en) 2021-11-12 2021-11-12 Register mapping method, processor, chip and electronic equipment

Publications (1)

Publication Number Publication Date
CN113961248A true CN113961248A (en) 2022-01-21

Family

ID=79470380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111342880.7A Pending CN113961248A (en) 2021-11-12 2021-11-12 Register mapping method, processor, chip and electronic equipment

Country Status (1)

Country Link
CN (1) CN113961248A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934166A (en) * 2022-11-08 2023-04-07 济南新语软件科技有限公司 Efficient operation method and system based on dynamically-configurable register
CN116257350A (en) * 2022-09-06 2023-06-13 进迭时空(杭州)科技有限公司 Renaming grouping device for RISC-V vector register

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116257350A (en) * 2022-09-06 2023-06-13 进迭时空(杭州)科技有限公司 Renaming grouping device for RISC-V vector register
CN116257350B (en) * 2022-09-06 2023-12-08 进迭时空(杭州)科技有限公司 Renaming grouping device for RISC-V vector register
CN115934166A (en) * 2022-11-08 2023-04-07 济南新语软件科技有限公司 Efficient operation method and system based on dynamically-configurable register

Similar Documents

Publication Publication Date Title
US11048517B2 (en) Decoupled processor instruction window and operand buffer
EP3314402B1 (en) Age-based management of instruction blocks in a processor instruction window
US9697004B2 (en) Very-long instruction word (VLIW) processor and compiler for executing instructions in parallel
US10768930B2 (en) Processor supporting arithmetic instructions with branch on overflow and methods
US8504808B2 (en) Cache memory apparatus having internal ALU
US5951670A (en) Segment register renaming in an out of order processor
CN113961248A (en) Register mapping method, processor, chip and electronic equipment
US20160378484A1 (en) Mapping instruction blocks based on block size
KR20180021165A (en) Bulk allocation of instruction blocks to processor instruction windows
CN110825437B (en) Method and apparatus for processing data
US7093100B2 (en) Translation look aside buffer (TLB) with increased translational capacity for multi-threaded computer processes
US20030065909A1 (en) Deferral of dependent loads until after execution of colliding stores
CN111638911A (en) Processor, instruction execution equipment and method
CN114153500A (en) Instruction scheduling method, instruction scheduling device, processor and storage medium
US6871343B1 (en) Central processing apparatus and a compile method
US7493481B1 (en) Direct hardware processing of internal data structure fields
CN115640047A (en) Instruction operation method and device, electronic device and storage medium
CN115576608A (en) Processor core, processor, chip, control equipment and instruction fusion method
KR20010053623A (en) Processor configured to selectively free physical registers upon retirement of instructions
US10534614B2 (en) Rescheduling threads using different cores in a multithreaded microprocessor having a shared register pool
US11321088B2 (en) Tracking load and store instructions and addresses in an out-of-order processor
CN112184536B (en) Method, apparatus, device and medium for processing image data based on GEMM
CN117519792A (en) Register release method, processor, chip and electronic equipment
CN114924793A (en) Processing unit, computing device, and instruction processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination