CN115878187A - Processor instruction processing apparatus and method supporting compressed instructions - Google Patents

Processor instruction processing apparatus and method supporting compressed instructions Download PDF

Info

Publication number
CN115878187A
CN115878187A CN202310057209.0A CN202310057209A CN115878187A CN 115878187 A CN115878187 A CN 115878187A CN 202310057209 A CN202310057209 A CN 202310057209A CN 115878187 A CN115878187 A CN 115878187A
Authority
CN
China
Prior art keywords
instruction
instructions
buffer
address
hit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310057209.0A
Other languages
Chinese (zh)
Other versions
CN115878187B (en
Inventor
郇丹丹
李祖松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Micro Core Technology Co ltd
Original Assignee
Beijing Micro Core Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Micro Core Technology Co ltd filed Critical Beijing Micro Core Technology Co ltd
Priority to CN202310057209.0A priority Critical patent/CN115878187B/en
Publication of CN115878187A publication Critical patent/CN115878187A/en
Application granted granted Critical
Publication of CN115878187B publication Critical patent/CN115878187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The device comprises a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unifying module, a pre-decoding module and a processing module, wherein the program counter is used for reading and storing instruction labels corresponding to a storage array of N paths of buffering instructions and the storage array, the comparison module is used for determining whether the instruction labels of the N paths of buffering instructions are hit or not so as to select a hit path buffering instruction corresponding to the hit instruction label, the instruction length unifying module is used for expanding the storage array of the hit path buffering instruction into a standard instruction with unified length, the standard instruction is pre-decoded by the pre-decoding module, and the processing module is used for performing pipeline processing on unified coding information obtained by pre-decoding and a bit vector of an effective instruction, so that the N paths of instructions are converted into the standard instruction with unified length for pre-decoding, the processing logic of the pre-decoding is simplified, and the execution performance of the processor is improved.

Description

Processor instruction processing apparatus and method supporting compressed instruction
Technical Field
The present application relates to the field of processor technologies, and in particular, to a processor instruction processing apparatus and method supporting a compressed instruction.
Background
The Instruction Set Architecture (ISA) of modern processors is designed to reduce the size of program codes and increase code density, and a compression Instruction is designed to reduce the miss rate of an Instruction cache, improve processor performance, reduce power consumption, and reduce area and cost.
Disclosure of Invention
A first aspect of the embodiments of the present application provides a processor instruction processing apparatus supporting a compressed instruction, where the apparatus includes a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unification module, a pre-decoding module, and a processing module, where: the program counter is used for reading a storage array of the N-path buffering instruction and an instruction tag corresponding to the storage array; the first instruction memory is used for storing the storage array; the second instruction memory is used for storing the instruction tag; the comparison module is used for determining whether the instruction tags of the N paths of buffering instructions hit or not so as to select the hit path buffering instruction corresponding to the hit instruction tags; the instruction length unifying module is used for receiving a hit way buffering instruction sent by the comparison module and expanding a storage array of the hit way buffering instruction into a standard instruction with unified length, wherein the standard instruction comprises a compression instruction and a common instruction; the pre-decoding module is used for pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and a bit vector of an effective instruction; and the processing module is used for sending the unified coding information and the bit vector of the effective instruction to a pipeline stage corresponding to the processor for processing.
In an embodiment of the present application, the instruction length unification module is further configured to latch instruction information of a last preset threshold bit in the hit way buffer instruction, and use the instruction information as a next input of the instruction length unification module.
In one embodiment of the present application, the apparatus further comprises an address calculation module, wherein: the address calculation module is used for obtaining an initial instruction address of a first path of buffering instruction in the N paths of buffering instructions.
In one embodiment of the present application, the apparatus further comprises a reset-to-address calculation module, wherein: and the reset direction address calculation module is used for sequentially calculating each corresponding target instruction address when each path of buffering instruction in the N paths of buffering instructions is executed based on the initial instruction address and the sequentially increased offset of the initial instruction address when the N paths of buffering instructions are sequentially executed, and taking each target instruction address as the reset direction address corresponding to each path of buffering instruction.
In one embodiment of the present application, the apparatus further comprises stage branch predictors, wherein: the branch predictor at each stage is used for receiving each target instruction address corresponding to each buffering instruction in the N buffering instructions and taking each target instruction address as an address required by the branch predictor at each stage.
The application provides a processor instruction processing device supporting a compression instruction, which comprises a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unifying module, a pre-decoding module and a processing module, wherein the program counter is used for reading and storing instruction labels corresponding to a storage array of N paths of buffer instructions and the storage array, the comparison module is used for determining whether the instruction labels of the N paths of buffer instructions are hit or not so as to select a hit path buffer instruction corresponding to the hit instruction label, the instruction length unifying module is used for expanding the storage array of the hit path buffer instruction into a standard instruction with unified length, the standard instruction is pre-decoded by the pre-decoding module, unified coding information obtained by pre-decoding and a bit vector of an effective instruction are subjected to pipeline level processing by the processing module, therefore, the N paths of buffer instructions are converted into the standard instruction with unified length for pre-decoding, the pre-decoded processing logic is simplified, and the execution performance of a processor is improved.
A second aspect of the present application provides a method for processing a processor instruction supporting a compressed instruction, where the method includes: reading a storage array of N paths of buffer instructions and an instruction tag corresponding to the storage array; determining whether the instruction tag of the N-path buffer instruction is hit or not, so as to select a hit path buffer instruction corresponding to the hit instruction tag, and expand a storage array of the hit path buffer instruction into a standard instruction with a uniform length, wherein the standard instruction comprises a compression instruction and a common instruction; pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and a bit vector of an effective instruction; and calling the unified coding information and the bit vector of the effective instruction to a pipeline stage to be processed so as to perform pipeline stage processing.
In an embodiment of the present application, after the pre-decoding the standard instruction to obtain the uniform coding information and the bit vector of the valid instruction corresponding to the standard instruction, the method further includes: and latching the instruction information of the last preset threshold bit in the hit way buffer instruction so as to input the instruction information of the last preset threshold bit into the next hit way buffer instruction.
In one embodiment of the present application, the method further comprises: and obtaining the initial instruction address of the first path of buffering instruction in the N paths of buffering instructions.
In one embodiment of the present application, the method further comprises: and sequentially calculating each target instruction address corresponding to the execution of each path of buffering instruction in the N paths of buffering instructions based on the initial instruction address and the sequentially increased offset of the initial instruction address when the N paths of buffering instructions are sequentially executed, and taking each target instruction address as the reset direction address corresponding to each path of buffering instruction.
In one embodiment of the present application, the method further comprises: the method further comprises the following steps: and receiving each target instruction address corresponding to each buffering instruction in the N paths of buffering instructions, and taking each target instruction address as an address required by the transfer of the branch predictor at each stage.
The application provides a processor instruction processing method supporting a compression instruction, which comprises the steps of reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array, determining whether the instruction labels of the N paths of buffer instructions hit, selecting hit path buffer instructions corresponding to the hit instruction labels, and expanding the storage array of the hit path buffer instructions into standard instructions with uniform length, wherein the standard instructions comprise compression instructions and common instructions, carrying out pre-decoding on the standard instructions to obtain uniform coding information corresponding to the standard instructions and bit vectors of effective instructions, calling the uniform coding information and the bit vectors of the effective instructions to a pipeline stage needing to be processed, carrying out pipeline stage processing, converting the N paths of buffer instructions into the standard instructions with the uniform length for pre-decoding, simplifying the processing logic of the pre-decoding, and improving the execution performance of a processor.
An embodiment of a third aspect of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the second aspect.
A fourth aspect of the present application is directed to a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of the second aspect.
Other effects of the above alternatives will be described below with reference to specific embodiments.
Drawings
FIG. 1 is a block diagram of an instruction processing apparatus of a processor supporting packed instructions according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method for processing instructions in a processor that supports packed instructions, according to one embodiment;
FIG. 3 is an exemplary diagram of a hit way buffer instruction of one embodiment of the present application;
FIG. 4 is a diagram of an example unified standard instruction of one embodiment of the present application;
FIG. 5 is a diagram illustrating an example of a concatenation of a first instruction length in a unified hit way buffer instruction according to one embodiment of the present application;
FIG. 6 is an exemplary diagram of a first instruction length in an extended hit way buffer instruction according to one embodiment of the present application;
FIG. 7 is a diagram of an example of a second instruction length in an extended hit way buffer instruction according to one embodiment of the present application;
FIG. 8 is an exemplary diagram of the length of the last instruction in an extended hit way buffer instruction according to one embodiment of the present application;
FIG. 9 is an exemplary diagram of a reset to address calculation for one embodiment of the present application.
Detailed Description
Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
A processor instruction processing apparatus supporting a compressed instruction according to an embodiment of the present application is described below with reference to the drawings.
Fig. 1 is a schematic structural diagram of an instruction processing apparatus of a processor supporting a compress instruction according to an embodiment of the present application.
As shown in fig. 1, the instruction processing apparatus for a processor supporting a compress instruction comprises: a program counter 101, a first instruction memory 102, a second instruction memory 103, a comparison module 104, an instruction length unification module 105, a pre-decoding module 106, and a processing module 107, wherein:
in some embodiments, program counter 101 is in communication with first instruction memory 102 and second instruction memory 103, respectively, program counter 101 being configured to read a memory array of N-way buffered instructions and an instruction tag corresponding to the memory array.
Wherein the N-way buffered instruction is a sequential multi-way buffered instruction.
In addition, program counter 101 continues to execute after the first buffered instruction in the N buffered instructions is fetched, as program counter 101 points to the next buffered instruction in the sequence.
In some embodiments, the first instruction memory 102 is used to receive and store the memory array sent by the program counter 101.
In some embodiments, the second instruction memory 103 receives and stores the instruction tag sent by the program counter 101.
The first instruction memory 102 and the second instruction memory 103 may be the same instruction memory, and are used for storing different data.
In some embodiments, the comparison module 104 is in communication with the first instruction memory 102 and the second instruction memory 103, and the comparison module 104 is specifically configured to determine whether an instruction tag of the N-way buffer instruction hits, so as to select a hit-way buffer instruction corresponding to the hit instruction tag.
Specifically, if the instruction tag of the N-way buffered instruction is not hit, the instruction tag is called a miss, and if the instruction tag of the N-way buffered instruction is hit, the instruction tag is called a hit, that is, the hit way buffered instruction corresponding to the hit instruction tag is selected.
In some embodiments, the instruction length unification module 105 is in communication with the comparison module 104, and the instruction length unification module 105 is specifically configured to receive the hit way buffer instruction output by the comparison module 104, and expand the memory array of the hit way buffer instruction into a standard instruction with a unified length.
Specifically, for example, a 256-bit hit way buffer instruction includes 32-Byte instructions (32 Byte instructions), where there are both a compressed instruction (assuming that the compressed instruction is 16bit and 16bit instruction) and a normal instruction (assuming that the normal instruction is 32bit and 32bit instruction), so that the compressed instruction and the normal instruction are expanded into a standard 32-bit instruction (64 Byte instructions in total) with a uniform length.
In some embodiments, the pre-decoding module 106 is in communication with the instruction length unifying module 105, and the pre-decoding module 106 is specifically configured to pre-decode the standard instruction to obtain unified encoding information corresponding to the standard instruction and a bit vector of the valid instruction.
Specifically, under the condition that the instruction length unifying module 105 outputs a 32-bit standard instruction with a unified length, the standard instruction is pre-decoded by the pre-decoding module 106 to obtain unified encoding information (instructions 16) of the expanded 16 standard instructions and a bit vector (valid instruction bit vector) of the valid instruction.
In some embodiments, processing module 107 is in communication with predecode module 106 for processing the bit vectors of the unified coding information and the valid instructions sent to the corresponding pipeline stages of the processor.
The processing module 107 may be a plurality of modules that need to call bit vectors of unified coding information and valid instructions when performing pipeline processing.
In addition, it is understood that the modules of the program counter 101, the first instruction memory 102, the second instruction memory 103, the comparison module 104, the instruction length unification module 105, the predecoding module 106, and the processing module 107 can communicate with each other through a copper wire, but is not limited thereto.
In some embodiments, the instruction length unification module 105 is further configured to latch the instruction information of the last predetermined threshold bit in the hit way buffer instruction, and as the next input of the instruction length unification module 105, taking 16 bits as a compressed instruction and 32 bits as a normal instruction as an example, the instruction information of the last 16 bits in the hit way buffer instruction may be a first half instruction of a standard instruction of a next hit way buffer instruction, that is, a lower 16 bits of the standard instruction, so that, because the first half instruction of the second hit way buffer instruction may be a last hit way buffer instruction, the instruction length unification module 105 further needs to latch the instruction information of the last 16 bits in the hit way buffer instruction, so as to input the instruction length unification module 105 with the instruction information of the last 16 bits and the second hit way buffer instruction.
The instruction information of the preset threshold bit may be determined based on the instruction length corresponding to the hit way buffer instruction, but is not limited thereto.
In some embodiments, the apparatus further comprises an address calculation module 108, wherein: the address calculation module 108 is further in communication with the program counter 101, and the address calculation module 108 is specifically configured to take an initial instruction address of a first buffering instruction in the N buffering instructions as an initial address of the N buffering instructions.
In some embodiments, the apparatus further comprises a reset-to-address calculation module 109, wherein: the reset direction address calculation module 109 is respectively in communication with the predecoding module 106 and the address calculation module 108, and is specifically configured to sequentially calculate, based on the initial instruction address and an Offset (Offset) that sequentially increases from the initial instruction address when the N-way buffering instruction is sequentially executed, each target instruction address corresponding to each of the N-way buffering instructions when each of the N-way buffering instructions is executed, and use each target instruction address as a reset direction address corresponding to each of the N-way buffering instructions.
In some embodiments, the apparatus further comprises stages of branch predictors 110, wherein: the branch predictors 110 of each stage are in communication with the return address calculation module 109, and the branch predictors 110 of each stage are specifically configured to receive respective target instruction addresses corresponding to respective buffer instructions of the N buffer instructions, and use the respective target instruction addresses as addresses required by the branch predictors 110 of each stage for branching.
In summary, the present application provides a processor instruction processing apparatus supporting a compressed instruction, wherein the apparatus includes a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unification module, a pre-decoding module, and a processing module, the program counter reads and stores instruction tags corresponding to a storage array of N-way buffering instructions and the storage array, the comparison module determines whether the instruction tag of the N-way buffering instruction hits, so as to select a hit way buffering instruction corresponding to the hit way buffering instruction tag, the instruction length unification module expands the storage array of the hit way buffering instruction into a standard instruction of unified length, the pre-decoding module pre-decodes the standard instruction, and the processing module performs pipeline processing on unified coding information obtained by pre-decoding and a bit vector of an effective instruction, thereby converting the N-way buffering instruction into the standard instruction of unified length for pre-decoding, simplifying a processing logic of the pre-decoding, and improving an execution performance of the processor.
In addition, the present application further provides a schematic flowchart of a method for processing a processor instruction supporting a compressed instruction, as shown in fig. 2, specifically:
step 201, reading a storage array of the N-way buffer instruction and an instruction tag corresponding to the storage array.
Step 202, determining whether the instruction tag of the N-way buffer instruction is hit, so as to select the hit way buffer instruction corresponding to the hit instruction tag, and extending the memory array of the hit way buffer instruction into a standard instruction with a uniform length, wherein the standard instruction includes a compression instruction and a normal instruction.
In some embodiments, as shown in fig. 3, fig. 3 is a 32-bit hit way buffer instruction (i.e. an instruction Cache line) before the unified length expansion, wherein the normal instructions across the lines all count as the 16 upper bit instruction line, the 16 lower bit instruction line shall set the 16bit instruction to invalid, the shaded portion is the normal instruction, and the unshaded portion is the compress instruction.
In addition, after the hit way buffer instruction is expanded into the standard instruction with the uniform length, the obtained standard instruction may be, for example, 16 32-bit standard instructions, as shown in fig. 4, fig. 4 is that the hit way buffer instruction is expanded into 16 32-bit standard instructions with the uniform length, and then the validity of the 16 32-bit standard instructions needs to be determined, so as to pre-decode the valid 32-bit standard instructions, so as to obtain an instruction valid bit vector and 16 32-bit instructions with the uniform length.
As shown in FIG. 4, the valid bit vector of the 32-bit standard instruction is 16'b0010_1111_1010_1111,1 for valid, 0 for invalid, the lower 2 bits [241 ] of [255,240] of the hit-way buffer instruction are 2' b11, and the last two bytes are half of the standard instruction and belong to the first standard instruction of the next hit-way buffer instruction.
Step 203, pre-decoding the standard instruction to obtain the unified coding information corresponding to the standard instruction and the bit vector of the effective instruction.
In some embodiments, after the standard instruction is pre-decoded to obtain the uniform coding information corresponding to the standard instruction and the bit vector of the effective instruction, taking 16 bits as a compressed instruction and 32 bits as an ordinary instruction as an example, the last 16 bits (2 bytes) of instruction information (16 bit previous instruction) in the hit way buffer instruction (instruction Cache line) may be latched to output the last 16 bits of instruction information to the next hit way buffer instruction, so as to determine the next buffer instruction to be executed according to the last 16 bits of instruction information.
Specifically, as shown in fig. 5, if the transmitted 16-bit instruction information is the first half of a 32-bit standard instruction, i.e., the lower 16 bits of the standard instruction, the first instruction of the instruction length unification module concatenates the 16-bit instruction information and the 16 bits of the first hit way buffer instruction, as shown by the shaded portion in fig. 5.
In addition, as shown in fig. 6, if the transmitted 16-bit instruction information is invalid, the 16bit of the first hit way buffer instruction of the instruction is expanded, and if the expanded instruction is a compression instruction, the lower bit is 16 bits of the instruction code of the compression instruction, and the upper bit is supplemented with 16 0 s; if the instruction is not a compression instruction, the 32-bit instruction is reserved as the first standard instruction after the instruction length is unified, as shown in FIG. 6.
And judging whether the instruction is a compression instruction or not according to a compression instruction identification bit in the instruction coding. For example, RISC-V Instruction set (RISC-V Instruction encoding architecture), the lowest two bits of Instruction encoding, namely Instruction [1 ] = =2' b11 (binary 11), are normal 32-bit instructions, otherwise are compressed instructions.
To sum up, after completing the 16bit expansion of the first hit way buffer instruction, the second hit way buffer instruction starts to expand, as shown in fig. 7, i.e. the 16 th bit of the second hit way buffer instruction starts, the principle of judging the compression instruction in fig. 6 is the same, the 17 th bit and the 16 th bit of the second hit way buffer instruction are judged, according to whether the [17,16] bit of the second hit way buffer instruction is binary 11 and whether the lower 16bit thereof is the upper 16bit of the standard instruction, whether the instruction is a normal instruction or a compression instruction is determined, and the instruction length is expanded uniformly, and then the operation starts from the 16 th bit of the third hit way buffer instruction, and so on.
Further, as shown in FIG. 8, up to the 16 th bit of the last hit way buffer instruction is determined, the last two bytes of the last hit way buffer instruction are determined, and the high order bits are zero padded, as shown in FIG. 8.
And step 204, calling the unified coding information and the bit vector of the effective instruction to a pipeline stage to be processed so as to perform pipeline stage processing.
Furthermore, in some embodiments, based on the initial instruction address and the offset amount sequentially increased by the initial instruction address when the N-way buffering instructions are sequentially executed, the target instruction addresses corresponding to the execution of the buffering instructions in the N-way buffering instructions are sequentially calculated, and an implementation manner of taking the target instruction addresses as the reset direction addresses respectively corresponding to the buffering instructions in the N-way buffering instructions may be as shown in fig. 9, specifically, in the calculation process of the reset direction addresses, if a first instruction of the hit way buffering instruction (instruction Cache line) is a standard instruction and a first half of the instruction is latched instruction information of the last 16 bits, and if so, each reset direction address of the hit way buffering instruction is equal to packet _ PC + offset; otherwise, each reset address of the fetch instruction hitting the way buffer instruction equals fetch _ PC + offset.
The offset is the number of bytes of the address offset of each way of buffer instruction relative to the first buffer instruction of the instruction fetch hit way buffer instruction, fig. 9 is the instruction information of the last 16 bits latched in the first instruction of the instruction fetch hit way buffer instruction, where the upper half of the instruction is the standard instruction, and the address of each way of buffer instruction of the instruction fetch hit way buffer instruction is calculated relative to the address of the first instruction of the instruction fetch hit way buffer instruction.
The application provides a processor instruction processing method supporting a compression instruction, which comprises the steps of reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array, determining whether the instruction labels of the N paths of buffer instructions hit, selecting hit path buffer instructions corresponding to the hit instruction labels, and expanding the storage array of the hit path buffer instructions into buffer instructions with uniform length, wherein the buffer instructions comprise compression instructions and common instructions, carrying out pre-decoding on the buffer instructions to obtain uniform coding information corresponding to the buffer instructions and bit vectors of the effective instructions, calling the uniform coding information and the bit vectors of the effective instructions to a pipeline stage needing to be processed, carrying out pipeline stage processing, converting the N paths of buffer instructions into the buffer instructions with uniform length for pre-decoding, simplifying the processing logic of the pre-decoding, and improving the execution performance of a processor.
In order to implement the foregoing embodiment, the present application further provides an electronic device, including:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the aforementioned methods.
To achieve the above embodiments, the present application also proposes a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the aforementioned method.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to implicitly indicate the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (12)

1. A processor instruction processing device supporting compressed instructions, the device comprising a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unification module, a pre-decoding module and a processing module, wherein:
the program counter is used for reading a storage array of the N-path buffering instruction and an instruction tag corresponding to the storage array;
the first instruction memory is used for storing the storage array; the second instruction memory is used for storing the instruction tag;
the comparison module is used for determining whether the instruction tags of the N paths of buffering instructions hit or not so as to select the hit path buffering instruction corresponding to the hit instruction tags;
the instruction length unifying module is used for receiving a hit way buffering instruction sent by the comparison module and expanding a storage array of the hit way buffering instruction into a standard instruction with unified length, wherein the standard instruction comprises a compression instruction and a common instruction;
the pre-decoding module is used for pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and a bit vector of an effective instruction;
and the processing module is used for sending the unified coding information and the bit vector of the effective instruction to a pipeline stage corresponding to the processor for processing.
2. The apparatus of claim 1, wherein the instruction length unification module is further configured to latch instruction information of a last predetermined threshold bit in the hit way buffer instruction, and to use the latched instruction information as a next input of the instruction length unification module.
3. The apparatus of claim 1, further comprising an address calculation module, wherein:
the address calculation module is used for evaluating an initial instruction address of a first path of buffering instruction in the N paths of buffering instructions.
4. The apparatus of claim 3, further comprising a reset direction address calculation module, wherein:
and the reset direction address calculation module is used for sequentially calculating each target instruction address corresponding to each path of buffer instruction in the N paths of buffer instructions based on the initial instruction address and the offset which is sequentially increased by the initial instruction address when the N paths of buffer instructions are sequentially executed, and taking each target instruction address as the reset direction address corresponding to each path of buffer instruction.
5. The apparatus of claim 4, further comprising stage branch predictors, wherein:
and the branch predictors at each stage are used for receiving each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions and taking each target instruction address as an address required by the branch predictor at each stage for branch.
6. A method for processing instructions in a processor that supports compressed instructions, the method comprising:
reading a storage array of N paths of buffer instructions and an instruction tag corresponding to the storage array;
determining whether the instruction tag of the N-path buffer instruction is hit or not, so as to select a hit path buffer instruction corresponding to the hit instruction tag, and expand a storage array of the hit path buffer instruction into a standard instruction with a uniform length, wherein the standard instruction comprises a compression instruction and a common instruction;
pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and a bit vector of an effective instruction;
and calling the unified coding information and the bit vector of the effective instruction to a pipeline stage to be processed so as to perform pipeline stage processing.
7. The method of claim 6, further comprising, after said predecoding the standard instruction to obtain the unified coding information and the bit vector of the valid instruction corresponding to the standard instruction:
and latching the instruction information of the last preset threshold bit in the hit way buffer instruction so as to input the instruction information of the last preset threshold bit into the next hit way buffer instruction.
8. The method of claim 7, further comprising:
and obtaining the initial instruction address of the first path of buffering instruction in the N paths of buffering instructions.
9. The method of claim 8, further comprising:
and sequentially calculating each target instruction address corresponding to each path of buffer instruction in the N paths of buffer instructions based on the initial instruction address and the offset which is sequentially increased from the initial instruction address when the N paths of buffer instructions are sequentially executed, and taking each target instruction address as the reset direction address corresponding to each path of buffer instruction.
10. The method of claim 9, further comprising:
and receiving each target instruction address corresponding to each buffering instruction in the N paths of buffering instructions, and taking each target instruction address as an address required by the transfer of the branch predictor at each stage.
11. An electronic device, comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 6-10.
12. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 6-10.
CN202310057209.0A 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions Active CN115878187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310057209.0A CN115878187B (en) 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310057209.0A CN115878187B (en) 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions

Publications (2)

Publication Number Publication Date
CN115878187A true CN115878187A (en) 2023-03-31
CN115878187B CN115878187B (en) 2023-05-02

Family

ID=85758681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310057209.0A Active CN115878187B (en) 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions

Country Status (1)

Country Link
CN (1) CN115878187B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020091892A1 (en) * 2001-01-09 2002-07-11 Vondran Gary L. Method and apparatus for efficient cache mapping of compressed VLIW instructions
CN1492318A (en) * 1995-05-31 2004-04-28 ���µ�����ҵ��ʽ���� Microprocessor for supporting program code length reduction
CN101164040A (en) * 2005-03-04 2008-04-16 高通股份有限公司 Power saving methods and apparatus for variable length instructions
CN108133452A (en) * 2017-12-06 2018-06-08 中国航空工业集团公司西安航空计算技术研究所 A kind of instruction issue processing circuit of unified stainer array
CN109918130A (en) * 2019-01-24 2019-06-21 中山大学 A kind of four level production line RISC-V processors with rapid data bypass structure
CN110780925A (en) * 2019-09-02 2020-02-11 芯创智(北京)微电子有限公司 Pre-decoding system and method of instruction pipeline

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1492318A (en) * 1995-05-31 2004-04-28 ���µ�����ҵ��ʽ���� Microprocessor for supporting program code length reduction
US20020091892A1 (en) * 2001-01-09 2002-07-11 Vondran Gary L. Method and apparatus for efficient cache mapping of compressed VLIW instructions
CN101164040A (en) * 2005-03-04 2008-04-16 高通股份有限公司 Power saving methods and apparatus for variable length instructions
CN108133452A (en) * 2017-12-06 2018-06-08 中国航空工业集团公司西安航空计算技术研究所 A kind of instruction issue processing circuit of unified stainer array
CN109918130A (en) * 2019-01-24 2019-06-21 中山大学 A kind of four level production line RISC-V processors with rapid data bypass structure
CN110780925A (en) * 2019-09-02 2020-02-11 芯创智(北京)微电子有限公司 Pre-decoding system and method of instruction pipeline

Also Published As

Publication number Publication date
CN115878187B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
US8898437B2 (en) Predecode repair cache for instructions that cross an instruction cache line
KR101059335B1 (en) Efficient Use of JHT in Processors with Variable Length Instruction Set Execution Modes
US6223277B1 (en) Data processing circuit with packed data structure capability
US5774710A (en) Cache line branch prediction scheme that shares among sets of a set associative cache
CN104657110B (en) Instruction cache with fixed number of variable length instructions
JP2000222205A (en) Method and device for reducing delay of set associative cache by set prediction
CN108108190B (en) Calculation method and related product
RU2602335C2 (en) Cache predicting method and device
US6499100B1 (en) Enhanced instruction decoding
US7949862B2 (en) Branch prediction table storing addresses with compressed high order bits
CN112631660A (en) Method for parallel instruction extraction and readable storage medium
US20080028189A1 (en) Microprocessor and Method of Instruction Alignment
EP1369776B1 (en) Information processor having delayed branch function
US5295248A (en) Branch control circuit
US5881258A (en) Hardware compatibility circuit for a new processor architecture
CN115878187B (en) Processor instruction processing apparatus and method supporting compressed instructions
CN101714076B (en) A processor and a method for decompressing instruction bundles
US6237087B1 (en) Method and apparatus for speeding sequential access of a set-associative cache
KR100719420B1 (en) Information processing device
US10732977B2 (en) Bytecode processing device and operation method thereof
US6047368A (en) Processor architecture including grouping circuit
US6654874B1 (en) Microcomputer systems having compressed instruction processing capability and methods of operating same
KR100528208B1 (en) Program control method
CN111209044B (en) Instruction compression method and device
JPH01183737A (en) Information processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant