CN115878187B - Processor instruction processing apparatus and method supporting compressed instructions - Google Patents

Processor instruction processing apparatus and method supporting compressed instructions Download PDF

Info

Publication number
CN115878187B
CN115878187B CN202310057209.0A CN202310057209A CN115878187B CN 115878187 B CN115878187 B CN 115878187B CN 202310057209 A CN202310057209 A CN 202310057209A CN 115878187 B CN115878187 B CN 115878187B
Authority
CN
China
Prior art keywords
instruction
buffer
instructions
hit
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310057209.0A
Other languages
Chinese (zh)
Other versions
CN115878187A (en
Inventor
郇丹丹
李祖松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Micro Core Technology Co ltd
Original Assignee
Beijing Micro Core Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Micro Core Technology Co ltd filed Critical Beijing Micro Core Technology Co ltd
Priority to CN202310057209.0A priority Critical patent/CN115878187B/en
Publication of CN115878187A publication Critical patent/CN115878187A/en
Application granted granted Critical
Publication of CN115878187B publication Critical patent/CN115878187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The device comprises a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unification module, a pre-decoding module and a processing module, wherein the program counter is used for reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array and storing the N paths of buffer instructions, the comparison module is used for determining whether the instruction labels of the N paths of buffer instructions hit or not so as to select corresponding hit path buffer instructions when hit the instruction labels, the instruction length unification module is used for expanding the storage array of the hit path buffer instructions into standard instructions with uniform length, the pre-decoding module is used for pre-decoding the standard instructions, and the processing module is used for pre-decoding the unified coding information obtained by pre-decoding and bit vectors of effective instructions in a pipeline mode, so that the N paths of buffer instructions are converted into the standard instructions with uniform length, the pre-decoding is performed, processing logic of the pre-decoding is simplified, and the execution performance of the processor is improved.

Description

Processor instruction processing apparatus and method supporting compressed instructions
Technical Field
The present disclosure relates to the field of processor technologies, and in particular, to a processor instruction processing apparatus and method supporting compressed instructions.
Background
The design of the instruction set (Instruction Set Architecture, ISA) of modern processors is to reduce the size of program codes, increase the code density, design compressed instructions to reduce the miss rate of instruction caches, improve the processor performance, reduce the power consumption, reduce the area and the cost, but the compressed instructions reduce the instruction length, on one hand, make the pre-decoding and decoding logic more complex, affect the main frequency of the high-performance processor, and on the other hand, increase the calculation logic of the program counter, so that the calculation delay of the program counter is increased, thus a processor instruction processing device and method for supporting the compressed instructions are needed.
Disclosure of Invention
An embodiment of the present application proposes a processor instruction processing apparatus supporting compressed instructions, the apparatus including a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unifying module, a pre-decoding module, and a processing module, wherein: the program counter is used for reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array; the first instruction memory is used for storing the storage array; the second instruction memory is used for storing the instruction tag; the comparison module is used for determining whether the instruction tag of the N-way buffer instruction hits or not so as to select a corresponding hit-way buffer instruction when the instruction tag is hit; the instruction length unifying module is used for receiving the hit way buffer instruction sent by the comparison module and expanding a storage array of the hit way buffer instruction into a standard instruction with unified length, wherein the standard instruction comprises a compression instruction and a common instruction; the pre-decoding module is used for pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and bit vectors of the effective instruction; and the processing module is used for sending the bit vectors of the unified coding information and the effective instruction to the pipeline stage corresponding to the processor for processing.
In an embodiment of the present application, the instruction length unifying module is further configured to latch instruction information of a last preset threshold bit in the hit way buffer instruction, and use the instruction information as a next input of the instruction length unifying module.
In one embodiment of the present application, the apparatus further comprises an address calculation module, wherein: the address calculation module is used for taking out an initial instruction address of a first path of buffer instruction in the N paths of buffer instructions.
In one embodiment of the present application, the apparatus further comprises a reset-to-address calculation module, wherein: the reset direction address calculation module is used for sequentially calculating each target instruction address corresponding to each path of buffer instruction in the N paths of buffer instructions based on the initial instruction address and the offset which is sequentially increased by the initial instruction address when the N paths of buffer instructions are sequentially executed, and taking each target instruction address as the reset direction address corresponding to each path of buffer instruction.
In one embodiment of the present application, the apparatus further comprises a stage branch predictor, wherein: the branch predictors at all levels are used for receiving each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions, and taking each target instruction address as an address required by the branch of the branch predictors at all levels.
The device comprises a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unification module, a pre-decoding module and a processing module, wherein the program counter is used for reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array and storing the N paths of buffer instructions, the comparison module is used for determining whether the instruction labels of the N paths of buffer instructions hit or not so as to select corresponding hit path buffer instructions when hit the instruction labels, the storage array of the hit path buffer instructions is expanded into standard instructions with uniform length through the instruction length unification module, the standard instructions are pre-decoded through the pre-decoding module, and the processing module is used for pre-decoding the unified coding information obtained through pre-decoding and bit vectors of effective instructions, so that the N paths of buffer instructions are converted into the standard instructions with uniform length, the pre-decoding processing logic is simplified, and the execution performance of the processor is improved.
A second aspect of an embodiment of the present application proposes a method for processing a processor instruction supporting a compressed instruction, the method including: reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array; determining whether an instruction tag of the N-way buffer instruction hits or not, selecting a hit-way buffer instruction corresponding to the hit-way buffer instruction when the instruction tag is hit, and expanding a storage array of the hit-way buffer instruction into a standard instruction with uniform length, wherein the standard instruction comprises a compression instruction and a common instruction; pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and a bit vector of an effective instruction; and calling the bit vector of the unified coding information and the valid instruction to a pipeline stage to be processed so as to carry out pipeline stage processing.
In one embodiment of the present application, after the pre-decoding the standard instruction to obtain the unified coding information corresponding to the standard instruction and the bit vector of the valid instruction, the method further includes: and latching instruction information of a last preset threshold bit in the hit way buffer instruction so as to input the instruction information of the last preset threshold bit into a next hit way buffer instruction.
In one embodiment of the present application, the method further comprises: and taking out the initial instruction address of the first path of buffer instruction in the N paths of buffer instructions.
In one embodiment of the present application, the method further comprises: and sequentially calculating each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions based on the initial instruction address and the offset which is sequentially increased by the initial instruction address when the N paths of buffer instructions are sequentially executed, and taking each target instruction address as a reset direction address corresponding to each buffer instruction.
In one embodiment of the present application, the method further comprises: the method further comprises the steps of: and receiving each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions, and taking each target instruction address as an address required by transfer of each level of transfer predictors.
The application provides a processor instruction processing method supporting compressed instructions, which is characterized in that whether an instruction tag of an N-way buffer instruction hits or not is determined by reading a storage array of the N-way buffer instruction and an instruction tag corresponding to the storage array, so that a hit buffer instruction corresponding to the hit buffer instruction when the instruction tag is hit is selected, and the storage array of the hit buffer instruction is expanded into a standard instruction with uniform length, wherein the standard instruction comprises the compressed instruction and a common instruction, the standard instruction is pre-decoded to obtain uniform coding information corresponding to the standard instruction and bit vectors of effective instructions, the uniform coding information and the bit vectors of the effective instructions are called to a pipeline stage to be processed to perform pipeline stage processing, and therefore the N-way buffer instruction is converted into the standard instruction with uniform length to be pre-decoded, the pre-decoded processing logic is simplified, and the execution performance of a processor is improved.
An embodiment of a third aspect of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the second aspect.
An embodiment of a fourth aspect of the present application proposes a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of the second aspect.
Other effects of the above alternative will be described below in connection with specific embodiments.
Drawings
FIG. 1 is a schematic diagram of a processor instruction processing apparatus supporting compressed instructions according to one embodiment of the present application;
FIG. 2 is a flow diagram of a method of processing processor instructions supporting compressed instructions according to one embodiment of the present application;
FIG. 3 is an exemplary diagram of a hit way buffer instruction according to one embodiment of the present application;
FIG. 4 is an exemplary diagram of a uniform length standard instruction according to one embodiment of the present application;
FIG. 5 is a diagram of a splice example of a first instruction length in a unified hit way buffer instruction according to one embodiment of the present application;
FIG. 6 is an exemplary diagram of the length of a first instruction in an extended hit way buffer instruction according to one embodiment of the present application;
FIG. 7 is an exemplary diagram of the length of a second instruction in an extended hit way buffer instruction according to one embodiment of the present application;
FIG. 8 is an exemplary diagram of the length of the last instruction in an extended hit way buffer instruction according to one embodiment of the present application;
FIG. 9 is an exemplary diagram of reset-to-address computation according to one embodiment of the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present application and are not to be construed as limiting the present application.
A processor instruction processing apparatus supporting compressed instructions according to an embodiment of the present application is described below with reference to the accompanying drawings.
FIG. 1 is a schematic diagram of a processor instruction processing apparatus supporting compressed instructions according to one embodiment of the present application.
As shown in fig. 1, the processor instruction processing apparatus supporting a compressed instruction includes: a program counter 101, a first instruction memory 102, a second instruction memory 103, a comparison module 104, an instruction length unification module 105, a pre-decoding module 106 and a processing module 107, wherein:
in some embodiments, the program counter 101 is in communication with the first instruction memory 102 and the second instruction memory 103, respectively, and the program counter 101 is configured to read a storage array of N-way buffered instructions and instruction tags corresponding to the storage array.
Wherein the N-way buffer instruction is a continuous multi-way buffer instruction.
Further, the program counter 101 continues to execute the next buffered instruction in the sequence pointed to by the program counter 101 after the first buffered instruction of the N buffered instructions is fetched.
In some embodiments, the first instruction memory 102 is configured to receive and store a memory array sent by the program counter 101.
In some embodiments, the second instruction memory 103 is stored with the instruction tag sent from the receive program counter 101.
The first instruction memory 102 and the second instruction memory 103 may be the same instruction memory, and are used for storing different data.
In some embodiments, the comparing module 104 is in communication with the first instruction memory 102 and the second instruction memory 103, respectively, and the comparing module 104 is specifically configured to determine whether an instruction tag of the N-way buffer instruction hits, so as to select a hit-way buffer instruction corresponding to the hit instruction tag.
Specifically, in the case of not hitting the instruction tag of the N-way buffer instruction, this is called miss, and in the case of hitting the instruction tag of the N-way buffer instruction, this is called hit, i.e., the hit-way buffer instruction corresponding to the hit instruction tag is selected.
In some embodiments, the instruction length unifying module 105 communicates with the comparing module 104, and the instruction length unifying module 105 is specifically configured to receive the hit way buffer instruction output by the comparing module 104, and expand the storage array of the hit way buffer instruction into a standard instruction with a unified length.
Specifically, taking a 256-bit hit buffer instruction as an example, it includes a 32-byte instruction (32 Byte instructions), in which there is both a compress instruction (assuming a compress instruction of 16bit,16bit instruction) and a normal instruction (assuming a normal instruction of 32bit,32bit instruction), thereby expanding the compress instruction and the normal instruction into a unified length of 32-bit standard instruction (64 Byte instructions in total).
In some embodiments, the pre-decoding module 106 is in communication with the instruction length unifying module 105, where the pre-decoding module 106 is specifically configured to pre-decode the standard instruction to obtain unified coding information corresponding to the standard instruction and a bit vector of the valid instruction.
Specifically, in the case that the instruction length unifying module 105 outputs a standard instruction of 32 bits with a unified length, the unified coding information (instructions 16) of the extended 16 standard instructions and the bit vector (Validinstruction bit vector) of the valid instruction are obtained by the pre-decoding module 106.
In some embodiments, the processing module 107 is in communication with the pre-decode module 106 for processing bit vectors that send the unified encoded information and valid instructions to corresponding pipeline stages of the processor.
The processing module 107 may be a plurality of modules that need to call bit vectors of unified coding information and valid instructions when performing pipeline processing.
Further, it is understood that the foregoing program counter 101, the first instruction memory 102, the second instruction memory 103, the comparison module 104, the instruction length unifying module 105, the pre-decoding module 106, and the processing module 107 may all communicate with each other through copper wires, but are not limited thereto.
In some embodiments, the instruction length unifying module 105 is further configured to latch the instruction information of the last preset threshold bit in the hit way buffer instruction, and take as the next input of the instruction length unifying module 105, taking 16 bits as a compressed instruction and 32 bits as a normal instruction as an example, the instruction information of the last 16 bits in the hit way buffer instruction may be the first half instruction of the standard instruction of the next hit way buffer instruction, that is, the lower 16 bits of the standard instruction, so, because the first half instruction of the second hit way buffer instruction may be the last hit way buffer instruction, the instruction length unifying module 105 is further configured to latch the instruction information of the last 16 bits in the hit way buffer instruction, so as to input the instruction information of the last 16 bits and the second hit way buffer instruction together into the instruction length unifying module 105.
The instruction information of the preset threshold bit may be determined based on the instruction length corresponding to the hit buffer instruction, but is not limited thereto.
In some embodiments, the apparatus further comprises an address calculation module 108, wherein: the address calculation module 108 is further in communication with the program counter 101, and the address calculation module 108 is specifically configured to take the initial instruction address of the first path of the N paths of buffer instructions as the start address of the N paths of buffer instructions.
In some embodiments, the apparatus further comprises a reset-to-address calculation module 109, wherein: the reset direction address calculation module 109 is respectively in communication with the pre-decoding module 106 and the address calculation module 108, and is specifically configured to sequentially calculate each target instruction address corresponding to each buffer instruction in the N-way buffer instruction based on the initial instruction address and an Offset (Offset) that sequentially increases the initial instruction address when the N-way buffer instruction is sequentially executed, and use each target instruction address as a reset direction address corresponding to each buffer instruction.
In some embodiments, the apparatus further comprises a stage branch predictor 110, wherein: the branch predictors 110 are in communication with the reset address calculation module 109, and the branch predictors 110 are specifically configured to receive respective target instruction addresses corresponding to respective buffer instructions in the N buffer instructions, and use the respective target instruction addresses as addresses required for the branch of the branch predictors 110.
In summary, the application proposes a processor instruction processing device supporting compressed instructions, where the device includes a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unifying module, a pre-decoding module and a processing module, where the comparison module determines whether an instruction tag of an N-way buffer instruction hits or not to select a hit-way buffer instruction corresponding to the hit instruction tag, and expands the storage array of the hit-way buffer instruction to a standard instruction with a unified length through the instruction length unifying module, and pre-decodes the standard instruction through the pre-decoding module, and performs pipeline level processing on unified coding information obtained by pre-decoding and bit vectors of an effective instruction through the processing module, thereby converting the N-way buffer instruction to the standard instruction with a unified length to pre-decode, simplifying pre-decoded processing logic, and improving execution performance of the processor.
In addition, the present application further provides a flow diagram of a processor instruction processing method supporting a compressed instruction, as shown in fig. 2, specifically:
step 201, a storage array of N-way buffer instructions and instruction tags corresponding to the storage array are read.
Step 202, determining whether an instruction tag of the N-way buffer instruction hits or not, so as to select a hit-way buffer instruction corresponding to the hit instruction tag, and expanding a storage array of the hit-way buffer instruction into a standard instruction with a uniform length, wherein the standard instruction comprises a compressed instruction and a normal instruction.
In some embodiments, as shown in fig. 3, fig. 3 is a 32-bit hit buffer instruction (instruction Cache line) before the unified length expansion, wherein the normal instructions crossing the line are all calculated as the instruction line where the upper 16 bits are located, the instruction line where the lower 16 bits are located should set the 16-bit instruction to be invalid, the shadow part is the normal instruction, and the non-shadow part is the compressed instruction.
In addition, after the hit buffer instruction is expanded to a standard instruction with a uniform length, the obtained standard instruction is exemplified by 16 32-bit standard instructions, as shown in fig. 4, fig. 4 is an example of the hit buffer instruction expanded to 16 32-bit standard instructions with a uniform length, and then validity judgment is further required to be performed on the 16 32-bit standard instructions to pre-decode the valid 32-bit standard instructions, so as to obtain an instruction valid bit vector and 16 32-bit instructions with a uniform length.
As shown in FIG. 4, the valid bit vector of the 32-bit standard instruction is 16'b0010_1111_1010_1111,1 is valid, 0 is invalid, the lower 2 bits [241:240] of [255,240] of the hit buffer instruction are 2' b11, the last two bytes are half of the standard instruction, and the first standard instruction belongs to the next hit buffer instruction.
Step 203, pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and bit vectors of the valid instruction.
In some embodiments, after the standard instruction is pre-decoded to obtain the unified coding information corresponding to the standard instruction and the bit vector of the valid instruction, taking 16 bits as a compressed instruction and 32 bits as a normal instruction as an example, the instruction information (16 bit previous instruction) of the last 16 bits (2 bytes) in the hit way buffer instruction (instruction Cache line) may be further latched, so as to output the instruction information of the last 16 bits to the next hit way buffer instruction, so as to determine the next buffer instruction to be executed according to the instruction information of the last 16 bits.
Specifically, as shown in fig. 5, if the incoming 16-bit instruction information is the first half of the 32-bit standard instruction, i.e., the lower 16 bits of the standard instruction, the first instruction of the instruction length unifying module concatenates the 16-bit instruction information with the 16 bits of the first hit buffer instruction, as shown in the shaded portion of fig. 5.
In addition, as shown in fig. 6, if the transmitted 16-bit instruction information is invalid, the 16 bits of the first hit buffer instruction of the instruction are expanded, if the expansion is a compression instruction, the low order is 16 bits of the instruction code of the compression instruction, and the high order is complemented with 16 0 s; if the instruction is not a compressed instruction, a 32-bit instruction is reserved and is used as a first standard instruction after the instruction length is unified, as shown in fig. 6.
And judging whether the instruction is a compressed instruction or not according to a compressed instruction identification bit in the instruction code. For example, the RISC-V Instruction set (RISC-V Instruction setarchitecture), the lowest two bits of Instruction encoding, i.e., instruction [1:0] = 2' b11 (binary 11), are normal 32-bit instructions, otherwise compressed instructions.
In summary, after the expansion of the 16 bits of the first hit buffer instruction is completed, the expansion of the second hit buffer instruction is started, as shown in fig. 7, that is, the 16 th bit of the second hit buffer instruction starts, the same principle as that of the judgment of the compressed instruction in fig. 6, the 17 th bit and the 16 th bit of the second hit buffer instruction are judged, according to whether the [17,16] bits of the second hit buffer instruction are binary 11 or not and whether the lower 16 bits thereof are the upper 16 bits of the standard instruction, whether the instruction is a normal instruction or the compressed instruction is determined, the expansion of the instruction length is consistent, and the like from the 16 th bit of the third hit buffer instruction is started.
In addition, as shown in fig. 8, until the 16 th bit of the last hit way buffer instruction is judged, the last two bytes of the last hit way buffer instruction are judged, and the upper bits are zero-padded, as shown in fig. 8.
Step 204, the bit vector of the unified coding information and the valid instruction is called to the pipeline stage to be processed, so as to perform pipeline stage processing.
In addition, in some embodiments, based on the initial instruction address and the offset that increases in order from the initial instruction address when the N-way buffer instruction is sequentially executed, one implementation manner of sequentially calculating each target instruction address corresponding to each way buffer instruction in the N-way buffer instruction and using each target instruction address as a reset address corresponding to each way buffer instruction may be as shown in fig. 9, specifically, in the process of calculating the reset address, if the first instruction of the instruction fetch middle-way buffer instruction (instruction Cache line) is a standard instruction and the upper half instruction thereof is the latched last 16-bit instruction information, if yes, each reset address of the instruction fetch middle-way buffer instruction is equal to packet_pc+offset; otherwise, each reset-direction address of the fetch hit way buffer instruction is equal to fetch_pc+offset.
The offset is the number of bytes of each buffer instruction relative to the address of the first buffer instruction of the instruction fetch hit buffer instruction, fig. 9 is the standard instruction of the first instruction of the instruction fetch hit buffer instruction, the upper half instruction is the latched last 16 bits of instruction information, and the reset address of each buffer instruction of the instruction fetch hit buffer instruction relative to the first instruction of the instruction fetch hit buffer instruction is calculated.
The application provides a processor instruction processing method supporting compressed instructions, which is characterized in that whether an instruction tag of N buffer instructions hits or not is determined by reading a storage array of the N buffer instructions and an instruction tag corresponding to the storage array, so that a hit buffer instruction corresponding to the hit instruction tag is selected, and the storage array of the hit buffer instruction is expanded into a buffer instruction with a uniform length, wherein the buffer instruction comprises the compressed instruction and a common instruction, the buffer instruction is pre-decoded to obtain uniform coding information corresponding to the buffer instruction and bit vectors of effective instructions, and the uniform coding information and the bit vectors of the effective instructions are called to a pipeline stage to be processed to carry out pipeline stage processing, so that the N buffer instructions are converted into the buffer instruction with the uniform length to be pre-decoded, the pre-decoded processing logic is simplified, and the execution performance of a processor is improved.
In order to achieve the above embodiments, the present application further proposes an electronic device including:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the aforementioned method.
To achieve the above embodiments, the present application also proposes a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the aforementioned method.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" is at least two, such as two, three, etc., unless explicitly defined otherwise.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (6)

1. A processor instruction processing apparatus supporting compressed instructions, the apparatus comprising a program counter, a first instruction memory, a second instruction memory, a comparison module, an instruction length unification module, a pre-decode module, and a processing module, wherein:
the program counter is used for reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array;
the first instruction memory is used for storing the storage array; the second instruction memory is used for storing the instruction tag;
the comparison module is used for determining whether the instruction tag of the N-way buffer instruction hits or not so as to select a corresponding hit-way buffer instruction when the instruction tag is hit;
the instruction length unifying module is used for receiving the hit way buffer instruction sent by the comparison module and expanding a storage array of the hit way buffer instruction into a standard instruction with unified length, wherein the standard instruction comprises a compression instruction and a common instruction;
the pre-decoding module is used for pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and bit vectors of the effective instruction;
the processing module is used for sending the bit vectors of the unified coding information and the effective instructions to the pipeline stage corresponding to the processor for processing;
the apparatus further comprises an address calculation module, wherein:
the address calculation module is used for taking out an initial instruction address of a first path of buffer instruction in the N paths of buffer instructions;
the apparatus further comprises a reset-to-address calculation module, wherein:
the reset direction address calculation module is used for sequentially calculating each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions based on the initial instruction address and the offset which is sequentially increased by the initial instruction address when the N paths of buffer instructions are sequentially executed, and taking each target instruction address as a reset direction address corresponding to each buffer instruction;
the apparatus further comprises a stage branch predictor, wherein:
the branch predictors at all levels are used for receiving each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions, and taking each target instruction address as an address required by the branch of the branch predictors at all levels.
2. The apparatus of claim 1, wherein the instruction length unification module is further configured to latch instruction information of a last preset threshold bit in the hit way buffer instruction and to use the instruction information as a next input of the instruction length unification module.
3. A method of processor instruction processing supporting compressed instructions, which is performed using an apparatus according to any one of claims 1-2, the method comprising:
reading a storage array of N paths of buffer instructions and instruction labels corresponding to the storage array;
determining whether an instruction tag of the N-way buffer instruction hits or not, selecting a hit-way buffer instruction corresponding to the hit-way buffer instruction when the instruction tag is hit, and expanding a storage array of the hit-way buffer instruction into a standard instruction with uniform length, wherein the standard instruction comprises a compression instruction and a common instruction;
pre-decoding the standard instruction to obtain unified coding information corresponding to the standard instruction and a bit vector of an effective instruction;
invoking the bit vector of the unified coding information and the valid instruction to a pipeline stage to be processed so as to carry out pipeline stage processing;
the method further comprises the steps of:
the initial instruction address of a first path of buffer instruction in the N paths of buffer instructions is taken out;
sequentially calculating each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions based on the initial instruction address and the offset sequentially increased by the initial instruction address when the N paths of buffer instructions are sequentially executed, and taking each target instruction address as a reset direction address corresponding to each buffer instruction;
each stage of branch predictors receives each target instruction address corresponding to each buffer instruction in the N paths of buffer instructions, and takes each target instruction address as an address required by the branch of each stage of branch predictors.
4. The method of claim 3, further comprising, after said pre-decoding said standard instruction to obtain uniformly encoded information corresponding to said standard instruction and a bit vector of valid instructions:
and latching instruction information of a last preset threshold bit in the hit way buffer instruction so as to input the instruction information of the last preset threshold bit into a next hit way buffer instruction.
5. An electronic device, comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 3-4.
6. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 3-4.
CN202310057209.0A 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions Active CN115878187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310057209.0A CN115878187B (en) 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310057209.0A CN115878187B (en) 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions

Publications (2)

Publication Number Publication Date
CN115878187A CN115878187A (en) 2023-03-31
CN115878187B true CN115878187B (en) 2023-05-02

Family

ID=85758681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310057209.0A Active CN115878187B (en) 2023-01-16 2023-01-16 Processor instruction processing apparatus and method supporting compressed instructions

Country Status (1)

Country Link
CN (1) CN115878187B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5966514A (en) * 1995-05-31 1999-10-12 Matsushita Electric Industrial Co., Ltd. Microprocessor for supporting reduction of program codes in size
US6581131B2 (en) * 2001-01-09 2003-06-17 Hewlett-Packard Development Company, L.P. Method and apparatus for efficient cache mapping of compressed VLIW instructions
US7421568B2 (en) * 2005-03-04 2008-09-02 Qualcomm Incorporated Power saving methods and apparatus to selectively enable cache bits based on known processor state
CN108133452B (en) * 2017-12-06 2021-06-01 中国航空工业集团公司西安航空计算技术研究所 Instruction transmitting and processing circuit of unified stainer array
CN109918130A (en) * 2019-01-24 2019-06-21 中山大学 A kind of four level production line RISC-V processors with rapid data bypass structure
CN110780925B (en) * 2019-09-02 2021-11-16 芯创智(北京)微电子有限公司 Pre-decoding system and method of instruction pipeline

Also Published As

Publication number Publication date
CN115878187A (en) 2023-03-31

Similar Documents

Publication Publication Date Title
US8898437B2 (en) Predecode repair cache for instructions that cross an instruction cache line
US10177782B2 (en) Hardware apparatuses and methods for data decompression
KR101059335B1 (en) Efficient Use of JHT in Processors with Variable Length Instruction Set Execution Modes
US6223277B1 (en) Data processing circuit with packed data structure capability
CN104657110B (en) Instruction cache with fixed number of variable length instructions
US5774710A (en) Cache line branch prediction scheme that shares among sets of a set associative cache
CN101223504B (en) Caching instructions for a multiple-state processor
US5187793A (en) Processor with hierarchal memory and using meta-instructions for software control of loading, unloading and execution of machine instructions stored in the cache
CN108108190B (en) Calculation method and related product
MX2007010773A (en) Power saving methods and apparatus for variable length instructions.
US20160202985A1 (en) Variable Length Instruction Processor System and Method
US7949862B2 (en) Branch prediction table storing addresses with compressed high order bits
CN112631660A (en) Method for parallel instruction extraction and readable storage medium
US7546445B2 (en) Information processor having delayed branch function with storing delay slot information together with branch history information
CN115878187B (en) Processor instruction processing apparatus and method supporting compressed instructions
US7346737B2 (en) Cache system having branch target address cache
JPH03129432A (en) Branch control circuit
US5381532A (en) Microprocessor having branch aligner between branch buffer and instruction decoder unit for enhancing initiation of data processing after execution of conditional branch instruction
US6237087B1 (en) Method and apparatus for speeding sequential access of a set-associative cache
CN116339832A (en) Data processing device, method and processor
CN114003292B (en) Branch prediction method and device and processor core
US7827355B1 (en) Data processor having a cache with efficient storage of predecode information, cache, and method
JP4601624B2 (en) Direct memory access unit with instruction predecoder
CN116627506A (en) Micro instruction cache and operation method, processor core and instruction processing method
US7711926B2 (en) Mapping system and method for instruction set processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant