CN112199118A - Instruction merging method, out-of-order execution equipment, chip and storage medium - Google Patents

Instruction merging method, out-of-order execution equipment, chip and storage medium Download PDF

Info

Publication number
CN112199118A
CN112199118A CN202011089841.6A CN202011089841A CN112199118A CN 112199118 A CN112199118 A CN 112199118A CN 202011089841 A CN202011089841 A CN 202011089841A CN 112199118 A CN112199118 A CN 112199118A
Authority
CN
China
Prior art keywords
instruction
single operator
target
unit
reservation station
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011089841.6A
Other languages
Chinese (zh)
Inventor
刘君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202011089841.6A priority Critical patent/CN112199118A/en
Publication of CN112199118A publication Critical patent/CN112199118A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines

Abstract

The embodiment of the application discloses an instruction merging method, an out-of-order execution device, a chip and a storage medium, wherein the instruction merging method is applied to the out-of-order execution device, the out-of-order execution device is provided with an instruction merging unit, and the method comprises the following steps: acquiring a single operator instruction through an analysis software program, and storing the single operator instruction to a reservation station; determining a dependency corresponding to a single operator instruction through a reservation station; determining a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to the instruction coding correspondence table and the dependency relationship; and merging the target single operator into a target multi-operator instruction through an instruction merging unit, and distributing the target multi-operator instruction to a target hardware unit so as to execute the target multi-operator instruction through the target hardware unit.

Description

Instruction merging method, out-of-order execution equipment, chip and storage medium
Technical Field
The present invention relates to the field of integrated circuit design technologies, and in particular, to an instruction merging method, an out-of-order execution device, a chip, and a storage medium.
Background
For an instruction containing multiple operations, such as multiply-accumulate, the common solution in the industry is to define a special multi-operator instruction in the instruction set, and then to use the multi-operator instruction preferentially by the compiler during the software compiling stage. That is to say, the existing technical solution mainly uses a software means to implement multiple computer operations in one clock cycle by defining additional multi-operator instructions.
However, the increase of redundant instructions means that more instruction set encoding space is required, which results in an increase of code size, and accordingly, more memory and cache space is required for storing larger code segments.
Disclosure of Invention
The embodiment of the application provides an instruction merging method, an out-of-order execution device, a chip and a storage medium, which can effectively reduce the size of an instruction, further reduce the size of the storage medium required by the chip and reduce the manufacturing cost of the chip.
The technical scheme of the embodiment of the application is realized as follows:
in a first aspect, an embodiment of the present application provides an instruction merging method, where the instruction merging method is applied to an out-of-order execution device, where the out-of-order execution device configures an instruction merging unit, and the method includes:
acquiring a single operator instruction through an analysis software program, and storing the single operator instruction to a reservation station;
determining a dependency corresponding to the single operator instruction through the reservation station;
determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship;
merging the target single operator into a target multi-operator instruction through the instruction merging unit, and distributing the target multi-operator instruction to the target hardware unit so as to execute the target multi-operator instruction through the target hardware unit.
In a second aspect, an embodiment of the present application provides an out-of-order execution device, where the out-of-order execution device configures an instruction merging unit, and the out-of-order execution device includes: an acquisition unit, a storage unit, a determination unit, a merging unit, a dispatch unit,
the acquisition unit is used for acquiring the single operator instruction through the analysis software program;
the storage unit is used for storing the single operator instruction to a reservation station;
the determining unit is used for determining the dependency corresponding to the single operator instruction through the reservation station; determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship;
the merging unit is used for merging the target single operator into a target multi-operator instruction through the instruction merging unit;
the dispatch unit is configured to dispatch the target multi-operator instruction to the target hardware unit, so as to execute the target multi-operator instruction through the target hardware unit.
In a third aspect, an out-of-order execution device is provided in an embodiment of the present application, and includes an instruction merging unit, a reservation station, a processor, and a memory storing instructions executable by the processor, and when the instructions are executed by the processor, the instruction merging method is implemented.
In a fourth aspect, embodiments of the present application provide a chip, where the chip includes a programmable logic circuit and/or program instructions, and when the chip runs, the instruction merging method described above is implemented.
In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a program is stored, and the program is applied to an out-of-order execution device, and when the program is executed by a processor, the method for merging instructions as described above is implemented.
The embodiment of the application provides an instruction merging method, an out-of-order execution device, a chip and a storage medium, wherein the out-of-order execution device acquires a single operator instruction through a software program and stores the single operator instruction to a reservation station; determining a dependency corresponding to a single operator instruction through a reservation station; determining a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to the instruction coding correspondence table and the dependency relationship; and merging the target single operator into a target multi-operator instruction through an instruction merging unit, and distributing the target multi-operator instruction to a target hardware unit so as to execute the target multi-operator instruction through the target hardware unit. Therefore, in the application, the out-of-order execution device can automatically merge the interdependent single operator instructions stored in the reservation station through the instruction merging unit at the instruction dispatching stage, and then dispatch the merged multi-operator instructions to the hardware unit supporting the multi-operator instructions, so that the coding requirement on the multi-operator instructions is reduced, the size of the instructions can be effectively reduced, the size of a storage medium required by a chip is further reduced, and the manufacturing cost of the chip is reduced.
Drawings
FIG. 1 is a first flowchart illustrating an implementation of an instruction merging method;
FIG. 2 is a diagram of a reservation station stored single operator instruction;
FIG. 3 is a diagram of an instruction code mapping table;
FIG. 4 is a flowchart illustrating a second implementation of the instruction merging method;
FIG. 5 is a diagram of a single operator instruction stored by a reservation station;
FIG. 6 is a third schematic flow chart illustrating an implementation of the instruction merging method;
FIG. 7 is a first diagram illustrating an implementation of instruction merging;
FIG. 8 is a second diagram illustrating an implementation of instruction merging;
FIG. 9 is a flowchart illustrating a fourth exemplary implementation of the instruction merging method;
FIG. 10 is a first schematic diagram of a component structure of an out-of-order execution apparatus;
FIG. 11 is a schematic diagram of a second exemplary configuration of an out-of-order execution apparatus.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the specific embodiments described herein are illustrative of the relevant application and are not limiting of the application. It should be noted that, for the convenience of description, only the parts related to the related applications are shown in the drawings.
Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.
A compiler: the compiler is responsible for compiling the software codes into hardware executable files;
instructions for: in a computer system, instructions are code for telling a computer to perform a particular operation, each computer operation corresponding to an instruction;
single operator instruction: a single operator instruction refers to an instruction that contains only one computer operation;
a multi-operator instruction: a multi-operator instruction refers to an instruction that contains multiple computer operations;
coding space: the coding space is used for mapping the computer instructions into binary numbers consisting of 0 and 1;
clock period: a clock cycle is a basic unit of time representing the computer chip processing instructions. During a clock cycle, the computer chip may perform one or more operations;
out-of-order execution processor: an out-of-order execution processor is a processor that achieves out-of-order execution of instructions by reordering instructions during instruction dispatch;
assembly line: in a computer system, a pipeline refers to a structure in which a series of processing units are connected together in tandem;
an instruction dispatching stage: the instruction dispatch stage is a stage in the processor pipeline responsible for dispatching instructions to the various processing units;
an instruction execution stage: the instruction execution stage is a stage which is actually responsible for executing and processing the received instruction in a processor pipeline;
a reservation station: the reservation station is a unit responsible for storing decoded instructions under the processor out-of-order execution architecture for out-of-order execution of instructions;
data dependence of the instruction: the data dependency of an instruction means that the current instruction needs to obtain the result of the previous instruction as input to perform the operation of the current instruction;
an arithmetic logic unit: an arithmetic logic unit is an arithmetic unit that performs arithmetic and logical operations exclusively.
The computer mainly comprises five parts, namely a controller (analyzing and executing machine instructions and controlling the cooperative work of all parts), an arithmetic unit (performing arithmetic operation and logic operation on data according to control signals), a storage (storing an intermediate result in an internal storage and storing information needing to be stored for a long time in the external storage), an input device (receiving external information) and an output device (transmitting information to the outside). In particular, modern computers consist of three major parts (which have turned to be memory centric): 1. a Central Processing Unit (CPU), core components of which are an Arithmetic Logic Unit (ALU) and a Control Unit (CU), 2, I/O devices (controlled by CU), 3, a Main Memory (MM), and are divided into a RAM (random access Memory) and a ROM (read only Memory), wherein the CPU and the MM are combined into a host, and the I/O devices can be referred to as external devices.
Generally, the execution process of an instruction in a CPU mainly includes five stages of instruction fetching, instruction decoding, instruction execution, access and data fetching, and result write-back.
Specifically, the Instruction Fetch (IF) stage is the process of fetching an Instruction from main memory to an Instruction register. Wherein, the value in the Program Counter (PC) is used to indicate the location of the current instruction in the main memory. When an instruction is fetched, the value in the PC is automatically incremented according to the instruction word length.
After the Instruction is fetched, the computer immediately enters an Instruction Decode (ID) stage. In the instruction decoding stage, the instruction decoder splits and interprets the fetched instruction according to a predetermined instruction format, and identifies and distinguishes different instruction types and various methods for obtaining operands. In a computer controlled by combinational logic, an instruction decoder generates different control potentials for different instruction operation codes to form different micro-operation sequences; in a microprogram-controlled computer, an instruction decoder uses an instruction opcode to find an entry for a microprogram that executes the instruction, and execution is started from this entry.
Depending on the instruction needs, it is possible to access main Memory, read operands, and thus enter the Memory access (MEM) phase. The tasks of this phase are: according to the instruction address code, the address of the operand in the main memory is obtained, and the operand is read from the main memory for operation.
After the instruction fetch and instruction decode stages, the Execute instruction (EX) stage is then entered. The task of this stage is to perform various operations specified by the instruction, and to implement the function of the instruction. To this end, different parts of the CPU are connected to perform the required operations. For example, if an addition operation is required, the arithmetic logic unit ALU will be connected to a set of inputs providing the values to be added and a set of outputs containing the final result of the operation.
As a last stage, a result write-back (WB) stage "writes back" the execution result data of the execute instruction stage to some form of storage: the result data is often written to internal registers of the CPU for quick access by subsequent instructions; in some cases, the resulting data may also be written to a relatively slow, but inexpensive and large capacity main memory. Many instructions also change the state of flag bits in the program status word register that identify different operation results that may be used to affect program behavior.
After the instruction execution is completed and the result data is written back, if no external event (such as result overflow) occurs, the computer then fetches the next instruction address from the program counter PC, starts a new cycle, and fetches the next instruction in sequence in the next instruction cycle.
For an instruction containing multiple operations, such as multiply-accumulate, the common solution in the industry is to define a special multi-operator instruction in the instruction set, and then to use the multi-operator instruction preferentially by the compiler during the software compiling stage. Taking multiply-accumulate as an example, in general, processors that support multiply-accumulate have 3 instructions, including add, multiply-accumulate. At program compile time, if the compiler finds the result of the multiplication to be the input of an addition, it calls a multiply-accumulate instruction directly.
That is to say, the existing technical solution mainly uses a software means to implement multiple computer operations in one clock cycle by defining additional multi-operator instructions. However, the increase of redundant instructions means that more instruction set encoding space is required, which results in an increase of code size, and accordingly, more memory and cache space is required for storing larger code segments.
In order to solve the existing problems, in the application, the out-of-order execution device can automatically merge the interdependent single operator instructions stored in the reservation station through the instruction merging unit at the instruction dispatching stage, and then dispatch the merged multi-operator instructions to the hardware unit supporting the multi-operator instructions, so that the coding requirement on the multi-operator instructions is reduced, the size of the instructions can be effectively reduced, the size of a storage medium required by a chip is further reduced, and the manufacturing cost of the chip is reduced.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
An embodiment of the present application provides an instruction merging method, where the instruction merging method may be applied to an out-of-order execution device, where the out-of-order execution device configures an instruction merging unit, fig. 1 is a schematic diagram of an implementation flow of the instruction merging method, as shown in fig. 1, in an embodiment of the present application, a method for the out-of-order execution device to merge instructions may include the following steps:
step 101, acquiring a single operator instruction through an analysis software program, and storing the single operator instruction to a reservation station.
In the embodiment of the application, when the out-of-order execution device runs one software program, the software program may be firstly analyzed to obtain the single operator instruction corresponding to the software program, and then the single operator instruction may be stored in the reservation station.
It should be noted that, in the embodiment of the present application, the out-of-order execution device may be any electronic device capable of performing out-of-order execution on instructions, including but not limited to: tablet computers, mobile phones, electronic readers, Personal Computers (PCs), notebook computers, in-vehicle devices, wearable devices, and the like. Accordingly, the target device is an electronic device for receiving screen projection data, for example, a tablet computer, a projection screen, a notebook computer, a display screen, or a fixed terminal such as a smart television.
In the field of computer engineering, out-of-order execution (OoOE/OOE) is a paradigm applied in high-performance microprocessors to utilize instruction cycles to avoid certain types of latency consumption. In this paradigm, the processor executes instructions in an order determined by the availability of input data, rather than by the original data of the program. In this way, processor latency caused by fetching the next program instruction can be avoided and the next immediately executable instruction can be processed instead.
Specifically, out-of-order execution means that a Central Processing Unit (CPU) employs a technique that allows a plurality of instructions to be distributed and processed to respective circuit units out of the order specified by a program. For example, the Core out-of-order execution engine says that a certain section of a program has 7 instructions, and at the moment, the CPU immediately sends the instructions capable of being executed in advance to corresponding circuits for execution after analyzing the idle state of each unit circuit and the specific situation of whether each instruction can be executed in advance. Furthermore, after the units execute the instructions out of the specified order, the operation results must be rearranged by the corresponding circuits according to the instruction order specified by the original program before returning to the program.
That is, after analyzing the state of each circuit unit and the specific situation of whether each instruction can be executed in advance, the instruction capable of being executed in advance is immediately sent to the corresponding circuit unit to be executed, during which the instructions are executed out of the prescribed order, and then the result of each execution unit is rearranged in the instruction order by the rearrangement unit. The purpose of the out-of-order execution technology is to make the internal circuit of the CPU run at full load and correspondingly increase the speed of running programs of the CPU.
In-order execution, the pipeline stalls once an instruction dependent condition is encountered, and if out-of-order execution is employed, the next non-dependent instruction can be skipped and issued. Thus, the execution unit can always be in an operating state, minimizing time waste. Out-of-order execution may allow instructions 4-8 to be issued before instruction 3 is issued, and the results of execution of these instructions may be retired immediately after instruction 3 retires (retirement in order is necessary for the X86 CPU), again increasing the actual decode rate by 25%.
Further, in the embodiments of the present application, the out-of-order execution device may configure an instruction merging unit, wherein the associated dependent different single operator instructions may be combined and merged at the instruction dispatch stage through the instruction merging unit.
It should be noted that, in the embodiment of the present application, the out-of-order execution device may parse the software program through the compiler, and convert the software code into the instruction executable by the hardware unit according to the instruction code. Specifically, the instruction obtained by the out-of-order execution device through analysis is a single operator instruction, that is, in the present application, the executable instructions corresponding to the software program are all single operator instructions.
It is understood that, in the present application, the out-of-order execution device may obtain, by parsing the software program, at least one single operator instruction corresponding to the software program, and may then store the single operator instruction in the reservation station, so as to be used for the subsequent instruction dispatching processing. The single operator instruction corresponding to the software program may include an addition instruction, a subtraction instruction, a multiplication instruction, a division instruction, a remainder instruction, and the like.
Reservation Stations (RSs) include one or more ordered queues. When multiple instructions are ready to be dispatched from the RS queue, meaning that the instructions satisfy the conditions for being dispatched to the execution units, one or more of the ready instructions are dispatched to the corresponding execution units. When the execution unit is available, and any operands necessary for instruction execution are also available, it means that the instruction is ready to be dispatched.
Further, in the embodiment of the present application, when the out-of-order execution device obtains a single operator instruction by analyzing a software program and stores the single operator execution to the reservation station, the out-of-order execution device may first analyze the software program to obtain an operation code and register information corresponding to the single operator instruction; the opcode and the register information may then be stored to the reservation station.
It should be noted that, in the embodiments of the present application, an opcode refers to a part of an instruction or a field (usually represented by a code) specified in a computer program to perform an operation, and is actually an instruction sequence number used to tell a CPU which instruction needs to be executed. For example, the operation code corresponding to the ADD instruction may be ADD, the operation code corresponding to the ADD instruction with carry may be ADC, the operation code corresponding to the subtract instruction without carry may be SUB, the operation code corresponding to the unsigned number multiply instruction may be MUL, the operation code corresponding to the signed number multiply instruction may be IMUL, the operation code corresponding to the unsigned number divide instruction may be DIV, and the operation code corresponding to the signed number divide instruction may be IDIV.
It is understood that, in the embodiments of the present application, the register information corresponding to one instruction may include a destination register and a source register. Where the source register corresponds to the input of an instruction and the destination register corresponds to the output of an instruction. For example, for a single operator instruction, if the opcode corresponding to the instruction is SUB, that is, a subtraction instruction, then the instruction may be executed by using a corresponding hardware unit, such as a subtraction circuit, specifically, if the source registers of the single operator instruction are R1 and R2, and the destination register is R6, then the inputs of the subtraction circuit may be R1 and R2, and the output of the subtraction circuit is R6, and the result after the instruction is executed is given to R6.
Therefore, in the application, after at least one single operator instruction is obtained through analysis, the hardware in the out-of-order execution device can read the single operator instructions, and then the operation codes and the register information of the single operator instructions are put into the reservation station in a one-to-one correspondence manner.
For example, in the present application, fig. 2 is a schematic diagram of a single operator instruction stored in a reservation station, as shown in fig. 2, a plurality of single operator instructions are stored in the reservation station, where, for each single operator instruction, the reservation station stores therein an opcode and a register information (a source register, a destination register 0, and a destination register 1) according to a corresponding relationship.
And 102, determining the dependency corresponding to the single operator instruction through the reservation station.
In the embodiment of the application, after the out-of-order execution device obtains the single operator instruction through the analysis software program and stores the single operator instruction to the reservation station, the reservation station can analyze the dependency between the stored single operator instructions to determine the dependency corresponding to the single operator instruction.
It will be appreciated that in embodiments of the present application, the dependencies between two or more single operator instructions may be interdependent or independent of each other.
It should be noted that, in the present application, when determining the dependency relationship between a plurality of single operator instructions, the reservation station may determine that the dependency relationship between two single operator instructions is an interdependence if it is determined that there is a data dependency between them, and may determine that the dependency relationship between two single operator instructions is an independent dependency if it is determined that there is no data dependency between them.
For example, in the present application, if the output of one instruction is the input of another instruction, i.e. the destination register corresponding to one instruction is the same as the source register corresponding to another instruction, then there is a data dependency between the two instructions, i.e. there is a mutual dependency between the two instructions.
Further, in the embodiment of the present application, when the out-of-order execution device determines the dependency relationship corresponding to the single operator instruction through the reservation station, according to the register information, a first destination register and a first source register corresponding to the first single operator instruction, and a second destination register and a second source register corresponding to the second single operator instruction may be determined; if the first destination register and the second source register are identical or the second destination register and the first source register are identical, it may be determined by a reservation station that the first single operator instruction and the second single operator instruction are interdependent.
Accordingly, if the first destination register and the second source register are not the same and the second destination register and the first source register are not the same, then the first single operator instruction and the second single operator instruction are determined to be independent of each other by the reservation station.
The first single operator instruction and the second single operator instruction may be any two different instructions in the single operator instructions stored by the reservation station.
It will be appreciated that in the present application, in computer hardware that supports out-of-order execution, there will be a unit of reservation stations to store instructions to be allocated. The reservation station may be responsible for checking the data dependencies of the resident instructions. When all inputs of an instruction are ready, the instruction is dispatched to the next pipeline stage.
And 103, determining a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to the instruction coding correspondence table and the dependency relationship.
In an embodiment of the application, after determining, by a reservation station, a dependency relationship corresponding to a single operator instruction in an out-of-order execution device, the out-of-order execution device may determine, by a configured instruction merging unit, a target single operator instruction corresponding to a target hardware unit from the single operator instruction stored by the reservation station according to an instruction coding correspondence table and the dependency relationship. The target single operator instruction is a plurality of instructions forming a multi-operator instruction supported by the target hardware unit, and the target hardware unit may be any one of hardware units which are configured in the out-of-order execution device and can support the multi-operator instruction.
It is understood that in the present application, there are hardware units capable of performing multi-operator operations inside the hardware of the out-of-order execution apparatus. These hardware units are the basis for performing multi-operator operations, and are required regardless of whether proprietary multi-operator instructions are used, which are commonly used in the industry, or the hardware instruction combination of the present application.
It should be noted that, in the embodiment of the present application, the instruction code correspondence table stores correspondence between hardware units supporting a multi-operator instruction and the multi-operator instruction, and a single-operator instruction constituting the multi-operator instruction. Specifically, each item in the instruction code correspondence table corresponds to a hardware unit supporting multi-operator operation, wherein the composition of each multi-operator operation is fixed, so once the multi-operator unit inside the hardware is fixed, the instruction code correspondence table is also fixed.
That is, in the present application, the instruction encoding correspondence table corresponds to all hardware units capable of supporting a multi-operator instruction in the out-of-order execution apparatus.
For example, in the present application, fig. 3 is a schematic diagram of an instruction code correspondence table, and as shown in fig. 3, in the instruction code correspondence table, a hardware identifier supporting a multi-operator instruction, an opcode of the multi-operator instruction, a hardware identifier of a single-operator instruction constituting the multi-operator instruction, and an opcode of the single-operator instruction are stored correspondingly, for example, when the multi-operator instruction is a multiply-accumulate instruction, the hardware identifier of the corresponding hardware supporting the multiply-accumulate operation is 010, the opcode of the multiply-accumulate instruction is MADD, and the corresponding single-operator instruction constituting the multiply-accumulate instruction includes a multiply instruction and an ADD instruction, where the hardware identifier corresponding to the multiply instruction is 0010, the hardware identifier corresponding to the ADD instruction is 1001, the opcode of the multiply instruction is MUL, and the opcode of the ADD instruction is ADD.
Illustratively, as shown in fig. 3, when the multi-operator instruction is a, the hardware identifier of the corresponding hardware supporting a is a1, the opcode of the a instruction is a2, and the corresponding single-operator instruction constituting the a instruction includes B, C, D, E, where the hardware identifier of the B instruction is B1, the opcode is B2, the hardware identifier of the C instruction is C1, the opcode is C2, the hardware identifier of the D instruction is D1, the opcode is D2, the hardware identifier of the E instruction is E1, and the opcode is E2.
Further, in an embodiment of the present application, when the instruction merging unit determines, according to the instruction code correspondence table and the dependency relationship, a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station, the instruction merging unit may first query the instruction code correspondence table, and determine the target multi-operator instruction corresponding to the target hardware unit and an instruction to be merged which constitutes the target multi-operator instruction; and then the instruction merging unit queries the reservation station, and if the single operator instruction stored in the reservation station has the instruction to be merged and the dependency corresponding to the instruction to be merged is interdependent, the instruction to be merged is determined as the target single operator instruction by the instruction merging unit.
That is to say, in the present application, when the instruction merging unit in the out-of-order execution device merges different single operator instructions by using the instruction code correspondence table and the dependency relationship between different single operator instructions stored inside the reservation station, the instruction merging unit may determine, based on the dependency relationship between different single operator instructions stored in the reservation station, an instruction to be merged that is dependent on each other and can be merged, and then may query the instruction code correspondence table according to the instruction to be merged, to further determine whether there is a target multi-operator instruction that the target hardware unit can support the instruction to be merged, and if there is a target multi-operator instruction that the target hardware unit can support the instruction to be merged, the instruction to be merged may be determined as the target single operator instruction by the instruction merging unit.
Further, in the embodiment of the present application, when the instruction merging unit determines, according to the instruction code correspondence table and the dependency relationship, a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station, the instruction merging unit may also determine, according to the instruction code correspondence table, a target multi-operator instruction that can be supported by the target hardware unit, so as to determine an instruction to be merged that constitutes the target multi-operator instruction, and then, may query, by using the instruction merging unit, all the single operator instructions stored in the reservation station according to the instruction to be merged, and if the instruction to be merged is stored in the reservation station and there is a data dependency between the instructions to be merged, may determine the instruction to be merged as the target single operator instruction.
That is to say, in the present application, when the instruction merging unit in the out-of-order execution device merges different single operator instructions by using the instruction code correspondence table and the dependency relationship between different single operator instructions stored inside the reservation station, the instruction merging unit may determine, based on the dependency relationship between different single operator instructions stored in the reservation station, an instruction to be merged that is dependent on each other and can be merged, and then may query the instruction code correspondence table according to the instruction to be merged, to further determine whether there is a target multi-operator instruction that the target hardware unit can support the instruction to be merged, and if there is a target multi-operator instruction that the target hardware unit can support the instruction to be merged, the instruction to be merged may be determined as the target single operator instruction.
And 104, merging the target single operator into a target multi-operator instruction through the instruction merging unit, and distributing the target multi-operator instruction to the target hardware unit so as to execute the target multi-operator instruction through the target hardware unit.
In the embodiment of the application, the out-of-order execution device is configured to merge a target single operator into a target multi-operator instruction after determining the target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station according to the instruction coding correspondence table and the dependency relationship by using the configured instruction merging unit, so that the target multi-operator instruction can be assigned to the target hardware unit, and further the target multi-operator instruction can be executed by the target hardware unit.
It can be understood that, in the embodiment of the present application, after the instruction merging unit determines, in the reservation station, the target single operator instruction corresponding to the target multi-operator instruction that can be supported by the target hardware unit, the instruction merging unit may perform merging processing on the target single operator instructions stored in the reservation station, and obtain the corresponding target multi-operator instruction after merging.
Further, in the embodiment of the present application, after obtaining the target multi-operator instruction through the merging process, the instruction merging unit may dispatch the target multi-operator instruction to a target hardware unit capable of supporting the target multi-operator instruction, and the target hardware unit may execute the target multi-operator instruction in the instruction execution stage.
That is, in the present application, after the instruction merging unit combines the single operator instructions in the reservation station that are dependent on each other into a corresponding multi-operator instruction based on the instruction encoding correspondence table, the multi-operator instruction may be assigned to the corresponding hardware entry instruction execution stage as another source of instruction assignment.
Fig. 4 is a schematic view of an implementation flow of an instruction merging method, as shown in fig. 4, after determining, by an instruction merging unit, a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station according to an instruction encoding correspondence table and the dependency relationship, that is, after step 103, the method for performing instruction merging by an out-of-order execution device may further include the following steps:
and 105, releasing the target single operator instruction through the reservation station.
In the embodiment of the application, the out-of-order execution device may release the target single operator instruction through the reservation station after determining the target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station according to the instruction coding correspondence table and the dependency relationship by the instruction merging unit.
It should be noted that, in the embodiment of the present application, since the target single operator instructions stored in the reservation station have been merged into corresponding target multi-operator instructions by the instruction merging unit and are assigned to corresponding target hardware units for execution, that is, the reservation station no longer needs to perform assignment processing of the target single operator, so that the target single operator instructions stored before can be released.
It is understood that in the embodiment of the present application, when storing a single operator instruction by a reservation station, a valid bit may be set for each single operator instruction, where the valid bit may have a value of 0 or 1, and if the reservation station sets the valid bit of an instruction from 1 to 0, it indicates that the instruction is released by the reservation station.
Fig. 5 is a schematic diagram two of a single operator instruction stored in the reservation station, and as shown in fig. 5, if an add instruction is merged and dispatched as a target single operator instruction by the instruction merging unit, the reservation station may release the add instruction by setting the valid position corresponding to the add instruction to 0.
Fig. 6 is a schematic flow chart of an implementation of the instruction merging method, as shown in fig. 6, after determining, by the instruction merging unit, a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station according to the instruction encoding correspondence table and the dependency relationship, that is, after step 103, the method for performing instruction merging by the out-of-order execution device may further include the following steps:
and 106, carrying out assignment processing on other single operator instructions except the target single operator instruction through the reservation station.
In an embodiment of the present application, after determining, by the instruction merging unit, a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station according to the instruction coding correspondence table and the dependency relationship, the reservation station may assign other single operator instructions except the target single operator instruction.
It should be noted that, in the embodiment of the present application, since the target single operator instructions stored in the reservation station have been merged into corresponding target multi-operator instructions by the instruction merging unit and are assigned to corresponding target hardware units for execution, the reservation station may directly assign other single operator instructions, other than the target single operator instructions, stored in the reservation station to corresponding hardware units one by one for execution processing.
It should be noted that, in the embodiment of the present application, in the process of executing the flow of instruction merging, the out-of-order execution device continuously acquires a new single operator instruction through the reservation station; and merging the instructions according to the instruction encoding corresponding table continuously through the instruction merging unit.
Through the instruction merging method provided in the steps 101 to 106, in the instruction dispatching stage of the out-of-order execution processor, the related and interdependent single operator instructions are automatically combined through the instruction merging unit, so that the real-time instruction combination in the operation stage can be realized, the design of a compiler is simplified, the coding requirement on the multi-operator instructions is further reduced, and the coding space and the total code size are finally saved.
Specifically, in the present application, the reservation station can be used by the instruction merging unit to check the characteristics of data dependence, and single operator instructions with dependencies in the reservation station can be combined into multi-operator instructions in the instruction dispatching stage and then dispatched.
Further, in the present application, since the instruction merging unit implements merging of instructions, the encoding space only includes a single operator instruction, and the storage space is saved by shortening the binary length of instruction encoding. Accordingly, only a single operator instruction is stored in the reservation station. The instruction merging unit can retrieve the data dependence of the single operator instruction in the reservation station, and can also determine whether the single operator instruction can carry out instruction merging processing or not by inquiring the instruction coding corresponding table.
The embodiment of the application provides an instruction merging method, wherein an out-of-order execution device acquires a single operator instruction through a software program, and stores the single operator instruction to a reservation station; determining a dependency corresponding to a single operator instruction through a reservation station; determining a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to the instruction coding correspondence table and the dependency relationship; and merging the target single operator into a target multi-operator instruction through an instruction merging unit, and distributing the target multi-operator instruction to a target hardware unit so as to execute the target multi-operator instruction through the target hardware unit. Therefore, in the application, the out-of-order execution device can automatically merge the interdependent single operator instructions stored in the reservation station through the instruction merging unit at the instruction dispatching stage, and then dispatch the merged multi-operator instructions to the hardware unit supporting the multi-operator instructions, so that the coding requirement on the multi-operator instructions is reduced, the size of the instructions can be effectively reduced, the size of a storage medium required by a chip is further reduced, and the manufacturing cost of the chip is reduced.
Based on the foregoing embodiment, in yet another embodiment of the present application, fig. 7 is a first schematic diagram illustrating implementation of instruction merging, and as shown in fig. 7, the instruction merging method proposed in the present application is applied to an out-of-order execution device, where the out-of-order execution device is configured with an instruction merging unit, and the instruction merging unit is capable of implementing merging processing on a plurality of single operator instructions with dependencies. Meanwhile, the reservation station can also distribute other single operator instructions, and the difference is that the merged multi-operator instruction can be distributed to a hardware unit capable of supporting the multi-operator instruction through the instruction merging unit, and the reservation station directly distributes the stored single operator instruction to the hardware unit supporting the single operator instruction. The hardware unit may include various arithmetic units, arithmetic logic units, and the like.
Further, in the embodiment of the present application, fig. 8 is a second schematic diagram illustrating the implementation of instruction merging, as shown in fig. 8, the reservation station stores a valid bit, an opcode, a destination register, and a source register corresponding to each single operator instruction. The instruction merging unit can query the instruction coding correspondence table and the reservation station, so that the merging of the instructions can be carried out in real time according to the instruction data dependence inside the instruction coding correspondence table and the reservation station.
It will be appreciated that in embodiments of the present application, when determining the dependency relationship between a plurality of single operator instructions stored therein by a reservation station, if the output of one instruction is the input of another instruction, i.e. the destination register corresponding to one instruction is the same as the source register corresponding to another instruction, then there is a data dependency between the two instructions, i.e. there is a mutual dependency between the two instructions. For example, if the source register for the add instruction includes R12 and the destination register for the multiply instruction is R12, the output of the multiply instruction may be determined to be the input of the add instruction, and thus the add instruction and the multiply instruction may be considered interdependent.
It should be noted that, in the present application, when an instruction merging unit in the out-of-order execution device merges different single operator instructions by using an instruction coding correspondence table and a dependency relationship between different single operator instructions stored in a reservation station, the instruction merging unit may determine, based on the dependency relationship between different single operator instructions stored in the reservation station, instructions to be merged that are dependent on each other and can be merged, and then may query the instruction coding correspondence table according to the instructions to be merged, to further determine whether there is a target multi-operator instruction that a target hardware unit can support the instructions to be merged, and if there is a target multi-operator instruction that is configured by the target hardware unit, the instruction to be merged may be determined as the target single operator instruction by the instruction merging unit.
It should be noted that, in the present application, when an instruction merging unit in the out-of-order execution device merges different single operator instructions by using an instruction coding correspondence table and a dependency relationship between different single operator instructions stored inside a reservation station, the instruction merging unit may determine, based on the dependency relationship between different single operator instructions stored in the reservation station, instructions to be merged that are dependent on each other and can be merged, and then may query the instruction coding correspondence table according to the instructions to be merged, to further determine whether a target hardware unit capable of supporting a target multi-operator instruction composed of the instructions to be merged exists, and if the target hardware unit capable of supporting the target multi-operator instruction exists, may determine the instruction to be merged as the target single operator instruction.
Further, in an embodiment of the present application, after merging the single operator instructions stored in the reservation station into one multi-operator instruction through the merging process, the instruction merging unit may dispatch the multi-operator instruction to a hardware unit capable of supporting the multi-operator instruction, so that the multi-operator instruction may be executed in the instruction execution stage through the hardware unit.
It should be noted that, in the embodiment of the present application, since part of the single operator instructions stored in the reservation station have been merged into corresponding multi-operator instructions by the instruction merging unit and are assigned to corresponding hardware units for execution, other single operator instructions stored in the reservation station may be directly assigned to corresponding hardware units one by the reservation station for execution processing.
It is understood that in the present application, the instruction may be transmitted to the selector through the instruction merging unit and the reservation station, and then the selector selects a part of the instructions among the plurality of single operator instructions and the plurality of multi operator instructions for dispatch. For example, the merged 2 multi-operator instructions are transmitted to the selector by the instruction merging unit, while the reservation station transmits 4 single-operator instructions to the selector, and then the selector can select 4 instructions from the 6 instructions for the dispatch processing.
Further, in the embodiment of the present application, since the part of the single operator instructions stored in the reservation station has been merged into the corresponding multi-operator instructions by the instruction merging unit and is allocated to the corresponding hardware unit for execution, that is, the reservation station no longer needs to allocate the part of the single operator instructions, the part of the single operator instructions stored before can be released by the reservation station. In particular, if an add instruction and a multiply instruction are merged and dispatched by the instruction merge unit, the reservation station may free the add instruction and the multiply instruction by setting the valid bit corresponding to the add instruction and the multiply instruction to 0.
Fig. 9 is a flowchart illustrating a fourth implementation flow of the instruction merging method, as shown in fig. 9, for example, taking a multiply-add instruction (a multiply instruction + an add instruction), the method for instruction merging by the out-of-order execution device may include the following steps:
step 201, analyzing and obtaining the instruction.
The compiler parses the software program and converts the software code into hardware executable instructions based on the instruction code. In the present application, all executable instructions are single operator instructions.
Step 202, store the instructions through the reservation station and determine the dependencies between the instructions.
The computer hardware reads these single operator instructions (including multiply and accumulate, etc.) and then places these instructions in the reservation station for subsequent instruction dispatch, at which point the dependencies between these instructions can be determined by the reservation station.
And step 203, determining a target single operator instruction through the instruction merging unit.
The instruction merging unit determines a multi-operator instruction supporting multiplication and accumulation by current hardware by inquiring the instruction coding corresponding table, and simultaneously can determine that the multiplication and addition instruction consists of a multiplication instruction and an addition instruction, namely a target single-operator instruction is the multiplication instruction and the addition instruction.
And step 204, retrieving the target single operator instruction from the reservation station through the instruction merging unit.
The instruction merge unit retrieves the reservation stations and finds the multiply and add instructions stored in the reservation stations.
And step 205, merging the target multi-operator instruction by using the dependency relationship determined by the reservation station through the instruction merging unit.
And checking whether the multiply instruction and the add instruction in the reservation station have data dependence or not through an instruction merging unit according to the dependence relationship, and merging the multiply instruction and the add instruction into a multiply-add instruction to be dispatched if the multiply instruction and the add instruction are dependent on each other, namely obtaining the target multi-operator instruction. Wherein the dispatched multiply-add instruction is executed in a multiply-accumulate execution unit of the instruction execution stage.
And step 206, releasing the target single operator instruction through the reservation station.
The reservation station releases the corresponding multiply instruction and add instruction from the stored entry.
Step 207, the instructions are dispatched separately by the reservation station and the instruction assembly unit.
The instruction combination unit distributes the target multi-operator instruction obtained by combination to a hardware unit supporting the target multi-operator instruction, and the reservation station distributes other stored instructions except the target single-operator instruction to the corresponding hardware units.
Through the instruction merging method provided in the above steps 201 to 207, in the instruction dispatching stage of the out-of-order execution processor, the instruction merging unit automatically combines the associated and interdependent single operator instructions, so that the real-time instruction combination in the operation stage can be realized, the necessity of multi-operator instruction coding is removed, the instruction set coding space is saved, and the total size of the code is reduced. Accordingly, the smaller overall code size means smaller memory and cache space, and finally, the manufacturing cost of the semiconductor chip can be effectively reduced.
Therefore, the instruction merging method provided by the application effectively reduces the encoding space of the instruction set and reduces the manufacturing cost of the semiconductor chip by introducing the instruction merging unit in the instruction dispatching stage. Meanwhile, the instruction merging unit is directly merged into the existing pipeline structure, so that the used hardware logic is not too much, and the cost and the feasibility of the implementation of the method are greatly increased.
The embodiment of the application provides an instruction merging method, wherein an out-of-order execution device acquires a single operator instruction through a software program, and stores the single operator instruction to a reservation station; determining a dependency corresponding to a single operator instruction through a reservation station; determining a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to the instruction coding correspondence table and the dependency relationship; and merging the target single operator into a target multi-operator instruction through an instruction merging unit, and distributing the target multi-operator instruction to a target hardware unit so as to execute the target multi-operator instruction through the target hardware unit. Therefore, in the application, the out-of-order execution device can automatically merge the interdependent single operator instructions stored in the reservation station through the instruction merging unit at the instruction dispatching stage, and then dispatch the merged multi-operator instructions to the hardware unit supporting the multi-operator instructions, so that the coding requirement on the multi-operator instructions is reduced, the size of the instructions can be effectively reduced, the size of a storage medium required by a chip is further reduced, and the manufacturing cost of the chip is reduced.
Based on the foregoing embodiment, in another embodiment of the present application, fig. 10 is a schematic structural diagram of a composition of an out-of-order execution apparatus, as shown in fig. 10, the out-of-order execution apparatus 10 according to the embodiment of the present application may include: an acquisition unit 11, a storage unit 12, a determination unit 13, a merging unit 14, a dispatch unit 15, a release unit 16,
the acquiring unit 11 is configured to acquire a single operator instruction through an analysis software program;
the storage unit 12 is configured to store the single operator instruction to a reservation station;
the determining unit 13 is configured to determine, by the reservation station, a dependency corresponding to the single operator instruction; determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship;
the merging unit 14 is configured to merge the target single operator into a target multi-operator instruction through the instruction merging unit;
the dispatch unit 15 is configured to dispatch the target multi-operator instruction to the target hardware unit, so as to execute the target multi-operator instruction through the target hardware unit.
Further, in an embodiment of the present application, the obtaining unit 11 is specifically configured to analyze the software program to obtain an operation code and register information corresponding to the single operator instruction.
Further, in the embodiment of the present application, the storage unit 12 is specifically configured to store the operation code and the register information to the reservation station.
Further, in an embodiment of the present application, the determining unit 13 is specifically configured to determine, according to the register information, a first destination register and a first source register corresponding to a first single operator instruction, and a second destination register and a second source register corresponding to a second single operator instruction; when the first destination register and the second source register are the same or the second destination register and the first source register are the same, determining that the first single operator instruction and the second single operator instruction are mutually dependent through the reservation station; and when the first destination register is different from the second source register and the second destination register is different from the first source register, determining that the first single operator instruction and the second single operator instruction are independent of each other through the reservation station.
Further, in an embodiment of the present application, the determining unit 13 is further specifically configured to query the instruction code correspondence table through the instruction merging unit, determine the target multi-operator instruction corresponding to the target hardware unit, and a to-be-merged instruction forming the target multi-operator instruction; and if the single operator instruction stored in the reservation station has the instruction to be merged and the dependency corresponding to the instruction to be merged is interdependent, determining the instruction to be merged as the target single operator instruction.
Further, in an embodiment of the present application, the determining unit 13 is further specifically configured to determine, by the instruction merging unit, the single operator instruction whose dependency relationship is interdependence as the instruction to be merged; the instruction merging unit queries the instruction code corresponding table according to the instruction to be merged to obtain a query result; and if the query result is that the instruction to be merged forms the target multi-operator instruction corresponding to the target hardware unit, determining the instruction to be merged as the target single-operator instruction.
Further, in an embodiment of the present application, the releasing unit 16 is configured to release, by the instruction merging unit, the target single operator instruction through the reservation station after determining the target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station according to the instruction encoding correspondence table and the dependency relationship.
Further, in an embodiment of the present application, the dispatch unit 15 is further configured to, after determining, by the instruction merging unit, a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station according to the instruction coding correspondence table and the dependency relationship, dispatch, by the reservation station, other single operator instructions except the target single operator instruction.
In an embodiment of the present application, further, fig. 11 is a schematic diagram of a second constituent structure of an out-of-order execution device, as shown in fig. 11, the out-of-order execution device 10 according to the embodiment of the present application may further include a processor 17 and a memory 18 storing executable instructions of the processor 17, and further, the out-of-order execution device 10 may further include a communication interface 19, and a bus 110 for connecting the processor 17, the memory 18, and the communication interface 19.
Further, the out-of-order execution apparatus 10 may further include an instruction merge unit 111 and a reservation station 112.
In an embodiment of the present Application, the Processor 17 may be at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a ProgRAMmable Logic Device (PLD), a Field ProgRAMmable Gate Array (FPGA), a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor. It is understood that the electronic devices for implementing the above processor functions may be other devices, and the embodiments of the present application are not limited in particular. The out-of-order execution apparatus 10 may further comprise a memory 18, the memory 18 being connectable to the processor 17, wherein the memory 18 is for storing executable program code comprising computer operating instructions, the memory 18 may comprise a high speed RAM memory and may also comprise a non-volatile memory, such as at least two disk memories.
In the embodiment of the present application, the bus 110 is used to connect the communication interface 19, the processor 17, and the memory 18 and the intercommunication among these devices.
In an embodiment of the present application, the memory 18 is used for storing instructions and data.
Further, in an embodiment of the present application, the processor 17 is configured to obtain a single operator instruction through an analysis software program, and store the single operator instruction to the reservation station; determining a dependency corresponding to the single operator instruction through the reservation station; determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship; merging the target single operator into a target multi-operator instruction through the instruction merging unit, and distributing the target multi-operator instruction to the target hardware unit so as to execute the target multi-operator instruction through the target hardware unit.
In practical applications, the Memory 18 may be a volatile Memory (volatile Memory), such as a Random-Access Memory (RAM); or a non-volatile Memory (non-volatile Memory), such as a Read-Only Memory (ROM), a flash Memory (flash Memory), a Hard Disk (Hard Disk Drive, HDD) or a Solid-State Drive (SSD); or a combination of the above types of memories and provides instructions and data to the processor 17.
In addition, each functional module in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware or a form of a software functional module.
Based on the understanding that the technical solution of the present embodiment essentially or a part contributing to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method of the present embodiment. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The embodiment of the application provides an out-of-order execution device, which acquires a single operator instruction through a software program and stores the single operator instruction to a reservation station; determining a dependency corresponding to a single operator instruction through a reservation station; determining a target single operator instruction corresponding to the target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to the instruction coding correspondence table and the dependency relationship; and merging the target single operator into a target multi-operator instruction through an instruction merging unit, and distributing the target multi-operator instruction to a target hardware unit so as to execute the target multi-operator instruction through the target hardware unit. Therefore, in the application, the out-of-order execution device can automatically merge the interdependent single operator instructions stored in the reservation station through the instruction merging unit at the instruction dispatching stage, and then dispatch the merged multi-operator instructions to the hardware unit supporting the multi-operator instructions, so that the coding requirement on the multi-operator instructions is reduced, the size of the instructions can be effectively reduced, the size of a storage medium required by a chip is further reduced, and the manufacturing cost of the chip is reduced.
An embodiment of the present application provides a computer-readable storage medium, on which a program is stored, and the program, when executed by a processor, implements the instruction merging method as described above.
Specifically, the program instructions corresponding to an instruction merging method in the present embodiment may be stored on a storage medium such as an optical disc, a hard disc, a usb disk, or the like, and when the program instructions corresponding to an instruction merging method in the storage medium are read or executed by an electronic device, the method includes the following steps:
acquiring a single operator instruction through an analysis software program, and storing the single operator instruction to a reservation station;
determining a dependency corresponding to the single operator instruction through the reservation station;
determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship;
merging the target single operator into a target multi-operator instruction through the instruction merging unit, and distributing the target multi-operator instruction to the target hardware unit so as to execute the target multi-operator instruction through the target hardware unit.
The embodiment of the present application provides a chip, where the chip includes a programmable logic circuit and/or a program instruction, and when the chip runs, the method for merging the instructions described above is implemented, and specifically includes the following steps:
acquiring a single operator instruction through an analysis software program, and storing the single operator instruction to a reservation station;
determining a dependency corresponding to the single operator instruction through the reservation station;
determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship;
merging the target single operator into a target multi-operator instruction through the instruction merging unit, and distributing the target multi-operator instruction to the target hardware unit so as to execute the target multi-operator instruction through the target hardware unit.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of implementations of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart block or blocks and/or flowchart block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks in the flowchart and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application.

Claims (11)

1. An instruction merging method is applied to an out-of-order execution device, wherein the out-of-order execution device configures an instruction merging unit, and the method comprises the following steps:
acquiring a single operator instruction through an analysis software program, and storing the single operator instruction to a reservation station;
determining a dependency corresponding to the single operator instruction through the reservation station;
determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship;
merging the target single operator into a target multi-operator instruction through the instruction merging unit, and distributing the target multi-operator instruction to the target hardware unit so as to execute the target multi-operator instruction through the target hardware unit.
2. The method of claim 1, wherein said obtaining a single operator instruction by parsing a software program and storing said single operator execution to a reservation station comprises:
analyzing the software program to obtain an operation code and register information corresponding to the single operator instruction;
storing the opcode and the register information to the reservation station.
3. The method of claim 2, wherein said determining, by said reservation station, dependencies corresponding to said single operator instructions comprises:
determining a first target register and a first source register corresponding to the first single operator instruction and a second target register and a second source register corresponding to the second single operator instruction according to the register information;
when the first destination register and the second source register are the same or the second destination register and the first source register are the same, determining that the first single operator instruction and the second single operator instruction are mutually dependent through the reservation station;
and when the first destination register is different from the second source register and the second destination register is different from the first source register, determining that the first single operator instruction and the second single operator instruction are independent of each other through the reservation station.
4. The method according to claim 1, wherein the determining, by the instruction merging unit, a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station according to an instruction coding correspondence table and the dependency relationship comprises:
inquiring the instruction code corresponding table through the instruction merging unit, and determining the target multi-operator instruction corresponding to the target hardware unit and an instruction to be merged forming the target multi-operator instruction;
and if the single operator instruction stored in the reservation station has the instruction to be merged and the dependency corresponding to the instruction to be merged is interdependent, determining the instruction to be merged as the target single operator instruction.
5. The method according to claim 1, wherein the determining, by the instruction merging unit, a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station according to an instruction coding correspondence table and the dependency relationship comprises:
determining the single operator instruction with the dependence relationship being interdependence as an instruction to be merged by the instruction merging unit;
inquiring the instruction code corresponding table according to the instruction to be merged by the instruction merging unit to obtain an inquiry result;
and if the query result is that the instruction to be merged forms the target multi-operator instruction corresponding to the target hardware unit, determining the instruction to be merged as the target single-operator instruction.
6. The method according to claim 4 or 5, wherein after determining, by the instruction merging unit, a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station according to an instruction encoding correspondence table and the dependency relationship, the method further comprises:
releasing the target single operator instruction through the reservation station.
7. The method according to claim 4 or 5, wherein after determining, by the instruction merging unit, a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station according to an instruction encoding correspondence table and the dependency relationship, the method further comprises:
and dispatching other single operator instructions except the target single operator instruction through the reservation station.
8. An out-of-order execution device, wherein the out-of-order execution device configures an instruction merge unit, the out-of-order execution device comprising: an acquisition unit, a storage unit, a determination unit, a merging unit, a dispatch unit,
the acquisition unit is used for acquiring the single operator instruction through the analysis software program;
the storage unit is used for storing the single operator instruction to a reservation station;
the determining unit is used for determining the dependency corresponding to the single operator instruction through the reservation station; determining a target single operator instruction corresponding to a target hardware unit from the single operator instructions stored in the reservation station through the instruction merging unit according to an instruction coding correspondence table and the dependency relationship;
the merging unit is used for merging the target single operator into a target multi-operator instruction through the instruction merging unit;
the dispatch unit is configured to dispatch the target multi-operator instruction to the target hardware unit, so as to execute the target multi-operator instruction through the target hardware unit.
9. An out-of-order execution device, comprising an instruction merge unit, a reservation station, a processor, a memory storing instructions executable by the processor, the instructions when executed by the processor implementing the method of any of claims 1 to 7.
10. A chip comprising programmable logic circuits and/or program instructions which, when run, implement the method of any one of claims 1 to 7.
11. A computer-readable storage medium, on which a program is stored, for use in an out-of-order execution apparatus, wherein the program, when executed by a processor, implements the method of any one of claims 1-7.
CN202011089841.6A 2020-10-13 2020-10-13 Instruction merging method, out-of-order execution equipment, chip and storage medium Withdrawn CN112199118A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011089841.6A CN112199118A (en) 2020-10-13 2020-10-13 Instruction merging method, out-of-order execution equipment, chip and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011089841.6A CN112199118A (en) 2020-10-13 2020-10-13 Instruction merging method, out-of-order execution equipment, chip and storage medium

Publications (1)

Publication Number Publication Date
CN112199118A true CN112199118A (en) 2021-01-08

Family

ID=74008797

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011089841.6A Withdrawn CN112199118A (en) 2020-10-13 2020-10-13 Instruction merging method, out-of-order execution equipment, chip and storage medium

Country Status (1)

Country Link
CN (1) CN112199118A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327643A (en) * 2022-03-11 2022-04-12 上海聪链信息科技有限公司 Machine instruction preprocessing method, electronic device and computer-readable storage medium
CN115509608A (en) * 2022-11-23 2022-12-23 成都登临科技有限公司 Instruction optimization method and device, electronic equipment and computer-readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1173931A (en) * 1995-09-01 1998-02-18 菲利浦电子北美公司 Method and appts. for custom operations of a processor
CN101377736A (en) * 2008-04-03 2009-03-04 威盛电子股份有限公司 Disorder performing microcomputer and macro instruction processing method
US20150026442A1 (en) * 2013-07-18 2015-01-22 Nvidia Corporation System, method, and computer program product for managing out-of-order execution of program instructions
CN110297662A (en) * 2019-07-04 2019-10-01 深圳芯英科技有限公司 Instruct method, processor and the electronic equipment of Out-of-order execution

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1173931A (en) * 1995-09-01 1998-02-18 菲利浦电子北美公司 Method and appts. for custom operations of a processor
CN101377736A (en) * 2008-04-03 2009-03-04 威盛电子股份有限公司 Disorder performing microcomputer and macro instruction processing method
US20150026442A1 (en) * 2013-07-18 2015-01-22 Nvidia Corporation System, method, and computer program product for managing out-of-order execution of program instructions
CN110297662A (en) * 2019-07-04 2019-10-01 深圳芯英科技有限公司 Instruct method, processor and the electronic equipment of Out-of-order execution

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨坤;高德远;黄小平;: "超标量RISC微处理器指令发射算法设计", 微电子学与计算机, no. 09 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327643A (en) * 2022-03-11 2022-04-12 上海聪链信息科技有限公司 Machine instruction preprocessing method, electronic device and computer-readable storage medium
CN114327643B (en) * 2022-03-11 2022-06-21 上海聪链信息科技有限公司 Machine instruction preprocessing method, electronic device and computer-readable storage medium
CN115509608A (en) * 2022-11-23 2022-12-23 成都登临科技有限公司 Instruction optimization method and device, electronic equipment and computer-readable storage medium

Similar Documents

Publication Publication Date Title
CN106406812B (en) Microprocessor and method for executing fused complex arithmetic operation in microprocessor
US9733945B2 (en) Pipelining out-of-order instructions
US5748950A (en) Method and apparatus for providing an optimized compare-and-branch instruction
JP3797471B2 (en) Method and apparatus for identifying divisible packets in a multi-threaded VLIW processor
US6671796B1 (en) Converting an arbitrary fixed point value to a floating point value
EP1230591B1 (en) Decompression bit processing with a general purpose alignment tool
US9213677B2 (en) Reconfigurable processor architecture
US20090271790A1 (en) Computer architecture
JPH10134036A (en) Single-instruction multiple data processing for multimedia signal processor
KR20140131472A (en) Reconfigurable processor having constant storage register
EP3289444A1 (en) Explicit instruction scheduler state information for a processor
WO2012106716A1 (en) Processor with a hybrid instruction queue with instruction elaboration between sections
CN112199118A (en) Instruction merging method, out-of-order execution equipment, chip and storage medium
JP4991299B2 (en) Method for reducing stall due to operand dependency and data processor therefor
US20030005261A1 (en) Method and apparatus for attaching accelerator hardware containing internal state to a processing core
KR102161682B1 (en) Processor and methods for immediate handling and flag handling
US20130339689A1 (en) Later stage read port reduction
JPH10105402A (en) Processor of pipeline system
EP4336352A1 (en) Instruction execution method, processor and electronic apparatus
CN114968373A (en) Instruction dispatching method and device, electronic equipment and computer readable storage medium
US20220035635A1 (en) Processor with multiple execution pipelines
US7139897B2 (en) Computer instruction dispatch
CN113900712A (en) Instruction processing method, instruction processing apparatus, and storage medium
JP2022549493A (en) Compressing the Retirement Queue
US20230093393A1 (en) Processor, processing method, and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210108