CN117453290A - Execution method and device in out-of-order core based on RISCV-V instruction set - Google Patents

Execution method and device in out-of-order core based on RISCV-V instruction set Download PDF

Info

Publication number
CN117453290A
CN117453290A CN202210847921.6A CN202210847921A CN117453290A CN 117453290 A CN117453290 A CN 117453290A CN 202210847921 A CN202210847921 A CN 202210847921A CN 117453290 A CN117453290 A CN 117453290A
Authority
CN
China
Prior art keywords
instruction
configuration
speculative
vector
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210847921.6A
Other languages
Chinese (zh)
Inventor
姚慧
欧阳鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Simm Computing Technology Co ltd
Original Assignee
Beijing Simm Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Simm Computing Technology Co ltd filed Critical Beijing Simm Computing Technology Co ltd
Priority to CN202210847921.6A priority Critical patent/CN117453290A/en
Publication of CN117453290A publication Critical patent/CN117453290A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files

Abstract

The embodiment of the invention discloses an execution method and device in an out-of-order core based on a RISCV-V instruction set. The embodiment of the invention obtains the execution configuration parameters by executing the configuration instruction for configuring the parameters required by the execution of the vector instruction, wherein the configuration instruction carries the tag index; acquiring a speculative configuration parameter from a file entry corresponding to the configuration instruction according to the tag index, wherein a vector instruction corresponding to the configuration instruction is executed according to the speculative configuration parameter; and determining that the configuration instruction is speculative correctly in response to the speculative configuration parameters being the same as the execution configuration parameters. By the method, the vector instruction does not need to wait for the execution configuration parameters generated by the execution of the configuration instruction, and the speculative configuration parameters are adopted, so that the occupation of a time window of a pipeline when the configuration instruction is executed is reduced, and the efficiency of the pipeline is improved.

Description

Execution method and device in out-of-order core based on RISCV-V instruction set
Technical Field
The invention relates to the technical field of computers, in particular to an execution method and device in an out-of-order core based on a RISCV-V instruction set.
Background
The fifth generation of reduced instruction set computer RISC-V (Reduced Instruction Set Computer-V) combines the advantages of the x86 and ARM instruction sets, and has the advantages of simple instructions, fewer instruction strips, small code and low power consumption in RISC-V, so that the RISC-V has wider application range.
RISC-V comprises an existing basic instruction set and an extended instruction set, wherein the basic instruction set comprises RV32I, RV32E, RV64I and RV128I, and the extended instruction set comprises M, A, F, D, C and V extensions; the V-extension is Vector extension instruction set (RISCV-Vector, RISCV-V); the RISCV-V is realized by a Vector processor, and the Vector expansion instruction set comprises a vconfig configuration instruction and a Vector instruction, wherein the vconfig configuration instruction configures two parameters, namely a type and a Vl, required by execution of the Vector instruction, and the execution of each Vector instruction must depend on the configuration result of the vconfig configuration instruction corresponding to the Vector instruction, and the type and the Vl configured by each vconfig configuration instruction can be used for a plurality of Vector instructions corresponding to the vconfig configuration instruction.
In the prior art, when executing a vconfig configuration instruction, in a decoding stage, the vconfig configuration instruction is marked as a unique instruction (unique instruction), the vconfig configuration instruction can be sent to an INT transmit queue when a Reorder Buffer (ROB) is emptied, and a subsequent instruction of the vconfig configuration instruction can not be issued subsequently until the subsequent instruction of the vconfig configuration instruction can be issued continuously when the ROB is submitted, so that the vconfig configuration instruction occupies a large time window of a pipeline when executing, blocks the flow of the subsequent instruction, and has low pipeline efficiency.
In summary, how to reduce the occupation of the time window of the pipeline when the vconfig configuration instruction is executed and improve the efficiency of the pipeline is a problem to be solved at present.
Disclosure of Invention
In view of this, the embodiment of the invention provides a method and a device for executing an RISCV-V instruction set in an out-of-order core, which can reduce occupation of a time window of a pipeline when a vconfig configuration instruction is executed and improve efficiency of the pipeline.
In a first aspect, an embodiment of the present invention provides a method for executing an RISCV-V instruction set in an out-of-order core, the method including:
executing a configuration instruction for configuring parameters required by vector instruction execution, and acquiring execution configuration parameters, wherein the configuration instruction carries a tag index;
acquiring a speculative configuration parameter from a file entry corresponding to the configuration instruction according to the tag index, wherein a vector instruction corresponding to the configuration instruction is executed according to the speculative configuration parameter;
and determining that the configuration instruction is speculative correctly in response to the speculative configuration parameters being the same as the execution configuration parameters.
Optionally, the method further comprises:
acquiring the configuration instruction;
determining a speculative configuration parameter according to the configuration instruction and the speculative strategy;
And writing the speculative configuration parameters into file entries corresponding to the configuration instructions according to the tag indexes, so that vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters.
Optionally, the method further comprises:
determining that the configuration instruction speculation fails in response to the speculation configuration parameters being different from the execution configuration parameters;
and re-acquiring the vector instruction corresponding to the configuration instruction, so that the vector instruction corresponding to the configuration instruction is executed according to the execution configuration parameter.
Optionally, the re-acquiring the vector instruction corresponding to the configuration instruction specifically includes:
and re-acquiring a vector instruction corresponding to the configuration instruction according to the tag index of the configuration instruction, wherein the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction.
Optionally, the method further comprises:
and brushing the vector instruction corresponding to the configuration instruction and the instruction with the tag index value larger than the tag index value of the configuration instruction out of the pipeline, and re-acquiring the vector instruction and the instruction.
Optionally, the method further comprises:
acquiring a vector instruction corresponding to the configuration instruction, wherein the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction;
And acquiring the speculative configuration parameters from the file entries corresponding to the configuration instructions according to the tag indexes, so that the vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters.
Optionally, before obtaining the speculative configuration parameter from the file entry corresponding to the configuration instruction according to the tag index, the method further includes:
responding to the configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction to be effective, and determining the vector instruction corresponding to the configuration instruction as an instruction to be executed; or,
and responding to the invalid configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction, blocking the vector instruction corresponding to the configuration instruction in a vector sending slot, and waiting for the update of the configuration state.
Optionally, the determining the speculative configuration parameter according to the configuration instruction and the speculative policy specifically includes:
and determining the speculative configuration parameters according to the type of the configuration instruction and a speculative policy, wherein the configuration instruction comprises a vsetvl instruction, a vsetvli instruction and a vsetvli instruction, and the speculative configuration parameters comprise a vtype value and a vl value.
Optionally, the determining the speculative configuration parameter according to the configuration instruction and the speculative policy specifically includes:
And when the configuration instruction is a vsetivli instruction, acquiring the speculative configuration parameter in a field of the vsetivli instruction.
Optionally, the determining the speculative configuration parameter according to the configuration instruction and the speculative policy specifically includes:
and when the configuration instruction is a vsetvli instruction, acquiring a vtype value in the speculative configuration parameter in a field of the vsetvli instruction, and acquiring the vl value in the speculative configuration parameter according to a preset speculative strategy.
Optionally, the determining the speculative configuration parameter according to the configuration instruction and the speculative policy specifically includes:
and when the configuration instruction is a vsetvl instruction, obtaining a vtype value in the speculative configuration parameter and a vl value in the speculative configuration parameter according to a preset speculative strategy.
In a second aspect, an embodiment of the present invention provides an execution in an out-of-order core based on a RISCV-V instruction set, the apparatus including:
the execution unit is used for executing a configuration instruction for configuring parameters required by the execution of the vector instruction to obtain execution configuration parameters, wherein the configuration instruction carries a tag index;
the acquisition unit is used for acquiring the speculative configuration parameters from the file entries corresponding to the configuration instructions according to the tag indexes, wherein the vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters;
And the determining unit is used for determining that the configuration instruction is speculative correctly in response to the fact that the speculative configuration parameters are the same as the execution configuration parameters.
Optionally, the acquiring unit is further configured to acquire the configuration instruction;
the determining unit is further used for determining a speculative configuration parameter according to the configuration instruction and a speculative policy;
the apparatus further comprises: and the writing unit is used for writing the speculative configuration parameters into file entries corresponding to the configuration instructions according to the tag indexes, so that vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters.
Optionally, the determining unit is further configured to determine that the configuration instruction speculation fails in response to the speculation configuration parameter being different from the execution configuration parameter;
the obtaining unit is further configured to re-obtain a vector instruction corresponding to the configuration instruction, so that the vector instruction corresponding to the configuration instruction is executed according to the execution configuration parameter.
Optionally, the acquiring unit is specifically configured to:
and re-acquiring a vector instruction corresponding to the configuration instruction according to the tag index of the configuration instruction, wherein the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction.
Optionally, the apparatus further comprises:
and the processing unit is used for brushing the vector instruction corresponding to the configuration instruction and the instruction with the tag index value larger than the tag index value of the configuration instruction out of the pipeline and reacquiring the instruction.
Optionally, the obtaining unit is further configured to obtain a vector instruction corresponding to the configuration instruction, where the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction;
the acquisition unit is further configured to acquire a speculative configuration parameter from a file entry corresponding to the configuration instruction according to the tag index, so that a vector instruction corresponding to the configuration instruction is executed according to the speculative configuration parameter.
Optionally, before acquiring the speculative configuration parameter from the file entry corresponding to the configuration instruction according to the tag index, the processing unit is further configured to:
responding to the configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction to be effective, and determining the vector instruction corresponding to the configuration instruction as an instruction to be executed; or,
and responding to the invalid configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction, blocking the vector instruction corresponding to the configuration instruction in a vector sending slot, and waiting for the update of the configuration state.
Optionally, the determining unit is specifically configured to:
and determining the speculative configuration parameters according to the type of the configuration instruction and a speculative policy, wherein the configuration instruction comprises a vsetvl instruction, a vsetvli instruction and a vsetvli instruction, and the speculative configuration parameters comprise a vtype value and a vl value.
Optionally, the determining unit is specifically configured to:
and when the configuration instruction is a vsetivli instruction, acquiring the speculative configuration parameter in a field of the vsetivli instruction.
Optionally, the determining unit is specifically configured to:
and when the configuration instruction is a vsetvli instruction, acquiring a vtype value in the speculative configuration parameter in a field of the vsetvli instruction, and acquiring the vl value in the speculative configuration parameter according to a preset speculative strategy.
Optionally, the determining unit is specifically configured to:
and when the configuration instruction is a vsetvl instruction, obtaining a vtype value in the speculative configuration parameter and a vl value in the speculative configuration parameter according to a preset speculative strategy.
In a third aspect, embodiments of the present invention provide computer program instructions which, when executed by a processor, implement a method as in the first aspect or any one of the possibilities of the first aspect.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium having stored thereon
Computer program instructions are stored which, when executed by a processor, implement the method of the first aspect or any one of the possibilities of the first aspect.
In a fifth aspect, an embodiment of the present invention provides a chip comprising a memory and a processing core, the memory being configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processing core to implement the method of the first aspect or any one of the possibilities of the first aspect.
In a sixth aspect, an embodiment of the present invention provides a board, where the board includes the chip of the fifth aspect.
In a seventh aspect, an embodiment of the present invention provides a server, where the server includes the board card of the sixth aspect.
The embodiment of the invention obtains the execution configuration parameters by executing the configuration instruction for configuring the parameters required by the execution of the vector instruction, wherein the configuration instruction carries the tag index; acquiring a speculative configuration parameter from a file entry corresponding to the configuration instruction according to the tag index; and determining that the configuration instruction is speculative correctly in response to the speculative configuration parameters being the same as the execution configuration parameters. By the method, the vector instruction does not need to wait for the execution configuration parameters generated by the execution of the configuration instruction, and the speculative configuration parameters are adopted, so that the occupation of a time window of a pipeline when the configuration instruction is executed is reduced, and the efficiency of the pipeline is improved.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following description of embodiments of the present invention with reference to the accompanying drawings, in which:
FIG. 1 is a schematic illustration of a prior art pipeline;
FIG. 2 is a flow chart of a method for executing configuration instructions in an out-of-order core based on a RISCV-V instruction set in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a configuration instruction format according to an embodiment of the present invention;
FIG. 4 is a flowchart of a method for executing configuration instructions in an out-of-order core based on a RISCV-V instruction set according to an embodiment of the present invention;
FIG. 5 is a flow chart of a method for executing configuration instructions in an out-of-order core based on a RISCV-V instruction set in an embodiment of the present invention;
FIG. 6 is a flow chart of a method for executing configuration instructions in an out-of-order core based on a RISCV-V instruction set in an embodiment of the present invention;
FIG. 7 is a schematic illustration of a pipeline in accordance with an embodiment of the present invention;
FIG. 8 is a schematic diagram of an execution device of a configuration instruction in an out-of-order core based on a RISCV-V instruction set according to an embodiment of the present invention.
Detailed Description
The present disclosure is described below based on examples, but the present disclosure is not limited to only these examples. In the following detailed description of the present disclosure, certain specific details are set forth in detail. The present disclosure may be fully understood by those skilled in the art without a review of these details. Well-known methods, procedures, flows, components and circuits have not been described in detail so as not to obscure the nature of the disclosure.
Moreover, those of ordinary skill in the art will appreciate that the drawings are provided herein for illustrative purposes and that the drawings are not necessarily drawn to scale.
Unless the context clearly requires otherwise, the words "comprise," "comprising," and the like throughout the application are to be construed as including but not being exclusive or exhaustive; that is, it is the meaning of "including but not limited to".
In the description of the present disclosure, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present disclosure, unless otherwise indicated, the meaning of "a plurality" is two or more.
In the prior art, RISC-V (Reduced Instruction Set Computer-V) comprises an existing basic instruction set and an extended instruction set, wherein the basic instruction set comprises RV32I, RV32E, RV64I and RV128I, and the extended instruction set comprises M, A, F, D, C and V extensions; taking V extension as an example, the V extension is a Vector extension instruction set (RISCV-Vector, RISCV-V), where the RISCV-V is implemented by a Vector processor, and the Vector extension instruction set includes a configuration instruction (i.e., vconfig configuration instruction) and a Vector instruction (i.e., vector instruction), where the vconfig configuration instruction configures two parameters, namely, vtype and Vl, required when the Vector instruction is executed, where each Vector instruction must rely on a configuration result of the vconfig configuration instruction corresponding to the Vector instruction, and since the instruction is executed sequentially, that is, after execution of each Vector instruction must rely on one vconfig configuration instruction closest to the Vector instruction, the execution results Vtype and Vl are written into a CSR control status register, and when the Vector instruction is executed, the Vector instruction directly reads the register to obtain correct configuration parameters Vtype and Vl; and, each vconfig configuration instruction may correspond to a plurality of Vector instructions and scalar instructions, and the corresponding plurality of Vector instructions and scalar instructions may be referred to as v_coupled Vector instructions of the vconfig configuration instruction, where the Vector instructions depend on execution results Vtype and Vl obtained by completing execution of the vconfig configuration instruction, and the scalar instructions do not depend on the execution results, and therefore, in order to improve pipeline efficiency of instruction execution, execution of Vector instructions is mainly considered.
Since the subsequent v_coupled vector instruction of the vconfig configuration instruction needs to rely on the execution results Vtype and Vl obtained by completing the execution of the vconfig configuration instruction, and the decoding Stage (Decode Stage), naming Stage (Rename Stage), dispatch Stage (Dispatch Stage), dispatch Stage (Issue Stage), execution Stage (Execute Stage), write-back Stage (Write back Stage), commit Stage (Commit Stage) and the like in the pipeline all need the parameters Vtype and Vl, the v_coupled vector instruction needs to wait until the execution of the vconfig configuration instruction is completed, and then the execution results can continue to be executed after being stored in the CSR. In the prior art, when executing a vconfig configuration instruction, in a decoding stage, marking the vconfig configuration instruction as a unique instruction (unique instruction), then sending the marked vconfig configuration instruction to an INT emission queue when a Reorder Buffer (ROB) is emptied, and when a subsequent instruction of the vconfig configuration instruction cannot be issued to the INT emission queue; when ROB is submitted, the vconfig configuration instruction can be sent to the INT emission queue continuously only after the vconfig configuration instruction is followed; thus, vconfig configuration instructions alone occupy a large portion of the time window in the pipeline when executed, blocking the flow of subsequent v_coupled vector instructions, and the pipeline is inefficient.
Specifically, as shown in fig. 1, the Pipeline (Pipeline) marks a vsetvl instruction as a unique instruction in a decoding Stage, wherein the vsetvl instruction is one of the vconfig configuration instructions, determines whether the instruction carries a unique instruction identifier in a decoded Stage (Decode Stage) and a renamed Stage (Rename Stage), if the instruction does not carry a unique instruction identifier, indicates that the instruction is a scalar instruction or a Vector instruction, directly enters a Dispatch Stage (Dispatch Stage), then enters a Commit Stage (Commit Stage) through a Dispatch Stage (Issue Stage) and an Execute Stage (Execute Stage), and if the instruction can be committed to the head of the ROB in a Commit Stage, the instruction is sent to an INT transmit queue, and the Pipeline is released; if the instruction carries the unique instruction identifier, continuing to judge whether the ROB is empty, if so, entering a Dispatch Stage, if not, blocking the pipeline, returning to a decoding Stage and a renaming Stage again until receiving the message that the ROB is empty, and judging whether the instruction carries the unique instruction identifier again.
In summary, how to reduce the occupation of the time window of the pipeline when the vconfig configuration instruction is executed and improve the efficiency of the pipeline is a problem to be solved at present.
In the embodiment of the invention, in order to solve the problems of large occupation of a time window of a pipeline and low pipeline efficiency when a vconfig configuration instruction is executed, the invention provides a method for executing the vconfig configuration instruction in an out-of-order core based on a RISCV-V instruction set, which comprises the following steps: executing a configuration instruction for configuring parameters required by vector instruction execution, and acquiring execution configuration parameters, wherein the configuration instruction carries a tag index; acquiring a speculative configuration parameter from a file entry corresponding to the configuration instruction according to the tag index, wherein a vector instruction corresponding to the configuration instruction is executed according to the speculative configuration parameter; and determining that the configuration instruction is speculative correctly in response to the speculative configuration parameters being the same as the execution configuration parameters. By the method, the vector instruction does not need to wait for the execution configuration parameters generated by the execution of the configuration instruction, and the speculative configuration parameters are adopted, so that the occupation of a time window of a pipeline when the configuration instruction is executed is reduced, and the efficiency of the pipeline is improved.
FIG. 2 is a flowchart of a method for executing a vconfig configuration instruction in an out-of-order core based on a RISCV-V instruction set according to an embodiment of the present invention, specifically including:
Step S200, obtaining a vconfig configuration instruction.
Specifically, in the Fetch Stage (Fetch Stage) in the pipeline, a vconfig configuration instruction is fetched.
And step S201, determining a speculative configuration parameter according to the vconfig configuration instruction and the speculative policy.
Wherein, the gambling strategy refers to a mode of obtaining gambling configuration parameters. For example, if the speculative configuration parameter includes a vtype value and a vl value, the speculative policy may be a manner of acquiring the vtype value or a manner of acquiring the vl value, or may be a combination of a manner of acquiring the vtype value and a manner of acquiring the vl value.
Specifically, the configuration instruction carries a tag index.
Specifically, the speculative configuration parameters are determined according to the types of the vconfig configuration instructions, and in the embodiment of the present invention, the vconfig configuration instructions include a vcetvl instruction, a vcetvli instruction, and a vcsetivli instruction, and then the types of the vconfig configuration instructions are 3, and the speculative policy is described in detail later.
In the embodiment of the present invention, the instruction formats of the vsetvl instruction, the vsetvli instruction, and the vsetvli instruction are all different, as shown in fig. 3, the instruction formats are specifically as follows:
1. vsetivli instruction
The instruction format of the vsetivli instruction includes an imm field (i.e., imm field) including a zimm field and a uimm field, illustratively, as shown in fig. 3, 0-6 bits and 12-14 bits, 30 bits, 31 bits for storing a particular instruction encoding that may be an official encoded value for each type of instruction; bits 7-11 are used to store the destination register (rd); 15-19 bits are uimm field for storing the value of the vl parameter, which is vector element length register (vector length); bits 20-29 are the zimm field for storing the value of the vtype parameter, which is a vector data type register (vector data type register) for interpreting the default type of vector register file content.
Therefore, when the vconfig configuration instruction is a vcsetivli instruction, the speculative configuration parameters can be obtained in an imm domain in a decoding stage, wherein the speculative configuration parameters comprise a vtype parameter and a vl parameter; specifically, a vtype parameter is obtained in the zimm domain, and a vl parameter is obtained in the uimm domain; the vsetivli instruction may also be referred to as an immediate format vconfig configuration instruction because the values of the vtype parameter and the vl parameter are directly obtained from the zimm field and the uimm field.
In the embodiment of the present invention, the speculative configuration parameters acquired in the imm field are the same as the execution configuration parameters acquired by the vconfig configuration instruction in the execution stage, that is, the speculative configuration parameters acquired by the v_coupled vector instruction corresponding to the vconfig configuration instruction are correct parameters for speculative.
2. vsetvli instruction
The instruction format of the vsetvli instruction includes a zimm field, illustratively, as shown in FIG. 3, bits 0-6 and bits 12-14, 31 for storing a particular instruction encoding; bits 7-11 are used to store the destination register (rd); bits 15-19 are used to store a source register rs1, where the value of the vl parameter is stored in the source register rs 1; bits 20-30 are the zimm field for storing the value of the vtype parameter.
When the vconfig configuration instruction is a vcetvli instruction, in the decoding stage, the value of a vctype parameter in the speculative configuration parameter is obtained in a zimm domain, the vl parameter in the speculative configuration parameter is stored in a source register rs1, the value of the vconvli parameter which is required to be obtained only after the execution of the vconvli instruction is completed in the execution stage, and in the decoding stage, the value of the vconvli parameter can be obtained only in the zimm domain according to a preset speculative strategy, and the specific speculative strategy is described in the same way hereinafter.
In the embodiment of the present invention, the value of the vtype parameter obtained in the zimm field is the same as the value of the vtype parameter obtained in the execution stage of the vconfig configuration instruction, that is, the value of the vtype parameter is the correct parameter for the speculative machine; however, since the value of the vl parameter is obtained according to a preset speculative strategy, the value may be the same as the value of the vl parameter obtained in the execution stage, or may be different from the value of the vl parameter obtained in the execution stage, if the value of the vl parameter obtained according to the speculative strategy is the same as the value of the vl parameter obtained in the execution stage, the vl parameter obtained according to the speculative strategy is the correct parameter for speculation; if the value of the vl parameter obtained according to the speculative strategy is different from the value of the vl parameter obtained in the execution stage, the vl parameter obtained according to the speculative strategy is the parameter of the speculative error.
3. vsetvl instruction
For the instruction format of the vsetvl instruction, illustratively, bits 0-6 and 12-14, 25-30, 31 are used to store a particular instruction encoding, as shown in FIG. 3; bits 7-11 are used to store the destination register (rd); bits 15-19 are used to store a source register rs1, where the value of the vl parameter is stored in the source register rs 1; the 20-24 bits are used to store a source register rs2, in which the value of the vtype parameter is stored.
When the vconfig configuration instruction is a vsetvl instruction, in the decoding stage, the value of the vctype parameter in the speculative configuration parameter is stored in the source register rs2, the vl parameter in the speculative configuration parameter is stored in the source register rs1, the value of the vconfig parameter and the value of the vl parameter which are required to be obtained accurately are required to be obtained in the execution stage, the vconvl instruction can be obtained after the execution is completed, and in the decoding stage, the value of the vconfig parameter and the value of the vl parameter can only be obtained according to a preset speculative strategy, and a specific speculative strategy (i.e., a manner of obtaining the value of the vconfig parameter and the value of the vl parameter) is described in the same manner later.
In the embodiment of the present invention, since there may be values (i.e., speculative configuration parameters) obtained according to a speculative policy in the values of the vtype parameter and the vl parameter, the values may be the same as the values of the vtype parameter and the vl parameter (i.e., the execution configuration parameters) obtained in the execution stage, or may be different from the values of the vtype parameter and the vl parameter obtained in the execution stage, if the values of the vtype parameter obtained in the execution stage are the same, the vtype parameter is the correct parameter for speculation; if the values of the vtype parameters acquired in the execution stage are different, the vtype parameters are parameters of the speculative errors; if the values of the vl parameters obtained in the execution stage are the same, the vl parameters are correct parameters for speculation; if the values of the vl parameters obtained in the execution stage are different, the vl parameters are the parameters of the speculative errors.
Step S202, writing the speculative configuration parameters into file entries corresponding to the vconfig configuration instructions according to the tag indexes, so that vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters.
Specifically, in the decoding stage, a corresponding file entry (vconfig table) is set for each vconfig configuration instruction, wherein the file entry comprises a speculative configuration parameter and a configuration state of the speculative configuration parameter, and different items are set for different speculative strategies; if the vsetvli instruction and the vsetvli instruction need to be speculative in the speculative policy, the vsetvli instruction does not need to be speculative, and only the state of the vl parameter needs to be set, if the vsetvli instruction is included in the speculative policy, the state of the vtype parameter and the state of the vl parameter need to be set, and the following embodiments describe the case that the vsetvli instruction does not need to be speculative, that is, the file entry includes the configuration states of the vtype parameter, the vl parameter and the vl parameter.
For example, for the vsetvli instruction, the value of the vtype parameter is obtained in the zimm field and is stored and written into the corresponding file entry, the value of the vl parameter in the file entry is set to VLMAX (i.e., the vl maximum value), as long as the value of the vtype parameter is correct, the value of the vl parameter is smaller than VLMAX, which is the correct speculation, but because the vl parameter needs to be speculative, when the speculative configuration parameter is written into the file entry, the configuration state of the vl parameter can be set to be invalid, and the update is valid after the correct speculation is determined; when the configuration state of the vl parameter can be set to be invalid, the subsequent v_coupled vector instruction having a correlation with the vsetvli instruction is marked as unreready (not exactly good) and blocked in the issue slot; when the configuration state of the vl parameter can be set to be valid, the subsequent v_coupled vector instruction having a correlation with the vsetvli instruction is marked ready (accurate good) and normally issued; for the vsetivli instruction, the value of the vtype parameter is obtained in the zimm domain, and the value of the vl parameter is obtained in the uimm domain, and the values are respectively stored in the file entries.
In the embodiment of the present invention, the vconfig configuration instruction and the corresponding v_coupled instruction carry the same tag index.
Step S203, executing a vconfig configuration instruction for configuring parameters required by the execution of the vector instruction, and obtaining an execution configuration parameter, where the configuration instruction carries a tag index.
Specifically, in the execution phase, the execution configuration parameters are acquired, that is, the new value of the vtype parameter and the new value of the vl parameter are acquired.
Step S204, obtaining the speculative configuration parameters from the file entries corresponding to the configuration instructions according to the tag indexes.
The vector instruction corresponding to the configuration instruction is executed according to the speculative configuration parameter.
Step S205, determining that the vconfig configuration instruction is speculative correctly in response to the speculative configuration parameter being the same as the execution configuration parameter.
Specifically, when the value of the vtype parameter in the speculative configuration parameter is the same as the value of the new vtype parameter in the execution configuration parameter, the value of the vl parameter in the speculative configuration parameter is the same as the value of the new vl parameter in the execution configuration parameter, which indicates that the speculative is correct.
In one possible implementation, the value of the vl parameter in the file entry is set to VLMAX, and as long as the value of the vtype parameter is correct, a new vl parameter having a value less than VLMAX is also considered correct for the speculation, where the value of the vl parameter is not required to be the same as the value of the new vl parameter.
In one possible implementation manner, the v_coupled vector instruction is related to the vconfig configuration instruction, and because the v_coupled vector instruction carries the same tag index as the vconfig configuration instruction, a speculative configuration parameter in the file entry can be searched according to the tag index, and the v_coupled vector instruction executes a processing procedure pipeline by adopting the speculative configuration parameter until a transmitting stage, and does not need to wait for the vconfig configuration instruction to be executed to finish processing in an entering pipeline, but only needs to acquire a configuration state of the speculative configuration parameter in the transmitting stage; in response to the configuration state being valid, i.e., vl_valid=1, the speculative configuration parameter being speculative correctly, the v_coupled vector instruction may be directly issued; or, in response to the configuration state being invalid, that is, vl_valid=0, blocking the v_coupled vector instruction in the vector sending slot, waiting for the configuration state to be updated to be valid, where the foregoing only occurs when the speculation is correct, except that vl_valid=0 corresponds to a different processing procedure when the vl_valid is in a different state, in this embodiment of the present invention, vl_valid=0 may also be represented as the configuration state being valid, and vl_valid=1 may also be represented as the configuration state being invalid, which is not limited by the present invention.
In one possible implementation manner, for each vconfig configuration instruction, the corresponding v_coupled instruction and the same tag Vtag are set, and the file entry corresponding to the vconfig configuration instruction in the file entry can be found through the tag Vtag.
In the embodiment of the present invention, after step S204, step S206 is further included, as shown in fig. 4, specifically, the following steps are included:
and step S206, determining that the vconfig configuration instruction fails to be speculative in response to the speculation configuration parameters being different from the execution configuration parameters.
Specifically, taking a vsetvli instruction as an example, where the vsetvli instruction is a value of a vl parameter generated according to a speculative policy in a decoding stage, and when the value of the vl parameter generated according to the speculative policy is different from the value of the vl parameter generated by executing the vconfig configuration instruction, indicating that the vconfig configuration instruction fails to speculation; since the v_coupled vector instruction subsequent to the vconfig configuration instruction adopts the value of the vl parameter generated according to the speculative policy from the decoding stage, but the value of the vl parameter generated according to the speculative policy is different from the value of the vl parameter generated when the vconfig configuration instruction is executed, if the value of the vl parameter generated according to the speculative policy is still used in the execution stage, an error occurs, and therefore, the following step S207 is executed.
Step S207, re-acquiring a v_coupled vector instruction subsequent to the vconfig configuration instruction, so that the vector instruction corresponding to the configuration instruction is executed according to the execution configuration parameter.
Specifically, taking the vsetvli instruction as an example, since the v_coupled vector instruction needs to acquire the correct value of the vl parameter, it is necessary to refresh and reacquire the v_coupled vector instruction subsequent to the vconfig configuration instruction, and after reacquiring the v_coupled vector instruction, the v_coupled vector instruction acquires the execution configuration parameter from the control status register CSR, where the execution configuration parameter is updated to the control status register CSR by the execution configuration parameter generated after the vconfig configuration instruction is executed, and after the update of the execution configuration parameter is completed, the configuration status is updated to be valid.
In one possible implementation manner, after the vconfig configuration instruction is acquired, that is, after step S200, the method further includes step S208, specifically as shown in fig. 5, specifically as follows:
step S208, continuing to acquire at least one v_coupled vector instruction corresponding to the vconfig configuration instruction.
Specifically, the execution of the vconfig configuration instruction is not required to be completed, and only the speculative configuration parameters generated by the vconfig configuration instruction through the speculative policy in the decoding stage are required to be obtained.
In a possible implementation manner, after step S205, the method further includes step S209, specifically as shown in fig. 6, specifically as follows:
step S209, the vconfig configuration instruction and the subsequent v_coupled vector instruction exit the processing pipeline.
In the embodiment of the present invention, after a decoding stage, a renaming stage is entered, specifically, after a vsetvli instruction, a vsetvli instruction or a vsetvli instruction is identified in the renaming stage, the vsetvli instruction or the vsetvli instruction is renamed as a normal instruction, specifically, a physical register is allocated to the vsetvli instruction, the vsetvli instruction or the vsetvli instruction, and in the decoding stage, a source register and a destination register in the vsetvli instruction, the vsetvli instruction or the vsetvli instruction are logic registers, and parameter values are actually stored in physical registers, and there is a correspondence between the physical registers and the logic registers, so that a physical register is allocated by renaming the vsetvli instruction, the vsetvli instruction or the vsetvli instruction, and then a subsequent stage is entered after the renaming stage.
In one possible implementation, the processing of a vconfig configuration instruction and a v_coupled vector instruction is illustrated by fig. 7, where the vconfig configuration instruction and the corresponding v_coupled vector instruction may be referred to as a set of instructions, and Vtag of the set of instructions is the same, and the specific processing flow is as follows:
A Fetch Stage (Fetch Stage), a Decode Stage (Decode Stage), a Rename Stage (Rename Stage), a Dispatch Stage (Dispatch Stage), a Dispatch Stage (Issue Stage) and an Execute Stage (Execute Stage) passed by a Pipeline (Commit Stage), then the Pipeline (Pipeline Stage) is required to be sequenced in the ROB during the Commit Stage, then whether vconfig configuration instruction speculation fails or not is judged, if so, the judgment is continued, instructions younger than the group of instructions are refreshed, wherein the younger instructions (i.e. later entered instructions) have larger assigned vtag values, i.e. vtag > fail_vtag; and whether the subsequent instruction is related to the value of the vtype parameter or the value of the vl parameter is required to be judged, the correlation is represented by vtag_vld, and if the correlation is related, the speculation failure is that refreshing is required.
In the embodiment of the invention, different speculative strategies can be set according to different processor microarchitectures in the out-of-order core so as to obtain different effects.
In a possible implementation manner, for the vtype parameter, for the vsetivli instruction and the vsetvli instruction, the value of the vtype parameter in the zimm domain is directly taken, and for the vsetvli instruction, the following four policies may be used for processing, specifically as follows:
Strategy 1: the vsetvl instruction is marked as a unique instruction and executed in a manner known in the art.
Strategy 2: preferentially determining the valid vtype value of a vsetvl nearest to the vsetvl instruction in the file entry as the vtype value of the vsetvl instruction; if the file entry is empty, acquiring a vtype value stored in the CSR as a vtype value of the vsetvl instruction; if CSR is empty, the result is "exception".
Strategy 3: the historical vtype value is stored in the file entry, then hashed with the value of the partial address (PC) and the historical prediction information, and then indexed to determine the vtype value of the vsetvl instruction.
Strategy 4: the new item is filled in the file entry, and the configuration state flag bit unreready, v_coupled instruction blocks in the decode stage until the vsetvl instruction updates the value of the item in the execution segment and places it as ready.
In one possible implementation, for the vl parameter, the value of the vl parameter in the uimm domain is directly fetched for the vsetivli instruction, and for the vsetvli instruction and the vsetvli instruction, the following five policies may be used for processing, specifically:
strategy 5: the vsetvl instruction is marked as a unique instruction and executed in a manner known in the art.
Strategy 6: preferentially determining the valid vl value of a vsetvl nearest to the vsetvl instruction in the file entry as the vl value of the vsetvl instruction; if the file entry is empty, the vl value stored in the CSR is obtained as the vl value of the vsetvl instruction.
Strategy 7: the historic vl value is stored in the file entry, then hashed with the value of the partial address (PC) and the historic prediction information, and then indexed to determine the vl value of the vsetvl instruction.
Strategy 8: the new item is filled in the file entry, and the configuration state flag bit unreready, v_coupled instruction blocks in the decode stage until the vsetvl instruction updates the value of the item in the execution segment and places it as ready.
Strategy 9: the vl value is set to VLMAX.
Different speculative strategies can be set according to different processor microarchitectures in the out-of-order core, and a specific combination mode can be shown in the following table 1:
TABLE 1
/>
Wherein, description of dimension information in the first row in table 1: "combination" means a combination of a manner of acquiring a vtype value and a manner of acquiring a vl value; "vtype" refers to the manner in which the vtype value is obtained; "vl" refers to the manner in which the vl value is obtained; "scope of applicability" refers to which types of vconfig configuration instructions each combination applies to, i.e., the scope of speculation; "description" refers to further description of the applicable scope; the term "effect" refers to an effect that would be produced if the corresponding combination were employed.
In the embodiment of the invention, for the sequential vector core or the vector and scalar decoupling core, the combination 1 can be selected, the hardware cost is small, the realization is simple, and the influence on the performance is low. For the out-of-order core/vector and scalar coupling core, the efficiency of selecting the speculative combination mode is higher, and in particular, the combination 2 is adopted to realize simplicity, the speculative range is small, and the optimization degree is low; the combination 3/4 can cover more application range than the combination 2, so that the precision of the gambling is high, and the optimization effect is good; the combination 5/6 is adopted for evaluation and selection, so that the range of the gambling is wide; the combination 7/8 is adopted to use a better prediction algorithm, the hardware cost is larger, and the probability is higher than the combination 5/6.
In the embodiment of the invention, the out-of-order core means that the naming stage, the scheduling stage, the distribution stage, the execution stage and the write-back stage after the decoding stage of the pipeline can be executed out-of-order, namely the stages can be executed out of order when the instruction is acquired, and the commit stage commits in order when the instruction is acquired, so that the out-of-order execution in the naming stage, the scheduling stage, the distribution stage, the execution stage and the write-back stage can be executed out-of-order without queuing, thereby improving the processing speed; the sequence vector core refers to that the instructions are executed one by one completely according to the sequence when the instructions are acquired.
FIG. 8 is a schematic diagram of an execution device in a RISCV-V instruction set based out-of-order core according to an embodiment of the present invention. As shown in fig. 8, the apparatus of the present embodiment includes an execution unit 801, an acquisition unit 802, and a determination unit 803.
The execution unit 801 executes a configuration instruction for configuring parameters required by the execution of the vector instruction, and obtains execution configuration parameters, wherein the configuration instruction carries a tag index; an obtaining unit 802, configured to obtain, according to the tag index, a speculative configuration parameter from a file entry corresponding to the configuration instruction, where a vector instruction corresponding to the configuration instruction has been executed according to the speculative configuration parameter; a determining unit 803, configured to determine that the configuration instruction is speculative correctly in response to the speculative configuration parameter being the same as the execution configuration parameter.
Optionally, the acquiring unit is further configured to acquire the configuration instruction;
the determining unit is further used for determining a speculative configuration parameter according to the configuration instruction and a speculative policy;
the apparatus further comprises: and the writing unit is used for writing the speculative configuration parameters into file entries corresponding to the configuration instructions according to the tag indexes, so that vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters.
Optionally, the determining unit is further configured to determine that the configuration instruction speculation fails in response to the speculation configuration parameter being different from the execution configuration parameter;
the obtaining unit is further configured to re-obtain a vector instruction corresponding to the configuration instruction, so that the vector instruction corresponding to the configuration instruction is executed according to the execution configuration parameter.
Optionally, the acquiring unit is specifically configured to:
and re-acquiring a vector instruction corresponding to the configuration instruction according to the tag index of the configuration instruction, wherein the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction.
Optionally, the apparatus further comprises:
and the processing unit is used for brushing the vector instruction corresponding to the configuration instruction and the instruction with the tag index value larger than the tag index value of the configuration instruction out of the pipeline and reacquiring the instruction.
Optionally, the obtaining unit is further configured to obtain a vector instruction corresponding to the configuration instruction, where the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction;
the acquisition unit is further configured to acquire a speculative configuration parameter from a file entry corresponding to the configuration instruction according to the tag index, so that a vector instruction corresponding to the configuration instruction is executed according to the speculative configuration parameter.
Optionally, before acquiring the speculative configuration parameter from the file entry corresponding to the configuration instruction according to the tag index, the processing unit is further configured to:
responding to the configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction to be effective, and determining the vector instruction corresponding to the configuration instruction as an instruction to be executed; or,
and responding to the invalid configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction, blocking the vector instruction corresponding to the configuration instruction in a vector sending slot, and waiting for the update of the configuration state.
Optionally, the determining unit is specifically configured to:
and determining the speculative configuration parameters according to the type of the configuration instruction and a speculative policy, wherein the configuration instruction comprises a vsetvl instruction, a vsetvli instruction and a vsetvli instruction, and the speculative configuration parameters comprise a vtype value and a vl value.
Optionally, the determining unit is specifically configured to:
and when the configuration instruction is a vsetivli instruction, acquiring the speculative configuration parameter in a field of the vsetivli instruction.
Optionally, the determining unit is specifically configured to:
and when the configuration instruction is a vsetvli instruction, acquiring a vtype value in the speculative configuration parameter in a field of the vsetvli instruction, and acquiring the vl value in the speculative configuration parameter according to a preset speculative strategy.
Optionally, the determining unit is specifically configured to:
and when the configuration instruction is a vsetvl instruction, obtaining a vtype value in the speculative configuration parameter and a vl value in the speculative configuration parameter according to a preset speculative strategy.
In an embodiment of the present invention, there is also provided computer program instructions which, when executed by a processor, implement the method of any of the above embodiments.
In an embodiment of the present invention, there is also provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of any of the above embodiments.
An embodiment of the present invention provides a chip including a memory for storing one or more computer program instructions, and a processing core, where the one or more computer program instructions are executed by the processing core to implement the method of any of the above embodiments.
The embodiment of the invention provides a board card, which comprises a chip.
The embodiment of the invention provides a server, which comprises the board card.
As will be appreciated by one skilled in the art, aspects of embodiments of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of embodiments of the invention may take the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit," module "or" system. Furthermore, aspects of embodiments of the invention may take the form of: a computer program product embodied in one or more computer-readable media having computer-readable program code embodied thereon.
Any combination of one or more computer readable media may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of embodiments of the present invention, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, such as in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to: electromagnetic, optical, or any suitable combination thereof. The computer readable signal medium may be any of the following: a computer-readable storage medium is not a computer-readable storage medium and can communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of embodiments of the present invention may be written in any combination of one or more programming languages, including: object oriented programming languages such as Java, smalltalk, C ++, etc.; and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package; executing partly on the user computer and partly on the remote computer; or entirely on a remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention described above describe aspects of embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations may be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (14)

1. A method of executing in an out-of-order core based on a RISCV-V instruction set, the method comprising:
executing a configuration instruction for configuring parameters required by vector instruction execution, and acquiring execution configuration parameters, wherein the configuration instruction carries a tag index;
acquiring a speculative configuration parameter from a file entry corresponding to the configuration instruction according to the tag index, wherein a vector instruction corresponding to the configuration instruction is executed according to the speculative configuration parameter;
and determining that the configuration instruction is speculative correctly in response to the speculative configuration parameters being the same as the execution configuration parameters.
2. The method of claim 1, wherein the method further comprises:
acquiring the configuration instruction;
determining a speculative configuration parameter according to the configuration instruction and the speculative strategy;
and writing the speculative configuration parameters into file entries corresponding to the configuration instructions according to the tag indexes, so that vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters.
3. The method of any one of claims 1-2, further comprising:
determining that the configuration instruction speculation fails in response to the speculation configuration parameters being different from the execution configuration parameters;
and re-acquiring the vector instruction corresponding to the configuration instruction, so that the vector instruction corresponding to the configuration instruction is executed according to the execution configuration parameter.
4. The method of claim 3, wherein the re-fetching the vector instruction corresponding to the configuration instruction specifically comprises:
and re-acquiring a vector instruction corresponding to the configuration instruction according to the tag index of the configuration instruction, wherein the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction.
5. A method as claimed in claim 3, characterized in that the method further comprises:
and brushing the vector instruction corresponding to the configuration instruction and the instruction with the tag index value larger than the tag index value of the configuration instruction out of the pipeline, and re-acquiring the vector instruction and the instruction.
6. The method of any one of claims 1-5, further comprising:
acquiring a vector instruction corresponding to the configuration instruction, wherein the vector instruction corresponding to the configuration instruction carries the same tag index as the configuration instruction;
And acquiring the speculative configuration parameters from the file entries corresponding to the configuration instructions according to the tag indexes, so that the vector instructions corresponding to the configuration instructions are executed according to the speculative configuration parameters.
7. The method of claim 6, wherein prior to retrieving the speculative configuration parameters from the file entries corresponding to the configuration instructions according to the tag index, the method further comprises:
responding to the configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction to be effective, and determining the vector instruction corresponding to the configuration instruction as an instruction to be executed; or,
and responding to the invalid configuration state of the speculative configuration parameters in the file entry corresponding to the configuration instruction, blocking the vector instruction corresponding to the configuration instruction in a vector sending slot, and waiting for the update of the configuration state.
8. The method according to any one of claims 2-7, wherein determining a speculative configuration parameter according to the configuration instruction and a speculative policy, in particular comprises:
and determining the speculative configuration parameters according to the type of the configuration instruction and a speculative policy, wherein the configuration instruction comprises a vsetvl instruction, a vsetvli instruction and a vsetvli instruction, and the speculative configuration parameters comprise a vtype value and a vl value.
9. The method of claim 8, wherein determining the speculative configuration parameters according to the configuration instructions and the speculative policies comprises:
and when the configuration instruction is a vsetivli instruction, acquiring the speculative configuration parameter in a field of the vsetivli instruction.
10. The method of claim 8, wherein determining the speculative configuration parameters according to the configuration instructions and the speculative policies comprises:
and when the configuration instruction is a vsetvli instruction, acquiring a vtype value in the speculative configuration parameter in a field of the vsetvli instruction, and acquiring the vl value in the speculative configuration parameter according to a preset speculative strategy.
11. The method of claim 8, wherein determining the speculative configuration parameters according to the configuration instructions and the speculative policies comprises:
and when the configuration instruction is a vsetvl instruction, obtaining a vtype value in the speculative configuration parameter and a vl value in the speculative configuration parameter according to a preset speculative strategy.
12. An execution apparatus in an out-of-order core based on a RISCV-V instruction set, the apparatus comprising:
The execution unit is used for executing a configuration instruction for configuring parameters required by the execution of the vector instruction to obtain execution configuration parameters, wherein the configuration instruction carries a tag index;
the acquisition unit is used for acquiring the speculative configuration parameters from the file entries corresponding to the configuration instructions according to the tag indexes;
and the determining unit is used for determining that the configuration instruction is speculative correctly in response to the fact that the speculative configuration parameters are the same as the execution configuration parameters.
13. Computer program instructions, characterized in that it implements the method according to any of claims 1-11 when executed by a processor.
14. A computer readable storage medium, on which computer program instructions are stored, which computer program instructions, when executed by a processor, implement the method of any one of claims 1-11.
CN202210847921.6A 2022-07-19 2022-07-19 Execution method and device in out-of-order core based on RISCV-V instruction set Pending CN117453290A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210847921.6A CN117453290A (en) 2022-07-19 2022-07-19 Execution method and device in out-of-order core based on RISCV-V instruction set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210847921.6A CN117453290A (en) 2022-07-19 2022-07-19 Execution method and device in out-of-order core based on RISCV-V instruction set

Publications (1)

Publication Number Publication Date
CN117453290A true CN117453290A (en) 2024-01-26

Family

ID=89587928

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210847921.6A Pending CN117453290A (en) 2022-07-19 2022-07-19 Execution method and device in out-of-order core based on RISCV-V instruction set

Country Status (1)

Country Link
CN (1) CN117453290A (en)

Similar Documents

Publication Publication Date Title
CN107810481B (en) Age-based management of instruction blocks in a processor instruction window
JP5965041B2 (en) Load store dependency predictor content management
US7472260B2 (en) Early retirement of store operation past exception reporting pipeline stage in strongly ordered processor with load/store queue entry retained until completion
CN107771318B (en) Mapping instruction blocks to instruction windows based on block size
US7174428B2 (en) Method and system for transforming memory location references in instructions
US7711934B2 (en) Processor core and method for managing branch misprediction in an out-of-order processor pipeline
US6950928B2 (en) Apparatus, method and system for fast register renaming using virtual renaming, including by using rename information or a renamed register
US20030182536A1 (en) Instruction issuing device and instruction issuing method
US20210011729A1 (en) Managing Commit Order for an External Instruction Relative to Queued Instructions
US20120137109A1 (en) Method and apparatus for performing store-to-load forwarding from an interlocking store using an enhanced load/store unit in a processor
US10545765B2 (en) Multi-level history buffer for transaction memory in a microprocessor
US20140095814A1 (en) Memory Renaming Mechanism in Microarchitecture
US20100037036A1 (en) Method to improve branch prediction latency
US20040128481A1 (en) Efficient lossy instruction scheduling
US20080244224A1 (en) Scheduling a direct dependent instruction
US20040199749A1 (en) Method and apparatus to limit register file read ports in an out-of-order, multi-stranded processor
US20100306513A1 (en) Processor Core and Method for Managing Program Counter Redirection in an Out-of-Order Processor Pipeline
CN110515659B (en) Atomic instruction execution method and device
US7376816B2 (en) Method and systems for executing load instructions that achieve sequential load consistency
US6959377B2 (en) Method and system for managing registers
CN117453290A (en) Execution method and device in out-of-order core based on RISCV-V instruction set
US20230056077A1 (en) Processor overriding of a false load-hit-store detection
CN116414454A (en) Vector instruction processing method and device based on RISC-V instruction set and readable storage medium
CN112905236A (en) Method and system for realizing RISC-V instruction set remainder instruction
WO2007084202A2 (en) Processor core and method for managing branch misprediction in an out-of-order processor pipeline

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination