WO2015024452A1 - Branch predicting method and related apparatus - Google Patents

Branch predicting method and related apparatus Download PDF

Info

Publication number
WO2015024452A1
WO2015024452A1 PCT/CN2014/083882 CN2014083882W WO2015024452A1 WO 2015024452 A1 WO2015024452 A1 WO 2015024452A1 CN 2014083882 W CN2014083882 W CN 2014083882W WO 2015024452 A1 WO2015024452 A1 WO 2015024452A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
branch
read
prediction
register
Prior art date
Application number
PCT/CN2014/083882
Other languages
French (fr)
Chinese (zh)
Inventor
侯锐
冯煜晶
郭旭斌
张乾龙
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015024452A1 publication Critical patent/WO2015024452A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • G06F9/3806Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer

Definitions

  • the present invention relates to the field of computer systems, and in particular, to a branch prediction method and related apparatus.
  • the actual program includes branch instructions.
  • the branching behavior of the branch instruction is often determined until the back end of the pipeline. Therefore, the branch instruction may cause a control risk and cause the pipeline to stall.
  • the processor cannot determine from which address to take an instruction. Until this branch instruction is executed.
  • Most processors use different forms of branch prediction mechanism, so that the target branch direction and target jump address of the conditional branch instruction can be predicted at the front end of the pipeline, so that the processor can predictively fetch instructions and execute instructions. . If the branch prediction is correct or the correct rate is high, the performance and power consumption of the processor can be greatly improved.
  • the Branch Target Address Cache (BTAC) is used to predict the target jump address of the indirect branch instruction.
  • BTAC uses the structure of the cache, with part of the program counter (PC, Program Counter) as the index (ie index), and part of it as the tag (ie label), such as the lower 8 bits of the PC as the index, and the high 8 of the PC. Bit as a tag.
  • Each Entry (ie, entry) of the BTAC corresponds to an index and a tag, and each Entry of the BTAC is set with a valid bit for recording whether the Entry stores valid history information (historical information is predicted)
  • the target jump address where the target jump address stored in the entry is a virtual address (VA, Virtual Address). If the BTAC is full, like Cache, it is also necessary to determine which of the least recently used entries can be replaced according to a certain replacement algorithm.
  • One is shared BTAC. Multiple threads share the same BTAC. Each thread uses its own PC to index the contents stored in BTAC. Although this method saves the area, since the index address of BTAC is the PC of each thread, and the PCs of different threads may be the same, the historical information between different threads is stored in the same BTAC, which will affect the branch prediction. Accuracy rate
  • the other is the exclusive BTAC, where each thread sets up a BTAC, and the BTAC provides the service for predicting the branch target jump address for the corresponding thread.
  • aspects of the present invention provide a branch prediction method and related apparatus for solving the problem of affecting the accuracy of branch prediction when sharing BTAC.
  • a first aspect of the present invention provides a branch prediction method, which is applied to a processor, where the processor includes: a first branch target address prediction buffer and a second branch target address prediction buffer, and the first branch target address prediction cache.
  • the storage device stores: a correspondence identifier of the register identifier and the predicted target jump address, wherein the second branch target address prediction buffer stores: correspondence information between the field of the program counter and the predicted target jump address, where
  • the above branch prediction method includes: reading an instruction from an instruction cache;
  • the above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
  • the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier;
  • the register identifier in the above instruction is a specific register identifier
  • the foregoing determining that the read instruction does not satisfy the foregoing register prediction condition specifically: when the type of the instruction is not an unconditional indirect jump branch instruction, or the above instruction
  • the register identifier in the register is not identified by a specific register, it is determined that the above-mentioned instruction read does not satisfy the register prediction condition.
  • the foregoing reading the instruction from the instruction cache includes:
  • the instruction to be read is pre-decoded to obtain the type information of the instruction to be read; the reading instruction further includes: determining, according to the type information of the instruction obtained above, whether the type of the currently read instruction is an unconditional indirect jump Transfer branch instructions.
  • the type of the compiled instruction is specified as an unconditional indirect jump branch instruction.
  • a second aspect of the present invention provides a branch prediction apparatus, which is applied to a processor, where the processor includes: a first branch target address prediction buffer and a second branch target address prediction buffer, and the first branch target address prediction cache.
  • the device stores: a correspondence identifier of the register identifier and the predicted target jump address
  • the second branch target address prediction buffer stores: correspondence information between the partial field of the program counter and the predicted target jump address, or And the corresponding relationship information of all the fields of the program counter and the predicted target jump address
  • the branch prediction device includes:
  • a reading unit configured to read an instruction from the instruction cache
  • a prediction acquiring unit configured to: when determining that the instruction read by the reading unit satisfies a register prediction condition, acquire the read from the first branch target address prediction buffer according to the register identifier of the instruction read by the reading unit Taking the prediction target jump address of the above instruction read by the unit; when it is determined that the instruction read by the reading unit does not satisfy the register prediction condition, the program counter of the instruction read according to the reading unit is from the second Obtaining, in the branch target address prediction buffer, a predicted target jump address of the above instruction read by the reading unit;
  • the foregoing register prediction condition includes: The type of the instruction is an unconditional indirect jump branch instruction. According to the second aspect of the present invention, in a first possible implementation, the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier;
  • the above branch prediction device further includes:
  • a determining unit configured to: when the type of the instruction read by the reading unit is an unconditional indirect branch branch instruction, and the register identifier in the instruction read by the reading unit is a specific register identifier, determining the reading unit to read The fetched instruction satisfies the register prediction condition; when the type of the instruction read by the read unit is not an unconditional indirect jump branch instruction, or the register identifier in the instruction read by the read unit is not a specific register identifier, It is determined that the above instruction read does not satisfy the register prediction condition.
  • the branch prediction apparatus further includes:
  • a pre-decoding unit configured to pre-decode an instruction to be read by the reading unit, to obtain type information of an instruction to be read by the reading unit
  • the determining unit is configured to determine, after the reading unit reads the instruction, whether the type of the instruction currently read by the reading unit is an unconditional indirect branch branch instruction according to the type information of the instruction obtained by the pre-decoding unit.
  • the foregoing branch prediction apparatus further includes:
  • the specifying unit is configured to specify the type of the compiled instruction as an unconditional indirect jump branch instruction when the function called when the above compiling unit compiles the high-level language is a standard library function.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions with the same register identifier are necessarily the same, so that even the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC.
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • FIG. 1 is a schematic flow chart of an embodiment of a branch prediction method according to the present invention
  • FIG. 2 is a schematic flowchart of another embodiment of a branch prediction method according to the present invention
  • FIG. 3 is a schematic diagram of a branch prediction method provided by the present invention.
  • FIG. 4 is a schematic structural diagram of an embodiment of a branch prediction apparatus according to the present invention
  • FIG. 5 is a schematic structural diagram of another embodiment of a branch prediction apparatus according to the present invention.
  • Embodiments of the present invention provide a branch prediction method and related apparatus.
  • a branch prediction method provided by an embodiment of the present invention is described below.
  • the branch prediction method in the embodiment is applied to the processor, where the processor includes: a first BTAC and a second BTAC, where the first BTAC stores: correspondence information between the register identifier and the predicted target jump address, the foregoing
  • the BTAC stores: a correspondence between the field of the PC and the predicted target jump address, optionally, the correspondence information of the partial field of the PC and the predicted target jump address is stored in the second BTAC, or Corresponding relationship information of all the fields of the PC and the predicted target jump address is stored in the second BTAC.
  • a branch prediction method in an embodiment of the present invention includes:
  • the instruction read by step 101 may be a branch instruction or a non-branch instruction.
  • the branch instruction can be divided into the following two ways: One is to divide the type of the branch instruction into a conditional branch instruction and an unconditional branch instruction for the jump condition, wherein the conditional branch instruction performs the branch jump when a certain condition is met.
  • the unconditional branch instruction does not need to satisfy any condition, and always performs a branch jump; the other is to divide the type of the branch instruction into a direct jump branch instruction and an indirect jump branch instruction for the target jump address, where
  • the offset of the target jump address indicated by the jump branch is directly specified in the instruction with an immediate value (that is, the number given in the immediate addressing mode instruction), and the target jump address is the PC plus the branch instruction itself.
  • the offset of the immediate branch is calculated, and the target jump address of the indirect jump branch instruction is specified in the register.
  • An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction.
  • the type of the branch instruction (such as whether it is a conditional branch instruction, an indirect jump branch instruction, etc.) needs to be identified through the pre-decoding stage.
  • the corresponding branch prediction is performed according to the type of the branch instruction.
  • the pre-decoded result (such as the type information of the instruction) and the instruction are stored together in the instruction cache. It should be noted that the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, or may be performed by other devices, which is not limited herein.
  • the above register prediction conditions include: the type of the instruction is an unconditional indirect jump Instructions.
  • the first BTAC is set in the processor.
  • the first BTAC is described as SBTAC
  • the second BTAC is simply referred to as BTAC.
  • the hardware structure of SBTAC is similar to that of BTAC. The difference is that BTAC is indexed by a part of the PC field or all fields, and SBTAC is indexed by register identifier. Since the SBPAC stores the correspondence information of the register identifier and the predicted target jump address, the branch prediction device can find the predicted target jump address corresponding to the register identifier from the SBTAC according to the register identifier in the above instruction. .
  • the branch prediction device specifies the type of the compiled instruction as an unconditional indirect jump branch instruction, such as The type of the compiled instruction is specified as a branch and link register (BLR) instruction.
  • the BLR instruction is an unconditional indirect branch instruction. It is caused by a subroutine call or function call and will return.
  • the address is stored in the Link Register.
  • the high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include a plurality of programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc.
  • the standard library function in the embodiment of the present invention refers to a library composed of some basic functions pre-written according to high-level language standards.
  • the register identifier in the embodiment of the present invention may be a register number, or The register identifier may also be other codes or symbols that can be used to indicate the register.
  • the branch prediction method in the embodiment of the present invention may be applied to a multi-thread processor, and may also be applied to a single-thread processor. limited.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address can be stored in the same Entry of the first BTAC without Affect the accuracy of branch prediction, so that the resource sharing of BTAC can be realized under the premise of ensuring the accuracy of branch prediction.
  • the register identifier of all the registers is used as the index of the predicted target jump address in the SBTAC.
  • only the register identifier of the partial register may be used as the index of the predicted target jump address in the SBTAC.
  • the prediction condition further includes: the register identifier in the instruction is a specific register identifier, and the read is determined when the type of the read instruction is an unconditional indirect jump branch instruction, and the register identifier in the read instruction is a specific register identifier. The instruction satisfies the register prediction condition. When the type of the read instruction is not an unconditional indirect branch instruction, or the register identifier in the read instruction is not a specific register identifier, it is determined that the read instruction does not satisfy the register prediction. condition.
  • the branch prediction method in the embodiment of the present invention includes:
  • An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction.
  • the type of the branch instruction needs to be identified through the pre-decoding stage (such as whether it is a conditional branch instruction, whether it is an indirect jump).
  • the branch instruction, etc. to perform the corresponding branch prediction according to the type of the branch instruction.
  • the predecoded result and instructions are saved together in the instruction cache. It should be noted that the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, or may be performed by other devices, which is not limited herein.
  • the type of the read instruction is an unconditional indirect jump branch instruction, determining whether the register identifier in the read instruction is a specific register identifier;
  • step 203 is executed, and if the branch prediction device determines that the register identifier in the read command is not a specific register identifier, execute Step 204.
  • the first BTAC stores: correspondence information between the register identifier and the predicted target jump address.
  • the predicted target jump address of the read instruction is obtained from the second BTAC according to the program counter in the read command.
  • the branch prediction device specifies the type of the compiled instruction as an unconditional indirect jump branch instruction, such as The type of the compiled instruction is specified as a BLR instruction.
  • the BLR instruction is an unconditional indirect branch instruction. It is caused by a subroutine call or function call and must be returned. The returned address is stored in the Link Register.
  • the high-level language in the embodiment of the present invention is mainly related to the programming language, which is a program that is closer to the natural language and the mathematical formula, and is basically separated from the machine system, and is written in a more understandable way. program.
  • the high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include a plurality of programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc.
  • the standard library function in the embodiment of the present invention refers to a base pre-written according to high-level language standards. A library of this function.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the branch instruction under the standard library function For the branch instruction under the standard library function, the target jump address of the branch instruction usually does not change. Therefore, in order to ensure that the content of the SBTAC does not appear to be updated or invalid due to the switching of the software process, the standard in the embodiment of the present invention
  • the branch instruction under the library function uses the SBTAC to perform branch prediction.
  • the branch prediction method in the embodiment of the present invention includes:
  • step 303 is executed. If the standard library function is called, step 304 is performed.
  • Specify the type of the compiled instruction as the BLR instruction is stored in the second level cache or in the memory.
  • the steps 305-308 are similar to the steps 201-204 in the embodiment shown in FIG. 2, and the specific implementation manners may refer to the description in the corresponding steps, and details are not described herein again.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the embodiment of the present invention further provides a branch prediction apparatus, which is applied to a processor, where the processor includes: a first BTAC and a second BTAC, where the first BTAC stores: a register identifier and a predicted target jump address.
  • the second BTAC stores: correspondence information between the field of the PC and the predicted target jump address, optionally, storing the partial field of the PC and the predicted target jump address in the second BTAC - - correspondence relationship information, or storing all fields of the PC and the predicted target hop in the second BTAC
  • the branch prediction apparatus 400 in the embodiment of the present invention includes:
  • a reading unit 401 configured to read an instruction from the instruction cache
  • An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction.
  • the type of the branch instruction (such as whether it is a conditional branch instruction, an indirect jump branch instruction, etc.) needs to be identified through the pre-decoding stage.
  • the corresponding branch prediction is performed according to the type of the branch instruction.
  • the pre-decoded result (such as the type information of the instruction) and the instruction are stored together in the instruction cache.
  • the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, and the branch prediction device in the embodiment of the present invention may further include: a pre-decoding unit, configured to read the reading unit 401 The fetching instruction is pre-decoded to obtain the type information of the instruction to be read; the determining unit is configured to determine, after the reading unit 401 reads the instruction, the type information of the instruction obtained by the pre-decoding unit, Whether the type is an unconditional indirect jump branch instruction.
  • the foregoing pre-decoding operation of the instruction to be read by the reading unit 401 can also be performed by other devices, which is not limited herein.
  • the prediction obtaining unit 402 is configured to: when determining that the instruction read by the reading unit 401 satisfies the register prediction condition, acquire the read unit 401 from the first BTAC according to the register identifier in the instruction read by the reading unit 401 The predicted target jump address of the instruction. When it is determined that the instruction read by the reading unit 401 does not satisfy the above-described register prediction condition, the predicted target jump address of the instruction read by the reading unit 401 is acquired from the second BTAC according to the PC of the instruction read by the reading unit 401.
  • the above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
  • the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier.
  • the branch prediction apparatus 400 further includes: a determining unit, wherein the type of the instruction read by the reading unit 401 is an unconditional indirect branch branch instruction, and the register identifier in the instruction read by the reading unit 401 is a specific register identifier.
  • the reading unit 401 is determined The read instruction satisfies the register prediction condition; when the type of the instruction read by the reading unit 401 is not an unconditional indirect jump branch instruction, or the register identifier in the instruction read by the reading unit 401 is not a specific register identifier , to determine that the read instruction does not meet the register prediction condition.
  • the branch prediction device when compiling a high-level language, if the function called when the high-level language is compiled is a standard library function, the branch prediction device specifies the type of the compiled instruction as the BLR. Then, on the basis of the branch prediction apparatus shown in FIG. 4, the branch prediction apparatus may further include: a compiling unit for compiling the high-level language; and a specifying unit for calling the function when the compiling unit compiles the high-level language.
  • the type of the compiled instruction is specified as an unconditional indirect jump branch instruction, such as specifying the type of the compiled instruction as a BLR instruction.
  • the high-level language in the embodiment of the present invention is mainly related to the assembly language, which is a program that is closer to the natural language and the mathematical formula, and is basically separated from the hardware system of the machine, and the program is written in a more understandable way.
  • the high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include many programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc.
  • the standard library function in the embodiment of the present invention refers to a library composed of some basic functions pre-written according to a high-level language standard.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • branch prediction apparatus in the embodiment of the present invention may be used as the branch prediction apparatus in the foregoing method embodiment, and may be used to implement all the technical solutions in the foregoing method embodiments, and the functions of the respective functional modules may be according to the foregoing method.
  • the method in the embodiment is specifically implemented.
  • the first BTAC and the second BTAC are set, and the register identifier is used as an index in the first BTAC (that is, the register identifier is stored in the first BTAC)
  • the predicted correspondence address of the target jump address the second BTAC uses the PC as an index (ie, the correspondence information of the partial field of the program counter and the predicted target jump address is stored in the second BTAC), when When the read command satisfies the register prediction condition, the first BTAC is used for branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores a program, and the program execution includes some or all of the arrangements of a branch prediction method and a branch prediction device described in the foregoing method embodiments.
  • the embodiment of the present invention provides another branch prediction apparatus.
  • the branch prediction apparatus 500 in the embodiment of the present invention includes:
  • the input device 501, the output device 502, the memory 503, and the processor 504 (the number of processors of the branch prediction device may be one or more, and FIG. 5 takes a processor as an example).
  • the input device 501, the output device 502, the memory 503, and the processor 504 may be connected by a bus or other means, as exemplified by a bus connection as shown in FIG.
  • the memory 503 is used to store data input from the input device 502, and may also store information such as necessary files processed by the processor 504; the input device 501 and the output device 502 may include ports through which the branch prediction device 500 communicates with other devices, and Output devices external to the branch prediction device 500, such as a display, a keyboard, a mouse, and a printer, etc., may also be included.
  • the input device 502 may include a mouse and a keyboard, etc.
  • the output device 501 includes a display or the like.
  • the processor 504 includes: a first BTAC and a second BTAC, where the first BTAC stores: a correspondence between a register identifier and a predicted target jump address, where the foregoing
  • the BTAC stores: a correspondence between the field of the PC and the predicted target jump address, optionally, the correspondence information of the partial field of the PC and the predicted target jump address is stored in the second BTAC, or Corresponding relationship information of all the fields of the PC and the predicted target jump address is stored in the second BTAC.
  • Processor 504 performs the following steps:
  • the above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
  • the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • branch prediction apparatus in the embodiment of the present invention may be used as the branch prediction apparatus in the foregoing method embodiment, and may be used to implement all the technical solutions in the foregoing method embodiments, and the functions of the respective functional modules may be according to the foregoing method.
  • the method in the embodiment is specifically implemented.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC) , using PC in the second BTAC For indexing (ie, storing the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), when the read instruction satisfies the register prediction condition, the first BTAC is used for branch prediction, otherwise, Branch prediction is performed using the second BTAC.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the program may be stored in a computer readable storage medium, for example, the storage medium may be Includes: Read Only Memory, Random Access Memory, Disk or CD, and more.

Abstract

A branch predicting method and a related apparatus. Applied in a server, the processor comprises: a first BTAC that stores information about a mapping between a register identifier and a predicted target jump-to address, and a second BTAC that stores information about a mapping between a field of a program counter and a predicted target jump-to address. The brand predicting method comprises: reading an instruction from an instruction buffer; if determining that the instruction satisfies a register prediction condition, obtaining a predicted target jump-to address of the instruction from the first BTAC according to a register identifier of the instruction; and if determining that the instruction does not satisfy the register prediction condition, obtaining a predicted target jump-to address of the instruction from the second BTAC according to a program counter of the instruction. Therefore, the problem that branch prediction accuracy is affected during BTAC sharing is effectively solved.

Description

一种分支预测方法及相关装置 本申请要求于 2013 年 8 月 21 日提交中国专利局、 申请号为 201310367653.9 、 发明名称为"一种分支预测方法及相关装置"的中国专利申 请的优先权, 其全部内容通过引用结合在本申请中。  The present invention claims the priority of a Chinese patent application filed on August 21, 2013 by the Chinese Patent Office, the application number is 201310367653.9, and the invention is entitled "a branch prediction method and related device". The entire contents are incorporated herein by reference.
技术领域 Technical field
本发明涉及计算机系统领域, 尤其涉及一种分支预测方法及相关装 置。  The present invention relates to the field of computer systems, and in particular, to a branch prediction method and related apparatus.
背景技术 Background technique
目前的处理器多釆用流水线的结构, 使得顺序执行的指令流可以并 行地执行。这种处理指令的方式在很大程度上提高了处理器的执行效率。 在理想情况下, 流水线的每个 Stage (即流水线层)仅占用一个时钟周期, 所以每个时钟周期都可以完成一条指令。 但是实际情况并非如此理想, 因为指令之间可能存在着相互的依赖关系从而影响指令执行的并行度。 比如数据依赖、 控制依赖 (比如分支指令) 、 资源竟争、 中断等等因素, 都会影响指令的并行度。  Current processors use a pipelined architecture that allows sequential execution of instruction streams to be executed concurrently. This way of processing instructions greatly improves the execution efficiency of the processor. Ideally, each stage of the pipeline (that is, the pipeline layer) takes only one clock cycle, so one instruction can be completed per clock cycle. But the actual situation is not so ideal, because there may be mutual dependencies between instructions that affect the degree of parallelism of instruction execution. For example, data dependencies, control dependencies (such as branch instructions), resource contention, and interrupts all affect the parallelism of instructions.
实际程序中包括分支指令, 分支指令的分支行为往往到等到流水线 的后端才能确定, 因此, 分支指令可能产生控制冒险从而导致流水线停 顿, 同时, 处理器也不能确定从哪个地址开始取下一条指令直到这条分 支指令执行完为止。大部分的处理器都釆用了不同形式的分支预测机制, 让条件分支指令的目标跳转方向和目标跳转地址可以在流水线的前端被 预测, 使得处理器可以预测性地取指令并执行指令。 如果分支预测正确 或者正确率较高的话, 可以大幅提高处理器的性能和功耗, 如果分支预 测错误的话, 意味着预测取出的指令不能被执行, 错误的指令需要从緩 冲区中清除, 然后从正确的地址处再重新取指令并执行。 分支目标地址緩存 (BTAC, Branch Target Address Cache ) 用于对 间接跳转分支指令的目标跳转地址进行预测。 BTAC 釆用緩存的结构, 以指令的程序计数器(PC, Program Counter )的一部分作为 index (即索 引 ) , 一部分作为 tag (即标签) , 如以 PC的低 8位作为 index, 以 PC 的高 8位作为 tag 。 BTAC的每个 Entry (即表项) 对应于一个 index和 一个 tag, 并且, BTAC的每个 Entry都设置一个有效位, 用于记录这个 Entry是否存放了有效的历史信息(历史信息即为预测的目标跳转地址), 其中, Entry存放的目标跳转地址是虚拟地址 ( VA, Virtual Address ) 。 如果 BTAC满了, 像 Cache—样, 也需要根据一定的替换算法决定哪个 最近最少使用的 Entry中存放的内容可以被替换掉。 The actual program includes branch instructions. The branching behavior of the branch instruction is often determined until the back end of the pipeline. Therefore, the branch instruction may cause a control risk and cause the pipeline to stall. At the same time, the processor cannot determine from which address to take an instruction. Until this branch instruction is executed. Most processors use different forms of branch prediction mechanism, so that the target branch direction and target jump address of the conditional branch instruction can be predicted at the front end of the pipeline, so that the processor can predictively fetch instructions and execute instructions. . If the branch prediction is correct or the correct rate is high, the performance and power consumption of the processor can be greatly improved. If the branch prediction is wrong, it means that the instruction fetched cannot be executed, the wrong instruction needs to be cleared from the buffer, and then Re-fetch the instruction from the correct address and execute it. The Branch Target Address Cache (BTAC) is used to predict the target jump address of the indirect branch instruction. BTAC uses the structure of the cache, with part of the program counter (PC, Program Counter) as the index (ie index), and part of it as the tag (ie label), such as the lower 8 bits of the PC as the index, and the high 8 of the PC. Bit as a tag. Each Entry (ie, entry) of the BTAC corresponds to an index and a tag, and each Entry of the BTAC is set with a valid bit for recording whether the Entry stores valid history information (historical information is predicted) The target jump address), where the target jump address stored in the entry is a virtual address (VA, Virtual Address). If the BTAC is full, like Cache, it is also necessary to determine which of the least recently used entries can be replaced according to a certain replacement algorithm.
在多线程处理器中, 对 BTAC的设置有两种方式:  In a multi-threaded processor, there are two ways to set up BTAC:
一种是共享的 BTAC, 多个线程共享同一块 BTAC, 每个线程各自 用自己的 PC去索引 BTAC当中存放的内容。 这种方式虽然节约了面积, 但是由于 BTAC的索引地址是每个线程的 PC, 而不同线程的 PC有可能 一样, 因此, 不同线程之间的历史信息存放在同一块 BTAC中, 将影响 分支预测的准确率;  One is shared BTAC. Multiple threads share the same BTAC. Each thread uses its own PC to index the contents stored in BTAC. Although this method saves the area, since the index address of BTAC is the PC of each thread, and the PCs of different threads may be the same, the historical information between different threads is stored in the same BTAC, which will affect the branch prediction. Accuracy rate
另一种是独享的 BTAC, 每个线程各自设置一块 BTAC, BTAC为相 应的线程提供预测分支目标跳转地址的服务。 这种方式虽然相比共享的 方法在一定程度上提高了预测准确率,但是极大浪费了硬件资源和面积。  The other is the exclusive BTAC, where each thread sets up a BTAC, and the BTAC provides the service for predicting the branch target jump address for the corresponding thread. Although this method improves the prediction accuracy to some extent compared with the shared method, it greatly wastes hardware resources and area.
无论是共享的 BTAC还是独享的 BTAC, 它们有一个共同的特点: 只要 BTAC不满, 就以分支指令的 PC作为索引记录所有和分支指令相 关的历史信息。 但是在实际的程序中, 存在如下情况的几率很大: 多个 不同的分支指令跳转到同样的目标地址, 比如类似 C++中 "Printf" 这种 标准的库函数, 由经过编译之后得到的汇编指令可以看出, 不同的分支 指令总是跳转到相同的目标地址。 那么使用传统的 BTAC结构, 这些不 同的分支指令虽然跳转到同一个目标地址, 但是仍将占用 BTAC当中的 多个 Entry来记录其相关历史信息。  Whether it is a shared BTAC or an exclusive BTAC, they have one thing in common: As long as the BTAC is not satisfied, the PC with the branch instruction is used as an index to record all the history information related to the branch instruction. However, in the actual program, there is a high probability that there are many different branch instructions that jump to the same target address, such as the standard library function like "Printf" in C++, compiled by the compiled version. As you can see from the instructions, different branch instructions always jump to the same destination address. Then, using the traditional BTAC architecture, these different branch instructions, while jumping to the same destination address, will still occupy multiple entries in the BTAC to record their relevant history information.
由上可见, 在多线程处理器当中, BTAC 的资源共享、 分支预测的 准确率之间的矛盾极为突出。 As can be seen from the above, in the multi-threaded processor, BTAC resource sharing, branch prediction The contradiction between accuracy rates is extremely prominent.
发明内容 Summary of the invention
本发明各个方面提供了一种分支预测方法及相关装置, 用于解决在 共享 BTAC时影响分支预测的准确率的问题。  Aspects of the present invention provide a branch prediction method and related apparatus for solving the problem of affecting the accuracy of branch prediction when sharing BTAC.
为解决上述技术问题, 提供以下技术方案:  In order to solve the above technical problems, the following technical solutions are provided:
本发明第一方面提供了一种分支预测方法, 应用于处理器中, 上述 处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩 存器, 上述第一分支目标地址预测緩存器存储着: 寄存器标识与预测目 标跳转地址的——对应关系信息, 上述第二分支目标地址预测緩存器存 储着: 程序计数器的字段与预测目标跳转地址的——对应关系信息, 其 中, 上述分支预测方法, 包括: 从指令緩存中读取指令;  A first aspect of the present invention provides a branch prediction method, which is applied to a processor, where the processor includes: a first branch target address prediction buffer and a second branch target address prediction buffer, and the first branch target address prediction cache. The storage device stores: a correspondence identifier of the register identifier and the predicted target jump address, wherein the second branch target address prediction buffer stores: correspondence information between the field of the program counter and the predicted target jump address, where The above branch prediction method includes: reading an instruction from an instruction cache;
若确定读取的上述指令满足寄存器预测条件, 则:  If it is determined that the above read command satisfies the register prediction condition, then:
根据读取的上述指令的寄存器标识, 从上述第一分支目标地址预测 緩存器中获取读取的上述指令的预测目标跳转地址;  Obtaining, from the first branch target address prediction buffer, a predicted target jump address of the read instruction according to the read register identifier of the instruction;
若确定读取的上述指令不满足上述寄存器预测条件, 则:  If it is determined that the above-mentioned instruction read does not satisfy the above register prediction condition, then:
则根据读取的上述指令的程序计数器, 从上述第二分支目标地址预 测緩存器中获取读取的上述指令的预测目标跳转地址;  And acquiring, according to the read program counter of the above instruction, the predicted target jump address of the read instruction from the second branch target address predictor buffer;
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。  The above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
基于第一方面, 在第一种可能的实现方式中, 上述寄存器预测条件 还包括: 指令中的寄存器标识为特定的寄存器标识;  Based on the first aspect, in a first possible implementation manner, the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier;
上述确定读取的上述指令满足寄存器预测条件, 具体为:  The above-mentioned instructions determined to be read satisfy the register prediction condition, specifically:
当上述指令的类型为无条件间接跳转分支指令, 且上述指令中的寄 存器标识为特定的寄存器标识时, 确定读取的上述指令满足寄存器预测 条件;  When the type of the above instruction is an unconditional indirect jump branch instruction, and the register identifier in the above instruction is a specific register identifier, it is determined that the read instruction satisfies the register prediction condition;
上述确定读取的上述指令不满足上述寄存器预测条件, 具体为: 当上述指令的类型不为无条件间接跳转分支指令, 或者, 上述指令 中的寄存器标识不为特定的寄存器标识时, 确定读取的上述指令不满足 寄存器预测条件。 The foregoing determining that the read instruction does not satisfy the foregoing register prediction condition, specifically: when the type of the instruction is not an unconditional indirect jump branch instruction, or the above instruction When the register identifier in the register is not identified by a specific register, it is determined that the above-mentioned instruction read does not satisfy the register prediction condition.
基于第一方面, 或者第一方面的第一种可能的实现方式, 在第二种 可能的实现方式中, 上述从上述指令緩存中读取指令之前包括:  Based on the first aspect, or the first possible implementation manner of the first aspect, in the second possible implementation manner, the foregoing reading the instruction from the instruction cache includes:
对待读取的指令进行预译码, 得到上述待读取的指令的类型信息; 上述读取指令之后包括: 根据上述得到的指令的类型信息, 判定当 前读取的指令的类型是否为无条件间接跳转分支指令。  The instruction to be read is pre-decoded to obtain the type information of the instruction to be read; the reading instruction further includes: determining, according to the type information of the instruction obtained above, whether the type of the currently read instruction is an unconditional indirect jump Transfer branch instructions.
基于第一方面, 或者第一方面的第一种可能的实现方式, 或者第一 方面的第二种可能的实现方式, 在第三种可能的实现方式中, 在上述读 取指令之前, 若对高级语言进行编译时调用的函数为标准库函数, 则, 将编译后的指令的类型指定为无条件间接跳转分支指令。  Based on the first aspect, or the first possible implementation of the first aspect, or the second possible implementation of the first aspect, in a third possible implementation, before the reading instruction, if The function called when the high-level language is compiled is a standard library function. Then, the type of the compiled instruction is specified as an unconditional indirect jump branch instruction.
本发明第二方面提供了一种分支预测装置, 应用于处理器中, 上述 处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩 存器, 上述第一分支目标地址预测緩存器存储着: 寄存器标识与预测目 标跳转地址的——对应关系信息, 上述第二分支目标地址预测緩存器存 储着:程序计数器的部分字段与预测目标跳转地址的——对应关系信息, 或者,程序计数器的全部字段与预测目标跳转地址的——对应关系信息, 其中, 上述分支预测装置, 包括:  A second aspect of the present invention provides a branch prediction apparatus, which is applied to a processor, where the processor includes: a first branch target address prediction buffer and a second branch target address prediction buffer, and the first branch target address prediction cache. The device stores: a correspondence identifier of the register identifier and the predicted target jump address, and the second branch target address prediction buffer stores: correspondence information between the partial field of the program counter and the predicted target jump address, or And the corresponding relationship information of all the fields of the program counter and the predicted target jump address, wherein the branch prediction device includes:
读取单元, 用于从指令緩存中读取指令;  a reading unit, configured to read an instruction from the instruction cache;
预测获取单元, 用于当确定上述读取单元读取的指令满足寄存器预 测条件时, 根据上述读取单元读取的上述指令的寄存器标识, 从上述第 一分支目标地址预测緩存器中获取上述读取单元读取的上述指令的预测 目标跳转地址; 当确定上述读取单元读取的指令不满足上述寄存器预测 条件时, 根据上述读取单元读取的上述指令的程序计数器, 从上述第二 分支目标地址预测緩存器中获取上述读取单元读取的上述指令的预测目 标跳转地址;  a prediction acquiring unit, configured to: when determining that the instruction read by the reading unit satisfies a register prediction condition, acquire the read from the first branch target address prediction buffer according to the register identifier of the instruction read by the reading unit Taking the prediction target jump address of the above instruction read by the unit; when it is determined that the instruction read by the reading unit does not satisfy the register prediction condition, the program counter of the instruction read according to the reading unit is from the second Obtaining, in the branch target address prediction buffer, a predicted target jump address of the above instruction read by the reading unit;
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。 基于本发明第二方面, 在第一种可能的实现方式中, 上述寄存器预 测条件还包括: 指令中的寄存器标识为特定的寄存器标识; The foregoing register prediction condition includes: The type of the instruction is an unconditional indirect jump branch instruction. According to the second aspect of the present invention, in a first possible implementation, the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier;
上述分支预测装置还包括:  The above branch prediction device further includes:
确定单元, 用于当上述读取单元读取的指令的类型为无条件间接跳 转分支指令, 且上述读取单元读取的指令中的寄存器标识为特定的寄存 器标识时, 确定上述读取单元读取的指令满足寄存器预测条件; 当上述 读取单元读取的指令的类型不为无条件间接跳转分支指令, 或者, 上述 读取单元读取的指令中的寄存器标识不为特定的寄存器标识时, 确定读 取的上述指令不满足寄存器预测条件。  a determining unit, configured to: when the type of the instruction read by the reading unit is an unconditional indirect branch branch instruction, and the register identifier in the instruction read by the reading unit is a specific register identifier, determining the reading unit to read The fetched instruction satisfies the register prediction condition; when the type of the instruction read by the read unit is not an unconditional indirect jump branch instruction, or the register identifier in the instruction read by the read unit is not a specific register identifier, It is determined that the above instruction read does not satisfy the register prediction condition.
基于本发明第二方面, 或者本发明第二方面的第一种可能的实现方 式, 在第二种可能的实现方式中, 上述分支预测装置还包括:  Based on the second aspect of the present invention, or the first possible implementation of the second aspect of the present invention, in the second possible implementation, the branch prediction apparatus further includes:
预译码单元, 用于对上述读取单元待读取的指令进行预译码, 得到 上述读取单元待读取的指令的类型信息;  a pre-decoding unit, configured to pre-decode an instruction to be read by the reading unit, to obtain type information of an instruction to be read by the reading unit;
判定单元, 用于在上述读取单元读取上述指令后, 根据上述预译码 单元得到的指令的类型信息, 判定上述读取单元当前读取的指令的类型 是否为无条件间接跳转分支指令。  The determining unit is configured to determine, after the reading unit reads the instruction, whether the type of the instruction currently read by the reading unit is an unconditional indirect branch branch instruction according to the type information of the instruction obtained by the pre-decoding unit.
基于本发明第二方面, 或者本发明第二方面的第一种可能的实现方 式, 在第三种可能的实现方式中, 上述分支预测装置还包括:  Based on the second aspect of the present invention, or the first possible implementation of the second aspect of the present invention, in a third possible implementation, the foregoing branch prediction apparatus further includes:
编译单元, 用于对高级语言进行编译;  a compilation unit for compiling high-level languages;
指定单元, 用于当上述编译单元对高级语言进行编译时调用的函数 为标准库函数时, 将编译后的指令的类型指定为无条件间接跳转分支指 令。  The specifying unit is configured to specify the type of the compiled instruction as an unconditional indirect jump branch instruction when the function called when the above compiling unit compiles the high-level language is a standard library function.
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息 ) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。 It can be seen that, in the embodiment of the present invention, the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC) The second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition. The BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction. By The target jump addresses of the unconditional indirect jump branch instructions with the same register identifier are necessarily the same, so that even the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC. The accuracy of the branch prediction is also not affected. In other words, the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
附图说明 DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将 对实施例或现有技术描述中所需要使用的附图做一简单地介绍, 显而易 见地, 下面描述中的附图是本发明的一些实施例, 对于本领域普通技术 人员来讲, 在不付出创造性劳动的前提下, 还可以根据这些附图获得其 他的附图。  In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, a brief description of the drawings used in the embodiments or the prior art description will be briefly made. Obviously, the drawings in the following description It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any creative work.
图 1为本发明提供的一种分支预测方法一个实施例流程示意图; 图 2为本发明提供的一种分支预测方法另一个实施例流程示意图; 图 3为本发明提供的一种分支预测方法再一个实施例流程示意图; 图 4为本发明提供的一种分支预测装置一个实施例结构示意图; 图 5为本发明提供的一种分支预测装置另一个实施例结构示意图。  1 is a schematic flow chart of an embodiment of a branch prediction method according to the present invention; FIG. 2 is a schematic flowchart of another embodiment of a branch prediction method according to the present invention; FIG. 3 is a schematic diagram of a branch prediction method provided by the present invention. FIG. 4 is a schematic structural diagram of an embodiment of a branch prediction apparatus according to the present invention; FIG. 5 is a schematic structural diagram of another embodiment of a branch prediction apparatus according to the present invention.
具体实施方式 detailed description
本发明实施例提供了一种分支预测方法及相关装置。  Embodiments of the present invention provide a branch prediction method and related apparatus.
为使得本发明的发明目的、 特征、 优点能够更加的明显和易懂, 下 面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而非全部实施例。 基于本发明中的实施例, 本领域普通技术人员在没有 做出创造性劳动前提下所获得的各个其他实施例, 都属于本发明保护的 范围。  The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. The embodiments are merely a part of the embodiments of the invention, and not all of the embodiments. Based on the embodiments of the present invention, various other embodiments obtained by those skilled in the art without creative efforts are within the scope of the present invention.
下面对本发明实施例提供的一种分支预测方法进行描述, 本发明实 施例中的分支预测方法应用于处理器中, 上述处理器包括: 第一 BTAC 和第二 BTAC, 上述第一 BTAC存储着: 寄存器标识与预测目标跳转地 址的——对应关系信息, 上述第二 BTAC存储着: PC的字段与预测目标 跳转地址的——对应关系信息, 可选地, 在上述第二 BTAC 中存储 PC 的部分字段与预测目标跳转地址的——对应关系信息, 或者, 在上述第 二 BTAC中存储 PC的全部字段与预测目标跳转地址的——对应关系信 息。 请参阅图 1 , 本发明实施例中的分支预测方法, 包括: A branch prediction method provided by an embodiment of the present invention is described below. The branch prediction method in the embodiment is applied to the processor, where the processor includes: a first BTAC and a second BTAC, where the first BTAC stores: correspondence information between the register identifier and the predicted target jump address, the foregoing The BTAC stores: a correspondence between the field of the PC and the predicted target jump address, optionally, the correspondence information of the partial field of the PC and the predicted target jump address is stored in the second BTAC, or Corresponding relationship information of all the fields of the PC and the predicted target jump address is stored in the second BTAC. Referring to FIG. 1, a branch prediction method in an embodiment of the present invention includes:
101、 从指令緩存中读取指令;  101. Reading an instruction from an instruction cache;
在本发明实施例中, 步骤 101读取的指令有可能为分支指令, 也有 可能为非分支指令。 通常, 分支指令可以有如下两种划分方式: 一种是 针对跳转条件, 将分支指令的类型划分为条件分支指令和无条件分支指 令, 其中, 条件分支指令在满足一定条件时才执行分支跳转, 无条件分 支指令则不需要满足任何条件, 总是执行分支跳转; 另一种是针对目标 跳转地址, 将分支指令的类型划分为直接跳转分支指令和间接跳转分支 指令, 其中, 直接跳转分支指示的目标跳转地址的偏移量直接在指令当 中用立即数 (即, 在立即寻址方式指令中给出的数) 指定, 目标跳转地 址就是用分支指令本身的 PC 加上立即数的偏移量计算得到, 而间接跳 转分支指令的目标跳转地址是在寄存器当中指定的。  In the embodiment of the present invention, the instruction read by step 101 may be a branch instruction or a non-branch instruction. Generally, the branch instruction can be divided into the following two ways: One is to divide the type of the branch instruction into a conditional branch instruction and an unconditional branch instruction for the jump condition, wherein the conditional branch instruction performs the branch jump when a certain condition is met. The unconditional branch instruction does not need to satisfy any condition, and always performs a branch jump; the other is to divide the type of the branch instruction into a direct jump branch instruction and an indirect jump branch instruction for the target jump address, where The offset of the target jump address indicated by the jump branch is directly specified in the instruction with an immediate value (that is, the number given in the immediate addressing mode instruction), and the target jump address is the PC plus the branch instruction itself. The offset of the immediate branch is calculated, and the target jump address of the indirect jump branch instruction is specified in the register.
一条指令在从二级緩存或者内存取到指令緩存之前, 需要对指令进 行预译码, 以便将指令的部分预译码结果作为分支预测的指导。 比如, 当分支指令在从二级緩存或者内存取到指令緩存之前, 需要通过预译码 阶段识别出该分支指令的类型 (如是否为条件分支指令、 是否为间接跳 转分支指令等) , 以便根据分支指令的类型执行相应地分支预测。 在预 译码之后, 预译码结果 (如指令的类型信息) 和指令会被一同保存在指 令緩存中。 需要说明的是, 上述对指令的预译码操作可以由分支预测装 置执行, 或者, 也可以由其它装置执行, 此处不作限定。  An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction. For example, when the branch instruction is accessed from the L2 cache or the internal cache to the instruction cache, the type of the branch instruction (such as whether it is a conditional branch instruction, an indirect jump branch instruction, etc.) needs to be identified through the pre-decoding stage. The corresponding branch prediction is performed according to the type of the branch instruction. After pre-decoding, the pre-decoded result (such as the type information of the instruction) and the instruction are stored together in the instruction cache. It should be noted that the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, or may be performed by other devices, which is not limited herein.
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。 Wherein, the above register prediction conditions include: the type of the instruction is an unconditional indirect jump Instructions.
102、 当读取的指令满足寄存器预测条件时, 根据上述读取的指令中 的寄存器标识, 从第一 BTAC中获取上述读取的指令的预测目标跳转地 址。  102. When the read instruction satisfies the register prediction condition, obtain the predicted target jump address of the read instruction from the first BTAC according to the register identifier in the read instruction.
103、 当读取的指令不满足寄存器预测条件时, 根据上述读取的指令 中的程序计数器, 从第二 BTAC中获取上述读取的指令的预测目标跳转 地址。  103. When the read instruction does not satisfy the register prediction condition, obtain the predicted target jump address of the read instruction from the second BTAC according to the program counter in the read command.
本发明实施例在处理器中设置第一 BTAC, 为便于描述, 下面将第 一 BTAC描述为 SBTAC , 将第二 BTAC简称为 BTAC。 SBTAC的硬件 构造与 BTAC相类似, 不同的是, BTAC是以 PC的一部分字段或者全 部字段作为索引, 而 SBTAC是以寄存器标识作为索引。 由于 SBTAC中 存储着寄存器标识与预测目标跳转地址的——对应关系信息, 因此, 分 支预测装置能够根据上述指令中的寄存器标识 ,从 SBTAC中找到与该寄 存器标识相对应的预测目标跳转地址。  In the embodiment of the present invention, the first BTAC is set in the processor. For convenience of description, the first BTAC is described as SBTAC, and the second BTAC is simply referred to as BTAC. The hardware structure of SBTAC is similar to that of BTAC. The difference is that BTAC is indexed by a part of the PC field or all fields, and SBTAC is indexed by register identifier. Since the SBPAC stores the correspondence information of the register identifier and the predicted target jump address, the branch prediction device can find the predicted target jump address corresponding to the register identifier from the SBTAC according to the register identifier in the above instruction. .
在一种应用场景中, 在步骤 101之前, 若对高级语言进行编译时调 用的函数为标准库函数, 则, 分支预测装置将编译后的指令的类型指定 为无条件间接跳转分支指令, 如将编译后的指令的类型指定为分支与链 接寄存器(BLR, Branch and Link Register )指令, BLR指令为一条无条 件间接跳转分支指令, 它为一个子程序调用或者函数调用引起且一定会 返回, 返回的地址存放在 Link Register (即链接寄存器) 中。 需要说明 的是, 本发明实施例中的高级语言主要是相对于汇编语言而言, 它是较 接近自然语言和数学公式的编程, 基本脱离了机器的^^系统, 用人们 更易理解的方式编写程序。 本发明实施例中的高级语言并不特指某一种 具体的语言,可以包括艮多编程语言,如 java, c, C++ , C#, pascal, python, lisp , prolog, FoxPro, VC, 易语言等等, 本发明实施例中的标准库函数 是指由一些按照高级语言标准预先编写的基本函数组成的库。  In an application scenario, before step 101, if the function called when compiling the high-level language is a standard library function, the branch prediction device specifies the type of the compiled instruction as an unconditional indirect jump branch instruction, such as The type of the compiled instruction is specified as a branch and link register (BLR) instruction. The BLR instruction is an unconditional indirect branch instruction. It is caused by a subroutine call or function call and will return. The address is stored in the Link Register. It should be noted that the high-level language in the embodiment of the present invention is mainly related to the assembly language, which is a program that is closer to the natural language and the mathematical formula, and is basically separated from the machine system, and is written in a more understandable way. program. The high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include a plurality of programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc. Etc., the standard library function in the embodiment of the present invention refers to a library composed of some basic functions pre-written according to high-level language standards.
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。 It should be noted that the register identifier in the embodiment of the present invention may be a register number, or The register identifier may also be other codes or symbols that can be used to indicate the register. The branch prediction method in the embodiment of the present invention may be applied to a multi-thread processor, and may also be applied to a single-thread processor. limited.
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 目标跳转地址相同的多个无条件间接跳转分支指令的历史信 息能够存储在第一 BTAC的同一个 Entry而不影响分支预测的准确率, 从而能够在保证分支预测的准确率的前提下实现 BTAC的资源共享。  It can be seen that, in the embodiment of the present invention, the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC) The second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition. The BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction. Since the target jump addresses of the unconditional indirect jump branch instructions with the same register identifier are necessarily the same, the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address can be stored in the same Entry of the first BTAC without Affect the accuracy of branch prediction, so that the resource sharing of BTAC can be realized under the premise of ensuring the accuracy of branch prediction.
上述实施例中使用全部寄存器的寄存器标识作为 SBTAC 中的预测 目标跳转地址的索引, 本发明实施例也可以只使用部分寄存器的寄存器 标识作为 SBTAC中的预测目标跳转地址的索引,则上述寄存器预测条件 还包括: 指令中的寄存器标识为特定的寄存器标识, 在当读取的指令的 类型为无条件间接跳转分支指令, 且读取指令中的寄存器标识为特定的 寄存器标识时, 确定读取的指令满足寄存器预测条件, 当读取的指令的 类型不为无条件间接跳转分支指令, 或者, 读取的指令中的寄存器标识 不为特定的寄存器标识时, 确定读取的指令不满足寄存器预测条件。 如 图 2所示, 本发明实施例中的分支预测方法, 包括:  In the above embodiment, the register identifier of all the registers is used as the index of the predicted target jump address in the SBTAC. In the embodiment of the present invention, only the register identifier of the partial register may be used as the index of the predicted target jump address in the SBTAC. The prediction condition further includes: the register identifier in the instruction is a specific register identifier, and the read is determined when the type of the read instruction is an unconditional indirect jump branch instruction, and the register identifier in the read instruction is a specific register identifier. The instruction satisfies the register prediction condition. When the type of the read instruction is not an unconditional indirect branch instruction, or the register identifier in the read instruction is not a specific register identifier, it is determined that the read instruction does not satisfy the register prediction. condition. As shown in FIG. 2, the branch prediction method in the embodiment of the present invention includes:
201、 从指令緩存中读取指令;  201. Read an instruction from an instruction cache.
一条指令在从二级緩存或者内存取到指令緩存之前, 需要对指令进 行预译码, 以便将指令的部分预译码结果作为分支预测的指导。 比如, 当分支指令在从二级緩存或者内存取到指令緩存之前, 需要通过预译码 阶段识别出该分支指令的类型 (如是否为条件分支指令、 是否为间接跳 转分支指令等) , 以便根据分支指令的类型执行相应地分支预测。 在预 译码之后, 预译码结果和指令会被一同保存在指令緩存中。 需要说明的 是, 上述对指令的预译码操作可以由分支预测装置执行, 或者, 也可以 由其它装置执行, 此处不作限定。 An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction. For example, when a branch instruction is accessed from the L2 cache or internal to the instruction cache, the type of the branch instruction needs to be identified through the pre-decoding stage (such as whether it is a conditional branch instruction, whether it is an indirect jump). The branch instruction, etc.), to perform the corresponding branch prediction according to the type of the branch instruction. After precoding, the predecoded result and instructions are saved together in the instruction cache. It should be noted that the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, or may be performed by other devices, which is not limited herein.
202、 当读取的指令的类型为无条件间接跳转分支指令时, 判断上述 读取的指令中的寄存器标识是否为特定的寄存器标识;  202. When the type of the read instruction is an unconditional indirect jump branch instruction, determining whether the register identifier in the read instruction is a specific register identifier;
若分支预测装置判断出上述读取的指令中的寄存器标识为特定的寄 存器标识, 则执行步骤 203 , 若分支预测装置判断出上述读取的指令中 的寄存器标识不为特定的寄存器标识, 则执行步骤 204。  If the branch prediction device determines that the register identifier in the read command is a specific register identifier, step 203 is executed, and if the branch prediction device determines that the register identifier in the read command is not a specific register identifier, execute Step 204.
203、根据上述读取的指令中的寄存器标识, 从第一 BTAC中获取上 述读取的指令的预测目标跳转地址;  203. Obtain a predicted target jump address of the read command from the first BTAC according to the register identifier in the read command.
其中, 上述第一 BTAC中存储着: 寄存器标识与预测目标跳转地址 的——对应关系信息。  The first BTAC stores: correspondence information between the register identifier and the predicted target jump address.
204、 当读取的指令的类型不为无条件间接跳转分支指令时, 根据上 述读取的指令中的程序计数器, 从第二 BTAC中获取上述读取的指令的 预测目标跳转地址。  204. When the type of the read instruction is not an unconditional indirect jump branch instruction, the predicted target jump address of the read instruction is obtained from the second BTAC according to the program counter in the read command.
在一种应用场景中, 在步骤 201之前, 若对高级语言进行编译时调 用的函数为标准库函数, 则, 分支预测装置将编译后的指令的类型指定 为无条件间接跳转分支指令,如将编译后的指令的类型指定为 BLR指令。 BLR指令为一条无条件间接跳转分支指令, 它为一个子程序调用或者函 数调用引起且一定会返回, 返回的地址存放在 Link Register (即链接寄 存器) 中。 需要说明的是, 本发明实施例中的高级语言主要是相对于 编语言而言, 它是较接近自然语言和数学公式的编程, 基本脱离了机器 的^^系统, 用人们更易理解的方式编写程序。 本发明实施例中的高级 语言并不特指某一种具体的语言, 可以包括艮多编程语言, 如 java, c, C++ , C# , pascal, python, lisp, prolog, FoxPro, VC, 易语言等等, 本 发明实施例中的标准库函数是指由一些按照高级语言标准预先编写的基 本函数组成的库。 In an application scenario, before step 201, if the function called when the high-level language is compiled is a standard library function, the branch prediction device specifies the type of the compiled instruction as an unconditional indirect jump branch instruction, such as The type of the compiled instruction is specified as a BLR instruction. The BLR instruction is an unconditional indirect branch instruction. It is caused by a subroutine call or function call and must be returned. The returned address is stored in the Link Register. It should be noted that the high-level language in the embodiment of the present invention is mainly related to the programming language, which is a program that is closer to the natural language and the mathematical formula, and is basically separated from the machine system, and is written in a more understandable way. program. The high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include a plurality of programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc. Etc., the standard library function in the embodiment of the present invention refers to a base pre-written according to high-level language standards. A library of this function.
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。  It should be noted that the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc. The branch prediction method in the embodiment of the present invention may be applied to multiple The thread processor can also be applied to a single-thread processor, which is not limited herein.
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。  It can be seen that, in the embodiment of the present invention, the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC) The second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition. The BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction. Since the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC The accuracy of the branch prediction is also not affected. In other words, the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
对于标准库函数下的分支指令, 分支指令的目标跳转地址通常不会 改变, 因此, 为了保证 SBTAC的内容不会因为软件进程的切换而出现更 新或者无效的操作, 本发明实施例中对标准库函数下的分支指令使用 SBTAC进行分支预测, 如图 3所示, 本发明实施例中的分支预测方法, 包括:  For the branch instruction under the standard library function, the target jump address of the branch instruction usually does not change. Therefore, in order to ensure that the content of the SBTAC does not appear to be updated or invalid due to the switching of the software process, the standard in the embodiment of the present invention The branch instruction under the library function uses the SBTAC to perform branch prediction. As shown in FIG. 3, the branch prediction method in the embodiment of the present invention includes:
301、 对高级语言进行编译。  301. Compile a high-level language.
302、 判断是否调用标准库函数;  302. Determine whether to call a standard library function.
在编译过程中可以确定是否调用标准库函数, 若没有调用标准库函 数, 则执行步骤 303 , 若调用了标准库函数, 则执行步骤 304。  During the compilation process, it can be determined whether the standard library function is called. If the standard library function is not called, step 303 is executed. If the standard library function is called, step 304 is performed.
303、将编译后的指令的类型指定为其它指令存储在二级緩存或者内 存中。 303. Specify the type of the compiled instruction as another instruction stored in the second level cache or Save in.
304、 将编译后的指令的类型指定为 BLR指令存储在二级緩存或者 内存中。  304. Specify the type of the compiled instruction as the BLR instruction is stored in the second level cache or in the memory.
步骤 305~308与图 2所示实施例中的步骤 201~204类似, 其具体实 现方式可以参照相应步骤中的描述, 此处不再赘述。  The steps 305-308 are similar to the steps 201-204 in the embodiment shown in FIG. 2, and the specific implementation manners may refer to the description in the corresponding steps, and details are not described herein again.
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。  It should be noted that the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc. The branch prediction method in the embodiment of the present invention may be applied to multiple The thread processor can also be applied to a single-thread processor, which is not limited herein.
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。  It can be seen that, in the embodiment of the present invention, the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC) The second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition. The BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction. Since the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC The accuracy of the branch prediction is also not affected. In other words, the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
本发明实施例还提供了一种分支预测装置, 应用于处理器中, 上述 处理器包括: 第一 BTAC和第二 BTAC, 上述第一 BTAC存储着: 寄存 器标识与预测目标跳转地址的——对应关系信息, 上述第二 BTAC存储 着: PC的字段与预测目标跳转地址的——对应关系信息, 可选地, 在上 述第二 BTAC中存储 PC的部分字段与预测目标跳转地址的——对应关 系信息, 或者, 在上述第二 BTAC中存储 PC的全部字段与预测目标跳 转地址的——对应关系信息, 如图 4所示, 本发明实施例中的分支预测 装置 400 , 包括: The embodiment of the present invention further provides a branch prediction apparatus, which is applied to a processor, where the processor includes: a first BTAC and a second BTAC, where the first BTAC stores: a register identifier and a predicted target jump address. Corresponding relationship information, the second BTAC stores: correspondence information between the field of the PC and the predicted target jump address, optionally, storing the partial field of the PC and the predicted target jump address in the second BTAC - - correspondence relationship information, or storing all fields of the PC and the predicted target hop in the second BTAC As shown in FIG. 4, the branch prediction apparatus 400 in the embodiment of the present invention includes:
读取单元 401 , 用于从指令緩存中读取指令;  a reading unit 401, configured to read an instruction from the instruction cache;
一条指令在从二级緩存或者内存取到指令緩存之前, 需要对指令进 行预译码, 以便将指令的部分预译码结果作为分支预测的指导。 比如, 当分支指令在从二级緩存或者内存取到指令緩存之前, 需要通过预译码 阶段识别出该分支指令的类型 (如是否为条件分支指令、 是否为间接跳 转分支指令等) , 以便根据分支指令的类型执行相应地分支预测。 在预 译码之后, 预译码结果 (如指令的类型信息) 和指令会被一同保存在指 令緩存中。 在一种实现方式中, 上述对指令的预译码操作可以由分支预 测装置执行, 则本发明实施例中的分支预测装置还可以包括: 预译码单 元, 用于对读取单元 401待读取的指令进行预译码, 得到上述待读取的 指令的类型信息; 判定单元, 用于在读取单元 401读取指令后, 根据预 译码单元得到的该指令的类型信息, 判定该指令的类型是否为无条件间 接跳转分支指令。 当然, 上述对读取单元 401待读取的指令的预译码操 作也可以由其它装置执行, 此处不作限定。  An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction. For example, when the branch instruction is accessed from the L2 cache or the internal cache to the instruction cache, the type of the branch instruction (such as whether it is a conditional branch instruction, an indirect jump branch instruction, etc.) needs to be identified through the pre-decoding stage. The corresponding branch prediction is performed according to the type of the branch instruction. After pre-decoding, the pre-decoded result (such as the type information of the instruction) and the instruction are stored together in the instruction cache. In an implementation, the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, and the branch prediction device in the embodiment of the present invention may further include: a pre-decoding unit, configured to read the reading unit 401 The fetching instruction is pre-decoded to obtain the type information of the instruction to be read; the determining unit is configured to determine, after the reading unit 401 reads the instruction, the type information of the instruction obtained by the pre-decoding unit, Whether the type is an unconditional indirect jump branch instruction. Of course, the foregoing pre-decoding operation of the instruction to be read by the reading unit 401 can also be performed by other devices, which is not limited herein.
预测获取单元 402 , 用于当确定读取单元 401 读取的指令满足寄存 器预测条件时, 根据读取单元 401读取的指令中的寄存器标识, 从第一 BTAC 中获取读取单元 401读取的指令的预测目标跳转地址。 当确定读 取单元 401读取的指令不满足上述寄存器预测条件时,根据读取单元 401 读取的指令的 PC ,从第二 BTAC中获取读取单元 401读取的指令的预测 目标跳转地址; 其中, 上述寄存器预测条件包括: 指令的类型为无条件 间接跳转分支指令。  The prediction obtaining unit 402 is configured to: when determining that the instruction read by the reading unit 401 satisfies the register prediction condition, acquire the read unit 401 from the first BTAC according to the register identifier in the instruction read by the reading unit 401 The predicted target jump address of the instruction. When it is determined that the instruction read by the reading unit 401 does not satisfy the above-described register prediction condition, the predicted target jump address of the instruction read by the reading unit 401 is acquired from the second BTAC according to the PC of the instruction read by the reading unit 401. The above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
可选地, 上述寄存器预测条件还包括: 指令中的寄存器标识为特定 的寄存器标识。 则分支预测装置 400还包括: 确定单元, 用于当读取单 元 401 读取的指令的类型为无条件间接跳转分支指令, 且读取单元 401 读取的指令中的寄存器标识为特定的寄存器标识时, 确定读取单元 401 读取的指令满足寄存器预测条件; 当读取单元 401读取的指令的类型不 为无条件间接跳转分支指令, 或者, 读取单元 401读取的指令中的寄存 器标识不为特定的寄存器标识时, 确定读取的指令不满足寄存器预测条 件。 Optionally, the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier. The branch prediction apparatus 400 further includes: a determining unit, wherein the type of the instruction read by the reading unit 401 is an unconditional indirect branch branch instruction, and the register identifier in the instruction read by the reading unit 401 is a specific register identifier. When the reading unit 401 is determined The read instruction satisfies the register prediction condition; when the type of the instruction read by the reading unit 401 is not an unconditional indirect jump branch instruction, or the register identifier in the instruction read by the reading unit 401 is not a specific register identifier , to determine that the read instruction does not meet the register prediction condition.
在一种应用场景中, 在对高级语言进行编译时, 若对高级语言进行 编译时调用的函数为标准库函数, 则, 分支预测装置将编译后的指令的 类型指定为 BLR。 则在图 4所示的分支预测装置的基础上, 分支预测装 置还可以包括: 编译单元, 用于对高级语言进行编译; 指定单元, 用于 当上述编译单元对高级语言进行编译时调用的函数为标准库函数时, 将 编译后的指令的类型指定为无条件间接跳转分支指令, 如将编译后的指 令的类型指定为 BLR指令。 需要说明的是, 本发明实施例中的高级语言 主要是相对于汇编语言而言, 它是较接近自然语言和数学公式的编程, 基本脱离了机器的硬件系统, 用人们更易理解的方式编写程序。 本发明 实施例中的高级语言并不特指某一种具体的语言, 可以包括很多编程语 言 , 如 java, c, C++ , C# , pascal, python, lisp, prolog, FoxPro, VC, 易语言等等, 本发明实施例中的标准库函数是指由一些按照高级语言标 准预先编写的基本函数组成的库。  In an application scenario, when compiling a high-level language, if the function called when the high-level language is compiled is a standard library function, the branch prediction device specifies the type of the compiled instruction as the BLR. Then, on the basis of the branch prediction apparatus shown in FIG. 4, the branch prediction apparatus may further include: a compiling unit for compiling the high-level language; and a specifying unit for calling the function when the compiling unit compiles the high-level language. When the standard library function is used, the type of the compiled instruction is specified as an unconditional indirect jump branch instruction, such as specifying the type of the compiled instruction as a BLR instruction. It should be noted that the high-level language in the embodiment of the present invention is mainly related to the assembly language, which is a program that is closer to the natural language and the mathematical formula, and is basically separated from the hardware system of the machine, and the program is written in a more understandable way. . The high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include many programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc. The standard library function in the embodiment of the present invention refers to a library composed of some basic functions pre-written according to a high-level language standard.
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。  It should be noted that the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc. The branch prediction method in the embodiment of the present invention may be applied to multiple The thread processor can also be applied to a single-thread processor, which is not limited herein.
需要说明的是, 本发明实施例中的分支预测装置可以如上述方法实 施例中的分支预测装置, 可以用于实现上述方法实施例中的全部技术方 案,其各个功能模块的功能可以根据上述方法实施例中的方法具体实现, 其具体实现过程可参照上述方法实施例中的相关描述, 此处不再赘述。  It should be noted that the branch prediction apparatus in the embodiment of the present invention may be used as the branch prediction apparatus in the foregoing method embodiment, and may be used to implement all the technical solutions in the foregoing method embodiments, and the functions of the respective functional modules may be according to the foregoing method. The method in the embodiment is specifically implemented. For the specific implementation process, reference may be made to the related description in the foregoing method embodiments, and details are not described herein again.
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。 It can be seen that, in the embodiment of the present invention, the first BTAC and the second BTAC are set, and the register identifier is used as an index in the first BTAC (that is, the register identifier is stored in the first BTAC) And the predicted correspondence address of the target jump address, the second BTAC uses the PC as an index (ie, the correspondence information of the partial field of the program counter and the predicted target jump address is stored in the second BTAC), when When the read command satisfies the register prediction condition, the first BTAC is used for branch prediction, otherwise, the second BTAC is used for branch prediction. Since the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC The accuracy of the branch prediction is also not affected. In other words, the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
本发明实施例还提供一种计算机存储介质, 其中, 该计算机存储介 质存储有程序, 该程序执行包括上述方法实施例中记载的在一种分支预 测方法和分支预测装置的部分或全部布置。  The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores a program, and the program execution includes some or all of the arrangements of a branch prediction method and a branch prediction device described in the foregoing method embodiments.
本发明实施例提供另一个分支预测装置, 如图 5所示, 本发明实施 例中的分支预测装置 500 , 包括:  The embodiment of the present invention provides another branch prediction apparatus. As shown in FIG. 5, the branch prediction apparatus 500 in the embodiment of the present invention includes:
输入装置 501、 输出装置 502、 存储器 503以及处理器 504 (分支预 测装置的处理器的数量可以是一个或者多个, 图 5以一个处理器为例)。 在本发明的一些实施例中, 输入装置 501、 输出装置 502、 存储器 503以 及处理器 504可以通过总线或其它方式连接, 如图 5所示以通过总线连 接为例。 存储器 503 中用来储存从输入装置 502输入的数据, 且还可以 储存处理器 504处理数据的必要文件等信息; 输入装置 501和输出装置 502可以包括分支预测装置 500 与其他设备通信的端口, 且还可以包括 分支预测装置 500外接的输出设备比如显示器、 键盘、 鼠标和打印机等, 具体地输入装置 502可以包括鼠标和键盘等, 而输出装置 501 包括显示 器等。  The input device 501, the output device 502, the memory 503, and the processor 504 (the number of processors of the branch prediction device may be one or more, and FIG. 5 takes a processor as an example). In some embodiments of the present invention, the input device 501, the output device 502, the memory 503, and the processor 504 may be connected by a bus or other means, as exemplified by a bus connection as shown in FIG. The memory 503 is used to store data input from the input device 502, and may also store information such as necessary files processed by the processor 504; the input device 501 and the output device 502 may include ports through which the branch prediction device 500 communicates with other devices, and Output devices external to the branch prediction device 500, such as a display, a keyboard, a mouse, and a printer, etc., may also be included. Specifically, the input device 502 may include a mouse and a keyboard, etc., and the output device 501 includes a display or the like.
其中, 处理器 504包括: 第一 BTAC和第二 BTAC , 上述第一 BTAC 存储着: 寄存器标识与预测目标跳转地址的——对应关系信息, 上述第 二 BTAC存储着: PC的字段与预测目标跳转地址的——对应关系信息, 可选地, 在上述第二 BTAC中存储 PC的部分字段与预测目标跳转地址 的——对应关系信息, 或者, 在上述第二 BTAC 中存储 PC的全部字段 与预测目标跳转地址的——对应关系信息。 The processor 504 includes: a first BTAC and a second BTAC, where the first BTAC stores: a correspondence between a register identifier and a predicted target jump address, where the foregoing The BTAC stores: a correspondence between the field of the PC and the predicted target jump address, optionally, the correspondence information of the partial field of the PC and the predicted target jump address is stored in the second BTAC, or Corresponding relationship information of all the fields of the PC and the predicted target jump address is stored in the second BTAC.
处理器 504执行如下步骤:  Processor 504 performs the following steps:
从指令緩存中读取指令;  Reading instructions from the instruction cache;
若确定读取的上述指令满足寄存器预测条件, 则:  If it is determined that the above read command satisfies the register prediction condition, then:
根据读取的上述指令的寄存器标识, 从上述第一分支目标地址预测 緩存器中获取读取的上述指令的预测目标跳转地址;  Obtaining, from the first branch target address prediction buffer, a predicted target jump address of the read instruction according to the read register identifier of the instruction;
若确定读取的上述指令不满足上述寄存器预测条件, 则:  If it is determined that the above-mentioned instruction read does not satisfy the above register prediction condition, then:
则根据读取的上述指令的程序计数器, 从上述第二分支目标地址预 测緩存器中获取读取的上述指令的预测目标跳转地址;  And acquiring, according to the read program counter of the above instruction, the predicted target jump address of the read instruction from the second branch target address predictor buffer;
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。  The above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
可选地, 上述寄存器预测条件还包括: 指令中的寄存器标识为特定 的寄存器标识。  Optionally, the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier.
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。  It should be noted that the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc. The branch prediction method in the embodiment of the present invention may be applied to multiple The thread processor can also be applied to a single-thread processor, which is not limited herein.
需要说明的是, 本发明实施例中的分支预测装置可以如上述方法实 施例中的分支预测装置, 可以用于实现上述方法实施例中的全部技术方 案,其各个功能模块的功能可以根据上述方法实施例中的方法具体实现, 其具体实现过程可参照上述方法实施例中的相关描述, 此处不再赘述。  It should be noted that the branch prediction apparatus in the embodiment of the present invention may be used as the branch prediction apparatus in the foregoing method embodiment, and may be used to implement all the technical solutions in the foregoing method embodiments, and the functions of the respective functional modules may be according to the foregoing method. The method in the embodiment is specifically implemented. For the specific implementation process, reference may be made to the related description in the foregoing method embodiments, and details are not described herein again.
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息 ) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。 It can be seen that, in the embodiment of the present invention, the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC) , using PC in the second BTAC For indexing (ie, storing the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), when the read instruction satisfies the register prediction condition, the first BTAC is used for branch prediction, otherwise, Branch prediction is performed using the second BTAC. Since the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC The accuracy of the branch prediction is also not affected. In other words, the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
需要说明的是, 对于前述的各方法实施例, 为了简便描述, 故将其 都表述为一系列的动作组合, 但是本领域技术人员应该知悉, 本发明并 不受所描述的动作顺序的限制, 因为依据本发明, 某些步骤可以釆用其 它顺序或者同时进行。 其次, 本领域技术人员也应该知悉, 说明书中所 描述的实施例均属于优选实施例, 所涉及的动作和模块并不一定都是本 发明所必须的。  It should be noted that, for the foregoing method embodiments, for the sake of brevity, they are all described as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. Further, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.
在上述实施例中, 对各个实施例的描述都各有侧重, 某个实施例中 没有详述的部分, 可以参见其它实施例的相关描述。  In the above embodiments, the descriptions of the various embodiments are different, and the parts that are not detailed in an embodiment can be referred to the related descriptions of other embodiments.
本领域普通技术人员可以理解上述实施例中的各种方法中的全部或 部分步骤是可以通过程序来指令相关的硬件来完成, 该程序可以存储于 一计算机可读存储介质中, 存储介质例如可以包括: 只读存储器、 随机 存储器、 磁盘或光盘等。  A person of ordinary skill in the art may understand that all or part of the steps in the foregoing embodiments may be completed by a program to instruct related hardware, and the program may be stored in a computer readable storage medium, for example, the storage medium may be Includes: Read Only Memory, Random Access Memory, Disk or CD, and more.
以上对本发明所提供的一种分支预测方法及相关装置进行了详细介 绍, 对于本领域的一般技术人员, 依据本发明实施例的思想, 在具体实 施方式及应用范围上均会有改变之处, 本说明书内容不应理解为对本发 明的限制。  A branch prediction method and related device provided by the present invention are described in detail above. For those skilled in the art, according to the idea of the embodiment of the present invention, there are changes in the specific implementation manner and application scope. The contents of this specification are not to be construed as limiting the invention.

Claims

权利 要求 书 claims
1、 一种分支预测方法, 其特征在于, 应用于处理器中, 所述处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩存器, 所述第一分支 目标地址预测緩存器存储着: 寄存器标识与预测目标跳转地址的——对应关系 信息, 所述第二分支目标地址预测緩存器存储着: 程序计数器的字段与预测目 标跳转地址的——对应关系信息, 其中, 所述分支预测方法, 包括: 1. A branch prediction method, characterized in that it is applied to a processor, and the processor includes: a first branch target address prediction cache and a second branch target address prediction cache, the first branch target address prediction The cache stores: Correspondence information between the register identifier and the predicted target jump address, The second branch target address prediction cache stores: Correspondence information between the program counter field and the predicted target jump address, Among them, the branch prediction method includes:
从指令緩存中读取指令; Read instructions from the instruction cache;
若确定读取的所述指令满足寄存器预测条件, 则: If it is determined that the instruction read satisfies the register prediction condition, then:
根据读取的所述指令的寄存器标识, 从所述第一分支目标地址预测緩存器 中获取读取的所述指令的预测目标跳转地址; According to the read register identifier of the instruction, obtain the predicted target jump address of the read instruction from the first branch target address prediction cache;
若确定读取的所述指令不满足所述寄存器预测条件, 则: If it is determined that the read instruction does not meet the register prediction condition, then:
则根据读取的所述指令的程序计数器, 从所述第二分支目标地址预测緩存 器中获取读取的所述指令的预测目标跳转地址; Then, according to the program counter of the read instruction, obtain the predicted target jump address of the read instruction from the second branch target address prediction cache;
其中, 所述寄存器预测条件包括: 指令的类型为无条件间接跳转分支指令。 Wherein, the register prediction conditions include: the type of instruction is an unconditional indirect jump branch instruction.
2、 根据权利要求 1所述的方法, 其特征在于, 2. The method according to claim 1, characterized in that,
所述寄存器预测条件还包括: 指令中的寄存器标识为特定的寄存器标识; 所述确定读取的所述指令满足寄存器预测条件, 具体为: The register prediction condition also includes: the register identifier in the instruction is a specific register identifier; and it is determined that the read instruction satisfies the register prediction condition, specifically:
当所述指令的类型为无条件间接跳转分支指令, 且所述指令中的寄存器标 识为特定的寄存器标识时, 确定读取的所述指令满足寄存器预测条件; When the type of the instruction is an unconditional indirect jump branch instruction, and the register identifier in the instruction is a specific register identifier, it is determined that the read instruction satisfies the register prediction condition;
所述确定读取的所述指令不满足所述寄存器预测条件, 具体为: It is determined that the instruction read does not satisfy the register prediction condition, specifically:
当所述指令的类型不为无条件间接跳转分支指令, 或者, 所述指令中的寄 存器标识不为特定的寄存器标识时, 确定读取的所述指令不满足寄存器预测条 件。 When the type of the instruction is not an unconditional indirect jump branch instruction, or the register identifier in the instruction is not a specific register identifier, it is determined that the read instruction does not meet the register prediction condition.
3、 根据权利要求 1或 2所述的方法, 其特征在于, 3. The method according to claim 1 or 2, characterized in that,
所述从所述指令緩存中读取指令之前包括: Reading instructions from the instruction cache includes:
对待读取的指令进行预译码, 得到所述待读取的指令的类型信息; 所述读取指令之后包括: 根据所述得到的指令的类型信息, 判定当前读取 的指令的类型是否为无条件间接跳转分支指令。 Pre-decode the instruction to be read to obtain the type information of the instruction to be read; the reading instruction includes: based on the obtained instruction type information, determine whether the type of the currently read instruction is Unconditional indirect jump branch instructions.
4、 根据权利要求 1至 3任一项所述的方法, 其特征在于, 4. The method according to any one of claims 1 to 3, characterized in that,
在所述读取指令之前, 若对高级语言进行编译时调用的函数为标准库函数, 则, 将编译后的指令的类型指定为无条件间接跳转分支指令。 Before the read instruction, if the function called when compiling a high-level language is a standard library function, Then, specify the type of the compiled instruction as an unconditional indirect jump branch instruction.
5、 一种分支预测装置, 其特征在于, 应用于处理器中, 所述处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩存器, 所述第一分支 目标地址预测緩存器存储着: 寄存器标识与预测目标跳转地址的——对应关系 信息, 所述第二分支目标地址预测緩存器存储着: 程序计数器的部分字段与预 测目标跳转地址的——对应关系信息, 或者, 程序计数器的全部字段与预测目 标跳转地址的——对应关系信息, 其中, 所述分支预测装置, 包括: 5. A branch prediction device, characterized in that it is used in a processor, and the processor includes: a first branch target address prediction buffer and a second branch target address prediction buffer, the first branch target address prediction The cache stores: Correspondence information between the register identifier and the predicted target jump address. The second branch target address prediction cache stores: Correspondence information between some fields of the program counter and the predicted target jump address. , or, correspondence information between all fields of the program counter and the predicted target jump address, where the branch prediction device includes:
读取单元, 用于从指令緩存中读取指令; Read unit, used to read instructions from the instruction cache;
预测获取单元, 用于当确定所述读取单元读取的指令满足寄存器预测条件 时, 根据所述读取单元读取的所述指令的寄存器标识, 从所述第一分支目标地 定所述读取单元读取的指令不满足所述寄存器预测条件时, 根据所述读取单元 读取的所述指令的程序计数器, 从所述第二分支目标地址预测緩存器中获取所 述读取单元读取的所述指令的预测目标跳转地址; A prediction acquisition unit configured to, when it is determined that the instruction read by the reading unit satisfies the register prediction condition, determine the first branch target from the first branch target according to the register identification of the instruction read by the reading unit. When the instruction read by the read unit does not satisfy the register prediction condition, the read unit is obtained from the second branch target address prediction buffer according to the program counter of the instruction read by the read unit. The predicted target jump address of the instruction read;
其中, 所述寄存器预测条件包括: 指令的类型为无条件间接跳转分支指令。 Wherein, the register prediction conditions include: the type of instruction is an unconditional indirect jump branch instruction.
6、 根据权利要求 5所述的分支预测装置, 其特征在于, 6. The branch prediction device according to claim 5, characterized in that,
所述寄存器预测条件还包括: 指令中的寄存器标识为特定的寄存器标识; 所述分支预测装置还包括: The register prediction condition also includes: the register identifier in the instruction is a specific register identifier; the branch prediction device also includes:
确定单元, 用于当所述读取单元读取的指令的类型为无条件间接跳转分支 指令, 且所述读取单元读取的指令中的寄存器标识为特定的寄存器标识时, 确 定所述读取单元读取的指令满足寄存器预测条件; 当所述读取单元读取的指令 的类型不为无条件间接跳转分支指令, 或者, 所述读取单元读取的指令中的寄 存器标识不为特定的寄存器标识时, 确定读取的所述指令不满足寄存器预测条 件。 Determining unit, configured to determine when the type of instruction read by the reading unit is an unconditional indirect jump branch instruction and the register identifier in the instruction read by the reading unit is a specific register identifier. The instruction read by the fetch unit satisfies the register prediction condition; when the type of instruction read by the fetch unit is not an unconditional indirect jump branch instruction, or the register identifier in the instruction read by the fetch unit is not a specific When the register identifier is specified, it is determined that the read instruction does not meet the register prediction condition.
7、 根据权利要求 5或 6所述的分支预测装置, 其特征在于, 7. The branch prediction device according to claim 5 or 6, characterized in that,
所述分支预测装置还包括: The branch prediction device also includes:
预译码单元, 用于对所述读取单元待读取的指令进行预译码, 得到所述读 取单元待读取的指令的类型信息; A pre-decoding unit, used to pre-decode the instructions to be read by the reading unit and obtain the type information of the instructions to be read by the reading unit;
判定单元, 用于在所述读取单元读取所述指令后, 根据所述预译码单元得 到的指令的类型信息, 判定所述读取单元当前读取的指令的类型是否为无条件 间接跳转分支指令。 A determination unit, configured to determine whether the type of instruction currently read by the reading unit is unconditional based on the type information of the instruction obtained by the pre-decoding unit after the reading unit reads the instruction. Indirect jump branch instructions.
8、 根据权利要求 5或 6所述的分支预测装置, 其特征在于, 8. The branch prediction device according to claim 5 or 6, characterized in that,
所述分支预测装置还包括: The branch prediction device also includes:
编译单元, 用于对高级语言进行编译; Compilation unit, used to compile high-level languages;
指定单元, 用于当所述编译单元对高级语言进行编译时调用的函数为标准 库函数时, 将编译后的指令的类型指定为无条件间接跳转分支指令。 The designation unit is used to designate the type of the compiled instruction as an unconditional indirect jump branch instruction when the function called by the compilation unit when compiling a high-level language is a standard library function.
PCT/CN2014/083882 2013-08-21 2014-08-07 Branch predicting method and related apparatus WO2015024452A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310367653.9 2013-08-21
CN201310367653.9A CN104423929B (en) 2013-08-21 2013-08-21 A kind of branch prediction method and relevant apparatus

Publications (1)

Publication Number Publication Date
WO2015024452A1 true WO2015024452A1 (en) 2015-02-26

Family

ID=52483061

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/083882 WO2015024452A1 (en) 2013-08-21 2014-08-07 Branch predicting method and related apparatus

Country Status (2)

Country Link
CN (1) CN104423929B (en)
WO (1) WO2015024452A1 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10534609B2 (en) 2017-08-18 2020-01-14 International Business Machines Corporation Code-specific affiliated register prediction
US10558461B2 (en) 2017-08-18 2020-02-11 International Business Machines Corporation Determining and predicting derived values used in register-indirect branching
US10564974B2 (en) 2017-08-18 2020-02-18 International Business Machines Corporation Determining and predicting affiliated registers based on dynamic runtime control flow analysis
US10579385B2 (en) 2017-08-18 2020-03-03 International Business Machines Corporation Prediction of an affiliated register
US10620955B2 (en) 2017-09-19 2020-04-14 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US10691600B2 (en) 2017-09-19 2020-06-23 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10705973B2 (en) 2017-09-19 2020-07-07 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US10713050B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing Table of Contents (TOC)-setting instructions in code with TOC predicting instructions
US10831457B2 (en) 2017-09-19 2020-11-10 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10884929B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10884748B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Providing a predicted target address to multiple locations based on detecting an affiliated relationship
US10901741B2 (en) 2017-08-18 2021-01-26 International Business Machines Corporation Dynamic fusion of derived value creation and prediction of derived values in a subroutine branch sequence
US10908911B2 (en) 2017-08-18 2021-02-02 International Business Machines Corporation Predicting and storing a predicted target address in a plurality of selected locations
US11061575B2 (en) 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US11150904B2 (en) 2017-08-18 2021-10-19 International Business Machines Corporation Concurrent prediction of branch addresses and update of register contents
CN117093272A (en) * 2023-10-07 2023-11-21 飞腾信息技术有限公司 Instruction sending method and processor

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106155928A (en) * 2015-04-13 2016-11-23 上海芯豪微电子有限公司 A kind of storage hierarchy pre-fetching system and method
WO2016155623A1 (en) * 2015-03-30 2016-10-06 上海芯豪微电子有限公司 Information-push-based information system and method
CN105867880B (en) * 2016-04-01 2018-12-04 中国科学院计算技术研究所 It is a kind of towards the branch target buffer and design method that jump branch prediction indirectly
CN108062236A (en) * 2016-11-07 2018-05-22 杭州华为数字技术有限公司 A kind of software-hardware synergism branch instruction predictions method and device
CN109308191B (en) * 2017-07-28 2021-09-14 华为技术有限公司 Branch prediction method and device
CN111176729A (en) * 2018-11-13 2020-05-19 深圳市中兴微电子技术有限公司 Information processing method and device and computer readable storage medium
CN111209044B (en) * 2018-11-21 2022-11-25 展讯通信(上海)有限公司 Instruction compression method and device
CN111625280B (en) * 2019-02-27 2023-08-04 上海复旦微电子集团股份有限公司 Instruction control method and device and readable storage medium
CN110347432B (en) * 2019-06-17 2021-09-14 海光信息技术股份有限公司 Processor, branch predictor, data processing method thereof and branch prediction method
CN111638913B (en) * 2019-09-19 2023-05-12 中国科学院信息工程研究所 Processor chip branch predictor security enhancement method based on randomized index and electronic device
CN111026442B (en) * 2019-12-17 2022-08-02 天津国芯科技有限公司 Method and device for eliminating program unconditional jump overhead in CPU
CN111258649B (en) * 2020-01-21 2022-03-01 Oppo广东移动通信有限公司 Processor, chip and electronic equipment
CN111538535B (en) * 2020-04-28 2021-09-21 支付宝(杭州)信息技术有限公司 CPU instruction processing method, controller and central processing unit
CN112613039B (en) * 2020-12-10 2022-09-09 成都海光微电子技术有限公司 Performance optimization method and device for ghost vulnerability
CN113722243A (en) * 2021-09-03 2021-11-30 苏州睿芯集成电路科技有限公司 Advanced prediction method for direct jump and branch instruction tracking cache
CN115480826B (en) * 2022-09-21 2024-03-12 海光信息技术股份有限公司 Branch predictor, branch prediction method, branch prediction device and computing equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110055529A1 (en) * 2009-08-28 2011-03-03 Via Technologies, Inc. Efficient branch target address cache entry replacement
CN102117198A (en) * 2009-12-31 2011-07-06 上海芯豪微电子有限公司 Branch processing method
CN102662640A (en) * 2012-04-12 2012-09-12 苏州睿云智芯微电子有限公司 Double-branch target buffer and branch target processing system and processing method
CN103150142A (en) * 2011-12-07 2013-06-12 苹果公司 Next fetch predictor training with hysteresis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10133874A (en) * 1996-11-01 1998-05-22 Mitsubishi Electric Corp Branch predicting mechanism for superscalar processor
US20070294518A1 (en) * 2006-06-14 2007-12-20 Shen-Chang Wang System and method for predicting target address of branch instruction utilizing branch target buffer having entry indexed according to program counter value of previous instruction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110055529A1 (en) * 2009-08-28 2011-03-03 Via Technologies, Inc. Efficient branch target address cache entry replacement
CN102117198A (en) * 2009-12-31 2011-07-06 上海芯豪微电子有限公司 Branch processing method
CN103150142A (en) * 2011-12-07 2013-06-12 苹果公司 Next fetch predictor training with hysteresis
CN102662640A (en) * 2012-04-12 2012-09-12 苏州睿云智芯微电子有限公司 Double-branch target buffer and branch target processing system and processing method

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10884745B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Providing a predicted target address to multiple locations based on detecting an affiliated relationship
US10558461B2 (en) 2017-08-18 2020-02-11 International Business Machines Corporation Determining and predicting derived values used in register-indirect branching
US10564974B2 (en) 2017-08-18 2020-02-18 International Business Machines Corporation Determining and predicting affiliated registers based on dynamic runtime control flow analysis
US10579385B2 (en) 2017-08-18 2020-03-03 International Business Machines Corporation Prediction of an affiliated register
US11314511B2 (en) 2017-08-18 2022-04-26 International Business Machines Corporation Concurrent prediction of branch addresses and update of register contents
US11150904B2 (en) 2017-08-18 2021-10-19 International Business Machines Corporation Concurrent prediction of branch addresses and update of register contents
US11150908B2 (en) 2017-08-18 2021-10-19 International Business Machines Corporation Dynamic fusion of derived value creation and prediction of derived values in a subroutine branch sequence
US10929135B2 (en) 2017-08-18 2021-02-23 International Business Machines Corporation Predicting and storing a predicted target address in a plurality of selected locations
US10908911B2 (en) 2017-08-18 2021-02-02 International Business Machines Corporation Predicting and storing a predicted target address in a plurality of selected locations
US10534609B2 (en) 2017-08-18 2020-01-14 International Business Machines Corporation Code-specific affiliated register prediction
US10719328B2 (en) 2017-08-18 2020-07-21 International Business Machines Corporation Determining and predicting derived values used in register-indirect branching
US10901741B2 (en) 2017-08-18 2021-01-26 International Business Machines Corporation Dynamic fusion of derived value creation and prediction of derived values in a subroutine branch sequence
US10754656B2 (en) 2017-08-18 2020-08-25 International Business Machines Corporation Determining and predicting derived values
US10891133B2 (en) 2017-08-18 2021-01-12 International Business Machines Corporation Code-specific affiliated register prediction
US10884747B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Prediction of an affiliated register
US10884746B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Determining and predicting affiliated registers based on dynamic runtime control flow analysis
US10884748B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Providing a predicted target address to multiple locations based on detecting an affiliated relationship
US10713051B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing table of contents (TOC)-setting instructions in code with TOC predicting instructions
US11010164B2 (en) 2017-09-19 2021-05-18 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US10884929B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10831457B2 (en) 2017-09-19 2020-11-10 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10896030B2 (en) 2017-09-19 2021-01-19 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10725918B2 (en) 2017-09-19 2020-07-28 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10713050B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing Table of Contents (TOC)-setting instructions in code with TOC predicting instructions
US10705973B2 (en) 2017-09-19 2020-07-07 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US10949350B2 (en) 2017-09-19 2021-03-16 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10963382B2 (en) 2017-09-19 2021-03-30 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10977185B2 (en) 2017-09-19 2021-04-13 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US10884930B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US11061575B2 (en) 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US11061576B2 (en) 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US11138127B2 (en) 2017-09-19 2021-10-05 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US11138113B2 (en) 2017-09-19 2021-10-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10691600B2 (en) 2017-09-19 2020-06-23 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10656946B2 (en) 2017-09-19 2020-05-19 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US10620955B2 (en) 2017-09-19 2020-04-14 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
CN117093272A (en) * 2023-10-07 2023-11-21 飞腾信息技术有限公司 Instruction sending method and processor
CN117093272B (en) * 2023-10-07 2024-01-16 飞腾信息技术有限公司 Instruction sending method and processor

Also Published As

Publication number Publication date
CN104423929B (en) 2017-07-14
CN104423929A (en) 2015-03-18

Similar Documents

Publication Publication Date Title
WO2015024452A1 (en) Branch predicting method and related apparatus
US7437537B2 (en) Methods and apparatus for predicting unaligned memory access
JP5043560B2 (en) Program execution control device
JP5917616B2 (en) Method and apparatus for changing the sequential flow of a program using prior notification technology
EP2628072B1 (en) An instruction sequence buffer to enhance branch prediction efficiency
EP2628076B1 (en) An instruction sequence buffer to store branches having reliably predictable instruction sequences
TWI423123B (en) Universal branch isystem, method thereof, identifier thereof, and computer accessible medium thereof for invalidation of speculative instructions
US8069336B2 (en) Transitioning from instruction cache to trace cache on label boundaries
RU2417407C2 (en) Methods and apparatus for emulating branch prediction behaviour of explicit subroutine call
US9529596B2 (en) Method and apparatus for scheduling instructions in a multi-strand out of order processor with instruction synchronization bits and scoreboard bits
US9367471B2 (en) Fetch width predictor
JP2008530714A5 (en)
JP2009536770A (en) Branch address cache based on block
JP2013080497A (en) Sliding-window, block-based branch target address cache
JP7064273B2 (en) Read / store unit with split reorder queue using a single CAM port
US20220197662A1 (en) Accessing A Branch Target Buffer Based On Branch Instruction Information
US20220197657A1 (en) Segmented branch target buffer based on branch instruction type
US20230315453A1 (en) Forward conditional branch event for profile-guided-optimization (pgo)
US20230195456A1 (en) System, apparatus and method for throttling fusion of micro-operations in a processor
CN113568663A (en) Code prefetch instruction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14838748

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14838748

Country of ref document: EP

Kind code of ref document: A1