WO2015024452A1 - 一种分支预测方法及相关装置 - Google Patents

一种分支预测方法及相关装置 Download PDF

Info

Publication number
WO2015024452A1
WO2015024452A1 PCT/CN2014/083882 CN2014083882W WO2015024452A1 WO 2015024452 A1 WO2015024452 A1 WO 2015024452A1 CN 2014083882 W CN2014083882 W CN 2014083882W WO 2015024452 A1 WO2015024452 A1 WO 2015024452A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
branch
read
prediction
register
Prior art date
Application number
PCT/CN2014/083882
Other languages
English (en)
French (fr)
Inventor
侯锐
冯煜晶
郭旭斌
张乾龙
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015024452A1 publication Critical patent/WO2015024452A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • G06F9/3806Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer

Definitions

  • the present invention relates to the field of computer systems, and in particular, to a branch prediction method and related apparatus.
  • the actual program includes branch instructions.
  • the branching behavior of the branch instruction is often determined until the back end of the pipeline. Therefore, the branch instruction may cause a control risk and cause the pipeline to stall.
  • the processor cannot determine from which address to take an instruction. Until this branch instruction is executed.
  • Most processors use different forms of branch prediction mechanism, so that the target branch direction and target jump address of the conditional branch instruction can be predicted at the front end of the pipeline, so that the processor can predictively fetch instructions and execute instructions. . If the branch prediction is correct or the correct rate is high, the performance and power consumption of the processor can be greatly improved.
  • the Branch Target Address Cache (BTAC) is used to predict the target jump address of the indirect branch instruction.
  • BTAC uses the structure of the cache, with part of the program counter (PC, Program Counter) as the index (ie index), and part of it as the tag (ie label), such as the lower 8 bits of the PC as the index, and the high 8 of the PC. Bit as a tag.
  • Each Entry (ie, entry) of the BTAC corresponds to an index and a tag, and each Entry of the BTAC is set with a valid bit for recording whether the Entry stores valid history information (historical information is predicted)
  • the target jump address where the target jump address stored in the entry is a virtual address (VA, Virtual Address). If the BTAC is full, like Cache, it is also necessary to determine which of the least recently used entries can be replaced according to a certain replacement algorithm.
  • One is shared BTAC. Multiple threads share the same BTAC. Each thread uses its own PC to index the contents stored in BTAC. Although this method saves the area, since the index address of BTAC is the PC of each thread, and the PCs of different threads may be the same, the historical information between different threads is stored in the same BTAC, which will affect the branch prediction. Accuracy rate
  • the other is the exclusive BTAC, where each thread sets up a BTAC, and the BTAC provides the service for predicting the branch target jump address for the corresponding thread.
  • aspects of the present invention provide a branch prediction method and related apparatus for solving the problem of affecting the accuracy of branch prediction when sharing BTAC.
  • a first aspect of the present invention provides a branch prediction method, which is applied to a processor, where the processor includes: a first branch target address prediction buffer and a second branch target address prediction buffer, and the first branch target address prediction cache.
  • the storage device stores: a correspondence identifier of the register identifier and the predicted target jump address, wherein the second branch target address prediction buffer stores: correspondence information between the field of the program counter and the predicted target jump address, where
  • the above branch prediction method includes: reading an instruction from an instruction cache;
  • the above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
  • the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier;
  • the register identifier in the above instruction is a specific register identifier
  • the foregoing determining that the read instruction does not satisfy the foregoing register prediction condition specifically: when the type of the instruction is not an unconditional indirect jump branch instruction, or the above instruction
  • the register identifier in the register is not identified by a specific register, it is determined that the above-mentioned instruction read does not satisfy the register prediction condition.
  • the foregoing reading the instruction from the instruction cache includes:
  • the instruction to be read is pre-decoded to obtain the type information of the instruction to be read; the reading instruction further includes: determining, according to the type information of the instruction obtained above, whether the type of the currently read instruction is an unconditional indirect jump Transfer branch instructions.
  • the type of the compiled instruction is specified as an unconditional indirect jump branch instruction.
  • a second aspect of the present invention provides a branch prediction apparatus, which is applied to a processor, where the processor includes: a first branch target address prediction buffer and a second branch target address prediction buffer, and the first branch target address prediction cache.
  • the device stores: a correspondence identifier of the register identifier and the predicted target jump address
  • the second branch target address prediction buffer stores: correspondence information between the partial field of the program counter and the predicted target jump address, or And the corresponding relationship information of all the fields of the program counter and the predicted target jump address
  • the branch prediction device includes:
  • a reading unit configured to read an instruction from the instruction cache
  • a prediction acquiring unit configured to: when determining that the instruction read by the reading unit satisfies a register prediction condition, acquire the read from the first branch target address prediction buffer according to the register identifier of the instruction read by the reading unit Taking the prediction target jump address of the above instruction read by the unit; when it is determined that the instruction read by the reading unit does not satisfy the register prediction condition, the program counter of the instruction read according to the reading unit is from the second Obtaining, in the branch target address prediction buffer, a predicted target jump address of the above instruction read by the reading unit;
  • the foregoing register prediction condition includes: The type of the instruction is an unconditional indirect jump branch instruction. According to the second aspect of the present invention, in a first possible implementation, the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier;
  • the above branch prediction device further includes:
  • a determining unit configured to: when the type of the instruction read by the reading unit is an unconditional indirect branch branch instruction, and the register identifier in the instruction read by the reading unit is a specific register identifier, determining the reading unit to read The fetched instruction satisfies the register prediction condition; when the type of the instruction read by the read unit is not an unconditional indirect jump branch instruction, or the register identifier in the instruction read by the read unit is not a specific register identifier, It is determined that the above instruction read does not satisfy the register prediction condition.
  • the branch prediction apparatus further includes:
  • a pre-decoding unit configured to pre-decode an instruction to be read by the reading unit, to obtain type information of an instruction to be read by the reading unit
  • the determining unit is configured to determine, after the reading unit reads the instruction, whether the type of the instruction currently read by the reading unit is an unconditional indirect branch branch instruction according to the type information of the instruction obtained by the pre-decoding unit.
  • the foregoing branch prediction apparatus further includes:
  • the specifying unit is configured to specify the type of the compiled instruction as an unconditional indirect jump branch instruction when the function called when the above compiling unit compiles the high-level language is a standard library function.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions with the same register identifier are necessarily the same, so that even the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC.
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • FIG. 1 is a schematic flow chart of an embodiment of a branch prediction method according to the present invention
  • FIG. 2 is a schematic flowchart of another embodiment of a branch prediction method according to the present invention
  • FIG. 3 is a schematic diagram of a branch prediction method provided by the present invention.
  • FIG. 4 is a schematic structural diagram of an embodiment of a branch prediction apparatus according to the present invention
  • FIG. 5 is a schematic structural diagram of another embodiment of a branch prediction apparatus according to the present invention.
  • Embodiments of the present invention provide a branch prediction method and related apparatus.
  • a branch prediction method provided by an embodiment of the present invention is described below.
  • the branch prediction method in the embodiment is applied to the processor, where the processor includes: a first BTAC and a second BTAC, where the first BTAC stores: correspondence information between the register identifier and the predicted target jump address, the foregoing
  • the BTAC stores: a correspondence between the field of the PC and the predicted target jump address, optionally, the correspondence information of the partial field of the PC and the predicted target jump address is stored in the second BTAC, or Corresponding relationship information of all the fields of the PC and the predicted target jump address is stored in the second BTAC.
  • a branch prediction method in an embodiment of the present invention includes:
  • the instruction read by step 101 may be a branch instruction or a non-branch instruction.
  • the branch instruction can be divided into the following two ways: One is to divide the type of the branch instruction into a conditional branch instruction and an unconditional branch instruction for the jump condition, wherein the conditional branch instruction performs the branch jump when a certain condition is met.
  • the unconditional branch instruction does not need to satisfy any condition, and always performs a branch jump; the other is to divide the type of the branch instruction into a direct jump branch instruction and an indirect jump branch instruction for the target jump address, where
  • the offset of the target jump address indicated by the jump branch is directly specified in the instruction with an immediate value (that is, the number given in the immediate addressing mode instruction), and the target jump address is the PC plus the branch instruction itself.
  • the offset of the immediate branch is calculated, and the target jump address of the indirect jump branch instruction is specified in the register.
  • An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction.
  • the type of the branch instruction (such as whether it is a conditional branch instruction, an indirect jump branch instruction, etc.) needs to be identified through the pre-decoding stage.
  • the corresponding branch prediction is performed according to the type of the branch instruction.
  • the pre-decoded result (such as the type information of the instruction) and the instruction are stored together in the instruction cache. It should be noted that the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, or may be performed by other devices, which is not limited herein.
  • the above register prediction conditions include: the type of the instruction is an unconditional indirect jump Instructions.
  • the first BTAC is set in the processor.
  • the first BTAC is described as SBTAC
  • the second BTAC is simply referred to as BTAC.
  • the hardware structure of SBTAC is similar to that of BTAC. The difference is that BTAC is indexed by a part of the PC field or all fields, and SBTAC is indexed by register identifier. Since the SBPAC stores the correspondence information of the register identifier and the predicted target jump address, the branch prediction device can find the predicted target jump address corresponding to the register identifier from the SBTAC according to the register identifier in the above instruction. .
  • the branch prediction device specifies the type of the compiled instruction as an unconditional indirect jump branch instruction, such as The type of the compiled instruction is specified as a branch and link register (BLR) instruction.
  • the BLR instruction is an unconditional indirect branch instruction. It is caused by a subroutine call or function call and will return.
  • the address is stored in the Link Register.
  • the high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include a plurality of programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc.
  • the standard library function in the embodiment of the present invention refers to a library composed of some basic functions pre-written according to high-level language standards.
  • the register identifier in the embodiment of the present invention may be a register number, or The register identifier may also be other codes or symbols that can be used to indicate the register.
  • the branch prediction method in the embodiment of the present invention may be applied to a multi-thread processor, and may also be applied to a single-thread processor. limited.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address can be stored in the same Entry of the first BTAC without Affect the accuracy of branch prediction, so that the resource sharing of BTAC can be realized under the premise of ensuring the accuracy of branch prediction.
  • the register identifier of all the registers is used as the index of the predicted target jump address in the SBTAC.
  • only the register identifier of the partial register may be used as the index of the predicted target jump address in the SBTAC.
  • the prediction condition further includes: the register identifier in the instruction is a specific register identifier, and the read is determined when the type of the read instruction is an unconditional indirect jump branch instruction, and the register identifier in the read instruction is a specific register identifier. The instruction satisfies the register prediction condition. When the type of the read instruction is not an unconditional indirect branch instruction, or the register identifier in the read instruction is not a specific register identifier, it is determined that the read instruction does not satisfy the register prediction. condition.
  • the branch prediction method in the embodiment of the present invention includes:
  • An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction.
  • the type of the branch instruction needs to be identified through the pre-decoding stage (such as whether it is a conditional branch instruction, whether it is an indirect jump).
  • the branch instruction, etc. to perform the corresponding branch prediction according to the type of the branch instruction.
  • the predecoded result and instructions are saved together in the instruction cache. It should be noted that the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, or may be performed by other devices, which is not limited herein.
  • the type of the read instruction is an unconditional indirect jump branch instruction, determining whether the register identifier in the read instruction is a specific register identifier;
  • step 203 is executed, and if the branch prediction device determines that the register identifier in the read command is not a specific register identifier, execute Step 204.
  • the first BTAC stores: correspondence information between the register identifier and the predicted target jump address.
  • the predicted target jump address of the read instruction is obtained from the second BTAC according to the program counter in the read command.
  • the branch prediction device specifies the type of the compiled instruction as an unconditional indirect jump branch instruction, such as The type of the compiled instruction is specified as a BLR instruction.
  • the BLR instruction is an unconditional indirect branch instruction. It is caused by a subroutine call or function call and must be returned. The returned address is stored in the Link Register.
  • the high-level language in the embodiment of the present invention is mainly related to the programming language, which is a program that is closer to the natural language and the mathematical formula, and is basically separated from the machine system, and is written in a more understandable way. program.
  • the high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include a plurality of programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc.
  • the standard library function in the embodiment of the present invention refers to a base pre-written according to high-level language standards. A library of this function.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the branch instruction under the standard library function For the branch instruction under the standard library function, the target jump address of the branch instruction usually does not change. Therefore, in order to ensure that the content of the SBTAC does not appear to be updated or invalid due to the switching of the software process, the standard in the embodiment of the present invention
  • the branch instruction under the library function uses the SBTAC to perform branch prediction.
  • the branch prediction method in the embodiment of the present invention includes:
  • step 303 is executed. If the standard library function is called, step 304 is performed.
  • Specify the type of the compiled instruction as the BLR instruction is stored in the second level cache or in the memory.
  • the steps 305-308 are similar to the steps 201-204 in the embodiment shown in FIG. 2, and the specific implementation manners may refer to the description in the corresponding steps, and details are not described herein again.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC)
  • the second BTAC uses the PC as an index (ie, stores the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), and uses the first when the read instruction satisfies the register prediction condition.
  • the BTAC performs branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the embodiment of the present invention further provides a branch prediction apparatus, which is applied to a processor, where the processor includes: a first BTAC and a second BTAC, where the first BTAC stores: a register identifier and a predicted target jump address.
  • the second BTAC stores: correspondence information between the field of the PC and the predicted target jump address, optionally, storing the partial field of the PC and the predicted target jump address in the second BTAC - - correspondence relationship information, or storing all fields of the PC and the predicted target hop in the second BTAC
  • the branch prediction apparatus 400 in the embodiment of the present invention includes:
  • a reading unit 401 configured to read an instruction from the instruction cache
  • An instruction needs to pre-decode the instruction before it is accessed from the L2 cache or internal to the instruction cache, so that the partial pre-decoded result of the instruction is used as a guide for branch prediction.
  • the type of the branch instruction (such as whether it is a conditional branch instruction, an indirect jump branch instruction, etc.) needs to be identified through the pre-decoding stage.
  • the corresponding branch prediction is performed according to the type of the branch instruction.
  • the pre-decoded result (such as the type information of the instruction) and the instruction are stored together in the instruction cache.
  • the foregoing pre-decoding operation on the instruction may be performed by the branch prediction device, and the branch prediction device in the embodiment of the present invention may further include: a pre-decoding unit, configured to read the reading unit 401 The fetching instruction is pre-decoded to obtain the type information of the instruction to be read; the determining unit is configured to determine, after the reading unit 401 reads the instruction, the type information of the instruction obtained by the pre-decoding unit, Whether the type is an unconditional indirect jump branch instruction.
  • the foregoing pre-decoding operation of the instruction to be read by the reading unit 401 can also be performed by other devices, which is not limited herein.
  • the prediction obtaining unit 402 is configured to: when determining that the instruction read by the reading unit 401 satisfies the register prediction condition, acquire the read unit 401 from the first BTAC according to the register identifier in the instruction read by the reading unit 401 The predicted target jump address of the instruction. When it is determined that the instruction read by the reading unit 401 does not satisfy the above-described register prediction condition, the predicted target jump address of the instruction read by the reading unit 401 is acquired from the second BTAC according to the PC of the instruction read by the reading unit 401.
  • the above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
  • the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier.
  • the branch prediction apparatus 400 further includes: a determining unit, wherein the type of the instruction read by the reading unit 401 is an unconditional indirect branch branch instruction, and the register identifier in the instruction read by the reading unit 401 is a specific register identifier.
  • the reading unit 401 is determined The read instruction satisfies the register prediction condition; when the type of the instruction read by the reading unit 401 is not an unconditional indirect jump branch instruction, or the register identifier in the instruction read by the reading unit 401 is not a specific register identifier , to determine that the read instruction does not meet the register prediction condition.
  • the branch prediction device when compiling a high-level language, if the function called when the high-level language is compiled is a standard library function, the branch prediction device specifies the type of the compiled instruction as the BLR. Then, on the basis of the branch prediction apparatus shown in FIG. 4, the branch prediction apparatus may further include: a compiling unit for compiling the high-level language; and a specifying unit for calling the function when the compiling unit compiles the high-level language.
  • the type of the compiled instruction is specified as an unconditional indirect jump branch instruction, such as specifying the type of the compiled instruction as a BLR instruction.
  • the high-level language in the embodiment of the present invention is mainly related to the assembly language, which is a program that is closer to the natural language and the mathematical formula, and is basically separated from the hardware system of the machine, and the program is written in a more understandable way.
  • the high-level language in the embodiment of the present invention does not specifically refer to a specific language, and may include many programming languages, such as java, c, C++, C#, pascal, python, lisp, prolog, FoxPro, VC, easy language, etc.
  • the standard library function in the embodiment of the present invention refers to a library composed of some basic functions pre-written according to a high-level language standard.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • branch prediction apparatus in the embodiment of the present invention may be used as the branch prediction apparatus in the foregoing method embodiment, and may be used to implement all the technical solutions in the foregoing method embodiments, and the functions of the respective functional modules may be according to the foregoing method.
  • the method in the embodiment is specifically implemented.
  • the first BTAC and the second BTAC are set, and the register identifier is used as an index in the first BTAC (that is, the register identifier is stored in the first BTAC)
  • the predicted correspondence address of the target jump address the second BTAC uses the PC as an index (ie, the correspondence information of the partial field of the program counter and the predicted target jump address is stored in the second BTAC), when When the read command satisfies the register prediction condition, the first BTAC is used for branch prediction, otherwise, the second BTAC is used for branch prediction.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores a program, and the program execution includes some or all of the arrangements of a branch prediction method and a branch prediction device described in the foregoing method embodiments.
  • the embodiment of the present invention provides another branch prediction apparatus.
  • the branch prediction apparatus 500 in the embodiment of the present invention includes:
  • the input device 501, the output device 502, the memory 503, and the processor 504 (the number of processors of the branch prediction device may be one or more, and FIG. 5 takes a processor as an example).
  • the input device 501, the output device 502, the memory 503, and the processor 504 may be connected by a bus or other means, as exemplified by a bus connection as shown in FIG.
  • the memory 503 is used to store data input from the input device 502, and may also store information such as necessary files processed by the processor 504; the input device 501 and the output device 502 may include ports through which the branch prediction device 500 communicates with other devices, and Output devices external to the branch prediction device 500, such as a display, a keyboard, a mouse, and a printer, etc., may also be included.
  • the input device 502 may include a mouse and a keyboard, etc.
  • the output device 501 includes a display or the like.
  • the processor 504 includes: a first BTAC and a second BTAC, where the first BTAC stores: a correspondence between a register identifier and a predicted target jump address, where the foregoing
  • the BTAC stores: a correspondence between the field of the PC and the predicted target jump address, optionally, the correspondence information of the partial field of the PC and the predicted target jump address is stored in the second BTAC, or Corresponding relationship information of all the fields of the PC and the predicted target jump address is stored in the second BTAC.
  • Processor 504 performs the following steps:
  • the above register prediction conditions include: The type of the instruction is an unconditional indirect jump branch instruction.
  • the foregoing register prediction condition further includes: the register identifier in the instruction is a specific register identifier.
  • the register identifier in the embodiment of the present invention may be a register number, or the register identifier may be other codes or symbols that can be used to indicate a register, etc.
  • the branch prediction method in the embodiment of the present invention may be applied to multiple
  • the thread processor can also be applied to a single-thread processor, which is not limited herein.
  • branch prediction apparatus in the embodiment of the present invention may be used as the branch prediction apparatus in the foregoing method embodiment, and may be used to implement all the technical solutions in the foregoing method embodiments, and the functions of the respective functional modules may be according to the foregoing method.
  • the method in the embodiment is specifically implemented.
  • the first BTAC and the second BTAC are set, and the register identifier is used as the index in the first BTAC (that is, the correspondence relationship between the register identifier and the predicted target jump address is stored in the first BTAC) , using PC in the second BTAC For indexing (ie, storing the correspondence information between the partial field of the program counter and the predicted target jump address in the second BTAC), when the read instruction satisfies the register prediction condition, the first BTAC is used for branch prediction, otherwise, Branch prediction is performed using the second BTAC.
  • the target jump addresses of the unconditional indirect jump branch instructions having the same register identifier are necessarily the same, even if the history information of the plurality of unconditional indirect jump branch instructions having the same target jump address is stored in the same Entry of the first BTAC
  • the accuracy of the branch prediction is also not affected.
  • the technical solution provided by the present invention can not affect the accuracy of the branch prediction when sharing the first BTAC, thereby implementing the BTAC under the premise of ensuring the accuracy of the branch prediction. Resource sharing is possible.
  • the program may be stored in a computer readable storage medium, for example, the storage medium may be Includes: Read Only Memory, Random Access Memory, Disk or CD, and more.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

一种分支预测方法及相关装置,应用于处理器中,上述处理器包括:存储寄存器标识与预测目标跳转地址的一一对应关系信息的第一BTAC,和存储程序计数器的字段与预测目标跳转地址的一一对应关系信息的第二BTAC,其中,一种分支预测方法包括:从指令缓存中读取指令;若确定该指令满足寄存器预测条件,则:根据该指令的寄存器标识,从第一BTAC中获取该指令的预测目标跳转地址;若确定该指令不满足寄存器预测条件,则根据该指令的程序计数器,从第二BTAC中获取该指令的预测目标跳转地址,有效解决在共享BTAC时影响分支预测的准确率的问题。

Description

一种分支预测方法及相关装置 本申请要求于 2013 年 8 月 21 日提交中国专利局、 申请号为 201310367653.9 、 发明名称为"一种分支预测方法及相关装置"的中国专利申 请的优先权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机系统领域, 尤其涉及一种分支预测方法及相关装 置。
背景技术
目前的处理器多釆用流水线的结构, 使得顺序执行的指令流可以并 行地执行。这种处理指令的方式在很大程度上提高了处理器的执行效率。 在理想情况下, 流水线的每个 Stage (即流水线层)仅占用一个时钟周期, 所以每个时钟周期都可以完成一条指令。 但是实际情况并非如此理想, 因为指令之间可能存在着相互的依赖关系从而影响指令执行的并行度。 比如数据依赖、 控制依赖 (比如分支指令) 、 资源竟争、 中断等等因素, 都会影响指令的并行度。
实际程序中包括分支指令, 分支指令的分支行为往往到等到流水线 的后端才能确定, 因此, 分支指令可能产生控制冒险从而导致流水线停 顿, 同时, 处理器也不能确定从哪个地址开始取下一条指令直到这条分 支指令执行完为止。大部分的处理器都釆用了不同形式的分支预测机制, 让条件分支指令的目标跳转方向和目标跳转地址可以在流水线的前端被 预测, 使得处理器可以预测性地取指令并执行指令。 如果分支预测正确 或者正确率较高的话, 可以大幅提高处理器的性能和功耗, 如果分支预 测错误的话, 意味着预测取出的指令不能被执行, 错误的指令需要从緩 冲区中清除, 然后从正确的地址处再重新取指令并执行。 分支目标地址緩存 (BTAC, Branch Target Address Cache ) 用于对 间接跳转分支指令的目标跳转地址进行预测。 BTAC 釆用緩存的结构, 以指令的程序计数器(PC, Program Counter )的一部分作为 index (即索 引 ) , 一部分作为 tag (即标签) , 如以 PC的低 8位作为 index, 以 PC 的高 8位作为 tag 。 BTAC的每个 Entry (即表项) 对应于一个 index和 一个 tag, 并且, BTAC的每个 Entry都设置一个有效位, 用于记录这个 Entry是否存放了有效的历史信息(历史信息即为预测的目标跳转地址), 其中, Entry存放的目标跳转地址是虚拟地址 ( VA, Virtual Address ) 。 如果 BTAC满了, 像 Cache—样, 也需要根据一定的替换算法决定哪个 最近最少使用的 Entry中存放的内容可以被替换掉。
在多线程处理器中, 对 BTAC的设置有两种方式:
一种是共享的 BTAC, 多个线程共享同一块 BTAC, 每个线程各自 用自己的 PC去索引 BTAC当中存放的内容。 这种方式虽然节约了面积, 但是由于 BTAC的索引地址是每个线程的 PC, 而不同线程的 PC有可能 一样, 因此, 不同线程之间的历史信息存放在同一块 BTAC中, 将影响 分支预测的准确率;
另一种是独享的 BTAC, 每个线程各自设置一块 BTAC, BTAC为相 应的线程提供预测分支目标跳转地址的服务。 这种方式虽然相比共享的 方法在一定程度上提高了预测准确率,但是极大浪费了硬件资源和面积。
无论是共享的 BTAC还是独享的 BTAC, 它们有一个共同的特点: 只要 BTAC不满, 就以分支指令的 PC作为索引记录所有和分支指令相 关的历史信息。 但是在实际的程序中, 存在如下情况的几率很大: 多个 不同的分支指令跳转到同样的目标地址, 比如类似 C++中 "Printf" 这种 标准的库函数, 由经过编译之后得到的汇编指令可以看出, 不同的分支 指令总是跳转到相同的目标地址。 那么使用传统的 BTAC结构, 这些不 同的分支指令虽然跳转到同一个目标地址, 但是仍将占用 BTAC当中的 多个 Entry来记录其相关历史信息。
由上可见, 在多线程处理器当中, BTAC 的资源共享、 分支预测的 准确率之间的矛盾极为突出。
发明内容
本发明各个方面提供了一种分支预测方法及相关装置, 用于解决在 共享 BTAC时影响分支预测的准确率的问题。
为解决上述技术问题, 提供以下技术方案:
本发明第一方面提供了一种分支预测方法, 应用于处理器中, 上述 处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩 存器, 上述第一分支目标地址预测緩存器存储着: 寄存器标识与预测目 标跳转地址的——对应关系信息, 上述第二分支目标地址预测緩存器存 储着: 程序计数器的字段与预测目标跳转地址的——对应关系信息, 其 中, 上述分支预测方法, 包括: 从指令緩存中读取指令;
若确定读取的上述指令满足寄存器预测条件, 则:
根据读取的上述指令的寄存器标识, 从上述第一分支目标地址预测 緩存器中获取读取的上述指令的预测目标跳转地址;
若确定读取的上述指令不满足上述寄存器预测条件, 则:
则根据读取的上述指令的程序计数器, 从上述第二分支目标地址预 测緩存器中获取读取的上述指令的预测目标跳转地址;
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。
基于第一方面, 在第一种可能的实现方式中, 上述寄存器预测条件 还包括: 指令中的寄存器标识为特定的寄存器标识;
上述确定读取的上述指令满足寄存器预测条件, 具体为:
当上述指令的类型为无条件间接跳转分支指令, 且上述指令中的寄 存器标识为特定的寄存器标识时, 确定读取的上述指令满足寄存器预测 条件;
上述确定读取的上述指令不满足上述寄存器预测条件, 具体为: 当上述指令的类型不为无条件间接跳转分支指令, 或者, 上述指令 中的寄存器标识不为特定的寄存器标识时, 确定读取的上述指令不满足 寄存器预测条件。
基于第一方面, 或者第一方面的第一种可能的实现方式, 在第二种 可能的实现方式中, 上述从上述指令緩存中读取指令之前包括:
对待读取的指令进行预译码, 得到上述待读取的指令的类型信息; 上述读取指令之后包括: 根据上述得到的指令的类型信息, 判定当 前读取的指令的类型是否为无条件间接跳转分支指令。
基于第一方面, 或者第一方面的第一种可能的实现方式, 或者第一 方面的第二种可能的实现方式, 在第三种可能的实现方式中, 在上述读 取指令之前, 若对高级语言进行编译时调用的函数为标准库函数, 则, 将编译后的指令的类型指定为无条件间接跳转分支指令。
本发明第二方面提供了一种分支预测装置, 应用于处理器中, 上述 处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩 存器, 上述第一分支目标地址预测緩存器存储着: 寄存器标识与预测目 标跳转地址的——对应关系信息, 上述第二分支目标地址预测緩存器存 储着:程序计数器的部分字段与预测目标跳转地址的——对应关系信息, 或者,程序计数器的全部字段与预测目标跳转地址的——对应关系信息, 其中, 上述分支预测装置, 包括:
读取单元, 用于从指令緩存中读取指令;
预测获取单元, 用于当确定上述读取单元读取的指令满足寄存器预 测条件时, 根据上述读取单元读取的上述指令的寄存器标识, 从上述第 一分支目标地址预测緩存器中获取上述读取单元读取的上述指令的预测 目标跳转地址; 当确定上述读取单元读取的指令不满足上述寄存器预测 条件时, 根据上述读取单元读取的上述指令的程序计数器, 从上述第二 分支目标地址预测緩存器中获取上述读取单元读取的上述指令的预测目 标跳转地址;
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。 基于本发明第二方面, 在第一种可能的实现方式中, 上述寄存器预 测条件还包括: 指令中的寄存器标识为特定的寄存器标识;
上述分支预测装置还包括:
确定单元, 用于当上述读取单元读取的指令的类型为无条件间接跳 转分支指令, 且上述读取单元读取的指令中的寄存器标识为特定的寄存 器标识时, 确定上述读取单元读取的指令满足寄存器预测条件; 当上述 读取单元读取的指令的类型不为无条件间接跳转分支指令, 或者, 上述 读取单元读取的指令中的寄存器标识不为特定的寄存器标识时, 确定读 取的上述指令不满足寄存器预测条件。
基于本发明第二方面, 或者本发明第二方面的第一种可能的实现方 式, 在第二种可能的实现方式中, 上述分支预测装置还包括:
预译码单元, 用于对上述读取单元待读取的指令进行预译码, 得到 上述读取单元待读取的指令的类型信息;
判定单元, 用于在上述读取单元读取上述指令后, 根据上述预译码 单元得到的指令的类型信息, 判定上述读取单元当前读取的指令的类型 是否为无条件间接跳转分支指令。
基于本发明第二方面, 或者本发明第二方面的第一种可能的实现方 式, 在第三种可能的实现方式中, 上述分支预测装置还包括:
编译单元, 用于对高级语言进行编译;
指定单元, 用于当上述编译单元对高级语言进行编译时调用的函数 为标准库函数时, 将编译后的指令的类型指定为无条件间接跳转分支指 令。
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息 ) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将 对实施例或现有技术描述中所需要使用的附图做一简单地介绍, 显而易 见地, 下面描述中的附图是本发明的一些实施例, 对于本领域普通技术 人员来讲, 在不付出创造性劳动的前提下, 还可以根据这些附图获得其 他的附图。
图 1为本发明提供的一种分支预测方法一个实施例流程示意图; 图 2为本发明提供的一种分支预测方法另一个实施例流程示意图; 图 3为本发明提供的一种分支预测方法再一个实施例流程示意图; 图 4为本发明提供的一种分支预测装置一个实施例结构示意图; 图 5为本发明提供的一种分支预测装置另一个实施例结构示意图。
具体实施方式
本发明实施例提供了一种分支预测方法及相关装置。
为使得本发明的发明目的、 特征、 优点能够更加的明显和易懂, 下 面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而非全部实施例。 基于本发明中的实施例, 本领域普通技术人员在没有 做出创造性劳动前提下所获得的各个其他实施例, 都属于本发明保护的 范围。
下面对本发明实施例提供的一种分支预测方法进行描述, 本发明实 施例中的分支预测方法应用于处理器中, 上述处理器包括: 第一 BTAC 和第二 BTAC, 上述第一 BTAC存储着: 寄存器标识与预测目标跳转地 址的——对应关系信息, 上述第二 BTAC存储着: PC的字段与预测目标 跳转地址的——对应关系信息, 可选地, 在上述第二 BTAC 中存储 PC 的部分字段与预测目标跳转地址的——对应关系信息, 或者, 在上述第 二 BTAC中存储 PC的全部字段与预测目标跳转地址的——对应关系信 息。 请参阅图 1 , 本发明实施例中的分支预测方法, 包括:
101、 从指令緩存中读取指令;
在本发明实施例中, 步骤 101读取的指令有可能为分支指令, 也有 可能为非分支指令。 通常, 分支指令可以有如下两种划分方式: 一种是 针对跳转条件, 将分支指令的类型划分为条件分支指令和无条件分支指 令, 其中, 条件分支指令在满足一定条件时才执行分支跳转, 无条件分 支指令则不需要满足任何条件, 总是执行分支跳转; 另一种是针对目标 跳转地址, 将分支指令的类型划分为直接跳转分支指令和间接跳转分支 指令, 其中, 直接跳转分支指示的目标跳转地址的偏移量直接在指令当 中用立即数 (即, 在立即寻址方式指令中给出的数) 指定, 目标跳转地 址就是用分支指令本身的 PC 加上立即数的偏移量计算得到, 而间接跳 转分支指令的目标跳转地址是在寄存器当中指定的。
一条指令在从二级緩存或者内存取到指令緩存之前, 需要对指令进 行预译码, 以便将指令的部分预译码结果作为分支预测的指导。 比如, 当分支指令在从二级緩存或者内存取到指令緩存之前, 需要通过预译码 阶段识别出该分支指令的类型 (如是否为条件分支指令、 是否为间接跳 转分支指令等) , 以便根据分支指令的类型执行相应地分支预测。 在预 译码之后, 预译码结果 (如指令的类型信息) 和指令会被一同保存在指 令緩存中。 需要说明的是, 上述对指令的预译码操作可以由分支预测装 置执行, 或者, 也可以由其它装置执行, 此处不作限定。
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。
102、 当读取的指令满足寄存器预测条件时, 根据上述读取的指令中 的寄存器标识, 从第一 BTAC中获取上述读取的指令的预测目标跳转地 址。
103、 当读取的指令不满足寄存器预测条件时, 根据上述读取的指令 中的程序计数器, 从第二 BTAC中获取上述读取的指令的预测目标跳转 地址。
本发明实施例在处理器中设置第一 BTAC, 为便于描述, 下面将第 一 BTAC描述为 SBTAC , 将第二 BTAC简称为 BTAC。 SBTAC的硬件 构造与 BTAC相类似, 不同的是, BTAC是以 PC的一部分字段或者全 部字段作为索引, 而 SBTAC是以寄存器标识作为索引。 由于 SBTAC中 存储着寄存器标识与预测目标跳转地址的——对应关系信息, 因此, 分 支预测装置能够根据上述指令中的寄存器标识 ,从 SBTAC中找到与该寄 存器标识相对应的预测目标跳转地址。
在一种应用场景中, 在步骤 101之前, 若对高级语言进行编译时调 用的函数为标准库函数, 则, 分支预测装置将编译后的指令的类型指定 为无条件间接跳转分支指令, 如将编译后的指令的类型指定为分支与链 接寄存器(BLR, Branch and Link Register )指令, BLR指令为一条无条 件间接跳转分支指令, 它为一个子程序调用或者函数调用引起且一定会 返回, 返回的地址存放在 Link Register (即链接寄存器) 中。 需要说明 的是, 本发明实施例中的高级语言主要是相对于汇编语言而言, 它是较 接近自然语言和数学公式的编程, 基本脱离了机器的^^系统, 用人们 更易理解的方式编写程序。 本发明实施例中的高级语言并不特指某一种 具体的语言,可以包括艮多编程语言,如 java, c, C++ , C#, pascal, python, lisp , prolog, FoxPro, VC, 易语言等等, 本发明实施例中的标准库函数 是指由一些按照高级语言标准预先编写的基本函数组成的库。
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 目标跳转地址相同的多个无条件间接跳转分支指令的历史信 息能够存储在第一 BTAC的同一个 Entry而不影响分支预测的准确率, 从而能够在保证分支预测的准确率的前提下实现 BTAC的资源共享。
上述实施例中使用全部寄存器的寄存器标识作为 SBTAC 中的预测 目标跳转地址的索引, 本发明实施例也可以只使用部分寄存器的寄存器 标识作为 SBTAC中的预测目标跳转地址的索引,则上述寄存器预测条件 还包括: 指令中的寄存器标识为特定的寄存器标识, 在当读取的指令的 类型为无条件间接跳转分支指令, 且读取指令中的寄存器标识为特定的 寄存器标识时, 确定读取的指令满足寄存器预测条件, 当读取的指令的 类型不为无条件间接跳转分支指令, 或者, 读取的指令中的寄存器标识 不为特定的寄存器标识时, 确定读取的指令不满足寄存器预测条件。 如 图 2所示, 本发明实施例中的分支预测方法, 包括:
201、 从指令緩存中读取指令;
一条指令在从二级緩存或者内存取到指令緩存之前, 需要对指令进 行预译码, 以便将指令的部分预译码结果作为分支预测的指导。 比如, 当分支指令在从二级緩存或者内存取到指令緩存之前, 需要通过预译码 阶段识别出该分支指令的类型 (如是否为条件分支指令、 是否为间接跳 转分支指令等) , 以便根据分支指令的类型执行相应地分支预测。 在预 译码之后, 预译码结果和指令会被一同保存在指令緩存中。 需要说明的 是, 上述对指令的预译码操作可以由分支预测装置执行, 或者, 也可以 由其它装置执行, 此处不作限定。
202、 当读取的指令的类型为无条件间接跳转分支指令时, 判断上述 读取的指令中的寄存器标识是否为特定的寄存器标识;
若分支预测装置判断出上述读取的指令中的寄存器标识为特定的寄 存器标识, 则执行步骤 203 , 若分支预测装置判断出上述读取的指令中 的寄存器标识不为特定的寄存器标识, 则执行步骤 204。
203、根据上述读取的指令中的寄存器标识, 从第一 BTAC中获取上 述读取的指令的预测目标跳转地址;
其中, 上述第一 BTAC中存储着: 寄存器标识与预测目标跳转地址 的——对应关系信息。
204、 当读取的指令的类型不为无条件间接跳转分支指令时, 根据上 述读取的指令中的程序计数器, 从第二 BTAC中获取上述读取的指令的 预测目标跳转地址。
在一种应用场景中, 在步骤 201之前, 若对高级语言进行编译时调 用的函数为标准库函数, 则, 分支预测装置将编译后的指令的类型指定 为无条件间接跳转分支指令,如将编译后的指令的类型指定为 BLR指令。 BLR指令为一条无条件间接跳转分支指令, 它为一个子程序调用或者函 数调用引起且一定会返回, 返回的地址存放在 Link Register (即链接寄 存器) 中。 需要说明的是, 本发明实施例中的高级语言主要是相对于 编语言而言, 它是较接近自然语言和数学公式的编程, 基本脱离了机器 的^^系统, 用人们更易理解的方式编写程序。 本发明实施例中的高级 语言并不特指某一种具体的语言, 可以包括艮多编程语言, 如 java, c, C++ , C# , pascal, python, lisp, prolog, FoxPro, VC, 易语言等等, 本 发明实施例中的标准库函数是指由一些按照高级语言标准预先编写的基 本函数组成的库。
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。
对于标准库函数下的分支指令, 分支指令的目标跳转地址通常不会 改变, 因此, 为了保证 SBTAC的内容不会因为软件进程的切换而出现更 新或者无效的操作, 本发明实施例中对标准库函数下的分支指令使用 SBTAC进行分支预测, 如图 3所示, 本发明实施例中的分支预测方法, 包括:
301、 对高级语言进行编译。
302、 判断是否调用标准库函数;
在编译过程中可以确定是否调用标准库函数, 若没有调用标准库函 数, 则执行步骤 303 , 若调用了标准库函数, 则执行步骤 304。
303、将编译后的指令的类型指定为其它指令存储在二级緩存或者内 存中。
304、 将编译后的指令的类型指定为 BLR指令存储在二级緩存或者 内存中。
步骤 305~308与图 2所示实施例中的步骤 201~204类似, 其具体实 现方式可以参照相应步骤中的描述, 此处不再赘述。
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。
本发明实施例还提供了一种分支预测装置, 应用于处理器中, 上述 处理器包括: 第一 BTAC和第二 BTAC, 上述第一 BTAC存储着: 寄存 器标识与预测目标跳转地址的——对应关系信息, 上述第二 BTAC存储 着: PC的字段与预测目标跳转地址的——对应关系信息, 可选地, 在上 述第二 BTAC中存储 PC的部分字段与预测目标跳转地址的——对应关 系信息, 或者, 在上述第二 BTAC中存储 PC的全部字段与预测目标跳 转地址的——对应关系信息, 如图 4所示, 本发明实施例中的分支预测 装置 400 , 包括:
读取单元 401 , 用于从指令緩存中读取指令;
一条指令在从二级緩存或者内存取到指令緩存之前, 需要对指令进 行预译码, 以便将指令的部分预译码结果作为分支预测的指导。 比如, 当分支指令在从二级緩存或者内存取到指令緩存之前, 需要通过预译码 阶段识别出该分支指令的类型 (如是否为条件分支指令、 是否为间接跳 转分支指令等) , 以便根据分支指令的类型执行相应地分支预测。 在预 译码之后, 预译码结果 (如指令的类型信息) 和指令会被一同保存在指 令緩存中。 在一种实现方式中, 上述对指令的预译码操作可以由分支预 测装置执行, 则本发明实施例中的分支预测装置还可以包括: 预译码单 元, 用于对读取单元 401待读取的指令进行预译码, 得到上述待读取的 指令的类型信息; 判定单元, 用于在读取单元 401读取指令后, 根据预 译码单元得到的该指令的类型信息, 判定该指令的类型是否为无条件间 接跳转分支指令。 当然, 上述对读取单元 401待读取的指令的预译码操 作也可以由其它装置执行, 此处不作限定。
预测获取单元 402 , 用于当确定读取单元 401 读取的指令满足寄存 器预测条件时, 根据读取单元 401读取的指令中的寄存器标识, 从第一 BTAC 中获取读取单元 401读取的指令的预测目标跳转地址。 当确定读 取单元 401读取的指令不满足上述寄存器预测条件时,根据读取单元 401 读取的指令的 PC ,从第二 BTAC中获取读取单元 401读取的指令的预测 目标跳转地址; 其中, 上述寄存器预测条件包括: 指令的类型为无条件 间接跳转分支指令。
可选地, 上述寄存器预测条件还包括: 指令中的寄存器标识为特定 的寄存器标识。 则分支预测装置 400还包括: 确定单元, 用于当读取单 元 401 读取的指令的类型为无条件间接跳转分支指令, 且读取单元 401 读取的指令中的寄存器标识为特定的寄存器标识时, 确定读取单元 401 读取的指令满足寄存器预测条件; 当读取单元 401读取的指令的类型不 为无条件间接跳转分支指令, 或者, 读取单元 401读取的指令中的寄存 器标识不为特定的寄存器标识时, 确定读取的指令不满足寄存器预测条 件。
在一种应用场景中, 在对高级语言进行编译时, 若对高级语言进行 编译时调用的函数为标准库函数, 则, 分支预测装置将编译后的指令的 类型指定为 BLR。 则在图 4所示的分支预测装置的基础上, 分支预测装 置还可以包括: 编译单元, 用于对高级语言进行编译; 指定单元, 用于 当上述编译单元对高级语言进行编译时调用的函数为标准库函数时, 将 编译后的指令的类型指定为无条件间接跳转分支指令, 如将编译后的指 令的类型指定为 BLR指令。 需要说明的是, 本发明实施例中的高级语言 主要是相对于汇编语言而言, 它是较接近自然语言和数学公式的编程, 基本脱离了机器的硬件系统, 用人们更易理解的方式编写程序。 本发明 实施例中的高级语言并不特指某一种具体的语言, 可以包括很多编程语 言 , 如 java, c, C++ , C# , pascal, python, lisp, prolog, FoxPro, VC, 易语言等等, 本发明实施例中的标准库函数是指由一些按照高级语言标 准预先编写的基本函数组成的库。
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。
需要说明的是, 本发明实施例中的分支预测装置可以如上述方法实 施例中的分支预测装置, 可以用于实现上述方法实施例中的全部技术方 案,其各个功能模块的功能可以根据上述方法实施例中的方法具体实现, 其具体实现过程可参照上述方法实施例中的相关描述, 此处不再赘述。
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。
本发明实施例还提供一种计算机存储介质, 其中, 该计算机存储介 质存储有程序, 该程序执行包括上述方法实施例中记载的在一种分支预 测方法和分支预测装置的部分或全部布置。
本发明实施例提供另一个分支预测装置, 如图 5所示, 本发明实施 例中的分支预测装置 500 , 包括:
输入装置 501、 输出装置 502、 存储器 503以及处理器 504 (分支预 测装置的处理器的数量可以是一个或者多个, 图 5以一个处理器为例)。 在本发明的一些实施例中, 输入装置 501、 输出装置 502、 存储器 503以 及处理器 504可以通过总线或其它方式连接, 如图 5所示以通过总线连 接为例。 存储器 503 中用来储存从输入装置 502输入的数据, 且还可以 储存处理器 504处理数据的必要文件等信息; 输入装置 501和输出装置 502可以包括分支预测装置 500 与其他设备通信的端口, 且还可以包括 分支预测装置 500外接的输出设备比如显示器、 键盘、 鼠标和打印机等, 具体地输入装置 502可以包括鼠标和键盘等, 而输出装置 501 包括显示 器等。
其中, 处理器 504包括: 第一 BTAC和第二 BTAC , 上述第一 BTAC 存储着: 寄存器标识与预测目标跳转地址的——对应关系信息, 上述第 二 BTAC存储着: PC的字段与预测目标跳转地址的——对应关系信息, 可选地, 在上述第二 BTAC中存储 PC的部分字段与预测目标跳转地址 的——对应关系信息, 或者, 在上述第二 BTAC 中存储 PC的全部字段 与预测目标跳转地址的——对应关系信息。
处理器 504执行如下步骤:
从指令緩存中读取指令;
若确定读取的上述指令满足寄存器预测条件, 则:
根据读取的上述指令的寄存器标识, 从上述第一分支目标地址预测 緩存器中获取读取的上述指令的预测目标跳转地址;
若确定读取的上述指令不满足上述寄存器预测条件, 则:
则根据读取的上述指令的程序计数器, 从上述第二分支目标地址预 测緩存器中获取读取的上述指令的预测目标跳转地址;
其中, 上述寄存器预测条件包括: 指令的类型为无条件间接跳转分 支指令。
可选地, 上述寄存器预测条件还包括: 指令中的寄存器标识为特定 的寄存器标识。
需要说明的是, 本发明实施例中的寄存器标识可以为寄存器号, 或 者, 寄存器标识也可以是其它能够用于指示寄存器的代码或者符号等, 本发明实施例中的分支预测方法可以应用于多线程处理器中, 也可以应 用于单线程处理器中, 此处不作限定。
需要说明的是, 本发明实施例中的分支预测装置可以如上述方法实 施例中的分支预测装置, 可以用于实现上述方法实施例中的全部技术方 案,其各个功能模块的功能可以根据上述方法实施例中的方法具体实现, 其具体实现过程可参照上述方法实施例中的相关描述, 此处不再赘述。
由上可见, 本发明实施例中设置第一 BTAC 和第二 BTAC, 第一 BTAC中使用寄存器标识作为索引 (即在第一 BTAC中存储寄存器标识 与预测目标跳转地址的——对应关系信息 ) , 第二 BTAC 中使用 PC作 为索引 (即在第二 BTAC中存储程序计数器的部分字段与预测目标跳转 地址的——对应关系信息) , 当读取的指令满足寄存器预测条件时, 使 用第一 BTAC进行分支预测, 否则, 使用第二 BTAC进行分支预测。 由 于寄存器标识相同的无条件间接跳转分支指令的目标跳转地址必然相 同, 因此, 即使将目标跳转地址相同的多个无条件间接跳转分支指令的 历史信息存储在第一 BTAC的同一个 Entry中, 也不会影响分支预测的 准确率, 换言之, 本发明提供的技术方案能够在共享第一 BTAC时不对 分支预测的准确率产生影响, 从而使得在保证分支预测的准确率的前提 下实现 BTAC的资源共享成为可能。
需要说明的是, 对于前述的各方法实施例, 为了简便描述, 故将其 都表述为一系列的动作组合, 但是本领域技术人员应该知悉, 本发明并 不受所描述的动作顺序的限制, 因为依据本发明, 某些步骤可以釆用其 它顺序或者同时进行。 其次, 本领域技术人员也应该知悉, 说明书中所 描述的实施例均属于优选实施例, 所涉及的动作和模块并不一定都是本 发明所必须的。
在上述实施例中, 对各个实施例的描述都各有侧重, 某个实施例中 没有详述的部分, 可以参见其它实施例的相关描述。
本领域普通技术人员可以理解上述实施例中的各种方法中的全部或 部分步骤是可以通过程序来指令相关的硬件来完成, 该程序可以存储于 一计算机可读存储介质中, 存储介质例如可以包括: 只读存储器、 随机 存储器、 磁盘或光盘等。
以上对本发明所提供的一种分支预测方法及相关装置进行了详细介 绍, 对于本领域的一般技术人员, 依据本发明实施例的思想, 在具体实 施方式及应用范围上均会有改变之处, 本说明书内容不应理解为对本发 明的限制。

Claims

权利 要求 书
1、 一种分支预测方法, 其特征在于, 应用于处理器中, 所述处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩存器, 所述第一分支 目标地址预测緩存器存储着: 寄存器标识与预测目标跳转地址的——对应关系 信息, 所述第二分支目标地址预测緩存器存储着: 程序计数器的字段与预测目 标跳转地址的——对应关系信息, 其中, 所述分支预测方法, 包括:
从指令緩存中读取指令;
若确定读取的所述指令满足寄存器预测条件, 则:
根据读取的所述指令的寄存器标识, 从所述第一分支目标地址预测緩存器 中获取读取的所述指令的预测目标跳转地址;
若确定读取的所述指令不满足所述寄存器预测条件, 则:
则根据读取的所述指令的程序计数器, 从所述第二分支目标地址预测緩存 器中获取读取的所述指令的预测目标跳转地址;
其中, 所述寄存器预测条件包括: 指令的类型为无条件间接跳转分支指令。
2、 根据权利要求 1所述的方法, 其特征在于,
所述寄存器预测条件还包括: 指令中的寄存器标识为特定的寄存器标识; 所述确定读取的所述指令满足寄存器预测条件, 具体为:
当所述指令的类型为无条件间接跳转分支指令, 且所述指令中的寄存器标 识为特定的寄存器标识时, 确定读取的所述指令满足寄存器预测条件;
所述确定读取的所述指令不满足所述寄存器预测条件, 具体为:
当所述指令的类型不为无条件间接跳转分支指令, 或者, 所述指令中的寄 存器标识不为特定的寄存器标识时, 确定读取的所述指令不满足寄存器预测条 件。
3、 根据权利要求 1或 2所述的方法, 其特征在于,
所述从所述指令緩存中读取指令之前包括:
对待读取的指令进行预译码, 得到所述待读取的指令的类型信息; 所述读取指令之后包括: 根据所述得到的指令的类型信息, 判定当前读取 的指令的类型是否为无条件间接跳转分支指令。
4、 根据权利要求 1至 3任一项所述的方法, 其特征在于,
在所述读取指令之前, 若对高级语言进行编译时调用的函数为标准库函数, 则, 将编译后的指令的类型指定为无条件间接跳转分支指令。
5、 一种分支预测装置, 其特征在于, 应用于处理器中, 所述处理器包括: 第一分支目标地址预测緩存器和第二分支目标地址预测緩存器, 所述第一分支 目标地址预测緩存器存储着: 寄存器标识与预测目标跳转地址的——对应关系 信息, 所述第二分支目标地址预测緩存器存储着: 程序计数器的部分字段与预 测目标跳转地址的——对应关系信息, 或者, 程序计数器的全部字段与预测目 标跳转地址的——对应关系信息, 其中, 所述分支预测装置, 包括:
读取单元, 用于从指令緩存中读取指令;
预测获取单元, 用于当确定所述读取单元读取的指令满足寄存器预测条件 时, 根据所述读取单元读取的所述指令的寄存器标识, 从所述第一分支目标地 定所述读取单元读取的指令不满足所述寄存器预测条件时, 根据所述读取单元 读取的所述指令的程序计数器, 从所述第二分支目标地址预测緩存器中获取所 述读取单元读取的所述指令的预测目标跳转地址;
其中, 所述寄存器预测条件包括: 指令的类型为无条件间接跳转分支指令。
6、 根据权利要求 5所述的分支预测装置, 其特征在于,
所述寄存器预测条件还包括: 指令中的寄存器标识为特定的寄存器标识; 所述分支预测装置还包括:
确定单元, 用于当所述读取单元读取的指令的类型为无条件间接跳转分支 指令, 且所述读取单元读取的指令中的寄存器标识为特定的寄存器标识时, 确 定所述读取单元读取的指令满足寄存器预测条件; 当所述读取单元读取的指令 的类型不为无条件间接跳转分支指令, 或者, 所述读取单元读取的指令中的寄 存器标识不为特定的寄存器标识时, 确定读取的所述指令不满足寄存器预测条 件。
7、 根据权利要求 5或 6所述的分支预测装置, 其特征在于,
所述分支预测装置还包括:
预译码单元, 用于对所述读取单元待读取的指令进行预译码, 得到所述读 取单元待读取的指令的类型信息;
判定单元, 用于在所述读取单元读取所述指令后, 根据所述预译码单元得 到的指令的类型信息, 判定所述读取单元当前读取的指令的类型是否为无条件 间接跳转分支指令。
8、 根据权利要求 5或 6所述的分支预测装置, 其特征在于,
所述分支预测装置还包括:
编译单元, 用于对高级语言进行编译;
指定单元, 用于当所述编译单元对高级语言进行编译时调用的函数为标准 库函数时, 将编译后的指令的类型指定为无条件间接跳转分支指令。
PCT/CN2014/083882 2013-08-21 2014-08-07 一种分支预测方法及相关装置 WO2015024452A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310367653.9 2013-08-21
CN201310367653.9A CN104423929B (zh) 2013-08-21 2013-08-21 一种分支预测方法及相关装置

Publications (1)

Publication Number Publication Date
WO2015024452A1 true WO2015024452A1 (zh) 2015-02-26

Family

ID=52483061

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/083882 WO2015024452A1 (zh) 2013-08-21 2014-08-07 一种分支预测方法及相关装置

Country Status (2)

Country Link
CN (1) CN104423929B (zh)
WO (1) WO2015024452A1 (zh)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10534609B2 (en) 2017-08-18 2020-01-14 International Business Machines Corporation Code-specific affiliated register prediction
US10558461B2 (en) 2017-08-18 2020-02-11 International Business Machines Corporation Determining and predicting derived values used in register-indirect branching
US10564974B2 (en) 2017-08-18 2020-02-18 International Business Machines Corporation Determining and predicting affiliated registers based on dynamic runtime control flow analysis
US10579385B2 (en) 2017-08-18 2020-03-03 International Business Machines Corporation Prediction of an affiliated register
US10620955B2 (en) 2017-09-19 2020-04-14 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US10691600B2 (en) 2017-09-19 2020-06-23 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10705973B2 (en) 2017-09-19 2020-07-07 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US10713051B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing table of contents (TOC)-setting instructions in code with TOC predicting instructions
US10831457B2 (en) 2017-09-19 2020-11-10 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10884748B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Providing a predicted target address to multiple locations based on detecting an affiliated relationship
US10884930B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10901741B2 (en) 2017-08-18 2021-01-26 International Business Machines Corporation Dynamic fusion of derived value creation and prediction of derived values in a subroutine branch sequence
US10908911B2 (en) 2017-08-18 2021-02-02 International Business Machines Corporation Predicting and storing a predicted target address in a plurality of selected locations
US11061576B2 (en) 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US11150904B2 (en) 2017-08-18 2021-10-19 International Business Machines Corporation Concurrent prediction of branch addresses and update of register contents
CN114265623A (zh) * 2021-11-29 2022-04-01 中电科申泰信息科技有限公司 一种嵌入式处理器的分支预测器
CN117093272A (zh) * 2023-10-07 2023-11-21 飞腾信息技术有限公司 指令发送方法及处理器
CN117762493A (zh) * 2023-12-27 2024-03-26 江苏华创微系统有限公司 一种支持dsp处理器的内核屏蔽非法地址的方法及装置

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016155623A1 (zh) * 2015-03-30 2016-10-06 上海芯豪微电子有限公司 基于信息推送的信息系统和方法
CN106155928A (zh) * 2015-04-13 2016-11-23 上海芯豪微电子有限公司 一种存储层次预取系统和方法
CN105867880B (zh) * 2016-04-01 2018-12-04 中国科学院计算技术研究所 一种面向间接跳转分支预测的分支目标缓冲器及设计方法
CN108062236A (zh) * 2016-11-07 2018-05-22 杭州华为数字技术有限公司 一种软硬件协同分支指令预测方法及装置
CN109308191B (zh) * 2017-07-28 2021-09-14 华为技术有限公司 分支预测方法及装置
CN111176729A (zh) * 2018-11-13 2020-05-19 深圳市中兴微电子技术有限公司 一种信息处理方法、装置及计算机可读存储介质
CN111209044B (zh) * 2018-11-21 2022-11-25 展讯通信(上海)有限公司 指令压缩方法及装置
CN111625280B (zh) * 2019-02-27 2023-08-04 上海复旦微电子集团股份有限公司 指令控制方法及装置、可读存储介质
CN110347432B (zh) * 2019-06-17 2021-09-14 海光信息技术股份有限公司 处理器、分支预测器及其数据处理方法、分支预测方法
CN111638913B (zh) * 2019-09-19 2023-05-12 中国科学院信息工程研究所 一种基于随机化索引的处理器芯片分支预测器安全增强方法及电子装置
CN111026442B (zh) * 2019-12-17 2022-08-02 天津国芯科技有限公司 一种cpu中用于消除程序无条件跳转开销的方法及装置
CN111258649B (zh) * 2020-01-21 2022-03-01 Oppo广东移动通信有限公司 处理器、芯片和电子设备
CN111538535B (zh) * 2020-04-28 2021-09-21 支付宝(杭州)信息技术有限公司 一种cpu指令处理方法、控制器和中央处理单元
CN112613039B (zh) * 2020-12-10 2022-09-09 成都海光微电子技术有限公司 一种针对幽灵漏洞的性能优化方法及装置
CN113722243A (zh) * 2021-09-03 2021-11-30 苏州睿芯集成电路科技有限公司 用于直接跳转的超前预测的方法及分支指令追踪高速缓存
CN115480826B (zh) * 2022-09-21 2024-03-12 海光信息技术股份有限公司 分支预测器、分支预测方法、装置和计算设备
CN117389629B (zh) * 2023-11-02 2024-06-04 北京市合芯数字科技有限公司 分支预测方法、装置、电子设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110055529A1 (en) * 2009-08-28 2011-03-03 Via Technologies, Inc. Efficient branch target address cache entry replacement
CN102117198A (zh) * 2009-12-31 2011-07-06 上海芯豪微电子有限公司 一种分支处理方法
CN102662640A (zh) * 2012-04-12 2012-09-12 苏州睿云智芯微电子有限公司 双重分支目标缓冲器和分支目标处理系统及处理方法
CN103150142A (zh) * 2011-12-07 2013-06-12 苹果公司 具有滞后的下一获取预测器训练

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10133874A (ja) * 1996-11-01 1998-05-22 Mitsubishi Electric Corp スーパスカラプロセッサ用分岐予測機構
US20070294518A1 (en) * 2006-06-14 2007-12-20 Shen-Chang Wang System and method for predicting target address of branch instruction utilizing branch target buffer having entry indexed according to program counter value of previous instruction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110055529A1 (en) * 2009-08-28 2011-03-03 Via Technologies, Inc. Efficient branch target address cache entry replacement
CN102117198A (zh) * 2009-12-31 2011-07-06 上海芯豪微电子有限公司 一种分支处理方法
CN103150142A (zh) * 2011-12-07 2013-06-12 苹果公司 具有滞后的下一获取预测器训练
CN102662640A (zh) * 2012-04-12 2012-09-12 苏州睿云智芯微电子有限公司 双重分支目标缓冲器和分支目标处理系统及处理方法

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10891133B2 (en) 2017-08-18 2021-01-12 International Business Machines Corporation Code-specific affiliated register prediction
US11150908B2 (en) 2017-08-18 2021-10-19 International Business Machines Corporation Dynamic fusion of derived value creation and prediction of derived values in a subroutine branch sequence
US10564974B2 (en) 2017-08-18 2020-02-18 International Business Machines Corporation Determining and predicting affiliated registers based on dynamic runtime control flow analysis
US10579385B2 (en) 2017-08-18 2020-03-03 International Business Machines Corporation Prediction of an affiliated register
US11314511B2 (en) 2017-08-18 2022-04-26 International Business Machines Corporation Concurrent prediction of branch addresses and update of register contents
US11150904B2 (en) 2017-08-18 2021-10-19 International Business Machines Corporation Concurrent prediction of branch addresses and update of register contents
US10929135B2 (en) 2017-08-18 2021-02-23 International Business Machines Corporation Predicting and storing a predicted target address in a plurality of selected locations
US10908911B2 (en) 2017-08-18 2021-02-02 International Business Machines Corporation Predicting and storing a predicted target address in a plurality of selected locations
US10901741B2 (en) 2017-08-18 2021-01-26 International Business Machines Corporation Dynamic fusion of derived value creation and prediction of derived values in a subroutine branch sequence
US10884747B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Prediction of an affiliated register
US10719328B2 (en) 2017-08-18 2020-07-21 International Business Machines Corporation Determining and predicting derived values used in register-indirect branching
US10558461B2 (en) 2017-08-18 2020-02-11 International Business Machines Corporation Determining and predicting derived values used in register-indirect branching
US10754656B2 (en) 2017-08-18 2020-08-25 International Business Machines Corporation Determining and predicting derived values
US10534609B2 (en) 2017-08-18 2020-01-14 International Business Machines Corporation Code-specific affiliated register prediction
US10884746B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Determining and predicting affiliated registers based on dynamic runtime control flow analysis
US10884748B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Providing a predicted target address to multiple locations based on detecting an affiliated relationship
US10884745B2 (en) 2017-08-18 2021-01-05 International Business Machines Corporation Providing a predicted target address to multiple locations based on detecting an affiliated relationship
US11138127B2 (en) 2017-09-19 2021-10-05 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US10963382B2 (en) 2017-09-19 2021-03-30 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US11061575B2 (en) 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US10713050B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing Table of Contents (TOC)-setting instructions in code with TOC predicting instructions
US10896030B2 (en) 2017-09-19 2021-01-19 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10713051B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing table of contents (TOC)-setting instructions in code with TOC predicting instructions
US10705973B2 (en) 2017-09-19 2020-07-07 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US10884930B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10949350B2 (en) 2017-09-19 2021-03-16 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US11061576B2 (en) 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US10725918B2 (en) 2017-09-19 2020-07-28 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10691600B2 (en) 2017-09-19 2020-06-23 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10977185B2 (en) 2017-09-19 2021-04-13 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US11010164B2 (en) 2017-09-19 2021-05-18 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US11138113B2 (en) 2017-09-19 2021-10-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10831457B2 (en) 2017-09-19 2020-11-10 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10656946B2 (en) 2017-09-19 2020-05-19 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US10884929B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10620955B2 (en) 2017-09-19 2020-04-14 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
CN114265623A (zh) * 2021-11-29 2022-04-01 中电科申泰信息科技有限公司 一种嵌入式处理器的分支预测器
CN117093272B (zh) * 2023-10-07 2024-01-16 飞腾信息技术有限公司 指令发送方法及处理器
CN117093272A (zh) * 2023-10-07 2023-11-21 飞腾信息技术有限公司 指令发送方法及处理器
CN117762493A (zh) * 2023-12-27 2024-03-26 江苏华创微系统有限公司 一种支持dsp处理器的内核屏蔽非法地址的方法及装置

Also Published As

Publication number Publication date
CN104423929A (zh) 2015-03-18
CN104423929B (zh) 2017-07-14

Similar Documents

Publication Publication Date Title
WO2015024452A1 (zh) 一种分支预测方法及相关装置
US7437537B2 (en) Methods and apparatus for predicting unaligned memory access
JP5043560B2 (ja) プログラム実行制御装置
JP5917616B2 (ja) 事前通知技術を用いる、プログラムのシーケンシャルフローを変更するための方法および装置
EP2628072B1 (en) An instruction sequence buffer to enhance branch prediction efficiency
EP2628076B1 (en) An instruction sequence buffer to store branches having reliably predictable instruction sequences
TWI423123B (zh) 用於推測性指令之無效的通用分支系統、其方法、其識別器與其電腦可讀取儲存媒體
US8069336B2 (en) Transitioning from instruction cache to trace cache on label boundaries
US9367471B2 (en) Fetch width predictor
RU2417407C2 (ru) Способы и устройство для моделирования поведения предсказания переходов явного вызова подпрограммы
US9529596B2 (en) Method and apparatus for scheduling instructions in a multi-strand out of order processor with instruction synchronization bits and scoreboard bits
JP2008530714A5 (zh)
JP2009536770A (ja) ブロックに基づく分岐先アドレスキャッシュ
JP2013080497A (ja) スライドウィンドウブロックベースの分岐ターゲットアドレスキャッシュ
JP7064273B2 (ja) 単一のcamポートを使用する分割された順序変更キューを備える読み込み/格納ユニット
US20220197662A1 (en) Accessing A Branch Target Buffer Based On Branch Instruction Information
US20220197657A1 (en) Segmented branch target buffer based on branch instruction type
CN113568663A (zh) 代码预取指令
US20230315453A1 (en) Forward conditional branch event for profile-guided-optimization (pgo)
US20230195456A1 (en) System, apparatus and method for throttling fusion of micro-operations in a processor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14838748

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14838748

Country of ref document: EP

Kind code of ref document: A1