WO2020199058A1 - 分支指令的处理方法、分支预测器及处理器 - Google Patents

分支指令的处理方法、分支预测器及处理器 Download PDF

Info

Publication number
WO2020199058A1
WO2020199058A1 PCT/CN2019/080694 CN2019080694W WO2020199058A1 WO 2020199058 A1 WO2020199058 A1 WO 2020199058A1 CN 2019080694 W CN2019080694 W CN 2019080694W WO 2020199058 A1 WO2020199058 A1 WO 2020199058A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction
address
branch
field
target
Prior art date
Application number
PCT/CN2019/080694
Other languages
English (en)
French (fr)
Inventor
杜白
陈运必
杨庆庆
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2019/080694 priority Critical patent/WO2020199058A1/zh
Priority to CN201980093772.5A priority patent/CN113544640A/zh
Publication of WO2020199058A1 publication Critical patent/WO2020199058A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead

Definitions

  • This application relates to the field of computer technology, in particular to a method for processing branch instructions, a branch predictor and a processor.
  • processors In order to meet the increasing demands of users for processor performance, processors usually use pipeline technology capable of overlapping execution instructions to improve efficiency. Under normal circumstances, after the processor obtains a branch instruction, it will predict the branch instruction at the previous stage of the pipeline, and continue to fetch instructions at the jump address indicated by the prediction result, without waiting for the latter stage of the pipeline to return to the branch The result of the instruction jump can reduce the "bubbles" of the pipeline and improve the processing efficiency of the processor.
  • a two-bit branch prediction (local 2-bit branch-prediction, local 2b) algorithm is usually used to predict the jump result of a branch instruction.
  • the local 2b algorithm updates the 2bit counter according to the historical jump information of the branch instruction, and uses the counter result to make jump prediction.
  • the algorithm is used for jump prediction, the prediction is only made based on the historical jump information of the branch instruction, and other influencing factors are not taken into account, which leads to low accuracy of the prediction result.
  • the present application provides a method for processing branch instructions, a branch predictor and a processor, which improve the accuracy of prediction results.
  • an embodiment of the present application provides a method for processing branch instructions, and the method for processing branch instructions may include:
  • the target branch entry includes a first field, the first field is used to indicate M prediction information, and the M prediction information is based on The historical jump information of the target branch instruction and the information that affects the prediction result of the target branch instruction are predicted; M is an integer greater than or equal to 2;
  • the multiple prediction information is based on the historical jump of the target branch instruction
  • the transfer information and the information that affects the prediction result of the target branch instruction are predicted, so that it can predict whether to perform a jump to the target branch instruction based on the historical jump information of multiple target branch instructions and the information that affects the prediction result of the target branch instruction Operation, thereby improving the accuracy of the prediction results.
  • the information that affects the predicted result of the target branch instruction includes the historical jump information of other branch instructions that affect the jump of the target branch instruction, that is, the historical jump information and the influence target of multiple target branch instructions
  • the historical jump information of other branch instructions jumped by the branch instruction jointly predicts whether to perform a jump operation on the target branch instruction, thereby improving the accuracy of the prediction result.
  • the target branch entry further includes a second field, and the second field is used to indicate the first address. If the prediction result indicates that the jump operation is performed, the target address of the jump operation is the first address.
  • the target branch entry further includes a fourth field, the fourth field is used to indicate that the first address is determined according to the fourth field and the second field; the second field is used to indicate the second address, the second address It is an address in the first address that is different from the address of the target branch instruction, which reduces the number of bits occupied by the target branch entry while ensuring multiple prediction information.
  • the fourth field is the number of bits of the second address; or, the fourth field is the number of bits that the address of the target branch instruction is the same as the first address; or, the fourth field is the number of bits of the target branch instruction. The sum of the number of bits of the address and the number of bits of the second address.
  • the second field can be set In the last column of the target branch entry.
  • the M pieces of prediction information include m1 pieces of first prediction information corresponding to the first rule and m2 pieces of second prediction information corresponding to the second rule.
  • the method may further include:
  • the prediction result corresponding to the first rule is predicted based on m1 pieces of first prediction information
  • the second rule corresponds to The prediction result is predicted based on m2 second prediction information; both m1 and m2 are integers greater than or equal to 1, and the sum of m1 and m2 is less than or equal to M;
  • the prediction information used for the prediction is determined.
  • the prediction information used for the prediction is one of the first prediction information and the second prediction information.
  • the multiple prediction information includes two prediction rules, it can also be implemented based on two different prediction rules.
  • the rule predicts whether to perform a jump operation on the target branch instruction, so that the branch instruction processing method provided in the embodiment of the present application can be adapted to more prediction scenarios.
  • determining the prediction information used for prediction according to the comparison result may include:
  • the first prediction information is determined as the prediction information used for prediction; or,
  • the second prediction information is determined as the prediction information used for prediction.
  • the first prediction information and the second prediction information are updated according to the actual jump result, thereby improving the accuracy of the first prediction information and the second prediction information, so that the updated first prediction information is subsequently passed.
  • the prediction information and the second prediction information predict whether to perform a jump operation on the target branch instruction, which can further improve the accuracy of the prediction result.
  • an embodiment of the present application provides a branch predictor, and the branch predictor may include:
  • the search unit is used to search for the target branch entry corresponding to the target branch instruction in the branch prediction table according to the partial address of the target branch instruction; wherein the target branch entry includes a first field, and the first field is used to indicate M pieces of prediction information, M
  • the prediction information is predicted based on the historical jump information of the target branch instruction and the information that affects the prediction result of the target branch instruction; M is an integer greater than or equal to 2;
  • the prediction unit is configured to, if the target branch entry is found in the branch prediction table, predict whether to perform a jump operation on the target branch instruction according to the prediction information indicated by the first field.
  • the information that affects the prediction result of the target branch instruction includes historical jump information of other branch instructions that affect the jump of the target branch instruction.
  • the target branch entry further includes a second field, and the second field is used to indicate the first address. If the prediction result indicates that the jump operation is performed, the target address of the jump operation is the first address.
  • the target branch entry further includes a third field
  • the third field is used to indicate branch attributes
  • the branch predictor may include: a processing unit, if there are idle bits in the third field, At least one of the M pieces of prediction information is stored in free bits.
  • the target branch entry further includes a fourth field, the fourth field is used to indicate that the first address is determined according to the fourth field and the second field; the second field is used to indicate the second address, the second address It is an address different from the address of the target branch instruction in the first address.
  • the fourth field is the number of bits of the second address; or, the fourth field is the number of bits that the address of the target branch instruction is the same as the first address; or, the fourth field is the number of bits of the target branch instruction. The sum of the number of bits of the address and the number of bits of the second address.
  • the second field is set in the last column of the target branch entry.
  • the M pieces of prediction information include m1 pieces of first prediction information corresponding to the first rule and m2 pieces of second prediction information corresponding to the second rule;
  • the processing unit is also used to obtain the actual jump result of the target branch instruction; and respectively compare the prediction result corresponding to the first rule and the prediction result corresponding to the second rule with the actual jump result; and then determine the prediction according to the comparison result
  • the prediction information used for prediction is one of the first prediction information and the second prediction information. Among them, the prediction result corresponding to the first rule is predicted based on m1 pieces of first prediction information, and the prediction result corresponding to the second rule is predicted based on m2 pieces of second prediction information; both m1 and m2 are greater than or equal to 1. An integer, and the sum of m1 and m2 is less than or equal to M.
  • the processing unit is specifically configured to determine the first prediction information as the prediction information used for prediction if the prediction result corresponding to the first rule is the same as the actual jump result; or, if the second rule If the corresponding prediction result is the same as the actual jump result, the second prediction information is determined as the prediction information used for prediction.
  • the first prediction information and the second prediction information are updated according to the actual jump result.
  • the first rule is the global enhanced GL algorithm
  • the second rule is the two-bit branch prediction local2b algorithm
  • the first rule is the local2b algorithm
  • the second rule is the GL algorithm
  • an embodiment of the present application also provides a branch predictor, which may include a prediction logic circuit and a memory;
  • a memory for storing a branch prediction table, where the branch prediction table includes multiple branch entries
  • the prediction logic circuit is used to find the target branch entry corresponding to the target branch instruction among multiple branch entries according to the partial address of the target branch instruction; wherein the target branch entry includes a first field, and the first field is used to indicate M predictions Information, the M prediction information is predicted based on the historical jump information of the target branch instruction and the information that affects the prediction result of the target branch instruction; if the target branch entry is found in the branch prediction table, the prediction is based on the prediction indicated by the first field Information, predict whether to perform a jump operation on the target branch instruction; where M is an integer greater than or equal to 2.
  • the information that affects the prediction result of the target branch instruction includes historical jump information of other branch instructions that affect the jump of the target branch instruction.
  • the target branch entry further includes a second field.
  • the second field is used to indicate the first address. If the prediction result indicates that a jump operation is performed, the target address of the jump operation is the first address.
  • the target branch entry further includes a third field, the third field is used to indicate branch attributes, and the branch predictor further includes an update logic circuit;
  • the update logic circuit is used for storing at least one prediction information of the M pieces of prediction information in the idle bit if there is an idle bit in the third field.
  • the target branch entry further includes a fourth field, the fourth field is used to indicate that the first address is determined according to the fourth field and the second field; the second field is used to indicate the second address, the second address It is an address different from the address of the target branch instruction in the first address.
  • the fourth field is the number of bits of the second address; or, the fourth field is the number of bits that the address of the target branch instruction is the same as the first address; or, the fourth field is the number of bits of the target branch instruction. The sum of the number of bits of the address and the number of bits of the second address.
  • the second field is set in the last column of the target branch entry.
  • the M pieces of prediction information include m1 pieces of first prediction information corresponding to the first rule and m2 pieces of second prediction information corresponding to the second rule;
  • the prediction logic circuit is also used to compare the prediction result corresponding to the first rule and the prediction result corresponding to the second rule with the actual jump result, and to determine the prediction information used for prediction and the prediction information used for prediction according to the comparison result Is one of the first prediction information and the second prediction information; wherein the prediction result corresponding to the first rule is predicted based on m1 pieces of first prediction information, and the prediction result corresponding to the second rule is based on m2 pieces of second prediction information Predicted; both m1 and m2 are integers greater than or equal to 1, and the sum of m1 and m2 is less than or equal to M.
  • the prediction logic circuit is specifically configured to determine the first prediction information as the prediction information used for prediction if the prediction result corresponding to the first rule is the same as the actual jump result; or, if the second The prediction result corresponding to the rule is the same as the actual jump result, and the second prediction information is determined as the prediction information used for prediction.
  • the update logic circuit is also used to update the first prediction information and the second prediction information according to the actual jump result.
  • the first rule is the global enhanced GL algorithm
  • the second rule is the two-bit branch prediction local2b algorithm
  • the first rule is the local2b algorithm
  • the second rule is the GL algorithm
  • an embodiment of the present application further provides a processor, which may include: a processing circuit and the branch predictor described in any possible implementation manner of the second aspect or the third aspect;
  • the branch predictor is used to predict whether to perform a jump operation on the target branch instruction, and if the prediction result indicates that the jump operation is performed, the processing circuit uses the first address in the target branch entry in the branch predictor as the target address of the jump operation, Perform a jump operation on the target branch instruction.
  • the embodiments of the present application also provide a readable storage medium, on which a computer program is stored; when the computer program is executed, it is used to execute any one of the possible implementations of the first aspect above The processing method of the branch instruction.
  • an embodiment of the present application also provides a chip on which a computer program is stored.
  • the computer program is executed by a processor, it is used to execute the branch instruction as described in any of the possible implementations of the first aspect. ⁇ Treatment methods.
  • the branch instruction processing method, branch predictor, and processor provided by the embodiments of the present application, when predicting whether to perform a jump operation on the target branch instruction, first look up the target branch instruction in the branch prediction table according to the partial address of the target branch instruction Corresponding target branch entry; if the target branch entry is found in the branch prediction table, since the target branch entry includes a first field, and the first field is used to indicate multiple prediction information, the multiple prediction information is based on the target branch instruction
  • the historical jump information and the information that affects the prediction result of the target branch instruction are predicted, so that the historical jump information of multiple target branch instructions and the information that affects the prediction result of the target branch instruction can jointly predict whether the target branch instruction is Perform jump operations, thereby improving the accuracy of the prediction results.
  • FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a method for processing branch instructions provided by an embodiment of the application
  • FIG. 3 is a schematic structural diagram of a branch entry provided by an embodiment of this application.
  • FIG. 4 is a schematic structural diagram of another branch entry provided by an embodiment of this application.
  • FIG. 5 is a schematic diagram of processing an instruction provided by an embodiment of this application.
  • FIG. 6 is a schematic structural diagram of yet another branch entry provided by an embodiment of this application.
  • Figure 7 is a schematic structural diagram of a branch entry provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of the address and the first address of a target branch instruction provided by an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of another branch entry provided by an embodiment of this application.
  • FIG. 10 is a schematic flowchart of another method for processing branch instructions provided by an embodiment of the application.
  • FIG. 11 is a schematic diagram of a selection prediction rule provided by an embodiment of the application.
  • FIG. 12 is a schematic diagram of a prediction rule switching provided by an embodiment of this application.
  • FIG. 13 is a schematic structural diagram of a branch predictor provided by an embodiment of this application.
  • FIG. 14 is a schematic structural diagram of another branch predictor provided by an embodiment of the application.
  • FIG. 15 is a schematic structural diagram of a processor provided by an embodiment of the application.
  • FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the application.
  • the system architecture may include an instruction memory, an instruction fetcher, and a branch predictor.
  • the instruction memory is used to store multiple instructions.
  • the instruction fetcher fetches instructions from the instruction memory and sends the instruction address to the branch predictor. If the branch predictor predicts the branch according to the instruction address If the instruction is found in the table, it is determined that the instruction is a branch instruction, and when the branch instruction is predicted to jump, it can be predicted based on the historical jump information of the branch instruction stored in the branch predictor, and the prediction is sent to the selector As a result, the prediction result can be information such as the jump address, whether to jump or not.
  • the jump address corresponding to the prediction operation is sent to the selector.
  • the selector determines the address for querying the branch instruction among the jump address and the address of the branch instruction sent by the fetcher, and The address is sent to the instruction memory, so that the memory can find the corresponding instruction according to the address, and send the instruction to the fetcher (for example, the instruction address sent by the fetcher to the selector can be sent by the fetcher to the branch prediction Add 1 to the instruction address of the device, and add 1 refers to the address of the next jump instruction).
  • an embodiment of the application provides a branch instruction prediction method.
  • branch instructions can be divided into four categories according to the first dimension (direct target jump or indirect target jump) and the second dimension (non-conditional jump or conditional jump).
  • the jump address of the direct target jump is fixed, and the target of the indirect target jump can be a variable, such as the value of a register, and the value of this register can change; the branch instruction of an unconditional jump is a certain jump
  • conditional jumps generally determine whether to perform a jump operation based on the value of a certain variable.
  • the jump operation is determined to be performed; when the value of the variable is 0, the jump operation is determined not to be performed operating.
  • the four types of branch instructions divided by these two dimensions are: the first type of branch instruction is a direct target jump, and the branch instruction of an unconditional jump, the jump address of the first type of branch instruction is fixed, and A jump operation is bound to be performed, that is, the first type of branch instruction does not require jump prediction; the second type of branch instruction is a direct target jump and a conditional jump branch instruction, the jump address of the second type of branch instruction Fixed, but it needs to predict whether to perform a jump operation; the third type of branch instruction is an indirect target jump and an unconditional jump branch instruction, the jump address of the third type of branch instruction needs to be predicted, and the jump will inevitably be executed Operation, that is, the third type of branch instruction only needs to predict the jump address; the fourth type of branch instruction is an indirect target jump and conditional jump branch instruction, the fourth type of branch instruction needs to predict the jump address, and needs to predict.
  • the prediction information of the branch instruction may include information for predicting whether the branch instruction will perform a jump operation, or it may include the historical prediction result of the branch instruction, which can be used as the prediction information of the branch instruction.
  • the historical jump information stored in the prediction information local2b algorithm and gshare-like algorithm.
  • FIG. 2 is a schematic flowchart of a method for processing branch instructions provided by an embodiment of the application.
  • the method for processing branch instructions may include:
  • the target branch entry includes a first field, the first field is used to indicate M pieces of prediction information, and the M pieces of prediction information are predicted based on the historical jump information of the target branch instruction and the information that affects the prediction result of the target branch instruction , M is an integer greater than or equal to 2, that is, the first field is used for at least two pieces of prediction information.
  • the information that affects the prediction result of the target branch instruction may be historical jump information of other branch instructions that affect the jump of the target branch instruction, or other information, which can be specifically set according to actual needs.
  • FIG. 3 is a schematic structural diagram of a branch entry provided in an embodiment of this application.
  • the branch entry may include the branch address of the branch instruction.
  • the branch address may be understood as the high-order address of the branch instruction, and may also include the first field used to indicate M attribute information of the branch instruction. Of course, it may also include the instruction type.
  • the address of the target branch instruction is the address of the target instruction in the target branch instruction, and the address of the target branch instruction is usually not completely stored in the branch prediction table.
  • the address of the target branch instruction of 32 bits as an example, several bits can be selected from it, for example, 10 bits from the low order are taken to determine its position in the branch prediction table, such as which row.
  • use 20 of the remaining 22 bits to compare and determine whether the corresponding location of the branch prediction table stores the target instruction.
  • the remaining 2 bits may be invalid or no invalid bits. All the remaining 22 bits are used for comparison to determine whether the corresponding location of the branch prediction table stores the target instruction.
  • the partial address of the target branch instruction can be understood as the lower address of the target branch instruction, that is, the 10-bit address.
  • the lower address of the target branch instruction that is, the 10-bit address.
  • it can also include the For addresses with a lower address greater than 10 bits, as long as the row of the target branch instruction can be determined in the branch prediction table according to the partial address.
  • the embodiment of the present application only takes the address of the target branch instruction of 32 bits as an example for description, but it does not mean that the embodiment of the present application is limited thereto.
  • the target branch entry corresponding to the target branch instruction can be searched in the branch prediction table according to the partial address of the target branch instruction.
  • the branch prediction table in the initial state, the branch prediction table is empty, that is, no branch entry is stored in the branch prediction table.
  • the processor will process the branch instruction every time it processes a branch instruction.
  • the relevant information of the instruction is stored in the branch prediction table in the branch predictor in the form of branch entries.
  • the branch prediction table is stored in the memory of the branch predictor. In this way, the branch prediction table will store at least one branch instruction corresponding to each Branch entry.
  • At least one branch entry in the branch prediction table may include the target branch entry corresponding to the target branch instruction, or may not include the target branch entry corresponding to the target branch instruction. Therefore, the branch prediction can be performed according to the partial address of the target branch instruction.
  • the target branch entry corresponding to the target branch instruction is searched in the table to determine the corresponding prediction result according to the search result.
  • the processor in this application may include a branch predictor and a processing circuit. Among them, the branch predictor is mainly used to predict whether to perform a jump operation on the target branch instruction. If the prediction result indicates that the jump operation is performed, the processing circuit uses the first address in the target branch entry in the branch predictor as the target of the jump operation Address, perform a jump operation on the target branch instruction.
  • the prediction result can be information such as the jump address or whether to jump.
  • the target branch entry is found in the branch prediction table, it can be predicted whether to perform a jump operation on the target branch instruction according to the prediction information indicated by the first field in the target branch entry. For example, it is possible to predict whether to perform a jump operation on the target branch instruction according to N pieces of prediction information in the M pieces of prediction information indicated by the first field. It is understandable that when N is less than M, it means that it can predict whether to perform a jump operation on the target branch instruction according to the partial prediction information in the M prediction information; when N is equal to M, it means that it can be based on all the prediction information in the M prediction information.
  • the prediction information predicts whether to perform a jump operation on the target branch instruction, thereby improving the accuracy of the prediction result. It can be understood that when predicting whether to perform a jump operation on the target branch instruction according to the prediction information, the number of prediction information can be set according to actual needs. Here, the embodiment of the present application does not make further restrictions.
  • the prediction information when predicting whether to perform a jump operation on the first branch instruction, not only includes the historical jump information of the first branch instruction, but other influences may also be considered.
  • the factor information of the first branch instruction for example, historical jump information of other branch instructions (such as the second branch instruction, the third branch instruction, and the fourth branch instruction).
  • the M pieces of prediction information may include 8 pieces of prediction information, so that it is possible to predict whether to perform a jump operation on the first branch instruction according to at least two pieces of the 8 pieces of prediction information.
  • the prediction result After predicting whether to perform a jump operation on the target branch instruction, the prediction result can be obtained.
  • the prediction result can be represented by two bits, for example, the prediction result can be 00, 01, 10, and 11. Normally, the prediction results 00 and 01 indicate that no jump operation is performed, and the prediction results 10 and 11 indicate that the jump operation is performed. Therefore, after the prediction result is obtained, it can be determined whether to perform the jump operation according to the prediction result.
  • the first address is used as the target address of the jump operation.
  • the branch predictor instructs the processing circuit in the processor to use the first address as the target address of the jump operation, and perform a jump to the target branch instruction The operation causes the jump to the first address; on the contrary, if the prediction result indicates that the jump operation is not performed, the target branch instruction is not processed.
  • the target branch entry further includes a second field, and the second field is used to indicate the first address.
  • the first address can be understood as a complete jump address or a partial jump address, but the complete jump address can be determined according to the partial jump address.
  • FIG. 4 is a schematic structural diagram of another branch entry provided in an embodiment of this application.
  • the jump operation can be performed on the target branch instruction according to the second field included in the target branch entry for the first address to, so that the jump to the first address is performed and the Continue to obtain branch instructions at the address for prediction until all instructions are executed.
  • the target branch instruction when predicting whether to perform a jump operation on the target branch instruction, it first searches the branch prediction table for the target corresponding to the target branch instruction according to the partial address of the target branch instruction. Branch entry; if the target branch entry is found in the branch prediction table, since the target branch entry includes the first field, and the first field is used to indicate multiple prediction information, the multiple prediction information is based on the historical jump of the target branch instruction.
  • the transfer information and the information that affects the prediction result of the target branch instruction are predicted, so that it can be predicted based on the historical jump information of multiple target branch instructions and the historical jump information of other branch instructions that affect the jump of the target branch instruction.
  • the target branch instruction performs a jump operation, thereby improving the accuracy of the prediction result.
  • FIG. 5 is a schematic diagram of processing an instruction provided by an embodiment of this application.
  • instruction A, instruction B, instruction C, instruction D, and instruction E are sequentially stored in the instruction memory.
  • the instruction fetcher fetches instruction A from the instruction memory and sends the address of instruction A to the branch predictor.
  • the branch entry stores the address of the branch instruction, so the branch predictor can look up in the branch prediction table according to the address of instruction A, if you look up in the branch prediction table
  • the branch predictor can look up in the branch prediction table according to the address of instruction A, if you look up in the branch prediction table
  • the instruction A is a branch instruction
  • the multiple history jump information of instruction A in the branch entry corresponding to instruction A and the history of other branch instructions that affect the jump of instruction A are further determined.
  • the jump information jointly predicts whether to perform a jump operation on the instruction A. If the prediction result indicates that the jump operation is performed, the first address is obtained in the branch entry corresponding to the instruction A, and the processing circuit in the processor is instructed to perform the jump operation on the instruction A. Perform a jump operation.
  • the instruction fetcher sends the address of instruction D to the instruction memory, and the instruction memory sends instruction D to the processor according to the address of instruction D, and the processor processes instruction D. It can be seen that when predicting whether to perform a jump operation on instruction A, it is based on the multiple historical jump information corresponding to instruction A and the historical jump information of other branch instructions that affect the jump of instruction A to predict whether it is correct The instruction A performs a jump operation, thereby improving the accuracy of the prediction result.
  • the embodiment shown in FIG. 2 describes in detail that when predicting whether to perform a jump operation on a target branch instruction, whether to perform a jump operation on the target branch instruction can be predicted based on multiple prediction information, thereby improving the accuracy of the prediction result.
  • the target branch entry includes multiple prediction information, this will increase the length of the target branch entry, that is, the number of bits occupied by the target branch entry increases.
  • the number of bits occupied by the entry may be reduced through at least two possible implementation manners.
  • the target branch entry when the target branch entry includes a third field for indicating branch attributes, and there are idle bits in the third field, at least one prediction information can be stored by multiplexing the idle bits to reduce the target branch.
  • the number of bits occupied by the entry For an example, refer to FIG. 6, which is a schematic structural diagram of yet another branch entry provided in an embodiment of this application.
  • the address of the target branch instruction and the first address since the address of the target branch instruction and the first address are relatively close, the address of the target branch instruction and the first address have many bits that are the same. Therefore, the first address can be compressed by The number of bits to reduce the number of bits occupied by the target branch entry. Below, these two possible implementations will be described in detail.
  • At least one piece of prediction information is stored by multiplexing idle bits to reduce the number of bits occupied by the target branch entry.
  • the specific solution is: since the target branch entry also includes a third field for indicating branch attributes, if there are idle bits in the third field, at least one prediction information of the M pieces of prediction information can be stored in the idle bits to pass The idle bit in the third field is reused to store at least one prediction information, so that there is no need to increase the number of bits of the target branch entry, thereby reducing the number of bits occupied by the target branch entry.
  • the branch attribute indicates the type of branch instruction, it can also indicate whether the branch instruction is for one-time use, or whether it should be replaced first.
  • Each attribute occupies a certain number of bits.
  • the relevant information in the branch entry is read each time, and then the usable branch attributes and the unusable branch attributes are determined according to the type of the branch instruction, and the fields of the unusable branch attributes are free. For example, in each branch entry, there are 10 branch attribute bit fields.
  • the types of branch instructions are divided into three types: A, B, and C.
  • a type branch instruction requires all 10 branch attribute bit fields
  • type B The branch instruction requires 8 branch attribute bit fields
  • the C-type branch instruction requires 5 branch attribute bit fields.
  • the B-type branch instruction can free up 2 free bit fields of branch attributes, these 2 branches
  • the free bit field of the attribute can be used to store at least one prediction information; similarly, the branch instruction of type C can leave 5 free bit fields of branch attributes, and the free bit fields of these 5 branch attributes can be used to store at least A prediction information, thereby reducing the number of bits occupied by the target branch entry.
  • the number of bits occupied by the first address is compressed to reduce the number of bits occupied by the target branch entry.
  • the specific solution is: because the target branch entry also includes a fourth field, the fourth field is used to indicate that the first address is determined according to the fourth field and the second field; the second field is used to indicate the second address, and the second address is the first address In the address that is different from the address of the target branch instruction, the number of bits occupied by the target branch entry is reduced.
  • FIG. 7, is a schematic structural diagram of a branch entry provided in an embodiment of this application.
  • the fourth field is the number of bits of the second address; or, the fourth field is the number of bits of the address of the target branch instruction that is the same as the first address ; Or, the fourth field is the sum of the number of bits of the address of the target branch instruction and the number of bits of the second address.
  • the address of the target branch instruction is relatively close to the first address
  • the address of the target branch instruction and the first address have many bits that are the same. Therefore, the bits occupied by the first address can be compressed Number to reduce the number of bits occupied by the target branch entry.
  • FIG. 8 is a schematic diagram of the address and the first address of a target branch instruction provided by an embodiment of the application. It can be seen that the address of the target branch instruction and the first address have many bit addresses that are the same. Therefore, in order to save the number of bits occupied by the first address, only the second address can be indicated through the second field, and the second address is the first address. An address that is different from the address of the target branch instruction.
  • the target branch entry when the second address is indicated through the second field, since the number of bits in the branch instruction address and the target address is dynamically changed, the target branch entry also needs to include a fourth field, which is used to indicate The second address can correctly read the first address, that is, the first address can be determined according to the fourth field and the second address indicated by the second field, thereby determining the jump address of the target branch instruction.
  • the fourth field may directly be the number of bits of the second address; or, the fourth field may be the address of the target branch instruction The same number of bits as the first address; or, the fourth field is the sum of the address of the target branch instruction and the number of bits of the second address, so that the first address can be determined according to the fourth field and the second address indicated by the second field .
  • the address of the target branch instruction and the first address are both 32 bits, and the address of the target branch instruction has the same 20-bit address as the first address, it can be seen that the address of the target branch instruction has a 12-bit address with the first address.
  • the fourth field can be directly the number of bits of the second address, that is, the fourth field can be 12, and the first address can be determined according to the fourth field 12 and the second address.
  • the fourth field may also be the same number of bits as the address of the target branch instruction and the first address, that is, the fourth field may be 20, and the first address may be determined according to the fourth field 20 and the second address.
  • the fourth field may also be the address of the target branch instruction and the number of bits of the second address, that is, the fourth field may be 44 bits (the sum of 32 bits and 12 bits), and the fourth field 44 and the second address Determine the first address, thereby determining the first address.
  • the second field can be set in the target branch entry. The last column.
  • the structure of the above branch entry can be directly used to store the jump address, but it is not Use the prediction information in the branch entry;
  • the technical solution provided by the embodiment of the application is applied to the third type of branch instruction, because the third type of branch instruction only needs to predict the jump address, the jump address is By default, the jump address is the same as the last time, and the prediction information in the branch entry is not used;
  • the technical solution provided by the embodiment of the present application is applied to the fourth type of branch instruction, the fourth type of branch instruction needs to predict the jump address And whether to perform a jump operation.
  • the prediction method is the same as the second
  • the prediction method of the class of branch instructions is similar. For details, please refer to the prediction method of the second class of branch instructions.
  • the description of this embodiment of the present application will not be repeated. It is understandable that the embodiments of the present application only take these four types of branch instructions as examples for description, but it does not mean that the embodiments of the present application are limited to these.
  • FIG. 9 is a schematic structural diagram of another branch entry provided in an embodiment of this application. It is understandable that the embodiment of this application only takes the prediction information corresponding to the two rules in the M pieces of prediction information as an example for description, and the prediction information corresponding to the three rules may also be included, which can be specifically set according to actual needs.
  • the number of rules to which the prediction information included in the M pieces of prediction information belongs is not further limited in this embodiment of the application.
  • the first rule is the global enhanced GL algorithm
  • the second rule is the two-bit branch prediction local2b algorithm
  • the first rule is the local2b algorithm
  • the second rule is the GL algorithm.
  • other algorithms can also be used. It is set according to actual needs.
  • the embodiment of the present application does not make further restrictions.
  • the M pieces of prediction information corresponding to the target branch instruction may include 1 corresponding to the local2b algorithm.
  • the prediction information corresponding to the GL algorithm includes not only the historical jump information of the target branch instruction, but also other factor information affecting the target branch instruction, for example, other branch instructions (such as the second branch instruction, the third branch instruction, and the first branch instruction).
  • M prediction information can include 9 prediction information (1 prediction information corresponding to the local2b algorithm and 8 prediction information corresponding to the GL algorithm), which can be based on the 9 prediction information At least two pieces of prediction information predict whether to perform a jump operation on the target branch instruction.
  • Figure 10 is a schematic flowchart of another branch instruction processing method provided by an embodiment of the application.
  • the branch instruction processing method can also be include:
  • the processor When the processor processes the target branch instruction, it needs a certain interval of time to obtain the actual jump result of the target branch instruction. Therefore, it will use the prediction information corresponding to the first rule and the prediction information corresponding to the second rule.
  • the jump result of the target branch instruction is predicted, and the prediction result corresponding to the first rule and the prediction result corresponding to the second rule are obtained.
  • the following S1002 can be executed:
  • the prediction result corresponding to the first rule is predicted based on m1 pieces of first prediction information
  • the prediction result corresponding to the second rule is predicted based on m2 pieces of second prediction information; both m1 and m2 are greater than or equal to 1.
  • An integer, and the sum of m1 and m2 is less than or equal to M.
  • the prediction information used in the prediction is one of the first prediction information and the second prediction information, and the prediction information used in the prediction can be understood as the prediction information used in the next prediction.
  • the prediction result corresponding to the second rule After obtaining the prediction result corresponding to the first rule, the prediction result corresponding to the second rule, and the actual jump result, respectively, compare the prediction result corresponding to the first rule and the prediction result corresponding to the second rule with the actual jump result , If the prediction result corresponding to the first rule is the same as the actual jump result, the first prediction information is determined as the prediction information used for prediction, that is, the next time the first prediction information is used to predict whether to perform the jump operation on the target branch instruction; on the contrary; If the prediction result corresponding to the second rule is the same as the actual jump result, the second prediction information is determined as the prediction information used for prediction, so as to obtain the prediction information used for prediction. Among them, the prediction information used for prediction can be understood as the prediction information used for the next prediction.
  • the selection counter may be updated according to the comparison result first, and then the prediction information used for prediction is determined according to the value of the selection counter.
  • FIG. 11 is a schematic diagram of a selection prediction rule provided by an embodiment of this application.
  • the first rule and the second rule as the local2b algorithm and the GL algorithm respectively, where the GL algorithm is the prediction algorithm provided in this embodiment of the application, and the selection counter is represented by a 2-bit value, which is 00, 01, 10 and 11, 00 and 01 indicate that the local2b algorithm is used, and 10 and 11 indicate that the GL algorithm is used.
  • the counting rule for the selection counter can be: when the local2b algorithm is used to predict whether to perform a jump operation on the target branch instruction, if the prediction result is correct, the value of the selection counter is decreased by 1, and if the prediction result is wrong, the value of the selection counter is increased by 1; When the GL algorithm is used to predict whether to perform a jump operation on the target branch instruction, if the prediction result is correct, the value of the selection counter is increased by 1, and if the prediction result is wrong, the value of the selection counter is decreased by 1.
  • FIG. 12 is a schematic diagram of a prediction rule switching provided by an embodiment of the application. If the local2b algorithm predicts The result is the same as the actual jump result, which means that the local2b algorithm predicts correctly.
  • the value of the selection counter is reduced by 1. Since the value of the selection counter itself is already 00, you can first keep the value of the selection counter unchanged, that is, the value of the selection counter is still Is 00; if the prediction result of the local2b algorithm is not the same as the actual jump result, it means that the local2b algorithm prediction is wrong.
  • the value of the selection counter is increased by 1, and the value of the selection counter is changed from 00 to 01. 01 represents the local2b algorithm.
  • the local2b algorithm remains unchanged; if the prediction result of the local2b algorithm is not the same as the actual jump result, the local2b algorithm prediction error, the value of the selection counter is increased by 1, and the value of the selection counter is changed from 01 to 10, which means 10 It is the GL algorithm. At this time, switch to the GL algorithm, and then use the GL algorithm for prediction. If the prediction result of the GL algorithm is not the same as the actual jump result, it means that the local2b algorithm predicts an error.
  • the value of the selected counter is increased by 1, and the counter is selected
  • the value of is changed from 10 to 11, and 11 represents the GL algorithm. At this time, the GL algorithm remains unchanged, so that the prediction information used for prediction is determined according to the value of the selection counter.
  • the embodiment of this application only uses 00 and 01 to indicate the use of the local2b algorithm, 10 and 11 indicate the use of the GL algorithm as an example, and 00 and 01 indicate the use of the GL algorithm, and 10 and 11 indicate the use of the local2b algorithm. Specifically, it can be set according to actual needs.
  • the embodiment of the present application does not make further restrictions.
  • the corresponding selection counter counting rule can be: when the GL algorithm is used to predict whether to perform a jump operation on the target branch instruction, if the prediction result is correct, then The value of the selection counter is subtracted by 1, if the prediction result is wrong, the value of the selection counter is increased by 1.
  • the local2b algorithm is used to predict whether to perform a jump operation on the target branch instruction, if the prediction result is correct, the value of the selection counter is increased by 1. If the prediction result is wrong, the value of the selection counter is decreased by 1.
  • the first prediction information and the second prediction information can also be updated according to the actual jump result, thereby improving the accuracy of the first prediction information and the second prediction information. Then, by predicting whether to perform a jump operation on the target branch instruction through the updated first prediction information and the second prediction information, the accuracy of the prediction result can also be improved.
  • FIG. 13 is a schematic structural diagram of a branch predictor 130 provided by an embodiment of this application.
  • the branch predictor 130 may include:
  • the search unit 1301 is configured to search for the target branch entry corresponding to the target branch instruction in the branch prediction table according to the partial address of the target branch instruction; wherein, the target branch entry includes a first field, and the first field is used to indicate M prediction information, The M pieces of prediction information are predicted based on the historical jump information of the target branch instruction and the information affecting the prediction result of the target branch instruction; M is an integer greater than or equal to 2.
  • the prediction unit 1302 is configured to, if the target branch entry is found in the branch prediction table, predict whether to perform a jump operation on the target branch instruction according to the prediction information indicated by the first field.
  • the information that affects the prediction result of the target branch instruction includes historical jump information of other branch instructions that affect the jump of the target branch instruction.
  • the target branch entry further includes a second field, and the second field is used to indicate the first address. If the prediction result indicates that the jump operation is performed, the target address of the jump operation is the first address.
  • the target branch entry further includes a third field, and the third field is used to indicate branch attributes.
  • the branch predictor 130 may include: a processing unit 1303, configured to predict M numbers of free bits in the third field. At least one prediction information in the information is stored in free bits.
  • the target branch entry further includes a fourth field, the fourth field is used to indicate that the first address is determined according to the fourth field and the second field; the second field is used to indicate the second address, and the second address is in the first address An address different from the address of the target branch instruction.
  • the fourth field is the number of bits of the second address; or, the fourth field is the number of bits that the address of the target branch instruction is the same as the first address; or, the fourth field is the number of bits of the address of the target branch instruction and The sum of the number of bits of the second address.
  • the second field is set in the last column of the target branch entry.
  • the M pieces of prediction information include m1 pieces of first prediction information corresponding to the first rule and m2 pieces of second prediction information corresponding to the second rule.
  • the processing unit 1303 is also used to obtain the actual jump result of the target branch instruction; and respectively compare the prediction result corresponding to the first rule and the prediction result corresponding to the second rule with the actual jump result; and then determine according to the comparison result
  • the prediction information used in the prediction is one of the first prediction information and the second prediction information.
  • the prediction result corresponding to the first rule is predicted based on m1 pieces of first prediction information
  • the prediction result corresponding to the second rule is predicted based on m2 pieces of second prediction information; both m1 and m2 are greater than or equal to 1.
  • An integer, and the sum of m1 and m2 is less than or equal to M.
  • the processing unit 1303 is specifically configured to determine the first prediction information as the prediction information used for prediction if the prediction result corresponding to the first rule is the same as the actual jump result; or, if the prediction result corresponding to the second rule If the result is the same as the actual jump, the second prediction information is determined as the prediction information used for prediction.
  • the first rule is the global enhanced GL algorithm
  • the second rule is the two-bit branch prediction local2b algorithm
  • the first rule is the local2b algorithm
  • the second rule is the GL algorithm.
  • the branch predictor 130 provided by the embodiment of the present application can be used to execute the technical solutions in the embodiment of the branch instruction processing method shown in FIG. 2 to FIG. 12, its implementation principle and technical effect and the branch instruction processing method embodiment The implementation principle and technical effect are similar to those in the middle, and will not be repeated here.
  • FIG. 14 is a schematic structural diagram of another branch predictor 140 provided by an embodiment of the application.
  • the branch predictor 140 may include a prediction logic circuit 1401 and a memory 1402;
  • the memory 1402 is configured to store a branch prediction table, where the branch prediction table includes multiple branch entries.
  • the prediction logic circuit 1401 is used to find the target branch entry corresponding to the target branch instruction among multiple branch entries according to the partial address of the target branch instruction; wherein the target branch entry includes a first field, and the first field is used to indicate M Prediction information, M pieces of prediction information are predicted based on the historical jump information of the target branch instruction and the information that affects the prediction result of the target branch instruction; if the target branch entry is found in the branch prediction table, it is based on the indication in the first field Prediction information, predict whether to perform a jump operation on the target branch instruction; where M is an integer greater than or equal to 2.
  • the information that affects the prediction result of the target branch instruction includes historical jump information of other branch instructions that affect the jump of the target branch instruction.
  • the target branch entry further includes a second field.
  • the second field is used to indicate the first address. If the prediction result indicates that a jump operation is performed, the target address of the jump operation is the first address.
  • the target branch entry further includes a third field
  • the third field is used to indicate branch attributes
  • the branch predictor 140 further includes an update logic circuit 1403.
  • the update logic circuit 1403 is configured to store at least one prediction information of the M pieces of prediction information in the idle bit if there is an idle bit in the third field.
  • the target branch entry further includes a fourth field, the fourth field is used to indicate that the first address is determined according to the fourth field and the second field; the second field is used to indicate the second address, and the second address is in the first address An address different from the address of the target branch instruction.
  • the fourth field is the number of bits of the second address; or, the fourth field is the number of bits that the address of the target branch instruction is the same as the first address; or, the fourth field is the number of bits of the address of the target branch instruction and The sum of the number of bits of the second address.
  • the second field is set in the last column of the target branch entry.
  • the M pieces of prediction information include m1 pieces of first prediction information corresponding to the first rule and m2 pieces of second prediction information corresponding to the second rule.
  • the updated logic circuit 1403 is also used to obtain the actual jump result of the target branch instruction.
  • the prediction logic circuit 1401 is also used to compare the prediction result corresponding to the first rule and the prediction result corresponding to the second rule with the actual jump result, and to determine the prediction information used for prediction and the prediction used for prediction according to the comparison result
  • the information is one of the first prediction information and the second prediction information; wherein the prediction result corresponding to the first rule is predicted based on m1 first prediction information, and the prediction result corresponding to the second rule is based on m2 second predictions Information predicted; both m1 and m2 are integers greater than or equal to 1, and the sum of m1 and m2 is less than or equal to M.
  • the prediction logic circuit 1401 is specifically configured to determine the first prediction information as the prediction information used for prediction if the prediction result corresponding to the first rule is the same as the actual jump result; or, if the prediction corresponding to the second rule The result is the same as the actual jump result, and the second prediction information is determined as the prediction information used for prediction.
  • the update logic circuit 1403 is also used to update the first prediction information and the second prediction information according to the actual jump result.
  • the first rule is the global enhanced GL algorithm
  • the second rule is the two-bit branch prediction local2b algorithm
  • the first rule is the local2b algorithm
  • the second rule is the GL algorithm.
  • the branch predictor 140 provided by the embodiment of the present application can be used to execute the technical solutions in the embodiment of the branch instruction processing method shown in FIG. 2 to FIG. 12, its implementation principle and technical effect, and the branch instruction processing method embodiment The implementation principle and technical effect are similar to those in the middle, and will not be repeated here.
  • FIG. 15 is a schematic structural diagram of a processor 150 provided in an embodiment of the present application.
  • the processor 150 may include: the foregoing FIG. 2 -The branch predictor 1501 and the processing circuit 1502 described in FIG. 12.
  • the branch predictor 1501 is used to predict whether to perform a jump operation on the target branch instruction. If the prediction result indicates that the jump operation is performed, the processing circuit 1502 uses the first address in the target branch entry in the branch predictor 1501 as the jump operation To perform a jump operation on the target branch instruction.
  • the processor 150 provided by the embodiment of the present application may be used to execute the technical solutions in the embodiment of the branch instruction processing method shown in FIG. 2 to FIG. 12, and its implementation principles and technical effects are similar to those in the branch instruction processing method embodiment. The implementation principle and technical effect are similar, and will not be repeated here.
  • the embodiment of the present application also provides a readable storage medium on which a computer program is stored; when the computer program is executed, it is used to execute the branch instruction as described in any one of the possible implementations of the first aspect.
  • the processing method, its implementation principle and technical effect are similar to the implementation principle and technical effect of the method embodiment, and will not be repeated here.
  • An embodiment of the present application also provides a chip on which a computer program is stored, and when the computer program is executed by a processor, it is used to execute the branch instruction processing method as described in any one of the possible implementations of the first aspect, The implementation principle and technical effect are similar to those in the method embodiment, and will not be repeated here.
  • modules or units in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • the functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
  • the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including a number of instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer programs or instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer program or instruction may be stored in a computer-readable storage medium, or transmitted through the computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server integrating one or more available media.
  • the usable medium may be a magnetic medium, such as a floppy disk, a hard disk, or a magnetic tape; it may also be an optical medium, such as a digital versatile disc (DVD); it may also be a semiconductor medium, such as a solid state drive (solid state drive). disk, SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

本申请实施例提供了一种分支指令的处理方法、分支预测器及处理器,在预测是否对目标分支指令执行跳转操作时,先根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;若在分支预测表中查找到目标分支条目,由于该目标分支条目包括第一字段,且第一字段用于指示M个预测信息,M个预测信息根据目标分支指令的历史跳转信息及影响所述目标分支指令的预测结果的信息预测得到的,M为大于或者等于2的整数,这样就可以根据多个目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息,共同预测是否对所述目标分支指令执行跳转操作,从而提高了预测结果的准确度。

Description

分支指令的处理方法、分支预测器及处理器 技术领域
本申请涉及计算机技术领域,尤其涉及一种分支指令的处理方法、分支预测器及处理器。
背景技术
为了满足用户对处理器性能日益增高的需求,处理器通常使用能够重叠执行指令的流水线技术来提高效率。通常情况下,处理器获取到分支指令后,在流水线前级就会对该分支指令进行跳转预测,并在预测结果指示的跳转地址上继续获取指令,而无需等待流水线后级返回该分支指令的跳转结果,从而可以减少流水线的“气泡”,提高处理器的处理效率。
现有技术中,通常采用两比特分支预测(local 2-bit branch-prediction,local 2b)算法对分支指令的跳转结果进行预测。该local 2b算法是根据分支指令的历史跳转信息对2bit计数器进行更新,并使用计数器结果进行跳转预测。但是,采用该算法进行跳转预测时,只是根据该分支指令的历史跳转信息进行预测,并未考虑到其它的影响因素,从而导致预测结果的准确度不高。
发明内容
本申请提供一种分支指令的处理方法、分支预测器及处理器,提高了预测结果的准确度。
第一方面,本申请实施例提供一种分支指令的处理方法,该分支指令的处理方法可以包括:
根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;其中,目标分支条目包括第一字段,第一字段用于指示M个预测信息,M个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的;M为大于或者等于2的整数;
若在分支预测表中查找到目标分支条目,则根据第一字段指示的预测信息,预测是否对目标分支指令执行跳转操作。
由此可见,本申请实施例提供的分支指令的处理方法,在预测是否对目标分支指令执行跳转操作时,先根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;若在分支预测表中查找到目标分支条目,由于该目标分支条目包括第一字段,且第一字段用于指示多个预测信息,该多个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的,这样就可以根据多个目标分支指令的历史跳转信息和影响目标分支指令的预测结果的信息共同预测是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。
在一种可能的实现方式中,影响目标分支指令的预测结果的信息包括影响目标分支指令跳转的其它分支指令的历史跳转信息,即根据多个目标分支指令的历史跳转信息和影响目标分支指令跳转的其它分支指令的历史跳转信息共同预测是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。
在一种可能的实现方式中,目标分支条目还包括第二字段,第二字段用于指示第一地址,若预测结果指示执行跳转操作,则跳转操作的目标地址即为第一地址。
在一种可能的实现方式中,目标分支条目还包括第三字段,第三字段用于指示分支属性;该方法还可以包括:
若第三字段中存在空闲比特,则将M个预测信息中的至少一个预测信息存放在空闲比特中,以通过复用第三字段中的空闲比特存放至少一个预测信息,这样就无需额外增加目标分支条目的比特数,从而减少了目标分支条目占用的比特数。
在一种可能的实现方式中,目标分支条目还包括第四字段,第四字段用于指示根据第四字段和第二字段确定第一地址;第二字段用于指示第二地址,第二地址为第一地址中与目标分支指令的地址不同的地址,实现了在保证多个预测信息的情况下,减少目标分支条目所占的比特数。
在一种可能的实现方式中,第四字段为第二地址的比特数;或者,第四字段为目标分支指令的地址与第一地址相同的比特数;或者,第四字段为目标分支指令的地址的比特数与第二地址的比特数之和。
在一种可能的实现方式中,由于第四字段指示的压缩控制的值不同,第二地址的长度也会发生变化,为了不影响其它字段所占的比特域的位置,可以将第二字段设置在目标分支条目的最后一列。
在一种可能的实现方式中,M个预测信息包括第一规则对应的m1个第一预测信息和第二规则对应的m2个第二预测信息,该方法还可以包括:
获取目标分支指令的实际跳转结果;
分别将第一规则对应的预测结果和第二规则对应的预测结果,与实际跳转结果进行比较;第一规则对应的预测结果为根据m1个第一预测信息预测得到的,第二规则对应的预测结果为根据m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M;
根据比较结果,确定预测使用的预测信息,预测使用的预测信息为第一预测信息和第二预测信息中的一个,当多个预测信息包括两种预测规则时,还可以实现根据两种不同的规则预测是否对目标分支指令执行跳转操作,以使本申请实施例提供的分支指令的处理方法可以适应于更多的预测场景。
在一种可能的实现方式中,根据比较结果,确定预测使用的预测信息,可以包括:
若第一规则对应的预测结果与实际跳转结果相同,则将第一预测信息确定为预测使用的预测信息;或者,
若第二规则对应的预测结果与实际跳转结果相同,则将第二预测信息确定为预测使用的预测信息。
在一种可能的实现方式中,根据实际跳转结果更新第一预测信息和第二预测信息,从而提高了第一预测信息和第二预测信息的准确度,这样后续再通过更新后的第一预测信息 和第二预测信息预测是否对目标分支指令执行跳转操作,可以进一步提高预测结果的准确度。
第二方面,本申请实施例提供一种分支预测器,该分支预测器可以包括:
查找单元,用于根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;其中,目标分支条目包括第一字段,第一字段用于指示M个预测信息,M个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的;M为大于或者等于2的整数;
预测单元,用于若在分支预测表中查找到目标分支条目,则根据第一字段指示的预测信息,预测是否对目标分支指令执行跳转操作。
在一种可能的实现方式中,影响目标分支指令的预测结果的信息包括影响目标分支指令跳转的其它分支指令的历史跳转信息。
在一种可能的实现方式中,目标分支条目还包括第二字段,第二字段用于指示第一地址,若预测结果指示执行跳转操作,则跳转操作的目标地址即为第一地址。
在一种可能的实现方式中,目标分支条目还包括第三字段,第三字段用于指示分支属性,该分支预测器可以包括:处理单元,用于若第三字段中存在空闲比特,则将M个预测信息中的至少一个预测信息存放在空闲比特中。
在一种可能的实现方式中,目标分支条目还包括第四字段,第四字段用于指示根据第四字段和第二字段确定第一地址;第二字段用于指示第二地址,第二地址为第一地址中与目标分支指令的地址不同的地址。
在一种可能的实现方式中,第四字段为第二地址的比特数;或者,第四字段为目标分支指令的地址与第一地址相同的比特数;或者,第四字段为目标分支指令的地址的比特数与第二地址的比特数之和。
在一种可能的实现方式中,第二字段设置在目标分支条目的最后一列。
在一种可能的实现方式中,M个预测信息包括第一规则对应的m1个第一预测信息和第二规则对应的m2个第二预测信息;
处理单元,还用于获取目标分支指令的实际跳转结果;并分别将第一规则对应的预测结果和第二规则对应的预测结果,与实际跳转结果进行比较;再根据比较结果,确定预测使用的预测信息,预测使用的预测信息为第一预测信息和第二预测信息中的一个。其中,第一规则对应的预测结果为根据m1个第一预测信息预测得到的,第二规则对应的预测结果为根据m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M。
在一种可能的实现方式中,处理单元,具体用于若第一规则对应的预测结果与实际跳转结果相同,则将第一预测信息确定为预测使用的预测信息;或者,若第二规则对应的预测结果与实际跳转结果相同,则将第二预测信息确定为预测使用的预测信息。
在一种可能的实现方式中,根据实际跳转结果更新第一预测信息和第二预测信息。
在一种可能的实现方式中,第一规则为全局增强GL算法,第二规则为两比特分支预测local2b算法;或者,第一规则为local2b算法,第二规则为GL算法。
第三方面,本申请实施例还提供一种分支预测器,该分支预测器可以包括预测逻辑电路和存储器;
存储器,用于存储分支预测表,其中,分支预测表中包括多个分支条目;
预测逻辑电路,用于根据目标分支指令的部分地址,在多个分支条目中查找与目标分支指令对应的目标分支条目;其中,目标分支条目包括第一字段,第一字段用于指示M个预测信息,M个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的;若在分支预测表中查找到目标分支条目,则根据第一字段指示的预测信息,预测是否对目标分支指令执行跳转操作;其中,M为大于或者等于2的整数。
在一种可能的实现方式中,影响目标分支指令的预测结果的信息包括影响目标分支指令跳转的其它分支指令的历史跳转信息。
在一种可能的实现方式中,目标分支条目还包括第二字段,第二字段用于指示第一地址,若预测结果指示执行跳转操作,跳转操作的目标地址即为第一地址。
在一种可能的实现方式中,目标分支条目还包括第三字段,第三字段用于指示分支属性,分支预测器还包括更新逻辑电路;
更新逻辑电路,用于若第三字段中存在空闲比特,则将M个预测信息中的至少一个预测信息存放在空闲比特中。
在一种可能的实现方式中,目标分支条目还包括第四字段,第四字段用于指示根据第四字段和第二字段确定第一地址;第二字段用于指示第二地址,第二地址为第一地址中与目标分支指令的地址不同的地址。
在一种可能的实现方式中,第四字段为第二地址的比特数;或者,第四字段为目标分支指令的地址与第一地址相同的比特数;或者,第四字段为目标分支指令的地址的比特数与第二地址的比特数之和。
在一种可能的实现方式中,第二字段设置在目标分支条目的最后一列。
在一种可能的实现方式中,M个预测信息包括第一规则对应的m1个第一预测信息和第二规则对应的m2个第二预测信息;
更新逻辑电路,还用于获取目标分支指令的实际跳转结果;
预测逻辑电路,还用于分别将第一规则对应的预测结果和第二规则对应的预测结果,与实际跳转结果进行比较,并根据比较结果,确定预测使用的预测信息,预测使用的预测信息为第一预测信息和第二预测信息中的一个;其中,第一规则对应的预测结果为根据m1个第一预测信息预测得到的,第二规则对应的预测结果为根据m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M。
在一种可能的实现方式中,预测逻辑电路,具体用于若第一规则对应的预测结果与实际跳转结果相同,则将第一预测信息确定为预测使用的预测信息;或者,若第二规则对应的预测结果与实际跳转结果相同,则将第二预测信息确定为预测使用的预测信息。
在一种可能的实现方式中,更新逻辑电路,还用于根据实际跳转结果更新第一预测信息和第二预测信息。
在一种可能的实现方式中,第一规则为全局增强GL算法,第二规则为两比特分支预测local2b算法;或者,第一规则为local2b算法,第二规则为GL算法。
第四方面,本申请实施例还提供一种处理器,该处理器可以包括:处理电路和上述第二方面或第三方面任一种可能的实现方式中所述的分支预测器;
其中,分支预测器用于预测是否对目标分支指令执行跳转操作,若预测结果指示执行 跳转操作,则处理电路将分支预测器中目标分支条目中的第一地址作为跳转操作的目标地址,对目标分支指令执行跳转操作。
第五方面,本申请实施例还提供一种可读存储介质,可读存储介质上存储有计算机程序;计算机程序被执行时,用于执行如上述第一方面任一种可能的实现方式所述的分支指令的处理方法。
第六方面,本申请实施例还提供一种芯片,芯片上存储有计算机程序,在计算机程序被处理器执行时,用于执行如上述第一方面任一种可能的实现方式所述的分支指令的处理方法。
本申请实施例提供的分支指令的处理方法、分支预测器及处理器,在预测是否对目标分支指令执行跳转操作时,先根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;若在分支预测表中查找到目标分支条目,由于该目标分支条目包括第一字段,且第一字段用于指示多个预测信息,该多个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的,这样就可以根据多个目标分支指令的历史跳转信息和影响目标分支指令的预测结果的信息共同预测是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。
附图说明
图1为本申请实施例提供的一种系统架构示意图;
图2为本申请实施例提供的一种分支指令的处理方法的流程示意图;
图3为本申请实施例提供的一种分支条目的结构示意图;
图4为本申请实施例提供的另一种分支条目的结构示意图;
图5为本申请实施例提供的一种指令的处理示意图;
图6为本申请实施例提供的又一种分支条目的结构示意图;
图7为本申请实施例提供的一种分支条目的结构示意图
图8为本申请实施例提供的一种目标分支指令的地址和第一地址的示意图;
图9为本申请实施例提供的另一种分支条目的结构示意图;
图10为本申请实施例提供的另一种分支指令的处理方法的流程示意图;
图11为本申请实施例提供的一种选择预测规则的示意图;
图12为本申请实施例提供的一种预测规则切换的示意图;
图13为本申请实施例提供的一种分支预测器的结构示意图;
图14为本申请实施例提供的另一种分支预测器的结构示意图;
图15为本申请实施例提供的一种处理器的结构示意图。
具体实施方式
图1为本申请实施例提供的一种系统架构示意图,该系统架构可以包括指令存储器、取指器及分支预测器。示例的,可参见图1所示,指令存储器用于储存多个指令,取指器从指令存储器中获取指令,并将指令地址发送给分支预测器,若分支预测器根据指令地址在其分支预测表中查找到该指令,则确定该指令为分支指令,并对分支指令进行跳转预测 时,可以根据分支预测器中存储的该分支指令的历史跳转信息进行预测,并向选择器发送预测结果,预测结果可以是跳转地址,是否跳转等信息。若确定执行预测操作,则向选择器发送该预测操作对应的跳转地址,选择器在该跳转地址和取指器发送的分支指令的地址中,确定用于查询分支指令的地址,并将该地址发送给指令存储器,使得存储器可以根据该地址查找到对应的指令,并将该指令发送给取指器(示例的,取指器发给选择器的指令地址可以取指器发给分支预测器的指令地址加1,加1是指下一跳指令的地址)。
现有技术中,在通过分支预测器对分支指令进行预测时,通常采用Local2b算法对分支指令是否跳转进行预测。但该算法只是根据该分支指令的历史跳转信息进行预测,并未考虑到其它会影响因素,从而导致预测结果的准确度不高。为了提高预测结果的准确度,本申请实施例提供了一种分支指令的预测方法,在预测是否对目标分支指令执行跳转操作时,先根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;若在分支预测表中查找到目标分支条目,由于该目标分支条目包括第一字段,且第一字段用于指示M个预测信息,M个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的,M为大于或者等于2的整数,这样就可以根据第一字段指示的多个(至少两个)预测信息,预测是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。
在详细描述本申请实施例提供的分支指令的处理方法的技术方案之前,先解释一下本申请实施例中涉及的几个概念。首先,对于分支指令,可以根据第一维度(直接目标跳转或间接目标跳转)、及第二维度(非条件跳转或条件跳转)将分支指令分为四类。其中,直接目标跳转的跳转地址是固定的,而间接目标跳转的目标可以是个变量,例如某个寄存器的值,而这个寄存器的值可以变化;非条件跳转的分支指令是一定跳转,而条件跳转一般会根据某个变量的值确定是否执行跳转操作,例如,当变量的值为1时,确定执行跳转操作;当变量的值为0时,确定不执行跳转操作。通过这两个维度将分别指令划分的四类分支指令分别为:第一类分支指令为直接目标跳转,且非条件跳转的分支指令,该第一类分支指令的跳转地址固定,且必然会执行跳转操作,即该第一类分支指令不需要进行跳转预测;第二类分支指令为直接目标跳转,且条件跳转的分支指令,该第二类分支指令的跳转地址固定,但需要预测是否执行跳转操作;第三类分支指令为间接目标跳转,且非条件跳转的分支指令,该第三类分支指令的跳转地址需要预测,且必然会执行跳转操作,即该第三类分支指令只需要预测跳转地址;第四类分支指令为间接目标跳转,且条件跳转的分支指令,该第四类分支指令需要预测跳转地址,且需要预测是否执行跳转操作,即该第四类分支指令需要预测跳转地址和是否执行跳转操作。其次,对于不同类型的分支指令,分支指令的预测信息可以包括用于预测分支指令是否执行跳转操作的信息,也可以包括分支指令的历史预测结果,这些都可以作为分支指令的预测信息。例如,预测信息local2b算法和gshare-like算法中存储的历史跳转信息。下面,以分支指令为第二类分支指令(即跳转地址固定,但需要预测是否跳转的分支指令)为例,详细描述本申请实施例提供的分支指令的处理方法的技术方案。
需要说明的是,本申请的实施例中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A, B可以是单数或者复数。在本申请的文字描述中,字符“/”一般表示前后关联对象是一种“或”的关系。
图2为本申请实施例提供的一种分支指令的处理方法的流程示意图,示例的,请参见图2所示,该分支指令的处理方法可以包括:
S201、根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目。
其中,目标分支条目包括第一字段,第一字段用于指示M个预测信息,M个预测信息是根据目标分支指令的历史跳转信息及影响所述目标分支指令的预测结果的信息预测得到的,M为大于或者等于2的整数,即第一字段用于至少两个预测信息。可选的,影响所述目标分支指令的预测结果的信息可以为影响目标分支指令跳转的其它分支指令的历史跳转信息,也可以为其它信息,具体可以根据实际需要进行设置,在此,本申请实施例不做具体限制。示例的,请参见图3所示,图3为本申请实施例提供的一种分支条目的结构示意图。该分支条目中可以包括分支指令的分支地址,该分支地址可以理解为分支指令的高位地址,也可以包括用于指示分支指令的M个属性信息的第一字段,当然,也可以包括指令类型。
在解释目标分支指令的部分地址之前,先介绍下目标分支指令的地址。目标分支指令的地址是该目标指令在目标分支指令内的地址,该目标分支指令的地址通常不会完整的储存在分支预测表中。以目标分支指令的地址为32比特为例,可以从中选取若干比特,例如从低位取10比特来确定其在分支预测表中的位置,比如哪一行。再使用剩余22比特中的20比特来对比确定分支预测表的对应位置储存的是否是该目标指令。剩余2比特可以是无效比特,也可以没有无效比特,使用剩余的所有22比特进行对比确定分支预测表的对应位置储存的是否是该目标指令。对应的,当从低位取10比特来确定其在分支预测表中的位置时,该目标分支指令的部分地址可以理解为该目标分支指令的低位地址,即10比特地址,当然,也可以包括该低位地址的大于10比特的地址,只要根据该部分地址在分支预测表中可以确定该目标分支指令所在的行即可。可以理解的是,本申请实施例只是以目标分支指令的地址为32比特为例进行说明,但并不代表本申请实施例仅局限于此。
在确定目标分支指令的部分地址之后,就可以根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目。对于分支预测表而言,在初始状态下,该分支预测表为空,即该分支预测表中没有存储分支条目,处理器在分支指令处理过程中,每处理一条分支指令,就会将该分支指令的相关信息以分支条目的形式存放在分支预测器中的分支预测表中,分支预测表是存储在分支预测器中的存储器中,这样分支预测表中就会存放至少一个分支指令各自对应的分支条目,该分支预测表中至少一个分支条目可能包括目标分支指令对应的目标分支条目,也可能不包括目标分支指令对应的目标分支条目,因此,可以根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目,以根据查找结果确定相应的预测结果。可以理解的是,本申请中的处理器可以包括分支预测器和处理电路。其中,分支预测器主要用于预测是否对目标分支指令执行跳转操作,若预测结果指示执行跳转操作,则处理电路将分支预测器中目标分支条目中的第一地址作为跳转操作的目标地址,对所述目标分支指令执行跳转操作。
S202、若在分支预测表中查找到目标分支条目,则根据第一字段指示的预测信息,预 测是否对目标分支指令执行跳转操作。
可以理解的是,预测结果可以是跳转地址,或者是否跳转等信息。
若在分支预测表中查找到目标分支条目,则可以根据该目标分支条目中第一字段指示的预测信息预测是否对目标分支指令执行跳转操作。示例的,可以根据第一字段指示的M个预测信息中的N个预测信息,预测是否对目标分支指令执行跳转操作。可以理解的是,当N小于M时,表示可以根据M个预测信息中的部分预测信息预测是否对目标分支指令执行跳转操作;当N等于M时,表示可以根据M个预测信息中的所有预测信息预测是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。可以理解的是,在根据预测信息预测是否对目标分支指令执行跳转操作时,预测信息的个数可以根据实际需要进行设置,在此,本申请实施例不做进一步地限制。
示例的,以目标分支指令为第一分支指令,在预测是否对该第一分支指令执行跳转操作时,该预测信息中不仅包括第一分支指令的历史跳转信息,还可以考虑其它影响该第一分支指令的因素信息,例如,其它分支指令(如第二分支指令、第三分支指令和第四分支指令)的历史跳转信息。若第二分支指令、第三分支指令和第四分支指令的历史跳转信息为上一次的跳转结果,且跳转结果用两比特的0和1表示时,可以有8种组合的跳转结果,对应有8个预测信息,则M个预测信息可以包括8个预测信息,这样就可以根据8个预测信息中的至少两个预测信息预测是否对第一分支指令执行跳转操作。
在预测是否对目标分支指令执行跳转操作后,就可以得到预测结果。示例的,预测结果可以用两比特表示,例如,预测结果可以为00、01、10及11。通常情况下,预测结果00和01表示不执行跳转操作,预测结果10和11表示执行跳转操作,因此,在得到预测结果之后,就可以根据该预测结果确定是否执行跳转操作。
可选的,若预测结果指示执行跳转操作,则将第一地址作为跳转操作的目标地址。示例的,在将第一地址作为跳转操作的目标地址执行跳转操作时,分支预测器指示处理器中的处理电路将第一地址作为跳转操作的目标地址,对目标分支指令执行跳转操作,使得跳转至该第一地址;相反的,预测结果指示不执行跳转操作,则不对该目标分支指令进行处理。
可选的,目标分支条目还包括第二字段,第二字段用于指示第一地址。可以理解的是,该第一地址可以理解为完整的跳转地址,也可以理解为部分跳转地址,但根据该部分跳转地址可以确定完整的跳转地址。示例的,请参见图4所示,图4为本申请实施例提供的另一种分支条目的结构示意图。
若预测结果指示执行跳转操作,则可以根据目标分支条目包括的用于第一地址到的第二字段,对目标分支指令执行跳转操作,使得跳转至该第一地址,并在第一地址上继续获取分支指令进行预测,直至执行完所有指令。
由此可见,本申请实施例提供的分支指令的处理方法,在预测是否对目标分支指令执行跳转操作时,先根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;若在分支预测表中查找到目标分支条目,由于该目标分支条目包括第一字段,且第一字段用于指示多个预测信息,该多个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的,这样就可以根据多个目标分支指令的历史跳转信息和影响目标分支指令跳转的其它分支指令的历史跳转信息共同预测 是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。
示例的,为了更清楚地说明本申请实施例提供的分支指令的处理方法,示例的,请参见图5所示,图5为本申请实施例提供的一种指令的处理示意图。结合图5可以看出,指令存储器中依次存储有指令A、指令B、指令C、指令D及指令E,取指器从指令存储器中取出指令A,并将指令A的地址发送给分支预测器,由于分支预测器中包括多个分支指令对应的分支条目,分支条目中存储有分支指令的地址,因此分支预测器可以根据指令A的地址在分支预测表中查找,若在分支预测表中查找到该指令A,则可以确定该指令A为分支指令,并进一步根据该指令A对应的分支条目中的指令A的多个历史跳转信息和影响所述指令A跳转的其它分支指令的历史跳转信息共同预测是否对该指令A执行跳转操作,若预测结果指示执行跳转操作,则在该指令A对应的分支条目中获取第一地址,并指示处理器中的处理电路对指令A执行跳转操作。若第一地址为指令D的地址,则取指器将指令D的地址发送给指令存储器,指令存储器中根据指令D的地址将指令D发给处理器,处理器对指令D进行处理。可以看出,在预测是否对指令A执行跳转操作时,是根据该指令A对应的多个历史跳转信息和影响所述指令A跳转的其它分支指令的历史跳转信息共同预测是否对该指令A执行跳转操作,从而提高了预测结果的准确度。
图2所示的实施例详细描述了在预测是否对目标分支指令执行跳转操作时,可以基于多个预测信息预测是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。但由于目标分支条目中包括多个预测信息,这样会增加目标分支条目的长度,即目标分支条目所占的比特数增加,此时,可以考虑在保证多个预测信息的情况下,减少目标分支条目所占的比特数。可选的,在本申请实施例中,可以通过至少两种可能的实现方式减少目标分支条目所占的比特数。在一种可能的实现方式中,当目标分支条目包括用于指示分支属性的第三字段,且第三字段中存在空闲比特时,可以通过复用空闲比特存放至少一个预测信息,以减少目标分支条目所占的比特数。示例的,可参见图6所示,图6为本申请实施例提供的又一种分支条目的结构示意图。在另一种可能的实现方式中,由于目标分支指令的地址和第一地址比较接近,目标分支指令的地址和第一地址有很多位是相同的,因此,可以通过压缩第一地址所占的比特数,以减少目标分支条目所占的比特数。下面,将对这两种可能的实现方式进行详细地说明。
在一种可能的实现方式中,通过复用空闲比特存放至少一个预测信息,以减少目标分支条目所占的比特数。具体方案为:由于目标分支条目还包括用于指示分支属性的第三字段,若第三字段中存在空闲比特,则可以将M个预测信息中的至少一个预测信息存放在空闲比特中,以通过复用第三字段中的空闲比特存放至少一个预测信息,这样就无需额外增加目标分支条目的比特数,从而减少了目标分支条目占用的比特数。
可以理解的是,分支属性表示分支指令的类型,也可以表示分支指令是否为一次性用途,也可以表示是否要优先替换等。每一种属性占用一定的比特数。现有技术中,每次读取分支条目中的相关信息,然后根据分支指令的类型,确定可使用的分支属性和不可使用的分支属性,不可使用的分支属性的字段是空闲的。例如,在每一个分支条目中,有10个分支属性的比特域,分支指令的类型分为A,B,C三种,A类型的分支指令需要全部10个分支属性的比特域,B类型的分支指令需要8个分支属性的比特域,C类型的分支指令需要5个分支属性的比特域,可以看出,B类型的分支指令可以空出2个分支属性的空 闲比特域,这2个分支属性的空闲比特域就可以用于存放至少一个预测信息;类似的,C类型的分支指令可以空出5个分支属性的空闲比特域,这5个分支属性的空闲比特域就可以用于存放至少一个预测信息,从而减少了目标分支条目占用的比特数。
在另一种可能的实现方式中,通过压缩第一地址所占的比特数,以减少目标分支条目所占的比特数。具体方案为:由于目标分支条目还包括第四字段,第四字段用于指示根据第四字段和第二字段确定第一地址;第二字段用于指示第二地址,第二地址为第一地址中与目标分支指令的地址不同的地址,从而减少了目标分支条目占用的比特数。示例的,请参见图7所示,图7为本申请实施例提供的一种分支条目的结构示意图。
可选的,在根据第四字段和第二字段确定第一地址时,该第四字段为第二地址的比特数;或者,第四字段为目标分支指令的地址与第一地址相同的比特数;或者,第四字段为目标分支指令的地址的比特数与第二地址的比特数之和。
在该种可能的实现方式中,由于目标分支指令的地址和第一地址比较接近,目标分支指令的地址和第一地址有很多位是相同的,因此,可以通过压缩第一地址所占的比特数,以减少目标分支条目所占的比特数。示例的,可参见图8所示,图8为本申请实施例提供的一种目标分支指令的地址和第一地址的示意图。可以看出,目标分支指令的地址和第一地址有很多位地址是相同的,因此,为了节省第一地址所占的比特数,可以通过第二字段只指示第二地址,第二地址为第一地址中与目标分支指令的地址不同的地址。可选的,在通过第二字段指示第二地址时,由于分支指令地址和目标地址相同的bit数是动态变化的,因此,目标分支条目还需要包括第四字段,第四字段用于指示根据第二地址可以正确读取第一地址,即可以根据该第四字段和第二字段指示的第二地址确定第一地址,从而确定目标分支指令的跳转地址。
可选的,在根据该第四字段和第二字段指示的第二地址确定第一地址时,该第四字段可以直接为第二地址的比特数;或者,第四字段为目标分支指令的地址与第一地址相同的比特数;或者,第四字段为目标分支指令的地址和第二地址的比特数之和,从而可以根据该第四字段和第二字段指示的第二地址确定第一地址。
以目标分支指令的地址和第一地址均为32比特,且目标分支指令的地址与第一地址有20位地址相同为例,可以看出,目标分支指令的地址与第一地址有12位地址不同,此时,第四字段可以直接为第二地址的比特数,即第四字段可以为12,则可以根据第四字段12和第二地址确定第一地址。或者,第四字段也可以为目标分支指令的地址与第一地址相同的比特数,即第四字段可以为20,则可以根据第四字段20和第二地址确定第一地址。或者,第四字段也可以为目标分支指令的地址和第二地址的比特数,即第四字段可以为44比特(32比特与12比特之和),则可以根据第四字段44和第二地址确定第一地址,从而确定第一地址。
可以理解的是,由于第四字段指示的压缩控制的值不同,第二地址的长度也会发生变化,为了不影响其它字段所占的比特域的位置,可以将第二字段设置在目标分支条目的最后一列。
需要说明的是,在通过上述两种可能的实现方式减少目标分支条目所占的比特数时,这两种可能的实现方式可以独立使用,也可以混合使用,具体可以根据实际需要进行设置,在此,本申请实施例不做具体限制。
由此可见,在本申请实施例中,通过图2所示的实施例,在预测是否对目标分支指令执行跳转操作时,可以基于多个预测信息预测是否对目标分支指令执行跳转操作,从而提高了预测结果的准确度。但由于目标分支条目中包括多个预测信息,这样会增加目标分支条目的长度,通过上述两种可能的实现方式,实现了在保证多个预测信息的情况下,减少目标分支条目所占的比特数。
需要说明的是,在通过上述实施例描述本申请实施例提供的技术方案时,只是以分支指令为第二类分支指令为例进行说明,该本申请实施例提供的技术方案也可以应用于其它三种类型的分支指令。示例的,当本申请实施例提供的技术方案应用于第一类分支指令时,由于该第一类分支指令不需要进行跳转预测,可以直接使用上述分支条目的结构储存跳转地址,只是不使用分支条目中的预测信息;当本申请实施例提供的技术方案应用于第三类分支指令时,由于该第三类分支指令只需要预测跳转地址,因此,在跳转时,跳转地址默认和上次的跳转地址相同,且不使用分支条目中的预测信息;当本申请实施例提供的技术方案应用于第四类分支指令时,由于该第四类分支指令需要预测跳转地址和是否执行跳转操作,因此,在跳转时,跳转地址默认和上次的跳转地址相同,且使用分支条目中的预测信息对是否执行跳转操作进行预测,其预测方式与第二类分支指令的预测方法类似,具体可以参见第二类分支指令的预测方法,在此,本申请实施例不再进行赘述。可以理解的是,本申请实施例只是以这四种类型的分支指令为例进行说明,但并不代表本申请实施例仅局限于此。
由于单一的预测规则难以适应于所有的分支指令预测场景,因此,可以通过两种不同的规则预测是否对目标分支指令执行跳转操作。基于上述图2所示的实施例,当M个预测信息中包括两种规则对应的预测信息时,就可以实现根据两种不同的规则预测是否对目标分支指令执行跳转操作。示例的,请参见图9所示,图9为本申请实施例提供的另一种分支条目的结构示意图。可以理解的是,本申请实施例只是以M个预测信息中包括两种规则对应的预测信息为例进行说明,也可以包括三种规则对应的预测信息,具体可以根据实际需要进行设置,在此,对于M个预测信息中包括的预测信息所属的规则的个数,本申请实施例不做进一步地限制。
可选的,第一规则为全局增强GL算法,第二规则为两比特分支预测local2b算法;或者,第一规则为local2b算法,第二规则为GL算法,当然,也可以为其它算法,具体可以根据实际需要进行设置,在此,本申请实施例不做进一步地限制。
示例的,以第一规则和第二规则分别为local2b算法和GL算法为例,预测是否对目标分支指令执行跳转操作时,该目标分支指令对应的M个预测信息可以包括local2b算法对应的1个预测信息,及GL算法对应的预测信息。其中,GL算法对应的预测信息不仅包括目标分支指令的历史跳转信息,还可以考虑其它影响该目标分支指令的因素信息,例如,其它分支指令(如第二分支指令、第三分支指令和第四分支指令)的历史跳转信息。若第二分支指令、第三分支指令和第四分支指令的历史跳转信息为上一次的跳转结果,且跳转结果用两比特的0和1表示时,可以有8种组合的跳转结果,对应有8个预测信息,则M个预测信息可以包括9个预测信息(local2b算法对应的1个预测信息和GL算法对应的8个预测信息),这样就可以根据9个预测信息中的至少两个预测信息预测是否对目标分支指令执行跳转操作。
进一步地,当M个预测信息中包括两种规则对应的预测信息,并通过该两种规则对应的预测信息预测是否对目标分支指令执行跳转操作时,如何确定使用哪种规则对应的预测信息预测是否对目标分支指令执行跳转操作,示例的,请参见图10所示,图10为本申请实施例提供的另一种分支指令的处理方法的流程示意图,该分支指令的处理方法还可以包括:
S1001、获取目标分支指令的实际跳转结果。
处理器在对目标分支指令进行处理时,需要间隔一定的时间才能获取到目标分支指令的实际跳转结果,因此,会分别通过第一规则对应的预测信息和第二规则对应的预测信息对该目标分支指令的跳转结果进行预测,得到第一规则对应的预测结果和第二规则对应的预测结果。待获取到目标分支指令的实际跳转结果之后,就可以执行下述S1002:
S1002、分别将第一规则对应的预测结果和第二规则对应的预测结果,与实际跳转结果进行比较。
其中,第一规则对应的预测结果为根据m1个第一预测信息预测得到的,第二规则对应的预测结果为根据m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M。
S1003、根据比较结果,确定预测使用的预测信息。
其中,预测使用的预测信息为第一预测信息和第二预测信息中的一个,预测使用的预测信息可以理解为下一次预测使用的预测信息。
在分别得到第一规则对应的预测结果、第二规则对应的预测结果及实际跳转结果之后,分别将第一规则对应的预测结果和第二规则对应的预测结果,与实际跳转结果进行比较,若第一规则对应的预测结果与实际跳转结果相同,则将第一预测信息确定为预测使用的预测信息,即下一次通过第一预测信息预测是否对目标分支指令执行跳转操作;相反的,若第二规则对应的预测结果与实际跳转结果相同,则将第二预测信息确定为预测使用的预测信息,从而得到预测使用的预测信息。其中,预测使用的预测信息可以理解为下一次预测使用的预测信息。
可选的,在根据比较结果确定预测使用的预测信息时,可以先根据比较结果,更新选择计数器,再根据该选择计数器的值确定预测使用的预测信息。示例的,请参见如图11所示,图11为本申请实施例提供的一种选择预测规则的示意图。
示例的,以第一规则和第二规则分别为local2b算法和GL算法为例,其中,GL算法为本申请实施例提供的预测算法,选择计数器以2比特的值表示,分别为00、01、10及11,00和01表示采用local2b算法,10和11表示采用GL算法。选择计数器的计数规则可以为:当采用local2b算法预测是否对目标分支指令执行跳转操作时,若预测结果正确,则选择计数器的值减1,若预测结果错误,则选择计数器的值加1;当采用GL算法预测是否对目标分支指令执行跳转操作时,若预测结果正确,则选择计数器的值加1,若预测结果错误,则选择计数器的值减1。
示例的,若初始状态下选择计数器的值为00,且使用local2b算法进行预测,可参见图12所示,图12为本申请实施例提供的一种预测规则切换的示意图,若local2b算法的预测结果与实际跳转结果相同,则说明local2b算法预测正确,选择计数器的值减1,由于选择计数器本身的值已经为00,因此,可以先保持选择计数器的值不变,即选择计数器的值 仍为00;若local2b算法的预测结果与实际跳转结果不相同,则说明local2b算法预测错误,选择计数器的值加1,选择计数器的值由00变为01,01表示的是local2b算法,此时还是保持采用local2b算法不变;若再次使用local2b算法的预测结果与实际跳转结果不相同,则说明local2b算法预测错误,选择计数器的值加1,选择计数器的值由01变为10,10表示的是GL算法,此时切换到GL算法,后续在使用GL算法进行预测,若GL算法的预测结果与实际跳转结果不相同,则说明local2b算法预测错误,选择计数器的值加1,选择计数器的值由10变为11,11表示的是GL算法,此时还是保持采用GL算法不变,从而根据该选择计数器的值确定预测使用的预测信息。
可以理解的是,本申请实施例只是以00和01表示采用local2b算法,10和11表示采用GL算法为例进行说明,也可以用00和01表示采用GL算法,10和11表示采用local2b算法,具体可以根据实际需要进行设置,在此,本申请实施例不做进一步地限制。当00和01表示采用GL算法,10和11表示采用local2b算法时,对应的选择计数器的计数规则可以为:当采用GL算法预测是否对目标分支指令执行跳转操作时,若预测结果正确,则选择计数器的值减1,若预测结果错误,则选择计数器的值加1;当采用local2b算法预测是否对目标分支指令执行跳转操作时,若预测结果正确,则选择计数器的值加1,若预测结果错误,则选择计数器的值减1。
由此可见,在本申请实施例中,通过图2所示的实施例,在预测是否对目标分支指令执行跳转操作时,可以基于多个预测信息预测是否对目标分支指令执行跳转操作,其中,多个预测信息是根据所述目标分支指令的历史跳转信息及影响所述目标分支指令的预测结果的信息预测得到的,因此提高了预测结果的准确度。进一步地,当多个预测信息包括两种预测规则时,还可以实现根据两种不同的规则预测是否对目标分支指令执行跳转操作,以使本申请实施例提供的分支指令的处理方法可以适应于更多的预测场景。
需要说明的是,在获取到实际跳转结果之后,还可以根据实际跳转结果更新第一预测信息和第二预测信息,从而提高了第一预测信息和第二预测信息的准确度,这样后续再通过更新后的第一预测信息和第二预测信息预测是否对目标分支指令执行跳转操作,还可以提高预测结果的准确度。
图13为本申请实施例提供的一种分支预测器130的结构示意图,示例的,请参见图13所示,该分支预测器130可以包括:
查找单元1301,用于根据目标分支指令的部分地址,在分支预测表中查找目标分支指令对应的目标分支条目;其中,目标分支条目包括第一字段,第一字段用于指示M个预测信息,M个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的;M为大于或者等于2的整数。
预测单元1302,用于若在分支预测表中查找到目标分支条目,则根据第一字段指示的预测信息,预测是否对目标分支指令执行跳转操作。
可选的,影响目标分支指令的预测结果的信息包括影响目标分支指令跳转的其它分支指令的历史跳转信息。
可选的,目标分支条目还包括第二字段,第二字段用于指示第一地址,若预测结果指示执行跳转操作,则跳转操作的目标地址即为第一地址。
可选的,目标分支条目还包括第三字段,第三字段用于指示分支属性,该分支预测器 130可以包括:处理单元1303,用于若第三字段中存在空闲比特,则将M个预测信息中的至少一个预测信息存放在空闲比特中。
可选的,目标分支条目还包括第四字段,第四字段用于指示根据第四字段和第二字段确定第一地址;第二字段用于指示第二地址,第二地址为第一地址中与目标分支指令的地址不同的地址。
可选的,第四字段为第二地址的比特数;或者,第四字段为目标分支指令的地址与第一地址相同的比特数;或者,第四字段为目标分支指令的地址的比特数与第二地址的比特数之和。
可选的,第二字段设置在目标分支条目的最后一列。
可选的,M个预测信息包括第一规则对应的m1个第一预测信息和第二规则对应的m2个第二预测信息。
处理单元1303,还用于获取目标分支指令的实际跳转结果;并分别将第一规则对应的预测结果和第二规则对应的预测结果,与实际跳转结果进行比较;再根据比较结果,确定预测使用的预测信息,预测使用的预测信息为第一预测信息和第二预测信息中的一个。其中,第一规则对应的预测结果为根据m1个第一预测信息预测得到的,第二规则对应的预测结果为根据m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M。
可选的,处理单元1303,具体用于若第一规则对应的预测结果与实际跳转结果相同,则将第一预测信息确定为预测使用的预测信息;或者,若第二规则对应的预测结果与实际跳转结果相同,则将第二预测信息确定为预测使用的预测信息。
可选的,根据实际跳转结果更新第一预测信息和第二预测信息。
可选的,第一规则为全局增强GL算法,第二规则为两比特分支预测local2b算法;或者,第一规则为local2b算法,第二规则为GL算法。
本申请实施例提供的分支预测器130,可以用于执行上述图2-图12所示的分支指令的处理方法实施例中的技术方案,其实现原理和技术效果与分支指令的处理方法实施例中实现原理和技术效果类似,此处不再进行赘述。
图14为本申请实施例提供的另一种分支预测器140的结构示意图,示例的,请参见图14所示,该分支预测器140可以包括预测逻辑电路1401和存储器1402;
存储器1402,用于存储分支预测表,其中,分支预测表中包括多个分支条目。
预测逻辑电路1401,用于根据目标分支指令的部分地址,在多个分支条目中查找与目标分支指令对应的目标分支条目;其中,目标分支条目包括第一字段,第一字段用于指示M个预测信息,M个预测信息是根据目标分支指令的历史跳转信息及影响目标分支指令的预测结果的信息预测得到的;若在分支预测表中查找到目标分支条目,则根据第一字段指示的预测信息,预测是否对目标分支指令执行跳转操作;其中,M为大于或者等于2的整数。
可选的,影响目标分支指令的预测结果的信息包括影响目标分支指令跳转的其它分支指令的历史跳转信息。
可选的,目标分支条目还包括第二字段,第二字段用于指示第一地址,若预测结果指示执行跳转操作,跳转操作的目标地址即为第一地址。
可选的,目标分支条目还包括第三字段,第三字段用于指示分支属性,分支预测器140还包括更新逻辑电路1403。
更新逻辑电路1403,用于若第三字段中存在空闲比特,则将M个预测信息中的至少一个预测信息存放在空闲比特中。
可选的,目标分支条目还包括第四字段,第四字段用于指示根据第四字段和第二字段确定第一地址;第二字段用于指示第二地址,第二地址为第一地址中与目标分支指令的地址不同的地址。
可选的,第四字段为第二地址的比特数;或者,第四字段为目标分支指令的地址与第一地址相同的比特数;或者,第四字段为目标分支指令的地址的比特数与第二地址的比特数之和。
可选的,第二字段设置在目标分支条目的最后一列。
可选的,M个预测信息包括第一规则对应的m1个第一预测信息和第二规则对应的m2个第二预测信息。
更新逻辑电路1403,还用于获取目标分支指令的实际跳转结果。
预测逻辑电路1401,还用于分别将第一规则对应的预测结果和第二规则对应的预测结果,与实际跳转结果进行比较,并根据比较结果,确定预测使用的预测信息,预测使用的预测信息为第一预测信息和第二预测信息中的一个;其中,第一规则对应的预测结果为根据m1个第一预测信息预测得到的,第二规则对应的预测结果为根据m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M。
可选的,预测逻辑电路1401,具体用于若第一规则对应的预测结果与实际跳转结果相同,则将第一预测信息确定为预测使用的预测信息;或者,若第二规则对应的预测结果与实际跳转结果相同,则将第二预测信息确定为预测使用的预测信息。
可选的,更新逻辑电路1403,还用于根据实际跳转结果更新第一预测信息和第二预测信息。
可选的,第一规则为全局增强GL算法,第二规则为两比特分支预测local2b算法;或者,第一规则为local2b算法,第二规则为GL算法。
本申请实施例提供的分支预测器140,可以用于执行上述图2-图12所示的分支指令的处理方法实施例中的技术方案,其实现原理和技术效果与分支指令的处理方法实施例中实现原理和技术效果类似,此处不再进行赘述。
本申请实施例还提供一种处理器150,示例的,请参见图15所示,图15为本申请实施例提供的一种处理器150的结构示意图,该处理器150可以包括:上述图2-图12中所述的分支预测器1501和处理电路1502。
其中,分支预测器1501用于预测是否对目标分支指令执行跳转操作,若预测结果指示执行跳转操作,则处理电路1502将分支预测器1501中目标分支条目中的第一地址作为跳转操作的目标地址,对目标分支指令执行跳转操作。
本申请实施例提供的处理器150,可以用于执行上述图2-图12所示的分支指令的处理方法实施例中的技术方案,其实现原理和技术效果与分支指令的处理方法实施例中实现原理和技术效果类似,此处不再进行赘述。
本申请实施例还提供一种可读存储介质,可读存储介质上存储有计算机程序;计算机 程序被执行时,用于执行如上述第一方面任一种可能的实现方式所述的分支指令的处理方法,其实现原理和技术效果与方法实施例中实现原理和技术效果类似,此处不再进行赘述。
本申请实施例还提供一种芯片,芯片上存储有计算机程序,在计算机程序被处理器执行时,用于执行如上述第一方面任一种可能的实现方式所述的分支指令的处理方法,其实现原理和技术效果与方法实施例中实现原理和技术效果类似,此处不再进行赘述。
需要说明的是,本申请实施例中对模块(或单元)的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。在本申请的实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字多功能光盘(digital versatile disc,DVD);还可以是半导体介质,例如,固态硬盘(solid state disk,SSD)。
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定。

Claims (24)

  1. 一种分支指令的处理方法,其特征在于,包括:
    根据目标分支指令的部分地址,在分支预测表中查找所述目标分支指令对应的目标分支条目;其中,所述目标分支条目包括第一字段,所述第一字段用于指示M个预测信息,所述M个预测信息是根据所述目标分支指令的历史跳转信息及影响所述目标分支指令的预测结果的信息预测得到的;M为大于或者等于2的整数;
    若在所述分支预测表中查找到所述目标分支条目,则根据所述第一字段指示的预测信息,预测是否对所述目标分支指令执行跳转操作。
  2. 根据权利要求1所述的方法,其特征在于,
    所述影响所述目标分支指令的预测结果的信息包括影响所述目标分支指令跳转的其它分支指令的历史跳转信息。
  3. 根据权利要求1或2所述的方法,其特征在于,所述目标分支条目还包括第二字段,所述第二字段用于指示第一地址,若预测结果指示执行跳转操作,则所述跳转操作的目标地址即为所述第一地址。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述目标分支条目还包括第三字段,所述第三字段用于指示分支属性;所述方法还包括:
    若所述第三字段中存在空闲比特,则将所述M个预测信息中的至少一个预测信息存放在所述空闲比特中。
  5. 根据权利要求3所述的方法,其特征在于,所述目标分支条目还包括第四字段,所述第四字段用于指示根据所述第四字段和第二字段确定第一地址;所述第二字段用于指示第二地址,所述第二地址为所述第一地址中与所述目标分支指令的地址不同的地址。
  6. 根据权利要求5所述的方法,其特征在于,
    所述第四字段为所述第二地址的比特数;或者,所述第四字段为所述目标分支指令的地址与所述第一地址相同的比特数;或者,所述第四字段为所述目标分支指令的地址的比特数与所述第二地址的比特数之和。
  7. 根据权利要求5或6所述的方法,其特征在于,
    所述第二字段设置在所述目标分支条目的最后一列。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述M个预测信息包括第一规则对应的m1个第一预测信息和第二规则对应的m2个第二预测信息,所述方法还包括:
    获取所述目标分支指令的实际跳转结果;
    分别将所述第一规则对应的预测结果和所述第二规则对应的预测结果,与所述实际跳转结果进行比较;所述第一规则对应的预测结果为根据所述m1个第一预测信息预测得到的,所述第二规则对应的预测结果为根据所述m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M;
    根据比较结果,确定预测使用的预测信息,所述预测使用的预测信息为所述第一预测信息和所述第二预测信息中的一个。
  9. 根据权利要求8所述的方法,其特征在于,所述根据比较结果,确定预测使用的预测信息,包括:
    若所述第一规则对应的预测结果与所述实际跳转结果相同,则将所述第一预测信息确定为所述预测使用的预测信息;或者,
    若所述第二规则对应的预测结果与所述实际跳转结果相同,则将所述第二预测信息确定为所述预测使用的预测信息。
  10. 根据权利要求8或9所述的方法,其特征在于,
    根据所述实际跳转结果更新所述第一预测信息和所述第二预测信息。
  11. 根据权利要求8-10任一项所述的方法,其特征在于,
    所述第一规则为全局增强GL算法,所述第二规则为两比特分支预测local2b算法;或者,所述第一规则为local2b算法,所述第二规则为GL算法。
  12. 一种分支预测器,其特征在于,包括预测逻辑电路和存储器;
    所述存储器,用于存储分支预测表,其中,所述分支预测表中包括多个分支条目;
    所述预测逻辑电路,用于根据目标分支指令的部分地址,在所述多个分支条目中查找与所述目标分支指令对应的目标分支条目;其中,所述目标分支条目包括第一字段,所述第一字段用于指示M个预测信息,M个预测信息是根据所述目标分支指令的历史跳转信息及影响所述目标分支指令的预测结果的信息预测得到的;若在所述分支预测表中查找到所述目标分支条目,则根据所述第一字段指示的预测信息,预测是否对所述目标分支指令执行跳转操作;其中,M为大于或者等于2的整数。
  13. 根据权利要求12所述的分支预测器,其特征在于,
    所述影响所述目标分支指令的预测结果的信息包括影响所述目标分支指令跳转的其它分支指令的历史跳转信息。
  14. 根据权利要求12或13所述的分支预测器,其特征在于,所述目标分支条目还包括第二字段,所述第二字段用于指示第一地址,若预测结果指示执行跳转操作,所述跳转操作的目标地址即为所述第一地址。
  15. 根据权利要求12-14任一项所述的分支预测器,其特征在于,所述目标分支条目还包括第三字段,所述第三字段用于指示分支属性,所述分支预测器还包括更新逻辑电路;
    所述更新逻辑电路,用于若所述第三字段中存在空闲比特,则将所述M个预测信息中的至少一个预测信息存放在所述空闲比特中。
  16. 根据权利要求14所述的分支预测器,其特征在于,所述目标分支条目还包括第四字段,所述第四字段用于指示根据所述第四字段和第二字段确定第一地址;所述第二字段用于指示第二地址,所述第二地址为所述第一地址中与所述目标分支指令的地址不同的地址。
  17. 根据权利要求16所述的分支预测器,其特征在于,
    所述第四字段为所述第二地址的比特数;或者,所述第四字段为所述目标分支指令的地址与所述第一地址相同的比特数;或者,所述第四字段为所述目标分支指令的地址的比特数与所述第二地址的比特数之和。
  18. 根据权利要求16或17所述的分支预测器,其特征在于,
    所述第二字段设置在所述目标分支条目的最后一列。
  19. 根据权利要求15所述的分支预测器,其特征在于,所述M个预测信息包括第一规则对应的m1个第一预测信息和第二规则对应的m2个第二预测信息;
    所述更新逻辑电路,还用于获取所述目标分支指令的实际跳转结果;
    所述预测逻辑电路,还用于分别将所述第一规则对应的预测结果和所述第二规则对应的预测结果,与所述实际跳转结果进行比较,并根据比较结果,确定预测使用的预测信息,所述预测使用的预测信息为所述第一预测信息和所述第二预测信息中的一个;其中,所述第一规则对应的预测结果为根据所述m1个第一预测信息预测得到的,所述第二规则对应的预测结果为根据所述m2个第二预测信息预测得到的;m1和m2均为大于或者等于1的整数,且m1与m2的和小于或者等于M。
  20. 根据权利要求19所述的分支预测器,其特征在于,
    所述预测逻辑电路,具体用于若所述第一规则对应的预测结果与所述实际跳转结果相同,则将所述第一预测信息确定为所述预测使用的预测信息;或者,若所述第二规则对应的预测结果与所述实际跳转结果相同,则将所述第二预测信息确定为所述预测使用的预测信息。
  21. 根据权利要求19或20所述的分支预测器,其特征在于,
    所述更新逻辑电路,还用于根据所述实际跳转结果更新所述第一预测信息和所述第二预测信息。
  22. 根据权利要求19-21任一项所述的分支预测器,其特征在于,
    所述第一规则为全局增强GL算法,所述第二规则为两比特分支预测local2b算法;或者,所述第一规则为local2b算法,所述第二规则为GL算法。
  23. 一种处理器,其特征在于,包括:
    处理电路和上述权利要求12-22任一项所述的分支预测器;
    其中,分支预测器用于预测是否对目标分支指令执行跳转操作,若预测结果指示执行跳转操作,则处理电路将所述分支预测器中目标分支条目中的第一地址作为跳转操作的目标地址,对所述目标分支指令执行跳转操作。
  24. 一种可读存储介质,其特征在于,
    所述可读存储介质上存储有计算机程序;所述计算机程序被执行时,用于执行如权利要求1-11任一项所述的分支指令的处理方法。
PCT/CN2019/080694 2019-03-30 2019-03-30 分支指令的处理方法、分支预测器及处理器 WO2020199058A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/080694 WO2020199058A1 (zh) 2019-03-30 2019-03-30 分支指令的处理方法、分支预测器及处理器
CN201980093772.5A CN113544640A (zh) 2019-03-30 2019-03-30 分支指令的处理方法、分支预测器及处理器

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/080694 WO2020199058A1 (zh) 2019-03-30 2019-03-30 分支指令的处理方法、分支预测器及处理器

Publications (1)

Publication Number Publication Date
WO2020199058A1 true WO2020199058A1 (zh) 2020-10-08

Family

ID=72664741

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/080694 WO2020199058A1 (zh) 2019-03-30 2019-03-30 分支指令的处理方法、分支预测器及处理器

Country Status (2)

Country Link
CN (1) CN113544640A (zh)
WO (1) WO2020199058A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613039A (zh) * 2020-12-10 2021-04-06 海光信息技术股份有限公司 一种针对幽灵漏洞的性能优化方法及装置
CN115225572A (zh) * 2022-07-13 2022-10-21 阿里巴巴(中国)有限公司 路由信息的处理方法、装置、电子设备和存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113868899B (zh) * 2021-12-03 2022-03-04 苏州浪潮智能科技有限公司 一种分支指令处理方法、系统、设备及计算机存储介质
CN114840258B (zh) * 2022-05-10 2023-08-22 苏州睿芯集成电路科技有限公司 一种多层级混合算法过滤式分支预测方法及预测系统
CN117389629B (zh) * 2023-11-02 2024-06-04 北京市合芯数字科技有限公司 分支预测方法、装置、电子设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100205405A1 (en) * 2009-02-12 2010-08-12 Jin Tai-Song Static branch prediction method and code execution method for pipeline processor, and code compiling method for static branch prediction
US20130275726A1 (en) * 2011-01-07 2013-10-17 Fujitsu Limited Arithmetic processing apparatus and branch prediction method
US20140019737A1 (en) * 2012-07-16 2014-01-16 International Business Machines Corporation Branch Prediction For Indirect Jumps
CN105867880A (zh) * 2016-04-01 2016-08-17 中国科学院计算技术研究所 一种面向间接跳转分支预测的分支目标缓冲器及设计方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6230261B1 (en) * 1998-12-02 2001-05-08 I. P. First, L.L.C. Method and apparatus for predicting conditional branch instruction outcome based on branch condition test type
GB2528676B (en) * 2014-07-25 2016-10-26 Imagination Tech Ltd Conditional Branch Prediction Using A Long History
CN112579166B (zh) * 2020-12-08 2022-11-15 海光信息技术股份有限公司 一种多级分支预测器跳过训练标识的确定方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100205405A1 (en) * 2009-02-12 2010-08-12 Jin Tai-Song Static branch prediction method and code execution method for pipeline processor, and code compiling method for static branch prediction
US20130275726A1 (en) * 2011-01-07 2013-10-17 Fujitsu Limited Arithmetic processing apparatus and branch prediction method
US20140019737A1 (en) * 2012-07-16 2014-01-16 International Business Machines Corporation Branch Prediction For Indirect Jumps
CN105867880A (zh) * 2016-04-01 2016-08-17 中国科学院计算技术研究所 一种面向间接跳转分支预测的分支目标缓冲器及设计方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613039A (zh) * 2020-12-10 2021-04-06 海光信息技术股份有限公司 一种针对幽灵漏洞的性能优化方法及装置
CN112613039B (zh) * 2020-12-10 2022-09-09 成都海光微电子技术有限公司 一种针对幽灵漏洞的性能优化方法及装置
CN115225572A (zh) * 2022-07-13 2022-10-21 阿里巴巴(中国)有限公司 路由信息的处理方法、装置、电子设备和存储介质
CN115225572B (zh) * 2022-07-13 2023-05-26 阿里巴巴(中国)有限公司 路由信息的处理方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN113544640A (zh) 2021-10-22

Similar Documents

Publication Publication Date Title
WO2020199058A1 (zh) 分支指令的处理方法、分支预测器及处理器
US9529595B2 (en) Branch processing method and system
US9367471B2 (en) Fetch width predictor
JP2018063684A (ja) 分岐予測器
US20080250232A1 (en) Data Processing Device, Data Processing Program, and Recording Medium Recording Data Processing Program
WO2015024452A1 (zh) 一种分支预测方法及相关装置
US8578141B2 (en) Loop predictor and method for instruction fetching using a loop predictor
US11379243B2 (en) Microprocessor with multistep-ahead branch predictor
US11416256B2 (en) Selectively performing ahead branch prediction based on types of branch instructions
JP6796717B2 (ja) 分岐ターゲットバッファの圧縮
CN109308191B (zh) 分支预测方法及装置
US8473727B2 (en) History based pipelined branch prediction
US11442727B2 (en) Controlling prediction functional blocks used by a branch predictor in a processor
CN111078296B (zh) 分支预测方法、分支预测单元及处理器核
US8707014B2 (en) Arithmetic processing unit and control method for cache hit check instruction execution
US8151096B2 (en) Method to improve branch prediction latency
US9652245B2 (en) Branch prediction for indirect jumps by hashing current and previous branch instruction addresses
US9569220B2 (en) Processor branch cache with secondary branches
WO2019183877A1 (zh) 分支预测的方法与装置
US11436146B2 (en) Storage control apparatus, processing apparatus, computer system, and storage control method
CN114610388A (zh) 一种指令跳转方法、处理器及电子设备
TW201905683A (zh) 多標籤分支預測表
US20150134939A1 (en) Information processing system, information processing method and memory system
US20230367596A1 (en) Instruction prediction method and apparatus, system, and computer-readable storage medium
US12014176B2 (en) Apparatus and method for pipeline control

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19923531

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19923531

Country of ref document: EP

Kind code of ref document: A1