CN114816536B - Branch prediction processing method, device, equipment and storage medium - Google Patents

Branch prediction processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN114816536B
CN114816536B CN202210753823.6A CN202210753823A CN114816536B CN 114816536 B CN114816536 B CN 114816536B CN 202210753823 A CN202210753823 A CN 202210753823A CN 114816536 B CN114816536 B CN 114816536B
Authority
CN
China
Prior art keywords
instruction
branch
predicted
branch prediction
prediction result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210753823.6A
Other languages
Chinese (zh)
Other versions
CN114816536A (en
Inventor
冯明鹤
高军
王小岛
李晋阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Phytium Technology Co Ltd
Original Assignee
Phytium Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Phytium Technology Co Ltd filed Critical Phytium Technology Co Ltd
Priority to CN202210753823.6A priority Critical patent/CN114816536B/en
Publication of CN114816536A publication Critical patent/CN114816536A/en
Application granted granted Critical
Publication of CN114816536B publication Critical patent/CN114816536B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3861Recovery, e.g. branch miss-prediction, exception handling
    • G06F9/3863Recovery, e.g. branch miss-prediction, exception handling using multiple copies of the architectural state, e.g. shadow registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • G06F9/3806Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

The present disclosure provides a branch prediction processing method, apparatus, device and storage medium, relating to the technical field of processors, the method mainly includes: obtaining a branch prediction result of an instruction to be predicted, wherein the instruction to be predicted is a branch instruction; judging whether the branch prediction result meets a first preset condition or not; and the branch prediction result meets a first preset condition, and the branch path of the instruction to be predicted is backed up to obtain a backup path. According to the branch prediction processing method, the branch prediction processing device, the branch prediction processing equipment and the storage medium, when a branch misprediction occurs, a path in the correct branch direction of an instruction to be predicted is taken out from a backup path to be continuously executed, and an instruction execution result does not need to be cleared, so that instruction execution delay is avoided, and the performance of a processor is further improved.

Description

Branch prediction processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of processor technologies, and in particular, to a branch prediction processing method, apparatus, device, and storage medium.
Background
When a processor executes a branch instruction, the next instruction of the branch instruction can be determined only after the branch instruction is executed, which causes a plurality of pipelines to idle, and the performance of the processor is reduced. In order to solve the above problems, the existing processors generally adopt a branch prediction technology, that is, a branch direction and a target address are predicted before an execution segment, and then instruction jump is realized and executed according to a prediction result; however, when a branch is predicted for an instruction, a branch misprediction may occur, and the prediction of the branch direction is most influential in the branch misprediction.
In the prior art, if a branch misprediction occurs, a flush (flush) process is usually performed on the whole branch path, i.e. the execution results of the instructions after the misprediction are all discarded, and then the instructions are fetched from the correct branch direction and the execution is restarted. This approach causes instruction execution delay due to the need to re-execute instructions in the correct branch direction, and also incurs a certain overhead in flushing the instruction execution results, thereby greatly affecting the performance of the processor.
Disclosure of Invention
The present disclosure provides a branch prediction processing method, apparatus, device and storage medium, so as to at least solve the above technical problems in the prior art.
According to a first aspect of the present disclosure, there is provided a branch prediction processing method, including: obtaining a branch prediction result of an instruction to be predicted, wherein the instruction to be predicted is a branch instruction; judging whether the branch prediction result meets a first preset condition or not; and the branch prediction result meets a first preset condition, and the branch path of the instruction to be predicted is backed up to obtain a backup path.
In one embodiment, the determining whether the branch prediction result satisfies a first preset condition includes: judging whether the historical jump times corresponding to the instruction to be predicted is smaller than a first preset threshold value or not; if the historical jump times are smaller than the first preset threshold, the branch prediction result meets the first preset condition; and if the historical jumping times are not less than the first preset threshold, the branch prediction result does not meet the first preset condition.
In an embodiment, the backing up the branch path of the instruction to be predicted to obtain a backup path includes: judging whether the branch length of the instruction to be predicted meets a second preset condition or not; if the branch length of the instruction to be predicted meets a second preset condition, writing the backup path into a first transmission queue; and if the branch length of the instruction to be predicted does not meet a second preset condition, storing the backup path to a second transmission queue, and writing the main path corresponding to the branch prediction result into a first transmission queue.
In one embodiment, the determining whether the branch length of the instruction to be predicted satisfies a second predetermined condition includes: judging whether the number of instructions in the branches of the instructions to be predicted is smaller than a second preset threshold value or not, or judging whether the branches of the instructions to be predicted are in a block or not; if so, the branch length of the instruction to be predicted meets a second preset condition; and if the judgment result is negative, the branch length of the instruction to be predicted does not meet a second preset condition.
In an embodiment, the method further comprises: and when the branch prediction result is judged to be wrong, taking out the path in the correct branch direction of the instruction to be predicted from the backup path and continuing to execute.
In one embodiment, whether the branch prediction result is incorrect is determined by: acquiring an execution result of the instruction to be predicted; and if the execution result is different from the branch prediction result, judging that the branch prediction result is wrong.
In an embodiment, the method further comprises: and writing the main path corresponding to the branch prediction result into a first transmission queue when the branch prediction result does not meet a first preset condition.
According to a second aspect of the present disclosure, there is provided a branch prediction processing apparatus, the apparatus including: the first obtaining module is used for obtaining a branch prediction result of an instruction to be predicted, and the instruction to be predicted is a branch instruction; the first judgment module is used for judging whether the branch prediction result meets a first preset condition or not; and the backup module is used for backing up the branch path of the instruction to be predicted to obtain a backup path when the branch prediction result meets a first preset condition.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the methods of the present disclosure.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of the present disclosure.
According to the branch prediction processing method, the branch prediction processing device, the branch prediction processing equipment and the storage medium, under the condition that a branch prediction result meets a first preset condition, a branch path of an instruction to be predicted is backed up to obtain a backup path, so that a processor can take out a path in the correct branch direction of the instruction to be predicted from the backup path to continue execution when a branch is mispredicted, and an instruction execution result does not need to be cleared, so that the instruction execution delay is avoided, and the performance of the processor is further improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Fig. 1 is a flowchart illustrating a branch prediction processing method according to a first embodiment of the present disclosure;
FIG. 2 is a flow chart diagram illustrating a branch prediction processing method according to a third embodiment of the present disclosure;
fig. 3 is a flowchart illustrating a branch prediction processing method according to a fifth embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a branch prediction processing apparatus according to an eighth embodiment of the present disclosure;
fig. 5 shows a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, features and advantages of the present disclosure more apparent and understandable, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
Fig. 1 shows a schematic flowchart of a branch prediction processing method according to a first embodiment of the present disclosure, and as shown in fig. 1, the method mainly includes:
step S101, a branch prediction result of an instruction to be predicted is obtained, and the instruction to be predicted is a branch instruction.
In this embodiment, a branch prediction result of an instruction to be predicted needs to be obtained first, where the instruction to be predicted is a branch instruction, the branch prediction result includes a branch direction and a target address of the instruction to be predicted, and the branch direction is used to indicate whether a jump occurs in the instruction to be executed. A branch instruction is a special instruction that can change the execution flow of a program or call a subroutine, so that the next instruction cannot be determined until the branch instruction is executed.
However, when a processor executes a branch instruction, the next instruction of the branch instruction can be determined only after the branch instruction is executed, that is, before the branch instruction is executed, the fetch stage and the decode stage in the pipeline are idle, which may reduce the performance of the processor. In order to solve the above problems, when executing a branch instruction, the conventional processor adopts a branch prediction technology, that is, a branch direction and a target address are predicted before executing the branch instruction, and then an instruction fetch operation and the like are performed on a next instruction of the branch instruction in advance according to a branch prediction result, so that a plurality of pipeline idle operations can be avoided, and the performance of the processor is improved.
In one implementation, the branch prediction may be performed on the instruction to be predicted through static branch prediction or dynamic branch prediction, and a branch prediction result of the instruction to be predicted is obtained, where the branch prediction result includes a branch direction and a target address of the instruction to be predicted.
Step S102, judging whether the branch prediction result meets a first preset condition.
In this embodiment, after obtaining the branch prediction result of the instruction to be predicted, it is further required to determine whether the branch prediction result satisfies a first preset condition.
In one embodiment, a counter is set for the instruction to be predicted, and is used to record the historical number of jumps of the instruction to be predicted, wherein the counter is increased by one when the branch jumps and is decreased by one when the branch does not jump. The processor generally determines a branch prediction result according to a counter (historical jump times) of an instruction to be predicted, wherein the more times of jumping of the instruction to be predicted in the historical execution process, the greater the probability that the instruction to be predicted jumps, and the instruction to be predicted is in a strong jump state at the moment; the number of times of jumping of the instruction to be predicted in the historical execution process is small, the probability that the instruction to be predicted jumps is smaller, and the instruction to be predicted is in a weak jumping state at the moment. If the branch prediction result is the 'strong jump' state, the probability of the jump of the instruction to be predicted is proved to be very high; if the branch prediction result is in a 'weak jump' state, the fact that the instruction to be predicted is possible to jump or not is proved. And judging whether the branch prediction result meets a first preset condition, namely judging whether the branch prediction result is in a weak jump state.
Step S103, the branch prediction result meets a first preset condition, and the branch path of the instruction to be predicted is backed up to obtain a backup path.
In this embodiment, if the branch prediction result meets the first preset condition, that is, the branch prediction result is in a "weak jump" state, the branch path of the instruction to be predicted is backed up to obtain a backup path, and the backup path is used to ensure that the processor can continue to execute the instruction corresponding to the correct branch direction when the branch prediction result is incorrect.
In an implementation manner, if the branch prediction result is in a "weak jump" state, it is proved that the probability of the error of the branch prediction result is relatively high, at this time, the instruction is fetched and executed according to the branch direction in the branch prediction result, and if the error of the branch prediction result is found after the execution of the instruction to be predicted is completed, the path corresponding to the correct branch direction can be directly fetched from the backup path and started to be executed, and it is not necessary to perform flush (flush) processing on the already executed branch path, and it is also not necessary to re-fetch and execute the instruction from the correct branch direction in the execution result of the instruction to be predicted.
In an implementation manner, the backup path may be stored in the issue queue corresponding to the instruction to be predicted, and when the instruction corresponding to the branch prediction result is issued, the backup path is also carried, and once the branch misprediction occurs, the execution is directly started from the correct branch direction; the backup path may also be stored in another transmission queue, and when a branch misprediction occurs, the path corresponding to the correct branch direction is directly taken out from the other transmission queue and executed.
In the first embodiment of the present disclosure, when the branch prediction result meets the first preset condition, that is, the branch prediction result is in a "weak jump" state, the branch path of the instruction to be predicted is backed up to obtain a backup path, and it is ensured that when a branch misprediction occurs in the processor, the path in the correct branch direction of the instruction to be predicted can be taken out from the backup path to continue executing, so that a delay in instruction execution is avoided, and the performance of the processor is further improved.
In the second embodiment of the present disclosure, step S102 mainly includes:
judging whether the historical jump times corresponding to the instruction to be predicted is smaller than a first preset threshold value or not; if the historical jumping times are smaller than a first preset threshold value, the branch prediction result meets a first preset condition; and if the historical jumping times are not less than a first preset threshold value, the branch prediction result does not meet a first preset condition.
In this embodiment, whether the branch prediction result meets a first preset condition, that is, whether the branch prediction result is in a "weak jump" state, may be determined according to whether the historical jump frequency corresponding to the instruction to be predicted is smaller than a first preset threshold, and if the historical jump frequency is smaller than the first preset threshold, the branch prediction result meets the first preset condition; and if the historical jumping times are not less than a first preset threshold value, the branch prediction result does not meet a first preset condition.
In an implementation manner, a counter corresponding to an instruction to be predicted can be used for recording the historical jump times of the instruction to be predicted, if the instruction to be predicted jumps, the value of the counter is increased by one, if the instruction to be predicted does not jump, the value of the counter is decreased by one, and if the value of the counter is smaller than a first preset threshold, the instruction to be predicted is in a weak jump state, namely, a branch prediction result meets a first preset condition; if the value of the counter is larger than a first preset threshold value, the instruction to be predicted is in a strong jump state, namely the branch prediction result does not meet a first preset condition. Specifically, the first preset threshold may be set according to an actual situation, and if the first preset threshold is large, for most of the instructions to be predicted, the branch path needs to be backed up, which may increase a certain storage cost; if the first preset threshold is small, for most of the instructions to be predicted, the branch path is not backed up, which may cause delay of execution of some instructions, so that the storage cost and the accuracy of instruction extraction need to be comprehensively considered to set a reasonable first preset threshold.
In the second embodiment of the present disclosure, whether the branch prediction result meets the first preset condition, that is, whether the branch prediction result is in the "weak jump" state, is determined according to the historical jump times corresponding to the instruction to be predicted, which is helpful for determining whether to backup the branch path of the instruction to be predicted in the "weak jump" state subsequently.
Fig. 2 is a flowchart illustrating a branch prediction processing method according to a third embodiment of the present disclosure, and as shown in fig. 2, backing up a branch path of an instruction to be predicted to obtain a backup path includes:
in step S201, it is determined whether the branch length of the instruction to be predicted satisfies a second preset condition.
In this embodiment, it is required to determine whether the branch length of the instruction to be predicted meets a second preset condition, and in general, the branch of the instruction to be predicted may be divided into a long branch and a short branch according to the number of instructions included between the source address and the target address of the branch jump of the instruction to be predicted, and whether the branch length of the instruction to be predicted meets the second preset condition, that is, whether the branch of the instruction to be predicted is a short branch is determined.
In step S202, if the branch length of the instruction to be predicted satisfies the second preset condition, the backup path is written into the first transmit queue.
In this embodiment, if the branch length of the instruction to be predicted satisfies the second preset condition, that is, the branch of the instruction to be predicted is a short branch, it is proved that the number of instructions included between the source address and the target address of the branch jump of the instruction to be predicted is small, and the number of instructions included in the backup path is also small, at this time, all the backup paths are directly written into the first transmit queue, and the first transmit queue is the transmit queue corresponding to the instruction to be predicted.
In an implementation manner, when the instruction corresponding to the branch prediction result is transmitted and executed, the instruction may be transmitted together with the backup path, and if the branch prediction result is found to be wrong, the path in the correct branch direction of the instruction to be predicted may be directly executed without clearing the execution result of the path corresponding to the branch prediction result or performing operations such as instruction fetching and decoding on the path in the correct branch direction of the instruction to be predicted again. For example, if the instruction path is instruction 1-instruction 2-instruction 3-instruction 4-instruction 5-instruction 6, wherein, the instruction to be predicted is instruction 3, the branch prediction result is that instruction 3 jumps to instruction 6, then the branch paths of instruction 3-instruction 4-instruction 5-instruction 6 can be all written into the first transmission queue, i.e. instruction 3-instruction 4-instruction 5-instruction 6 as a backup path to the first transmit queue, directly transmitting the backup path of instruction 3-instruction 4-instruction 5-instruction 6 in all at the time of transmitting, if branch prediction result instruction 3 jumps to instruction 6 error, instruction 4-instruction 5-instruction 6 may also be executed directly, instruction 4, instruction 5, and instruction 6 do not need to be re-fetched, nor re-decoded.
Step S203, if the branch length of the instruction to be predicted does not satisfy the second preset condition, the backup path is saved to the second issue queue, and the main path corresponding to the branch prediction result is written into the first issue queue.
In this embodiment, if the branch length of the instruction to be predicted does not satisfy the second preset condition, that is, the branch of the instruction to be predicted is a long branch, it is proved that the number of instructions included between the source address and the target address of the branch jump of the instruction to be predicted is large, and the number of instructions included in the backup path is also large, and at this time, if the backup path is directly and completely written into the first transmit queue, the space of other instructions is greatly occupied, so that the backup path can be stored in other queues, that is, the second transmit queue, and only the main path corresponding to the branch prediction result is written into the first transmit queue.
In an implementation manner, the main path corresponding to the branch prediction result may be directly transmitted from the first transmission queue and then executed, and if the branch prediction result is found to be wrong, the path in the correct branch direction of the instruction to be predicted may be directly taken out from the second transmission queue for execution, and operations such as instruction fetching and decoding need not to be performed on the path in the correct branch direction of the instruction to be predicted again. For example, if the instruction path is instruction 1-instruction 2-instruction 3-instruction 4-instruction 5-instruction 6-instruction 7-instruction 8, where the instruction to be predicted is instruction 3 and the branch prediction result is that instruction 3 jumps to instruction 8, the branch path of instruction 4-instruction 5-instruction 6-instruction 7-instruction 8 may be saved as a backup path to the second issue queue, only instruction 3 and instruction 8 are written to the first issue queue, and if the branch prediction result is that instruction 3 jumps to instruction 8 incorrectly, the backup path of instruction 4-instruction 5-instruction 6-instruction 7-instruction 8 may be fetched from the second issue queue, and then execution is resumed from instruction 4.
In the third embodiment of the present disclosure, the backup path is stored in different issue queues according to the branch length of the instruction to be predicted, if the branch of the instruction to be predicted is a short branch, the backup path is written into the first issue queue, and if the branch of the instruction to be predicted is a long branch, the backup path is stored in the second issue queue.
In the fourth embodiment of the present disclosure, step S201 mainly includes:
judging whether the number of instructions in the branches of the instructions to be predicted is smaller than a second preset threshold value or not, or judging whether the branches of the instructions to be predicted are in a block or not; if so, the branch length of the instruction to be predicted meets a second preset condition; and if the judgment result is negative, the branch length of the instruction to be predicted does not meet the second preset condition.
In this embodiment, whether the branch length of the instruction to be predicted meets a second preset condition, that is, whether the branch of the instruction to be predicted is a short branch, may be determined according to the number of instructions in the branch of the instruction to be predicted, that is, the number of instructions included between the source address and the target address of the branch jump of the instruction to be predicted, or whether the branch of the instruction to be predicted is within one block; if the judgment result is yes, the branch length of the instruction to be predicted meets a second preset condition; if the judgment result is negative, the branch length of the instruction to be predicted does not meet a second preset condition. In particular, block represents a predicted step size, such as 8 consecutive instructions. If the branch of the instruction to be predicted and the target address thereof are within one block, the branch of the instruction to be predicted can be considered as a short branch.
In an implementation manner, the second preset threshold may be set according to an actual situation, for example, an experiment may be performed offline in advance, different second preset thresholds are set, and the optimal second preset threshold is determined according to a final execution effect of the instruction and the performance of the processor; or, a self-adaptive hardware may be used to monitor the final execution effect of the instruction and the performance of the processor under different second preset thresholds, and then automatically switch to the optimal second preset threshold.
In the fourth embodiment of the present disclosure, whether a branch of an instruction to be predicted is a short branch is determined according to the number of instructions in the branch of the instruction to be predicted or whether the branch of the instruction to be predicted is within one block, which is convenient for storing backup paths in different issue queues according to the branch length of the instruction to be predicted subsequently, thereby improving the performance of the processor.
Fig. 3 is a flowchart illustrating a branch prediction processing method according to a fifth embodiment of the disclosure, and as shown in fig. 3, the method further includes:
and step S104, when the branch prediction result is judged to be wrong, taking out the path in the correct branch direction of the instruction to be predicted from the backup path and continuing to execute.
In this embodiment, after the branch path of the instruction to be predicted is backed up to obtain the backup path, the path in the correct branch direction of the instruction to be predicted can be directly taken out from the backup path to continue executing when the branch prediction result is judged to be incorrect.
In an implementation manner, after the instruction to be predicted is executed, the correct branch direction of the instruction to be predicted can be obtained, if the correct branch direction is not consistent with the branch direction in the branch prediction result, the branch prediction result is incorrect, and at this time, the path in the correct branch direction of the instruction to be predicted can be directly taken out from the backup path to continue execution. If the backup path is stored in the transmitting queue corresponding to the instruction to be predicted, the backup path is carried when the instruction corresponding to the branch prediction result is transmitted, and once the branch misprediction occurs, the backup path is directly executed from the correct branch direction; if the backup path is stored in other transmitting queues, when the branch misprediction occurs, directly taking out the path corresponding to the correct branch direction from other transmitting queues and executing.
In the fifth embodiment of the present disclosure, when the branch prediction result is incorrect, the instruction in the correct branch direction of the instruction to be predicted is directly executed continuously, and operations such as instruction fetching and decoding and the like do not need to be performed again on the instruction in the correct branch direction of the instruction to be predicted, so that the delay of instruction execution can be avoided, and the performance of the processor is further improved.
In a sixth embodiment of the present disclosure, whether the branch prediction result is erroneous is determined by:
acquiring an execution result of an instruction to be predicted; if the execution result is different from the branch prediction result, the branch prediction result is judged to be wrong.
In this embodiment, after the instruction to be predicted is executed, the execution result of the instruction to be predicted may be obtained, where the execution result includes a correct branch direction and a correct target address of the instruction to be predicted, and if the execution result is different from the branch prediction result, it indicates that the branch prediction result is incorrect, and at this time, the instruction in the correct branch direction of the instruction to be predicted may be taken out from the backup path according to the execution result to continue executing.
In a seventh embodiment of the present disclosure, the method further comprises:
and writing the main path corresponding to the branch prediction result into the first transmission queue when the branch prediction result does not meet the first preset condition.
In this embodiment, if the branch prediction result does not satisfy the first preset condition, that is, the branch prediction result is in a "strong jump" state, it indicates that the confidence of the predicted jump of the instruction to be predicted is very strong, and the probability of the jump is very high, so that the branch path of the instruction to be predicted does not need to be backed up, the main path corresponding to the branch prediction result is directly written into the first transmission queue, and the next instruction is executed according to the branch prediction result.
In practical application, the branch path of the instruction to be predicted may be backed up when the branch prediction result does not satisfy the first preset condition, that is, the branch prediction result is in the "strong jump" state, and the instruction in the correct branch direction of the instruction to be predicted is taken out from the backup path to continue executing when the branch prediction result is wrong, so that a small probability event, such as a case where the branch prediction result is in the "strong jump" state, that is, an error still occurs and the influence on the performance of the processor can be avoided. However, if the "strong jump" state and the "weak jump" state are not distinguished from each other and all the branch paths of the instructions to be predicted are backed up, the processor cache may be consumed too much, so in an actual application process, the processor performance and the processor cache may be considered comprehensively to determine whether the branch paths of the instructions to be predicted are backed up under the condition that the branch prediction result does not meet the first preset condition, that is, the branch prediction result is in the "strong jump" state.
And the cond instruction is used for jumping to the target address to execute when the condition specified by the instruction is met. The following further describes a branch prediction processing method provided by an embodiment of the present invention with an execution process of a b.cond instruction:
firstly, performing branch prediction on a B.cond instruction to obtain a branch prediction result of the B.cond instruction, and judging whether the branch prediction result of the B.cond instruction meets a first preset condition or not according to the historical jump times corresponding to the B.cond instruction; if the branch prediction result does not meet the first preset condition, writing a main path corresponding to the branch prediction result into a first transmission queue; if the branch prediction result meets a first preset condition, judging whether the branch length of the B.cond instruction meets a second preset condition according to the instruction number in the branch of the instruction to be predicted or whether the branch of the instruction to be predicted is in a block; if the branch length meets a second preset condition, writing the backup path into the first transmitting queue, if the branch length does not meet the second preset condition, saving the backup path to the second transmitting queue, and writing the main path corresponding to the branch prediction result into the first transmitting queue; after the finished execution of the cond instruction, if the execution result is different from the branch prediction result, which indicates that the branch prediction result is wrong, the path in the correct branch direction of the instruction to be predicted can be taken out from the backup path to continue execution. Therefore, the delay of instruction execution can be avoided, and the performance of the processor is improved.
Fig. 4 is a schematic structural diagram of a branch prediction processing apparatus according to an eighth embodiment of the present disclosure, and as shown in fig. 4, the apparatus mainly includes:
a first obtaining module 40, configured to obtain a branch prediction result of an instruction to be predicted, where the instruction to be predicted is a branch instruction; a first judging module 41, configured to judge whether a branch prediction result meets a first preset condition; and the backup module 42 is configured to backup the branch path of the instruction to be predicted to obtain a backup path, where the branch prediction result meets the first preset condition.
In an implementation manner, the first determining module 41 is configured to determine whether a historical jump time corresponding to an instruction to be predicted is less than a first preset threshold; if the historical jumping times are smaller than a first preset threshold value, the branch prediction result meets a first preset condition; and if the historical jumping times are not less than a first preset threshold value, the branch prediction result does not meet a first preset condition.
In one embodiment, the backup module 42 includes: the judging submodule 421 is configured to judge whether the branch length of the instruction to be predicted meets a second preset condition; the writing sub-module 422 is configured to write the backup path into the first transmit queue if the branch length of the instruction to be predicted meets a second preset condition; the saving submodule 423 is configured to save the backup path to the second transmit queue if the branch length of the instruction to be predicted does not meet the second preset condition, and write the active path corresponding to the branch prediction result into the first transmit queue.
In an implementation manner, the determining submodule 421 is configured to determine whether the number of instructions in a branch of the instruction to be predicted is smaller than a second preset threshold, or whether the branch of the instruction to be predicted is within a block; if so, the branch length of the instruction to be predicted meets a second preset condition; and if the judgment result is negative, the branch length of the instruction to be predicted does not meet the second preset condition.
In one embodiment, the apparatus further comprises: and the execution module 43 is configured to, when the branch prediction result is judged to be incorrect, take the path in the correct branch direction of the instruction to be predicted from the backup path and continue executing.
In one embodiment, the apparatus further comprises: a second obtaining module 44, configured to obtain an execution result of the instruction to be predicted; a second determining module 45, configured to determine that the branch prediction result is incorrect if the execution result is different from the branch prediction result.
In an implementation manner, the apparatus further includes a writing module 46, configured to write the main path corresponding to the branch prediction result into the first transmission queue when the branch prediction result does not satisfy the first preset condition.
The present disclosure also provides an electronic device and a readable storage medium according to an embodiment of the present disclosure.
FIG. 5 illustrates a schematic block diagram of an example electronic device 500 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the apparatus 500 comprises a computing unit 501 which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The calculation unit 501, the ROM 502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 performs the respective methods and processes described above, such as a branch prediction processing method. For example, in some embodiments, a branch prediction processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When loaded into RAM 503 and executed by the computing unit 501, a computer program may perform one or more of the steps of a branch prediction processing method described above. Alternatively, in other embodiments, the computing unit 501 may be configured to perform a branch prediction processing method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present disclosure, "a plurality" means two or more unless specifically limited otherwise.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present disclosure, and all the changes or substitutions should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (9)

1. A branch prediction processing method, the method comprising:
obtaining a branch prediction result of an instruction to be predicted, wherein the instruction to be predicted is a branch instruction;
judging whether the branch prediction result meets a first preset condition or not;
the branch prediction result meets a first preset condition, and a branch path of the instruction to be predicted is backed up to obtain a backup path;
wherein, the backing up the branch path of the instruction to be predicted to obtain a backup path includes:
judging whether the branch length of the instruction to be predicted meets a second preset condition or not;
if the branch length of the instruction to be predicted meets a second preset condition, writing the backup path into a first transmission queue;
and if the branch length of the instruction to be predicted does not meet a second preset condition, storing the backup path to a second transmission queue, and writing the main path corresponding to the branch prediction result into a first transmission queue.
2. The method of claim 1, wherein determining whether the branch prediction result satisfies a first predetermined condition comprises:
judging whether the historical jump times corresponding to the instruction to be predicted is smaller than a first preset threshold value or not;
if the historical jump times are smaller than the first preset threshold, the branch prediction result meets the first preset condition;
and if the historical jump times are not less than the first preset threshold, the branch prediction result does not meet the first preset condition.
3. The method as claimed in claim 1, wherein said determining whether the branch length of the instruction to be predicted satisfies a second preset condition comprises:
judging whether the number of the instructions in the branches of the instructions to be predicted is smaller than a second preset threshold value or not, or judging whether the branches of the instructions to be predicted are in a block or not;
if so, the branch length of the instruction to be predicted meets a second preset condition;
and if the judgment result is negative, the branch length of the instruction to be predicted does not meet a second preset condition.
4. The method of claim 1, further comprising:
and when the branch prediction result is judged to be wrong, taking out the path in the correct branch direction of the instruction to be predicted from the backup path and continuing to execute.
5. The method of claim 4, wherein determining whether the branch prediction result is incorrect is performed by:
acquiring an execution result of the instruction to be predicted;
and if the execution result is different from the branch prediction result, judging that the branch prediction result is wrong.
6. The method according to any one of claims 1 to 5, further comprising:
and writing the main path corresponding to the branch prediction result into a first transmission queue when the branch prediction result does not meet a first preset condition.
7. A branch prediction processing apparatus, comprising:
the first obtaining module is used for obtaining a branch prediction result of an instruction to be predicted, and the instruction to be predicted is a branch instruction;
the first judgment module is used for judging whether the branch prediction result meets a first preset condition or not;
the backup module is used for backing up the branch path of the instruction to be predicted to obtain a backup path when the branch prediction result meets a first preset condition;
wherein, the backing up the branch path of the instruction to be predicted to obtain a backup path includes:
judging whether the branch length of the instruction to be predicted meets a second preset condition or not;
if the branch length of the instruction to be predicted meets a second preset condition, writing the backup path into a first transmission queue;
and if the branch length of the instruction to be predicted does not meet a second preset condition, storing the backup path to a second transmission queue, and writing the main path corresponding to the branch prediction result into a first transmission queue.
8. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
9. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
CN202210753823.6A 2022-06-30 2022-06-30 Branch prediction processing method, device, equipment and storage medium Active CN114816536B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210753823.6A CN114816536B (en) 2022-06-30 2022-06-30 Branch prediction processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210753823.6A CN114816536B (en) 2022-06-30 2022-06-30 Branch prediction processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114816536A CN114816536A (en) 2022-07-29
CN114816536B true CN114816536B (en) 2022-09-20

Family

ID=82523415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210753823.6A Active CN114816536B (en) 2022-06-30 2022-06-30 Branch prediction processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114816536B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117420770B (en) * 2023-12-01 2024-04-26 上海频准激光科技有限公司 Data simulation system for multipath laser control

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109705A (en) * 2019-05-14 2019-08-09 核芯互联科技(青岛)有限公司 A kind of superscalar processor branch prediction method for supporting embedded edge calculations

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9477481B2 (en) * 2014-06-27 2016-10-25 International Business Machines Corporation Accurate tracking of transactional read and write sets with speculation
US9703667B2 (en) * 2015-02-22 2017-07-11 International Business Machines Corporation Hardware-based edge profiling
CN106843812A (en) * 2016-12-23 2017-06-13 北京北大众志微系统科技有限责任公司 A kind of method and device for realizing the prediction of indirect branch associated software
CN109960607B (en) * 2017-12-22 2021-04-20 龙芯中科技术股份有限公司 Error recovery method and device of prediction stack and storage medium
CN110147293A (en) * 2019-05-20 2019-08-20 江南大学 A method of reducing microprocessor soft error neurological susceptibility
CN111459549B (en) * 2020-04-07 2022-11-01 上海兆芯集成电路有限公司 Microprocessor with highly advanced branch predictor
US11768686B2 (en) * 2020-07-27 2023-09-26 Nvidia Corporation Out of order memory request tracking structure and technique

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109705A (en) * 2019-05-14 2019-08-09 核芯互联科技(青岛)有限公司 A kind of superscalar processor branch prediction method for supporting embedded edge calculations

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Probabilistic Monte Carlo Framework for Branch Prediction;Bhargava Kalla;《2017 IEEE International Conference on Cluster Computing (CLUSTER)》;20170908;651-652 *
多路径Trace处理器;杜贵然;《中国博士学位论文全文数据库 多路径Trace处理器》;20040115;I137-3 *

Also Published As

Publication number Publication date
CN114816536A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN114816536B (en) Branch prediction processing method, device, equipment and storage medium
CN113778644B (en) Task processing method, device, equipment and storage medium
US9652245B2 (en) Branch prediction for indirect jumps by hashing current and previous branch instruction addresses
CN113032258B (en) Electronic map testing method and device, electronic equipment and storage medium
CN113590287B (en) Task processing method, device, equipment, storage medium and scheduling system
US20230177143A1 (en) Operating a secure code segment on a processor core of a processing unit
CN113139891A (en) Image processing method, image processing device, electronic equipment and storage medium
CN112462921B (en) Application management method, device and storage medium
CN114386577A (en) Method, apparatus, and storage medium for executing deep learning model
CN113012682A (en) False wake-up rate determination method, device, apparatus, storage medium, and program product
EP3905654A1 (en) Method and apparatus for detecting echo delay and electronic device
CN115729688B (en) Multithreading scheduling method and device for processor, electronic equipment and storage medium
CN115495312B (en) Service request processing method and device
CN114721725B (en) Branch instruction execution method and device, electronic equipment and storage medium
CN115878362A (en) Operating system abnormity positioning method, device, equipment and storage medium
CN115981985A (en) Code block monitoring method and device, electronic equipment and readable storage medium
CN115098167A (en) Instruction execution method and device
CN115129462A (en) Method and device for determining processor load
CN117873775A (en) Data updating method, device, equipment and storage medium
CN116719621A (en) Data write-back method, device, equipment and medium for mass tasks
CN116126669A (en) Performance detection method and device for heterogeneous acceleration program and storage medium
CN117271113A (en) Task execution method, device, electronic equipment and storage medium
CN114217872A (en) Application program starting method and device, electronic equipment and storage medium
CN116107927A (en) Data processing device, data processing method and electronic equipment
CN115600687A (en) Model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant