CN112445520A - Branch prediction optimization method for conditional branch instructions in loops - Google Patents
Branch prediction optimization method for conditional branch instructions in loops Download PDFInfo
- Publication number
- CN112445520A CN112445520A CN201910794939.2A CN201910794939A CN112445520A CN 112445520 A CN112445520 A CN 112445520A CN 201910794939 A CN201910794939 A CN 201910794939A CN 112445520 A CN112445520 A CN 112445520A
- Authority
- CN
- China
- Prior art keywords
- instruction
- conditional
- branch
- condition
- loop
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
- G06F9/30058—Conditional branch instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The invention discloses a branch prediction optimization method for a conditional branch instruction in a loop, which comprises the following steps: s11, judging whether the number of the instructions in one loop can meet the condition that the operation instruction with the conditional branch flag bit is advanced by at least N before the conditional branch instruction with the conditional branch flag bit; s12, if the condition is met, the compiler directly generates the assembly code, if the condition is not met, the compiler calculates the number of times of loop expansion according to the loop body code amount and the condition N, and the loop expansion is carried out to generate the assembly code; s13, the operation instruction with conditional mark changes the branch flag bit of the conditional branch instruction in advance; s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence; s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0. The invention can avoid the performance loss caused by the last transfer of the cycle.
Description
Technical Field
The invention relates to a branch prediction optimization method for a conditional branch instruction in a loop, and belongs to the technical field of computers.
Background
Performance of the processor is often limited by the branch instruction because many cycles are lost by the need to drain the pipeline of invalid instructions following the branch instruction when the branch occurs to fetch the instruction from a new target address. Branch prediction is an effective method to overcome this loss, and the success rate of branch prediction is critical to improving the performance of the processor.
Branch prediction mechanisms can be divided into static branch prediction and dynamic branch prediction. Static branch prediction is independent of the execution result of the branch, does not need history, is simple to realize, and can be divided into compiler prediction and hardware fixed prediction. Dynamic branch prediction is divided into primary branch prediction, secondary adaptive branch prediction and hybrid prediction, and the accuracy is much higher than that of static branch prediction.
For a loop executing n times, the existing branch prediction technology can successfully predict the jump direction of the previous n-1 loops, but cannot predict the last branch when the nth loop ends, that is, the branch prediction always fails when the loop ends and jumps out of the loop, which causes a certain loss to the performance of program operation.
Disclosure of Invention
The invention aims to provide a branch prediction optimization method for a conditional branch instruction in a loop, which achieves the purpose of avoiding the performance loss caused by the failure of the branch prediction and can avoid the performance loss caused by the last branch of the loop.
In order to achieve the purpose, the invention adopts the technical scheme that: a method of branch prediction optimization for an in-loop conditional branch instruction, comprising the steps of:
s11, the compiler processes the for loop in the user program, and judges whether the number of instructions in the loop can meet the condition that at least N instructions are advanced before the conditional branch instruction with the conditional branch flag bit of the operation instruction with the conditional branch flag bit;
s12, the compiler generates an operation instruction with a conditional branch flag bit and a conditional branch instruction with the conditional branch flag bit, if the number of instructions in a loop is satisfied, whether the condition that the operation instruction with the conditional branch flag bit is advanced by at least N is satisfied, the compiler directly generates an assembly code, and if the condition is not satisfied, the compiler calculates the number of times of loop expansion according to the code amount of the loop body and the condition N, and performs loop expansion to generate the assembly code;
s13, in the program running stage, the operation instruction with conditional tag changes the branch flag bit of the conditional branch instruction in advance, that is, the operation instruction with conditional tag sets the conditional tag bit to be valid, and sets the corresponding conditional branch tag according to the operation result;
s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence;
s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0.
The further improved scheme in the technical scheme is as follows:
1. in the above scheme, N is determined by hardware, and N is an integer.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
the invention relates to a branch prediction optimization method for a conditional branch instruction in a loop, which combines an operation instruction with a conditional tag provided by a processor and the conditional branch instruction with a conditional branch flag bit, so that the branch direction can be correctly predicted when the loop is jumped out of the loop after the loop is ended, thereby achieving the purpose of avoiding the performance loss caused by the branch prediction failure and avoiding the performance loss caused by the last branch of the loop.
Drawings
FIG. 1 is a flow diagram of a method for branch prediction optimization for an in-loop conditional branch instruction according to the present invention.
Detailed Description
Example (b): a method of branch prediction optimization for an in-loop conditional branch instruction, comprising the steps of:
s11, the compiler processes the for loop in the user program, and judges whether the number of instructions in the loop can meet the condition that at least N instructions are advanced before the conditional branch instruction with the conditional branch flag bit of the operation instruction with the conditional branch flag bit;
s12, the compiler generates an operation instruction with a conditional branch flag bit and a conditional branch instruction with the conditional branch flag bit, if the number of instructions in a loop is satisfied, whether the condition that the operation instruction with the conditional branch flag bit is advanced by at least N is satisfied, the compiler directly generates an assembly code, and if the condition is not satisfied, the compiler calculates the number of times of loop expansion according to the code amount of the loop body and the condition N, and performs loop expansion to generate the assembly code;
s13, in the program running stage, the operation instruction with conditional tag changes the branch flag bit of the conditional branch instruction in advance, that is, the operation instruction with conditional tag sets the conditional tag bit to be valid, and sets the corresponding conditional branch tag according to the operation result;
s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence;
s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0.
And N is determined by hardware and is an integer, the jump direction of the conditional branch is calculated by using an operation instruction with a conditional tag, in order to ensure that the branch prediction is successful, the position of the instruction is ahead of the conditional branch instruction, and how many instructions ahead are related to the pipeline stage number of the processor.
The examples are further explained below:
the compiler processes for loop in the program, the generated assembly code uses the operation instruction with conditional mark and the conditional branch instruction with conditional branch flag bit provided by the processor, the operation instruction with conditional mark changes the branch flag bit of the conditional branch instruction in advance during the program running, and the processor fetches the instruction according to the branch flag bit of the conditional branch instruction when fetching the instruction.
It should be noted that, in order for the branch flag bit to function correctly, the conditional tagged instruction needs to be advanced by at least N before the conditional branch instruction, which leaves a cycle for the processor to handle the branch flag and fetch, and the specific value of N depends on the hardware implementation. The compiler needs to consider the condition when generating the assembly code, and for the condition that the condition cannot be met because the number of instructions in one loop is small, the compiler adopts a loop expansion strategy to enable the instruction generation to meet the condition.
The invention provides a branch prediction optimization method aiming at a conditional branch instruction in a loop, which comprises the following specific processes:
step S11, the compiler processes the for loop in the user program.
Specifically, the compiler processes the for loop in the user program and judges whether the number of instructions in one loop can meet the condition that the operation instruction with the conditional flag is advanced by at least N before the conditional branch instruction.
Step S12, the compiler generates a conditional branch instruction with conditional branch flag and a conditional branch instruction with conditional branch flag.
Specifically, for the case where the above condition can be satisfied, the compiler directly generates assembly code; and for the condition that the condition is not met, the compiler adopts a loop expansion strategy, calculates the number of times of loop expansion according to the loop body code quantity and the condition N, and generates the assembly code.
Step S13, in the program running stage, the conditional branch instruction is executed by the conditional branch instruction.
Specifically, the operation instruction with conditional tag is valid in the condition tag bit, and the corresponding conditional branch tag is set according to the operation result.
Step S14, the processor instructs instruction fetching according to the conditional branch flag.
Specifically, the conditional branch instruction is judged according to the corresponding conditional flag bit, if yes, the instruction fetching is skipped, otherwise, the instruction fetching is performed sequentially.
Step S15, branch instruction condition flag bit is clear 0.
Specifically, the conditional flag bit is invalidated after use, regardless of whether a jump or no jump is predicted.
When the branch prediction optimization method for the conditional branch instruction in the loop is adopted, the operation instruction with the conditional label provided by the processor and the conditional branch instruction with the conditional branch flag bit are combined, so that the branch direction can be correctly predicted when the loop is jumped out of the loop after the loop is ended, the purpose of avoiding the performance loss caused by the branch prediction failure is achieved, and the performance loss caused by the last branch of the loop can be avoided.
To facilitate a better understanding of the invention, the terms used herein will be briefly explained as follows:
processor pipeline: the method is a technology for decomposing an instruction into multiple steps and overlapping the operations of the steps of different instructions so as to realize parallel processing of a plurality of instructions and accelerate the program running process.
Conditional branch instructions: and taking the state of the flag bit or the logic operation result of the flag bit as a basis, if the branch condition is met, switching to the instruction execution indicated by the target address, and otherwise, continuing to execute the next instruction.
And (3) branch prediction: an advanced data processing method for solving the problem of pipeline failure caused by processing branch instructions, a processor predicts the proceeding direction of program branches and can accelerate the operation speed;
a compiler: a program that translates one language (typically a high level language) into another language (typically a low level language).
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.
Claims (2)
1. A method for branch prediction optimization for an in-loop conditional branch instruction, comprising: the method comprises the following steps:
s11, the compiler processes the for loop in the user program, and judges whether the number of instructions in one loop can meet the condition that the operation instruction with conditional branch flag bit is advanced by at least N before the conditional branch instruction with conditional branch flag bit to meet the condition that the branch prediction prompts success;
s12, on the basis of the judgment result of S11, the compiler generates an operation instruction with a condition mark and a conditional branch instruction with a condition branch zone bit, if the number of instructions in a loop can meet the condition that the operation instruction with the condition mark is at least N ahead of the conditional branch instruction with the condition branch zone bit, the compiler directly generates assembly codes, and if the number of instructions in the loop can not meet the condition, the compiler calculates the number of times of loop expansion according to the code amount of a loop body and the condition N, and performs loop expansion to generate the assembly codes;
s13, in the program running stage, the operation instruction with conditional tag changes the branch flag bit of the conditional branch instruction in advance, that is, the operation instruction with conditional tag sets the conditional tag bit to be valid, and sets the corresponding conditional branch tag according to the operation result;
s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence;
s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0.
2. The method of claim 1 for branch prediction optimization for an in-loop conditional branch instruction, wherein: the N is determined by hardware and is an integer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910794939.2A CN112445520B (en) | 2019-08-27 | 2019-08-27 | Branch prediction optimization method for conditional branch instructions in loop |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910794939.2A CN112445520B (en) | 2019-08-27 | 2019-08-27 | Branch prediction optimization method for conditional branch instructions in loop |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112445520A true CN112445520A (en) | 2021-03-05 |
CN112445520B CN112445520B (en) | 2022-11-15 |
Family
ID=74741282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910794939.2A Active CN112445520B (en) | 2019-08-27 | 2019-08-27 | Branch prediction optimization method for conditional branch instructions in loop |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112445520B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118245115A (en) * | 2024-05-27 | 2024-06-25 | 北京微核芯科技有限公司 | Prediction method and device for transfer instruction |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928810A (en) * | 2005-09-09 | 2007-03-14 | 上海采微电子科技有限公司 | Micro-processor with cycling jump forecasting unit |
CN102736894A (en) * | 2011-04-01 | 2012-10-17 | 中兴通讯股份有限公司 | Method and system for coding jump instruction |
-
2019
- 2019-08-27 CN CN201910794939.2A patent/CN112445520B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928810A (en) * | 2005-09-09 | 2007-03-14 | 上海采微电子科技有限公司 | Micro-processor with cycling jump forecasting unit |
CN102736894A (en) * | 2011-04-01 | 2012-10-17 | 中兴通讯股份有限公司 | Method and system for coding jump instruction |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118245115A (en) * | 2024-05-27 | 2024-06-25 | 北京微核芯科技有限公司 | Prediction method and device for transfer instruction |
CN118245115B (en) * | 2024-05-27 | 2024-07-26 | 北京微核芯科技有限公司 | Prediction method and device for transfer instruction |
Also Published As
Publication number | Publication date |
---|---|
CN112445520B (en) | 2022-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5917616B2 (en) | Method and apparatus for changing the sequential flow of a program using prior notification technology | |
KR101685247B1 (en) | Single cycle multi-branch prediction including shadow cache for early far branch prediction | |
US9875106B2 (en) | Computer processor employing instruction block exit prediction | |
CN107092467B (en) | Instruction sequence buffer for enhancing branch prediction efficiency | |
KR101523020B1 (en) | Combined branch target and predicate prediction | |
US8607209B2 (en) | Energy-focused compiler-assisted branch prediction | |
CN109101276B (en) | Method for executing instruction in CPU | |
EP0448499A2 (en) | Instruction prefetch method for branch-with-execute instructions | |
CN102662640B (en) | Double-branch target buffer and branch target processing system and processing method | |
EP1853995B1 (en) | Method and apparatus for managing a return stack | |
CN101299192A (en) | Non-aligning access and storage processing method | |
US20170139693A1 (en) | Code execution method and device | |
US20090019431A1 (en) | Optimised compilation method during conditional branching | |
CN107526622B (en) | Rapid exception handling method and device for Linux | |
CN112445520B (en) | Branch prediction optimization method for conditional branch instructions in loop | |
JPWO2009004709A1 (en) | Indirect branch processing program and indirect branch processing method | |
US9639370B1 (en) | Software instructed dynamic branch history pattern adjustment | |
CN111522584B (en) | Hardware circulation acceleration processor and hardware circulation acceleration method executed by same | |
CN101604255A (en) | The method that the binary translation by delayed skip instruction of intermediate language is realized | |
CN107943518B (en) | Local jump instruction fetch circuit | |
CN103838616A (en) | Tree program branch based computer program immediate compiling method | |
CN106325963B (en) | Self-adaptive dynamic compiling and scheduling method and device | |
CN1481527A (en) | Resource-saving hardware loop | |
CN105094750B (en) | A kind of the return address prediction technique and device of multiline procedure processor | |
JP2009009253A (en) | Program execution method, program, and program execution system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |