CN112445520A - Branch prediction optimization method for conditional branch instructions in loops - Google Patents

Branch prediction optimization method for conditional branch instructions in loops Download PDF

Info

Publication number
CN112445520A
CN112445520A CN201910794939.2A CN201910794939A CN112445520A CN 112445520 A CN112445520 A CN 112445520A CN 201910794939 A CN201910794939 A CN 201910794939A CN 112445520 A CN112445520 A CN 112445520A
Authority
CN
China
Prior art keywords
instruction
conditional
branch
condition
loop
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910794939.2A
Other languages
Chinese (zh)
Other versions
CN112445520B (en
Inventor
钱宏
朱琪
王飞
吴伟
肖谦
管茂林
沈莉
周文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN201910794939.2A priority Critical patent/CN112445520B/en
Publication of CN112445520A publication Critical patent/CN112445520A/en
Application granted granted Critical
Publication of CN112445520B publication Critical patent/CN112445520B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30058Conditional branch instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses a branch prediction optimization method for a conditional branch instruction in a loop, which comprises the following steps: s11, judging whether the number of the instructions in one loop can meet the condition that the operation instruction with the conditional branch flag bit is advanced by at least N before the conditional branch instruction with the conditional branch flag bit; s12, if the condition is met, the compiler directly generates the assembly code, if the condition is not met, the compiler calculates the number of times of loop expansion according to the loop body code amount and the condition N, and the loop expansion is carried out to generate the assembly code; s13, the operation instruction with conditional mark changes the branch flag bit of the conditional branch instruction in advance; s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence; s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0. The invention can avoid the performance loss caused by the last transfer of the cycle.

Description

Branch prediction optimization method for conditional branch instructions in loops
Technical Field
The invention relates to a branch prediction optimization method for a conditional branch instruction in a loop, and belongs to the technical field of computers.
Background
Performance of the processor is often limited by the branch instruction because many cycles are lost by the need to drain the pipeline of invalid instructions following the branch instruction when the branch occurs to fetch the instruction from a new target address. Branch prediction is an effective method to overcome this loss, and the success rate of branch prediction is critical to improving the performance of the processor.
Branch prediction mechanisms can be divided into static branch prediction and dynamic branch prediction. Static branch prediction is independent of the execution result of the branch, does not need history, is simple to realize, and can be divided into compiler prediction and hardware fixed prediction. Dynamic branch prediction is divided into primary branch prediction, secondary adaptive branch prediction and hybrid prediction, and the accuracy is much higher than that of static branch prediction.
For a loop executing n times, the existing branch prediction technology can successfully predict the jump direction of the previous n-1 loops, but cannot predict the last branch when the nth loop ends, that is, the branch prediction always fails when the loop ends and jumps out of the loop, which causes a certain loss to the performance of program operation.
Disclosure of Invention
The invention aims to provide a branch prediction optimization method for a conditional branch instruction in a loop, which achieves the purpose of avoiding the performance loss caused by the failure of the branch prediction and can avoid the performance loss caused by the last branch of the loop.
In order to achieve the purpose, the invention adopts the technical scheme that: a method of branch prediction optimization for an in-loop conditional branch instruction, comprising the steps of:
s11, the compiler processes the for loop in the user program, and judges whether the number of instructions in the loop can meet the condition that at least N instructions are advanced before the conditional branch instruction with the conditional branch flag bit of the operation instruction with the conditional branch flag bit;
s12, the compiler generates an operation instruction with a conditional branch flag bit and a conditional branch instruction with the conditional branch flag bit, if the number of instructions in a loop is satisfied, whether the condition that the operation instruction with the conditional branch flag bit is advanced by at least N is satisfied, the compiler directly generates an assembly code, and if the condition is not satisfied, the compiler calculates the number of times of loop expansion according to the code amount of the loop body and the condition N, and performs loop expansion to generate the assembly code;
s13, in the program running stage, the operation instruction with conditional tag changes the branch flag bit of the conditional branch instruction in advance, that is, the operation instruction with conditional tag sets the conditional tag bit to be valid, and sets the corresponding conditional branch tag according to the operation result;
s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence;
s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0.
The further improved scheme in the technical scheme is as follows:
1. in the above scheme, N is determined by hardware, and N is an integer.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
the invention relates to a branch prediction optimization method for a conditional branch instruction in a loop, which combines an operation instruction with a conditional tag provided by a processor and the conditional branch instruction with a conditional branch flag bit, so that the branch direction can be correctly predicted when the loop is jumped out of the loop after the loop is ended, thereby achieving the purpose of avoiding the performance loss caused by the branch prediction failure and avoiding the performance loss caused by the last branch of the loop.
Drawings
FIG. 1 is a flow diagram of a method for branch prediction optimization for an in-loop conditional branch instruction according to the present invention.
Detailed Description
Example (b): a method of branch prediction optimization for an in-loop conditional branch instruction, comprising the steps of:
s11, the compiler processes the for loop in the user program, and judges whether the number of instructions in the loop can meet the condition that at least N instructions are advanced before the conditional branch instruction with the conditional branch flag bit of the operation instruction with the conditional branch flag bit;
s12, the compiler generates an operation instruction with a conditional branch flag bit and a conditional branch instruction with the conditional branch flag bit, if the number of instructions in a loop is satisfied, whether the condition that the operation instruction with the conditional branch flag bit is advanced by at least N is satisfied, the compiler directly generates an assembly code, and if the condition is not satisfied, the compiler calculates the number of times of loop expansion according to the code amount of the loop body and the condition N, and performs loop expansion to generate the assembly code;
s13, in the program running stage, the operation instruction with conditional tag changes the branch flag bit of the conditional branch instruction in advance, that is, the operation instruction with conditional tag sets the conditional tag bit to be valid, and sets the corresponding conditional branch tag according to the operation result;
s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence;
s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0.
And N is determined by hardware and is an integer, the jump direction of the conditional branch is calculated by using an operation instruction with a conditional tag, in order to ensure that the branch prediction is successful, the position of the instruction is ahead of the conditional branch instruction, and how many instructions ahead are related to the pipeline stage number of the processor.
The examples are further explained below:
the compiler processes for loop in the program, the generated assembly code uses the operation instruction with conditional mark and the conditional branch instruction with conditional branch flag bit provided by the processor, the operation instruction with conditional mark changes the branch flag bit of the conditional branch instruction in advance during the program running, and the processor fetches the instruction according to the branch flag bit of the conditional branch instruction when fetching the instruction.
It should be noted that, in order for the branch flag bit to function correctly, the conditional tagged instruction needs to be advanced by at least N before the conditional branch instruction, which leaves a cycle for the processor to handle the branch flag and fetch, and the specific value of N depends on the hardware implementation. The compiler needs to consider the condition when generating the assembly code, and for the condition that the condition cannot be met because the number of instructions in one loop is small, the compiler adopts a loop expansion strategy to enable the instruction generation to meet the condition.
The invention provides a branch prediction optimization method aiming at a conditional branch instruction in a loop, which comprises the following specific processes:
step S11, the compiler processes the for loop in the user program.
Specifically, the compiler processes the for loop in the user program and judges whether the number of instructions in one loop can meet the condition that the operation instruction with the conditional flag is advanced by at least N before the conditional branch instruction.
Step S12, the compiler generates a conditional branch instruction with conditional branch flag and a conditional branch instruction with conditional branch flag.
Specifically, for the case where the above condition can be satisfied, the compiler directly generates assembly code; and for the condition that the condition is not met, the compiler adopts a loop expansion strategy, calculates the number of times of loop expansion according to the loop body code quantity and the condition N, and generates the assembly code.
Step S13, in the program running stage, the conditional branch instruction is executed by the conditional branch instruction.
Specifically, the operation instruction with conditional tag is valid in the condition tag bit, and the corresponding conditional branch tag is set according to the operation result.
Step S14, the processor instructs instruction fetching according to the conditional branch flag.
Specifically, the conditional branch instruction is judged according to the corresponding conditional flag bit, if yes, the instruction fetching is skipped, otherwise, the instruction fetching is performed sequentially.
Step S15, branch instruction condition flag bit is clear 0.
Specifically, the conditional flag bit is invalidated after use, regardless of whether a jump or no jump is predicted.
When the branch prediction optimization method for the conditional branch instruction in the loop is adopted, the operation instruction with the conditional label provided by the processor and the conditional branch instruction with the conditional branch flag bit are combined, so that the branch direction can be correctly predicted when the loop is jumped out of the loop after the loop is ended, the purpose of avoiding the performance loss caused by the branch prediction failure is achieved, and the performance loss caused by the last branch of the loop can be avoided.
To facilitate a better understanding of the invention, the terms used herein will be briefly explained as follows:
processor pipeline: the method is a technology for decomposing an instruction into multiple steps and overlapping the operations of the steps of different instructions so as to realize parallel processing of a plurality of instructions and accelerate the program running process.
Conditional branch instructions: and taking the state of the flag bit or the logic operation result of the flag bit as a basis, if the branch condition is met, switching to the instruction execution indicated by the target address, and otherwise, continuing to execute the next instruction.
And (3) branch prediction: an advanced data processing method for solving the problem of pipeline failure caused by processing branch instructions, a processor predicts the proceeding direction of program branches and can accelerate the operation speed;
a compiler: a program that translates one language (typically a high level language) into another language (typically a low level language).
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (2)

1. A method for branch prediction optimization for an in-loop conditional branch instruction, comprising: the method comprises the following steps:
s11, the compiler processes the for loop in the user program, and judges whether the number of instructions in one loop can meet the condition that the operation instruction with conditional branch flag bit is advanced by at least N before the conditional branch instruction with conditional branch flag bit to meet the condition that the branch prediction prompts success;
s12, on the basis of the judgment result of S11, the compiler generates an operation instruction with a condition mark and a conditional branch instruction with a condition branch zone bit, if the number of instructions in a loop can meet the condition that the operation instruction with the condition mark is at least N ahead of the conditional branch instruction with the condition branch zone bit, the compiler directly generates assembly codes, and if the number of instructions in the loop can not meet the condition, the compiler calculates the number of times of loop expansion according to the code amount of a loop body and the condition N, and performs loop expansion to generate the assembly codes;
s13, in the program running stage, the operation instruction with conditional tag changes the branch flag bit of the conditional branch instruction in advance, that is, the operation instruction with conditional tag sets the conditional tag bit to be valid, and sets the corresponding conditional branch tag according to the operation result;
s14, judging the conditional branch instruction according to the corresponding conditional flag bit, if yes, the processor indicates to skip to fetch the instruction according to the conditional branch flag, otherwise, the processor fetches the instruction in sequence;
s15, whether the jump is predicted or not, the condition mark bit is used and then is invalidated, and the branch instruction condition mark bit is clear 0.
2. The method of claim 1 for branch prediction optimization for an in-loop conditional branch instruction, wherein: the N is determined by hardware and is an integer.
CN201910794939.2A 2019-08-27 2019-08-27 Branch prediction optimization method for conditional branch instructions in loop Active CN112445520B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910794939.2A CN112445520B (en) 2019-08-27 2019-08-27 Branch prediction optimization method for conditional branch instructions in loop

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910794939.2A CN112445520B (en) 2019-08-27 2019-08-27 Branch prediction optimization method for conditional branch instructions in loop

Publications (2)

Publication Number Publication Date
CN112445520A true CN112445520A (en) 2021-03-05
CN112445520B CN112445520B (en) 2022-11-15

Family

ID=74741282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910794939.2A Active CN112445520B (en) 2019-08-27 2019-08-27 Branch prediction optimization method for conditional branch instructions in loop

Country Status (1)

Country Link
CN (1) CN112445520B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118245115A (en) * 2024-05-27 2024-06-25 北京微核芯科技有限公司 Prediction method and device for transfer instruction

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1928810A (en) * 2005-09-09 2007-03-14 上海采微电子科技有限公司 Micro-processor with cycling jump forecasting unit
CN102736894A (en) * 2011-04-01 2012-10-17 中兴通讯股份有限公司 Method and system for coding jump instruction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1928810A (en) * 2005-09-09 2007-03-14 上海采微电子科技有限公司 Micro-processor with cycling jump forecasting unit
CN102736894A (en) * 2011-04-01 2012-10-17 中兴通讯股份有限公司 Method and system for coding jump instruction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118245115A (en) * 2024-05-27 2024-06-25 北京微核芯科技有限公司 Prediction method and device for transfer instruction
CN118245115B (en) * 2024-05-27 2024-07-26 北京微核芯科技有限公司 Prediction method and device for transfer instruction

Also Published As

Publication number Publication date
CN112445520B (en) 2022-11-15

Similar Documents

Publication Publication Date Title
JP5917616B2 (en) Method and apparatus for changing the sequential flow of a program using prior notification technology
KR101685247B1 (en) Single cycle multi-branch prediction including shadow cache for early far branch prediction
US9875106B2 (en) Computer processor employing instruction block exit prediction
CN107092467B (en) Instruction sequence buffer for enhancing branch prediction efficiency
KR101523020B1 (en) Combined branch target and predicate prediction
US8607209B2 (en) Energy-focused compiler-assisted branch prediction
CN109101276B (en) Method for executing instruction in CPU
EP0448499A2 (en) Instruction prefetch method for branch-with-execute instructions
CN102662640B (en) Double-branch target buffer and branch target processing system and processing method
EP1853995B1 (en) Method and apparatus for managing a return stack
CN101299192A (en) Non-aligning access and storage processing method
US20170139693A1 (en) Code execution method and device
US20090019431A1 (en) Optimised compilation method during conditional branching
CN107526622B (en) Rapid exception handling method and device for Linux
CN112445520B (en) Branch prediction optimization method for conditional branch instructions in loop
JPWO2009004709A1 (en) Indirect branch processing program and indirect branch processing method
US9639370B1 (en) Software instructed dynamic branch history pattern adjustment
CN111522584B (en) Hardware circulation acceleration processor and hardware circulation acceleration method executed by same
CN101604255A (en) The method that the binary translation by delayed skip instruction of intermediate language is realized
CN107943518B (en) Local jump instruction fetch circuit
CN103838616A (en) Tree program branch based computer program immediate compiling method
CN106325963B (en) Self-adaptive dynamic compiling and scheduling method and device
CN1481527A (en) Resource-saving hardware loop
CN105094750B (en) A kind of the return address prediction technique and device of multiline procedure processor
JP2009009253A (en) Program execution method, program, and program execution system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant