CN117389629A - Branch prediction method, device, electronic equipment and medium - Google Patents

Branch prediction method, device, electronic equipment and medium Download PDF

Info

Publication number
CN117389629A
CN117389629A CN202311447344.2A CN202311447344A CN117389629A CN 117389629 A CN117389629 A CN 117389629A CN 202311447344 A CN202311447344 A CN 202311447344A CN 117389629 A CN117389629 A CN 117389629A
Authority
CN
China
Prior art keywords
branch
tag
jump
instruction
branch instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311447344.2A
Other languages
Chinese (zh)
Inventor
郭津榜
刘洋
张稚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hexin Technology Co ltd
Beijing Hexin Digital Technology Co ltd
Original Assignee
Hexin Technology Co ltd
Beijing Hexin Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hexin Technology Co ltd, Beijing Hexin Digital Technology Co ltd filed Critical Hexin Technology Co ltd
Priority to CN202311447344.2A priority Critical patent/CN117389629A/en
Publication of CN117389629A publication Critical patent/CN117389629A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • G06F9/3848Speculative instruction execution using hybrid branch prediction, e.g. selection between prediction techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • G06F9/3844Speculative instruction execution using dynamic branch prediction, e.g. using branch history tables

Abstract

The application provides a branch prediction method, a device, electronic equipment and a medium. The method comprises the following steps: after judging that the instruction fetched under the current thread is a branch instruction in the instruction decoding stage, determining a tag table entry in a branch tag table according to part of address bits of a virtual address of the branch instruction under the current thread; the branch label list comprises label list items corresponding to a plurality of branch instructions, and different branch instructions correspond to different label list items; the tag table entry comprises a tag of a corresponding branch instruction, wherein the tag of the branch instruction comprises a highest bit of a virtual address of the branch instruction to a highest one of partial address bits; according to the determined tag table item, determining a history table item in a branch history table, and reading jump information in the history table item, wherein the history table item corresponds to the tag table item; and executing the jump or not according to the read jump information. The method is used for improving the accuracy of branch prediction and improving the performance of a processor.

Description

Branch prediction method, device, electronic equipment and medium
Technical Field
The present disclosure relates to computer technology, and in particular, to a branch prediction method, apparatus, electronic device, and medium.
Background
In the computer architecture, the execution process of the instruction can be divided into three stages of instruction fetching, instruction decoding and instruction execution, wherein in the instruction fetching stage, the processor fetches the instruction from the instruction storage according to the virtual address of the instruction; in the instruction decoding stage, according to a preset instruction format, the retrieved instructions are split and interpreted, and different instruction categories and various operand acquisition methods are identified and distinguished; in the instruction execution stage, various operations specified by the instruction are completed, and the functions of the instruction are specifically realized. When the processor processes a branch instruction, depending on whether the determination condition is true or false, a jump may occur. If the instruction fetching, instruction decoding and instruction execution are only carried out in a simple sequence, when the branch instruction needing to jump is found in the instruction execution stage, the calculated jump address is required to be used as a starting address, the sequential instruction fetching is restarted, the power consumption of a processor is wasted, and the execution efficiency is reduced.
Branch prediction is a common optimization technique that improves processor instruction throughput for many high performance processors. The subsequent instructions are speculatively fetched by predicting the jump condition of the branch instruction ahead of the instruction execution stage. The technique does not need to wait until the execution stage of the branch instruction, and the instruction fetching is performed again after the jump address is calculated, so that the operation efficiency of the processor can be improved if the branch prediction is correct or the accuracy is higher.
The branch history table is a data structure commonly used by most dynamic branch prediction technologies, adopts a fully-connected structure, indexes by using part of bits of virtual addresses of branch instructions, corresponds to different branch instructions by different table entries, and stores prediction information of whether the corresponding branch instruction jumps or not by each table entry. After judging that a branch instruction exists in the instruction decoding stage, inquiring a branch history table according to the virtual address of the branch instruction to obtain prediction information of whether to jump. If the query result is that the jump is performed, acquiring a jump address of the branch instruction, and performing instruction fetching, instruction decoding and instruction executing operations based on the jump address; if the query result is not jump, no turning is executed. However, since the branch history table uses only part of bits of the virtual address of the branch instruction for indexing, in practical application, there are cases where different branch instructions are used for indexing with the same part of bits of the virtual address, which results in lower accuracy of the jump prediction result obtained by querying the branch history table and affects the performance of the processor.
Disclosure of Invention
The application provides a branch prediction method, a device, electronic equipment and a medium, which are used for improving the accuracy of branch prediction and improving the performance of a processor.
In one aspect, the present application provides a branch prediction method, including:
after judging that an instruction fetched under a current thread is a branch instruction in an instruction decoding stage, acquiring part of address bits of a virtual address of the branch instruction under the current thread;
determining a tag table entry in a branch tag table according to part of address bits of a virtual address of a branch instruction under the current thread; the branch label list comprises label list items corresponding to a plurality of branch instructions, and different branch instructions correspond to different label list items; the tag table entry comprises a tag of a corresponding branch instruction, wherein the tag of the branch instruction comprises a highest bit of a virtual address of the branch instruction to a highest bit of the partial address bits;
according to the determined tag table item, a history table item is determined in a branch history table, and jump information in the history table item is read, wherein the history table item corresponds to the tag table item;
if the read jump information represents confirmation jump, acquiring a jump address corresponding to the branch instruction and executing jump; and if the read jump information represents refusal jump, not executing the jump.
Optionally, the position of the history entry corresponding to any branch instruction in the branch history table is the same as the position of the tag entry corresponding to the branch instruction in the branch tag table.
Optionally, the determining a history entry in the branch history table according to the determined tag entry includes:
and determining a history table item in the branch history table according to the determined position of the tag table item in the branch tag table, wherein the position of the history table item in the branch history table is the same as the position of the tag table item in the branch tag table.
Optionally, the branch tag table includes tag entries corresponding to the branch instructions under a plurality of different threads, the same branch instruction under the different threads corresponds to different tag entries, and the tag entries further include thread numbers of threads where the corresponding branch instructions are located.
Optionally, the branch label table adopts a multi-path group connection structure; the determining a tag table entry in the branch tag table according to the partial address bit of the virtual address of the branch instruction under the current thread comprises:
determining a tag group in the branch tag table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table comprises a plurality of tag groups, each tag group comprises a plurality of tag table items, and different tag groups correspond to different part address bits;
And determining a tag table item from the tag group according to the tag of the branch instruction under the current thread and the thread number of the current thread, wherein the tag in the tag table item is the same as the tag of the branch instruction under the current thread, and the thread number in the tag table item is the same as the thread number of the current thread.
Optionally, after determining a tag entry in the branch tag table according to the partial address bits of the virtual address of the branch instruction under the current thread, the method further includes:
generating hit information, wherein the number of bits of the hit information is the same as the number of paths of the branch tag table, different bits of the hit information correspond to different paths of the branch tag table, the value of the bit corresponding to the determined path of the tag table item in the hit information is a first value, and the values of other bits are second values;
the determining a history table entry in the branch history table according to the determined position of the tag table entry in the branch tag table, including:
determining a history group in the branch history table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table adopts a multi-path group connection structure, the branch tag table comprises a plurality of history groups, each history group comprises a plurality of tag table items, and different history groups correspond to different part address bits;
And determining a corresponding history table entry under the road from the history group according to the position of the first value in the hit information.
Optionally, the method further comprises:
if the corresponding tag table item does not exist in the branch tag table, the miss is judged, and the jump is not executed.
Optionally, the method further comprises:
if the branch instruction is not hit, selecting the least frequently accessed tag table item from the tag group according to a least recently used algorithm, and replacing the tag and the thread number in the tag table item with the tag of the branch instruction under the current thread and the thread number of the current thread.
Optionally, if the read jump information represents confirmation of jump, acquiring a jump address corresponding to the branch instruction and executing jump; if the read jump information represents refusal jump, not executing jump, including:
if the read high order of the jump information is 1, acquiring a jump address corresponding to the branch instruction and executing jump;
and if the high order of the read jump information is 0, not executing the jump.
In another aspect, the present application provides a branch prediction apparatus comprising:
The acquisition module is used for acquiring partial address bits of the virtual address of the branch instruction under the current thread after judging that the instruction taken out under the current thread is the branch instruction in the instruction decoding stage;
the determining module is used for determining a tag table item in the branch tag table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch label list comprises label list items corresponding to a plurality of branch instructions, and different branch instructions correspond to different label list items; the tag table entry comprises a tag of a corresponding branch instruction, wherein the tag of the branch instruction comprises a highest bit of a virtual address of the branch instruction to a highest bit of the partial address bits;
the reading module is used for determining a history table item in a branch history table according to the determined tag table item and reading jump information in the history table item, wherein the history table item corresponds to the tag table item;
the jump module is used for acquiring a jump address corresponding to the branch instruction and executing the jump if the read jump information represents the confirmation jump; and if the read jump information represents refusal jump, not executing the jump.
Optionally, the position of the history entry corresponding to any branch instruction in the branch history table is the same as the position of the tag entry corresponding to the branch instruction in the branch tag table.
Optionally, the determining module is specifically configured to:
and determining a history table item in the branch history table according to the determined position of the tag table item in the branch tag table, wherein the position of the history table item in the branch history table is the same as the position of the tag table item in the branch tag table.
Optionally, the branch tag table includes tag entries corresponding to the branch instructions under a plurality of different threads, the same branch instruction under the different threads corresponds to different tag entries, and the tag entries further include thread numbers of threads where the corresponding branch instructions are located.
Optionally, the branch label table adopts a multi-path group connection structure; the determining module is further specifically configured to:
determining a tag group in the branch tag table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table comprises a plurality of tag groups, each tag group comprises a plurality of tag table items, and different tag groups correspond to different part address bits;
And determining a tag table item from the tag group according to the tag of the branch instruction under the current thread and the thread number of the current thread, wherein the tag in the tag table item is the same as the tag of the branch instruction under the current thread, and the thread number in the tag table item is the same as the thread number of the current thread.
Optionally, the determining module is further configured to:
generating hit information, wherein the number of bits of the hit information is the same as the number of paths of the branch tag table, different bits of the hit information correspond to different paths of the branch tag table, the value of the bit corresponding to the determined path of the tag table item in the hit information is a first value, and the values of other bits are second values;
the determining a history table entry in the branch history table according to the determined position of the tag table entry in the branch tag table, including:
determining a history group in the branch history table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table adopts a multi-path group connection structure, the branch tag table comprises a plurality of history groups, each history group comprises a plurality of tag table items, and different history groups correspond to different part address bits;
And determining a corresponding history table entry under the road from the history group according to the position of the first value in the hit information.
Optionally, the jump module is further configured to:
if the corresponding tag table item does not exist in the branch tag table, the miss is judged, and the jump is not executed.
Optionally, the apparatus further includes:
and the updating module is used for selecting the least frequently accessed tag table item from the tag group according to the least recently used algorithm if the instruction is missed, and replacing the tag and the thread number in the tag table item with the tag of the branch instruction under the current thread and the thread number of the current thread.
Optionally, the jump module is specifically configured to:
if the read high order of the jump information is 1, acquiring a jump address corresponding to the branch instruction and executing jump;
and if the high order of the read jump information is 0, not executing the jump.
In yet another aspect, the present application provides an electronic device, including: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored in the memory to implement the method as described above.
In yet another aspect, the present application provides a computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, are configured to implement the method as described above.
In the branch prediction method, the device, the electronic equipment and the medium, a branch tag table storing the high-order virtual address of a branch instruction is arranged, after an instruction fetched under a current thread is judged to be the branch instruction in an instruction decoding stage, a tag table entry is determined in the branch tag table according to part of bits of the virtual address of the branch instruction, and the high-order virtual address contained in the tag table entry is identical to the high-order virtual address of the branch instruction; and determining a history table entry corresponding to the tag table entry in the branch history table, and executing the jump or non-jump of the branch instruction according to the jump information in the history table entry. By comparing the high-order virtual address of the branch instruction, the history table entry corresponding to the branch instruction in the branch history table can be accurately obtained, so that whether the branch instruction jumps or not can be accurately predicted, and the performance of the processor is effectively improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a flow chart schematically illustrating an instruction execution process according to a first embodiment of the present application;
FIG. 2 is a flow chart schematically illustrating a branch prediction method according to an embodiment of the present application;
FIG. 3 is a flow chart illustrating another instruction execution process according to one embodiment of the present application;
fig. 4 is a schematic structural diagram of a branch tag table according to an embodiment of the present application;
fig. 5 is a schematic flow chart of skip information update according to an embodiment of the present application;
FIG. 6 is a schematic diagram schematically illustrating a branch prediction apparatus according to a second embodiment of the present disclosure;
a schematic structural diagram of a branch prediction electronic device provided in the third embodiment of the present application is schematically shown in fig. 7.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
The modules in this application refer to functional modules or logic modules. It may be in the form of software, the functions of which are implemented by the execution of program code by a processor; or may be in hardware. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
In the computer architecture, the execution process of the instruction can be divided into three stages of instruction fetching, instruction decoding and instruction execution, wherein in the instruction fetching stage, the processor fetches the instruction from the instruction storage according to the virtual address of the instruction; in the instruction decoding stage, according to a preset instruction format, the retrieved instructions are split and interpreted, and different instruction categories and various operand acquisition methods are identified and distinguished; in the instruction execution stage, various operations specified by the instruction are completed, and the functions of the instruction are specifically realized. When the processor processes a branch instruction, depending on whether the determination condition is true or false, a jump may occur. If the instruction fetching, instruction decoding and instruction execution are only carried out in a simple sequence, when the branch instruction needing to jump is found in the instruction execution stage, the calculated jump address is required to be used as a starting address, the sequential instruction fetching is restarted, the power consumption of a processor is wasted, and the execution efficiency is reduced.
Branch prediction is a common optimization technique that improves processor instruction throughput for many high performance processors. The subsequent instructions are speculatively fetched by predicting the jump condition of the branch instruction ahead of the instruction execution stage. The technique does not need to wait until the execution stage of the branch instruction, and the instruction fetching is performed again after the jump address is calculated, so that the operation efficiency of the processor can be improved if the branch prediction is correct or the accuracy is higher.
The branch history table is a data structure commonly used by most dynamic branch prediction technologies, adopts a fully-connected structure, indexes by using part of bits of virtual addresses of branch instructions, corresponds to different branch instructions by different table entries, and stores prediction information of whether the corresponding branch instruction jumps or not by each table entry.
Fig. 1 is a flow chart of an instruction execution process according to an embodiment of the present application, as shown in fig. 1, sequentially fetching instructions and performing instruction decoding, and after determining that a branch instruction exists in the instruction decoding stage, querying a branch history table according to a virtual address of the branch instruction to obtain prediction information of whether to jump. If the query result is that the jump is performed, performing address redirection, obtaining a jump address of a branch instruction, and performing instruction fetching, instruction decoding and instruction execution operations based on the jump address; if the query result is not skip, the skip is not performed, and the sequential instruction fetching process is continued. The branch history table is updated based on the instruction after the jump or the execution result of the branch instruction.
However, since the branch history table uses only part of bits of the virtual address of the branch instruction for indexing, in practical application, there are cases where different branch instructions are used for indexing with the same part of bits of the virtual address, which results in lower accuracy of the jump prediction result obtained by querying the branch history table and affects the performance of the processor.
The technical content provided by the application aims to solve the technical problems of the related technology.
In the embodiment of the application, after a branch tag table storing the high-order virtual address of a branch instruction is set to judge that an instruction taken out under a current thread is a branch instruction in an instruction decoding stage, determining a tag table entry in the branch tag table according to part of bits of the virtual address of the branch instruction, wherein the high-order virtual address contained in the tag table entry is the same as the high-order virtual address of the branch instruction; and determining a history table entry corresponding to the tag table entry in the branch history table, and executing the jump or non-jump of the branch instruction according to the jump information in the history table entry. By comparing the high-order virtual address of the branch instruction, the history table entry corresponding to the branch instruction in the branch history table can be accurately obtained, so that whether the branch instruction jumps or not can be accurately predicted, and the performance of the processor is effectively improved.
The technical solutions of the present application are illustrated in the following specific examples. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
Example 1
FIG. 2 is a flow chart illustrating a branch prediction method according to an embodiment of the present application. As shown in fig. 2, the branch prediction method provided in this embodiment may include:
s201, after judging that an instruction fetched under a current thread is a branch instruction in an instruction decoding stage, acquiring part of address bits of a virtual address of the branch instruction under the current thread;
s202, determining a tag table entry in a branch tag table according to partial address bits of a virtual address of a branch instruction under the current thread; the branch label list comprises label list items corresponding to a plurality of branch instructions, and different branch instructions correspond to different label list items; the tag table entry comprises a tag of a corresponding branch instruction, wherein the tag of the branch instruction comprises a highest bit of a virtual address of the branch instruction to a highest bit of the partial address bits;
s203, according to the determined tag table item, determining a history table item in a branch history table, and reading jump information in the history table item, wherein the history table item corresponds to the tag table item;
S204, if the read jump information represents confirmation jump, acquiring a jump address corresponding to the branch instruction and executing jump; and if the read jump information represents refusal jump, executing no jump.
In practical applications, the execution body of the embodiment may be a branch prediction apparatus, which may be implemented by a computer program, for example, application software, etc.; alternatively, the computer program may be implemented as a medium storing a related computer program, for example, a usb disk, a cloud disk, or the like; still alternatively, it may be implemented by a physical device, e.g., a chip, a server, etc., integrated with or installed with the relevant computer program.
Because the branch history table uses only part of bits of the virtual address of the branch instruction to index, in order to accurately obtain the history entry corresponding to the branch instruction in the branch history table, the remaining bits of the virtual address corresponding to the branch instruction may be compared. Specifically, a branch tag table is set, the branch tag table comprises a plurality of tag table entries, different branch tag table entries correspond to different branch instructions under threads, and each tag table entry comprises a high bit of a virtual address of the corresponding branch instruction as a tag. Before the branch history table is queried, taking part of bits of the virtual address of the branch instruction as an index, taking the highest bit of the virtual address of the branch instruction to the highest bit of the part of bits used for the index as a label of the branch instruction, and determining a label table entry corresponding to the branch instruction in the branch label table according to the index and the label of the branch instruction, wherein the label in the label table entry is the same as the label of the branch instruction. And determining a history table item corresponding to the label table item in the branch history table, so that a jump prediction result corresponding to the branch instruction can be accurately obtained from the branch history table.
Fig. 3 is a flow chart of another instruction execution process according to an embodiment of the present application, as shown in fig. 3, instruction fetching is sequentially performed and instruction decoding is performed, and when it is determined that an instruction fetched under a current thread is a branch instruction in the instruction decoding stage, a tag entry corresponding to the branch instruction is queried in the branch tag table according to an index and a tag of the branch instruction. If the tag table item corresponding to the branch instruction is queried, namely the tag hits, the history table item corresponding to the tag table item is obtained in the branch history table and is used as the history table item corresponding to the branch instruction. Reading the jump information of the branch instruction from the history table item corresponding to the branch instruction, if the jump information characterizes the confirmation jump, redirecting the address, obtaining the jump address corresponding to the branch instruction, fetching the instruction according to the jump address, executing instruction decoding and instruction execution; if the jump information indicates refusal jump, continuing the sequential instruction fetching process.
In this example, a branch tag table storing the high-order virtual address of the branch instruction is set, after the instruction fetched under the current thread is determined to be the branch instruction in the instruction decoding stage, a tag table entry is determined in the branch tag table according to part of bits of the virtual address of the branch instruction, where the high-order virtual address included in the tag table entry is the same as the high-order virtual address of the branch instruction; and determining a history table entry corresponding to the tag table entry in the branch history table, and executing the jump or non-jump of the branch instruction according to the jump information in the history table entry. By comparing the high-order virtual address of the branch instruction, the history table entry corresponding to the branch instruction in the branch history table can be accurately obtained, so that whether the branch instruction jumps or not can be accurately predicted, and the performance of the processor is effectively improved.
In practical applications, there may be multiple ways of mapping the tag table entry to the history table entry, and in one example, the position of the history table entry corresponding to any branch instruction in the branch history table is the same as the position of the tag table entry corresponding to the branch instruction in the branch tag table.
Specifically, the position of each history table item in the branch history table is the same as the position of the tag table item corresponding to the history table in the branch tag table, and the history table item and the tag table item correspond to the same branch instruction. For example, assuming that a history entry corresponding to a branch instruction is located in a first position of a branch history table, a tag entry corresponding to the branch instruction is located in the first position of the branch tag table.
In this example, by making the position of the history entry corresponding to the branch instruction in the branch history table and the position of the tag entry corresponding to the branch instruction in the branch tag table the same, the corresponding history entry can be determined in the branch history table based on the position of the tag entry in the branch tag table.
Based on the positional correspondence of the tag entries with the history entries, in one example, determining one history entry in the branch history table includes:
And determining a history table item in the branch history table according to the determined position of the tag table item in the branch tag table, wherein the position of the history table item in the branch history table is the same as the position of the tag table item in the branch tag table.
Specifically, according to the position of the tag entry corresponding to the branch instruction in the branch tag table, the position in the branch history table is the same as the position of the corresponding tag entry in the branch tag table. For example, assuming that the tag entry corresponding to the branch instruction is located at the first position of the branch tag table, the history entry corresponding to the first position in the branch history table is obtained and used as the history entry corresponding to the tag entry, that is, the history entry corresponding to the branch instruction.
In this example, the history entries with the same positions in the branch history table as the corresponding tag entries in the branch tag table are obtained, and the corresponding history entries can be accurately obtained based on the determined tag entries, so that the jump information in the history entries corresponding to the branch instruction can be accurately obtained, and the accuracy of branch prediction is improved.
For processors supporting the hyper-threading technique, there are situations where the branch history table is shared by multiple threads. Because the jump trend of the same branch instruction is not necessarily the same under different threads, but the corresponding virtual addresses have the same situation, the problem that a plurality of threads interfere with each other exists. In one example, the branch tag table includes tag entries corresponding to branch instructions under a plurality of different threads, the same branch instruction under the different threads corresponds to different tag entries, and the tag entries further include thread numbers of threads in which the corresponding branch instructions are located.
Specifically, since the thread numbers of different threads are different, the tag entry in the branch tag may include, in addition to the tag corresponding to the branch instruction, the thread number of the thread in which the branch instruction is located. Based on different thread numbers, the tag entries corresponding to the same branch instruction under different threads in the branch tag table are different, so that the history entries corresponding to the same branch instruction under different threads in the branch history table are also different. According to a certain branch instruction under a certain thread, the corresponding history table entry can be uniquely determined, and the mutual interference of a plurality of threads is effectively avoided.
In this example, for instruction execution of multiple threads, the thread number of the branch instruction is saved through the branch tag table, so that the history entry corresponding to the branch instruction can be accurately acquired based on the virtual address and the thread information of the branch instruction under the current thread, mutual interference of multiple threads is avoided, and accuracy of branch prediction is improved.
The branch tag table may have various structures, and in one example, the branch tag table adopts a multi-path group connection structure; the determining a tag table entry in the branch tag table according to the partial address bit of the virtual address of the branch instruction under the current thread comprises:
determining a tag group in the branch tag table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table comprises a plurality of tag groups, each tag group comprises a plurality of tag table items, and different tag groups correspond to different part address bits;
and determining a tag table item from the tag group according to the tag of the branch instruction under the current thread and the thread number of the current thread, wherein the tag in the tag table item is the same as the tag of the branch instruction under the current thread, and the thread number in the tag table item is the same as the thread number of the current thread.
Fig. 4 is a schematic structural diagram of a branch tag table provided in an embodiment of the present application, where, as shown in fig. 4, the branch tag table adopts a multi-path group connection structure, and includes M groups and N paths, different tag groups correspond to partial address bits of different virtual addresses, and the branch tag table includes m×n tag entries, and each tag entry includes a tag of a branch instruction under a corresponding thread and a thread number of the thread. The partial low order address bits of the virtual address of the branch instruction under the current thread are used as indexes, corresponding tag groups are determined in a branch tag table, wherein the bit width of the partial address bits of the virtual address used for indexing can be determined according to the group number of the branch tag table, and the branch tag table is assumed to contain 2 n Each path of the branch tag table of the group, i.e. the group association structure, includes n tag entries of power of 2, and the required address width is n, and the number and structure of the entries of the branch tag table can be determined according to the requirement of hardware design, which is not limited herein. Taking the example of a branch tag table adopting a four-way group association structure, it is assumed that 256 tag entries in the branch tag table are shared, and the virtual of the branch instruction is obtainedThe pseudo address has [63:0 ]]64 bits total, the branch tag table has 64 tag groups, each way contains 64 tag entries, and the virtual address of the branch instruction is selected [11:6 ] ]A total of 6 bits, which is an index into the branch tag table, determines the tag set to which the branch instruction corresponds, e.g., [11:6 ] of the virtual address of the branch instruction]000000, the 0 th group in the branch tag table is determined as the tag group corresponding to the branch instruction. The most significant bits of the virtual address of the branch instruction to the most significant bits of the partial address bits are used as labels, for example, assume that the virtual address of the branch instruction has [63:0]64 bits total, with [11:6 ] of the virtual address of the branch instruction]The 6 bits of the partial address bits are used as an index into the branch tag table, then the virtual address of the branch instruction is selected [63:12 ]]The bit serves as a tag for the branch instruction. And comparing the labels and the thread numbers in all the label list items in the determined label group with the labels and the thread numbers of the branch instructions under the current thread, and taking the stored label list item with the same label and the thread number as the label list item corresponding to the branch instructions under the current thread.
In this example, according to the partial address bit of the virtual address of the branch instruction under the current thread, a tag group is determined in the branch tag table, and the tag table entry in the tag group, which has the same tag as the tag of the branch instruction and the same thread number as the thread number of the branch instruction, is determined to be the tag table entry corresponding to the branch instruction, so that the tag table entry corresponding to the branch instruction under the current thread can be accurately obtained.
After the tag table entry corresponding to the branch instruction under the current thread is determined in the branch tag table, the history table entry corresponding to the branch instruction can be determined in the branch history table based on the group and the path where the tag table entry is located in the branch tag table. In one example, after determining a tag entry in the branch tag table according to the partial address bits of the virtual address of the branch instruction under the current thread, the method further includes:
generating hit information, wherein the number of bits of the hit information is the same as the number of paths of the branch tag table, different bits of the hit information correspond to different paths of the branch tag table, the value of the bit corresponding to the determined path of the tag table item in the hit information is a first value, and the values of other bits are second values;
the determining a history table entry in the branch history table according to the determined position of the tag table entry in the branch tag table, including:
determining a history group in the branch history table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table adopts a multi-path group connection structure, the branch tag table comprises a plurality of history groups, each history group comprises a plurality of tag table items, and different history groups correspond to different part address bits;
And determining a corresponding history table entry under the road from the history group according to the position of the first value in the hit information.
Specifically, the structure of the branch history table is the same as that of the branch tag table, and will not be described here again. The method comprises the steps of indexing part of address bits of virtual addresses of branch instructions under a current thread in a branch history table, and determining a history group corresponding to the branch instructions, so that the history entries corresponding to the branch instructions can be determined only based on a path of a tag entry corresponding to the branch instructions in the branch tag table. After the tag table item corresponding to the branch instruction is determined in the branch tag table, hit information with the same number of bits as the number of ways of the branch tag table is generated, wherein the value of the bit of the way where the tag table item corresponding to the hit information is located is a first value, and the values of other bits are second values. Assuming that the branch tag table adopts a four-way group association structure and comprises a 0 th way, a 1 st way, a 2 nd way and a 3 rd way, hit information comprises four bits, and when the tag table item is positioned in the 1 st way, the generated hit information is 0010. After the branch history table determines the history group corresponding to the branch instruction, determining a certain path in the history group according to the hit information, and taking the history table item at the position as the history table item corresponding to the branch instruction under the current thread.
In the example, the way of the tag table item in the branch tag table is represented by the hit information, and the way of the history table item in the branch history table can be accurately determined based on the hit information, so that the history table item corresponding to the branch instruction under the current thread is accurately determined, and the accuracy of branch prediction is improved.
The history entries of the branch history table store the jump information of the corresponding branch instruction, and based on the jump information, it may be determined whether to execute the jump, in one example, if the read jump information characterizes to confirm the jump, the jump address corresponding to the branch instruction is obtained and the jump is executed; if the read jump information represents refusal jump, not executing jump, including:
if the read high order of the jump information is 1, acquiring a jump address corresponding to the branch instruction and executing jump;
and if the high order of the read jump information is 0, not executing the jump.
Specifically, the branch history table may keep the history of the jump condition of the branch instruction, and according to the history of the jump condition, it may predict whether to execute the jump this time, and its specific form may be various. Illustratively, each history entry of the branch history table holds 2bits of jump information, the high order of the jump information indicating that the predicted result is to jump when 1, and the high order of the jump information indicating that the predicted result is not to jump when 0; the lower order of the jump information is 1, which indicates a higher accuracy of the prediction result, and the lower order of the jump information is 0, which indicates a lower accuracy of the prediction result. FIG. 5 is a schematic flow chart of updating skip information according to an embodiment of the present application, wherein when the skip information is 00 and 01 as shown in FIG. 5, no skip is performed; and when the jump information is 11 and 10, acquiring the jump address corresponding to the branch instruction and executing the jump. Because the branch history table has the condition of prediction errors, after executing or not executing the jump on the branch instruction based on the jump information, the branch history table updates the jump information based on the actual execution result of the branch instruction. When the jump information is 00, if the actual execution result of the branch instruction is jump, updating the jump information corresponding to the branch instruction to 10; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is updated to 01. When the jump information is 01, if the actual execution result of the branch instruction is jump, updating the jump information corresponding to the branch instruction to 00; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is still 01. When the jump information is 10, if the actual execution result of the branch instruction is jump, updating the jump information corresponding to the branch instruction to 11; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is updated to 00. When the jump information is 11, if the actual execution result of the branch instruction is jump, the jump information corresponding to the branch instruction is still 11; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is updated to 10.
In this example, the jump condition of the branch instruction can be predicted based on the high order of the jump information, and the jump information of the branch history table can be updated based on the actual execution condition of the branch instruction, so that the accuracy of branch prediction can be improved.
In the above case, the branch tag table includes only a part of tag entries corresponding to the branch instruction under the current thread, and therefore there is a possibility that there is no tag entry corresponding to the branch instruction under the current thread, i.e., a miss, in the branch tag table. In one example, the method further comprises:
if the corresponding tag table item does not exist in the branch tag table, the miss is judged, and the jump is not executed.
Specifically, it is assumed that after a tag group is determined in the branch tag table according to a partial address bit of a virtual address of a branch instruction, if a tag table entry having a tag and a thread number identical to those of the branch instruction is not present in the tag group, a tag table entry corresponding to the branch instruction under the current thread is not present in the branch tag table, a miss is determined, a jump is not executed, and a sequential instruction fetching process is continued.
In this example, if no corresponding tag entry exists in the branch tag table, a miss is determined, which indicates that no jump information of the branch instruction exists in the branch history table, and no jump is executed, so as to continue the sequential instruction fetching process.
Since the branch tag table only holds tag and thread information for a portion of the branch instruction, in order to increase the hit rate of the branch tag table, in one example, the method further comprises:
if the branch instruction is not hit, selecting the least frequently accessed tag table item from the tag group according to a least recently used algorithm, and replacing the tag and the thread number in the tag table item with the tag of the branch instruction under the current thread and the thread number of the current thread.
Specifically, the least recently used (Least recently used, LRU) algorithm eliminates data based on its historical access records, with the core idea that "if data has been accessed recently, then the probability of being accessed later is also higher. If the branch tag table is not hit, determining a tag table item which is least frequently accessed in a tag group corresponding to the branch instruction based on a least recently used algorithm, and replacing the tag and the thread number stored in the tag table item with the tag of the virtual address of the branch instruction under the current thread and the thread number of the current thread.
In this example, the hit rate of the query branch tag table can be improved by updating the branch tag table with the least recently used algorithm, so that the jump condition of more branch instructions can be predicted.
In the branch prediction method, the device, the electronic equipment and the medium provided by the embodiment of the invention, a branch tag table storing the high-order virtual address of a branch instruction is set, and after the instruction fetched under the current thread is judged to be the branch instruction in the instruction decoding stage, a tag table entry is determined in the branch tag table according to part of bits of the virtual address of the branch instruction, wherein the high-order virtual address contained in the tag table entry is the same as the high-order virtual address of the branch instruction; and determining a history table entry corresponding to the tag table entry in the branch history table, and executing the jump or non-jump of the branch instruction according to the jump information in the history table entry. By comparing the high-order virtual address of the branch instruction, the history table entry corresponding to the branch instruction in the branch history table can be accurately obtained, so that whether the branch instruction jumps or not can be accurately predicted, and the performance of the processor is effectively improved.
Example two
FIG. 6 is a schematic diagram of a branch prediction apparatus according to an embodiment of the present application. As shown in fig. 6, the branch prediction apparatus provided in this embodiment may include:
An obtaining module 61, configured to obtain a partial address bit of a virtual address of a branch instruction under a current thread after determining that the instruction fetched under the current thread is the branch instruction in an instruction decoding stage;
a determining module 62, configured to determine a tag table entry in a branch tag table according to a partial address bit of a virtual address of a branch instruction under the current thread; the branch label list comprises label list items corresponding to a plurality of branch instructions, and different branch instructions correspond to different label list items; the tag table entry comprises a tag of a corresponding branch instruction, wherein the tag of the branch instruction comprises a highest bit of a virtual address of the branch instruction to a highest bit of the partial address bits;
a reading module 63, configured to determine a history entry in a branch history table according to the determined tag entry, and read jump information in the history entry, where the history entry corresponds to the tag entry;
a jump module 64, configured to acquire a jump address corresponding to the branch instruction and execute a jump if the read jump information indicates that a jump is confirmed; and if the read jump information represents refusal jump, executing no jump.
In practical application, the branch prediction apparatus may be implemented by a computer program, for example, application software or the like; alternatively, the computer program may be implemented as a medium storing a related computer program, for example, a usb disk, a cloud disk, or the like; still alternatively, it may be implemented by a physical device, e.g., a chip, a server, etc., integrated with or installed with the relevant computer program.
Because the branch history table uses only part of bits of the virtual address of the branch instruction to index, in order to accurately obtain the history entry corresponding to the branch instruction in the branch history table, the remaining bits of the virtual address corresponding to the branch instruction may be compared. Specifically, a branch tag table is set, the branch tag table comprises a plurality of tag table entries, different branch tag table entries correspond to different branch instructions under threads, and each tag table entry comprises a high bit of a virtual address of the corresponding branch instruction as a tag. Before the branch history table is queried, taking part of bits of the virtual address of the branch instruction as an index, taking the highest bit of the virtual address of the branch instruction to the highest bit of the part of bits used for the index as a label of the branch instruction, and determining a label table entry corresponding to the branch instruction in the branch label table according to the index and the label of the branch instruction, wherein the label in the label table entry is the same as the label of the branch instruction. And determining a history table item corresponding to the label table item in the branch history table, so that a jump prediction result corresponding to the branch instruction can be accurately obtained from the branch history table.
Specifically, instructions are sequentially fetched and decoded, and when the instructions fetched under the current thread are judged to be branch instructions in the instruction decoding stage, a label table entry corresponding to the branch instructions is queried in a branch label table according to indexes and labels of the branch instructions. If the tag table item corresponding to the branch instruction is queried, namely the tag hits, the history table item corresponding to the tag table item is obtained in the branch history table and is used as the history table item corresponding to the branch instruction. Reading the jump information of the branch instruction from the history table item corresponding to the branch instruction, if the jump information characterizes the confirmation jump, redirecting the address, obtaining the jump address corresponding to the branch instruction, fetching the instruction according to the jump address, executing instruction decoding and instruction execution; and if the jump information represents refusal jump, the instruction fetching process is performed sequentially.
In this example, a branch tag table storing the high-order virtual address of the branch instruction is set, after the instruction fetched under the current thread is determined to be the branch instruction in the instruction decoding stage, a tag table entry is determined in the branch tag table according to part of bits of the virtual address of the branch instruction, where the high-order virtual address included in the tag table entry is the same as the high-order virtual address of the branch instruction; and determining a history table entry corresponding to the tag table entry in the branch history table, and executing the jump or non-jump of the branch instruction according to the jump information in the history table entry. By comparing the high-order virtual address of the branch instruction, the history table entry corresponding to the branch instruction in the branch history table can be accurately obtained, so that whether the branch instruction jumps or not can be accurately predicted, and the performance of the processor is effectively improved.
In practical applications, there may be multiple ways of mapping the tag table entry to the history table entry, and in one example, the position of the history table entry corresponding to any branch instruction in the branch history table is the same as the position of the tag table entry corresponding to the branch instruction in the branch tag table.
Specifically, the position of each history table item in the branch history table is the same as the position of the tag table item corresponding to the history table in the branch tag table, and the history table item and the tag table item correspond to the same branch instruction. For example, assuming that a history entry corresponding to a branch instruction is located in a first position of a branch history table, a tag entry corresponding to the branch instruction is located in the first position of the branch tag table.
In this example, by making the position of the history entry corresponding to the branch instruction in the branch history table and the position of the tag entry corresponding to the branch instruction in the branch tag table the same, the corresponding history entry can be determined in the branch history table based on the position of the tag entry in the branch tag table.
Based on the positional correspondence of the tag entries with the history entries, in one example, the determining module 62 is specifically configured to:
And determining a history table item in the branch history table according to the determined position of the tag table item in the branch tag table, wherein the position of the history table item in the branch history table is the same as the position of the tag table item in the branch tag table.
Specifically, according to the position of the tag entry corresponding to the branch instruction in the branch tag table, the position in the branch history table is the same as the position of the corresponding tag entry in the branch tag table. For example, assuming that the tag entry corresponding to the branch instruction is located at the first position of the branch tag table, the history entry corresponding to the first position in the branch history table is obtained and used as the history entry corresponding to the tag entry, that is, the history entry corresponding to the branch instruction.
In this example, the history entries with the same positions in the branch history table as the corresponding tag entries in the branch tag table are obtained, and the corresponding history entries can be accurately obtained based on the determined tag entries, so that the jump information in the history entries corresponding to the branch instruction can be accurately obtained, and the accuracy of branch prediction is improved.
For processors supporting the hyper-threading technique, there are situations where the branch history table is shared by multiple threads. Because the jump trend of the same branch instruction is not necessarily the same under different threads, but the corresponding virtual addresses have the same situation, the problem that a plurality of threads interfere with each other exists. In one example, the branch tag table includes tag entries corresponding to branch instructions under a plurality of different threads, the same branch instruction under the different threads corresponds to different tag entries, and the tag entries further include thread numbers of threads in which the corresponding branch instructions are located.
Specifically, since the thread numbers of different threads are different, the tag entry in the branch tag may include, in addition to the tag corresponding to the branch instruction, the thread number of the thread in which the branch instruction is located. Based on different thread numbers, the tag entries corresponding to the same branch instruction under different threads in the branch tag table are different, so that the history entries corresponding to the same branch instruction under different threads in the branch history table are also different. According to a certain branch instruction under a certain thread, the corresponding history table entry can be uniquely determined, and the mutual interference of a plurality of threads is effectively avoided.
In this example, for instruction execution of multiple threads, the thread number of the branch instruction is saved through the branch tag table, so that the history entry corresponding to the branch instruction can be accurately acquired based on the virtual address and the thread information of the branch instruction under the current thread, mutual interference of multiple threads is avoided, and accuracy of branch prediction is improved.
The branch tag table may have various structures, and in one example, the branch tag table adopts a multi-path group connection structure; the determining module 62 is further specifically configured to:
determining a tag group in the branch tag table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table comprises a plurality of tag groups, each tag group comprises a plurality of tag table items, and different tag groups correspond to different part address bits;
and determining a tag table item from the tag group according to the tag of the branch instruction under the current thread and the thread number of the current thread, wherein the tag in the tag table item is the same as the tag of the branch instruction under the current thread, and the thread number in the tag table item is the same as the thread number of the current thread.
The branch tag table adopts a multi-path group connection structure, and is assumed to comprise M groups and N paths, different tag groups correspond to partial address bits of different virtual addresses, the branch tag table totally comprises M multiplied by N tag table entries, and each tag table entry comprises a tag of a branch instruction under a corresponding thread and a thread number of the thread. The partial low order address bits of the virtual address of the branch instruction under the current thread are used as indexes, corresponding tag groups are determined in a branch tag table, wherein the bit width of the partial address bits of the virtual address used for indexing can be determined according to the group number of the branch tag table, and the branch tag table is assumed to contain 2 n Each path of the branch tag table of the group, i.e. the group association structure, includes n tag entries of power of 2, and the required address width is n, and the number and structure of the entries of the branch tag table can be determined according to the requirement of hardware design, which is not limited herein. Taking the example of a four-way set association structure as the branch tag table, it is assumed that 256 tag entries exist in the branch tag table, and the virtual address of the branch instruction has [63:0]64 bits total, the branch tag table has 64 tag groups, each way contains 64 tag entries, and the virtual address of the branch instruction is selected [11:6 ] ]A total of 6 bits, which is an index into the branch tag table, determines the tag set to which the branch instruction corresponds, e.g., [11:6 ] of the virtual address of the branch instruction]000000, then determine group 0 in the branch tag table as a componentAnd a label group corresponding to the branch instruction. The most significant bits of the virtual address of the branch instruction to the most significant bits of the partial address bits are used as labels, for example, assume that the virtual address of the branch instruction has [63:0]64 bits total, with [11:6 ] of the virtual address of the branch instruction]The 6 bits of the partial address bits are used as an index into the branch tag table, then the virtual address of the branch instruction is selected [63:12 ]]The bit serves as a tag for the branch instruction. And comparing the labels and the thread numbers in all the label list items in the determined label group with the labels and the thread numbers of the branch instructions under the current thread, and taking the stored label list item with the same label and the thread number as the label list item corresponding to the branch instructions under the current thread.
In this example, according to the partial address bit of the virtual address of the branch instruction under the current thread, a tag group is determined in the branch tag table, and the tag table entry in the tag group, which has the same tag as the tag of the branch instruction and the same thread number as the thread number of the branch instruction, is determined to be the tag table entry corresponding to the branch instruction, so that the tag table entry corresponding to the branch instruction under the current thread can be accurately obtained.
After the tag table entry corresponding to the branch instruction under the current thread is determined in the branch tag table, the history table entry corresponding to the branch instruction can be determined in the branch history table based on the group and the path where the tag table entry is located in the branch tag table. In one example, the determining module 62 is further configured to:
generating hit information, wherein the number of bits of the hit information is the same as the number of paths of the branch tag table, different bits of the hit information correspond to different paths of the branch tag table, the value of the bit corresponding to the determined path of the tag table item in the hit information is a first value, and the values of other bits are second values;
the determining a history table entry in the branch history table according to the determined position of the tag table entry in the branch tag table, including:
determining a history group in the branch history table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table adopts a multi-path group connection structure, the branch tag table comprises a plurality of history groups, each history group comprises a plurality of tag table items, and different history groups correspond to different part address bits;
And determining a corresponding history table entry under the road from the history group according to the position of the first value in the hit information.
Specifically, the structure of the branch history table is the same as that of the branch tag table, and will not be described here again. The method comprises the steps of indexing part of address bits of virtual addresses of branch instructions under a current thread in a branch history table, and determining a history group corresponding to the branch instructions, so that the history entries corresponding to the branch instructions can be determined only based on a path of a tag entry corresponding to the branch instructions in the branch tag table. After the tag table item corresponding to the branch instruction is determined in the branch tag table, hit information with the same number of bits as the number of ways of the branch tag table is generated, wherein the value of the bit of the way where the tag table item corresponding to the hit information is located is a first value, and the values of other bits are second values. Assuming that the branch tag table adopts a four-way group association structure and comprises a 0 th way, a 1 st way, a 2 nd way and a 3 rd way, hit information comprises four bits, and when the tag table item is positioned in the 1 st way, the generated hit information is 0010. After the branch history table determines the history group corresponding to the branch instruction, determining a certain path in the history group according to the hit information, and taking the history table item at the position as the history table item corresponding to the branch instruction under the current thread.
In the example, the way of the tag table item in the branch tag table is represented by the hit information, and the way of the history table item in the branch history table can be accurately determined based on the hit information, so that the history table item corresponding to the branch instruction under the current thread is accurately determined, and the accuracy of branch prediction is improved.
The history entries of the branch history table store the jump information of the corresponding branch instruction, based on which it may be decided whether to execute the jump, in one example, the jump module 64 is specifically configured to:
if the read high order of the jump information is 1, acquiring a jump address corresponding to the branch instruction and executing jump;
and if the high order of the read jump information is 0, not executing the jump.
Specifically, the branch history table may keep the history of the jump condition of the branch instruction, and according to the history of the jump condition, it may predict whether to execute the jump this time, and its specific form may be various. Illustratively, each history entry of the branch history table stores 2bits of jump information, wherein when the high order of the jump information is 1, the predicted result is that the jump is performed, and when the high order of the jump information is 0, the predicted result is that the jump is not performed; the lower order of the jump information is 1, which indicates a higher accuracy of the prediction result, and the lower order of the jump information is 0, which indicates a lower accuracy of the prediction result. For example, when the jump information is 00 and 01, no jump is performed; and when the jump information is 11 and 10, acquiring the jump address corresponding to the branch instruction and executing the jump. Because the branch history table has the condition of prediction errors, after executing or not executing the jump on the branch instruction based on the jump information, the branch history table updates the jump information based on the actual execution result of the branch instruction. When the jump information is 00, if the actual execution result of the branch instruction is jump, updating the jump information corresponding to the branch instruction to 10; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is updated to 01. When the jump information is 01, if the actual execution result of the branch instruction is jump, updating the jump information corresponding to the branch instruction to 00; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is still 01. When the jump information is 10, if the actual execution result of the branch instruction is jump, updating the jump information corresponding to the branch instruction to 11; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is updated to 00. When the jump information is 11, if the actual execution result of the branch instruction is jump, the jump information corresponding to the branch instruction is still 11; if the actual execution result of the branch instruction is not jump, the jump information corresponding to the branch instruction is updated to 10.
In this example, the jump condition of the branch instruction can be predicted based on the high order of the jump information, and the jump information of the branch history table can be updated based on the actual execution condition of the branch instruction, so that the accuracy of branch prediction can be improved.
In the above case, the branch tag table includes only a part of tag entries corresponding to the branch instruction under the current thread, and therefore there is a possibility that there is no tag entry corresponding to the branch instruction under the current thread, i.e., a miss, in the branch tag table. In one example, the skip module 64 is further configured to:
if the corresponding tag table item does not exist in the branch tag table, the miss is judged, and the jump is not executed.
Specifically, it is assumed that after a tag group is determined in the branch tag table according to a partial address bit of a virtual address of a branch instruction, if a tag table entry having a tag and a thread number identical to those of the branch instruction is not present in the tag group, a tag table entry corresponding to the branch instruction under the current thread is not present in the branch tag table, a miss is determined, a jump is not executed, and a sequential instruction fetching process is continued.
In this example, if no corresponding tag entry exists in the branch tag table, a miss is determined, which indicates that no jump information of the branch instruction exists in the branch history table, and no jump is executed, so as to continue the sequential instruction fetching process.
Since the branch tag table only holds tag and thread information for a portion of the branch instruction, in order to increase the hit rate of the branch tag table, in one example, the apparatus further comprises:
and the updating module is used for selecting the least frequently accessed tag table item from the tag group according to the least recently used algorithm if the current thread is not hit, and replacing the tag and the thread number in the tag table item with the tag of the branch instruction under the current thread and the thread number of the current thread.
Specifically, the least recently used algorithm eliminates data according to the historical access record of the data, and the core idea is that if the data is accessed recently, the probability of being accessed later is higher. If the branch tag table is not hit, determining a tag table item which is least frequently accessed in a tag group corresponding to the branch instruction based on a least recently used algorithm, and replacing the tag and the thread number stored in the tag table item with the tag of the virtual address of the branch instruction under the current thread and the thread number of the current thread.
In this example, the hit rate of the query branch tag table can be improved by updating the branch tag table with the least recently used algorithm, so that the jump condition of more branch instructions can be predicted.
In the branch prediction apparatus provided in this embodiment, a branch tag table storing the high-order virtual address of a branch instruction is set, and after determining that an instruction fetched under a current thread is a branch instruction in an instruction decoding stage, a tag entry is determined in the branch tag table according to a part of bits of the virtual address of the branch instruction, where the high-order virtual address included in the tag entry is the same as the high-order virtual address of the branch instruction; and determining a history table entry corresponding to the tag table entry in the branch history table, and executing the jump or non-jump of the branch instruction according to the jump information in the history table entry. By comparing the high-order virtual address of the branch instruction, the history table entry corresponding to the branch instruction in the branch history table can be accurately obtained, so that whether the branch instruction jumps or not can be accurately predicted, and the performance of the processor is effectively improved.
Example III
Fig. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the disclosure, as shown in fig. 7, where the electronic device includes:
A processor 291, the electronic device further comprising a memory 292; a communication interface (Communication Interface) 293 and bus 294 may also be included. The processor 291, the memory 292, and the communication interface 293 may communicate with each other via the bus 294. Communication interface 293 may be used for information transfer. The processor 291 may call logic instructions in the memory 292 to perform the methods of the above-described embodiments.
Further, the logic instructions in memory 292 described above may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product.
The memory 292 is a computer-readable storage medium that may be used to store a software program, a computer-executable program, and program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 291 executes functional applications and data processing by running software programs, instructions and modules stored in the memory 292, i.e., implements the methods of the method embodiments described above.
Memory 292 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the terminal device, etc. Further, memory 292 may include high-speed random access memory, and may also include non-volatile memory.
The disclosed embodiments provide a non-transitory computer readable storage medium having stored therein computer-executable instructions that, when executed by a processor, are configured to implement the method of the previous embodiments.
Example IV
The disclosed embodiments provide a computer program product comprising a computer program which, when executed by a processor, implements the method provided by any of the embodiments of the disclosure described above.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (11)

1. A method of branch prediction, comprising:
after judging that an instruction fetched under a current thread is a branch instruction in an instruction decoding stage, acquiring part of address bits of a virtual address of the branch instruction under the current thread;
determining a tag table entry in a branch tag table according to part of address bits of a virtual address of a branch instruction under the current thread; the branch label list comprises label list items corresponding to a plurality of branch instructions, and different branch instructions correspond to different label list items; the tag table entry comprises a tag of a corresponding branch instruction, wherein the tag of the branch instruction comprises a highest bit of a virtual address of the branch instruction to a highest bit of the partial address bits;
according to the determined tag table item, a history table item is determined in a branch history table, and jump information in the history table item is read, wherein the history table item corresponds to the tag table item;
if the read jump information represents confirmation jump, acquiring a jump address corresponding to the branch instruction and executing jump; and if the read jump information represents refusal jump, not executing the jump.
2. The method of claim 1, wherein a location of a history entry corresponding to any branch instruction in the branch history table is the same as a location of a tag entry corresponding to the branch instruction in the branch tag table.
3. The method of claim 2, wherein said determining a history entry in a branch history table based on said determined tag entry comprises:
and determining a history table item in the branch history table according to the determined position of the tag table item in the branch tag table, wherein the position of the history table item in the branch history table is the same as the position of the tag table item in the branch tag table.
4. The method of claim 1, wherein the branch tag table includes tag entries corresponding to branch instructions under a plurality of different threads, the same branch instruction under a different thread corresponding to a different tag entry, and the tag entries further include thread numbers of threads in which the corresponding branch instruction is located.
5. The method of claim 4, wherein the branch tag table adopts a multi-way group connection structure; the determining a tag table entry in the branch tag table according to the partial address bit of the virtual address of the branch instruction under the current thread comprises:
determining a tag group in the branch tag table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table comprises a plurality of tag groups, each tag group comprises a plurality of tag table items, and different tag groups correspond to different part address bits;
And determining a tag table item from the tag group according to the tag of the branch instruction under the current thread and the thread number of the current thread, wherein the tag in the tag table item is the same as the tag of the branch instruction under the current thread, and the thread number in the tag table item is the same as the thread number of the current thread.
6. The method of claim 5, wherein after determining a tag entry in the branch tag table based on the partial address bits of the virtual address of the branch instruction under the current thread, further comprising:
generating hit information, wherein the number of bits of the hit information is the same as the number of paths of the branch tag table, different bits of the hit information correspond to different paths of the branch tag table, the value of the bit corresponding to the determined path of the tag table item in the hit information is a first value, and the values of other bits are second values;
the determining a history table entry in a branch history table according to the determined position of the tag table entry in the branch tag table, including:
determining a history group in the branch history table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch tag table adopts a multi-path group connection structure, the branch tag table comprises a plurality of history groups, each history group comprises a plurality of tag table items, and different history groups correspond to different part address bits;
And determining a corresponding history table entry under the road from the history group according to the position of the first value in the hit information.
7. The method of claim 6, wherein the method further comprises:
if the corresponding tag table item does not exist in the branch tag table, the miss is judged, and the jump is not executed.
8. The method of claim 7, wherein the method further comprises:
if the branch instruction is not hit, selecting the least frequently accessed tag table item from the tag group according to a least recently used algorithm, and replacing the tag and the thread number in the tag table item with the tag of the branch instruction under the current thread and the thread number of the current thread.
9. The method according to any one of claims 1-8, wherein if the read jump information characterizes a confirm jump, then obtaining a jump address corresponding to the branch instruction and executing the jump; if the read jump information represents refusal jump, not executing jump, including:
if the read high order of the jump information is 1, acquiring a jump address corresponding to the branch instruction and executing jump;
And if the high order of the read jump information is 0, not executing the jump.
10. A branch prediction apparatus, comprising:
the acquisition module is used for acquiring partial address bits of the virtual address of the branch instruction under the current thread after judging that the instruction taken out under the current thread is the branch instruction in the instruction decoding stage;
the determining module is used for determining a tag table item in the branch tag table according to partial address bits of the virtual address of the branch instruction under the current thread; the branch label list comprises label list items corresponding to a plurality of branch instructions, and different branch instructions correspond to different label list items; the tag table entry comprises a tag of a corresponding branch instruction, wherein the tag of the branch instruction comprises a highest bit of a virtual address of the branch instruction to a highest bit of the partial address bits;
the reading module is used for determining a history table item in a branch history table according to the determined tag table item and reading jump information in the history table item, wherein the history table item corresponds to the tag table item;
the jump module is used for acquiring a jump address corresponding to the branch instruction and executing the jump if the read jump information represents the confirmation jump; and if the read jump information represents refusal jump, not executing the jump.
11. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-9.
CN202311447344.2A 2023-11-02 2023-11-02 Branch prediction method, device, electronic equipment and medium Pending CN117389629A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311447344.2A CN117389629A (en) 2023-11-02 2023-11-02 Branch prediction method, device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311447344.2A CN117389629A (en) 2023-11-02 2023-11-02 Branch prediction method, device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN117389629A true CN117389629A (en) 2024-01-12

Family

ID=89440623

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311447344.2A Pending CN117389629A (en) 2023-11-02 2023-11-02 Branch prediction method, device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN117389629A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020199091A1 (en) * 2001-06-20 2002-12-26 Fujitsu Limited Apparatus for branch prediction based on history table
EP1622004A2 (en) * 2004-07-29 2006-02-01 Fujitsu Limited Processor system and thread switching control method
CN102053818A (en) * 2009-11-05 2011-05-11 无锡江南计算技术研究所 Branch prediction method and device as well as processor
CN102163143A (en) * 2011-04-28 2011-08-24 北京北大众志微系统科技有限责任公司 A method realizing prediction of value association indirect jump
CN104423929A (en) * 2013-08-21 2015-03-18 华为技术有限公司 Branch prediction method and related device
CN106406823A (en) * 2016-10-10 2017-02-15 上海兆芯集成电路有限公司 Branch predictor and method used for operating same
US20170315810A1 (en) * 2016-04-28 2017-11-02 International Business Machines Corporation Techniques for predicting a target address of an indirect branch instruction
WO2019019719A1 (en) * 2017-07-28 2019-01-31 华为技术有限公司 Branch prediction method and apparatus
US20200225955A1 (en) * 2019-01-12 2020-07-16 MIPS Tech, LLC Address manipulation using indices and tags
CN113544640A (en) * 2019-03-30 2021-10-22 华为技术有限公司 Processing method of branch instruction, branch predictor and processor
CN113722243A (en) * 2021-09-03 2021-11-30 苏州睿芯集成电路科技有限公司 Advanced prediction method for direct jump and branch instruction tracking cache
CN114020441A (en) * 2021-11-29 2022-02-08 锐捷网络股份有限公司 Instruction prediction method of multi-thread processor and related device
CN114518900A (en) * 2020-11-20 2022-05-20 上海华为技术有限公司 Instruction processing method applied to multi-core processor and multi-core processor
CN116339832A (en) * 2023-03-31 2023-06-27 北京奕斯伟计算技术股份有限公司 Data processing device, method and processor

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020199091A1 (en) * 2001-06-20 2002-12-26 Fujitsu Limited Apparatus for branch prediction based on history table
EP1622004A2 (en) * 2004-07-29 2006-02-01 Fujitsu Limited Processor system and thread switching control method
CN102053818A (en) * 2009-11-05 2011-05-11 无锡江南计算技术研究所 Branch prediction method and device as well as processor
CN102163143A (en) * 2011-04-28 2011-08-24 北京北大众志微系统科技有限责任公司 A method realizing prediction of value association indirect jump
CN104423929A (en) * 2013-08-21 2015-03-18 华为技术有限公司 Branch prediction method and related device
US20170315810A1 (en) * 2016-04-28 2017-11-02 International Business Machines Corporation Techniques for predicting a target address of an indirect branch instruction
CN106406823A (en) * 2016-10-10 2017-02-15 上海兆芯集成电路有限公司 Branch predictor and method used for operating same
WO2019019719A1 (en) * 2017-07-28 2019-01-31 华为技术有限公司 Branch prediction method and apparatus
US20200225955A1 (en) * 2019-01-12 2020-07-16 MIPS Tech, LLC Address manipulation using indices and tags
CN113544640A (en) * 2019-03-30 2021-10-22 华为技术有限公司 Processing method of branch instruction, branch predictor and processor
CN114518900A (en) * 2020-11-20 2022-05-20 上海华为技术有限公司 Instruction processing method applied to multi-core processor and multi-core processor
CN113722243A (en) * 2021-09-03 2021-11-30 苏州睿芯集成电路科技有限公司 Advanced prediction method for direct jump and branch instruction tracking cache
WO2023029912A1 (en) * 2021-09-03 2023-03-09 苏州睿芯集成电路科技有限公司 Ahead prediction method and branch trace cache for direct jumping
CN114020441A (en) * 2021-11-29 2022-02-08 锐捷网络股份有限公司 Instruction prediction method of multi-thread processor and related device
CN116339832A (en) * 2023-03-31 2023-06-27 北京奕斯伟计算技术股份有限公司 Data processing device, method and processor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘权胜;杨洪斌;吴悦;: "精简的指令预测与分支部件的设计", 计算机工程与设计, no. 07, 16 April 2008 (2008-04-16), pages 19 - 21 *
李静梅;关海洋;: "基于同时多线程的TBHBP分支预测器研究", 计算机科学, no. 09, 15 September 2012 (2012-09-15), pages 313 - 317 *

Similar Documents

Publication Publication Date Title
US6584549B2 (en) System and method for prefetching data into a cache based on miss distance
EP2628076B1 (en) An instruction sequence buffer to store branches having reliably predictable instruction sequences
EP2628072A2 (en) An instruction sequence buffer to enhance branch prediction efficiency
US10810134B2 (en) Sharing virtual and real translations in a virtual cache
US11775445B2 (en) Translation support for a virtual cache
CN114327641A (en) Instruction prefetching method, instruction prefetching device, processor and electronic equipment
US8707014B2 (en) Arithmetic processing unit and control method for cache hit check instruction execution
CN101847096A (en) Optimization method of stack variable-containing function
CN1093658C (en) Branch history table with branch pattern field
JP2002229852A (en) Cache system control circuit
CN117389629A (en) Branch prediction method, device, electronic equipment and medium
GB2392266A (en) Using a flag in a branch target address cache to reduce latency when a branch occurs that references a call-return stack
EP0296430A2 (en) Sequential prefetching with deconfirmation
US6961844B1 (en) System and method for extracting instruction boundaries in a fetched cacheline, given an arbitrary offset within the cacheline
CN117331853B (en) Cache processing method, device, electronic equipment and medium
CN113504943B (en) Method and system for implementing hybrid branch prediction device for reducing resource usage
CN117331854B (en) Cache processing method, device, electronic equipment and medium
CN115562730A (en) Branch predictor, related device and branch prediction method
CN111124946A (en) Circuit and method
EP0912928A1 (en) A data address prediction structure utilizing a stride prediction method
CN116302112A (en) Low-power-consumption branch target buffer with two-stage prediction mechanism and design method
CN117331854A (en) Cache processing method, device, electronic equipment and medium
CN117093271A (en) Branch instruction prefetching method and device
CN115237473A (en) Method for improving BTB table utilization rate in 16-bit 32-bit mixed-programming instruction processor
WO1998020416A1 (en) A stride-based data address prediction structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination