CN117234594A

CN117234594A - Branch prediction method, electronic device and storage medium

Info

Publication number: CN117234594A
Application number: CN202311106975.8A
Authority: CN
Inventors: 纪嘉龙; 李文明
Original assignee: Shanghai Processor Technology Innovation Center
Current assignee: Shanghai Processor Technology Innovation Center
Priority date: 2023-08-29
Filing date: 2023-08-29
Publication date: 2023-12-15

Abstract

The application discloses a branch prediction method, electronic equipment and a storage medium. The branch prediction method comprises the steps of obtaining an instruction address and a history execution value of a global history register; determining an index value according to the instruction address and the historical execution value; determining a target branch prediction component from the index value and the branch prediction component table; wherein the branch prediction component table contains an accurate predictor tag, a branch target buffer tag, and a prefetch target buffer tag; the accurate predictor comprises a TAGE class branch predictor and a non-deviation nerve predictor; the currently determined target branch prediction unit and branch prediction unit table are updated. By using the technical scheme of the application, the power consumption of the high-performance hybrid branch predictor can be reduced, and the branch prediction result with high accuracy can be ensured.

Description

Branch prediction method, electronic device and storage medium

Technical Field

The present application relates generally to the field of computer technology. More particularly, the present application relates to a branch prediction method, an electronic device, and a storage medium.

Background

Microprocessors and very large Very Long Instruction Word (VLIW) architectures employ Instruction Level Parallelism (ILP) to achieve high performance. Wherein the ILP executes multiple instructions of the program simultaneously in one CPU cycle (clock cycle) to speed up memory references and computations, thereby improving the performance of the processor. One of the limitations with ILP is that the frequency of branch instructions in a program is high, with 15% to 25% of instructions in a program being branch instructions. Thus, existing processors may employ some form of branch prediction whereby the branching behavior of branch instructions is predicted in a pipeline, which may result in a saving in execution time by starting speculatively executing the corresponding branch ahead of time using unallocated computing resources (i.e., the pipeline). It follows that branch prediction techniques are important techniques to improve the performance of general-purpose processors.

Existing research on branch prediction architecture has focused on two most advanced major variants: tag class and perceptron class. The TAGE predictor is built, wherein, based on the observation result that different branches can obtain higher prediction precision by virtue of different history lengths. While perceptron classes use complex algorithms to learn the correlation between branch history registers and branch results. In order to reduce the errors in the predicted outcome, prior art attempts to apply machine learning algorithms such as CNN to branch predictions, but require more delay to predict the branch outcome. In addition, hybrid predictors are also common techniques for modern processors, and a Branch Prediction Unit (BPU) is a component that combines a tag-type branch predictor with a perceptron-type branch predictor. However, the power consumption problem of the BPU of modern superscalar processors is quite significant, especially for high performance hybrid branch predictors. Wherein for many applications it is not necessary to utilize each branch predictor in the BPU, running all branch predictors simultaneously results in significant useless power consumption.

In view of the foregoing, it is desirable to provide an innovative branch prediction method to reduce the power consumption of a high performance hybrid branch predictor and ensure high accuracy branch prediction results.

Disclosure of Invention

To solve at least one or more of the technical problems mentioned above, the present application proposes a branch prediction method, an electronic device, and a storage medium. The branch prediction method can reduce the power consumption of the high-performance hybrid branch predictor and ensure the branch prediction result with high accuracy.

In a first aspect, the present application provides a branch prediction method comprising: acquiring an instruction address and a history execution value of a global history register; determining an index value according to the instruction address and the historical execution value; determining a target branch prediction component from the index value and the branch prediction component table; the currently determined target branch prediction unit and branch prediction unit table are updated.

In some embodiments, determining the index value from the instruction address and the historical execution value includes: and performing exclusive OR operation on the instruction address and the history execution value to obtain an index value.

In some embodiments, determining the target branch prediction unit from the index value and the branch prediction unit table comprises: determining a branch predictor tag in the branch prediction component table based on the index value; a target branch prediction unit is determined based on the branch predictor tags.

In some embodiments, in updating the currently determined target branch prediction unit and branch prediction unit table, updating the branch prediction unit table includes: determining a judgment value according to the history prediction component record and a preset record weight set; and updating the branch prediction part table according to the judging value and the preset comparison value.

In some embodiments, determining the decision value from the historical prediction component record and the set of preset record weights comprises: n recorded values in the history prediction component record are respectively multiplied by n weight values in a preset recording weight set in sequence, and the multiplied values are added to obtain a judgment value; wherein n recorded values in the history prediction part record are arranged according to the recording time sequence; n weight values in a preset recording weight set are sequentially arranged from small to large, and n is a positive integer.

In some embodiments, the branch prediction component table contains an accurate predictor tag, a branch target buffer tag, and a prefetch target buffer tag; the accurate predictor comprises a TAGE class branch predictor and a non-deviation nerve predictor; updating the branch prediction component table according to the judging value and the preset comparison value comprises the following steps: if the judging value is larger than or equal to the preset comparison value, updating the branch predictor mark corresponding to the current index value into a branch target buffer mark; if the judging value is smaller than the preset comparison value, updating the branch predictor mark corresponding to the current index value into the accurate predictor mark.

In some embodiments, in updating the currently determined target branch prediction unit and branch prediction unit table, updating the currently determined target branch prediction unit includes: the prediction data table in the currently determined target branch prediction unit is updated.

In some embodiments, after updating the currently determined target branch prediction component and branch prediction component table, the method further comprises: the steps of obtaining the instruction address and the historical execution value of the global history register are re-executed to determine a target branch prediction unit according to the index value and the branch prediction unit table, so as to determine the target branch prediction unit corresponding to the next instruction.

In a second aspect, the present application provides an electronic device comprising:

a processor; and a memory having program code stored thereon for branch prediction, which when executed by the processor, causes the electronic device to implement the method as described above.

In a third aspect, the present application provides a non-transitory machine-readable storage medium having stored thereon program code for branch prediction, which when executed by a processor, is capable of implementing the method as described above.

The technical scheme provided by the application can comprise the following beneficial effects:

according to the branch prediction method, the electronic equipment and the storage medium, the index value is determined according to the instruction address and the history execution value by acquiring the instruction address and the history execution value of the global history register, and then the target branch prediction component is determined according to the index value and the branch prediction component table. Therefore, the corresponding target branch prediction component can be selected for branch prediction according to the current instruction, and the problem that many useless power consumption is caused by running all branch predictors simultaneously is solved.

Further, the application can update the currently determined target branch prediction component and branch prediction component table, so that when the next instruction needs to perform branch prediction, the corresponding target branch prediction component can be more accurately matched, and the accuracy of the branch prediction result is improved.

In general, the technical scheme of the application can reduce the power consumption of the high-performance hybrid branch predictor and ensure the branch prediction result with high accuracy.

Drawings

The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. In the drawings, embodiments of the application are illustrated by way of example and not by way of limitation, and like reference numerals refer to similar or corresponding parts and in which:

FIG. 1 illustrates an exemplary flow chart of a branch prediction method of some embodiments of the application;

FIG. 2 illustrates an exemplary flow chart of a branch prediction method of other embodiments of the present application;

FIG. 3 illustrates an exemplary flow chart of a branch prediction method of further embodiments of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. For simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements. Furthermore, the application has been set forth in numerous specific details in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure the embodiments described herein. Moreover, this description should not be taken as limiting the scope of the embodiments described herein. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It should be understood that the possible terms "first" or "second" and the like in the claims, specification and drawings of the present disclosure are used for distinguishing between different objects and not for describing a particular sequential order. The terms "comprises" and "comprising" when used in the specification and claims of the present application are taken to specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification and claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the term "and/or" as used in the present specification and claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in this specification and the claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

Existing processors may employ some form of branch prediction whereby the branching behavior of branch instructions is predicted in a pipeline, and by starting speculatively executing the corresponding branch ahead of time using unallocated computing resources (i.e., the pipeline), execution time savings may be achieved. It follows that branch prediction techniques are important techniques to improve the performance of general-purpose processors. Existing research on branch prediction architecture has focused on two most advanced major variants: tag class and perceptron class. The TAGE predictor is built, wherein, based on the observation result that different branches can obtain higher prediction precision by virtue of different history lengths. While perceptron classes use complex algorithms to learn the correlation between branch history registers and branch results. In order to reduce the errors in the predicted outcome, prior art attempts to apply machine learning algorithms such as CNN to branch predictions, but require more delay to predict the branch outcome. In addition, hybrid predictors are also common techniques for modern processors, and a Branch Prediction Unit (BPU) is a component that combines a tag-type branch predictor with a perceptron-type branch predictor. However, the power consumption problem of the BPU of modern superscalar processors is quite significant, especially for high performance hybrid branch predictors. Wherein for many applications it is not necessary to utilize each branch predictor in the BPU, running all branch predictors simultaneously results in significant useless power consumption.

Specific embodiments of the present application are described in detail below with reference to the accompanying drawings.

FIG. 1 illustrates an exemplary flow chart of a branch prediction method of some embodiments of the application. Referring to fig. 1, a branch prediction method according to an embodiment of the present application may include:

in step S101, the instruction address and the history execution value of the global history register are acquired. The aforementioned instruction address (PC) is an address for storing the current instruction to be executed, and has a direct path with the host MAR, and a function of self-adding 1, so as to form the address of the next instruction. The aforementioned Global History Register (GHR), also known as a global branch history register or global history shift register, may be used to track the past history of previously executed branch instructions. The branch history stored by the GHR provides an overview of the branch instruction sequence encountered in the code path leading to the currently executed branch instruction, i.e., the historical execution value. Assume that the history execution value is 32 bits, where each bit records whether a branch instruction is executed, 1 indicates execution, and 0 indicates non-execution. It will be appreciated that the historical execution value needs to be determined according to the actual application, and the present application is not limited in this respect.

In step S102, an index value is determined from the instruction address and the history execution value. In an embodiment of the present application, the index value described above may be used to index the data item sequence numbers in the branch prediction unit table.

In step S103, a target branch prediction unit is determined from the index value and the branch prediction unit table. The aforementioned branch prediction component table (tease table) may illustratively include, but is not limited to, a precision predictor flag, a branch target buffer flag, and a prefetch target buffer flag. It will be appreciated that, instead, the precision predictor tag may map a precision predictor, the branch target buffer tag may map a branch target buffer, and the prefetch target buffer tag may map a prefetch target buffer, such that it may be determined whether the target branch prediction unit is a precision predictor, a branch target buffer, or a prefetch target buffer after the branch prediction unit table is indexed to the corresponding tag based on the index value.

In an embodiment of the present application, the above-mentioned Accurate Predictor (APD) is one of the component parts of the Branch Prediction Unit (BPU), which may include a tag-type branch predictor (tag) and a non-deviation nerve predictor (BFNP). Among them, BATAGE (proposed in [ Michaud, P.: an Alternative Tage-like Conditional Branch Predictor, vol.15.Association for Computing Machinery, new York, NY, USA (2018. ]) is a variant of the TAGE class branch predictor that uses two counters to record the number of branch execution (taken) and non-execution (non taken) instead of an up/down counter in TAGE. The BATAGE uses Bayesian confidence estimates to emphasize confidence and optimize tag term assignment. In addition, BFNP employs filtering biased branches and multiple branch instances from the history, allowing the predictor to find relevant branches based on the program's execution history without using a large memory budget. It has the significant advantage of high performance and low resource usage compared to other perceptron predictor variants.

In embodiments of the present application, the precision predictor and branch target buffers (branch target buffer, BTB) may be decoupled from the instruction cache (instruction cache, I-cache) to maximize performance, requiring a prefetch target buffer (Fetch Target Buffer, FTB). The return address stack (return address stack, RAS) is a stack memory structure implemented as a register file that records the address of the next instruction to the call instruction, pushes the stack when the FTB considers the predicted block to jump on the call instruction, and pops the predicted block when the FTB considers the predicted block to jump on the ret instruction. Each item contains an address and a counter, the stack pointer is unchanged when the stack is repeatedly pushed to the same address, and the counter is incremented by one for the case of recursive call in the processing program. After each prediction, the stack top entry and stack pointer are stored in a storage structure of an acquisition target queue (fetch target queue, FTQ) for recovery in the event of a misprediction. In the embodiment of the application, the matching use of the FTB and the RAS can be adopted. To maintain high performance of the FTB, the data items in the FTB may be indexed by a start, which is the start address of an instruction block recorded in a data item, where an instruction block contains multiple pieces of instruction information. Further, the start is generated in the prediction pipeline, and in practical application, the start basically follows one of the following principles: start is the end address (end) of the last predicted block; start is the destination address from the redirection outside the BPU. While a maximum of two branch instructions are recorded within the FTB's data item.

In step S104, the currently determined target branch prediction unit and branch prediction unit table is updated. In the embodiment of the application, the currently determined target branch prediction component and the branch prediction component table can be updated based on the history record of the past selected branch predictor, so that when the next instruction needs to perform branch prediction, the corresponding target branch prediction component can be more accurately matched, and the accuracy of the branch prediction result is improved.

According to the branch prediction method, the electronic equipment and the storage medium, the index value is determined according to the instruction address and the history execution value by acquiring the instruction address and the history execution value of the global history register, and then the target branch prediction component is determined according to the index value and the branch prediction component table. Therefore, the corresponding target branch prediction component can be selected for branch prediction according to the current instruction, and the problem that many useless power consumption is caused by running all branch predictors simultaneously is solved. Further, the application can update the currently determined target branch prediction component and branch prediction component table, so that when the next instruction needs to perform branch prediction, the corresponding target branch prediction component can be more accurately matched, and the accuracy of the branch prediction result is improved. In general, the technical scheme of the application can reduce the power consumption of the high-performance hybrid branch predictor and ensure the branch prediction result with high accuracy.

In some embodiments, the determination of the target branch prediction component may be further designed. The determination steps of the target branch prediction unit will be described in detail below in conjunction with FIG. 2. FIG. 2 is a flow chart illustrating an exemplary method of branch prediction according to further embodiments of the present application, and referring to FIG. 2, the method of branch prediction according to embodiments of the present application may include:

in step S201, the instruction address and the history execution value of the global history register are acquired. In the embodiment of the present application, the content of step S201 is substantially the same as the content of step S101, and will not be described here again.

In step S202, the instruction address and the history execution value are xored to obtain an index value. The exclusive or is also called half-addition, and the algorithm is equivalent to binary addition without carry, and finally the binary value obtained by the exclusive or is converted into a natural number, so that the index value is obtained.

In step S203, a branch predictor tag is determined in the branch prediction component table based on the index value. In embodiments of the present application, an index value may be used to index a data item sequence number in a branch prediction unit table. So that the branch predictor tag corresponding to the data item sequence number can be determined. Assuming an index value of 8, then the branch predictor tag for data item number 8 may be indexed in the branch prediction unit table. It will be appreciated that the specific value of the index value needs to be determined according to the actual application, and the present application is not limited in this respect.

In step S204, a target branch prediction unit is determined based on the branch predictor flags. In embodiments of the present application, the target branch prediction component may be determined from a branch predictor tag map. Illustratively, when the branch predictor is marked 00, mapping is to use only FTB without branch prediction components to ensure high processor front-end performance; when the branch predictor flag is 01, mapping is to use a Branch Target Buffer (BTB) as the target branch prediction component; when the branch predictor is marked 10, the mapping is to use an Accurate Predictor (APD) as the target branch prediction component. In so doing, the energy consumption of accessing multiple branch predictors can be reduced while ensuring high branch prediction performance. It will be appreciated that the manner of setting the branch predictor flag is various, and in practical applications, the manner of setting the branch predictor flag needs to be determined according to practical application conditions, and the present application is not limited in this respect.

In some embodiments, the update process of the currently determined target branch prediction unit and branch prediction unit table may be further designed after the current instruction is completed by the processor back-end. The updating process of the currently determined target branch prediction unit and branch prediction unit table will be described in detail below in conjunction with FIG. 3. FIG. 3 illustrates an exemplary flow chart of a branch prediction method according to still further embodiments of the present application, referring to FIG. 3, the branch prediction method illustrated by embodiments of the present application may include:

in step S301, a determination value is determined from the history prediction part record and the preset record weight set. In the embodiment of the present application, specifically, n record values in the history prediction unit record may be multiplied by n weight values in the preset record weight set in sequence, and the multiplied values may be added to obtain the judgment value. Wherein n recorded values in the history prediction part record are arranged according to the recording time sequence; the preset recording weight sets are sequentially arranged from small to large, for example, in n weight values in the preset recording weight sets, the n weight value is larger than the n-i weight value, the n-i weight value is larger than the n-2i weight value, and so on until the n-ki weight value is larger than the 1 st weight value, and the n weight values are sequentially arranged from small to large. n, k and i are positive integers, n-ki is greater than 1.

It will be appreciated that since the n record values in the history prediction unit record are arranged in the record time sequence, the n record stores the latest primary branch prediction unit use record, and the influence of the latest record on the branch prediction unit selected in the following is greater than that of the previous use record, so that the n weight values in the preset record weight set are in a sequential increasing trend. In particular, during the setting of n weight values, the same weight may be given to each i record values to avoid the newer record value of the n record values from having an excessive impact on the selected branch prediction unit outcome. Assuming that i is equal to 2, the weights of the n-1 th recorded value and the n-th recorded value may be equal.

As an example, assume that the 8 recorded values recorded by the history prediction unit are (1,0,1,1,0,1,0,0), respectively, where 1 indicates the use of BTB and 0 indicates the use of APD. The preset set of recording weights may be set to 0.05,0.05,0.1,0.1,0.15,0.15,0.2,0.2. And correspondingly multiplying the 8 recorded values and the 8 weights, and then adding to obtain a judgment value.

In step S302, the branch prediction component table is updated according to the determination value and the preset comparison value. In the embodiment of the present application, if the judgment value is greater than or equal to the preset comparison value, the branch predictor flag corresponding to the current index value is updated to be a branch target buffer flag, for example, 01. If the judgment value is smaller than the preset comparison value, the branch predictor mark corresponding to the current index value is updated to be the accurate predictor mark, for example, 10.

In particular, embodiments of the present application do not use branch prediction components for the fall through instruction recorded in the FTB. For the call through instruction in the FTB and the instruction that has used the RAS, the tag corresponding to the current index value must be marked with 00.

In step S303, a prediction data table in the currently determined target branch prediction unit is updated. In the embodiment of the present application, the updating of the currently determined target branch prediction component may be specifically updating a prediction data table in the currently determined target branch prediction component. Specific updating methods can be exemplified by Michaud, P.: an Alternative Tage-like Conditional Branch Predictor, vol.15.association for Computing Machinery, new York, NY, USA (2018), gope, D.; lipasti, M.H., bias-free branch predictor.In:2014 47thAnnual IEEE/ACM International Symposium on Microarchitec-wire, pp.521-532 (2014) Reinman G, austin T, calder B.A scalable front-end architecture for fast instruction delivery [ J ]. ACM SIGARCH Computer Architecture News,1999,27 (2): 234-245.

It may be understood that, step S301 to step S302 are processes of updating the branch prediction component table, step S303 is a process of updating the branch prediction component table, and in practical applications, step S301 to step S302 and step S303 may be executed sequentially or may be executed synchronously, and the execution sequence thereof needs to be determined according to the practical application situation, which is not limited in this aspect of the present application.

In step S304, the steps of obtaining the instruction address and the history execution value of the global history register are re-executed to determine the target branch prediction unit from the index value and the branch prediction unit table to determine the target branch prediction unit corresponding to the next instruction. So that when the next instruction needs to carry out branch prediction, the corresponding target branch prediction component can be matched more accurately, and the accuracy of the branch prediction result is improved.

Corresponding to the embodiment of the application function implementation method, the application also provides electronic equipment for executing the branch prediction method and corresponding embodiment.

Fig. 4 shows a block diagram of a hardware configuration of an electronic device 400 that may implement the branch prediction method of an embodiment of the present application. As shown in fig. 4, electronic device 400 may include a processor 410 and a memory 420. In the electronic apparatus 400 of fig. 4, only constituent elements related to the present embodiment are shown. Thus, it will be apparent to those of ordinary skill in the art that: electronic device 400 may also include common constituent elements that are different from those shown in fig. 4. Such as: a fixed point arithmetic unit.

Electronic device 400 may correspond to a computing device having various processing functions, such as functions for generating a neural network, training or learning a neural network, quantifying a floating point type neural network as a fixed point type neural network, or retraining a neural network. For example, the electronic device 400 may be implemented as various types of devices, such as a Personal Computer (PC), a server device, a mobile device, and so forth.

The processor 410 controls all functions of the electronic device 400. For example, the processor 410 controls all functions of the electronic device 400 by executing programs stored in the memory 420 on the electronic device 400. The processor 410 may be implemented by a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), an Application Processor (AP), an artificial intelligence processor chip (IPU), etc. provided in the electronic device 400. However, the present application is not limited thereto.

In some embodiments, processor 410 may include an input/output (I/O) unit 411 and a computing unit 412. The I/O unit 411 may be used to receive various data such as instruction addresses and historical execution values of global history registers. Illustratively, the computing unit 412 may be configured to determine an index value based on the instruction address received via the I/O unit 411 and the historical execution value of the global history register, and further determine the target branch prediction unit based on the index value and the branch prediction unit table. This target branch prediction component may be output by I/O unit 411, for example. The output data may be provided to memory 420 for reading by other devices (not shown) or may be provided directly to other devices for use.

The memory 420 is hardware for storing various data processed in the electronic device 400. For example, the memory 420 may store processed data and data to be processed in the electronic device 400. Memory 420 may store data sets involved in the branch prediction method processes that have been or are to be processed by processor 410, such as instruction addresses and historical execution values of global history registers. Further, the memory 420 may store applications, drivers, etc. to be driven by the electronic device 400. For example: memory 420 may store various programs related to the branch prediction method to be performed by processor 410. The memory 420 may be a DRAM, but the present application is not limited thereto. The memory 420 may include at least one of volatile memory or nonvolatile memory. The nonvolatile memory may include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), flash memory, phase change RAM (PRAM), magnetic RAM (MRAM), resistive RAM (RRAM), ferroelectric RAM (FRAM), and the like. Volatile memory can include Dynamic RAM (DRAM), static RAM (SRAM), synchronous DRAM (SDRAM), PRAM, MRAM, RRAM, ferroelectric RAM (FeRAM), and the like. In an embodiment, the memory 420 may include at least one of a Hard Disk Drive (HDD), a Solid State Drive (SSD), a high density flash memory (CF), a Secure Digital (SD) card, a Micro-secure digital (Micro-SD) card, a Mini-secure digital (Mini-SD) card, an extreme digital (xD) card, a cache (caches), or a memory stick.

In summary, specific functions implemented by the memory 420 and the processor 410 of the electronic device 400 provided in the embodiments of the present disclosure may be explained in comparison with the foregoing embodiments in the present disclosure, and may achieve the technical effects of the foregoing embodiments, which will not be repeated herein.

In this embodiment, the processor 410 may be implemented in any suitable manner. For example, the processor 410 may take the form of, for example, a microprocessor or processor, and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a programmable logic controller, and an embedded microcontroller, among others.

It should also be appreciated that any of the modules, units, components, servers, computers, terminals, or devices illustrated herein that execute instructions may include or otherwise access a computer readable medium, such as a storage medium, computer storage medium, or data storage device (removable) and/or non-removable) such as a magnetic disk, optical disk, or magnetic tape. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.

While various embodiments of the present application have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous modifications, changes, and substitutions will occur to those skilled in the art without departing from the spirit and scope of the application. It should be understood that various alternatives to the embodiments of the application described herein may be employed in practicing the application. The appended claims are intended to define the scope of the application and are therefore to cover all equivalents or alternatives falling within the scope of these claims.

Claims

1. A method of branch prediction, comprising:

acquiring an instruction address and a history execution value of a global history register;

determining an index value according to the instruction address and the historical execution value;

determining a target branch prediction component from the index value and a branch prediction component table;

and updating the currently determined target branch prediction component and the branch prediction component table.

2. The branch prediction method of claim 1, wherein said determining an index value from said instruction address and said historical execution value comprises:

and performing exclusive OR operation on the instruction address and the history execution value to obtain the index value.

3. The branch prediction method of claim 1, wherein said determining a target branch prediction component from said index value and branch prediction component table comprises:

determining a branch predictor tag in the branch prediction component table based on the index value;

the target branch prediction component is determined from the branch predictor tag.

4. The branch prediction method of claim 3, wherein in said updating the currently determined target branch prediction unit and the branch prediction unit table, updating the branch prediction unit table comprises:

determining a judgment value according to the history prediction component record and a preset record weight set;

and updating the branch prediction component table according to the judging value and the preset comparison value.

5. The branch prediction method according to claim 4, wherein determining the decision value based on the history prediction unit record and the set of preset record weights comprises:

multiplying n recorded values in the history prediction component record with n weight values in the preset recording weight set in sequence respectively, and adding the multiplied values to obtain the judgment value;

wherein n recorded values in the history prediction part record are arranged according to a recording time sequence; and n weight values in the preset recording weight set are sequentially arranged from small to large, and n is a positive integer.

6. The branch prediction method of claim 4, wherein the branch prediction component table comprises an accurate predictor tag, a branch target buffer tag, and a prefetch target buffer tag; the accurate predictor comprises a TAGE class branch predictor and a non-deviation nerve predictor; wherein updating the branch prediction component table according to the determination value and a preset comparison value includes:

if the judging value is larger than or equal to the preset comparison value, updating the branch predictor mark corresponding to the current index value into the branch target buffer mark;

and if the judging value is smaller than the preset comparison value, updating the branch predictor mark corresponding to the current index value into the accurate predictor mark.

7. The branch prediction method of claim 1, wherein in updating the currently determined target branch prediction unit and the branch prediction unit table, updating the currently determined target branch prediction unit comprises:

the prediction data table in the currently determined target branch prediction unit is updated.

8. The branch prediction method of claim 1, wherein after the updating of the currently determined target branch prediction component and the branch prediction component table, the method further comprises:

and re-executing the step of acquiring the instruction address and the history execution value of the global history register to the step of determining the target branch prediction unit according to the index value and the branch prediction unit table so as to determine the target branch prediction unit corresponding to the next instruction.

9. An electronic device, comprising:

a processor; and

a memory having program code stored thereon for branch prediction, which when executed by the processor, causes the electronic device to implement the method of any of claims 1-8.

10. A non-transitory machine readable storage medium having stored thereon program code for branch prediction, which when executed by a processor, causes the method of any of claims 1-8 to be implemented.