CN114003292B - Branch prediction method and device and processor core - Google Patents

Branch prediction method and device and processor core Download PDF

Info

Publication number
CN114003292B
CN114003292B CN202111636402.7A CN202111636402A CN114003292B CN 114003292 B CN114003292 B CN 114003292B CN 202111636402 A CN202111636402 A CN 202111636402A CN 114003292 B CN114003292 B CN 114003292B
Authority
CN
China
Prior art keywords
record
attribute value
ghr
branch prediction
history information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111636402.7A
Other languages
Chinese (zh)
Other versions
CN114003292A (en
Inventor
郑添
蔡刚
黄志洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ehiway Microelectronic Science And Technology Suzhou Co ltd
Original Assignee
Ehiway Microelectronic Science And Technology Suzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ehiway Microelectronic Science And Technology Suzhou Co ltd filed Critical Ehiway Microelectronic Science And Technology Suzhou Co ltd
Priority to CN202111636402.7A priority Critical patent/CN114003292B/en
Publication of CN114003292A publication Critical patent/CN114003292A/en
Application granted granted Critical
Publication of CN114003292B publication Critical patent/CN114003292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • G06F9/3844Speculative instruction execution using dynamic branch prediction, e.g. using branch history tables

Abstract

The invention provides a branch prediction method, a branch prediction device and a processor core, wherein n different lengths of GHR are taken when a clock rising edge arrives, and a PC value and n different lengths of GHR are subjected to Hash operation to generate an index; when the clock falling edge arrives, accessing the basic branch prediction unit T0 by using the PC value, and accessing the history information record table corresponding to the GHRs with different lengths by using the generated index; when a record matched with the index exists in a certain historical information record table and the u value of the record is 1, outputting the weight value of the record as an access result; adding all the output weight attribute values; and outputting prediction information according to the result of the addition. By using GHRs with different lengths, weight information is added to the historical information recording list, global historical jump information GHR is introduced, the used historical information is short in length, the cost of added circuit resources is low, and therefore the higher prediction accuracy can be achieved by adopting less hardware resource consumption.

Description

Branch prediction method and device and processor core
Technical Field
The present invention relates to the field of processors, and in particular, to a branch prediction method, an apparatus, and a processor core.
Background
A superscalar super pipeline structure is generally adopted in a modern processor to improve the utilization rate of hardware resources of the processor and accelerate the running speed of a CPU, but the existence of a branch instruction limits the improvement of the superscalar super pipeline structure on the performance of the processor to a certain extent. To ensure the continuity of the pipeline process and reduce the delay caused by the branch instruction, the processor design usually employs a branch prediction device to predict whether the branch command is branched, so as to process in advance.
However, when the branch prediction unit predicts a miss, the instruction fetched ahead in the pipeline must be flushed back, causing a delay in the pipeline. This means that the prediction accuracy of the branch prediction apparatus has a great influence on the performance of the processor.
The existing Branch prediction device mainly increases the prediction accuracy by increasing a Branch History Register (BHR) and a Pattern History Table (PHT), but this needs to be done at the cost of high hardware overhead. Among them, the hardware overhead of global BHR and PHT is more huge, and global BHR is also called ghr (global History register). Furthermore, simply increasing the capacities of GHR and PHT has limited improvement in prediction accuracy, which is mainly caused by the branch alias problem. The most commonly used branch predictor Gshare in a processor accesses the PHT using the xor of the instruction address and BHR as an index, which can avoid the problem of branch aliasing well, but the available history length is limited, and the hardware overhead is exponentially related to the history length. With the intensive research on branch prediction technology, the academia introduces neural networks to perform branch prediction. The branch prediction device based on the Perceptron uses the weighted summation mode to utilize the historical information, the hardware expense and the historical length are linearly related, longer history can be used, and the branch prediction precision is improved. However, the Perceptron branch predictor needs to learn continuously to adapt to the excitation which changes continuously, the prediction accuracy is low in the learning period, an adder with a large bit width needs to be implemented in the Perceptron branch predictor, and the calculation delay is greatly increased. The TAGE branch predictor uses a PPM (prediction by Partial match) algorithm, the hardware overhead of the TAGE branch predictor is logarithmically related to the branch history, the available history information is longer, and the prediction accuracy of the branch prediction is further improved. However, the TAGE branch predictor requires a plurality of prediction components with different history lengths, so that the hardware overhead is increased and the implementation complexity is high.
Disclosure of Invention
The invention aims to solve the technical problem of how to improve the accuracy of branch prediction by using longer history length under the condition of not increasing hardware overhead, and provides a branch prediction method, a device and a processor core.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a branch prediction method, comprising the steps of:
step 1: when a clock rising edge comes, taking n kinds of digits with different lengths for GHR, and carrying out hash operation on a current instruction address PC value and the GHR with the n kinds of digits with different lengths to generate n historical information indexes;
step 2: when a clock falling edge arrives, accessing a basic branch prediction unit T0 by using a current instruction address PC value, and respectively accessing history information recording tables T1-Tn corresponding to n GHRs with different length digits by using the generated n history information indexes; the basic branch prediction unit T0 is composed of a plurality of two-bit saturation counters, each record in each history information record table including a tag attribute value for matching with a history information index, a pred attribute value for providing a branch prediction direction, a u attribute value for indicating whether a branch prediction item is valid, and a weight attribute value for indicating a branch jump strength;
when a record matched with the history information index exists in a certain history information record table T1-Tn and the u attribute value of the record is 1, outputting the weight attribute value of the record as the access result of the certain history information record table;
and step 3: adding all weight attribute values output by each historical information recording table; when the falling edge of the next clock comes, when the added output result is greater than 0, outputting the prediction information as taken; when the added output result is less than 0, outputting the prediction information as not token; when the added output result is equal to 0, the output prediction information is the output result of the basic branch prediction unit T0.
Further, the method also comprises the step 4: after outputting the prediction information, updating the weight attribute value of the record matched with the history information index in each history information record table T1-Tn, wherein the updating method is that when the highest bit of the pred attribute value of the record matched with the history information index is 1, the weight attribute value of the record is the bit number of 1 in GHR minus the bit number of 0 in GHR, and when the highest bit of the pred attribute value of the record matched with the history information index is 0, the weight attribute value of the record is the bit number of 0 in GHR minus the bit number of 1 in GHR.
Further, the method also comprises the step 5: after the processor core finishes executing according to the output prediction information to obtain a feedback signal, updating pred attribute values of records matched with the history information indexes in the history information record tables T1-Tn, if the actual skipping result of the feedback signal is taken, adding 1 to the pred value, and if not, subtracting 1, and after reaching the upper limit or the lower limit, the pred attribute values are not changed.
Further, when the tag attribute value of a record in the history information recording table T1-Tn is successfully matched with the history information index, the u attribute value of the record is set to 1, otherwise, the u attribute value is set to 0.
Further, the number of record entries in each history information record table is the same as the number of bits of different lengths taken from GHR, a tag attribute value of one record corresponds to the history flag address to which the one-bit jump result in GHR is mapped, and the initial values of a pred attribute value, a u attribute value, and a weight attribute value of one record are 0.
Further, in step 1, n kinds of bits with different lengths are taken for GHR, and each length satisfies:
L(i)=int(αi-1*L(1)+0.5)
wherein, L (1) refers to the number of bits with the 1 st length taken from GHR, L (i) refers to the number of bits with the i th length taken from GHR, the range of values 1< i < n ensures that the minimum bit width taken by GHR is 1, int () is an integer function, α is self-defined, and the range of values is 0< α < 1.
The invention also provides a branch prediction device, which comprises a branch direction generation component and a branch target generation component, wherein the branch direction generation component generates by using the branch prediction method.
The invention also provides a processor core which comprises a branch prediction unit, an instruction fetching unit, a decoding unit and an execution unit, wherein the branch prediction unit uses the branch prediction device.
By adopting the technical scheme, the invention has the following beneficial effects:
the branch prediction method, the branch prediction device and the processor core fully utilize the rising and falling edges of a clock, obtain a PC value and a GHR value at the rising edge of the clock, obtain bits with different lengths for the GHR, perform hash calculation on the GHR and the PC value with n types of bits with different lengths to obtain historical information indexes, access n historical information recording tables corresponding to each bit of the GHR at the falling edge of the clock, output the weight value reflecting the jump strength of the matched historical information, perform summation calculation on all weight values output by the n historical information recording tables at the next clock, and output branch prediction information at the next clock edge according to the summation calculation result. Therefore, branch prediction is realized by adopting two stages of running water, and the working frequency is high.
The invention adds weight information in the history information recording list, overcomes the defect that pred is adopted in the prior art, and the weight information is not accumulated after the upper limit is reached, and can only represent the jump information indicated by the tag which is matched last time, so the jump strength can not accurately reflect, and the global history jump information GHR is introduced after the weight is added, so that the information matched last time and the global information are provided, and the jump instruction are related. The result of the jump prediction can be made more accurate.
Compared with the method of directly providing a jump strategy by static branch prediction, the method has the advantage that the working frequency is higher by using the historical information recording table and recording the historical matching information through the u attribute value.
The history information length of T1-Tn used in the invention is exponentially reduced, and is different from the original logarithmic increase, because the used history information length is small, the subsequent addition tree is low bit width addition, the increased circuit resource cost is little, and therefore, the higher prediction precision can be realized by adopting less hardware resource consumption.
In addition, in the prediction process, when the history information cannot be matched, namely the history information recording table T1-Tn does not have matching information, the prediction is output by the T0 component, so that the prediction failure caused by the weight value being 0 in the early stage is avoided, and the prediction accuracy is improved.
Drawings
FIG. 1 is a system flow diagram of the present invention;
FIG. 2 is a schematic diagram of two-stage pipelined prediction information generation;
FIG. 3 is a schematic diagram of the operation of predicting information at rising and falling clock edges;
FIG. 4 is a graph of the GHR median;
FIG. 5 is a diagram of a branch prediction apparatus;
FIG. 6 is a schematic diagram of branch target generation;
FIG. 7 is a diagram of a processor core assembly;
FIG. 8 is a diagram illustrating the connection relationship between the units of the processor core.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 to 4 show an embodiment of a branch prediction method according to the present invention, which includes the following steps:
step 1: when a clock rising edge comes, n kinds of bits with different lengths are taken for GHR, and the current instruction address PC value and the GHR with the n kinds of bits with different lengths are subjected to hash operation to generate n historical information indexes.
In this embodiment, n kinds of bits with different lengths are taken for GHR and hash operation is performed on the pc value, so that unnecessary high warm-up time can be reduced, low accuracy caused by constant missing of the history information record table T1-Tn can be avoided, and hit probability can be improved.
In this embodiment, the hash operation is an exclusive or operation with high and low bits.
Taking n kinds of digits with different lengths for GHR, wherein each length satisfies the following conditions:
L(i)=int(αi-1*L(1)+0.5)
wherein, L (1) refers to the number of bits with the 1 st length taken from GHR, L (i) refers to the number of bits with the i th length taken from GHR, the range of values 1< i < n ensures that the minimum bit width taken by GHR is 1, int () is an integer function, α is self-defined, and the range of values is 0< α < 1.
Step 2: when a clock falling edge arrives, accessing a basic branch prediction unit T0 by using a current instruction address PC value, and respectively accessing history information recording tables T1-Tn corresponding to n GHRs with different length digits by using the generated n history information indexes; the basic branch prediction unit T0 is composed of a plurality of two-bit saturation counters, each record in each history information record table including a tag attribute value for matching with a history information index, a pred attribute value for providing a branch prediction direction, a u attribute value for indicating whether a branch prediction item is valid, and a weight attribute value for indicating a branch jump strength; when there is a record matching the history information index in a certain history information record table T1-Tn and the u attribute value of the record is 1, the weight attribute value of the record is output as the access result of the certain history information record table.
In this embodiment, the number of record entries in each history information record table is the same as the number of bits with different lengths taken from the GHR, a tag attribute value of one record corresponds to the history flag address mapped by a one-bit jump result in the GHR, and the initial values of a pred attribute value, a u attribute value, and a weight attribute value of one record are 0. The historical instruction address of the jump result mapping represented by one bit in GHR is reflected by tag attribute, and then the information such as jump condition, matching condition, jump strength and the like of the information is recorded by pred attribute value, u attribute value and weight attribute value. According to the invention, through adding weight information in the history information recording table, the defect that in the prior art, pred is adopted, accumulation is not carried out after the upper limit is reached, and the weight information can only represent the jump information indicated by the tag which is matched last time, so that the jump strength can not accurately reflect, and the global history jump information GHR is introduced after the weight is added, so that not only the information matched last time, but also the global information is obtained, and the jump instruction are related. The result of the jump prediction can be made more accurate. The history information length of T1-Tn used in the invention is exponentially reduced, and is different from the original logarithmic increase, because the used history information length is small, the subsequent addition tree is low bit width addition, the increased circuit resource cost is little, and therefore, the higher prediction precision can be realized by adopting less hardware resource consumption.
As shown in fig. 2 and 3, when a clock rising edge arrives, an index is generated by hash operation, a history information index is obtained by hash operation of GHRs and pc with different lengths, and when a clock falling edge arrives, components T1-Tn with different lengths are accessed through the history information index, and weight values of history information with different lengths are obtained. The method comprises the steps of fully utilizing the rising and falling edges of a clock, obtaining a PC value and a GHR value on the rising edge of the clock, taking different length digits for the GHR, carrying out Hash calculation on n types of GHR and PC values with different length digits to obtain historical information indexes, accessing n historical information recording tables corresponding to each bit of the GHR on the falling edge of the clock, outputting the weight value of matched historical information reflecting jump strength, carrying out summation calculation on all weight values output by the n historical information recording tables on the next rising edge of the clock, and giving branch prediction information according to the summation calculation result. Therefore, branch prediction is realized by adopting two stages of running water, and the working frequency is high, so that the overall working frequency of the processor core is improved.
And step 3: adding all weight attribute values output by each historical information recording table; when the falling edge of the next clock comes, when the added output result is greater than 0, outputting the prediction information as taken; when the added output result is less than 0, outputting the prediction information as not token; when the added output result is equal to 0, the output prediction information is the output result of the basic branch prediction unit T0. The invention outputs the prediction information through the weight value, and if the condition that no matching information exists in T1-Tn through the index, the prediction information is output by the T0 component, thereby avoiding the prediction failure caused by the weight unit being 0 in the previous period and improving the prediction precision.
And 4, step 4: after outputting the prediction information, updating the weight attribute value of the record matched with the history information index in each history information record table T1-Tn, wherein the updating method is that when the highest bit of the pred attribute value of the record matched with the history information index is 1, the weight attribute value of the record is the bit number of 1 in GHR minus the bit number of 0 in GHR, and when the highest bit of the pred attribute value of the record matched with the history information index is 0, the weight attribute value of the record is the bit number of 0 in GHR minus the bit number of 1 in GHR. In this embodiment, the pred bit width is 2 bits, the tag bit width is 9 bits, the u bit width is 1 bit, and the weight bit width is 8 bits. Of course, the bit widths of tag and weight can be customized as required. In this embodiment, as shown in fig. 2, the GHR and the T0-T4 are both implemented by bram (block Random Access memory), and read and write are performed on a clock falling edge, so that rapid transmission of prediction information can be realized, and input delay of a subsequent addition tree is reduced. The number of bits of 1 in GHR is obtained by an accumulator, as shown in fig. 4, and the number of bits of 0 in GHR is obtained by subtracting the number of bits of l (i) and 1.
And 5: after the processor core finishes executing according to the output prediction information to obtain a feedback signal, the pred attribute value of the record matched with the history information index in each history information record table T1-Tn is updated, if the actual skipping result of the feedback signal is taken, the pred value is added with 1, otherwise, the pred value is subtracted with 1, and the pred value is not changed after reaching the upper limit or the lower limit.
In this embodiment, when a tag in T1-Tn is successfully matched with the index generated by hash matching, u unit of the component is set to 1, otherwise, it is set to 0. Tag in T1-Tn remains 0 from the missed entry, u, and 1 after the first hit. Compared with the method that the jump strategy is directly provided by static branch prediction, the method has higher working frequency by recording historical matching information through the u attribute value.
As shown in fig. 3, when the clock rising edge arrives, PC0 and GHR perform a hash operation to generate an index. Upon the arrival of a falling clock edge, historical information is accessed. At the end of this clock cycle, the value of PC1 is obtained. Prediction information is generated during the second clock cycle and is registered by the PC 2. The PC2 inputs the output to the branch prediction apparatus to perform prediction. In the next cycle, the PC3 generates an output. Specifically, PC2 was generated by PC0, and PC3 was generated by PC 1. It can be seen that with this dual clock edge design, the clock frequency will be fast.
The present invention also provides a branch prediction apparatus, as shown in fig. 5, including a branch direction generating component and a branch target generating component, where the branch direction generating component generates by using the branch prediction method described above. As shown in fig. 6, after determining the Branch jump direction, the Branch Target generating component determines the Branch jump Target through a Branch Target Buffer (BTB) with a lower 10-bit index of the PC0, and generates a PC 2.
The present invention also provides a processor core, as shown in fig. 7, including a branch prediction unit, an instruction fetch unit, a decode unit, and an execution unit, where the branch prediction unit uses the aforementioned branch prediction apparatus. The connection relationship of each unit in the processor core is shown in fig. 8.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. A branch prediction method, comprising the steps of:
step 1: when a clock rising edge comes, taking n kinds of digits with different lengths for GHR, and carrying out hash operation on a current instruction address PC value and the GHR with the n kinds of digits with different lengths to generate n historical information indexes;
step 2: when a clock falling edge arrives, accessing a basic branch prediction unit T0 by using a current instruction address PC value, and respectively accessing history information recording tables T1-Tn corresponding to n GHRs with different length digits by using the generated n history information indexes; the basic branch prediction unit T0 is composed of a plurality of two-bit saturation counters, each record in each history information record table including a tag attribute value for matching with a history information index, a pred attribute value for providing a branch prediction direction, a u attribute value for indicating whether a branch prediction item is valid, and a weight attribute value for indicating a branch jump strength;
when a record matched with the history information index exists in a certain history information record table T1-Tn and the u attribute value of the record is 1, outputting the weight attribute value of the record as the access result of the certain history information record table;
and step 3: adding all weight attribute values output by each historical information recording table; when the falling edge of the next clock comes, when the added output result is greater than 0, outputting the prediction information as taken; when the added output result is less than 0, outputting the prediction information as not token; when the added output result is equal to 0, the output prediction information is the output result of the basic branch prediction unit T0.
2. The branch prediction method according to claim 1, further comprising the step of 4: after outputting the prediction information, updating the weight attribute value of the record matched with the history information index in each history information record table T1-Tn, wherein the updating method is that when the highest bit of the pred attribute value of the record matched with the history information index is 1, the weight attribute value of the record is the bit number of 1 in GHR minus the bit number of 0 in GHR, and when the highest bit of the pred attribute value of the record matched with the history information index is 0, the weight attribute value of the record is the bit number of 0 in GHR minus the bit number of 1 in GHR.
3. The branch prediction method according to claim 2, further comprising the step of 5: after the processor core finishes executing according to the output prediction information to obtain a feedback signal, the pred attribute value of the record matched with the history information index in each history information record table T1-Tn is updated, if the actual skipping result of the feedback signal is taken, the pred value is added with 1, otherwise, the pred value is subtracted with 1, and the pred value is not changed after reaching the upper limit or the lower limit.
4. The branch prediction method according to claim 3, wherein when the tag attribute value of a record in the history information record table T1-Tn is successfully matched with the history information index, the u attribute value of the record is set to 1, otherwise, the u attribute value is set to 0.
5. The branch prediction method according to claim 3, wherein the number of record entries in each history information record table is the same as the number of bits of different lengths taken from the GHR, a tag attribute value of one record corresponds to the history flag address mapped by the one-bit jump result in the GHR, and initial values of a pred attribute value, a u attribute value, and a weight attribute value of one record are 0.
6. The branch prediction method according to claim 5, wherein n different length bits are taken for GHR in step 1, and each length satisfies:
L(i)=int(αi-1*L(1)+0.5)
wherein, L (1) refers to the number of bits with the 1 st length taken from GHR, L (i) refers to the number of bits with the i th length taken from GHR, the range of values 1< i < n ensures that the minimum bit width taken by GHR is 1, int () is an integer function, α is self-defined, and the range of values is 0< α < 1.
7. A branch prediction apparatus comprising a branch direction generating component and a branch target generating component, characterized in that the branch direction generating component is generated using the branch prediction method of any of claims 1-6.
8. A processor core comprising a branch prediction unit, an instruction fetch unit, a decode unit, and an execution unit, wherein the branch prediction unit uses the branch prediction apparatus of claim 7.
CN202111636402.7A 2021-12-30 2021-12-30 Branch prediction method and device and processor core Active CN114003292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111636402.7A CN114003292B (en) 2021-12-30 2021-12-30 Branch prediction method and device and processor core

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111636402.7A CN114003292B (en) 2021-12-30 2021-12-30 Branch prediction method and device and processor core

Publications (2)

Publication Number Publication Date
CN114003292A CN114003292A (en) 2022-02-01
CN114003292B true CN114003292B (en) 2022-03-15

Family

ID=79932230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111636402.7A Active CN114003292B (en) 2021-12-30 2021-12-30 Branch prediction method and device and processor core

Country Status (1)

Country Link
CN (1) CN114003292B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106030516A (en) * 2013-10-25 2016-10-12 超威半导体公司 Bandwidth increase in branch prediction unit and level 1 instruction cache
CN110555475A (en) * 2019-08-29 2019-12-10 华南理工大学 few-sample target detection method based on semantic information fusion
TW202112808A (en) * 2019-06-05 2021-04-01 日商中外製藥股份有限公司 Antibody cleavage site-binding molecule

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7707398B2 (en) * 2007-11-13 2010-04-27 Applied Micro Circuits Corporation System and method for speculative global history prediction updating

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106030516A (en) * 2013-10-25 2016-10-12 超威半导体公司 Bandwidth increase in branch prediction unit and level 1 instruction cache
TW202112808A (en) * 2019-06-05 2021-04-01 日商中外製藥股份有限公司 Antibody cleavage site-binding molecule
CN110555475A (en) * 2019-08-29 2019-12-10 华南理工大学 few-sample target detection method based on semantic information fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Record Branch Prediction: An Optimized Scheme for Two-level Branch Predictors;T. Chen;《2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems》;20121218;1526-1533 *
TAGE分支预测器设计空间探索及优化研究;周朝兵;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20200215(第02期);I137-168 *

Also Published As

Publication number Publication date
CN114003292A (en) 2022-02-01

Similar Documents

Publication Publication Date Title
JP6796468B2 (en) Branch predictor
US8943300B2 (en) Method and apparatus for generating return address predictions for implicit and explicit subroutine calls using predecode information
US7631146B2 (en) Processor with cache way prediction and method thereof
US6550004B1 (en) Hybrid branch predictor with improved selector table update mechanism
JP5579694B2 (en) Method and apparatus for managing a return stack
JPH06110683A (en) Method and apparatus for extending branch target of microprocessor
US10140126B2 (en) Variable length instruction processor system and method
CN116737240A (en) Branch prediction method, device, processor, medium and equipment
CN110780925B (en) Pre-decoding system and method of instruction pipeline
CN107870780B (en) Data processing apparatus and method
CN114003292B (en) Branch prediction method and device and processor core
EP0394711A2 (en) Branch instruction control unit based on a pipeline method
CN116048627B (en) Instruction buffering method, apparatus, processor, electronic device and readable storage medium
JPH05143335A (en) Method for accelerating operating speed of processor
US7234046B2 (en) Branch prediction using precedent instruction address of relative offset determined based on branch type and enabling skipping
US20050027921A1 (en) Information processing apparatus capable of prefetching instructions
US20160011889A1 (en) Simulation method and storage medium
US6421774B1 (en) Static branch predictor using opcode of instruction preceding conditional branch
JP4247132B2 (en) Information processing device
CN115878187B (en) Processor instruction processing apparatus and method supporting compressed instructions
WO2023160522A9 (en) Return address table branch predictor
US11687342B2 (en) Way predictor and enable logic for instruction tightly-coupled memory and instruction cache
CN116107638A (en) Processing method, processing device and storage medium
CN114358179A (en) Pre-fetch training method of processor, processing device, processor and computing equipment
JP3851235B2 (en) Branch prediction apparatus and branch prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant