US20170371669A1 - Branch target predictor - Google Patents

Branch target predictor Download PDF

Info

Publication number
US20170371669A1
US20170371669A1 US15/192,794 US201615192794A US2017371669A1 US 20170371669 A1 US20170371669 A1 US 20170371669A1 US 201615192794 A US201615192794 A US 201615192794A US 2017371669 A1 US2017371669 A1 US 2017371669A1
Authority
US
United States
Prior art keywords
way
fetch address
identifier
entry
prediction data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/192,794
Inventor
Anil Krishna
Gregory Wright
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Priority to US15/192,794 priority Critical patent/US20170371669A1/en
Assigned to QUALCOMM INCORPORATED reassignment QUALCOMM INCORPORATED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KRISHNA, ANIL, WRIGHT, GREGORY
Priority to PCT/US2017/029452 priority patent/WO2017222635A1/en
Priority to CN201780033792.4A priority patent/CN109219798A/en
Priority to EP17721035.8A priority patent/EP3475811A1/en
Publication of US20170371669A1 publication Critical patent/US20170371669A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • G06F9/3806Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30058Conditional branch instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30061Multi-way branch instructions, e.g. CASE
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • G06F9/3844Speculative instruction execution using dynamic branch prediction, e.g. using branch history tables

Definitions

  • the present disclosure is generally related to a branch target predictor.
  • wireless computing devices such as portable wireless telephones, personal digital assistants (PDAs), and paging devices that are small, lightweight, and easily carried by users, laptop and desktop computers, and servers.
  • PDAs personal digital assistants
  • paging devices that are small, lightweight, and easily carried by users, laptop and desktop computers, and servers.
  • a computing device may include a processor that is operable to execute different instructions in an instruction set (e.g., a program).
  • the instruction set may include direct branches and indirect branches.
  • An indirect branch may specify the fetch address of the next instruction to be executed from an instruction memory.
  • the next instruction may be indirectly fetched because the instruction address is resident in some other storage element (e.g., a processor register).
  • the indirect branch may not embed the offset to the address of the target instruction within one of the instruction fields in the branch instruction.
  • Non-limiting examples of an indirect branch include a computed jump, an indirect jump, and a register-indirect jump.
  • the processor may predict the fetch address. To predict the fetch address, the processor may use multiple predictor tables, where each predictor table includes multiple prediction entries, and where each prediction entry stores a fetch address.
  • each prediction entry stores an entire fetch address and multiple prediction tables may include similar entries, in certain scenarios, there may be a relatively large amount of overhead at each predictor table.
  • each prediction entry in a predictor table may not be used by an application, multiple predictor tables may include identical predictor entries (e.g., target duplication), and the number of predictor table entries may not be capable of adjustment independently from the number of target instructions.
  • the processor may also utilize a stored global history from past indirect branches to predict the fetch address. For example, the processor may predict the fetch address based on predicted fetch addresses for the previous ten indirect branches to provide context. Each fetch address stored in the global history may utilize approximately ten bits of storage. For example, twenty previously predicted fetch addresses stored in the global history may utilize approximately two-hundred bits of storage. Thus, a relatively large amount of storage may be used for the global history.
  • an apparatus for predicting a fetch address of a next instruction to be fetched includes a memory system, first selection logic, and second selection logic.
  • the memory system includes a plurality of predictor tables and a target table.
  • the plurality of predictor tables includes a first predictor table and a second predictor table.
  • the first predictor table includes a first entry having a first way identifier
  • the second predictor table includes a second entry having a second way identifier.
  • the target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier.
  • the first way and the second way are associated with an active address.
  • the first way identifier and the second way identifier may “point” to a similar way.
  • the first way identifier and the second way identifier may point to different ways.
  • the first selection logic is coupled to select the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data.
  • the second selection logic is configured to select the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
  • the historical prediction data may include an “abbreviated version” of the previously used fetch addresses (e.g., some bits of previously used fetch addresses) as opposed to the entire fetch addresses, data associated with way identifiers of the previously used fetch addresses, or a combination of both.
  • the most significant bits of a fetch address may not substantially change from one fetch address to another fetch address.
  • Lower order bits or a hash function
  • the historical prediction data may include a way number (e.g., a way identifier) in the target table for each previously used fetch address.
  • the historical prediction data may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address. This reduction in bits may reduce the overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
  • a method for predicting a fetch address of a next instruction to be fetched includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data.
  • a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier.
  • the method also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer.
  • a target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address.
  • the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
  • a non-transitory computer-readable medium includes commands for predicting a fetch address of a next instruction to be fetched.
  • the commands when executed by a processor, cause the processor to perform operations including selecting a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data.
  • a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier.
  • the operations also include selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer.
  • a target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address.
  • the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
  • an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data.
  • the means for storing data includes a plurality of predictor tables and a target table.
  • the plurality of predictor tables includes a first predictor table and a second predictor table.
  • the first predictor table includes a first entry having a first way identifier
  • the second predictor table includes a second entry having a second way identifier.
  • the target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier.
  • the first way and the second way are associated with an active address.
  • the apparatus also includes means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data.
  • the apparatus also includes means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
  • FIG. 1 is a processing system that it operable to predict a fetch address of a target instruction
  • FIG. 2 depicts predictor tables included in the processing system of FIG. 1 ;
  • FIG. 3 is a method for predicting a fetch address of a target instruction
  • FIG. 4 is a block diagram of a device that includes the processing system of FIG. 1 .
  • a processing system 100 that is operable to predict a fetch address of a target instruction is shown.
  • a fetch address corresponds to a location in memory where an address for the target instruction (e.g., the next instruction to be executed) is stored.
  • the processing system 100 may also be referred to as a “memory system.”
  • the processing system 100 may predict the fetch address of the target instruction based on an active fetch address 110 .
  • the active fetch address 110 may be based on a current program counter (PC) value.
  • the processing system 100 includes a plurality of predictor tables, a global history table 112 , first selection logic 114 , a target table 118 , and second selection logic 120 .
  • the first selection logic 114 includes a first multiplexer and the second selection logic 120 includes a second multiplexer.
  • the plurality of predictor tables includes a predictor table 102 , a predictor table 104 , a predictor table 106 , and a predictor table 108 . Although four predictor tables 102 - 108 are shown, in other implementations, the processing system 100 may include additional (or fewer) predictor tables. As a non-limiting example, the processing system 100 may include eight predictor tables in another implementation.
  • Each predictor table 102 - 108 includes multiple entries that identify different fetch addresses.
  • the predictor table 102 includes a first plurality of entries 150
  • the predictor table 104 includes a second plurality of entries 160
  • the predictor table 106 includes a third plurality of entries 170
  • the predictor table 108 includes a fourth plurality of entries 180 .
  • different predictor tables 102 - 108 may have different sizes.
  • different predictor tables 102 - 108 may have a different number of entries.
  • the fourth plurality of entries 180 may include more entries than the second plurality of entries 160 .
  • the predictor tables 102 - 108 of the processing system 100 are shown in greater detail in FIG. 2 .
  • the active fetch address 110 is provided to each predictor table 102 - 108 to determine whether a “hit” exists at the predictor tables 102 - 108 .
  • the processing system 100 may determine whether each predictor table 102 - 108 includes an entry that matches the active fetch address 110 .
  • the active fetch address 110 is “0X80881323”. It should be understood that the active fetch address 110 (and other addresses described herein) is merely for illustrative purposes and should not be construed as limiting.
  • the predictor table 102 includes an entry 152 , an entry 154 , an entry 156 , and an entry 158 .
  • each entry 152 - 158 may be included in the first plurality of entries 150 of FIG. 1 .
  • the entry 152 may include a tag “0X80881323” and may include a way identifier “A”
  • the entry 154 may include a tag “0X80881636” and may include a way identifier “B”
  • the entry 156 may include a tag “0X80882399” and may include a way identifier “C”
  • the entry 158 may include a tag “0X80883456” and may include a way identifier “D”.
  • each tag may include a subset of a fetch address hashed together with other information (e.g., a particular number of previously seen fetch addresses).
  • Each tag may include enough information such that remainder of the entry's content is associated with a fetch address looking up for that entry.
  • each tag may be used as an identification mechanism for a fetch address. For ease of illustration, the way identifiers are identified by a single capitalized letter.
  • the predictor table 104 includes an entry 162 , an entry 164 , an entry 166 , and an entry 168 .
  • each entry 162 - 168 may be included in the second plurality of entries 160 of FIG. 1 .
  • the entry 162 may include a tag “0X80884635” and may include the way identifier “A”
  • the entry 164 may include a tag “0X80881323” and may include the way identifier “B”
  • the entry 166 may include a tag “0X80881493” and may include the way identifier “C”
  • the entry 168 may include a tag “0X80889999” and may include the way identifier “D”.
  • the predictor table 106 includes an entry 172 , an entry 174 , an entry 176 , and an entry 178 .
  • each entry 172 - 178 may be included in the third plurality of entries 170 of FIG. 1 .
  • the entry 172 may include a tag “0X80884639” and may include the way identifier “A”
  • the entry 174 may include a tag “0X80882395” and may include the way identifier “B”
  • the entry 176 may include a tag “0X80888723” and may include the way identifier “C”
  • the entry 178 may include a tag “0X80881321” and may include the way identifier “D”.
  • the predictor table 108 includes an entry 182 , an entry 184 , an entry 186 , and an entry 188 .
  • each entry 182 - 188 may be included in the fourth plurality of entries 180 of FIG. 1 .
  • the entry 182 may include a tag “0X80885245” and may include the way identifier “A”, the entry 184 may include a tag
  • the entry 186 may include a tag “0X80881323” and may include the way identifier “C”
  • the entry 188 may include a tag “0X80888888” and may include the way identifier “D”.
  • a processor may determine that the entry 152 in the predictor table 102 matches the active fetch address 110 . Based on this determination, the processor may provide the way identifier “A” to the first selection logic 114 as an output tag indicator 103 of the predictor table 102 . The processor may also determine that the entry 164 in the predictor table 102 matches the active fetch address 110 . Based on this determination, the processor may provide the way identifier “B” to the first selection logic 114 as an output tag indicator 105 of the predictor table 102 .
  • the processor may determine that there are no entries in the predictor table 106 that match the active fetch address 110 . Thus, the processor may not provide a way identifier to the first selection logic 114 as an output tag indicator 107 of the predictor table 106 . The processor may determine that the entry 186 in the predictor table 108 matches the active fetch address 110 . Based on this determination, the processor may provide the way identifier “C” to the first selection logic 114 as an output tag indicator 109 of the predictor table 108 .
  • each output tag indicator 103 , 105 , 107 , 109 provides a different way identifier to the first selection logic 114 .
  • the first selection logic 114 may be configured to select the output tag indicator of the predictor table that has an entry matching the active fetch address 110 and that utilizes a largest amount of historical prediction data (associated with the global history table 112 ), as explained below.
  • the output tag indicators 103 , 105 , 109 correspond to entries 152 , 164 , 186 , respectively, having tags identify the active fetch address 110 .
  • the first selection logic 114 may determine which output tag indicator 103 , 105 , 109 to select based on the amount of historical prediction data associated with each output tag indicator 103 , 105 , 109 . In a scenario where only one output tag indicator corresponds to an entry having a tag identifies the active fetch address 110 , the first selection logic 114 may select that output tag indicator.
  • the global history table 112 includes (e.g., stores) historical prediction data 113 .
  • the historical prediction data 113 includes a history of previous fetch addresses for indirect branches.
  • the historical prediction data 113 may include data to identify fetch addresses for previous indirect branches and way numbers associated with the fetch addresses.
  • Each fetch address in the historical prediction data 113 may be an “abbreviated version” of a fetch address, to reduce overhead.
  • the historical prediction data 113 may store some bits (e.g., a subset) of each previous fetch address as opposed to the entire fetch address.
  • the historical prediction data 113 may include a way number (e.g., a way identifier) in the target table 118 for each previously used fetch address.
  • the historical prediction data 113 may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address.
  • the processing system 100 may provide the historical prediction data 113 to the predictor table 104 , to the predictor table 106 , and to the predictor table 108 .
  • the processing system 100 may provide a first amount of the historical prediction data 113 to the predictor table 104 with the active fetch address 110 to generate the output tag indicator 105
  • the processing system 100 may provide a second amount of the historical prediction data 113 (that is greater than the first amount) to the predictor table 106 with the active fetch address 110 to generate the output tag indicator 107
  • the processing system 100 may provide a third amount of the historical prediction data 113 (that is greater than the second amount) to the predictor table 104 with the active fetch address 110 to generate the output tag indicator 109 .
  • the output tag indicator 103 may not be as reliable as the output tag indicators 105 , 107 , 109 that are generated based on increasing amounts of the historical prediction data 113 . Furthermore, because the output tag indicator 107 is generated using more of the historical prediction data 113 than the amount of historical prediction data 113 used to generate the output tag indicator 105 , the output tag indicator 107 may be more reliable than the output tag indicator 105 . Similarly, because the output tag indicator 109 is generated using more of the historical prediction data 113 than the amount of historical prediction data 113 used to generate the output tag indicator 107 , the output tag indicator 109 may be more reliable than the output tag indicator 107 .
  • the output tag indicators 103 , 105 , 109 correspond to entries 152 , 164 , 186 , respectively, having tags that identify the active fetch address 110 .
  • the first selection logic 114 may determine which output tag indicator 103 , 105 , 109 to select based on the amount of historical prediction data 113 associated with each output tag indicator 103 , 105 , 109 . Because the output tag indicator 109 is associated with more historical prediction data 113 than the other output tag indicators 103 , 105 , the first selection logic 114 may select that output tag indicator 109 as a selected way pointer 116 .
  • the processing system 100 may provide the selected way pointer 116 to the second selection logic 120 .
  • the target table 118 includes multiple fetch addresses that are separated by sets (e.g., rows) and ways (e.g., columns).
  • the target table 118 includes four sets (e.g., “Set 1 ”, “Set 2 ”, “Set 3 ”, and “Set 4 ”).
  • the target table 118 may also include four ways (e.g., “Way A”, “Way B”, “Way C”, and “Way D”).
  • the target table 118 is shown to include four sets and four ways, in other implementations, the target table 118 may include additional (or fewer) ways and sets.
  • target table 118 may include sixteen sets and thirty-two ways.
  • the processing system 100 may provide the active fetch address 110 to the target table 118 .
  • the active fetch address 110 may indicate a particular set of fetch addresses in the target table 118 to be selected. In the illustrative example of FIG. 1 , the active fetch address 110 indicates that “Set 3 ” is where the predicted fetch address 140 is located in the target table 118 .
  • Each way in the target table 118 corresponds to a particular way identifier in the predictor tables 102 - 108 .
  • each entry in the predictor tables 102 - 108 can include way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D”.
  • the entries that include way identifier “A” are associated with “Way A”, the entries that include way identifier “B” are associated with “Way B”, the entries that include way identifier “C” are associated with “Way C”, and the entries that include way identifier “D” are associated with “Way D”.
  • the second selection logic 120 may select “Way C” as the selected way of the predicted fetch address 140 .
  • the second selection logic 120 may select the predicted fetch address 140 in the table 118 as a fetch address 122 for a target instruction based on the way indicated by the selected way pointer 116 and the set indicated by the active fetch address 110 .
  • the fetch address 122 may be used by the processing system to locate the address of the next instruction to be executed (e.g., the target instruction).
  • the techniques described with respect to FIGS. 1-2 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
  • a separate table e.g., the target table
  • the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g.
  • the techniques described with respect to FIGS. 1-2 may also utilize an efficient methodology to determine the way of the predicted fetch address 140 in the target table 118 .
  • the techniques may use the predictor tables 102 - 108 (e.g., the way identifier in the predictor tables 102 - 108 ) to determine the selected way of the predicted fetch address 140 in the target table 118 .
  • a method 300 for predicting a fetch address of a next instruction to be fetched is shown.
  • the method 300 may be performed by the processing system 100 of FIG. 1 .
  • the method 300 includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data, at 302 .
  • a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier.
  • the first selection logic 114 may select way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D” as the selected way pointer 116 based on the active fetch address 110 and the historical prediction data 113 .
  • the predictor table 102 includes the selected entry 152 having way identifier “A”, the predictor table 104 includes the selected entry 164 having way identifier “B”, the predictor table 106 includes the selected entry 178 having way identifier “D”, and the predictor table 108 includes the selected entry 186 having way identifier “C”.
  • the method 300 also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer, at 304 .
  • a target table includes a first way storing the first fetch address and a second way storing the second fetch address.
  • the first way and the second way may be associated with the active fetch address.
  • the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
  • the second selection logic 120 select the fetch address associated with the entry 186 as the predicted fetch address 140 based on the selected way pointer 116 .
  • the first selection logic 114 includes a first multiplexer
  • the second selection logic 120 includes a second multiplexer.
  • the method 300 may also include storing the historical prediction data 113 at the global history table 112 that is accessible to the processor (e.g., the processing system 100 ).
  • the historical prediction data 113 includes one or more fetch addresses for one or more previous indirect branches.
  • the method 300 may also include storing most significant bits of each fetch address of the one or more fetch addresses at the global history table to reduce overhead.
  • the method 300 includes generating the first entry based on a first amount of the historical prediction data.
  • the entries 162 - 168 in the predictor table 104 may be generated based on the first amount of the historical prediction data 113 .
  • the method 300 may also include generating the second entry based on a second amount of the historical prediction data that is greater than the first amount of the historical prediction data.
  • the entries 172 - 178 in the predictor table 106 may be generated based on the second amount of the historical prediction data 113 that is greater than the first amount of the historical prediction data 113 .
  • the method 300 includes selecting the second way identifier as the way pointer if the second entry (e.g., the entry generated on a larger amount of the historical prediction data) matches the active fetch address.
  • the method 300 may also include selecting the first way identifier as the way pointer if the second entry fails to match the active fetch address and the first entry matches the active fetch address.
  • the method 300 of FIG. 3 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
  • a separate table e.g., the target table
  • the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most
  • the method 300 may also efficiently determine the way of the predicted fetch address 140 in the target table 118 .
  • the techniques may use the predictor tables 102 - 108 (e.g., the way identifier in the predictor tables 102 - 108 ) to determine the selected way of the predicted fetch address 140 in the target table 118 .
  • the method 300 of FIG. 3 may be implemented via hardware (e.g., a field-programmable gate array (FPGA) device, an application-specific integrated circuit (ASIC), etc.) of a processing unit, such as a central processing unit (CPU), a digital signal processor (DSP), or a controller, via a firmware device, or any combination thereof.
  • a processing unit such as a central processing unit (CPU), a digital signal processor (DSP), or a controller
  • the method 300 can be performed by a processor that executes instructions.
  • the device 400 includes a processor 410 (e.g., a central processing unit (CPU), a digital signal processor (DSP), etc.) coupled to a memory 432 .
  • the processor 410 may include the processing system 100 of FIG. 1 .
  • the memory 432 may be a memory device, such as a random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
  • the memory device may include commands (e.g., the commands 460 ) that, when executed by a computer (e.g., processor 410 ), may cause the computer to perform the method 300 of FIG. 3 .
  • FIG. 4 also shows a display controller 426 that is coupled to the processor 410 and to a display 428 .
  • An encoder/decoder (CODEC) 434 may be coupled to the processor 410 , as shown.
  • a speaker 436 and a microphone 438 can be coupled to the CODEC 434 .
  • FIG. 4 also shows a wireless controller 440 coupled to the processor 410 and to an antenna 442 .
  • the processor 410 , the display controller 426 , the memory 432 , the CODEC 434 , and the wireless controller 440 are included in a system-in-package or system-on-chip device (e.g., a mobile station modem (MSM)) 422 .
  • MSM mobile station modem
  • an input device 430 such as a touchscreen and/or keypad, and a power supply 444 are coupled to the system-on-chip device 422 .
  • the display 428 , the input device 430 , the speaker 436 , the microphone 438 , the antenna 442 , and the power supply 444 are external to the system-on-chip device 422 .
  • each of the display 428 , the input device 430 , the speaker 436 , the microphone 438 , the antenna 442 , and the power supply 444 can be coupled to a component of the system-on-chip device 422 , such as an interface or a controller.
  • an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data.
  • the means for storing data may include a memory system component (e.g., components storing the tables) of the processing system 100 of FIG. 1 , one or more other devices, circuits, modules, or instructions to store data, or any combination thereof.
  • the means for storing data may include a plurality of predictor tables and a target table.
  • the plurality of predictor tables may include a first predictor table and a second predictor table.
  • the first predictor table may include a first entry having a first way identifier
  • the second predictor table may include a second entry having a second way identifier.
  • the target table may include a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier.
  • the first way and the second way may be associated with an active address.
  • the apparatus may also include means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data.
  • the means for selecting the first way identifier or the second way identifier may include the first selection logic 114 of FIG. 1 , one or more other devices, circuits, modules, or instructions to select the first way identifier or the second way identifier, or any combination thereof
  • the apparatus may also include means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
  • the means for selecting the first fetch address or the second fetch address may include the second selection logic 120 of FIG. 1 , one or more other devices, circuits, modules, or instructions to select the first fetch address or the second fetch address, or any combination thereof
  • the foregoing disclosed devices and functionalities may be designed and configured into computer files (e.g. RTL, GDSII, GERBER, etc.) stored on computer readable media. Some or all such files may be provided to fabrication handlers who fabricate devices based on such files. Resulting products include semiconductor wafers that are then cut into semiconductor die and packaged into a semiconductor chip. The chips are then employed in devices, such as a communications device (e.g., a mobile phone), a tablet, a laptop, a personal digital assistant (PDA), a set top box, a music player, a video player, an entertainment unit, a navigation device, a fixed location data unit, a server, or a computer.
  • a communications device e.g., a mobile phone
  • PDA personal digital assistant
  • set top box e.g., a music player, a video player, an entertainment unit
  • navigation device e.g., a fixed location data unit, a server, or a computer.
  • a software module may reside in a memory device, such as random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
  • RAM random access memory
  • MRAM magnetoresistive random access memory
  • STT-MRAM spin-torque transfer MRAM
  • ROM read-only memory
  • PROM programmable read-only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • registers hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
  • An exemplary memory device is coupled to the processor such that the processor can read information from, and write information to, the memory device.
  • the memory device may be integral to the processor.
  • the processor and the storage medium may reside in an application-specific integrated circuit (ASIC).
  • the ASIC may reside in a computing device or a user terminal.
  • the processor and the storage medium may reside as discrete components in a computing device or a user terminal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method for predicting a fetch address of a next instruction to be fetched includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data. A first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier. The method also includes selecting a first or second fetch address as a predicted fetch address based on the way pointer. A target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address. The first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.

Description

    I. FIELD
  • The present disclosure is generally related to a branch target predictor.
  • II. DESCRIPTION OF RELATED ART
  • Advances in technology have resulted in more powerful computing devices. For example, there currently exists a variety of computing devices, including wireless computing devices, such as portable wireless telephones, personal digital assistants (PDAs), and paging devices that are small, lightweight, and easily carried by users, laptop and desktop computers, and servers.
  • A computing device may include a processor that is operable to execute different instructions in an instruction set (e.g., a program). The instruction set may include direct branches and indirect branches. An indirect branch may specify the fetch address of the next instruction to be executed from an instruction memory. The next instruction may be indirectly fetched because the instruction address is resident in some other storage element (e.g., a processor register). Thus, the indirect branch may not embed the offset to the address of the target instruction within one of the instruction fields in the branch instruction. Non-limiting examples of an indirect branch include a computed jump, an indirect jump, and a register-indirect jump. In order to attempt to increase performance at the processor, the processor may predict the fetch address. To predict the fetch address, the processor may use multiple predictor tables, where each predictor table includes multiple prediction entries, and where each prediction entry stores a fetch address.
  • Because each prediction entry stores an entire fetch address and multiple prediction tables may include similar entries, in certain scenarios, there may be a relatively large amount of overhead at each predictor table. For example, each prediction entry in a predictor table may not be used by an application, multiple predictor tables may include identical predictor entries (e.g., target duplication), and the number of predictor table entries may not be capable of adjustment independently from the number of target instructions.
  • The processor may also utilize a stored global history from past indirect branches to predict the fetch address. For example, the processor may predict the fetch address based on predicted fetch addresses for the previous ten indirect branches to provide context. Each fetch address stored in the global history may utilize approximately ten bits of storage. For example, twenty previously predicted fetch addresses stored in the global history may utilize approximately two-hundred bits of storage. Thus, a relatively large amount of storage may be used for the global history.
  • III. SUMMARY
  • According to one implementation of the present disclosure, an apparatus for predicting a fetch address of a next instruction to be fetched includes a memory system, first selection logic, and second selection logic. The memory system includes a plurality of predictor tables and a target table. The plurality of predictor tables includes a first predictor table and a second predictor table. The first predictor table includes a first entry having a first way identifier, and the second predictor table includes a second entry having a second way identifier. The target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier. The first way and the second way are associated with an active address. According to one implementation, the first way identifier and the second way identifier may “point” to a similar way. According to another implementation, the first way identifier and the second way identifier may point to different ways. The first selection logic is coupled to select the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data. The second selection logic is configured to select the first fetch address or the second fetch address as a predicted fetch address based on the way pointer. By using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, an amount of overhead may be reduced. Additionally, the historical prediction data may include an “abbreviated version” of the previously used fetch addresses (e.g., some bits of previously used fetch addresses) as opposed to the entire fetch addresses, data associated with way identifiers of the previously used fetch addresses, or a combination of both. The most significant bits of a fetch address may not substantially change from one fetch address to another fetch address. Lower order bits (or a hash function) may be used to reduce a particular fetch address into a smaller number of bits. According to one example, the historical prediction data may include a way number (e.g., a way identifier) in the target table for each previously used fetch address. Thus, instead of 64-bit previously used fetch addresses, the historical prediction data may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address. This reduction in bits may reduce the overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
  • According to another implementation of the present disclosure, a method for predicting a fetch address of a next instruction to be fetched includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data. A first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier. The method also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer. A target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address. The first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
  • According to another implementation of the present disclosure, a non-transitory computer-readable medium includes commands for predicting a fetch address of a next instruction to be fetched. The commands, when executed by a processor, cause the processor to perform operations including selecting a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data. A first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier. The operations also include selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer. A target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address. The first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
  • According to another implementation of the present disclosure, an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data. The means for storing data includes a plurality of predictor tables and a target table. The plurality of predictor tables includes a first predictor table and a second predictor table. The first predictor table includes a first entry having a first way identifier, and the second predictor table includes a second entry having a second way identifier. The target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier. The first way and the second way are associated with an active address. The apparatus also includes means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data. The apparatus also includes means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
  • IV. BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a processing system that it operable to predict a fetch address of a target instruction;
  • FIG. 2 depicts predictor tables included in the processing system of FIG. 1;
  • FIG. 3 is a method for predicting a fetch address of a target instruction; and
  • FIG. 4 is a block diagram of a device that includes the processing system of FIG. 1.
  • V. DETAILED DESCRIPTION
  • Referring to FIG. 1, a processing system 100 that is operable to predict a fetch address of a target instruction is shown. As used herein, a fetch address corresponds to a location in memory where an address for the target instruction (e.g., the next instruction to be executed) is stored. The processing system 100 may also be referred to as a “memory system.”
  • As explained below, the processing system 100 may predict the fetch address of the target instruction based on an active fetch address 110. According to one implementation, the active fetch address 110 may be based on a current program counter (PC) value. The processing system 100 includes a plurality of predictor tables, a global history table 112, first selection logic 114, a target table 118, and second selection logic 120. According to one implementation, the first selection logic 114 includes a first multiplexer and the second selection logic 120 includes a second multiplexer.
  • The plurality of predictor tables includes a predictor table 102, a predictor table 104, a predictor table 106, and a predictor table 108. Although four predictor tables 102-108 are shown, in other implementations, the processing system 100 may include additional (or fewer) predictor tables. As a non-limiting example, the processing system 100 may include eight predictor tables in another implementation.
  • Each predictor table 102-108 includes multiple entries that identify different fetch addresses. For example, the predictor table 102 includes a first plurality of entries 150, the predictor table 104 includes a second plurality of entries 160, the predictor table 106 includes a third plurality of entries 170, and the predictor table 108 includes a fourth plurality of entries 180. According to one implementation, different predictor tables 102-108 may have different sizes. To illustrate, different predictor tables 102-108 may have a different number of entries. As a non-limiting example, the fourth plurality of entries 180 may include more entries than the second plurality of entries 160.
  • The predictor tables 102-108 of the processing system 100 are shown in greater detail in FIG. 2. The active fetch address 110 is provided to each predictor table 102-108 to determine whether a “hit” exists at the predictor tables 102-108. For example, the processing system 100 may determine whether each predictor table 102-108 includes an entry that matches the active fetch address 110. According to the example illustrated in FIG. 2, the active fetch address 110 is “0X80881323”. It should be understood that the active fetch address 110 (and other addresses described herein) is merely for illustrative purposes and should not be construed as limiting.
  • The predictor table 102 includes an entry 152, an entry 154, an entry 156, and an entry 158. According to one implementation, each entry 152-158 may be included in the first plurality of entries 150 of FIG. 1. The entry 152 may include a tag “0X80881323” and may include a way identifier “A”, the entry 154 may include a tag “0X80881636” and may include a way identifier “B”, the entry 156 may include a tag “0X80882399” and may include a way identifier “C”, and the entry 158 may include a tag “0X80883456” and may include a way identifier “D”. According to one implementation, each tag may include a subset of a fetch address hashed together with other information (e.g., a particular number of previously seen fetch addresses). Each tag may include enough information such that remainder of the entry's content is associated with a fetch address looking up for that entry. Thus, each tag may be used as an identification mechanism for a fetch address. For ease of illustration, the way identifiers are identified by a single capitalized letter.
  • The predictor table 104 includes an entry 162, an entry 164, an entry 166, and an entry 168. According to one implementation, each entry 162-168 may be included in the second plurality of entries 160 of FIG. 1. The entry 162 may include a tag “0X80884635” and may include the way identifier “A”, the entry 164 may include a tag “0X80881323” and may include the way identifier “B”, the entry 166 may include a tag “0X80881493” and may include the way identifier “C”, and the entry 168 may include a tag “0X80889999” and may include the way identifier “D”.
  • The predictor table 106 includes an entry 172, an entry 174, an entry 176, and an entry 178. According to one implementation, each entry 172-178 may be included in the third plurality of entries 170 of FIG. 1. The entry 172 may include a tag “0X80884639” and may include the way identifier “A”, the entry 174 may include a tag “0X80882395” and may include the way identifier “B”, the entry 176 may include a tag “0X80888723” and may include the way identifier “C”, and the entry 178 may include a tag “0X80881321” and may include the way identifier “D”.
  • The predictor table 108 includes an entry 182, an entry 184, an entry 186, and an entry 188. According to one implementation, each entry 182-188 may be included in the fourth plurality of entries 180 of FIG. 1. The entry 182 may include a tag “0X80885245” and may include the way identifier “A”, the entry 184 may include a tag
  • “0X80889823” and may include the way identifier “B”, the entry 186 may include a tag “0X80881323” and may include the way identifier “C”, and the entry 188 may include a tag “0X80888888” and may include the way identifier “D”.
  • A processor (e.g., in the processing system 100 of FIG. 1) may determine that the entry 152 in the predictor table 102 matches the active fetch address 110. Based on this determination, the processor may provide the way identifier “A” to the first selection logic 114 as an output tag indicator 103 of the predictor table 102. The processor may also determine that the entry 164 in the predictor table 102 matches the active fetch address 110. Based on this determination, the processor may provide the way identifier “B” to the first selection logic 114 as an output tag indicator 105 of the predictor table 102.
  • The processor may determine that there are no entries in the predictor table 106 that match the active fetch address 110. Thus, the processor may not provide a way identifier to the first selection logic 114 as an output tag indicator 107 of the predictor table 106. The processor may determine that the entry 186 in the predictor table 108 matches the active fetch address 110. Based on this determination, the processor may provide the way identifier “C” to the first selection logic 114 as an output tag indicator 109 of the predictor table 108.
  • In the illustrative example, each output tag indicator 103, 105, 107, 109 provides a different way identifier to the first selection logic 114. The first selection logic 114 may be configured to select the output tag indicator of the predictor table that has an entry matching the active fetch address 110 and that utilizes a largest amount of historical prediction data (associated with the global history table 112), as explained below. As described above, the output tag indicators 103, 105, 109 correspond to entries 152, 164, 186, respectively, having tags identify the active fetch address 110. Thus, as explained below, the first selection logic 114 may determine which output tag indicator 103, 105, 109 to select based on the amount of historical prediction data associated with each output tag indicator 103, 105, 109. In a scenario where only one output tag indicator corresponds to an entry having a tag identifies the active fetch address 110, the first selection logic 114 may select that output tag indicator.
  • Referring back to FIG. 1, the global history table 112 includes (e.g., stores) historical prediction data 113. The historical prediction data 113 includes a history of previous fetch addresses for indirect branches. For example, the historical prediction data 113 may include data to identify fetch addresses for previous indirect branches and way numbers associated with the fetch addresses. Each fetch address in the historical prediction data 113 may be an “abbreviated version” of a fetch address, to reduce overhead. For example, the historical prediction data 113 may store some bits (e.g., a subset) of each previous fetch address as opposed to the entire fetch address. The historical prediction data 113 may include a way number (e.g., a way identifier) in the target table 118 for each previously used fetch address. Thus, instead of 64-bit previously used fetch addresses, the historical prediction data 113 may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address.
  • The processing system 100 may provide the historical prediction data 113 to the predictor table 104, to the predictor table 106, and to the predictor table 108. For example, the processing system 100 may provide a first amount of the historical prediction data 113 to the predictor table 104 with the active fetch address 110 to generate the output tag indicator 105, the processing system 100 may provide a second amount of the historical prediction data 113 (that is greater than the first amount) to the predictor table 106 with the active fetch address 110 to generate the output tag indicator 107, and the processing system 100 may provide a third amount of the historical prediction data 113 (that is greater than the second amount) to the predictor table 104 with the active fetch address 110 to generate the output tag indicator 109.
  • Because the processing system 100 generates the output tag indicator 103 from the predictor table 102 based solely on the active fetch address 110, the output tag indicator 103 may not be as reliable as the output tag indicators 105, 107, 109 that are generated based on increasing amounts of the historical prediction data 113. Furthermore, because the output tag indicator 107 is generated using more of the historical prediction data 113 than the amount of historical prediction data 113 used to generate the output tag indicator 105, the output tag indicator 107 may be more reliable than the output tag indicator 105. Similarly, because the output tag indicator 109 is generated using more of the historical prediction data 113 than the amount of historical prediction data 113 used to generate the output tag indicator 107, the output tag indicator 109 may be more reliable than the output tag indicator 107.
  • In the example illustrated in FIG. 2, the output tag indicators 103, 105, 109 correspond to entries 152, 164, 186, respectively, having tags that identify the active fetch address 110. Thus, the first selection logic 114 may determine which output tag indicator 103, 105, 109 to select based on the amount of historical prediction data 113 associated with each output tag indicator 103, 105, 109. Because the output tag indicator 109 is associated with more historical prediction data 113 than the other output tag indicators 103, 105, the first selection logic 114 may select that output tag indicator 109 as a selected way pointer 116. The processing system 100 may provide the selected way pointer 116 to the second selection logic 120.
  • The target table 118 includes multiple fetch addresses that are separated by sets (e.g., rows) and ways (e.g., columns). In the illustrative example, the target table 118 includes four sets (e.g., “Set 1”, “Set 2”, “Set 3”, and “Set 4”). The target table 118 may also include four ways (e.g., “Way A”, “Way B”, “Way C”, and “Way D”). Although the target table 118 is shown to include four sets and four ways, in other implementations, the target table 118 may include additional (or fewer) ways and sets. As a non-limiting example, target table 118 may include sixteen sets and thirty-two ways.
  • The processing system 100 may provide the active fetch address 110 to the target table 118. The active fetch address 110 may indicate a particular set of fetch addresses in the target table 118 to be selected. In the illustrative example of FIG. 1, the active fetch address 110 indicates that “Set 3” is where the predicted fetch address 140 is located in the target table 118.
  • Each way in the target table 118 corresponds to a particular way identifier in the predictor tables 102-108. As described with respect to the example in FIG. 2, each entry in the predictor tables 102-108 can include way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D”. The entries that include way identifier “A” are associated with “Way A”, the entries that include way identifier “B” are associated with “Way B”, the entries that include way identifier “C” are associated with “Way C”, and the entries that include way identifier “D” are associated with “Way D”. Because the first selection logic 114 selected the output tag indicator 109 as the selected way pointer 116 and the output tag indicator 109 corresponds to the way identifier “C” (e.g., the way identifier of associated with the entry 186), the second selection logic 120 may select “Way C” as the selected way of the predicted fetch address 140.
  • Thus, the second selection logic 120 may select the predicted fetch address 140 in the table 118 as a fetch address 122 for a target instruction based on the way indicated by the selected way pointer 116 and the set indicated by the active fetch address 110. The fetch address 122 may be used by the processing system to locate the address of the next instruction to be executed (e.g., the target instruction).
  • The techniques described with respect to FIGS. 1-2 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction. The techniques described with respect to FIGS. 1-2 may also utilize an efficient methodology to determine the way of the predicted fetch address 140 in the target table 118. For example, the techniques may use the predictor tables 102-108 (e.g., the way identifier in the predictor tables 102-108) to determine the selected way of the predicted fetch address 140 in the target table 118.
  • Referring to FIG. 3, a method 300 for predicting a fetch address of a next instruction to be fetched is shown. The method 300 may be performed by the processing system 100 of FIG. 1.
  • The method 300 includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data, at 302. A first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier. For example, referring to FIGS. 1-2, the first selection logic 114 may select way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D” as the selected way pointer 116 based on the active fetch address 110 and the historical prediction data 113. The predictor table 102 includes the selected entry 152 having way identifier “A”, the predictor table 104 includes the selected entry 164 having way identifier “B”, the predictor table 106 includes the selected entry 178 having way identifier “D”, and the predictor table 108 includes the selected entry 186 having way identifier “C”.
  • The method 300 also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer, at 304. A target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way may be associated with the active fetch address. The first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier. For example, referring to FIGS. 1-2, the second selection logic 120 select the fetch address associated with the entry 186 as the predicted fetch address 140 based on the selected way pointer 116.
  • According to one implementation of the method 300, the first selection logic 114 includes a first multiplexer, and the second selection logic 120 includes a second multiplexer. The method 300 may also include storing the historical prediction data 113 at the global history table 112 that is accessible to the processor (e.g., the processing system 100). The historical prediction data 113 includes one or more fetch addresses for one or more previous indirect branches. The method 300 may also include storing most significant bits of each fetch address of the one or more fetch addresses at the global history table to reduce overhead.
  • According to one implementation, the method 300 includes generating the first entry based on a first amount of the historical prediction data. For example, the entries 162-168 in the predictor table 104 may be generated based on the first amount of the historical prediction data 113. The method 300 may also include generating the second entry based on a second amount of the historical prediction data that is greater than the first amount of the historical prediction data. For example, the entries 172-178 in the predictor table 106 may be generated based on the second amount of the historical prediction data 113 that is greater than the first amount of the historical prediction data 113. According to one implementation, the method 300 includes selecting the second way identifier as the way pointer if the second entry (e.g., the entry generated on a larger amount of the historical prediction data) matches the active fetch address. The method 300 may also include selecting the first way identifier as the way pointer if the second entry fails to match the active fetch address and the first entry matches the active fetch address.
  • The method 300 of FIG. 3 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction. The method 300 may also efficiently determine the way of the predicted fetch address 140 in the target table 118. For example, the techniques may use the predictor tables 102-108 (e.g., the way identifier in the predictor tables 102-108) to determine the selected way of the predicted fetch address 140 in the target table 118.
  • In particular implementations, the method 300 of FIG. 3 may be implemented via hardware (e.g., a field-programmable gate array (FPGA) device, an application-specific integrated circuit (ASIC), etc.) of a processing unit, such as a central processing unit (CPU), a digital signal processor (DSP), or a controller, via a firmware device, or any combination thereof. As an example, the method 300 can be performed by a processor that executes instructions.
  • Referring to FIG. 4, a block diagram of a device 400 is depicted. The device 400 includes a processor 410 (e.g., a central processing unit (CPU), a digital signal processor (DSP), etc.) coupled to a memory 432. The processor 410 may include the processing system 100 of FIG. 1.
  • The memory 432 may be a memory device, such as a random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM). The memory device may include commands (e.g., the commands 460) that, when executed by a computer (e.g., processor 410), may cause the computer to perform the method 300 of FIG. 3.
  • FIG. 4 also shows a display controller 426 that is coupled to the processor 410 and to a display 428. An encoder/decoder (CODEC) 434 may be coupled to the processor 410, as shown. A speaker 436 and a microphone 438 can be coupled to the CODEC 434. FIG. 4 also shows a wireless controller 440 coupled to the processor 410 and to an antenna 442. In a particular implementation, the processor 410, the display controller 426, the memory 432, the CODEC 434, and the wireless controller 440 are included in a system-in-package or system-on-chip device (e.g., a mobile station modem (MSM)) 422. In a particular implementation, an input device 430, such as a touchscreen and/or keypad, and a power supply 444 are coupled to the system-on-chip device 422. Moreover, in a particular implementation, as illustrated in FIG. 4, the display 428, the input device 430, the speaker 436, the microphone 438, the antenna 442, and the power supply 444 are external to the system-on-chip device 422. However, each of the display 428, the input device 430, the speaker 436, the microphone 438, the antenna 442, and the power supply 444 can be coupled to a component of the system-on-chip device 422, such as an interface or a controller.
  • In conjunction with the described implementations, an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data. For example the means for storing data may include a memory system component (e.g., components storing the tables) of the processing system 100 of FIG. 1, one or more other devices, circuits, modules, or instructions to store data, or any combination thereof. The means for storing data may include a plurality of predictor tables and a target table. The plurality of predictor tables may include a first predictor table and a second predictor table. The first predictor table may include a first entry having a first way identifier, and the second predictor table may include a second entry having a second way identifier. The target table may include a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier. The first way and the second way may be associated with an active address.
  • The apparatus may also include means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data. For example, the means for selecting the first way identifier or the second way identifier may include the first selection logic 114 of FIG. 1, one or more other devices, circuits, modules, or instructions to select the first way identifier or the second way identifier, or any combination thereof
  • The apparatus may also include means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer. For example, the means for selecting the first fetch address or the second fetch address may include the second selection logic 120 of FIG. 1, one or more other devices, circuits, modules, or instructions to select the first fetch address or the second fetch address, or any combination thereof
  • The foregoing disclosed devices and functionalities may be designed and configured into computer files (e.g. RTL, GDSII, GERBER, etc.) stored on computer readable media. Some or all such files may be provided to fabrication handlers who fabricate devices based on such files. Resulting products include semiconductor wafers that are then cut into semiconductor die and packaged into a semiconductor chip. The chips are then employed in devices, such as a communications device (e.g., a mobile phone), a tablet, a laptop, a personal digital assistant (PDA), a set top box, a music player, a video player, an entertainment unit, a navigation device, a fixed location data unit, a server, or a computer.
  • Those of skill would further appreciate that the various illustrative logical blocks, configurations, modules, circuits, and algorithm steps described in connection with the implementations disclosed herein may be implemented as electronic hardware, computer software executed by a processing device such as a hardware processor, or combinations of both. Various illustrative components, blocks, configurations, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or executable software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
  • The steps of a method or algorithm described in connection with the implementations disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in a memory device, such as random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM). An exemplary memory device is coupled to the processor such that the processor can read information from, and write information to, the memory device. In the alternative, the memory device may be integral to the processor. The processor and the storage medium may reside in an application-specific integrated circuit (ASIC). The ASIC may reside in a computing device or a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a computing device or a user terminal.
  • The previous description of the disclosed implementations is provided to enable a person skilled in the art to make or use the disclosed implementations. Various modifications to these implementations will be readily apparent to those skilled in the art, and the principles defined herein may be applied to other implementations without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the implementations shown herein but is to be accorded the widest scope possible consistent with the principles and novel features as defined by the following claims.

Claims (30)

What is claimed is:
1. An apparatus for predicting a fetch address of a next instruction to be fetched, the apparatus comprising:
a memory system storing:
a plurality of predictor tables including a first predictor table and a second predictor table, the first predictor table including a first entry having a first way identifier and the second predictor table including a second entry having a second way identifier; and
a target table comprising:
a first way storing a first fetch address associated with the first way identifier; and
a second way storing a second fetch address associated with the second way identifier, the first way and the second way associated with an active fetch address;
first selection logic coupled to select the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data; and
second selection logic configured to select the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
2. The apparatus of claim 1, wherein the first selection logic comprises a first multiplexer, and wherein the second selection logic comprises a second multiplexer.
3. The apparatus of claim 1, wherein the memory system further comprises a global history table storing the historical prediction data.
4. The apparatus of claim 3, wherein the historical prediction data comprises one or more fetch addresses for one or more previous indirect branches.
5. The apparatus of claim 4, wherein the global history table stores at least a portion of bits of each fetch address of the one or more fetch addresses or a hashed version of the portion of bits.
6. The apparatus of claim 1, wherein the first entry is generated based on a first amount of the historical prediction data, and wherein the second entry is generated based on a second amount of the historical prediction data that is greater than the first amount of the historical prediction data.
7. The apparatus of claim 6, wherein the first selection logic selects the second way identifier as the way pointer if the second entry matches the active fetch address.
8. The apparatus of claim 6, wherein the first selection logic selects the first way identifier as the way pointer if the second entry fails to match the active fetch address and the first entry matches the active fetch address.
9. The apparatus of claim 1, wherein the first predictor table includes a first number of entries, and wherein the second predictor table includes a second number of entries that is different than the first number of entries.
10. The apparatus of claim 1, wherein each way in the target table corresponds to a way identifier in the plurality of predictor tables.
11. A method for predicting a fetch address of a next instruction to be fetched, the method comprising:
selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data, wherein a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier; and
selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer, wherein a target table includes a first way storing the first fetch address and a second way storing the second fetch address, the first way and the second way associated with the active fetch address;
wherein the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
12. The method of claim 11, wherein a first multiplexer of the processor selects the first way identifier or the second way identifier, and wherein a second multiplexer of the processor selects the first fetch address or the second fetch address.
13. The method of claim 11, further comprising storing the historical prediction data at a global history table accessible to the processor.
14. The method of claim 13, wherein the historical prediction data comprises one or more fetch addresses for one or more previous indirect branches.
15. The method of claim 14, further comprising storing most significant bits of each fetch address of the one or more fetch addresses at the global history table.
16. The method of claim 11, further comprising:
generating the first entry based on a first amount of the historical prediction data; and
generating the second entry based on a second amount of the historical prediction data that is greater than the first amount of the historical prediction data.
17. The method of claim 16, further comprising selecting the second way identifier as the way pointer if the second entry matches the active fetch address.
18. The method of claim 16, further comprising selecting the first way identifier as the way pointer if the second entry fails to match the active fetch address and the first entry matches the active fetch address.
19. The method of claim 11, wherein the first predictor table includes a first number of entries, and wherein the second predictor table includes a second number of entries that is different than the first number of entries.
20. The method of claim 11, wherein each way in the target table corresponds to a way identifier in the plurality of predictor tables.
21. A non-transitory computer-readable medium comprising commands for predicting a fetch address of a next instruction to be fetched, the commands, when executed by a processor, cause the processor to perform operations comprising:
selecting a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data, wherein a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier; and
selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer, wherein a target table includes a first way storing the first fetch address and a second way storing the second fetch address, the first way and the second way associated with the active fetch address;
wherein the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
22. The non-transitory computer-readable medium of claim 21, wherein the operations further comprise storing the historical prediction data at a global history table accessible to the processor.
23. The non-transitory computer-readable medium of claim 22, wherein the historical prediction data comprises one or more fetch addresses for one or more previous indirect branches.
24. The non-transitory computer-readable medium of claim 23, wherein the operations further comprise storing most significant bits of each fetch address of the one or more fetch addresses at the global history table.
25. The non-transitory computer-readable medium of claim 21, wherein the operations further comprise:
generating the first entry based on a first amount of the historical prediction data; and
generating the second entry based on a second amount of the historical prediction data that is greater than the first amount of the historical prediction data.
26. The non-transitory computer-readable medium of claim 25, wherein the operations further comprise selecting the second way identifier as the way pointer if the second entry matches the active fetch address.
27. The non-transitory computer-readable medium of claim 25, wherein the operations further comprise selecting the first way identifier as the way pointer if the second entry fails to match the active fetch address and the first entry matches the active fetch address.
28. An apparatus for predicting a fetch address of a next instruction to be fetched, the apparatus comprising:
means for storing data comprising:
a plurality of predictor tables including a first predictor table and a second predictor table, the first predictor table including a first entry having a first way identifier and the second predictor table including a second entry having a second way identifier; and
a target table comprising:
a first way storing a first fetch address associated with the first way identifier; and
a second way storing a second fetch address associated with the second way identifier, the first way and the second way associated with an active fetch address;
means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data; and
means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
29. The apparatus of claim 28, further comprising means for storing the historical prediction data.
30. The apparatus of claim 28, wherein the means for selecting the first way identifier or the second way identifier comprises a first multiplexer, and wherein the means for selecting the first fetch address or the second fetch address comprises a second multiplexer.
US15/192,794 2016-06-24 2016-06-24 Branch target predictor Abandoned US20170371669A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US15/192,794 US20170371669A1 (en) 2016-06-24 2016-06-24 Branch target predictor
PCT/US2017/029452 WO2017222635A1 (en) 2016-06-24 2017-04-25 Branch target predictor
CN201780033792.4A CN109219798A (en) 2016-06-24 2017-04-25 Branch target prediction device
EP17721035.8A EP3475811A1 (en) 2016-06-24 2017-04-25 Branch target predictor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US15/192,794 US20170371669A1 (en) 2016-06-24 2016-06-24 Branch target predictor

Publications (1)

Publication Number Publication Date
US20170371669A1 true US20170371669A1 (en) 2017-12-28

Family

ID=58664897

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/192,794 Abandoned US20170371669A1 (en) 2016-06-24 2016-06-24 Branch target predictor

Country Status (4)

Country Link
US (1) US20170371669A1 (en)
EP (1) EP3475811A1 (en)
CN (1) CN109219798A (en)
WO (1) WO2017222635A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10956161B2 (en) 2017-04-27 2021-03-23 International Business Machines Corporation Indirect target tagged geometric branch prediction using a set of target address pattern data

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7069426B1 (en) * 2000-03-28 2006-06-27 Intel Corporation Branch predictor with saturating counter and local branch history table with algorithm for updating replacement and history fields of matching table entries
US7165169B2 (en) * 2001-05-04 2007-01-16 Ip-First, Llc Speculative branch target address cache with selective override by secondary predictor based on branch instruction type
US7707397B2 (en) * 2001-05-04 2010-04-27 Via Technologies, Inc. Variable group associativity branch target address cache delivering multiple target addresses per cache line
US20060218385A1 (en) * 2005-03-23 2006-09-28 Smith Rodney W Branch target address cache storing two or more branch target addresses per index
US8935517B2 (en) * 2006-06-29 2015-01-13 Qualcomm Incorporated System and method for selectively managing a branch target address cache of a multiple-stage predictor
US20080209190A1 (en) * 2007-02-28 2008-08-28 Advanced Micro Devices, Inc. Parallel prediction of multiple branches
US7844807B2 (en) * 2008-02-01 2010-11-30 International Business Machines Corporation Branch target address cache storing direct predictions
CN101819523B (en) * 2009-03-04 2014-04-02 威盛电子股份有限公司 Microprocessor and related instruction execution method
US20130346727A1 (en) * 2012-06-25 2013-12-26 Qualcomm Incorporated Methods and Apparatus to Extend Software Branch Target Hints
GB2506462B (en) * 2013-03-13 2014-08-13 Imagination Tech Ltd Indirect branch prediction
US9983878B2 (en) * 2014-05-15 2018-05-29 International Business Machines Corporation Branch prediction using multiple versions of history data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10956161B2 (en) 2017-04-27 2021-03-23 International Business Machines Corporation Indirect target tagged geometric branch prediction using a set of target address pattern data

Also Published As

Publication number Publication date
EP3475811A1 (en) 2019-05-01
WO2017222635A1 (en) 2017-12-28
CN109219798A (en) 2019-01-15

Similar Documents

Publication Publication Date Title
JP6744423B2 (en) Implementation of load address prediction using address prediction table based on load path history in processor-based system
US10831491B2 (en) Selective access to partitioned branch transfer buffer (BTB) content
US9201658B2 (en) Branch predictor for wide issue, arbitrarily aligned fetch that can cross cache line boundaries
EP2423821A2 (en) Processor, apparatus, and method for fetching instructions and configurations from a shared cache
US10901484B2 (en) Fetch predition circuit for reducing power consumption in a processor
US9311098B2 (en) Mechanism for reducing cache power consumption using cache way prediction
US9804969B2 (en) Speculative addressing using a virtual address-to-physical address page crossing buffer
US9367468B2 (en) Data cache way prediction
KR20180058797A (en) Method and apparatus for cache line deduplication through data matching
EP2972898A1 (en) Externally programmable memory management unit
EP2962187A2 (en) Vector register addressing and functions based on a scalar register data value
WO2017030678A1 (en) Determining prefetch instructions based on instruction encoding
US20140201494A1 (en) Overlap checking for a translation lookaside buffer (tlb)
US20180081686A1 (en) Providing memory dependence prediction in block-atomic dataflow architectures
CN107533513B (en) Burst translation look-aside buffer
US20170371669A1 (en) Branch target predictor
WO2021061269A1 (en) Storage control apparatus, processing apparatus, computer system, and storage control method
TW202036284A (en) Branch prediction based on load-path history
US10437592B2 (en) Reduced logic level operation folding of context history in a history register in a prediction system for a processor-based system
EP2856304B1 (en) Issuing instructions to execution pipelines based on register-associated preferences, and related instruction processing circuits, processor systems, methods, and computer-readable media
US20170046266A1 (en) Way Mispredict Mitigation on a Way Predicted Cache
US20140181405A1 (en) Instruction cache having a multi-bit way prediction mask
US10162752B2 (en) Data storage at contiguous memory addresses
US8850109B2 (en) Content addressable memory data clustering block architecture
US20230393853A1 (en) Selectively updating branch predictors for loops executed from loop buffers in a processor

Legal Events

Date Code Title Description
AS Assignment

Owner name: QUALCOMM INCORPORATED, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KRISHNA, ANIL;WRIGHT, GREGORY;REEL/FRAME:039925/0478

Effective date: 20160914

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION