US20170371669A1 - Branch target predictor - Google Patents
Branch target predictor Download PDFInfo
- Publication number
- US20170371669A1 US20170371669A1 US15/192,794 US201615192794A US2017371669A1 US 20170371669 A1 US20170371669 A1 US 20170371669A1 US 201615192794 A US201615192794 A US 201615192794A US 2017371669 A1 US2017371669 A1 US 2017371669A1
- Authority
- US
- United States
- Prior art keywords
- way
- fetch address
- identifier
- entry
- prediction data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 claims abstract description 38
- 238000012545 processing Methods 0.000 description 38
- 230000009467 reduction Effects 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 235000012431 wafers Nutrition 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3804—Instruction prefetching for branches, e.g. hedging, branch folding
- G06F9/3806—Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
- G06F9/30058—Conditional branch instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
- G06F9/30061—Multi-way branch instructions, e.g. CASE
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3844—Speculative instruction execution using dynamic branch prediction, e.g. using branch history tables
Definitions
- the present disclosure is generally related to a branch target predictor.
- wireless computing devices such as portable wireless telephones, personal digital assistants (PDAs), and paging devices that are small, lightweight, and easily carried by users, laptop and desktop computers, and servers.
- PDAs personal digital assistants
- paging devices that are small, lightweight, and easily carried by users, laptop and desktop computers, and servers.
- a computing device may include a processor that is operable to execute different instructions in an instruction set (e.g., a program).
- the instruction set may include direct branches and indirect branches.
- An indirect branch may specify the fetch address of the next instruction to be executed from an instruction memory.
- the next instruction may be indirectly fetched because the instruction address is resident in some other storage element (e.g., a processor register).
- the indirect branch may not embed the offset to the address of the target instruction within one of the instruction fields in the branch instruction.
- Non-limiting examples of an indirect branch include a computed jump, an indirect jump, and a register-indirect jump.
- the processor may predict the fetch address. To predict the fetch address, the processor may use multiple predictor tables, where each predictor table includes multiple prediction entries, and where each prediction entry stores a fetch address.
- each prediction entry stores an entire fetch address and multiple prediction tables may include similar entries, in certain scenarios, there may be a relatively large amount of overhead at each predictor table.
- each prediction entry in a predictor table may not be used by an application, multiple predictor tables may include identical predictor entries (e.g., target duplication), and the number of predictor table entries may not be capable of adjustment independently from the number of target instructions.
- the processor may also utilize a stored global history from past indirect branches to predict the fetch address. For example, the processor may predict the fetch address based on predicted fetch addresses for the previous ten indirect branches to provide context. Each fetch address stored in the global history may utilize approximately ten bits of storage. For example, twenty previously predicted fetch addresses stored in the global history may utilize approximately two-hundred bits of storage. Thus, a relatively large amount of storage may be used for the global history.
- an apparatus for predicting a fetch address of a next instruction to be fetched includes a memory system, first selection logic, and second selection logic.
- the memory system includes a plurality of predictor tables and a target table.
- the plurality of predictor tables includes a first predictor table and a second predictor table.
- the first predictor table includes a first entry having a first way identifier
- the second predictor table includes a second entry having a second way identifier.
- the target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier.
- the first way and the second way are associated with an active address.
- the first way identifier and the second way identifier may “point” to a similar way.
- the first way identifier and the second way identifier may point to different ways.
- the first selection logic is coupled to select the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data.
- the second selection logic is configured to select the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
- the historical prediction data may include an “abbreviated version” of the previously used fetch addresses (e.g., some bits of previously used fetch addresses) as opposed to the entire fetch addresses, data associated with way identifiers of the previously used fetch addresses, or a combination of both.
- the most significant bits of a fetch address may not substantially change from one fetch address to another fetch address.
- Lower order bits or a hash function
- the historical prediction data may include a way number (e.g., a way identifier) in the target table for each previously used fetch address.
- the historical prediction data may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address. This reduction in bits may reduce the overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
- a method for predicting a fetch address of a next instruction to be fetched includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data.
- a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier.
- the method also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer.
- a target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address.
- the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
- a non-transitory computer-readable medium includes commands for predicting a fetch address of a next instruction to be fetched.
- the commands when executed by a processor, cause the processor to perform operations including selecting a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data.
- a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier.
- the operations also include selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer.
- a target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address.
- the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
- an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data.
- the means for storing data includes a plurality of predictor tables and a target table.
- the plurality of predictor tables includes a first predictor table and a second predictor table.
- the first predictor table includes a first entry having a first way identifier
- the second predictor table includes a second entry having a second way identifier.
- the target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier.
- the first way and the second way are associated with an active address.
- the apparatus also includes means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data.
- the apparatus also includes means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
- FIG. 1 is a processing system that it operable to predict a fetch address of a target instruction
- FIG. 2 depicts predictor tables included in the processing system of FIG. 1 ;
- FIG. 3 is a method for predicting a fetch address of a target instruction
- FIG. 4 is a block diagram of a device that includes the processing system of FIG. 1 .
- a processing system 100 that is operable to predict a fetch address of a target instruction is shown.
- a fetch address corresponds to a location in memory where an address for the target instruction (e.g., the next instruction to be executed) is stored.
- the processing system 100 may also be referred to as a “memory system.”
- the processing system 100 may predict the fetch address of the target instruction based on an active fetch address 110 .
- the active fetch address 110 may be based on a current program counter (PC) value.
- the processing system 100 includes a plurality of predictor tables, a global history table 112 , first selection logic 114 , a target table 118 , and second selection logic 120 .
- the first selection logic 114 includes a first multiplexer and the second selection logic 120 includes a second multiplexer.
- the plurality of predictor tables includes a predictor table 102 , a predictor table 104 , a predictor table 106 , and a predictor table 108 . Although four predictor tables 102 - 108 are shown, in other implementations, the processing system 100 may include additional (or fewer) predictor tables. As a non-limiting example, the processing system 100 may include eight predictor tables in another implementation.
- Each predictor table 102 - 108 includes multiple entries that identify different fetch addresses.
- the predictor table 102 includes a first plurality of entries 150
- the predictor table 104 includes a second plurality of entries 160
- the predictor table 106 includes a third plurality of entries 170
- the predictor table 108 includes a fourth plurality of entries 180 .
- different predictor tables 102 - 108 may have different sizes.
- different predictor tables 102 - 108 may have a different number of entries.
- the fourth plurality of entries 180 may include more entries than the second plurality of entries 160 .
- the predictor tables 102 - 108 of the processing system 100 are shown in greater detail in FIG. 2 .
- the active fetch address 110 is provided to each predictor table 102 - 108 to determine whether a “hit” exists at the predictor tables 102 - 108 .
- the processing system 100 may determine whether each predictor table 102 - 108 includes an entry that matches the active fetch address 110 .
- the active fetch address 110 is “0X80881323”. It should be understood that the active fetch address 110 (and other addresses described herein) is merely for illustrative purposes and should not be construed as limiting.
- the predictor table 102 includes an entry 152 , an entry 154 , an entry 156 , and an entry 158 .
- each entry 152 - 158 may be included in the first plurality of entries 150 of FIG. 1 .
- the entry 152 may include a tag “0X80881323” and may include a way identifier “A”
- the entry 154 may include a tag “0X80881636” and may include a way identifier “B”
- the entry 156 may include a tag “0X80882399” and may include a way identifier “C”
- the entry 158 may include a tag “0X80883456” and may include a way identifier “D”.
- each tag may include a subset of a fetch address hashed together with other information (e.g., a particular number of previously seen fetch addresses).
- Each tag may include enough information such that remainder of the entry's content is associated with a fetch address looking up for that entry.
- each tag may be used as an identification mechanism for a fetch address. For ease of illustration, the way identifiers are identified by a single capitalized letter.
- the predictor table 104 includes an entry 162 , an entry 164 , an entry 166 , and an entry 168 .
- each entry 162 - 168 may be included in the second plurality of entries 160 of FIG. 1 .
- the entry 162 may include a tag “0X80884635” and may include the way identifier “A”
- the entry 164 may include a tag “0X80881323” and may include the way identifier “B”
- the entry 166 may include a tag “0X80881493” and may include the way identifier “C”
- the entry 168 may include a tag “0X80889999” and may include the way identifier “D”.
- the predictor table 106 includes an entry 172 , an entry 174 , an entry 176 , and an entry 178 .
- each entry 172 - 178 may be included in the third plurality of entries 170 of FIG. 1 .
- the entry 172 may include a tag “0X80884639” and may include the way identifier “A”
- the entry 174 may include a tag “0X80882395” and may include the way identifier “B”
- the entry 176 may include a tag “0X80888723” and may include the way identifier “C”
- the entry 178 may include a tag “0X80881321” and may include the way identifier “D”.
- the predictor table 108 includes an entry 182 , an entry 184 , an entry 186 , and an entry 188 .
- each entry 182 - 188 may be included in the fourth plurality of entries 180 of FIG. 1 .
- the entry 182 may include a tag “0X80885245” and may include the way identifier “A”, the entry 184 may include a tag
- the entry 186 may include a tag “0X80881323” and may include the way identifier “C”
- the entry 188 may include a tag “0X80888888” and may include the way identifier “D”.
- a processor may determine that the entry 152 in the predictor table 102 matches the active fetch address 110 . Based on this determination, the processor may provide the way identifier “A” to the first selection logic 114 as an output tag indicator 103 of the predictor table 102 . The processor may also determine that the entry 164 in the predictor table 102 matches the active fetch address 110 . Based on this determination, the processor may provide the way identifier “B” to the first selection logic 114 as an output tag indicator 105 of the predictor table 102 .
- the processor may determine that there are no entries in the predictor table 106 that match the active fetch address 110 . Thus, the processor may not provide a way identifier to the first selection logic 114 as an output tag indicator 107 of the predictor table 106 . The processor may determine that the entry 186 in the predictor table 108 matches the active fetch address 110 . Based on this determination, the processor may provide the way identifier “C” to the first selection logic 114 as an output tag indicator 109 of the predictor table 108 .
- each output tag indicator 103 , 105 , 107 , 109 provides a different way identifier to the first selection logic 114 .
- the first selection logic 114 may be configured to select the output tag indicator of the predictor table that has an entry matching the active fetch address 110 and that utilizes a largest amount of historical prediction data (associated with the global history table 112 ), as explained below.
- the output tag indicators 103 , 105 , 109 correspond to entries 152 , 164 , 186 , respectively, having tags identify the active fetch address 110 .
- the first selection logic 114 may determine which output tag indicator 103 , 105 , 109 to select based on the amount of historical prediction data associated with each output tag indicator 103 , 105 , 109 . In a scenario where only one output tag indicator corresponds to an entry having a tag identifies the active fetch address 110 , the first selection logic 114 may select that output tag indicator.
- the global history table 112 includes (e.g., stores) historical prediction data 113 .
- the historical prediction data 113 includes a history of previous fetch addresses for indirect branches.
- the historical prediction data 113 may include data to identify fetch addresses for previous indirect branches and way numbers associated with the fetch addresses.
- Each fetch address in the historical prediction data 113 may be an “abbreviated version” of a fetch address, to reduce overhead.
- the historical prediction data 113 may store some bits (e.g., a subset) of each previous fetch address as opposed to the entire fetch address.
- the historical prediction data 113 may include a way number (e.g., a way identifier) in the target table 118 for each previously used fetch address.
- the historical prediction data 113 may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address.
- the processing system 100 may provide the historical prediction data 113 to the predictor table 104 , to the predictor table 106 , and to the predictor table 108 .
- the processing system 100 may provide a first amount of the historical prediction data 113 to the predictor table 104 with the active fetch address 110 to generate the output tag indicator 105
- the processing system 100 may provide a second amount of the historical prediction data 113 (that is greater than the first amount) to the predictor table 106 with the active fetch address 110 to generate the output tag indicator 107
- the processing system 100 may provide a third amount of the historical prediction data 113 (that is greater than the second amount) to the predictor table 104 with the active fetch address 110 to generate the output tag indicator 109 .
- the output tag indicator 103 may not be as reliable as the output tag indicators 105 , 107 , 109 that are generated based on increasing amounts of the historical prediction data 113 . Furthermore, because the output tag indicator 107 is generated using more of the historical prediction data 113 than the amount of historical prediction data 113 used to generate the output tag indicator 105 , the output tag indicator 107 may be more reliable than the output tag indicator 105 . Similarly, because the output tag indicator 109 is generated using more of the historical prediction data 113 than the amount of historical prediction data 113 used to generate the output tag indicator 107 , the output tag indicator 109 may be more reliable than the output tag indicator 107 .
- the output tag indicators 103 , 105 , 109 correspond to entries 152 , 164 , 186 , respectively, having tags that identify the active fetch address 110 .
- the first selection logic 114 may determine which output tag indicator 103 , 105 , 109 to select based on the amount of historical prediction data 113 associated with each output tag indicator 103 , 105 , 109 . Because the output tag indicator 109 is associated with more historical prediction data 113 than the other output tag indicators 103 , 105 , the first selection logic 114 may select that output tag indicator 109 as a selected way pointer 116 .
- the processing system 100 may provide the selected way pointer 116 to the second selection logic 120 .
- the target table 118 includes multiple fetch addresses that are separated by sets (e.g., rows) and ways (e.g., columns).
- the target table 118 includes four sets (e.g., “Set 1 ”, “Set 2 ”, “Set 3 ”, and “Set 4 ”).
- the target table 118 may also include four ways (e.g., “Way A”, “Way B”, “Way C”, and “Way D”).
- the target table 118 is shown to include four sets and four ways, in other implementations, the target table 118 may include additional (or fewer) ways and sets.
- target table 118 may include sixteen sets and thirty-two ways.
- the processing system 100 may provide the active fetch address 110 to the target table 118 .
- the active fetch address 110 may indicate a particular set of fetch addresses in the target table 118 to be selected. In the illustrative example of FIG. 1 , the active fetch address 110 indicates that “Set 3 ” is where the predicted fetch address 140 is located in the target table 118 .
- Each way in the target table 118 corresponds to a particular way identifier in the predictor tables 102 - 108 .
- each entry in the predictor tables 102 - 108 can include way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D”.
- the entries that include way identifier “A” are associated with “Way A”, the entries that include way identifier “B” are associated with “Way B”, the entries that include way identifier “C” are associated with “Way C”, and the entries that include way identifier “D” are associated with “Way D”.
- the second selection logic 120 may select “Way C” as the selected way of the predicted fetch address 140 .
- the second selection logic 120 may select the predicted fetch address 140 in the table 118 as a fetch address 122 for a target instruction based on the way indicated by the selected way pointer 116 and the set indicated by the active fetch address 110 .
- the fetch address 122 may be used by the processing system to locate the address of the next instruction to be executed (e.g., the target instruction).
- the techniques described with respect to FIGS. 1-2 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
- a separate table e.g., the target table
- the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g.
- the techniques described with respect to FIGS. 1-2 may also utilize an efficient methodology to determine the way of the predicted fetch address 140 in the target table 118 .
- the techniques may use the predictor tables 102 - 108 (e.g., the way identifier in the predictor tables 102 - 108 ) to determine the selected way of the predicted fetch address 140 in the target table 118 .
- a method 300 for predicting a fetch address of a next instruction to be fetched is shown.
- the method 300 may be performed by the processing system 100 of FIG. 1 .
- the method 300 includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data, at 302 .
- a first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier.
- the first selection logic 114 may select way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D” as the selected way pointer 116 based on the active fetch address 110 and the historical prediction data 113 .
- the predictor table 102 includes the selected entry 152 having way identifier “A”, the predictor table 104 includes the selected entry 164 having way identifier “B”, the predictor table 106 includes the selected entry 178 having way identifier “D”, and the predictor table 108 includes the selected entry 186 having way identifier “C”.
- the method 300 also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer, at 304 .
- a target table includes a first way storing the first fetch address and a second way storing the second fetch address.
- the first way and the second way may be associated with the active fetch address.
- the first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
- the second selection logic 120 select the fetch address associated with the entry 186 as the predicted fetch address 140 based on the selected way pointer 116 .
- the first selection logic 114 includes a first multiplexer
- the second selection logic 120 includes a second multiplexer.
- the method 300 may also include storing the historical prediction data 113 at the global history table 112 that is accessible to the processor (e.g., the processing system 100 ).
- the historical prediction data 113 includes one or more fetch addresses for one or more previous indirect branches.
- the method 300 may also include storing most significant bits of each fetch address of the one or more fetch addresses at the global history table to reduce overhead.
- the method 300 includes generating the first entry based on a first amount of the historical prediction data.
- the entries 162 - 168 in the predictor table 104 may be generated based on the first amount of the historical prediction data 113 .
- the method 300 may also include generating the second entry based on a second amount of the historical prediction data that is greater than the first amount of the historical prediction data.
- the entries 172 - 178 in the predictor table 106 may be generated based on the second amount of the historical prediction data 113 that is greater than the first amount of the historical prediction data 113 .
- the method 300 includes selecting the second way identifier as the way pointer if the second entry (e.g., the entry generated on a larger amount of the historical prediction data) matches the active fetch address.
- the method 300 may also include selecting the first way identifier as the way pointer if the second entry fails to match the active fetch address and the first entry matches the active fetch address.
- the method 300 of FIG. 3 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
- a separate table e.g., the target table
- the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most
- the method 300 may also efficiently determine the way of the predicted fetch address 140 in the target table 118 .
- the techniques may use the predictor tables 102 - 108 (e.g., the way identifier in the predictor tables 102 - 108 ) to determine the selected way of the predicted fetch address 140 in the target table 118 .
- the method 300 of FIG. 3 may be implemented via hardware (e.g., a field-programmable gate array (FPGA) device, an application-specific integrated circuit (ASIC), etc.) of a processing unit, such as a central processing unit (CPU), a digital signal processor (DSP), or a controller, via a firmware device, or any combination thereof.
- a processing unit such as a central processing unit (CPU), a digital signal processor (DSP), or a controller
- the method 300 can be performed by a processor that executes instructions.
- the device 400 includes a processor 410 (e.g., a central processing unit (CPU), a digital signal processor (DSP), etc.) coupled to a memory 432 .
- the processor 410 may include the processing system 100 of FIG. 1 .
- the memory 432 may be a memory device, such as a random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
- the memory device may include commands (e.g., the commands 460 ) that, when executed by a computer (e.g., processor 410 ), may cause the computer to perform the method 300 of FIG. 3 .
- FIG. 4 also shows a display controller 426 that is coupled to the processor 410 and to a display 428 .
- An encoder/decoder (CODEC) 434 may be coupled to the processor 410 , as shown.
- a speaker 436 and a microphone 438 can be coupled to the CODEC 434 .
- FIG. 4 also shows a wireless controller 440 coupled to the processor 410 and to an antenna 442 .
- the processor 410 , the display controller 426 , the memory 432 , the CODEC 434 , and the wireless controller 440 are included in a system-in-package or system-on-chip device (e.g., a mobile station modem (MSM)) 422 .
- MSM mobile station modem
- an input device 430 such as a touchscreen and/or keypad, and a power supply 444 are coupled to the system-on-chip device 422 .
- the display 428 , the input device 430 , the speaker 436 , the microphone 438 , the antenna 442 , and the power supply 444 are external to the system-on-chip device 422 .
- each of the display 428 , the input device 430 , the speaker 436 , the microphone 438 , the antenna 442 , and the power supply 444 can be coupled to a component of the system-on-chip device 422 , such as an interface or a controller.
- an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data.
- the means for storing data may include a memory system component (e.g., components storing the tables) of the processing system 100 of FIG. 1 , one or more other devices, circuits, modules, or instructions to store data, or any combination thereof.
- the means for storing data may include a plurality of predictor tables and a target table.
- the plurality of predictor tables may include a first predictor table and a second predictor table.
- the first predictor table may include a first entry having a first way identifier
- the second predictor table may include a second entry having a second way identifier.
- the target table may include a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier.
- the first way and the second way may be associated with an active address.
- the apparatus may also include means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data.
- the means for selecting the first way identifier or the second way identifier may include the first selection logic 114 of FIG. 1 , one or more other devices, circuits, modules, or instructions to select the first way identifier or the second way identifier, or any combination thereof
- the apparatus may also include means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
- the means for selecting the first fetch address or the second fetch address may include the second selection logic 120 of FIG. 1 , one or more other devices, circuits, modules, or instructions to select the first fetch address or the second fetch address, or any combination thereof
- the foregoing disclosed devices and functionalities may be designed and configured into computer files (e.g. RTL, GDSII, GERBER, etc.) stored on computer readable media. Some or all such files may be provided to fabrication handlers who fabricate devices based on such files. Resulting products include semiconductor wafers that are then cut into semiconductor die and packaged into a semiconductor chip. The chips are then employed in devices, such as a communications device (e.g., a mobile phone), a tablet, a laptop, a personal digital assistant (PDA), a set top box, a music player, a video player, an entertainment unit, a navigation device, a fixed location data unit, a server, or a computer.
- a communications device e.g., a mobile phone
- PDA personal digital assistant
- set top box e.g., a music player, a video player, an entertainment unit
- navigation device e.g., a fixed location data unit, a server, or a computer.
- a software module may reside in a memory device, such as random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
- RAM random access memory
- MRAM magnetoresistive random access memory
- STT-MRAM spin-torque transfer MRAM
- ROM read-only memory
- PROM programmable read-only memory
- EPROM erasable programmable read-only memory
- EEPROM electrically erasable programmable read-only memory
- registers hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
- An exemplary memory device is coupled to the processor such that the processor can read information from, and write information to, the memory device.
- the memory device may be integral to the processor.
- the processor and the storage medium may reside in an application-specific integrated circuit (ASIC).
- the ASIC may reside in a computing device or a user terminal.
- the processor and the storage medium may reside as discrete components in a computing device or a user terminal.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
- The present disclosure is generally related to a branch target predictor.
- Advances in technology have resulted in more powerful computing devices. For example, there currently exists a variety of computing devices, including wireless computing devices, such as portable wireless telephones, personal digital assistants (PDAs), and paging devices that are small, lightweight, and easily carried by users, laptop and desktop computers, and servers.
- A computing device may include a processor that is operable to execute different instructions in an instruction set (e.g., a program). The instruction set may include direct branches and indirect branches. An indirect branch may specify the fetch address of the next instruction to be executed from an instruction memory. The next instruction may be indirectly fetched because the instruction address is resident in some other storage element (e.g., a processor register). Thus, the indirect branch may not embed the offset to the address of the target instruction within one of the instruction fields in the branch instruction. Non-limiting examples of an indirect branch include a computed jump, an indirect jump, and a register-indirect jump. In order to attempt to increase performance at the processor, the processor may predict the fetch address. To predict the fetch address, the processor may use multiple predictor tables, where each predictor table includes multiple prediction entries, and where each prediction entry stores a fetch address.
- Because each prediction entry stores an entire fetch address and multiple prediction tables may include similar entries, in certain scenarios, there may be a relatively large amount of overhead at each predictor table. For example, each prediction entry in a predictor table may not be used by an application, multiple predictor tables may include identical predictor entries (e.g., target duplication), and the number of predictor table entries may not be capable of adjustment independently from the number of target instructions.
- The processor may also utilize a stored global history from past indirect branches to predict the fetch address. For example, the processor may predict the fetch address based on predicted fetch addresses for the previous ten indirect branches to provide context. Each fetch address stored in the global history may utilize approximately ten bits of storage. For example, twenty previously predicted fetch addresses stored in the global history may utilize approximately two-hundred bits of storage. Thus, a relatively large amount of storage may be used for the global history.
- According to one implementation of the present disclosure, an apparatus for predicting a fetch address of a next instruction to be fetched includes a memory system, first selection logic, and second selection logic. The memory system includes a plurality of predictor tables and a target table. The plurality of predictor tables includes a first predictor table and a second predictor table. The first predictor table includes a first entry having a first way identifier, and the second predictor table includes a second entry having a second way identifier. The target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier. The first way and the second way are associated with an active address. According to one implementation, the first way identifier and the second way identifier may “point” to a similar way. According to another implementation, the first way identifier and the second way identifier may point to different ways. The first selection logic is coupled to select the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data. The second selection logic is configured to select the first fetch address or the second fetch address as a predicted fetch address based on the way pointer. By using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, an amount of overhead may be reduced. Additionally, the historical prediction data may include an “abbreviated version” of the previously used fetch addresses (e.g., some bits of previously used fetch addresses) as opposed to the entire fetch addresses, data associated with way identifiers of the previously used fetch addresses, or a combination of both. The most significant bits of a fetch address may not substantially change from one fetch address to another fetch address. Lower order bits (or a hash function) may be used to reduce a particular fetch address into a smaller number of bits. According to one example, the historical prediction data may include a way number (e.g., a way identifier) in the target table for each previously used fetch address. Thus, instead of 64-bit previously used fetch addresses, the historical prediction data may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address. This reduction in bits may reduce the overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction.
- According to another implementation of the present disclosure, a method for predicting a fetch address of a next instruction to be fetched includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data. A first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier. The method also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer. A target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address. The first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
- According to another implementation of the present disclosure, a non-transitory computer-readable medium includes commands for predicting a fetch address of a next instruction to be fetched. The commands, when executed by a processor, cause the processor to perform operations including selecting a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data. A first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier. The operations also include selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer. A target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way are associated with the active fetch address. The first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier.
- According to another implementation of the present disclosure, an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data. The means for storing data includes a plurality of predictor tables and a target table. The plurality of predictor tables includes a first predictor table and a second predictor table. The first predictor table includes a first entry having a first way identifier, and the second predictor table includes a second entry having a second way identifier. The target table includes a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier. The first way and the second way are associated with an active address. The apparatus also includes means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data. The apparatus also includes means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer.
-
FIG. 1 is a processing system that it operable to predict a fetch address of a target instruction; -
FIG. 2 depicts predictor tables included in the processing system ofFIG. 1 ; -
FIG. 3 is a method for predicting a fetch address of a target instruction; and -
FIG. 4 is a block diagram of a device that includes the processing system ofFIG. 1 . - Referring to
FIG. 1 , aprocessing system 100 that is operable to predict a fetch address of a target instruction is shown. As used herein, a fetch address corresponds to a location in memory where an address for the target instruction (e.g., the next instruction to be executed) is stored. Theprocessing system 100 may also be referred to as a “memory system.” - As explained below, the
processing system 100 may predict the fetch address of the target instruction based on an active fetchaddress 110. According to one implementation, the active fetchaddress 110 may be based on a current program counter (PC) value. Theprocessing system 100 includes a plurality of predictor tables, a global history table 112,first selection logic 114, a target table 118, andsecond selection logic 120. According to one implementation, thefirst selection logic 114 includes a first multiplexer and thesecond selection logic 120 includes a second multiplexer. - The plurality of predictor tables includes a predictor table 102, a predictor table 104, a predictor table 106, and a predictor table 108. Although four predictor tables 102-108 are shown, in other implementations, the
processing system 100 may include additional (or fewer) predictor tables. As a non-limiting example, theprocessing system 100 may include eight predictor tables in another implementation. - Each predictor table 102-108 includes multiple entries that identify different fetch addresses. For example, the predictor table 102 includes a first plurality of
entries 150, the predictor table 104 includes a second plurality ofentries 160, the predictor table 106 includes a third plurality ofentries 170, and the predictor table 108 includes a fourth plurality ofentries 180. According to one implementation, different predictor tables 102-108 may have different sizes. To illustrate, different predictor tables 102-108 may have a different number of entries. As a non-limiting example, the fourth plurality ofentries 180 may include more entries than the second plurality ofentries 160. - The predictor tables 102-108 of the
processing system 100 are shown in greater detail inFIG. 2 . The active fetchaddress 110 is provided to each predictor table 102-108 to determine whether a “hit” exists at the predictor tables 102-108. For example, theprocessing system 100 may determine whether each predictor table 102-108 includes an entry that matches the active fetchaddress 110. According to the example illustrated inFIG. 2 , the active fetchaddress 110 is “0X80881323”. It should be understood that the active fetch address 110 (and other addresses described herein) is merely for illustrative purposes and should not be construed as limiting. - The predictor table 102 includes an
entry 152, anentry 154, anentry 156, and anentry 158. According to one implementation, each entry 152-158 may be included in the first plurality ofentries 150 ofFIG. 1 . Theentry 152 may include a tag “0X80881323” and may include a way identifier “A”, theentry 154 may include a tag “0X80881636” and may include a way identifier “B”, theentry 156 may include a tag “0X80882399” and may include a way identifier “C”, and theentry 158 may include a tag “0X80883456” and may include a way identifier “D”. According to one implementation, each tag may include a subset of a fetch address hashed together with other information (e.g., a particular number of previously seen fetch addresses). Each tag may include enough information such that remainder of the entry's content is associated with a fetch address looking up for that entry. Thus, each tag may be used as an identification mechanism for a fetch address. For ease of illustration, the way identifiers are identified by a single capitalized letter. - The predictor table 104 includes an
entry 162, anentry 164, anentry 166, and anentry 168. According to one implementation, each entry 162-168 may be included in the second plurality ofentries 160 ofFIG. 1 . Theentry 162 may include a tag “0X80884635” and may include the way identifier “A”, theentry 164 may include a tag “0X80881323” and may include the way identifier “B”, theentry 166 may include a tag “0X80881493” and may include the way identifier “C”, and theentry 168 may include a tag “0X80889999” and may include the way identifier “D”. - The predictor table 106 includes an
entry 172, anentry 174, anentry 176, and anentry 178. According to one implementation, each entry 172-178 may be included in the third plurality ofentries 170 ofFIG. 1 . Theentry 172 may include a tag “0X80884639” and may include the way identifier “A”, theentry 174 may include a tag “0X80882395” and may include the way identifier “B”, theentry 176 may include a tag “0X80888723” and may include the way identifier “C”, and theentry 178 may include a tag “0X80881321” and may include the way identifier “D”. - The predictor table 108 includes an
entry 182, anentry 184, anentry 186, and anentry 188. According to one implementation, each entry 182-188 may be included in the fourth plurality ofentries 180 ofFIG. 1 . Theentry 182 may include a tag “0X80885245” and may include the way identifier “A”, theentry 184 may include a tag - “0X80889823” and may include the way identifier “B”, the
entry 186 may include a tag “0X80881323” and may include the way identifier “C”, and theentry 188 may include a tag “0X80888888” and may include the way identifier “D”. - A processor (e.g., in the
processing system 100 ofFIG. 1 ) may determine that theentry 152 in the predictor table 102 matches the active fetchaddress 110. Based on this determination, the processor may provide the way identifier “A” to thefirst selection logic 114 as anoutput tag indicator 103 of the predictor table 102. The processor may also determine that theentry 164 in the predictor table 102 matches the active fetchaddress 110. Based on this determination, the processor may provide the way identifier “B” to thefirst selection logic 114 as anoutput tag indicator 105 of the predictor table 102. - The processor may determine that there are no entries in the predictor table 106 that match the active fetch
address 110. Thus, the processor may not provide a way identifier to thefirst selection logic 114 as anoutput tag indicator 107 of the predictor table 106. The processor may determine that theentry 186 in the predictor table 108 matches the active fetchaddress 110. Based on this determination, the processor may provide the way identifier “C” to thefirst selection logic 114 as anoutput tag indicator 109 of the predictor table 108. - In the illustrative example, each
output tag indicator first selection logic 114. Thefirst selection logic 114 may be configured to select the output tag indicator of the predictor table that has an entry matching the active fetchaddress 110 and that utilizes a largest amount of historical prediction data (associated with the global history table 112), as explained below. As described above, theoutput tag indicators entries address 110. Thus, as explained below, thefirst selection logic 114 may determine whichoutput tag indicator output tag indicator address 110, thefirst selection logic 114 may select that output tag indicator. - Referring back to
FIG. 1 , the global history table 112 includes (e.g., stores)historical prediction data 113. Thehistorical prediction data 113 includes a history of previous fetch addresses for indirect branches. For example, thehistorical prediction data 113 may include data to identify fetch addresses for previous indirect branches and way numbers associated with the fetch addresses. Each fetch address in thehistorical prediction data 113 may be an “abbreviated version” of a fetch address, to reduce overhead. For example, thehistorical prediction data 113 may store some bits (e.g., a subset) of each previous fetch address as opposed to the entire fetch address. Thehistorical prediction data 113 may include a way number (e.g., a way identifier) in the target table 118 for each previously used fetch address. Thus, instead of 64-bit previously used fetch addresses, thehistorical prediction data 113 may include some bits (e.g., three to five bits) for each previously used fetch address and a relatively small number of bits (e.g., two to three bits) to identify the way of each previously used fetch address. - The
processing system 100 may provide thehistorical prediction data 113 to the predictor table 104, to the predictor table 106, and to the predictor table 108. For example, theprocessing system 100 may provide a first amount of thehistorical prediction data 113 to the predictor table 104 with the active fetchaddress 110 to generate theoutput tag indicator 105, theprocessing system 100 may provide a second amount of the historical prediction data 113 (that is greater than the first amount) to the predictor table 106 with the active fetchaddress 110 to generate theoutput tag indicator 107, and theprocessing system 100 may provide a third amount of the historical prediction data 113 (that is greater than the second amount) to the predictor table 104 with the active fetchaddress 110 to generate theoutput tag indicator 109. - Because the
processing system 100 generates theoutput tag indicator 103 from the predictor table 102 based solely on the active fetchaddress 110, theoutput tag indicator 103 may not be as reliable as theoutput tag indicators historical prediction data 113. Furthermore, because theoutput tag indicator 107 is generated using more of thehistorical prediction data 113 than the amount ofhistorical prediction data 113 used to generate theoutput tag indicator 105, theoutput tag indicator 107 may be more reliable than theoutput tag indicator 105. Similarly, because theoutput tag indicator 109 is generated using more of thehistorical prediction data 113 than the amount ofhistorical prediction data 113 used to generate theoutput tag indicator 107, theoutput tag indicator 109 may be more reliable than theoutput tag indicator 107. - In the example illustrated in
FIG. 2 , theoutput tag indicators entries address 110. Thus, thefirst selection logic 114 may determine whichoutput tag indicator historical prediction data 113 associated with eachoutput tag indicator output tag indicator 109 is associated with morehistorical prediction data 113 than the otheroutput tag indicators first selection logic 114 may select thatoutput tag indicator 109 as aselected way pointer 116. Theprocessing system 100 may provide the selectedway pointer 116 to thesecond selection logic 120. - The target table 118 includes multiple fetch addresses that are separated by sets (e.g., rows) and ways (e.g., columns). In the illustrative example, the target table 118 includes four sets (e.g., “
Set 1”, “Set 2”, “Set 3”, and “Set 4”). The target table 118 may also include four ways (e.g., “Way A”, “Way B”, “Way C”, and “Way D”). Although the target table 118 is shown to include four sets and four ways, in other implementations, the target table 118 may include additional (or fewer) ways and sets. As a non-limiting example, target table 118 may include sixteen sets and thirty-two ways. - The
processing system 100 may provide the active fetchaddress 110 to the target table 118. The active fetchaddress 110 may indicate a particular set of fetch addresses in the target table 118 to be selected. In the illustrative example ofFIG. 1 , the active fetchaddress 110 indicates that “Set 3” is where the predicted fetchaddress 140 is located in the target table 118. - Each way in the target table 118 corresponds to a particular way identifier in the predictor tables 102-108. As described with respect to the example in
FIG. 2 , each entry in the predictor tables 102-108 can include way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D”. The entries that include way identifier “A” are associated with “Way A”, the entries that include way identifier “B” are associated with “Way B”, the entries that include way identifier “C” are associated with “Way C”, and the entries that include way identifier “D” are associated with “Way D”. Because thefirst selection logic 114 selected theoutput tag indicator 109 as the selectedway pointer 116 and theoutput tag indicator 109 corresponds to the way identifier “C” (e.g., the way identifier of associated with the entry 186), thesecond selection logic 120 may select “Way C” as the selected way of the predicted fetchaddress 140. - Thus, the
second selection logic 120 may select the predicted fetchaddress 140 in the table 118 as a fetchaddress 122 for a target instruction based on the way indicated by the selectedway pointer 116 and the set indicated by the active fetchaddress 110. The fetchaddress 122 may be used by the processing system to locate the address of the next instruction to be executed (e.g., the target instruction). - The techniques described with respect to
FIGS. 1-2 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction. The techniques described with respect toFIGS. 1-2 may also utilize an efficient methodology to determine the way of the predicted fetchaddress 140 in the target table 118. For example, the techniques may use the predictor tables 102-108 (e.g., the way identifier in the predictor tables 102-108) to determine the selected way of the predicted fetchaddress 140 in the target table 118. - Referring to
FIG. 3 , amethod 300 for predicting a fetch address of a next instruction to be fetched is shown. Themethod 300 may be performed by theprocessing system 100 ofFIG. 1 . - The
method 300 includes selecting, at a processor, a first way identifier or a second way identifier as a way pointer based on an active fetch address and historical prediction data, at 302. A first predictor table includes a first entry having the first way identifier and a second predictor table includes a second entry having the second way identifier. For example, referring toFIGS. 1-2 , thefirst selection logic 114 may select way identifier “A”, way identifier “B”, way identifier “C”, or way identifier “D” as the selectedway pointer 116 based on the active fetchaddress 110 and thehistorical prediction data 113. The predictor table 102 includes the selectedentry 152 having way identifier “A”, the predictor table 104 includes the selectedentry 164 having way identifier “B”, the predictor table 106 includes the selectedentry 178 having way identifier “D”, and the predictor table 108 includes the selectedentry 186 having way identifier “C”. - The
method 300 also includes selecting a first fetch address or a second fetch address as a predicted fetch address based on the way pointer, at 304. A target table includes a first way storing the first fetch address and a second way storing the second fetch address. The first way and the second way may be associated with the active fetch address. The first fetch address is associated with the first way identifier and the second fetch address is associated with the second way identifier. For example, referring toFIGS. 1-2 , thesecond selection logic 120 select the fetch address associated with theentry 186 as the predicted fetchaddress 140 based on the selectedway pointer 116. - According to one implementation of the
method 300, thefirst selection logic 114 includes a first multiplexer, and thesecond selection logic 120 includes a second multiplexer. Themethod 300 may also include storing thehistorical prediction data 113 at the global history table 112 that is accessible to the processor (e.g., the processing system 100). Thehistorical prediction data 113 includes one or more fetch addresses for one or more previous indirect branches. Themethod 300 may also include storing most significant bits of each fetch address of the one or more fetch addresses at the global history table to reduce overhead. - According to one implementation, the
method 300 includes generating the first entry based on a first amount of the historical prediction data. For example, the entries 162-168 in the predictor table 104 may be generated based on the first amount of thehistorical prediction data 113. Themethod 300 may also include generating the second entry based on a second amount of the historical prediction data that is greater than the first amount of the historical prediction data. For example, the entries 172-178 in the predictor table 106 may be generated based on the second amount of thehistorical prediction data 113 that is greater than the first amount of thehistorical prediction data 113. According to one implementation, themethod 300 includes selecting the second way identifier as the way pointer if the second entry (e.g., the entry generated on a larger amount of the historical prediction data) matches the active fetch address. Themethod 300 may also include selecting the first way identifier as the way pointer if the second entry fails to match the active fetch address and the first entry matches the active fetch address. - The
method 300 ofFIG. 3 may reduce an amount of overhead (compared to the overhead of a conventional processing system for predicting a fetch address of a target instruction). For example, by using a separate table (e.g., the target table) to store multiple fetch addresses as opposed to storing multiple (and sometimes identical) fetch addresses at different predictor tables, the amount of overhead may be reduced. Additionally, the global history table 112 may include reduced overhead (compared to a conventional processing system) because the global history table 112 stores an “abbreviated version” of the previously used fetch addresses (e.g., stores the most significant bits of previously used fetch addresses) as opposed to the entire addresses. This reduction in bits may reduce the amount of overhead at the processing system compared conventional processing systems for predicting a fetch address of a target instruction. Themethod 300 may also efficiently determine the way of the predicted fetchaddress 140 in the target table 118. For example, the techniques may use the predictor tables 102-108 (e.g., the way identifier in the predictor tables 102-108) to determine the selected way of the predicted fetchaddress 140 in the target table 118. - In particular implementations, the
method 300 ofFIG. 3 may be implemented via hardware (e.g., a field-programmable gate array (FPGA) device, an application-specific integrated circuit (ASIC), etc.) of a processing unit, such as a central processing unit (CPU), a digital signal processor (DSP), or a controller, via a firmware device, or any combination thereof. As an example, themethod 300 can be performed by a processor that executes instructions. - Referring to
FIG. 4 , a block diagram of adevice 400 is depicted. Thedevice 400 includes a processor 410 (e.g., a central processing unit (CPU), a digital signal processor (DSP), etc.) coupled to amemory 432. Theprocessor 410 may include theprocessing system 100 ofFIG. 1 . - The
memory 432 may be a memory device, such as a random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM). The memory device may include commands (e.g., the commands 460) that, when executed by a computer (e.g., processor 410), may cause the computer to perform themethod 300 ofFIG. 3 . -
FIG. 4 also shows adisplay controller 426 that is coupled to theprocessor 410 and to adisplay 428. An encoder/decoder (CODEC) 434 may be coupled to theprocessor 410, as shown. Aspeaker 436 and amicrophone 438 can be coupled to theCODEC 434.FIG. 4 also shows awireless controller 440 coupled to theprocessor 410 and to anantenna 442. In a particular implementation, theprocessor 410, thedisplay controller 426, thememory 432, theCODEC 434, and thewireless controller 440 are included in a system-in-package or system-on-chip device (e.g., a mobile station modem (MSM)) 422. In a particular implementation, aninput device 430, such as a touchscreen and/or keypad, and apower supply 444 are coupled to the system-on-chip device 422. Moreover, in a particular implementation, as illustrated inFIG. 4 , thedisplay 428, theinput device 430, thespeaker 436, themicrophone 438, theantenna 442, and thepower supply 444 are external to the system-on-chip device 422. However, each of thedisplay 428, theinput device 430, thespeaker 436, themicrophone 438, theantenna 442, and thepower supply 444 can be coupled to a component of the system-on-chip device 422, such as an interface or a controller. - In conjunction with the described implementations, an apparatus for predicting a fetch address of a next instruction to be fetched includes means for storing data. For example the means for storing data may include a memory system component (e.g., components storing the tables) of the
processing system 100 ofFIG. 1 , one or more other devices, circuits, modules, or instructions to store data, or any combination thereof. The means for storing data may include a plurality of predictor tables and a target table. The plurality of predictor tables may include a first predictor table and a second predictor table. The first predictor table may include a first entry having a first way identifier, and the second predictor table may include a second entry having a second way identifier. The target table may include a first way that stores a first fetch address associated with the first way identifier and a second way that stores a second fetch address associated with the second way identifier. The first way and the second way may be associated with an active address. - The apparatus may also include means for selecting the first way identifier or the second way identifier as a way pointer based on the active fetch address and historical prediction data. For example, the means for selecting the first way identifier or the second way identifier may include the
first selection logic 114 ofFIG. 1 , one or more other devices, circuits, modules, or instructions to select the first way identifier or the second way identifier, or any combination thereof - The apparatus may also include means for selecting the first fetch address or the second fetch address as a predicted fetch address based on the way pointer. For example, the means for selecting the first fetch address or the second fetch address may include the
second selection logic 120 ofFIG. 1 , one or more other devices, circuits, modules, or instructions to select the first fetch address or the second fetch address, or any combination thereof - The foregoing disclosed devices and functionalities may be designed and configured into computer files (e.g. RTL, GDSII, GERBER, etc.) stored on computer readable media. Some or all such files may be provided to fabrication handlers who fabricate devices based on such files. Resulting products include semiconductor wafers that are then cut into semiconductor die and packaged into a semiconductor chip. The chips are then employed in devices, such as a communications device (e.g., a mobile phone), a tablet, a laptop, a personal digital assistant (PDA), a set top box, a music player, a video player, an entertainment unit, a navigation device, a fixed location data unit, a server, or a computer.
- Those of skill would further appreciate that the various illustrative logical blocks, configurations, modules, circuits, and algorithm steps described in connection with the implementations disclosed herein may be implemented as electronic hardware, computer software executed by a processing device such as a hardware processor, or combinations of both. Various illustrative components, blocks, configurations, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or executable software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
- The steps of a method or algorithm described in connection with the implementations disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in a memory device, such as random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM). An exemplary memory device is coupled to the processor such that the processor can read information from, and write information to, the memory device. In the alternative, the memory device may be integral to the processor. The processor and the storage medium may reside in an application-specific integrated circuit (ASIC). The ASIC may reside in a computing device or a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a computing device or a user terminal.
- The previous description of the disclosed implementations is provided to enable a person skilled in the art to make or use the disclosed implementations. Various modifications to these implementations will be readily apparent to those skilled in the art, and the principles defined herein may be applied to other implementations without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the implementations shown herein but is to be accorded the widest scope possible consistent with the principles and novel features as defined by the following claims.
Claims (30)
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/192,794 US20170371669A1 (en) | 2016-06-24 | 2016-06-24 | Branch target predictor |
PCT/US2017/029452 WO2017222635A1 (en) | 2016-06-24 | 2017-04-25 | Branch target predictor |
CN201780033792.4A CN109219798A (en) | 2016-06-24 | 2017-04-25 | Branch target prediction device |
EP17721035.8A EP3475811A1 (en) | 2016-06-24 | 2017-04-25 | Branch target predictor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/192,794 US20170371669A1 (en) | 2016-06-24 | 2016-06-24 | Branch target predictor |
Publications (1)
Publication Number | Publication Date |
---|---|
US20170371669A1 true US20170371669A1 (en) | 2017-12-28 |
Family
ID=58664897
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/192,794 Abandoned US20170371669A1 (en) | 2016-06-24 | 2016-06-24 | Branch target predictor |
Country Status (4)
Country | Link |
---|---|
US (1) | US20170371669A1 (en) |
EP (1) | EP3475811A1 (en) |
CN (1) | CN109219798A (en) |
WO (1) | WO2017222635A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10956161B2 (en) | 2017-04-27 | 2021-03-23 | International Business Machines Corporation | Indirect target tagged geometric branch prediction using a set of target address pattern data |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7069426B1 (en) * | 2000-03-28 | 2006-06-27 | Intel Corporation | Branch predictor with saturating counter and local branch history table with algorithm for updating replacement and history fields of matching table entries |
US7165169B2 (en) * | 2001-05-04 | 2007-01-16 | Ip-First, Llc | Speculative branch target address cache with selective override by secondary predictor based on branch instruction type |
US7707397B2 (en) * | 2001-05-04 | 2010-04-27 | Via Technologies, Inc. | Variable group associativity branch target address cache delivering multiple target addresses per cache line |
US20060218385A1 (en) * | 2005-03-23 | 2006-09-28 | Smith Rodney W | Branch target address cache storing two or more branch target addresses per index |
US8935517B2 (en) * | 2006-06-29 | 2015-01-13 | Qualcomm Incorporated | System and method for selectively managing a branch target address cache of a multiple-stage predictor |
US20080209190A1 (en) * | 2007-02-28 | 2008-08-28 | Advanced Micro Devices, Inc. | Parallel prediction of multiple branches |
US7844807B2 (en) * | 2008-02-01 | 2010-11-30 | International Business Machines Corporation | Branch target address cache storing direct predictions |
CN101819523B (en) * | 2009-03-04 | 2014-04-02 | 威盛电子股份有限公司 | Microprocessor and related instruction execution method |
US20130346727A1 (en) * | 2012-06-25 | 2013-12-26 | Qualcomm Incorporated | Methods and Apparatus to Extend Software Branch Target Hints |
GB2506462B (en) * | 2013-03-13 | 2014-08-13 | Imagination Tech Ltd | Indirect branch prediction |
US9983878B2 (en) * | 2014-05-15 | 2018-05-29 | International Business Machines Corporation | Branch prediction using multiple versions of history data |
-
2016
- 2016-06-24 US US15/192,794 patent/US20170371669A1/en not_active Abandoned
-
2017
- 2017-04-25 WO PCT/US2017/029452 patent/WO2017222635A1/en active Search and Examination
- 2017-04-25 CN CN201780033792.4A patent/CN109219798A/en active Pending
- 2017-04-25 EP EP17721035.8A patent/EP3475811A1/en not_active Withdrawn
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10956161B2 (en) | 2017-04-27 | 2021-03-23 | International Business Machines Corporation | Indirect target tagged geometric branch prediction using a set of target address pattern data |
Also Published As
Publication number | Publication date |
---|---|
EP3475811A1 (en) | 2019-05-01 |
WO2017222635A1 (en) | 2017-12-28 |
CN109219798A (en) | 2019-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6744423B2 (en) | Implementation of load address prediction using address prediction table based on load path history in processor-based system | |
US10831491B2 (en) | Selective access to partitioned branch transfer buffer (BTB) content | |
US9201658B2 (en) | Branch predictor for wide issue, arbitrarily aligned fetch that can cross cache line boundaries | |
EP2423821A2 (en) | Processor, apparatus, and method for fetching instructions and configurations from a shared cache | |
US10901484B2 (en) | Fetch predition circuit for reducing power consumption in a processor | |
US9311098B2 (en) | Mechanism for reducing cache power consumption using cache way prediction | |
US9804969B2 (en) | Speculative addressing using a virtual address-to-physical address page crossing buffer | |
US9367468B2 (en) | Data cache way prediction | |
KR20180058797A (en) | Method and apparatus for cache line deduplication through data matching | |
EP2972898A1 (en) | Externally programmable memory management unit | |
EP2962187A2 (en) | Vector register addressing and functions based on a scalar register data value | |
WO2017030678A1 (en) | Determining prefetch instructions based on instruction encoding | |
US20140201494A1 (en) | Overlap checking for a translation lookaside buffer (tlb) | |
US20180081686A1 (en) | Providing memory dependence prediction in block-atomic dataflow architectures | |
CN107533513B (en) | Burst translation look-aside buffer | |
US20170371669A1 (en) | Branch target predictor | |
WO2021061269A1 (en) | Storage control apparatus, processing apparatus, computer system, and storage control method | |
TW202036284A (en) | Branch prediction based on load-path history | |
US10437592B2 (en) | Reduced logic level operation folding of context history in a history register in a prediction system for a processor-based system | |
EP2856304B1 (en) | Issuing instructions to execution pipelines based on register-associated preferences, and related instruction processing circuits, processor systems, methods, and computer-readable media | |
US20170046266A1 (en) | Way Mispredict Mitigation on a Way Predicted Cache | |
US20140181405A1 (en) | Instruction cache having a multi-bit way prediction mask | |
US10162752B2 (en) | Data storage at contiguous memory addresses | |
US8850109B2 (en) | Content addressable memory data clustering block architecture | |
US20230393853A1 (en) | Selectively updating branch predictors for loops executed from loop buffers in a processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: QUALCOMM INCORPORATED, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KRISHNA, ANIL;WRIGHT, GREGORY;REEL/FRAME:039925/0478 Effective date: 20160914 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |