WO2014100632A1 - Instruction cache having a multi-bit way prediction mask - Google Patents
Instruction cache having a multi-bit way prediction mask Download PDFInfo
- Publication number
- WO2014100632A1 WO2014100632A1 PCT/US2013/077020 US2013077020W WO2014100632A1 WO 2014100632 A1 WO2014100632 A1 WO 2014100632A1 US 2013077020 W US2013077020 W US 2013077020W WO 2014100632 A1 WO2014100632 A1 WO 2014100632A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- prediction mask
- way
- bit
- cache line
- cache
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0864—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using pseudo-associative means, e.g. set-associative or hashing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3804—Instruction prefetching for branches, e.g. hedging, branch folding
- G06F9/3806—Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
- G06F9/3832—Value prediction for operands; operand history buffers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/60—Details of cache memory
- G06F2212/608—Details relating to cache mapping
- G06F2212/6082—Way prediction in set-associative cache
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- a first technique stores the last (previous) "next way” for each cache line (i.e., a single "next way”). Based on the single "next way", the instruction cache enables a single driver that corresponds to the single "next way.” When the correct way is not driven (i.e., not provided to the multiplexer), a misprediction occurs and a second access to the instruction cache data array is performed that drives the correct way (based on a completed tag lookup operation or a signal provided from control logic). When using the first technique, accuracy of predicting the correct "next way” is an issue because predictability of a given program (e.g., multiple instructions) may be erratic. Accordingly, the last "next way" for a cache line is not necessarily a good predictor and frequent mispredictions occur.
- FIG. 4 is a flow diagram of a second illustrative embodiment of a method to perform way prediction
- the data array 110 may include a plurality of ways 120-124 that each include a
- multiplexer may determine whether the contents provided via the enabled drivers 140- 144 includes one or more instruction to be executed next (e.g., after the instruction(s) corresponding to the last cache line fetched). If the ways enabled according to the prediction mask 152 set to the prediction mask value do not provide the one or more instructions to be executed next (e.g., the prediction mask does not predict a correct next way to be driven), a misprediction occurs. In a particular embodiment, the control logic 150 determines whether a misprediction occurs based on whether or not a multiplexer selects an output of the data array 110 that corresponds to a predicted way identified by the prediction mask 152.
- the control logic 150 may receive an instruction address associated with an instruction stored in a cache line (e.g., a way) of the data array 110. Based on the instruction address, the control logic 150 may identify a particular prediction mask value associated with the instruction address. For example, the control logic 150 may identify the particular prediction mask value from the plurality of prediction mask values that each correspond to a cache line of the data array 110. After the particular prediction mask value is identified, the control logic 150 may set one or more bits of the prediction mask 152 based on the particular prediction mask value identified so that value of the prediction mask 152 is the same value as the particular prediction mask value.
- a power benefit may be realized on each data access of the instruction cache 102.
- a misprediction should only occur once for each successor way because, after a misprediction, a bit of the multi-bit way prediction mask 152 is updated to identify the particular cache line as a successor.
- the control logic 150 may set a value of the prediction mask 152.
- the value set for the prediction mask 152 may correspond to a particular cache line 220a-d of the data array 110 that was last accessed.
- the value of the prediction mask 152 may predict (e.g., identify) a "next way" (e.g., a subsequent way) with respect to the particular cache line 220a-d that was last fetched.
- the value set for the prediction mask 152 may correspond to the cache line A 220a.
- the prediction mask 152 of FIG. 2 may include four (4) bits, where each bit corresponds to a different way, and thus a different driver 240a-d, respectively.
- control logic 150 may also initialize a particular prediction mask value when
- the control logic 150 may determine (e.g., identify) which way is accessed next (e.g., subsequent to the particular cache line as a successor way). Upon determining the successor way, the control logic 150 may set a bit of the corresponding prediction mask value to identify the successor way.
- the control logic 150 may also determine (e.g., detect) whether a misprediction occurs as a result of the prediction mask 152 being applied to the data array 110. For example, when the prediction mask 152 is set to a value of a particular prediction mask value corresponding to a particular cache line 220a-d of the data array 110, the control logic 150 may determine whether the particular prediction mask value resulted in a misprediction. When the particular prediction mask value resulted in the misprediction, the control logic 150 may identify the correct way to be driven (e.g., the correct driver to be enabled) and update the particular prediction mask value based on the identified correct way.
- the correct way to be driven e.g., the correct driver to be enabled
- the third way corresponding to the cache line C 220c is identified by the control logic 150 as the correct way and the control logic 150 updates the cache line A prediction mask value 254 to "0011" reflecting the determination that the third way associated with the cache line C 220c is also a successor to the cache line A 220a.
- the tag portion 274 may be provided to the tag array 280 in parallel with the control logic 150 applying the prediction mask 152 (associated with an instruction currently being executed) that predicts one or more ways that may be associated with an instruction to be executed next.
- the tag portion 274 may be provided to the tag array 280 after (e.g., in response to) a misprediction.
- the tag array 280 may identify the location (e.g., a cache line or way) in the data array 110 that includes the instruction to be executed next.
- the tag array 280 may provide the location to the multiplexer 160 as the way select signal.
- a particular instruction is fetched from a particular cache line 220a-d of the data array 110 and executed (by an execution unit). Based on the particular instruction being fetched and/or executed, the control logic 150 identifies a prediction mask value corresponding to the particular cache line. The control logic 150 may set the prediction mask 152 to the prediction mask value to selectively enable one or more drivers 240a-d of the data array 110. When the one or more drivers 240a-d are enabled, contents of selected cache lines (e.g., one or more of the cache lines 220a-d) corresponding to the one or more enabled drivers (e.g., one or more of the drivers 240a- d) may be provided to the multiplexer 260.
- selected cache lines e.g., one or more of the cache lines 220a-d
- enabled drivers e.g., one or more of the drivers 240a- d
- the instruction cache may include the instruction cache 102 of FIG. 1.
- the method 300 may be performed by the control logic 150 of FIG. 1.
- a multi-bit way prediction mask corresponding to a cache line may be set to an initial value, at 302, and the cache line may be fetched, at 304.
- the multi-bit way prediction mask may be associated with an instruction cache including a data array having a plurality of cache lines.
- the value of the multi-bit way prediction mask may correspond to a cache line included in the data array of the instruction cache.
- the multi-bit prediction mask may be the prediction mask 152 and the data array may be the data array 110 of FIG. 1.
- the value of the multi-bit way prediction mask value corresponding to the first cache line is not updated when the second cache line is determined not to be the first cache line. Rather, when the second cache line is determined, at 418, not to be the first cache line, processing advances to 408, and the multi-bit predication mask value corresponding to the first cache line is not updated based on data (e.g., the second contents) being loaded into the second cache line.
- the apparatus may also include means for providing the multi-bit way prediction mask to a plurality of line drivers of the data array.
- the means for providing may include the control logic 150 of FIGS. 1-2, the processor 610, the control logic 686 of FIG. 6, one or more other devices or circuits configured to provide the multi-bit way prediction mask, or any combination thereof.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Advance Control (AREA)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2015549790A JP6212133B2 (ja) | 2012-12-20 | 2013-12-20 | マルチビットウェイ予測マスクを有する命令キャッシュ |
| CN201380065463.XA CN104854557B (zh) | 2012-12-20 | 2013-12-20 | 存取高速缓存的设备和方法 |
| EP13821338.4A EP2936303B1 (en) | 2012-12-20 | 2013-12-20 | Instruction cache having a multi-bit way prediction mask |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/721,317 US9304932B2 (en) | 2012-12-20 | 2012-12-20 | Instruction cache having a multi-bit way prediction mask |
| US13/721,317 | 2012-12-20 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2014100632A1 true WO2014100632A1 (en) | 2014-06-26 |
Family
ID=49956453
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2013/077020 Ceased WO2014100632A1 (en) | 2012-12-20 | 2013-12-20 | Instruction cache having a multi-bit way prediction mask |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US9304932B2 (enExample) |
| EP (1) | EP2936303B1 (enExample) |
| JP (1) | JP6212133B2 (enExample) |
| CN (1) | CN104854557B (enExample) |
| WO (1) | WO2014100632A1 (enExample) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GR20150100422A (el) | 2015-09-28 | 2017-05-15 | Arm Limited | Αποθηκευση δεδομενων |
| WO2019010703A1 (zh) * | 2017-07-14 | 2019-01-17 | 华为技术有限公司 | 读、部分写数据方法以及相关装置 |
| US11620229B2 (en) * | 2020-02-21 | 2023-04-04 | SiFive, Inc. | Data cache with prediction hints for cache hits |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6356990B1 (en) * | 2000-02-02 | 2002-03-12 | International Business Machines Corporation | Set-associative cache memory having a built-in set prediction array |
| US20050050278A1 (en) * | 2003-09-03 | 2005-03-03 | Advanced Micro Devices, Inc. | Low power way-predicted cache |
| US20050246499A1 (en) * | 2004-04-30 | 2005-11-03 | Nec Corporation | Cache memory with the number of operated ways being changed according to access pattern |
Family Cites Families (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5142633A (en) | 1989-02-03 | 1992-08-25 | Digital Equipment Corporation | Preprocessing implied specifiers in a pipelined processor |
| US5287467A (en) | 1991-04-18 | 1994-02-15 | International Business Machines Corporation | Pipeline for removing and concurrently executing two or more branch instructions in synchronization with other instructions executing in the execution unit |
| US5604909A (en) | 1993-12-15 | 1997-02-18 | Silicon Graphics Computer Systems, Inc. | Apparatus for processing instructions in a computing system |
| JP3589485B2 (ja) * | 1994-06-07 | 2004-11-17 | 株式会社ルネサステクノロジ | セットアソシアティブ方式のメモリ装置およびプロセッサ |
| US5848433A (en) * | 1995-04-12 | 1998-12-08 | Advanced Micro Devices | Way prediction unit and a method for operating the same |
| US5781789A (en) * | 1995-08-31 | 1998-07-14 | Advanced Micro Devices, Inc. | Superscaler microprocessor employing a parallel mask decoder |
| US5826071A (en) | 1995-08-31 | 1998-10-20 | Advanced Micro Devices, Inc. | Parallel mask decoder and method for generating said mask |
| US5794028A (en) * | 1996-10-17 | 1998-08-11 | Advanced Micro Devices, Inc. | Shared branch prediction structure |
| US5995749A (en) * | 1996-11-19 | 1999-11-30 | Advanced Micro Devices, Inc. | Branch prediction mechanism employing branch selectors to select a branch prediction |
| US5845102A (en) * | 1997-03-03 | 1998-12-01 | Advanced Micro Devices, Inc. | Determining microcode entry points and prefix bytes using a parallel logic technique |
| JP3469042B2 (ja) * | 1997-05-14 | 2003-11-25 | 株式会社東芝 | キャッシュメモリ |
| US6016533A (en) * | 1997-12-16 | 2000-01-18 | Advanced Micro Devices, Inc. | Way prediction logic for cache array |
| US7085920B2 (en) | 2000-02-02 | 2006-08-01 | Fujitsu Limited | Branch prediction method, arithmetic and logic unit, and information processing apparatus for performing brach prediction at the time of occurrence of a branch instruction |
| US6584549B2 (en) * | 2000-12-29 | 2003-06-24 | Intel Corporation | System and method for prefetching data into a cache based on miss distance |
| US20020194462A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Apparatus and method for selecting one of multiple target addresses stored in a speculative branch target address cache per instruction cache line |
| US7406569B2 (en) * | 2002-08-12 | 2008-07-29 | Nxp B.V. | Instruction cache way prediction for jump targets |
| US7587580B2 (en) * | 2005-02-03 | 2009-09-08 | Qualcomm Corporated | Power efficient instruction prefetch mechanism |
| US8046538B1 (en) * | 2005-08-04 | 2011-10-25 | Oracle America, Inc. | Method and mechanism for cache compaction and bandwidth reduction |
| US8275942B2 (en) * | 2005-12-22 | 2012-09-25 | Intel Corporation | Performance prioritization in multi-threaded processors |
| US8225046B2 (en) | 2006-09-29 | 2012-07-17 | Intel Corporation | Method and apparatus for saving power by efficiently disabling ways for a set-associative cache |
| US20080147989A1 (en) * | 2006-12-14 | 2008-06-19 | Arm Limited | Lockdown control of a multi-way set associative cache memory |
| US8151084B2 (en) * | 2008-01-23 | 2012-04-03 | Oracle America, Inc. | Using address and non-address information for improved index generation for cache memories |
| US8522097B2 (en) * | 2010-03-16 | 2013-08-27 | Qualcomm Incorporated | Logic built-in self-test programmable pattern bit mask |
| JP2011257800A (ja) * | 2010-06-04 | 2011-12-22 | Panasonic Corp | キャッシュメモリ装置、プログラム変換装置、キャッシュメモリ制御方法及びプログラム変換方法 |
| JP5954112B2 (ja) * | 2012-10-24 | 2016-07-20 | 富士通株式会社 | メモリ装置、演算処理装置、及びキャッシュメモリ制御方法 |
-
2012
- 2012-12-20 US US13/721,317 patent/US9304932B2/en not_active Expired - Fee Related
-
2013
- 2013-12-20 CN CN201380065463.XA patent/CN104854557B/zh active Active
- 2013-12-20 JP JP2015549790A patent/JP6212133B2/ja not_active Expired - Fee Related
- 2013-12-20 WO PCT/US2013/077020 patent/WO2014100632A1/en not_active Ceased
- 2013-12-20 EP EP13821338.4A patent/EP2936303B1/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6356990B1 (en) * | 2000-02-02 | 2002-03-12 | International Business Machines Corporation | Set-associative cache memory having a built-in set prediction array |
| US20050050278A1 (en) * | 2003-09-03 | 2005-03-03 | Advanced Micro Devices, Inc. | Low power way-predicted cache |
| US20050246499A1 (en) * | 2004-04-30 | 2005-11-03 | Nec Corporation | Cache memory with the number of operated ways being changed according to access pattern |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2016505971A (ja) | 2016-02-25 |
| JP6212133B2 (ja) | 2017-10-11 |
| US20140181405A1 (en) | 2014-06-26 |
| EP2936303A1 (en) | 2015-10-28 |
| EP2936303B1 (en) | 2020-01-15 |
| CN104854557A (zh) | 2015-08-19 |
| US9304932B2 (en) | 2016-04-05 |
| CN104854557B (zh) | 2018-06-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9367468B2 (en) | Data cache way prediction | |
| US10838731B2 (en) | Branch prediction based on load-path history | |
| US9519586B2 (en) | Methods and apparatus to reduce cache pollution caused by data prefetching | |
| KR20180127379A (ko) | 프로세서-기반 시스템들 내의 로드 경로 이력에 기반한 어드레스 예측 테이블들을 사용하는 로드 어드레스 예측들의 제공 | |
| US11620224B2 (en) | Instruction cache prefetch throttle | |
| US9830152B2 (en) | Selective storing of previously decoded instructions of frequently-called instruction sequences in an instruction sequence buffer to be executed by a processor | |
| US11243772B2 (en) | Efficient load value prediction | |
| CN104871144B (zh) | 使用虚拟地址到物理地址跨页缓冲器的推测性寻址 | |
| CN103238134B (zh) | 编码于分支指令中的双模态分支预测器 | |
| EP3149594B1 (en) | Method and apparatus for cache access mode selection | |
| US10719327B1 (en) | Branch prediction system | |
| EP2936303B1 (en) | Instruction cache having a multi-bit way prediction mask | |
| US20170046266A1 (en) | Way Mispredict Mitigation on a Way Predicted Cache | |
| US12380026B2 (en) | Optimizing cache energy consumption in processor-based devices |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13821338 Country of ref document: EP Kind code of ref document: A1 |
|
| DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
| WWE | Wipo information: entry into national phase |
Ref document number: 2013821338 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2015549790 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |