WO2014100632A1 - Instruction cache having a multi-bit way prediction mask - Google Patents

Instruction cache having a multi-bit way prediction mask Download PDF

Info

Publication number
WO2014100632A1
WO2014100632A1 PCT/US2013/077020 US2013077020W WO2014100632A1 WO 2014100632 A1 WO2014100632 A1 WO 2014100632A1 US 2013077020 W US2013077020 W US 2013077020W WO 2014100632 A1 WO2014100632 A1 WO 2014100632A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction mask
way
bit
cache line
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2013/077020
Other languages
English (en)
French (fr)
Inventor
Peter G. Sassone
Suresh K. Venkumahanti
Lucian Codrescu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Priority to JP2015549790A priority Critical patent/JP6212133B2/ja
Priority to CN201380065463.XA priority patent/CN104854557B/zh
Priority to EP13821338.4A priority patent/EP2936303B1/en
Publication of WO2014100632A1 publication Critical patent/WO2014100632A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0864Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using pseudo-associative means, e.g. set-associative or hashing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • G06F9/3806Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/383Operand prefetching
    • G06F9/3832Value prediction for operands; operand history buffers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/608Details relating to cache mapping
    • G06F2212/6082Way prediction in set-associative cache
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • a first technique stores the last (previous) "next way” for each cache line (i.e., a single "next way”). Based on the single "next way", the instruction cache enables a single driver that corresponds to the single "next way.” When the correct way is not driven (i.e., not provided to the multiplexer), a misprediction occurs and a second access to the instruction cache data array is performed that drives the correct way (based on a completed tag lookup operation or a signal provided from control logic). When using the first technique, accuracy of predicting the correct "next way” is an issue because predictability of a given program (e.g., multiple instructions) may be erratic. Accordingly, the last "next way" for a cache line is not necessarily a good predictor and frequent mispredictions occur.
  • FIG. 4 is a flow diagram of a second illustrative embodiment of a method to perform way prediction
  • the data array 110 may include a plurality of ways 120-124 that each include a
  • multiplexer may determine whether the contents provided via the enabled drivers 140- 144 includes one or more instruction to be executed next (e.g., after the instruction(s) corresponding to the last cache line fetched). If the ways enabled according to the prediction mask 152 set to the prediction mask value do not provide the one or more instructions to be executed next (e.g., the prediction mask does not predict a correct next way to be driven), a misprediction occurs. In a particular embodiment, the control logic 150 determines whether a misprediction occurs based on whether or not a multiplexer selects an output of the data array 110 that corresponds to a predicted way identified by the prediction mask 152.
  • the control logic 150 may receive an instruction address associated with an instruction stored in a cache line (e.g., a way) of the data array 110. Based on the instruction address, the control logic 150 may identify a particular prediction mask value associated with the instruction address. For example, the control logic 150 may identify the particular prediction mask value from the plurality of prediction mask values that each correspond to a cache line of the data array 110. After the particular prediction mask value is identified, the control logic 150 may set one or more bits of the prediction mask 152 based on the particular prediction mask value identified so that value of the prediction mask 152 is the same value as the particular prediction mask value.
  • a power benefit may be realized on each data access of the instruction cache 102.
  • a misprediction should only occur once for each successor way because, after a misprediction, a bit of the multi-bit way prediction mask 152 is updated to identify the particular cache line as a successor.
  • the control logic 150 may set a value of the prediction mask 152.
  • the value set for the prediction mask 152 may correspond to a particular cache line 220a-d of the data array 110 that was last accessed.
  • the value of the prediction mask 152 may predict (e.g., identify) a "next way" (e.g., a subsequent way) with respect to the particular cache line 220a-d that was last fetched.
  • the value set for the prediction mask 152 may correspond to the cache line A 220a.
  • the prediction mask 152 of FIG. 2 may include four (4) bits, where each bit corresponds to a different way, and thus a different driver 240a-d, respectively.
  • control logic 150 may also initialize a particular prediction mask value when
  • the control logic 150 may determine (e.g., identify) which way is accessed next (e.g., subsequent to the particular cache line as a successor way). Upon determining the successor way, the control logic 150 may set a bit of the corresponding prediction mask value to identify the successor way.
  • the control logic 150 may also determine (e.g., detect) whether a misprediction occurs as a result of the prediction mask 152 being applied to the data array 110. For example, when the prediction mask 152 is set to a value of a particular prediction mask value corresponding to a particular cache line 220a-d of the data array 110, the control logic 150 may determine whether the particular prediction mask value resulted in a misprediction. When the particular prediction mask value resulted in the misprediction, the control logic 150 may identify the correct way to be driven (e.g., the correct driver to be enabled) and update the particular prediction mask value based on the identified correct way.
  • the correct way to be driven e.g., the correct driver to be enabled
  • the third way corresponding to the cache line C 220c is identified by the control logic 150 as the correct way and the control logic 150 updates the cache line A prediction mask value 254 to "0011" reflecting the determination that the third way associated with the cache line C 220c is also a successor to the cache line A 220a.
  • the tag portion 274 may be provided to the tag array 280 in parallel with the control logic 150 applying the prediction mask 152 (associated with an instruction currently being executed) that predicts one or more ways that may be associated with an instruction to be executed next.
  • the tag portion 274 may be provided to the tag array 280 after (e.g., in response to) a misprediction.
  • the tag array 280 may identify the location (e.g., a cache line or way) in the data array 110 that includes the instruction to be executed next.
  • the tag array 280 may provide the location to the multiplexer 160 as the way select signal.
  • a particular instruction is fetched from a particular cache line 220a-d of the data array 110 and executed (by an execution unit). Based on the particular instruction being fetched and/or executed, the control logic 150 identifies a prediction mask value corresponding to the particular cache line. The control logic 150 may set the prediction mask 152 to the prediction mask value to selectively enable one or more drivers 240a-d of the data array 110. When the one or more drivers 240a-d are enabled, contents of selected cache lines (e.g., one or more of the cache lines 220a-d) corresponding to the one or more enabled drivers (e.g., one or more of the drivers 240a- d) may be provided to the multiplexer 260.
  • selected cache lines e.g., one or more of the cache lines 220a-d
  • enabled drivers e.g., one or more of the drivers 240a- d
  • the instruction cache may include the instruction cache 102 of FIG. 1.
  • the method 300 may be performed by the control logic 150 of FIG. 1.
  • a multi-bit way prediction mask corresponding to a cache line may be set to an initial value, at 302, and the cache line may be fetched, at 304.
  • the multi-bit way prediction mask may be associated with an instruction cache including a data array having a plurality of cache lines.
  • the value of the multi-bit way prediction mask may correspond to a cache line included in the data array of the instruction cache.
  • the multi-bit prediction mask may be the prediction mask 152 and the data array may be the data array 110 of FIG. 1.
  • the value of the multi-bit way prediction mask value corresponding to the first cache line is not updated when the second cache line is determined not to be the first cache line. Rather, when the second cache line is determined, at 418, not to be the first cache line, processing advances to 408, and the multi-bit predication mask value corresponding to the first cache line is not updated based on data (e.g., the second contents) being loaded into the second cache line.
  • the apparatus may also include means for providing the multi-bit way prediction mask to a plurality of line drivers of the data array.
  • the means for providing may include the control logic 150 of FIGS. 1-2, the processor 610, the control logic 686 of FIG. 6, one or more other devices or circuits configured to provide the multi-bit way prediction mask, or any combination thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Advance Control (AREA)
PCT/US2013/077020 2012-12-20 2013-12-20 Instruction cache having a multi-bit way prediction mask Ceased WO2014100632A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2015549790A JP6212133B2 (ja) 2012-12-20 2013-12-20 マルチビットウェイ予測マスクを有する命令キャッシュ
CN201380065463.XA CN104854557B (zh) 2012-12-20 2013-12-20 存取高速缓存的设备和方法
EP13821338.4A EP2936303B1 (en) 2012-12-20 2013-12-20 Instruction cache having a multi-bit way prediction mask

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/721,317 US9304932B2 (en) 2012-12-20 2012-12-20 Instruction cache having a multi-bit way prediction mask
US13/721,317 2012-12-20

Publications (1)

Publication Number Publication Date
WO2014100632A1 true WO2014100632A1 (en) 2014-06-26

Family

ID=49956453

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/077020 Ceased WO2014100632A1 (en) 2012-12-20 2013-12-20 Instruction cache having a multi-bit way prediction mask

Country Status (5)

Country Link
US (1) US9304932B2 (enExample)
EP (1) EP2936303B1 (enExample)
JP (1) JP6212133B2 (enExample)
CN (1) CN104854557B (enExample)
WO (1) WO2014100632A1 (enExample)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GR20150100422A (el) 2015-09-28 2017-05-15 Arm Limited Αποθηκευση δεδομενων
WO2019010703A1 (zh) * 2017-07-14 2019-01-17 华为技术有限公司 读、部分写数据方法以及相关装置
US11620229B2 (en) * 2020-02-21 2023-04-04 SiFive, Inc. Data cache with prediction hints for cache hits

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6356990B1 (en) * 2000-02-02 2002-03-12 International Business Machines Corporation Set-associative cache memory having a built-in set prediction array
US20050050278A1 (en) * 2003-09-03 2005-03-03 Advanced Micro Devices, Inc. Low power way-predicted cache
US20050246499A1 (en) * 2004-04-30 2005-11-03 Nec Corporation Cache memory with the number of operated ways being changed according to access pattern

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5142633A (en) 1989-02-03 1992-08-25 Digital Equipment Corporation Preprocessing implied specifiers in a pipelined processor
US5287467A (en) 1991-04-18 1994-02-15 International Business Machines Corporation Pipeline for removing and concurrently executing two or more branch instructions in synchronization with other instructions executing in the execution unit
US5604909A (en) 1993-12-15 1997-02-18 Silicon Graphics Computer Systems, Inc. Apparatus for processing instructions in a computing system
JP3589485B2 (ja) * 1994-06-07 2004-11-17 株式会社ルネサステクノロジ セットアソシアティブ方式のメモリ装置およびプロセッサ
US5848433A (en) * 1995-04-12 1998-12-08 Advanced Micro Devices Way prediction unit and a method for operating the same
US5781789A (en) * 1995-08-31 1998-07-14 Advanced Micro Devices, Inc. Superscaler microprocessor employing a parallel mask decoder
US5826071A (en) 1995-08-31 1998-10-20 Advanced Micro Devices, Inc. Parallel mask decoder and method for generating said mask
US5794028A (en) * 1996-10-17 1998-08-11 Advanced Micro Devices, Inc. Shared branch prediction structure
US5995749A (en) * 1996-11-19 1999-11-30 Advanced Micro Devices, Inc. Branch prediction mechanism employing branch selectors to select a branch prediction
US5845102A (en) * 1997-03-03 1998-12-01 Advanced Micro Devices, Inc. Determining microcode entry points and prefix bytes using a parallel logic technique
JP3469042B2 (ja) * 1997-05-14 2003-11-25 株式会社東芝 キャッシュメモリ
US6016533A (en) * 1997-12-16 2000-01-18 Advanced Micro Devices, Inc. Way prediction logic for cache array
US7085920B2 (en) 2000-02-02 2006-08-01 Fujitsu Limited Branch prediction method, arithmetic and logic unit, and information processing apparatus for performing brach prediction at the time of occurrence of a branch instruction
US6584549B2 (en) * 2000-12-29 2003-06-24 Intel Corporation System and method for prefetching data into a cache based on miss distance
US20020194462A1 (en) * 2001-05-04 2002-12-19 Ip First Llc Apparatus and method for selecting one of multiple target addresses stored in a speculative branch target address cache per instruction cache line
US7406569B2 (en) * 2002-08-12 2008-07-29 Nxp B.V. Instruction cache way prediction for jump targets
US7587580B2 (en) * 2005-02-03 2009-09-08 Qualcomm Corporated Power efficient instruction prefetch mechanism
US8046538B1 (en) * 2005-08-04 2011-10-25 Oracle America, Inc. Method and mechanism for cache compaction and bandwidth reduction
US8275942B2 (en) * 2005-12-22 2012-09-25 Intel Corporation Performance prioritization in multi-threaded processors
US8225046B2 (en) 2006-09-29 2012-07-17 Intel Corporation Method and apparatus for saving power by efficiently disabling ways for a set-associative cache
US20080147989A1 (en) * 2006-12-14 2008-06-19 Arm Limited Lockdown control of a multi-way set associative cache memory
US8151084B2 (en) * 2008-01-23 2012-04-03 Oracle America, Inc. Using address and non-address information for improved index generation for cache memories
US8522097B2 (en) * 2010-03-16 2013-08-27 Qualcomm Incorporated Logic built-in self-test programmable pattern bit mask
JP2011257800A (ja) * 2010-06-04 2011-12-22 Panasonic Corp キャッシュメモリ装置、プログラム変換装置、キャッシュメモリ制御方法及びプログラム変換方法
JP5954112B2 (ja) * 2012-10-24 2016-07-20 富士通株式会社 メモリ装置、演算処理装置、及びキャッシュメモリ制御方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6356990B1 (en) * 2000-02-02 2002-03-12 International Business Machines Corporation Set-associative cache memory having a built-in set prediction array
US20050050278A1 (en) * 2003-09-03 2005-03-03 Advanced Micro Devices, Inc. Low power way-predicted cache
US20050246499A1 (en) * 2004-04-30 2005-11-03 Nec Corporation Cache memory with the number of operated ways being changed according to access pattern

Also Published As

Publication number Publication date
JP2016505971A (ja) 2016-02-25
JP6212133B2 (ja) 2017-10-11
US20140181405A1 (en) 2014-06-26
EP2936303A1 (en) 2015-10-28
EP2936303B1 (en) 2020-01-15
CN104854557A (zh) 2015-08-19
US9304932B2 (en) 2016-04-05
CN104854557B (zh) 2018-06-01

Similar Documents

Publication Publication Date Title
US9367468B2 (en) Data cache way prediction
US10838731B2 (en) Branch prediction based on load-path history
US9519586B2 (en) Methods and apparatus to reduce cache pollution caused by data prefetching
KR20180127379A (ko) 프로세서-기반 시스템들 내의 로드 경로 이력에 기반한 어드레스 예측 테이블들을 사용하는 로드 어드레스 예측들의 제공
US11620224B2 (en) Instruction cache prefetch throttle
US9830152B2 (en) Selective storing of previously decoded instructions of frequently-called instruction sequences in an instruction sequence buffer to be executed by a processor
US11243772B2 (en) Efficient load value prediction
CN104871144B (zh) 使用虚拟地址到物理地址跨页缓冲器的推测性寻址
CN103238134B (zh) 编码于分支指令中的双模态分支预测器
EP3149594B1 (en) Method and apparatus for cache access mode selection
US10719327B1 (en) Branch prediction system
EP2936303B1 (en) Instruction cache having a multi-bit way prediction mask
US20170046266A1 (en) Way Mispredict Mitigation on a Way Predicted Cache
US12380026B2 (en) Optimizing cache energy consumption in processor-based devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13821338

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2013821338

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2015549790

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE