TW201407348A - 微處理器的轉譯位址快取記憶體 - Google Patents
微處理器的轉譯位址快取記憶體 Download PDFInfo
- Publication number
- TW201407348A TW201407348A TW102108698A TW102108698A TW201407348A TW 201407348 A TW201407348 A TW 201407348A TW 102108698 A TW102108698 A TW 102108698A TW 102108698 A TW102108698 A TW 102108698A TW 201407348 A TW201407348 A TW 201407348A
- Authority
- TW
- Taiwan
- Prior art keywords
- instruction
- address
- translation
- microprocessor
- alternate version
- Prior art date
Links
- 238000013519 translation Methods 0.000 title claims description 118
- 230000015654 memory Effects 0.000 claims description 71
- 238000000605 extraction Methods 0.000 claims description 28
- 230000008030 elimination Effects 0.000 claims description 12
- 238000003379 elimination reaction Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 9
- 239000000284 extract Substances 0.000 claims description 5
- 238000000034 method Methods 0.000 abstract description 56
- 238000012545 processing Methods 0.000 description 6
- 239000000463 material Substances 0.000 description 5
- 238000013507 mapping Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000001934 delay Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 206010000210 abortion Diseases 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 235000012431 wafers Nutrition 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/3017—Runtime instruction translation, e.g. macros
- G06F9/30174—Runtime instruction translation, e.g. macros for non-native instruction set, e.g. Javabyte, legacy code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/10—Address translation
- G06F12/1009—Address translation using page tables, e.g. page table structures
- G06F12/1018—Address translation using page tables, e.g. page table structures involving hashing techniques, e.g. inverted page tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3808—Instruction prefetching for instruction reuse, e.g. trace cache, branch target cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0862—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/3017—Runtime instruction translation, e.g. macros
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Advance Control (AREA)
Abstract
提供有關從包含於微處理器中之一指令快取記憶體提取指令以及達成與指令相同功能之替代版本的具體實施例。在一範例中,提供一種方法,其包含在一範例微處理器中從一指令快取記憶體提取一指令。範例方法也包含雜湊指令的一位址,以判定達成與指令相同功能之指令的一替代版本是否存在。範例方法更包含若雜湊的結果為判定此一替代版本確實存在,則中止指令的提取、並擷取及執行替代版本。
Description
本發明係關於微處理器的快取記憶體,尤其是關於從包含於微處理器中之一指令快取記憶體提取指令。
微處理器的架構層級指令可在指令集架構(ISA)與原生(native)架構之間轉譯。在某些微處理器中,ISA指令的軟體最佳化可比那些軟體最佳化所依據之ISA指令更加有效率地執行。某些過去的方式係將軟體最佳化鍊接以將控制從一軟體最佳化傳到另一個。然而,此類方法受到了間接分支程序的挑戰,因為其可能難以判定間接分支的目標。
本發明一實施例中提出一種微處理器,包含提取邏輯以操作:提取一指令;雜湊該指令之一位址,以判定是否存在達成與該指令相同功能之該指令的一替代版本;以及
若該雜湊的結果為判定此一替代版本確實存在,中止該提取、並擷取及執行該替代版本。
100‧‧‧微處理器
102‧‧‧管線
109‧‧‧暫存器
110‧‧‧記憶體階層
110A‧‧‧L1處理器快取記憶體
110B‧‧‧L2處理器快取記憶體
110C‧‧‧L3處理器快取記憶體
110D‧‧‧主記憶體
110E‧‧‧輔助儲存器
110F‧‧‧第三級儲存器
110H‧‧‧記憶體控制器
120‧‧‧提取邏輯
122‧‧‧指令轉譯後備緩衝器
124‧‧‧轉譯位址快取記憶體
126‧‧‧實體位址多工器
128‧‧‧指令快取記憶體
130‧‧‧原生轉譯緩衝器
132‧‧‧解碼邏輯
134‧‧‧排程邏輯
136‧‧‧執行邏輯
138‧‧‧記憶體邏輯
140‧‧‧寫回邏輯
200‧‧‧4通路結合快取記憶體
300‧‧‧方法
400‧‧‧方法
圖1示意地顯示根據本發明具體實施例之一微處理器;圖2示意地顯示根據本發明具體實施例之一轉譯位址快取記憶體;圖3A顯示根據本發明具體實施例之一方法流程圖的部份,其從一指令快取記憶體提取一指令並判定指令的替代版本是否儲存於指令快取記憶體中;圖3B顯示圖3A所示流程圖的另一部份;圖3C顯示圖3A及3B所示流程圖的另一部份;圖4示意地顯示根據本發明具體實施例之一方法,其雜湊一指令的線性位址以產生此線性位址的雜湊索引及歧義消除標籤;以及圖5示意地顯示根據本發明具體實施例之轉譯位址快取記憶體項目。
在現代的微處理器中,架構層級指令可在來源指令集架構(ISA)(例如進階RISC機器(ARM)架構或x86架構)以
及替代ISA(其達成與來源相同之可觀察到的功能)之間轉譯。舉例來說,來源ISA的一組一或多個指令可轉譯為原生架構的一或多個微操作,其施行與來源ISA指令相同的功能。在某些設定中,相較於來源ISA指令,原生微指令可提供增強或最佳化的效能。
某些過去的方式試圖將來源指令的軟體最佳化鍊接,使得控制經由直接原生分支從一軟體最佳化傳到另一個軟體最佳化。然而,此方式可能受到分支程序的挑戰。因為在程式執行過程中分支來源可能是動態的,軟體最佳化之間的逐鍊交遞(chain-wise handoff)可能無法實行。舉例來說,若發生間接分支,則分支的不確定目標可能使其在產生最佳化時難以確認應擷取哪個軟體最佳化。因此,當從上千個可能的候選最佳化中判定分支及針對該分支的軟體最佳化時,微處理器可能停止運行。
因此,本文所揭露的各個具體實施例係相關於提取來源資訊以及來源資訊的替代版本,其在可接受的容忍度之內(例如在架構上可觀察到之效果的可接受容忍度內)達成與來源資訊相同之可觀察到的功能(在本文中稱作相同功能)。將理解到,實際上可使用任何適合的來源資訊及其任何替代版本,而不會偏離本發明的範疇。在某些具體實施例中,來源可包括一指令,例如針對ISA架構的一指令。除了指令之外或取代指令,來源資訊可包括來源資料,且替代版本可包
括來源資料的替代形式或版本。同樣地,將理解到,將來源轉換為其替代版本的任何適當方法(例如軟體方法及/或硬體方法)可被認為是在本發明的範疇內。為描述目的,本文中所呈現的描述及圖式將來源指令及來源指令的轉譯分別稱作來源資訊及來源資訊的替代版本,然而此類具體實施例並非限制性。
一範例方法包括在被指示以擷取一指令後,雜湊針對該指令的一位址,使其可判定是否存在該指令的替代版本。雜湊係施行以判定是否存在達成相同功能之指令的替代版本,例如原生轉譯(如針對可提取供微處理器執行之各種指令的來源指令集架構及原生微操作集架構之間的轉譯)。範例方法更包括若雜湊的結果為判定此一替代版本存在,則中止指令的提取、並擷取及執行替代版本。
本文中的討論將經常提到「擷取」一指令,以及接著若存在某些條件時則中止該擷取。在某些具體實施例中,「擷取」一指令可包括提取一指令。此外,當發生此類中止,則終止擷取程序。終止一般在擷取程序完成前發生。舉例來說,在一情境中,中止擷取可能發生在擷取一指令的實體位址時。在另一情境中,中止擷取可能發生在擷取一指令的實體位址之後但在從記憶體擷取指令之前。在擷取程序完成前中止擷取可節省從記憶體存取及擷取來源的時間花費。將理解到,如本文所使用,擷取並不限於提取的情境,其中
提取一般在解碼之前完成。舉例來說,指令可被擷取但在解碼過程中、解碼之前、或在任何適當的時間點中止。
來源資訊及該資訊的轉譯版本之間的映射及轉譯存在著廣泛的可能性。藉由判定替代版本是否存在並在若替代版本確實存在時中止擷取指令(例如ISA指令),微處理器相對於藉由避免解碼操作而解碼來源ISA指令的微處理器可提供增強的效能。額外的效能增強可由設定實現,其中替代版本藉由對允許替代版本比來源ISA指令更快地進行執行之操作的改變而提供最佳化效能。
圖1示意地繪示微處理器100的一具體實施例,其可連同本文所述之系統及方法使用。微處理器100可包括處理器暫存器109。此外,微處理器100可包括記憶體階層110及/或可與記憶體階層110通訊,其可包括L1處理器快取記憶體110A、L2處理器快取記憶體110B、L3處理器快取記憶體110C、主記憶體110D(例如一或多個DRAM晶片)、輔助儲存器110E(如磁性及/或光學儲存單元)及/或第三級儲存器110F(如磁帶)。將理解到,範例記憶體/儲存構件係以存取時間及容量的遞增順序列出,但可能有例外。
記憶體控制器110H可用以處理協定並提供主記憶體110D所需的信號介面,以及排程記憶體存取。記憶體控制器110H可實現於處理器晶粒上或於一個別晶粒上。應理解到,上文所提的記憶體階層並非限制性,且可使用其他記憶
體階層而不會偏離本發明範疇。
微處理器100也包括管線,其在圖1中係以簡化形式描述為管線102。管線化可允許多於一個指令同時地在不同的擷取及執行階段。換言之,一組指令可傳送通過管線102所包括的各種階段(包括提取、解碼、執行、及寫回階段等等),而另一個指令及/或資料係從記憶體擷取並依管線102作用。因此,當上游階段等待記憶體回傳指令及/或資料等等,可利用管線102中的下游階段。相對於以個別、串行方式擷取及執行指令及/或資料的方法,此方法有可能加速微處理器的指令及資料處理。
如圖1所示,範例管線102包括提取邏輯120、原生轉譯緩衝器130、解碼邏輯132、排程邏輯134、執行邏輯136、記憶體邏輯138、及寫回邏輯140。提取邏輯120從指令快取記憶體提取一所選指令以供執行。在圖1所示範例中,提取邏輯120包括指令轉譯後備緩衝器122,用以將所選指令之線性位址轉譯為指令的實體位址以被提取供執行。如本文所使用,指令的線性位址係指由頁表格轉譯/重新映射為關聯於指令所儲存在記憶體中之位置之實體位址的位址。在某些具體實施例中,線性位址可包括目錄、表格、及/或偏位項目,其可識別可找到指令之實體位址的頁目錄、頁表格、及/或在一頁表格中的頁框位置。
指令轉譯後備緩衝器122實際上可施行將線性位
址轉譯為那些指令之實體位址的任何適當方法。舉例來說,在某些具體實施例中,指令轉譯後備緩衝器122可包括內容可尋址記憶體,其儲存一部分的頁表格,其將指令的線性位址映射至那些指令的實體位址。
提取邏輯120也判定所選指令的原生轉譯是否存在。若這樣的一個原生轉譯存在,則系統中止指令提取並改為傳送原生轉譯以供執行。在圖1所繪示的具體實施例中,提取邏輯120包括轉譯位址快取記憶體124,用以儲存原生轉譯的位址。
幾乎任何適合的資料儲存架構及邏輯都可用於轉譯位址快取記憶體124。舉例來說,圖2示意地顯示使用作為轉譯位址快取記憶體之4通路(4-way)結合快取記憶體200的具體實施例。在圖2所示的具體實施例中,1024轉譯位址項目可儲存於四通路中的任一者,其取決於所選的位址方案,每一通路包括256資料位置。然而,將理解到,某些具體實施例可能具有較少的資料通路及/或資料位置,而其他具體實施例可能包括更多的資料通路及/或資料位置,而不會偏離本發明的範疇。
繼續參照圖1,提取邏輯120包括實體位址多工器126,其係多路傳輸從指令轉譯後備緩衝器122及轉譯位址快取記憶體124所接收之實體位址,並將其分佈至指令快取記憶體128。接著,指令快取記憶體128參照這些指令及原生轉譯
的實體位址而擷取儲存供微處理器100執行之指令及原生轉譯。若提取邏輯120判定存在針對所選指令的原生轉譯,則從指令快取記憶體128擷取原生轉譯並可傳送至選擇性的原生轉譯緩衝器130,準備最後分佈至排程邏輯134。或者,若提取邏輯120判定不存在針對所選指令的原生轉譯,則從指令快取記憶體128擷取所選指令並傳送至解碼邏輯132。解碼邏輯132將所選指令解碼,例如藉由剖析運算碼、運算元、及定址模式,並產生一或多個原生指令或微操作的解碼組,準備最後分佈至排程邏輯134。排程邏輯134排程原生轉譯及解碼指令,以供指令邏輯136執行。
圖1所繪示的具體實施例描述指令快取記憶體128為包括實體索引實體標籤(PIPT)指令快取記憶體,使得原生轉譯的位址可從轉譯位址快取記憶體124擷取,並同時從指令轉譯後備緩衝器122擷取來源位址。然而,將理解到,根據本發明的具體實施例可採用任何合適的指令快取記憶體128。舉例來說,在某些具體實施例中,指令快取記憶體128可包括線性索引實體標籤(LIPT)指令快取記憶體。在某些具體實施例中,提取邏輯可同時地從指令轉譯後備緩衝器擷取一來源的位址、從轉譯位址快取記憶體擷取一原生轉譯的位址、以及從LIPT指令快取記憶體擷取來源。若一原生轉譯為可得,可拋棄指令並可從LIPT快取記憶體擷取原生轉譯,以基於原生轉譯的位址而執行。若無原生轉譯為可得,可將指
令解碼並接著執行。
管線102也可包括用以執行載入及/或儲存操作的記憶體邏輯138以及用以寫入操作結果至適當的位置(如暫存器109)的寫入邏輯140。寫回後,微處理器進入由指令所更改的狀態,使得導致確定狀態之操作的結果可能不會被撤銷。
應理解到,上文中顯示於管線102中的階段是用以說明一般的RISC實施,並不意欲作為限制。舉例來說,在某些具體實施例中,可在某些管線階段上游實施VLIW技術。在某些其他具體實施例中,排程邏輯可包含於微處理器的提取邏輯及/或解碼邏輯。更一般地,微處理器可包括提取、解碼、及執行邏輯,其中記憶體及寫回功能係由執行邏輯實現。本發明同樣可應用於這些及其他微處理器實施。
在所述的範例中,指令可在一時間提取及執行一次或多於一次,其可能需要多個時脈週期。在此期間,資料路線的重要部份可能不會使用。補充或取代單一指令提取,可使用預提取方法來改善效能並避免關聯於讀取及儲存操作(即指令的讀取及載入此類指令至處理器暫存器及/或執行序列)的延遲瓶頸。因此,將理解到,實際上可使用任何適合方式來提取、排程及配送指令,而不會偏離本發明範疇。
圖3A-3C示意地顯示用以從指令快取記憶體提取一所選指令並判定所選指令之原生轉譯是否儲存於指令快取記憶體中之方法300的具體實施例。雖然方法300係有關判
定一指令的原生轉譯是否可得而描述,但將理解到此情境僅為提取指令並判定達成與指令相同功能之替代版本是否存在的描述,且方法300並不限於下述的範例或設定。因此,將理解到,方法300中所描述的程序係為了說明目的而安排及描述,而不意欲作為限制。在某些具體實施例中,本文所述之方法可包括額外或替代的程序,而在某些具體實施例中,本文所述的方法可包括可被重新排序或省略的某些程序,其並不會偏離本發明範疇。此外,將理解到,本文所述之方法可使用任何合適的硬體(包含本文所述的硬體)來實施。
回到圖3A,方法300包括在302中被導向以從指令快取記憶體提取一所選指令。在某些具體實施例中,提取程序可被導向以參照所選指令的線性位址而擷取一指令。舉例來說,所選指令可反應至目標指令指標之分支而從指令快取記憶體擷取,例如源自微處理器管線中之分支預測器或分支驗證點的分支。將理解到,程序302可包括在指令轉譯後備緩衝器中查詢選擇的實體位址,下文將更詳細的描述。
在某些具體實施例中,提取所選指令可包括從指令轉譯後備緩衝器提取所選指令的實體位址。在此類具體實施例中,所選指令的線性位址可在到目標指令指標的方向接收。接著,線性位址可由指令轉譯後備緩衝器轉譯為所選指令的實體位址,其藉由參照線性位址而搜尋儲存於指令後備緩衝器中的實體位址。若搜尋沒有命中所選指令的實體位
址,則實體位址可經由頁行走(page walk)或經由在較高階轉譯後備緩衝器中查詢而判定。不論實體位址如何判定,一旦判定所選指令的實體位址,其係提供至指令快取記憶體,以獲得所選指令。
在304中,方法300包含在獲得所選指令的實體位址時,雜湊所選指令的線性位址,以從線性位址產生雜湊索引。接著,當判定針對所選指令的原生轉譯是否存在時,可使用雜湊索引,其將於下文中做更詳細的描述。
舉例來說,到目標指令指標的方向可能造成線性位址被雜湊,並同時發生(在適當的容忍度內)線性位址到指令轉譯後備緩衝器的分佈。然而,將理解到,可在程序流程內之任何合適的位置使用任何施行雜湊的合適方式,而不會偏離本發明的範疇。
在某些具體實施例中,線性位址可由包含於微處理器內的適當硬體結構所雜湊。舉例來說,線性位址可由提取邏輯及/或原生轉譯位址快取記憶體所雜湊,然而實際上可使用任何合適的硬體結構來雜湊線性位址而不會偏離本發明的範疇。
可使用各式各樣的雜湊技術。舉例來說,在某些具體實施例中,可使用XOR雜湊函數產生雜湊索引。雜湊索引也可藉由雜湊線性位址的複數個部份而產生。在某些其他具體實施例中,可藉由使用線性位址的單一部份產生雜湊索
引。圖4示意地顯示使用XOR雜湊函數雜湊一指令之48位元線性位址以產生8位元雜湊索引的方法。在圖4所示的範例中,位元0-7與位元8-15進行XOR的結果係與位元16-23進行XOR,以產生8位元雜湊索引。
在某些具體實施例中,當雜湊線性位址時,可產生歧義消除標籤。歧義消除標籤可用以在轉譯位址快取記憶體中有多於一個轉譯位址項目具有相同索引值時,區別替代版本彼此之間不同的轉譯位址項目(舉例來說,針對指令之原生轉譯的位址項目)。因此,在某些具體實施例中,歧義消除標籤可用以區別儲存在轉譯位址快取記憶體中之具有相同轉譯位址索引之複數個轉譯位址項目。舉例來說,圖4示意地顯示從沒有形成8位元雜湊索引之線性位址的部份產生48位元線性位址的40位元歧義消除標籤。因此,在某些具體實施例中,未用來產生雜湊標籤的位元可用以產生歧義消除標籤。在圖4所示的範例中,位元8-48可用以形成歧義消除標籤。然而,可使用任何適合用以產生歧義消除標籤的方法,而不會偏離本發明的範疇。
雖然上述討論係關於雜湊一線性位址以從轉譯位址快取記憶體獲得一或多個轉譯位址項目,使得轉譯位址項目根據線性位址而進行索引,但將理解到轉譯位址快取記憶體可根據任何合適的位址而索引。舉例來說,在某些具體實施例中,適當組態的轉譯位址可根據實體位址而索引。當
兩個程序映射至在不同線性位址的一共享程式庫,根據實體位址而索引轉譯位址快取記憶體可節省轉譯位址快取記憶體內的空間。在某些這樣的情況下,共享程式庫只有一個版本可實體地載入記憶體。藉由根據實體位址而進行索引,共享的映射可導致獲得一單一項目,而未共享的映射可導致獲得不同的項目。
回到圖3B,範例方法300包括在306中判定被擷取之所選來源指令的有效原生轉譯是否存在。在某些具體實施例中,是否存在有效原生轉譯的判定係與所選指令之實體位址的判定共同發生(在可接受的容忍度內),包含從指令轉譯後備緩衝器的位址擷取。在這些具體實施例中,若判定有效原生轉譯並不存在,則在一或多個這些階段的並行處理可允許實體位址提取繼續而無不利。然而,將理解到,在某些具體實施例中判定不需為同時發生的。
不論何時施行有效性判定,若判定有效原生轉譯存在,則中止提取來源指令,其係例如藉由中止來源指令之實體位址的提取。接著,可藉由避免解碼步驟及藉由允許替代版本的使用來增強處理效率。
在圖3B所示的具體實施例中,判定是否存在有效原生轉譯包含在308獲得雜湊位址的一或多個轉譯位址項目,以及在310比較在雜湊程序過程中產生的歧義消除標籤以及使用每一個所獲得的轉譯位址而獲得之一或多個轉譯位址
歧義消除標籤。
一轉譯位址項目儲存原生轉譯所儲存的一實體位址。轉譯位址項目可根據與其相關的轉譯位址索引而查詢。舉例來說,當雜湊一位址所產生的一雜湊索引可用以查詢在轉譯位址快取記憶體中的特定轉譯位址索引。
在某些具體實施例中,可經由特定轉譯位址索引的查詢而獲得多於一個轉譯位址項目。舉例來說,用以查詢4通路結合快取記憶體的轉譯位址索引的雜湊位址可能導致高達四個轉譯位址項目的擷取。在此類具體實施例中,每一轉譯位址項目具有個別的轉譯位址歧義消除標籤,其係區別該項目與來自具有相同轉譯位址索引之其他項目。比較藉由雜湊位址所產生之歧義消除標籤與以個別轉譯位址項目擷取之歧義消除標籤,可判定任何所獲得項目是否代表有效原生轉譯的一實體位址。在某些具體實施例中,歧義消除標籤的比較可包括有效位元的比較。在此類具體實施例中,只有當有效位元設定為預選值(例如數值1)時才可發現所比較標籤之間的一致。
在某些具體實施例中,轉譯位址項目可包括原生轉譯之實體位址的位元表示及原生轉譯之假設上下文的位元表示。此外,在某些具體實施例中,轉譯位址項目可包括關聯於轉譯及/或轉譯態樣的一或多個位元。圖5示意地顯示包括實體位址位元、假設上下文位元、及轉譯相關位元之轉譯位
址項目的具體實施例。
繼續參考圖3B,方法300包含在312判定在雜湊位址時所產生的歧義消除標籤是否符合以轉譯位址項目所獲得之任何歧義消除標籤。若歧義消除標籤並不符合,則方法300進行至330,如圖3C所示。若由轉譯位址快取記憶體所獲得的歧義消除標籤符合由雜湊所產生的歧義消除標籤,則此符合表示獲得了有效的歧義消除標籤。在某些具體實施例中,有效歧義消除標籤的存在可導致存在有效轉譯的判定。然而,在某些具體實施例中,僅存在有效歧義消除標籤可能無法支持關聯於該標籤的項目包含有效原生轉譯的結論。因此,方法300可有分支314,其將於下文詳細討論,或者可繼續進行至318,如圖3C所示。
如上文所介紹,在某些具體實施例中,轉譯位址項目可包含原生轉譯的假設上下文。如本文所使用,當前的上下文描述微處理器的當前工作狀態,且假設上下文描述原生轉譯為有效之微處理器的狀態。因此,在某些具體實施例中,即使識別了一項目的有效歧義消除標籤,關聯於該歧義消除標籤的項目可能不包含針對當前上下文的有效原生轉譯。在某些範例中,發佈當前上下文與假設上下文並不符合的原生轉譯可能造成執行錯誤或危害。
將理解到,上下文可包含於轉譯位址項目及/或轉譯位址的任何適當部份中。在圖5所示的範例中,上下文位
元係描述為包含於轉譯位址項目內。在此類具體實施例中,可選擇性地比較上下文,如圖3C的316所示。因此,不進行至318,方法300可選擇性地分支到314,比較微處理器的當前上下文與儲存於轉譯位址項目中的假設上下文。回到圖3C,在這些具體實施例中,方法300可包含在316判定當前上下文是否符合假設的上下文。在某些具體實施例中,當前上下文可與假設上下文比較以判定一致性。在一範例情境中,若基於一對一的比較,假設及當前的上下文係符合,則可找到一致性。若上下文符合,則方法300繼續進行至318,其中方法300係做出有效原生轉譯存在的判定。若上下文並不符合,則方法300進行至330,其中方法300做出有效原生轉譯不存在的判定。
額外或替代地,在某些具體實施例中,假設上下文的位元可包含於轉譯位址中,例如在歧義消除標籤及/或雜湊中。在這些具體實施例中,在位址的一或多個部份中包含假設上下文可允許將具有不同上下文但相同線性位址的二或更多項目共同儲存於轉譯位址快取記憶體內。將理解到,這些具體實施例的施行可取決於特定應用的考量。舉例來說,在集合相聯性為低的某些具體實施例中,例如位址係直接映射的情境下,假設上下文可包含於雜湊中,可避免衝突性失誤。舉例來說,假設上下文可在雜湊過程中進行XOR而成為雜湊。在某些其他具體實施例中,例如用以雜湊額外位元的
循環時間影響處理時間多於用以處理相對較寬之歧義消除標籤的時間的情況下,假設上下文可加入到歧義消除標籤以避免潛在的處理延遲。舉例來說,假設上下文可附加至歧義消除標籤。在其他具體實施例中,假設上下文可包括於雜湊及歧義消除標籤中。
一旦判定存在有效原生轉譯,方法300包含在320中止提取指令。當發生中止,提取程序將終止。雖然終止可發生於指令的提取之後,但在某些具體實施例中,終止可發生於提取程序完成之前。舉例來說,在提取指令包含從指令轉譯後備緩衝器擷取指令之實體位址的具體實施例中,中止提取指令可包含中止從指令轉譯後備緩衝器擷取實體位址。
方法300包含在322傳送原生轉譯的實體位址至指令快取記憶體,以及在324從指令快取記憶體接收所選的原生轉譯。在某些具體實施例中,一旦從指令快取記憶體接收所選的原生轉譯,可將其傳送至原生轉譯緩衝器,準備供最終分佈至排程邏輯,其將於此被排程供執行。
或者,在圖3C所示的具體實施例中,若不存在有效的原生轉譯,則方法300包含在332允許完成自指令快取記憶體的提取。舉例來說,在提取指令包含從指令轉譯後備緩衝器擷取實體位址的情況下,方法300可包括在334中,在接收來自指令轉譯後備緩衝器之指令的實體位址後,傳送指令的實體位址至指令快取記憶體,使得指令可在336從指令快取
記憶體獲得。
因此,藉由判定來源材料的替代版本的存在(在上述的範例中,提供與來源指令相同功能的原生轉譯),同時提取來源材料,相對於僅基於來源材料的程序,本文所述方法可提供增強的程序。此外,藉由使用硬體結構來進行同時的判定,本文所述的方法相對於基於軟體最佳化的方式可更為有效,特別在分支的處理情境。
此書面說明使用範例來揭露本發明,其包含最佳模式,也致能熟習相關領域技藝者實行本發明,包括製造及使用任何裝置或系統並實施任何結合的方法。本發明的可專利範疇係由申請專利範圍所定義,且可能包括熟此技藝者所理解的其他範例。這些其他範例係意欲在申請專利範圍的範疇內。
100‧‧‧微處理器
102‧‧‧管線
109‧‧‧暫存器
110‧‧‧記憶體階層
110A‧‧‧L1處理器快取記憶體
110B‧‧‧L2處理器快取記憶體
110C‧‧‧L3處理器快取記憶體
110D‧‧‧主記憶體
110E‧‧‧輔助儲存器
110F‧‧‧第三級儲存器
110H‧‧‧記憶體控制器
120‧‧‧提取邏輯
122‧‧‧指令轉譯後備緩衝器
124‧‧‧轉譯位址快取記憶體
126‧‧‧實體位址多工器
128‧‧‧指令快取記憶體
130‧‧‧原生轉譯緩衝器
132‧‧‧解碼邏輯
134‧‧‧排程邏輯
136‧‧‧執行邏輯
138‧‧‧記憶體邏輯
140‧‧‧寫回邏輯
Claims (10)
- 一種微處理器,包含提取邏輯以操作:提取一指令;雜湊該指令之一位址,以判定是否存在達成與該指令相同功能之該指令的一替代版本;以及若該雜湊的結果為判定此一替代版本確實存在,中止該提取、並擷取及執行該替代版本。
- 如申請專利範圍第1項所述之微處理器,其中該提取邏輯更操作以在該指令被提取時雜湊該位址。
- 如申請專利範圍第2項所述之微處理器,其中該提取邏輯更操作以經由該指令之一線性位址之一或多個部份的一雜湊產生一雜湊索引,以及從該指令之該線性位址之其他部份產生一歧義消除標籤。
- 如申請專利範圍第1項所述之微處理器,其中該提取邏輯更操作以:根據從該雜湊所產生之一雜湊索引,藉由參照該微處理器之一轉譯位址快取記憶體中的一轉譯位址索引而判定該替代版本是否存在;以及若該替代版本存在,則從該轉譯位址快取記憶體擷取該替代版本的一實體位址。
- 如申請專利範圍第4項所述之微處理器,其中該提取邏輯更操作以:根據該轉譯位址索引獲得儲存於該轉譯位址快取記憶體中的一或多個轉譯位址項目;比較從該雜湊產生的一歧義消除標籤與關聯於所獲得之該一或多個轉譯位址項目之每一者的一歧義消除標籤;以及若從該雜湊產生的該歧義消除標籤與從該轉譯位址快取記憶體所獲得之一歧義消除標籤符合,則判定該替代版本存在。
- 如申請專利範圍第4項所述之微處理器,其中該提取邏輯更操作以:比較該微處理器之一當前上下文與一假設上下文,該當前上下文描述該微處理器之一當前工作狀態,該假設上下文描述該微處理器在該替代版本為有效時的一狀態;以及若該當前上下文符合該假設上下文,則判定該替代版本存在,其中該假設上下文係包含於該雜湊索引、該歧義消除標籤、或關聯於該雜湊索引及該歧義消除標籤的一或多個轉譯位址項目的一或多個中。
- 如申請專利範圍第4項所述之微處理器,更包含提取邏輯以操作:傳送該替代版本之該實體位址至一指令快取記憶體,使得該替代版本可從該指令快取記憶體獲得;以及傳送從該指令快取記憶體所獲得的替代版本至排程邏輯,以排程該替代版本供執行。
- 如申請專利範圍第1項所述之微處理器,更包含一轉譯位址快取記憶體,其組態以針對儲存於該轉譯位址快取記憶體內的每一替代版本儲存包含該替代版本之一實體位址的一轉譯位址項目以及用以描述該微處理器在該替代版本為有效時之一狀態的一假設上下文。
- 如申請專利範圍第1項所述之微處理器,更包含一指令快取記憶體,其係選自由一線性索引實體標籤指令快取記憶體及一實體索引實體標籤指令快取記憶體所組成之群組。
- 如申請專利範圍第1項所述之微處理器,其中該提取邏輯更操作以參照該指令之一線性位址而從一指令轉譯後備緩衝器擷取該指令之一實體位址。
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/419,323 US10146545B2 (en) | 2012-03-13 | 2012-03-13 | Translation address cache for a microprocessor |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201407348A true TW201407348A (zh) | 2014-02-16 |
TWI515567B TWI515567B (zh) | 2016-01-01 |
Family
ID=49044138
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW102108698A TWI515567B (zh) | 2012-03-13 | 2013-03-12 | 微處理器的轉譯位址快取記憶體 |
Country Status (4)
Country | Link |
---|---|
US (1) | US10146545B2 (zh) |
CN (1) | CN103309644B (zh) |
DE (1) | DE102013201767B4 (zh) |
TW (1) | TWI515567B (zh) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9880846B2 (en) | 2012-04-11 | 2018-01-30 | Nvidia Corporation | Improving hit rate of code translation redirection table with replacement strategy based on usage history table of evicted entries |
US10241810B2 (en) | 2012-05-18 | 2019-03-26 | Nvidia Corporation | Instruction-optimizing processor with branch-count table in hardware |
US9588902B2 (en) * | 2012-12-04 | 2017-03-07 | Advanced Micro Devices, Inc. | Flexible page sizes for virtual memory |
US20140189310A1 (en) | 2012-12-27 | 2014-07-03 | Nvidia Corporation | Fault detection in instruction translations |
US10108424B2 (en) | 2013-03-14 | 2018-10-23 | Nvidia Corporation | Profiling code portions to generate translations |
US9330020B2 (en) * | 2013-12-27 | 2016-05-03 | Intel Corporation | System, apparatus, and method for transparent page level instruction translation |
CN103942161B (zh) * | 2014-04-24 | 2017-02-15 | 杭州冰特科技有限公司 | 只读缓存的去冗余系统及方法以及缓存的去冗余方法 |
US10152527B1 (en) * | 2015-12-28 | 2018-12-11 | EMC IP Holding Company LLC | Increment resynchronization in hash-based replication |
US10380100B2 (en) | 2016-04-27 | 2019-08-13 | Western Digital Technologies, Inc. | Generalized verification scheme for safe metadata modification |
US10380069B2 (en) * | 2016-05-04 | 2019-08-13 | Western Digital Technologies, Inc. | Generalized write operations verification method |
KR20180087925A (ko) * | 2017-01-25 | 2018-08-03 | 삼성전자주식회사 | 논리 어드레스와 물리 어드레스 사이에서 해싱 기반 변환을 수행하는 스토리지 장치 |
EP3422192B1 (en) * | 2017-06-28 | 2020-08-12 | Arm Ltd | Address translation data invalidation |
US10997066B2 (en) * | 2018-02-20 | 2021-05-04 | Samsung Electronics Co., Ltd. | Storage devices that support cached physical address verification and methods of operating same |
US11175921B2 (en) * | 2018-05-15 | 2021-11-16 | International Business Machines Corporation | Cognitive binary coded decimal to binary number conversion hardware for evaluating a preferred instruction variant based on feedback |
US11023397B2 (en) * | 2019-03-25 | 2021-06-01 | Alibaba Group Holding Limited | System and method for monitoring per virtual machine I/O |
WO2022003418A1 (en) | 2020-06-29 | 2022-01-06 | Aurora Labs Ltd. | Efficient controller data generation and extraction |
US11928472B2 (en) | 2020-09-26 | 2024-03-12 | Intel Corporation | Branch prefetch mechanisms for mitigating frontend branch resteers |
US20230057623A1 (en) * | 2021-08-23 | 2023-02-23 | Intel Corporation | Issue, execution, and backend driven frontend translation control for performant and secure data-space guided micro-sequencing |
Family Cites Families (187)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3815101A (en) | 1972-11-08 | 1974-06-04 | Sperry Rand Corp | Processor state and storage limits register auto-switch |
US3950729A (en) | 1973-08-31 | 1976-04-13 | Nasa | Shared memory for a fault-tolerant computer |
US4654790A (en) | 1983-11-28 | 1987-03-31 | Amdahl Corporation | Translation of virtual and real addresses to system addresses |
US4812981A (en) | 1985-10-24 | 1989-03-14 | Prime Computer, Inc. | Memory management system improving the efficiency of fork operations |
US4797814A (en) | 1986-05-01 | 1989-01-10 | International Business Machines Corporation | Variable address mode cache |
JP2589713B2 (ja) | 1987-11-20 | 1997-03-12 | 株式会社日立製作所 | データプロセッサ及びデータ処理システム |
US5179669A (en) | 1988-08-22 | 1993-01-12 | At&T Bell Laboratories | Multiprocessor interconnection and access arbitration arrangement |
JPH02288927A (ja) | 1989-01-18 | 1990-11-28 | Nec Corp | 共有メモリ管理方式 |
CA2011807C (en) | 1989-03-20 | 1999-02-23 | Katsumi Hayashi | Data base processing system using multiprocessor system |
JPH0354660A (ja) | 1989-07-21 | 1991-03-08 | Nec Corp | マルチプロセッサシステムにおける共有メモリ管理方式 |
US5123094A (en) | 1990-01-26 | 1992-06-16 | Apple Computer, Inc. | Interprocessor communications includes second CPU designating memory locations assigned to first CPU and writing their addresses into registers |
JPH04182858A (ja) | 1990-11-19 | 1992-06-30 | Mitsubishi Electric Corp | 共有メモリ管理方式 |
US5245702A (en) | 1991-07-05 | 1993-09-14 | Sun Microsystems, Inc. | Method and apparatus for providing shared off-screen memory |
US5696925A (en) | 1992-02-25 | 1997-12-09 | Hyundai Electronics Industries, Co., Ltd. | Memory management unit with address translation function |
US5414824A (en) | 1993-06-30 | 1995-05-09 | Intel Corporation | Apparatus and method for accessing a split line in a high speed cache |
US5446854A (en) | 1993-10-20 | 1995-08-29 | Sun Microsystems, Inc. | Virtual memory computer apparatus and address translation mechanism employing hashing scheme and page frame descriptor that support multiple page sizes |
US5649102A (en) | 1993-11-26 | 1997-07-15 | Hitachi, Ltd. | Distributed shared data management system for controlling structured shared data and for serializing access to shared data |
US5526504A (en) | 1993-12-15 | 1996-06-11 | Silicon Graphics, Inc. | Variable page size translation lookaside buffer |
US5956753A (en) | 1993-12-30 | 1999-09-21 | Intel Corporation | Method and apparatus for handling speculative memory access operations |
SG47981A1 (en) * | 1994-03-01 | 1998-04-17 | Intel Corp | Pipeline process of instructions in a computer system |
JPH0877347A (ja) | 1994-03-08 | 1996-03-22 | Texas Instr Inc <Ti> | 画像/グラフィックス処理用のデータ処理装置およびその操作方法 |
US5487146A (en) | 1994-03-08 | 1996-01-23 | Texas Instruments Incorporated | Plural memory access address generation employing guide table entries forming linked list |
US5963984A (en) | 1994-11-08 | 1999-10-05 | National Semiconductor Corporation | Address translation unit employing programmable page size |
US6813699B1 (en) | 1995-06-02 | 2004-11-02 | Transmeta Corporation | Speculative address translation for processor using segmentation and optional paging |
US5999189A (en) | 1995-08-04 | 1999-12-07 | Microsoft Corporation | Image compression to reduce pixel and texture memory requirements in a real-time image generator |
US5949785A (en) | 1995-11-01 | 1999-09-07 | Whittaker Corporation | Network access communications system and methodology |
US6298390B1 (en) | 1995-11-22 | 2001-10-02 | Sun Microsystems, Inc. | Method and apparatus for extending traditional operating systems file systems |
US6091897A (en) | 1996-01-29 | 2000-07-18 | Digital Equipment Corporation | Fast translation and execution of a computer program on a non-native architecture by use of background translator |
US6711667B1 (en) | 1996-06-28 | 2004-03-23 | Legerity, Inc. | Microprocessor configured to translate instructions from one instruction set to another, and to store the translated instructions |
US6031992A (en) | 1996-07-05 | 2000-02-29 | Transmeta Corporation | Combining hardware and software to provide an improved microprocessor |
US6012132A (en) | 1997-03-31 | 2000-01-04 | Intel Corporation | Method and apparatus for implementing a page table walker that uses a sliding field in the virtual addresses to identify entries in a page table |
US5870582A (en) | 1997-03-31 | 1999-02-09 | International Business Machines Corporation | Method and apparatus for completion of non-interruptible instructions before the instruction is dispatched |
AUPO647997A0 (en) | 1997-04-30 | 1997-05-22 | Canon Information Systems Research Australia Pty Ltd | Memory controller architecture |
GB9724031D0 (en) * | 1997-11-13 | 1998-01-14 | Advanced Telecommunications Mo | Cache memory operation |
US6091987A (en) | 1998-04-29 | 2000-07-18 | Medtronic, Inc. | Power consumption reduction in medical devices by employing different supply voltages |
US6591355B2 (en) | 1998-09-28 | 2003-07-08 | Technion Research And Development Foundation Ltd. | Distributed shared memory system with variable granularity |
US6862635B1 (en) | 1998-11-13 | 2005-03-01 | Cray Inc. | Synchronization techniques in a multithreaded environment |
US7007075B1 (en) | 1998-12-09 | 2006-02-28 | E-Lysium Transaction Systems Inc. | Flexible computer resource manager |
US6297832B1 (en) | 1999-01-04 | 2001-10-02 | Ati International Srl | Method and apparatus for memory access scheduling in a video graphics system |
US6362826B1 (en) | 1999-01-15 | 2002-03-26 | Intel Corporation | Method and apparatus for implementing dynamic display memory |
US6978462B1 (en) | 1999-01-28 | 2005-12-20 | Ati International Srl | Profiling execution of a sequence of events occuring during a profiled execution interval that matches time-independent selection criteria of events to be profiled |
US8065504B2 (en) | 1999-01-28 | 2011-11-22 | Ati International Srl | Using on-chip and off-chip look-up tables indexed by instruction address to control instruction execution in a processor |
US7941647B2 (en) | 1999-01-28 | 2011-05-10 | Ati Technologies Ulc | Computer for executing two instruction sets and adds a macroinstruction end marker for performing iterations after loop termination |
US7275246B1 (en) | 1999-01-28 | 2007-09-25 | Ati International Srl | Executing programs for a first computer architecture on a computer of a second architecture |
US6519694B2 (en) | 1999-02-04 | 2003-02-11 | Sun Microsystems, Inc. | System for handling load errors having symbolic entity generator to generate symbolic entity and ALU to propagate the symbolic entity |
US6535905B1 (en) | 1999-04-29 | 2003-03-18 | Intel Corporation | Method and apparatus for thread switching within a multithreaded processor |
US6714904B1 (en) | 1999-10-13 | 2004-03-30 | Transmeta Corporation | System for using rate of exception event generation during execution of translated instructions to control optimization of the translated instructions |
US6574749B1 (en) | 1999-10-29 | 2003-06-03 | Nortel Networks Limited | Reliable distributed shared memory |
US6751583B1 (en) | 1999-10-29 | 2004-06-15 | Vast Systems Technology Corporation | Hardware and software co-simulation including simulating a target processor using binary translation |
US6499090B1 (en) | 1999-12-28 | 2002-12-24 | Intel Corporation | Prioritized bus request scheduling mechanism for processing devices |
US6625715B1 (en) | 1999-12-30 | 2003-09-23 | Intel Corporation | System and method for translation buffer accommodating multiple page sizes |
US20010049818A1 (en) | 2000-02-09 | 2001-12-06 | Sanjeev Banerjia | Partitioned code cache organization to exploit program locallity |
US6457115B1 (en) | 2000-06-15 | 2002-09-24 | Advanced Micro Devices, Inc. | Apparatus and method for generating 64 bit addresses using a 32 bit adder |
ATE259081T1 (de) | 2000-07-06 | 2004-02-15 | Texas Instruments Inc | Mehrprozessorsystem prüfungsschaltung |
US6636223B1 (en) | 2000-08-02 | 2003-10-21 | Ati International. Srl | Graphics processing system with logic enhanced memory and method therefore |
US7162612B2 (en) | 2000-08-16 | 2007-01-09 | Ip-First, Llc | Mechanism in a microprocessor for executing native instructions directly from memory |
EP1213650A3 (en) | 2000-08-21 | 2006-08-30 | Texas Instruments France | Priority arbitration based on current task and MMU |
EP1182571B1 (en) | 2000-08-21 | 2011-01-26 | Texas Instruments Incorporated | TLB operations based on shared bit |
US6742104B2 (en) | 2000-08-21 | 2004-05-25 | Texas Instruments Incorporated | Master/slave processing system with shared translation lookaside buffer |
US6883079B1 (en) | 2000-09-01 | 2005-04-19 | Maxtor Corporation | Method and apparatus for using data compression as a means of increasing buffer bandwidth |
US6859208B1 (en) | 2000-09-29 | 2005-02-22 | Intel Corporation | Shared translation address caching |
US20020069402A1 (en) | 2000-10-05 | 2002-06-06 | Nevill Edward Colles | Scheduling control within a system having mixed hardware and software based instruction execution |
JP2002169696A (ja) | 2000-12-04 | 2002-06-14 | Mitsubishi Electric Corp | データ処理装置 |
US7356026B2 (en) | 2000-12-14 | 2008-04-08 | Silicon Graphics, Inc. | Node translation and protection in a clustered multiprocessor system |
US6925547B2 (en) | 2000-12-14 | 2005-08-02 | Silicon Graphics, Inc. | Remote address translation in a multiprocessor system |
US6560690B2 (en) * | 2000-12-29 | 2003-05-06 | Intel Corporation | System and method for employing a global bit for page sharing in a linear-addressed cache |
US6549997B2 (en) | 2001-03-16 | 2003-04-15 | Fujitsu Limited | Dynamic variable page size translation of addresses |
US7073044B2 (en) | 2001-03-30 | 2006-07-04 | Intel Corporation | Method and apparatus for sharing TLB entries |
US6658538B2 (en) | 2001-06-21 | 2003-12-02 | International Business Machines Corporation | Non-uniform memory access (NUMA) data processing system having a page table including node-specific data storage and coherency control |
US6523104B2 (en) | 2001-07-13 | 2003-02-18 | Mips Technologies, Inc. | Mechanism for programmable modification of memory mapping granularity |
US6901505B2 (en) | 2001-08-09 | 2005-05-31 | Advanced Micro Devices, Inc. | Instruction causing swap of base address from segment register with address from another register |
US6757784B2 (en) | 2001-09-28 | 2004-06-29 | Intel Corporation | Hiding refresh of memory and refresh-hidden memory |
US6823433B1 (en) | 2001-11-13 | 2004-11-23 | Advanced Micro Devices, Inc. | Memory management system and method for providing physical address based memory access security |
US6877077B2 (en) | 2001-12-07 | 2005-04-05 | Sun Microsystems, Inc. | Memory controller and method using read and write queues and an ordering queue for dispatching read and write memory requests out of order to reduce memory latency |
EP1331539B1 (en) | 2002-01-16 | 2016-09-28 | Texas Instruments France | Secure mode for processors supporting MMU and interrupts |
US6851008B2 (en) | 2002-03-06 | 2005-02-01 | Broadcom Corporation | Adaptive flow control method and apparatus |
KR100921779B1 (ko) | 2002-04-18 | 2009-10-15 | 어드밴스드 마이크로 디바이시즈, 인코포레이티드 | 보호 실행 모드로 동작 가능한 중앙처리장치를 포함한컴퓨터 시스템 및 보호 통신로를 통해 연결된 보호 서비스프로세서 |
US8285743B2 (en) | 2002-06-24 | 2012-10-09 | International Business Machines Corporation | Scheduling viewing of web pages in a data processing system |
US7124327B2 (en) | 2002-06-29 | 2006-10-17 | Intel Corporation | Control over faults occurring during the operation of guest software in the virtual-machine architecture |
JP3982353B2 (ja) | 2002-07-12 | 2007-09-26 | 日本電気株式会社 | フォルトトレラントコンピュータ装置、その再同期化方法及び再同期化プログラム |
EP1391820A3 (en) | 2002-07-31 | 2007-12-19 | Texas Instruments Incorporated | Concurrent task execution in a multi-processor, single operating system environment |
US6950925B1 (en) | 2002-08-28 | 2005-09-27 | Advanced Micro Devices, Inc. | Scheduler for use in a microprocessor that supports data-speculative execution |
GB2392998B (en) | 2002-09-16 | 2005-07-27 | Advanced Risc Mach Ltd | Handling interrupts during multiple access program instructions |
GB2393274B (en) | 2002-09-20 | 2006-03-15 | Advanced Risc Mach Ltd | Data processing system having an external instruction set and an internal instruction set |
US7398525B2 (en) | 2002-10-21 | 2008-07-08 | International Business Machines Corporation | Resource scheduling in workflow management systems |
US6981083B2 (en) | 2002-12-05 | 2005-12-27 | International Business Machines Corporation | Processor virtualization mechanism via an enhanced restoration of hard architected states |
US20040122800A1 (en) | 2002-12-23 | 2004-06-24 | Nair Sreekumar R. | Method and apparatus for hardware assisted control redirection of original computer code to transformed code |
US7191349B2 (en) | 2002-12-26 | 2007-03-13 | Intel Corporation | Mechanism for processor power state aware distribution of lowest priority interrupt |
US7203932B1 (en) | 2002-12-30 | 2007-04-10 | Transmeta Corporation | Method and system for using idiom recognition during a software translation process |
US20040128448A1 (en) | 2002-12-31 | 2004-07-01 | Intel Corporation | Apparatus for memory communication during runahead execution |
US7139876B2 (en) | 2003-01-16 | 2006-11-21 | Ip-First, Llc | Microprocessor and apparatus for performing fast speculative pop operation from a stack memory cache |
US7168077B2 (en) | 2003-01-31 | 2007-01-23 | Handysoft Corporation | System and method of executing and controlling workflow processes |
EP1447742A1 (en) | 2003-02-11 | 2004-08-18 | STMicroelectronics S.r.l. | Method and apparatus for translating instructions of an ARM-type processor into instructions for a LX-type processor |
US6965983B2 (en) | 2003-02-16 | 2005-11-15 | Faraday Technology Corp. | Simultaneously setting prefetch address and fetch address pipelined stages upon branch |
US6963963B2 (en) | 2003-03-25 | 2005-11-08 | Freescale Semiconductor, Inc. | Multiprocessor system having a shared main memory accessible by all processor units |
EP1611498B1 (en) | 2003-03-27 | 2010-03-10 | Nxp B.V. | Branch based activity monitoring |
US7003647B2 (en) | 2003-04-24 | 2006-02-21 | International Business Machines Corporation | Method, apparatus and computer program product for dynamically minimizing translation lookaside buffer entries across contiguous memory |
US7107441B2 (en) | 2003-05-21 | 2006-09-12 | Intel Corporation | Pre-boot interpreted namespace parsing for flexible heterogeneous configuration and code consolidation |
US7082508B2 (en) | 2003-06-24 | 2006-07-25 | Intel Corporation | Dynamic TLB locking based on page usage metric |
US7124255B2 (en) | 2003-06-30 | 2006-10-17 | Microsoft Corporation | Message based inter-process for high volume data |
GB0316532D0 (en) | 2003-07-15 | 2003-08-20 | Transitive Ltd | Method and apparatus for partitioning code in program code conversion |
US7225299B1 (en) | 2003-07-16 | 2007-05-29 | Transmeta Corporation | Supporting speculative modification in a data cache |
US7062631B1 (en) | 2003-07-17 | 2006-06-13 | Transmeta Corporation | Method and system for enforcing consistent per-physical page cacheability attributes |
US7418585B2 (en) | 2003-08-28 | 2008-08-26 | Mips Technologies, Inc. | Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts |
US20050050013A1 (en) | 2003-08-28 | 2005-03-03 | Sharp Laboratories Of America, Inc. | System and method for policy-driven device queries |
US7010648B2 (en) | 2003-09-08 | 2006-03-07 | Sun Microsystems, Inc. | Method and apparatus for avoiding cache pollution due to speculative memory load operations in a microprocessor |
US7921300B2 (en) | 2003-10-10 | 2011-04-05 | Via Technologies, Inc. | Apparatus and method for secure hash algorithm |
US7321958B2 (en) | 2003-10-30 | 2008-01-22 | International Business Machines Corporation | System and method for sharing memory by heterogeneous processors |
US7159095B2 (en) | 2003-12-09 | 2007-01-02 | International Business Machines Corporation | Method of efficiently handling multiple page sizes in an effective to real address translation (ERAT) table |
US7730489B1 (en) | 2003-12-10 | 2010-06-01 | Oracle America, Inc. | Horizontally scalable and reliable distributed transaction management in a clustered application server environment |
US7107411B2 (en) | 2003-12-16 | 2006-09-12 | International Business Machines Corporation | Apparatus method and system for fault tolerant virtual memory management |
US7496732B2 (en) | 2003-12-17 | 2009-02-24 | Intel Corporation | Method and apparatus for results speculation under run-ahead execution |
US7310722B2 (en) | 2003-12-18 | 2007-12-18 | Nvidia Corporation | Across-thread out of order instruction dispatch in a multithreaded graphics processor |
US7340565B2 (en) | 2004-01-13 | 2008-03-04 | Hewlett-Packard Development Company, L.P. | Source request arbitration |
US7293164B2 (en) | 2004-01-14 | 2007-11-06 | International Business Machines Corporation | Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions |
US7082075B2 (en) | 2004-03-18 | 2006-07-25 | Micron Technology, Inc. | Memory device and method having banks of different sizes |
US7383414B2 (en) | 2004-05-28 | 2008-06-03 | Oracle International Corporation | Method and apparatus for memory-mapped input/output |
US7234038B1 (en) | 2004-05-28 | 2007-06-19 | Sun Microsystems, Inc. | Page mapping cookies |
US20060004984A1 (en) | 2004-06-30 | 2006-01-05 | Morris Tonia G | Virtual memory management system |
US8190863B2 (en) | 2004-07-02 | 2012-05-29 | Intel Corporation | Apparatus and method for heterogeneous chip multiprocessors via resource allocation and restriction |
US7257699B2 (en) | 2004-07-08 | 2007-08-14 | Sun Microsystems, Inc. | Selective execution of deferred instructions in a processor that supports speculative execution |
US7194604B2 (en) | 2004-08-26 | 2007-03-20 | International Business Machines Corporation | Address generation interlock resolution under runahead execution |
US7890735B2 (en) | 2004-08-30 | 2011-02-15 | Texas Instruments Incorporated | Multi-threading processors, integrated circuit devices, systems, and processes of operation and manufacture |
US8001294B2 (en) | 2004-09-28 | 2011-08-16 | Sony Computer Entertainment Inc. | Methods and apparatus for providing a compressed network in a multi-processing system |
US7340582B2 (en) | 2004-09-30 | 2008-03-04 | Intel Corporation | Fault processing for direct memory access address translation |
US8843727B2 (en) | 2004-09-30 | 2014-09-23 | Intel Corporation | Performance enhancement of address translation using translation tables covering large address spaces |
US20060149931A1 (en) | 2004-12-28 | 2006-07-06 | Akkary Haitham | Runahead execution in a central processing unit |
CN100573443C (zh) | 2004-12-30 | 2009-12-23 | 英特尔公司 | 从混合源指令集架构到单一目标指令集架构的二进制代码转换中的多格式指令的格式选择 |
US7437517B2 (en) | 2005-01-11 | 2008-10-14 | International Business Machines Corporation | Methods and arrangements to manage on-chip memory to reduce memory latency |
US20060174228A1 (en) | 2005-01-28 | 2006-08-03 | Dell Products L.P. | Adaptive pre-fetch policy |
US7752627B2 (en) | 2005-02-04 | 2010-07-06 | Mips Technologies, Inc. | Leaky-bucket thread scheduler in a multithreading microprocessor |
US7948896B2 (en) | 2005-02-18 | 2011-05-24 | Broadcom Corporation | Weighted-fair-queuing relative bandwidth sharing |
US7209405B2 (en) | 2005-02-23 | 2007-04-24 | Micron Technology, Inc. | Memory device and method having multiple internal data buses and memory bank interleaving |
TWI309378B (en) | 2005-02-23 | 2009-05-01 | Altek Corp | Central processing unit having a micro-code engine |
US7447869B2 (en) | 2005-04-07 | 2008-11-04 | Ati Technologies, Inc. | Method and apparatus for fragment processing in a virtual memory system |
US20100161901A9 (en) * | 2005-04-14 | 2010-06-24 | Arm Limited | Correction of incorrect cache accesses |
US20060236074A1 (en) * | 2005-04-14 | 2006-10-19 | Arm Limited | Indicating storage locations within caches |
DE102005021749A1 (de) | 2005-05-11 | 2006-11-16 | Fachhochschule Dortmund | Verfahren und Vorrichtung zur programmgesteuerten Informationsverarbeitung |
US7299337B2 (en) | 2005-05-12 | 2007-11-20 | Traut Eric P | Enhanced shadow page table algorithms |
US7739668B2 (en) | 2005-05-16 | 2010-06-15 | Texas Instruments Incorporated | Method and system of profiling applications that use virtual memory |
US20060277398A1 (en) | 2005-06-03 | 2006-12-07 | Intel Corporation | Method and apparatus for instruction latency tolerant execution in an out-of-order pipeline |
US7814292B2 (en) | 2005-06-14 | 2010-10-12 | Intel Corporation | Memory attribute speculation |
US20070067505A1 (en) | 2005-09-22 | 2007-03-22 | Kaniyur Narayanan G | Method and an apparatus to prevent over subscription and thrashing of translation lookaside buffer (TLB) entries in I/O virtualization hardware |
JP2007109116A (ja) | 2005-10-17 | 2007-04-26 | Fukuoka Pref Gov Sangyo Kagaku Gijutsu Shinko Zaidan | 推定装置、テーブル管理装置、選択装置、テーブル管理方法、そのテーブル管理方法をコンピュータに実現させるプログラム、及び、そのプログラムを記録する記憶媒体 |
US7739476B2 (en) | 2005-11-04 | 2010-06-15 | Apple Inc. | R and C bit update handling |
US7616218B1 (en) | 2005-12-05 | 2009-11-10 | Nvidia Corporation | Apparatus, system, and method for clipping graphics primitives |
US7519781B1 (en) | 2005-12-19 | 2009-04-14 | Nvidia Corporation | Physically-based page characterization data |
US7512767B2 (en) | 2006-01-04 | 2009-03-31 | Sony Ericsson Mobile Communications Ab | Data compression method for supporting virtual memory management in a demand paging system |
US7653803B2 (en) | 2006-01-17 | 2010-01-26 | Globalfoundries Inc. | Address translation for input/output (I/O) devices and interrupt remapping for I/O devices in an I/O memory management unit (IOMMU) |
JP4890033B2 (ja) | 2006-01-19 | 2012-03-07 | 株式会社日立製作所 | 記憶装置システム及び記憶制御方法 |
US7545382B1 (en) | 2006-03-29 | 2009-06-09 | Nvidia Corporation | Apparatus, system, and method for using page table entries in a graphics system to provide storage format information for address translation |
US20070240141A1 (en) | 2006-03-30 | 2007-10-11 | Feng Qin | Performing dynamic information flow tracking |
JP5010164B2 (ja) | 2006-03-31 | 2012-08-29 | 株式会社日立製作所 | サーバ装置及び仮想計算機の制御プログラム |
US8621120B2 (en) | 2006-04-17 | 2013-12-31 | International Business Machines Corporation | Stalling of DMA operations in order to do memory migration using a migration in progress bit in the translation control entry mechanism |
US7702843B1 (en) | 2006-04-27 | 2010-04-20 | Vmware, Inc. | Determining memory conditions in a virtual machine |
US8035648B1 (en) | 2006-05-19 | 2011-10-11 | Nvidia Corporation | Runahead execution for graphics processing units |
US8707011B1 (en) | 2006-10-24 | 2014-04-22 | Nvidia Corporation | Memory access techniques utilizing a set-associative translation lookaside buffer |
US8706975B1 (en) | 2006-11-01 | 2014-04-22 | Nvidia Corporation | Memory access management block bind system and method |
CN100485689C (zh) | 2007-01-30 | 2009-05-06 | 浪潮通信信息系统有限公司 | 基于文件系统缓存的数据加速查询方法 |
WO2008097710A2 (en) | 2007-02-02 | 2008-08-14 | Tarari, Inc. | Systems and methods for processing access control lists (acls) in network switches using regular expression matching logic |
CN101042670A (zh) | 2007-04-24 | 2007-09-26 | 上海华龙信息技术开发中心 | 一种指令异常处理方法 |
US7895421B2 (en) | 2007-07-12 | 2011-02-22 | Globalfoundries Inc. | Mechanism for using performance counters to identify reasons and delay times for instructions that are stalled during retirement |
US7712092B2 (en) | 2007-10-01 | 2010-05-04 | The Board Of Trustees Of The Leland Stanford Junior University | Binary translation using peephole translation rules |
US7925923B1 (en) | 2008-01-31 | 2011-04-12 | Hewlett-Packard Development Company, L.P. | Migrating a virtual machine in response to failure of an instruction to execute |
US20090327661A1 (en) | 2008-06-30 | 2009-12-31 | Zeev Sperber | Mechanisms to handle free physical register identifiers for smt out-of-order processors |
US8145890B2 (en) | 2009-02-12 | 2012-03-27 | Via Technologies, Inc. | Pipelined microprocessor with fast conditional branch instructions based on static microcode-implemented instruction state |
US8533437B2 (en) | 2009-06-01 | 2013-09-10 | Via Technologies, Inc. | Guaranteed prefetch instruction |
US8364902B2 (en) | 2009-08-07 | 2013-01-29 | Via Technologies, Inc. | Microprocessor with repeat prefetch indirect instruction |
US20110078425A1 (en) * | 2009-09-25 | 2011-03-31 | Shah Manish K | Branch prediction mechanism for predicting indirect branch targets |
US8775153B2 (en) | 2009-12-23 | 2014-07-08 | Intel Corporation | Transitioning from source instruction set architecture (ISA) code to translated code in a partial emulation environment |
TWI506434B (zh) | 2010-03-29 | 2015-11-01 | Via Tech Inc | 預取單元、資料預取方法、電腦程式產品以及微處理器 |
US8479176B2 (en) | 2010-06-14 | 2013-07-02 | Intel Corporation | Register mapping techniques for efficient dynamic binary translation |
US8719625B2 (en) | 2010-07-22 | 2014-05-06 | International Business Machines Corporation | Method, apparatus and computer program for processing invalid data |
US8549504B2 (en) | 2010-09-25 | 2013-10-01 | Intel Corporation | Apparatus, method, and system for providing a decision mechanism for conditional commits in an atomic region |
US8627044B2 (en) | 2010-10-06 | 2014-01-07 | Oracle International Corporation | Issuing instructions with unresolved data dependencies |
KR101612594B1 (ko) | 2011-01-27 | 2016-04-14 | 소프트 머신즈, 인크. | 프로세서의 변환 룩 어사이드 버퍼를 이용하는 게스트 명령-네이티브 명령 레인지 기반 매핑 |
US20140019723A1 (en) | 2011-12-28 | 2014-01-16 | Koichi Yamada | Binary translation in asymmetric multiprocessor system |
US8898642B2 (en) | 2012-02-16 | 2014-11-25 | Unisys Corporation | Profiling and sequencing operators executable in an emulated computing system |
US9880846B2 (en) | 2012-04-11 | 2018-01-30 | Nvidia Corporation | Improving hit rate of code translation redirection table with replacement strategy based on usage history table of evicted entries |
US10241810B2 (en) | 2012-05-18 | 2019-03-26 | Nvidia Corporation | Instruction-optimizing processor with branch-count table in hardware |
US9384001B2 (en) | 2012-08-15 | 2016-07-05 | Nvidia Corporation | Custom chaining stubs for instruction code translation |
US9645929B2 (en) | 2012-09-14 | 2017-05-09 | Nvidia Corporation | Speculative permission acquisition for shared memory |
US9740553B2 (en) | 2012-11-14 | 2017-08-22 | Nvidia Corporation | Managing potentially invalid results during runahead |
US20140189310A1 (en) | 2012-12-27 | 2014-07-03 | Nvidia Corporation | Fault detection in instruction translations |
US10108424B2 (en) | 2013-03-14 | 2018-10-23 | Nvidia Corporation | Profiling code portions to generate translations |
US9547602B2 (en) | 2013-03-14 | 2017-01-17 | Nvidia Corporation | Translation lookaside buffer entry systems and methods |
US9582280B2 (en) | 2013-07-18 | 2017-02-28 | Nvidia Corporation | Branching to alternate code based on runahead determination |
-
2012
- 2012-03-13 US US13/419,323 patent/US10146545B2/en active Active
-
2013
- 2013-02-04 DE DE102013201767.7A patent/DE102013201767B4/de active Active
- 2013-03-12 TW TW102108698A patent/TWI515567B/zh active
- 2013-03-13 CN CN201310079112.6A patent/CN103309644B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN103309644A (zh) | 2013-09-18 |
US20130246709A1 (en) | 2013-09-19 |
DE102013201767A1 (de) | 2013-09-19 |
DE102013201767B4 (de) | 2021-12-02 |
US10146545B2 (en) | 2018-12-04 |
CN103309644B (zh) | 2016-08-03 |
TWI515567B (zh) | 2016-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI515567B (zh) | 微處理器的轉譯位址快取記憶體 | |
US8533438B2 (en) | Store-to-load forwarding based on load/store address computation source information comparisons | |
TWI552069B (zh) | 載入-儲存相依性預測器、用於在載入-儲存相依性預測器中處理操作的處理器與方法 | |
US6581151B2 (en) | Apparatus and method for speculatively forwarding storehit data based on physical page index compare | |
JP2618175B2 (ja) | キャッシュ・アクセスのための仮想アドレス変換予測の履歴テーブル | |
US7600097B1 (en) | Detecting raw hazards in an object-addressed memory hierarchy by comparing an object identifier and offset for a load instruction to object identifiers and offsets in a store queue | |
US9131899B2 (en) | Efficient handling of misaligned loads and stores | |
US8190652B2 (en) | Achieving coherence between dynamically optimized code and original code | |
JP5608594B2 (ja) | プレロード命令制御 | |
US20090006803A1 (en) | L2 Cache/Nest Address Translation | |
JP5059749B2 (ja) | キャッシュライン境界を横切る命令におけるキャッシュミスの処理 | |
CN105446900A (zh) | 处理器和区分系统管理模式条目的方法 | |
KR102268601B1 (ko) | 데이터 포워딩을 위한 프로세서, 그것의 동작 방법 및 그것을 포함하는 시스템 | |
KR20130140582A (ko) | 제로 사이클 로드 | |
CN107818053B (zh) | 用于存取高速缓存的方法与装置 | |
TW201423584A (zh) | 提取寬度預測器 | |
JP2009217827A (ja) | マイクロタグを使用するキャッシュ・アクセッシング | |
JP2011129103A (ja) | バッファを用いて高効率でロード処理を実行する方法および装置 | |
KR101787851B1 (ko) | 다중 페이지 크기 변환 색인 버퍼(tlb)용 장치 및 방법 | |
US11989285B2 (en) | Thwarting store-to-load forwarding side channel attacks by pre-forwarding matching of physical address proxies and/or permission checking | |
CN103365627A (zh) | 执行单元内的数据转发系统和方法 | |
US20150339233A1 (en) | Facilitating efficient prefetching for scatter/gather operations | |
AU2016265131A1 (en) | Method and apparatus for cache tag compression | |
WO2014100653A1 (en) | Speculative addressing using a virtual address-to-physical address page crossing buffer | |
US20140095838A1 (en) | Physical Reference List for Tracking Physical Register Sharing |