TW201407348A - 微處理器的轉譯位址快取記憶體 - Google Patents

微處理器的轉譯位址快取記憶體 Download PDF

Info

Publication number
TW201407348A
TW201407348A TW102108698A TW102108698A TW201407348A TW 201407348 A TW201407348 A TW 201407348A TW 102108698 A TW102108698 A TW 102108698A TW 102108698 A TW102108698 A TW 102108698A TW 201407348 A TW201407348 A TW 201407348A
Authority
TW
Taiwan
Prior art keywords
instruction
address
translation
microprocessor
alternate version
Prior art date
Application number
TW102108698A
Other languages
English (en)
Other versions
TWI515567B (zh
Inventor
Ross Segelken
Alexander Klaiber
Nathan Tuck
David Dunn
Original Assignee
Nvidia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nvidia Corp filed Critical Nvidia Corp
Publication of TW201407348A publication Critical patent/TW201407348A/zh
Application granted granted Critical
Publication of TWI515567B publication Critical patent/TWI515567B/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/3017Runtime instruction translation, e.g. macros
    • G06F9/30174Runtime instruction translation, e.g. macros for non-native instruction set, e.g. Javabyte, legacy code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • G06F12/1018Address translation using page tables, e.g. page table structures involving hashing techniques, e.g. inverted page tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3808Instruction prefetching for instruction reuse, e.g. trace cache, branch target cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/3017Runtime instruction translation, e.g. macros
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Advance Control (AREA)

Abstract

提供有關從包含於微處理器中之一指令快取記憶體提取指令以及達成與指令相同功能之替代版本的具體實施例。在一範例中,提供一種方法,其包含在一範例微處理器中從一指令快取記憶體提取一指令。範例方法也包含雜湊指令的一位址,以判定達成與指令相同功能之指令的一替代版本是否存在。範例方法更包含若雜湊的結果為判定此一替代版本確實存在,則中止指令的提取、並擷取及執行替代版本。

Description

微處理器的轉譯位址快取記憶體
本發明係關於微處理器的快取記憶體,尤其是關於從包含於微處理器中之一指令快取記憶體提取指令。
微處理器的架構層級指令可在指令集架構(ISA)與原生(native)架構之間轉譯。在某些微處理器中,ISA指令的軟體最佳化可比那些軟體最佳化所依據之ISA指令更加有效率地執行。某些過去的方式係將軟體最佳化鍊接以將控制從一軟體最佳化傳到另一個。然而,此類方法受到了間接分支程序的挑戰,因為其可能難以判定間接分支的目標。
本發明一實施例中提出一種微處理器,包含提取邏輯以操作:提取一指令;雜湊該指令之一位址,以判定是否存在達成與該指令相同功能之該指令的一替代版本;以及 若該雜湊的結果為判定此一替代版本確實存在,中止該提取、並擷取及執行該替代版本。
100‧‧‧微處理器
102‧‧‧管線
109‧‧‧暫存器
110‧‧‧記憶體階層
110A‧‧‧L1處理器快取記憶體
110B‧‧‧L2處理器快取記憶體
110C‧‧‧L3處理器快取記憶體
110D‧‧‧主記憶體
110E‧‧‧輔助儲存器
110F‧‧‧第三級儲存器
110H‧‧‧記憶體控制器
120‧‧‧提取邏輯
122‧‧‧指令轉譯後備緩衝器
124‧‧‧轉譯位址快取記憶體
126‧‧‧實體位址多工器
128‧‧‧指令快取記憶體
130‧‧‧原生轉譯緩衝器
132‧‧‧解碼邏輯
134‧‧‧排程邏輯
136‧‧‧執行邏輯
138‧‧‧記憶體邏輯
140‧‧‧寫回邏輯
200‧‧‧4通路結合快取記憶體
300‧‧‧方法
400‧‧‧方法
圖1示意地顯示根據本發明具體實施例之一微處理器;圖2示意地顯示根據本發明具體實施例之一轉譯位址快取記憶體;圖3A顯示根據本發明具體實施例之一方法流程圖的部份,其從一指令快取記憶體提取一指令並判定指令的替代版本是否儲存於指令快取記憶體中;圖3B顯示圖3A所示流程圖的另一部份;圖3C顯示圖3A及3B所示流程圖的另一部份;圖4示意地顯示根據本發明具體實施例之一方法,其雜湊一指令的線性位址以產生此線性位址的雜湊索引及歧義消除標籤;以及圖5示意地顯示根據本發明具體實施例之轉譯位址快取記憶體項目。
在現代的微處理器中,架構層級指令可在來源指令集架構(ISA)(例如進階RISC機器(ARM)架構或x86架構)以 及替代ISA(其達成與來源相同之可觀察到的功能)之間轉譯。舉例來說,來源ISA的一組一或多個指令可轉譯為原生架構的一或多個微操作,其施行與來源ISA指令相同的功能。在某些設定中,相較於來源ISA指令,原生微指令可提供增強或最佳化的效能。
某些過去的方式試圖將來源指令的軟體最佳化鍊接,使得控制經由直接原生分支從一軟體最佳化傳到另一個軟體最佳化。然而,此方式可能受到分支程序的挑戰。因為在程式執行過程中分支來源可能是動態的,軟體最佳化之間的逐鍊交遞(chain-wise handoff)可能無法實行。舉例來說,若發生間接分支,則分支的不確定目標可能使其在產生最佳化時難以確認應擷取哪個軟體最佳化。因此,當從上千個可能的候選最佳化中判定分支及針對該分支的軟體最佳化時,微處理器可能停止運行。
因此,本文所揭露的各個具體實施例係相關於提取來源資訊以及來源資訊的替代版本,其在可接受的容忍度之內(例如在架構上可觀察到之效果的可接受容忍度內)達成與來源資訊相同之可觀察到的功能(在本文中稱作相同功能)。將理解到,實際上可使用任何適合的來源資訊及其任何替代版本,而不會偏離本發明的範疇。在某些具體實施例中,來源可包括一指令,例如針對ISA架構的一指令。除了指令之外或取代指令,來源資訊可包括來源資料,且替代版本可包 括來源資料的替代形式或版本。同樣地,將理解到,將來源轉換為其替代版本的任何適當方法(例如軟體方法及/或硬體方法)可被認為是在本發明的範疇內。為描述目的,本文中所呈現的描述及圖式將來源指令及來源指令的轉譯分別稱作來源資訊及來源資訊的替代版本,然而此類具體實施例並非限制性。
一範例方法包括在被指示以擷取一指令後,雜湊針對該指令的一位址,使其可判定是否存在該指令的替代版本。雜湊係施行以判定是否存在達成相同功能之指令的替代版本,例如原生轉譯(如針對可提取供微處理器執行之各種指令的來源指令集架構及原生微操作集架構之間的轉譯)。範例方法更包括若雜湊的結果為判定此一替代版本存在,則中止指令的提取、並擷取及執行替代版本。
本文中的討論將經常提到「擷取」一指令,以及接著若存在某些條件時則中止該擷取。在某些具體實施例中,「擷取」一指令可包括提取一指令。此外,當發生此類中止,則終止擷取程序。終止一般在擷取程序完成前發生。舉例來說,在一情境中,中止擷取可能發生在擷取一指令的實體位址時。在另一情境中,中止擷取可能發生在擷取一指令的實體位址之後但在從記憶體擷取指令之前。在擷取程序完成前中止擷取可節省從記憶體存取及擷取來源的時間花費。將理解到,如本文所使用,擷取並不限於提取的情境,其中 提取一般在解碼之前完成。舉例來說,指令可被擷取但在解碼過程中、解碼之前、或在任何適當的時間點中止。
來源資訊及該資訊的轉譯版本之間的映射及轉譯存在著廣泛的可能性。藉由判定替代版本是否存在並在若替代版本確實存在時中止擷取指令(例如ISA指令),微處理器相對於藉由避免解碼操作而解碼來源ISA指令的微處理器可提供增強的效能。額外的效能增強可由設定實現,其中替代版本藉由對允許替代版本比來源ISA指令更快地進行執行之操作的改變而提供最佳化效能。
圖1示意地繪示微處理器100的一具體實施例,其可連同本文所述之系統及方法使用。微處理器100可包括處理器暫存器109。此外,微處理器100可包括記憶體階層110及/或可與記憶體階層110通訊,其可包括L1處理器快取記憶體110A、L2處理器快取記憶體110B、L3處理器快取記憶體110C、主記憶體110D(例如一或多個DRAM晶片)、輔助儲存器110E(如磁性及/或光學儲存單元)及/或第三級儲存器110F(如磁帶)。將理解到,範例記憶體/儲存構件係以存取時間及容量的遞增順序列出,但可能有例外。
記憶體控制器110H可用以處理協定並提供主記憶體110D所需的信號介面,以及排程記憶體存取。記憶體控制器110H可實現於處理器晶粒上或於一個別晶粒上。應理解到,上文所提的記憶體階層並非限制性,且可使用其他記憶 體階層而不會偏離本發明範疇。
微處理器100也包括管線,其在圖1中係以簡化形式描述為管線102。管線化可允許多於一個指令同時地在不同的擷取及執行階段。換言之,一組指令可傳送通過管線102所包括的各種階段(包括提取、解碼、執行、及寫回階段等等),而另一個指令及/或資料係從記憶體擷取並依管線102作用。因此,當上游階段等待記憶體回傳指令及/或資料等等,可利用管線102中的下游階段。相對於以個別、串行方式擷取及執行指令及/或資料的方法,此方法有可能加速微處理器的指令及資料處理。
如圖1所示,範例管線102包括提取邏輯120、原生轉譯緩衝器130、解碼邏輯132、排程邏輯134、執行邏輯136、記憶體邏輯138、及寫回邏輯140。提取邏輯120從指令快取記憶體提取一所選指令以供執行。在圖1所示範例中,提取邏輯120包括指令轉譯後備緩衝器122,用以將所選指令之線性位址轉譯為指令的實體位址以被提取供執行。如本文所使用,指令的線性位址係指由頁表格轉譯/重新映射為關聯於指令所儲存在記憶體中之位置之實體位址的位址。在某些具體實施例中,線性位址可包括目錄、表格、及/或偏位項目,其可識別可找到指令之實體位址的頁目錄、頁表格、及/或在一頁表格中的頁框位置。
指令轉譯後備緩衝器122實際上可施行將線性位 址轉譯為那些指令之實體位址的任何適當方法。舉例來說,在某些具體實施例中,指令轉譯後備緩衝器122可包括內容可尋址記憶體,其儲存一部分的頁表格,其將指令的線性位址映射至那些指令的實體位址。
提取邏輯120也判定所選指令的原生轉譯是否存在。若這樣的一個原生轉譯存在,則系統中止指令提取並改為傳送原生轉譯以供執行。在圖1所繪示的具體實施例中,提取邏輯120包括轉譯位址快取記憶體124,用以儲存原生轉譯的位址。
幾乎任何適合的資料儲存架構及邏輯都可用於轉譯位址快取記憶體124。舉例來說,圖2示意地顯示使用作為轉譯位址快取記憶體之4通路(4-way)結合快取記憶體200的具體實施例。在圖2所示的具體實施例中,1024轉譯位址項目可儲存於四通路中的任一者,其取決於所選的位址方案,每一通路包括256資料位置。然而,將理解到,某些具體實施例可能具有較少的資料通路及/或資料位置,而其他具體實施例可能包括更多的資料通路及/或資料位置,而不會偏離本發明的範疇。
繼續參照圖1,提取邏輯120包括實體位址多工器126,其係多路傳輸從指令轉譯後備緩衝器122及轉譯位址快取記憶體124所接收之實體位址,並將其分佈至指令快取記憶體128。接著,指令快取記憶體128參照這些指令及原生轉譯 的實體位址而擷取儲存供微處理器100執行之指令及原生轉譯。若提取邏輯120判定存在針對所選指令的原生轉譯,則從指令快取記憶體128擷取原生轉譯並可傳送至選擇性的原生轉譯緩衝器130,準備最後分佈至排程邏輯134。或者,若提取邏輯120判定不存在針對所選指令的原生轉譯,則從指令快取記憶體128擷取所選指令並傳送至解碼邏輯132。解碼邏輯132將所選指令解碼,例如藉由剖析運算碼、運算元、及定址模式,並產生一或多個原生指令或微操作的解碼組,準備最後分佈至排程邏輯134。排程邏輯134排程原生轉譯及解碼指令,以供指令邏輯136執行。
圖1所繪示的具體實施例描述指令快取記憶體128為包括實體索引實體標籤(PIPT)指令快取記憶體,使得原生轉譯的位址可從轉譯位址快取記憶體124擷取,並同時從指令轉譯後備緩衝器122擷取來源位址。然而,將理解到,根據本發明的具體實施例可採用任何合適的指令快取記憶體128。舉例來說,在某些具體實施例中,指令快取記憶體128可包括線性索引實體標籤(LIPT)指令快取記憶體。在某些具體實施例中,提取邏輯可同時地從指令轉譯後備緩衝器擷取一來源的位址、從轉譯位址快取記憶體擷取一原生轉譯的位址、以及從LIPT指令快取記憶體擷取來源。若一原生轉譯為可得,可拋棄指令並可從LIPT快取記憶體擷取原生轉譯,以基於原生轉譯的位址而執行。若無原生轉譯為可得,可將指 令解碼並接著執行。
管線102也可包括用以執行載入及/或儲存操作的記憶體邏輯138以及用以寫入操作結果至適當的位置(如暫存器109)的寫入邏輯140。寫回後,微處理器進入由指令所更改的狀態,使得導致確定狀態之操作的結果可能不會被撤銷。
應理解到,上文中顯示於管線102中的階段是用以說明一般的RISC實施,並不意欲作為限制。舉例來說,在某些具體實施例中,可在某些管線階段上游實施VLIW技術。在某些其他具體實施例中,排程邏輯可包含於微處理器的提取邏輯及/或解碼邏輯。更一般地,微處理器可包括提取、解碼、及執行邏輯,其中記憶體及寫回功能係由執行邏輯實現。本發明同樣可應用於這些及其他微處理器實施。
在所述的範例中,指令可在一時間提取及執行一次或多於一次,其可能需要多個時脈週期。在此期間,資料路線的重要部份可能不會使用。補充或取代單一指令提取,可使用預提取方法來改善效能並避免關聯於讀取及儲存操作(即指令的讀取及載入此類指令至處理器暫存器及/或執行序列)的延遲瓶頸。因此,將理解到,實際上可使用任何適合方式來提取、排程及配送指令,而不會偏離本發明範疇。
圖3A-3C示意地顯示用以從指令快取記憶體提取一所選指令並判定所選指令之原生轉譯是否儲存於指令快取記憶體中之方法300的具體實施例。雖然方法300係有關判 定一指令的原生轉譯是否可得而描述,但將理解到此情境僅為提取指令並判定達成與指令相同功能之替代版本是否存在的描述,且方法300並不限於下述的範例或設定。因此,將理解到,方法300中所描述的程序係為了說明目的而安排及描述,而不意欲作為限制。在某些具體實施例中,本文所述之方法可包括額外或替代的程序,而在某些具體實施例中,本文所述的方法可包括可被重新排序或省略的某些程序,其並不會偏離本發明範疇。此外,將理解到,本文所述之方法可使用任何合適的硬體(包含本文所述的硬體)來實施。
回到圖3A,方法300包括在302中被導向以從指令快取記憶體提取一所選指令。在某些具體實施例中,提取程序可被導向以參照所選指令的線性位址而擷取一指令。舉例來說,所選指令可反應至目標指令指標之分支而從指令快取記憶體擷取,例如源自微處理器管線中之分支預測器或分支驗證點的分支。將理解到,程序302可包括在指令轉譯後備緩衝器中查詢選擇的實體位址,下文將更詳細的描述。
在某些具體實施例中,提取所選指令可包括從指令轉譯後備緩衝器提取所選指令的實體位址。在此類具體實施例中,所選指令的線性位址可在到目標指令指標的方向接收。接著,線性位址可由指令轉譯後備緩衝器轉譯為所選指令的實體位址,其藉由參照線性位址而搜尋儲存於指令後備緩衝器中的實體位址。若搜尋沒有命中所選指令的實體位 址,則實體位址可經由頁行走(page walk)或經由在較高階轉譯後備緩衝器中查詢而判定。不論實體位址如何判定,一旦判定所選指令的實體位址,其係提供至指令快取記憶體,以獲得所選指令。
在304中,方法300包含在獲得所選指令的實體位址時,雜湊所選指令的線性位址,以從線性位址產生雜湊索引。接著,當判定針對所選指令的原生轉譯是否存在時,可使用雜湊索引,其將於下文中做更詳細的描述。
舉例來說,到目標指令指標的方向可能造成線性位址被雜湊,並同時發生(在適當的容忍度內)線性位址到指令轉譯後備緩衝器的分佈。然而,將理解到,可在程序流程內之任何合適的位置使用任何施行雜湊的合適方式,而不會偏離本發明的範疇。
在某些具體實施例中,線性位址可由包含於微處理器內的適當硬體結構所雜湊。舉例來說,線性位址可由提取邏輯及/或原生轉譯位址快取記憶體所雜湊,然而實際上可使用任何合適的硬體結構來雜湊線性位址而不會偏離本發明的範疇。
可使用各式各樣的雜湊技術。舉例來說,在某些具體實施例中,可使用XOR雜湊函數產生雜湊索引。雜湊索引也可藉由雜湊線性位址的複數個部份而產生。在某些其他具體實施例中,可藉由使用線性位址的單一部份產生雜湊索 引。圖4示意地顯示使用XOR雜湊函數雜湊一指令之48位元線性位址以產生8位元雜湊索引的方法。在圖4所示的範例中,位元0-7與位元8-15進行XOR的結果係與位元16-23進行XOR,以產生8位元雜湊索引。
在某些具體實施例中,當雜湊線性位址時,可產生歧義消除標籤。歧義消除標籤可用以在轉譯位址快取記憶體中有多於一個轉譯位址項目具有相同索引值時,區別替代版本彼此之間不同的轉譯位址項目(舉例來說,針對指令之原生轉譯的位址項目)。因此,在某些具體實施例中,歧義消除標籤可用以區別儲存在轉譯位址快取記憶體中之具有相同轉譯位址索引之複數個轉譯位址項目。舉例來說,圖4示意地顯示從沒有形成8位元雜湊索引之線性位址的部份產生48位元線性位址的40位元歧義消除標籤。因此,在某些具體實施例中,未用來產生雜湊標籤的位元可用以產生歧義消除標籤。在圖4所示的範例中,位元8-48可用以形成歧義消除標籤。然而,可使用任何適合用以產生歧義消除標籤的方法,而不會偏離本發明的範疇。
雖然上述討論係關於雜湊一線性位址以從轉譯位址快取記憶體獲得一或多個轉譯位址項目,使得轉譯位址項目根據線性位址而進行索引,但將理解到轉譯位址快取記憶體可根據任何合適的位址而索引。舉例來說,在某些具體實施例中,適當組態的轉譯位址可根據實體位址而索引。當 兩個程序映射至在不同線性位址的一共享程式庫,根據實體位址而索引轉譯位址快取記憶體可節省轉譯位址快取記憶體內的空間。在某些這樣的情況下,共享程式庫只有一個版本可實體地載入記憶體。藉由根據實體位址而進行索引,共享的映射可導致獲得一單一項目,而未共享的映射可導致獲得不同的項目。
回到圖3B,範例方法300包括在306中判定被擷取之所選來源指令的有效原生轉譯是否存在。在某些具體實施例中,是否存在有效原生轉譯的判定係與所選指令之實體位址的判定共同發生(在可接受的容忍度內),包含從指令轉譯後備緩衝器的位址擷取。在這些具體實施例中,若判定有效原生轉譯並不存在,則在一或多個這些階段的並行處理可允許實體位址提取繼續而無不利。然而,將理解到,在某些具體實施例中判定不需為同時發生的。
不論何時施行有效性判定,若判定有效原生轉譯存在,則中止提取來源指令,其係例如藉由中止來源指令之實體位址的提取。接著,可藉由避免解碼步驟及藉由允許替代版本的使用來增強處理效率。
在圖3B所示的具體實施例中,判定是否存在有效原生轉譯包含在308獲得雜湊位址的一或多個轉譯位址項目,以及在310比較在雜湊程序過程中產生的歧義消除標籤以及使用每一個所獲得的轉譯位址而獲得之一或多個轉譯位址 歧義消除標籤。
一轉譯位址項目儲存原生轉譯所儲存的一實體位址。轉譯位址項目可根據與其相關的轉譯位址索引而查詢。舉例來說,當雜湊一位址所產生的一雜湊索引可用以查詢在轉譯位址快取記憶體中的特定轉譯位址索引。
在某些具體實施例中,可經由特定轉譯位址索引的查詢而獲得多於一個轉譯位址項目。舉例來說,用以查詢4通路結合快取記憶體的轉譯位址索引的雜湊位址可能導致高達四個轉譯位址項目的擷取。在此類具體實施例中,每一轉譯位址項目具有個別的轉譯位址歧義消除標籤,其係區別該項目與來自具有相同轉譯位址索引之其他項目。比較藉由雜湊位址所產生之歧義消除標籤與以個別轉譯位址項目擷取之歧義消除標籤,可判定任何所獲得項目是否代表有效原生轉譯的一實體位址。在某些具體實施例中,歧義消除標籤的比較可包括有效位元的比較。在此類具體實施例中,只有當有效位元設定為預選值(例如數值1)時才可發現所比較標籤之間的一致。
在某些具體實施例中,轉譯位址項目可包括原生轉譯之實體位址的位元表示及原生轉譯之假設上下文的位元表示。此外,在某些具體實施例中,轉譯位址項目可包括關聯於轉譯及/或轉譯態樣的一或多個位元。圖5示意地顯示包括實體位址位元、假設上下文位元、及轉譯相關位元之轉譯位 址項目的具體實施例。
繼續參考圖3B,方法300包含在312判定在雜湊位址時所產生的歧義消除標籤是否符合以轉譯位址項目所獲得之任何歧義消除標籤。若歧義消除標籤並不符合,則方法300進行至330,如圖3C所示。若由轉譯位址快取記憶體所獲得的歧義消除標籤符合由雜湊所產生的歧義消除標籤,則此符合表示獲得了有效的歧義消除標籤。在某些具體實施例中,有效歧義消除標籤的存在可導致存在有效轉譯的判定。然而,在某些具體實施例中,僅存在有效歧義消除標籤可能無法支持關聯於該標籤的項目包含有效原生轉譯的結論。因此,方法300可有分支314,其將於下文詳細討論,或者可繼續進行至318,如圖3C所示。
如上文所介紹,在某些具體實施例中,轉譯位址項目可包含原生轉譯的假設上下文。如本文所使用,當前的上下文描述微處理器的當前工作狀態,且假設上下文描述原生轉譯為有效之微處理器的狀態。因此,在某些具體實施例中,即使識別了一項目的有效歧義消除標籤,關聯於該歧義消除標籤的項目可能不包含針對當前上下文的有效原生轉譯。在某些範例中,發佈當前上下文與假設上下文並不符合的原生轉譯可能造成執行錯誤或危害。
將理解到,上下文可包含於轉譯位址項目及/或轉譯位址的任何適當部份中。在圖5所示的範例中,上下文位 元係描述為包含於轉譯位址項目內。在此類具體實施例中,可選擇性地比較上下文,如圖3C的316所示。因此,不進行至318,方法300可選擇性地分支到314,比較微處理器的當前上下文與儲存於轉譯位址項目中的假設上下文。回到圖3C,在這些具體實施例中,方法300可包含在316判定當前上下文是否符合假設的上下文。在某些具體實施例中,當前上下文可與假設上下文比較以判定一致性。在一範例情境中,若基於一對一的比較,假設及當前的上下文係符合,則可找到一致性。若上下文符合,則方法300繼續進行至318,其中方法300係做出有效原生轉譯存在的判定。若上下文並不符合,則方法300進行至330,其中方法300做出有效原生轉譯不存在的判定。
額外或替代地,在某些具體實施例中,假設上下文的位元可包含於轉譯位址中,例如在歧義消除標籤及/或雜湊中。在這些具體實施例中,在位址的一或多個部份中包含假設上下文可允許將具有不同上下文但相同線性位址的二或更多項目共同儲存於轉譯位址快取記憶體內。將理解到,這些具體實施例的施行可取決於特定應用的考量。舉例來說,在集合相聯性為低的某些具體實施例中,例如位址係直接映射的情境下,假設上下文可包含於雜湊中,可避免衝突性失誤。舉例來說,假設上下文可在雜湊過程中進行XOR而成為雜湊。在某些其他具體實施例中,例如用以雜湊額外位元的 循環時間影響處理時間多於用以處理相對較寬之歧義消除標籤的時間的情況下,假設上下文可加入到歧義消除標籤以避免潛在的處理延遲。舉例來說,假設上下文可附加至歧義消除標籤。在其他具體實施例中,假設上下文可包括於雜湊及歧義消除標籤中。
一旦判定存在有效原生轉譯,方法300包含在320中止提取指令。當發生中止,提取程序將終止。雖然終止可發生於指令的提取之後,但在某些具體實施例中,終止可發生於提取程序完成之前。舉例來說,在提取指令包含從指令轉譯後備緩衝器擷取指令之實體位址的具體實施例中,中止提取指令可包含中止從指令轉譯後備緩衝器擷取實體位址。
方法300包含在322傳送原生轉譯的實體位址至指令快取記憶體,以及在324從指令快取記憶體接收所選的原生轉譯。在某些具體實施例中,一旦從指令快取記憶體接收所選的原生轉譯,可將其傳送至原生轉譯緩衝器,準備供最終分佈至排程邏輯,其將於此被排程供執行。
或者,在圖3C所示的具體實施例中,若不存在有效的原生轉譯,則方法300包含在332允許完成自指令快取記憶體的提取。舉例來說,在提取指令包含從指令轉譯後備緩衝器擷取實體位址的情況下,方法300可包括在334中,在接收來自指令轉譯後備緩衝器之指令的實體位址後,傳送指令的實體位址至指令快取記憶體,使得指令可在336從指令快取 記憶體獲得。
因此,藉由判定來源材料的替代版本的存在(在上述的範例中,提供與來源指令相同功能的原生轉譯),同時提取來源材料,相對於僅基於來源材料的程序,本文所述方法可提供增強的程序。此外,藉由使用硬體結構來進行同時的判定,本文所述的方法相對於基於軟體最佳化的方式可更為有效,特別在分支的處理情境。
此書面說明使用範例來揭露本發明,其包含最佳模式,也致能熟習相關領域技藝者實行本發明,包括製造及使用任何裝置或系統並實施任何結合的方法。本發明的可專利範疇係由申請專利範圍所定義,且可能包括熟此技藝者所理解的其他範例。這些其他範例係意欲在申請專利範圍的範疇內。
100‧‧‧微處理器
102‧‧‧管線
109‧‧‧暫存器
110‧‧‧記憶體階層
110A‧‧‧L1處理器快取記憶體
110B‧‧‧L2處理器快取記憶體
110C‧‧‧L3處理器快取記憶體
110D‧‧‧主記憶體
110E‧‧‧輔助儲存器
110F‧‧‧第三級儲存器
110H‧‧‧記憶體控制器
120‧‧‧提取邏輯
122‧‧‧指令轉譯後備緩衝器
124‧‧‧轉譯位址快取記憶體
126‧‧‧實體位址多工器
128‧‧‧指令快取記憶體
130‧‧‧原生轉譯緩衝器
132‧‧‧解碼邏輯
134‧‧‧排程邏輯
136‧‧‧執行邏輯
138‧‧‧記憶體邏輯
140‧‧‧寫回邏輯

Claims (10)

  1. 一種微處理器,包含提取邏輯以操作:提取一指令;雜湊該指令之一位址,以判定是否存在達成與該指令相同功能之該指令的一替代版本;以及若該雜湊的結果為判定此一替代版本確實存在,中止該提取、並擷取及執行該替代版本。
  2. 如申請專利範圍第1項所述之微處理器,其中該提取邏輯更操作以在該指令被提取時雜湊該位址。
  3. 如申請專利範圍第2項所述之微處理器,其中該提取邏輯更操作以經由該指令之一線性位址之一或多個部份的一雜湊產生一雜湊索引,以及從該指令之該線性位址之其他部份產生一歧義消除標籤。
  4. 如申請專利範圍第1項所述之微處理器,其中該提取邏輯更操作以:根據從該雜湊所產生之一雜湊索引,藉由參照該微處理器之一轉譯位址快取記憶體中的一轉譯位址索引而判定該替代版本是否存在;以及若該替代版本存在,則從該轉譯位址快取記憶體擷取該替代版本的一實體位址。
  5. 如申請專利範圍第4項所述之微處理器,其中該提取邏輯更操作以:根據該轉譯位址索引獲得儲存於該轉譯位址快取記憶體中的一或多個轉譯位址項目;比較從該雜湊產生的一歧義消除標籤與關聯於所獲得之該一或多個轉譯位址項目之每一者的一歧義消除標籤;以及若從該雜湊產生的該歧義消除標籤與從該轉譯位址快取記憶體所獲得之一歧義消除標籤符合,則判定該替代版本存在。
  6. 如申請專利範圍第4項所述之微處理器,其中該提取邏輯更操作以:比較該微處理器之一當前上下文與一假設上下文,該當前上下文描述該微處理器之一當前工作狀態,該假設上下文描述該微處理器在該替代版本為有效時的一狀態;以及若該當前上下文符合該假設上下文,則判定該替代版本存在,其中該假設上下文係包含於該雜湊索引、該歧義消除標籤、或關聯於該雜湊索引及該歧義消除標籤的一或多個轉譯位址項目的一或多個中。
  7. 如申請專利範圍第4項所述之微處理器,更包含提取邏輯以操作:傳送該替代版本之該實體位址至一指令快取記憶體,使得該替代版本可從該指令快取記憶體獲得;以及傳送從該指令快取記憶體所獲得的替代版本至排程邏輯,以排程該替代版本供執行。
  8. 如申請專利範圍第1項所述之微處理器,更包含一轉譯位址快取記憶體,其組態以針對儲存於該轉譯位址快取記憶體內的每一替代版本儲存包含該替代版本之一實體位址的一轉譯位址項目以及用以描述該微處理器在該替代版本為有效時之一狀態的一假設上下文。
  9. 如申請專利範圍第1項所述之微處理器,更包含一指令快取記憶體,其係選自由一線性索引實體標籤指令快取記憶體及一實體索引實體標籤指令快取記憶體所組成之群組。
  10. 如申請專利範圍第1項所述之微處理器,其中該提取邏輯更操作以參照該指令之一線性位址而從一指令轉譯後備緩衝器擷取該指令之一實體位址。
TW102108698A 2012-03-13 2013-03-12 微處理器的轉譯位址快取記憶體 TWI515567B (zh)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/419,323 US10146545B2 (en) 2012-03-13 2012-03-13 Translation address cache for a microprocessor

Publications (2)

Publication Number Publication Date
TW201407348A true TW201407348A (zh) 2014-02-16
TWI515567B TWI515567B (zh) 2016-01-01

Family

ID=49044138

Family Applications (1)

Application Number Title Priority Date Filing Date
TW102108698A TWI515567B (zh) 2012-03-13 2013-03-12 微處理器的轉譯位址快取記憶體

Country Status (4)

Country Link
US (1) US10146545B2 (zh)
CN (1) CN103309644B (zh)
DE (1) DE102013201767B4 (zh)
TW (1) TWI515567B (zh)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9880846B2 (en) 2012-04-11 2018-01-30 Nvidia Corporation Improving hit rate of code translation redirection table with replacement strategy based on usage history table of evicted entries
US10241810B2 (en) 2012-05-18 2019-03-26 Nvidia Corporation Instruction-optimizing processor with branch-count table in hardware
US9588902B2 (en) * 2012-12-04 2017-03-07 Advanced Micro Devices, Inc. Flexible page sizes for virtual memory
US20140189310A1 (en) 2012-12-27 2014-07-03 Nvidia Corporation Fault detection in instruction translations
US10108424B2 (en) 2013-03-14 2018-10-23 Nvidia Corporation Profiling code portions to generate translations
US9330020B2 (en) * 2013-12-27 2016-05-03 Intel Corporation System, apparatus, and method for transparent page level instruction translation
CN103942161B (zh) * 2014-04-24 2017-02-15 杭州冰特科技有限公司 只读缓存的去冗余系统及方法以及缓存的去冗余方法
US10152527B1 (en) * 2015-12-28 2018-12-11 EMC IP Holding Company LLC Increment resynchronization in hash-based replication
US10380100B2 (en) 2016-04-27 2019-08-13 Western Digital Technologies, Inc. Generalized verification scheme for safe metadata modification
US10380069B2 (en) * 2016-05-04 2019-08-13 Western Digital Technologies, Inc. Generalized write operations verification method
KR20180087925A (ko) * 2017-01-25 2018-08-03 삼성전자주식회사 논리 어드레스와 물리 어드레스 사이에서 해싱 기반 변환을 수행하는 스토리지 장치
EP3422192B1 (en) * 2017-06-28 2020-08-12 Arm Ltd Address translation data invalidation
US10997066B2 (en) * 2018-02-20 2021-05-04 Samsung Electronics Co., Ltd. Storage devices that support cached physical address verification and methods of operating same
US11175921B2 (en) * 2018-05-15 2021-11-16 International Business Machines Corporation Cognitive binary coded decimal to binary number conversion hardware for evaluating a preferred instruction variant based on feedback
US11023397B2 (en) * 2019-03-25 2021-06-01 Alibaba Group Holding Limited System and method for monitoring per virtual machine I/O
WO2022003418A1 (en) 2020-06-29 2022-01-06 Aurora Labs Ltd. Efficient controller data generation and extraction
US11928472B2 (en) 2020-09-26 2024-03-12 Intel Corporation Branch prefetch mechanisms for mitigating frontend branch resteers
US20230057623A1 (en) * 2021-08-23 2023-02-23 Intel Corporation Issue, execution, and backend driven frontend translation control for performant and secure data-space guided micro-sequencing

Family Cites Families (187)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3815101A (en) 1972-11-08 1974-06-04 Sperry Rand Corp Processor state and storage limits register auto-switch
US3950729A (en) 1973-08-31 1976-04-13 Nasa Shared memory for a fault-tolerant computer
US4654790A (en) 1983-11-28 1987-03-31 Amdahl Corporation Translation of virtual and real addresses to system addresses
US4812981A (en) 1985-10-24 1989-03-14 Prime Computer, Inc. Memory management system improving the efficiency of fork operations
US4797814A (en) 1986-05-01 1989-01-10 International Business Machines Corporation Variable address mode cache
JP2589713B2 (ja) 1987-11-20 1997-03-12 株式会社日立製作所 データプロセッサ及びデータ処理システム
US5179669A (en) 1988-08-22 1993-01-12 At&T Bell Laboratories Multiprocessor interconnection and access arbitration arrangement
JPH02288927A (ja) 1989-01-18 1990-11-28 Nec Corp 共有メモリ管理方式
CA2011807C (en) 1989-03-20 1999-02-23 Katsumi Hayashi Data base processing system using multiprocessor system
JPH0354660A (ja) 1989-07-21 1991-03-08 Nec Corp マルチプロセッサシステムにおける共有メモリ管理方式
US5123094A (en) 1990-01-26 1992-06-16 Apple Computer, Inc. Interprocessor communications includes second CPU designating memory locations assigned to first CPU and writing their addresses into registers
JPH04182858A (ja) 1990-11-19 1992-06-30 Mitsubishi Electric Corp 共有メモリ管理方式
US5245702A (en) 1991-07-05 1993-09-14 Sun Microsystems, Inc. Method and apparatus for providing shared off-screen memory
US5696925A (en) 1992-02-25 1997-12-09 Hyundai Electronics Industries, Co., Ltd. Memory management unit with address translation function
US5414824A (en) 1993-06-30 1995-05-09 Intel Corporation Apparatus and method for accessing a split line in a high speed cache
US5446854A (en) 1993-10-20 1995-08-29 Sun Microsystems, Inc. Virtual memory computer apparatus and address translation mechanism employing hashing scheme and page frame descriptor that support multiple page sizes
US5649102A (en) 1993-11-26 1997-07-15 Hitachi, Ltd. Distributed shared data management system for controlling structured shared data and for serializing access to shared data
US5526504A (en) 1993-12-15 1996-06-11 Silicon Graphics, Inc. Variable page size translation lookaside buffer
US5956753A (en) 1993-12-30 1999-09-21 Intel Corporation Method and apparatus for handling speculative memory access operations
SG47981A1 (en) * 1994-03-01 1998-04-17 Intel Corp Pipeline process of instructions in a computer system
JPH0877347A (ja) 1994-03-08 1996-03-22 Texas Instr Inc <Ti> 画像/グラフィックス処理用のデータ処理装置およびその操作方法
US5487146A (en) 1994-03-08 1996-01-23 Texas Instruments Incorporated Plural memory access address generation employing guide table entries forming linked list
US5963984A (en) 1994-11-08 1999-10-05 National Semiconductor Corporation Address translation unit employing programmable page size
US6813699B1 (en) 1995-06-02 2004-11-02 Transmeta Corporation Speculative address translation for processor using segmentation and optional paging
US5999189A (en) 1995-08-04 1999-12-07 Microsoft Corporation Image compression to reduce pixel and texture memory requirements in a real-time image generator
US5949785A (en) 1995-11-01 1999-09-07 Whittaker Corporation Network access communications system and methodology
US6298390B1 (en) 1995-11-22 2001-10-02 Sun Microsystems, Inc. Method and apparatus for extending traditional operating systems file systems
US6091897A (en) 1996-01-29 2000-07-18 Digital Equipment Corporation Fast translation and execution of a computer program on a non-native architecture by use of background translator
US6711667B1 (en) 1996-06-28 2004-03-23 Legerity, Inc. Microprocessor configured to translate instructions from one instruction set to another, and to store the translated instructions
US6031992A (en) 1996-07-05 2000-02-29 Transmeta Corporation Combining hardware and software to provide an improved microprocessor
US6012132A (en) 1997-03-31 2000-01-04 Intel Corporation Method and apparatus for implementing a page table walker that uses a sliding field in the virtual addresses to identify entries in a page table
US5870582A (en) 1997-03-31 1999-02-09 International Business Machines Corporation Method and apparatus for completion of non-interruptible instructions before the instruction is dispatched
AUPO647997A0 (en) 1997-04-30 1997-05-22 Canon Information Systems Research Australia Pty Ltd Memory controller architecture
GB9724031D0 (en) * 1997-11-13 1998-01-14 Advanced Telecommunications Mo Cache memory operation
US6091987A (en) 1998-04-29 2000-07-18 Medtronic, Inc. Power consumption reduction in medical devices by employing different supply voltages
US6591355B2 (en) 1998-09-28 2003-07-08 Technion Research And Development Foundation Ltd. Distributed shared memory system with variable granularity
US6862635B1 (en) 1998-11-13 2005-03-01 Cray Inc. Synchronization techniques in a multithreaded environment
US7007075B1 (en) 1998-12-09 2006-02-28 E-Lysium Transaction Systems Inc. Flexible computer resource manager
US6297832B1 (en) 1999-01-04 2001-10-02 Ati International Srl Method and apparatus for memory access scheduling in a video graphics system
US6362826B1 (en) 1999-01-15 2002-03-26 Intel Corporation Method and apparatus for implementing dynamic display memory
US6978462B1 (en) 1999-01-28 2005-12-20 Ati International Srl Profiling execution of a sequence of events occuring during a profiled execution interval that matches time-independent selection criteria of events to be profiled
US8065504B2 (en) 1999-01-28 2011-11-22 Ati International Srl Using on-chip and off-chip look-up tables indexed by instruction address to control instruction execution in a processor
US7941647B2 (en) 1999-01-28 2011-05-10 Ati Technologies Ulc Computer for executing two instruction sets and adds a macroinstruction end marker for performing iterations after loop termination
US7275246B1 (en) 1999-01-28 2007-09-25 Ati International Srl Executing programs for a first computer architecture on a computer of a second architecture
US6519694B2 (en) 1999-02-04 2003-02-11 Sun Microsystems, Inc. System for handling load errors having symbolic entity generator to generate symbolic entity and ALU to propagate the symbolic entity
US6535905B1 (en) 1999-04-29 2003-03-18 Intel Corporation Method and apparatus for thread switching within a multithreaded processor
US6714904B1 (en) 1999-10-13 2004-03-30 Transmeta Corporation System for using rate of exception event generation during execution of translated instructions to control optimization of the translated instructions
US6574749B1 (en) 1999-10-29 2003-06-03 Nortel Networks Limited Reliable distributed shared memory
US6751583B1 (en) 1999-10-29 2004-06-15 Vast Systems Technology Corporation Hardware and software co-simulation including simulating a target processor using binary translation
US6499090B1 (en) 1999-12-28 2002-12-24 Intel Corporation Prioritized bus request scheduling mechanism for processing devices
US6625715B1 (en) 1999-12-30 2003-09-23 Intel Corporation System and method for translation buffer accommodating multiple page sizes
US20010049818A1 (en) 2000-02-09 2001-12-06 Sanjeev Banerjia Partitioned code cache organization to exploit program locallity
US6457115B1 (en) 2000-06-15 2002-09-24 Advanced Micro Devices, Inc. Apparatus and method for generating 64 bit addresses using a 32 bit adder
ATE259081T1 (de) 2000-07-06 2004-02-15 Texas Instruments Inc Mehrprozessorsystem prüfungsschaltung
US6636223B1 (en) 2000-08-02 2003-10-21 Ati International. Srl Graphics processing system with logic enhanced memory and method therefore
US7162612B2 (en) 2000-08-16 2007-01-09 Ip-First, Llc Mechanism in a microprocessor for executing native instructions directly from memory
EP1213650A3 (en) 2000-08-21 2006-08-30 Texas Instruments France Priority arbitration based on current task and MMU
EP1182571B1 (en) 2000-08-21 2011-01-26 Texas Instruments Incorporated TLB operations based on shared bit
US6742104B2 (en) 2000-08-21 2004-05-25 Texas Instruments Incorporated Master/slave processing system with shared translation lookaside buffer
US6883079B1 (en) 2000-09-01 2005-04-19 Maxtor Corporation Method and apparatus for using data compression as a means of increasing buffer bandwidth
US6859208B1 (en) 2000-09-29 2005-02-22 Intel Corporation Shared translation address caching
US20020069402A1 (en) 2000-10-05 2002-06-06 Nevill Edward Colles Scheduling control within a system having mixed hardware and software based instruction execution
JP2002169696A (ja) 2000-12-04 2002-06-14 Mitsubishi Electric Corp データ処理装置
US7356026B2 (en) 2000-12-14 2008-04-08 Silicon Graphics, Inc. Node translation and protection in a clustered multiprocessor system
US6925547B2 (en) 2000-12-14 2005-08-02 Silicon Graphics, Inc. Remote address translation in a multiprocessor system
US6560690B2 (en) * 2000-12-29 2003-05-06 Intel Corporation System and method for employing a global bit for page sharing in a linear-addressed cache
US6549997B2 (en) 2001-03-16 2003-04-15 Fujitsu Limited Dynamic variable page size translation of addresses
US7073044B2 (en) 2001-03-30 2006-07-04 Intel Corporation Method and apparatus for sharing TLB entries
US6658538B2 (en) 2001-06-21 2003-12-02 International Business Machines Corporation Non-uniform memory access (NUMA) data processing system having a page table including node-specific data storage and coherency control
US6523104B2 (en) 2001-07-13 2003-02-18 Mips Technologies, Inc. Mechanism for programmable modification of memory mapping granularity
US6901505B2 (en) 2001-08-09 2005-05-31 Advanced Micro Devices, Inc. Instruction causing swap of base address from segment register with address from another register
US6757784B2 (en) 2001-09-28 2004-06-29 Intel Corporation Hiding refresh of memory and refresh-hidden memory
US6823433B1 (en) 2001-11-13 2004-11-23 Advanced Micro Devices, Inc. Memory management system and method for providing physical address based memory access security
US6877077B2 (en) 2001-12-07 2005-04-05 Sun Microsystems, Inc. Memory controller and method using read and write queues and an ordering queue for dispatching read and write memory requests out of order to reduce memory latency
EP1331539B1 (en) 2002-01-16 2016-09-28 Texas Instruments France Secure mode for processors supporting MMU and interrupts
US6851008B2 (en) 2002-03-06 2005-02-01 Broadcom Corporation Adaptive flow control method and apparatus
KR100921779B1 (ko) 2002-04-18 2009-10-15 어드밴스드 마이크로 디바이시즈, 인코포레이티드 보호 실행 모드로 동작 가능한 중앙처리장치를 포함한컴퓨터 시스템 및 보호 통신로를 통해 연결된 보호 서비스프로세서
US8285743B2 (en) 2002-06-24 2012-10-09 International Business Machines Corporation Scheduling viewing of web pages in a data processing system
US7124327B2 (en) 2002-06-29 2006-10-17 Intel Corporation Control over faults occurring during the operation of guest software in the virtual-machine architecture
JP3982353B2 (ja) 2002-07-12 2007-09-26 日本電気株式会社 フォルトトレラントコンピュータ装置、その再同期化方法及び再同期化プログラム
EP1391820A3 (en) 2002-07-31 2007-12-19 Texas Instruments Incorporated Concurrent task execution in a multi-processor, single operating system environment
US6950925B1 (en) 2002-08-28 2005-09-27 Advanced Micro Devices, Inc. Scheduler for use in a microprocessor that supports data-speculative execution
GB2392998B (en) 2002-09-16 2005-07-27 Advanced Risc Mach Ltd Handling interrupts during multiple access program instructions
GB2393274B (en) 2002-09-20 2006-03-15 Advanced Risc Mach Ltd Data processing system having an external instruction set and an internal instruction set
US7398525B2 (en) 2002-10-21 2008-07-08 International Business Machines Corporation Resource scheduling in workflow management systems
US6981083B2 (en) 2002-12-05 2005-12-27 International Business Machines Corporation Processor virtualization mechanism via an enhanced restoration of hard architected states
US20040122800A1 (en) 2002-12-23 2004-06-24 Nair Sreekumar R. Method and apparatus for hardware assisted control redirection of original computer code to transformed code
US7191349B2 (en) 2002-12-26 2007-03-13 Intel Corporation Mechanism for processor power state aware distribution of lowest priority interrupt
US7203932B1 (en) 2002-12-30 2007-04-10 Transmeta Corporation Method and system for using idiom recognition during a software translation process
US20040128448A1 (en) 2002-12-31 2004-07-01 Intel Corporation Apparatus for memory communication during runahead execution
US7139876B2 (en) 2003-01-16 2006-11-21 Ip-First, Llc Microprocessor and apparatus for performing fast speculative pop operation from a stack memory cache
US7168077B2 (en) 2003-01-31 2007-01-23 Handysoft Corporation System and method of executing and controlling workflow processes
EP1447742A1 (en) 2003-02-11 2004-08-18 STMicroelectronics S.r.l. Method and apparatus for translating instructions of an ARM-type processor into instructions for a LX-type processor
US6965983B2 (en) 2003-02-16 2005-11-15 Faraday Technology Corp. Simultaneously setting prefetch address and fetch address pipelined stages upon branch
US6963963B2 (en) 2003-03-25 2005-11-08 Freescale Semiconductor, Inc. Multiprocessor system having a shared main memory accessible by all processor units
EP1611498B1 (en) 2003-03-27 2010-03-10 Nxp B.V. Branch based activity monitoring
US7003647B2 (en) 2003-04-24 2006-02-21 International Business Machines Corporation Method, apparatus and computer program product for dynamically minimizing translation lookaside buffer entries across contiguous memory
US7107441B2 (en) 2003-05-21 2006-09-12 Intel Corporation Pre-boot interpreted namespace parsing for flexible heterogeneous configuration and code consolidation
US7082508B2 (en) 2003-06-24 2006-07-25 Intel Corporation Dynamic TLB locking based on page usage metric
US7124255B2 (en) 2003-06-30 2006-10-17 Microsoft Corporation Message based inter-process for high volume data
GB0316532D0 (en) 2003-07-15 2003-08-20 Transitive Ltd Method and apparatus for partitioning code in program code conversion
US7225299B1 (en) 2003-07-16 2007-05-29 Transmeta Corporation Supporting speculative modification in a data cache
US7062631B1 (en) 2003-07-17 2006-06-13 Transmeta Corporation Method and system for enforcing consistent per-physical page cacheability attributes
US7418585B2 (en) 2003-08-28 2008-08-26 Mips Technologies, Inc. Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts
US20050050013A1 (en) 2003-08-28 2005-03-03 Sharp Laboratories Of America, Inc. System and method for policy-driven device queries
US7010648B2 (en) 2003-09-08 2006-03-07 Sun Microsystems, Inc. Method and apparatus for avoiding cache pollution due to speculative memory load operations in a microprocessor
US7921300B2 (en) 2003-10-10 2011-04-05 Via Technologies, Inc. Apparatus and method for secure hash algorithm
US7321958B2 (en) 2003-10-30 2008-01-22 International Business Machines Corporation System and method for sharing memory by heterogeneous processors
US7159095B2 (en) 2003-12-09 2007-01-02 International Business Machines Corporation Method of efficiently handling multiple page sizes in an effective to real address translation (ERAT) table
US7730489B1 (en) 2003-12-10 2010-06-01 Oracle America, Inc. Horizontally scalable and reliable distributed transaction management in a clustered application server environment
US7107411B2 (en) 2003-12-16 2006-09-12 International Business Machines Corporation Apparatus method and system for fault tolerant virtual memory management
US7496732B2 (en) 2003-12-17 2009-02-24 Intel Corporation Method and apparatus for results speculation under run-ahead execution
US7310722B2 (en) 2003-12-18 2007-12-18 Nvidia Corporation Across-thread out of order instruction dispatch in a multithreaded graphics processor
US7340565B2 (en) 2004-01-13 2008-03-04 Hewlett-Packard Development Company, L.P. Source request arbitration
US7293164B2 (en) 2004-01-14 2007-11-06 International Business Machines Corporation Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions
US7082075B2 (en) 2004-03-18 2006-07-25 Micron Technology, Inc. Memory device and method having banks of different sizes
US7383414B2 (en) 2004-05-28 2008-06-03 Oracle International Corporation Method and apparatus for memory-mapped input/output
US7234038B1 (en) 2004-05-28 2007-06-19 Sun Microsystems, Inc. Page mapping cookies
US20060004984A1 (en) 2004-06-30 2006-01-05 Morris Tonia G Virtual memory management system
US8190863B2 (en) 2004-07-02 2012-05-29 Intel Corporation Apparatus and method for heterogeneous chip multiprocessors via resource allocation and restriction
US7257699B2 (en) 2004-07-08 2007-08-14 Sun Microsystems, Inc. Selective execution of deferred instructions in a processor that supports speculative execution
US7194604B2 (en) 2004-08-26 2007-03-20 International Business Machines Corporation Address generation interlock resolution under runahead execution
US7890735B2 (en) 2004-08-30 2011-02-15 Texas Instruments Incorporated Multi-threading processors, integrated circuit devices, systems, and processes of operation and manufacture
US8001294B2 (en) 2004-09-28 2011-08-16 Sony Computer Entertainment Inc. Methods and apparatus for providing a compressed network in a multi-processing system
US7340582B2 (en) 2004-09-30 2008-03-04 Intel Corporation Fault processing for direct memory access address translation
US8843727B2 (en) 2004-09-30 2014-09-23 Intel Corporation Performance enhancement of address translation using translation tables covering large address spaces
US20060149931A1 (en) 2004-12-28 2006-07-06 Akkary Haitham Runahead execution in a central processing unit
CN100573443C (zh) 2004-12-30 2009-12-23 英特尔公司 从混合源指令集架构到单一目标指令集架构的二进制代码转换中的多格式指令的格式选择
US7437517B2 (en) 2005-01-11 2008-10-14 International Business Machines Corporation Methods and arrangements to manage on-chip memory to reduce memory latency
US20060174228A1 (en) 2005-01-28 2006-08-03 Dell Products L.P. Adaptive pre-fetch policy
US7752627B2 (en) 2005-02-04 2010-07-06 Mips Technologies, Inc. Leaky-bucket thread scheduler in a multithreading microprocessor
US7948896B2 (en) 2005-02-18 2011-05-24 Broadcom Corporation Weighted-fair-queuing relative bandwidth sharing
US7209405B2 (en) 2005-02-23 2007-04-24 Micron Technology, Inc. Memory device and method having multiple internal data buses and memory bank interleaving
TWI309378B (en) 2005-02-23 2009-05-01 Altek Corp Central processing unit having a micro-code engine
US7447869B2 (en) 2005-04-07 2008-11-04 Ati Technologies, Inc. Method and apparatus for fragment processing in a virtual memory system
US20100161901A9 (en) * 2005-04-14 2010-06-24 Arm Limited Correction of incorrect cache accesses
US20060236074A1 (en) * 2005-04-14 2006-10-19 Arm Limited Indicating storage locations within caches
DE102005021749A1 (de) 2005-05-11 2006-11-16 Fachhochschule Dortmund Verfahren und Vorrichtung zur programmgesteuerten Informationsverarbeitung
US7299337B2 (en) 2005-05-12 2007-11-20 Traut Eric P Enhanced shadow page table algorithms
US7739668B2 (en) 2005-05-16 2010-06-15 Texas Instruments Incorporated Method and system of profiling applications that use virtual memory
US20060277398A1 (en) 2005-06-03 2006-12-07 Intel Corporation Method and apparatus for instruction latency tolerant execution in an out-of-order pipeline
US7814292B2 (en) 2005-06-14 2010-10-12 Intel Corporation Memory attribute speculation
US20070067505A1 (en) 2005-09-22 2007-03-22 Kaniyur Narayanan G Method and an apparatus to prevent over subscription and thrashing of translation lookaside buffer (TLB) entries in I/O virtualization hardware
JP2007109116A (ja) 2005-10-17 2007-04-26 Fukuoka Pref Gov Sangyo Kagaku Gijutsu Shinko Zaidan 推定装置、テーブル管理装置、選択装置、テーブル管理方法、そのテーブル管理方法をコンピュータに実現させるプログラム、及び、そのプログラムを記録する記憶媒体
US7739476B2 (en) 2005-11-04 2010-06-15 Apple Inc. R and C bit update handling
US7616218B1 (en) 2005-12-05 2009-11-10 Nvidia Corporation Apparatus, system, and method for clipping graphics primitives
US7519781B1 (en) 2005-12-19 2009-04-14 Nvidia Corporation Physically-based page characterization data
US7512767B2 (en) 2006-01-04 2009-03-31 Sony Ericsson Mobile Communications Ab Data compression method for supporting virtual memory management in a demand paging system
US7653803B2 (en) 2006-01-17 2010-01-26 Globalfoundries Inc. Address translation for input/output (I/O) devices and interrupt remapping for I/O devices in an I/O memory management unit (IOMMU)
JP4890033B2 (ja) 2006-01-19 2012-03-07 株式会社日立製作所 記憶装置システム及び記憶制御方法
US7545382B1 (en) 2006-03-29 2009-06-09 Nvidia Corporation Apparatus, system, and method for using page table entries in a graphics system to provide storage format information for address translation
US20070240141A1 (en) 2006-03-30 2007-10-11 Feng Qin Performing dynamic information flow tracking
JP5010164B2 (ja) 2006-03-31 2012-08-29 株式会社日立製作所 サーバ装置及び仮想計算機の制御プログラム
US8621120B2 (en) 2006-04-17 2013-12-31 International Business Machines Corporation Stalling of DMA operations in order to do memory migration using a migration in progress bit in the translation control entry mechanism
US7702843B1 (en) 2006-04-27 2010-04-20 Vmware, Inc. Determining memory conditions in a virtual machine
US8035648B1 (en) 2006-05-19 2011-10-11 Nvidia Corporation Runahead execution for graphics processing units
US8707011B1 (en) 2006-10-24 2014-04-22 Nvidia Corporation Memory access techniques utilizing a set-associative translation lookaside buffer
US8706975B1 (en) 2006-11-01 2014-04-22 Nvidia Corporation Memory access management block bind system and method
CN100485689C (zh) 2007-01-30 2009-05-06 浪潮通信信息系统有限公司 基于文件系统缓存的数据加速查询方法
WO2008097710A2 (en) 2007-02-02 2008-08-14 Tarari, Inc. Systems and methods for processing access control lists (acls) in network switches using regular expression matching logic
CN101042670A (zh) 2007-04-24 2007-09-26 上海华龙信息技术开发中心 一种指令异常处理方法
US7895421B2 (en) 2007-07-12 2011-02-22 Globalfoundries Inc. Mechanism for using performance counters to identify reasons and delay times for instructions that are stalled during retirement
US7712092B2 (en) 2007-10-01 2010-05-04 The Board Of Trustees Of The Leland Stanford Junior University Binary translation using peephole translation rules
US7925923B1 (en) 2008-01-31 2011-04-12 Hewlett-Packard Development Company, L.P. Migrating a virtual machine in response to failure of an instruction to execute
US20090327661A1 (en) 2008-06-30 2009-12-31 Zeev Sperber Mechanisms to handle free physical register identifiers for smt out-of-order processors
US8145890B2 (en) 2009-02-12 2012-03-27 Via Technologies, Inc. Pipelined microprocessor with fast conditional branch instructions based on static microcode-implemented instruction state
US8533437B2 (en) 2009-06-01 2013-09-10 Via Technologies, Inc. Guaranteed prefetch instruction
US8364902B2 (en) 2009-08-07 2013-01-29 Via Technologies, Inc. Microprocessor with repeat prefetch indirect instruction
US20110078425A1 (en) * 2009-09-25 2011-03-31 Shah Manish K Branch prediction mechanism for predicting indirect branch targets
US8775153B2 (en) 2009-12-23 2014-07-08 Intel Corporation Transitioning from source instruction set architecture (ISA) code to translated code in a partial emulation environment
TWI506434B (zh) 2010-03-29 2015-11-01 Via Tech Inc 預取單元、資料預取方法、電腦程式產品以及微處理器
US8479176B2 (en) 2010-06-14 2013-07-02 Intel Corporation Register mapping techniques for efficient dynamic binary translation
US8719625B2 (en) 2010-07-22 2014-05-06 International Business Machines Corporation Method, apparatus and computer program for processing invalid data
US8549504B2 (en) 2010-09-25 2013-10-01 Intel Corporation Apparatus, method, and system for providing a decision mechanism for conditional commits in an atomic region
US8627044B2 (en) 2010-10-06 2014-01-07 Oracle International Corporation Issuing instructions with unresolved data dependencies
KR101612594B1 (ko) 2011-01-27 2016-04-14 소프트 머신즈, 인크. 프로세서의 변환 룩 어사이드 버퍼를 이용하는 게스트 명령-네이티브 명령 레인지 기반 매핑
US20140019723A1 (en) 2011-12-28 2014-01-16 Koichi Yamada Binary translation in asymmetric multiprocessor system
US8898642B2 (en) 2012-02-16 2014-11-25 Unisys Corporation Profiling and sequencing operators executable in an emulated computing system
US9880846B2 (en) 2012-04-11 2018-01-30 Nvidia Corporation Improving hit rate of code translation redirection table with replacement strategy based on usage history table of evicted entries
US10241810B2 (en) 2012-05-18 2019-03-26 Nvidia Corporation Instruction-optimizing processor with branch-count table in hardware
US9384001B2 (en) 2012-08-15 2016-07-05 Nvidia Corporation Custom chaining stubs for instruction code translation
US9645929B2 (en) 2012-09-14 2017-05-09 Nvidia Corporation Speculative permission acquisition for shared memory
US9740553B2 (en) 2012-11-14 2017-08-22 Nvidia Corporation Managing potentially invalid results during runahead
US20140189310A1 (en) 2012-12-27 2014-07-03 Nvidia Corporation Fault detection in instruction translations
US10108424B2 (en) 2013-03-14 2018-10-23 Nvidia Corporation Profiling code portions to generate translations
US9547602B2 (en) 2013-03-14 2017-01-17 Nvidia Corporation Translation lookaside buffer entry systems and methods
US9582280B2 (en) 2013-07-18 2017-02-28 Nvidia Corporation Branching to alternate code based on runahead determination

Also Published As

Publication number Publication date
CN103309644A (zh) 2013-09-18
US20130246709A1 (en) 2013-09-19
DE102013201767A1 (de) 2013-09-19
DE102013201767B4 (de) 2021-12-02
US10146545B2 (en) 2018-12-04
CN103309644B (zh) 2016-08-03
TWI515567B (zh) 2016-01-01

Similar Documents

Publication Publication Date Title
TWI515567B (zh) 微處理器的轉譯位址快取記憶體
US8533438B2 (en) Store-to-load forwarding based on load/store address computation source information comparisons
TWI552069B (zh) 載入-儲存相依性預測器、用於在載入-儲存相依性預測器中處理操作的處理器與方法
US6581151B2 (en) Apparatus and method for speculatively forwarding storehit data based on physical page index compare
JP2618175B2 (ja) キャッシュ・アクセスのための仮想アドレス変換予測の履歴テーブル
US7600097B1 (en) Detecting raw hazards in an object-addressed memory hierarchy by comparing an object identifier and offset for a load instruction to object identifiers and offsets in a store queue
US9131899B2 (en) Efficient handling of misaligned loads and stores
US8190652B2 (en) Achieving coherence between dynamically optimized code and original code
JP5608594B2 (ja) プレロード命令制御
US20090006803A1 (en) L2 Cache/Nest Address Translation
JP5059749B2 (ja) キャッシュライン境界を横切る命令におけるキャッシュミスの処理
CN105446900A (zh) 处理器和区分系统管理模式条目的方法
KR102268601B1 (ko) 데이터 포워딩을 위한 프로세서, 그것의 동작 방법 및 그것을 포함하는 시스템
KR20130140582A (ko) 제로 사이클 로드
CN107818053B (zh) 用于存取高速缓存的方法与装置
TW201423584A (zh) 提取寬度預測器
JP2009217827A (ja) マイクロタグを使用するキャッシュ・アクセッシング
JP2011129103A (ja) バッファを用いて高効率でロード処理を実行する方法および装置
KR101787851B1 (ko) 다중 페이지 크기 변환 색인 버퍼(tlb)용 장치 및 방법
US11989285B2 (en) Thwarting store-to-load forwarding side channel attacks by pre-forwarding matching of physical address proxies and/or permission checking
CN103365627A (zh) 执行单元内的数据转发系统和方法
US20150339233A1 (en) Facilitating efficient prefetching for scatter/gather operations
AU2016265131A1 (en) Method and apparatus for cache tag compression
WO2014100653A1 (en) Speculative addressing using a virtual address-to-physical address page crossing buffer
US20140095838A1 (en) Physical Reference List for Tracking Physical Register Sharing