TW201504940A - 使用寄存器樣板快照以填充寄存器觀點資料結構的方法 - Google Patents
使用寄存器樣板快照以填充寄存器觀點資料結構的方法 Download PDFInfo
- Publication number
- TW201504940A TW201504940A TW103109509A TW103109509A TW201504940A TW 201504940 A TW201504940 A TW 201504940A TW 103109509 A TW103109509 A TW 103109509A TW 103109509 A TW103109509 A TW 103109509A TW 201504940 A TW201504940 A TW 201504940A
- Authority
- TW
- Taiwan
- Prior art keywords
- register
- blocks
- instruction
- data structure
- block
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 239000013598 vector Substances 0.000 claims description 7
- 239000000463 material Substances 0.000 claims description 5
- 241001442055 Vipera berus Species 0.000 description 27
- 238000010586 diagram Methods 0.000 description 16
- 230000009977 dual effect Effects 0.000 description 8
- 238000012545 processing Methods 0.000 description 7
- 238000007792 addition Methods 0.000 description 6
- 230000001052 transient effect Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000009825 accumulation Methods 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 230000000873 masking effect Effects 0.000 description 4
- 238000004088 simulation Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 244000208734 Pisonia aculeata Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3838—Dependency mechanisms, e.g. register scoreboarding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3853—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution of compound instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3854—Instruction completion, e.g. retiring, committing or graduating
- G06F9/3858—Result writeback, i.e. updating the architectural state or memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3861—Recovery, e.g. branch miss-prediction, exception handling
- G06F9/3863—Recovery, e.g. branch miss-prediction, exception handling using multiple copies of the architectural state, e.g. shadow registers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
Abstract
本發明提供一種使用寄存器樣板快照以填充寄存器觀點資料結構的方法。該方法包括使用全域前端接收輸入的指令序列;群組該等指令以排列指令區塊;使用複數寄存器樣板以藉著用對應於該等指令區塊的區塊編號填充該寄存器樣板而追蹤指令目標及指令來源,其中對應於該等指令區塊的該等區塊編號指示出該等指令區塊之間的依存度;填充寄存器觀點資料結構,其中該寄存器觀點資料結構儲存如該等複數寄存器樣板所記錄的對應於該等指令區塊的目標;以及使用該寄存器觀點資料結構以根據該等複數指令區塊之該執行追蹤機器狀態。
Description
本發明一般係關於數位電腦系統,尤其係關於一種包含一指令序列的指令選擇之系統及方法。
處理器必須處理互相依附或完全獨立任一者的多重任務。這種處理器之內部狀態通常由在程式執行之每個特定瞬間皆可能維持不同數值的寄存器構成。在程式執行之每個瞬間,該內部狀態圖像(state image)稱為該處理器之架構狀態。
碼執行切換成運行另一函數(例如另一執行緒、程序或程式)時,機器/處理器之狀態必須儲存使得新功能可利用該等內部寄存器以建立其新狀態。一旦該新功能終止,則其狀態可丟棄且先前脈絡(context)之狀態將會恢復並繼續執行。這種切換程序稱為脈絡切換(context switch)且通常包括數十或幾百個循環(cycles),尤其具有採用大量寄存器(例如64、128、256)及/或無序執行的現代架構時。
在執行緒感知(thread aware)硬體架構中,對於硬體而言,為數量有限的硬體支援執行緒支援多重脈絡狀態很正常。在此例中,硬體為每個所支援執行緒皆複製所有的架構狀態元件。這排除執行新執行緒時,對於脈絡切換的需要。然而,這仍有多個缺點,亦即為了硬體中所支援的
每個額外執行緒而皆複本所有架構狀態元件(亦即寄存器)之面積、功率及複雜度。此外,若軟體執行緒之數量超過明確所支援的硬體執行緒之數量,則該脈絡切換仍必須進行。
由於在需求大量執行緒的精細粒度基礎上需要平行處理(parallelism),因此這變得普遍。具有複本脈絡狀態硬體儲存體的硬體執行緒感知架構無助於非執行緒軟體碼,並僅為經執行緒軟體減縮脈絡切換之數量。然而,那些執行緒通常為粗粒平行處理而建構,並為初始化及同步化而導致沉重軟體負載,使得諸如函式呼叫及迴圈平行執行的細粒平行處理沒有有效的執行緒初始化/自動產生。這種所描述的負載伴隨著為了非明確(non-explicitly)/容易平行化(easily parallelized)/執行緒(threaded)軟體碼而使用最先進的編譯器或使用者平行化處理技術對這些碼之自動平行化處理之困難。
在一個具體實施例中,本發明實現為一種使用寄存器樣板快照以填充寄存器觀點資料結構的方法。該方法包括使用全域前端接收輸入的指令序列;群組該等指令以排列指令區塊;使用複數寄存器樣板以藉著用對應於該等指令區塊的區塊編號填充該寄存器樣板而追蹤指令目標及指令來源,其中對應於該等指令區塊的該等區塊編號指示出該等指令區塊之間的依存度(interdependencies);填充(populate)寄存器觀點資料結構,其中該寄存器觀點資料結構儲存如該等複數寄存器樣板所記錄的對應於該等指令區塊的目標;以及使用寄存器觀點資料結構以根據該等複數指令區塊之該執行追蹤機器狀態。
前述為總結,因此必然包含對細節之簡化、歸納及省略;所以,熟習此項技術者應可瞭解該總結僅為例示性且不欲以任何方式限制。如僅由諸申請專利範圍界定出的本發明之其他態樣、創造性特徵、與優勢,
將會在以下所闡述的非限制性實施方式中變得顯而易見。
R0-R63‧‧‧寄存器
T0-T4‧‧‧寄存器樣板
20‧‧‧區塊
S1-S8‧‧‧來源
P1-P4‧‧‧連接埠
本發明在所附圖式之圖示中藉著範例而非限制進行例示,且其中同樣的參考號碼指稱類似的元件。
圖1顯示將指令群組於區塊中並藉著使用寄存器樣板追蹤指令之間依附的程序之概觀圖。
圖2根據本發明之一個具體實施例顯示寄存器觀點、來源觀點、與指令觀點之概觀圖。
圖3根據本發明之一個具體實施例所顯示的圖示例示示範寄存器樣板,以及來源觀點如何藉著來自寄存器樣板的資訊而填充。
圖4所顯示的圖示例示來源觀點內的廣播依附的第一具體實施例。在此具體實施例中,每行皆包含一指令區塊。
圖5所顯示的圖示例示來源觀點內的廣播依附的第二具體實施例。
圖6根據本發明之一個具體實施例所顯示的圖示例示為了始於提交指示器的配送而選擇就緒區塊,並廣播對應的連接埠分配。
圖7根據本發明之一個具體實施例顯示用於實現圖6中所描述選擇器陣列的加法器樹結構。
圖8更詳細顯示選擇器陣列加法器樹之示範邏輯。
圖9根據本發明之一個具體實施例顯示實現選擇器陣列的加法器樹之平行實作。
圖10根據本發明之一個具體實施例所顯示的示範圖示例示來自圖9的加法器X如何可藉著使用進位儲存加法器而實現。
圖11根據本發明顯示為了始於提交指示器進行排程並使用選擇器陣列加法器而遮蔽(masking)就緒位元的遮蔽具體實施例。
圖12根據本發明之一個具體實施例顯示寄存器觀點條目如何由寄存器樣板填充之概觀圖。
圖13根據本發明之一個具體實施例顯示用於減縮寄存器觀點覆蓋區的第一具體實施例。
圖14根據本發明之一個具體實施例顯示用於減縮寄存器覆蓋區的第二具體實施例。
圖15根據本發明之一個具體實施例顯示快照之間的差量之示範格式。
圖16根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之圖示。
圖17根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之另一圖示。
圖18根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之串列實作的硬體之概觀圖。
圖19根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之平行實作的硬體之概觀圖。
圖20根據本發明之一個具體實施例顯示用於指令區塊型執行的硬體之概觀圖,以及其如何採用來源觀點、指令觀點、寄存器樣板、與寄存器觀點運作。
圖21根據本發明之一個具體實施例顯示組集(chunking)架構之範例。
圖22根據本發明之一個具體實施例顯示執行緒如何根據其區塊編號及執行緒識別碼(ID)進行配置之繪圖。
圖23根據本發明之一個具體實施例顯示排程器之實作,其使用為了管理多重執行緒執行而指向實體儲存位置的執行緒指示器映射。
圖24根據本發明之一個具體實施例顯示使用執行緒型指示
器映射的排程器之另一實作。
圖25根據本發明之一個具體實施例顯示對執行緒的執行資源之動態日曆型配置之圖示。
圖26根據本發明之一個具體實施例圖示雙重配送程序。
圖27根據本發明之一個具體實施例圖示雙重配送暫態乘法積累。
圖28根據本發明之一個具體實施例圖示雙重配送架構上可見狀態乘法加法。
圖29根據本發明之一個具體實施例顯示用於群組執行單元程序上的執行的指令區塊之提取及排列之概觀圖。
圖30根據本發明之一個具體實施例顯示指令群組之示範圖示。在圖30具體實施例中,採用第三輔助運算顯示兩個指令。
圖31根據本發明之一個具體實施例顯示區塊堆疊內的半區塊配對如何映射於執行區塊單元上。
圖32根據本發明之一個具體實施例所顯示的圖示將中間區塊結果儲存體描繪為第一階寄存器檔案。
圖33根據本發明之一個具體實施例顯示奇數/偶數連接埠排程器。
圖34顯示圖33之更詳細的版本,其中顯示四個執行單元接收來自排程器陣列的結果,並將輸出寫入暫時寄存器檔案段。
圖35根據本發明之一個具體實施例所顯示的圖示描繪出客戶旗標架構仿真。
圖36根據本發明之一個具體實施例所顯示的圖示例示機器之前端、排程器及執行單元、與集中旗標寄存器。
圖37顯示如本發明之具體實施例所實現的集中旗標寄存器仿真程序之圖示。
圖38顯示在客戶設定下仿真集中旗標寄存器行為之程序3800之步驟流程圖。
雖然本發明已結合一個具體實施例進行描述,但本發明不欲限於文中所闡述的特定形式。相反地,係欲涵蓋如同可合理包括於如所附諸申請專利範圍所界定出的本發明之範疇內的這種替代例、修飾例、與相等物。
在以下實施方式中,諸如特定方法順序、結構、元件、與連接的眾多特定細節皆已闡述。然而應可理解這些及其他特定細節不需要利用於實作本發明之具體實施例。在其他狀況下,已習知的結構、元件、或連接皆已省略,或者並未特別詳細描述,以避免不必要地模糊本描述。
在本說明書內提及「一個具體實施例(one embodiment)」或「一具體實施例(an embodiment)」,係欲指示結合該具體實施例所描述的特定特徵、結構、或特性包括於本發明之至少一個具體實施例中。在本說明書內各處所出現的片語「在一個具體實施例中(in one embodiment)」不必皆指稱同一具體實施例,亦非互斥其他具體實施例的分離或替代性具體實施例。而且,描述可藉著一些具體實施例而非其他所呈現出的各種特徵。同樣地,描述對於一些具體實施例而非其他具體實施例可能為要求的各種要求。
所依循的實施方式之一些部分在電腦記憶體內資料位元上的運算之流程、步驟、邏輯區塊、處理、與其他符號代表方面進行說明。這些描述與指示為熟習資料處理領域技術者用來最有效傳達其工作實質給熟習此項技術其他者的方法。流程、電腦執行步驟、邏輯區塊、程序等在此一般設想成導致所需結果的步驟或指令之自相一致序列。這些步驟為需求實體量之實體操控者。通常,但並非必然,這些量具有的形式為電腦可
讀取儲存媒體之電或磁信號,並能在電腦系統中儲存、傳送、結合、比較、與另行操控。主要由於通用之原因,有時已證明指稱這些信號為位元、數值、元件、符號、字元、用語、數字或此類的便利性。
然而,以此為前提,所有這些及類似用語將與適當實體量相關聯,並僅為施加於這些量的便利標記。除非如從以下詳述所顯而易見另外明確聲明,應可瞭解貫穿本發明利用諸如「處理(processing)」或「存取(accessing)」或「寫入(writing)」或「儲存(storing)」或「複製(replicating)」或此類用語的詳述指稱電腦系統或類似電子運算裝置之動作及程序,其將在該電腦系統的寄存器及記憶體及其他電腦可讀取媒體內表示為實體(電子)量的資料,操控及變換成在該電腦系統記憶體或寄存器或其他這種資訊儲存、傳輸或顯示裝置內同樣表示為實體量的其他資料。
圖1顯示將指令群組於區塊中且藉著使用寄存器樣板而追蹤該等指令之間的依附的程序之概觀圖。
圖1顯示具有標頭及本體(body)的指令區塊。該區塊從一群指令形成。該區塊包含一實體,其包覆(encapsulate)該指令群。在微處理器之本發明具體實施例中,摘要階層提高到區塊而非個別指令。區塊經處理進行配送,而非個別指令。每個區塊皆用區塊編號(block number)標記。機器的無序管理工作由此顯著簡化。一個關鍵特徵為找出藉以管理正在處理的更大量指令而不會顯著增加機器之管理負載的方法。
本發明之各具體實施例藉著實現指令區塊、寄存器樣板、與繼承向量而達成此目的。在圖1所顯示的區塊中,區塊之標頭列出且包覆區塊指令之所有來源及目標,以及那些來源的出處(例如來自哪些區塊)。該標頭包括該等目標,其更新該寄存器樣板。包括於該標頭中的該等來源將與儲存於該寄存器樣板中的該等區塊編號序連(concatenated)在一起。
經無序(out of order)處理的該些指令判定無序機器之管理複雜度。更多無序指令導致更高的複雜度。來源需要與處理器之無序配送視
窗中的先前指令之目標比較。
如圖1所顯示,寄存器樣板對於從R0至R63的每個寄存器都有欄位。區塊將其各自的區塊編號寫入對應於區塊目標的寄存器樣板欄位。每個區塊皆從該寄存器樣板讀取表示其寄存器來源的寄存器欄位。區塊拉回(retire)並將其目標寄存器內容寫入寄存器檔案時,其編號從寄存器樣板抹除。這意指那些寄存器可從寄存器檔案自身讀取為來源。
在本發明具體實施例中,寄存器樣板在每當區塊配置時機器之每個循環皆進行更新。隨著新的樣板更新產生,寄存器樣板之先前快照每個區塊一個儲存於陣列中(例如圖2所顯示的寄存器觀點)。此資訊留存直到對應的區塊拉回為止。這允許機器從未中預測(miss-predictions)恢復且非常迅速清除(例如藉著得到最後已知的依附狀態)。
在一個具體實施例中,儲存於寄存器觀點中的寄存器樣板可藉著僅儲存連續快照之間的差量(delta)(快照之間的增量改變)而壓縮(由此節省儲存空間)。以此方式機器得到縮小的寄存器觀點。進一步壓縮可藉著僅為具有分支指令的區塊儲存樣板而得到。
若除了分支未中預測之外還需要恢復點,則最初會在分支恢復點得到恢復,隨後狀態可由於配置指令(但並非將其執行)而重建直到機器求取到恢復點為止。
應注意到在一個具體實施例中,文中所使用的用語「寄存器樣板(register template)」與美國專利申請號13/428,440中所描述的用語「繼承向量(inheritance vector)」同義,於本文中將此專利申請案全部併入作為參照。
圖2根據本發明之一個具體實施例顯示寄存器觀點、來源觀點、與指令觀點之概觀圖。此圖示顯示排程器架構(例如具有來源觀點、指令觀點、寄存器觀點等)之一個具體實施例。藉著結合或分離以上所引述結構之一個或多個而達成相同功能的排程器架構之其他實作亦可能。
圖2圖示支援寄存器樣板之運算及機器狀態之保留的功能性
實體。圖2之左側顯示寄存器樣板T0至T4,具有箭頭指示從一個寄存器樣板/繼承向量到下一個的資訊之繼承。寄存器觀點、來源觀點、與指令觀點每個皆包含資料結構,其用於儲存與指令區塊相關的資訊。圖2亦顯示具有標頭的示範指令區塊,以及該指令區塊如何為機器之寄存器包括來源及目標兩者。有關區塊所指稱寄存器的資訊儲存於寄存器觀點資料結構中。有關區塊所指稱來源的資訊儲存於來源觀點資料結構中。有關區塊所指稱指令自身的資訊儲存於指令觀點資料結構中。該等寄存器樣板/繼承向量自身包含資料結構,其儲存區塊所指稱依附及繼承資訊。
圖3根據本發明之一個具體實施例所顯示的圖示例示示範寄存器樣板及如何由來自寄存器樣板的資訊填充來源觀點。
在本發明具體實施例中,應注意到來源觀點之目標為判定何時可配送(dispatch)特定區塊。區塊被配送時,會將其區塊編號廣播到所有剩餘區塊。對於其他區塊之來源的任何匹配(例如比較)皆會造成就緒位元(例如或者某其他類型之指示符)被設定。所有就緒位元皆設定(例如及閘(AND gate))時,區塊就緒進行配送。區塊依據其所依賴其他區塊之就緒度而被配送。
多個區塊就緒進行配送時,最早的區塊在較新的區塊前被選擇進行配送。舉例來說,在一個具體實施例中,最初找出的迴路(circuit)可用於依據接近於提交指示器及依據相對接近於該提交指示器的後續區塊找出最早的區塊(例如致力於每個區塊的就緒位元)。
仍參照圖3,在此範例中,正在檢查抵達區塊20時所形成的寄存器樣板快照。如上述,寄存器樣板具有用於R0至R63每個寄存器的欄位。區塊將其各自的區塊編號寫入對應於區塊目標的寄存器樣板欄位。每個區塊皆從寄存器樣板讀取代表其寄存器來源的寄存器欄位。第一編號為寫入寄存器的區塊,而第二編號為該區塊之目標編號。
舉例來說,區塊20抵達時,會讀取寄存器樣板之快照並在寄
存器樣板中查找其自身的寄存器來源,以判定寫入其每個來源的最新區塊並根據其目標對先前寄存器樣板快照所進行的更新填充來源觀點。後續區塊將會用其自身的目標更新寄存器樣板。這顯示於圖3之左下方,其中區塊20填充其來源:來源1、來源2、來源3、一直到來源8。
圖4所顯示的圖示例示來源觀點內的廣播依附的第一具體實施例。在此具體實施例中,每行皆包含一指令區塊。區塊被配置時,會在其來源曾經對那些區塊有依附的所有區塊行中進行標記(例如藉著寫入0)。任何其他區塊被配送時,其編號跨越與該區塊相關的確切欄進行廣播。應注意到寫入1為預設數值,指示對該區塊沒有依附。
區塊中的所有就緒位元皆就緒時,該區塊被配送且其編號廣播回到所有剩餘區塊。該區塊編號與儲存於其他區塊之來源中的所有編號比較。若有匹配,則設定用於該來源的就緒位元。舉例來說,若廣播於來源1上的區塊編號等於11,則將會設定用於區塊20之來源1的就緒位元。
圖5所顯示的圖示例示來源觀點內的廣播依附的第二具體實施例。此具體實施例由來源組織,而非由區塊組織。這藉著跨越來源觀點資料結構的來源S1至S8而顯示。以類似於以上圖4中所描述的方式,在圖5具體實施例中,區塊中的所有就緒位元皆就緒時,該區塊被配送且其編號廣播回到所有剩餘區塊。該區塊編號與儲存於其他區塊之來源中的所有編號比較。若有匹配,則設定用於該來源的就緒位元。舉例來說,若廣播於來源1上的區塊編號等於11,則將會設定用於區塊20之來源1的就緒位元。
圖5具體實施例亦顯示比較為何僅在提交指示器和配置指示器之間的區塊上啟動。所有其他區塊皆無效。
圖6根據本發明之一個具體實施例所顯示的圖示例示為了始於提交指示器的配送而選擇就緒區塊,並廣播對應的連接埠分配。來源觀點資料結構顯示於圖6之左側。指令觀點資料結構顯示於圖6之右側。選擇器陣列顯示於來源觀點和指令觀點之間。在此具體實施例中,選擇器陣列
經由四個配送連接埠P1至P4每個循環配送四個區塊。
如上述,為從環繞包覆(wrapping around)的提交指示器到配置指示器的配送而選擇區塊(例如試著實踐最初配送較早的區塊)。選擇器陣列用於找出始於提交指示器的最初四個就緒區塊。所需為配送最早的就緒區塊。在一個具體實施例中,選擇器陣列可藉著使用加法器樹結構而實現。這將會在以下的圖7中進行描述。
圖6亦顯示選擇器陣列如何耦接於通過指令觀點中的條目的四個連接埠之每個。在此具體實施例中,連接埠耦接為連接埠啟動,並啟動四個連接埠之一啟用,且為該指令觀點條目向下通過到配送連接埠及執行單元上。此外,如上述,經配送區塊透過來源觀點廣播回去。用於配送的選擇區塊之區塊編號廣播回去(最多四個)。這顯示於圖6之最右側。
圖7根據本發明之一個具體實施例顯示用於實現圖6中所描述選擇器陣列的加法器樹(adder tree)結構。所描繪出的加法器樹實現選擇器陣列之功能。加法器樹撿出最初四個就緒區塊,並將其裝入用於配送的四個可用連接埠(例如讀取連接埠1至讀取連接埠4)。未使用仲裁(arbitration)。用於具體啟動特定連接埠的實際邏輯明確顯示於條目編號1中。為了清楚表示,該邏輯並未具體顯示於其他條目中。以此方式,圖7顯示如何實現直接選擇用於區塊配送的每個特定連接埠之一個特定具體實施例。然而或者應注意到,可實現使用優先編碼器的具體實施例。
圖8更詳細顯示選擇器陣列加法器樹之示範邏輯。在圖8具體實施例中,為範圍超過位元(range exceed bit)顯示邏輯。範圍超過位元確保將會選擇不超過四個區塊進行配送,若第五區塊就緒且最初四個區塊亦就緒,則範圍超過位元不會允許配送第五區塊。應注意到在串列實作中,總位元S0至S3皆用於啟動配送連接埠以及傳遞到下一個加法器階段。
圖9根據本發明之一個具體實施例顯示實現選擇器陣列的加法器樹之平行實作。平行實作並未將總和從每個加法器轉發到下一個。在
平行實作中,每個加法器皆使用多重輸入加法實作直接使用所有其必要的輸入,諸如多輸入進位儲存加法器樹。舉例來說,加法器「X」加總先前的所有輸入。若為了執行更快速的運算次數(例如單一循環),較佳地可採用此平行實作。
圖10根據本發明之一個具體實施例所顯示的示範圖示例示來自圖9的加法器X如何可藉著使用進位儲存加法器而實現。圖10顯示可在單一循環中加入32個輸入的結構。該結構使用4×2進位儲存加法器組成。
圖11根據本發明顯示為了始於提交指示器進行排程並使用選擇器陣列加法器而遮蔽就緒位元的遮蔽具體實施例。在此實作中,選擇器陣列加法器正試著選擇最初四個就緒區塊,藉以始於可能環繞包覆的提交指示器到配置指示器進行配送。在此實作中,使用多輸入平行加法器。此外,在此實作中,利用這些循環緩衝之來源。
圖11顯示就緒位元如何與兩個遮罩(masks)之每個(個別或分離)皆一起ANDed,並平行施行於兩個加法器樹。最初四個藉著使用兩個加法器樹並與四個之臨界值比較而選擇。「X」標記表示「從用於該加法器樹的選擇陣列排除(exclude from the selection array for that adder tree)」,因此「X」數值為零。另一方面「Y」標記表示「確實包括於用於該加法器樹的選擇陣列中(do include in the selection array for that adder tree)」,因此「Y」數值為一。
圖12根據本發明之一個具體實施例顯示寄存器觀點條目(entries)如何由寄存器樣板填充(populate)之概觀圖。
如上述,寄存器觀點條目由寄存器樣板填充。寄存器觀點序列儲存用於每個區塊的寄存器樣板之快照。猜測無效(例如分支未中預測)時,寄存器觀點在無效猜測點之前有最新的有效快照。機器可藉著讀取該寄存器觀點條目並將其載入寄存器樣板之基底而將其狀態回復到最後的有效快照。寄存器觀點之每個條目皆會顯示所有的寄存器繼承狀態。舉例來
說,在圖12具體實施例中,若用於區塊F的寄存器觀點無效,則機器狀態可回復到稍早最後的有效寄存器樣板快照。
圖13根據本發明之一個具體實施例顯示用於減縮寄存器觀點覆蓋區的第一具體實施例。儲存寄存器觀點條目所需要的記憶體量可藉著僅儲存包含分支指令的那些寄存器觀點樣板快照而減縮。發生例外情形(例如猜測無效、分支未中預測等)時,最後的有效快照可從發生於例外情形之前的分支指令進行重建。為了建立最後的有效快照,從在例外情形之前向下到例外情形的分支提取指令。該等指令經提取但並未執行。如圖13中所顯示,僅包括分支指令的那些快照儲存於減縮寄存器觀點中。這顯著減縮儲存寄存器樣板快照所需要的記憶體量。
圖14根據本發明之一個具體實施例顯示用於減縮寄存器覆蓋區的第二具體實施例。儲存寄存器觀點條目所需要的記憶體量可藉著僅儲存快照之序列子集(例如每四個快照一個)而減縮。連續快照之間的改變可使用與完整連續快照比較更小的記憶體量儲存為偏離原始快照的「差量(delta)」。發生例外情形(例如猜測無效、分支未中預測等)時,最後的有效快照可從在例外情形之前所發生的原始快照重建。偏離在例外情形之前所發生的原始快照的「差量(delta)」及連續快照用於重建最後的有效快照。初始的原始狀態可積累差量以抵達所需求快照之狀態。
圖15根據本發明之一個具體實施例顯示快照之間的差量之示範格式。圖15顯示原始快照及兩個差量。在一個差量中,R5及R6為B3正進行更新的唯二寄存器。條目之其餘部分並未改變。在另一差量中,R1及R7為B2正進行更新的唯二寄存器。條目之其餘部分並未改變。
圖16根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之圖示。在此具體實施例中,圖16之左側顯示兩個解多工器(de-multiplexers),而圖16之上方為快照寄存器樣板。圖16顯示用於從先前寄存器樣板(例如串列實作)形成後續寄存器樣板的圖示。
此串列實作顯示寄存器樣板快照如何在指令區塊之配置上方形成。那些快照用來擷取用於依附追蹤(例如圖1至圖4中所描述)以及更新用於處理未中預測/例外情形的寄存器觀點(例如圖12至圖15中所描述)的最新寄存器架構狀態更新。
解多工器藉著選擇傳遞哪個輸入來源而起作用。舉例來說,寄存器R2將會在第二輸出解多工為1,而R8將會在第七輸出解多工為1等。
圖17根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之另一圖示。圖17具體實施例亦顯示從先前寄存器樣板形成後續寄存器樣板。圖17具體實施例亦顯示寄存器樣板區塊繼承之範例。此圖示顯示寄存器樣板如何從經配置的區塊編號進行更新之範例。舉例來說,區塊Bf更新R2、R8、與R10。Bg更新R1及R9。虛線箭頭指示數值從先前快照繼承。此程序向下一直進行到區塊Bi。因此,舉例來說,由於沒有快照更新寄存器R7,故原始數值Bb將會向下一直傳遞。
圖18根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之串列實作的硬體之概觀圖。解多工器用於控制一連串兩個輸入多工器,其具有兩個區塊編號將會向下傳遞到下一個階段。可為來自先前階段的區塊編號或現有區塊編號任一者。
圖19根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之平行實作的硬體之概觀圖。此平行實作使用特殊的編碼多工器控制,藉以從先前寄存器樣板形成後續寄存器樣板。
圖20根據本發明之一個具體實施例顯示用於指令區塊型執行的硬體之概觀圖,以及其如何採用來源觀點、指令觀點、寄存器樣板、與寄存器觀點運作。
在此實施例中,配送器中的配置器排程器接收機器前端所提取的指令。這些指令以先前我們描述過的方式通過區塊排列。如先前所描述,該等區塊產生寄存器樣板且這些寄存器樣板用於填充寄存器觀點。從
來源觀點來看,該等來源傳送到寄存器檔案階層,並有廣播以上述方式回到來源觀點。指令觀點將指令傳送到執行單元。由於該等指令所需要的該等來源來自寄存器檔案階層,因此該等指令由執行單元執行。這些經執行的指令隨後從執行單元傳送出來並回到寄存器檔案階層中。
圖21根據本發明之一個具體實施例顯示組集(chunking)架構之範例。組集之重要性在於其藉著使用所顯示的四個多工器而將進入每個排程器條目的寫入連接埠之數量皆從四減縮成一,同時仍密集堆積所有條目而未形成磁泡(bubbles)。
組集之重要性可由以下範例看出(例如注意到在每個循環中的區塊之配置皆始於上方位置,在此例中為B0)。假設在循環1中,三個指令區塊即將配置到排程器條目(例如這三個區塊將會占用排程器中的最初三個條目)。在下一個循環(例如循環2)中,另兩個指令區塊即將進行配置。為了避免在排程器陣列條目中形成磁泡(bubble),該等排程器陣列條目必須支援四個寫入連接埠而建立。這在功率消耗、時序、面積、與此類方面代價很大。以上的組集結構藉著在配置到陣列之前先使用多工結構而將所有排程器陣列皆簡化成僅有一個寫入連接埠。在以上的範例中,在循環2中的B0將會由最後的多工器選擇,而在循環2中的B1將會由第一多工器選擇(例如從左到右進行)。
以此方式,條目組集之每個皆僅需要每個條目一個寫入連接埠及每個條目四個讀取連接埠。在成本上有折衷,因為必須實現多工器,然而由於可能有非常多個條目,因此該成本多次從不必實現每個條目皆四個寫入連接埠的節省而補足。
圖21亦顯示中間配置緩衝。若排程器陣列無法接受發送來的所有組集,則其可暫時儲存於中間配置緩衝中。排程器陣列有可用空間時,該等組集將會從中間配置緩衝傳送到排程器陣列。
圖22根據本發明之一個具體實施例顯示執行緒如何根據其
區塊編號及執行緒ID進行配置之繪圖。區塊如上述經由組集實作配置到排程器陣列。該等執行緒區塊之每個皆使用區塊編號在其自身之間維持序列順序。來自不同執行緒的區塊可交錯(例如用於執行緒Th1的區塊和用於執行緒Th2的區塊在排程器陣列中交錯)。以此方式,在排程器陣列內呈現出來自不同執行緒的區塊。
圖23根據本發明之一個具體實施例顯示排程器之實作,其使用為了管理多重執行緒執行而指向實體儲存位置的執行緒指示器映射。在此具體實施例中,執行緒之管理透過執行緒映射之控制而實現。舉例來說,圖23在此顯示執行緒1映射及執行緒2映射。該等映射追蹤個別執行緒之區塊之位置。在映射中的條目配置到屬於該執行緒的區塊。在此實作中,每個執行緒皆有為兩者執行緒計數的配置計數器。整體計數不可超過N除以2(例如超過可用空間)。為了在來自池的總條目之配置上實現公平性,該等配置計數器有可調整的臨界值。配置計數器可避免一個執行緒使用所有可用空間。
圖24根據本發明之一個具體實施例顯示使用執行緒型指示器映射的排程器之另一實作。圖24顯示提交指示器和配置指示器之間的關係。如所顯示,每個執行緒皆有提交指示器及配置指示器,箭頭顯示用於執行緒2的實境指示器如何可環繞包覆配置區塊B1及B2的實體儲存體,但其直到用於執行緒2的提交指示器向下移動才可配置區塊B9。這由執行緒2之提交指示器之位置及刪除線顯示。圖24之右側顯示逆時針環繞移動的區塊之配置和提交指示器之間的關係。
圖25根據本發明之一個具體實施例顯示對執行緒的執行資源之動態日曆型配置之圖示。公平性可依據每個執行緒之向前進展而皆使用配置計數器進行動態控制。若兩者執行緒皆正做出重大向前進展,則兩者配置計數器皆設定成相同臨界值(例如9)。然而,若一個執行緒做出緩慢向前進展,諸如受到L2快取未中或這種事件影響,則臨界值計數器之比率
可依仍然正在做出重大向前進展的執行緒而調整。若一個執行緒拖延或中止(例如處於等待作業系統(OS)或輸入輸出(IO)回應的等待或自旋狀態下),則該比率可完全調整到另一執行緒,其具有為了經中止的執行緒而保留以發信號解除等待狀態的單一返回條目之例外情形。
在一個具體實施例中,程序採用50%:50%之比率開始。L2快取未中區塊22上的偵測時,指令管線之前端拖延任何進一步進入指令管線的提取或進入執行緒2區塊之排程器的配置。從排程器拉回執行緒2區塊時,將會使得那些條目可用於執行緒1配置直到達成新的執行緒配置動態比率。舉例來說,出於新近所拉回執行緒2區塊的3將會為了配置到執行緒1而非執行緒2而回到池中,使得執行緒1對執行緒2比率為75%:25%。
應注意到在指令管線前面的執行緒2區塊之拖延若沒有硬體機制可略過,則可能需要從指令管線前面清除那些區塊(例如由執行緒1區塊藉著經過受到拖延的執行緒2區塊)。
圖26根據本發明之一個具體實施例圖示雙重配送程序。多配送一般涵蓋多次配送區塊(其內有多個指令),使得區塊的不同指令在每次通過執行單元時皆可執行。一個範例為位址計算指令之配送,接著為耗用所得到資料的後續配送。另一範例為浮點運算,其中第一部分執行為固定點運算,而第二部分執行以藉著進行捨入、旗標產生/計算、指數調整或此類而完成運算。區塊作為單一實體基元地(atomically)進行配置、提交、與拉回。
多配送之主要效益為避免將多個分離區塊配置到機器視窗中,由此使得機器視窗有效更大。更大的機器視窗意指有更多機會進行最佳化及重新排序。
看到圖26之左下方,描繪出指令區塊。此區塊由於來自快取/記憶體的負載位址計算和負載返回資料之間有延遲,因此無法在單一循環中進行配送。所以此區塊最初採用其保持為暫態的中間結果進行配送(其結果正即時輸送到第二配送而看不見架構狀態)。第一配送發送在LA之位址計
算及配送中所使用的兩個分量1及2。第二配送發送在來自快取/記憶體的負載返回資料上的負載資料之執行部分的分量3及4。
看到圖26之右下方,描繪出浮點乘法積累運算。如乘法積累圖示顯示,在硬體沒有足夠輸入來源頻寬以在單一相中配送運算的案例中,則使用雙重配送。第一配送如所顯示為固定點乘法。第二配送如所顯示為浮點加法捨入。執行這兩者經配送的指令時,其有效進行浮點乘法/積累。
圖27根據本發明之一個具體實施例圖示雙重配送暫態乘法積累(transient multiply-accumulate)。如圖27中所顯示,第一配送為整數32位元乘法,而第二配送為整數積累加法。在第一配送和第二配送之間進行溝通的狀態(乘法之結果)為暫態且架構上看不見。暫態儲存體在一個實施例中可保存一個以上乘法器之結果,並可對它們加標籤以識別對應的乘法積累對,由此允許以隨意方式(例如交錯等)配送的多個乘法積累對之混合。
應可知到其他指令可將此同一硬體用於其實作(例如浮點等)。
圖28根據本發明之一個具體實施例圖示雙重配送架構上可見狀態乘法加法。第一配送為單一精確度乘法,而第二配送為單一精確度加法。在此實作中,由於此儲存體為架構狀態寄存器,因此在第一配送和第二配送之間進行溝通的狀態資訊(例如乘法之結果)為架構上可見。
圖29根據本發明之一個具體實施例顯示用於群組執行單元程序上的執行的指令區塊之提取及排列之概觀圖。本發明之具體實施例利用藉以由硬體或動態轉換器/JIT將指令提取及排列為區塊的程序。區塊中的指令經組織使得區塊中稍早指令之結果饋送區塊中後續指令之來源。這由指令區塊中的虛線箭頭顯示。此特性致能區塊以在執行區塊之堆疊執行單元上有效執行。即使指令可平行執行,但諸如若其分享同一來源時(在此圖示中未明確顯示),則亦可群組。
在硬體中排列區塊的一個替代例為在排列指令配對、三重、四重等的軟體中排列區塊(靜態或在運行時間)。
可美國專利8,327,115中找到指令群組功能之其他範例。
圖30根據本發明之一個具體實施例顯示指令群組之示範圖示。在圖30具體實施例中,採用第三輔助運算顯示兩個指令。圖31指令區塊之左側包含一上半區塊/一個狹槽(slot)及一下半區塊/一個狹槽。從上方往下的垂直箭頭指示進入區塊的來源,而從底部往下的垂直箭頭指示回到記憶體的目標。繼續從圖3之左側向右側看到,例示出可能的不同指令組合。在此實作中,每個半區塊可接收三個來源,並可傳遞兩個目標。OP1及OP2為正常運算。AuxiliaryOP為諸如邏輯值、移位、移動、記號擴充、分支等的輔助運算。將區塊分成兩個半部之效益為允許每個半部皆依據依附解析而自身獨立配送或作為一個區塊一起動態配送之效益(為了連接埠利用或因為資源限制任一者),因此有較佳的執行時間利用,同時有對應於一個區塊的兩個半部允許機器對即將像是一個區塊進行管理的兩個半區塊之複雜度(亦即配置及拉回)取得摘要(abstract)。
圖31根據本發明之一個具體實施例顯示區塊堆疊內的半區塊配對如何映射於執行區塊單元上。如執行區塊中所顯示,每個執行區塊皆有兩個狹槽:狹槽1及狹槽2。目的為將區塊映射於執行單元上,使得第一半區塊在狹槽1上執行,而第二半區塊在狹槽2上執行。目的為若每個半區塊之指令群組皆不依賴另一半部,則允許兩個半區塊獨立配送。從上方進入執行區塊的配對箭頭為來源之兩個32位元字詞。離開執行區塊往下的配對箭頭為目標之兩個32位元字詞。從圖31之左側向右側,顯示能堆疊於執行區塊單元上的指令之不同示範組合。
圖31之上方總結半區塊之配對如何在完整區塊脈絡或任一半區塊脈絡中執行。執行區塊之每個皆有兩個狹槽/半區塊,且半區塊/執行狹槽之每一個皆執行單一、配對或三重群組的運算任一者。有四種類型之
區塊執行類型。第一為平行半部(其允許每個半區塊一旦其自身來源就緒則皆獨立執行,但若兩者半部同時就緒,則兩個半區塊在一個執行單元上仍可作為一個區塊執行)。第二為基元(atomic)平行半部(其指稱由於兩個半部之間沒有依附因此可平行執行的半區塊,但由於兩個半部之間的資源分享使得對於兩個半部較佳或必要在每個執行區塊中可用的資源限制內基元地一起執行,因此其被迫作為一個區塊一起執行)。第三類型為基元串列半部(其需求第一半部透過帶或不帶內部儲存體的暫態轉發將資料轉發到第二半部)。第四類型為序列半部(如在雙重配送中),其中第二半部依賴第一半部並在第一半部以後的循環上進行配送,且透過類似於雙重配送案例為依附解析而追蹤的外部儲存體轉發資料。
圖32根據本發明之一個具體實施例所顯示的圖示將中間區塊結果儲存體描繪為第一階寄存器檔案。寄存器之每個群組皆表示指令區塊(表示兩個半區塊),其中可藉著使用兩個32位元寄存器來支援一個64位元寄存器而支援32位元結果以及64位元結果兩者。每個區塊的儲存體皆假設虛擬區塊儲存體,其意指來自不同區塊的兩個半區塊可寫入同一虛擬區塊儲存體。兩個半區塊之經結合的結果儲存體構成一個虛擬區塊儲存體。
圖33根據本發明之一個具體實施例顯示奇數/偶數連接埠排程器。在此實施例中,結果儲存體為不對稱。一些結果儲存體為每半區塊三個64位元結果寄存器,而其他為每半區塊一個64位元結果寄存器,然而替代性實施例可每半區塊使用對稱儲存體,且此外亦可如圖32中所描述採用64位元及32位元分區。在這些具體實施例中,儲存體每半區塊分配,而非每個區塊。此實施例藉著將其作為奇數或偶數使用而減縮進行配送所需要的連接埠數量。
圖34顯示圖33之更詳細的版本,其中顯示四個執行單元接收來自排程器陣列的結果,並將輸出寫入暫時寄存器檔案段。連接埠以偶數及奇數間隔連接。排程陣列之左側顯示區塊編號,而右側顯示半區塊編號。
每個核心皆有偶數及奇數連接埠進入排程陣列,其中每個連接埠皆連接到奇數或偶數半區塊位置。在一個實作中,偶數連接埠及其對應的半區塊可常駐於與奇數連接埠不同的核心及其對應的半區塊中。在另一實作中,奇數及偶數連接埠將會如此圖示中所顯示跨越多個不同的核心而分散。如美國專利申請號13/428,440中所描述,於本文中將此專利申請案全部併入為參照,核心可為實體核心或虛擬核心。
在某些類型之區塊中,區塊之一個半部可與區塊之另一個半部獨立配送。在其他類型之區塊中,區塊之兩者半部皆需要同時配送到同一執行區塊單元。在又其他類型之區塊中,區塊之兩個半部需要依序配送(第二半部在第一半部之後)。
圖35根據本發明之一個具體實施例所顯示的圖示描繪出客戶旗標架構仿真。圖35之左側顯示有五個旗標的集中旗標寄存器。圖35之右側顯示有分散旗標寄存器的分散旗標架構,其中旗標分散於寄存器自身之中。
在架構仿真期間,分散旗標架構有必要仿真集中客戶旗標架構之行為。分散旗標架構亦可藉著使用多個獨立的旗標寄存器而非與資料寄存器相關聯的旗標欄位而實現。舉例來說,資料寄存器可實現為R0至R15,而獨立的旗標寄存器可實現為F0至F3。那些旗標寄存器在此例中並未與資料寄存器直接相關聯。
圖36根據本發明之一個具體實施例所顯示的圖示例示機器之前端、排程器及執行單元、與集中旗標寄存器。在此實作中,前端依據其更新客戶指令旗標的方式分類輸入指令。在一個具體實施例中,客戶指令分類成四種原生指令類型:T1、T2、T3、與T4。T1-T4為指示每個客戶指令類型皆更新哪個旗標欄位的指令類型。客戶指令類型依據其類型更新不同的客戶指令旗標。舉例來說,邏輯客戶指令更新T1原生指令。
圖37顯示如本發明之具體實施例所實現的集中旗標寄存器
仿真程序之圖示。圖37中的動作主包含一最新的更新類型表、一重新命名的表擴充、實體寄存器、與分散旗標寄存器。圖37現在由圖38之流程圖進行描述。
圖38顯示在客戶設定下仿真集中旗標寄存器行為之程序3800之步驟流程圖。
在步驟3801中,前端/動態轉換器(硬體或軟體)依據其更新客戶指令旗標的方式分類輸入指令。在一個具體實施例中,客戶指令分類成四種旗標架構類型:T1、T2、T3、與T4。T1-T4為指示每個客戶指令類型皆更新哪個旗標欄位的指令類型。客戶指令類型依據其類型更新不同的客戶旗標。舉例來說,邏輯客戶指令更新T1類型旗標、移位客戶指令更新T2類型旗標、算術客戶指令更新T3類型旗標、以及特殊客戶指令更新類型T4旗標。應注意到客戶指令可為架構式指令表示,而原生可為機器內部所執行者(例如微碼)。或者,客戶指令可為來自仿真架構(例如x86、java、ARM碼等)的指令。
在步驟3802中,那些指令類型更新其各自客戶旗標的順序記錄於最新的更新類型表資料結構中。在一個具體實施例中,此動作由機器之前端進行。
在步驟3803中,那些指令類型到達排程器(配置/重新命名階段之依順序部分)時,排程器分配對應於架構類型的隱含實體目標,並將該分配記錄於重新命名/映射表資料結構中。
以及在步驟3804中,後續客戶指令到達排程器中的配置/重新命名階段且該指令想要讀取客戶旗標欄位時,(a)機器判定需要存取哪些旗標架構類型以進行讀取;(b)若所有需要的旗標皆在同一最新的更新旗標類型中找出(例如由最新的更新類型表判定),則讀取對應實體寄存器(例如映射於該最新的旗標類型者)以得到需要的旗標;(c)若所有需要的旗標無法皆在同一最新的更新旗標類型中找出,則需要從映射於個別最新的更新旗
標類型的對應實體寄存器讀取每個旗標。
以及在步驟3805中,每個旗標皆從保存其最後所更新(如採用最新的更新旗標類型表所追蹤)最新數值的實體寄存器個別讀取。
應注意到若最新的更新類型包括另一種類型,則所有子集類型皆必須映射於母集(super set)類型之同一實體寄存器。
在拉回時,該目標旗標欄位與仿製的集中/客戶旗標架構寄存器合併。應注意到仿製由於原生架構利用分散旗標架構而非單一寄存器集中旗標架構的事實而進行。
更新某些旗標類型的指令之範例:
CF、OF、SF、ZR-算術指令及負載/寫入旗標指令
SF、ZF、與有條件的CF-邏輯值及移位
SF、ZF-移動/負載、EXTR、一些乘法
ZF-POPCNT及STREX[P]
GE-SIMD指令???
讀取某些旗標的條件/預測之範例:
0000 EQ等於Z=1
0001 NE不等於或無序Z=0
0010 CS b進位集,大於或等於或無序C=1
0011 CC c進位歸零,小於C=0
0100 MI減,負數,小於N=1
0101 PL加,正數或零,大於或等於、無序N=00110 VS溢出,無序V=1
0111 VC沒有溢出,非無序V=0
1000 HI無正負號大於、大於、無序C=1且Z=0
1001 LS無正負號低於或相同、小於或等於C=0或Z=1
1010 GE帶正負號大於或等於、大於或等於N=V
1011 LT帶正負號小於、小於、無序N!=V
1100 GT帶正負號大於、大於Z=0且N=V
1101 LE帶正負號小於或等於、小於或等於、無序Z=1或N!=V
1110無(AL)、始終(無條件)、設定成任何數值的任何旗標
為了解釋之目的,前述描述已參照特定具體實施例進行描述。然而,以上所例示的詳述不欲為全面性或將本發明限制在所揭示的精確形式。許多修飾例與變化例鑑於以上講述為可能。各具體實施例為了最佳解釋本發明之原理及其實際應用而選擇並描述,以由此讓其他熟習此項技術者能採用可能適合所設想特定用途的各種修改來最佳利用本發明與各種具體實施例。
R0-R63‧‧‧寄存器
Claims (19)
- 一種使用寄存器樣板快照以填充寄存器觀點資料結構的方法,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊;使用複數寄存器樣板以藉著用對應於該等指令區塊的區塊編號填充該寄存器樣板而追蹤指令目標及指令來源,其中對應於該等指令區塊的該等區塊編號指示出該等指令區塊之間的依存度(interdependencies);填充(populate)一寄存器觀點資料結構,其中該寄存器觀點資料結構儲存如該等複數寄存器樣板所記錄的對應於該等指令區塊的目標;以及使用該寄存器觀點資料結構以根據該等複數指令區塊之該執行追蹤一機器狀態。
- 如申請專利範圍第1項之方法,其中該寄存器觀點資料結構包含一排程器架構。
- 如申請專利範圍第1項之方法,其中該等區塊所指稱有關寄存器的資訊儲存於該寄存器觀點資料結構中。
- 如申請專利範圍第1項之方法,其中該等區塊所指稱有關來源的資訊儲存於一來源觀點資料結構中。
- 如申請專利範圍第1項之方法,其中該等區塊所指稱有關指令的資訊儲存於一指令觀點資料結構中。
- 如申請專利範圍第1項之方法,其中寄存器樣板包含繼承向量,其更包含資料結構,儲存該等區塊所指稱的依附及繼承資訊。
- 一種具有當由電腦系統執行時使得該電腦系統進行使用寄存器樣板快照以填充寄存器觀點資料結構的方法的電腦可讀取碼的非暫時性電腦可讀取媒體,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊;使用複數寄存器樣板以藉著用對應於該等指令區塊的區塊編號填充該寄存器樣板而追蹤指令目標及指令來源,其中對應於該等指令區塊的該等區塊編號指示出該等指令區塊之間的依存度;填充一寄存器觀點資料結構,其中該寄存器觀點資料結構儲存如該等複數寄存器樣板所記錄的對應於該等指令區塊的目標;以及使用該寄存器觀點資料結構以根據該等複數指令區塊之該執行追蹤一機器狀態。
- 如申請專利範圍第7項之電腦可讀取媒體,其中該寄存器觀點資料結構包含一排程器架構。
- 如申請專利範圍第7項之電腦可讀取媒體,其中該等區塊所指稱有關寄存器的資訊儲存於該寄存器觀點資料結構中。
- 如申請專利範圍第7項之電腦可讀取媒體,其中該等區塊所指稱有關來源的資訊儲存於一來源觀點資料結構中。
- 如申請專利範圍第7項之電腦可讀取媒體,其中該等區塊所指稱有關指令的資訊儲存於一指令觀點資料結構中。
- 如申請專利範圍第7項之電腦可讀取媒體,其中寄存器樣板包含繼承向量,其更包含資料結構,儲存該等區塊所指稱的依附及繼承資訊。
- 一種具有耦接於記憶體的處理器的電腦系統,該記憶體具有當由該電腦系統執行時使得該電腦系統進行使用寄存器樣板快照以填充寄存器觀點資料結構的方法的電腦可讀取碼,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊;使用複數寄存器樣板以藉著用對應於該等指令區塊的區塊編號填充該寄存器樣板而追蹤指令目標及指令來源,其中對應於該等指令區塊的該等區塊編號指示出該等指令區塊之間的依存度;填充一寄存器觀點資料結構,其中該寄存器觀點資料結構儲存如該等複數寄存器樣板所記錄的對應於該等指令區塊的目標;以及使用該寄存器觀點資料結構以根據該等複數指令區塊之該執行追蹤一機器狀態。
- 如申請專利範圍第13項之電腦系統,其中該寄存器觀點資料結構包含一排程器架構。
- 如申請專利範圍第13項之電腦系統,其中該等區塊所指稱有關寄存器的資訊儲存於該寄存器觀點資料結構中。
- 如申請專利範圍第13項之電腦系統,其中該等區塊所指稱有關來源的 資訊儲存於一來源觀點資料結構中。
- 如申請專利範圍第13項之電腦系統,其中該等區塊所指稱有關指令的資訊儲存於一指令觀點資料結構中。
- 如申請專利範圍第13項之電腦系統,其中寄存器樣板包含繼承向量,其更包含資料結構,儲存該等區塊所指稱的依附及繼承資訊。
- 一種使用寄存器樣板快照以填充來源觀點資料結構的方法,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊;使用複數寄存器樣板以藉著用對應於該等指令區塊的區塊編號填充該寄存器樣板而追蹤指令目標及指令來源,其中對應於該等指令區塊的該等區塊編號指示出該等指令區塊之間的依存度;填充一來源觀點資料結構,其中該來源觀點資料結構儲存如該等複數寄存器樣板所記錄的對應於該等指令區塊的來源;以及判定該等複數指令區塊之哪些已就緒使用該經填充的來源觀點資料結構進行配送。
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361799006P | 2013-03-15 | 2013-03-15 |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201504940A true TW201504940A (zh) | 2015-02-01 |
TWI522909B TWI522909B (zh) | 2016-02-21 |
Family
ID=51533995
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW103109509A TWI522909B (zh) | 2013-03-15 | 2014-03-14 | 使用寄存器樣板快照以填充寄存器觀點資料結構的方法 |
Country Status (3)
Country | Link |
---|---|
US (2) | US9575762B2 (zh) |
TW (1) | TWI522909B (zh) |
WO (1) | WO2014150806A1 (zh) |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2477109B1 (en) | 2006-04-12 | 2016-07-13 | Soft Machines, Inc. | Apparatus and method for processing an instruction matrix specifying parallel and dependent operations |
EP2523101B1 (en) | 2006-11-14 | 2014-06-04 | Soft Machines, Inc. | Apparatus and method for processing complex instruction formats in a multi- threaded architecture supporting various context switch modes and virtualization schemes |
KR101685247B1 (ko) | 2010-09-17 | 2016-12-09 | 소프트 머신즈, 인크. | 조기 원거리 분기 예측을 위한 섀도우 캐시를 포함하는 단일 사이클 다중 분기 예측 |
KR101620676B1 (ko) | 2011-03-25 | 2016-05-23 | 소프트 머신즈, 인크. | 분할가능한 엔진에 의해 인스턴스화된 가상 코어를 이용한 코드 블록의 실행을 지원하는 레지스터 파일 세그먼트 |
TWI533129B (zh) | 2011-03-25 | 2016-05-11 | 軟體機器公司 | 使用可分割引擎實體化的虛擬核心執行指令序列程式碼區塊 |
US9274793B2 (en) | 2011-03-25 | 2016-03-01 | Soft Machines, Inc. | Memory fragments for supporting code block execution by using virtual cores instantiated by partitionable engines |
KR101639854B1 (ko) | 2011-05-20 | 2016-07-14 | 소프트 머신즈, 인크. | 복수의 엔진에 의해 명령어 시퀀스들의 실행을 지원하기 위한 상호접속 구조 |
EP2710481B1 (en) | 2011-05-20 | 2021-02-17 | Intel Corporation | Decentralized allocation of resources and interconnect structures to support the execution of instruction sequences by a plurality of engines |
KR101842550B1 (ko) | 2011-11-22 | 2018-03-28 | 소프트 머신즈, 인크. | 다중 엔진 마이크로프로세서용 가속 코드 최적화기 |
EP2783281B1 (en) | 2011-11-22 | 2020-05-13 | Intel Corporation | A microprocessor accelerated code optimizer |
US9904625B2 (en) | 2013-03-15 | 2018-02-27 | Intel Corporation | Methods, systems and apparatus for predicting the way of a set associative cache |
EP2972845B1 (en) | 2013-03-15 | 2021-07-07 | Intel Corporation | A method for executing multithreaded instructions grouped onto blocks |
US10140138B2 (en) | 2013-03-15 | 2018-11-27 | Intel Corporation | Methods, systems and apparatus for supporting wide and efficient front-end operation with guest-architecture emulation |
US9811342B2 (en) | 2013-03-15 | 2017-11-07 | Intel Corporation | Method for performing dual dispatch of blocks and half blocks |
WO2014150806A1 (en) | 2013-03-15 | 2014-09-25 | Soft Machines, Inc. | A method for populating register view data structure by using register template snapshots |
WO2014150971A1 (en) | 2013-03-15 | 2014-09-25 | Soft Machines, Inc. | A method for dependency broadcasting through a block organized source view data structure |
KR102083390B1 (ko) | 2013-03-15 | 2020-03-02 | 인텔 코포레이션 | 네이티브 분산된 플래그 아키텍처를 이용하여 게스트 중앙 플래그 아키텍처를 에뮬레이션하는 방법 |
US9891924B2 (en) | 2013-03-15 | 2018-02-13 | Intel Corporation | Method for implementing a reduced size register view data structure in a microprocessor |
US9569216B2 (en) | 2013-03-15 | 2017-02-14 | Soft Machines, Inc. | Method for populating a source view data structure by using register template snapshots |
WO2014150991A1 (en) | 2013-03-15 | 2014-09-25 | Soft Machines, Inc. | A method for implementing a reduced size register view data structure in a microprocessor |
US9632825B2 (en) | 2013-03-15 | 2017-04-25 | Intel Corporation | Method and apparatus for efficient scheduling for asymmetrical execution units |
US9886279B2 (en) * | 2013-03-15 | 2018-02-06 | Intel Corporation | Method for populating and instruction view data structure by using register template snapshots |
US10275255B2 (en) | 2013-03-15 | 2019-04-30 | Intel Corporation | Method for dependency broadcasting through a source organized source view data structure |
US11977891B2 (en) | 2015-09-19 | 2024-05-07 | Microsoft Technology Licensing, Llc | Implicit program order |
US11681531B2 (en) | 2015-09-19 | 2023-06-20 | Microsoft Technology Licensing, Llc | Generation and use of memory access instruction order encodings |
Family Cites Families (478)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US727487A (en) | 1902-10-21 | 1903-05-05 | Swan F Swanson | Dumping-car. |
US3046775A (en) | 1957-09-25 | 1962-07-31 | American Cyanamid Co | Apparatus for treating reticulate material |
US3044951A (en) | 1958-06-05 | 1962-07-17 | Texaco Inc | Hydrocarbon conversion process |
US4075704A (en) | 1976-07-02 | 1978-02-21 | Floating Point Systems, Inc. | Floating point data processor for high speech operation |
US4228496A (en) | 1976-09-07 | 1980-10-14 | Tandem Computers Incorporated | Multiprocessor system |
US4245344A (en) | 1979-04-02 | 1981-01-13 | Rockwell International Corporation | Processing system with dual buses |
US4527237A (en) | 1979-10-11 | 1985-07-02 | Nanodata Computer Corporation | Data processing system |
US4414624A (en) | 1980-11-19 | 1983-11-08 | The United States Of America As Represented By The Secretary Of The Navy | Multiple-microcomputer processing |
US4524415A (en) | 1982-12-07 | 1985-06-18 | Motorola, Inc. | Virtual machine data processor |
US4597061B1 (en) | 1983-01-03 | 1998-06-09 | Texas Instruments Inc | Memory system using pipleline circuitry for improved system |
US4577273A (en) | 1983-06-06 | 1986-03-18 | Sperry Corporation | Multiple microcomputer system for digital computers |
US4682281A (en) | 1983-08-30 | 1987-07-21 | Amdahl Corporation | Data storage unit employing translation lookaside buffer pointer |
US4600986A (en) | 1984-04-02 | 1986-07-15 | Sperry Corporation | Pipelined split stack with high performance interleaved decode |
US4633434A (en) | 1984-04-02 | 1986-12-30 | Sperry Corporation | High performance storage unit |
JPS6140643A (ja) | 1984-07-31 | 1986-02-26 | Hitachi Ltd | システムの資源割当て制御方式 |
US4835680A (en) | 1985-03-15 | 1989-05-30 | Xerox Corporation | Adaptive processor array capable of learning variable associations useful in recognizing classes of inputs |
JPS6289149A (ja) | 1985-10-15 | 1987-04-23 | Agency Of Ind Science & Technol | 多ポ−トメモリシステム |
JPH0658650B2 (ja) | 1986-03-14 | 1994-08-03 | 株式会社日立製作所 | 仮想計算機システム |
US4920477A (en) | 1987-04-20 | 1990-04-24 | Multiflow Computer, Inc. | Virtual address table look aside buffer miss recovery method and apparatus |
US4943909A (en) | 1987-07-08 | 1990-07-24 | At&T Bell Laboratories | Computational origami |
US5339398A (en) | 1989-07-31 | 1994-08-16 | North American Philips Corporation | Memory architecture and method of data organization optimized for hashing |
US5471593A (en) | 1989-12-11 | 1995-11-28 | Branigin; Michael H. | Computer processor with an efficient means of executing many instructions simultaneously |
US5197130A (en) | 1989-12-29 | 1993-03-23 | Supercomputer Systems Limited Partnership | Cluster architecture for a highly parallel scalar/vector multiprocessor system |
US5317754A (en) | 1990-10-23 | 1994-05-31 | International Business Machines Corporation | Method and apparatus for enabling an interpretive execution subset |
US5317705A (en) | 1990-10-24 | 1994-05-31 | International Business Machines Corporation | Apparatus and method for TLB purge reduction in a multi-level machine system |
US6282583B1 (en) | 1991-06-04 | 2001-08-28 | Silicon Graphics, Inc. | Method and apparatus for memory access in a matrix processor computer |
US5539911A (en) | 1991-07-08 | 1996-07-23 | Seiko Epson Corporation | High-performance, superscalar-based computer system with out-of-order instruction execution |
JPH0820949B2 (ja) * | 1991-11-26 | 1996-03-04 | 松下電器産業株式会社 | 情報処理装置 |
GB2277181B (en) | 1991-12-23 | 1995-12-13 | Intel Corp | Interleaved cache for multiple accesses per clock in a microprocessor |
KR100309566B1 (ko) | 1992-04-29 | 2001-12-15 | 리패치 | 파이프라인프로세서에서다중명령어를무리짓고,그룹화된명령어를동시에발행하고,그룹화된명령어를실행시키는방법및장치 |
JP3637920B2 (ja) | 1992-05-01 | 2005-04-13 | セイコーエプソン株式会社 | スーパースケーラマイクロプロセサに於て命令をリタイアさせるシステム及び方法 |
EP0576262B1 (en) | 1992-06-25 | 2000-08-23 | Canon Kabushiki Kaisha | Apparatus for multiplying integers of many figures |
JPH0637202A (ja) | 1992-07-20 | 1994-02-10 | Mitsubishi Electric Corp | マイクロ波ic用パッケージ |
JPH06110781A (ja) | 1992-09-30 | 1994-04-22 | Nec Corp | キャッシュメモリ装置 |
US5493660A (en) | 1992-10-06 | 1996-02-20 | Hewlett-Packard Company | Software assisted hardware TLB miss handler |
US5513335A (en) | 1992-11-02 | 1996-04-30 | Sgs-Thomson Microelectronics, Inc. | Cache tag memory having first and second single-port arrays and a dual-port array |
US5819088A (en) | 1993-03-25 | 1998-10-06 | Intel Corporation | Method and apparatus for scheduling instructions for execution on a multi-issue architecture computer |
JPH0784883A (ja) | 1993-09-17 | 1995-03-31 | Hitachi Ltd | 仮想計算機システムのアドレス変換バッファパージ方法 |
US6948172B1 (en) | 1993-09-21 | 2005-09-20 | Microsoft Corporation | Preemptive multi-tasking with cooperative groups of tasks |
US5469376A (en) | 1993-10-14 | 1995-11-21 | Abdallah; Mohammad A. F. F. | Digital circuit for the evaluation of mathematical expressions |
US5517651A (en) | 1993-12-29 | 1996-05-14 | Intel Corporation | Method and apparatus for loading a segment register in a microprocessor capable of operating in multiple modes |
US5761476A (en) | 1993-12-30 | 1998-06-02 | Intel Corporation | Non-clocked early read for back-to-back scheduling of instructions |
US5956753A (en) | 1993-12-30 | 1999-09-21 | Intel Corporation | Method and apparatus for handling speculative memory access operations |
JP3048498B2 (ja) | 1994-04-13 | 2000-06-05 | 株式会社東芝 | 半導体記憶装置 |
JPH07287668A (ja) | 1994-04-19 | 1995-10-31 | Hitachi Ltd | データ処理装置 |
CN1084005C (zh) | 1994-06-27 | 2002-05-01 | 国际商业机器公司 | 用于动态控制地址空间分配的方法和设备 |
US5548742A (en) | 1994-08-11 | 1996-08-20 | Intel Corporation | Method and apparatus for combining a direct-mapped cache and a multiple-way cache in a cache memory |
US5813031A (en) | 1994-09-21 | 1998-09-22 | Industrial Technology Research Institute | Caching tag for a large scale cache computer memory system |
US5640534A (en) | 1994-10-05 | 1997-06-17 | International Business Machines Corporation | Method and system for concurrent access in a data cache array utilizing multiple match line selection paths |
US5835951A (en) | 1994-10-18 | 1998-11-10 | National Semiconductor | Branch processing unit with target cache read prioritization protocol for handling multiple hits |
JP3569014B2 (ja) | 1994-11-25 | 2004-09-22 | 富士通株式会社 | マルチコンテキストをサポートするプロセッサおよび処理方法 |
US5724565A (en) | 1995-02-03 | 1998-03-03 | International Business Machines Corporation | Method and system for processing first and second sets of instructions by first and second types of processing systems |
US5644742A (en) | 1995-02-14 | 1997-07-01 | Hal Computer Systems, Inc. | Processor structure and method for a time-out checkpoint |
US5675759A (en) | 1995-03-03 | 1997-10-07 | Shebanow; Michael C. | Method and apparatus for register management using issue sequence prior physical register and register association validity information |
US5634068A (en) | 1995-03-31 | 1997-05-27 | Sun Microsystems, Inc. | Packet switched cache coherent multiprocessor system |
US5751982A (en) | 1995-03-31 | 1998-05-12 | Apple Computer, Inc. | Software emulation system with dynamic translation of emulated instructions for increased processing speed |
US6209085B1 (en) | 1995-05-05 | 2001-03-27 | Intel Corporation | Method and apparatus for performing process switching in multiprocessor computer systems |
US6643765B1 (en) | 1995-08-16 | 2003-11-04 | Microunity Systems Engineering, Inc. | Programmable processor with group floating point operations |
US5710902A (en) | 1995-09-06 | 1998-01-20 | Intel Corporation | Instruction dependency chain indentifier |
US6341324B1 (en) | 1995-10-06 | 2002-01-22 | Lsi Logic Corporation | Exception processing in superscalar microprocessor |
US5864657A (en) | 1995-11-29 | 1999-01-26 | Texas Micro, Inc. | Main memory system and checkpointing protocol for fault-tolerant computer system |
US5983327A (en) | 1995-12-01 | 1999-11-09 | Nortel Networks Corporation | Data path architecture and arbitration scheme for providing access to a shared system resource |
US5793941A (en) | 1995-12-04 | 1998-08-11 | Advanced Micro Devices, Inc. | On-chip primary cache testing circuit and test method |
US5911057A (en) | 1995-12-19 | 1999-06-08 | Texas Instruments Incorporated | Superscalar microprocessor having combined register and memory renaming circuits, systems, and methods |
US5699537A (en) | 1995-12-22 | 1997-12-16 | Intel Corporation | Processor microarchitecture for efficient dynamic scheduling and execution of chains of dependent instructions |
US6882177B1 (en) | 1996-01-10 | 2005-04-19 | Altera Corporation | Tristate structures for programmable logic devices |
US5754818A (en) | 1996-03-22 | 1998-05-19 | Sun Microsystems, Inc. | Architecture and method for sharing TLB entries through process IDS |
US5904892A (en) | 1996-04-01 | 1999-05-18 | Saint-Gobain/Norton Industrial Ceramics Corp. | Tape cast silicon carbide dummy wafer |
US5752260A (en) | 1996-04-29 | 1998-05-12 | International Business Machines Corporation | High-speed, multiple-port, interleaved cache with arbitration of multiple access addresses |
US5806085A (en) | 1996-05-01 | 1998-09-08 | Sun Microsystems, Inc. | Method for non-volatile caching of network and CD-ROM file accesses using a cache directory, pointers, file name conversion, a local hard disk, and separate small database |
US5829028A (en) | 1996-05-06 | 1998-10-27 | Advanced Micro Devices, Inc. | Data cache configured to store data in a use-once manner |
US6108769A (en) | 1996-05-17 | 2000-08-22 | Advanced Micro Devices, Inc. | Dependency table for reducing dependency checking hardware |
US5881277A (en) | 1996-06-13 | 1999-03-09 | Texas Instruments Incorporated | Pipelined microprocessor with branch misprediction cache circuits, systems and methods |
US5860146A (en) | 1996-06-25 | 1999-01-12 | Sun Microsystems, Inc. | Auxiliary translation lookaside buffer for assisting in accessing data in remote address spaces |
US5903760A (en) | 1996-06-27 | 1999-05-11 | Intel Corporation | Method and apparatus for translating a conditional instruction compatible with a first instruction set architecture (ISA) into a conditional instruction compatible with a second ISA |
US5974506A (en) | 1996-06-28 | 1999-10-26 | Digital Equipment Corporation | Enabling mirror, nonmirror and partial mirror cache modes in a dual cache system |
US6167490A (en) | 1996-09-20 | 2000-12-26 | University Of Washington | Using global memory information to manage memory in a computer network |
KR19980032776A (ko) | 1996-10-16 | 1998-07-25 | 가나이 츠토무 | 데이타 프로세서 및 데이타 처리시스템 |
EP0877981B1 (en) | 1996-11-04 | 2004-01-07 | Koninklijke Philips Electronics N.V. | Processing device, reads instructions in memory |
US6385715B1 (en) | 1996-11-13 | 2002-05-07 | Intel Corporation | Multi-threading for a processor utilizing a replay queue |
US5978906A (en) | 1996-11-19 | 1999-11-02 | Advanced Micro Devices, Inc. | Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions |
US6253316B1 (en) | 1996-11-19 | 2001-06-26 | Advanced Micro Devices, Inc. | Three state branch history using one bit in a branch prediction mechanism |
US5903750A (en) | 1996-11-20 | 1999-05-11 | Institute For The Development Of Emerging Architectures, L.L.P. | Dynamic branch prediction for branch instructions with multiple targets |
US6212542B1 (en) | 1996-12-16 | 2001-04-03 | International Business Machines Corporation | Method and system for executing a program within a multiscalar processor by processing linked thread descriptors |
US6134634A (en) | 1996-12-20 | 2000-10-17 | Texas Instruments Incorporated | Method and apparatus for preemptive cache write-back |
US5918251A (en) | 1996-12-23 | 1999-06-29 | Intel Corporation | Method and apparatus for preloading different default address translation attributes |
US6016540A (en) | 1997-01-08 | 2000-01-18 | Intel Corporation | Method and apparatus for scheduling instructions in waves |
US6065105A (en) | 1997-01-08 | 2000-05-16 | Intel Corporation | Dependency matrix |
US5802602A (en) | 1997-01-17 | 1998-09-01 | Intel Corporation | Method and apparatus for performing reads of related data from a set-associative cache memory |
US6088780A (en) | 1997-03-31 | 2000-07-11 | Institute For The Development Of Emerging Architecture, L.L.C. | Page table walker that uses at least one of a default page size and a page size selected for a virtual address space to position a sliding field in a virtual address |
US6075938A (en) | 1997-06-10 | 2000-06-13 | The Board Of Trustees Of The Leland Stanford Junior University | Virtual machine monitors for scalable multiprocessors |
US6073230A (en) | 1997-06-11 | 2000-06-06 | Advanced Micro Devices, Inc. | Instruction fetch unit configured to provide sequential way prediction for sequential instruction fetches |
JPH1124929A (ja) | 1997-06-30 | 1999-01-29 | Sony Corp | 演算処理装置およびその方法 |
US6128728A (en) | 1997-08-01 | 2000-10-03 | Micron Technology, Inc. | Virtual shadow registers and virtual register windows |
US6170051B1 (en) | 1997-08-01 | 2001-01-02 | Micron Technology, Inc. | Apparatus and method for program level parallelism in a VLIW processor |
US6085315A (en) | 1997-09-12 | 2000-07-04 | Siemens Aktiengesellschaft | Data processing device with loop pipeline |
US6101577A (en) | 1997-09-15 | 2000-08-08 | Advanced Micro Devices, Inc. | Pipelined instruction cache and branch prediction mechanism therefor |
US5901294A (en) | 1997-09-18 | 1999-05-04 | International Business Machines Corporation | Method and system for bus arbitration in a multiprocessor system utilizing simultaneous variable-width bus access |
US6185660B1 (en) | 1997-09-23 | 2001-02-06 | Hewlett-Packard Company | Pending access queue for providing data to a target register during an intermediate pipeline phase after a computer cache miss |
US5905509A (en) | 1997-09-30 | 1999-05-18 | Compaq Computer Corp. | Accelerated Graphics Port two level Gart cache having distributed first level caches |
US6226732B1 (en) | 1997-10-02 | 2001-05-01 | Hitachi Micro Systems, Inc. | Memory system architecture |
US5922065A (en) | 1997-10-13 | 1999-07-13 | Institute For The Development Of Emerging Architectures, L.L.C. | Processor utilizing a template field for encoding instruction sequences in a wide-word format |
US6178482B1 (en) | 1997-11-03 | 2001-01-23 | Brecis Communications | Virtual register sets |
US6021484A (en) | 1997-11-14 | 2000-02-01 | Samsung Electronics Co., Ltd. | Dual instruction set architecture |
US6256728B1 (en) | 1997-11-17 | 2001-07-03 | Advanced Micro Devices, Inc. | Processor configured to selectively cancel instructions from its pipeline responsive to a predicted-taken short forward branch instruction |
US6260131B1 (en) | 1997-11-18 | 2001-07-10 | Intrinsity, Inc. | Method and apparatus for TLB memory ordering |
US6016533A (en) | 1997-12-16 | 2000-01-18 | Advanced Micro Devices, Inc. | Way prediction logic for cache array |
US6219776B1 (en) | 1998-03-10 | 2001-04-17 | Billions Of Operations Per Second | Merged array controller and processing element |
US6609189B1 (en) | 1998-03-12 | 2003-08-19 | Yale University | Cycle segmented prefix circuits |
JP3657424B2 (ja) | 1998-03-20 | 2005-06-08 | 松下電器産業株式会社 | 番組情報を放送するセンター装置と端末装置 |
US6216215B1 (en) | 1998-04-02 | 2001-04-10 | Intel Corporation | Method and apparatus for senior loads |
US6157998A (en) | 1998-04-03 | 2000-12-05 | Motorola Inc. | Method for performing branch prediction and resolution of two or more branch instructions within two or more branch prediction buffers |
US6115809A (en) | 1998-04-30 | 2000-09-05 | Hewlett-Packard Company | Compiling strong and weak branching behavior instruction blocks to separate caches for dynamic and static prediction |
US6205545B1 (en) | 1998-04-30 | 2001-03-20 | Hewlett-Packard Company | Method and apparatus for using static branch predictions hints with dynamically translated code traces to improve performance |
US6256727B1 (en) | 1998-05-12 | 2001-07-03 | International Business Machines Corporation | Method and system for fetching noncontiguous instructions in a single clock cycle |
JPH11338710A (ja) | 1998-05-28 | 1999-12-10 | Toshiba Corp | 複数種の命令セットを持つプロセッサのためのコンパイル方法ならびに装置および同方法がプログラムされ記録される記録媒体 |
US6272616B1 (en) | 1998-06-17 | 2001-08-07 | Agere Systems Guardian Corp. | Method and apparatus for executing multiple instruction streams in a digital processor with multiple data paths |
US6988183B1 (en) | 1998-06-26 | 2006-01-17 | Derek Chi-Lan Wong | Methods for increasing instruction-level parallelism in microprocessors and digital system |
US6260138B1 (en) | 1998-07-17 | 2001-07-10 | Sun Microsystems, Inc. | Method and apparatus for branch instruction processing in a processor |
US6122656A (en) | 1998-07-31 | 2000-09-19 | Advanced Micro Devices, Inc. | Processor configured to map logical register numbers to physical register numbers using virtual register numbers |
US6272662B1 (en) | 1998-08-04 | 2001-08-07 | International Business Machines Corporation | Distributed storage system using front-end and back-end locking |
JP2000057054A (ja) | 1998-08-12 | 2000-02-25 | Fujitsu Ltd | 高速アドレス変換システム |
US8631066B2 (en) | 1998-09-10 | 2014-01-14 | Vmware, Inc. | Mechanism for providing virtual machines for use by multiple users |
US6339822B1 (en) | 1998-10-02 | 2002-01-15 | Advanced Micro Devices, Inc. | Using padded instructions in a block-oriented cache |
US6332189B1 (en) | 1998-10-16 | 2001-12-18 | Intel Corporation | Branch prediction architecture |
GB9825102D0 (en) | 1998-11-16 | 1999-01-13 | Insignia Solutions Plc | Computer system |
JP3110404B2 (ja) | 1998-11-18 | 2000-11-20 | 甲府日本電気株式会社 | マイクロプロセッサ装置及びそのソフトウェア命令高速化方法並びにその制御プログラムを記録した記録媒体 |
US6490673B1 (en) | 1998-11-27 | 2002-12-03 | Matsushita Electric Industrial Co., Ltd | Processor, compiling apparatus, and compile program recorded on a recording medium |
US6519682B2 (en) | 1998-12-04 | 2003-02-11 | Stmicroelectronics, Inc. | Pipelined non-blocking level two cache system with inherent transaction collision-avoidance |
US7020879B1 (en) | 1998-12-16 | 2006-03-28 | Mips Technologies, Inc. | Interrupt and exception handling for multi-streaming digital processors |
US6477562B2 (en) | 1998-12-16 | 2002-11-05 | Clearwater Networks, Inc. | Prioritized instruction scheduling for multi-streaming processors |
US6247097B1 (en) | 1999-01-22 | 2001-06-12 | International Business Machines Corporation | Aligned instruction cache handling of instruction fetches across multiple predicted branch instructions |
US6321298B1 (en) | 1999-01-25 | 2001-11-20 | International Business Machines Corporation | Full cache coherency across multiple raid controllers |
JP3842474B2 (ja) | 1999-02-02 | 2006-11-08 | 株式会社ルネサステクノロジ | データ処理装置 |
US6327650B1 (en) | 1999-02-12 | 2001-12-04 | Vsli Technology, Inc. | Pipelined multiprocessing with upstream processor concurrently writing to local register and to register of downstream processor |
US6668316B1 (en) | 1999-02-17 | 2003-12-23 | Elbrus International Limited | Method and apparatus for conflict-free execution of integer and floating-point operations with a common register file |
US6732220B2 (en) | 1999-02-17 | 2004-05-04 | Elbrus International | Method for emulating hardware features of a foreign architecture in a host operating system environment |
US6418530B2 (en) | 1999-02-18 | 2002-07-09 | Hewlett-Packard Company | Hardware/software system for instruction profiling and trace selection using branch history information for branch predictions |
US6437789B1 (en) | 1999-02-19 | 2002-08-20 | Evans & Sutherland Computer Corporation | Multi-level cache controller |
US6850531B1 (en) | 1999-02-23 | 2005-02-01 | Alcatel | Multi-service network switch |
US6212613B1 (en) | 1999-03-22 | 2001-04-03 | Cisco Technology, Inc. | Methods and apparatus for reusing addresses in a computer |
US6529928B1 (en) | 1999-03-23 | 2003-03-04 | Silicon Graphics, Inc. | Floating-point adder performing floating-point and integer operations |
DE69938621D1 (de) | 1999-05-03 | 2008-06-12 | St Microelectronics Sa | Befehlausgabe in einem Rechner |
US6449671B1 (en) | 1999-06-09 | 2002-09-10 | Ati International Srl | Method and apparatus for busing data elements |
US6473833B1 (en) | 1999-07-30 | 2002-10-29 | International Business Machines Corporation | Integrated cache and directory structure for multi-level caches |
US6643770B1 (en) | 1999-09-16 | 2003-11-04 | Intel Corporation | Branch misprediction recovery using a side memory |
US6772325B1 (en) | 1999-10-01 | 2004-08-03 | Hitachi, Ltd. | Processor architecture and operation for exploiting improved branch control instruction |
US6704822B1 (en) | 1999-10-01 | 2004-03-09 | Sun Microsystems, Inc. | Arbitration protocol for a shared data cache |
US6457120B1 (en) | 1999-11-01 | 2002-09-24 | International Business Machines Corporation | Processor and method including a cache having confirmation bits for improving address predictable branch instruction target predictions |
US7441110B1 (en) | 1999-12-10 | 2008-10-21 | International Business Machines Corporation | Prefetching using future branch path information derived from branch prediction |
US7107434B2 (en) | 1999-12-20 | 2006-09-12 | Board Of Regents, The University Of Texas | System, method and apparatus for allocating hardware resources using pseudorandom sequences |
AU2597401A (en) | 1999-12-22 | 2001-07-03 | Ubicom, Inc. | System and method for instruction level multithreading in an embedded processor using zero-time context switching |
US6557095B1 (en) * | 1999-12-27 | 2003-04-29 | Intel Corporation | Scheduling operations using a dependency matrix |
CN1210649C (zh) | 2000-01-03 | 2005-07-13 | 先进微装置公司 | 能够发送及重新发送附属链接的排程器、包括该排程器的处理器以及排程方法 |
US6542984B1 (en) | 2000-01-03 | 2003-04-01 | Advanced Micro Devices, Inc. | Scheduler capable of issuing and reissuing dependency chains |
US6594755B1 (en) | 2000-01-04 | 2003-07-15 | National Semiconductor Corporation | System and method for interleaved execution of multiple independent threads |
US6728872B1 (en) | 2000-02-04 | 2004-04-27 | International Business Machines Corporation | Method and apparatus for verifying that instructions are pipelined in correct architectural sequence |
GB0002848D0 (en) | 2000-02-08 | 2000-03-29 | Siroyan Limited | Communicating instruction results in processors and compiling methods for processors |
GB2365661A (en) | 2000-03-10 | 2002-02-20 | British Telecomm | Allocating switch requests within a packet switch |
US6615340B1 (en) | 2000-03-22 | 2003-09-02 | Wilmot, Ii Richard Byron | Extended operand management indicator structure and method |
US6604187B1 (en) | 2000-06-19 | 2003-08-05 | Advanced Micro Devices, Inc. | Providing global translations with address space numbers |
US6557083B1 (en) | 2000-06-30 | 2003-04-29 | Intel Corporation | Memory system for multiple data types |
US6704860B1 (en) | 2000-07-26 | 2004-03-09 | International Business Machines Corporation | Data processing system and method for fetching instruction blocks in response to a detected block sequence |
US7206925B1 (en) | 2000-08-18 | 2007-04-17 | Sun Microsystems, Inc. | Backing Register File for processors |
US6728866B1 (en) | 2000-08-31 | 2004-04-27 | International Business Machines Corporation | Partitioned issue queue and allocation strategy |
US6721874B1 (en) | 2000-10-12 | 2004-04-13 | International Business Machines Corporation | Method and system for dynamically shared completion table supporting multiple threads in a processing system |
US7757065B1 (en) | 2000-11-09 | 2010-07-13 | Intel Corporation | Instruction segment recording scheme |
JP2002185513A (ja) | 2000-12-18 | 2002-06-28 | Hitachi Ltd | パケット通信ネットワークおよびパケット転送制御方法 |
US6877089B2 (en) | 2000-12-27 | 2005-04-05 | International Business Machines Corporation | Branch prediction apparatus and process for restoring replaced branch history for use in future branch predictions for an executing program |
US6907600B2 (en) | 2000-12-27 | 2005-06-14 | Intel Corporation | Virtual translation lookaside buffer |
US6647466B2 (en) | 2001-01-25 | 2003-11-11 | Hewlett-Packard Development Company, L.P. | Method and apparatus for adaptively bypassing one or more levels of a cache hierarchy |
FR2820921A1 (fr) | 2001-02-14 | 2002-08-16 | Canon Kk | Dispositif et procede de transmission dans un commutateur |
US6985951B2 (en) | 2001-03-08 | 2006-01-10 | International Business Machines Corporation | Inter-partition message passing method, system and program product for managing workload in a partitioned processing environment |
US6950927B1 (en) | 2001-04-13 | 2005-09-27 | The United States Of America As Represented By The Secretary Of The Navy | System and method for instruction-level parallelism in a programmable multiple network processor environment |
US7200740B2 (en) | 2001-05-04 | 2007-04-03 | Ip-First, Llc | Apparatus and method for speculatively performing a return instruction in a microprocessor |
US7707397B2 (en) | 2001-05-04 | 2010-04-27 | Via Technologies, Inc. | Variable group associativity branch target address cache delivering multiple target addresses per cache line |
US6658549B2 (en) | 2001-05-22 | 2003-12-02 | Hewlett-Packard Development Company, Lp. | Method and system allowing a single entity to manage memory comprising compressed and uncompressed data |
US6985591B2 (en) | 2001-06-29 | 2006-01-10 | Intel Corporation | Method and apparatus for distributing keys for decrypting and re-encrypting publicly distributed media |
US7203824B2 (en) | 2001-07-03 | 2007-04-10 | Ip-First, Llc | Apparatus and method for handling BTAC branches that wrap across instruction cache lines |
US7024545B1 (en) | 2001-07-24 | 2006-04-04 | Advanced Micro Devices, Inc. | Hybrid branch prediction device with two levels of branch prediction cache |
US6954846B2 (en) | 2001-08-07 | 2005-10-11 | Sun Microsystems, Inc. | Microprocessor and method for giving each thread exclusive access to one register file in a multi-threading mode and for giving an active thread access to multiple register files in a single thread mode |
US6718440B2 (en) | 2001-09-28 | 2004-04-06 | Intel Corporation | Memory access latency hiding with hint buffer |
US7150021B1 (en) | 2001-10-12 | 2006-12-12 | Palau Acquisition Corporation (Delaware) | Method and system to allocate resources within an interconnect device according to a resource allocation table |
US7117347B2 (en) | 2001-10-23 | 2006-10-03 | Ip-First, Llc | Processor including fallback branch prediction mechanism for far jump and far call instructions |
US7272832B2 (en) | 2001-10-25 | 2007-09-18 | Hewlett-Packard Development Company, L.P. | Method of protecting user process data in a secure platform inaccessible to the operating system and other tasks on top of the secure platform |
US6964043B2 (en) | 2001-10-30 | 2005-11-08 | Intel Corporation | Method, apparatus, and system to optimize frequently executed code and to use compiler transformation and hardware support to handle infrequently executed code |
GB2381886B (en) | 2001-11-07 | 2004-06-23 | Sun Microsystems Inc | Computer system with virtual memory and paging mechanism |
US7092869B2 (en) | 2001-11-14 | 2006-08-15 | Ronald Hilton | Memory address prediction under emulation |
US7363467B2 (en) | 2002-01-03 | 2008-04-22 | Intel Corporation | Dependence-chain processing using trace descriptors having dependency descriptors |
US6640333B2 (en) | 2002-01-10 | 2003-10-28 | Lsi Logic Corporation | Architecture for a sea of platforms |
US7055021B2 (en) | 2002-02-05 | 2006-05-30 | Sun Microsystems, Inc. | Out-of-order processor that reduces mis-speculation using a replay scoreboard |
US7331040B2 (en) | 2002-02-06 | 2008-02-12 | Transitive Limted | Condition code flag emulation for program code conversion |
US6839816B2 (en) | 2002-02-26 | 2005-01-04 | International Business Machines Corporation | Shared cache line update mechanism |
US6731292B2 (en) | 2002-03-06 | 2004-05-04 | Sun Microsystems, Inc. | System and method for controlling a number of outstanding data transactions within an integrated circuit |
JP3719509B2 (ja) | 2002-04-01 | 2005-11-24 | 株式会社ソニー・コンピュータエンタテインメント | シリアル演算パイプライン、演算装置、算術論理演算回路およびシリアル演算パイプラインによる演算方法 |
US7565509B2 (en) | 2002-04-17 | 2009-07-21 | Microsoft Corporation | Using limits on address translation to control access to an addressable entity |
US6920530B2 (en) | 2002-04-23 | 2005-07-19 | Sun Microsystems, Inc. | Scheme for reordering instructions via an instruction caching mechanism |
US7113488B2 (en) | 2002-04-24 | 2006-09-26 | International Business Machines Corporation | Reconfigurable circular bus |
US7281055B2 (en) | 2002-05-28 | 2007-10-09 | Newisys, Inc. | Routing mechanisms in systems having multiple multi-processor clusters |
US7117346B2 (en) | 2002-05-31 | 2006-10-03 | Freescale Semiconductor, Inc. | Data processing system having multiple register contexts and method therefor |
US6938151B2 (en) | 2002-06-04 | 2005-08-30 | International Business Machines Corporation | Hybrid branch prediction using a global selection counter and a prediction method comparison table |
US8024735B2 (en) | 2002-06-14 | 2011-09-20 | Intel Corporation | Method and apparatus for ensuring fairness and forward progress when executing multiple threads of execution |
JP3845043B2 (ja) | 2002-06-28 | 2006-11-15 | 富士通株式会社 | 命令フェッチ制御装置 |
JP3982353B2 (ja) | 2002-07-12 | 2007-09-26 | 日本電気株式会社 | フォルトトレラントコンピュータ装置、その再同期化方法及び再同期化プログラム |
US6944744B2 (en) | 2002-08-27 | 2005-09-13 | Advanced Micro Devices, Inc. | Apparatus and method for independently schedulable functional units with issue lock mechanism in a processor |
US6950925B1 (en) | 2002-08-28 | 2005-09-27 | Advanced Micro Devices, Inc. | Scheduler for use in a microprocessor that supports data-speculative execution |
US7546422B2 (en) | 2002-08-28 | 2009-06-09 | Intel Corporation | Method and apparatus for the synchronization of distributed caches |
TW200408242A (en) | 2002-09-06 | 2004-05-16 | Matsushita Electric Ind Co Ltd | Home terminal apparatus and communication system |
US6895491B2 (en) | 2002-09-26 | 2005-05-17 | Hewlett-Packard Development Company, L.P. | Memory addressing for a virtual machine implementation on a computer processor supporting virtual hash-page-table searching |
US7334086B2 (en) | 2002-10-08 | 2008-02-19 | Rmi Corporation | Advanced processor with system on a chip interconnect technology |
US6829698B2 (en) | 2002-10-10 | 2004-12-07 | International Business Machines Corporation | Method, apparatus and system for acquiring a global promotion facility utilizing a data-less transaction |
US7213248B2 (en) | 2002-10-10 | 2007-05-01 | International Business Machines Corporation | High speed promotion mechanism suitable for lock acquisition in a multiprocessor data processing system |
US7222218B2 (en) | 2002-10-22 | 2007-05-22 | Sun Microsystems, Inc. | System and method for goal-based scheduling of blocks of code for concurrent execution |
US20040103251A1 (en) | 2002-11-26 | 2004-05-27 | Mitchell Alsup | Microprocessor including a first level cache and a second level cache having different cache line sizes |
AU2003292451A1 (en) | 2002-12-04 | 2004-06-23 | Koninklijke Philips Electronics N.V. | Register file gating to reduce microprocessor power dissipation |
US6981083B2 (en) | 2002-12-05 | 2005-12-27 | International Business Machines Corporation | Processor virtualization mechanism via an enhanced restoration of hard architected states |
US7073042B2 (en) | 2002-12-12 | 2006-07-04 | Intel Corporation | Reclaiming existing fields in address translation data structures to extend control over memory accesses |
US20040117594A1 (en) | 2002-12-13 | 2004-06-17 | Vanderspek Julius | Memory management method |
US20040122887A1 (en) | 2002-12-20 | 2004-06-24 | Macy William W. | Efficient multiplication of small matrices using SIMD registers |
US7191349B2 (en) | 2002-12-26 | 2007-03-13 | Intel Corporation | Mechanism for processor power state aware distribution of lowest priority interrupt |
US20040139441A1 (en) | 2003-01-09 | 2004-07-15 | Kabushiki Kaisha Toshiba | Processor, arithmetic operation processing method, and priority determination method |
US6925421B2 (en) | 2003-01-09 | 2005-08-02 | International Business Machines Corporation | Method, system, and computer program product for estimating the number of consumers that place a load on an individual resource in a pool of physically distributed resources |
US7178010B2 (en) | 2003-01-16 | 2007-02-13 | Ip-First, Llc | Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack |
US7089374B2 (en) | 2003-02-13 | 2006-08-08 | Sun Microsystems, Inc. | Selectively unmarking load-marked cache lines during transactional program execution |
US7278030B1 (en) | 2003-03-03 | 2007-10-02 | Vmware, Inc. | Virtualization system for computers having multiple protection mechanisms |
US6912644B1 (en) | 2003-03-06 | 2005-06-28 | Intel Corporation | Method and apparatus to steer memory access operations in a virtual memory system |
US7111145B1 (en) | 2003-03-25 | 2006-09-19 | Vmware, Inc. | TLB miss fault handler and method for accessing multiple page tables |
US7143273B2 (en) | 2003-03-31 | 2006-11-28 | Intel Corporation | Method and apparatus for dynamic branch prediction utilizing multiple stew algorithms for indexing a global history |
CN1214666C (zh) | 2003-04-07 | 2005-08-10 | 华为技术有限公司 | 位置业务中限制位置信息请求流量的方法 |
US7058764B2 (en) | 2003-04-14 | 2006-06-06 | Hewlett-Packard Development Company, L.P. | Method of adaptive cache partitioning to increase host I/O performance |
US7469407B2 (en) | 2003-04-24 | 2008-12-23 | International Business Machines Corporation | Method for resource balancing using dispatch flush in a simultaneous multithread processor |
EP1471421A1 (en) | 2003-04-24 | 2004-10-27 | STMicroelectronics Limited | Speculative load instruction control |
US7290261B2 (en) | 2003-04-24 | 2007-10-30 | International Business Machines Corporation | Method and logical apparatus for rename register reallocation in a simultaneous multi-threaded (SMT) processor |
US7139855B2 (en) | 2003-04-24 | 2006-11-21 | International Business Machines Corporation | High performance synchronization of resource allocation in a logically-partitioned system |
US7055003B2 (en) | 2003-04-25 | 2006-05-30 | International Business Machines Corporation | Data cache scrub mechanism for large L2/L3 data cache structures |
US7007108B2 (en) | 2003-04-30 | 2006-02-28 | Lsi Logic Corporation | System method for use of hardware semaphores for resource release notification wherein messages comprises read-modify-write operation and address |
JP2007519052A (ja) | 2003-06-25 | 2007-07-12 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | 命令制御式データ処理装置 |
JP2005032018A (ja) | 2003-07-04 | 2005-02-03 | Semiconductor Energy Lab Co Ltd | 遺伝的アルゴリズムを用いたマイクロプロセッサ |
US7149872B2 (en) | 2003-07-10 | 2006-12-12 | Transmeta Corporation | System and method for identifying TLB entries associated with a physical address of a specified range |
US7089398B2 (en) | 2003-07-31 | 2006-08-08 | Silicon Graphics, Inc. | Address translation using a page size tag |
US8296771B2 (en) | 2003-08-18 | 2012-10-23 | Cray Inc. | System and method for mapping between resource consumers and resource providers in a computing system |
US7133950B2 (en) | 2003-08-19 | 2006-11-07 | Sun Microsystems, Inc. | Request arbitration in multi-core processor |
US7594089B2 (en) | 2003-08-28 | 2009-09-22 | Mips Technologies, Inc. | Smart memory based synchronization controller for a multi-threaded multiprocessor SoC |
US7849297B2 (en) | 2003-08-28 | 2010-12-07 | Mips Technologies, Inc. | Software emulation of directed exceptions in a multithreading processor |
EP1658563B1 (en) | 2003-08-28 | 2013-06-05 | MIPS Technologies, Inc. | Apparatus, and method for initiation of concurrent instruction streams in a multithreading microprocessor |
US9032404B2 (en) | 2003-08-28 | 2015-05-12 | Mips Technologies, Inc. | Preemptive multitasking employing software emulation of directed exceptions in a multithreading processor |
US7111126B2 (en) | 2003-09-24 | 2006-09-19 | Arm Limited | Apparatus and method for loading data values |
JP4057989B2 (ja) | 2003-09-26 | 2008-03-05 | 株式会社東芝 | スケジューリング方法および情報処理システム |
US7373637B2 (en) | 2003-09-30 | 2008-05-13 | International Business Machines Corporation | Method and apparatus for counting instruction and memory location ranges |
US7047322B1 (en) | 2003-09-30 | 2006-05-16 | Unisys Corporation | System and method for performing conflict resolution and flow control in a multiprocessor system |
FR2860313B1 (fr) | 2003-09-30 | 2005-11-04 | Commissariat Energie Atomique | Composant a architecture reconfigurable dynamiquement |
TWI281121B (en) | 2003-10-06 | 2007-05-11 | Ip First Llc | Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence |
US8407433B2 (en) | 2007-06-25 | 2013-03-26 | Sonics, Inc. | Interconnect implementing internal controls |
US7395372B2 (en) | 2003-11-14 | 2008-07-01 | International Business Machines Corporation | Method and system for providing cache set selection which is power optimized |
US7243170B2 (en) | 2003-11-24 | 2007-07-10 | International Business Machines Corporation | Method and circuit for reading and writing an instruction buffer |
US20050120191A1 (en) | 2003-12-02 | 2005-06-02 | Intel Corporation (A Delaware Corporation) | Checkpoint-based register reclamation |
US20050132145A1 (en) | 2003-12-15 | 2005-06-16 | Finisar Corporation | Contingent processor time division multiple access of memory in a multi-processor system to allow supplemental memory consumer access |
US7310722B2 (en) | 2003-12-18 | 2007-12-18 | Nvidia Corporation | Across-thread out of order instruction dispatch in a multithreaded graphics processor |
US7293164B2 (en) | 2004-01-14 | 2007-11-06 | International Business Machines Corporation | Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions |
US20050204118A1 (en) | 2004-02-27 | 2005-09-15 | National Chiao Tung University | Method for inter-cluster communication that employs register permutation |
US20050216920A1 (en) | 2004-03-24 | 2005-09-29 | Vijay Tewari | Use of a virtual machine to emulate a hardware device |
EP1731998A1 (en) | 2004-03-29 | 2006-12-13 | Kyoto University | Data processing device, data processing program, and recording medium containing the data processing program |
US7383427B2 (en) | 2004-04-22 | 2008-06-03 | Sony Computer Entertainment Inc. | Multi-scalar extension for SIMD instruction set processors |
US20050251649A1 (en) | 2004-04-23 | 2005-11-10 | Sony Computer Entertainment Inc. | Methods and apparatus for address map optimization on a multi-scalar extension |
US7418582B1 (en) | 2004-05-13 | 2008-08-26 | Sun Microsystems, Inc. | Versatile register file design for a multi-threaded processor utilizing different modes and register windows |
US7478198B2 (en) | 2004-05-24 | 2009-01-13 | Intel Corporation | Multithreaded clustered microarchitecture with dynamic back-end assignment |
US7594234B1 (en) | 2004-06-04 | 2009-09-22 | Sun Microsystems, Inc. | Adaptive spin-then-block mutual exclusion in multi-threaded processing |
US7284092B2 (en) | 2004-06-24 | 2007-10-16 | International Business Machines Corporation | Digital data processing apparatus having multi-level register file |
US20050289530A1 (en) | 2004-06-29 | 2005-12-29 | Robison Arch D | Scheduling of instructions in program compilation |
EP1628235A1 (en) | 2004-07-01 | 2006-02-22 | Texas Instruments Incorporated | Method and system of ensuring integrity of a secure mode entry sequence |
US8044951B1 (en) | 2004-07-02 | 2011-10-25 | Nvidia Corporation | Integer-based functionality in a graphics shading language |
US7339592B2 (en) | 2004-07-13 | 2008-03-04 | Nvidia Corporation | Simulating multiported memories using lower port count memories |
US7398347B1 (en) | 2004-07-14 | 2008-07-08 | Altera Corporation | Methods and apparatus for dynamic instruction controlled reconfigurable register file |
EP1619593A1 (en) | 2004-07-22 | 2006-01-25 | Sap Ag | Computer-Implemented method and system for performing a product availability check |
JP4064380B2 (ja) | 2004-07-29 | 2008-03-19 | 富士通株式会社 | 演算処理装置およびその制御方法 |
US8443171B2 (en) | 2004-07-30 | 2013-05-14 | Hewlett-Packard Development Company, L.P. | Run-time updating of prediction hint instructions |
US7213106B1 (en) | 2004-08-09 | 2007-05-01 | Sun Microsystems, Inc. | Conservative shadow cache support in a point-to-point connected multiprocessing node |
US7318143B2 (en) | 2004-10-20 | 2008-01-08 | Arm Limited | Reuseable configuration data |
US20090150890A1 (en) | 2007-12-10 | 2009-06-11 | Yourst Matt T | Strand-based computing hardware and dynamically optimizing strandware for a high performance microprocessor system |
US7707578B1 (en) | 2004-12-16 | 2010-04-27 | Vmware, Inc. | Mechanism for scheduling execution of threads for fair resource allocation in a multi-threaded and/or multi-core processing system |
US7257695B2 (en) | 2004-12-28 | 2007-08-14 | Intel Corporation | Register file regions for a processing system |
US7996644B2 (en) | 2004-12-29 | 2011-08-09 | Intel Corporation | Fair sharing of a cache in a multi-core/multi-threaded processor by dynamically partitioning of the cache |
US8719819B2 (en) | 2005-06-30 | 2014-05-06 | Intel Corporation | Mechanism for instruction set based thread execution on a plurality of instruction sequencers |
US7050922B1 (en) | 2005-01-14 | 2006-05-23 | Agilent Technologies, Inc. | Method for optimizing test order, and machine-readable media storing sequences of instructions to perform same |
US7681014B2 (en) | 2005-02-04 | 2010-03-16 | Mips Technologies, Inc. | Multithreading instruction scheduler employing thread group priorities |
US7657891B2 (en) | 2005-02-04 | 2010-02-02 | Mips Technologies, Inc. | Multithreading microprocessor with optimized thread scheduler for increasing pipeline utilization efficiency |
EP1849095B1 (en) | 2005-02-07 | 2013-01-02 | Richter, Thomas | Low latency massive parallel data processing device |
US7400548B2 (en) | 2005-02-09 | 2008-07-15 | International Business Machines Corporation | Method for providing multiple reads/writes using a 2read/2write register file array |
US7343476B2 (en) | 2005-02-10 | 2008-03-11 | International Business Machines Corporation | Intelligent SMT thread hang detect taking into account shared resource contention/blocking |
US7152155B2 (en) | 2005-02-18 | 2006-12-19 | Qualcomm Incorporated | System and method of correcting a branch misprediction |
US20060200655A1 (en) | 2005-03-04 | 2006-09-07 | Smith Rodney W | Forward looking branch target address caching |
US8195922B2 (en) | 2005-03-18 | 2012-06-05 | Marvell World Trade, Ltd. | System for dynamically allocating processing time to multiple threads |
US20060212853A1 (en) | 2005-03-18 | 2006-09-21 | Marvell World Trade Ltd. | Real-time control apparatus having a multi-thread processor |
GB2424727B (en) | 2005-03-30 | 2007-08-01 | Transitive Ltd | Preparing instruction groups for a processor having a multiple issue ports |
US8522253B1 (en) | 2005-03-31 | 2013-08-27 | Guillermo Rozas | Hardware support for virtual machine and operating system context switching in translation lookaside buffers and virtually tagged caches |
US7313775B2 (en) | 2005-04-06 | 2007-12-25 | Lsi Corporation | Integrated circuit with relocatable processor hardmac |
US20060230243A1 (en) * | 2005-04-06 | 2006-10-12 | Robert Cochran | Cascaded snapshots |
US20060230409A1 (en) | 2005-04-07 | 2006-10-12 | Matteo Frigo | Multithreaded processor architecture with implicit granularity adaptation |
US8230423B2 (en) | 2005-04-07 | 2012-07-24 | International Business Machines Corporation | Multithreaded processor architecture with operational latency hiding |
US20060230253A1 (en) | 2005-04-11 | 2006-10-12 | Lucian Codrescu | Unified non-partitioned register files for a digital signal processor operating in an interleaved multi-threaded environment |
US20060236074A1 (en) | 2005-04-14 | 2006-10-19 | Arm Limited | Indicating storage locations within caches |
US7437543B2 (en) | 2005-04-19 | 2008-10-14 | International Business Machines Corporation | Reducing the fetch time of target instructions of a predicted taken branch instruction |
US7461237B2 (en) | 2005-04-20 | 2008-12-02 | Sun Microsystems, Inc. | Method and apparatus for suppressing duplicative prefetches for branch target cache lines |
US8713286B2 (en) | 2005-04-26 | 2014-04-29 | Qualcomm Incorporated | Register files for a digital signal processor operating in an interleaved multi-threaded environment |
GB2426084A (en) | 2005-05-13 | 2006-11-15 | Agilent Technologies Inc | Updating data in a dual port memory |
US7861055B2 (en) | 2005-06-07 | 2010-12-28 | Broadcom Corporation | Method and system for on-chip configurable data ram for fast memory and pseudo associative caches |
US8010969B2 (en) | 2005-06-13 | 2011-08-30 | Intel Corporation | Mechanism for monitoring instruction set based thread execution on a plurality of instruction sequencers |
KR101355496B1 (ko) | 2005-08-29 | 2014-01-28 | 디 인벤션 사이언스 펀드 원, 엘엘씨 | 복수의 병렬 클러스터들을 포함하는 계층 프로세서의스케쥴링 메카니즘 |
JP2009508247A (ja) | 2005-09-14 | 2009-02-26 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | バス調停に関する方法及びシステム |
US7350056B2 (en) | 2005-09-27 | 2008-03-25 | International Business Machines Corporation | Method and apparatus for issuing instructions from an issue queue in an information handling system |
US7606975B1 (en) | 2005-09-28 | 2009-10-20 | Sun Microsystems, Inc. | Trace cache for efficient self-modifying code processing |
US7231106B2 (en) | 2005-09-30 | 2007-06-12 | Lucent Technologies Inc. | Apparatus for directing an optical signal from an input fiber to an output fiber within a high index host |
US7613131B2 (en) | 2005-11-10 | 2009-11-03 | Citrix Systems, Inc. | Overlay network infrastructure |
US7681019B1 (en) | 2005-11-18 | 2010-03-16 | Sun Microsystems, Inc. | Executing functions determined via a collection of operations from translated instructions |
US7861060B1 (en) | 2005-12-15 | 2010-12-28 | Nvidia Corporation | Parallel data processing systems and methods using cooperative thread arrays and thread identifier values to determine processing behavior |
US7634637B1 (en) | 2005-12-16 | 2009-12-15 | Nvidia Corporation | Execution of parallel groups of threads with per-instruction serialization |
US7770161B2 (en) | 2005-12-28 | 2010-08-03 | International Business Machines Corporation | Post-register allocation profile directed instruction scheduling |
US8423682B2 (en) | 2005-12-30 | 2013-04-16 | Intel Corporation | Address space emulation |
GB2435362B (en) | 2006-02-20 | 2008-11-26 | Cramer Systems Ltd | Method of configuring devices in a telecommunications network |
JP4332205B2 (ja) | 2006-02-27 | 2009-09-16 | 富士通株式会社 | キャッシュ制御装置およびキャッシュ制御方法 |
US7543282B2 (en) | 2006-03-24 | 2009-06-02 | Sun Microsystems, Inc. | Method and apparatus for selectively executing different executable code versions which are optimized in different ways |
EP2477109B1 (en) | 2006-04-12 | 2016-07-13 | Soft Machines, Inc. | Apparatus and method for processing an instruction matrix specifying parallel and dependent operations |
US7610571B2 (en) | 2006-04-14 | 2009-10-27 | Cadence Design Systems, Inc. | Method and system for simulating state retention of an RTL design |
US7577820B1 (en) | 2006-04-14 | 2009-08-18 | Tilera Corporation | Managing data in a parallel processing environment |
CN100485636C (zh) | 2006-04-24 | 2009-05-06 | 华为技术有限公司 | 一种基于模型驱动进行电信级业务开发的调试方法及装置 |
US7804076B2 (en) | 2006-05-10 | 2010-09-28 | Taiwan Semiconductor Manufacturing Co., Ltd | Insulator for high current ion implanters |
US8145882B1 (en) * | 2006-05-25 | 2012-03-27 | Mips Technologies, Inc. | Apparatus and method for processing template based user defined instructions |
US20080126771A1 (en) | 2006-07-25 | 2008-05-29 | Lei Chen | Branch Target Extension for an Instruction Cache |
CN100495324C (zh) | 2006-07-27 | 2009-06-03 | 中国科学院计算技术研究所 | 复杂指令集体系结构中的深度优先异常处理方法 |
US7904704B2 (en) | 2006-08-14 | 2011-03-08 | Marvell World Trade Ltd. | Instruction dispatching method and apparatus |
US8046775B2 (en) | 2006-08-14 | 2011-10-25 | Marvell World Trade Ltd. | Event-based bandwidth allocation mode switching method and apparatus |
US7539842B2 (en) | 2006-08-15 | 2009-05-26 | International Business Machines Corporation | Computer memory system for selecting memory buses according to physical memory organization information stored in virtual address translation tables |
US7594060B2 (en) | 2006-08-23 | 2009-09-22 | Sun Microsystems, Inc. | Data buffer allocation in a non-blocking data services platform using input/output switching fabric |
US7752474B2 (en) | 2006-09-22 | 2010-07-06 | Apple Inc. | L1 cache flush when processor is entering low power mode |
US7716460B2 (en) | 2006-09-29 | 2010-05-11 | Qualcomm Incorporated | Effective use of a BHT in processor having variable length instruction set execution modes |
US7774549B2 (en) | 2006-10-11 | 2010-08-10 | Mips Technologies, Inc. | Horizontally-shared cache victims in multiple core processors |
TWI337495B (en) | 2006-10-26 | 2011-02-11 | Au Optronics Corp | System and method for operation scheduling |
US7680988B1 (en) | 2006-10-30 | 2010-03-16 | Nvidia Corporation | Single interconnect providing read and write access to a memory shared by concurrent threads |
US7617384B1 (en) | 2006-11-06 | 2009-11-10 | Nvidia Corporation | Structured programming control flow using a disable mask in a SIMD architecture |
EP2523101B1 (en) * | 2006-11-14 | 2014-06-04 | Soft Machines, Inc. | Apparatus and method for processing complex instruction formats in a multi- threaded architecture supporting various context switch modes and virtualization schemes |
US7493475B2 (en) | 2006-11-15 | 2009-02-17 | Stmicroelectronics, Inc. | Instruction vector-mode processing in multi-lane processor by multiplex switch replicating instruction in one lane to select others along with updated operand address |
US7934179B2 (en) | 2006-11-20 | 2011-04-26 | Et International, Inc. | Systems and methods for logic verification |
US20080235500A1 (en) | 2006-11-21 | 2008-09-25 | Davis Gordon T | Structure for instruction cache trace formation |
JP2008130056A (ja) | 2006-11-27 | 2008-06-05 | Renesas Technology Corp | 半導体回路 |
WO2008077088A2 (en) | 2006-12-19 | 2008-06-26 | The Board Of Governors For Higher Education, State Of Rhode Island And Providence Plantations | System and method for branch misprediction prediction using complementary branch predictors |
US7783869B2 (en) | 2006-12-19 | 2010-08-24 | Arm Limited | Accessing branch predictions ahead of instruction fetching |
EP1940028B1 (en) | 2006-12-29 | 2012-02-29 | STMicroelectronics Srl | Asynchronous interconnection system for 3D inter-chip communication |
US8321849B2 (en) | 2007-01-26 | 2012-11-27 | Nvidia Corporation | Virtual architecture and instruction set for parallel thread computing |
TW200833002A (en) | 2007-01-31 | 2008-08-01 | Univ Nat Yunlin Sci & Tech | Distributed switching circuit having fairness |
US20080189501A1 (en) | 2007-02-05 | 2008-08-07 | Irish John D | Methods and Apparatus for Issuing Commands on a Bus |
US7685410B2 (en) | 2007-02-13 | 2010-03-23 | Global Foundries Inc. | Redirect recovery cache that receives branch misprediction redirects and caches instructions to be dispatched in response to the redirects |
US7647483B2 (en) | 2007-02-20 | 2010-01-12 | Sony Computer Entertainment Inc. | Multi-threaded parallel processor methods and apparatus |
JP4980751B2 (ja) | 2007-03-02 | 2012-07-18 | 富士通セミコンダクター株式会社 | データ処理装置、およびメモリのリードアクティブ制御方法。 |
US8452907B2 (en) | 2007-03-27 | 2013-05-28 | Arm Limited | Data processing apparatus and method for arbitrating access to a shared resource |
US20080250227A1 (en) | 2007-04-04 | 2008-10-09 | Linderman Michael D | General Purpose Multiprocessor Programming Apparatus And Method |
US7716183B2 (en) * | 2007-04-11 | 2010-05-11 | Dot Hill Systems Corporation | Snapshot preserved data cloning |
US7941791B2 (en) | 2007-04-13 | 2011-05-10 | Perry Wang | Programming environment for heterogeneous processor resource integration |
US7769955B2 (en) | 2007-04-27 | 2010-08-03 | Arm Limited | Multiple thread instruction fetch from different cache levels |
US7711935B2 (en) | 2007-04-30 | 2010-05-04 | Netlogic Microsystems, Inc. | Universal branch identifier for invalidation of speculative instructions |
US8555039B2 (en) | 2007-05-03 | 2013-10-08 | Qualcomm Incorporated | System and method for using a local condition code register for accelerating conditional instruction execution in a pipeline processor |
US8219996B1 (en) | 2007-05-09 | 2012-07-10 | Hewlett-Packard Development Company, L.P. | Computer processor with fairness monitor |
CN101344840B (zh) | 2007-07-10 | 2011-08-31 | 苏州简约纳电子有限公司 | 一种微处理器及在微处理器中执行指令的方法 |
US7937568B2 (en) | 2007-07-11 | 2011-05-03 | International Business Machines Corporation | Adaptive execution cycle control method for enhanced instruction throughput |
US20090025004A1 (en) | 2007-07-16 | 2009-01-22 | Microsoft Corporation | Scheduling by Growing and Shrinking Resource Allocation |
US8108545B2 (en) | 2007-08-27 | 2012-01-31 | International Business Machines Corporation | Packet coalescing in virtual channels of a data processing system in a multi-tiered full-graph interconnect architecture |
US7711929B2 (en) | 2007-08-30 | 2010-05-04 | International Business Machines Corporation | Method and system for tracking instruction dependency in an out-of-order processor |
US8725991B2 (en) | 2007-09-12 | 2014-05-13 | Qualcomm Incorporated | Register file system and method for pipelined processing |
US8082420B2 (en) | 2007-10-24 | 2011-12-20 | International Business Machines Corporation | Method and apparatus for executing instructions |
US7856530B1 (en) | 2007-10-31 | 2010-12-21 | Network Appliance, Inc. | System and method for implementing a dynamic cache for a data storage system |
US7877559B2 (en) | 2007-11-26 | 2011-01-25 | Globalfoundries Inc. | Mechanism to accelerate removal of store operations from a queue |
US8245232B2 (en) | 2007-11-27 | 2012-08-14 | Microsoft Corporation | Software-configurable and stall-time fair memory access scheduling mechanism for shared memory systems |
US7809925B2 (en) | 2007-12-07 | 2010-10-05 | International Business Machines Corporation | Processing unit incorporating vectorizable execution unit |
US8145844B2 (en) | 2007-12-13 | 2012-03-27 | Arm Limited | Memory controller with write data cache and read data cache |
US7831813B2 (en) | 2007-12-17 | 2010-11-09 | Globalfoundries Inc. | Uses of known good code for implementing processor architectural modifications |
US7870371B2 (en) | 2007-12-17 | 2011-01-11 | Microsoft Corporation | Target-frequency based indirect jump prediction for high-performance processors |
US20090165007A1 (en) | 2007-12-19 | 2009-06-25 | Microsoft Corporation | Task-level thread scheduling and resource allocation |
US8782384B2 (en) | 2007-12-20 | 2014-07-15 | Advanced Micro Devices, Inc. | Branch history with polymorphic indirect branch information |
US7917699B2 (en) | 2007-12-21 | 2011-03-29 | Mips Technologies, Inc. | Apparatus and method for controlling the exclusivity mode of a level-two cache |
US8645965B2 (en) | 2007-12-31 | 2014-02-04 | Intel Corporation | Supporting metered clients with manycore through time-limited partitioning |
US9244855B2 (en) | 2007-12-31 | 2016-01-26 | Intel Corporation | Method, system, and apparatus for page sizing extension |
US7877582B2 (en) | 2008-01-31 | 2011-01-25 | International Business Machines Corporation | Multi-addressable register file |
WO2009101563A1 (en) | 2008-02-11 | 2009-08-20 | Nxp B.V. | Multiprocessing implementing a plurality of virtual processors |
US7987343B2 (en) | 2008-03-19 | 2011-07-26 | International Business Machines Corporation | Processor and method for synchronous load multiple fetching sequence and pipeline stage result tracking to facilitate early address generation interlock bypass |
US7949972B2 (en) | 2008-03-19 | 2011-05-24 | International Business Machines Corporation | Method, system and computer program product for exploiting orthogonal control vectors in timing driven synthesis |
US9513905B2 (en) | 2008-03-28 | 2016-12-06 | Intel Corporation | Vector instructions to enable efficient synchronization and parallel reduction operations |
US8120608B2 (en) | 2008-04-04 | 2012-02-21 | Via Technologies, Inc. | Constant buffering for a computational core of a programmable graphics processing unit |
TWI364703B (en) | 2008-05-26 | 2012-05-21 | Faraday Tech Corp | Processor and early execution method of data load thereof |
US8145880B1 (en) | 2008-07-07 | 2012-03-27 | Ovics | Matrix processor data switch routing systems and methods |
WO2010004474A2 (en) | 2008-07-10 | 2010-01-14 | Rocketic Technologies Ltd | Efficient parallel computation of dependency problems |
JP2010039536A (ja) | 2008-07-31 | 2010-02-18 | Panasonic Corp | プログラム変換装置、プログラム変換方法およびプログラム変換プログラム |
US8316435B1 (en) | 2008-08-14 | 2012-11-20 | Juniper Networks, Inc. | Routing device having integrated MPLS-aware firewall with virtual security system support |
US8135942B2 (en) | 2008-08-28 | 2012-03-13 | International Business Machines Corpration | System and method for double-issue instructions using a dependency matrix and a side issue queue |
US7769984B2 (en) | 2008-09-11 | 2010-08-03 | International Business Machines Corporation | Dual-issuance of microprocessor instructions using dual dependency matrices |
US8225048B2 (en) | 2008-10-01 | 2012-07-17 | Hewlett-Packard Development Company, L.P. | Systems and methods for resource access |
US9244732B2 (en) | 2009-08-28 | 2016-01-26 | Vmware, Inc. | Compensating threads for microarchitectural resource contentions by prioritizing scheduling and execution |
US7941616B2 (en) | 2008-10-21 | 2011-05-10 | Microsoft Corporation | System to reduce interference in concurrent programs |
US8423749B2 (en) | 2008-10-22 | 2013-04-16 | International Business Machines Corporation | Sequential processing in network on chip nodes by threads generating message containing payload and pointer for nanokernel to access algorithm to be executed on payload in another node |
GB2464703A (en) | 2008-10-22 | 2010-04-28 | Advanced Risc Mach Ltd | An array of interconnected processors executing a cycle-based program |
WO2010049585A1 (en) | 2008-10-30 | 2010-05-06 | Nokia Corporation | Method and apparatus for interleaving a data block |
US8032678B2 (en) | 2008-11-05 | 2011-10-04 | Mediatek Inc. | Shared resource arbitration |
US7848129B1 (en) | 2008-11-20 | 2010-12-07 | Netlogic Microsystems, Inc. | Dynamically partitioned CAM array |
US8868838B1 (en) | 2008-11-21 | 2014-10-21 | Nvidia Corporation | Multi-class data cache policies |
US8171223B2 (en) | 2008-12-03 | 2012-05-01 | Intel Corporation | Method and system to increase concurrency and control replication in a multi-core cache hierarchy |
US8200949B1 (en) | 2008-12-09 | 2012-06-12 | Nvidia Corporation | Policy based allocation of register file cache to threads in multi-threaded processor |
US8312268B2 (en) | 2008-12-12 | 2012-11-13 | International Business Machines Corporation | Virtual machine |
US8099586B2 (en) | 2008-12-30 | 2012-01-17 | Oracle America, Inc. | Branch misprediction recovery mechanism for microprocessors |
US20100169578A1 (en) | 2008-12-31 | 2010-07-01 | Texas Instruments Incorporated | Cache tag memory |
US20100205603A1 (en) | 2009-02-09 | 2010-08-12 | Unisys Corporation | Scheduling and dispatching tasks in an emulated operating system |
JP5417879B2 (ja) | 2009-02-17 | 2014-02-19 | 富士通セミコンダクター株式会社 | キャッシュ装置 |
US8505013B2 (en) | 2010-03-12 | 2013-08-06 | Lsi Corporation | Reducing data read latency in a network communications processor architecture |
US8805788B2 (en) * | 2009-05-04 | 2014-08-12 | Moka5, Inc. | Transactional virtual disk with differential snapshots |
US8332854B2 (en) * | 2009-05-19 | 2012-12-11 | Microsoft Corporation | Virtualized thread scheduling for hardware thread optimization based on hardware resource parameter summaries of instruction blocks in execution groups |
US8533437B2 (en) | 2009-06-01 | 2013-09-10 | Via Technologies, Inc. | Guaranteed prefetch instruction |
GB2471067B (en) | 2009-06-12 | 2011-11-30 | Graeme Roy Smith | Shared resource multi-thread array processor |
US9122487B2 (en) | 2009-06-23 | 2015-09-01 | Oracle America, Inc. | System and method for balancing instruction loads between multiple execution units using assignment history |
US8386754B2 (en) | 2009-06-24 | 2013-02-26 | Arm Limited | Renaming wide register source operand with plural short register source operands for select instructions to detect dependency fast with existing mechanism |
CN101582025B (zh) | 2009-06-25 | 2011-05-25 | 浙江大学 | 片上多处理器体系架构下全局寄存器重命名表的实现方法 |
US8397049B2 (en) | 2009-07-13 | 2013-03-12 | Apple Inc. | TLB prefetching |
US8539486B2 (en) | 2009-07-17 | 2013-09-17 | International Business Machines Corporation | Transactional block conflict resolution based on the determination of executing threads in parallel or in serial mode |
JP5423217B2 (ja) | 2009-08-04 | 2014-02-19 | 富士通株式会社 | 演算処理装置、情報処理装置、および演算処理装置の制御方法 |
US8127078B2 (en) | 2009-10-02 | 2012-02-28 | International Business Machines Corporation | High performance unaligned cache access |
US20110082983A1 (en) | 2009-10-06 | 2011-04-07 | Alcatel-Lucent Canada, Inc. | Cpu instruction and data cache corruption prevention system |
US8695002B2 (en) | 2009-10-20 | 2014-04-08 | Lantiq Deutschland Gmbh | Multi-threaded processors and multi-processor systems comprising shared resources |
US8364933B2 (en) | 2009-12-18 | 2013-01-29 | International Business Machines Corporation | Software assisted translation lookaside buffer search mechanism |
JP2011150397A (ja) | 2010-01-19 | 2011-08-04 | Panasonic Corp | バス調停装置 |
KR101699910B1 (ko) | 2010-03-04 | 2017-01-26 | 삼성전자주식회사 | 재구성 가능 프로세서 및 그 제어 방법 |
US20120005462A1 (en) | 2010-07-01 | 2012-01-05 | International Business Machines Corporation | Hardware Assist for Optimizing Code During Processing |
US8312258B2 (en) * | 2010-07-22 | 2012-11-13 | Intel Corporation | Providing platform independent memory logic |
CN101916180B (zh) | 2010-08-11 | 2013-05-29 | 中国科学院计算技术研究所 | Risc处理器中执行寄存器类型指令的方法和其系统 |
US8751745B2 (en) | 2010-08-11 | 2014-06-10 | Advanced Micro Devices, Inc. | Method for concurrent flush of L1 and L2 caches |
US9201801B2 (en) | 2010-09-15 | 2015-12-01 | International Business Machines Corporation | Computing device with asynchronous auxiliary execution unit |
US8756329B2 (en) | 2010-09-15 | 2014-06-17 | Oracle International Corporation | System and method for parallel multiplexing between servers in a cluster |
KR101685247B1 (ko) | 2010-09-17 | 2016-12-09 | 소프트 머신즈, 인크. | 조기 원거리 분기 예측을 위한 섀도우 캐시를 포함하는 단일 사이클 다중 분기 예측 |
US20120079212A1 (en) | 2010-09-23 | 2012-03-29 | International Business Machines Corporation | Architecture for sharing caches among multiple processes |
TWI541721B (zh) | 2010-10-12 | 2016-07-11 | 軟體機器公司 | 使用指令序列緩衝器來增強分支預測效能的方法、系統及微處理器 |
WO2012051281A2 (en) | 2010-10-12 | 2012-04-19 | Soft Machines, Inc. | An instruction sequence buffer to store branches having reliably predictable instruction sequences |
US8370553B2 (en) | 2010-10-18 | 2013-02-05 | International Business Machines Corporation | Formal verification of random priority-based arbiters using property strengthening and underapproximations |
US9047178B2 (en) | 2010-12-13 | 2015-06-02 | SanDisk Technologies, Inc. | Auto-commit memory synchronization |
US8677355B2 (en) | 2010-12-17 | 2014-03-18 | Microsoft Corporation | Virtual machine branching and parallel execution |
WO2012103245A2 (en) | 2011-01-27 | 2012-08-02 | Soft Machines Inc. | Guest instruction block with near branching and far branching sequence construction to native instruction block |
TWI533129B (zh) | 2011-03-25 | 2016-05-11 | 軟體機器公司 | 使用可分割引擎實體化的虛擬核心執行指令序列程式碼區塊 |
US9274793B2 (en) | 2011-03-25 | 2016-03-01 | Soft Machines, Inc. | Memory fragments for supporting code block execution by using virtual cores instantiated by partitionable engines |
KR101620676B1 (ko) | 2011-03-25 | 2016-05-23 | 소프트 머신즈, 인크. | 분할가능한 엔진에 의해 인스턴스화된 가상 코어를 이용한 코드 블록의 실행을 지원하는 레지스터 파일 세그먼트 |
US20120254592A1 (en) | 2011-04-01 | 2012-10-04 | Jesus Corbal San Adrian | Systems, apparatuses, and methods for expanding a memory source into a destination register and compressing a source register into a destination memory location |
US9740494B2 (en) | 2011-04-29 | 2017-08-22 | Arizona Board Of Regents For And On Behalf Of Arizona State University | Low complexity out-of-order issue logic using static circuits |
US8843690B2 (en) | 2011-07-11 | 2014-09-23 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Memory conflicts learning capability |
US8930432B2 (en) | 2011-08-04 | 2015-01-06 | International Business Machines Corporation | Floating point execution unit with fixed point functionality |
US20130046934A1 (en) | 2011-08-15 | 2013-02-21 | Robert Nychka | System caching using heterogenous memories |
US8839025B2 (en) | 2011-09-30 | 2014-09-16 | Oracle International Corporation | Systems and methods for retiring and unretiring cache lines |
KR101842550B1 (ko) | 2011-11-22 | 2018-03-28 | 소프트 머신즈, 인크. | 다중 엔진 마이크로프로세서용 가속 코드 최적화기 |
EP2783281B1 (en) | 2011-11-22 | 2020-05-13 | Intel Corporation | A microprocessor accelerated code optimizer |
EP2783282B1 (en) | 2011-11-22 | 2020-06-24 | Intel Corporation | A microprocessor accelerated code optimizer and dependency reordering method |
US8930674B2 (en) | 2012-03-07 | 2015-01-06 | Soft Machines, Inc. | Systems and methods for accessing a unified translation lookaside buffer |
KR20130119285A (ko) | 2012-04-23 | 2013-10-31 | 한국전자통신연구원 | 클러스터 컴퓨팅 환경에서의 자원 할당 장치 및 그 방법 |
US9684601B2 (en) | 2012-05-10 | 2017-06-20 | Arm Limited | Data processing apparatus having cache and translation lookaside buffer |
US9996348B2 (en) | 2012-06-14 | 2018-06-12 | Apple Inc. | Zero cycle load |
US9940247B2 (en) | 2012-06-26 | 2018-04-10 | Advanced Micro Devices, Inc. | Concurrent access to cache dirty bits |
US9229873B2 (en) | 2012-07-30 | 2016-01-05 | Soft Machines, Inc. | Systems and methods for supporting a plurality of load and store accesses of a cache |
US9430410B2 (en) | 2012-07-30 | 2016-08-30 | Soft Machines, Inc. | Systems and methods for supporting a plurality of load accesses of a cache in a single cycle |
US9916253B2 (en) | 2012-07-30 | 2018-03-13 | Intel Corporation | Method and apparatus for supporting a plurality of load accesses of a cache in a single cycle to maintain throughput |
US9740612B2 (en) | 2012-07-30 | 2017-08-22 | Intel Corporation | Systems and methods for maintaining the coherency of a store coalescing cache and a load cache |
US9710399B2 (en) | 2012-07-30 | 2017-07-18 | Intel Corporation | Systems and methods for flushing a cache with modified data |
US9678882B2 (en) | 2012-10-11 | 2017-06-13 | Intel Corporation | Systems and methods for non-blocking implementation of cache flush instructions |
US10037228B2 (en) | 2012-10-25 | 2018-07-31 | Nvidia Corporation | Efficient memory virtualization in multi-threaded processing units |
US9195506B2 (en) | 2012-12-21 | 2015-11-24 | International Business Machines Corporation | Processor provisioning by a middleware processing system for a plurality of logical processor partitions |
US9904625B2 (en) | 2013-03-15 | 2018-02-27 | Intel Corporation | Methods, systems and apparatus for predicting the way of a set associative cache |
US9891924B2 (en) | 2013-03-15 | 2018-02-13 | Intel Corporation | Method for implementing a reduced size register view data structure in a microprocessor |
US9811342B2 (en) | 2013-03-15 | 2017-11-07 | Intel Corporation | Method for performing dual dispatch of blocks and half blocks |
US9886279B2 (en) | 2013-03-15 | 2018-02-06 | Intel Corporation | Method for populating and instruction view data structure by using register template snapshots |
WO2014150971A1 (en) | 2013-03-15 | 2014-09-25 | Soft Machines, Inc. | A method for dependency broadcasting through a block organized source view data structure |
US10275255B2 (en) | 2013-03-15 | 2019-04-30 | Intel Corporation | Method for dependency broadcasting through a source organized source view data structure |
EP2972794A4 (en) | 2013-03-15 | 2017-05-03 | Soft Machines, Inc. | A method for executing blocks of instructions using a microprocessor architecture having a register view, source view, instruction view, and a plurality of register templates |
KR102083390B1 (ko) | 2013-03-15 | 2020-03-02 | 인텔 코포레이션 | 네이티브 분산된 플래그 아키텍처를 이용하여 게스트 중앙 플래그 아키텍처를 에뮬레이션하는 방법 |
WO2014150991A1 (en) | 2013-03-15 | 2014-09-25 | Soft Machines, Inc. | A method for implementing a reduced size register view data structure in a microprocessor |
EP2972845B1 (en) | 2013-03-15 | 2021-07-07 | Intel Corporation | A method for executing multithreaded instructions grouped onto blocks |
US9632825B2 (en) | 2013-03-15 | 2017-04-25 | Intel Corporation | Method and apparatus for efficient scheduling for asymmetrical execution units |
US9569216B2 (en) | 2013-03-15 | 2017-02-14 | Soft Machines, Inc. | Method for populating a source view data structure by using register template snapshots |
WO2014150806A1 (en) | 2013-03-15 | 2014-09-25 | Soft Machines, Inc. | A method for populating register view data structure by using register template snapshots |
-
2014
- 2014-03-12 WO PCT/US2014/024276 patent/WO2014150806A1/en active Application Filing
- 2014-03-14 US US14/213,854 patent/US9575762B2/en active Active
- 2014-03-14 TW TW103109509A patent/TWI522909B/zh not_active IP Right Cessation
-
2017
- 2017-01-17 US US15/408,269 patent/US10198266B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
US20140281428A1 (en) | 2014-09-18 |
US10198266B2 (en) | 2019-02-05 |
TWI522909B (zh) | 2016-02-21 |
WO2014150806A1 (en) | 2014-09-25 |
US20170123805A1 (en) | 2017-05-04 |
US9575762B2 (en) | 2017-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI522909B (zh) | 使用寄存器樣板快照以填充寄存器觀點資料結構的方法 | |
TWI533221B (zh) | 經由區塊組織的來源觀點資料結構來廣播依附的方法、非暫時性電腦可讀取媒體、與電腦系統 | |
TWI522913B (zh) | 在微處理器實現減縮尺寸寄存器觀點資料結構的方法 | |
TWI522908B (zh) | 使用具有寄存器觀點、來源觀點、指令觀點、與複數寄存器樣板的微處理器架構以執行指令區塊的方法 | |
TWI522912B (zh) | 利用原生分散旗標架構仿真客戶集中旗標架構的方法 | |
TWI619077B (zh) | 執行群組爲區塊的多重執行緒指令的方法、電腦可讀取媒體及電腦系統 | |
US10169045B2 (en) | Method for dependency broadcasting through a source organized source view data structure | |
US10146548B2 (en) | Method for populating a source view data structure by using register template snapshots | |
US9891924B2 (en) | Method for implementing a reduced size register view data structure in a microprocessor | |
US9886279B2 (en) | Method for populating and instruction view data structure by using register template snapshots |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |