TWI522912B - 利用原生分散旗標架構仿真客戶集中旗標架構的方法 - Google Patents

利用原生分散旗標架構仿真客戶集中旗標架構的方法 Download PDF

Info

Publication number
TWI522912B
TWI522912B TW103109493A TW103109493A TWI522912B TW I522912 B TWI522912 B TW I522912B TW 103109493 A TW103109493 A TW 103109493A TW 103109493 A TW103109493 A TW 103109493A TW I522912 B TWI522912 B TW I522912B
Authority
TW
Taiwan
Prior art keywords
instruction
block
instructions
flag
register
Prior art date
Application number
TW103109493A
Other languages
English (en)
Other versions
TW201504942A (zh
Inventor
摩翰麥德 艾伯戴爾拉
Original Assignee
軟體機器公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 軟體機器公司 filed Critical 軟體機器公司
Publication of TW201504942A publication Critical patent/TW201504942A/zh
Application granted granted Critical
Publication of TWI522912B publication Critical patent/TWI522912B/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3858Result writeback, i.e. updating the architectural state or memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30094Condition code generation, e.g. Carry, Zero flag
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/3017Runtime instruction translation, e.g. macros
    • G06F9/30174Runtime instruction translation, e.g. macros for non-native instruction set, e.g. Javabyte, legacy code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3814Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • G06F9/384Register renaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3861Recovery, e.g. branch miss-prediction, exception handling
    • G06F9/3863Recovery, e.g. branch miss-prediction, exception handling using multiple copies of the architectural state, e.g. shadow registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • G06F9/3893Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs

Description

利用原生分散旗標架構仿真客戶集中旗標架構的方法
本發明一般係關於數位電腦系統,尤其係關於一種包含一指令序列的指令選擇之系統及方法。
處理器必須處理互相依附或完全獨立任一者的多重任務。這種處理器之內部狀態通常由在程式執行之每個特定瞬間皆可能維持不同數值的寄存器構成。在程式執行之每個瞬間,該內部狀態圖像(state image)稱為該處理器之架構狀態。
碼執行切換成運行另一函數(例如另一執行緒、程序或程式)時,機器/處理器之狀態必須儲存使得新功能可利用該等內部寄存器以建立其新狀態。一旦該新功能終止,則其狀態可丟棄且先前脈絡(context)之狀態將會恢復並繼續執行。這種切換程序稱為脈絡切換(context switch)且通常包括數十或幾百個循環(cycles),尤其具有採用大量寄存器(例如64、128、256)及/或無序執行的現代架構時。
在執行緒感知(thread aware)硬體架構中,對於硬體而言,為數量有限的硬體支援執行緒支援多重脈絡狀態很正常。在此例中,硬體為每個所支援執行緒皆複製所有的架構狀態元件。這排除執行新執行緒時, 對於脈絡切換的需要。然而,這仍有多個缺點,亦即為了硬體中所支援的每個額外執行緒而皆複本所有架構狀態元件(亦即寄存器)之面積、功率及複雜度。此外,若軟體執行緒之數量超過明確所支援的硬體執行緒之數量,則該脈絡切換仍必須進行。
由於在需求大量執行緒的精細粒度基礎上需要平行處理(parallelism),因此這變得普遍。具有複本脈絡狀態硬體儲存體的硬體執行緒感知架構無助於非執行緒軟體碼,並僅為經執行緒軟體減縮脈絡切換之數量。然而,那些執行緒通常為粗粒平行處理而建構,並為初始化及同步化而導致沉重軟體負載,使得諸如函式呼叫及迴圈平行執行的細粒平行處理沒有有效的執行緒初始化/自動產生。這種所描述的負載伴隨著為了非明確(non-explicitly)/容易平行化(easily parallelized)/執行緒(threaded)軟體碼而使用最先進的編譯器或使用者平行化處理技術對這些碼之自動平行化處理之困難。
在一個具體實施例中,本發明實現為一種利用原生分散旗標架構仿真客戶集中旗標架構的方法。該方法包括使用全域前端接收輸入的指令序列;群組該等指令以排列指令區塊,其中該等指令區塊之每個皆包含兩個半區塊;排程該指令區塊之該等指令,以根據排程器執行;以及為了客戶指令執行之該仿真而利用分散旗標架構仿真集中旗標架構。
前述為總結,因此必然包含對細節之簡化、歸納及省略;所以,熟習此項技術者應可瞭解該總結僅為例示性且不欲以任何方式限制。如僅由諸申請專利範圍界定出的本發明之其他態樣、創造性特徵、與優勢,將會在以下所闡述的非限制性實施方式中變得顯而易見。
R0-R63‧‧‧寄存器
T0-T4‧‧‧寄存器樣板
20‧‧‧區塊
S1-S8‧‧‧來源
P1-P4‧‧‧連接埠
本發明在所附圖式之圖示中藉著範例而非限制進行例示,且其中同樣的參考號碼指稱類似的元件。
圖1顯示將指令群組於區塊中並藉著使用寄存器樣板追蹤指令之間依附的程序之概觀圖。
圖2根據本發明之一個具體實施例顯示寄存器觀點、來源觀點、與指令觀點之概觀圖。
圖3根據本發明之一個具體實施例所顯示的圖示例示示範寄存器樣板,以及來源觀點如何藉著來自寄存器樣板的資訊而填充。
圖4所顯示的圖示例示來源觀點內的廣播依附的第一具體實施例。在此具體實施例中,每行皆包含一指令區塊。
圖5所顯示的圖示例示來源觀點內的廣播依附的第二具體實施例。
圖6根據本發明之一個具體實施例所顯示的圖示例示為了始於提交指示器的配送而選擇就緒區塊,並廣播對應的連接埠分配。
圖7根據本發明之一個具體實施例顯示用於實現圖6中所描述選擇器陣列的加法器樹結構。
圖8更詳細顯示選擇器陣列加法器樹之示範邏輯。
圖9根據本發明之一個具體實施例顯示實現選擇器陣列的加法器樹之平行實作。
圖10根據本發明之一個具體實施例所顯示的示範圖示例示來自圖9的加法器X如何可藉著使用進位儲存加法器而實現。
圖11根據本發明顯示為了始於提交指示器進行排程並使用選擇器陣列加法器而遮蔽(masking)就緒位元的遮蔽具體實施例。
圖12根據本發明之一個具體實施例顯示寄存器觀點條目如何由寄存器樣板填充之概觀圖。
圖13根據本發明之一個具體實施例顯示用於減縮寄存器觀 點覆蓋區的第一具體實施例。
圖14根據本發明之一個具體實施例顯示用於減縮寄存器覆蓋區的第二具體實施例。
圖15根據本發明之一個具體實施例顯示快照之間的差量之示範格式。
圖16根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之圖示。
圖17根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之另一圖示。
圖18根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之串列實作的硬體之概觀圖。
圖19根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之平行實作的硬體之概觀圖。
圖20根據本發明之一個具體實施例顯示用於指令區塊型執行的硬體之概觀圖,以及其如何採用來源觀點、指令觀點、寄存器樣板、與寄存器觀點運作。
圖21根據本發明之一個具體實施例顯示組集(chunking)架構之範例。
圖22根據本發明之一個具體實施例顯示執行緒如何根據其區塊編號及執行緒識別碼(ID)進行配置之繪圖。
圖23根據本發明之一個具體實施例顯示排程器之實作,其使用為了管理多重執行緒執行而指向實體儲存位置的執行緒指示器映射。
圖24根據本發明之一個具體實施例顯示使用執行緒型指示器映射的排程器之另一實作。
圖25根據本發明之一個具體實施例顯示對執行緒的執行資源之動態日曆型配置之圖示。
圖26根據本發明之一個具體實施例圖示雙重配送程序。
圖27根據本發明之一個具體實施例圖示雙重配送暫態乘法積累。
圖28根據本發明之一個具體實施例圖示雙重配送架構上可見狀態乘法加法。
圖29根據本發明之一個具體實施例顯示用於群組執行單元程序上的執行的指令區塊之提取及排列之概觀圖。
圖30根據本發明之一個具體實施例顯示指令群組之示範圖示。在圖30具體實施例中,採用第三輔助運算顯示兩個指令。
圖31根據本發明之一個具體實施例顯示區塊堆疊內的半區塊配對如何映射於執行區塊單元上。
圖32根據本發明之一個具體實施例所顯示的圖示將中間區塊結果儲存體描繪為第一階寄存器檔案。
圖33根據本發明之一個具體實施例顯示奇數/偶數連接埠排程器。
圖34顯示圖33之更詳細的版本,其中顯示四個執行單元接收來自排程器陣列的結果,並將輸出寫入暫時寄存器檔案段。
圖35根據本發明之一個具體實施例所顯示的圖示描繪出客戶旗標架構仿真。
圖36根據本發明之一個具體實施例所顯示的圖示例示機器之前端、排程器及執行單元、與集中旗標寄存器。
圖37顯示如本發明之具體實施例所實現的集中旗標寄存器仿真程序之圖示。
圖38顯示在客戶設定下仿真集中旗標寄存器行為之程序3800之步驟流程圖。
雖然本發明已結合一個具體實施例進行描述,但本發明不欲限於文中所闡述的特定形式。相反地,係欲涵蓋如同可合理包括於如所附諸申請專利範圍所界定出的本發明之範疇內的這種替代例、修飾例、與相等物。
在以下實施方式中,諸如特定方法順序、結構、元件、與連接的眾多特定細節皆已闡述。然而應可理解這些及其他特定細節不需要利用於實作本發明之具體實施例。在其他狀況下,已習知的結構、元件、或連接皆已省略,或者並未特別詳細描述,以避免不必要地模糊本描述。
在本說明書內提及「一個具體實施例(one embodiment)」或「一具體實施例(an embodiment)」,係欲指示結合該具體實施例所描述的特定特徵、結構、或特性包括於本發明之至少一個具體實施例中。在本說明書內各處所出現的片語「在一個具體實施例中(in one embodiment)」不必皆指稱同一具體實施例,亦非互斥其他具體實施例的分離或替代性具體實施例。而且,描述可藉著一些具體實施例而非其他所呈現出的各種特徵。同樣地,描述對於一些具體實施例而非其他具體實施例可能為要求的各種要求。
所依循的實施方式之一些部分在電腦記憶體內資料位元上的運算之流程、步驟、邏輯區塊、處理、與其他符號代表方面進行說明。這些描述與指示為熟習資料處理領域技術者用來最有效傳達其工作實質給熟習此項技術其他者的方法。流程、電腦執行步驟、邏輯區塊、程序等在此一般設想成導致所需結果的步驟或指令之自相一致序列。這些步驟為需求實體量之實體操控者。通常,但並非必然,這些量具有的形式為電腦可讀取儲存媒體之電或磁信號,並能在電腦系統中儲存、傳送、結合、比較、與另行操控。主要由於通用之原因,有時已證明指稱這些信號為位元、數值、元件、符號、字元、用語、數字或此類的便利性。
然而,以此為前提,所有這些及類似用語將與適當實體量相關聯,並僅為施加於這些量的便利標記。除非如從以下詳述所顯而易見另外明確聲明,應可瞭解貫穿本發明利用諸如「處理(processing)」或「存取(accessing)」或「寫入(writing)」或「儲存(storing)」或「複製(replicating)」或此類用語的詳述指稱電腦系統或類似電子運算裝置之動作及程序,其將在該電腦系統的寄存器及記憶體及其他電腦可讀取媒體內表示為實體(電子)量的資料,操控及變換成在該電腦系統記憶體或寄存器或其他這種資訊儲存、傳輸或顯示裝置內同樣表示為實體量的其他資料。
圖1顯示將指令群組於區塊中且藉著使用寄存器樣板而追蹤該等指令之間的依附的程序之概觀圖。
圖1顯示具有標頭及本體(body)的指令區塊。該區塊從一群指令形成。該區塊包含一實體,其包覆(encapsulate)該指令群。在微處理器之本發明具體實施例中,摘要階層提高到區塊而非個別指令。區塊經處理進行配送,而非個別指令。每個區塊皆用區塊編號(block number)標記。機器的無序管理工作由此顯著簡化。一個關鍵特徵為找出藉以管理正在處理的更大量指令而不會顯著增加機器之管理負載的方法。
本發明之各具體實施例藉著實現指令區塊、寄存器樣板、與繼承向量而達成此目的。在圖1所顯示的區塊中,區塊之標頭列出且包覆區塊指令之所有來源及目標,以及那些來源的出處(例如來自哪些區塊)。該標頭包括該等目標,其更新該寄存器樣板。包括於該標頭中的該等來源將與儲存於該寄存器樣板中的該等區塊編號序連(concatenated)在一起。
經無序(out of order)處理的該些指令判定無序機器之管理複雜度。更多無序指令導致更高的複雜度。來源需要與處理器之無序配送視窗中的先前指令之目標比較。
如圖1所顯示,寄存器樣板對於從R0至R63的每個寄存器都有欄位。區塊將其各自的區塊編號寫入對應於區塊目標的寄存器樣板欄 位。每個區塊皆從該寄存器樣板讀取表示其寄存器來源的寄存器欄位。區塊拉回(retire)並將其目標寄存器內容寫入寄存器檔案時,其編號從寄存器樣板抹除。這意指那些寄存器可從寄存器檔案自身讀取為來源。
在本發明具體實施例中,寄存器樣板在每當區塊配置時機器之每個循環皆進行更新。隨著新的樣板更新產生,寄存器樣板之先前快照每個區塊一個儲存於陣列中(例如圖2所顯示的寄存器觀點)。此資訊留存直到對應的區塊拉回為止。這允許機器從未中預測(miss-predictions)恢復且非常迅速清除(例如藉著得到最後已知的依附狀態)。
在一個具體實施例中,儲存於寄存器觀點中的寄存器樣板可藉著僅儲存連續快照之間的差量(delta)(快照之間的增量改變)而壓縮(由此節省儲存空間)。以此方式機器得到縮小的寄存器觀點。進一步壓縮可藉著僅為具有分支指令的區塊儲存樣板而得到。
若除了分支未中預測之外還需要恢復點,則最初會在分支恢復點得到恢復,隨後狀態可由於配置指令(但並非將其執行)而重建直到機器求取到恢復點為止。
應注意到在一個具體實施例中,文中所使用的用語「寄存器樣板(register template)」與美國專利申請號13/428,440中所描述的用語「繼承向量(inheritance vector)」同義,於本文中將此專利申請案全部併入作為參照。
圖2根據本發明之一個具體實施例顯示寄存器觀點、來源觀點、與指令觀點之概觀圖。此圖示顯示排程器架構(例如具有來源觀點、指令觀點、寄存器觀點等)之一個具體實施例。藉著結合或分離以上所引述結構之一個或多個而達成相同功能的排程器架構之其他實作亦可能。
圖2圖示支援寄存器樣板之運算及機器狀態之保留的功能性實體。圖2之左側顯示寄存器樣板T0至T4,具有箭頭指示從一個寄存器樣板/繼承向量到下一個的資訊之繼承。寄存器觀點、來源觀點、與指令觀點每個皆包含資料結構,其用於儲存與指令區塊相關的資訊。圖2亦顯示具有標 頭的示範指令區塊,以及該指令區塊如何為機器之寄存器包括來源及目標兩者。有關區塊所指稱寄存器的資訊儲存於寄存器觀點資料結構中。有關區塊所指稱來源的資訊儲存於來源觀點資料結構中。有關區塊所指稱指令自身的資訊儲存於指令觀點資料結構中。該等寄存器樣板/繼承向量自身包含資料結構,其儲存區塊所指稱依附及繼承資訊。
圖3根據本發明之一個具體實施例所顯示的圖示例示示範寄存器樣板及如何由來自寄存器樣板的資訊填充來源觀點。
在本發明具體實施例中,應注意到來源觀點之目標為判定何時可配送(dispatch)特定區塊。區塊被配送時,會將其區塊編號廣播到所有剩餘區塊。對於其他區塊之來源的任何匹配(例如比較)皆會造成就緒位元(例如或者某其他類型之指示符)被設定。所有就緒位元皆設定(例如及閘(AND gate))時,區塊就緒進行配送。區塊依據其所依賴其他區塊之就緒度而被配送。
多個區塊就緒進行配送時,最早的區塊在較新的區塊前被選擇進行配送。舉例來說,在一個具體實施例中,最初找出的迴路(circuit)可用於依據接近於提交指示器及依據相對接近於該提交指示器的後續區塊找出最早的區塊(例如致力於每個區塊的就緒位元)。
仍參照圖3,在此範例中,正在檢查抵達區塊20時所形成的寄存器樣板快照。如上述,寄存器樣板具有用於R0至R63每個寄存器的欄位。區塊將其各自的區塊編號寫入對應於區塊目標的寄存器樣板欄位。每個區塊皆從寄存器樣板讀取代表其寄存器來源的寄存器欄位。第一編號為寫入寄存器的區塊,而第二編號為該區塊之目標編號。
舉例來說,區塊20抵達時,會讀取寄存器樣板之快照並在寄存器樣板中查找其自身的寄存器來源,以判定寫入其每個來源的最新區塊並根據其目標對先前寄存器樣板快照所進行的更新填充來源觀點。後續區塊將會用其自身的目標更新寄存器樣板。這顯示於圖3之左下方,其中區塊 20填充其來源:來源1、來源2、來源3、一直到來源8。
圖4所顯示的圖示例示來源觀點內的廣播依附的第一具體實施例。在此具體實施例中,每行皆包含一指令區塊。區塊被配置時,會在其來源曾經對那些區塊有依附的所有區塊行中進行標記(例如藉著寫入0)。任何其他區塊被配送時,其編號跨越與該區塊相關的確切欄進行廣播。應注意到寫入1為預設數值,指示對該區塊沒有依附。
區塊中的所有就緒位元皆就緒時,該區塊被配送且其編號廣播回到所有剩餘區塊。該區塊編號與儲存於其他區塊之來源中的所有編號比較。若有匹配,則設定用於該來源的就緒位元。舉例來說,若廣播於來源1上的區塊編號等於11,則將會設定用於區塊20之來源1的就緒位元。
圖5所顯示的圖示例示來源觀點內的廣播依附的第二具體實施例。此具體實施例由來源組織,而非由區塊組織。這藉著跨越來源觀點資料結構的來源S1至S8而顯示。以類似於以上圖4中所描述的方式,在圖5具體實施例中,區塊中的所有就緒位元皆就緒時,該區塊被配送且其編號廣播回到所有剩餘區塊。該區塊編號與儲存於其他區塊之來源中的所有編號比較。若有匹配,則設定用於該來源的就緒位元。舉例來說,若廣播於來源1上的區塊編號等於11,則將會設定用於區塊20之來源1的就緒位元。
圖5具體實施例亦顯示比較為何僅在提交指示器和配置指示器之間的區塊上啟動。所有其他區塊皆無效。
圖6根據本發明之一個具體實施例所顯示的圖示例示為了始於提交指示器的配送而選擇就緒區塊,並廣播對應的連接埠分配。來源觀點資料結構顯示於圖6之左側。指令觀點資料結構顯示於圖6之右側。選擇器陣列顯示於來源觀點和指令觀點之間。在此具體實施例中,選擇器陣列經由四個配送連接埠P1至P4每個循環配送四個區塊。
如上述,為從環繞包覆(wrapping around)的提交指示器到配置指示器的配送而選擇區塊(例如試著實踐最初配送較早的區塊)。選擇器陣 列用於找出始於提交指示器的最初四個就緒區塊。所需為配送最早的就緒區塊。在一個具體實施例中,選擇器陣列可藉著使用加法器樹結構而實現。這將會在以下的圖7中進行描述。
圖6亦顯示選擇器陣列如何耦接於通過指令觀點中的條目的四個連接埠之每個。在此具體實施例中,連接埠耦接為連接埠啟動,並啟動四個連接埠之一啟用,且為該指令觀點條目向下通過到配送連接埠及執行單元上。此外,如上述,經配送區塊透過來源觀點廣播回去。用於配送的選擇區塊之區塊編號廣播回去(最多四個)。這顯示於圖6之最右側。
圖7根據本發明之一個具體實施例顯示用於實現圖6中所描述選擇器陣列的加法器樹(adder tree)結構。所描繪出的加法器樹實現選擇器陣列之功能。加法器樹撿出最初四個就緒區塊,並將其裝入用於配送的四個可用連接埠(例如讀取連接埠1至讀取連接埠4)。未使用仲裁(arbitration)。 用於具體啟動特定連接埠的實際邏輯明確顯示於條目編號1中。為了清楚表示,該邏輯並未具體顯示於其他條目中。以此方式,圖7顯示如何實現直接選擇用於區塊配送的每個特定連接埠之一個特定具體實施例。然而或者應注意到,可實現使用優先編碼器的具體實施例。
圖8更詳細顯示選擇器陣列加法器樹之示範邏輯。在圖8具體實施例中,為範圍超過位元(range exceed bit)顯示邏輯。範圍超過位元確保將會選擇不超過四個區塊進行配送,若第五區塊就緒且最初四個區塊亦就緒,則範圍超過位元不會允許配送第五區塊。應注意到在串列實作中,總位元S0至S3皆用於啟動配送連接埠以及傳遞到下一個加法器階段。
圖9根據本發明之一個具體實施例顯示實現選擇器陣列的加法器樹之平行實作。平行實作並未將總和從每個加法器轉發到下一個。在平行實作中,每個加法器皆使用多重輸入加法實作直接使用所有其必要的輸入,諸如多輸入進位儲存加法器樹。舉例來說,加法器「X」加總先前的所有輸入。若為了執行更快速的運算次數(例如單一循環),較佳地可採用此 平行實作。
圖10根據本發明之一個具體實施例所顯示的示範圖示例示來自圖9的加法器X如何可藉著使用進位儲存加法器而實現。圖10顯示可在單一循環中加入32個輸入的結構。該結構使用4×2進位儲存加法器組成。
圖11根據本發明顯示為了始於提交指示器進行排程並使用選擇器陣列加法器而遮蔽就緒位元的遮蔽具體實施例。在此實作中,選擇器陣列加法器正試著選擇最初四個就緒區塊,藉以始於可能環繞包覆的提交指示器到配置指示器進行配送。在此實作中,使用多輸入平行加法器。此外,在此實作中,利用這些循環緩衝之來源。
圖11顯示就緒位元如何與兩個遮罩(masks)之每個(個別或分離)皆一起ANDed,並平行施行於兩個加法器樹。最初四個藉著使用兩個加法器樹並與四個之臨界值比較而選擇。「X」標記表示「從用於該加法器樹的選擇陣列排除(exclude from the selection array for that adder tree)」,因此「X」數值為零。另一方面「Y」標記表示「確實包括於用於該加法器樹的選擇陣列中(do include in the selection array for that adder tree)」,因此「Y」數值為一。
圖12根據本發明之一個具體實施例顯示寄存器觀點條目(entries)如何由寄存器樣板填充(populate)之概觀圖。
如上述,寄存器觀點條目由寄存器樣板填充。寄存器觀點序列儲存用於每個區塊的寄存器樣板之快照。猜測無效(例如分支未中預測)時,寄存器觀點在無效猜測點之前有最新的有效快照。機器可藉著讀取該寄存器觀點條目並將其載入寄存器樣板之基底而將其狀態回復到最後的有效快照。寄存器觀點之每個條目皆會顯示所有的寄存器繼承狀態。舉例來說,在圖12具體實施例中,若用於區塊F的寄存器觀點無效,則機器狀態可回復到稍早最後的有效寄存器樣板快照。
圖13根據本發明之一個具體實施例顯示用於減縮寄存器觀 點覆蓋區的第一具體實施例。儲存寄存器觀點條目所需要的記憶體量可藉著僅儲存包含分支指令的那些寄存器觀點樣板快照而減縮。發生例外情形(例如猜測無效、分支未中預測等)時,最後的有效快照可從發生於例外情形之前的分支指令進行重建。為了建立最後的有效快照,從在例外情形之前向下到例外情形的分支提取指令。該等指令經提取但並未執行。如圖13中所顯示,僅包括分支指令的那些快照儲存於減縮寄存器觀點中。這顯著減縮儲存寄存器樣板快照所需要的記憶體量。
圖14根據本發明之一個具體實施例顯示用於減縮寄存器覆蓋區的第二具體實施例。儲存寄存器觀點條目所需要的記憶體量可藉著僅儲存快照之序列子集(例如每四個快照一個)而減縮。連續快照之間的改變可使用與完整連續快照比較更小的記憶體量儲存為偏離原始快照的「差量(delta)」。發生例外情形(例如猜測無效、分支未中預測等)時,最後的有效快照可從在例外情形之前所發生的原始快照重建。偏離在例外情形之前所發生的原始快照的「差量(delta)」及連續快照用於重建最後的有效快照。初始的原始狀態可積累差量以抵達所需求快照之狀態。
圖15根據本發明之一個具體實施例顯示快照之間的差量之示範格式。圖15顯示原始快照及兩個差量。在一個差量中,R5及R6為B3正進行更新的唯二寄存器。條目之其餘部分並未改變。在另一差量中,R1及R7為B2正進行更新的唯二寄存器。條目之其餘部分並未改變。
圖16根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之圖示。在此具體實施例中,圖16之左側顯示兩個解多工器(de-multiplexers),而圖16之上方為快照寄存器樣板。圖16顯示用於從先前寄存器樣板(例如串列實作)形成後續寄存器樣板的圖示。
此串列實作顯示寄存器樣板快照如何在指令區塊之配置上方形成。那些快照用來擷取用於依附追蹤(例如圖1至圖4中所描述)以及更新用於處理未中預測/例外情形的寄存器觀點(例如圖12至圖15中所描述)的最 新寄存器架構狀態更新。
解多工器藉著選擇傳遞哪個輸入來源而起作用。舉例來說,寄存器R2將會在第二輸出解多工為1,而R8將會在第七輸出解多工為1等。
圖17根據本發明之一個具體實施例顯示在指令區塊之配置上形成寄存器樣板快照的程序之另一圖示。圖17具體實施例亦顯示從先前寄存器樣板形成後續寄存器樣板。圖17具體實施例亦顯示寄存器樣板區塊繼承之範例。此圖示顯示寄存器樣板如何從經配置的區塊編號進行更新之範例。舉例來說,區塊Bf更新R2、R8、與R10。Bg更新R1及R9。虛線箭頭指示數值從先前快照繼承。此程序向下一直進行到區塊Bi。因此,舉例來說,由於沒有快照更新寄存器R7,故原始數值Bb將會向下一直傳遞。
圖18根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之串列實作的硬體之概觀圖。解多工器用於控制一連串兩個輸入多工器,其具有兩個區塊編號將會向下傳遞到下一個階段。可為來自先前階段的區塊編號或現有區塊編號任一者。
圖19根據本發明之一個具體實施例顯示用於實現從先前寄存器樣板形成後續寄存器樣板之平行實作的硬體之概觀圖。此平行實作使用特殊的編碼多工器控制,藉以從先前寄存器樣板形成後續寄存器樣板。
圖20根據本發明之一個具體實施例顯示用於指令區塊型執行的硬體之概觀圖,以及其如何採用來源觀點、指令觀點、寄存器樣板、與寄存器觀點運作。
在此實施例中,配送器中的配置器排程器接收機器前端所提取的指令。這些指令以先前我們描述過的方式通過區塊排列。如先前所描述,該等區塊產生寄存器樣板且這些寄存器樣板用於填充寄存器觀點。從來源觀點來看,該等來源傳送到寄存器檔案階層,並有廣播以上述方式回到來源觀點。指令觀點將指令傳送到執行單元。由於該等指令所需要的該等來源來自寄存器檔案階層,因此該等指令由執行單元執行。這些經執行 的指令隨後從執行單元傳送出來並回到寄存器檔案階層中。
圖21根據本發明之一個具體實施例顯示組集(chunking)架構之範例。組集之重要性在於其藉著使用所顯示的四個多工器而將進入每個排程器條目的寫入連接埠之數量皆從四減縮成一,同時仍密集堆積所有條目而未形成磁泡(bubbles)。
組集之重要性可由以下範例看出(例如注意到在每個循環中的區塊之配置皆始於上方位置,在此例中為B0)。假設在循環1中,三個指令區塊即將配置到排程器條目(例如這三個區塊將會占用排程器中的最初三個條目)。在下一個循環(例如循環2)中,另兩個指令區塊即將進行配置。為了避免在排程器陣列條目中形成磁泡(bubble),該等排程器陣列條目必須支援四個寫入連接埠而建立。這在功率消耗、時序、面積、與此類方面代價很大。以上的組集結構藉著在配置到陣列之前先使用多工結構而將所有排程器陣列皆簡化成僅有一個寫入連接埠。在以上的範例中,在循環2中的B0將會由最後的多工器選擇,而在循環2中的B1將會由第一多工器選擇(例如從左到右進行)。
以此方式,條目組集之每個皆僅需要每個條目一個寫入連接埠及每個條目四個讀取連接埠。在成本上有折衷,因為必須實現多工器,然而由於可能有非常多個條目,因此該成本多次從不必實現每個條目皆四個寫入連接埠的節省而補足。
圖21亦顯示中間配置緩衝。若排程器陣列無法接受發送來的所有組集,則其可暫時儲存於中間配置緩衝中。排程器陣列有可用空間時,該等組集將會從中間配置緩衝傳送到排程器陣列。
圖22根據本發明之一個具體實施例顯示執行緒如何根據其區塊編號及執行緒ID進行配置之繪圖。區塊如上述經由組集實作配置到排程器陣列。該等執行緒區塊之每個皆使用區塊編號在其自身之間維持序列順序。來自不同執行緒的區塊可交錯(例如用於執行緒Th1的區塊和用於執 行緒Th2的區塊在排程器陣列中交錯)。以此方式,在排程器陣列內呈現出來自不同執行緒的區塊。
圖23根據本發明之一個具體實施例顯示排程器之實作,其使用為了管理多重執行緒執行而指向實體儲存位置的執行緒指示器映射。在此具體實施例中,執行緒之管理透過執行緒映射之控制而實現。舉例來說,圖23在此顯示執行緒1映射及執行緒2映射。該等映射追蹤個別執行緒之區塊之位置。在映射中的條目配置到屬於該執行緒的區塊。在此實作中,每個執行緒皆有為兩者執行緒計數的配置計數器。整體計數不可超過N除以2(例如超過可用空間)。為了在來自池的總條目之配置上實現公平性,該等配置計數器有可調整的臨界值。配置計數器可避免一個執行緒使用所有可用空間。
圖24根據本發明之一個具體實施例顯示使用執行緒型指示器映射的排程器之另一實作。圖24顯示提交指示器和配置指示器之間的關係。如所顯示,每個執行緒皆有提交指示器及配置指示器,箭頭顯示用於執行緒2的實境指示器如何可環繞包覆配置區塊B1及B2的實體儲存體,但其直到用於執行緒2的提交指示器向下移動才可配置區塊B9。這由執行緒2之提交指示器之位置及刪除線顯示。圖24之右側顯示逆時針環繞移動的區塊之配置和提交指示器之間的關係。
圖25根據本發明之一個具體實施例顯示對執行緒的執行資源之動態日曆型配置之圖示。公平性可依據每個執行緒之向前進展而皆使用配置計數器進行動態控制。若兩者執行緒皆正做出重大向前進展,則兩者配置計數器皆設定成相同臨界值(例如9)。然而,若一個執行緒做出緩慢向前進展,諸如受到L2快取未中或這種事件影響,則臨界值計數器之比率可依仍然正在做出重大向前進展的執行緒而調整。若一個執行緒拖延或中止(例如處於等待作業系統(OS)或輸入輸出(IO)回應的等待或自旋狀態下),則該比率可完全調整到另一執行緒,其具有為了經中止的執行緒而保留以 發信號解除等待狀態的單一返回條目之例外情形。
在一個具體實施例中,程序採用50%:50%之比率開始。L2快取未中區塊22上的偵測時,指令管線之前端拖延任何進一步進入指令管線的提取或進入執行緒2區塊之排程器的配置。從排程器拉回執行緒2區塊時,將會使得那些條目可用於執行緒1配置直到達成新的執行緒配置動態比率。舉例來說,出於新近所拉回執行緒2區塊的3將會為了配置到執行緒1而非執行緒2而回到池中,使得執行緒1對執行緒2比率為75%:25%。
應注意到在指令管線前面的執行緒2區塊之拖延若沒有硬體機制可略過,則可能需要從指令管線前面清除那些區塊(例如由執行緒1區塊藉著經過受到拖延的執行緒2區塊)。
圖26根據本發明之一個具體實施例圖示雙重配送程序。多配送一般涵蓋多次配送區塊(其內有多個指令),使得區塊的不同指令在每次通過執行單元時皆可執行。一個範例為位址計算指令之配送,接著為耗用所得到資料的後續配送。另一範例為浮點運算,其中第一部分執行為固定點運算,而第二部分執行以藉著進行捨入、旗標產生/計算、指數調整或此類而完成運算。區塊作為單一實體基元地(atomically)進行配置、提交、與拉回。
多配送之主要效益為避免將多個分離區塊配置到機器視窗中,由此使得機器視窗有效更大。更大的機器視窗意指有更多機會進行最佳化及重新排序。
看到圖26之左下方,描繪出指令區塊。此區塊由於來自快取/記憶體的負載位址計算和負載返回資料之間有延遲,因此無法在單一循環中進行配送。所以此區塊最初採用其保持為暫態的中間結果進行配送(其結果正即時輸送到第二配送而看不見架構狀態)。第一配送發送在LA之位址計算及配送中所使用的兩個分量1及2。第二配送發送在來自快取/記憶體的負載返回資料上的負載資料之執行部分的分量3及4。
看到圖26之右下方,描繪出浮點乘法積累運算。如乘法積累 圖示顯示,在硬體沒有足夠輸入來源頻寬以在單一相中配送運算的案例中,則使用雙重配送。第一配送如所顯示為固定點乘法。第二配送如所顯示為浮點加法捨入。執行這兩者經配送的指令時,其有效進行浮點乘法/積累。
圖27根據本發明之一個具體實施例圖示雙重配送暫態乘法積累(transient multiply-accumulate)。如圖27中所顯示,第一配送為整數32位元乘法,而第二配送為整數積累加法。在第一配送和第二配送之間進行溝通的狀態(乘法之結果)為暫態且架構上看不見。暫態儲存體在一個實施例中可保存一個以上乘法器之結果,並可對它們加標籤以識別對應的乘法積累對,由此允許以隨意方式(例如交錯等)配送的多個乘法積累對之混合。
應可知到其他指令可將此同一硬體用於其實作(例如浮點等)。
圖28根據本發明之一個具體實施例圖示雙重配送架構上可見狀態乘法加法。第一配送為單一精確度乘法,而第二配送為單一精確度加法。在此實作中,由於此儲存體為架構狀態寄存器,因此在第一配送和第二配送之間進行溝通的狀態資訊(例如乘法之結果)為架構上可見。
圖29根據本發明之一個具體實施例顯示用於群組執行單元程序上的執行的指令區塊之提取及排列之概觀圖。本發明之具體實施例利用藉以由硬體或動態轉換器/JIT將指令提取及排列為區塊的程序。區塊中的指令經組織使得區塊中稍早指令之結果饋送區塊中後續指令之來源。這由指令區塊中的虛線箭頭顯示。此特性致能區塊以在執行區塊之堆疊執行單元上有效執行。即使指令可平行執行,但諸如若其分享同一來源時(在此圖示中未明確顯示),則亦可群組。
在硬體中排列區塊的一個替代例為在排列指令配對、三重、四重等的軟體中排列區塊(靜態或在運行時間)。
可美國專利8,327,115中找到指令群組功能之其他範例。
圖30根據本發明之一個具體實施例顯示指令群組之示範圖示。在圖30具體實施例中,採用第三輔助運算顯示兩個指令。圖31指令區塊之左側包含一上半區塊/一個狹槽(slot)及一下半區塊/一個狹槽。從上方往下的垂直箭頭指示進入區塊的來源,而從底部往下的垂直箭頭指示回到記憶體的目標。繼續從圖3之左側向右側看到,例示出可能的不同指令組合。在此實作中,每個半區塊可接收三個來源,並可傳遞兩個目標。OP1及OP2為正常運算。AuxiliaryOP為諸如邏輯值、移位、移動、記號擴充、分支等的輔助運算。將區塊分成兩個半部之效益為允許每個半部皆依據依附解析而自身獨立配送或作為一個區塊一起動態配送之效益(為了連接埠利用或因為資源限制任一者),因此有較佳的執行時間利用,同時有對應於一個區塊的兩個半部允許機器對即將像是一個區塊進行管理的兩個半區塊之複雜度(亦即配置及拉回)取得摘要(abstract)。
圖31根據本發明之一個具體實施例顯示區塊堆疊內的半區塊配對如何映射於執行區塊單元上。如執行區塊中所顯示,每個執行區塊皆有兩個狹槽:狹槽1及狹槽2。目的為將區塊映射於執行單元上,使得第一半區塊在狹槽1上執行,而第二半區塊在狹槽2上執行。目的為若每個半區塊之指令群組皆不依賴另一半部,則允許兩個半區塊獨立配送。從上方進入執行區塊的配對箭頭為來源之兩個32位元字詞。離開執行區塊往下的配對箭頭為目標之兩個32位元字詞。從圖31之左側向右側,顯示能堆疊於執行區塊單元上的指令之不同示範組合。
圖31之上方總結半區塊之配對如何在完整區塊脈絡或任一半區塊脈絡中執行。執行區塊之每個皆有兩個狹槽/半區塊,且半區塊/執行狹槽之每一個皆執行單一、配對或三重群組的運算任一者。有四種類型之區塊執行類型。第一為平行半部(其允許每個半區塊一旦其自身來源就緒則皆獨立執行,但若兩者半部同時就緒,則兩個半區塊在一個執行單元上仍可作為一個區塊執行)。第二為基元(atomic)平行半部(其指稱由於兩個半部 之間沒有依附因此可平行執行的半區塊,但由於兩個半部之間的資源分享使得對於兩個半部較佳或必要在每個執行區塊中可用的資源限制內基元地一起執行,因此其被迫作為一個區塊一起執行)。第三類型為基元串列半部(其需求第一半部透過帶或不帶內部儲存體的暫態轉發將資料轉發到第二半部)。第四類型為序列半部(如在雙重配送中),其中第二半部依賴第一半部並在第一半部以後的循環上進行配送,且透過類似於雙重配送案例為依附解析而追蹤的外部儲存體轉發資料。
圖32根據本發明之一個具體實施例所顯示的圖示將中間區塊結果儲存體描繪為第一階寄存器檔案。寄存器之每個群組皆表示指令區塊(表示兩個半區塊),其中可藉著使用兩個32位元寄存器來支援一個64位元寄存器而支援32位元結果以及64位元結果兩者。每個區塊的儲存體皆假設虛擬區塊儲存體,其意指來自不同區塊的兩個半區塊可寫入同一虛擬區塊儲存體。兩個半區塊之經結合的結果儲存體構成一個虛擬區塊儲存體。
圖33根據本發明之一個具體實施例顯示奇數/偶數連接埠排程器。在此實施例中,結果儲存體為不對稱。一些結果儲存體為每半區塊三個64位元結果寄存器,而其他為每半區塊一個64位元結果寄存器,然而替代性實施例可每半區塊使用對稱儲存體,且此外亦可如圖32中所描述採用64位元及32位元分區。在這些具體實施例中,儲存體每半區塊分配,而非每個區塊。此實施例藉著將其作為奇數或偶數使用而減縮進行配送所需要的連接埠數量。
圖34顯示圖33之更詳細的版本,其中顯示四個執行單元接收來自排程器陣列的結果,並將輸出寫入暫時寄存器檔案段。連接埠以偶數及奇數間隔連接。排程陣列之左側顯示區塊編號,而右側顯示半區塊編號。
每個核心皆有偶數及奇數連接埠進入排程陣列,其中每個連接埠皆連接到奇數或偶數半區塊位置。在一個實作中,偶數連接埠及其對應的半區塊可常駐於與奇數連接埠不同的核心及其對應的半區塊中。在另 一實作中,奇數及偶數連接埠將會如此圖示中所顯示跨越多個不同的核心而分散。如美國專利申請號13/428,440中所描述,於本文中將此專利申請案全部併入為參照,核心可為實體核心或虛擬核心。
在某些類型之區塊中,區塊之一個半部可與區塊之另一個半部獨立配送。在其他類型之區塊中,區塊之兩者半部皆需要同時配送到同一執行區塊單元。在又其他類型之區塊中,區塊之兩個半部需要依序配送(第二半部在第一半部之後)。
圖35根據本發明之一個具體實施例所顯示的圖示描繪出客戶旗標架構仿真。圖35之左側顯示有五個旗標的集中旗標寄存器。圖35之右側顯示有分散旗標寄存器的分散旗標架構,其中旗標分散於寄存器自身之中。
在架構仿真期間,分散旗標架構有必要仿真集中客戶旗標架構之行為。分散旗標架構亦可藉著使用多個獨立的旗標寄存器而非與資料寄存器相關聯的旗標欄位而實現。舉例來說,資料寄存器可實現為R0至R15,而獨立的旗標寄存器可實現為F0至F3。那些旗標寄存器在此例中並未與資料寄存器直接相關聯。
圖36根據本發明之一個具體實施例所顯示的圖示例示機器之前端、排程器及執行單元、與集中旗標寄存器。在此實作中,前端依據其更新客戶指令旗標的方式分類輸入指令。在一個具體實施例中,客戶指令分類成四種原生指令類型:T1、T2、T3、與T4。T1-T4為指示每個客戶指令類型皆更新哪個旗標欄位的指令類型。客戶指令類型依據其類型更新不同的客戶指令旗標。舉例來說,邏輯客戶指令更新T1原生指令。
圖37顯示如本發明之具體實施例所實現的集中旗標寄存器仿真程序之圖示。圖37中的動作主包含一最新的更新類型表、一重新命名的表擴充、實體寄存器、與分散旗標寄存器。圖37現在由圖38之流程圖進行描述。
圖38顯示在客戶設定下仿真集中旗標寄存器行為之程序3800之步驟流程圖。
在步驟3801中,前端/動態轉換器(硬體或軟體)依據其更新客戶指令旗標的方式分類輸入指令。在一個具體實施例中,客戶指令分類成四種旗標架構類型:T1、T2、T3、與T4。T1-T4為指示每個客戶指令類型皆更新哪個旗標欄位的指令類型。客戶指令類型依據其類型更新不同的客戶旗標。舉例來說,邏輯客戶指令更新T1類型旗標、移位客戶指令更新T2類型旗標、算術客戶指令更新T3類型旗標、以及特殊客戶指令更新類型T4旗標。應注意到客戶指令可為架構式指令表示,而原生可為機器內部所執行者(例如微碼)。或者,客戶指令可為來自仿真架構(例如x86、java、ARM碼等)的指令。
在步驟3802中,那些指令類型更新其各自客戶旗標的順序記錄於最新的更新類型表資料結構中。在一個具體實施例中,此動作由機器之前端進行。
在步驟3803中,那些指令類型到達排程器(配置/重新命名階段之依順序部分)時,排程器分配對應於架構類型的隱含實體目標,並將該分配記錄於重新命名/映射表資料結構中。
以及在步驟3804中,後續客戶指令到達排程器中的配置/重新命名階段且該指令想要讀取客戶旗標欄位時,(a)機器判定需要存取哪些旗標架構類型以進行讀取;(b)若所有需要的旗標皆在同一最新的更新旗標類型中找出(例如由最新的更新類型表判定),則讀取對應實體寄存器(例如映射於該最新的旗標類型者)以得到需要的旗標;(c)若所有需要的旗標無法皆在同一最新的更新旗標類型中找出,則需要從映射於個別最新的更新旗標類型的對應實體寄存器讀取每個旗標。
以及在步驟3805中,每個旗標皆從保存其最後所更新(如採用最新的更新旗標類型表所追蹤)最新數值的實體寄存器個別讀取。
應注意到若最新的更新類型包括另一種類型,則所有子集類型皆必須映射於母集(super set)類型之同一實體寄存器。
在拉回時,該目標旗標欄位與仿製的集中/客戶旗標架構寄存器合併。應注意到仿製由於原生架構利用分散旗標架構而非單一寄存器集中旗標架構的事實而進行。
更新某些旗標類型的指令之範例:CF、OF、SF、ZR-算術指令及負載/寫入旗標指令
SF、ZF、與有條件的CF-邏輯值及移位
SF、ZF-移動/負載、EXTR、一些乘法
ZF-POPCNT及STREX[P]
GE-SIMD指令???
讀取某些旗標的條件/預測之範例:0000 EQ等於Z=1
0001 NE不等於或無序Z=0
0010 CS b進位集,大於或等於或無序C=1
0011 CC c進位歸零,小於C=0
0100 MI減,負數,小於N=1
0101 PL加,正數或零,大於或等於、無序N=00110 VS溢出,無序V=1
0111 VC沒有溢出,非無序V=0
1000 HI無正負號大於、大於、無序C=1且Z=0
1001 LS無正負號低於或相同、小於或等於C=0或Z=1
1010 GE帶正負號大於或等於、大於或等於N=V
1011 LT帶正負號小於、小於、無序N!=V
1100 GT帶正負號大於、大於Z=0且N=V
1101 LE帶正負號小於或等於、小於或等於、無序Z=1或 N!=V
1110無(AL)、始終(無條件)、設定成任何數值的任何旗標
為了解釋之目的,前述描述已參照特定具體實施例進行描述。然而,以上所例示的詳述不欲為全面性或將本發明限制在所揭示的精確形式。許多修飾例與變化例鑑於以上講述為可能。各具體實施例為了最佳解釋本發明之原理及其實際應用而選擇並描述,以由此讓其他熟習此項技術者能採用可能適合所設想特定用途的各種修改來最佳利用本發明與各種具體實施例。
R0-R63‧‧‧寄存器

Claims (19)

  1. 一種利用原生分散旗標架構仿真客戶集中旗標架構的方法,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊,其中該等指令區塊之每個皆包含兩個半區塊;排程該指令區塊之該等指令,以根據一排程器執行;以及為了客戶指令執行之該仿真,而利用一分散旗標架構仿真一集中旗標架構。
  2. 如申請專利範圍第1項之方法,其中該分散旗標架構仿真一集中客戶旗標架構之該行為。
  3. 如申請專利範圍第1項之方法,其中一分散旗標架構可使用多重獨立的旗標寄存器實現。
  4. 如申請專利範圍第1項之方法,其中客戶指令分類成四種原生指令類型。
  5. 如申請專利範圍第1項之方法,其中客戶指令分類成四種原生指令類型,且客戶指令類型依據其類型更新不同的客戶指令旗標。
  6. 如申請專利範圍第1項之方法,其中一前端/動態轉換器依據於其中其更新客戶指令旗標的該方式分類輸入的指令。
  7. 一種具有當由電腦系統執行時,使得該電腦系統進行利用原生分散旗 標架構仿真客戶集中旗標架構的方法的電腦可讀取碼的非暫時性電腦可讀取媒體,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊,其中該等指令區塊之每個皆包含兩個半區塊;排程該指令區塊之該等指令,以根據一排程器執行;以及為了客戶指令執行之該仿真而利用一分散旗標架構仿真一集中旗標架構。
  8. 如申請專利範圍第7項之電腦可讀取媒體,其中該分散旗標架構仿真一集中客戶旗標架構之該行為。
  9. 如申請專利範圍第7項之電腦可讀取媒體,其中一分散旗標架構可使用多重獨立的旗標寄存器實現。
  10. 如申請專利範圍第7項之電腦可讀取媒體,其中客戶指令分類成四種原生指令類型。
  11. 如申請專利範圍第7項之電腦可讀取媒體,其中客戶指令分類成四種原生指令類型,且客戶指令類型依據其類型更新不同的客戶指令旗標。
  12. 如申請專利範圍第7項之電腦可讀取媒體,其中一前端/動態轉換器依據於其中其更新客戶指令旗標的該方式分類輸入的指令。
  13. 一種具有耦接於記憶體的處理器的電腦系統,該記憶體具有當由該電腦系統執行時,使得該電腦系統實現利用原生分散旗標架構仿真客戶 集中旗標架構的方法的電腦可讀取碼,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊,其中該等指令區塊之每個皆包含兩個半區塊;排程該指令區塊之該等指令,以根據一排程器執行;以及為了客戶指令執行之該仿真而利用一分散旗標架構仿真一集中旗標架構。
  14. 如申請專利範圍第13項之電腦系統,其中該分散旗標架構仿真一集中客戶旗標架構之該行為。
  15. 如申請專利範圍第13項之電腦系統,其中一分散旗標架構可使用多重獨立的旗標寄存器實現。
  16. 如申請專利範圍第13項之電腦系統,其中客戶指令分類成四種原生指令類型。
  17. 如申請專利範圍第13項之電腦系統,其中客戶指令分類成四種原生指令類型,且客戶指令類型依據其類型更新不同的客戶指令旗標。
  18. 如申請專利範圍第13項之電腦系統,其中一前端/動態轉換器依據於其中其更新客戶指令旗標的該方式分類輸入的指令。
  19. 一種執行區塊及半區塊之雙重配送的方法,包含:使用一全域前端接收一輸入的指令序列;群組該等指令以排列指令區塊,其中該等指令區塊之每個皆包含兩 個半區塊;排程該指令區塊之該等指令,以根據一排程器執行;以及為了一執行單元上的執行而進行該等兩個半區塊之一雙重配送。
TW103109493A 2013-03-15 2014-03-14 利用原生分散旗標架構仿真客戶集中旗標架構的方法 TWI522912B (zh)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US201361800487P 2013-03-15 2013-03-15

Publications (2)

Publication Number Publication Date
TW201504942A TW201504942A (zh) 2015-02-01
TWI522912B true TWI522912B (zh) 2016-02-21

Family

ID=51533998

Family Applications (1)

Application Number Title Priority Date Filing Date
TW103109493A TWI522912B (zh) 2013-03-15 2014-03-14 利用原生分散旗標架構仿真客戶集中旗標架構的方法

Country Status (6)

Country Link
US (3) US9823930B2 (zh)
EP (1) EP2972836B1 (zh)
KR (2) KR102083390B1 (zh)
CN (1) CN105247484B (zh)
TW (1) TWI522912B (zh)
WO (1) WO2014151043A1 (zh)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646009B (zh) 2006-04-12 2016-08-17 索夫特机械公司 对载明并行和依赖运算的指令矩阵进行处理的装置和方法
CN101627365B (zh) 2006-11-14 2017-03-29 索夫特机械公司 多线程架构
EP3156896B1 (en) 2010-09-17 2020-04-08 Soft Machines, Inc. Single cycle multi-branch prediction including shadow cache for early far branch prediction
EP2689327B1 (en) 2011-03-25 2021-07-28 Intel Corporation Executing instruction sequence code blocks by using virtual cores instantiated by partitionable engines
WO2012135041A2 (en) 2011-03-25 2012-10-04 Soft Machines, Inc. Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines
CN103635875B (zh) 2011-03-25 2018-02-16 英特尔公司 用于通过使用由可分区引擎实例化的虚拟核来支持代码块执行的存储器片段
TWI603198B (zh) 2011-05-20 2017-10-21 英特爾股份有限公司 以複數個引擎作資源與互連結構的分散式分配以支援指令序列的執行
KR101639854B1 (ko) 2011-05-20 2016-07-14 소프트 머신즈, 인크. 복수의 엔진에 의해 명령어 시퀀스들의 실행을 지원하기 위한 상호접속 구조
US20150039859A1 (en) 2011-11-22 2015-02-05 Soft Machines, Inc. Microprocessor accelerated code optimizer
KR101703401B1 (ko) 2011-11-22 2017-02-06 소프트 머신즈, 인크. 다중 엔진 마이크로프로세서용 가속 코드 최적화기
WO2014150806A1 (en) 2013-03-15 2014-09-25 Soft Machines, Inc. A method for populating register view data structure by using register template snapshots
WO2014150991A1 (en) 2013-03-15 2014-09-25 Soft Machines, Inc. A method for implementing a reduced size register view data structure in a microprocessor
KR101708591B1 (ko) 2013-03-15 2017-02-20 소프트 머신즈, 인크. 블록들로 그룹화된 멀티스레드 명령어들을 실행하기 위한 방법
US10140138B2 (en) 2013-03-15 2018-11-27 Intel Corporation Methods, systems and apparatus for supporting wide and efficient front-end operation with guest-architecture emulation
KR102083390B1 (ko) * 2013-03-15 2020-03-02 인텔 코포레이션 네이티브 분산된 플래그 아키텍처를 이용하여 게스트 중앙 플래그 아키텍처를 에뮬레이션하는 방법
US9811342B2 (en) 2013-03-15 2017-11-07 Intel Corporation Method for performing dual dispatch of blocks and half blocks
US9632825B2 (en) 2013-03-15 2017-04-25 Intel Corporation Method and apparatus for efficient scheduling for asymmetrical execution units
WO2014150971A1 (en) 2013-03-15 2014-09-25 Soft Machines, Inc. A method for dependency broadcasting through a block organized source view data structure
US10275255B2 (en) 2013-03-15 2019-04-30 Intel Corporation Method for dependency broadcasting through a source organized source view data structure
US9886279B2 (en) 2013-03-15 2018-02-06 Intel Corporation Method for populating and instruction view data structure by using register template snapshots
US9569216B2 (en) 2013-03-15 2017-02-14 Soft Machines, Inc. Method for populating a source view data structure by using register template snapshots
US9891924B2 (en) 2013-03-15 2018-02-13 Intel Corporation Method for implementing a reduced size register view data structure in a microprocessor
US9904625B2 (en) 2013-03-15 2018-02-27 Intel Corporation Methods, systems and apparatus for predicting the way of a set associative cache
US10678544B2 (en) 2015-09-19 2020-06-09 Microsoft Technology Licensing, Llc Initiating instruction block execution using a register access instruction
US11681531B2 (en) 2015-09-19 2023-06-20 Microsoft Technology Licensing, Llc Generation and use of memory access instruction order encodings
US10871967B2 (en) * 2015-09-19 2020-12-22 Microsoft Technology Licensing, Llc Register read/write ordering
US20170315812A1 (en) 2016-04-28 2017-11-02 Microsoft Technology Licensing, Llc Parallel instruction scheduler for block isa processor
US10261785B2 (en) * 2017-02-28 2019-04-16 Microsoft Technology Licensing, Llc Arithmetic lazy flags representation for emulation
US11288072B2 (en) * 2019-09-11 2022-03-29 Ceremorphic, Inc. Multi-threaded processor with thread granularity
CN113535231A (zh) * 2020-04-17 2021-10-22 中科寒武纪科技股份有限公司 减少指令跳转的方法及装置

Family Cites Families (534)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US727487A (en) 1902-10-21 1903-05-05 Swan F Swanson Dumping-car.
US4075704A (en) 1976-07-02 1978-02-21 Floating Point Systems, Inc. Floating point data processor for high speech operation
US4228496A (en) 1976-09-07 1980-10-14 Tandem Computers Incorporated Multiprocessor system
US4245344A (en) 1979-04-02 1981-01-13 Rockwell International Corporation Processing system with dual buses
US4527237A (en) 1979-10-11 1985-07-02 Nanodata Computer Corporation Data processing system
US4414624A (en) 1980-11-19 1983-11-08 The United States Of America As Represented By The Secretary Of The Navy Multiple-microcomputer processing
US4524415A (en) 1982-12-07 1985-06-18 Motorola, Inc. Virtual machine data processor
US4597061B1 (en) 1983-01-03 1998-06-09 Texas Instruments Inc Memory system using pipleline circuitry for improved system
US4577273A (en) 1983-06-06 1986-03-18 Sperry Corporation Multiple microcomputer system for digital computers
US4682281A (en) 1983-08-30 1987-07-21 Amdahl Corporation Data storage unit employing translation lookaside buffer pointer
US4633434A (en) 1984-04-02 1986-12-30 Sperry Corporation High performance storage unit
US4600986A (en) 1984-04-02 1986-07-15 Sperry Corporation Pipelined split stack with high performance interleaved decode
JPS6140643A (ja) 1984-07-31 1986-02-26 Hitachi Ltd システムの資源割当て制御方式
US4835680A (en) 1985-03-15 1989-05-30 Xerox Corporation Adaptive processor array capable of learning variable associations useful in recognizing classes of inputs
JPS6289149A (ja) 1985-10-15 1987-04-23 Agency Of Ind Science & Technol 多ポ−トメモリシステム
JPH0658650B2 (ja) 1986-03-14 1994-08-03 株式会社日立製作所 仮想計算機システム
US4920477A (en) 1987-04-20 1990-04-24 Multiflow Computer, Inc. Virtual address table look aside buffer miss recovery method and apparatus
US4943909A (en) 1987-07-08 1990-07-24 At&T Bell Laboratories Computational origami
DE68926783T2 (de) 1988-10-07 1996-11-28 Martin Marietta Corp Paralleler datenprozessor
US5339398A (en) 1989-07-31 1994-08-16 North American Philips Corporation Memory architecture and method of data organization optimized for hashing
US5471593A (en) 1989-12-11 1995-11-28 Branigin; Michael H. Computer processor with an efficient means of executing many instructions simultaneously
US5197130A (en) 1989-12-29 1993-03-23 Supercomputer Systems Limited Partnership Cluster architecture for a highly parallel scalar/vector multiprocessor system
EP0463965B1 (en) 1990-06-29 1998-09-09 Digital Equipment Corporation Branch prediction unit for high-performance processor
US5317754A (en) 1990-10-23 1994-05-31 International Business Machines Corporation Method and apparatus for enabling an interpretive execution subset
US5317705A (en) 1990-10-24 1994-05-31 International Business Machines Corporation Apparatus and method for TLB purge reduction in a multi-level machine system
US6282583B1 (en) 1991-06-04 2001-08-28 Silicon Graphics, Inc. Method and apparatus for memory access in a matrix processor computer
US5539911A (en) 1991-07-08 1996-07-23 Seiko Epson Corporation High-performance, superscalar-based computer system with out-of-order instruction execution
JPH0820949B2 (ja) 1991-11-26 1996-03-04 松下電器産業株式会社 情報処理装置
GB2277181B (en) 1991-12-23 1995-12-13 Intel Corp Interleaved cache for multiple accesses per clock in a microprocessor
JP2647327B2 (ja) 1992-04-06 1997-08-27 インターナショナル・ビジネス・マシーンズ・コーポレイション 大規模並列コンピューティング・システム装置
KR100309566B1 (ko) 1992-04-29 2001-12-15 리패치 파이프라인프로세서에서다중명령어를무리짓고,그룹화된명령어를동시에발행하고,그룹화된명령어를실행시키는방법및장치
DE69308548T2 (de) 1992-05-01 1997-06-12 Seiko Epson Corp Vorrichtung und verfahren zum befehlsabschluss in einem superskalaren prozessor.
DE69329260T2 (de) 1992-06-25 2001-02-22 Canon Kk Gerät zum Multiplizieren von Ganzzahlen mit vielen Ziffern
JPH0637202A (ja) 1992-07-20 1994-02-10 Mitsubishi Electric Corp マイクロ波ic用パッケージ
JPH06110781A (ja) 1992-09-30 1994-04-22 Nec Corp キャッシュメモリ装置
US5493660A (en) 1992-10-06 1996-02-20 Hewlett-Packard Company Software assisted hardware TLB miss handler
US5513335A (en) 1992-11-02 1996-04-30 Sgs-Thomson Microelectronics, Inc. Cache tag memory having first and second single-port arrays and a dual-port array
US5819088A (en) 1993-03-25 1998-10-06 Intel Corporation Method and apparatus for scheduling instructions for execution on a multi-issue architecture computer
US5548773A (en) 1993-03-30 1996-08-20 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Digital parallel processor array for optimum path planning
JPH0784883A (ja) 1993-09-17 1995-03-31 Hitachi Ltd 仮想計算機システムのアドレス変換バッファパージ方法
US6948172B1 (en) 1993-09-21 2005-09-20 Microsoft Corporation Preemptive multi-tasking with cooperative groups of tasks
US5469376A (en) 1993-10-14 1995-11-21 Abdallah; Mohammad A. F. F. Digital circuit for the evaluation of mathematical expressions
US5517651A (en) 1993-12-29 1996-05-14 Intel Corporation Method and apparatus for loading a segment register in a microprocessor capable of operating in multiple modes
US5956753A (en) 1993-12-30 1999-09-21 Intel Corporation Method and apparatus for handling speculative memory access operations
US5761476A (en) 1993-12-30 1998-06-02 Intel Corporation Non-clocked early read for back-to-back scheduling of instructions
JP3048498B2 (ja) 1994-04-13 2000-06-05 株式会社東芝 半導体記憶装置
JPH07287668A (ja) 1994-04-19 1995-10-31 Hitachi Ltd データ処理装置
CN1084005C (zh) 1994-06-27 2002-05-01 国际商业机器公司 用于动态控制地址空间分配的方法和设备
US5548742A (en) 1994-08-11 1996-08-20 Intel Corporation Method and apparatus for combining a direct-mapped cache and a multiple-way cache in a cache memory
US5813031A (en) 1994-09-21 1998-09-22 Industrial Technology Research Institute Caching tag for a large scale cache computer memory system
US5640534A (en) 1994-10-05 1997-06-17 International Business Machines Corporation Method and system for concurrent access in a data cache array utilizing multiple match line selection paths
US5835951A (en) 1994-10-18 1998-11-10 National Semiconductor Branch processing unit with target cache read prioritization protocol for handling multiple hits
JP3569014B2 (ja) 1994-11-25 2004-09-22 富士通株式会社 マルチコンテキストをサポートするプロセッサおよび処理方法
US5724565A (en) 1995-02-03 1998-03-03 International Business Machines Corporation Method and system for processing first and second sets of instructions by first and second types of processing systems
US5651124A (en) 1995-02-14 1997-07-22 Hal Computer Systems, Inc. Processor structure and method for aggressively scheduling long latency instructions including load/store instructions while maintaining precise state
US5675759A (en) 1995-03-03 1997-10-07 Shebanow; Michael C. Method and apparatus for register management using issue sequence prior physical register and register association validity information
US5751982A (en) * 1995-03-31 1998-05-12 Apple Computer, Inc. Software emulation system with dynamic translation of emulated instructions for increased processing speed
US5634068A (en) 1995-03-31 1997-05-27 Sun Microsystems, Inc. Packet switched cache coherent multiprocessor system
US6209085B1 (en) 1995-05-05 2001-03-27 Intel Corporation Method and apparatus for performing process switching in multiprocessor computer systems
US6643765B1 (en) * 1995-08-16 2003-11-04 Microunity Systems Engineering, Inc. Programmable processor with group floating point operations
US5710902A (en) 1995-09-06 1998-01-20 Intel Corporation Instruction dependency chain indentifier
US6341324B1 (en) 1995-10-06 2002-01-22 Lsi Logic Corporation Exception processing in superscalar microprocessor
US5864657A (en) 1995-11-29 1999-01-26 Texas Micro, Inc. Main memory system and checkpointing protocol for fault-tolerant computer system
US5983327A (en) 1995-12-01 1999-11-09 Nortel Networks Corporation Data path architecture and arbitration scheme for providing access to a shared system resource
US5793941A (en) 1995-12-04 1998-08-11 Advanced Micro Devices, Inc. On-chip primary cache testing circuit and test method
US5911057A (en) 1995-12-19 1999-06-08 Texas Instruments Incorporated Superscalar microprocessor having combined register and memory renaming circuits, systems, and methods
US5699537A (en) 1995-12-22 1997-12-16 Intel Corporation Processor microarchitecture for efficient dynamic scheduling and execution of chains of dependent instructions
US6882177B1 (en) 1996-01-10 2005-04-19 Altera Corporation Tristate structures for programmable logic devices
US5754818A (en) 1996-03-22 1998-05-19 Sun Microsystems, Inc. Architecture and method for sharing TLB entries through process IDS
US5904892A (en) 1996-04-01 1999-05-18 Saint-Gobain/Norton Industrial Ceramics Corp. Tape cast silicon carbide dummy wafer
US5752260A (en) 1996-04-29 1998-05-12 International Business Machines Corporation High-speed, multiple-port, interleaved cache with arbitration of multiple access addresses
US5806085A (en) 1996-05-01 1998-09-08 Sun Microsystems, Inc. Method for non-volatile caching of network and CD-ROM file accesses using a cache directory, pointers, file name conversion, a local hard disk, and separate small database
US5829028A (en) 1996-05-06 1998-10-27 Advanced Micro Devices, Inc. Data cache configured to store data in a use-once manner
US6108769A (en) 1996-05-17 2000-08-22 Advanced Micro Devices, Inc. Dependency table for reducing dependency checking hardware
US5958042A (en) 1996-06-11 1999-09-28 Sun Microsystems, Inc. Grouping logic circuit in a pipelined superscalar processor
US5881277A (en) 1996-06-13 1999-03-09 Texas Instruments Incorporated Pipelined microprocessor with branch misprediction cache circuits, systems and methods
US5860146A (en) 1996-06-25 1999-01-12 Sun Microsystems, Inc. Auxiliary translation lookaside buffer for assisting in accessing data in remote address spaces
US5903760A (en) * 1996-06-27 1999-05-11 Intel Corporation Method and apparatus for translating a conditional instruction compatible with a first instruction set architecture (ISA) into a conditional instruction compatible with a second ISA
US5974506A (en) 1996-06-28 1999-10-26 Digital Equipment Corporation Enabling mirror, nonmirror and partial mirror cache modes in a dual cache system
US6167490A (en) 1996-09-20 2000-12-26 University Of Washington Using global memory information to manage memory in a computer network
KR19980032776A (ko) 1996-10-16 1998-07-25 가나이 츠토무 데이타 프로세서 및 데이타 처리시스템
KR19990076967A (ko) 1996-11-04 1999-10-25 요트.게.아. 롤페즈 처리 장치 및 메모리내의 명령 판독
US6385715B1 (en) 1996-11-13 2002-05-07 Intel Corporation Multi-threading for a processor utilizing a replay queue
US5978906A (en) 1996-11-19 1999-11-02 Advanced Micro Devices, Inc. Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions
US6253316B1 (en) 1996-11-19 2001-06-26 Advanced Micro Devices, Inc. Three state branch history using one bit in a branch prediction mechanism
US5903750A (en) 1996-11-20 1999-05-11 Institute For The Development Of Emerging Architectures, L.L.P. Dynamic branch prediction for branch instructions with multiple targets
US6212542B1 (en) 1996-12-16 2001-04-03 International Business Machines Corporation Method and system for executing a program within a multiscalar processor by processing linked thread descriptors
US6134634A (en) 1996-12-20 2000-10-17 Texas Instruments Incorporated Method and apparatus for preemptive cache write-back
US5918251A (en) 1996-12-23 1999-06-29 Intel Corporation Method and apparatus for preloading different default address translation attributes
US6016540A (en) 1997-01-08 2000-01-18 Intel Corporation Method and apparatus for scheduling instructions in waves
US6065105A (en) 1997-01-08 2000-05-16 Intel Corporation Dependency matrix
US5802602A (en) 1997-01-17 1998-09-01 Intel Corporation Method and apparatus for performing reads of related data from a set-associative cache memory
JP3739888B2 (ja) 1997-03-27 2006-01-25 株式会社ソニー・コンピュータエンタテインメント 情報処理装置および方法
US6088780A (en) 1997-03-31 2000-07-11 Institute For The Development Of Emerging Architecture, L.L.C. Page table walker that uses at least one of a default page size and a page size selected for a virtual address space to position a sliding field in a virtual address
US6314511B2 (en) 1997-04-03 2001-11-06 University Of Washington Mechanism for freeing registers on processors that perform dynamic out-of-order execution of instructions using renaming registers
US6035120A (en) 1997-05-28 2000-03-07 Sun Microsystems, Inc. Method and apparatus for converting executable computer programs in a heterogeneous computing environment
US6075938A (en) 1997-06-10 2000-06-13 The Board Of Trustees Of The Leland Stanford Junior University Virtual machine monitors for scalable multiprocessors
US6073230A (en) 1997-06-11 2000-06-06 Advanced Micro Devices, Inc. Instruction fetch unit configured to provide sequential way prediction for sequential instruction fetches
JPH1124929A (ja) 1997-06-30 1999-01-29 Sony Corp 演算処理装置およびその方法
US6658447B2 (en) 1997-07-08 2003-12-02 Intel Corporation Priority based simultaneous multi-threading
US6170051B1 (en) 1997-08-01 2001-01-02 Micron Technology, Inc. Apparatus and method for program level parallelism in a VLIW processor
US6128728A (en) 1997-08-01 2000-10-03 Micron Technology, Inc. Virtual shadow registers and virtual register windows
US6085315A (en) 1997-09-12 2000-07-04 Siemens Aktiengesellschaft Data processing device with loop pipeline
US6101577A (en) 1997-09-15 2000-08-08 Advanced Micro Devices, Inc. Pipelined instruction cache and branch prediction mechanism therefor
US5901294A (en) 1997-09-18 1999-05-04 International Business Machines Corporation Method and system for bus arbitration in a multiprocessor system utilizing simultaneous variable-width bus access
US6185660B1 (en) 1997-09-23 2001-02-06 Hewlett-Packard Company Pending access queue for providing data to a target register during an intermediate pipeline phase after a computer cache miss
US5905509A (en) 1997-09-30 1999-05-18 Compaq Computer Corp. Accelerated Graphics Port two level Gart cache having distributed first level caches
US6226732B1 (en) 1997-10-02 2001-05-01 Hitachi Micro Systems, Inc. Memory system architecture
US5922065A (en) 1997-10-13 1999-07-13 Institute For The Development Of Emerging Architectures, L.L.C. Processor utilizing a template field for encoding instruction sequences in a wide-word format
US6178482B1 (en) 1997-11-03 2001-01-23 Brecis Communications Virtual register sets
US6021484A (en) * 1997-11-14 2000-02-01 Samsung Electronics Co., Ltd. Dual instruction set architecture
US6256728B1 (en) 1997-11-17 2001-07-03 Advanced Micro Devices, Inc. Processor configured to selectively cancel instructions from its pipeline responsive to a predicted-taken short forward branch instruction
US6260131B1 (en) 1997-11-18 2001-07-10 Intrinsity, Inc. Method and apparatus for TLB memory ordering
US6016533A (en) 1997-12-16 2000-01-18 Advanced Micro Devices, Inc. Way prediction logic for cache array
US6219776B1 (en) 1998-03-10 2001-04-17 Billions Of Operations Per Second Merged array controller and processing element
US6609189B1 (en) 1998-03-12 2003-08-19 Yale University Cycle segmented prefix circuits
JP3657424B2 (ja) 1998-03-20 2005-06-08 松下電器産業株式会社 番組情報を放送するセンター装置と端末装置
US6216215B1 (en) 1998-04-02 2001-04-10 Intel Corporation Method and apparatus for senior loads
US6157998A (en) 1998-04-03 2000-12-05 Motorola Inc. Method for performing branch prediction and resolution of two or more branch instructions within two or more branch prediction buffers
US6205545B1 (en) 1998-04-30 2001-03-20 Hewlett-Packard Company Method and apparatus for using static branch predictions hints with dynamically translated code traces to improve performance
US6115809A (en) 1998-04-30 2000-09-05 Hewlett-Packard Company Compiling strong and weak branching behavior instruction blocks to separate caches for dynamic and static prediction
US6256727B1 (en) 1998-05-12 2001-07-03 International Business Machines Corporation Method and system for fetching noncontiguous instructions in a single clock cycle
JPH11338710A (ja) * 1998-05-28 1999-12-10 Toshiba Corp 複数種の命令セットを持つプロセッサのためのコンパイル方法ならびに装置および同方法がプログラムされ記録される記録媒体
US6272616B1 (en) 1998-06-17 2001-08-07 Agere Systems Guardian Corp. Method and apparatus for executing multiple instruction streams in a digital processor with multiple data paths
US6988183B1 (en) 1998-06-26 2006-01-17 Derek Chi-Lan Wong Methods for increasing instruction-level parallelism in microprocessors and digital system
US6260138B1 (en) 1998-07-17 2001-07-10 Sun Microsystems, Inc. Method and apparatus for branch instruction processing in a processor
US6122656A (en) 1998-07-31 2000-09-19 Advanced Micro Devices, Inc. Processor configured to map logical register numbers to physical register numbers using virtual register numbers
US6272662B1 (en) 1998-08-04 2001-08-07 International Business Machines Corporation Distributed storage system using front-end and back-end locking
JP2000057054A (ja) 1998-08-12 2000-02-25 Fujitsu Ltd 高速アドレス変換システム
US6742111B2 (en) 1998-08-31 2004-05-25 Stmicroelectronics, Inc. Reservation stations to increase instruction level parallelism
US8631066B2 (en) * 1998-09-10 2014-01-14 Vmware, Inc. Mechanism for providing virtual machines for use by multiple users
US6339822B1 (en) 1998-10-02 2002-01-15 Advanced Micro Devices, Inc. Using padded instructions in a block-oriented cache
US6332189B1 (en) 1998-10-16 2001-12-18 Intel Corporation Branch prediction architecture
GB9825102D0 (en) 1998-11-16 1999-01-13 Insignia Solutions Plc Computer system
JP3110404B2 (ja) 1998-11-18 2000-11-20 甲府日本電気株式会社 マイクロプロセッサ装置及びそのソフトウェア命令高速化方法並びにその制御プログラムを記録した記録媒体
US6490673B1 (en) 1998-11-27 2002-12-03 Matsushita Electric Industrial Co., Ltd Processor, compiling apparatus, and compile program recorded on a recording medium
US6519682B2 (en) 1998-12-04 2003-02-11 Stmicroelectronics, Inc. Pipelined non-blocking level two cache system with inherent transaction collision-avoidance
US6049501A (en) 1998-12-14 2000-04-11 Motorola, Inc. Memory data bus architecture and method of configuring multi-wide word memories
US7020879B1 (en) 1998-12-16 2006-03-28 Mips Technologies, Inc. Interrupt and exception handling for multi-streaming digital processors
US6477562B2 (en) 1998-12-16 2002-11-05 Clearwater Networks, Inc. Prioritized instruction scheduling for multi-streaming processors
US6247097B1 (en) 1999-01-22 2001-06-12 International Business Machines Corporation Aligned instruction cache handling of instruction fetches across multiple predicted branch instructions
US6321298B1 (en) 1999-01-25 2001-11-20 International Business Machines Corporation Full cache coherency across multiple raid controllers
JP3842474B2 (ja) 1999-02-02 2006-11-08 株式会社ルネサステクノロジ データ処理装置
US6327650B1 (en) 1999-02-12 2001-12-04 Vsli Technology, Inc. Pipelined multiprocessing with upstream processor concurrently writing to local register and to register of downstream processor
US6732220B2 (en) * 1999-02-17 2004-05-04 Elbrus International Method for emulating hardware features of a foreign architecture in a host operating system environment
US6668316B1 (en) 1999-02-17 2003-12-23 Elbrus International Limited Method and apparatus for conflict-free execution of integer and floating-point operations with a common register file
US6418530B2 (en) 1999-02-18 2002-07-09 Hewlett-Packard Company Hardware/software system for instruction profiling and trace selection using branch history information for branch predictions
US6437789B1 (en) 1999-02-19 2002-08-20 Evans & Sutherland Computer Corporation Multi-level cache controller
US6850531B1 (en) 1999-02-23 2005-02-01 Alcatel Multi-service network switch
US6212613B1 (en) 1999-03-22 2001-04-03 Cisco Technology, Inc. Methods and apparatus for reusing addresses in a computer
US6529928B1 (en) 1999-03-23 2003-03-04 Silicon Graphics, Inc. Floating-point adder performing floating-point and integer operations
US6708268B1 (en) 1999-03-26 2004-03-16 Microchip Technology Incorporated Microcontroller instruction set
EP1050808B1 (en) 1999-05-03 2008-04-30 STMicroelectronics S.A. Computer instruction scheduling
US6449671B1 (en) 1999-06-09 2002-09-10 Ati International Srl Method and apparatus for busing data elements
US6473833B1 (en) 1999-07-30 2002-10-29 International Business Machines Corporation Integrated cache and directory structure for multi-level caches
US6643770B1 (en) 1999-09-16 2003-11-04 Intel Corporation Branch misprediction recovery using a side memory
US6772325B1 (en) 1999-10-01 2004-08-03 Hitachi, Ltd. Processor architecture and operation for exploiting improved branch control instruction
US6704822B1 (en) 1999-10-01 2004-03-09 Sun Microsystems, Inc. Arbitration protocol for a shared data cache
US6457120B1 (en) 1999-11-01 2002-09-24 International Business Machines Corporation Processor and method including a cache having confirmation bits for improving address predictable branch instruction target predictions
US7441110B1 (en) 1999-12-10 2008-10-21 International Business Machines Corporation Prefetching using future branch path information derived from branch prediction
US7107434B2 (en) 1999-12-20 2006-09-12 Board Of Regents, The University Of Texas System, method and apparatus for allocating hardware resources using pseudorandom sequences
WO2001046827A1 (en) 1999-12-22 2001-06-28 Ubicom, Inc. System and method for instruction level multithreading in an embedded processor using zero-time context switching
US6557095B1 (en) 1999-12-27 2003-04-29 Intel Corporation Scheduling operations using a dependency matrix
KR100747128B1 (ko) 2000-01-03 2007-08-09 어드밴스드 마이크로 디바이시즈, 인코포레이티드 발행 후에 명령의 비투기적 성질을 발견하고 상기 명령을 재발행하는 스케줄러
US6542984B1 (en) 2000-01-03 2003-04-01 Advanced Micro Devices, Inc. Scheduler capable of issuing and reissuing dependency chains
US6594755B1 (en) 2000-01-04 2003-07-15 National Semiconductor Corporation System and method for interleaved execution of multiple independent threads
US6728872B1 (en) 2000-02-04 2004-04-27 International Business Machines Corporation Method and apparatus for verifying that instructions are pipelined in correct architectural sequence
GB0002848D0 (en) 2000-02-08 2000-03-29 Siroyan Limited Communicating instruction results in processors and compiling methods for processors
GB2365661A (en) 2000-03-10 2002-02-20 British Telecomm Allocating switch requests within a packet switch
US6615340B1 (en) 2000-03-22 2003-09-02 Wilmot, Ii Richard Byron Extended operand management indicator structure and method
US7140022B2 (en) 2000-06-02 2006-11-21 Honeywell International Inc. Method and apparatus for slack stealing with dynamic threads
US6604187B1 (en) 2000-06-19 2003-08-05 Advanced Micro Devices, Inc. Providing global translations with address space numbers
US6557083B1 (en) 2000-06-30 2003-04-29 Intel Corporation Memory system for multiple data types
US6704860B1 (en) 2000-07-26 2004-03-09 International Business Machines Corporation Data processing system and method for fetching instruction blocks in response to a detected block sequence
US7206925B1 (en) 2000-08-18 2007-04-17 Sun Microsystems, Inc. Backing Register File for processors
US6728866B1 (en) 2000-08-31 2004-04-27 International Business Machines Corporation Partitioned issue queue and allocation strategy
US6721874B1 (en) 2000-10-12 2004-04-13 International Business Machines Corporation Method and system for dynamically shared completion table supporting multiple threads in a processing system
US6639866B2 (en) 2000-11-03 2003-10-28 Broadcom Corporation Very small swing high performance asynchronous CMOS static memory (multi-port register file) with power reducing column multiplexing scheme
US7757065B1 (en) 2000-11-09 2010-07-13 Intel Corporation Instruction segment recording scheme
WO2002071211A2 (en) 2000-11-20 2002-09-12 Zucotto Wireless, Inc. Data processor having multiple operating modes
JP2002185513A (ja) 2000-12-18 2002-06-28 Hitachi Ltd パケット通信ネットワークおよびパケット転送制御方法
US6877089B2 (en) 2000-12-27 2005-04-05 International Business Machines Corporation Branch prediction apparatus and process for restoring replaced branch history for use in future branch predictions for an executing program
US6907600B2 (en) 2000-12-27 2005-06-14 Intel Corporation Virtual translation lookaside buffer
US6647466B2 (en) 2001-01-25 2003-11-11 Hewlett-Packard Development Company, L.P. Method and apparatus for adaptively bypassing one or more levels of a cache hierarchy
FR2820921A1 (fr) 2001-02-14 2002-08-16 Canon Kk Dispositif et procede de transmission dans un commutateur
US6985951B2 (en) 2001-03-08 2006-01-10 International Business Machines Corporation Inter-partition message passing method, system and program product for managing workload in a partitioned processing environment
US6950927B1 (en) 2001-04-13 2005-09-27 The United States Of America As Represented By The Secretary Of The Navy System and method for instruction-level parallelism in a programmable multiple network processor environment
US7200740B2 (en) 2001-05-04 2007-04-03 Ip-First, Llc Apparatus and method for speculatively performing a return instruction in a microprocessor
US7707397B2 (en) 2001-05-04 2010-04-27 Via Technologies, Inc. Variable group associativity branch target address cache delivering multiple target addresses per cache line
US6658549B2 (en) 2001-05-22 2003-12-02 Hewlett-Packard Development Company, Lp. Method and system allowing a single entity to manage memory comprising compressed and uncompressed data
US6985591B2 (en) 2001-06-29 2006-01-10 Intel Corporation Method and apparatus for distributing keys for decrypting and re-encrypting publicly distributed media
US7203824B2 (en) 2001-07-03 2007-04-10 Ip-First, Llc Apparatus and method for handling BTAC branches that wrap across instruction cache lines
US7024545B1 (en) 2001-07-24 2006-04-04 Advanced Micro Devices, Inc. Hybrid branch prediction device with two levels of branch prediction cache
US6954846B2 (en) 2001-08-07 2005-10-11 Sun Microsystems, Inc. Microprocessor and method for giving each thread exclusive access to one register file in a multi-threading mode and for giving an active thread access to multiple register files in a single thread mode
US6718440B2 (en) 2001-09-28 2004-04-06 Intel Corporation Memory access latency hiding with hint buffer
US7150021B1 (en) 2001-10-12 2006-12-12 Palau Acquisition Corporation (Delaware) Method and system to allocate resources within an interconnect device according to a resource allocation table
US7117347B2 (en) 2001-10-23 2006-10-03 Ip-First, Llc Processor including fallback branch prediction mechanism for far jump and far call instructions
US7272832B2 (en) 2001-10-25 2007-09-18 Hewlett-Packard Development Company, L.P. Method of protecting user process data in a secure platform inaccessible to the operating system and other tasks on top of the secure platform
US6964043B2 (en) 2001-10-30 2005-11-08 Intel Corporation Method, apparatus, and system to optimize frequently executed code and to use compiler transformation and hardware support to handle infrequently executed code
GB2381886B (en) 2001-11-07 2004-06-23 Sun Microsystems Inc Computer system with virtual memory and paging mechanism
US7092869B2 (en) * 2001-11-14 2006-08-15 Ronald Hilton Memory address prediction under emulation
US7080169B2 (en) 2001-12-11 2006-07-18 Emulex Design & Manufacturing Corporation Receiving data from interleaved multiple concurrent transactions in a FIFO memory having programmable buffer zones
US20030126416A1 (en) 2001-12-31 2003-07-03 Marr Deborah T. Suspending execution of a thread in a multi-threaded processor
US7363467B2 (en) 2002-01-03 2008-04-22 Intel Corporation Dependence-chain processing using trace descriptors having dependency descriptors
US6640333B2 (en) 2002-01-10 2003-10-28 Lsi Logic Corporation Architecture for a sea of platforms
US7055021B2 (en) 2002-02-05 2006-05-30 Sun Microsystems, Inc. Out-of-order processor that reduces mis-speculation using a replay scoreboard
US7331040B2 (en) * 2002-02-06 2008-02-12 Transitive Limted Condition code flag emulation for program code conversion
US20030154363A1 (en) 2002-02-11 2003-08-14 Soltis Donald C. Stacked register aliasing in data hazard detection to reduce circuit
US6839816B2 (en) 2002-02-26 2005-01-04 International Business Machines Corporation Shared cache line update mechanism
US6731292B2 (en) 2002-03-06 2004-05-04 Sun Microsystems, Inc. System and method for controlling a number of outstanding data transactions within an integrated circuit
JP3719509B2 (ja) 2002-04-01 2005-11-24 株式会社ソニー・コンピュータエンタテインメント シリアル演算パイプライン、演算装置、算術論理演算回路およびシリアル演算パイプラインによる演算方法
US7565509B2 (en) 2002-04-17 2009-07-21 Microsoft Corporation Using limits on address translation to control access to an addressable entity
US6920530B2 (en) 2002-04-23 2005-07-19 Sun Microsystems, Inc. Scheme for reordering instructions via an instruction caching mechanism
US7113488B2 (en) 2002-04-24 2006-09-26 International Business Machines Corporation Reconfigurable circular bus
US6760818B2 (en) 2002-05-01 2004-07-06 Koninklijke Philips Electronics N.V. Memory region based data pre-fetching
US7281055B2 (en) 2002-05-28 2007-10-09 Newisys, Inc. Routing mechanisms in systems having multiple multi-processor clusters
US7117346B2 (en) 2002-05-31 2006-10-03 Freescale Semiconductor, Inc. Data processing system having multiple register contexts and method therefor
US6938151B2 (en) 2002-06-04 2005-08-30 International Business Machines Corporation Hybrid branch prediction using a global selection counter and a prediction method comparison table
US6735747B2 (en) 2002-06-10 2004-05-11 Lsi Logic Corporation Pre-silicon verification path coverage
US8024735B2 (en) 2002-06-14 2011-09-20 Intel Corporation Method and apparatus for ensuring fairness and forward progress when executing multiple threads of execution
JP3845043B2 (ja) 2002-06-28 2006-11-15 富士通株式会社 命令フェッチ制御装置
JP3982353B2 (ja) 2002-07-12 2007-09-26 日本電気株式会社 フォルトトレラントコンピュータ装置、その再同期化方法及び再同期化プログラム
US6944744B2 (en) 2002-08-27 2005-09-13 Advanced Micro Devices, Inc. Apparatus and method for independently schedulable functional units with issue lock mechanism in a processor
US7546422B2 (en) 2002-08-28 2009-06-09 Intel Corporation Method and apparatus for the synchronization of distributed caches
US6950925B1 (en) 2002-08-28 2005-09-27 Advanced Micro Devices, Inc. Scheduler for use in a microprocessor that supports data-speculative execution
TW200408242A (en) 2002-09-06 2004-05-16 Matsushita Electric Ind Co Ltd Home terminal apparatus and communication system
US6895491B2 (en) 2002-09-26 2005-05-17 Hewlett-Packard Development Company, L.P. Memory addressing for a virtual machine implementation on a computer processor supporting virtual hash-page-table searching
US7334086B2 (en) 2002-10-08 2008-02-19 Rmi Corporation Advanced processor with system on a chip interconnect technology
US6829698B2 (en) 2002-10-10 2004-12-07 International Business Machines Corporation Method, apparatus and system for acquiring a global promotion facility utilizing a data-less transaction
US7213248B2 (en) 2002-10-10 2007-05-01 International Business Machines Corporation High speed promotion mechanism suitable for lock acquisition in a multiprocessor data processing system
US7222218B2 (en) 2002-10-22 2007-05-22 Sun Microsystems, Inc. System and method for goal-based scheduling of blocks of code for concurrent execution
US20040103251A1 (en) 2002-11-26 2004-05-27 Mitchell Alsup Microprocessor including a first level cache and a second level cache having different cache line sizes
WO2004051449A2 (en) 2002-12-04 2004-06-17 Koninklijke Philips Electronics N.V. Register file gating to reduce microprocessor power dissipation
US6981083B2 (en) 2002-12-05 2005-12-27 International Business Machines Corporation Processor virtualization mechanism via an enhanced restoration of hard architected states
US7073042B2 (en) 2002-12-12 2006-07-04 Intel Corporation Reclaiming existing fields in address translation data structures to extend control over memory accesses
US20040117594A1 (en) 2002-12-13 2004-06-17 Vanderspek Julius Memory management method
US20040122887A1 (en) 2002-12-20 2004-06-24 Macy William W. Efficient multiplication of small matrices using SIMD registers
US7191349B2 (en) 2002-12-26 2007-03-13 Intel Corporation Mechanism for processor power state aware distribution of lowest priority interrupt
US20040139441A1 (en) 2003-01-09 2004-07-15 Kabushiki Kaisha Toshiba Processor, arithmetic operation processing method, and priority determination method
US6925421B2 (en) 2003-01-09 2005-08-02 International Business Machines Corporation Method, system, and computer program product for estimating the number of consumers that place a load on an individual resource in a pool of physically distributed resources
US7178010B2 (en) 2003-01-16 2007-02-13 Ip-First, Llc Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack
US7089374B2 (en) * 2003-02-13 2006-08-08 Sun Microsystems, Inc. Selectively unmarking load-marked cache lines during transactional program execution
US7278030B1 (en) 2003-03-03 2007-10-02 Vmware, Inc. Virtualization system for computers having multiple protection mechanisms
US6912644B1 (en) 2003-03-06 2005-06-28 Intel Corporation Method and apparatus to steer memory access operations in a virtual memory system
US7111145B1 (en) 2003-03-25 2006-09-19 Vmware, Inc. TLB miss fault handler and method for accessing multiple page tables
US7143273B2 (en) 2003-03-31 2006-11-28 Intel Corporation Method and apparatus for dynamic branch prediction utilizing multiple stew algorithms for indexing a global history
CN1214666C (zh) 2003-04-07 2005-08-10 华为技术有限公司 位置业务中限制位置信息请求流量的方法
US7058764B2 (en) 2003-04-14 2006-06-06 Hewlett-Packard Development Company, L.P. Method of adaptive cache partitioning to increase host I/O performance
US7469407B2 (en) 2003-04-24 2008-12-23 International Business Machines Corporation Method for resource balancing using dispatch flush in a simultaneous multithread processor
US7139855B2 (en) 2003-04-24 2006-11-21 International Business Machines Corporation High performance synchronization of resource allocation in a logically-partitioned system
EP1471421A1 (en) 2003-04-24 2004-10-27 STMicroelectronics Limited Speculative load instruction control
US7290261B2 (en) 2003-04-24 2007-10-30 International Business Machines Corporation Method and logical apparatus for rename register reallocation in a simultaneous multi-threaded (SMT) processor
US7055003B2 (en) 2003-04-25 2006-05-30 International Business Machines Corporation Data cache scrub mechanism for large L2/L3 data cache structures
US7007108B2 (en) 2003-04-30 2006-02-28 Lsi Logic Corporation System method for use of hardware semaphores for resource release notification wherein messages comprises read-modify-write operation and address
US7743238B2 (en) 2003-05-09 2010-06-22 Arm Limited Accessing items of architectural state from a register cache in a data processing apparatus when performing branch prediction operations for an indirect branch instruction
US7861062B2 (en) 2003-06-25 2010-12-28 Koninklijke Philips Electronics N.V. Data processing device with instruction controlled clock speed
JP2005032018A (ja) 2003-07-04 2005-02-03 Semiconductor Energy Lab Co Ltd 遺伝的アルゴリズムを用いたマイクロプロセッサ
US7149872B2 (en) 2003-07-10 2006-12-12 Transmeta Corporation System and method for identifying TLB entries associated with a physical address of a specified range
US7089398B2 (en) 2003-07-31 2006-08-08 Silicon Graphics, Inc. Address translation using a page size tag
US8296771B2 (en) 2003-08-18 2012-10-23 Cray Inc. System and method for mapping between resource consumers and resource providers in a computing system
US7133950B2 (en) 2003-08-19 2006-11-07 Sun Microsystems, Inc. Request arbitration in multi-core processor
US9032404B2 (en) 2003-08-28 2015-05-12 Mips Technologies, Inc. Preemptive multitasking employing software emulation of directed exceptions in a multithreading processor
US7594089B2 (en) 2003-08-28 2009-09-22 Mips Technologies, Inc. Smart memory based synchronization controller for a multi-threaded multiprocessor SoC
US7849297B2 (en) 2003-08-28 2010-12-07 Mips Technologies, Inc. Software emulation of directed exceptions in a multithreading processor
US7610473B2 (en) 2003-08-28 2009-10-27 Mips Technologies, Inc. Apparatus, method, and instruction for initiation of concurrent instruction streams in a multithreading microprocessor
US7111126B2 (en) 2003-09-24 2006-09-19 Arm Limited Apparatus and method for loading data values
JP4057989B2 (ja) 2003-09-26 2008-03-05 株式会社東芝 スケジューリング方法および情報処理システム
US7047322B1 (en) 2003-09-30 2006-05-16 Unisys Corporation System and method for performing conflict resolution and flow control in a multiprocessor system
FR2860313B1 (fr) 2003-09-30 2005-11-04 Commissariat Energie Atomique Composant a architecture reconfigurable dynamiquement
US7373637B2 (en) 2003-09-30 2008-05-13 International Business Machines Corporation Method and apparatus for counting instruction and memory location ranges
TWI281121B (en) 2003-10-06 2007-05-11 Ip First Llc Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence
US7395372B2 (en) 2003-11-14 2008-07-01 International Business Machines Corporation Method and system for providing cache set selection which is power optimized
US7243170B2 (en) 2003-11-24 2007-07-10 International Business Machines Corporation Method and circuit for reading and writing an instruction buffer
US20050120191A1 (en) 2003-12-02 2005-06-02 Intel Corporation (A Delaware Corporation) Checkpoint-based register reclamation
US20050132145A1 (en) 2003-12-15 2005-06-16 Finisar Corporation Contingent processor time division multiple access of memory in a multi-processor system to allow supplemental memory consumer access
US7310722B2 (en) 2003-12-18 2007-12-18 Nvidia Corporation Across-thread out of order instruction dispatch in a multithreaded graphics processor
US7293164B2 (en) 2004-01-14 2007-11-06 International Business Machines Corporation Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions
US7426749B2 (en) 2004-01-20 2008-09-16 International Business Machines Corporation Distributed computation in untrusted computing environments using distractive computational units
US20050204118A1 (en) 2004-02-27 2005-09-15 National Chiao Tung University Method for inter-cluster communication that employs register permutation
EP1738258A4 (en) 2004-03-13 2009-10-28 Cluster Resources Inc SYSTEM AND METHOD IMPLEMENTING OBJECT TRIGGERS
US7478374B2 (en) 2004-03-22 2009-01-13 Intel Corporation Debug system having assembler correcting register allocation errors
US20050216920A1 (en) 2004-03-24 2005-09-29 Vijay Tewari Use of a virtual machine to emulate a hardware device
US8055885B2 (en) 2004-03-29 2011-11-08 Japan Science And Technology Agency Data processing device for implementing instruction reuse, and digital data storage medium for storing a data processing program for implementing instruction reuse
US7386679B2 (en) 2004-04-15 2008-06-10 International Business Machines Corporation System, method and storage medium for memory management
US7383427B2 (en) 2004-04-22 2008-06-03 Sony Computer Entertainment Inc. Multi-scalar extension for SIMD instruction set processors
US20050251649A1 (en) 2004-04-23 2005-11-10 Sony Computer Entertainment Inc. Methods and apparatus for address map optimization on a multi-scalar extension
US7418582B1 (en) 2004-05-13 2008-08-26 Sun Microsystems, Inc. Versatile register file design for a multi-threaded processor utilizing different modes and register windows
US7478198B2 (en) 2004-05-24 2009-01-13 Intel Corporation Multithreaded clustered microarchitecture with dynamic back-end assignment
US7594234B1 (en) 2004-06-04 2009-09-22 Sun Microsystems, Inc. Adaptive spin-then-block mutual exclusion in multi-threaded processing
US7284092B2 (en) 2004-06-24 2007-10-16 International Business Machines Corporation Digital data processing apparatus having multi-level register file
US20050289530A1 (en) * 2004-06-29 2005-12-29 Robison Arch D Scheduling of instructions in program compilation
EP1628235A1 (en) 2004-07-01 2006-02-22 Texas Instruments Incorporated Method and system of ensuring integrity of a secure mode entry sequence
US8044951B1 (en) 2004-07-02 2011-10-25 Nvidia Corporation Integer-based functionality in a graphics shading language
US7339592B2 (en) 2004-07-13 2008-03-04 Nvidia Corporation Simulating multiported memories using lower port count memories
US7398347B1 (en) 2004-07-14 2008-07-08 Altera Corporation Methods and apparatus for dynamic instruction controlled reconfigurable register file
EP1619593A1 (en) 2004-07-22 2006-01-25 Sap Ag Computer-Implemented method and system for performing a product availability check
JP4064380B2 (ja) 2004-07-29 2008-03-19 富士通株式会社 演算処理装置およびその制御方法
US8443171B2 (en) 2004-07-30 2013-05-14 Hewlett-Packard Development Company, L.P. Run-time updating of prediction hint instructions
US7213106B1 (en) 2004-08-09 2007-05-01 Sun Microsystems, Inc. Conservative shadow cache support in a point-to-point connected multiprocessing node
US7318143B2 (en) 2004-10-20 2008-01-08 Arm Limited Reuseable configuration data
US20090150890A1 (en) 2007-12-10 2009-06-11 Yourst Matt T Strand-based computing hardware and dynamically optimizing strandware for a high performance microprocessor system
US7707578B1 (en) 2004-12-16 2010-04-27 Vmware, Inc. Mechanism for scheduling execution of threads for fair resource allocation in a multi-threaded and/or multi-core processing system
US7257695B2 (en) 2004-12-28 2007-08-14 Intel Corporation Register file regions for a processing system
US7996644B2 (en) 2004-12-29 2011-08-09 Intel Corporation Fair sharing of a cache in a multi-core/multi-threaded processor by dynamically partitioning of the cache
US8719819B2 (en) 2005-06-30 2014-05-06 Intel Corporation Mechanism for instruction set based thread execution on a plurality of instruction sequencers
US7050922B1 (en) 2005-01-14 2006-05-23 Agilent Technologies, Inc. Method for optimizing test order, and machine-readable media storing sequences of instructions to perform same
US7657891B2 (en) 2005-02-04 2010-02-02 Mips Technologies, Inc. Multithreading microprocessor with optimized thread scheduler for increasing pipeline utilization efficiency
US7681014B2 (en) 2005-02-04 2010-03-16 Mips Technologies, Inc. Multithreading instruction scheduler employing thread group priorities
US20060179277A1 (en) 2005-02-04 2006-08-10 Flachs Brian K System and method for instruction line buffer holding a branch target buffer
EP1849095B1 (en) 2005-02-07 2013-01-02 Richter, Thomas Low latency massive parallel data processing device
US7400548B2 (en) 2005-02-09 2008-07-15 International Business Machines Corporation Method for providing multiple reads/writes using a 2read/2write register file array
US7343476B2 (en) 2005-02-10 2008-03-11 International Business Machines Corporation Intelligent SMT thread hang detect taking into account shared resource contention/blocking
US7152155B2 (en) 2005-02-18 2006-12-19 Qualcomm Incorporated System and method of correcting a branch misprediction
US20060200655A1 (en) 2005-03-04 2006-09-07 Smith Rodney W Forward looking branch target address caching
US20060212853A1 (en) 2005-03-18 2006-09-21 Marvell World Trade Ltd. Real-time control apparatus having a multi-thread processor
US8195922B2 (en) 2005-03-18 2012-06-05 Marvell World Trade, Ltd. System for dynamically allocating processing time to multiple threads
GB2424727B (en) * 2005-03-30 2007-08-01 Transitive Ltd Preparing instruction groups for a processor having a multiple issue ports
US8522253B1 (en) 2005-03-31 2013-08-27 Guillermo Rozas Hardware support for virtual machine and operating system context switching in translation lookaside buffers and virtually tagged caches
US7313775B2 (en) 2005-04-06 2007-12-25 Lsi Corporation Integrated circuit with relocatable processor hardmac
US20060230243A1 (en) 2005-04-06 2006-10-12 Robert Cochran Cascaded snapshots
US8230423B2 (en) 2005-04-07 2012-07-24 International Business Machines Corporation Multithreaded processor architecture with operational latency hiding
US20060230409A1 (en) 2005-04-07 2006-10-12 Matteo Frigo Multithreaded processor architecture with implicit granularity adaptation
US7447869B2 (en) 2005-04-07 2008-11-04 Ati Technologies, Inc. Method and apparatus for fragment processing in a virtual memory system
US20060230253A1 (en) 2005-04-11 2006-10-12 Lucian Codrescu Unified non-partitioned register files for a digital signal processor operating in an interleaved multi-threaded environment
US20060236074A1 (en) 2005-04-14 2006-10-19 Arm Limited Indicating storage locations within caches
US7600135B2 (en) 2005-04-14 2009-10-06 Mips Technologies, Inc. Apparatus and method for software specified power management performance using low power virtual threads
US7437543B2 (en) 2005-04-19 2008-10-14 International Business Machines Corporation Reducing the fetch time of target instructions of a predicted taken branch instruction
US7461237B2 (en) 2005-04-20 2008-12-02 Sun Microsystems, Inc. Method and apparatus for suppressing duplicative prefetches for branch target cache lines
WO2006116044A2 (en) 2005-04-22 2006-11-02 Altrix Logic, Inc. Array of data processing elements with variable precision interconnect
US8713286B2 (en) 2005-04-26 2014-04-29 Qualcomm Incorporated Register files for a digital signal processor operating in an interleaved multi-threaded environment
US7630388B2 (en) 2005-05-04 2009-12-08 Arm Limited Software defined FIFO memory for storing a set of data from a stream of source data
GB2426084A (en) 2005-05-13 2006-11-15 Agilent Technologies Inc Updating data in a dual port memory
US7861055B2 (en) 2005-06-07 2010-12-28 Broadcom Corporation Method and system for on-chip configurable data ram for fast memory and pseudo associative caches
US8010969B2 (en) 2005-06-13 2011-08-30 Intel Corporation Mechanism for monitoring instruction set based thread execution on a plurality of instruction sequencers
KR101355496B1 (ko) 2005-08-29 2014-01-28 디 인벤션 사이언스 펀드 원, 엘엘씨 복수의 병렬 클러스터들을 포함하는 계층 프로세서의스케쥴링 메카니즘
EP1927054A1 (en) 2005-09-14 2008-06-04 Koninklijke Philips Electronics N.V. Method and system for bus arbitration
US7562271B2 (en) 2005-09-26 2009-07-14 Rambus Inc. Memory system topologies including a buffer device and an integrated circuit memory device
US7350056B2 (en) 2005-09-27 2008-03-25 International Business Machines Corporation Method and apparatus for issuing instructions from an issue queue in an information handling system
US7676634B1 (en) 2005-09-28 2010-03-09 Sun Microsystems, Inc. Selective trace cache invalidation for self-modifying code via memory aging
US7231106B2 (en) 2005-09-30 2007-06-12 Lucent Technologies Inc. Apparatus for directing an optical signal from an input fiber to an output fiber within a high index host
US7627735B2 (en) 2005-10-21 2009-12-01 Intel Corporation Implementing vector memory operations
US7613131B2 (en) 2005-11-10 2009-11-03 Citrix Systems, Inc. Overlay network infrastructure
US7681019B1 (en) 2005-11-18 2010-03-16 Sun Microsystems, Inc. Executing functions determined via a collection of operations from translated instructions
US7861060B1 (en) 2005-12-15 2010-12-28 Nvidia Corporation Parallel data processing systems and methods using cooperative thread arrays and thread identifier values to determine processing behavior
US7634637B1 (en) 2005-12-16 2009-12-15 Nvidia Corporation Execution of parallel groups of threads with per-instruction serialization
US7673111B2 (en) 2005-12-23 2010-03-02 Intel Corporation Memory system with both single and consolidated commands
US7770161B2 (en) 2005-12-28 2010-08-03 International Business Machines Corporation Post-register allocation profile directed instruction scheduling
US8423682B2 (en) 2005-12-30 2013-04-16 Intel Corporation Address space emulation
US7512767B2 (en) 2006-01-04 2009-03-31 Sony Ericsson Mobile Communications Ab Data compression method for supporting virtual memory management in a demand paging system
US20070186050A1 (en) 2006-02-03 2007-08-09 International Business Machines Corporation Self prefetching L2 cache mechanism for data lines
GB2435362B (en) 2006-02-20 2008-11-26 Cramer Systems Ltd Method of configuring devices in a telecommunications network
WO2007097019A1 (ja) 2006-02-27 2007-08-30 Fujitsu Limited キャッシュ制御装置およびキャッシュ制御方法
US7543282B2 (en) 2006-03-24 2009-06-02 Sun Microsystems, Inc. Method and apparatus for selectively executing different executable code versions which are optimized in different ways
US7500073B1 (en) 2006-03-29 2009-03-03 Sun Microsystems, Inc. Relocation of virtual-to-physical mappings
CN103646009B (zh) 2006-04-12 2016-08-17 索夫特机械公司 对载明并行和依赖运算的指令矩阵进行处理的装置和方法
US7577820B1 (en) 2006-04-14 2009-08-18 Tilera Corporation Managing data in a parallel processing environment
US7610571B2 (en) 2006-04-14 2009-10-27 Cadence Design Systems, Inc. Method and system for simulating state retention of an RTL design
CN100485636C (zh) 2006-04-24 2009-05-06 华为技术有限公司 一种基于模型驱动进行电信级业务开发的调试方法及装置
US7804076B2 (en) 2006-05-10 2010-09-28 Taiwan Semiconductor Manufacturing Co., Ltd Insulator for high current ion implanters
US8145882B1 (en) 2006-05-25 2012-03-27 Mips Technologies, Inc. Apparatus and method for processing template based user defined instructions
US8108844B2 (en) 2006-06-20 2012-01-31 Google Inc. Systems and methods for dynamically choosing a processing element for a compute kernel
US20080126771A1 (en) 2006-07-25 2008-05-29 Lei Chen Branch Target Extension for an Instruction Cache
CN100495324C (zh) 2006-07-27 2009-06-03 中国科学院计算技术研究所 复杂指令集体系结构中的深度优先异常处理方法
US8046775B2 (en) 2006-08-14 2011-10-25 Marvell World Trade Ltd. Event-based bandwidth allocation mode switching method and apparatus
US7904704B2 (en) 2006-08-14 2011-03-08 Marvell World Trade Ltd. Instruction dispatching method and apparatus
US7539842B2 (en) 2006-08-15 2009-05-26 International Business Machines Corporation Computer memory system for selecting memory buses according to physical memory organization information stored in virtual address translation tables
US7594060B2 (en) 2006-08-23 2009-09-22 Sun Microsystems, Inc. Data buffer allocation in a non-blocking data services platform using input/output switching fabric
US7657494B2 (en) 2006-09-20 2010-02-02 Chevron U.S.A. Inc. Method for forecasting the production of a petroleum reservoir utilizing genetic programming
US7752474B2 (en) 2006-09-22 2010-07-06 Apple Inc. L1 cache flush when processor is entering low power mode
US7716460B2 (en) 2006-09-29 2010-05-11 Qualcomm Incorporated Effective use of a BHT in processor having variable length instruction set execution modes
US7774549B2 (en) 2006-10-11 2010-08-10 Mips Technologies, Inc. Horizontally-shared cache victims in multiple core processors
TWI337495B (en) * 2006-10-26 2011-02-11 Au Optronics Corp System and method for operation scheduling
US7680988B1 (en) 2006-10-30 2010-03-16 Nvidia Corporation Single interconnect providing read and write access to a memory shared by concurrent threads
US8108625B1 (en) 2006-10-30 2012-01-31 Nvidia Corporation Shared memory with parallel access and access conflict resolution mechanism
US7617384B1 (en) 2006-11-06 2009-11-10 Nvidia Corporation Structured programming control flow using a disable mask in a SIMD architecture
CN101627365B (zh) 2006-11-14 2017-03-29 索夫特机械公司 多线程架构
US7493475B2 (en) 2006-11-15 2009-02-17 Stmicroelectronics, Inc. Instruction vector-mode processing in multi-lane processor by multiplex switch replicating instruction in one lane to select others along with updated operand address
US7934179B2 (en) 2006-11-20 2011-04-26 Et International, Inc. Systems and methods for logic verification
US20080235500A1 (en) 2006-11-21 2008-09-25 Davis Gordon T Structure for instruction cache trace formation
JP2008130056A (ja) 2006-11-27 2008-06-05 Renesas Technology Corp 半導体回路
US7945763B2 (en) 2006-12-13 2011-05-17 International Business Machines Corporation Single shared instruction predecoder for supporting multiple processors
US20080148020A1 (en) 2006-12-13 2008-06-19 Luick David A Low Cost Persistent Instruction Predecoded Issue and Dispatcher
WO2008077088A2 (en) 2006-12-19 2008-06-26 The Board Of Governors For Higher Education, State Of Rhode Island And Providence Plantations System and method for branch misprediction prediction using complementary branch predictors
US7783869B2 (en) 2006-12-19 2010-08-24 Arm Limited Accessing branch predictions ahead of instruction fetching
EP1940028B1 (en) 2006-12-29 2012-02-29 STMicroelectronics Srl Asynchronous interconnection system for 3D inter-chip communication
US8321849B2 (en) 2007-01-26 2012-11-27 Nvidia Corporation Virtual architecture and instruction set for parallel thread computing
TW200833002A (en) 2007-01-31 2008-08-01 Univ Nat Yunlin Sci & Tech Distributed switching circuit having fairness
US20080189501A1 (en) 2007-02-05 2008-08-07 Irish John D Methods and Apparatus for Issuing Commands on a Bus
US7685410B2 (en) 2007-02-13 2010-03-23 Global Foundries Inc. Redirect recovery cache that receives branch misprediction redirects and caches instructions to be dispatched in response to the redirects
US7647483B2 (en) 2007-02-20 2010-01-12 Sony Computer Entertainment Inc. Multi-threaded parallel processor methods and apparatus
US20080209190A1 (en) 2007-02-28 2008-08-28 Advanced Micro Devices, Inc. Parallel prediction of multiple branches
JP4980751B2 (ja) 2007-03-02 2012-07-18 富士通セミコンダクター株式会社 データ処理装置、およびメモリのリードアクティブ制御方法。
US8452907B2 (en) 2007-03-27 2013-05-28 Arm Limited Data processing apparatus and method for arbitrating access to a shared resource
US20080250227A1 (en) * 2007-04-04 2008-10-09 Linderman Michael D General Purpose Multiprocessor Programming Apparatus And Method
US7716183B2 (en) 2007-04-11 2010-05-11 Dot Hill Systems Corporation Snapshot preserved data cloning
US7941791B2 (en) 2007-04-13 2011-05-10 Perry Wang Programming environment for heterogeneous processor resource integration
US7769955B2 (en) 2007-04-27 2010-08-03 Arm Limited Multiple thread instruction fetch from different cache levels
US7711935B2 (en) 2007-04-30 2010-05-04 Netlogic Microsystems, Inc. Universal branch identifier for invalidation of speculative instructions
US8555039B2 (en) 2007-05-03 2013-10-08 Qualcomm Incorporated System and method for using a local condition code register for accelerating conditional instruction execution in a pipeline processor
US8219996B1 (en) 2007-05-09 2012-07-10 Hewlett-Packard Development Company, L.P. Computer processor with fairness monitor
US9292436B2 (en) 2007-06-25 2016-03-22 Sonics, Inc. Various methods and apparatus to support transactions whose data address sequence within that transaction crosses an interleaved channel address boundary
CN101344840B (zh) 2007-07-10 2011-08-31 苏州简约纳电子有限公司 一种微处理器及在微处理器中执行指令的方法
US7937568B2 (en) 2007-07-11 2011-05-03 International Business Machines Corporation Adaptive execution cycle control method for enhanced instruction throughput
US20090025004A1 (en) 2007-07-16 2009-01-22 Microsoft Corporation Scheduling by Growing and Shrinking Resource Allocation
US8433851B2 (en) 2007-08-16 2013-04-30 International Business Machines Corporation Reducing wiring congestion in a cache subsystem utilizing sectored caches with discontiguous addressing
US8108545B2 (en) 2007-08-27 2012-01-31 International Business Machines Corporation Packet coalescing in virtual channels of a data processing system in a multi-tiered full-graph interconnect architecture
US7711929B2 (en) 2007-08-30 2010-05-04 International Business Machines Corporation Method and system for tracking instruction dependency in an out-of-order processor
GB2452316B (en) 2007-08-31 2009-08-19 Toshiba Res Europ Ltd Method of Allocating Resources in a Computer.
US8725991B2 (en) 2007-09-12 2014-05-13 Qualcomm Incorporated Register file system and method for pipelined processing
US8082420B2 (en) 2007-10-24 2011-12-20 International Business Machines Corporation Method and apparatus for executing instructions
US7856530B1 (en) 2007-10-31 2010-12-21 Network Appliance, Inc. System and method for implementing a dynamic cache for a data storage system
CN100478918C (zh) 2007-10-31 2009-04-15 中国人民解放军国防科学技术大学 微处理器中分段高速缓存的设计方法及分段高速缓存
JP2011503710A (ja) 2007-11-09 2011-01-27 プルラリティー リミテッド しっかりと連結されたマルチプロセッサのための共有メモリ・システム
US7877559B2 (en) 2007-11-26 2011-01-25 Globalfoundries Inc. Mechanism to accelerate removal of store operations from a queue
US8245232B2 (en) 2007-11-27 2012-08-14 Microsoft Corporation Software-configurable and stall-time fair memory access scheduling mechanism for shared memory systems
US7809925B2 (en) 2007-12-07 2010-10-05 International Business Machines Corporation Processing unit incorporating vectorizable execution unit
US8145844B2 (en) 2007-12-13 2012-03-27 Arm Limited Memory controller with write data cache and read data cache
US7870371B2 (en) 2007-12-17 2011-01-11 Microsoft Corporation Target-frequency based indirect jump prediction for high-performance processors
US7831813B2 (en) 2007-12-17 2010-11-09 Globalfoundries Inc. Uses of known good code for implementing processor architectural modifications
US20090165007A1 (en) 2007-12-19 2009-06-25 Microsoft Corporation Task-level thread scheduling and resource allocation
US8782384B2 (en) 2007-12-20 2014-07-15 Advanced Micro Devices, Inc. Branch history with polymorphic indirect branch information
US7917699B2 (en) 2007-12-21 2011-03-29 Mips Technologies, Inc. Apparatus and method for controlling the exclusivity mode of a level-two cache
US8645965B2 (en) 2007-12-31 2014-02-04 Intel Corporation Supporting metered clients with manycore through time-limited partitioning
US9244855B2 (en) 2007-12-31 2016-01-26 Intel Corporation Method, system, and apparatus for page sizing extension
CN101217495A (zh) 2008-01-11 2008-07-09 北京邮电大学 用于t-mpls网络环境下的流量监控方法和装置
US7877582B2 (en) 2008-01-31 2011-01-25 International Business Machines Corporation Multi-addressable register file
WO2009101563A1 (en) 2008-02-11 2009-08-20 Nxp B.V. Multiprocessing implementing a plurality of virtual processors
US9021240B2 (en) 2008-02-22 2015-04-28 International Business Machines Corporation System and method for Controlling restarting of instruction fetching using speculative address computations
US7949972B2 (en) 2008-03-19 2011-05-24 International Business Machines Corporation Method, system and computer program product for exploiting orthogonal control vectors in timing driven synthesis
US7987343B2 (en) 2008-03-19 2011-07-26 International Business Machines Corporation Processor and method for synchronous load multiple fetching sequence and pipeline stage result tracking to facilitate early address generation interlock bypass
US9513905B2 (en) 2008-03-28 2016-12-06 Intel Corporation Vector instructions to enable efficient synchronization and parallel reduction operations
US8120608B2 (en) 2008-04-04 2012-02-21 Via Technologies, Inc. Constant buffering for a computational core of a programmable graphics processing unit
CR20170001A (es) 2008-04-28 2017-08-10 Genentech Inc Anticuerpos anti factor d humanizados
TWI364703B (en) 2008-05-26 2012-05-21 Faraday Tech Corp Processor and early execution method of data load thereof
US8131982B2 (en) 2008-06-13 2012-03-06 International Business Machines Corporation Branch prediction instructions having mask values involving unloading and loading branch history data
US8145880B1 (en) 2008-07-07 2012-03-27 Ovics Matrix processor data switch routing systems and methods
US8516454B2 (en) 2008-07-10 2013-08-20 Rocketick Technologies Ltd. Efficient parallel computation of dependency problems
JP2010039536A (ja) 2008-07-31 2010-02-18 Panasonic Corp プログラム変換装置、プログラム変換方法およびプログラム変換プログラム
US8316435B1 (en) 2008-08-14 2012-11-20 Juniper Networks, Inc. Routing device having integrated MPLS-aware firewall with virtual security system support
US8135942B2 (en) 2008-08-28 2012-03-13 International Business Machines Corpration System and method for double-issue instructions using a dependency matrix and a side issue queue
US7769984B2 (en) 2008-09-11 2010-08-03 International Business Machines Corporation Dual-issuance of microprocessor instructions using dual dependency matrices
US8225048B2 (en) 2008-10-01 2012-07-17 Hewlett-Packard Development Company, L.P. Systems and methods for resource access
US9244732B2 (en) 2009-08-28 2016-01-26 Vmware, Inc. Compensating threads for microarchitectural resource contentions by prioritizing scheduling and execution
US7941616B2 (en) 2008-10-21 2011-05-10 Microsoft Corporation System to reduce interference in concurrent programs
US8423749B2 (en) 2008-10-22 2013-04-16 International Business Machines Corporation Sequential processing in network on chip nodes by threads generating message containing payload and pointer for nanokernel to access algorithm to be executed on payload in another node
GB2464703A (en) 2008-10-22 2010-04-28 Advanced Risc Mach Ltd An array of interconnected processors executing a cycle-based program
WO2010049585A1 (en) 2008-10-30 2010-05-06 Nokia Corporation Method and apparatus for interleaving a data block
US8032678B2 (en) 2008-11-05 2011-10-04 Mediatek Inc. Shared resource arbitration
US7848129B1 (en) 2008-11-20 2010-12-07 Netlogic Microsystems, Inc. Dynamically partitioned CAM array
US8868838B1 (en) 2008-11-21 2014-10-21 Nvidia Corporation Multi-class data cache policies
US8171223B2 (en) 2008-12-03 2012-05-01 Intel Corporation Method and system to increase concurrency and control replication in a multi-core cache hierarchy
US8200949B1 (en) 2008-12-09 2012-06-12 Nvidia Corporation Policy based allocation of register file cache to threads in multi-threaded processor
US8312268B2 (en) 2008-12-12 2012-11-13 International Business Machines Corporation Virtual machine
US7870308B2 (en) 2008-12-23 2011-01-11 International Business Machines Corporation Programmable direct memory access engine
US8099586B2 (en) 2008-12-30 2012-01-17 Oracle America, Inc. Branch misprediction recovery mechanism for microprocessors
US20100169578A1 (en) 2008-12-31 2010-07-01 Texas Instruments Incorporated Cache tag memory
US20100205603A1 (en) 2009-02-09 2010-08-12 Unisys Corporation Scheduling and dispatching tasks in an emulated operating system
JP5417879B2 (ja) 2009-02-17 2014-02-19 富士通セミコンダクター株式会社 キャッシュ装置
JP2010226275A (ja) 2009-03-23 2010-10-07 Nec Corp 通信装置および通信方法
US8505013B2 (en) 2010-03-12 2013-08-06 Lsi Corporation Reducing data read latency in a network communications processor architecture
US8805788B2 (en) 2009-05-04 2014-08-12 Moka5, Inc. Transactional virtual disk with differential snapshots
US8332854B2 (en) * 2009-05-19 2012-12-11 Microsoft Corporation Virtualized thread scheduling for hardware thread optimization based on hardware resource parameter summaries of instruction blocks in execution groups
US8533437B2 (en) 2009-06-01 2013-09-10 Via Technologies, Inc. Guaranteed prefetch instruction
TW201044185A (en) 2009-06-09 2010-12-16 Zillians Inc Virtual world simulation systems and methods utilizing parallel coprocessors, and computer program products thereof
GB2471067B (en) 2009-06-12 2011-11-30 Graeme Roy Smith Shared resource multi-thread array processor
US9122487B2 (en) 2009-06-23 2015-09-01 Oracle America, Inc. System and method for balancing instruction loads between multiple execution units using assignment history
US8386754B2 (en) 2009-06-24 2013-02-26 Arm Limited Renaming wide register source operand with plural short register source operands for select instructions to detect dependency fast with existing mechanism
CN101582025B (zh) 2009-06-25 2011-05-25 浙江大学 片上多处理器体系架构下全局寄存器重命名表的实现方法
US8397049B2 (en) 2009-07-13 2013-03-12 Apple Inc. TLB prefetching
US8539486B2 (en) 2009-07-17 2013-09-17 International Business Machines Corporation Transactional block conflict resolution based on the determination of executing threads in parallel or in serial mode
JP5423217B2 (ja) 2009-08-04 2014-02-19 富士通株式会社 演算処理装置、情報処理装置、および演算処理装置の制御方法
US8127078B2 (en) 2009-10-02 2012-02-28 International Business Machines Corporation High performance unaligned cache access
US20110082983A1 (en) 2009-10-06 2011-04-07 Alcatel-Lucent Canada, Inc. Cpu instruction and data cache corruption prevention system
US8695002B2 (en) 2009-10-20 2014-04-08 Lantiq Deutschland Gmbh Multi-threaded processors and multi-processor systems comprising shared resources
US8364933B2 (en) 2009-12-18 2013-01-29 International Business Machines Corporation Software assisted translation lookaside buffer search mechanism
JP2011150397A (ja) 2010-01-19 2011-08-04 Panasonic Corp バス調停装置
KR101699910B1 (ko) 2010-03-04 2017-01-26 삼성전자주식회사 재구성 가능 프로세서 및 그 제어 방법
US20120005462A1 (en) 2010-07-01 2012-01-05 International Business Machines Corporation Hardware Assist for Optimizing Code During Processing
US8312258B2 (en) 2010-07-22 2012-11-13 Intel Corporation Providing platform independent memory logic
US8751745B2 (en) 2010-08-11 2014-06-10 Advanced Micro Devices, Inc. Method for concurrent flush of L1 and L2 caches
CN101916180B (zh) 2010-08-11 2013-05-29 中国科学院计算技术研究所 Risc处理器中执行寄存器类型指令的方法和其系统
US9201801B2 (en) 2010-09-15 2015-12-01 International Business Machines Corporation Computing device with asynchronous auxiliary execution unit
US8856460B2 (en) 2010-09-15 2014-10-07 Oracle International Corporation System and method for zero buffer copying in a middleware environment
EP3156896B1 (en) 2010-09-17 2020-04-08 Soft Machines, Inc. Single cycle multi-branch prediction including shadow cache for early far branch prediction
US20120079212A1 (en) 2010-09-23 2012-03-29 International Business Machines Corporation Architecture for sharing caches among multiple processes
US9733944B2 (en) 2010-10-12 2017-08-15 Intel Corporation Instruction sequence buffer to store branches having reliably predictable instruction sequences
EP2628072B1 (en) 2010-10-12 2016-10-12 Soft Machines, Inc. An instruction sequence buffer to enhance branch prediction efficiency
US8370553B2 (en) 2010-10-18 2013-02-05 International Business Machines Corporation Formal verification of random priority-based arbiters using property strengthening and underapproximations
US9047178B2 (en) 2010-12-13 2015-06-02 SanDisk Technologies, Inc. Auto-commit memory synchronization
US8677355B2 (en) 2010-12-17 2014-03-18 Microsoft Corporation Virtual machine branching and parallel execution
WO2012103245A2 (en) 2011-01-27 2012-08-02 Soft Machines Inc. Guest instruction block with near branching and far branching sequence construction to native instruction block
EP2689327B1 (en) 2011-03-25 2021-07-28 Intel Corporation Executing instruction sequence code blocks by using virtual cores instantiated by partitionable engines
CN103635875B (zh) 2011-03-25 2018-02-16 英特尔公司 用于通过使用由可分区引擎实例化的虚拟核来支持代码块执行的存储器片段
WO2012135041A2 (en) 2011-03-25 2012-10-04 Soft Machines, Inc. Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines
US20120254592A1 (en) 2011-04-01 2012-10-04 Jesus Corbal San Adrian Systems, apparatuses, and methods for expanding a memory source into a destination register and compressing a source register into a destination memory location
US9740494B2 (en) 2011-04-29 2017-08-22 Arizona Board Of Regents For And On Behalf Of Arizona State University Low complexity out-of-order issue logic using static circuits
US8843690B2 (en) 2011-07-11 2014-09-23 Avago Technologies General Ip (Singapore) Pte. Ltd. Memory conflicts learning capability
US8930432B2 (en) 2011-08-04 2015-01-06 International Business Machines Corporation Floating point execution unit with fixed point functionality
US20130046934A1 (en) 2011-08-15 2013-02-21 Robert Nychka System caching using heterogenous memories
US8839025B2 (en) 2011-09-30 2014-09-16 Oracle International Corporation Systems and methods for retiring and unretiring cache lines
KR101703401B1 (ko) 2011-11-22 2017-02-06 소프트 머신즈, 인크. 다중 엔진 마이크로프로세서용 가속 코드 최적화기
WO2013077872A1 (en) 2011-11-22 2013-05-30 Soft Machines, Inc. A microprocessor accelerated code optimizer and dependency reordering method
US20150039859A1 (en) 2011-11-22 2015-02-05 Soft Machines, Inc. Microprocessor accelerated code optimizer
US20130138888A1 (en) 2011-11-30 2013-05-30 Jama I. Barreh Storing a target address of a control transfer instruction in an instruction field
US8930674B2 (en) 2012-03-07 2015-01-06 Soft Machines, Inc. Systems and methods for accessing a unified translation lookaside buffer
KR20130119285A (ko) 2012-04-23 2013-10-31 한국전자통신연구원 클러스터 컴퓨팅 환경에서의 자원 할당 장치 및 그 방법
US9684601B2 (en) 2012-05-10 2017-06-20 Arm Limited Data processing apparatus having cache and translation lookaside buffer
US9996348B2 (en) 2012-06-14 2018-06-12 Apple Inc. Zero cycle load
US9940247B2 (en) 2012-06-26 2018-04-10 Advanced Micro Devices, Inc. Concurrent access to cache dirty bits
US9710399B2 (en) 2012-07-30 2017-07-18 Intel Corporation Systems and methods for flushing a cache with modified data
US9229873B2 (en) 2012-07-30 2016-01-05 Soft Machines, Inc. Systems and methods for supporting a plurality of load and store accesses of a cache
US9916253B2 (en) 2012-07-30 2018-03-13 Intel Corporation Method and apparatus for supporting a plurality of load accesses of a cache in a single cycle to maintain throughput
US9430410B2 (en) 2012-07-30 2016-08-30 Soft Machines, Inc. Systems and methods for supporting a plurality of load accesses of a cache in a single cycle
US9740612B2 (en) 2012-07-30 2017-08-22 Intel Corporation Systems and methods for maintaining the coherency of a store coalescing cache and a load cache
US9678882B2 (en) 2012-10-11 2017-06-13 Intel Corporation Systems and methods for non-blocking implementation of cache flush instructions
US10037228B2 (en) 2012-10-25 2018-07-31 Nvidia Corporation Efficient memory virtualization in multi-threaded processing units
US9195506B2 (en) 2012-12-21 2015-11-24 International Business Machines Corporation Processor provisioning by a middleware processing system for a plurality of logical processor partitions
GB2514956B (en) 2013-01-21 2015-04-01 Imagination Tech Ltd Allocating resources to threads based on speculation metric
KR102083390B1 (ko) * 2013-03-15 2020-03-02 인텔 코포레이션 네이티브 분산된 플래그 아키텍처를 이용하여 게스트 중앙 플래그 아키텍처를 에뮬레이션하는 방법
KR101708591B1 (ko) 2013-03-15 2017-02-20 소프트 머신즈, 인크. 블록들로 그룹화된 멀티스레드 명령어들을 실행하기 위한 방법
US10140138B2 (en) 2013-03-15 2018-11-27 Intel Corporation Methods, systems and apparatus for supporting wide and efficient front-end operation with guest-architecture emulation
US9811342B2 (en) 2013-03-15 2017-11-07 Intel Corporation Method for performing dual dispatch of blocks and half blocks
EP2972794A4 (en) 2013-03-15 2017-05-03 Soft Machines, Inc. A method for executing blocks of instructions using a microprocessor architecture having a register view, source view, instruction view, and a plurality of register templates
US9112767B2 (en) 2013-03-15 2015-08-18 Cavium, Inc. Method and an accumulator scoreboard for out-of-order rule response handling
US9891924B2 (en) 2013-03-15 2018-02-13 Intel Corporation Method for implementing a reduced size register view data structure in a microprocessor
US9904625B2 (en) 2013-03-15 2018-02-27 Intel Corporation Methods, systems and apparatus for predicting the way of a set associative cache
WO2014150806A1 (en) 2013-03-15 2014-09-25 Soft Machines, Inc. A method for populating register view data structure by using register template snapshots
US9569216B2 (en) 2013-03-15 2017-02-14 Soft Machines, Inc. Method for populating a source view data structure by using register template snapshots
US10275255B2 (en) 2013-03-15 2019-04-30 Intel Corporation Method for dependency broadcasting through a source organized source view data structure
WO2014150991A1 (en) 2013-03-15 2014-09-25 Soft Machines, Inc. A method for implementing a reduced size register view data structure in a microprocessor
WO2014150971A1 (en) 2013-03-15 2014-09-25 Soft Machines, Inc. A method for dependency broadcasting through a block organized source view data structure
US9632825B2 (en) 2013-03-15 2017-04-25 Intel Corporation Method and apparatus for efficient scheduling for asymmetrical execution units
US9886279B2 (en) 2013-03-15 2018-02-06 Intel Corporation Method for populating and instruction view data structure by using register template snapshots
US9208066B1 (en) 2015-03-04 2015-12-08 Centipede Semi Ltd. Run-time code parallelization with approximate monitoring of instruction sequences

Also Published As

Publication number Publication date
KR20150130510A (ko) 2015-11-23
US20170123807A1 (en) 2017-05-04
EP2972836B1 (en) 2022-11-09
TW201504942A (zh) 2015-02-01
CN105247484A (zh) 2016-01-13
CN105247484B (zh) 2021-02-23
EP2972836A4 (en) 2017-08-02
WO2014151043A1 (en) 2014-09-25
EP2972836A1 (en) 2016-01-20
US20200341768A1 (en) 2020-10-29
KR20170089968A (ko) 2017-08-04
US20140281436A1 (en) 2014-09-18
US11656875B2 (en) 2023-05-23
US9823930B2 (en) 2017-11-21
KR102083390B1 (ko) 2020-03-02

Similar Documents

Publication Publication Date Title
TWI522912B (zh) 利用原生分散旗標架構仿真客戶集中旗標架構的方法
TWI522908B (zh) 使用具有寄存器觀點、來源觀點、指令觀點、與複數寄存器樣板的微處理器架構以執行指令區塊的方法
TWI533221B (zh) 經由區塊組織的來源觀點資料結構來廣播依附的方法、非暫時性電腦可讀取媒體、與電腦系統
TWI522913B (zh) 在微處理器實現減縮尺寸寄存器觀點資料結構的方法
TWI522909B (zh) 使用寄存器樣板快照以填充寄存器觀點資料結構的方法
TWI619077B (zh) 執行群組爲區塊的多重執行緒指令的方法、電腦可讀取媒體及電腦系統
US10169045B2 (en) Method for dependency broadcasting through a source organized source view data structure
US10146548B2 (en) Method for populating a source view data structure by using register template snapshots
US9891924B2 (en) Method for implementing a reduced size register view data structure in a microprocessor
US20140317387A1 (en) Method for performing dual dispatch of blocks and half blocks
US20140281412A1 (en) Method for populating and instruction view data structure by using register template snapshots

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees