TW200842579A - Asynchronous processing system - Google Patents

Asynchronous processing system Download PDF

Info

Publication number
TW200842579A
TW200842579A TW096115360A TW96115360A TW200842579A TW 200842579 A TW200842579 A TW 200842579A TW 096115360 A TW096115360 A TW 096115360A TW 96115360 A TW96115360 A TW 96115360A TW 200842579 A TW200842579 A TW 200842579A
Authority
TW
Taiwan
Prior art keywords
buffer
control unit
page number
conversion search
prefetch
Prior art date
Application number
TW096115360A
Other languages
Chinese (zh)
Other versions
TWI360052B (en
Inventor
wei-min Zheng
Chang-Ju Chen
Original Assignee
Univ Nat Chiao Tung
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Nat Chiao Tung filed Critical Univ Nat Chiao Tung
Priority to TW096115360A priority Critical patent/TW200842579A/en
Publication of TW200842579A publication Critical patent/TW200842579A/en
Application granted granted Critical
Publication of TWI360052B publication Critical patent/TWI360052B/zh

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

An asynchronous processing system performs signal transmission based on a handshake protocol, and is applicable to translate a virtual memory address to a physical memory address. The asynchronous processing system comprises an asynchronous processor and a memory. The asynchronous processing system comprises a plurality of translation searching buffers with record items so as to reduce power consumption, and a pre-fetching buffer so as to support the translation searching buffers by means of a pre-fetching mechanism. Therefore, the asynchronous processing system can still improve address translation efficiency even it is under a multi-task environment. Because serial components of the asynchronous processing system can communicate with each other via the handshake protocol, the reliability and efficiency of the address translation can be increased.

Description

200842579 九、發明說明: 【發明所屬之技術領域】 本發明是有關於-種處理系統,特別是指—種 處理系統。 【先前技術】 參閱圖卜f知—種可以將虛擬記憶體位址轉換為實體 记憶體位址的同步處理器包含一轉換搜尋緩衝器11、一加 法器12。該轉換搜尋緩衝器、u中儲存报多記錄項且每^ 記錄項有-標籤U1和一實體分頁號m 力立# 上 叩成標戴111記 錄耆一虛擬分頁號和一些管理相關位元。 旦當同步處理器產生包含一虛擬分頁號31和—位址偏移 里32的虛擬記憶體位址時,該轉換搜尋緩衝器11會根據虛 擬記憶體位址中記錄的虛擬分頁號31搜尋記錄項,以找到 與该虛擬分頁號31對應的標籤⑴,並將吻合的標藏⑴ 所㈣的實體分頁號112送入該加法器12。該加法器㈣ =實體/刀頁幻12加上虛擬記憶體位址中所記錄的位址偏移 量32 ’而得到實體記憶體位址。 ,·而為了減少搜哥的失誤,習知會盡量加大轉換搜尋緩 衝器η所儲存之記錄項的數目,但因為加大之後,每次所 需搜尋的記錄項數目變多,故會增加同步處理器的耗電量200842579 IX. DESCRIPTION OF THE INVENTION: TECHNICAL FIELD OF THE INVENTION The present invention relates to a processing system, and more particularly to a processing system. [Prior Art] Referring to the figure, a synchronization processor that can convert a virtual memory address into a physical memory address includes a conversion search buffer 11 and an adder 12. The conversion search buffer, u stores a plurality of record entries, and each record has a -tag U1 and an entity page number m. The first record is a virtual page number and some management related bits. When the synchronization processor generates a virtual memory address including a virtual page number 31 and an address offset 32, the conversion search buffer 11 searches for a record item based on the virtual page number 31 recorded in the virtual memory address. The tag (1) corresponding to the virtual page number 31 is found, and the entity page number 112 of the matching (1) (4) is sent to the adder 12. The adder (4) = entity/blade page 12 plus the address offset 32' recorded in the virtual memory address to obtain the physical memory address. In order to reduce the mistakes of the search brother, it is customary to increase the number of records stored in the conversion search buffer η, but since the number of entries required to search each time increases, the synchronization will increase. Processor power consumption

習知另一種將虛擬記憶體位址轉換為實體記憶體位址 的同V處理器是利用預取機制來增加轉換的效率,可參考 G〇kul B. Kandiraju 等人的 “G〇ing Distance f0r TLB 200842579Another conventional V processor that converts a virtual memory address into a physical memory address is to use a prefetch mechanism to increase the efficiency of the conversion. For example, G〇kul B. Kandiraju et al., "G〇ing Distance f0r TLB 200842579

Prefetching: An Application-driven Study,^ in Proceedings of the 29th Annual Int Ί Symp. on Computer Architecture,2002, Ashley Saulsbury 等人的 “Recency-Based TLB Preloading,” in Proceedings of the 27th Int’l Symp· on Computer Architecture, pp. 117-127,2000,及張繼文於民國94年6月所著的國立交 通大學資訊工程系碩士論文:「降低程序環境切換導致效能 損失之轉換搜尋缓衝器設計」中第2.2.2章節。 習知另一種可以將虛擬記憶體位址轉換為實體記憶體 位址的同步處理器是利用複數個轉換搜尋緩衝器來增加轉 換的效率,可參考張繼文於民國94年6月所著的國立交通 大學資訊工程系碩士論文:「降低程序環境切換導致效能損 失之轉換搜尋緩衝器設計」中第3.1.1章節。 但以上習知架構都是在同步處理器中實現,而同步處 理器最重要的特性就是具有時脈’因而同步處理器中所有 元件的動作都需考慮及符合系統時脈。但因為近年來超大 型積體電路的快速進步,為了提高同步處理器的效率,時 脈不斷提高,而在電路設計日益複雜下,高時脈帶來了許 多嚴重的問題,如:時脈歪斜(clock skew)等。此外,高時 脈和越來越複雜的時脈樹(clock tree)使得線路正確性降低且 耗電量增大,而造成了散熱問題。 因為同步處理器一直是市場的主流,故較多的研究都 著重於同步處理器上,而非同步處理器目前則僅於起步的 階段,因而對於非同步處理器中虛擬記憶體位址轉換為實 體記憶體位址的研究則更少。但因為非同步處理器的運作 200842579 不需要時脈,因而可避免上述所提到的同步處理器的 ’因此,開發出一種能有效地將虛擬記憶體位址轉換為每 體記憶體位址的非同步處理器是目前—門重要的課 ^ 【發明内容】 °Prefetching: An Application-driven Study,^ in Proceedings of the 29th Annual Int Ί Symp. on Computer Architecture, 2002, Ashley Saulsbury et al., “Recency-Based TLB Preloading,” in Proceedings of the 27th Int'l Symp· on Computer Architecture, pp. 117-127, 2000, and Zhang Jiwen, Master's thesis, Department of Information Engineering, National Chiao Tung University, June 1994, “Reducing the Search Buffer Design for Performance Loss Caused by Program Environment Switching”, Section 2.2. 2 chapters. Another synchronous processor that can convert a virtual memory address into a physical memory address is to use a plurality of conversion search buffers to increase the efficiency of the conversion. For reference, Zhang Jiwen wrote the National Jiaotong University information in June 1994. Master's thesis in Engineering: Section 3.1.1 of the "Conversion Search Buffer Design for Reducing Performance Loss Caused by Program Environment Switching". However, the above conventional architectures are implemented in a synchronous processor, and the most important feature of the synchronous processor is that it has a clock' so that the actions of all components in the synchronous processor need to be considered and conform to the system clock. However, due to the rapid progress of ultra-large integrated circuits in recent years, in order to improve the efficiency of the synchronous processor, the clock is continuously improved, and in the increasingly complex circuit design, the high clock brings many serious problems, such as: clock skew (clock skew) and so on. In addition, high clocks and increasingly complex clock trees result in reduced line correctness and increased power consumption, causing thermal issues. Because synchronous processors have always been the mainstream of the market, more research has focused on synchronous processors, while non-synchronous processors are currently only in the initial stage, thus converting virtual memory addresses to entities in non-synchronous processors. There are fewer studies of memory addresses. However, because the operation of the non-synchronous processor 200842579 does not require a clock, it can avoid the above-mentioned synchronization processor. Therefore, a non-synchronization method that can effectively convert the virtual memory address into the memory address of each body is developed. The processor is an important part of the current door ^ [Summary] °

—因此,本發明之目的,即在提供—種基於交握協定來 執行信號傳輸且可有效地將虛擬記憶體位址轉換為實 憶體位址的非同步處理系統。 D 於是,本發明非同步處理系統包含一非同步處理器及 一記憶體。 該記憶體儲存一具有很多記錄項的分頁表,且每—記 錄項有-標籤和一實體分頁號,且該標籤所記錄的包含二 虛擬分頁號。 該非同步處理器包括-控制單元、N個轉換搜尋緩衝器 、N個標籤暫存器、一預取緩衝器、一第一至一第n+i及 閘、一多工器、一加法器、N個反閘。 該控制單元與該記憶體電連接。 每一轉換搜尋緩衝器中記錄著複數個記錄項,且每一 記錄項的内容包含一標籤和一實體分頁號,而該標籤所記 錄的包含一虛擬分頁號,且一個轉換搜尋緩衝器所記錄的 記錄項都是與一個程序相關。 N個標籤暫存器分別與該N個轉換搜尋缓衝器相對應 ’且每一標籤暫存器包含由控制單元設定的目前位元、有 效位元、最近最少使用位元、工作標籤,且目前位元表示 對應到的轉換搜尋緩衝器目前是否有被選取,且同一時間 200842579 :,只會有-個轉換搜尋緩衝器會被選取;而 不對應到的轉換搜尋緩衝器是否有效 70 士匕山北L rfc %瑕夕使用位开- Accordingly, it is an object of the present invention to provide a non-synchronous processing system that performs signal transmission based on a handshake protocol and that can efficiently convert a virtual memory address to a memory address. Thus, the asynchronous processing system of the present invention comprises a non-synchronous processor and a memory. The memory stores a page table having a plurality of entries, and each of the entries has a -tag and an entity page number, and the tag records two virtual page numbers. The non-synchronous processor includes a control unit, N conversion search buffers, N tag registers, a prefetch buffer, a first to an n+ith gate, a multiplexer, an adder, N reverse gates. The control unit is electrically connected to the memory. Each conversion search buffer records a plurality of records, and the content of each record includes a label and an entity page number, and the label records a virtual page number, and a conversion search buffer records The entries are all related to a program. N tag registers respectively correspond to the N conversion search buffers' and each tag register includes a current bit, a valid bit, a least recently used bit, a work tag set by the control unit, and The current bit indicates whether the corresponding conversion search buffer is currently selected, and at the same time 200842579 :, only one conversion search buffer will be selected; the corresponding conversion search buffer is valid for 70 士Mountain North L rfc %瑕夕使用位开开

曰^應到的轉換搜尋緩衝器最近被使用到 作標籤記錄著一程序號碼。 W,而X 该預取緩衝器記錄著一些預取記錄項,且每一 錄項的内容包含一標籤和一實 σ 沾—人 紙和κ體刀頁唬,而該標籤所記錄 、匕各-虛擬分頁號,控制單元會關接下來可能會進行 到的私序’並先將與這些程序相關的資料從該記憶體中取 出並記錄於該預取緩衝器中。 每一轉換搜尋緩衝器會接收一虛擬記憶體位址中所記 錄的虛擬分請’且與各自的記錄項比較以判斷是否命中 或未7中,且該預取緩衝器也會接收該虛擬分頁號,且與 其預取記錄項比較以判斷是否命中或未命中,且每一轉換 搜尋緩衝器和該預取緩衝器命中時,會輸出其命中之記錄 項所記錄的實體分頁號。 第j及閘接收第j轉換搜尋緩衝器的命中信號或未命中 仏旎,且接收第j標籤暫存器的目前位元值,並執行及運算 ’且j為1至N之間且含1與N的整數。 該多工器與該N個轉換搜尋緩衝器及該預取缓衝器電 連接’且可接收該N個轉換搜尋緩衝器及該預取缓衝器傳 來的實體分頁號,並受該控制單元及該第一至該第N及閘 之輸出信號控制,以將一實體分頁號送入該加法器。 該加法器將該多工器傳來的實體分頁號加上該虛擬記 憶體位址中所記錄的一位址偏移量,而得到一實體記憶體 200842579 位址。 N個反閘分別電連接於該第—至第N及閘的輪出端, 且將該第一至第N及閘的輸出信號進行反運算。 第N+1及閘對所有反閘的輸出執行及運算,且將運算 結果送至該控制單元,若第N+1及閉的輸出為邏輯ι,該 控制單元則判斷所有轉換搜尋緩衝器都沒有命中。 當該控制單元判斷出所有轉換搜尋缓衝器都沒有命中 而该預取緩衝器有命中時,會控制該多工器輸出由該預取 緩衝器傳至該多卫器的實體分頁號,且該控制單元也會將 X預取’、爰衝盗中:中的記錄項記錄至該等轉換搜尋緩衝器 的其中之一,以更新該等轉換搜尋緩衝器的其中之一,且 該控制單元會進-步從該記憶體中取得新的預取記錄項並 記錄至該預取緩衝器,以更新該預取緩衝器。 當該控制單元判斷出該等轉換搜尋緩衝器有至少一命 中且該預取緩衝器也命中時,該控制單元會控制該多工器 輸出由該至少—轉換搜尋緩衝器傳至該多1器的實體分頁 號’且該多卫器受該第—至該第N及閘之輸出信號控制, 以決定要將哪一個轉換搜尋緩衝器傳來的實體分頁號輸出 〇 ^當該控制單元判斷出所有轉換搜尋緩衝器和該預取缓 衝is都沒有命中時,該控制單元會先更新該等轉換搜尋缓 衝w的°己錄項,之後該控制單元會再更新該預取緩衝器的 預取記錄項。 "亥非同步處理系統是基於交握協定來執行信號傳輸, 200842579 且當,控制單元要從該記憶體中取得—新的記錄項時,該 控制單兀會先發出_存取請求信號給該記憶體,以向該記 ,體要求存取,若該圮憶體接受存取,則發出一認可信號 通知該控制早元其接受存取;之後該控制單it發出-更新 信號以向該記憶體取得新的記錄項;然後該記憶體經由一 記錄項路徑將新的記錄項送至該控制單元。 【實施方式】 、有關本發明之前述及其他技術内容、特點與功效,在 以下配合參考圖式之一個較佳實施例的詳細說明中,將可 清楚的呈現。 一預取緩衝器54、一多工器55 參閱圖2’本發明非同步處理系統之較佳實施例包含一 非同步處理器5及一記憶體6。該非同步處理器5包括n個 轉換搜尋緩衝器51、N個標籤暫存器52、一控制單元53、 加法器56、第一至第 N+1及閘(AND gate)57、N個反閘(Ν〇τ帅別。且在本實 施财’每-轉換搜尋緩衝器51是―種全關聯式轉換搜尋 緩衝器,但不以此為限。 該記憶體6儲存-記錄有很多記錄項的分頁表(醇 ’圖未示)’且每一記錄項有一標籤和一實體分頁號, 且該標籤所記錄的包含一虛擬分頁號。 每一轉換搜尋緩衝器51中記錄很多記錄項,且每一記 錄項有-標籤和-實體分頁號,且該標籤所記錄的包含;_ 虛擬分頁號。而i換搜尋緩衝器51所記錄的記錄項是與 一個工作程序(process)相關。 10 200842579 該N個標籤暫存器52分別與該N個轉換搜尋緩衝器 51相對應,且每一標籤暫存器52包含以下攔位:「目前位 元」、「有效位元」、「最近最少使用(least recemly used)位元 」 工作;^籤」。而這些攔位的值都是由控制單元設定 〇 其中,「目剷位元」用來表示對應到的轉換搜尋緩衝器 51目前是否有被選取,且同一時間下,只會有一個轉換搜 尋綾衝器51會被選取。而「有效位元」則表示對應到的轉 換搜尋緩衝器51是否有效(valid)。而「最近最少使用位元 」指出對應到的轉換搜尋緩衝器51最近被使用到的情形, 如此當所有轉換搜尋緩衝器51都滿了,而卻有新的程序加 入時,控制單元53即可將最近最少被使用到的轉換搜尋緩 衝斋51的所有記錄項清除,以安排給此新程序使用。而厂 工作標籤」在本實施例中是一種程序號碼(pr〇cess m)。 該預取緩衝器54中記錄著一些預取記錄項,且每一預 取記錄項有一標籤和一實體分頁號,且該標籤所記錄的包 含一虛擬分頁號。控制單元53會預測接下來可能會進行到 的程序,並先將與這些程序相關的記錄項從記憶體6中取 出並先記錄於預取緩衝器54的預取記錄項中。而控制單元 53預測的方式可基於空間相關或是時間相關。 本實施例之非同步處理系統會產生包含一虛擬分頁號 和一位址偏移量的虛擬記憶體位址7,每一轉換搜尋緩衝器 51會接收該虛擬分頁號,且與各自的記錄項比較以判斷是 否命中(hit)或未命中(miss)。且該預取緩衝器54也會接收該 11 200842579 虛擬分頁號,且與其預取記錄項比較以判斷是否命中或未 命中。且當每一轉換搜尋緩衝器51或該預取緩衝器54命 中時,會將命中之記錄項所對應的實體分頁號送至該多工 器55。 第j及閘57(j = l〜N)接收第j轉換搜尋緩衝器51的命中 (hit)信號(即:邏輯1)或未命中(miss)信號(即:邏輯〇),且 接收第j標籤暫存器52的「目前位元」值,並執行及運算 (AND operation),然後將輸出信號送入該多工器55和相對 應的反閘5 8。 該多工器55與所有轉換搜尋緩衝器51及預取緩衝器 54電連接且可接收轉換搜尋緩衝器51及預取緩衝器54傳 末的貝體分頁號,並受该控制單元5 3及該n個及閘5 7控 制,以將其中一實體分頁號送入該加法器56。 該加法器56將該實體分頁號加上虛擬記憶體位址7中 所記錄的位址偏移量,而得到實體記憶體位址8。 第N+1及閘57對所有反閘58的輸出執行及運算,且 將運算結果送至該控制單元53,該控制單元53即可根據第 N+1及閘57的輸出為邏輯Q或是丨來判斷轉換搜尋緩衝器 51整體的命中狀態’即:若第Ν+ι及閘57的輸出為邏輯工 ,則表示所有轉換搜尋緩衝器51都沒有命中。 *控制单it 53判斷出所有轉換搜尋緩衝器、51都沒有 π中而預取緩衝器54有命中時,控制單元Μ會控制多工 器55輸出由該預取緩衝器⑷專至該多μ 55 號。且控制單元53也會將預取緩衝器54中命中的記錄項 12 200842579 記錄至轉換搜尋绥< 抑 、衝170 51,以更新轉換搜尋緩衝器51。之 控制單70 53會攸,己憶體6中取得新的預取記錄項並記 錄至預取緩衝器54,以更新預取緩衝器5心 /控制早兀53判斷出有轉換搜尋緩衝器51命中且預 取緩衝器54也命中時,批告丨丨― 控制早兀53會控制多工器55輸出 由該轉換搜尋緩衝器51傳至兮容丁哭《人虫 1寻至忒夕工态55的實體分頁號, 且该多工器55受續笛_ 21 ^ Ahr 、、 Ό 至該弟Ν及閘57之輸出信號控制 ’以決定要將哪一個轉換掬聋 寻狭技+緩衝杰51傳來的實體分頁號 w出。 虽控制早7G 53判斷出所有轉換搜尋緩衝器51和預取 =器54都沒有命中時’控制單元53會更新轉換搜尋緩 Τ 51的記錄項’之後控制單元53會再更新預取緩衝器 54的預取記錄項。 值得注意的是,因為本發明是一種非同步處理系統, 故不需要基於時脈運作,相反 σ 相反的,控制早元53與記憶體6 、控制單元53與預取緩衝器54、控制單元53與每一轉換 搜尋緩衝器51、控制單斧q> 早 53與母一軚戴暫存器52之間在 傳輸信號時,都是按用六4Ρ + 疋抹用父握協定,因此不需要時脈也可以 正確運作。而圖2中為了古柄驻日& 為了方便嬈看,並沒有——標出前後 級元件間傳輸的所有信號。 在交握協定下,雨开杜夕 一 两兀件之間如何溝通,在此舉控制單 元53與記憶體6為例來說明:者 个兀θ田控制早兀53要從記憶體6 中取得一新的記錄項時 +、 才11)徑制早兀53會先發出一存取請 求信號91給記憶體6,以向記情妒 门己U體6要未存取;(2)若記憶 13 200842579 體6接受存取,則發出—認可信號92通知控㈣A% i =受存取;(3)之後控制單元53發出__更新信號%以㈣ fe'體6取讀的記錄項;⑷然後記憶體6經由—記錄項路 徑94將新的記錄項送至控制單元53。 ▲紅上所述,本發明的非同步處理系統利用複數個具有 車父少S己錄項的轉換# i绘iii 3? c 1 兴技寸緩衝态51,因此能降低習知使用一 個具有非常多記錄項的轉換搜尋緩衝器11所造成的耗電問 題,此外,本發明仙為加人了預取緩㈣54的預取機制 ,因此能支援轉換搜尋緩衝器51,而達到即使在多行程 (麵ulprogramming)環境下,也能提昇位址轉換的效能。此 外丄本,明是-種非同步的處理系統,因此前後級元件都 可藉由又握協疋而解決習知同步處理器使用時脈而造成的 問題,因而使位址轉換的可靠性和效能增加。 准以上所述者,僅為本發明之較佳實施例而已,當不 =以此限定本發明實施之範圍,即大凡依本發明中請專利 乾圍及發明說明内容所作之簡單的等效變化與修飾,皆仍 屬本發明專利涵蓋之範圍内。 【圖式簡單說明】 圖1疋一習知的可將虛擬記憶體位址轉換為實體記憶 體位址的同步處理器的架構圖;及 、圖2是本發明之一實施例的可將虛擬記憶體位址轉換 為實體記憶體位址的非同步處理系統的架構圖。 14 200842579 【主要元件符號說明】 5…… •…非同步處理器 58…·· ••…反閘 51 ···· ••…轉換搜尋緩衝器 6…… •…·記憶體 52•…· •…標籤暫存器 7…… ••…虛擬記憶體位址 53•…· …·控制單元 8…… ••…實體記憶體位址 54····. •…預取緩衝器 91 •… ••…存取請求信號 55.·.·, •…多工器 92•… ••…認可信號 56••… .....加法态 93•… …··更新信號 57····. …· ·及閘 94•… ••…記錄項路徑 15The conversion search buffer that should be used is recently used to record a program number. W, and X the prefetch buffer records some prefetched items, and the content of each entry contains a label and a real σ-------- - The virtual page number, the control unit will close the private sequence that may be followed next, and the data related to these programs will be taken out of the memory and recorded in the prefetch buffer. Each conversion search buffer receives a virtual score recorded in a virtual memory address and compares with a respective record to determine whether it is hit or not, and the prefetch buffer also receives the virtual page number. And comparing with its prefetch record to determine whether a hit or miss, and each conversion search buffer and the prefetch buffer hit, the entity page number recorded by the hit record is output. The jth gate receives the hit signal or miss 第 of the jth conversion search buffer, and receives the current bit value of the jth tag register, and performs a sum operation 'and j is between 1 and N and includes 1 An integer with N. The multiplexer is electrically connected to the N conversion search buffers and the prefetch buffer and can receive the N translation search buffers and the physical page number transmitted by the prefetch buffer, and is controlled by the control The unit and the output signals of the first to the Nth gates are controlled to send an entity page number to the adder. The adder adds the physical page number sent from the multiplexer to the address offset recorded in the virtual memory address to obtain a physical memory 200842579 address. N back gates are electrically connected to the first to the Nth gates, respectively, and the first to the Nth gate outputs are inversely operated. The N+1th gate and the gate perform the operation on the output of all the reverse gates, and send the operation result to the control unit. If the output of the N+1th and the closed is logic ι, the control unit determines that all the conversion search buffers are No hits. When the control unit determines that all of the conversion search buffers do not hit and the prefetch buffer has a hit, the multiplexer is controlled to output an entity page number transmitted by the prefetch buffer to the multi-guard, and The control unit also records the entries in the X prefetching, hacking, and voicing to one of the conversion search buffers to update one of the conversion search buffers, and the control unit A new prefetch record is fetched from the memory and recorded to the prefetch buffer to update the prefetch buffer. When the control unit determines that the conversion search buffer has at least one hit and the prefetch buffer also hits, the control unit controls the multiplexer output to be transmitted to the multi-device by the at least-transition search buffer. The entity page number 'and the multi-guard is controlled by the first-to-Nth and gate output signals to determine which one of the conversion search buffers to send the physical page number to 〇^ when the control unit determines When all the conversion search buffers and the prefetch buffer is not hit, the control unit first updates the entries of the conversion search buffers w, and then the control unit updates the prefetch buffers. Take the entry. "Hai non-synchronous processing system is based on the handshake protocol to perform signal transmission, 200842579 and when the control unit wants to retrieve from the memory - a new record item, the control unit will first issue an _ access request signal to The memory is required to access the memory, and if the memory accepts access, an acknowledgement signal is sent to notify the control that the early access is acceptable; then the control unit sends an update signal to the The memory obtains a new entry; the memory then sends the new entry to the control unit via a record entry path. The above and other technical contents, features and effects of the present invention will be apparent from the following detailed description of the preferred embodiments of the invention. A prefetch buffer 54, a multiplexer 55 Referring to Fig. 2', a preferred embodiment of the asynchronous processing system of the present invention comprises an asynchronous processor 5 and a memory 6. The asynchronous processor 5 includes n conversion search buffers 51, N tag registers 52, a control unit 53, an adder 56, first to N+1th and AND gates 57, and N reverse gates. (Ν〇τ帅别别. And in this implementation of the 'per-conversion search buffer 51 is a kind of all-association conversion search buffer, but not limited to this. The memory 6 stores - records many records A page table (alcohol 'not shown') and each record has a tag and an entity page number, and the tag records a virtual page number. Each conversion search buffer 51 records a plurality of entries, and each record A record has a -tag and - entity page number, and the record contains the _ virtual page number. The record record recorded by the i-change search buffer 51 is associated with a work process. 10 200842579 N tag registers 52 correspond to the N conversion search buffers 51, respectively, and each tag register 52 includes the following blocks: "current bit", "effective bit", "least recently used ( Least recemly used) bit" work; ^ sign". and this The value of the block is set by the control unit, and the "shovel bit" is used to indicate whether the corresponding conversion search buffer 51 is currently selected, and at the same time, there will be only one conversion search buffer. 51 will be selected, and "effective bit" indicates whether the corresponding conversion search buffer 51 is valid. The "least recently used bit" indicates that the corresponding conversion search buffer 51 has been used recently. Thus, when all the conversion search buffers 51 are full, but a new program is added, the control unit 53 can clear all the entries of the least recently used conversion search buffer 51 to arrange for this new The program is used. The factory work label is a program number (pr〇cess m) in this embodiment. The prefetch buffer 54 records some prefetch records, and each prefetch record has a label and a The physical page number, and the label records a virtual page number. The control unit 53 predicts the programs that may be performed next, and first records the items related to the programs from the memory. 6 is taken out and recorded in the prefetch record of the prefetch buffer 54. The manner in which the control unit 53 predicts may be based on spatial correlation or time correlation. The asynchronous processing system of this embodiment generates a virtual page number. And a virtual offset address 7 of the address offset, each conversion search buffer 51 receives the virtual page number and compares it with the respective record item to determine whether it is a hit or a miss. The prefetch buffer 54 also receives the 11 200842579 virtual page number and compares it with its prefetch record to determine whether a hit or miss. And when each conversion seek buffer 51 or the prefetch buffer 54 hits, The entity page number corresponding to the hit record item is sent to the multiplexer 55. The jth gate and the gate 57 (j = l~N) receive a hit signal (ie, logic 1) or a miss signal (ie, logic 〇) of the jth conversion search buffer 51, and receive the jth The "current bit" value of the tag register 52, and an AND operation, is then sent to the multiplexer 55 and the corresponding reverse gate 58. The multiplexer 55 is electrically connected to all the conversion search buffers 51 and the prefetch buffers 54 and can receive the shell page number of the conversion search buffer 51 and the prefetch buffer 54, and is controlled by the control unit 53 and The n and gates 57 are controlled to feed one of the physical page numbers into the adder 56. The adder 56 adds the physical page number to the address offset recorded in the virtual memory address 7, to obtain the physical memory address 8. The N+1 and the gate 57 perform the operation on the output of all the reverse gates 58 and send the operation result to the control unit 53, and the control unit 53 can be based on the output of the N+1th and the gate 57 as the logic Q or Then, the hit status of the conversion search buffer 51 as a whole is judged. That is, if the output of the Ν+1 and the gate 57 is a logical worker, it means that all the conversion search buffers 51 are not hit. * The control unit it 53 judges that all the conversion search buffers, 51 have no π and the prefetch buffer 54 has a hit, the control unit Μ controls the multiplexer 55 output from the prefetch buffer (4) to the multi μ No. 55. And the control unit 53 also records the record entry 12 200842579 hit in the prefetch buffer 54 to the conversion search 绥 < STOP 170 51 to update the conversion search buffer 51. The control list 70 53 will be retrieved, and a new prefetch record will be retrieved from the memory 6 and recorded to the prefetch buffer 54 to update the prefetch buffer 5 core/control early 53 to determine the conversion search buffer 51. When the hit and prefetch buffer 54 also hits, the control 丨丨 ― control early 兀 53 will control the multiplexer 55 output from the conversion search buffer 51 to the 丁容丁 cry "human worm 1 to the 忒 工The physical page number of 55, and the multiplexer 55 is controlled by the output signal of the whistle _ 21 ^ Ahr , Ό to the Ν and the gate 57 to determine which one to convert to find the narrow + buffer Jie 51 The incoming entity page number w is out. The control unit 53 updates the prefetch buffer 54 after the control 7G 53 judges that all of the conversion search buffer 51 and the prefetch = 54 are not hit, and the control unit 53 updates the record of the conversion search buffer 51. Prefetched entry. It should be noted that, since the present invention is a non-synchronous processing system, it is not required to operate based on the clock, and instead σ is opposite, the control element 53 and the memory 6, the control unit 53 and the prefetch buffer 54, and the control unit 53 are controlled. When each of the conversion search buffer 51, the control single axe q > the early 53 and the female one of the temporary registers 52 are transmitting signals, the parental agreement is used by six 4 Ρ + 疋, so when not needed The pulse can also work correctly. In Fig. 2, for the sake of convenience, it is not necessary to mark all the signals transmitted between the components at the front and the rear. Under the handshake agreement, how to communicate between the rain and the eve of the eve, in this case, the control unit 53 and the memory 6 are taken as an example to illustrate: the 兀 田 控制 field control early 53 must be obtained from the memory 6 When a new record item is +, only 11), the system will send an access request signal 91 to the memory 6 first, so as to the memory of the U-body 6 to be remembered; (2) if the memory 13 200842579 Body 6 accepts access, then sends out - acknowledgement signal 92 informs control (4) A% i = is accessed; (3) control unit 53 then sends __ update signal % to (4) fe' body 6 read the record entry; (4) The memory 6 then sends the new entry to the control unit 53 via the record entry path 94. ▲ Red, the non-synchronous processing system of the present invention utilizes a plurality of conversions #i with the parent's less S recorded items, i3, c1, and the state of the buffer state 51, so that the conventional use can be reduced to a very high The multi-record item is converted to the power consumption problem caused by the search buffer 11. In addition, the present invention adds a pre-fetch mechanism for pre-fetching (four) 54 and thus can support the conversion of the search buffer 51 even in the case of multiple trips ( In the ulprogramming environment, the performance of address translation can also be improved. In addition, the script is a non-synchronous processing system, so that the front and rear components can solve the problems caused by the use of the clock by the conventional synchronous processor by means of the cooperation, thereby making the reliability of the address conversion and Increased performance. The above is only the preferred embodiment of the present invention, and is not intended to limit the scope of the practice of the present invention, that is, the simple equivalent change of the patented invention and the description of the invention in the present invention. And modifications are still within the scope of the invention patent. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic diagram of a conventional synchronization processor that can convert a virtual memory address into a physical memory address; and FIG. 2 is a virtual memory location according to an embodiment of the present invention. An architectural diagram of an asynchronous processing system that translates to a physical memory address. 14 200842579 [Explanation of main component symbols] 5... •... Non-synchronous processor 58...··••...Reverse gate 51 ···· ••...Convert search buffer 6... •...·Memory 52•...· •...tag register 7... ••...virtual memory address 53•...·...control unit 8... ••...physical memory address 54····.......prefetch buffer 91 •... • •...Access request signal 55.·.·, •...Multiplexer 92•... ••...Recognition signal 56••.........Addition state 93•...···Update signal 57····. ...· · and gate 94•... ••...record entry path 15

Claims (1)

200842579 十、申請專利範圍: 1 _ 一種非同步處理系統,包含: 一記憶體,儲存一具有很多記錄項的分頁表,且每 一記錄項有一標籤和一實體分頁號’且該標籤所記錄Z 包含一虛擬分頁號;及 一非同步處理器,包括: 一控制單元,與該記憶體電連接; N個轉換搜尋缓衝器,每一轉換搜尋緩衝器中 記錄著複數個記錄項,且每一記錄項的内容包含一 標籤和一實體分頁號,而該標籤所記錄的包含一虛 擬分頁號’且一個轉換搜尋緩衝器所記錄的記錄項 都是與一個程序相關; 加n個標籤暫存器,分別與該N個轉換搜尋緩衝 相對應,且母一標籤暫存器包含由該控制單元設 定的一目前位元、一有效位元、一最近最少使用位 元 工作標籤,且該目前位元表示對應到的轉換 搜尋綾衝器目前是否有被選取,且同一時間下,只 印有個轉換搜尋緩衝器會被選取;而該有效位元 f 丁對應到的轉換搜尋緩衝器是否有效;該最近最 夕使用位兀指出對應到的轉換搜尋缓衝器最近被使 用到的情形’·而該工作標籤記錄著一程序號碼; 預取緩衝器,記錄著一些預取記錄項,且每 預取。己錄項的内容包含一標籤和一實體分頁號, 而該標籤所記錄的包含一虛擬分頁號,該控制單元 16 200842579 會預測接下來可能會進行到的程序,並先將與這些 私序相關的資料從該記憶體中取出並記錄於該預取 緩衝器中; 每一轉換搜尋緩衝器會接收一虛擬記憶體位址 中所記錄的一虛擬分頁號,且與各自的記錄項比較 以判斷是否命中或未命中,且該預取緩衝器也會接 收该虛擬分頁號,且與其預取記錄項比較以判斷是 否命中或未命中,且每一轉換搜尋緩衝器和該預取 緩衝器命中時,會輸出其命中之記錄項所記錄的實 體分頁號; 一第一至一第N及閘,第j及閘接收第j轉換 搜尋緩衝器的命中信號或未命中信號,且接收第』 標籤暫存器的目前位元值’並執行及運算,且」·為 1至N之間且含1與]^的整數; 一多工益,與該N個轉換搜尋緩衝器及該預取 緩衝器電連接’且可接收該N個轉換搜尋缓衝器及 該預取缓衝器傳來的實體分頁號,並受該控制單元 及該第-至該第N及閘之輸出信號控制,以將一實 體分頁號輸出; 一加法器,將該多工器傳來的實體分頁號加上 該虛擬記憶體位址中所記錄的一位址偏移量,而得 到一實體記憶體位址; N個反閘’分別電連接於該第-至第N及閘的 輸出端’且將該第一 5楚 弟至弟N及閘的輸出信號進行反 17 200842579 運算;及 一第N+1及閘,對所有反閘的輸出執行及運算 ,且將運算結果送至該控制單元,若第N+1及閘的 輸出為邏輯1,該控制單元則判斷所有轉換搜尋緩 衝器都沒有命中; §该控制單元判斷出所有轉換搜尋緩衝器都沒 有印中而该預取緩衝器有命中時,會控制該多工琴 輸出由該預取緩衝器傳至該多工器的實體分頁號, 且該控制單元也會將該預取緩衝器中命中的記錄項 記錄至該等轉換搜尋緩衝器的其中之一,以更新該 等轉換搜尋緩衝器的其中之一,且該控制單元會進 一步從該記憶體中取得新的預取記錄項並記錄至該 預取緩衝器,以更新該預取緩衝器; 當该控制單元判斷出該等轉換搜尋緩衝器有至 少一命中且該預取緩衝器也命中時,該控制單元會 控制該多工器輸出由該至少一轉換搜尋緩衝器傳至 該多工器的實體分頁號,且該多工器受該第二至該 第N及閘之輸出信號控制,以決定要將哪一個轉換 搜尋緩衝器傳來的實體分頁號輸出; 當該控制單元判斷出所有轉換搜尋緩衝器和該 預取緩衝器都沒有命中時,該控制單元會先更新該 等轉換搜尋緩衝器的記錄項,之後該控制單元會再 更新该預取緩衝器的預取記錄項; 該非同步處理系統是基於交握協定來執行信號 18 200842579 傳輸,且當該控制單元要從該記憶體中取得一新的 忑錄項蚪,該控制單元會先發出一存取請求信號給 該記憶體,以向該記憶體要求存取;若該記憶體接 又存取,則發出一認可信號通知該控制單元其接受 存取;之後該控制單元發出一更新信號以向該記憶 體取得新的記錄項;然後該記憶體經由-記錄項路 徑將新的記錄項送至該控制單元。 2·依據巾請專利範圍第丨項所述之㈣步處理系統,复中 丄每-個轉換搜尋緩衝器是—種全關聯式轉換搜尋緩衝 19200842579 X. Patent application scope: 1 _ A non-synchronous processing system, comprising: a memory, storing a paging table having a plurality of recording items, and each recording item has a label and an entity page number 'and the label records Z Included as a virtual page number; and a non-synchronous processor, comprising: a control unit electrically connected to the memory; N conversion search buffers, each of the conversion search buffers recording a plurality of entries, and each The content of a record contains a label and an entity page number, and the record contains a virtual page number 'and the record records recorded by one conversion search buffer are related to one program; plus n labels are temporarily stored Corresponding to the N conversion search buffers respectively, and the parent-tag buffer includes a current bit, a valid bit, a least recently used bit work tag set by the control unit, and the current bit The meta-representation indicates whether the corresponding conversion search buffer is currently selected, and at the same time, only one conversion search buffer is selected; Whether the conversion search buffer corresponding to the valid bit f is valid; the recent most recent use bit indicates the situation in which the corresponding conversion search buffer has been used recently. The work tag records a program number; Take the buffer, record some prefetch records, and each prefetch. The content of the recorded item includes a label and an entity page number, and the label records a virtual page number, and the control unit 16 200842579 predicts the next possible procedure, and first relates to the private order. The data is taken from the memory and recorded in the prefetch buffer; each conversion search buffer receives a virtual page number recorded in a virtual memory address and is compared with the respective record item to determine whether Hit or miss, and the prefetch buffer also receives the virtual page number and compares it with its prefetch record to determine whether a hit or miss, and each conversion search buffer and the prefetch buffer hit, The physical page number recorded by the record of the hit is output; a first to an Nth gate, the jth gate receives a hit signal or a miss signal of the jth conversion search buffer, and receives the ">tag temporary storage The current bit value of the device 'and performs the AND operation, and ·· is an integer between 1 and N and containing 1 and ^^; a multi-benefit, and the N conversion search buffers and the prefetch The buffer is electrically connected to and receives the N conversion search buffers and the physical page number transmitted by the prefetch buffer, and is controlled by the control unit and the output signals of the first to the Nth gates. Outputting an entity page number; an adder, adding an entity page number sent from the multiplexer to the address offset recorded in the virtual memory address to obtain a physical memory address; The opposite gates are electrically connected to the output terminals of the first to the Nth gates respectively, and the output signals of the first 5 Chudi to the brothers N and the gates are inversely processed by the 2008200842579; and an N+1th gate and a gate Execute and operate on the output of all the reverse gates, and send the operation result to the control unit. If the output of the N+1th and the gate is logic 1, the control unit determines that all the conversion search buffers are not hit; When the control unit determines that all the conversion search buffers are not printed and the prefetch buffer has a hit, it controls the multiplexed piano to output the physical page number transmitted from the prefetch buffer to the multiplexer, and the control The unit will also be in the prefetch buffer Recording items are recorded to one of the conversion search buffers to update one of the conversion search buffers, and the control unit further retrieves new prefetch entries from the memory and records Up to the prefetch buffer to update the prefetch buffer; when the control unit determines that the conversion search buffer has at least one hit and the prefetch buffer also hits, the control unit controls the multiplexer Outputting an entity page number transmitted by the at least one conversion search buffer to the multiplexer, and the multiplexer is controlled by the second to the Nth AND gate output signals to determine which one to convert the search buffer The transmitted entity page number output; when the control unit determines that all of the conversion search buffers and the prefetch buffer have not hit, the control unit first updates the records of the conversion search buffers, and then the control unit The prefetched record of the prefetch buffer is updated again; the non-synchronous processing system performs signal 18 200842579 transmission based on the handshake protocol, and when the control unit is to After obtaining a new recording item in the memory, the control unit first sends an access request signal to the memory to request access to the memory; if the memory is accessed and accessed, an approval is issued. The control unit is signaled to accept the access; the control unit then issues an update signal to retrieve a new entry to the memory; the memory then sends the new entry to the control unit via the -record entry path. 2. According to the (4) step processing system described in the third paragraph of the patent scope, the complex search buffer is a fully associative conversion search buffer.
TW096115360A 2007-04-30 2007-04-30 Asynchronous processing system TW200842579A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW096115360A TW200842579A (en) 2007-04-30 2007-04-30 Asynchronous processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW096115360A TW200842579A (en) 2007-04-30 2007-04-30 Asynchronous processing system

Publications (2)

Publication Number Publication Date
TW200842579A true TW200842579A (en) 2008-11-01
TWI360052B TWI360052B (en) 2012-03-11

Family

ID=44822045

Family Applications (1)

Application Number Title Priority Date Filing Date
TW096115360A TW200842579A (en) 2007-04-30 2007-04-30 Asynchronous processing system

Country Status (1)

Country Link
TW (1) TW200842579A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8838935B2 (en) 2010-09-24 2014-09-16 Intel Corporation Apparatus, method, and system for implementing micro page tables

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8838935B2 (en) 2010-09-24 2014-09-16 Intel Corporation Apparatus, method, and system for implementing micro page tables
TWI461911B (en) * 2010-09-24 2014-11-21 Intel Corp Apparatus, method, and system for implementing micro page tables

Also Published As

Publication number Publication date
TWI360052B (en) 2012-03-11

Similar Documents

Publication Publication Date Title
JP3936378B2 (en) Address translation device
TWI451334B (en) Microprocessor and method for reducing tablewalk time
EP1944696B1 (en) Arithmetic processing apparatus, information processing apparatus, and method for accessing memory of the arithmetic processing apparatus
TWI506428B (en) Method and system for optimizing prefetching of cache memory lines
US20060047912A1 (en) System and method for high performance, power efficient store buffer forwarding
US7350016B2 (en) High speed DRAM cache architecture
CN107818053B (en) Method and apparatus for accessing a cache
US9418018B2 (en) Efficient fill-buffer data forwarding supporting high frequencies
US8769251B2 (en) Data processing apparatus and method for converting data values between endian formats
JPH01503011A (en) General purpose processor unit for digital data processing systems including cash management systems
US6954822B2 (en) Techniques to map cache data to memory arrays
TW201003396A (en) Microprocessor, method and computer program product that perform speculative tablewalks
US7730290B2 (en) Systems for executing load instructions that achieve sequential load consistency
US20180181329A1 (en) Memory aware reordered source
US8489851B2 (en) Processing of read requests in a memory controller using pre-fetch mechanism
US10318436B2 (en) Precise invalidation of virtually tagged caches
US20090006753A1 (en) Design structure for accessing a cache with an effective address
US10126966B1 (en) Rotated memory storage for fast first-bit read access
JP4131789B2 (en) Cache control apparatus and method
TW200842579A (en) Asynchronous processing system
JPH0371355A (en) Apparatus and method for retrieving cache
US6925014B1 (en) Integrated circuit and method of reading data from a memory device
US7055005B2 (en) Methods and apparatus used to retrieve data from memory into a RAM controller before such data is requested
JP2002312239A (en) Processor, system-on-chip device and method of access
JPH0336647A (en) Cache buffering control system

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees