TW305959B - The method and system for eliminating penalty of page/row missing - Google Patents

The method and system for eliminating penalty of page/row missing Download PDF

Info

Publication number
TW305959B
TW305959B TW85106019A TW85106019A TW305959B TW 305959 B TW305959 B TW 305959B TW 85106019 A TW85106019 A TW 85106019A TW 85106019 A TW85106019 A TW 85106019A TW 305959 B TW305959 B TW 305959B
Authority
TW
Taiwan
Prior art keywords
memory
page
memory access
eliminating
access
Prior art date
Application number
TW85106019A
Other languages
Chinese (zh)
Inventor
Cherng-Shenq Jan
Tian-Yow Pan
Original Assignee
Ind Tech Res Inst
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ind Tech Res Inst filed Critical Ind Tech Res Inst
Priority to TW85106019A priority Critical patent/TW305959B/en
Application granted granted Critical
Publication of TW305959B publication Critical patent/TW305959B/en

Links

Abstract

A system for eliminating the penalty of page/row missing in memory access, it includes: A device can issue simultaneously one or several memory access commands;A system-bus connects with above device; A memory controller connects with above system-bus, it includes: a dispatch logic for dispatched the command sent from system-bus; several command queues connect with dispatch logic for registered the command sent from dispatch logic; a selective logic for ordered one of command queues to do access with memory;A set of memory connects with memory controller.

Description

經濟部中夬樣準局員工消費合作社印製 A7 B7 五、發明説明() 本發明係針對主僕架構(Master-Slave system)系統中之 記憶體控制器之改良,其藉由一種多重記憶體指令存取佇 列的重新排序方案,避免來自頁失誤及列失誤的處罰,因 此減小平均記憶體存取的等待時間以及增加全部系統的效 能0 圖一為某一共用匯流排及共用記憶體之多重處理器系 統結構之圖例。此多重處理器連接至系統匯流排,而此多 重處理器可以是最新型的處理器;如Pentium-Pro和PowerPC 620,其已將第二層快取記憶體(cache)與快取記憶體的黏著 能力(coherence capability)功能包含在内,因此在處理器和 系統匯流排之間就不需要枯著邏輯(glue logic)。 一般輸出入橋接器(I/O bridge)被使用以傳送系統匯流排 與輸出入匯流排間之資料。此輸出入匯流排可連接多重輸 出入裝置。記憶體控制器被使用以交換系統匯流排與主記 憶體(如典型的動態隨機存取記憶體(DRAM))間之資料。 通常記憶體控制器包含一個記憶體指令仔列(memory command queue),此記憶體指令佇列,將暫存由處理器或 輸出入裝置流入之記憶體指令。 由動態隨機存取記憶體(DRAM)讀取或欲寫入之資料, 暫時儲存在讀取和寫入的資料緩衝區(data buffer)。每一個 動態隨機存取記憶體(DRAM)存取的指令訊號包括:資料被 儲存之記憶體位址、用以控制DRAM的鎖存列位址時間之 列位址掏取訊號(Row Address Strobe,以下簡稱RAS),以 及控制DRAM鎖存行位址時間之行鱼址擷取訊號(Column Address Strobe,以下簡稱CAS) 〇其他尚有些指令訊號未表 示在圖一之中,如寫入許可(Write Enable ; WE)等。 每一筆在系統匯流排的交易(transaction)可以被分劉成 多重相位,如仲裁相位(arbitration phase)、請求相位(request 本紙張尺度適用申國國家揉準(CNS ) A4规格(210X297公釐) -----L----裝—-----訂------線 (請先閲讀背面之注意事項再填寫本頁) 經濟部中央橾準局員工消費合作社印製 如5㈣ A7 ~___ ^3 7 五、發明説明() ' phase)、錯誤相位(error phase)、偵測相位(5n〇〇p沖批旬、回 應相位(response phase),以及資料相位(data phase)。不同交 易的管路(pipelining)或重昼(overlaping)相位可以改善匯流排 的效能。大部份較先進的系統匯流排已經採取劈裂交易 (§plit-transaction)的機構。此種匯流妯》援非循痒宗成士夺 易(transaction)結果。因此如圖一,此系統匯流排連接多個 處奚盖.,和二A出入橋接Μ上這些皆為主動奘署HViaster Device)可發生記憶體存取之要求(reqUest)。这此奘署力菸屮 要求(request)到系統匯流排時,也同時附加一標籤(Tag)到 上’专此要求為記憶體存取要求,則當記憶體收到此要 遂jj將標籤記錄下來,直到可回應結果時,將結果和標 籤送到匯流排上,此時匯流排上的装,冒舍弈檢查標湛是否 Λ自己先前所發出的,若是則收回此結果並結束整個交 一§。這些標籤皆為唯一的(unique),故各發出要求之装署可 丛正確辨認其發出之要求,因此結果之順庠可以釦屉菸屮 要求時之順序不同。 這種匯流最大的好處為,若某一麻流排要求需睹鲂々 jf,其他包括處理器等主動裝置的要求可以被發出,並务 赶_處理,_不像一般匯流排需先等目前匯流排攄有去rh„g g}vner)完成其交易並釋出擁有權後,其他裝置才可使用匯流 #。因此在高速之多重處理機系統中,此類匯流排已成 二種很I效之實現方式。因此,此種劈裂交易(split-transaction)系統匯流排(如PowerPC 620匯流排)允許非循序 完成的交易結果(put-of-order completion of the transaction)。 例如,交易A、B、C流入系統匯流排一段時間後,可能出 現C、A、B順序的結果。竺因為各處理器將每一筆交易附 加一個標籤(Tag),所以在系統匯流排上的交易能藉由各自 的標籤來識別,且標籤的設定為各自獨立,非以各筆交易 .^— ! (請先閱讀背而之注意事項再填寫本貰) 訂 線A7 B7 printed by the Ministry of Economic Affairs of the Ministry of Economic Affairs and Consumers Cooperative Association 5. Description of the invention () The present invention is directed to the improvement of the memory controller in the master-slave system, which uses a multi-memory Command requeue reordering scheme to avoid penalties from page faults and column faults, thus reducing average memory access latency and increasing overall system performance. Figure 1 shows a shared bus and shared memory Illustration of the multi-processor system structure. This multi-processor is connected to the system bus, and this multi-processor can be the latest type of processor; such as Pentium-Pro and PowerPC 620, which have integrated the second-level cache and cache memory The coherence capability function is included, so no glue logic is required between the processor and the system bus. Generally, I / O bridges are used to transfer data between the system bus and I / O bus. This I / O bus can be connected to multiple I / O devices. The memory controller is used to exchange data between the system bus and the main memory (such as a typical dynamic random access memory (DRAM)). Generally, the memory controller includes a memory command queue. This memory command queue temporarily stores the memory commands flowing from the processor or the I / O device. The data read or written by the dynamic random access memory (DRAM) is temporarily stored in the data buffer for reading and writing. Each dynamic random access memory (DRAM) access command signal includes: memory address where data is stored, and row address extraction signal (Row Address Strobe, below) used to control the DRAM latch row address time Referred to as RAS), and the row address capture signal (Column Address Strobe, hereinafter referred to as CAS) that controls the DRAM latch row address time. Other command signals are not shown in Figure 1, such as Write Enable (Write Enable) ; WE) etc. Each transaction on the system bus can be divided into multiple phases, such as arbitration phase and request phase (request. This paper standard is applicable to the National Standard A4 (CNS) A4 specification (210X297mm) ----- L ---- installed ------- ordered ----- line (please read the precautions on the back before filling in this page) Printed by the Consumer Cooperative of the Central Bureau of Economic Affairs of the Ministry of Economic Affairs 5㈣ A7 ~ ___ ^ 3 7 V. Description of the invention () 'phase), error phase, error phase, detection phase (5n〇〇p), response phase, and data phase .Pipelining or overlapping phases of different transactions can improve the performance of the bus. Most of the more advanced system busses have adopted a split-transaction mechanism. Such a confluence 》 Aid non-trafficking results of the transaction. Therefore, as shown in Figure 1, this system bus connects to multiple locations., And the two A access bridges are all active HViaster devices) can occur Memory access requirements (reqUest). When this request is applied to the system bus, a tag is also attached to it. The special request is a memory access request. When the memory receives this request, jj will tag Record it until you can respond to the result, send the result and the label to the bus, at this time the equipment on the bus, take the opportunity to check whether Bian Zhan has issued previously, if it is, then withdraw the result and end the entire transaction One§. These labels are unique, so each requesting installation can correctly identify the request it issued, so the order of results can be different from the order in which the request is made. The biggest advantage of this kind of bus is that if a certain bus bar request needs to see the jf, other requests including active devices such as processors can be issued and rushed to handle _, unlike the general bus bar need to wait for the current The bus can go to rh „gg} vner) to complete its transaction and release the ownership before other devices can use the bus #. Therefore, in high-speed multiprocessor systems, such bus has become two very effective Therefore, this split-transaction system bus (such as the PowerPC 620 bus) allows non-sequential completion of the transaction (put-of-order completion of the transaction). For example, transaction A, After B and C flow into the system bus for a period of time, the results of C, A, and B may appear. Because each processor attaches a tag to each transaction, the transactions on the system bus can be determined by their respective To identify, and the label settings are independent, not for each transaction. ^ —! (Please read the precautions before filling in this book)

A7 B7 經濟部中央標準局員工消費合作社印製 五、發明説明() 的順序來定義,故各處理器能識別各自指令所回應之結 果。非循序完成的機制(out-of-order-completion mechanism),更進一步改善系統匯流排效能以及核心邏輯 的設計彈性。此核心邏輯例如記憶體控制器和輸出入橋接 器。於本發明中,使用之劈裂交易系統匯流排,其具有支 援非循序完成結果的能力。 圖二表示典型的主記憶體結構。除兩側的列位址(Row Address ’以下簡稱RA)與行位址(Column Address',以下簡 稱CA)的條狀物外’圖中每一個方塊表示一個動態隨機存取 記憶體(DRAM)的單元陣列(cell array)。多重記憶體單元陣 列被分組成一列列(如R0W1、R0W2),且一行資料的輸出 稱為一字元。圖二顯示主記憶體之四列,每一列包含4個動 態隨機存取記憶體(DRAM)單元陣列。主記憶趙在存取資料 時,包含下面幾個步騾:首先’列位址被列位址擷取訊號 (Row Address Strobe ; RAS)抓取,並將列位址鎖存於ra條 狀物中;然後’其行位址被行位址擷取訊號(columnA7 B7 Printed by the Employee Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economy 5. The order of invention description () is defined, so that each processor can recognize the result of the response to their respective commands. The out-of-order-completion mechanism further improves the system bus performance and the design flexibility of the core logic. This core logic is for example memory controller and I / O bridge. In the present invention, the split transaction system bus used has the ability to support non-sequential results. Figure 2 shows a typical main memory structure. In addition to the row address (Row Address hereinafter referred to as RA) and row address (Column Address hereinafter referred to as CA) bars on both sides, each square in the figure represents a dynamic random access memory (DRAM) Cell array. The multiple memory cell array is grouped into a row (such as R0W1, R0W2), and the output of a row of data is called a character. Figure 2 shows four rows of main memory, each row contains 4 arrays of dynamic random access memory (DRAM) cells. The main memory Zhao includes the following steps when accessing data: first, the row address is captured by the row address acquisition signal (Row Address Strobe; RAS), and the row address is latched in the ra bar; Then 'the row address is captured by the row address (column

Address Strobe ; CAS)所抓取,並將行位址鎖存ca條狀物 中。動態隨機存取記憶體(DRAM)單元陣列的行被行位址所 選取(行位址如圖中斜線部份之行),其列位址的選擇方法與 行同。最後’資料成進或流出之己憶體位置(mem〇ry location)已由行位址或列位址選擇,如圖二圓圈符號所 示0 一旦第一聿資料存取冗成,第二筆資料的存取可能產 生如下3種情形,即頁擊中(page hit);頁失誤(page miss),· 以及列失誤(row miss)。 假如第一筆資料與第一筆資料有相同之列位址,且落 =主記憶體中相同之頁,(如三角形符號所示),則第二 筆資料的存取就不需要再泥入列位址和1訊號,因其第 L----早-----ΐτ------0 (請先聞绩背面之注意事項再填寫本頁) 經濟部中央橾準局貝工消费合作社印装 A7 B7 五、發明説明() 一筆資料之列位址已鎖存在RA條狀物中。此情形被稱為頁 擊中(page hit)。而頁失誤(page miss)即第二筆資料與第—筆 資料在主記憶體中相同的列(ROW),但卻有不同的列位 址(如圖二中之又形符號"X”所示)。於此情況,列位址必須 再被流入,且RAS 1必須經歷一段長時期的預充電 (precharge)的週期。最後,假如第二資筆料落入主記體不同 的(ROW)(如圖二中之星形符號,·*"所示),則列位址必須 被流入ROW 2,但RAS2的預充電(precharge)已進杆過,钕 並不影響到主記憶體存取的等待時間。 如上之二種情形之時序圖(timing diagram),分別如圖三 (甲)到圖三(丙)所示。其只考慮主記憶體中之兩個連續讀取 操作。 圖三(甲)為頁擊中(page hit)的情形。第一筆資料的ra 和CA連續到達R〇w 1的動態隨機存取記憶體(DRAM)單元 陣列中,且分別被RAS1與CAS1訊號所鎖存(latch)。此時字 元1 (Word 1)出現在第5個時脈週期(ci〇ck cycle 5)的資料匯流 排设atabus)中。既然第二筆資料出現頁擊中咖明㈣則不需 要有列位址的動作。在CAS 1預充電的一個週期後,字 元2(Word 2)出現在第9個時脈週期(ci〇ck CyCle 9)的資料匯流 排世ata bus)中’而此兩筆連續資料的讀取操作依序在第u 個週期時完成。 圖三(乙)為頁失誤(page miss)的情形,此處RAS 1在抓 取第二筆資料的RA之前,需要4個週期(由第6個週期至第9 個週期)時間用以預充電。由於RAS 1預充電的延遲,兩筆 讀取資料的操作於第15個週期時才完成。 圖三(丙)為列失誤(row miss)的情形,故第二筆資料的 RA必須流入ROW 2(其於第6個至第7個週期完成),但既然 RAS 2已經完成預充電,故其能在第7個週期時鎖存抓取到 1- --I ί ·n 1^1 i L -----IT------0 _I_(請先閲讀背面之注意事項再填寫本f)Address Strobe; CAS), and the row address is latched in the ca bar. The row of the dynamic random access memory (DRAM) cell array is selected by the row address (the row address is shown as the slanted line in the figure), and the method of selecting the column address is the same as the row. Finally, the memory location of the data into or out of the data has been selected by the row or column address, as shown by the circle symbol in Figure 2 0 Once the first data access is redundant, the second The data access may produce the following three situations, namely page hits (page hit); page errors (page miss), and row errors (row miss). If the first piece of data and the first piece of data have the same row address, and the drop = the same page in the main memory, (as indicated by the triangle symbol), then the second piece of data access does not need to be muddled Listed address and 1 signal, because of its L-Last ----- lτ ------ 0 (please read the notes on the back of the performance first and then fill out this page) Central Ministry of Economic Affairs A7 B7 printed by industrial and consumer cooperatives V. Description of invention () The address of a row of data has been locked in the RA bar. This situation is called a page hit. Page miss (page miss) is the second row of data and the first row of data in the same row (ROW) in the main memory, but has a different row address (as shown in the second symbol "quote X" in Figure 2) ). In this case, the column address must be flown in again, and RAS 1 must go through a long period of precharge (precharge) cycle. Finally, if the second capital is written into a different main record (ROW) ( As shown by the star symbol in Figure 2, ** "), the column address must be flowed into ROW 2, but the precharge of RAS2 has been entered, and neodymium does not affect the main memory access The waiting time of the above two cases is shown in Figure 3 (A) to Figure 3 (C). It only considers two consecutive read operations in the main memory. (A) It is the case of page hit. The ra and CA of the first data continuously reach the dynamic random access memory (DRAM) cell array of R〇w 1, and are respectively signaled by RAS1 and CAS1. Latch. At this time, Word 1 appears in the data bus of the 5th clock cycle 5 (atabus). However, when the second data appears on the page and hits Kaming (iv), there is no need to list the address. After one cycle of CAS 1 precharge, Word 2 appears in the 9th clock cycle (ci〇). ck CyCle 9) 's data bus is listed in ata bus), and the two consecutive data reading operations are completed in sequence at the uth cycle. Figure 3 (B) shows the case of page miss. Before RAS 1 grabs the RA of the second data, it takes 4 cycles (from the 6th cycle to the 9th cycle) for pre-charging. Due to the delay of RAS 1 pre-charging, two reading data The operation is only completed in the 15th cycle. Figure 3 (C) is the case of row miss, so the RA of the second data must flow into ROW 2 (which is completed in the 6th to 7th cycle), However, since the RAS 2 has completed pre-charging, it can latch to grab at the 7th cycle to 1- --I ί · n 1 ^ 1 i L ----- IT ------ 0 _I_ (Please read the notes on the back before filling in this f)

經濟部中央標準局員工消费合作杜印裝 RA ’且此兩筆資料的讀取在第12個週期時完成。比較圖三 (乙)、圖二(丙)以及圖三(甲),典型之頁失誤miss)會產 生四個週期的處罰(pena丨⑺,以及列失誤(晴m㈣會產生 -個週期的處罰,而本發明的目的即在消除以上之處罰, 也因此而能增進記憶體的效能。 本案發明人於目前有一申請案(美國專利申請號 =/570,441)尚在美國專利局審查中,其揭露一在記憶體控制 器中的記憶體讀寫佇列之無阻礙機制(n〇n_bl〇cking mechanism)。在這個發明中,允許讀取的操作比寫入的操 作有較高的記憶體存取優先順序(higher mem〇ry access priority)。所以處理器等待讀取資料回應的等待時間(通6 time)就能減至最小。 美國專利5,461,718揭露一個讀取緩衝系統(a read buffering system),其使用一先進先出的區段(bank 〇f FIFO),以持有預取(prefetch)循序位址的資料。當後來中 央處理單元(CPU)在下一讀取需求以讀取資料時,此筆資料 因已存放在先進先出的區段,故被讀取時,便不需要再次 進入記憶體存取。 美國專利5,265,236揭露一種方法,能同時執行列位址 比較與快取記憶體搜尋(cache lookup)。因此,假如快取記 憶體搜尋(cache lookup)決定需要進行記憶體存取時,則可 產生一個頁擊中(pagehit)的記憶體存取需求。則記憶體控制 器並不會招致因檢視列位址而產生的附加延遲(additional delay) 〇 如圖一所示,目前大部份記憶體控制器(memory controller)的設計,使用單一記憶體指令列(a single memory command queue),此命令列以先進先出(FIFO)方式操作。一 本紙張尺度逋用中國國家橾準(CNS ) A4规格(210X297公釐) I.-------^-1 I. (請先閲讀背面之注意事項再填寫本頁) 訂 線 A7 B7 經濟部中央樣率局貝工消费合作杜印製 五、發明説明( 旦頁失誤或列失誤產生時,系統效能(Performance)的處罰 (penalty)就不可避免了。 如圖四所示,本發明為使用多重記憶禮命令列。每一 個佇列與主記憶體的列相對應。例如,佇列l(Queue 1)與在 圖二ROW 1相對應,且佇列2(Queue 2)與ROW 2相對應,其 他依此類推。當記憶體存取指令到達記憶體控制器時,首 先放置在統合指令仵列(Unified Command Queue)中,然後 將會由分配邏輯(Dispatch Logic)分配處理(dispatch)至多重指 令仵列(Multiple Command Queues)其中之一仔列。此存取相 同列的指令被分配處理在相同佇列中。因此在相同佇列中 執行連續指令,將不會產生任何列失誤(row miss)。選取邏 輯(selection logic)被使用以決定那個指令仔列,赴為進行狀 態(active)(亦即正在存取動態隨機存取記憶體;DRAM),此 時其他佇列均在等待(standby)狀態。 當進行中指令^(宁列的開頭指令(leading command)正在存 取動態隨機存取記憶體(DRAM)時,等待指令佇列中的開頭 指令(leading command)可利用偷取週期(stealing cycle)而預 先流入列位址和預充電(precharge)的RAS。選擇邏輯會依 演算法,在適當時機時做佇列狀態棘诲。以下是其中—稀 進行中佇列是以先進先出(FIFO)的方式執行, 故當只有在發生如下兩個情形中的一個時,正在進行中的 佇列(active gueue)會將記憶體存取權力,授權給等待中的仔 列(stardby gueue)。其中有一種情形發生在頁失誤(page龃㈣ 產生時;而另一種情形則發生在某一進行中仵列是空的時 候。無論如何,既然等待仵列(standby gueue)在被授權為進 行中佇列(active)苴,已經完成列位址的選取,以及訊 號的預充電(precharge),則在進行中佇列(active gueue)的交 換期間’就不會發生頁失誤的處罰(page mb 口611&丨以)。 -----L----装------,玎------.^. {請先閱讀背面之注意事項再填寫本頁}The Ministry of Economic Affairs Central Bureau of Standards and Staff's Consumer Cooperation Du Printing Pack RA ’and the reading of these two data is completed in the twelfth cycle. Comparing Figure 3 (B), Figure 2 (C) and Figure 3 (A), the typical page error miss will result in four cycles of penalties (pena 丨 ⑺, and column errors (晴 m㈣ will result in one cycle of penalties , And the purpose of the present invention is to eliminate the above punishment, which can also improve the performance of the memory. The inventor of this case has an application (US Patent Application No. = / 570,441) which is still under examination by the US Patent Office. A non-blocking mechanism for memory read and write queues in the memory controller. In this invention, read operations are allowed to have higher memory access than write operations Priority order (higher memory access priority). Therefore, the waiting time for the processor to wait for a response to read data (through 6 time) can be minimized. US Patent 5,461,718 discloses a read buffering system (a read buffering system) , Which uses a first-in first-out sector (bank 〇f FIFO) to hold prefetch (prefetch) sequential address data. When the central processing unit (CPU) later reads the data at the next reading demand, This information has been It is placed in the first-in first-out section, so it does not need to enter the memory access again when it is read. US Patent 5,265,236 discloses a method that can perform row address comparison and cache lookup at the same time. Therefore, if a cache lookup determines that a memory access is required, a pagehit memory access requirement can be generated. The memory controller does not incur due to the view Additional delay due to the address. As shown in Figure 1, most of the current memory controller designs use a single memory command queue. This command line Operates in a first-in first-out (FIFO) mode. A paper standard uses the Chinese National Standard (CNS) A4 specification (210X297mm) I .------- ^-1 I. (Please read the back (Notes and then fill out this page) Line A7 B7 Printed by the Beigong Consumer Cooperation Department of the Central Sample Bureau of the Ministry of Economic Affairs V. Invention description (When page errors or column errors occur, the penalty of the system performance (Performance) is not allowed Avoid it. As shown in Figure 4, the present invention uses multiple memory command lines. Each queue corresponds to the main memory line. For example, queue 1 (Queue 1) corresponds to ROW 1 in Figure 2 and queue 2 ( Queue 2) corresponds to ROW 2, and so on. When the memory access command reaches the memory controller, it is first placed in the Unified Command Queue, and then it will be dispatched by the Dispatch Logic ) Dispatch to one of the multiple command queues (Multiple Command Queues). Commands that access the same row are allocated and processed in the same queue. Therefore, executing consecutive instructions in the same queue will not cause any row miss. Selection logic (selection logic) is used to determine which command line to go to active (that is, accessing dynamic random access memory; DRAM), at this time other queues are in standby (standby) state . When the in-progress command ^ (leading command of Ninglie is accessing dynamic random access memory (DRAM), waiting for the leading command in the command queue can use the stealing cycle) The RAS that flows into the column address and precharge in advance. The selection logic will follow the algorithm to do the queue state at the right time. The following is among them-the rare in-process queue is first-in first-out (FIFO) Implementation, so when only one of the following two situations occurs, the ongoing queue (active gueue) will grant memory access rights to the waiting queue (stardby gueue). Among them are One situation occurs when a page fault occurs, and the other situation occurs when an ongoing queue is empty. In any case, since the standby queue is authorized as an ongoing queue (active) If you have completed the selection of the row address and the precharge of the signal, during the exchange of the active queue (the active gueue), there will be no penalty for page faults (page mb port 611 &丨 to). ----- L ---- 装 ------, 玎 ------. ^. {Please read the notes on the back before filling this page}

經濟部中央標準局負工消费合作杜印製 A7 B7 五、發明説明() 因此,藉由本發明如上提供之方法故能避免頁失誤 (page miss)與列失誤(row miss)的處罰(penalty)。既然所有相 同列的指令被聚集在相同的佇列,則在一個佇列當中的連 續執行的指令便不會產生列失誤(row miss)。另一方面,當 一個指令尚在等待中佇列時,頁失誤的處罰(page miss penalty)將被早先完成的ra選擇(Row Address selection),以 及RAS預充電(precharge)所避免。 當5己憶體存取指令’由先進先出統合作列,經分配邏 輯被分配至多重佇列後,其指令流入記憶體控制器中之順 序已不再保持,選擇邏輯將依其選擇演算法,選擇適當之 指令佇列,以完成記憶體存取,並避免頁失誤或列失誤之 處罰。因此各指令之結果將非循序完成,故記憶體控制器 要將結果重排成原先順序傳回,或者系統匯流排要有支援 非循序完成的機制。然而當指令被重新排列時,讀取同一 記憶體位址之指令不可互換,否則,若先寫再讀被換成先 讀再寫時,將造成讀到舊的資料。這種錯誤的狀況,在本 發明中不會出現,因若為相同位址之存取,指令會由分配 邏輯分配至多重指令佇列中之相同的佇列中,而每個指令 佇列,皆是採先進先出的方式執行。所以相同位址之存取 指令將以流入記憶控制器中相同的順序被完成。另外,若 非以先進先出的方式執行時也需有如前述發明人尚在美國 申請中專利(號.磷08/570,441)類似之方式以保持處理資料之 一致性。Du Printed A7 B7 by the Ministry of Economic Affairs, Central Bureau of Standards and Consumer Cooperation V. Description of the invention () Therefore, the penalty of page miss and row miss can be avoided by the method provided by the invention as above . Since all the instructions in the same row are aggregated in the same queue, consecutively executed instructions in a queue will not cause row misses. On the other hand, when an instruction is waiting to be queued, the page miss penalty will be avoided by the ra address (Row Address selection) completed earlier and the RAS precharge. When the 5 memory access commands are from the first-in first-out unified queue, and the distribution logic is assigned to multiple queues, the order in which the instructions flow into the memory controller is no longer maintained, and the selection logic will be calculated according to its selection Method, select the appropriate command queue to complete memory access, and avoid punishment for page faults or column errors. Therefore, the results of each instruction will be completed out of order, so the memory controller must rearrange the results back to the original order, or the system bus must have a mechanism to support non-sequential completion. However, when the commands are rearranged, the commands that read the same memory address are not interchangeable; otherwise, if the write-first read then is replaced by the read-first write, the old data will be read. This error condition will not occur in the present invention, because if the access is the same address, the instruction will be allocated by the distribution logic to the same queue in the multiple instruction queue, and each instruction queue, All are implemented in a first-in-first-out manner. Therefore, access commands with the same address will be completed in the same order as they flow into the memory controller. In addition, if it is not implemented in a first-in-first-out manner, it is also necessary to maintain the consistency of the processing data in a similar way as the aforementioned inventor still has a patent in the United States (No. Phosphorus 08 / 570,441).

本發明係一種用以消除記憶體存取之頁失誤及列失誤 處罰之方法與系統,經系統匯流排,將指令送至記憶體控 制器之先進先出指令佇列中,再經分配邏輯將送至之指令 予以分配給相對應的指令佇列,等待中佇列的開頭指令, 可利用進行中佇列未用之位址週期,先送出列位址和RAS 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) ----------装------订------線 (請先閱讀背面之注意事項异填寫本f) 經濟部中央樣準局負工消费合作社印製 305959 五、發明説明() 訊號給記憶體,再依選擇邏輯之選擇進行記憶體之存取。 如此設計之系統與方法可運用在任何可能會發生頁失誤或 列失誤之記憶體存取系統中以免除處罰,以因此減小平均 記憶體存取的等待時間及增加全部系統的效能。 圖一為分享式匯流排(shared-bus)與分享式記憶體(shared-memory)之多重處理器(multiple-processor) 系統 ,此 圖特別強調記憶體控制器的設計。 圖二為動態隨機存取記憶體(DRAM)的之基本架構。 圖三(甲)為當頁擊中(page hit)產生時之兩個連續讀取操作 的時序圖(timing diagram)。 圖三(乙)為當頁失誤(page miss)產生時之兩個讀取操作的 時序圖。 圖三(丙)為當列失誤(row miss)產生時之兩個讀取操作的 時序圖。 圖四為本發明的具體實施例,此圖説明多重記憶體指令 4宁列的利用。 圖五(甲)為早期列位址選擇(early Row Address selection)的 時序圖。 圖五(乙)為交換進行中仔列的時序圖(switching active queue) 〇 圖六為本發明為一假設情況中,指令佇列重新排序的例 Ο 圖七(甲)為擴充資料輸出(Extended Data Output ; EDO)動 態隨機存取記憶體(DRAM)的時序圖。 圖七(乙)為爆發式擴充資料輸出(Burst EDO)動態隨機存取 記憶體(DRAM)的時序圖。 圖七(丙)為同步動態隨機存取記憶體(Synchronous DRAM) 的時序圖。 本紙張尺度逋用中國國家標準(CNS ) Α4规格(210X 297公釐) 裝— I n 1 —線 (請先閲讀背面之注意事項再填寫本頁} 經濟部中央樣準局員工消費合作社印製 A7 B7 五、發明説明() 本發明的具禮實施例如圖四所示。一個進入記憶體控 制器的記憶體存取指令,係根據記憶體列的指令佇列,在 此指令佇列的尾端,選擇邏輯挑選一個佇列以存取動態隨 機存取記憶體(DRAM),此佇列成為進行中佇列。例如,佇 歹ill為某一進行中^宁列。此開頭指令(leading command)伴隨 著RAS1訊號將列住址流入動態隨機存取記憶體(DRam) 中。此開頭指令(leading command)意指儲存在仔列的最右邊 的部份者。然後,行位址被CAS1訊號鎖存(latch)而流入。 以上動作的時序如圖三(甲)開頭四個週期(Cycle)(亦即一般頁 擊中的情形)。無論如何,圖三(甲)在週期五與週期六(cycle 5@cycle 6)之間,位址匯流排產生閒置(idle)的情形。而此 時在週期五與週期六就產生了讓等待佇列執行列位址的早 期列位址選擇的好機會。 如圖五(甲)表示早期列位址選擇的時序圖(timing diagram)。在週期五和週期六,與R〇W2相對應之佇列2的 開頭指令流入一個列位址和RAS2。在週期七,即仔列1第 二個指令流入行位址之前,佇列2的開頭指令已經完成列位 址的選擇。 以此類推,當佇列1的第三個指令於週期11流入行位址 之前,佇列3的列位址選擇,已在週期九和週期十完成。因 此,本發明的重點在讓等待中仔列(standby gueue),能從進 行中佇列的操作時偷取部份週期。例如,在圖三(甲)(即單 一指令列的設計)與圖五(甲)(即多重指令列的設計)字元二 (word 2)均在週期九的資料匯流排出現,但在圖五(甲)中允 許佇列2在週期五和週期六完成列位址的選擇,而在圖三(甲) 則不能。 在圖五(乙)表示一個進行中佇列的交換。假設在佇列1 的第三個指令被執行之後,佇列1的第四個指令發生頁失 本紙張尺度適用中國國家梂準(CNS ) A4规格(210X297公釐) ------^----乎-----、訂------0 (請先閲讀背面之注意事項再填寫本頁) 經濟部_央橾率局員工消費合作社印裂 3 . r A7 ⑽----- B7 五、發明説明() 疾0故於週期十三和週期十四時,佇列4的開頭指令執行早 期列位址選擇(early row addres selection),然後進行仵列狀 塵毪塞仵列2成為進行中佇列,並於週期十五和週期十六流 ^行位址。因為佇列2開頭指令已於週期五和週期六完成列 位址選擇,故仵列2(與R〇W 2對應)的第一個字元,在週期 十七的資料匯流排出現。此週期十七亦為佇列1(尺〇冒1)第3 個字元(word 3)出現之後的四個週期,此情形有如頁擊中 (page hit)的延遲。在週期十七和週期十八時,作列丨利用偷 取週期以完成列位址的選擇,在週期十九和週期二十,佇 列2流入第二個指令的行位址。在週期二十一和週期二十 一,因為所有等待中的仵列(如4宁列3、仵列4和仵列1)沒已 經疋成開頭指令的列位址選擇,故在閉置(id丨e)狀態,從圖 五(曱)和圖五(乙)我們學習到雖然佇列丨的字元3和字元4之間 產生頁失誤,但藉由進行中佇列的交換,亦可以免除處 罰。由此,可觀察到的一點是,每四個週期就有一個字元 輸入資料匯流排,因此根本不需要顧慮頁失誤、列失誤或 進行中佇列交換的問題。 圖六為本發明方法的例子之一。假設記憶體存取指令 ①到Θ已經邊分配處理(也進入佇列丨到佇列4。為簡化 此例子,假設直到指令®執行完成,均沒有其他的指令流 入。並且,我們假設指令②和指令③,指令⑤和指令⑥以及 指令⑩和指令0)之間有頁失誤存在。 第一步驟:選逢邏輯選|仵列丨為進行中佇列,且指令 ①流入列位址和RAS1。如前圖五(曱)所提及的,典型的]^ 步驟需兩個週期。 第二步驟:指令①泥入行位址以及CAS1,且彳宁列2的 指令⑤姓用_偷取週期以及執行一早期列位址選擇(eariy row address selection) 〇 ---------裝------訂------線 (請先閱讀背面之注意事項再填寫本頁) 本紙張尺度適用中國國家標準(CNS ) A4说格(21 Οχ 297公慶)The present invention is a method and system for eliminating page faults and row error penalties for memory access. Via the system bus, the instructions are sent to the first-in first-out command queue of the memory controller, and then distributed by the logic The instruction sent to is assigned to the corresponding instruction queue, waiting for the beginning instruction of the middle queue, you can use the unused address cycle of the ongoing queue to send out the row address and RAS first. CNS) A4 specification (210X297mm) ---------- installed ------ ordered ---- line (please read the notes on the back to fill in this f) Central Ministry of Economic Affairs Sample 305959 is printed by the Consumer Labor Cooperative of the Pre-Examination Bureau. 5. Description of the invention () The signal is given to the memory, and then the memory is accessed according to the selection of the selection logic. The system and method designed in this way can be used in any memory access system that may cause page faults or column faults to avoid penalties, thereby reducing the average memory access latency and increasing the performance of the entire system. Figure 1 shows the multiple-processor system of shared-bus and shared-memory. This figure emphasizes the design of the memory controller. Figure 2 shows the basic structure of dynamic random access memory (DRAM). Figure 3 (A) is a timing diagram of two consecutive read operations when a page hit occurs. Figure 3 (B) is a timing diagram of two read operations when a page miss occurs. Figure 3 (C) is a timing diagram of two read operations when a row miss occurs. FIG. 4 is a specific embodiment of the present invention. This figure illustrates the use of multiple memory instructions. Figure 5 (A) is a timing diagram of early row address selection (early Row Address selection). Figure 5 (B) is the sequence diagram of the switching active queue (switching active queue). Figure 6 is an example of the reordering of the command queue in the hypothetical situation of the present invention. Figure 7 (A) is the extended data output (Extended Data Output; EDO) Dynamic random access memory (DRAM) timing diagram. Figure 7 (B) is a timing diagram of Burst EDO dynamic random access memory (DRAM). Figure 7 (C) is a timing diagram of synchronous dynamic random access memory (Synchronous DRAM). The size of this paper is printed in Chinese National Standard (CNS) Α4 specification (210X 297mm) — I n 1 — line (please read the precautions on the back before filling in this page) Printed by the Employee Consumer Cooperative of the Central Bureau of Samples of the Ministry of Economic Affairs A7 B7 5. Description of the invention () The polite example of the present invention is shown in Figure 4. A memory access command to enter the memory controller is based on the command queue of the memory row, at the end of the command queue At the end, the selection logic selects a queue to access the dynamic random access memory (DRAM). This queue becomes the ongoing queue. For example, the queue is a certain ongoing queue. This leading command (leading command ) Along with the RAS1 signal, the row address is flowed into the dynamic random access memory (DRam). This leading command means the one stored in the rightmost part of the row. Then, the row address is locked by the CAS1 signal Latch and flow in. The sequence of the above actions is shown in the first four cycles of Figure 3 (A) (that is, the case of general page hits). In any case, Figure 3 (A) is in cycle 5 and cycle 6 ( between cycle 5 @ cycle 6), address bus A situation of idle is generated. At this time, a good opportunity for the early column address selection of the queue address to be performed by the queue is generated in cycle 5 and cycle 6. As shown in Figure 5 (A), the early column address selection Timing diagram. In cycles five and six, the first instruction of queue 2 corresponding to R0W2 flows into a column address and RAS2. In cycle seven, the second instruction in column 1 flows into the row Before the address, the beginning instruction of queue 2 has completed the selection of the row address. By analogy, when the third instruction of queue 1 flows into the row address in cycle 11, the selection of the row address of queue 3 has It is completed in cycle nine and cycle ten. Therefore, the focus of the present invention is to allow the standby queue to steal part of the cycle from the ongoing queue operation. For example, in Figure 3 (a) (ie (Single command line design) and Figure 5 (A) (that is, the design of multiple command lines), word 2 (word 2) appears in the data bus of cycle 9, but in Figure 5 (A), queue 2 is allowed Cycles 5 and 6 complete the selection of column addresses, which is not possible in Figure 3 (A). Five (B) represents an ongoing queue exchange. Assume that after the third instruction of queue 1 is executed, the fourth instruction of queue 1 has a page loss. The paper size applies to China National Standards (CNS) A4 Specifications (210X297mm) ------ ^ ---- Hu -----, order ------ 0 (please read the precautions on the back before filling this page) Ministry of Economic Affairs_Central The bureau employees' consumer cooperatives printed 3. R A7 ⑽ ----- B7 V. Description of invention () Disease 0 Because of cycle 13 and cycle 14, the beginning instruction of queue 4 performs early column address selection ( early row addres selection), and then the queue-shaped dust-filled queue 2 becomes the ongoing queue, and the stream ^ row address is at cycle 15 and cycle 16. Because the instruction at the beginning of queue 2 has completed the selection of the row address in cycles 5 and 6, the first character of queue 2 (corresponding to R0W 2) appears in the data bus of cycle 17. This cycle 17 is also the four cycles after the third character (word 3) of queue 1 (foot square 1) appears. This situation is like a page hit delay. At cycle seventeen and cycle eighteen, column stealing cycles are used to complete the selection of column addresses. In cycle nineteen and cycle twenty, queue 2 flows into the row address of the second instruction. In cycle twenty-one and cycle twenty-one, because all waiting queues (such as 4 queue 3, queue 4 and queue 1) have not been selected as the column address of the beginning instruction, they are closed ( id 丨 e) state, we learned from Figure 5 (囱) and Figure 5 (B) that although page faults occurred between the characters 3 and 4 of the queue 丨, but through the exchange of the ongoing queue, Penalties can be exempted. From this, it can be observed that there is a character input data bus every four cycles, so there is no need to worry about page errors, column errors or ongoing queue exchange problems. Figure 6 is one of the examples of the method of the present invention. Assume that the memory access instructions ① to Θ have been allocated (also enter queue 丨 to queue 4. To simplify this example, assume that no other instructions flow in until instruction ® is completed. And, we assume that instructions Page fault exists between command ③, command ⑤ and command ⑥, and command ⑩ and command 0). The first step: select the logical selection | queue 丨 is the ongoing queue, and the instruction ① flows into the column address and RAS1. As mentioned in Figure 5 (囱) above, a typical] ^ step requires two cycles. The second step: the instruction ① muddling into the row address and CAS1, and the instruction ⑤ of the Lining column 2 surname _ stealing cycle and performing an early row address selection (eariy row address selection) ------------ -Installed ------ ordered ----- line (please read the precautions on the back before filling in this page) This paper size is applicable to the Chinese National Standard (CNS) A4 said grid (21 Ο 297 Gongqing)

A7 B7 經濟部中央揉準局員工消费合作社印裝 五、發明説明( 第三麵:指令②完成CA,且指令⑦完成Μ。同時, 指令(亦即指令⑤和指令⑦)已經完成列 第四步驟:H1頁失誤在指令②與指令③之生,故在 進行中制須交換至仔列2,此處指令⑤完成CA且指令⑩完 成RA,此指令⑩為佇列4的開頭指令。 第五步驟··因在指令⑤與指令7⑥再次出現頁失誤咖明 rmss)。故進行中佇列交換至仲列3。指令⑦完成ca以及指 令③執行RA。前述之指令⑦為制3的_指令;指令③為 目前侧1的_指令。此程序輯讀有齡被執行完。 到目前為止,我們均假設進行中佇列以循環的方式 (cyclic)^多重仔列中被選擇。例如,當佇列丨產生頁失誤 時,則仵列2將被選擇成進行中仵列。或許亦有其他的選擇 方法可被使用。其中如選擇包含最多指令的等待佇列。如 此韦$可防止件列太滿了。另外一種方法亦可選擇一等待 佇列,以讀取指令的開頭指令。此方法則給 入指令有較高的優先權以存取動態隨機存取記憶(DRAM) 因此如正在等待讀取回應(read resp〇nse)的中央處理單元 (CPU)的利用就能得到改善。但循環的方法可説是執行上最 簡單的。雖其各有優點,但其他的方法可能會增加設計選 擇邏輯的負擔。 圖二(甲)(乙)(丙)以及圖五(甲)(乙)為與現今一般動態隨 機存取δ己憶體(DRAM)控制器相結合的記憶體存取時序圖。 當不同動態隨機存取記憶體(DRAM)或時脈週期時間(d〇ck cycle虫迎)被使用時,將出現另一種時序圖。 圖七(甲)為擴充資料輸出(Ext竺nded Data Output)動態隨 機存取記憶體(EDO-page DRAM)的記憶體存取時序圖Γ圖 七(乙)為爆發式擴充資料輸出動態隨機存取記憶體(Burst_ 本紙法尺度適用中國國家標準(CNS ) A4规格(210X297公釐) -----^----裝------訂------線 (請先閱讀背面之注意事項再填寫本頁) A7 ______B7_ 五、發明説明() EDO DRAM)的記憶體存取時序圖。而圖七(丙)為同步動態 隨機存取記憶體。在圖七(甲)到圖七(丙)的圖例,均為單二 記憶體指令佇列。故即使是先進的動態隨機存取記憶體的 裝置,其在列位址與行位址,以及在兩個連續的行位址之 間,位址匯流排將閒置一段長週期。此時正合適早期列位 址選擇(early row address selection)的插入(如圖中箭頭所 示)。因此,本發明多重記憶體指令佇列的方法,能較二 般之兄憶體裝置k供更好的結果,且如前舉例之記惊體裝 置,若採用本發明之方法加以改良,則對減少平均記憶體 存取的等待時間以及增加全部系統的效能將有更好的結 果。 综合以上所述,僅為本發明之一具體實施例而已,並 非用以限定本發明實施之範圍。即凡依本發明申請專利範 圍所做的均等變化與修飾等,皆為本發明申請專利範圍所 涵蓋。 ----------坤衣-! (請先閲讀背面之注意事項再填寫本頁) -5 經濟部中央搮準局貝工消费合作社印装 本紙張尺度速用中國國家揉準(CNS ) A4規格(210X 297公釐)A7 B7 Printed and printed by the Employee Consumer Cooperative of the Central Bureau of Economic Development of the Ministry of Economic Affairs 5. The description of the invention (third side: instruction ② completes CA, and instruction ⑦ completes Μ. At the same time, the instruction (ie instruction ⑤ and instruction ⑦) has been completed and ranks fourth Step: The error of page H1 is in the life of instruction ② and instruction ③, so the system in progress must be exchanged to line 2, where instruction ⑤ completes CA and instruction ⑩ completes RA, this instruction ⑩ is the beginning of queue 4. Five steps ·· The page fault occurs again in instruction ⑤ and instruction 7⑥ (Chamming rmss). Therefore, the ongoing queue exchange to Zhonglie 3. Instruction ⑦ completes ca and instruction ③ executes RA. The aforementioned instruction ⑦ is the _ instruction of system 3; instruction ③ is the _ instruction of current side 1. This program is completed after reading Youling. So far, we have assumed that the ongoing queue is selected in a cyclic manner (multiple rows). For example, when a page fault occurs in the queue, then queue 2 will be selected as the ongoing queue. There may be other options available. Among them, if you select the waiting queue that contains the most commands. This way Wei $ can prevent the package from being too full. Alternatively, you can select a wait queue to read the beginning of the command. This method gives higher priority to the instruction to access the dynamic random access memory (DRAM). Therefore, the utilization of the central processing unit (CPU) that is waiting for the read response (read response) can be improved. But the loop method is the easiest to implement. Although each has its own advantages, other methods may increase the burden of design selection logic. Figure 2 (A) (B) (C) and Figure 5 (A) (B) are the memory access timing diagrams combined with the current general dynamic random access delta memory (DRAM) controller. When different dynamic random access memory (DRAM) or clock cycle time (dock cycle) is used, another timing diagram will appear. Figure 7 (A) is the memory access timing diagram of the extended data output (Extzhunded Data Output) dynamic random access memory (EDO-page DRAM). Figure 7 (B) is the explosive extended data output dynamic random access memory. Take memory (Burst_ This paper method standard is applicable to China National Standard (CNS) A4 specification (210X297mm) ----- ^ ---- installed ------ ordered ------ line (please first Read the precautions on the back and then fill out this page) A7 ______B7_ V. Description of invention () EDO DRAM) Memory access timing diagram. Figure 7 (C) shows synchronous dynamic random access memory. The legends in Figure 7 (A) to Figure 7 (C) are all single-two memory command queues. Therefore, even for advanced dynamic random access memory devices, the address bus will be idle for a long period between the column address and the row address, and between two consecutive row addresses. This is the time to insert the early row address selection (as shown by the arrow in the figure). Therefore, the method of multiple memory command queues of the present invention can provide better results than the ordinary brother memory device k, and as the memory device of the previous example, if the method of the present invention is improved, then the Reducing the average memory access latency and increasing the overall system performance will have better results. In summary, the above is only one specific embodiment of the present invention, and is not intended to limit the scope of the present invention. That is, all changes and modifications made in accordance with the patent application scope of the invention are covered by the patent application scope of the invention. ---------- Kunyi-! (Please read the notes on the back before filling in this page) -5 Printed copies of the Beigong Consumer Cooperative of the Central Accreditation Bureau of the Ministry of Economic Affairs (CNS) A4 specification (210X 297mm)

Claims (1)

經_部中央揉準局員工消費合作社印製 A8 B8 C8 —__D8 申請專利範圍 1種用以消除記憶體存取之頁失誤及列失誤處罰之系統, 其包含: U一或多個可同時發號記憶體存取指令之裝置; 2) 一系統匯流排,與前述裝置連接; 3) —記憶體控制器,與前述系統匯流排連接,其可包含: a) —分配邏輯,其將前述系統匯流排送至之指令予以重 新分配; b) 複數個指令佇列,與前述分配邏輯連接,用以暫存由 前述分配邏輯送至之指令; c) —選擇邏輯,其用以指定前述指令佇列中之一佇列與 記憶體做存取動作; 4) —组記憶體,與前述記憶體控制器連接。 2·如申請專利範圍1之一種用以消除記憶體存取之頁失誤或 列失誤處罰之系統,其中可同時發號記憶體存取指令之 裳置,為一或多個處理器。 專利範圍1之一種用以消除記憶體存取之頁失誤或 處罰之系統,其中可同睹發號記憶體存取指令之 墓置’為一或多個主動裝置。 4·如申請專利範圍1之一種用以消除記憶體存取之頁失誤或 列失誤處罰之系統,其中可同時發號記憶體存取指令之 裝置,為一或多個能支援劈裂交易機制之處理器。 專利範圍1之一種用以消除記憶體存取之頁失誤或 £1先j吳處罰之系統,其中可同a♦發號記憶體存取指令之 墓1,為一或多個能支擡劈梨交易機制之主動裝置。 6.如申請專利範圍1之一種用以消除記憶體存取之頁失誤或 列失誤處罰之系統,其中可同時發號記憶體存取指令之 裝置為Pentium-Pro 〇 本紙張尺度適用中國國家標準(CNS ) A4規格(21 Οχ 297公釐) ---^-------t.---.--ΐτ------0!, (請先閲讀背面之注意事項再填寫本頁) '—-_'中請專利範 A8 B8 C8 D8 圍 經濟部中央椟準局貝工消費合作社印製 7 如申請專姆® 1之―翻⑴肖除記賴存取之頁失誤或 列失誤處罰之系統’其中可同時發號記憶體存取指令之 裝置為PowerPC 620。 8’如申請專利範圍丨之—種用以消除記憶體存取之頁失誤或 列失誤處罰之系統’其中系統匯流排能支援非循度完成 的劈裂交易。 — 9. 如申請專利範阳之—_以消除記憶禮存取之頁失誤或 射記賴為纏賴存取記憶 體。 10. 如申請專利範圍1之—種用以消除記憶體存取之頁失誤或 =誤處罰之系統,其中記憶雜為擴充資料輸出動態隨 機存取記憶體。 1 專利範圍1之―種㈤以消除記憶禮存取之頁失誤或 &决處罰之系統,其中記憶體為爆發式擴充資料輸出 動態隨機存取記憶體。 ίί利糊1之—種用以消除記憶體存取之頁失誤或 =误處罰之系統’其中記憶禮為同步動態隨機存取記 專利細1之—细以雜記麵存取之頁失誤或 β吳處罰之系統,其中記憶體控制器尚可包含一先進 後前述系統_送至之指令 之—细以消除記‘_取之頁失誤或 統,其中記憶體控制器之分配邏輯,其 配。的方式係依指令佇列與記憶體之對應予以分 15 二 本紙張尺度適财® 210 X 297公釐 > (請先閲讀背面之注意事項再填寫本頁) -裝 訂 線丨丨· •MlWIXiX^IIOVWWaMWHIiaK 經濟部中央梯隼局具工消費合作社印裝 A8 B8 C8 ----—___D8 、申請專利範圍 由進行中指令仵列交換至等待中指令仵列之發生條件, 一為進行中指令佇列發生頁失誤或列失誤時,一為進行 中指令仵列為空時。 16.如申請專利範圍15之一種用以消除記憶體存取之頁失誤 或列失誤處罰之系統,其中記憶體控制器之選擇邏輯, 其選擇前述等待中指令佇列之方式係選取等待中指令佇 列之開頭指令。 Π.—種記憶體控制器,其可包含: 1) 一分配邏輯,其將送至之指令予以重新分配; 2) 複數個指令佇列,其與前述分配邏輯連接,用以暫存由 前述分配邏輯送至之指令; 3) —選擇邏輯,其用以指定前述指令佇列中之一佇列, 與記憶體做存取動作。 18. —種用以消除記憶體存取之頁失誤及列失誤處罰之方 法,其處理經一或多個可同時發號記憶體存取指令之裝 置送出的記憶體存取指令後,將處理後之指令傳送至記 憶體,此方法可包含下列步驟: 1) 利用一分配邏輯,將前述系統匯流排送至之指令予以 重新分配; 2) 利用複數個指令佇列,用以暫存由前述分配邏輯送至 之指令; 3) 利用一選擇邏輯,用以指定前述指令佇列中之一佇列 與記憶體做存取動作。 19. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中可同時發號記憶體存取指令 之裝置,為一或多個處理器。 本紙張尺度適用中國國家標準(CNS ) Λ4規格(21〇Χ297公釐) ---^--------装-----ir-------- (请先閱讀背面之注意事項再填寫本萸) 經濟部中央標準局負工消費合作社印裝 A8 B8 C8 D8 六、申請專利範圍 ' ^ 20. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中可同時發號記憶體存取指令 之裝置,為一或多個主動裝置。 21. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中可同時發號記憶體存取指令 之裝置,為一或多個能支援劈裂交易機制之多重處理 器0 22. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中可同時發號記憶體存取指令 之裝置,為一或多個能支援劈裂交易機制之主動裝置。 23. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 及列失誤處罰之方法,其中可同時發號記憶體存取指令 之裝置為Pentium-Pro。 24. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 及列失誤處罰之方法,其中可同時發號記憶體存取指令 之裝置為PowerPC 620。 25. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中系統匯流排能支援非循度完 成的劈裂交易。 26. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體為動態隨機存取記憶 體0 27. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體為擴充資料輸出動態 隨機存取記憶體。 28. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體為爆發式擴充資料輸 出動態隨機存取記憶體。 本紙張尺度逋用中國國家標準(CNS ) A4说格(_210X297公釐) ~' ----------裝-----tr------線-- (請先閲讀背面之注意事項真填窝本\®〇 A8 BS C8 D8 六、申請專利範圍 29. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體為同步動態隨機存取 記憶體。 30. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體控制器尚可包含一先 進先出指令佇列,用以暫存前述系統匯流排送至之指令 後,傳送至前述分配邏輯。 31. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體控制器之分配邏輯, 其分配指令的方式係依指令佇列與記憶體之對應予以分 配。 32. 如申請專利範圍18之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體控制器之選擇邏輯, 其由進行中指令佇列交換至等待中指令佇列之發生條 件,一為進行中指令佇列發生頁失誤或科央誤時,一為 進行中指令仵列為空時。 33·如申請專利範圍32之一種用以消除記憶體存取之頁失誤 或列失誤處罰之方法,其中記憶體控制器之選擇邏輯, 其選擇前述等待中指令佇列之方式係選取等待中指令之 開頭指令。 ---^-------裝-----1T------.411 - (請先閱讀背面之注意事項再填寫本頁) 經濟部中央標準局貝工消费合作社印製 本紙張尺度適用 ) A4^ ( 210X29'7^iTA8 B8 C8 —__ D8 printed by the Ministry of Industry and Consumers ’Cooperatives of the Central Committee of the Ministry of Accreditation. 1 patent application. A system for eliminating page errors and penalties for memory access errors, which includes: U One or more can be issued at the same time No. memory access command device; 2) A system bus, connected to the aforementioned device; 3)-A memory controller, connected to the aforementioned system bus, which may include: a)-Distribution logic, which connects the aforementioned system The instruction sent to the bus is redistributed; b) a plurality of instruction queues, connected to the aforementioned allocation logic, for temporarily storing the instructions sent by the aforementioned allocation logic; c) —selection logic, which is used to specify the aforementioned instruction queue One of the queues performs access with the memory; 4)-a set of memory, connected to the aforementioned memory controller. 2. A system for eliminating page faults or punishment errors for memory accesses as claimed in patent scope 1, in which the memory access command can be issued simultaneously, which is one or more processors. A system for eliminating page faults or penalties for memory access in patent scope 1, wherein the tomb that can be used to issue memory access commands is one or more active devices. 4. For example, in the patent application scope 1, a system for eliminating memory access page errors or column error penalties, in which devices that can simultaneously issue memory access commands are one or more mechanisms that can support split transactions Of the processor. Patent scope 1 is a system for eliminating memory access page faults or £ 1 prior to Wu punishment, which can be the same as a Tomb 1 for issuing memory access instructions, which is one or more support The active device of the pear trading mechanism. 6. For example, a system for eliminating the page fault or memory fault punishment of memory access in patent application scope 1, where the device that can issue memory access commands at the same time is Pentium-Pro. This paper standard is applicable to Chinese national standards (CNS) A4 specification (21 Οχ 297 mm) --- ^ ------- t .---.-- lsτ ------ 0 !, (Please read the notes on the back first (Fill in this page) '—-_' Patent Request A8 B8 C8 D8 Printed by the Central Bureau of Economic Affairs of the Ministry of Economic Affairs, Printed by the Beigong Consumer Cooperative 7 If you apply for the junmu® 1-turn over (1) Xiao Xiaoji's page access error Or a system for punishment of errors', in which the device that can simultaneously issue memory access commands is PowerPC 620. 8 ’As claimed in the patent scope, a system for eliminating page faults or column error penalties for memory access’, where the system bus can support split transactions that are not completed regularly. — 9. If applying for a patent Fan Yangzhi — To eliminate memory page access errors or shoot memory access as memory. 10. As claimed in patent scope 1-a system for eliminating page errors or = error penalties for memory access, where the memory complex is extended data output dynamic random access memory. 1 Patent scope 1-a kind of system to eliminate page errors or & punishment for memory etiquette access, in which the memory is an explosive expansion data output dynamic random access memory. ίί 利 paste 1-a system for eliminating page errors or = error penalties for memory accesses where the memory ritual is synchronous dynamic random access to record patent details 1-page faults or β that are accessed by miscellaneous memory In the system of Wu punishment, the memory controller can still include a post-advanced system _ instruction sent to-detailed to eliminate errors or control of the page taken, in which the allocation logic of the memory controller is allocated. The method is divided into 15 according to the correspondence between the instruction queue and the memory. Two paper sizes suitable for financial ® 210 X 297 mm > (please read the precautions on the back and then fill out this page) -Binding line 丨 丨 · MlWIXiX ^ IIOVWWaMWHIiaK Ministry of Economic Affairs Central Falcon Bureau Tool Industry Consumer Cooperative Printed A8 B8 C8 ------___ D8, the scope of the patent application range from the exchange of the ongoing instruction queue to the waiting instruction queue, the first is the ongoing instruction queue When a page fault or column fault occurs in a row, one is when the in-progress command queue is empty. 16. For example, a system for eliminating the page fault or column fault punishment of memory access in patent application scope 15, wherein the selection logic of the memory controller, which selects the aforementioned waiting instruction queue is to select the waiting instruction Command at the beginning of the queue. Π. A kind of memory controller, which can include: 1) a distribution logic that redistributes the instructions sent to it; 2) a plurality of instruction queues, which are connected to the aforementioned distribution logic for temporary storage by the aforementioned The instruction to which the distribution logic is sent; 3) —Selection logic, which is used to designate one of the aforementioned instruction queues to perform access operations with the memory. 18.-A method for eliminating page faults and column error penalties for memory access, which will process the memory access commands sent by one or more devices that can issue memory access commands simultaneously The following instructions are sent to the memory. This method can include the following steps: 1) Use a distribution logic to redistribute the instructions to which the system bus is sent; 2) Use a plurality of instruction queues to temporarily store the instructions The instruction sent by the distribution logic; 3) Use a selection logic to designate one of the aforementioned instruction queues and the memory to perform the access operation. 19. For example, a method for eliminating page faults or column error penalties for memory access in patent application scope 18, wherein the device that can simultaneously issue memory access commands is one or more processors. This paper scale is applicable to China National Standard (CNS) Λ4 specification (21〇Χ297mm) --- ^ -------- installed ----- ir -------- (please read first (Notes on the back and then fill in this booklet) A8 B8 C8 D8 printed by the Consumer Labor Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs 6. Scope of patent application '^ 20. For example, one of the scope of patent application 18 is used to eliminate memory access page errors Or a method of punishment for errors, wherein the device that can simultaneously issue memory access commands is one or more active devices. 21. For example, in the patent scope 18, a method for eliminating memory access page errors or row error penalties, in which devices that can simultaneously issue memory access commands are one or more mechanisms that can support split transactions Multi-processor 0 22. A method for eliminating the page fault or memory fault punishment of memory access, such as patent application scope 18, in which the device that can simultaneously issue memory access commands is one or more Active device supporting split trading mechanism. 23. For example, a method for eliminating page faults and penalties for memory errors in the patent application scope 18, in which the device that can simultaneously issue memory access commands is Pentium-Pro. 24. For example, in the patent application scope 18, a method for eliminating memory access page errors and column error penalties, in which the device that can simultaneously issue memory access commands is PowerPC 620. 25. A method for eliminating page faults or column error penalties for memory access, such as patent application scope 18, where the system bus can support split transactions that are not completed periodically. 26. A method for eliminating page faults or column error penalties for memory access, such as applying for patent scope 18, where the memory is dynamic random access memory 0 27. A method for eliminating memory, such as applying for patent scope 18 The method of punishment for page error or row error of body access, in which the memory is dynamic random access memory for extended data output. 28. A method for eliminating page faults or column error penalties for memory access, such as patent application scope 18, where the memory is dynamic random access memory for explosive expansion of data output. This paper uses the Chinese National Standard (CNS) A4 standard (_210X297mm) ~ '---------- installed ----- tr ------ line-- (please first Read the precautions on the back of the backfill book \ ®〇A8 BS C8 D8 6. Apply for patent scope 29. For example, to apply for patent scope 18 is a method for eliminating page faults or punishment for errors in memory access, in which the memory It is a synchronous dynamic random access memory. 30. A method for eliminating page faults or column error penalties for memory access, such as patent scope 18, in which the memory controller can still include a first-in first-out command queue , Used to temporarily store the instructions sent to the aforementioned system bus, and then sent to the aforementioned allocation logic. 31. A method for eliminating the page fault or memory fault punishment for memory access, such as the patent scope 18, in which the memory The allocation logic of the controller, the way of allocating instructions is allocated according to the correspondence between the instruction queue and the memory. 32. For example, a method for eliminating the page fault or the punishment of the column error of the memory access, such as the patent scope 18 Among them the choice of memory controller It is the condition for the exchange from the in-progress command queue to the waiting-in command queue, one is when the page fault or the central error occurs in the ongoing command queue, and the other is when the ongoing command queue is empty. Patent application scope 32 is a method for eliminating the page fault or column fault punishment of memory access, in which the selection logic of the memory controller, which selects the aforementioned waiting instruction queue is to select the beginning instruction of the waiting instruction . --- ^ ------- install ----- 1T ------. 411-(please read the precautions on the back before filling in this page) Beigong Consumer Cooperative of Central Bureau of Standards, Ministry of Economic Affairs (Printed paper size is applicable) A4 ^ (210X29'7 ^ iT
TW85106019A 1996-05-22 1996-05-22 The method and system for eliminating penalty of page/row missing TW305959B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW85106019A TW305959B (en) 1996-05-22 1996-05-22 The method and system for eliminating penalty of page/row missing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW85106019A TW305959B (en) 1996-05-22 1996-05-22 The method and system for eliminating penalty of page/row missing

Publications (1)

Publication Number Publication Date
TW305959B true TW305959B (en) 1997-05-21

Family

ID=51566019

Family Applications (1)

Application Number Title Priority Date Filing Date
TW85106019A TW305959B (en) 1996-05-22 1996-05-22 The method and system for eliminating penalty of page/row missing

Country Status (1)

Country Link
TW (1) TW305959B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI514144B (en) * 2011-12-29 2015-12-21 Intel Corp Aggregated page fault signaling and handling

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI514144B (en) * 2011-12-29 2015-12-21 Intel Corp Aggregated page fault signaling and handling
US9891980B2 (en) 2011-12-29 2018-02-13 Intel Corporation Aggregated page fault signaling and handline
US10255126B2 (en) 2011-12-29 2019-04-09 Intel Corporation Aggregated page fault signaling and handling
US11275637B2 (en) 2011-12-29 2022-03-15 Intel Corporation Aggregated page fault signaling and handling

Similar Documents

Publication Publication Date Title
US5822772A (en) Memory controller and method of memory access sequence recordering that eliminates page miss and row miss penalties
US5611058A (en) System and method for transferring information between multiple buses
EP2223217B1 (en) System, apparatus, and method for modifying the order of memory accesses
KR100898710B1 (en) Multi-bank scheduling to improve performance on tree accesses in a dram based random access memory subsystem
EP1540485B1 (en) Out of order dram sequencer
US5870625A (en) Non-blocking memory write/read mechanism by combining two pending commands write and read in buffer and executing the combined command in advance of other pending command
US5388247A (en) History buffer control to reduce unnecessary allocations in a memory stream buffer
KR102519019B1 (en) Ordering of memory requests based on access efficiency
TW201234188A (en) Memory access device for memory sharing among multiple processors and access method for the same
JP2002530731A (en) Method and apparatus for detecting data collision on a data bus during abnormal memory access or performing memory access at different times
WO2005109218A1 (en) Memory controller with command look-ahead
US8271746B1 (en) Tiering of linear clients
US6549991B1 (en) Pipelined SDRAM memory controller to optimize bus utilization
US9620215B2 (en) Efficiently accessing shared memory by scheduling multiple access requests transferable in bank interleave mode and continuous mode
TW491970B (en) Page collector for improving performance of a memory
TW305959B (en) The method and system for eliminating penalty of page/row missing
JP5382113B2 (en) Storage control device and control method thereof
TW548547B (en) Method and system for maintaining cache coherency for write-through store operations in a multiprocessor system
US6502150B1 (en) Method and apparatus for resource sharing in a multi-processor system
KR100328726B1 (en) Memory access system and method thereof
KR20240000773A (en) Pim computing system and memory controller therof
JPH07295947A (en) Equipment and method for data transfer management
JP2005165508A (en) Direct memory access controller
EP1704487A2 (en) Dmac issue mechanism via streaming id method
US8713291B2 (en) Cache memory control device, semiconductor integrated circuit, and cache memory control method