TWI220944B - Cache controller unit architecture and applied method - Google Patents

Cache controller unit architecture and applied method Download PDF

Info

Publication number
TWI220944B
TWI220944B TW92105886A TW92105886A TWI220944B TW I220944 B TWI220944 B TW I220944B TW 92105886 A TW92105886 A TW 92105886A TW 92105886 A TW92105886 A TW 92105886A TW I220944 B TWI220944 B TW I220944B
Authority
TW
Taiwan
Prior art keywords
cache
dirty
column
write
buffer
Prior art date
Application number
TW92105886A
Other languages
Chinese (zh)
Other versions
TW200419343A (en
Inventor
Hua-Chang Chi
Original Assignee
Faraday Tech Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Faraday Tech Corp filed Critical Faraday Tech Corp
Priority to TW92105886A priority Critical patent/TWI220944B/en
Application granted granted Critical
Publication of TWI220944B publication Critical patent/TWI220944B/en
Publication of TW200419343A publication Critical patent/TW200419343A/en

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A cache controller unit (CCU) architecture with dirty line write-back auto-adjustment, suitable for high performance microprocessor systems with write-back cache memory. The CCU architecture includes a cache data control unit to access data between a cache memory and a CPU, a tag compare unit to compare an address sent by the CPU and a tag address sent by a tag memory and thus produce a cache hit signal, and a CCU state machine to control the data access direction of the cache data control and produce corresponding operations according to the tag compare result.

Description

1220944 五、發明說明(1) -------- 發明所屬之技術領域 …本發明有關於一種用於具有一回寫快取記憶體的高效 率微處理器系統的快取記憶體控制器單元架構。尤其,本 發明有關於提供一種用於快取記憶體控制器單元(ccu)的 操作方法,以改進在髒快取列回寫時的快取記憶體執行效 率。 〜 先前技術 現今高效率微處理器系統多半會使用到不同的記憶體 階層。在最靠近中央處理單元(cpu)核心的階層配置有〜快 取記憶體,其多數為高速靜態隨機存取記憶體9(sram)。上 述快取記憶體通常與CPU核心在同一顆晶片上,因此它可 以與CPU以相同時脈速率同步操作。在低階處配置有主記 憶體二其由CPU所能使用的整個實體記憶體空間所構成。 主記憶體,例如,動態隨機存取記憶體⑶^",通常不在 上述晶片内,其速度較慢且價格較便宜。又, 在主記憶體中會…小部分的記憶體位i。當CPUY存 t的資料位址係位於快取記憶體中(此稱之為擊中狀態) ,? ί ί接1快取記憶體進行存取,因此,CPU不會耽綱 f貢料的日守間。但是,當CPU所存取的資料位址不在快 it ^ ^ ^ ^ ”、、失洛狀恶)吋,則必須到主記憶體中 進:貝抖的㈣,這個會花費較長的時間。也就是,在本 例中cpu必須暫停„以等待來自主記憶體的資料 1 - ί : : ΐ :益系統中’ •了cpu外’主記憶體也可被 /、匕 予。例如,各種輸出入(10)裝置或直接記憶1220944 V. Description of the invention (1) -------- The technical field to which the invention belongs ... The present invention relates to a cache memory control for a high-efficiency microprocessor system having a write-back cache memory Unit architecture. More particularly, the present invention is directed to providing an operating method for a cache memory controller unit (ccu) to improve cache memory execution efficiency when writing back to a dirty cache line. ~ Prior technology Most efficient microprocessor systems today use different memory levels. A cache memory is arranged at a level closest to the core of the central processing unit (cpu), and most of them are high-speed static random access memory 9 (sram). The cache memory is usually on the same chip as the CPU core, so it can operate synchronously with the CPU at the same clock rate. At the lower level, there is a main memory two, which is composed of the entire physical memory space that the CPU can use. Main memory, e.g., dynamic random access memory (CD ^ "), is usually not in the chip, and it is slower and cheaper. Also, in the main memory, there will be ... a small part of the memory position i. When CPUY stores the data address in cache memory (this is called the hit state)? ί ί access 1 cache memory for access, therefore, the CPU will not delay the schedule. However, when the address of the data accessed by the CPU is not fast, it must go to the main memory: it will take a long time. That is, in this example, the CPU must be paused to wait for data from the main memory 1-ί:: ΐ: In the beneficial system, the main memory can also be accessed by /. For example, various input / output (10) devices or direct memory

1220944 五、發明說明(2) 上述WT快取記憶 但是,因為總是 因此會影響CPU 這個方式是只有 體存取(D Μ A )主裝置。快取記憶體必須維持快取資料門的 一致性,而主記憶體應包含跟快取記憶體内的資料—樣的 備份。有兩種方式可以使主記憶體内的資料與快取記情體 内的資料相同。一種是直接寫入(WT)快取記憶體方°式了也 就是,當要寫入的資料在快取記憶體中時,CPU同時將f 料寫入快取記憶體及主記憶體中,如此,主記憶體内總是 包含與快取記憶體内的資料相同的資料 ~ 體架構較容易設計也較容易維持一致性 需要寫入至存取速度較慢的主記憶體中 的執行效率。另一方式是回寫(WB)方式 資料在快取記憶體中時才會寫入到快取記憶體,而且,異 動的資料不會先寫入主記憶體内,要一段時間之後,才會 將 > 料寫入至主記憶體中,以維持資料的一致性。異動或 π航髒”快取列(即髒列)必須更新的狀況會發生於讀取造成 的失落使得髒列要被取代時。在上述狀況中,必須在原有 資料要被一新的快取列資料取代之前,先從快取記憶體中 讀出原有髒列的資料,置放於主記憶體内。這個會引起二 個連續性的操作:從快取記憶體中將髒列資料寫入主記憶 體中’以及從主記憶體中讀取新的快取列·資料,寫入至快 取記憶體中。為此,CPU必須在上述操作期間暫停,因而 導致執行效率的降低。 對於上述問題,一種常用的解決辦法是在快取記憶體 及主記憶體之間使用一寫入緩衝器或暫存器,用以暫存要 被取代的髒列資料及位址(此後稱之為被取代列)。第1圖1220944 V. Description of the invention (2) The above WT cache memory However, because it always affects the CPU, this method is only the main memory (DMA) master device. The cache memory must maintain the consistency of the cache data gate, and the main memory should contain the same backup as the data in the cache memory. There are two ways to make the data in the main memory the same as the data in the cache memory. One is the direct write (WT) cache memory method. That is, when the data to be written is in the cache memory, the CPU writes the f data into the cache memory and the main memory at the same time. In this way, the main memory always contains the same data as the data in the cache memory. The physical structure is easier to design and easier to maintain consistency. It needs to be written to the main memory with slower execution speed. Another method is to write back (WB) data to the cache memory when it is in the cache memory, and the changed data will not be written into the main memory first. Write > data to main memory to maintain data consistency. Changed or pi-dirty "cache columns" (that is, dirty columns) must be updated when the loss caused by reading causes the dirty columns to be replaced. In the above situation, the original data must be re-cached Before the row data is replaced, the original dirty row data is read from the cache memory and placed in the main memory. This will cause two consecutive operations: write the dirty row data from the cache memory 'Into main memory' and read new cache rows and data from main memory and write to cache memory. To this end, the CPU must be suspended during the above operations, resulting in a reduction in execution efficiency. For A common solution to the above problem is to use a write buffer or register between the cache memory and the main memory to temporarily store the dirty column data and the address (hereinafter referred to as Replaced columns). Figure 1

0697-9048TW(N1);P2〇〇2.〇54;SUe ptd 第6頁 1220944 五、發明說明(3) 是一典型具有快取記憶體的微處理器系統的方塊圖。第2 圖是第1圖的一操作流程圖。在第1圖中,這個系統包含一 CPU 11、一標籤記憶體1 2、一快取記憶體1 3、一快取記憶 體控制單元(CCU)14、一主記憶體15、一記憶體控制器 (MEMC) 1 6、一具有一寫入緩衝器1 7 1的匯流排介面單元 (BI U)17、以及一系統匯流排18。如第1及2圖所示,當cpu 11要存取記憶體中的資料時,會發出一讀/寫命令及要存 取的位址給CCU 14(S1)。此時,ecu 14會將這個位址與標 籤記憶體1 2的内含做比較,以檢查這個位址是不是存在於 快取記憶體中(S2 )。上述記憶體1 2的内含包括快取記憶體 中每一快取列的高位址位元,以及可能包括每一快取列的 某些控制位元’例如,指示著存在於快取列的資料係為有 效的有效位元’以及指示著存在於快取列上的資料已被異 動的髒位元。若上述檢查找到所需的位址時,資料會從快 取記憶體13中讀到CPU 11(讀取操作)或從cpu丨丨寫入快取 記憶體lj(寫入操作)(S3)。若位址不在快取記憶體,代表 所需的貢料位在主記憶體丨5中,此時,ccu丨4必須將存取 工作重新導向至BIU 17,藉以存取連接至系統匯流排18的 那些裝置:尤其是負責存取主記憶體的記、隐體控制器i 6。 BIU 17通常包含一宫λ蜂:^口口 1R λα ^ 冩入緩衝益171 ,以保存寫至主記憶體 料與位址,藉以維持與快取記憶體資訊的一 二時,、ccu 14對βΐυ π發出-填滿要求 ,W 一 $取代列是骯髒的或乾淨的(S5)。若上述被 取代列疋骯髒的,表示ecu 1 4兩要笠$丨吩 1 4而要4到寫入緩衝器有空位0697-9048TW (N1); P2〇02.〇54; SUe ptd Page 6 1220944 V. Description of the Invention (3) is a block diagram of a typical microprocessor system with cache memory. Fig. 2 is an operation flowchart of Fig. 1. In Figure 1, this system includes a CPU 11, a tag memory 1, 2, a cache memory 1, 3, a cache memory control unit (CCU) 14, a main memory 15, and a memory control. Controller (MEMC) 16, a bus interface unit (BI U) 17 with a write buffer 17 1, and a system bus 18. As shown in Figures 1 and 2, when the CPU 11 wants to access the data in the memory, it will issue a read / write command and the address to be stored to the CCU 14 (S1). At this time, ecu 14 compares this address with the contents of tag memory 12 to check whether this address exists in the cache memory (S2). The above memory 12 includes the high address bits of each cache row in the cache memory, and some control bits that may include each cache row. For example, it indicates that Data are valid valid bits' and dirty bits indicating that the data on the cache line has been changed. If the above-mentioned check finds the required address, the data will be read from the cache memory 13 to the CPU 11 (read operation) or written from the CPU to the cache memory lj (write operation) (S3). If the address is not in the cache memory, it means that the required data is in the main memory 丨 5. At this time, ccu 丨 4 must redirect the access work to BIU 17 to access and connect to the system bus 18 Those devices: especially the memory controller i 6 which is responsible for accessing the main memory. BIU 17 usually contains a house of lambda bees: ^ 口 口 1R λα ^ 缓冲 入 Buffering benefit 171 to save the data and address written to the main memory, so as to maintain and cache one or two hours of memory information, ccu 14 pairs βΐυ π issues a -fill request, and the W- $ substitution column is dirty or clean (S5). If the above replaced column is dirty, it means that ecu 1 4 has to be $ 丨 phen 1 4 and 4 to the write buffer.

1220944 五、發明說明(4) -- 可填滿目前髒列資訊為止(S6)。接著,ccu H將髒列資訊 置入寫入緩衝器UKS7)並等待BIU 17的第一要求 ', 藉以繼續CPU 11的操作(S9)。 ^而上述操作可能產生如第3圖所示的一種最壞狀 況。也就是,當一失落狀態發生時,在寫入緩衝与Η】内1220944 V. Description of the invention (4)-Until the current dirty column information can be filled (S6). Then, ccu H puts the dirty column information into the write buffer UKS7) and waits for the first request of BIU 17 ', so as to continue the operation of the CPU 11 (S9). ^ The above operation may produce a worst case scenario as shown in Figure 3. That is, when a lost state occurs, within the write buffer and buffer]

17尚未將寫入緩衝器171中°的資料 到主圮憶體中時,如第3圖實線Α所示。因為,在BIU 會先服是::較落; 、— 、/备列貝讯被填滿且寫入緩衝器1 7 1淨空(如楚3 圖貫線C)為止,如此,也會影響到c ς 發明内容 ]執仃效率。 因此本發明之一目的為提供一種用於且右仓 記憶體的高效率微處理器系統的快取記憶體^ 快取 構,其可減少快取記憶體控制器等二w早兀架 求時間。 寸卞災用寫入緩衝器的需 本發明之另一目的為提供一種用於具 體的高效率微處理器系統的操作方法,官、取記憶 沒空位時’主動延遲下-快取失落資料的寫=緩衝器 在辦列回寫時,改進快取記憶體的效率。呆作’藉以 本發明提供一種快取記憶體控制器單元加 卢裡/二二、料控制早70,以存取一快取記憶體及—φ血 處理早to(CPU)間的資料;一標藏比較中央 M比較上述17 When the data written in the buffer 171 has not yet been transferred to the main memory, it is shown as a solid line A in FIG. 3. Because, before the BIU will serve is :: falling;,-, / backup Lexun is filled and written to the buffer 1 7 1 headroom (such as Chu 3 drawing line C), so this will also affect c ς Invention Content] Enforcement Efficiency. Therefore, one object of the present invention is to provide a cache memory for a high-efficiency microprocessor system for a right-storage memory. ^ Cache structure, which can reduce the time required for the early memory of the cache memory controller and the like. . The need for a write buffer for disaster relief Another object of the present invention is to provide an operation method for a specific high-efficiency microprocessor system. Write = The buffer improves the efficiency of cache memory when writing back. Imagination 'whereby the present invention provides a cache memory controller unit Galluri / 22, material control early 70, in order to access a cache memory and-φ blood processing early to (CPU) data; a Center comparison, M comparison, above

0697-9048TWF(N1);P2002-054;SUE.p t d 第8頁 12209440697-9048TWF (N1); P2002-054; SUE.p t d p. 8 1220944

1220944 五、發明說明(6) 填滿位址搁為目前失落列位址;檢查被取代列是否為辨 列;當該被取代列不是髒列時,設定該髒旗標為不動作狀 態;當該被取代列是髒列時,則檢查該寫入緩衝器是否為 淨空的;當該寫入緩衝器不為淨空時,設定該髒旗標為動 作狀態以在稍後將該被取代列放置至該寫入緩衝器内,並 釋放快取記憶體控制器單元(CCU)及中央處理單元(CPU)而 不需等到該寫入緩衝器淨空;當該寫入緩衝器淨空時,在 該被取代列放入該寫入緩衝器後,設定該髒旗標為不動作 狀態;以及根據該填滿緩衝器所抓取而該快取記憶體控制 器單元所傳送的新快取列中的第一個資料字組,繼續中央 處理單元的操作。 實施方式 全文中,類似功能元件以相同符號代表之。 一快取記憶體又可細分為幾個階層,例如,L1 (第1 層)快取記憶體是最接近C P U,也最快的。l 2 (第2層)快取 記憶體是下一階層,速度較L1慢但容量較li大。本發明為 一般性原理且可應用至任何快取階層的記憶體。然而,為 了易於說明起見’下列說明假設快取記憶體只由一個階層 所構成。 _ δ —快取S己彳思體失洛狀態發生時,一列資料必須從主 e己憶體中讀出並放到快取記憶體中。當一失落發生時,通 常為複數個CPU字組組成的一列資料基本單位會從主記憶 體中被讀取。1220944 V. Description of the invention (6) The filled address is set to the current lost column address; check whether the replaced column is a distinguished column; when the replaced column is not a dirty column, set the dirty flag as inactive; when When the replaced column is dirty, check whether the write buffer is clear; when the write buffer is not clear, set the dirty flag to the action state to place the replaced column later Into the write buffer, and release the cache memory controller unit (CCU) and central processing unit (CPU) without waiting for the write buffer headroom; when the write buffer headroom, After the replacement column is placed in the write buffer, the dirty flag is set to an inactive state; and the first in a new cache column transmitted by the cache memory controller unit according to the full buffer capture A data block that continues the operation of the central processing unit. Embodiments Throughout the text, similar functional elements are represented by the same symbols. A cache can be subdivided into several levels. For example, the L1 (level 1) cache is the closest to C P U and the fastest. l 2 (Layer 2) Cache Memory is the next level, which is slower than L1 but larger in capacity. The invention is a general principle and can be applied to any cache level memory. However, for ease of explanation ', the following description assumes that the cache memory consists of only one hierarchy. _ δ — When the cached state of body loss occurs, a column of data must be read from the main memory and placed in the cache memory. When a loss occurs, a basic unit of data, usually composed of multiple CPU blocks, is read from the main memory.

0697-9048TWF(N1);P2002-054;SUE.p t d ^209440697-9048TWF (N1); P2002-054; SUE.p t d ^ 20944

如使用下列之一:直接對應位址式快取記憶體架構,其内 的一列資料只配置於該快取記憶體中的一特定位置;η路 緩合式快取記憶體架構,其内的一列資料可配置於該快取 5己憶體的η個位置中的一個;或完全組合式快取記憶體架 構’其内的一列資料可配置於該快取記憶體中的任一位 置。 第4圖顯示根據本發明的快取記憶體控制器單元(ecu) 方塊圖。在第4圖中,該CCU單元41包含:一快取資料控制 單元411、一標籤比較單元412、一填滿緩衝器41 3及一CCU 狀態機器414。除了該CCU單元41,本微處理器系統也包含 如第1圖所示的一CPU 11、一快取記憶體丨3、一標籤記憶 體1 2、及一具有寫入緩衝器1 7 1的匯流排介面單元 (BIU)17 。 如第4圖所示,單元412是一組合式邏輯,用以比較 CPU 11的位址輸出(CPU_addr)及標籤記憶體12的標籤位址 輪出(Tag —addr)。標籤位址輪出(Tag — addr)也包含一有效 位元以指示所對應的快取列包含有效資料,及一癖位元以 指示所對應的快取列已被修改。若CPU —addr = Tag_addr且 有效位元邏輯值為真’則送出一 Η I τ信號至該狀態機器 414。做為CCU 41控制核心的狀態機器414指示元件411自 快取A憶體1 3中讀取資料至CPU 1 1 (讀取操作)或從epu 1 1 中寫入貧料至快取記憶體1 3 (寫入操作)。若H丨τ信號未動 作時,也就是,相反地,例如一讀取失落信號動作,則狀 態機器414發出一Fill_req信號以要求BIU 17抓取一新的For example, if one of the following is used: directly corresponds to the address type cache memory structure, a row of data therein is only allocated at a specific position in the cache memory; the η-way cache type cache memory structure, a row in it The data can be arranged in one of the n positions of the cache memory; or a row of data in the fully-combined cache memory structure can be arranged at any position in the cache memory. Figure 4 shows a block diagram of a cache memory controller unit (ecu) according to the present invention. In FIG. 4, the CCU unit 41 includes a cache data control unit 411, a tag comparison unit 412, a filling buffer 413, and a CCU state machine 414. In addition to the CCU unit 41, the microprocessor system also includes a CPU 11, a cache memory 3, a tag memory 1 2 and a write buffer 1 7 1 as shown in FIG. 1. Bus Interface Unit (BIU) 17. As shown in FIG. 4, the unit 412 is a combined logic for comparing the address output (CPU_addr) of the CPU 11 and the tag address of the tag memory 12 (Tag-addr). Tag address rotation (Tag — addr) also contains a valid bit to indicate that the corresponding cache row contains valid data, and a habit bit to indicate that the corresponding cache row has been modified. If CPU —addr = Tag_addr and the logic value of the valid bit is true ’, a Η I τ signal is sent to the state machine 414. The state machine 414 as the control core of the CCU 41 instructs the element 411 to read data from the cache A memory 1 3 to the CPU 1 1 (read operation) or write lean data from the CPU 1 1 to the cache memory 1 3 (write operation). If the H 丨 τ signal is not active, that is, on the contrary, for example, a read loss signal is actuated, the state machine 414 issues a Fill_req signal to request the BIU 17 to grab a new

0697-9048TWF(Nl);P2002-054;SUE.ptd 第丨丨頁 12209440697-9048TWF (Nl); P2002-054; SUE.ptd page 丨 丨 page 1220944

辦位元為"1 "),則這 五、發明說明(8) 快取列。若被取代列為髒列(此時, 個髒列必須被更新至主記憶體(第1圖)中,以維持快取記 憶體的一致性。本狀況中,狀態機器41 4同時發出一 Wri te一req信號至BIU 1 7,用以在寫入緩衝器1 71淨空 (WB_empty = l )時,將髒列上的資料WB-data從快取記情體 1 3中放入寫入緩衝器1 71。在B I U 1 7抓取新快取列上的資 料後,會將資料F i 1 1 一da ta放入填滿緩衝器4 1 3中。接著, 填滿緩衝器内的資料會被送至CPU 11以使CPU繼續執行下 一指令操作。 第5圖為根據本發明填滿緩衝器結構的範例。如第5圖 所示’上述結構包含一填滿位址欄,以儲存失落列位址°, ,以及,許多資料字組欄,以儲存要抓取的一列資料。本例 中,一快取列包含4個CPU字組。要注意的是,所述之本二 例只是用於說明而非用以限制。 乾 第6a及6b圖為根據本發明第4圖的操作流程圖。如 6a及6b圖所示,當CPU發出一讀取或寫入信號(S1)時,c⑶ 檢查目前快取狀態(S2)。若為一快取擊中,ccu從快取 憶體中存取資料(S3),且CPU持續操作沒有耽擱。若^ 快取失落,ecu再檢查位元髒旗標d丨rty—H 否 (S4) ’也就是’檢查是否dirty_nag = 1,其中, d—lrty_fUg是一狀態位元,由ccu用以指示因為先前的 洛而被取代的髒列是否已被放入寫入緩衝器中。若 dirty-f lag=i ’ CCU必須等到寫入緩 ° 當寫入緩衝器淨空時,CCU發出一 M t c ) 义K 填滿要求給BIU,以要求The office bit is " 1 "), then these five, invention description (8) cache column. If it is replaced as a dirty column (at this time, the dirty columns must be updated into the main memory (Figure 1) to maintain the consistency of the cache memory. In this situation, the state machine 4144 issues a Wri at the same time. te a req signal to BIU 1 7 to write the data WB-data on the dirty column from the cache memory 1 3 into the write buffer when writing the buffer 1 71 headroom (WB_empty = l) 1 71. After BIU 1 7 fetches the data on the new cache line, it will put the data F i 1 1-da ta into the filling buffer 4 1 3. Then, the data in the filling buffer will be It is sent to the CPU 11 so that the CPU can continue to execute the next instruction operation. Fig. 5 is an example of filling the buffer structure according to the present invention. As shown in Fig. 5, the above structure includes a filling address column to store the missing columns. Address °,, and, many data word columns to store a row of data to be fetched. In this example, a cache line contains 4 CPU words. It should be noted that the two examples described are only used The description is not for limitation. Figures 6a and 6b are operation flowcharts according to Figure 4 of the present invention. As shown in Figures 6a and 6b, when the CPU issues When reading or writing the signal (S1), c⑶ checks the current cache state (S2). If it is a cache hit, ccu accesses the data from the cache memory (S3), and the CPU continues to operate without delay. If ^ cache is lost, ecu then checks the bit dirty flag d 丨 rty_H No (S4) 'that is,' checks if dirty_nag = 1, where d_lrty_fUg is a status bit, which is used by ccu to indicate because Whether the previous dirty column has been replaced in the write buffer. If dirty-f lag = i 'the CCU must wait until the write is slow. When the write buffer is clear, the CCU issues a M tc) meaning K fills requirements to BIU to request

41220944 五、發明說明(9) 自主記憶體 讀取上述髒 時,發出一 述新的快取 次失落所產 著’填滿緩 意,此時填 並非目前失 流排的速度 取中。在步 自含有前一 攔。接著, 快取列位址 目前被取代 就是圖中的 為0(S11)。 態),CCU再 空,d i rty_ 入寫入緩衝 停操作以等 空時,上述 d i rty_f1ag 字組資料填 後,將它傳 抓取一新的快取列,且同時,自快取記憶體中 列至寫入緩衝器中(S6)。當dirty一 flag未動作 填滿要求給B I U,以要求自主記憶體中抓取上 列(S7)。注意,在步驟86所讀取的髒列為前一 生的髒列,並非目前失落所產生的髒列。接 衝裔中的資料被放入快取記憶體中(g 8 )。注 滿緩衝器中的資料為前一失落所讀取的資料, 落的新快取列的資料。由於主記憶體及系統匯 相對較慢,新快取列上的資料這時正由311]抓 驟S6及S8中,用以存取快取記憶體的位址是來 失落列位址及資料的填滿緩衝器中的填滿位址 CCU更新填滿位址欄為目前失落列位址(即新的 ),以取代上述前一失落列位址(s 9 ),並檢查 列是否為髒列(S1 0 )。若被取代列不為髒列(也 乾淨狀態),不需要更新並將心“厂丨丨“設定 右被取代列是為髒列(也就是圖中的骯髒狀 檢查寫入緩衝器是否為淨空(s 1 2 )。若不是淨 flag設定為i(si 3),藉以在稍後將上述髒列放 器中,如此,可先釋出ccu及CTU資源而不須暫 待寫入緩衝器淨空。反之,若寫入緩衝器為淨 被取代列被放入寫入緩衝器(s丨4 )及 設為0(S15)。接著,ccu等待失落列中的第一 士填滿緩衝器(S1 6)。接到上述第一字組資料 送到CPU,使得CPU能繼續操作(S17)。失落列41220944 V. Description of the invention (9) When the above-mentioned dirty memory is read by the autonomous memory, a new cache is issued, which is produced by the "lost time", and the filling is not completed at this time. Self-contained before the step. Then, the cache column address is currently replaced, which is 0 in the figure (S11). State), the CCU is empty again, and di rty_ enters the write buffer to stop the operation to wait for empty. After the above di rty_f1ag block data is filled, it is passed to grab a new cache line, and at the same time, it is saved in the cache memory. List to write buffer (S6). When dirty_flag is inactive, the request is filled to the B I U to request the above list to be fetched from the autonomous memory (S7). Note that the dirty columns read in step 86 are the dirty columns of the previous life, not the dirty columns generated by the current loss. The data in the buffer is placed in cache memory (g 8). Note The data in the full buffer is the data read by the previous loss, and the data of the new cache line. Because the main memory and the system sink are relatively slow, the data on the new cache is now being processed by 311] in steps S6 and S8. The address used to access the cache is to lose the row address and data. The CCU updates the filled address column in the filled buffer with the current missing column address (ie, new) to replace the previous missing column address (s 9) and checks whether the column is dirty (S1 0). If the replaced column is not a dirty column (also clean), you do not need to update and set the right replaced column as a dirty column (that is, the dirty status in the figure. Check whether the write buffer is Headroom (s 1 2). If the net flag is not set to i (si 3), so that the above-mentioned dirty lister can be used later, so the ccu and CTU resources can be released first without writing to the buffer temporarily. Headroom. Conversely, if the write buffer is a net replaced column, it is placed in the write buffer (s 丨 4) and set to 0 (S15). Then, ccu waits for the first person in the missing column to fill the buffer ( S1 6). The first block of data is received and sent to the CPU, so that the CPU can continue to operate (S17).

1220944 五、發明說明(ίο) 中的其餘字組可在CPU持續操作之後,再填入填滿緩衝器 中。前述流程被清楚地顯示於第7圖中,其中,當CCU在 抓取失落列週期發出一填滿要求時,經由設定 dirty一flag=l使得CPU繼續操作.。因此,CPU操作不會因為 寫入緩衝器未淨空而耽擱。注意,從步驟36至816,BIU正 從主記憶體抓取失落列(新的快取列),且因為相對而言, 主記憶體及系統匯流排的速度較慢,瓶頸的發生通常是在 BIU抓取上述失落列,而非步驟S6至31 6的(;(^執行任務。 注意,每執行一次第6圖的一個循環時,只有步驟S6或31 4 其中之一會被執行。若步驟34被執行,寫入緩衝器就不會 淨空,因此,當CCU狀態機器跳到步驟S1 2時,接著會執行 步驟S13而非步驟S14 ’接著,再從步驟si3跳到步驟, 而不須等待寫入緩衝器淨空以將髒列放入寫入緩衝器中。 上述放入寫入緩衝器的動作會被延遲,直到ccu跳到下一 -人丨夬取失落的步驟S 4時,才會執行。此時,因前一次失落 所執=的步驟S13,使得dirty_flag動作。本發明中,cm 可旎還是須要等待步驟S5的寫入緩衝器淨空。缺而,比車六 於I知技術,這個可能性是非常小的,因為在前一失落^ 目丽失落之間的這段時間可被BIU用來淨空寫入緩衝哭。 將髒列放入寫入緩衝器的兩個重要時刻是在步驟%及 法當一失落狀態發生時,本發明會根據寫入緩衝哭的 :形主動地選擇將髒列放入寫入緩衝器' 此達到最佳化CPU效率的目的。 ,、、、措 雖然本發明已以一較佳實施例揭露如上,然其並非用1220944 V. The remaining words in the description of the invention (ίο) can be filled into the fill buffer after the CPU continues to operate. The foregoing flow is clearly shown in FIG. 7, in which, when the CCU issues a fill request in the fetching lost column cycle, the CPU continues to operate by setting dirty_flag = 1. Therefore, the CPU operation is not delayed because the write buffer is not cleared. Note that from steps 36 to 816, the BIU is fetching missing columns (new cache columns) from the main memory, and because the main memory and system buses are relatively slow, bottlenecks usually occur at BIU grabs the missing columns, instead of (; (^) in steps S6 to 3116 to perform the task. Note that each time a cycle in Figure 6 is executed, only one of steps S6 or 31 4 will be executed. If step 34 is executed, the write buffer will not be cleared. Therefore, when the CCU state machine jumps to step S12, it will then execute step S13 instead of step S14 '. Then, jump from step si3 to step without waiting Write buffer headroom to put dirty columns into the write buffer. The above action of putting into the write buffer will be delayed until the ccu jumps to the next-person and fetches the missing step S4, Execution. At this time, because step S13 performed by the previous loss, the dirty_flag action is performed. In the present invention, cm may still have to wait for the write buffer clearance of step S5. However, compared with the car, I know the technology, This possibility is very small, because lost in the previous ^ The time between the loss and the loss can be used by the BIU to clear the write buffer. The two important moments when putting a dirty column into the write buffer are at step% and when a loss occurs, the present invention will be based on The write buffer is crying: the shape actively chooses to put dirty columns into the write buffer. This achieves the goal of optimizing CPU efficiency. Although the invention has been disclosed above in a preferred embodiment, its Not used

第14頁 1220944 五、發明說明(11) 以限定本發明,任何熟知此技術之人士,在不脫離本發明 之精神及範圍内,當可做更動與潤飾,因此本發明之保護 範圍當視後附之申請專利範圍所界定者為準。Page 14 1220944 V. Description of the invention (11) To limit the present invention, anyone who is familiar with this technology can make changes and retouches without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention shall be considered The attached application patent shall prevail.

0697 -9048TWF(N1);P2002-054;SUE.p t d 第15頁 1220944 圖式簡單說明 為讓本發明之μ、+、 上述及其它目的、特徵、與優點飴 而易U文特舉一較佳實施例,並配合所附圖式,作“ 細說明如下: Μ八作砰 第1圖顯示一JL女10. 統 • 具有快取記憶體的傳統式微處理器系 方塊圖, w示 第2圖顯示第1圖的操作流程圖; .第3圖顯不第1圖中快取記憶體控制器的—系列操作 明的快取記憶體控制器單元的方0697 -9048TWF (N1); P2002-054; SUE.ptd Page 15 1220944 The diagram briefly explains the advantages of the invention, μ, +, the above and other objects, features, and advantages. Example, and in conjunction with the attached drawings, make a detailed description as follows: Μ 八 作 bang The first picture shows a JL female 10. System • Block diagram of a conventional microprocessor with cache memory, w shows the second diagram Figure 1 shows the operation flowchart of Figure 1; Figure 3 shows the cache memory controller in Figure 1-a series of operating cache memory controller units

第4圖顯示根據本發 塊圖; X 第5圖係根據本發明第4圖顯示一填滿緩衝器内含範 例·, •第6a及6b圖係根據本發明第4圖的一完整操作流程 圖A及 第7圖顯示本發明第4圖中快取記憶體控制器 操作圖。 糸列 [符號說明] 11〜中央處理單元(CPU) 1 2〜標籤記憶體(tag memory) - 13〜快取3己憶體(cache memory) 1 4、4卜快取記憶體控制器單元(ccu) 15〜主記憶體(main memory) 16〜記憶體控制器(MEMC) 1 7〜匯流排介面單元(B I U) 4 1220944 圖式簡單說明 1 8系統匯流排(s y s t e m b u s ) 171〜寫入緩衝器(write buffer) 411 &取資料控制單元(cache data control unit) 412〜標藏比較單 干又平 7〇Qtag compare unit) ^〜填滿緩衝器(fill buffer·) 快取記憶體控制器單元狀態機器(CCU SM)Fig. 4 shows a block diagram according to the present invention; X Fig. 5 shows an example of filling a buffer according to the present invention. Fig. 6a and 6b show a complete operation flow according to the present invention. Figures A and 7 show operation diagrams of the cache memory controller in Figure 4 of the present invention. Queue [Symbol Description] 11 ~ Central Processing Unit (CPU) 1 2 ~ Tag Memory-13 ~ Cache 3 Cache Memory 1 4, 4 Cache Controller Unit ( ccu) 15 ~ main memory 16 ~ memory controller (MEMC) 1 7 ~ bus interface unit (BIU) 4 1220944 simple illustration 1 8 system bus (systembus) 171 ~ write buffer (Write buffer) 411 & cache data control unit 412 ~ tag comparison is simple and flat 70Qtag compare unit) ^ ~ fill buffer · state of cache memory controller unit Machine (CCU SM)

0697-9048TWF(N1);P2002-054;SUE.p t d 第17頁0697-9048TWF (N1); P2002-054; SUE.p t d p.17

Claims (1)

1220944 六、申請專利範圍 • 種具有辦列(dirty line) θ S 士田# 憶體控制器單元加# . 動凋1的快取記 鞅 减# ( U)茱構,適用於具有一回寫快& # & 埃輪一示鐵記憶體、一中央處理單元(CPU)及含有一^、 級衝器的-匯流排介面單 =寫入 ^該具有辦列⑷rty iine)回寫調^的處理益系 控制器單元架構包括·· 陕取圮憶體 央處理Ϊ取貝料控制單元,肖以存取上述快取記怜體刀* 央處理早7G之間的資料; L k體及中 一標蕺比較單元,用以比較中央 址輸出及標藏記憶體位70迗的一位 取擊中信號; U铩戴位址輸出’以產生—快 Μ :快取記憶體控制器單元狀態機器,用 戴比較、结果來控制該快取f ^上述標 薪旗標被設定且該寫入緩衝器淨空則將」:=:若-若該髒旗標未被設定則發出-填滿寫入 排;丨面早兀以*求該匯流排介面單元抓取一新i ^亥匯流 料,或者,若該寫入緩衝器未、列的資 該中央處理單元的操作以使一被取二以髒;標並繼續 及 j欣馮一新的髒列; 一填滿緩衝器,用以儲存該匯流排介面 訊亚將該貧訊提供給該快取記憶體控 7所送之資 作使用…,該資訊包含一失落列二;::大態機器操 2. ?口申請專利範圍第i項之具有髒列Ui立址。 寫自動調整的快取記憶體控制器單元([⑶)架 1 ne )回 一苒,其中,1220944 6. Scope of patent application • A type with a dirty line θ S 士 田 # Memory controller unit plus #. Cache 1 of the wither and withdrawal # (U) Zhu structure, suitable for a write-back Fast &# & Egypt round a memory, a central processing unit (CPU) and a bus interface sheet containing a ^, stage punch = write ^ should have the line ⑷rty iine) write back ^ The architecture of the processing unit of the controller includes: · The Shaanxi 圮 memory system processing Ϊ Ϊ shell material control unit, Xiao Yi to access the cache memory above the knife * Central processing data between the early 7G; L k body and Secondary standard 蕺 comparison unit, used to compare the central address output and a single bit hit signal of the hidden memory position 70 迗; U 铩 wears the address output 'to generate—Quick M: cache memory controller unit state machine Use the comparison and results to control the cache f ^ The above salary flag is set and the write buffer is cleared ": =: if-if the dirty flag is not set, issue-fill the write Row; 丨 the surface must use * to find the bus interface unit to grab a new i ^ bus assembly, or, if the write buffer The operation of the central processing unit is based on the operation of the central processing unit so that one is taken to be dirty; the standard is continued and the new dirty row is filled; a buffer is filled to store the bus interface. The information provided to the cache memory controller 7 is for use ..., the information includes one of the missing two; :: big state machine operation; 2. the application of the patent scope of the i item has a dirty column Ui site. Write the auto-adjusted cache memory controller unit ([⑶] shelf 1 ne) back to the frame, where, 0697-9048TWF(N1);P2002-054;SUE.p t d 第18頁 1220944 六、申請專利範圍 j填滿緩衝姦包括一列位址櫊,以儲存一失落列位址,·及 L ^料攔以儲存該中央處理單元所抓取的失落列的相 關資料。 3· 士申月專利範圍第1項之具有辩列(dirty line)回 ς自動調整的快取記憶體控制器單元(ccu)架構,豆中, 人 有效位兀’以私示一對應快取列所包 、 辦位凡’以拓示該對應快取列已被修 窝白專利範圍第3項之具有髒列(dirty line)回 ς動调正的快取記憶體控制器單元(ccu)架 其中, 戎對應快取列為該髒列或該新快取列。 記恃、5體列(dirty line)回寫自動調整的-快取 處 理單元(CPU)及含有—票織記憶體、一 I (BIU)的古兮至蚴走冩入凌衝益的一匯流排介面單兀 為擊中 者發+ 一 e +糸統该方法包括下列步f 田發出一項取或一耷入合入 或失落事件; Λ寫入…檢查目前快取列 若為一快取擊中搴株,目丨丨产 一穴 料 爭件則存取來自該,取記憶體的貝 若為一快取失落搴株,目| 當髒旗標未動作時,發出:填::旗標是否動作丄面 單元以要求一來自真滿要求給該匯流排介面 ^ 木s β主圯憶體的新列; 當該寫入緩衝器淨空且該 辨錤‘動作時,發出 填滿0697-9048TWF (N1); P2002-054; SUE.ptd Page 18 1220944 VI. Patent application scope j Filling the buffer includes a row of addresses 櫊 to store a lost row address, and L ^ material block for storage Relevant data of the lost column captured by the central processing unit. 3. Shi Shenyue's Patent Scope Item 1 has a cache memory controller unit (ccu) architecture with automatic adjustment of the dirty line and the automatic adjustment. In the bean, the effective position of the person is displayed as a private cache. Included in the list, the office's Fanfan 'is used to show that the corresponding cache line has been modified by the dirty line (dirty line) and the cache memory controller unit (ccu) has been corrected in the third patent area. Among them, the corresponding cache column is the dirty column or the new cache column. Recording, 5-line writing (dirty line) write-back auto-adjustment-cache processing unit (CPU), and containing-ticket weaving memory, one I (BIU) from the ancient to the last one into the confluence of Ling Chongyi The interface interface is issued for the hit + + e + system. The method includes the following steps: f Field issues a fetch or a falcon or loss event; Λ write ... Check if the current cache line is a cache Hit the plant, and the item that produced a hole material will be accessed from it. The beru fetched from the memory is a cached lost plant. This item is issued when the dirty flag is not activated. Fill in :: flag Mark whether to operate the surface unit to request a new column from the bus interface to the bus interface. ^ S β new memory of the main memory; when the write buffer is clear and the discriminating operation is performed, a fill is issued. 0697-9048TWF(N1);P20〇2-〇54;SUE.ptd 第19頁 1220944 六、申請專利範圍 要求給該匯流排介面單元以要求一來自該主記憶體的新列 並同時將前一失落所產生的一髒列自該回寫快^記憶體讀 到該寫入緩衝器; 將前一失落所讀取的資料從一填滿緩衝器放入該回寫 快取記憶體中; ^ ^ 更新上述填滿緩衝器中的填滿位址欄為—目前失落列 位址; / 檢查目前要被取代的列是否為髒列; 當該被取代列不是髒列時,設定該髒旗標為不動作; 當該被取代列為髒列時,檢查該寫入緩衝器是否淨 空; w 當該寫入緩衝器不是淨空時,設定該髒旗標為動作, 藉以在稍後將該被取代列放入該寫入緩衝器,如此,可釋 出該快取記憶體控制器單元及該中央處理單元而不須等待 緩衝器淨空; 當該寫入緩衝器淨空時,在該被取代列被寫入該寫入 緩衝器後,設定該髒旗標為不動作;及 根據從該填滿緩衝器所抓取並由該快取記憶體控制單 凡傳送的上述新列中的第一個資料字組傳‘·送給中央處理單 元’使中央處理單元繼續執行。 6·如申請專利範圍第5項之具有髒列(dirty 1 ine)回 寫自動調整的一快取記憶體控制器單元(CCU)架構的操作 方法/、中"玄髒旗標為一狀態位元,由該快取記憶體护r 制器單凡用來指示因為前一失落所要取代的髒列是否已^0697-9048TWF (N1); P20〇2-〇54; SUE.ptd Page 19 1220944 6. The scope of patent application requires the bus interface unit to request a new row from the main memory and simultaneously lose the previous one. A generated dirty row is read from the write-back cache to the write buffer; the data read from the previous loss is filled from the full buffer into the write-back cache; ^ ^ Update the filled address column in the above filled buffer to-the current missing column address; / check if the column currently being replaced is a dirty column; when the replaced column is not a dirty column, set the dirty flag to No action; when the replaced column is a dirty column, check whether the write buffer is clear; w when the write buffer is not clear, set the dirty flag as an action, so that the replaced column is later Put in the write buffer, so that the cache memory controller unit and the central processing unit can be released without waiting for the buffer to be cleared; when the write buffer is cleared, write in the replaced column After entering the write buffer, set the dirty flag to inactive; '* To the central processing unit' according to the fetched from the buffer filled by the cache control unit where the first data block transmission above a new column in the transfer of the central processing unit continues. 6. Operation method of a cache memory controller unit (CCU) architecture with dirty 1 ine write-back auto-adjustment, such as item 5 in the scope of patent application, " Xuan dirty flag is a state Bit, used by the cache memory controller to indicate whether the dirty row to be replaced because of the previous loss has been replaced ^ 0697-9048TWF(Nl);P2002-054;SUE.ptd0697-9048TWF (Nl); P2002-054; SUE.ptd 第20頁 1ZZUV44 1ZZUV44 六、申請專利範圍 入該寫入緩衝器 7 ·如申請專 寫自動調整的一 方法,其中,填 失落列、目前失 8 ·如申請專 寫自動調整的一 方法,其中,該 滿位址攔及多個 内存上述失落列 早元要抓取的一 中。 利範圍 快取記 滿緩衝 落列的 利範圍 快取記 快取記 資料字 位址而 列資料 第5項之具有髒列(dirty Une)回 憶體控制器單元(CCU)架構的操作 器在上述不同步驟中分別儲存前一 位址及資料。 第5項之具有髒列(dirty line)回 憶體控制器單元(ecu)架構的操作 憶體控制器單元使用上述具有一填 組攔的填滿緩衝器,該填滿位址欄 4多個資料字組攔内存該中央處理Page 20 1ZZUV44 1ZZUV44 6. The patent application scope is included in the write buffer. 7 · If a method for automatic adjustment of monograph is applied, in which the missing column is lost, the current loss is 8 · If a method for automatic adjustment of monograph is applied, in which, The full address block and the multiple memory mentioned above are among the early ones to be captured. The profit range cache is full of buffers. The profit range cache is cached. The data word address is listed in the fifth item of the data. The operator with a dirty une memory controller unit (CCU) structure is as described above. Store the previous address and data separately in different steps. The operation of the fifth line with a dirty line memory controller unit (ecu) architecture The memory controller unit uses the above-mentioned filling buffer with a filling block, which fills more than 4 data in the address column Blocks the central processing unit
TW92105886A 2003-03-18 2003-03-18 Cache controller unit architecture and applied method TWI220944B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW92105886A TWI220944B (en) 2003-03-18 2003-03-18 Cache controller unit architecture and applied method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW92105886A TWI220944B (en) 2003-03-18 2003-03-18 Cache controller unit architecture and applied method

Publications (2)

Publication Number Publication Date
TWI220944B true TWI220944B (en) 2004-09-11
TW200419343A TW200419343A (en) 2004-10-01

Family

ID=34132751

Family Applications (1)

Application Number Title Priority Date Filing Date
TW92105886A TWI220944B (en) 2003-03-18 2003-03-18 Cache controller unit architecture and applied method

Country Status (1)

Country Link
TW (1) TWI220944B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI411915B (en) * 2009-07-10 2013-10-11 Via Tech Inc Microprocessor, memory subsystem and method for caching data
US8700859B2 (en) 2009-09-15 2014-04-15 Via Technologies, Inc. Transfer request block cache system and method

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200008759A (en) * 2018-07-17 2020-01-29 에스케이하이닉스 주식회사 Cache memory amd memory system including the same, eviction method of cache memory

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI411915B (en) * 2009-07-10 2013-10-11 Via Tech Inc Microprocessor, memory subsystem and method for caching data
TWI506437B (en) * 2009-07-10 2015-11-01 Via Tech Inc Microprocessor, method for caching data and computer program product
US8700859B2 (en) 2009-09-15 2014-04-15 Via Technologies, Inc. Transfer request block cache system and method
TWI514143B (en) * 2009-09-15 2015-12-21 Via Tech Inc Transfer request block cache system and method

Also Published As

Publication number Publication date
TW200419343A (en) 2004-10-01

Similar Documents

Publication Publication Date Title
JP4748610B2 (en) Optimal use of buffer space by the storage controller that writes the retrieved data directly into memory
EP0817067A2 (en) Integrated processor/memory device with victim data cache
US6199142B1 (en) Processor/memory device with integrated CPU, main memory, and full width cache and associated method
JP3629519B2 (en) Programmable SRAM and DRAM cache interface
US20120023302A1 (en) Concurrent Atomic Operations with Page Migration in PCIe
TWI652576B (en) Memory system and processor system
KR102478527B1 (en) Signaling for Heterogeneous Memory Systems
US8407389B2 (en) Atomic operations with page migration in PCIe
US8549227B2 (en) Multiprocessor system and operating method of multiprocessor system
US5287512A (en) Computer memory system and method for cleaning data elements
TWI220944B (en) Cache controller unit architecture and applied method
JP2010146084A (en) Data processor including cache memory control section
US20040153610A1 (en) Cache controller unit architecture and applied method
US8312218B2 (en) Cache controller and cache control method
JP3964821B2 (en) Processor, cache system and cache memory
JP2005267148A (en) Memory controller
JP2006091995A (en) Write-back device of cache memory
JP4295815B2 (en) Multiprocessor system and method of operating multiprocessor system
EP1607869A1 (en) Data cache system
US20060129762A1 (en) Accessible buffer for use in parallel with a filling cacheline
JP4583981B2 (en) Image processing device
US7840757B2 (en) Method and apparatus for providing high speed memory for a processing unit
JPH05173879A (en) Cache memory system
JPS62226348A (en) Main memory and concurrently main memory control device
JPH056659A (en) Dynamic ram

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees