TWI739430B - Cache and method for managing cache - Google Patents

Cache and method for managing cache Download PDF

Info

Publication number
TWI739430B
TWI739430B TW109116171A TW109116171A TWI739430B TW I739430 B TWI739430 B TW I739430B TW 109116171 A TW109116171 A TW 109116171A TW 109116171 A TW109116171 A TW 109116171A TW I739430 B TWI739430 B TW I739430B
Authority
TW
Taiwan
Prior art keywords
cache memory
control circuit
program
core
circuit
Prior art date
Application number
TW109116171A
Other languages
Chinese (zh)
Other versions
TW202038103A (en
Inventor
林瑞源
盧彥儒
Original Assignee
瑞昱半導體股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 瑞昱半導體股份有限公司 filed Critical 瑞昱半導體股份有限公司
Priority to TW109116171A priority Critical patent/TWI739430B/en
Publication of TW202038103A publication Critical patent/TW202038103A/en
Application granted granted Critical
Publication of TWI739430B publication Critical patent/TWI739430B/en

Links

Images

Abstract

A cache and a method for managing a cache are provided. The cache includes a first level cache, a second level cache and a register. The first level cache includes a first control circuit. The second level cache includes a second control circuit. The register is coupled to the first control circuit and the second control circuit. The first control circuit and the second control circuit refer to a register value of the register to respectively control the first level cache and the second level cache to operate in an inclusive mode or an exclusive mode.

Description

快取記憶體及快取記憶體的管理方法Cache memory and management method of cache memory

本案是關於快取記憶體,尤其是關於多階層快取記憶體。 This case is about cache memory, especially about multi-level cache memory.

圖1為習知電子裝置的架構圖。電子裝置100包含處理器110、第一級(L1)快取記憶體120、第二級(L2)快取記憶體130以及系統記憶體140。L1快取記憶體120及L2快取記憶體130通常為靜態隨機存取記憶體(Static Random-Access Memory,SRAM),而系統記憶體140通常為動態隨機存取記憶體(Dynamic Random-Access Memory,DRAM)。L2快取記憶體130包含控制電路132及儲存電路136。控制電路132將資料寫入儲存電路136或從儲存電路136讀取資料。儲存電路136的資料結構以及控制電路132所採用的用來存取儲存電路136的演算法為本技術領域具有通常知識者所熟知,故不再贅述。以下分別就快取記憶體的包容模式(inclusive mode)及專有模式(exclusive mode)討論電子裝置100所遭遇的問題。包容模式及專有模式為本技術領域具有通常知識者所熟知,故不再贅述。 FIG. 1 is a structural diagram of a conventional electronic device. The electronic device 100 includes a processor 110, a first level (L1) cache memory 120, a second level (L2) cache memory 130 and a system memory 140. L1 cache memory 120 and L2 cache memory 130 are usually static random-access memory (SRAM), and system memory 140 is usually dynamic random-access memory (Dynamic Random-Access Memory). ,DRAM). The L2 cache memory 130 includes a control circuit 132 and a storage circuit 136. The control circuit 132 writes data into the storage circuit 136 or reads data from the storage circuit 136. The data structure of the storage circuit 136 and the algorithm used by the control circuit 132 to access the storage circuit 136 are well known to those with ordinary knowledge in the art, so they will not be described in detail. The following discusses the problems encountered by the electronic device 100 in the inclusive mode and the exclusive mode of the cache memory respectively. The tolerance mode and the proprietary mode are well-known to those with ordinary knowledge in the technical field, so they will not be described in detail.

圖2為電子裝置100操作於包容模式的部分流程圖。於資料存取過程中,當資料於L1快取記憶體120未命中(miss)時,L1快取記憶 體120便向L2快取記憶體130請求資料(步驟S210)。在步驟S220中,控制電路132檢查儲存電路136中是否儲存L1快取記憶體120所請求的資料。假設儲存電路136中沒有儲存L1快取記憶體120所請求的資料(亦即L2快取記憶體沒有命中),則控制電路132向系統記憶體140請求資料(步驟S230)。接著,L2快取記憶體130取得來自系統記憶體140的資料(步驟S240),然後L2快取記憶體130將資料回覆給L1快取記憶體120(步驟S250)。收到L2快取記憶體130所回覆的資料之後,L1快取記憶體120儲存該資料。最後,L1快取記憶體120向L2快取記憶體130廣播(broadcast)資料(步驟S260)。在步驟S260中,控制電路132須檢查儲存電路136的標籤(tag)並且將資料寫入儲存電路136中。因為L2快取記憶體130的容量通常大於L1快取記憶體120的容量,所以存取儲存電路136也相對地耗時。舉例來說,如果存取L1快取記憶體120需要1個系統時脈的週期,則存取儲存電路136可能需2~3個週期。由於步驟S260相對地耗時,所以控制電路132無法立即處理的下一個存取命令,造成處理器110停滯(stall)。 FIG. 2 is a partial flowchart of the electronic device 100 operating in the containment mode. During data access, when data is missed in L1 cache memory 120, L1 cache memory The body 120 requests data from the L2 cache 130 (step S210). In step S220, the control circuit 132 checks whether the storage circuit 136 stores the data requested by the L1 cache 120. Assuming that the data requested by the L1 cache 120 is not stored in the storage circuit 136 (that is, the L2 cache is not hit), the control circuit 132 requests the data from the system memory 140 (step S230). Then, the L2 cache 130 obtains the data from the system memory 140 (step S240), and then the L2 cache 130 replies the data to the L1 cache 120 (step S250). After receiving the data replies from the L2 cache 130, the L1 cache 120 stores the data. Finally, the L1 cache memory 120 broadcasts data to the L2 cache memory 130 (step S260). In step S260, the control circuit 132 needs to check the tag of the storage circuit 136 and write data into the storage circuit 136. Because the capacity of the L2 cache memory 130 is generally greater than the capacity of the L1 cache memory 120, accessing the storage circuit 136 is also relatively time-consuming. For example, if access to the L1 cache 120 requires 1 system clock cycle, then access to the storage circuit 136 may require 2 to 3 cycles. Since step S260 is relatively time-consuming, the next access command that the control circuit 132 cannot process immediately causes the processor 110 to stall.

圖3為電子裝置100操作於專有模式的部分流程圖。於資料存取過程中,當資料在L1快取記憶體120未命中時,L1快取記憶體120便向L2快取記憶體130請求資料(步驟S310)。在步驟S320中,控制電路132檢查儲存電路136中是否儲存L1快取記憶體120所請求的資料。假設儲存電路136中儲存L1快取記憶體120所請求的資料(亦即L2快取記憶體命中(hit)),則控制電路132將資料回覆給L1快取記憶體120(步驟S330)。接著,L1快取記憶體120踢出(evict)一行資料(line data)至L2快取記憶體130(步驟S340)。在步驟S340中,控制電路132須檢查儲存電路136 的標籤並且將該行資料寫入儲存電路136的適當位置中。由於存取儲存電路136相對地耗時,所以步驟S340可能使得控制電路132無法立即處理的下一個存取命令,造成處理器110停滯。 FIG. 3 is a partial flowchart of the electronic device 100 operating in the proprietary mode. During the data access process, when data is missed in the L1 cache 120, the L1 cache 120 requests data from the L2 cache 130 (step S310). In step S320, the control circuit 132 checks whether the storage circuit 136 stores the data requested by the L1 cache 120. Assuming that the storage circuit 136 stores the data requested by the L1 cache 120 (that is, the L2 cache hit), the control circuit 132 replies the data to the L1 cache 120 (step S330). Then, the L1 cache memory 120 evicts a line of data to the L2 cache memory 130 (step S340). In step S340, the control circuit 132 must check the storage circuit 136 And write the row data into the appropriate location of the storage circuit 136. Since accessing the storage circuit 136 is relatively time-consuming, step S340 may make the next access command that the control circuit 132 cannot process immediately, causing the processor 110 to stall.

鑑於先前技術之不足,本案之一目的在於提供一種快取記憶體及快取記憶體的管理方法,以提升電子裝置的效能。 In view of the shortcomings of the prior art, one purpose of this project is to provide a cache memory and a method for managing the cache memory to improve the performance of the electronic device.

本案揭露一種快取記憶體,包含一第一級快取記憶體、一第二級快取記憶體以及一暫存器。第一級快取記憶體包含一第一控制電路。第二級快取記憶體包含一第二控制電路。暫存器耦接於該第一控制電路及該第二控制電路。該第一控制電路及該第二控制電路參考該暫存器的一暫存值以分別控制該第一級快取記憶體及該第二級快取記憶體操作於一包含模式(inclusive mode)或一專有模式(exclusive mode)。 This case discloses a cache memory, including a first-level cache memory, a second-level cache memory, and a register. The first level cache includes a first control circuit. The second-level cache includes a second control circuit. The register is coupled to the first control circuit and the second control circuit. The first control circuit and the second control circuit refer to a temporary value of the register to respectively control the first-level cache and the second-level cache to operate in an inclusive mode Or an exclusive mode.

本案揭露一種種快取記憶體的管理方法,用於一快取記憶體中,快取記憶體包含一第一級快取記憶體、一第二級快取記憶體以及一暫存器。管理方法包含下述步驟:讀取暫存器的一暫存值;根據暫存值控制第一級快取記憶體及第二級快取記憶體操作於一包含模式(inclusive mode)或一專有模式(exclusive mode)。 This case discloses a method for managing cache memory, which is used in a cache memory. The cache memory includes a first-level cache memory, a second-level cache memory, and a register. The management method includes the following steps: reading a temporary value of the register; controlling the first-level cache and the second-level cache to operate in an inclusive mode or a dedicated mode according to the temporary value. There are modes (exclusive mode).

有關本案的特徵、實作與功效,茲配合圖式作實施例詳細說明如下。 The features, implementation, and effects of this case are described in detail as follows in conjunction with the drawings as examples.

100、400、70:電子裝置 100, 400, 70: electronic device

110、410、72:處理器 110, 410, 72: processor

120、420、724、734:L1快取記憶體 120, 420, 724, 734: L1 cache

130、430、74:L2快取記憶體 130, 430, 74: L2 cache

140、440:系統記憶體 140, 440: System memory

132、432、7241、7341、742:控制電路 132, 432, 7241, 7341, 742: control circuit

136、436、7242、7342、746:儲存電路 136, 436, 7242, 7342, 746: storage circuit

434、744:緩衝電路 434, 744: snubber circuit

720、730:核心 720, 730: Core

722、732:處理單元 722, 732: processing unit

76:暫存器 76: register

S210~S260、S310~S340、S510~S580、S610~S640:步驟 S210~S260, S310~S340, S510~S580, S610~S640: steps

〔圖1〕為習知電子裝置的架構圖。;〔圖2〕為習知電子裝置操作於包容模式的部分流程圖;〔圖3〕為習知電子裝置操作於專有模式的部分流程圖;〔圖4〕為本案電子裝置之一實施例的架構圖;〔圖5〕為本案快取記憶體的管理方法的一實施例的流程圖;〔圖6〕為圖5的步驟S540的一實施例的流程圖;以及〔圖7〕為本案電子裝置之另一實施例的架構圖。 [Figure 1] is a structural diagram of a conventional electronic device. [Figure 2] is a partial flow chart of the conventional electronic device operating in an inclusive mode; [Figure 3] is a partial flow chart of the conventional electronic device operating in a proprietary mode; [Figure 4] is an embodiment of the electronic device of the present invention [Fig. 5] is a flowchart of an embodiment of the method for managing cache memory in this case; [Fig. 6] is a flowchart of an embodiment of step S540 in Fig. 5; and [Fig. 7] is this case The architecture diagram of another embodiment of the electronic device.

以下說明內容之技術用語係參照本技術領域之習慣用語,如本說明書對部分用語有加以說明或定義,該部分用語之解釋係以本說明書之說明或定義為準。 The technical terms used in the following description refer to the customary terms in the technical field. If part of the terms is described or defined in this specification, the explanation of the part of the terms is based on the description or definition of this specification.

本案之揭露內容包含快取記憶體及快取記憶體的管理方法。由於本案之快取記憶體所包含之部分元件單獨而言可能為已知元件,因此在不影響該裝置實施例之充分揭露及可實施性的前提下,以下說明對於已知元件的細節將予以節略。此外,本案之快取記憶體的管理方法的部分或全部流程可以是軟體及/或韌體之形式,並且可藉由本案之快取記憶體或其等效裝置來執行,在不影響該方法實施例之充分揭露及可實施性的前提下,以下方法實施例之說明將著重於步驟內容而非硬體。 The disclosure of this case includes the cache memory and the management method of the cache memory. Since some of the components included in the cache memory in this case may be known components alone, the following description will give details of the known components without affecting the full disclosure and implementability of the device embodiments. Abridged. In addition, part or all of the process of the cache management method of this case can be in the form of software and/or firmware, and can be executed by the cache or its equivalent device of this case, without affecting the method Under the premise of full disclosure and implementability of the embodiments, the description of the following method embodiments will focus on the content of the steps rather than the hardware.

圖4為本案電子裝置之一實施例的架構圖。電子裝置400包含處理器410、L1快取記憶體420、L2快取記憶體430以及系統記憶體 440。L2快取記憶體430包含控制電路432、緩衝電路434以及儲存電路436。緩衝電路434以先進先出(first-in first-out,FIFO)的方式儲存資料,而儲存電路436非以先進先出的方式儲存資料。在一些實施例中,緩衝電路434的容量小於儲存電路436的容量,如此一來,控制電路432對緩衝電路434的存取速度可以大於對儲存電路436的存取速度。儲存電路436儲存複數個標籤及對應該些標籤的複數個資料。儲存電路436的資料結構為本技術領域具有通常知識者所熟知,故不再贅述。緩衝電路434可以以SRAM實作或是以暫存器(例如正反器)實作,儲存電路436以SRAM實作。L1快取記憶體420及L2快取記憶體430可以操作於包容模式或專有模式。 FIG. 4 is a structural diagram of an embodiment of the electronic device of the present invention. The electronic device 400 includes a processor 410, an L1 cache memory 420, an L2 cache memory 430, and a system memory 440. The L2 cache memory 430 includes a control circuit 432, a buffer circuit 434, and a storage circuit 436. The buffer circuit 434 stores data in a first-in first-out (FIFO) manner, while the storage circuit 436 does not store data in a first-in first-out manner. In some embodiments, the capacity of the buffer circuit 434 is smaller than the capacity of the storage circuit 436. As a result, the access speed of the control circuit 432 to the buffer circuit 434 can be greater than the access speed of the storage circuit 436. The storage circuit 436 stores a plurality of tags and a plurality of data corresponding to the tags. The data structure of the storage circuit 436 is well-known to those skilled in the art, so it will not be described in detail. The buffer circuit 434 can be implemented by SRAM or a register (such as a flip-flop), and the storage circuit 436 can be implemented by SRAM. The L1 cache memory 420 and the L2 cache memory 430 can be operated in an inclusive mode or a proprietary mode.

圖5為本案快取記憶體的管理方法的一實施例的流程圖。圖5的流程適用於包容模式及專有模式。當控制電路432從L1快取記憶體420或系統記憶體440獲得目標資料並且要儲存該目標資料時,控制電路432將目標資料寫入緩衝電路434而不檢查儲存電路436的標籤(步驟S510)。接著,控制電路432判斷L2快取記憶體430是否處於閒置(idle)狀態(步驟S520)。如果步驟S520為否,控制電路432進一步判斷是否有另一目標資料需要被寫入L2快取記憶體430(步驟S530)。如果步驟S530為是,則控制電路432將該另一目標資料寫入緩衝電路434(步驟S510);如果步驟S530為否,則控制電路432尋找及/或回覆資料(包含存取緩衝電路434及/或儲存電路436)(步驟S540)。步驟S540結束後,流程回到步驟S520。 FIG. 5 is a flowchart of an embodiment of the method for managing the cache memory of the present application. The process in Figure 5 is applicable to both the inclusive model and the proprietary model. When the control circuit 432 obtains the target data from the L1 cache memory 420 or the system memory 440 and wants to store the target data, the control circuit 432 writes the target data into the buffer circuit 434 without checking the tag of the storage circuit 436 (step S510) . Next, the control circuit 432 determines whether the L2 cache 430 is in an idle state (step S520). If step S520 is no, the control circuit 432 further determines whether there is another target data to be written into the L2 cache 430 (step S530). If step S530 is yes, the control circuit 432 writes the other target data to the buffer circuit 434 (step S510); if step S530 is no, the control circuit 432 searches for and/or responds to the data (including the access buffer circuit 434 and /Or storage circuit 436) (step S540). After step S540 ends, the flow returns to step S520.

當L2快取記憶體430處於閒置狀態時(步驟S520為是), 控制電路432判斷緩衝電路434是否為空(步驟S550)。如果緩衝電路434沒有儲存任何資料(亦即步驟S550為是),則回到步驟S520。如果緩衝電路434不為空(亦即步驟S550為否),則控制電路432在儲存電路436中尋找儲存空間(步驟S560),然後將目標資料從緩衝電路434中讀出並且寫入儲存電路436(步驟S570)。換言之,步驟S560及步驟S570的目的在於將目標資料從緩衝電路434移到儲存電路436。搬移之後,該目標資料只存在於儲存電路436而不存在於緩衝電路434。換言之,緩衝電路434及儲存電路436不同時儲存同一筆行資料。步驟S570完成後,控制電路432即完成將目標資料寫入L2快取記憶體430(步驟S580),然後流程回到步驟S520。 When the L2 cache 430 is in an idle state (Yes in step S520), The control circuit 432 determines whether the buffer circuit 434 is empty (step S550). If the buffer circuit 434 does not store any data (that is, YES in step S550), go back to step S520. If the buffer circuit 434 is not empty (that is, no in step S550), the control circuit 432 searches for storage space in the storage circuit 436 (step S560), and then reads the target data from the buffer circuit 434 and writes it into the storage circuit 436 (Step S570). In other words, the purpose of step S560 and step S570 is to move the target data from the buffer circuit 434 to the storage circuit 436. After being moved, the target data only exists in the storage circuit 436 and does not exist in the buffer circuit 434. In other words, the buffer circuit 434 and the storage circuit 436 do not store the same line of data at the same time. After step S570 is completed, the control circuit 432 finishes writing the target data into the L2 cache memory 430 (step S580), and then the flow returns to step S520.

在步驟S560中,該儲存空間可以是未被占用的空間或是即將被剔除的資料所占用的空間。控制電路432可以根據演算法(例如最近最少使用(Least Recently Used,LRU))及儲存電路436中的標籤找到即將被踢出的資料。 In step S560, the storage space may be an unoccupied space or a space occupied by data to be removed. The control circuit 432 can find the data to be kicked out according to the algorithm (for example, Least Recently Used (LRU)) and the tag in the storage circuit 436.

由圖5的流程可知,緩衝電路434可能同時儲存複數個目標資料,而控制電路432以先進先出的方式依序將該些目標資料讀出並寫入儲存電路436。在一些實施例中,緩衝電路434中的資料具有與儲存電路436中的資料相同的格式(例如皆為行資料的格式),以簡化步驟S570。 It can be seen from the flow of FIG. 5 that the buffer circuit 434 may store a plurality of target data at the same time, and the control circuit 432 sequentially reads and writes the target data into the storage circuit 436 in a first-in-first-out manner. In some embodiments, the data in the buffer circuit 434 has the same format as the data in the storage circuit 436 (for example, the format of all line data), so as to simplify step S570.

因為在步驟S510中控制電路432不需要檢查儲存電路436的標籤來找出適合的儲存空間(無論是空的儲存空間或是即將被踢出的資料所占用的空間),所以理論上步驟S510只需要1個系統時脈的週期即可完成。作為比較,因為控制電路432將目標資料寫入儲存電路436時需 要先檢查標籤,所以控制電路432直接將目標資料寫入儲存電路436至少需要2個系統時脈的週期(視儲存電路436的大小而定)。換言之,緩衝電路434可以提升L2快取記憶體430的速度。 Because the control circuit 432 does not need to check the label of the storage circuit 436 in step S510 to find a suitable storage space (whether it is empty storage space or the space occupied by the data to be kicked out), theoretically step S510 only It takes 1 cycle of the system clock to complete. For comparison, because the control circuit 432 needs to write the target data into the storage circuit 436 The tag must be checked first, so the control circuit 432 directly writes the target data into the storage circuit 436 requires at least 2 cycles of the system clock (depending on the size of the storage circuit 436). In other words, the buffer circuit 434 can increase the speed of the L2 cache 430.

步驟S520的閒置狀態包含:(1)控制電路432沒有待處理的讀寫操作時;以及(2)L2快取記憶體430未命中時,從控制電路432向系統記憶體440請求資料後至收到系統記憶體440回覆的期間。因為系統記憶體440的一次存取所需的系統時脈的週期數通常遠大於控制電路432將資料寫入儲存電路436所需的系統時脈的週期數,所以控制電路432在情況(2)中有充裕的時間執行步驟S560及S570。 The idle state of step S520 includes: (1) when the control circuit 432 has no pending read/write operations; and (2) when the L2 cache memory 430 is missed, the control circuit 432 requests data from the system memory 440 until it is received. Until the system memory 440 responds. Because the number of cycles of the system clock required for one access of the system memory 440 is usually much larger than the number of cycles of the system clock required by the control circuit 432 to write data into the storage circuit 436, the control circuit 432 is in case (2) There is sufficient time to execute steps S560 and S570.

綜合上述,因為無論是L2快取記憶體430在包容模式下未命中或是在專有模式下命中,就處理器410的角度而言,L2快取記憶體430的操作只需要1個系統時脈的週期,所以處理器410不會被停滯,因此大幅增加電子裝置400的效能。 To sum up, because whether the L2 cache 430 misses in the containment mode or hits in the proprietary mode, from the perspective of the processor 410, the operation of the L2 cache 430 only requires 1 system time. Therefore, the processor 410 will not be stalled due to the pulse cycle, thus greatly increasing the performance of the electronic device 400.

圖6為圖5的步驟S540的一實施例的流程圖。當L1快取記憶體420未命中而向L2快取記憶體430請求資料時,控制電路432檢查緩衝電路434及儲存電路436是否儲存目標資料(步驟S610)。如果命中(亦即緩衝電路434或儲存電路436存有該目標資料,步驟S620為是),則控制電路432讀取該目標資料,並回覆該目標資料給L1快取記憶體420(步驟S630)。如果未命中(亦即緩衝電路434及儲存電路436皆未儲存該目標資料,步驟S620為否),控制電路432向系統記憶體440請求資料(步驟S640)。 FIG. 6 is a flowchart of an embodiment of step S540 in FIG. 5. When the L1 cache 420 misses and requests data from the L2 cache 430, the control circuit 432 checks whether the buffer circuit 434 and the storage circuit 436 store the target data (step S610). If it hits (that is, the buffer circuit 434 or the storage circuit 436 stores the target data, step S620 is YES), the control circuit 432 reads the target data, and replies the target data to the L1 cache 420 (step S630) . If there is a miss (that is, neither the buffer circuit 434 nor the storage circuit 436 has stored the target data, step S620 is No), the control circuit 432 requests data from the system memory 440 (step S640).

圖7為本案電子裝置之另一實施例的架構圖。電子裝置70 包含處理器72、L2快取記憶體74以及暫存器76。處理器72包含核心720及核心730。核心720包含處理單元722及L1快取記憶體724。L1快取記憶體724包含控制電路7241及儲存電路7242。核心730包含處理單元732及L1快取記憶體734。L1快取記憶體734包含控制電路7341及儲存電路7342。簡言之,處理器72為一個多核心的架構,核心720及核心730有各自的L1快取記憶體(分別為724及734),且共用L2快取記憶體74。L2快取記憶體74包含控制電路742、緩衝電路744以及儲存電路746。控制電路742、緩衝電路744及儲存電路746的功能分別與控制電路432、緩衝電路434及儲存電路436相似,故不再贅述。控制電路7241、控制電路7341及控制電路742耦接暫存器76,可讀取暫存器76中的暫存值。 FIG. 7 is a structural diagram of another embodiment of the electronic device of the present invention. Electronic device 70 It includes a processor 72, an L2 cache 74, and a register 76. The processor 72 includes a core 720 and a core 730. The core 720 includes a processing unit 722 and an L1 cache memory 724. The L1 cache memory 724 includes a control circuit 7241 and a storage circuit 7242. The core 730 includes a processing unit 732 and an L1 cache memory 734. The L1 cache memory 734 includes a control circuit 7341 and a storage circuit 7342. In short, the processor 72 has a multi-core architecture. The core 720 and the core 730 have their own L1 cache memory (724 and 734 respectively), and share the L2 cache memory 74. The L2 cache memory 74 includes a control circuit 742, a buffer circuit 744, and a storage circuit 746. The functions of the control circuit 742, the buffer circuit 744, and the storage circuit 746 are similar to those of the control circuit 432, the buffer circuit 434, and the storage circuit 436, respectively, so they will not be described again. The control circuit 7241, the control circuit 7341, and the control circuit 742 are coupled to the register 76, and the register value in the register 76 can be read.

L1快取記憶體724的控制電路7241、L1快取記憶體734的控制電路7341以及L2快取記憶體74的控制電路742參考暫存器76的暫存值來分別控制L1快取記憶體724、L1快取記憶體734及L2快取記憶體74操作於包容模式或專有模式。換言之,L1快取記憶體及L2快取記憶體以可程式化的方式切換於包容模式及專有模式兩者之間。如此一來,電子裝置70不需要在設計階段就決定L1快取記憶體724、L1快取記憶體734以及L2快取記憶體74的操作模式,而是使用者可以在電路完成後再根據實際的應用(亦即動態調整)設定暫存器76的暫存值。在一些實施例中,暫存器76可以是處理器72的控制暫存器。 The control circuit 7241 of the L1 cache memory 724, the control circuit 7341 of the L1 cache memory 734, and the control circuit 742 of the L2 cache memory 74, refer to the temporary value of the register 76 to control the L1 cache memory 724, respectively. , L1 cache memory 734 and L2 cache memory 74 operate in inclusive mode or proprietary mode. In other words, the L1 cache and the L2 cache can be switched between the tolerance mode and the exclusive mode in a programmable manner. In this way, the electronic device 70 does not need to determine the operation mode of the L1 cache memory 724, the L1 cache memory 734, and the L2 cache memory 74 at the design stage. Instead, the user can use the actual operation mode after the circuit is completed. The application (that is, dynamic adjustment) sets the temporary value of the register 76. In some embodiments, the register 76 may be a control register of the processor 72.

以下為電子裝置70的應用範例。 The following is an application example of the electronic device 70.

範例一:當核心720及核心730平行處理(亦即執行同一程式)時,暫存器76的暫存值可以被設定為第一數值(例如1),使得L1 快取記憶體724、L1快取記憶體734及L2快取記憶體74操作於包容模式。 Example 1: When the core 720 and the core 730 are processed in parallel (that is, the same program is executed), the temporary value of the register 76 can be set to a first value (for example, 1), so that L1 The cache memory 724, the L1 cache memory 734, and the L2 cache memory 74 operate in the containment mode.

範例二:當核心720及核心730分別執行第一程式及第二程式,且第一程式及第二程式共用指令及/或資料時,暫存器76的暫存值可以被設定為第一數值(例如1),使得L1快取記憶體724、L1快取記憶體734及L2快取記憶體74操作於包容模式。 Example 2: When the core 720 and the core 730 execute the first program and the second program, and the first program and the second program share commands and/or data, the temporary value of the register 76 can be set to the first value (For example, 1), the L1 cache memory 724, the L1 cache memory 734, and the L2 cache memory 74 are made to operate in the containment mode.

範例三:當核心720及核心730分別執行第一程式及第二程式,且第一程式及第二程式不共用指令及/或資料時(亦即第一程式及第二程式為獨立的程式),暫存器76的暫存值可以被設定為第二數值(例如0),使得L1快取記憶體724、L1快取記憶體734及L2快取記憶體74操作於專有模式。 Example 3: When the core 720 and the core 730 execute the first program and the second program respectively, and the first program and the second program do not share commands and/or data (that is, the first program and the second program are independent programs) , The temporary value of the register 76 can be set to a second value (for example, 0), so that the L1 cache memory 724, the L1 cache memory 734, and the L2 cache memory 74 operate in the exclusive mode.

在範例一及二中,包容模式有助於減少資料的移動次數(亦即提高命中率),所以電子裝置70的效能可以獲得提升。在範例三中,專有模式有助於L1快取記憶體724、L1快取記憶體734及L2快取記憶體74儲存更多的指令及/或資料,所以電子裝置70的效能可以獲得提升。 In Examples 1 and 2, the containment mode helps to reduce the number of data movement (that is, to increase the hit rate), so the performance of the electronic device 70 can be improved. In the third example, the proprietary mode helps L1 cache memory 724, L1 cache memory 734, and L2 cache memory 74 to store more commands and/or data, so the performance of the electronic device 70 can be improved .

在一些實施例中,前述的控制電路432、控制電路7241、控制電路7341及控制電路742可以由有限狀態機(包含複數個邏輯電路)實作。 In some embodiments, the aforementioned control circuit 432, control circuit 7241, control circuit 7341, and control circuit 742 may be implemented by a finite state machine (including a plurality of logic circuits).

由於本技術領域具有通常知識者可藉由本案之裝置實施例的揭露內容來瞭解本案之方法實施例的實施細節與變化,因此,為避免贅文,在不影響該方法實施例之揭露要求及可實施性的前提下,重複之說明在此予以節略。請注意,前揭圖示中,元件之形狀、尺寸、比例以及步驟之順序等僅為示意,係供本技術領域具有通常知識者瞭解本案之用,非用以限 制本案。 Since those with ordinary knowledge in the art can understand the implementation details and changes of the method embodiment of this case through the disclosure content of the device embodiment of this case, in order to avoid redundant text, it will not affect the disclosure requirements and the disclosure requirements of the method embodiment. Under the premise of feasibility, the repeated description is abbreviated here. Please note that the shapes, sizes, ratios, and sequence of steps in the preceding figures are only for illustration, and are provided for those skilled in the art to understand this case, and are not intended to be limiting. Make this case.

雖然本案之實施例如上所述,然而該些實施例並非用來限定本案,本技術領域具有通常知識者可依據本案之明示或隱含之內容對本案之技術特徵施以變化,凡此種種變化均可能屬於本案所尋求之專利保護範疇,換言之,本案之專利保護範圍須視本說明書之申請專利範圍所界定者為準。 Although the embodiments of this case are as described above, these embodiments are not used to limit the case. Those with ordinary knowledge in the technical field can apply changes to the technical features of the case based on the explicit or implicit content of the case, and all such changes All of them may fall into the scope of patent protection sought in this case. In other words, the scope of patent protection in this case shall be subject to the scope of the patent application in this specification.

70:電子裝置 70: electronic device

72:處理器 72: processor

724、734:L1快取記憶體 724, 734: L1 cache

74:L2快取記憶體 74: L2 cache

7241、7341、742:控制電路 7241, 7341, 742: control circuit

7242、7342、746:儲存電路 7242, 7342, 746: storage circuit

744:緩衝電路 744: snubber circuit

720、730:核心 720, 730: Core

722、732:處理單元 722, 732: processing unit

76:暫存器 76: register

Claims (10)

一種快取記憶體,包含:一第一級快取記憶體,包含一第一控制電路;一第二級快取記憶體,包含一第二控制電路;以及一暫存器,耦接該第一控制電路及該第二控制電路;其中該第一控制電路及該第二控制電路參考該暫存器的一暫存值以分別控制該第一級快取記憶體及該第二級快取記憶體操作於一包含模式(inclusive mode)或一專有模式(exclusive mode)。 A cache memory includes: a first level cache memory, including a first control circuit; a second level cache memory, including a second control circuit; and a register, coupled to the first control circuit A control circuit and the second control circuit; wherein the first control circuit and the second control circuit refer to a temporary value of the register to control the first-level cache and the second-level cache, respectively The memory operates in an inclusive mode or an exclusive mode. 如申請專利範圍第1項所述之快取記憶體,其中該第二級快取記憶體由一處理器之一第一核心及一第二核心共用,該第一核心執行一第一程式且該第二核心執行一第二程式,當該第一程式及該第二程式為共用指令或資料的程式時,該暫存值對應該包含模式。 The cache memory described in claim 1, wherein the second-level cache memory is shared by a first core and a second core of a processor, and the first core executes a first program and The second core executes a second program. When the first program and the second program are programs that share commands or data, the temporary value should contain a pattern. 如申請專利範圍第1項所述之快取記憶體,其中該第二級快取記憶體由一處理器之一第一核心及一第二核心共用,該第一核心執行一第一程式且該第二核心執行一第二程式,當該第一程式及該第二程式非為共用指令或資料的程式時,該暫存值對應該專有模式。 The cache memory described in claim 1, wherein the second-level cache memory is shared by a first core and a second core of a processor, and the first core executes a first program and The second core executes a second program. When the first program and the second program are not programs that share commands or data, the temporary storage value corresponds to a proprietary mode. 如申請專利範圍第1項所述之快取記憶體,其中該第二級快取記憶體更包含:一儲存電路;以及一緩衝電路,用來以先進先出的方式儲存一資料; 其中,該第二控制電路用來在該儲存電路中找到一儲存空間,並將該資料寫入該儲存空間。 For the cache memory described in item 1 of the scope of patent application, the second-level cache memory further includes: a storage circuit; and a buffer circuit for storing a data in a first-in first-out manner; Wherein, the second control circuit is used to find a storage space in the storage circuit and write the data into the storage space. 如申請專利範圍第4項所述之快取記憶體,其中當該第二控制電路檢查該第二級快取記憶體是否包含一目標資料時,該第二控制電路檢查該儲存電路及該緩衝電路是否儲存該目標資料。 The cache memory described in item 4 of the scope of patent application, wherein when the second control circuit checks whether the second level cache memory contains a target data, the second control circuit checks the storage circuit and the buffer Whether the circuit stores the target data. 如申請專利範圍第4項所述之快取記憶體,其中當一目標資料被寫入該第二級快取記憶體時,該第二控制電路將該目標資料寫入該緩衝電路,而不檢查該儲存電路。 For the cache memory described in item 4 of the scope of patent application, when a target data is written into the second-level cache memory, the second control circuit writes the target data into the buffer circuit without Check the storage circuit. 如申請專利範圍第4項所述之快取記憶體,其中該緩衝電路的容量小於該儲存電路的容量。 In the cache memory described in item 4 of the scope of patent application, the capacity of the buffer circuit is smaller than the capacity of the storage circuit. 一種快取記憶體的管理方法,用於一快取記憶體中,該快取記憶體包含一第一級快取記憶體、一第二級快取記憶體以及一暫存器,其中,該管理方法包含:讀取該暫存器的一暫存值;根據暫存值控制該第一級快取記憶體及該第二級快取記憶體操作於一包含模式(inclusive mode)或一專有模式(exclusive mode)。 A method for managing cache memory is used in a cache memory. The cache memory includes a first-level cache memory, a second-level cache memory, and a register. The management method includes: reading a temporary value of the register; controlling the first-level cache and the second-level cache to operate in an inclusive mode or a dedicated mode according to the temporary value. There are modes (exclusive mode). 如申請專利範圍第8項所述之管理方法,其中該第二級快取記憶體由一處理器之一第一核心及一第二核心共用,該第一核心執行一第一程式且該第二核心執行一第二程式,當該第一程式及該第二程式為共用指令或資料的程式時,該暫存值對應該包含模式。 For the management method described in claim 8, wherein the second-level cache is shared by a first core and a second core of a processor, the first core executes a first program and the second core The second core executes a second program. When the first program and the second program are programs that share commands or data, the temporary value should contain a pattern. 如申請專利範圍第8項所述之管理方法,其中該第二級快取記憶體由一處理器之一第一核心及一第二核心共用,該第一核心執行一第一程式且該第二核心執行一第二程式,當該第一程式及該第二程式非為共用指令或資料的程式時,該暫存值對應該專有模式。 For the management method described in claim 8, wherein the second-level cache is shared by a first core and a second core of a processor, the first core executes a first program and the second core The second core executes a second program. When the first program and the second program are not programs that share commands or data, the temporary value corresponds to the proprietary mode.
TW109116171A 2019-01-24 2019-01-24 Cache and method for managing cache TWI739430B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW109116171A TWI739430B (en) 2019-01-24 2019-01-24 Cache and method for managing cache

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW109116171A TWI739430B (en) 2019-01-24 2019-01-24 Cache and method for managing cache

Publications (2)

Publication Number Publication Date
TW202038103A TW202038103A (en) 2020-10-16
TWI739430B true TWI739430B (en) 2021-09-11

Family

ID=74091282

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109116171A TWI739430B (en) 2019-01-24 2019-01-24 Cache and method for managing cache

Country Status (1)

Country Link
TW (1) TWI739430B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070143550A1 (en) * 2005-12-19 2007-06-21 Intel Corporation Per-set relaxation of cache inclusion
US20130042070A1 (en) * 2011-08-08 2013-02-14 Arm Limited Shared cache memory control
US20170371784A1 (en) * 2016-06-24 2017-12-28 Advanced Micro Devices, Inc. Targeted per-line operations for remote scope promotion
US20180357175A1 (en) * 2017-06-13 2018-12-13 Alibaba Group Holding Limited Cache devices with configurable access policies and control methods thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070143550A1 (en) * 2005-12-19 2007-06-21 Intel Corporation Per-set relaxation of cache inclusion
US20130042070A1 (en) * 2011-08-08 2013-02-14 Arm Limited Shared cache memory control
US20170371784A1 (en) * 2016-06-24 2017-12-28 Advanced Micro Devices, Inc. Targeted per-line operations for remote scope promotion
US20180357175A1 (en) * 2017-06-13 2018-12-13 Alibaba Group Holding Limited Cache devices with configurable access policies and control methods thereof

Also Published As

Publication number Publication date
TW202038103A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
JP7115899B2 (en) MEMORY MODULE FOR WRITE AND FLASH SUPPORT IN HYBRID MEMORY AND METHOD OF OPERATION THEREOF
US8788759B2 (en) Double-buffered data storage to reduce prefetch generation stalls
JP5526626B2 (en) Arithmetic processing device and address conversion method
KR100395756B1 (en) Cache memory and microprocessor using this cache memory
US6332179B1 (en) Allocation for back-to-back misses in a directory based cache
US7010649B2 (en) Performance of a cache by including a tag that stores an indication of a previously requested address by the processor not stored in the cache
US9009415B2 (en) Memory system including a spiral cache
US5386526A (en) Cache memory controller and method for reducing CPU idle time by fetching data during a cache fill
US7313658B2 (en) Microprocessor and method for utilizing disparity between bus clock and core clock frequencies to prioritize cache line fill bus access requests
US9128856B2 (en) Selective cache fills in response to write misses
EP2562652B1 (en) System and method for locking data in a cache memory
TWI697902B (en) Electronic device and method for managing electronic device
US6976130B2 (en) Cache controller unit architecture and applied method
CN111124297B (en) Performance improving method for stacked DRAM cache
JP2010146084A (en) Data processor including cache memory control section
JP2002007373A (en) Semiconductor device
TWI739430B (en) Cache and method for managing cache
CN109669881B (en) Computing method based on Cache space reservation algorithm
US9158697B2 (en) Method for cleaning cache of processor and associated processor
CN111506252B (en) Cache memory and management method thereof
CN103186474B (en) The method that the cache of processor is purged and this processor
JP3295728B2 (en) Update circuit of pipeline cache memory
JP2001222467A (en) Cache device
JP3260566B2 (en) Storage control method and storage control device in information processing system
TW202244735A (en) Dram-aware caching