TWI703440B - Memory system, processing system thereof and method for operating memory stack - Google Patents

Memory system, processing system thereof and method for operating memory stack Download PDF

Info

Publication number
TWI703440B
TWI703440B TW106118496A TW106118496A TWI703440B TW I703440 B TWI703440 B TW I703440B TW 106118496 A TW106118496 A TW 106118496A TW 106118496 A TW106118496 A TW 106118496A TW I703440 B TWI703440 B TW I703440B
Authority
TW
Taiwan
Prior art keywords
memory
tag
command
tag value
address
Prior art date
Application number
TW106118496A
Other languages
Chinese (zh)
Other versions
TW201804328A (en
Inventor
泰勒 斯托克斯戴爾
張牧天
鄭宏忠
Original Assignee
南韓商三星電子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南韓商三星電子股份有限公司 filed Critical 南韓商三星電子股份有限公司
Publication of TW201804328A publication Critical patent/TW201804328A/en
Application granted granted Critical
Publication of TWI703440B publication Critical patent/TWI703440B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/0292User address space allocation, e.g. contiguous or non contiguous base addressing using tables or multilevel address translation means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/6028Prefetching based on hints or prefetch instructions
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C2207/00Indexing scheme relating to arrangements for writing information into, or reading information out from, a digital store
    • G11C2207/10Aspects relating to interfaces of memory device to external buses
    • G11C2207/107Serial-parallel conversion of data or prefetch
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C2211/00Indexing scheme relating to digital stores characterized by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C2211/56Indexing scheme relating to G11C11/56 and sub-groups for features not covered by these groups
    • G11C2211/564Miscellaneous aspects
    • G11C2211/5643Multilevel memory comprising cache storage devices
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A system and method for using high bandwidth memory as cache memory. A high bandwidth memory may include a logic die, and, stacked on the logic die, a plurality of dynamic random access memory dies. The logic die may include a cache manager, that may interface to external systems through an external interface conforming to the JESD235A standard, and that may include an address translator, a command translator, and a tag comparator. The address translator may translate each physical address received through the external interface into a tag value, a tag address in the stack of memory dies, and a data address in the stack of memory dies. The tag comparator may determine whether a cache hit or cache miss has occurred, according to whether the tag value generated by the address translator matches the tag value stored at the tag address.

Description

記憶體系統、其處理系統以及操作記憶體堆疊 的方法 Memory system, its processing system and operating memory stack Methods

本申請案主張於2016年7月26日提出申請且標題為「具有記憶體內快取管理器之高頻寬記憶體(HBM WITH IN-MEMORY CACHE MANAGER)」的美國臨時申請案第62/367062號的優先權及權利,所述美國臨時申請案的內容全文併入本文供參考。 This application claims the priority of the U.S. Provisional Application No. 62/367062 filed on July 26, 2016 and titled "HBM WITH IN-MEMORY CACHE MANAGER" Rights and rights, the content of the US provisional application is incorporated herein for reference in its entirety.

根據本發明實施例的一或多個態樣是有關於高頻寬記憶體,且更具體而言,是有關於一種使用高頻寬記憶體作為快取記憶體的系統及方法。 One or more aspects according to the embodiments of the present invention are related to high-bandwidth memory, and more specifically, to a system and method for using high-bandwidth memory as a cache memory.

高頻寬記憶體(High Bandwidth Memory,HBM)是三維(three dimensional,3D)堆疊式動態隨機存取記憶體(dynamic random access memory,DRAM)的高效能(隨機存取記憶體)介面。使用高頻寬記憶體作為快取記憶體的先前技術系統可具有位於主機上的快取管理器,以執行快取管理功能。此種配置可能會 給主機以及位於主機與高頻寬記憶體之間的介面帶來負擔。 High Bandwidth Memory (HBM) is a high-performance (random access memory) interface of three dimensional (3D) stacked dynamic random access memory (DRAM). The prior art system using high bandwidth memory as the cache memory may have a cache manager on the host to perform cache management functions. This configuration may It puts a burden on the host and the interface between the host and the high-bandwidth memory.

因此,需要一種使用高頻寬記憶體作為快取記憶體的改良的系統及方法。 Therefore, there is a need for an improved system and method that uses high-bandwidth memory as cache memory.

本發明實施例的各個態樣是有關於一種使用高頻寬記憶體作為快取記憶體的系統及方法。高頻寬記憶體可包括邏輯晶粒以及堆疊於所述邏輯晶粒上的多個動態隨機存取記憶體晶粒。所述邏輯晶粒可包括快取管理器,所述快取管理器可經由符合JESD235A標準的外部介面介接至外部系統且可包括位址轉譯器、命令轉譯器及標籤比較器。所述位址轉譯器可將經由所述外部介面接收的每一個實體位址轉譯成標籤值、記憶體晶粒堆疊中的標籤位址以及記憶體晶粒堆疊中的資料位址。所述標籤比較器可根據由位址轉譯器產生的標籤值是否與儲存於標籤位址處的標籤值匹配來判斷出現快取命中還是快取未中。命令產生器可產生命令。舉例而言,在經由外部介面接收寫入命令時,命令產生器可首先產生用於提取標籤值的命令以判斷是否出現快取命中,且若出現快取命中,則命令產生器可產生寫入命令。 Each aspect of the embodiments of the present invention relates to a system and method that uses a high-bandwidth memory as a cache memory. The high-bandwidth memory may include a logic die and a plurality of dynamic random access memory die stacked on the logic die. The logic die may include a cache manager, which may interface to an external system via an external interface compliant with the JESD235A standard and may include an address translator, a command translator, and a tag comparator. The address translator can translate each physical address received via the external interface into a tag value, a tag address in the memory die stack, and a data address in the memory die stack. The tag comparator can determine whether a cache hit or a cache miss occurs according to whether the tag value generated by the address translator matches the tag value stored at the tag address. The command generator can generate commands. For example, when receiving a write command via an external interface, the command generator can first generate a command for extracting tag values to determine whether a cache hit occurs, and if a cache hit occurs, the command generator can generate a write command.

根據本發明的實施例,提供一種記憶體系統,所述記憶體系統包括:記憶體堆疊,包括多個記憶體晶粒;以及邏輯晶粒,所述記憶體晶粒堆疊於並連接至所述邏輯晶粒,所述邏輯晶粒具有與所述記憶體系統介接的外部介面,所述邏輯晶粒包括快取管 理器。 According to an embodiment of the present invention, a memory system is provided. The memory system includes: a memory stack including a plurality of memory dies; and a logic die, the memory dies stacked on and connected to the A logic die, the logic die having an external interface that interfaces with the memory system, and the logic die includes a cache Manager.

在一個實施例中,所述快取管理器包括位址轉譯器,所述位址轉譯器經配置以對經由所述外部介面接收的位址進行轉譯以生成:第一標籤值;所述記憶體堆疊中的資料位址;以及所述記憶體堆疊中的標籤位址。 In one embodiment, the cache manager includes an address translator configured to translate an address received via the external interface to generate: a first tag value; the memory The data address in the memory stack; and the tag address in the memory stack.

在一個實施例中,所述快取管理器包括命令轉譯器,所述命令轉譯器經配置以因應於經由所述外部介面接收的讀取命令而產生:用於提取標籤的第一命令;以及用於提取資料字元的第二命令。 In one embodiment, the cache manager includes a command translator configured to generate in response to a read command received via the external interface: a first command for retrieving tags; and The second command used to extract data characters.

在一個實施例中,所述快取管理器包括用於產生快取命中訊號的標籤比較器,所述快取命中訊號:當所述第一標籤值等於藉由執行所述第一命令而得到的值時,具有真值;以及當所述第一標籤值不等於藉由執行所述第一命令而得到的所述值時,具有假值。 In one embodiment, the cache manager includes a tag comparator for generating a cache hit signal, the cache hit signal: when the first tag value is equal to the value obtained by executing the first command When the value of is a true value; and when the first tag value is not equal to the value obtained by executing the first command, it has a false value.

在一個實施例中,所述標籤比較器經配置以經由所述外部介面的第一引腳發送所述快取命中訊號。 In one embodiment, the tag comparator is configured to send the cache hit signal via the first pin of the external interface.

在一個實施例中,所述快取管理器經配置以經由所述外部介面的第二引腳發送壞位元的值及/或有效位元的值。 In one embodiment, the cache manager is configured to send the value of the bad bit and/or the value of the valid bit via the second pin of the external interface.

在一個實施例中,所述快取管理器經配置以在第一時間間隔期間經由所述第一引腳發送所述快取命中訊號,且在第二間隔期間經由所述第一引腳發送壞位元的值。 In one embodiment, the cache manager is configured to send the cache hit signal via the first pin during a first time interval, and send via the first pin during a second interval The value of the bad bit.

在一個實施例中,所述快取管理器經配置以經由偽通道 執行所述第一命令。 In one embodiment, the cache manager is configured to Execute the first command.

在一個實施例中,所述快取管理器包括模式選擇器,所述模式選擇器指示選擇並列運作模式或選擇串列運作模式,所述快取管理器經配置成:當所述模式選擇器指示選擇並列運作模式時,將所述第一命令與所述第二命令並列地執行;以及當所述模式選擇器指示選擇所述串列運作模式時,在執行所述第二命令之前執行所述第一命令。 In one embodiment, the cache manager includes a mode selector, the mode selector instructs to select a parallel operation mode or a serial operation mode, and the cache manager is configured to: when the mode selector When the parallel operation mode is instructed to be selected, execute the first command and the second command in parallel; and when the mode selector instructs to select the serial operation mode, execute all the commands before executing the second command. The first order.

在一個實施例中,所述模式選擇器經配置以經由所述外部介面而受到控制。 In one embodiment, the mode selector is configured to be controlled via the external interface.

在一個實施例中,對於儲存於所述記憶體晶粒中的第一記憶庫中且能夠經由不同偽通道進行存取的任兩個資料字元,在所述記憶體堆疊的不同子陣列中儲存有兩個對應的標籤。 In one embodiment, for any two data characters that are stored in the first memory bank in the memory die and can be accessed through different pseudo channels, in different sub-arrays of the memory stack Two corresponding tags are stored.

在一個實施例中,所述外部介面經配置以依據聯合電子裝置工程委員會(Joint Electron Device Engineering Council)標準JESD235A來運作。 In one embodiment, the external interface is configured to operate according to the Joint Electron Device Engineering Council (Joint Electron Device Engineering Council) standard JESD235A.

根據本發明的實施例,提供一種處理系統,所述處理系統包括:主機處理器;第一記憶體系統,連接至所述主機處理器;以及第二記憶體系統,連接至所述主機處理器,所述第一記憶體系統包括:記憶體堆疊,包括多個記憶體晶粒;以及邏輯晶粒,所述記憶體晶粒堆疊於並連接至所述邏輯晶粒,所述邏輯晶粒具有與所述第一記憶體系統介接的外部介面,所述邏輯晶粒包括快取管理器,所述第二記憶體系統經配置成為所述第一記憶體系統 的後備儲存器。 According to an embodiment of the present invention, there is provided a processing system including: a host processor; a first memory system connected to the host processor; and a second memory system connected to the host processor , The first memory system includes: a memory stack including a plurality of memory dies; and a logic die, the memory die being stacked on and connected to the logic die, the logic die having An external interface that interfaces with the first memory system, the logic die includes a cache manager, and the second memory system is configured to become the first memory system Backup storage.

在一個實施例中,所述快取管理器包括位址轉譯器,所述位址轉譯器經配置以將經由所述外部介面自所述主機處理器接收的位址轉譯成:第一標籤值;所述記憶體堆疊中的資料位址;以及所述記憶體堆疊中的標籤位址。 In one embodiment, the cache manager includes an address translator configured to translate the address received from the host processor via the external interface into: a first tag value ; The data address in the memory stack; and the tag address in the memory stack.

在一個實施例中,所述快取管理器包括命令轉譯器,所述命令轉譯器經配置以因應於經由所述外部介面自所述主機處理器接收的讀取命令而產生:用於提取標籤的第一命令;以及用於提取資料字元的第二命令。 In one embodiment, the cache manager includes a command translator configured to generate in response to a read command received from the host processor via the external interface: for retrieving tags The first command; and the second command for extracting data characters.

在一個實施例中,所述快取管理器包括用於產生快取命中訊號的標籤比較器,所述快取命中訊號:當所述第一標籤值等於藉由執行所述第一命令而得到的值時,具有真值;以及當所述第一標籤值不等於藉由執行所述第一命令而得到的所述值時,具有假值。 In one embodiment, the cache manager includes a tag comparator for generating a cache hit signal, the cache hit signal: when the first tag value is equal to the value obtained by executing the first command When the value of is a true value; and when the first tag value is not equal to the value obtained by executing the first command, it has a false value.

在一個實施例中,所述外部介面經配置以依據聯合電子裝置工程委員會標準JESD235A來運作。 In one embodiment, the external interface is configured to operate according to the JESD235A standard of the Joint Electronic Device Engineering Committee.

根據本發明的實施例,提供一種操作記憶體堆疊的方法,所述記憶體堆疊包括多個記憶體晶粒及邏輯晶粒,所述記憶體晶粒堆疊於並連接至所述邏輯晶粒,所述邏輯晶粒具有與記憶體系統介接的外部介面,所述方法包括:由所述邏輯晶粒對經由所述外部介面接收的位址進行轉譯以生成:第一標籤值;所述記憶體堆疊中的資料位址;以及所述記憶體堆疊中的標籤位址。 According to an embodiment of the present invention, there is provided a method of operating a memory stack, the memory stack including a plurality of memory dies and logic dies, the memory dies are stacked on and connected to the logic dies, The logic die has an external interface that interfaces with a memory system, and the method includes: translating, by the logic die, an address received through the external interface to generate: a first tag value; the memory The data address in the memory stack; and the tag address in the memory stack.

在一個實施例中,所述方法包括由所述邏輯晶粒因應於經由所述外部介面接收的讀取命令而產生:用於提取標籤的第一命令;以及用於提取資料字元的第二命令。 In one embodiment, the method includes generating by the logic die in response to a read command received via the external interface: a first command for extracting tags; and a second command for extracting data characters command.

在一個實施例中,所述方法包括由所述邏輯晶粒產生快取命中訊號,所述快取命中訊號:當所述第一標籤值等於藉由執行所述第一命令而得到的值時,具有真值;以及當所述第一標籤值不等於藉由執行所述第一命令而得到的所述值時,具有假值。 In one embodiment, the method includes generating a cache hit signal from the logic die, the cache hit signal: when the first tag value is equal to the value obtained by executing the first command , Has a true value; and when the first tag value is not equal to the value obtained by executing the first command, has a false value.

105:高頻寬記憶體堆疊 105: High bandwidth memory stack

110:邏輯晶粒 110: Logic Die

115:動態隨機存取記憶體堆疊 115: dynamic random access memory stack

205:主機處理器 205: host processor

210:核心 210: core

215:1級快取 215: Level 1 cache

220:2級快取 220: Level 2 cache

225:第一記憶體控制器 225: The first memory controller

230:晶片外主記憶體 230: Off-chip main memory

235:第二記憶體控制器 235: second memory controller

240:矽插板 240: silicon board

245:高頻寬記憶體介面 245: High bandwidth memory interface

305:命令及位址線 305: command and address line

310:命令轉譯器 310: Command translator

315:位址轉譯器 315: address translator

320:標籤比較器 320: Tag Comparator

325:排程器 325: Scheduler

330:資料緩衝器 330: data buffer

Bk0~Bk7:記憶庫 Bk0~Bk7: memory bank

Ch0~Ch7、PCh0、PCh1:通道 Ch0~Ch7, PCh0, PCh1: channel

Sa0~Sa14:子陣列 Sa0~Sa14: sub-array

參照說明書、申請專利範圍及附圖將會領會並理解本發明的該些以及其他特徵及優點,在附圖中: These and other features and advantages of the present invention will be understood and understood with reference to the specification, scope of patent application and drawings. In the drawings:

圖1是根據本發明實施例的高頻寬記憶體堆疊的立體圖。 FIG. 1 is a perspective view of a high-bandwidth memory stack according to an embodiment of the invention.

圖2A是根據本發明實施例的採用高頻寬記憶體堆疊作為3級快取的處理系統的方塊圖。 2A is a block diagram of a processing system using a high-bandwidth memory stack as a 3-level cache according to an embodiment of the present invention.

圖2B是根據本發明實施例的高頻寬記憶體堆疊的方塊圖。 FIG. 2B is a block diagram of a high-bandwidth memory stack according to an embodiment of the present invention.

圖3是根據本發明實施例的高頻寬記憶體堆疊的方塊圖。 FIG. 3 is a block diagram of a high-bandwidth memory stack according to an embodiment of the invention.

圖4A是根據本發明實施例的儲存圖。 Fig. 4A is a storage diagram according to an embodiment of the present invention.

圖4B是根據本發明實施例的儲存圖。 Figure 4B is a storage diagram according to an embodiment of the present invention.

以下結合附圖所提出的詳細說明旨在作為根據本發明所提供的具有記憶體內快取管理器的高頻寬記憶體的示例性實施例 的說明,而並非旨在代表本發明可被構想或利用的唯一形式。本說明結合所示實施例提出本發明的特徵。然而,應理解,可藉由亦旨在囊括於本發明的精神及範圍內的不同實施例來達成相同或等效的功能及結構。如本文其他地方所表明,相同的元件編號旨在表示相同的元件或特徵。 The following detailed description with reference to the accompanying drawings is intended to serve as an exemplary embodiment of a high-bandwidth memory with an in-memory cache manager provided by the present invention The description is not intended to represent the only form in which the invention can be conceived or utilized. This description presents the features of the invention in conjunction with the illustrated embodiment. However, it should be understood that the same or equivalent functions and structures can be achieved by different embodiments that are also intended to be included in the spirit and scope of the present invention. As indicated elsewhere herein, the same element numbers are intended to represent the same elements or features.

高頻寬記憶體(HBM)是高效能三維(3D)堆疊式動態隨機存取記憶體RAM(DRAM)。第二代高頻寬記憶體可每個堆疊包括多達8個晶粒且提供高達每秒2十億傳輸(giga transfers per second,GT/s)的引腳傳輸速率。介面可包括8個通道,每個通道分別為128位元寬,以達成總共為1024位元寬的存取。第二代高頻寬記憶體可能夠達到每個封裝每秒256十億位元(GB/s)的記憶體頻寬,且可具有高達每個封裝8十億位元的儲存容量。第二代高頻寬記憶體的介面可依據聯合電子裝置工程委員會(JEDEC)所接受的標準,即標準JESD235A。 High-bandwidth memory (HBM) is a high-performance three-dimensional (3D) stacked dynamic random access memory RAM (DRAM). The second-generation high-bandwidth memory can include up to 8 dies per stack and provide pin transfer rates of up to 2 giga transfers per second (GT/s). The interface can include 8 channels, each of which is 128 bits wide to achieve a total of 1024 bits wide access. The second-generation high-bandwidth memory can reach a memory bandwidth of 256 billion bits per second (GB/s) per package, and can have a storage capacity of up to 8 billion bits per package. The interface of the second-generation high-bandwidth memory can be based on the standard accepted by the Joint Electronic Device Engineering Committee (JEDEC), which is the standard JESD235A.

參照圖1,高頻寬記憶體堆疊105的實體配置可包括邏輯晶粒110及三維動態隨機存取記憶體或「動態隨機存取記憶體堆疊」115,三維動態隨機存取記憶體或「動態隨機存取記憶體堆疊」115包括堆疊於邏輯晶粒110的頂部上的多個動態隨機存取記憶體晶粒(例如,8個此種晶粒)。在所述堆疊內利用矽穿孔(through-silicon via,TSV)形成互連線。先前技術高頻寬記憶體堆疊在邏輯晶粒中可包括連接線及訊號調節電路系統(signal conditioning circuitry),由此向位於高頻寬記憶體的外部介面處的 主機處理器提供實質上不會改變的動態隨機存取記憶體通道介面。 1, the physical configuration of the high-bandwidth memory stack 105 can include logic die 110 and three-dimensional dynamic random access memory or "dynamic random access memory stack" 115, three-dimensional dynamic random access memory or "dynamic random access memory" 115 The "memory stack" 115 includes a plurality of dynamic random access memory dies (for example, 8 such dies) stacked on top of the logic die 110. A through-silicon via (TSV) is used to form interconnects in the stack. The prior art high-bandwidth memory stacked in the logic die can include connecting lines and signal conditioning circuitry, thereby providing a high-speed bandwidth to the external interface of the high-bandwidth memory. The host processor provides a dynamic random access memory channel interface that does not change substantially.

參照圖2A,高頻寬記憶體堆疊105可連接至主機處理器205(例如,中央處理單元(central processing unit,CPU)或圖形處理單元(graphics processing unit,GPU))。主機處理器205可包括多個核心210,所述多個核心210分別具有各自的1級快取215。2級快取220可連接至1級快取215,且第一記憶體控制器225可提供與晶片外主記憶體(off chip main memory)230介接的介面。第二記憶體控制器235可提供與高頻寬記憶體堆疊105介接的介面。高頻寬記憶體堆疊105可包括高頻寬記憶體堆疊105的邏輯晶粒中的快取管理器(cache manager,CM)。主機處理器205可採用高頻寬記憶體堆疊105以及高頻寬記憶體堆疊105的積體快取管理器作為3級快取(或者,例如,在亦具有3級快取的系統中作為4級快取)。高頻寬記憶體介面245可為符合JESD235A的介面,亦即,高頻寬記憶體介面245可提供由JESD235A規定的導體(conductor)及訊號協定。 2A, the high-bandwidth memory stack 105 may be connected to the host processor 205 (for example, a central processing unit (CPU) or a graphics processing unit (GPU)). The host processor 205 may include a plurality of cores 210, each of which has its own level 1 cache 215. The level 2 cache 220 may be connected to the level 1 cache 215, and the first memory controller 225 may Provides an interface for off-chip main memory 230. The second memory controller 235 can provide an interface with the high-bandwidth memory stack 105. The high bandwidth memory stack 105 may include a cache manager (CM) in the logic die of the high bandwidth memory stack 105. The host processor 205 can use the high-bandwidth memory stack 105 and the integrated cache manager of the high-bandwidth memory stack 105 as a level 3 cache (or, for example, as a level 4 cache in a system that also has a level 3 cache) . The high-bandwidth memory interface 245 can be an interface compliant with JESD235A, that is, the high-bandwidth memory interface 245 can provide a conductor and a signal protocol specified by JESD235A.

參照圖2B,在一些實施例中,高頻寬記憶體堆疊105可包括邏輯晶粒110,邏輯晶粒110可經由八個內部介面(被稱為通道,且在圖2B中被示出為Ch0至Ch7)連接至動態隨機存取記憶體堆疊115中的動態隨機存取記憶體。 2B, in some embodiments, the high-bandwidth memory stack 105 may include logic die 110, and the logic die 110 may pass through eight internal interfaces (referred to as channels, and are shown as Ch0 to Ch7 in FIG. 2B). ) Connected to the dynamic random access memory in the dynamic random access memory stack 115.

參照圖3,在一個實施例中,高頻寬記憶體堆疊105如上所述包括動態隨機存取記憶體堆疊115及邏輯晶粒110,且邏輯晶 粒110可包括用於實作快取管理器的多個組件。高頻寬記憶體介面245中的命令及位址線305可在高頻寬記憶體堆疊105的邏輯晶粒110中連接至命令轉譯器310及位址轉譯器315。對於高頻寬記憶體介面245的8個通道中的每一者而言,命令及位址線305可包括例如6條列命令/位址線及8條行命令/位址線。 3, in one embodiment, the high-bandwidth memory stack 105 includes a dynamic random access memory stack 115 and a logic die 110 as described above, and the logic die The pellet 110 may include multiple components for implementing a cache manager. The command and address lines 305 in the high-bandwidth memory interface 245 can be connected to the command translator 310 and the address translator 315 in the logic die 110 of the high-bandwidth memory stack 105. For each of the 8 channels of the high bandwidth memory interface 245, the command and address lines 305 may include, for example, 6 column command/address lines and 8 row command/address lines.

在操作中,位址轉譯器315可週期性地接收要被執行命令(例如,讀取命令或寫入命令)的實體記憶體位址。位址轉譯器315可接著將所述位址轉譯成標籤值、標籤位址及資料位址。標籤值可用於判斷是否已出現「快取命中」,亦即,快取中的位址目前是否被分配至經由高頻寬記憶體介面245所接收的位址。舉例而言,快取管理器可讀取(或「提取」)標籤位址處的標籤值並將其與由位址轉譯器315生成的標籤值進行比較(例如,利用以下會進一步詳細闡述的標籤比較器320)。若由所接收的實體位址(藉由位址轉譯器315)形成的標籤值與儲存於動態隨機存取記憶體堆疊115中的標籤位址處的標籤值匹配,則已出現快取命中,亦即,快取中的位址目前被分配至處理器的實體記憶體空間中的接收位址。若由所接收的實體位址形成的標籤值不與儲存於動態隨機存取記憶體堆疊115中的標籤位址處的標籤值匹配(本文中被稱為「快取未中」的情形),則快取中的位址目前未被分配至處理器的實體記憶體空間中的接收位址。 In operation, the address translator 315 may periodically receive the physical memory address of the command to be executed (for example, a read command or a write command). The address translator 315 can then translate the address into a tag value, a tag address, and a data address. The tag value can be used to determine whether a "cache hit" has occurred, that is, whether the address in the cache is currently allocated to the address received through the high-bandwidth memory interface 245. For example, the cache manager can read (or "extract") the tag value at the tag address and compare it with the tag value generated by the address translator 315 (for example, using the following detailed description Tag comparator 320). If the tag value formed by the received physical address (by the address translator 315) matches the tag value stored at the tag address in the dynamic random access memory stack 115, a cache hit has occurred, That is, the address in the cache is currently allocated to the receiving address in the physical memory space of the processor. If the tag value formed by the received physical address does not match the tag value stored at the tag address in the dynamic random access memory stack 115 (this is referred to as a "cache miss" situation), The address in the cache is currently not allocated to the receiving address in the physical memory space of the processor.

標籤比較器320可用於進行比較,亦即,對由所接收的實體位址形成的標籤值與儲存於標籤位址處的標籤值進行比較。 標籤比較器320的輸出可為被稱為快取命中訊號的訊號,所述訊號在出現快取命中時具有真值(例如,二進制值1),且在出現快取未中時具有假值(例如,二進制值0)。 The tag comparator 320 can be used for comparison, that is, to compare the tag value formed by the received physical address with the tag value stored at the tag address. The output of the tag comparator 320 may be a signal called a cache hit signal, which has a true value (for example, a binary value of 1) when a cache hit occurs, and a false value when a cache miss occurs ( For example, the binary value 0).

命令轉譯器310可因應於經由高頻寬記憶體介面245接收的命令而產生欲對動態隨機存取記憶體堆疊115執行的命令。舉例而言,若經由高頻寬記憶體介面245接收的命令是讀取命令,則命令轉譯器310可產生用於讀取儲存於資料位址處的資料字元的命令以及用於讀取儲存於標籤位址處的標籤值的命令。該些命令中的每一者可包括多個微操作(例如,由多個微操作構成),例如,讀取命令可包括讀取操作之後的激活操作。 The command translator 310 can generate a command to be executed on the dynamic random access memory stack 115 in response to the command received through the high-bandwidth memory interface 245. For example, if the command received via the high-bandwidth memory interface 245 is a read command, the command translator 310 can generate a command for reading the data characters stored at the data address and for reading the data stored in the tag The command of the tag value at the address. Each of these commands may include multiple micro-operations (for example, composed of multiple micro-operations), for example, the read command may include an activation operation after the read operation.

若經由高頻寬記憶體介面245接收的命令是寫入命令,則命令轉譯器310可首先產生用於讀取儲存於標籤位址處的標籤的命令,且若標籤值與由位址轉譯器315產生的標籤值匹配,則可接著產生用於將資料寫入至動態隨機存取記憶體堆疊115的寫入命令。如此一來,在邏輯晶粒110中包括快取管理器而可藉由使得第二命令(寫入命令)能夠在邏輯晶粒110中而非在主機處理器205中產生,來使得主機處理器205無需實作快取管理器,藉此會得到提高的效率。 If the command received via the high-bandwidth memory interface 245 is a write command, the command translator 310 can first generate a command for reading the tag stored at the tag address, and if the tag value is the same as that generated by the address translator 315 If the tag value of is matched, a write command for writing data to the dynamic random access memory stack 115 can then be generated. In this way, the logic die 110 includes a cache manager so that the second command (write command) can be generated in the logic die 110 instead of the host processor 205, so that the host processor 205 does not need to implement a cache manager, which will increase efficiency.

標籤值及資料的提取可並列或串列地執行。舉例而言,當快取管理器以並列模式運作時,可並列地提取標籤值及資料,所提取的標籤值可由標籤比較器320來與由位址轉譯器315生成的標籤值進行比較,且若該兩個標籤值匹配,則可經由高頻寬記 憶體介面245返回所讀取的資料。否則,可經由高頻寬記憶體介面245向主機處理器205發出快取未中的訊號,如以下所更詳細論述。當快取管理器以串列模式運作時,可首先提取標籤值,所提取的標籤值可由標籤比較器320來與由位址轉譯器315生成的標籤值進行比較,且若該兩個標籤值匹配,則可提取資料且經由高頻寬記憶體介面245返回所述資料。否則,可經由高頻寬記憶體介面245向主機處理器205發出快取未中的訊號。串列模式的運作可比並列模式的運作更節能,乃因在串列模式中,資料提取操作僅在快取命中的情形中執行。並列模式的運作可較串列模式的運作快,乃因在並列模式中,標籤值提取與資料提取可同時執行。快取管理器可包括模式選擇器(例如,控制暫存器中的位元),所述模式選擇器可控制快取管理器是以並列模式運作還是以串列模式運作。模式選擇器可藉由高頻寬記憶體介面245(例如,藉由主機處理器205)來控制,例如,由用於將新值寫入到控制暫存器的命令來控制。 The extraction of tag values and data can be performed in parallel or serially. For example, when the cache manager operates in parallel mode, the tag value and data can be extracted in parallel, and the extracted tag value can be compared by the tag comparator 320 with the tag value generated by the address translator 315, and If the two tag values match, you can record The memory interface 245 returns the read data. Otherwise, a cache miss signal can be sent to the host processor 205 via the high bandwidth memory interface 245, as discussed in more detail below. When the cache manager operates in serial mode, the tag value can be extracted first, and the extracted tag value can be compared by the tag comparator 320 with the tag value generated by the address translator 315, and if the two tag values If it matches, the data can be extracted and returned via the high-bandwidth memory interface 245. Otherwise, a cache miss signal can be sent to the host processor 205 via the high-bandwidth memory interface 245. The operation of the serial mode can be more energy-efficient than the operation of the parallel mode, because in the serial mode, the data extraction operation is only executed in the case of a cache hit. The operation of the parallel mode can be faster than the operation of the serial mode, because in the parallel mode, the label value extraction and the data extraction can be executed simultaneously. The cache manager may include a mode selector (for example, a bit in the control register), and the mode selector may control whether the cache manager operates in parallel mode or serial mode. The mode selector can be controlled by the high-bandwidth memory interface 245 (for example, by the host processor 205), for example, by a command for writing a new value to the control register.

快取管理器可利用每一個標籤來儲存兩位元的元資料:(i)表示對應的資料字元是有效還是無效的位元(「有效位元」),以及(ii)表示對應的資料字元是否為壞的的位元(「壞位元」)。若快取中的資料已被更新,而後備儲存器中的對應資料未更新,則快取中的資料被視為壞的(且否則為不是壞的),且若後備儲存器中的資料已被更新而快取中的對應資料未更新,則快取中的資料被視為無效的(且否則為有效的)。 The cache manager can use each tag to store two-digit metadata: (i) a bit indicating whether the corresponding data character is valid or invalid ("valid bit"), and (ii) indicating the corresponding data Whether the character is a bad bit ("bad bit"). If the data in the cache has been updated, but the corresponding data in the backup storage has not been updated, the data in the cache is considered bad (and otherwise not bad), and if the data in the backup storage has been If the corresponding data in the cache is updated but not updated, the data in the cache is deemed invalid (and otherwise valid).

另外,經由高頻寬記憶體介面245接收的命令可如上所述得到快取命中訊號的真值或假值(分別與快取命中或快取未中對應)。在完成由高頻寬記憶體堆疊105經由高頻寬記憶體介面245接收的任意命令時,快取管理器可產生三個值,所述三個值分別是快取命中訊號的、壞位元的及有效位元的值。該些值可使用高頻寬記憶體介面245的不用於其他功能的一或多個引腳(例如,既不是(i)所述八個通道中的每一者的212個引腳中的任意者、也不是(ii)RESET引腳、TEMP[2:0]引腳或CATTRIP引腳中的任意者的引腳)來經由高頻寬記憶體介面245傳送至主機處理器205。可使用由JESD235A標準定義為預留供未來使用的引腳(pins that are reserved for future use,RFU pins)的引腳。舉例而言,在一個實施例中,預留供未來使用的引腳用於:在執行命令之後的資料叢發的第一資料循環期間,傳輸快取命中訊號,且在資料叢發的下一資料循環期間,傳輸壞位元的值。快取命中訊號及壞位元的發送可與資料叢發的資料的發送同時進行。在一些實施例中,使用多個預留供未來使用的引腳來傳輸快取命中訊號、壞位元及/或有效位元。 In addition, the command received via the high-bandwidth memory interface 245 can obtain the true value or the false value of the cache hit signal (corresponding to the cache hit or the cache miss respectively) as described above. Upon completion of any command received by the high-bandwidth memory stack 105 via the high-bandwidth memory interface 245, the cache manager can generate three values. The three values are the cache hit signal, the bad bit, and the valid bit. The value of yuan. These values can use one or more pins of the high-bandwidth memory interface 245 that are not used for other functions (for example, neither of (i) any of the 212 pins of each of the eight channels, It is not (ii) any one of the RESET pin, TEMP[2:0] pin, or CATTRIP pin) to be transmitted to the host processor 205 via the high-bandwidth memory interface 245. The pins that are reserved for future use (pins that are reserved for future use, RFU pins) defined by the JESD235A standard can be used. For example, in one embodiment, the pins reserved for future use are used to transmit the cache hit signal during the first data cycle of the data burst after the command is executed, and in the next data burst During the data cycle, the value of the bad bit is transmitted. The sending of the cache hit signal and bad bits can be carried out simultaneously with the sending of the data of the data burst. In some embodiments, multiple pins reserved for future use are used to transmit cache hit signals, bad bits, and/or valid bits.

為使快取中的資料失效,主機處理器205可經由高頻寬記憶體介面245將「失效」訊號與要失效的資料的位址一同發送至邏輯晶粒110。「失效」訊號可經由高頻寬記憶體介面245的與用於向主機處理器205發送快取命中訊號的引腳相同的引腳(例如,預留供未來使用的引腳)來發送。位址可經由高頻寬記憶體 介面245的命令/位址(Command/Address)匯流排來發送。利用此資訊,邏輯晶粒110可接著更新儲存於動態隨機存取記憶體堆疊115中的對應的有效位元。 In order to invalidate the data in the cache, the host processor 205 can send the "failure" signal along with the address of the data to be invalidated to the logic die 110 via the high-bandwidth memory interface 245. The "failure" signal can be sent through the same pin of the high-bandwidth memory interface 245 that is used to send the cache hit signal to the host processor 205 (for example, a pin reserved for future use). Address can pass through high bandwidth memory The command/address (Command/Address) bus of the interface 245 is sent. Using this information, the logic die 110 can then update the corresponding valid bits stored in the dynamic random access memory stack 115.

在一些實施例中,也使用預留供未來使用的引腳來維持快取同調性(coherency),例如,用於維持後備儲存器(例如,晶片外主記憶體230)與多核心系統(例如圖2A所示系統)中的快取之間的同調性,在圖2A所示的系統中,3級快取(其在高頻寬記憶體堆疊105中實作)與後備儲存器二者可藉由核心210中的每一者進行讀取及/或修改。 In some embodiments, pins reserved for future use are also used to maintain cache coherency, for example, to maintain backup memory (for example, off-chip main memory 230) and multi-core systems (for example, The coherence between the caches in the system shown in Figure 2A). In the system shown in Figure 2A, the three-level cache (which is implemented in the high-bandwidth memory stack 105) and the backing memory can be achieved by Each of the cores 210 performs reading and/or modification.

邏輯晶粒110中的排程器325可分別自命令轉譯器310及自位址轉譯器315接收命令及位址,並對該些命令在動態隨機存取記憶體堆疊115上的執行進行排程。邏輯晶粒110中的資料緩衝器330可用於在經由高頻寬記憶體介面245接收到資料之後及/或在自動態隨機存取記憶體堆疊115讀取到資料之後暫時儲存資料。排程器325與資料緩衝器330二者可幫助適應於進行以下操作時的速率的變化:(i)經由高頻寬記憶體介面245接收命令;(ii)對動態隨機存取記憶體堆疊115執行命令;(iii)經由高頻寬記憶體介面245發送或接收資料;及(iv)自動態隨機存取記憶體堆疊115讀取資料或將資料寫入到動態隨機存取記憶體堆疊115。 The scheduler 325 in the logic die 110 can receive commands and addresses from the command translator 310 and the address translator 315, respectively, and schedule the execution of these commands on the dynamic random access memory stack 115 . The data buffer 330 in the logic die 110 can be used to temporarily store data after the data is received through the high-bandwidth memory interface 245 and/or after the data is read from the dynamic random access memory stack 115. Both the scheduler 325 and the data buffer 330 can help adapt to changes in rate when performing the following operations: (i) receiving commands via the high-bandwidth memory interface 245; (ii) executing commands on the dynamic random access memory stack 115 (Iii) Send or receive data via the high-bandwidth memory interface 245; and (iv) Read data from the dynamic random access memory stack 115 or write data to the dynamic random access memory stack 115.

JESD235A標準提供偽通道模式的操作,在所述偽通道模式中,8個128位元通道中的每一者作為兩個半獨立偽通道 (semi-independent pseudo channel)運作。在此種模式中,每一對偽通道共享通道的列及行命令匯流排以及CK及CKE輸入端,但是該兩個偽通道獨立地對命令進行解碼及執行所述命令。在一些實施例中,使用此種模式來儲存標籤值。每一個標籤值可為32位元字元,從而在使用通道的整個128位元寬資料匯流排(「DQ」匯流排)自動態隨機存取記憶體堆疊115讀取標籤值或將標籤值寫入至動態隨機存取記憶體堆疊115時會引起顯著的低效率(例如,為25%的效率)。在偽通道模式中,可僅使用此匯流排寬度的一半(即,64位元)來讀取或寫入標籤值,從而得到較高的效率(例如,50%)。 The JESD235A standard provides a pseudo-channel mode of operation, in which each of the eight 128-bit channels is used as two semi-independent pseudo channels (semi-independent pseudo channel) operation. In this mode, each pair of pseudo channels share the channel's column and row command bus and CK and CKE input terminals, but the two pseudo channels independently decode commands and execute them. In some embodiments, this mode is used to store tag values. Each tag value can be 32-bit characters, so that the entire 128-bit wide data bus ("DQ" bus) of the channel in use can read or write tag values from the dynamic random access memory stack 115 Entering into the dynamic random access memory stack 115 will cause significant inefficiency (for example, 25% efficiency). In the pseudo channel mode, only half of the bus width (ie, 64 bits) can be used to read or write the tag value, thereby obtaining a higher efficiency (eg, 50%).

動態隨機存取記憶體的每一個記憶庫(記憶庫Bk0~記憶庫Bk7)可包括16個子陣列(即,由16個子陣列構成)。參照圖4A,在一些實施例中,每一個資料字元可為64位元組(512位元)長,且每一個標籤值可為4位元組(byte)(32位元)長,亦即,資料字元的長度與標籤值的長度的比率可為16:1。標籤值與資料可經由不同的通道進行存取;舉例而言,資料可經由通道1至通道15(通道PCh1)來進行存取,且標籤值可經由通道0(通道PCh0)來進行存取。在此種實施例中,可並列地存取資料,但標籤存取可能會經歷記憶庫衝突,如虛線橢圓形所示。因此,參照圖4B,在一個實施例中,標籤儲存於不同的子陣列Sa0~Sa14中,所述子陣列可使用本文中被稱為子陣列級並列(subarray level parallelism,SALP)的方法來並列地進行存取。在此種實施例中, 可同時處理各標籤存取,亦即,即使在對同一記憶庫進行存取時,也可避免標籤存取衝突。 Each memory bank (Memory Bank Bk0~Memory Bank Bk7) of the dynamic random access memory may include 16 sub-arrays (that is, composed of 16 sub-arrays). 4A, in some embodiments, each data character can be 64 bytes (512 bits) long, and each tag value can be 4 bytes (32 bits) long, or That is, the ratio of the length of the data character to the length of the tag value may be 16:1. The tag value and data can be accessed through different channels; for example, the data can be accessed via channel 1 to channel 15 (channel PCh1), and the tag value can be accessed via channel 0 (channel PCh0). In this embodiment, data can be accessed in parallel, but tag access may experience memory conflicts, as shown by the dashed oval. Therefore, referring to FIG. 4B, in one embodiment, tags are stored in different sub-arrays Sa0~Sa14, and the sub-arrays can be paralleled using a method called subarray level parallelism (SALP) herein. To access. In this embodiment, All tag accesses can be processed at the same time, that is, tag access conflicts can be avoided even when accessing the same memory bank.

有鑒於上述,相較於不具有快取管理器的先前技術高頻寬記憶體堆疊而言,在邏輯晶粒110中包括快取管理器的高頻寬記憶體堆疊105可具有幾個優點。使用包括快取管理器的高頻寬記憶體堆疊105可使得無需在主機處理器205中包括快取管理器,此會潛在地減小主機處理器的大小、降低主機處理器的成本及功耗,或者可將相同的資源用於主機處理器205中的其他用途。此外,與當條件性執行涉及主機處理器205時相比,當完全在高頻寬記憶體堆疊105中執行時,此種條件性執行可更快。舉例而言,在快取命中事件中,與在主機處理器205中的快取管理器中進行快取命中判斷時相比,當在高頻寬記憶體堆疊105中進行快取命中判斷時,寫入命令可更快速地執行。 In view of the foregoing, compared to the prior art high-bandwidth memory stack without a cache manager, the high-bandwidth memory stack 105 including a cache manager in the logic die 110 may have several advantages. Using a high-bandwidth memory stack 105 that includes a cache manager can eliminate the need to include a cache manager in the host processor 205, which can potentially reduce the size of the host processor, reduce the cost and power consumption of the host processor, or The same resources can be used for other purposes in the host processor 205. In addition, compared to when the conditional execution involves the host processor 205, such conditional execution may be faster when executed entirely in the high-bandwidth memory stack 105. For example, in the cache hit event, compared with the cache hit determination in the cache manager in the host processor 205, when the cache hit determination is performed in the high-bandwidth memory stack 105, write Commands can be executed more quickly.

應理解,儘管本文中可能使用用語「第一(first)」、「第二(second)」、「第三(third)」等來闡述各種元件、組件、區、層及/或區段,但該些元件、組件、區、層及/或區段不應受該些用語限制。該些用語僅用於區分各個元件、組件、區、層或區段。因此,以下論述的第一元件、組件、區、層或區段可被稱為第二元件、組件、區、層或區段,此並不背離本發明概念的精神及範圍。 It should be understood that although the terms "first", "second", "third", etc. may be used herein to describe various elements, components, regions, layers and/or sections, These elements, components, regions, layers and/or sections should not be restricted by these terms. These terms are only used to distinguish individual elements, components, regions, layers or sections. Therefore, the first element, component, region, layer or section discussed below may be referred to as a second element, component, region, layer or section, which does not depart from the spirit and scope of the concept of the present invention.

為便於說明,本文中可使用例如「在...下面(beneath)」、「在...下方(below)」、「下部的(lower)」、「位於...之下(under)」、「在...上方(above)」、「上部的(upper)」等空間相對關係用語來 闡述圖式所示一個元件或特徵與另一或其他元件或特徵的關係。應理解,該些空間相對關係用語旨在除圖中所繪示的定向外亦涵蓋裝置在使用或操作中的不同定向。舉例而言,若圖式中的裝置被翻轉,則被闡述為位於其他元件或特徵「下方」或「下面」或者「之下」的元件此時可被定向為位於所述其他元件或特徵「上方」。因此,示例性用語「在...下方」及「在...之下」可涵蓋上方及下方兩種定向。所述裝置可為其他定向(例如,旋轉90度或處於其他定向)且本文中所使用的空間相對關係描述語應相應地進行解釋。另外,亦應理解,當層被稱為「位於兩個層之間」時,其可為兩個層之間的唯一層,或者亦可存在一或多個中間層。 For ease of explanation, for example, "beneath", "below", "lower", "under" can be used in this article , "Above", "upper" and other spatial relative terms Explain the relationship between one element or feature shown in the drawings and another or other elements or features. It should be understood that these spatial relative terms are intended to cover different orientations of the device in use or operation in addition to the orientation shown in the figure. For example, if the device in the drawing is turned over, then elements described as being located "below" or "below" or "below" other elements or features can now be oriented to be located at the other elements or features. Above". Therefore, the exemplary terms "below" and "below" can cover both orientations of above and below. The device can be in other orientations (for example, rotated by 90 degrees or in other orientations) and the description of the spatial relationship used herein should be explained accordingly. In addition, it should also be understood that when a layer is referred to as being "between two layers," it can be the only layer between the two layers, or one or more intermediate layers may also be present.

本文所使用的用語僅是用於闡述特定實施例的目的,而並非旨在限制本發明概念。本文所用用語「實質上(substantially)」、「大約(about)」及類似用語用作近似值用語、而並非作為程度用語,並且旨在慮及此項技術中具有通常知識者將知的量測值或計算值的固有偏差。本文中所使用的用語「主要組分」意指在重量上構成組成物的至少一半的組分,且當應用於多個項目時,用語「主要部分」意指所述項目中的至少一半。 The terms used herein are only for the purpose of describing specific embodiments, and are not intended to limit the concept of the present invention. The terms "substantially", "about" and similar terms used in this article are used as approximate terms, not as terms of degree, and are intended to take into account the measurement values that those with ordinary knowledge in this technology would know Or the inherent deviation of the calculated value. The term "main component" used herein means a component that constitutes at least half of the composition by weight, and when applied to multiple items, the term "main component" means at least half of the items.

除非上下文中清楚地另外指明,否則在本文中所使用的單數形式「一(a及an)」旨在亦包括複數形式。更應理解,當在本說明書中使用用語「包括(comprises及/或comprising)」時,是指明所陳述的特徵、整數、步驟、操作、元件及/或組件的存在,但不排除一或多個其他特徵、整數、步驟、操作、元件、組件及/ 或其群組的存在或添加。本文中所用用語「及/或(and/or)」包括相關列出項中的一或多個項的任意及所有組合。當例如「…中的至少一者(at least one of)」等表達出現在一系列元件之前時是修飾整個所述系列的元件,而並非修飾所述系列中的個別元件。此外,在闡述本發明概念實施例時使用「可(may)」是指代「本發明的一或多個實施例」。另外,用語「示例性(exemplary)」旨在指代實例或例示。本文所用用語「使用(use)」、「正使用(using)」及「被使用(used)」可視為分別與用語「利用(utilize)」、「正利用(utilizing)」及「被利用(utilized)」同義。 Unless the context clearly indicates otherwise, the singular form "一 (a and an)" used herein is intended to also include the plural form. It should be understood that when the term "comprises (comprises and/or comprising)" is used in this specification, it refers to the existence of the stated features, integers, steps, operations, elements and/or components, but does not exclude one or more Other features, integers, steps, operations, elements, components and/ Or the existence or addition of its group. The term "and/or" as used herein includes any and all combinations of one or more of the related listed items. When an expression such as "at least one of" appears before a series of elements, it modifies the entire series of elements, but does not modify individual elements in the series. In addition, the use of "may" when describing embodiments of the concept of the present invention refers to "one or more embodiments of the present invention." In addition, the term "exemplary" is intended to refer to an example or illustration. The terms “use”, “using” and “used” used in this article can be regarded as the terms “utilize”, “utilizing” and “utilized” respectively. )" is synonymous.

應理解,當稱元件或層位於另一元件或層「上(on)」、「連接至(connected to)」、「耦合至(coupled to)」或「相鄰於(adjacent to)」另一元件或層時,所述元件或層可直接位於所述另一元件或層上、直接連接至、直接耦合至或直接相鄰於所述另一元件或層,抑或可存在一或多個中間元件或層。相比之下,當稱元件或層「直接位於另一元件或層上(directly on)」、「直接連接至(directly connected to)」、「直接耦合至(directly coupled to)」或「緊鄰於(immediately adjacent to)」另一元件或層時,則不存在中間元件或層。 It should be understood that when an element or layer is referred to as being “on”, “connected to”, “coupled to” or “adjacent to” another element or layer In the case of an element or layer, the element or layer may be directly on, directly connected to, directly coupled to, or directly adjacent to the other element or layer, or there may be one or more intervening elements Elements or layers. In contrast, when a component or layer is called “directly on”, “directly connected to”, “directly coupled to” or “directly on (immediately adjacent to)" When another element or layer, there is no intermediate element or layer.

本文所述的任何數值範圍旨在包括歸入所述範圍內的相同數值精度的所有子範圍。舉例而言,「1.0至10.0」的範圍旨在包括所述最小值1.0與所述最大值10.0之間(且包含所述最小值1.0與所述最大值10.0在內)的所有子範圍,亦即,具有等於或大 於1.0的最小值以及等於或小於10.0的最大值,例如(舉例而言)2.4至7.6。本文所述的任何最大數值限制旨在包括歸入其中的所有較低的數值限制,並且本說明書中所述的任何最小數值限制旨在包括歸入其中的所有更高的數值限制。 Any numerical range described herein is intended to include all sub-ranges of the same numerical precision that fall within the stated range. For example, the range of "1.0 to 10.0" is intended to include all sub-ranges between the minimum value 1.0 and the maximum value 10.0 (and including the minimum value 1.0 and the maximum value 10.0). That is, have equal or greater A minimum value of 1.0 and a maximum value of 10.0 or less, such as (for example) 2.4 to 7.6. Any maximum numerical limit described herein is intended to include all lower numerical limits subsumed therein, and any minimum numerical limit described in this specification is intended to include all higher numerical limits subsumed therein.

儘管在本文中已具體闡述並說明瞭具有記憶體內快取管理器的高頻寬記憶體的示例性實施例,然而對熟習此項技術者而言,諸多潤飾及變化將顯而易見。因此,應理解,根據本發明原理所構想的具有記憶體內快取管理器的高頻寬記憶體可以除本文所具體闡述的之外的其他方式實施。在以下申請專利範圍及其等效範圍中亦對本發明加以界定。 Although an exemplary embodiment of a high-bandwidth memory with an in-memory cache manager has been specifically illustrated and described in this article, many modifications and changes will be obvious to those familiar with the art. Therefore, it should be understood that the high-bandwidth memory with an in-memory cache manager conceived according to the principles of the present invention can be implemented in other ways than those specifically described herein. The present invention is also defined in the scope of the following patent applications and their equivalent scope.

105:高頻寬記憶體堆疊 105: High bandwidth memory stack

110:邏輯晶粒 110: Logic Die

115:動態隨機存取記憶體堆疊 115: dynamic random access memory stack

Claims (20)

一種記憶體系統,包括:記憶體堆疊,包括多個記憶體晶粒;以及邏輯晶粒,所述記憶體晶粒堆疊於並連接至所述邏輯晶粒,所述邏輯晶粒具有與主機處理器介接的外部介面,所述邏輯晶粒包括快取管理器,所述快取管理器經配置以:對經由所述外部介面接收作為命令的部分的位址轉譯以生成:第一標籤值;以及標籤位址,自所述記憶體堆疊中與所述標籤位址對應的位置中提取第二標籤值,比較所述第一標籤值與所述第二標籤值,以及基於所述第一標籤值與所述第二標籤值的比較執行所述命令。 A memory system includes: a memory stack, including a plurality of memory die; and a logic die, the memory die stacked on and connected to the logic die, the logic die having a host processing An external interface interfaced by a device, the logical die includes a cache manager configured to: translate an address received as part of a command via the external interface to generate: a first tag value And a tag address, extracting a second tag value from a position corresponding to the tag address in the memory stack, comparing the first tag value with the second tag value, and based on the first The comparison of the tag value and the second tag value executes the command. 如申請專利範圍第1項所述的記憶體系統,其中所述快取管理器包括位址轉譯器,所述位址轉譯器經配置以對經由所述外部介面接收的所述位址進行轉譯以生成:所述第一標籤值;所述記憶體堆疊中的資料位址;以及所述記憶體堆疊中的所述標籤位址。 The memory system according to claim 1, wherein the cache manager includes an address translator configured to translate the address received via the external interface To generate: the first tag value; the data address in the memory stack; and the tag address in the memory stack. 如申請專利範圍第2項所述的記憶體系統,其中所述快取管理器包括命令轉譯器,所述命令轉譯器經配置以因應於經由所述外部介面接收的讀取命令而產生:用於提取所述第二標籤值的第一命令;以及用於提取資料字元的第二命令。 The memory system according to claim 2, wherein the cache manager includes a command translator, and the command translator is configured to generate in response to a read command received via the external interface: A first command for extracting the second tag value; and a second command for extracting data characters. 如申請專利範圍第3項所述的記憶體系統,其中所述快取管理器包括用於產生快取命中訊號的標籤比較器,所述快取命中訊號:當所述第一標籤值等於所述第二標籤值時,具有真值;以及當所述第一標籤值不等於所述第二標籤值時,具有假值。 The memory system described in item 3 of the scope of application, wherein the cache manager includes a tag comparator for generating a cache hit signal, the cache hit signal: when the first tag value is equal to all When the second tag value, it has a true value; and when the first tag value is not equal to the second tag value, it has a false value. 如申請專利範圍第4項所述的記憶體系統,其中所述標籤比較器經配置以經由所述外部介面的第一引腳發送所述快取命中訊號。 The memory system according to claim 4, wherein the tag comparator is configured to send the cache hit signal via the first pin of the external interface. 如申請專利範圍第5項所述的記憶體系統,其中所述快取管理器經配置以經由所述外部介面的第二引腳發送壞位元的值及/或有效位元的值。 The memory system of claim 5, wherein the cache manager is configured to send the value of the bad bit and/or the value of the valid bit via the second pin of the external interface. 如申請專利範圍第5項所述的記憶體系統,其中所述快取管理器經配置以在時間的第一間隔期間經由所述第一引腳發送所述快取命中訊號,且在第二間隔期間經由所述第一引腳發送壞位元的值。 The memory system of claim 5, wherein the cache manager is configured to send the cache hit signal via the first pin during a first interval of time, and in the second interval During the interval, the value of the bad bit is sent via the first pin. 如申請專利範圍第3項所述的記憶體系統,其中所述快取管理器經配置以經由偽通道執行所述第一命令。 The memory system described in claim 3, wherein the cache manager is configured to execute the first command via a pseudo channel. 如申請專利範圍第3項所述的記憶體系統,其中所述快取管理器包括模式選擇器,所述模式選擇器指示選擇並列運作模式或選擇串列運作模式,所述快取管理器經配置成:當所述模式選擇器指示選擇所述並列運作模式時,將所述第一命令與所述第二命令並列地執行;以及當所述模式選擇器指示選擇所述串列運作模式時,在執行所述第二命令之前執行所述第一命令。 As described in item 3 of the scope of patent application, wherein the cache manager includes a mode selector, the mode selector instructs to select the parallel operation mode or the serial operation mode, and the cache manager It is configured to: when the mode selector instructs to select the parallel operation mode, execute the first command and the second command in parallel; and when the mode selector instructs to select the serial operation mode Execute the first command before executing the second command. 如申請專利範圍第9項所述的記憶體系統,其中所述模式選擇器經配置以經由所述外部介面而受到控制。 The memory system described in claim 9, wherein the mode selector is configured to be controlled via the external interface. 如申請專利範圍第1項所述的記憶體系統,其中對於儲存於所述記憶體晶粒中的第一記憶庫中且能夠經由不同偽通道進行存取的任兩個資料字元,在所述記憶體堆疊的不同子陣列中儲存有兩個對應的標籤。 The memory system described in the first item of the patent application, wherein for any two data characters stored in the first memory bank in the memory die and accessible through different pseudo channels, Two corresponding tags are stored in different sub-arrays of the memory stack. 如申請專利範圍第1項所述的記憶體系統,其中所述外部介面經配置以依據聯合電子裝置工程委員會標準JESD235A來運作。 The memory system described in the first item of the patent application, wherein the external interface is configured to operate in accordance with the JESD235A standard of the Joint Electronic Device Engineering Committee. 一種處理系統,包括:主機處理器;第一記憶體系統,連接至所述主機處理器;以及第二記憶體系統,連接至所述主機處理器,所述第一記憶體系統包括:記憶體堆疊,包括多個記憶體晶粒;以及 邏輯晶粒,所述記憶體晶粒堆疊於並連接至所述邏輯晶粒,所述邏輯晶粒具有與所述主機處理器介接的外部介面,所述邏輯晶粒包括快取管理器,所述快取管理器經配置以:對經由所述外部介面接收作為命令的部分的位址轉譯以生成:第一標籤值;以及標籤位址,自所述記憶體堆疊中與所述標籤位址對應的位置中提取第二標籤值,比較所述第一標籤值與所述第二標籤值,以及基於所述第一標籤值與所述第二標籤值的比較執行所述命令,所述第二記憶體系統經配置成為所述第一記憶體系統的後備儲存器。 A processing system comprising: a host processor; a first memory system connected to the host processor; and a second memory system connected to the host processor, the first memory system comprising: memory Stack, including multiple memory die; and A logic die, the memory die is stacked on and connected to the logic die, the logic die has an external interface that interfaces with the host processor, the logic die includes a cache manager, The cache manager is configured to: translate the address received as part of the command via the external interface to generate: a first tag value; and a tag address from the memory stack and the tag bit Extracting a second tag value from a location corresponding to the address, comparing the first tag value with the second tag value, and executing the command based on the comparison between the first tag value and the second tag value, the The second memory system is configured as a backup storage for the first memory system. 如申請專利範圍第13項所述的處理系統,其中所述快取管理器包括位址轉譯器,所述位址轉譯器經配置以將經由所述外部介面自所述主機處理器接收的所述位址轉譯以生成:所述第一標籤值;所述記憶體堆疊中的資料位址;以及所述記憶體堆疊中的所述標籤位址。 The processing system according to claim 13, wherein the cache manager includes an address translator configured to transfer all data received from the host processor via the external interface The address is translated to generate: the first tag value; the data address in the memory stack; and the tag address in the memory stack. 如申請專利範圍第14項所述的處理系統,其中所述快取管理器包括命令轉譯器,所述命令轉譯器經配置以因應於經由所述外部介面自所述主機處理器接收的讀取命令而產生:用於提取所述第二標籤值的第一命令;以及用於提取資料字元的第二命令。 The processing system according to claim 14, wherein the cache manager includes a command translator configured to respond to reads received from the host processor via the external interface The commands are generated: a first command for extracting the second tag value; and a second command for extracting data characters. 如申請專利範圍第15項所述的處理系統,其中所述快取管理器包括用於產生快取命中訊號的標籤比較器,所述快取命中訊號:當所述第一標籤值等於所述第二標籤值時,具有真值;以及當所述第一標籤值不等於所述第二標籤值時,具有假值。 According to the processing system described in claim 15, wherein the cache manager includes a tag comparator for generating a cache hit signal, the cache hit signal: when the first tag value is equal to the When the second tag value, it has a true value; and when the first tag value is not equal to the second tag value, it has a false value. 如申請專利範圍第13項所述的處理系統,其中所述外部介面經配置以依據聯合電子裝置工程委員會標準JESD235A來運作。 The processing system described in claim 13, wherein the external interface is configured to operate according to the JESD235A standard of the Joint Electronic Device Engineering Committee. 一種操作記憶體堆疊的方法,所述記憶體堆疊包括多個記憶體晶粒、以及邏輯晶粒,所述記憶體晶粒堆疊於並連接至所述邏輯晶粒,所述邏輯晶粒具有與主機處理器介接的外部介面,所述方法包括:由所述邏輯晶粒對經由所述外部介面接收作為命令的部分的位址進行轉譯以生成:第一標籤值;所述記憶體堆疊中的資料位址;以及所述記憶體堆疊中的標籤位址; 自所述記憶體堆疊中與所述標籤位址對應的位置中提取第二標籤值;比較所述第一標籤值與所述第二標籤值;以及基於所述第一標籤值與所述第二標籤值的比較執行所述命令。 A method for operating a memory stack, the memory stack includes a plurality of memory dies and logic dies, the memory dies are stacked on and connected to the logic dies, the logic dies have and An external interface interfaced by a host processor, and the method includes: translating, by the logic die, an address of a part received as a command via the external interface to generate: a first tag value; the memory stack The data address of; and the label address in the memory stack; Extracting a second tag value from a position in the memory stack corresponding to the tag address; comparing the first tag value with the second tag value; and based on the first tag value and the first tag value The comparison of the two tag values executes the command. 如申請專利範圍第18項所述的方法,更包括由所述邏輯晶粒因應於經由所述外部介面接收的讀取命令而產生:用於提取所述第二標籤值的第一命令;以及用於提取資料字元的第二命令。 According to the method described in item 18 of the scope of the patent application, the logic die is generated in response to the read command received via the external interface: a first command for extracting the second tag value; and The second command used to extract data characters. 如申請專利範圍第19項所述的方法,更包括由所述邏輯晶粒產生快取命中訊號,所述快取命中訊號:當所述第一標籤值等於所述第二標籤值時,具有真值;以及當所述第一標籤值不等於所述第二標籤值時,具有假值。 The method described in item 19 of the scope of patent application further includes generating a cache hit signal from the logic die, the cache hit signal: when the first tag value is equal to the second tag value, True value; and when the first tag value is not equal to the second tag value, it has a false value.
TW106118496A 2016-07-26 2017-06-05 Memory system, processing system thereof and method for operating memory stack TWI703440B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662367062P 2016-07-26 2016-07-26
US62/367,062 2016-07-26
US15/272,339 2016-09-21
US15/272,339 US10180906B2 (en) 2016-07-26 2016-09-21 HBM with in-memory cache manager

Publications (2)

Publication Number Publication Date
TW201804328A TW201804328A (en) 2018-02-01
TWI703440B true TWI703440B (en) 2020-09-01

Family

ID=61010011

Family Applications (1)

Application Number Title Priority Date Filing Date
TW106118496A TWI703440B (en) 2016-07-26 2017-06-05 Memory system, processing system thereof and method for operating memory stack

Country Status (5)

Country Link
US (1) US10180906B2 (en)
JP (1) JP2018018513A (en)
KR (1) KR102404643B1 (en)
CN (1) CN107656878B (en)
TW (1) TWI703440B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10936221B2 (en) 2017-10-24 2021-03-02 Micron Technology, Inc. Reconfigurable memory architectures
US10628354B2 (en) * 2017-12-11 2020-04-21 Micron Technology, Inc. Translation system for finer grain memory architectures
KR102505913B1 (en) * 2018-04-04 2023-03-07 삼성전자주식회사 Memory module and memory system including memory module)
KR102605205B1 (en) * 2018-07-25 2023-11-24 에스케이하이닉스 주식회사 Memory device and processing system
CN110928810B (en) * 2018-09-20 2023-11-14 三星电子株式会社 Outward expansion high bandwidth storage system
KR20200065762A (en) * 2018-11-30 2020-06-09 에스케이하이닉스 주식회사 Memory system
US10915451B2 (en) * 2019-05-10 2021-02-09 Samsung Electronics Co., Ltd. Bandwidth boosted stacked memory
US11216385B2 (en) * 2019-05-15 2022-01-04 Samsung Electronics Co., Ltd. Application processor, system-on chip and method of operating memory management unit
US11226816B2 (en) * 2020-02-12 2022-01-18 Samsung Electronics Co., Ltd. Systems and methods for data placement for in-memory-compute
KR20220127601A (en) * 2021-03-11 2022-09-20 삼성전자주식회사 Memory system, memory device of performing internal processing operations with interface, operation method of the memory device having the same
US11901035B2 (en) 2021-07-09 2024-02-13 Taiwan Semiconductor Manufacturing Company, Ltd. Method of differentiated thermal throttling of memory and system therefor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6594728B1 (en) * 1994-10-14 2003-07-15 Mips Technologies, Inc. Cache memory with dual-way arrays and multiplexed parallel output
US20140181417A1 (en) * 2012-12-23 2014-06-26 Advanced Micro Devices, Inc. Cache coherency using die-stacked memory device with logic die

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2297398B (en) * 1995-01-17 1999-11-24 Advanced Risc Mach Ltd Accessing cache memories
JPH11212868A (en) * 1998-01-28 1999-08-06 Oki Electric Ind Co Ltd Snoop cash memory control system
US8341352B2 (en) * 2007-04-17 2012-12-25 International Business Machines Corporation Checkpointed tag prefetcher
KR101728067B1 (en) * 2010-09-03 2017-04-18 삼성전자 주식회사 Semiconductor memory device
KR20120079682A (en) * 2011-01-05 2012-07-13 삼성전자주식회사 Memory device having dram cache and system including the memory device
US20120221785A1 (en) * 2011-02-28 2012-08-30 Jaewoong Chung Polymorphic Stacked DRAM Memory Architecture
US20120297256A1 (en) 2011-05-20 2012-11-22 Qualcomm Incorporated Large Ram Cache
JP6012263B2 (en) * 2011-06-09 2016-10-25 株式会社半導体エネルギー研究所 Semiconductor memory device
US9753858B2 (en) * 2011-11-30 2017-09-05 Advanced Micro Devices, Inc. DRAM cache with tags and data jointly stored in physical rows
US9189399B2 (en) * 2012-11-21 2015-11-17 Advanced Micro Devices, Inc. Stack cache management and coherence techniques
US9053039B2 (en) 2012-12-21 2015-06-09 Advanced Micro Devices, Inc. Installation cache
US9477605B2 (en) 2013-07-11 2016-10-25 Advanced Micro Devices, Inc. Memory hierarchy using row-based compression
US9286948B2 (en) * 2013-07-15 2016-03-15 Advanced Micro Devices, Inc. Query operations for stacked-die memory device
CN104575584B (en) 2013-10-23 2018-11-30 钰创科技股份有限公司 System-in-package memory module with embedded memory
KR20150062646A (en) * 2013-11-29 2015-06-08 삼성전자주식회사 Electronic System and Operating Method of the same
US10078597B2 (en) * 2015-04-03 2018-09-18 Via Alliance Semiconductor Co., Ltd. System and method of distinguishing system management mode entries in a translation address cache of a processor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6594728B1 (en) * 1994-10-14 2003-07-15 Mips Technologies, Inc. Cache memory with dual-way arrays and multiplexed parallel output
US20140181417A1 (en) * 2012-12-23 2014-06-26 Advanced Micro Devices, Inc. Cache coherency using die-stacked memory device with logic die

Also Published As

Publication number Publication date
US10180906B2 (en) 2019-01-15
US20180032437A1 (en) 2018-02-01
KR102404643B1 (en) 2022-06-02
KR20180012180A (en) 2018-02-05
CN107656878B (en) 2023-06-13
TW201804328A (en) 2018-02-01
CN107656878A (en) 2018-02-02
JP2018018513A (en) 2018-02-01

Similar Documents

Publication Publication Date Title
TWI703440B (en) Memory system, processing system thereof and method for operating memory stack
JP5464529B2 (en) Multi-mode memory device and method
JP6980912B2 (en) Swizling in 3D stacked memory
US10838865B2 (en) Stacked memory device system interconnect directory-based cache coherence methodology
US10310976B2 (en) System and method for concurrently checking availability of data in extending memories
JP3807582B2 (en) Information processing apparatus and semiconductor device
US20140126274A1 (en) Memory circuit and method of operating the memory circui
US8996818B2 (en) Bypassing memory requests to a main memory
JP7384806B2 (en) Scheduling memory requests for ganged memory devices
TWI634550B (en) Dram and access and operating method thereof
US20130191587A1 (en) Memory control device, control method, and information processing apparatus
KR20210063496A (en) Memory device including processing circuit, and electronic device including system on chip and memory device
CN107369473B (en) Storage system and operation method thereof
TWI553483B (en) Processor and method for accessing memory
Asifuzzaman et al. Demystifying the characteristics of high bandwidth memory for real-time systems
US11995005B2 (en) SEDRAM-based stacked cache system and device and controlling method therefor
US11928039B1 (en) Data-transfer test mode
US20240079036A1 (en) Standalone Mode
KR102343550B1 (en) Memory system using small active command
JP2009181221A (en) Memory control method