WO2024054316A1 - Hybrid memory architecture for advanced 3d systems - Google Patents

Hybrid memory architecture for advanced 3d systems Download PDF

Info

Publication number
WO2024054316A1
WO2024054316A1 PCT/US2023/029044 US2023029044W WO2024054316A1 WO 2024054316 A1 WO2024054316 A1 WO 2024054316A1 US 2023029044 W US2023029044 W US 2023029044W WO 2024054316 A1 WO2024054316 A1 WO 2024054316A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
die
circuitry
stack
dies
Prior art date
Application number
PCT/US2023/029044
Other languages
French (fr)
Inventor
Divya Madapusi Srinivas Prasad
Niti Madan
Michael Ignatowski
Hyung-Dong Lee
Original Assignee
Advanced Micro Devices, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US18/199,837 external-priority patent/US20240088098A1/en
Application filed by Advanced Micro Devices, Inc. filed Critical Advanced Micro Devices, Inc.
Publication of WO2024054316A1 publication Critical patent/WO2024054316A1/en

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C5/00Details of stores covered by group G11C11/00
    • G11C5/02Disposition of storage elements, e.g. in the form of a matrix array
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/21Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
    • G11C11/22Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using ferroelectric elements
    • G11C11/221Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using ferroelectric elements using ferroelectric capacitors
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/21Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
    • G11C11/34Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
    • G11C11/40Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
    • G11C11/401Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming cells needing refreshing or charge regeneration, i.e. dynamic cells

Definitions

  • Embodiments of the invention generally relate to stacked memory dies having volatile and non-volatile based memory dies, and chip packages containing the same.
  • High bandwidth memory (HBM) and other stacked dynamic random-access memory (DRAM) memories have been proposed/enabled to alleviate off-chip memory access latency as well as increase memory density.
  • HBM high bandwidth memory
  • DRAM stacked dynamic random-access memory
  • FeRAM ferro-electric random-access memory
  • MRAM magneto resistive random-access memory
  • PCM phase-change memory
  • NVM non-volatile memory
  • SRAM static random-access memory
  • Non-volatile main memories such as FeRAMs, MRAMs, and volatile memories such as DRAMs (including HBM and other stacked variants of DRAM) are being considered and traded-off for achieving higher memory density, performance, and lower power.
  • DRAMs have been the most popular off-chip memory, however, even Double Data Rate 5 Synchronous DRAM (DDR5) has certain Performance-Power- Area (PPA) limitations of having to going off-chip to access data.
  • the typical DRAM bitcell consists of a one transistor and one capacitor (1T-1 C) structure where the capacitor is formed by a dielectric layer sandwiched in between conductor plates.
  • System interprocess communication (IPC) is often limited by DRAM bandwidth and latency, especially in memory-heavy workloads.
  • HBM has been introduced to provide increased bandwidth and memory density, allowing up to 8-12 layers of DRAM dies to be stacked on top of each other with an optional logic/memory interface die. This memory stack can either be connected to the CPU/GPU through silicon interposers (Fig. 1 ) or placed on top of the CPU/GPU themselves to provide superior connectivity and performance.
  • FeRAM is like 1T-1 C DRAM, except for that the capacitor is made of a ferroelectric material versus a (linear) dielectric as used in DRAM. Bit ‘0’ and T are written with electric polarization orientations of the ferroelectric material in the dielectric. The benefit of this technology is refresh-free storage which has potential to offer more density and performance over DRAM.
  • MRAM on the other hand has a one transistor and one resistor (1T-1 R) bitcell. Unlike DRAM and FeRAM, MRAM does not have a destructive read. However, MRAM is less reliable compared to FeRAM and has lower endurance and retention.
  • the memory technology is developed and “optimized” as an independent macro or for specific applications like deep neural networks (DNN), in the HBM case.
  • DNN deep neural networks
  • GPDDR graphics double data rate
  • DDR double data rate
  • More fine-grained optimizations of memory technology with logic technology and architecture are not deeply explored, and there is much to do to achieve superior performance and lower power products.
  • Non-linear power increase and decreasing improvement in performance and memory density from generation to generation requires more design and co-optimization to push alleviate the memory bottleneck.
  • High temperature memory dies such as those using non-volatile memory (NVM) technologies are in a memory stack with low temperature memory dies, such as those having volatile memory technologies.
  • the high temperature memory technologies could be used together, in some cases, on the same IC die as logic circuitry.
  • a memory stack is provided that include a first memory IC die having high temperature memory circuitry, such as non-volatile memory, stacked below a second memory IC die.
  • the second memory IC die has high temperature memory circuitry, such as volatile memory circuitry.
  • a memory stack in another example, includes a first memory IC die stacked on a second memory IC die.
  • the first memory IC die includes memory circuitry that requires more frequent refresh rates as compared the second memory IC die.
  • the first memory IC die includes memory circuitry operational at temperatures above 110 degrees Celsius without increased refresh rates as compared to operation at 95 degrees Celsius.
  • the second memory IC die includes memory circuitry requiring increased refresh rates at temperatures above 110 degrees Celsius as compared to operation at 95 degrees Celsius.
  • a memory stack includes a first memory IC die stacked on a second memory IC die.
  • the first memory IC die includes ferro-electric random-access memory (FeRAM).
  • the second memory IC die includes dynamic random-access memory (DRAM) circuitry.
  • a chip package having a memory stack mounted to a package substrate is provided.
  • the memory stack that includes a plurality of first memory IC dies stacked on a second memory IC die.
  • the second memory IC die includes ferro-electric random-access memory (FeRAM) circuitry and optionally controller circuity.
  • the second memory IC die is stacked on the package substrate.
  • the plurality of first memory IC dies includes DRAM circuitry.
  • NVM non-volatile memory
  • the NVM technologies could be used together, in some cases on the same IC die as logic circuitry. Exploitation of specific properties of each of the technologies, in the stacked memory subsystem, can beneficially result in differentiated SoC performance.
  • a memory stack includes a first memory IC die having non-volatile memory (NVM) circuitry stacked below a second memory IC die.
  • the second memory IC die has volatile memory circuitry.
  • one IC memory die of the memory stack includes ferro-electric random-access memory (FeRAM) or static randomaccess memory (SRAM) circuitry, while another IC memory die of the memory stack includes volatile memory circuitry.
  • FeRAM ferro-electric random-access memory
  • SRAM static randomaccess memory
  • one IC memory die of the memory stack includes ferro-electric random-access memory (FeRAM) or static randomaccess memory (SRAM) circuitry, while another IC memory die of the memory stack includes dynamic random-access memory (DRAM) circuitry.
  • FeRAM ferro-electric random-access memory
  • SRAM static randomaccess memory
  • DRAM dynamic random-access memory
  • first processing in memory (PIM) circuitry is disposed in a second memory IC die of the memory stack while a second PIM circuitry is disposed in the third memory IC die of the memory stack.
  • PIM processing in memory
  • memory stack includes a first memory IC die comprising first ferro-electric random-access memory (FeRAM) circuitry and first processing in memory (PIM) circuitry, a second memory IC die stacked on the first memory IC die, and a third memory IC die stacked on the first memory IC die.
  • the second memory IC die includes second FeRAM circuitry and second PIM circuitry.
  • the third memory IC die includes third FeRAM circuitry and third PIM circuitry.
  • a chip package includes a hybrid memory stack mounted on a substrate.
  • the hybrid memory stack include both volatile and non-volatile memory IC dies.
  • Figure 1 depicts a chip package having a memory stack connected to a compute/processor die through an interposer.
  • Figure 2 depicts an IC die stack disposed on top of a compute/processor chip and a memory-interface/controller die.
  • Figure 3A depicts a high-temperature memory die stacked on top of a compute/processor chip and a memory-interface/controller die.
  • Figure 3B depicts a high-temperature memory, a compute/processor circuitry and memory-interface/controller circuitry integrated into a single integrated circuit (IC) die.
  • IC integrated circuit
  • Figure 4A is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including at least one low temperature memory integrated circuit die and at least one high temperature memory integrated circuit die.
  • Figure 4B is a memory die stack disposed on top of a compute/processor chip, the memory die stack including at least one low temperature memory integrated circuit die and at least one high temperature memory integrated circuit die, wherein the high temperature memory integrated circuit die includes controller circuitry.
  • Figure 4C is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including at least one low temperature memory integrated circuit die and at least one high temperature memory integrated circuit die having a buffer IC die disposed therebetween, the buffer IC including logic and non-volatile memory circuitry.
  • Figure 5A is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of low temperature memory integrated circuit dies, one of more of the low temperature memory integrated circuit dies having processing in memory (PIM) circuitry with FeRAM and/or embedded DRAM based PIM local storage.
  • PIM processing in memory
  • Figure 5B is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of low temperature memory integrated circuit dies, one of more of the low temperature memory integrated circuit dies having additional PIM circuitry relative to FeRAM and/or embedded DRAM based PIM storage as compared to the memory die stack of Figure 5A.
  • Figure 5C is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of low temperature memory integrated circuit dies, one of more of the low temperature memory integrated circuit dies having fine grained PIM circuitry with FeRAM and/or embedded DRAM based PIM local storage.
  • Figure 5D is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of high temperature memory integrated circuit dies, one of more of the high temperature memory integrated circuit dies having fine grained sub-bank level PIM circuitry with FeRAM based PIM local storage.
  • the disclosure herein addresses specific challenges of stacked DRAM subsystem with hybrid memory 3D organization and logic codesign.
  • the disclosed technology defines various methods, systems and devices to design compute- and application-aware advanced memory-based systems.
  • memory die stacks that utilize a mix of high and low operational temperature memory dies such that the high temperature die may function as a temperature buffer with adjacent heat generating logic dies.
  • memory die stacks that utilize a mix of volatile and non-volatile based memory dies, one example of which are stacked DRAM and FeRAM based memory die.
  • Figure 1 depicts a chip package 100 having a memory stack 102 connected to a compute/processor IC die 108 through an interposer 110. Any of the memory stacks described herein may be utilized in the chip package 100 depicted in Figure 1 , or other suitable memory device.
  • the chip package 100 depicted in Figure 1 which may include any of the other memory stacks described below, also includes a package substrate 114 to which the interposer 110 is mounted.
  • the package substrate 114 of the chip package 100 may be coupled to a printed circuit board (PCB) 136 to form an electronic system 180, such as but not limited to the graphics card depicted in Figure 1.
  • PCB printed circuit board
  • the memory stack 102 generally includes at least one low temperature memory integrated circuit (LTMIC) die 104 stacked with at least one high temperature memory IC (HTMIC) die 106.
  • LTMIC low temperature memory integrated circuit
  • HTMIC high temperature memory IC
  • the space shown in Figure 1 between the dies 104, 106 is used to make solder connections (not shown) between the dies 104, 106.
  • the dies 104, 106 may be stacked directly in contact with each other using hybrid bonding techniques.
  • the HTMIC and LTMIC dies 106, 104 may be comparatively defined by at least one the following definitions.
  • a HTMIC die 106 has a memory refresh requirement that is longer than another memory die in the memory stack 102, the memory die having the shorter memory refreshes requirement comparatively referred to the LTMIC die 104.
  • a HTMIC die 106 has a longer period between refresh (i.e., a longer refresh period) than recommended by Joint Electron Device Engineering Council (JEDEC) Solid State Technology Association standard JESD21-C, the memory die having memory refresh requirements following JESD21-C comparatively referred to as the LTMIC die 104.
  • JEDEC Joint Electron Device Engineering Council
  • the HTMIC die 106 has a memory refresh requirement that exceeds 60 microseconds between memory refreshes, the memory die requiring a memory refresh every 60 microseconds comparatively referred to the LTMIC die 104.
  • the HTMIC die 106 is non-volatile memory, while the LTMIC die 104 is volatile memory.
  • the LTMIC die 104 can be defined as a memory die that can operate at temperatures up to 110 degrees Celsius (i.e., operational temperature) without having to increase the refresh rate. At temperatures above 110 degrees Celsius, LTMIC die 104 require increased the refresh rates as compared to operation at 110 degrees Celsius (as compared to operation at 95 degrees Celsius).
  • An example of a LTMIC die 104 is a dynamic random-access memory (DRAM) die.
  • DRAM dynamic random-access memory
  • Other examples of LTMIC dies 104 include volatile memory dues such as system random-access memory (SRAM), among others.
  • the HTMIC die 106 has an operational temperature that is greater than the operational temperature of the LTMIC die 104.
  • the HTMIC die 106 is a memory die that can operate at temperatures above 110 degrees Celsius without having to increase the refresh rate (as compared to operation at 95 degrees Celsius).
  • An example of a HTMIC die 106 is a ferromagnetic random-access memory (FeRAM) die.
  • Other examples of HTMIC die 106 include non-volatile memory dues such magnetoresistive random-access memory (MRAM), phase-change memory (PCM), flash memory, and resistive random-access memory (RRAM), among others.
  • the memory stack 102 may optionally include at least one controller IC die 120 stacked with the LTMIC die 104 and the HTMIC die 106.
  • the IC dies 104, 106, 120 may be electrically and mechanically connected by solder balls and/or hybrid bonding techniques, such that the functional circuitries with in the IC dies 104, 106, 120 can communicate with each other and/or transmit data signals, power and/or ground therethrough.
  • the functional circuitries within the IC memory dies 104, 106 are arranged into multiple memory banks. Each bank has multiples rows and each row has multiple columns. Residing at each unique memory location within a bank is a memory cell. The memory cell may be addressed using its unique identifying row and cell location within a particular bank of the memory dies 104, 106.
  • the functional circuitries within the IC dies 104, 106, 120 are coupled to the functional circuitry of the compute/processor IC die 108 via routings 112 formed in the interposer 110.
  • the routings 112 of the interposer 110 are connected to the functional circuitries with in the IC dies 104, 106, 120 via solder connections 118.
  • the routings 112 of the interposer 110 also connect to the functional circuitries of the IC dies 108, 120 to routing 122 formed in the package substrate 114 via the solder connections 118.
  • Solder balls 116 are utilized to connect the routings 122 of the package substrate 114 with routing 124 formed in the PCB 136.
  • the chip stack 102 and IC die 108 may be mounted directly to the package substrate 114.
  • the IC die 120 is generally a heat generating device. That is, the IC die 120 generates heat when in use. As the performance of the LTMIC die 104 may be diminished due to the heat generated by the IC die 120, performance of the chip package 100 is enhanced by separating the LTMIC die 104 from the heat generating IC die 120 by one or more HTMIC dies 106. Since the HTMIC die 106 is generally more heat resistant than the LTMIC dies 104, the HTMIC die 106 can be located adjacent the heat generating IC die 120 without significant reduction in performance while enabling the LTMIC dies 104 that are significantly spaced from the heat generating IC die 120 to also maintain robust levels of performance.
  • the chip stack 102 includes one HTMIC die 106 disposed between a plurality of LTMIC die 104 and the controller IC die 120. Although four IC memory dies 104, 106 are illustrated in the single chip stack 102 shown in Figure 1 , the number of LTMIC dies 104 and the number of HTMIC dies 106 comprising the chip stack 102 may vary from one to as many as can fit within the chip package 100. Additionally, although only one chip stack 102 is shown in Figure 1 , one or more additional chip stacks may be disposed adjacent the chip stack 102 shown in Figure 1.
  • the controller IC die 120 include functional logic circuitry provides commands that enable the row and column identifying each bank of the memory dies 104, 106 to be addressed.
  • the controller IC die 120 controls the write/read operation from each memory bank.
  • the HBM memory can be put in low power modes by row address bus to save power on the I/O drivers.
  • clocks can be gated when in power-down or self-refresh modes.
  • Figure 2 depicts an IC die stack 202 disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120.
  • the IC die stack 202 may be used in place of the IC die stack 102 and mounted on the interposer 110 or the package substrate 114 in the chip package 100 and electronic system 180 illustrated in Figure 1.
  • the IC die stack 202 that can be used in the electronic system 180 includes high-bandwidth (HBM) cube 204 stacked on top of the compute IC die 108 and the controller IC die 120.
  • the compute IC die (e.g., compute chip) 108 may be a central processing unit (CPU), graphics processing unit (GPU), field programmable gate array (FPGA), or other accelerator.
  • the HBM cube 204 includes a plurality of LTMIC dies 104 that are vertically stacked. Each LTMIC die 104 includes functional circuitry configured as memory circuitry 220.
  • the memory circuitry 220 is configured as volatile memory circuitry such as dynamic random-access memory (DRAM) and system randomaccess memory (SRAM), among others.
  • DRAM dynamic random-access memory
  • SRAM system randomaccess memory
  • the LTMIC dies 104 may be referred as DRAM IC dies.
  • the stacked DRAM IC dies may be connected by solder connections, hybrid bonding or other suitable connection.
  • 3 DRAM IC die are illustrated in the HBM cube 204 depicted in Figure 2, the HBM cube 204 may alternatively have more or less than 3 DRAM IC dies.
  • the HTMIC die 106 includes functional circuitry configured as memory circuitry 222.
  • the memory circuitry 222 is configured as non-volatile random-access memory, such as ferroelectric random-access memory (FeRAM), magnetoresistive random-access memory (MRAM), phase-change memory (PCM), flash memory, and resistive random-access memory (RRAM), among others.
  • FeRAM ferroelectric random-access memory
  • MRAM magnetoresistive random-access memory
  • PCM phase-change memory
  • RRAM resistive random-access memory
  • the HTMIC die 106 may be referred as a FeRAM IC die.
  • the memory circuitry 222 of the HTMIC die 106 has an operational temperature that is greater than the operational temperature of the memory circuitry 220 of the LTMIC die 104.
  • the memory circuitry 222 of the HTMIC die 106 is memory circuitry that can operate at temperatures above 110 degrees Celsius without having to increase the refresh rate (as compared to operation at 95 degrees Celsius)
  • the memory circuitry 220 of the LTMIC die 104 is memory circuitry that cannot operate at temperatures above 110 degrees Celsius without having to increase the refresh rate (as compared to operation at 95 degrees Celsius)
  • the HTMIC die 106 is a non-volatile random-access memory IC die that has faster refresh speed as compared to the LTMIC dies 104 that have volatile random-access memory.
  • the faster refresh speed enables faster communication with the controller IC die 120, which beneficially reduces latency within the IC die stack 202, and ultimately, the chip package 100 and the electronic system 180.
  • the HTMIC die 106 has memory circuitry 222 configured a FeRAM circuitry, while the LTMIC dies 104 have memory circuitry 220 configured as DRAM circuitry.
  • Figure 3A depicts another example of a memory stack 302 that includes a HTMIC die 106 stacked on top of a compute/processor IC die 108 and a memory- interface/controller IC die 120.
  • a plurality of LTMIC dies 104 may be stacked on the HTMIC die 106 shown in Figure 3A, such as in the manner illustrated in Figure 2, to complete the memory stack 322.
  • the IC die stack 302 may be used in place of the IC die stack 102 and mounted on the interposer 110 or the package substrate 114 in the chip package 100 and electronic system 180 illustrated in Figure 1.
  • the HTMIC die 106 includes memory circuitry 222 configured as FeRAM circuitry.
  • the HTMIC die 106 is vertically stacked on top of the memory- interface/controller IC die 120, while the memory-interface/controller IC die 120 is vertically stacked on top of the compute/processor IC die 108.
  • the memory circuitry 222 of the HTMIC die 106 includes FeRAM or other suitable circuitry which is compatible with the controller circuitry 224 of the memory-interface/controller IC die 120.
  • the interconnections between the dies 104, 106, 108, 120 may be made by solder connections, hybrid bonding or other suitable connection.
  • FIG. 3B depicts another example of a memory stack 322 that includes memory circuitry 222, a compute/processor circuitry 324 and memory- interface/controller circuitry 224 integrated into a single HTMIC die 106.
  • a plurality of LTMIC dies 104 may be stacked on the HTMIC die 106 shown in Figure 3B, such as in the manner illustrated in Figure 2, to complete the memory stack 322.
  • the additional dies stacked on the HTMIC dies 106 shown in Figure 3B may be one or more FeRAM dies, and/or one or more LTMIC dies 104 (such as DRAM IC dies), and/or one or more other type of memory dies.
  • the memory circuitry 222 is configured as FeRAM circuitry, which is compatible with the memory-interface/controller circuitry 224 so that the circuitries 222, 224 may be co-located within the same HTMIC die 106.
  • the FeRAM circuitry 222 is also compatible with the compute/processor circuitry 324 so that the circuitries 222, 324, 224 may be co-located within the same HTMIC die 106.
  • the memory circuitry 222 is disposed between the compute/processor circuitry 324 and the memory-interface/controller circuitry 224.
  • the compute/processor circuitry 224 may reside in an IC die neighboring the HTMIC die 106 that contains both the FeRAM and controller circuitries 222, 224.
  • stacked DRAM memory may be integrate FeRAM based memory to form a hybrid memory stack or hybrid memory-logic assembly.
  • Hybrid memory and hybrid memory-logic assembly utilizes a mix of memory technologies that can be stacked, for example on top of a logic die.
  • a hybrid memory cube with a non-volatile memory (such as FeRAM and the like) IC die and a volatile memory (such as DRAM IC and the like) die has the HTMIC die 106 beneficially disposed closest to the logic IC die 120, as the HTMIC die 106 can tolerate higher heat dissipated from the logic-die, while the LTMIC dies 104 disposed on top of the HTMIC die 106 could be placed closer to the heat spreader in the chip package (such as the chip package 100 depicted in Figure 1 ) to minimize temperature gradients and impact on performance and refresh rates associated with the LTMIC dies 104.
  • Figure 4A depicts an example of a memory die stack 400 disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120.
  • the memory die stack 400 includes at least one HTMIC die 106, such as a nonvolatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies.
  • Another alternative hybrid approach is to arranged the and HTMIC and LTMIC dies 106, 104 within the same memory die stack 400 ranked based on latency, with the fastest dies being closer to the memory-interface/controller IC die 120.
  • the ranked memory IC dies 104, 106 may include SRAM, DRAM, and non-volatile memory (NVM) IC dies arranged by latency all in the same memory die stack 400. This results in a hierarchical hardware managed cache for the LTMIC dies 104, such as DRAM IC dies, within the stacked memory cube, e.g., the memory die stack 400.
  • NVM non-volatile memory
  • Figure 4B is a memory die stack 410 disposed on top of a compute/processor IC die 108.
  • the memory die stack 410 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies.
  • the HTMIC die 106 of the memory die stack 410 includes both memory circuitry 222 and controller circuitry 224.
  • the memory circuitry 222 of the HTMIC die 106 may include FeRAM (or other nonvolatile memory) circuitry and controller circuitry 224 integrated on the same die.
  • FeRAM FeRAM
  • the IO/SA logic on each memory die 104 could be separated into a buffer IC die 422, to achieve higher performance and yield, and could also include FeRAM memory blocks that are logic compatible.
  • Figure 4C shows a memory die stack 420 disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120, the memory die stack 420 including at least one HTMIC die 106, such as a non-volatile memory IC die, and at least one LTMIC die 104, such as a volatile memory die, having buffer IC dies 422 disposed therebetween.
  • the buffer IC die 422 includes both logic and non-volatile memory circuitry 224, 222.
  • the buffer IC die 422 may be hybrid bonded on each side to the other dies 104 comprising the HBM cube 204 of the die stack 420.
  • Memory stack and technology may be selected and design in a “hierarchical” manner (hardware managed cache), to use the faster memory/or memories not requiring refresh, closer to the logic IC die 120, to act as an “intermediate” layer, to transfer data to more dense, slower memories on the upper tiers of LTMIC dies 104, away from the logic IC die 120. This could help hiding latency, and overhead due to refresh needed for LTMIC dies 104, such as DRAM IC dies, on the top of the memory die stack 420.
  • LTMIC dies 104 such as DRAM IC dies
  • FIGS 5A through 5D illustrate some non-limiting examples of processing in memory (PIM) circuitry 502 utilized within hybrid memory assembly, i.e., a memory die stack 500.
  • PIM processing in memory
  • the PIM circuitry 502 include processor or other logic circuitry integrated with memory circuitry 220/222 on a single IC die of the memory die stack 500.
  • the PIM circuitry 502 contains local storage circuitry 506.
  • the local storage circuitry 506 of the PIM circuitry 502 may be FeRAM and/or embedded DRAM (eDRAM).
  • eDRAM embedded DRAM
  • the FeRAM and/or eDRAM based local storage circuitry 506 generally are low leakage logic-compatible storage as compared to local logic-based high leakage registers. This allows area scaling of PIM circuitry 502 and reduced leakage compared to conventional PIM using registers in the logic-based logical storage. Thus, the amount of processing in memory may be increased within the same IC die area allocated for the PIM circuitry 502.
  • the memory die stack 500 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120.
  • the memory die stack 500 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies.
  • One of more of the LTMIC dies 104 have processing in memory (PIM) circuitry 502 with FeRAM and/or embedded DRAM based PIM local storage circuitry 506.
  • PIM processing in memory
  • a memory die stack 510 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120.
  • the memory die stack 510 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies.
  • One of more of the LTMIC dies 104 have additional PIM circuitry 502 relative to FeRAM and/or embedded DRAM based PIM local storage circuitry 506 as compared to the memory die stack 500 of Figure 5A.
  • a memory die stack 520 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120.
  • the memory die stack 520 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies.
  • One of more of the LTMIC dies 104 have fine grained PIM circuitry 502 with FeRAM and/or embedded DRAM based PIM local storage circuitry 506.
  • a memory die stack 530 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120.
  • the memory die stack 530 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies.
  • One of more of the LTMIC dies 104 has fine grained sub-bank level PIM circuitry 502 with FeRAM based PIM local storage circuitry 506.
  • the plurality of LTMIC dies 104 includes DRAM IC memory dies stacked on top of one or more SRAM IC memory dies.
  • the SRAM IC memory die(s) can be configured to function as a hardware managed cache for the DRAM IC memory dies, or as a part of the address space offering exceptionally low latency.
  • the HTMIC die 106 includes FeRAM memory arrays that allow more easy integration of PIM circuitry 502 in logic-compatible FeRAM circuitry of the HTMIC die 106. All of the above examples enable enhanced SoC performance and power efficiency.

Abstract

Disclosed wherein stacked memory dies that utilize a mix of high and low operational temperature memory and non-volatile based memory dies, and chip packages containing the same. High temperature memory dies, such as those using non-volatile memory (NVM) technologies are in a memory stack with low temperature memory dies, such as those having volatile memory technologies. In some cases, the high temperature memory technologies could be used together, in some cases, on the same IC die as logic circuitry. In one example, a memory stack is provided that include a first memory IC die having high temperature memory circuitry, such as non-volatile memory, stacked below a second memory IC die. The second memory IC die has high temperature memory circuitry, such as volatile memory circuitry.

Description

HYBRID MEMORY ARCHITECTURE FOR ADVANCED 3D SYSTEMS
TECHNICAL FIELD
[0001] Embodiments of the invention generally relate to stacked memory dies having volatile and non-volatile based memory dies, and chip packages containing the same.
BACKGROUND
[0002] The memory wall (i.e., bandwidth limitations) has been referred to as one of the key limiters in pushing the bounds of computation in modern systems. High bandwidth memory (HBM) and other stacked dynamic random-access memory (DRAM) memories have been proposed/enabled to alleviate off-chip memory access latency as well as increase memory density. In addition to the traditional DRAM roadmap, several other memories are being explored, that have not yet reached maturity for large scale manufacturing, e.g., technologies such as ferro-electric random-access memory (FeRAM), magneto resistive random-access memory (MRAM), phase-change memory (PCM), etc. During this technology enablement phase, it is crucial to not only examine how new technologies would “replace” the classic roadmap, but if it can aid/complement/address the limitations of existing DRAM without adding much complexity, or how it can be used together with existing technologies to enhance specific properties to achieve superior system on a chip (SoC) performance and power efficiency.
[0003] Memory wall problems are currently being tackled by the industry with HBM-like solutions. Stacked DRAM and HBM as described in the JEDEC Solid State Technology Association (e.g., JEDEC) specifications address memory bandwidth and latency issues by replacing long off-chip connections with stacked (e.g., connected through silicon interposers) memory closer to the logic die. However, there exist yield challenges and overhead due to the non-linear power increase with memory capacity increase. Additionally, 3D stacking on logic, brings new thermal challenges that can negatively impact retention in DRAM. On the other hand, other non-volatile memory (NVM) technologies like logic-compatible FeRAM do not have refresh-requirements and can tolerate high temperature; but suffer with scalability/large capacity and wearout, while static random-access memory (SRAM) is a faster but leaky memory system. Current solutions do not fully utilize hybrid memory systems as disclosed by the inventors herein to take advantage of the unique properties of each memory type; hence do not maximize performance/power- efficiency potential.
[0004] Non-volatile main memories such as FeRAMs, MRAMs, and volatile memories such as DRAMs (including HBM and other stacked variants of DRAM) are being considered and traded-off for achieving higher memory density, performance, and lower power.
[0005] DRAMs have been the most popular off-chip memory, however, even Double Data Rate 5 Synchronous DRAM (DDR5) has certain Performance-Power- Area (PPA) limitations of having to going off-chip to access data. The typical DRAM bitcell consists of a one transistor and one capacitor (1T-1 C) structure where the capacitor is formed by a dielectric layer sandwiched in between conductor plates. System interprocess communication (IPC) is often limited by DRAM bandwidth and latency, especially in memory-heavy workloads. HBM has been introduced to provide increased bandwidth and memory density, allowing up to 8-12 layers of DRAM dies to be stacked on top of each other with an optional logic/memory interface die. This memory stack can either be connected to the CPU/GPU through silicon interposers (Fig. 1 ) or placed on top of the CPU/GPU themselves to provide superior connectivity and performance.
[0006] FeRAM is like 1T-1 C DRAM, except for that the capacitor is made of a ferroelectric material versus a (linear) dielectric as used in DRAM. Bit ‘0’ and T are written with electric polarization orientations of the ferroelectric material in the dielectric. The benefit of this technology is refresh-free storage which has potential to offer more density and performance over DRAM.
[0007] MRAM on the other hand has a one transistor and one resistor (1T-1 R) bitcell. Unlike DRAM and FeRAM, MRAM does not have a destructive read. However, MRAM is less reliable compared to FeRAM and has lower endurance and retention.
[0008] Typically, the memory technology is developed and “optimized” as an independent macro or for specific applications like deep neural networks (DNN), in the HBM case. Albeit some advancements like graphics double data rate (GPDDR) vs double data rate (DDR) have been developed to support high-bandwidth memory for graphics applications. More fine-grained optimizations of memory technology with logic technology and architecture are not deeply explored, and there is much to do to achieve superior performance and lower power products. Non-linear power increase and decreasing improvement in performance and memory density from generation to generation requires more design and co-optimization to push alleviate the memory bottleneck.
SUMMARY
[0009] Disclosed wherein stacked memory dies that utilize a mix of high and low operational temperature memory and non-volatile based memory dies, and chip packages containing the same. High temperature memory dies, such as those using non-volatile memory (NVM) technologies are in a memory stack with low temperature memory dies, such as those having volatile memory technologies. In some cases, the high temperature memory technologies could be used together, in some cases, on the same IC die as logic circuitry. In one example, a memory stack is provided that include a first memory IC die having high temperature memory circuitry, such as non-volatile memory, stacked below a second memory IC die. The second memory IC die has high temperature memory circuitry, such as volatile memory circuitry.
[0010] In another example, a memory stack is provided that includes a first memory IC die stacked on a second memory IC die. The first memory IC die includes memory circuitry that requires more frequent refresh rates as compared the second memory IC die. In some other examples, the first memory IC die includes memory circuitry operational at temperatures above 110 degrees Celsius without increased refresh rates as compared to operation at 95 degrees Celsius. The second memory IC die includes memory circuitry requiring increased refresh rates at temperatures above 110 degrees Celsius as compared to operation at 95 degrees Celsius.
[0011] In another example, a memory stack is provided that includes a first memory IC die stacked on a second memory IC die. The first memory IC die includes ferro-electric random-access memory (FeRAM). The second memory IC die includes dynamic random-access memory (DRAM) circuitry. [0012] In yet another example, a chip package having a memory stack mounted to a package substrate is provided. The memory stack that includes a plurality of first memory IC dies stacked on a second memory IC die. The second memory IC die includes ferro-electric random-access memory (FeRAM) circuitry and optionally controller circuity. The second memory IC die is stacked on the package substrate. The plurality of first memory IC dies includes DRAM circuitry.
[0013] Also disclosed herein are non-volatile memory (NVM) technologies, that may be utilized in a memory stack with volatile memory technologies. In some cases, the NVM technologies could be used together, in some cases on the same IC die as logic circuitry. Exploitation of specific properties of each of the technologies, in the stacked memory subsystem, can beneficially result in differentiated SoC performance.
[0014] In one example, a memory stack is provided that include a first memory IC die having non-volatile memory (NVM) circuitry stacked below a second memory IC die. The second memory IC die has volatile memory circuitry.
[0015] In another example of a memory stack, one IC memory die of the memory stack includes ferro-electric random-access memory (FeRAM) or static randomaccess memory (SRAM) circuitry, while another IC memory die of the memory stack includes volatile memory circuitry.
[0016] In another example of a memory stack, one IC memory die of the memory stack includes ferro-electric random-access memory (FeRAM) or static randomaccess memory (SRAM) circuitry, while another IC memory die of the memory stack includes dynamic random-access memory (DRAM) circuitry.
[0017] In another example of a memory stack, first processing in memory (PIM) circuitry is disposed in a second memory IC die of the memory stack while a second PIM circuitry is disposed in the third memory IC die of the memory stack.
[0018] In another example of a memory stack, a first buffer IC die disposed between one pair of memory IC dies, a second buffer IC die disposed between another pair of memory IC dies. [0019] In another example, memory stack includes a first memory IC die comprising first ferro-electric random-access memory (FeRAM) circuitry and first processing in memory (PIM) circuitry, a second memory IC die stacked on the first memory IC die, and a third memory IC die stacked on the first memory IC die. The second memory IC die includes second FeRAM circuitry and second PIM circuitry. The third memory IC die includes third FeRAM circuitry and third PIM circuitry.
[0020] In yet another example, a chip package is provided that includes a hybrid memory stack mounted on a substrate. The hybrid memory stack include both volatile and non-volatile memory IC dies.
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] Figure 1 depicts a chip package having a memory stack connected to a compute/processor die through an interposer.
[0022] Figure 2 depicts an IC die stack disposed on top of a compute/processor chip and a memory-interface/controller die.
[0023] Figure 3A depicts a high-temperature memory die stacked on top of a compute/processor chip and a memory-interface/controller die.
[0024] Figure 3B depicts a high-temperature memory, a compute/processor circuitry and memory-interface/controller circuitry integrated into a single integrated circuit (IC) die.
[0025] Figure 4A is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including at least one low temperature memory integrated circuit die and at least one high temperature memory integrated circuit die.
[0026] Figure 4B is a memory die stack disposed on top of a compute/processor chip, the memory die stack including at least one low temperature memory integrated circuit die and at least one high temperature memory integrated circuit die, wherein the high temperature memory integrated circuit die includes controller circuitry.
[0027] Figure 4C is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including at least one low temperature memory integrated circuit die and at least one high temperature memory integrated circuit die having a buffer IC die disposed therebetween, the buffer IC including logic and non-volatile memory circuitry.
[0028] Figure 5A is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of low temperature memory integrated circuit dies, one of more of the low temperature memory integrated circuit dies having processing in memory (PIM) circuitry with FeRAM and/or embedded DRAM based PIM local storage.
[0029] Figure 5B is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of low temperature memory integrated circuit dies, one of more of the low temperature memory integrated circuit dies having additional PIM circuitry relative to FeRAM and/or embedded DRAM based PIM storage as compared to the memory die stack of Figure 5A.
[0030] Figure 5C is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of low temperature memory integrated circuit dies, one of more of the low temperature memory integrated circuit dies having fine grained PIM circuitry with FeRAM and/or embedded DRAM based PIM local storage.
[0031] Figure 5D is a memory die stack disposed on top of a compute/processor chip and a memory-interface/controller die, the memory die stack including a plurality of high temperature memory integrated circuit dies, one of more of the high temperature memory integrated circuit dies having fine grained sub-bank level PIM circuitry with FeRAM based PIM local storage.
DETAILED DESCRIPTION
[0032] The disclosure herein addresses specific challenges of stacked DRAM subsystem with hybrid memory 3D organization and logic codesign. The disclosed technology defines various methods, systems and devices to design compute- and application-aware advanced memory-based systems. Generally, disclosed are memory die stacks that utilize a mix of high and low operational temperature memory dies such that the high temperature die may function as a temperature buffer with adjacent heat generating logic dies. In particular, disclosed are memory die stacks that utilize a mix of volatile and non-volatile based memory dies, one example of which are stacked DRAM and FeRAM based memory die.
[0033] Figure 1 depicts a chip package 100 having a memory stack 102 connected to a compute/processor IC die 108 through an interposer 110. Any of the memory stacks described herein may be utilized in the chip package 100 depicted in Figure 1 , or other suitable memory device. The chip package 100 depicted in Figure 1 , which may include any of the other memory stacks described below, also includes a package substrate 114 to which the interposer 110 is mounted. The package substrate 114 of the chip package 100 may be coupled to a printed circuit board (PCB) 136 to form an electronic system 180, such as but not limited to the graphics card depicted in Figure 1.
[0034] The memory stack 102 generally includes at least one low temperature memory integrated circuit (LTMIC) die 104 stacked with at least one high temperature memory IC (HTMIC) die 106. The space shown in Figure 1 between the dies 104, 106 is used to make solder connections (not shown) between the dies 104, 106. Alternatively, the dies 104, 106 may be stacked directly in contact with each other using hybrid bonding techniques. The HTMIC and LTMIC dies 106, 104 may be comparatively defined by at least one the following definitions. In one example, a HTMIC die 106 has a memory refresh requirement that is longer than another memory die in the memory stack 102, the memory die having the shorter memory refreshes requirement comparatively referred to the LTMIC die 104. In another example, a HTMIC die 106 has a longer period between refresh (i.e., a longer refresh period) than recommended by Joint Electron Device Engineering Council (JEDEC) Solid State Technology Association standard JESD21-C, the memory die having memory refresh requirements following JESD21-C comparatively referred to as the LTMIC die 104. In another example, the HTMIC die 106 has a memory refresh requirement that exceeds 60 microseconds between memory refreshes, the memory die requiring a memory refresh every 60 microseconds comparatively referred to the LTMIC die 104. In yet another example, In another example, the HTMIC die 106 is non-volatile memory, while the LTMIC die 104 is volatile memory. In still other examples, the LTMIC die 104 can be defined as a memory die that can operate at temperatures up to 110 degrees Celsius (i.e., operational temperature) without having to increase the refresh rate. At temperatures above 110 degrees Celsius, LTMIC die 104 require increased the refresh rates as compared to operation at 110 degrees Celsius (as compared to operation at 95 degrees Celsius). An example of a LTMIC die 104 is a dynamic random-access memory (DRAM) die. Other examples of LTMIC dies 104 include volatile memory dues such as system random-access memory (SRAM), among others.
[0035] In still other examples, the HTMIC die 106 has an operational temperature that is greater than the operational temperature of the LTMIC die 104. Defined differently, the HTMIC die 106 is a memory die that can operate at temperatures above 110 degrees Celsius without having to increase the refresh rate (as compared to operation at 95 degrees Celsius). An example of a HTMIC die 106 is a ferromagnetic random-access memory (FeRAM) die. Other examples of HTMIC die 106 include non-volatile memory dues such magnetoresistive random-access memory (MRAM), phase-change memory (PCM), flash memory, and resistive random-access memory (RRAM), among others.
[0036] The memory stack 102 may optionally include at least one controller IC die 120 stacked with the LTMIC die 104 and the HTMIC die 106. The IC dies 104, 106, 120 may be electrically and mechanically connected by solder balls and/or hybrid bonding techniques, such that the functional circuitries with in the IC dies 104, 106, 120 can communicate with each other and/or transmit data signals, power and/or ground therethrough.
[0037] The functional circuitries within the IC memory dies 104, 106 are arranged into multiple memory banks. Each bank has multiples rows and each row has multiple columns. Residing at each unique memory location within a bank is a memory cell. The memory cell may be addressed using its unique identifying row and cell location within a particular bank of the memory dies 104, 106.
[0038] The functional circuitries within the IC dies 104, 106, 120 are coupled to the functional circuitry of the compute/processor IC die 108 via routings 112 formed in the interposer 110. The routings 112 of the interposer 110 are connected to the functional circuitries with in the IC dies 104, 106, 120 via solder connections 118. The routings 112 of the interposer 110 also connect to the functional circuitries of the IC dies 108, 120 to routing 122 formed in the package substrate 114 via the solder connections 118. Solder balls 116 are utilized to connect the routings 122 of the package substrate 114 with routing 124 formed in the PCB 136.
[0039] In other embodiments where an interposer is not present, the chip stack 102 and IC die 108 may be mounted directly to the package substrate 114.
[0040] The IC die 120 is generally a heat generating device. That is, the IC die 120 generates heat when in use. As the performance of the LTMIC die 104 may be diminished due to the heat generated by the IC die 120, performance of the chip package 100 is enhanced by separating the LTMIC die 104 from the heat generating IC die 120 by one or more HTMIC dies 106. Since the HTMIC die 106 is generally more heat resistant than the LTMIC dies 104, the HTMIC die 106 can be located adjacent the heat generating IC die 120 without significant reduction in performance while enabling the LTMIC dies 104 that are significantly spaced from the heat generating IC die 120 to also maintain robust levels of performance.
[0041] In the example depicted in Figure 1 , the chip stack 102 includes one HTMIC die 106 disposed between a plurality of LTMIC die 104 and the controller IC die 120. Although four IC memory dies 104, 106 are illustrated in the single chip stack 102 shown in Figure 1 , the number of LTMIC dies 104 and the number of HTMIC dies 106 comprising the chip stack 102 may vary from one to as many as can fit within the chip package 100. Additionally, although only one chip stack 102 is shown in Figure 1 , one or more additional chip stacks may be disposed adjacent the chip stack 102 shown in Figure 1.
[0042] The controller IC die 120 include functional logic circuitry provides commands that enable the row and column identifying each bank of the memory dies 104, 106 to be addressed. The controller IC die 120 controls the write/read operation from each memory bank.
[0043] The HBM memory can be put in low power modes by row address bus to save power on the I/O drivers. To further reduce power consumption, clocks can be gated when in power-down or self-refresh modes.
[0044] Figure 2 depicts an IC die stack 202 disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120. The IC die stack 202 may be used in place of the IC die stack 102 and mounted on the interposer 110 or the package substrate 114 in the chip package 100 and electronic system 180 illustrated in Figure 1.
[0045] In the example depicted in Figure 2, the IC die stack 202 that can be used in the electronic system 180 includes high-bandwidth (HBM) cube 204 stacked on top of the compute IC die 108 and the controller IC die 120. The compute IC die (e.g., compute chip) 108 may be a central processing unit (CPU), graphics processing unit (GPU), field programmable gate array (FPGA), or other accelerator. The HBM cube 204 includes a plurality of LTMIC dies 104 that are vertically stacked. Each LTMIC die 104 includes functional circuitry configured as memory circuitry 220. In one example, the memory circuitry 220 is configured as volatile memory circuitry such as dynamic random-access memory (DRAM) and system randomaccess memory (SRAM), among others. In an example where the memory circuitry 220 is configured as DRAM circuitry, the LTMIC dies 104 may be referred as DRAM IC dies. The stacked DRAM IC dies may be connected by solder connections, hybrid bonding or other suitable connection. Although 3 DRAM IC die are illustrated in the HBM cube 204 depicted in Figure 2, the HBM cube 204 may alternatively have more or less than 3 DRAM IC dies.
[0046] Sandwiched between the LTMIC dies 104 and the controller IC die 120 in the HBM cube 204 is one or more HTMIC dies 106. In Figure 2, one HTMIC dies 106 is shown, although more may be utilized. The HTMIC die 106 includes functional circuitry configured as memory circuitry 222. In one example, the memory circuitry 222 is configured as non-volatile random-access memory, such as ferroelectric random-access memory (FeRAM), magnetoresistive random-access memory (MRAM), phase-change memory (PCM), flash memory, and resistive random-access memory (RRAM), among others. In an example where the memory circuitry 222 is configured as FeRAM, the HTMIC die 106 may be referred as a FeRAM IC die.
[0047] The memory circuitry 222 of the HTMIC die 106 has an operational temperature that is greater than the operational temperature of the memory circuitry 220 of the LTMIC die 104. Defined differently, the memory circuitry 222 of the HTMIC die 106 is memory circuitry that can operate at temperatures above 110 degrees Celsius without having to increase the refresh rate (as compared to operation at 95 degrees Celsius), while the memory circuitry 220 of the LTMIC die 104 is memory circuitry that cannot operate at temperatures above 110 degrees Celsius without having to increase the refresh rate (as compared to operation at 95 degrees Celsius)
[0048] In one example, the HTMIC die 106 is a non-volatile random-access memory IC die that has faster refresh speed as compared to the LTMIC dies 104 that have volatile random-access memory. Thus, in addition to the HTMIC die 106 performing better than the LTMIC dies 104 when placed closer to the controller IC die 120, the faster refresh speed enables faster communication with the controller IC die 120, which beneficially reduces latency within the IC die stack 202, and ultimately, the chip package 100 and the electronic system 180.
[0049] In the example depicted in Figure 2, the HTMIC die 106 has memory circuitry 222 configured a FeRAM circuitry, while the LTMIC dies 104 have memory circuitry 220 configured as DRAM circuitry.
[0050] Figure 3A depicts another example of a memory stack 302 that includes a HTMIC die 106 stacked on top of a compute/processor IC die 108 and a memory- interface/controller IC die 120. Although not shown in Figure 3A, a plurality of LTMIC dies 104 may be stacked on the HTMIC die 106 shown in Figure 3A, such as in the manner illustrated in Figure 2, to complete the memory stack 322. The IC die stack 302 may be used in place of the IC die stack 102 and mounted on the interposer 110 or the package substrate 114 in the chip package 100 and electronic system 180 illustrated in Figure 1. In the example depicted in Figure 3A, the HTMIC die 106 includes memory circuitry 222 configured as FeRAM circuitry. In the example depicted in Figure 3A, the HTMIC die 106 is vertically stacked on top of the memory- interface/controller IC die 120, while the memory-interface/controller IC die 120 is vertically stacked on top of the compute/processor IC die 108. The memory circuitry 222 of the HTMIC die 106 includes FeRAM or other suitable circuitry which is compatible with the controller circuitry 224 of the memory-interface/controller IC die 120. The interconnections between the dies 104, 106, 108, 120 may be made by solder connections, hybrid bonding or other suitable connection. The HTMIC die 106 stacked on top of the compute/processor IC die 120 forms a hybrid memory-logic assembly that may be later stacked with LTMIC dies 104, such as illustrated in Figure 2. [0051] Figure 3B depicts another example of a memory stack 322 that includes memory circuitry 222, a compute/processor circuitry 324 and memory- interface/controller circuitry 224 integrated into a single HTMIC die 106. Although not shown in Figure 3B, a plurality of LTMIC dies 104 may be stacked on the HTMIC die 106 shown in Figure 3B, such as in the manner illustrated in Figure 2, to complete the memory stack 322. The additional dies stacked on the HTMIC dies 106 shown in Figure 3B may be one or more FeRAM dies, and/or one or more LTMIC dies 104 (such as DRAM IC dies), and/or one or more other type of memory dies. In one example, the memory circuitry 222 is configured as FeRAM circuitry, which is compatible with the memory-interface/controller circuitry 224 so that the circuitries 222, 224 may be co-located within the same HTMIC die 106. Similarly, the FeRAM circuitry 222 is also compatible with the compute/processor circuitry 324 so that the circuitries 222, 324, 224 may be co-located within the same HTMIC die 106. In one example, the memory circuitry 222 is disposed between the compute/processor circuitry 324 and the memory-interface/controller circuitry 224. Optionally, the compute/processor circuitry 224 may reside in an IC die neighboring the HTMIC die 106 that contains both the FeRAM and controller circuitries 222, 224.
[0052] The advanced memory technology roadmap targets increased memory density and bandwidth, with minimal impact to power and performance to alleviate the memory-bottleneck to system performance. With the advancement in memory technology, memory stacking and novel non-volatile memories like FeRAMs, updating circuitry, architecture and memory interfacing principles keeping in pace with the memory technology itself is imperative. Improvements to memory technology are described below that leverage enhancements specific to HBM/other forms of stacked high temperature memory by integrating low temperature memory technology to create hybrid memory stacks. In one example, stacked DRAM memory may be integrate FeRAM based memory to form a hybrid memory stack or hybrid memory-logic assembly.
[0053] Hybrid memory and hybrid memory-logic assembly are disclosed that utilizes a mix of memory technologies that can be stacked, for example on top of a logic die. For example, a hybrid memory cube with a non-volatile memory (such as FeRAM and the like) IC die and a volatile memory (such as DRAM IC and the like) die has the HTMIC die 106 beneficially disposed closest to the logic IC die 120, as the HTMIC die 106 can tolerate higher heat dissipated from the logic-die, while the LTMIC dies 104 disposed on top of the HTMIC die 106 could be placed closer to the heat spreader in the chip package (such as the chip package 100 depicted in Figure 1 ) to minimize temperature gradients and impact on performance and refresh rates associated with the LTMIC dies 104.
[0054] Figure 4A depicts an example of a memory die stack 400 disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120. The memory die stack 400 includes at least one HTMIC die 106, such as a nonvolatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies. Another alternative hybrid approach is to arranged the and HTMIC and LTMIC dies 106, 104 within the same memory die stack 400 ranked based on latency, with the fastest dies being closer to the memory-interface/controller IC die 120. The ranked memory IC dies 104, 106 may include SRAM, DRAM, and non-volatile memory (NVM) IC dies arranged by latency all in the same memory die stack 400. This results in a hierarchical hardware managed cache for the LTMIC dies 104, such as DRAM IC dies, within the stacked memory cube, e.g., the memory die stack 400.
[0055] Figure 4B is a memory die stack 410 disposed on top of a compute/processor IC die 108. The memory die stack 410 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies. The HTMIC die 106 of the memory die stack 410 includes both memory circuitry 222 and controller circuitry 224. For example, the memory circuitry 222 of the HTMIC die 106 may include FeRAM (or other nonvolatile memory) circuitry and controller circuitry 224 integrated on the same die. Such an arrangement is illustrated in the memory die stack 410 of Figure 4B which is enabled by the FeRAM configuration of the memory circuitry 222 being compatible with the logic technology of the controller circuitry 224.
[0056] Alternatively, the IO/SA logic on each memory die 104 could be separated into a buffer IC die 422, to achieve higher performance and yield, and could also include FeRAM memory blocks that are logic compatible. Such an example is illustrated in Figure 4C which shows a memory die stack 420 disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120, the memory die stack 420 including at least one HTMIC die 106, such as a non-volatile memory IC die, and at least one LTMIC die 104, such as a volatile memory die, having buffer IC dies 422 disposed therebetween. The buffer IC die 422 includes both logic and non-volatile memory circuitry 224, 222. The buffer IC die 422 may be hybrid bonded on each side to the other dies 104 comprising the HBM cube 204 of the die stack 420.
[0057] Memory stack and technology may be selected and design in a “hierarchical” manner (hardware managed cache), to use the faster memory/or memories not requiring refresh, closer to the logic IC die 120, to act as an “intermediate” layer, to transfer data to more dense, slower memories on the upper tiers of LTMIC dies 104, away from the logic IC die 120. This could help hiding latency, and overhead due to refresh needed for LTMIC dies 104, such as DRAM IC dies, on the top of the memory die stack 420.
[0058] In cases where FeRAM or other non-volatile memory dies are used with multi-bit cell storage, (e.g., NAND flash stores multiple bits in one cell), wear out can become a concern, since each cell will be accessed “n” times, where “n” is the number of bits in a single cell, compared to a single bit cell scenario. Hence, DRAM/SRAM and other volatile memories which have enhanced endurance can be used as a “standby” or hardware managed cache for the non-volatile memory dies. This allows multiple writes (termed write levelling) to be combined to a single write into the non-volatile memory multi-bit cell, which beneficially reduces the number of writes to a single cell in the non-volatile memory IC die and increases the lifetime of the non-volatile memory circuitry. Reads may be combined in substantially the same manner.
[0059] Figures 5A through 5D illustrate some non-limiting examples of processing in memory (PIM) circuitry 502 utilized within hybrid memory assembly, i.e., a memory die stack 500. Trade-off for PIM usually is between the speedup due to compute near-memory vs the area overhead/impact on memory density and leakage due to integration of PIM circuitry. The PIM circuitry 502 include processor or other logic circuitry integrated with memory circuitry 220/222 on a single IC die of the memory die stack 500. The PIM circuitry 502 contains local storage circuitry 506.
[0060] The local storage circuitry 506 of the PIM circuitry 502 may be FeRAM and/or embedded DRAM (eDRAM). Advantageously, the FeRAM and/or eDRAM based local storage circuitry 506 generally are low leakage logic-compatible storage as compared to local logic-based high leakage registers. This allows area scaling of PIM circuitry 502 and reduced leakage compared to conventional PIM using registers in the logic-based logical storage. Thus, the amount of processing in memory may be increased within the same IC die area allocated for the PIM circuitry 502.
[0061] This can also be used towards more fine-grained PIM (e.g., at a sub-bank level; currently PIM is performed at a bank level) or increased memory density due to reduced PIM area.
[0062] Referring first to Figure 5A, the memory die stack 500 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120. The memory die stack 500 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies. One of more of the LTMIC dies 104 have processing in memory (PIM) circuitry 502 with FeRAM and/or embedded DRAM based PIM local storage circuitry 506.
[0063] In Figure 5B, a memory die stack 510 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120. The memory die stack 510 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies. One of more of the LTMIC dies 104 have additional PIM circuitry 502 relative to FeRAM and/or embedded DRAM based PIM local storage circuitry 506 as compared to the memory die stack 500 of Figure 5A.
[0064] In Figure 5C, a memory die stack 520 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120. The memory die stack 520 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies. One of more of the LTMIC dies 104 have fine grained PIM circuitry 502 with FeRAM and/or embedded DRAM based PIM local storage circuitry 506.
[0065] In Figure 5D, a memory die stack 530 is shown disposed on top of a compute/processor IC die 108 and a memory-interface/controller IC die 120. The memory die stack 530 includes at least one HTMIC die 106, such as a non-volatile memory IC die, and a plurality of LTMIC dies 104, such as volatile memory dies. One of more of the LTMIC dies 104 has fine grained sub-bank level PIM circuitry 502 with FeRAM based PIM local storage circuitry 506. In an alternative hybrid approach, the plurality of LTMIC dies 104 includes DRAM IC memory dies stacked on top of one or more SRAM IC memory dies. The SRAM IC memory die(s) can be configured to function as a hardware managed cache for the DRAM IC memory dies, or as a part of the address space offering exceptionally low latency. In the exampled depicted in Figure 5D, the HTMIC die 106 includes FeRAM memory arrays that allow more easy integration of PIM circuitry 502 in logic-compatible FeRAM circuitry of the HTMIC die 106. All of the above examples enable enhanced SoC performance and power efficiency.

Claims

What is claimed is:
1 . A memory stack comprising: a first memory IC die comprising memory and a second memory IC die stacked on the first memory IC die, the second memory IC die comprising memory circuitry requiring refresh rates more frequent than that of the first memory IC die.
2. The memory stack of claim 1 , wherein the memory circuitry of the first memory IC die is non-volatile memory circuitry.
3. The memory stack of claim 2, wherein the non-volatile memory circuitry is ferro-electric random-access memory (FeRAM) or static random-access memory (SRAM) circuitry.
4. The memory stack of claim 3, wherein the memory circuitry of the second memory IC die is volatile memory circuitry
5 The memory stack of claim 4, wherein the volatile memory circuitry is dynamic random-access memory (DRAM) circuitry.
6. The memory stack of claim 1 , further comprising: a controller die stacked below and in contact with the first memory IC die; and a processor die stacked below and in contact with the controller die, the processor die includes processor circuitry that communicates with memory circuitries of the first and second memory IC dies through controller circuitry of the controller IC die.
7. The memory stack of claim 1 , wherein the first memory IC die includes controller circuitry.
8. The memory stack of claim 7, further comprising: a processor die stacked below and in contact with the first memory IC die, the processor die includes processor circuitry that communicates with memory circuitries of the first and second memory IC dies through controller circuitry of the first memory IC die.
9. The memory stack of claim 1 , further comprising: a third memory IC die stacked on the second memory IC die, the third memory IC die comprising dynamic random-access memory (DRAM) circuitry.
10. The memory stack of claim 9, wherein the third memory IC die has a greater latency than the second memory IC die, and second memory IC die has a greater latency than the first memory IC die.
11 . The memory stack of claim 9 further comprising: first processing in memory (PIM) circuitry disposed in the second memory IC die; and second PIM circuitry disposed in the third memory IC die.
12. The memory stack of claim 1 , wherein the first memory IC die includes controller circuitry.
13. The memory stack of claim 12, further comprising: a processor die stacked below and in contact with the first memory IC die, the processor die includes processor circuitry that communicates with memory circuitries of the first and second memory IC dies through the controller circuitry of the first memory IC die.
14. The memory stack of claim 1 , further comprising: a first buffer IC die disposed between the first memory IC die and the second memory IC die.
15. A chip package comprising: a package substrate; and a memory stack stacked on the package substrate, the memory stack comprising: a plurality of first memory IC dies stacked on a second memory IC die, the second memory IC die having ferro-electric random-access memory (FeRAM) circuitry and optionally controller circuity, the second memory IC die stacked on the package substrate, the plurality of first memory IC dies including DRAM circuitry.
PCT/US2023/029044 2022-09-09 2023-07-28 Hybrid memory architecture for advanced 3d systems WO2024054316A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202263405347P 2022-09-09 2022-09-09
US63/405,347 2022-09-09
US18/199,837 US20240088098A1 (en) 2022-09-09 2023-05-19 Hybrid memory architecture for advanced 3d systems
US18/199,837 2023-05-19

Publications (1)

Publication Number Publication Date
WO2024054316A1 true WO2024054316A1 (en) 2024-03-14

Family

ID=87847847

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/029044 WO2024054316A1 (en) 2022-09-09 2023-07-28 Hybrid memory architecture for advanced 3d systems

Country Status (1)

Country Link
WO (1) WO2024054316A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170160955A1 (en) * 2014-01-10 2017-06-08 Advanced Micro Devices, Inc. Page migration in a 3d stacked hybrid memory
US20190102330A1 (en) * 2017-10-02 2019-04-04 Micron Technology, Inc. Communicating data with stacked memory dies
US20190121560A1 (en) * 2017-10-24 2019-04-25 Micron Technology, Inc. Reconfigurable memory architectures
US20200272560A1 (en) * 2019-02-22 2020-08-27 Micron Technology, Inc. Memory device interface and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170160955A1 (en) * 2014-01-10 2017-06-08 Advanced Micro Devices, Inc. Page migration in a 3d stacked hybrid memory
US20190102330A1 (en) * 2017-10-02 2019-04-04 Micron Technology, Inc. Communicating data with stacked memory dies
US20190121560A1 (en) * 2017-10-24 2019-04-25 Micron Technology, Inc. Reconfigurable memory architectures
US20200272560A1 (en) * 2019-02-22 2020-08-27 Micron Technology, Inc. Memory device interface and method

Similar Documents

Publication Publication Date Title
CN111788685B (en) Three-dimensional memory device
US7330368B2 (en) Three-dimensional semiconductor device provided with interchip interconnection selection means for electrically isolating interconnections other than selected interchip interconnections
CN100383968C (en) Stacked semiconductor memory device
US7929368B2 (en) Variable memory refresh devices and methods
US10573356B2 (en) Semiconductor memory devices, memory systems and methods of operating semiconductor memory devices
CN102486931A (en) Multi channel semiconductor memory device and semiconductor device including same
CN102456394A (en) Memory circuits, systems, and modules for performing DRAM refresh operations and methods of operating the same
KR102548599B1 (en) Memory device including buffer-memory and memory module including the same
US11508429B2 (en) Memory system performing hammer refresh operation and method of controlling refresh of memory device
KR20230025554A (en) Operating method of host device and storage device and storage device
TW202310202A (en) Memory device
US20240088098A1 (en) Hybrid memory architecture for advanced 3d systems
WO2024054316A1 (en) Hybrid memory architecture for advanced 3d systems
KR102457630B1 (en) Semiconductor device and memory module including the same
US11901025B2 (en) Semiconductor memory device and method of operating semiconductor memory device
US20240088099A1 (en) 3d layout and organization for enhancement of modern memory systems
US20090097301A1 (en) Semiconductor storage apparatus and semiconductor integrated circuit incorporating the same
KR102458340B1 (en) A memory device
US20230141221A1 (en) Volatile memory device
US20240087632A1 (en) Ferroelectric random-access memory with enhanced lifetime, density, and performance
US11961552B2 (en) Memory device including partial pages in memory blocks
WO2024054317A1 (en) 3d layout and organization for enhancement of modern memory systems
US20230143132A1 (en) Volatile memory device
US11442665B2 (en) Storage system and method for dynamic selection of a host interface
US7872902B2 (en) Integrated circuit with bit lines positioned in different planes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23762039

Country of ref document: EP

Kind code of ref document: A1