WO2013078536A1 - Cpu with stacked memory - Google Patents

Cpu with stacked memory Download PDF

Info

Publication number
WO2013078536A1
WO2013078536A1 PCT/CA2012/001086 CA2012001086W WO2013078536A1 WO 2013078536 A1 WO2013078536 A1 WO 2013078536A1 CA 2012001086 W CA2012001086 W CA 2012001086W WO 2013078536 A1 WO2013078536 A1 WO 2013078536A1
Authority
WO
WIPO (PCT)
Prior art keywords
cpu
cpu die
die
area
disposed
Prior art date
Application number
PCT/CA2012/001086
Other languages
French (fr)
Inventor
Hong Beom Pyeon
Original Assignee
Mosaid Technologies Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mosaid Technologies Incorporated filed Critical Mosaid Technologies Incorporated
Priority to EP12854424.4A priority Critical patent/EP2786409A4/en
Priority to CN201280068123.8A priority patent/CN104094402A/en
Priority to KR1020147018137A priority patent/KR20140109914A/en
Priority to JP2014543729A priority patent/JP2015502663A/en
Publication of WO2013078536A1 publication Critical patent/WO2013078536A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/18Packaging or power distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • G06F1/203Cooling means for portable computers, e.g. for laptops
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/21Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
    • G11C11/34Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
    • G11C11/40Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
    • G11C11/401Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming cells needing refreshing or charge regeneration, i.e. dynamic cells
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L2224/16Structure, shape, material or disposition of the bump connectors after the connecting process of an individual bump connector
    • H01L2224/161Disposition
    • H01L2224/16151Disposition the bump connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive
    • H01L2224/16221Disposition the bump connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive the body and the item being stacked
    • H01L2224/16225Disposition the bump connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive the body and the item being stacked the item being non-metallic, e.g. insulating substrate with or without metallisation
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/73Means for bonding being of different types provided for in two or more of groups H01L2224/10, H01L2224/18, H01L2224/26, H01L2224/34, H01L2224/42, H01L2224/50, H01L2224/63, H01L2224/71
    • H01L2224/732Location after the connecting process
    • H01L2224/73251Location after the connecting process on different surfaces
    • H01L2224/73253Bump and layer connectors
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L25/00Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof
    • H01L25/18Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof the devices being of types provided for in two or more different subgroups of the same main group of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2924/00Indexing scheme for arrangements or methods for connecting or disconnecting semiconductor or solid-state bodies as covered by H01L24/00
    • H01L2924/0001Technical content checked by a classifier
    • H01L2924/0002Not covered by any one of groups H01L24/00, H01L24/00 and H01L2224/00
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2924/00Indexing scheme for arrangements or methods for connecting or disconnecting semiconductor or solid-state bodies as covered by H01L24/00
    • H01L2924/15Details of package parts other than the semiconductor or other solid state devices to be connected
    • H01L2924/151Die mounting substrate
    • H01L2924/153Connection portion
    • H01L2924/1531Connection portion the connection portion being formed only on the surface of the substrate opposite to the die mounting surface
    • H01L2924/15311Connection portion the connection portion being formed only on the surface of the substrate opposite to the die mounting surface being a ball array, e.g. BGA

Definitions

  • the invention relates generally to semiconductor devices, and more specifically to a CPU having stacked memory.
  • smallest generally refers to the lateral area occupied by the memory device in a "lateral" X/Y plane, such as a plane defined by the primary surfaces of a printed circuit board or module board.
  • the distance between the CPU and the memory which contributes to signal distortion and degradation of signal integrity, and increases power consumption by the I/O signal connection.
  • the distance between the CPU and the memory device is limited by the physical dimensions of memory and the CPU if these devices are both mounted next to each other on the same board. This distance can be reduced by stacking memory devices with the CPU.
  • Two common stacking arrangements are memory over CPU ( Figure 1) and CPU over memory ( Figure 2).
  • the arrangement of Figure 1 has disadvantages in terms of heat dissipation, because the heat from the CPU must be conducted through the DRAM stack to reach the heat sink.
  • the arrangement of Figure 2 requires the CPU to communicate to external devices (via the board) using TSVs through the intervening DRAM stack, thereby increasing the TSV overhead of the DRAM stack and reducing storage capacity accordingly.
  • the processor cores of the CPU chip consume a lot of power and generate heat during normal operation. It is not atypical for the processor cores of the CPU chip to generate hot spots about 30°C (about 55°F) hotter than the cooler portions of the chip such as the area allocated to the level 2 (L2) SRAM cache.
  • This high temperature can adversely affect the performance of adjacent DRAM devices, which are inherently temperature-sensitive, and which themselves consume a significant amount of power during operation. Higher temperatures contribute to degradation of memory performance, require more frequent refresh cycles, and increase power consumption in DRAM devices.
  • the stacked arrangement exacerbates the heat dissipation problem, because multiple heat-generating dies are in close proximity and must share a heat sink. Thermal issues are one limiting factor in the maximum acceptable height of the DRAM stack, thereby limiting the memory capacity available to the CPU, as well as adversely affecting the proper operation of the DRAM chips provided.
  • One approach to regulating thermal issues is to configure the CPU so that the hot spots are more evenly distributed over the area occupied by the processor cores.
  • this increases design complexity and may conflict with optimized logic block placement in the CPU.
  • this approach is of limited benefit when the CPU and the DRAM are stacked together, because the DRAM is still exposed to the same quantity of heat overall.
  • a multi-chip package comprises a substrate having electrical contacts for connection to an external device.
  • a CPU die is disposed on the substrate and is in
  • the CPU die has a plurality of processor cores occupying a first area of the CPU die; and an SRAM cache occupying a second area of the CPU die.
  • a DRAM cache is disposed on the CPU die and is in communication with the CPU die.
  • the DRAM cache comprises a plurality of stacked DRAM dies. The plurality of stacked DRAM dies are substantially aligned with the second area of the CPU die. The plurality of stacked DRAM dies substantially do not overlap the first area of the CPU die.
  • a bulk material is disposed on the CPU die and is substantially aligned with the first area of the CPU die.
  • the bulk material has a top surface substantially coplanar to a top surface of the plurality of stacked DRAM dies.
  • a chip is disposed on the top surface of the bulk material and on the top surface of the plurality of stacked DRAM dies.
  • the chip is in communication with the CPU die.
  • the chip and the plurality of DRAM dies are in communication with the CPU die via through-silicon vias (TSVs).
  • TSVs through-silicon vias
  • a heat sink is disposed on a top surface of the plurality of stacked DRAM dies.
  • a heat sink is disposed on a top surface of the first area of the CPU die.
  • a heat sink is disposed on a top surface of the bulk material.
  • a heat sink is disposed on the top surface of the bulk material and on the top surface of the plurality of stacked DRAM dies.
  • At least one die is disposed on the CPU die and is substantially aligned with the first area of the CPU die.
  • the at least one die comprises at least one additional processor core.
  • a multi-chip package comprises a substrate having electrical contacts for connection to an external device.
  • a DRAM cache is disposed on the substrate and is in communication with the CPU die.
  • the DRAM cache comprises a plurality of stacked DRAM dies.
  • a bulk material is disposed on the substrate.
  • a CPU die is disposed on the DRAM cache and the substrate.
  • the CPU die is in communication with the substrate.
  • the CPU die comprises a plurality of processor cores occupying a first area of the CPU die; and an SRAM cache occupying a second area of the CPU die.
  • the plurality of stacked DRAM dies are substantially aligned with the second area of the CPU die.
  • the bulk material is substantially aligned with the first area of the CPU die.
  • the bulk material has a top surface substantially coplanar to a top surface of the plurality of stacked DRAM dies.
  • the substrate and the plurality of DRAM dies are in communication with the CPU die via through-silicon vias (TSVs).
  • TSVs through-silicon vias
  • At least some of the TSVs pass through the bulk material.
  • a heat sink is disposed on a top surface of the CPU die.
  • At least one die is disposed on a top surface of the bulk material and is substantially aligned with the first area of the CPU die.
  • the at least one die comprises at least one additional processor core.
  • Figure 1 is a schematic diagram of a memory-over-CPU stacking arrangement according to a prior art embodiment
  • Figure 2 is a schematic diagram of a CPU-over-memory stacking arrangement according to a prior art embodiment
  • Figure 3 is a schematic diagram of a CPU chip according to an embodiment
  • Figure 4 is a schematic side elevation view of a memory-over-CPU stacking arrangement according to a first embodiment
  • Figure 5 is a perspective view of the stacking arrangement of Figure 4.
  • Figure 6 is an exploded view of the stacking arrangement of Figure 4.
  • Figure 7 is a schematic side elevation view of a memory-over-CPU stacking arrangement according to a second embodiment
  • Figure 8 is a schematic side elevation view of a memory-over-CPU stacking arrangement according to a third embodiment.
  • Figure 9 is a schematic side elevation view of a CPU-over-memory stacking arrangement according to a fourth embodiment.
  • a multi-chip package (MCP) 100 will be described according to a first embodiment.
  • a CPU chip 102 is mounted on a substrate 104 which connects to external devices (not shown) via a ball grid array 106. It is contemplated that the substrate 104 may alternatively be electrically connectable to external devices using any other suitable form of electrical contacts, such as pins.
  • the CPU chip 102 includes a processor region 108 containing two core processors 1 10, each with its respective level 1 (LI) cache 1 12. It is contemplated that the CPU chip 102 may alternatively have a single core processor 1 10 or more than two core processors 1 10.
  • the CPU chip 102 also includes a non-core region 1 14 used as a cache region and containing, among other things, a level 2 (L2) SRAM cache 1 16 and associated circuitry. It is contemplated that other known types of memory may alternatively be used for the L2 cache 1 16, or that the non-core region may alternatively contain other logic circuitry used in support of the core processors 1 10.
  • L2 cache 1 16 level 2
  • Each of the processor region 108 and the non-core region 1 14 may take up approximately half of the area of the CPU chip 102; however, it should be understood that the proportions of either may vary according to the desired performance characteristics of the CPU chip 102.
  • a number of DRAM chips 1 18 are stacked on the top surface 120 of the CPU chip 102, using any suitable known method for adhering each DRAM chip 1 18 to adjacent chips. While three or four DRAM chips 1 18 are shown in various embodiments, it should be understood that any number of DRAM chips 1 18 may be stacked as needed to achieve the desired storage capacity for a particular MCP 100.
  • the DRAM chips 1 18 are approximately the size of the non-core region 1 14 of the CPU chip 102, and are stacked on the non-core region 1 14 of the CPU chip 102 such that when the DRAM chips 1 18 are stacked they substantially overlap only the non-core region 1 14 and substantially do not overlap the processor region 108.
  • the bottom DRAM chip 1 18 is in contact only with the relatively cooler non-core region 1 14 of the CPU chip 102 and not the relatively hotter processor region 108 of the CPU chip 102.
  • less heat is conducted from the CPU chip 102 to the stack of DRAM chips 1 18, resulting in reduced temperature and improved performance of the DRAM chips 1 18, and the ability to stack a greater number of DRAM chips 1 18 before thermal effects on performance become unacceptable.
  • a die 128 having one or more additional core processors 1 10 may be stacked on top of the processor region 108 of the CPU chip 102.
  • Stacking at least one die 128 containing additional processors 1 10 on top of the processor region 108 of the CPU chip 102 may enable the non-core region 1 14 to occupy a higher proportion of the area of the CPU chip 102, thereby enabling larger DRAM chips 118 to be stacked on the CPU chip 102 without overlapping the processor region 108.
  • a layer of bulk material 122 such as bulk silicon, is disposed on the processor region 108 of the chip 102.
  • the bulk material 122 acts as a spacer to create a more uniformly-shaped package, and may also serve other functions.
  • the thermal conductivity of the bulk material 122 may improve dissipation of the heat generated by the core processors 1 10 during their operation, and a heat sink 130 ( Figure 5) may be disposed on the top surface of the bulk material 122 after a packaging compound 140 has been applied to the entire assembly, to further enhance its heat dissipation properties. If the top surface of the bulk material 122 is approximately coplanar with the top surface of the stack of DRAM chips 1 18 (as shown in Figure 4), the heat sink 130 may also be disposed on the top surface of the stack of DRAM chips 1 18.
  • the CPU chip 102 may communicate with each of the DRAM chips 1 18 using through-silicon vias (TSVs) 126 (shown in Figure 6) extending from the non- core region 1 14 of the CPU chip 102 that is positioned directly below the DRAM chips 1 18, resulting in a short signal path that allows rapid communication between the DRAM chips 1 18 and the SRAM cache 1 16.
  • TSVs through-silicon vias
  • the CPU chip 102 communicates with external devices via the ball grid array 106.
  • both the core processors 1 10 and the DRAM chips 1 18 may be directly cooled via a thermal path to a heat sink without passing through the other.
  • the MCP 200 according to a second embodiment is similar to the MCP 100 of Figure 3, except that the bulk material 122 has been omitted. Corresponding parts have been given corresponding reference numerals and will not be described again in detail.
  • separate heat sinks 232, 234 may optionally be placed directly on the top surface 124 of the processor region 108 and the top DRAM chip 1 18, thereby providing improved cooling of both the core processors 1 10 and the DRAM chips 1 18 relative to the configurations of Figures 1 and 2.
  • the MCP 300 according to a third embodiment is similar to the MCP 100 of Figure 3. Corresponding parts have been given corresponding reference numerals and will not be described again in detail.
  • the layer of bulk material 122 is
  • An additional chip 326 which may be a chip with relatively low thermal sensitivity and relatively low heat generation such as a MEMS or random logic based chip, is stacked on top of the DRAM chips 1 18 and the bulk material 122.
  • the CPU chip 102 may communicate with the chip 326 via TSVs 126 passing through the bulk material 122, to minimize the TSV overhead of the DRAM chips 1 18. It is contemplated that multiple chips or other components such as a common heat sink 338 might additionally or alternatively be stacked on top of the DRAM chips 1 18 and the bulk material 122.
  • the MCP 400 according to a fourth embodiment is similar to the MCP 100 of Figure 3. Corresponding parts have been given corresponding reference numerals and will not be described again in detail.
  • the chip 326 is mounted closest to the substrate 104. It is contemplated that multiple chips 326 may be used.
  • the DRAM chips 1 18 are stacked on top of a portion of the chip 326, and the bulk material 122 is stacked on the remaining area of the chip 326.
  • the CPU chip 102 is mounted on top of the DRAM chips 1 18 and the bulk material 122 such that the non-core region 1 14 of the CPU chip 102 substantially overlaps the DRAM chips 118 and the processor region 108.
  • additional core processors 1 10 may be stacked above or below the processor region 108 of the CPU chip 102. If the additional core processors 1 10 are stacked below the processor region 108, the thickness of the bulk material 122 may be reduced accordingly.
  • the CPU chip 102 may communicate with the substrate using TSVs 126 through the bulk material, thereby reducing the TSV overhead of the DRAM chips 1 18.
  • a heat sink may optionally be mounted on the CPU chip 102 to provide cooling for both the core processors 1 10 and the DRAM chips 118.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Power Engineering (AREA)
  • Dram (AREA)
  • Semiconductor Memories (AREA)
  • Static Random-Access Memory (AREA)

Abstract

A multi-chip package has a substrate with electrical contacts for connection to an external device. A CPU die is disposed on the substrate and is in communication with the substrate. The CPU die has a plurality of processor cores occupying a first area of the CPU die, and an SRAM cache occupying a second area of the CPU die. A DRAM cache is disposed on the CPU die and is in communication with the CPU die. The DRAM cache has a plurality of stacked DRAM die. The plurality of stacked DRAM dies are substantially aligned with the second area of the CPU die, and substantially do not overlap the first area of the CPU die. A multi-chip package having a DRAM cache disposed on the substrate and a CPU die disposed on the DRAM cache is also disclosed.

Description

CPU With Stacked Memory
Cross-Reference To Related Application And Claim Of Priority
[0001] This application claims the benefit of priority to U.S. Provisional Patent Application No. 61/565,709, filed on December 1 , 201 1, the contents of which are hereby incorporated herein by reference in their entirety.
Field of the Invention
[0002] The invention relates generally to semiconductor devices, and more specifically to a CPU having stacked memory.
Background
[0003] The emergence of mobile consumer electronics, such as cellular telephones, laptop computers, Personal Digital Assistants (PDAs), and MP3 players, has increased the demand for compact, high performance memory devices. These memory devices are subject to increasingly stringent constraints in terms of the number of data bits that can be provided at defined operating speeds using the smallest possible device. In this context, the term "smallest" generally refers to the lateral area occupied by the memory device in a "lateral" X/Y plane, such as a plane defined by the primary surfaces of a printed circuit board or module board.
[0004] As a result of the constraints on the area occupied by the device, microchip designers have begun to vertically integrate the data storage capacity of their devices. Thus, multiple memory devices that might have previously been laid out adjacent to one another in a lateral plane are now vertically stacked one on top of the other in a Z plane relative to the lateral X/Y plane, thereby greatly increasing the memory density per area that the device occupies on the board.
[0005] Recent developments in the fabrication of through silicon vias (TSVs) have facilitated the trend towards vertically stacked semiconductor memory devices, by providing more efficient communication between stacked chips and by further reducing the area occupied by the device. Most 3-D stacked technologies have focused on only chip-level integration in the vertical direction. One performance bottleneck results from the speed difference between the increasingly-fast microprocessor and the relatively fixed latency times of the main memory (typically DRAM). In order to mitigate this performance bottleneck, the memory I/O interface has been improved in an attempt to keep pace with ever-accelerating CPU performance.
However, another limiting factor is the distance between the CPU and the memory, which contributes to signal distortion and degradation of signal integrity, and increases power consumption by the I/O signal connection. The distance between the CPU and the memory device is limited by the physical dimensions of memory and the CPU if these devices are both mounted next to each other on the same board. This distance can be reduced by stacking memory devices with the CPU. Two common stacking arrangements are memory over CPU (Figure 1) and CPU over memory (Figure 2). The arrangement of Figure 1 has disadvantages in terms of heat dissipation, because the heat from the CPU must be conducted through the DRAM stack to reach the heat sink. However, the arrangement of Figure 2 requires the CPU to communicate to external devices (via the board) using TSVs through the intervening DRAM stack, thereby increasing the TSV overhead of the DRAM stack and reducing storage capacity accordingly.
[0006] The processor cores of the CPU chip consume a lot of power and generate heat during normal operation. It is not atypical for the processor cores of the CPU chip to generate hot spots about 30°C (about 55°F) hotter than the cooler portions of the chip such as the area allocated to the level 2 (L2) SRAM cache. This high temperature can adversely affect the performance of adjacent DRAM devices, which are inherently temperature-sensitive, and which themselves consume a significant amount of power during operation. Higher temperatures contribute to degradation of memory performance, require more frequent refresh cycles, and increase power consumption in DRAM devices. The stacked arrangement exacerbates the heat dissipation problem, because multiple heat-generating dies are in close proximity and must share a heat sink. Thermal issues are one limiting factor in the maximum acceptable height of the DRAM stack, thereby limiting the memory capacity available to the CPU, as well as adversely affecting the proper operation of the DRAM chips provided.
[0007] One approach to regulating thermal issues is to configure the CPU so that the hot spots are more evenly distributed over the area occupied by the processor cores. However, this increases design complexity and may conflict with optimized logic block placement in the CPU. In addition, this approach is of limited benefit when the CPU and the DRAM are stacked together, because the DRAM is still exposed to the same quantity of heat overall.
[0008] Therefore, there is a need to provide a stacked arrangement of a CPU and a DRAM memory wherein the stacked DRAM memory is exposed to reduced thermal effects. [0009] There is also a need to provide a stacked arrangement of a CPU and a DRAM memory having efficient heat dissipation.
Summary
[0010] It is an object of the present invention to address one or more of the disadvantages of the prior art.
[001 1] It is another object of the invention to provide a multi-chip package arrangement having a CPU chip stacked with a plurality of stacked DRAM chips, wherein the DRAM chips are positioned and dimensioned to substantially not overlap the processor cores of the CPU chip.
[0012] It is another object of the invention to provide a multi-chip package arrangement having a CPU chip stacked with a plurality of stacked DRAM chips, wherein the DRAM chips are positioned and dimensioned to substantially overlap only a cache portion of the CPU chip.
[0013] In one aspect, a multi-chip package comprises a substrate having electrical contacts for connection to an external device. A CPU die is disposed on the substrate and is in
communication with the substrate. The CPU die has a plurality of processor cores occupying a first area of the CPU die; and an SRAM cache occupying a second area of the CPU die. A DRAM cache is disposed on the CPU die and is in communication with the CPU die. The DRAM cache comprises a plurality of stacked DRAM dies. The plurality of stacked DRAM dies are substantially aligned with the second area of the CPU die. The plurality of stacked DRAM dies substantially do not overlap the first area of the CPU die.
[0014] In a further aspect, a bulk material is disposed on the CPU die and is substantially aligned with the first area of the CPU die.
[0015] In a further aspect, the bulk material has a top surface substantially coplanar to a top surface of the plurality of stacked DRAM dies.
[0016] In a further aspect, a chip is disposed on the top surface of the bulk material and on the top surface of the plurality of stacked DRAM dies. The chip is in communication with the CPU die.
[0017] In a further aspect, the chip and the plurality of DRAM dies are in communication with the CPU die via through-silicon vias (TSVs).
[0018] In a further aspect, at least some of the TSVs pass through the bulk material. [0019] In a further aspect, a heat sink is disposed on a top surface of the plurality of stacked DRAM dies.
[0020] In a further aspect, a heat sink is disposed on a top surface of the first area of the CPU die.
[0021] In a further aspect, a heat sink is disposed on a top surface of the bulk material.
[0022] In a further aspect, a heat sink is disposed on the top surface of the bulk material and on the top surface of the plurality of stacked DRAM dies.
[0023] In a further aspect, at least one die is disposed on the CPU die and is substantially aligned with the first area of the CPU die. The at least one die comprises at least one additional processor core.
[0024] In an additional aspect, a multi-chip package comprises a substrate having electrical contacts for connection to an external device. A DRAM cache is disposed on the substrate and is in communication with the CPU die. The DRAM cache comprises a plurality of stacked DRAM dies. A bulk material is disposed on the substrate. A CPU die is disposed on the DRAM cache and the substrate. The CPU die is in communication with the substrate. The CPU die comprises a plurality of processor cores occupying a first area of the CPU die; and an SRAM cache occupying a second area of the CPU die. The plurality of stacked DRAM dies are substantially aligned with the second area of the CPU die. The bulk material is substantially aligned with the first area of the CPU die.
[0025] In a further aspect, the bulk material has a top surface substantially coplanar to a top surface of the plurality of stacked DRAM dies.
[0026] In a further aspect, the substrate and the plurality of DRAM dies are in communication with the CPU die via through-silicon vias (TSVs).
[0027] In a further aspect, at least some of the TSVs pass through the bulk material.
[0028] In a further aspect, a heat sink is disposed on a top surface of the CPU die.
[0029] In a further aspect, at least one die is disposed on a top surface of the bulk material and is substantially aligned with the first area of the CPU die. The at least one die comprises at least one additional processor core.
[0030] Additional and/or alternative features, aspects, and advantages of embodiments of the present invention will become apparent from the following description, the accompanying drawings, and the appended claims. Brief Description of the Drawings
[0031 ] Figure 1 is a schematic diagram of a memory-over-CPU stacking arrangement according to a prior art embodiment;
[0032] Figure 2 is a schematic diagram of a CPU-over-memory stacking arrangement according to a prior art embodiment;
[0033] Figure 3 is a schematic diagram of a CPU chip according to an embodiment;
[0034] Figure 4 is a schematic side elevation view of a memory-over-CPU stacking arrangement according to a first embodiment;
[0035] Figure 5 is a perspective view of the stacking arrangement of Figure 4;
[0036] Figure 6 is an exploded view of the stacking arrangement of Figure 4;
[0037] Figure 7 is a schematic side elevation view of a memory-over-CPU stacking arrangement according to a second embodiment;
[0038] Figure 8 is a schematic side elevation view of a memory-over-CPU stacking arrangement according to a third embodiment; and
[0039] Figure 9 is a schematic side elevation view of a CPU-over-memory stacking arrangement according to a fourth embodiment.
Detailed Description
[0040] Referring generally to Figures 3-6, a multi-chip package (MCP) 100 will be described according to a first embodiment. A CPU chip 102 is mounted on a substrate 104 which connects to external devices (not shown) via a ball grid array 106. It is contemplated that the substrate 104 may alternatively be electrically connectable to external devices using any other suitable form of electrical contacts, such as pins. The CPU chip 102 includes a processor region 108 containing two core processors 1 10, each with its respective level 1 (LI) cache 1 12. It is contemplated that the CPU chip 102 may alternatively have a single core processor 1 10 or more than two core processors 1 10. The CPU chip 102 also includes a non-core region 1 14 used as a cache region and containing, among other things, a level 2 (L2) SRAM cache 1 16 and associated circuitry. It is contemplated that other known types of memory may alternatively be used for the L2 cache 1 16, or that the non-core region may alternatively contain other logic circuitry used in support of the core processors 1 10. Each of the processor region 108 and the non-core region 1 14 may take up approximately half of the area of the CPU chip 102; however, it should be understood that the proportions of either may vary according to the desired performance characteristics of the CPU chip 102. A number of DRAM chips 1 18 are stacked on the top surface 120 of the CPU chip 102, using any suitable known method for adhering each DRAM chip 1 18 to adjacent chips. While three or four DRAM chips 1 18 are shown in various embodiments, it should be understood that any number of DRAM chips 1 18 may be stacked as needed to achieve the desired storage capacity for a particular MCP 100. The DRAM chips 1 18 are approximately the size of the non-core region 1 14 of the CPU chip 102, and are stacked on the non-core region 1 14 of the CPU chip 102 such that when the DRAM chips 1 18 are stacked they substantially overlap only the non-core region 1 14 and substantially do not overlap the processor region 108. As a result, the bottom DRAM chip 1 18 is in contact only with the relatively cooler non-core region 1 14 of the CPU chip 102 and not the relatively hotter processor region 108 of the CPU chip 102. In this arrangement, less heat is conducted from the CPU chip 102 to the stack of DRAM chips 1 18, resulting in reduced temperature and improved performance of the DRAM chips 1 18, and the ability to stack a greater number of DRAM chips 1 18 before thermal effects on performance become unacceptable. If increased processor capacity is desired, a die 128 having one or more additional core processors 1 10 may be stacked on top of the processor region 108 of the CPU chip 102. Stacking at least one die 128 containing additional processors 1 10 on top of the processor region 108 of the CPU chip 102 may enable the non-core region 1 14 to occupy a higher proportion of the area of the CPU chip 102, thereby enabling larger DRAM chips 118 to be stacked on the CPU chip 102 without overlapping the processor region 108.
[0041 ] A layer of bulk material 122, such as bulk silicon, is disposed on the processor region 108 of the chip 102. The bulk material 122 acts as a spacer to create a more uniformly-shaped package, and may also serve other functions. The thermal conductivity of the bulk material 122 may improve dissipation of the heat generated by the core processors 1 10 during their operation, and a heat sink 130 (Figure 5) may be disposed on the top surface of the bulk material 122 after a packaging compound 140 has been applied to the entire assembly, to further enhance its heat dissipation properties. If the top surface of the bulk material 122 is approximately coplanar with the top surface of the stack of DRAM chips 1 18 (as shown in Figure 4), the heat sink 130 may also be disposed on the top surface of the stack of DRAM chips 1 18.
[0042] In this configurations, the CPU chip 102 may communicate with each of the DRAM chips 1 18 using through-silicon vias (TSVs) 126 (shown in Figure 6) extending from the non- core region 1 14 of the CPU chip 102 that is positioned directly below the DRAM chips 1 18, resulting in a short signal path that allows rapid communication between the DRAM chips 1 18 and the SRAM cache 1 16. The CPU chip 102 communicates with external devices via the ball grid array 106. In this arrangement, both the core processors 1 10 and the DRAM chips 1 18 may be directly cooled via a thermal path to a heat sink without passing through the other. Although this arrangement results in a reduced area for each DRAM chip 1 18, the improved thermal isolation of the DRAM chips 1 18 from the core processors 110 enables more DRAM chips 1 18 to be stacked. As a result, storage capacity may be maintained or increased while maintaining an acceptable operating temperature, which in turn results in improved performance and reliability of the DRAM chips 1 18.
[0043] Referring now to Figure 7, the MCP 200 according to a second embodiment is similar to the MCP 100 of Figure 3, except that the bulk material 122 has been omitted. Corresponding parts have been given corresponding reference numerals and will not be described again in detail. In this configuration, separate heat sinks 232, 234 may optionally be placed directly on the top surface 124 of the processor region 108 and the top DRAM chip 1 18, thereby providing improved cooling of both the core processors 1 10 and the DRAM chips 1 18 relative to the configurations of Figures 1 and 2.
[0044] Referring now to Figure 8, the MCP 300 according to a third embodiment is similar to the MCP 100 of Figure 3. Corresponding parts have been given corresponding reference numerals and will not be described again in detail. The layer of bulk material 122 is
approximately equal in height to the stack of DRAM chips 1 18, to facilitate packaging of the MCP 300. An additional chip 326, which may be a chip with relatively low thermal sensitivity and relatively low heat generation such as a MEMS or random logic based chip, is stacked on top of the DRAM chips 1 18 and the bulk material 122. The CPU chip 102 may communicate with the chip 326 via TSVs 126 passing through the bulk material 122, to minimize the TSV overhead of the DRAM chips 1 18. It is contemplated that multiple chips or other components such as a common heat sink 338 might additionally or alternatively be stacked on top of the DRAM chips 1 18 and the bulk material 122.
[0045] Referring now to Figure 9, the MCP 400 according to a fourth embodiment is similar to the MCP 100 of Figure 3. Corresponding parts have been given corresponding reference numerals and will not be described again in detail. In this embodiment, the chip 326 is mounted closest to the substrate 104. It is contemplated that multiple chips 326 may be used. The DRAM chips 1 18 are stacked on top of a portion of the chip 326, and the bulk material 122 is stacked on the remaining area of the chip 326. The CPU chip 102 is mounted on top of the DRAM chips 1 18 and the bulk material 122 such that the non-core region 1 14 of the CPU chip 102 substantially overlaps the DRAM chips 118 and the processor region 108. It is contemplated that additional core processors 1 10 may be stacked above or below the processor region 108 of the CPU chip 102. If the additional core processors 1 10 are stacked below the processor region 108, the thickness of the bulk material 122 may be reduced accordingly. The CPU chip 102 may communicate with the substrate using TSVs 126 through the bulk material, thereby reducing the TSV overhead of the DRAM chips 1 18. A heat sink may optionally be mounted on the CPU chip 102 to provide cooling for both the core processors 1 10 and the DRAM chips 118.
[0046] Modifications and improvements to the above-described embodiments of the present invention may become apparent to those skilled in the art. The foregoing description is intended to be by way of example rather than limiting. The scope of the present invention is therefore intended to be limited solely by the scope of the appended claims.

Claims

1. A multi-chip package comprising:
a substrate having electrical contacts for connection to an external device;
a CPU die disposed on the substrate and being in communication with the substrate; the
CPU die comprising:
a plurality of processor cores occupying a first area of the CPU die; and an SRAM cache occupying a second area of the CPU die; and
a DRAM cache disposed on the CPU die and being in communication with the CPU die, the DRAM cache comprising a plurality of stacked DRAM dies,
the plurality of stacked DRAM dies being substantially aligned with the second area of the CPU die; and
the plurality of stacked DRAM dies substantially not overlapping the first area of the CPU die.
2. The multi-chip package of claim 1, further comprising:
a bulk material disposed on the CPU die and being substantially aligned with the first area of the CPU die.
3. The multi-chip package of claim 2, wherein:
the bulk material has a top surface substantially coplanar to a top surface of the plurality of stacked DRAM dies.
4. The multi-chip package of claim 3, further comprising:
a chip disposed on the top surface of the bulk material and on the top surface of the plurality of stacked DRAM dies, the chip being in communication with the CPU die.
5. The multi-chip package of claim 4, wherein:
the chip and the plurality of DRAM dies are in communication with the CPU die via through-silicon vias (TSVs).
6. The multi-chip package of claim 5, wherein at least some of the TSVs pass through the bulk material.
7. The multi-chip package of claim 1, further comprising a heat sink disposed on a top surface of the plurality of stacked DRAM dies.
8. The multi-chip package of claim 1 , further comprising a heat sink disposed on a top surface of the first area of the CPU die.
9. The multi-chip package of claim 2, further comprising a heat sink disposed on a top surface of the bulk material.
10. The multi-chip package of claim 3, further comprising a heat sink disposed on the top surface of the bulk material and on the top surface of the plurality of stacked DRAM dies.
1 1. The multi-chip package of claim 1 , further comprising at least one die disposed on the CPU die and being substantially aligned with the first area of the CPU die, the at least one die comprising at least one additional processor core.
12. A multi-chip package comprising:
a substrate having electrical contacts for connection to an external device;
a DRAM cache disposed on the substrate and being in communication with the CPU die, the DRAM cache comprising a plurality of stacked DRAM dies;
a bulk material disposed on the substrate; and
a CPU die disposed on the DRAM cache and the substrate, the CPU die being in communication with the substrate; the CPU die comprising:
a plurality of processor cores occupying a first area of the CPU die; and an SRAM cache occupying a second area of the CPU die,
the plurality of stacked DRAM dies being substantially aligned with the second area of the CPU die; and
the bulk material being substantially aligned with the first area of the CPU die.
13. The multi-chip package of claim 12, wherein:
the bulk material has a top surface substantially coplanar to a top surface of the plurality of stacked DRAM dies.
14. The multi-chip package of claim 12, wherein:
the substrate and the plurality of DRAM dies are in communication with the CPU die via through-silicon vias (TSVs).
15. The multi-chip package of claim 14, wherein at least some of the TSVs pass through the bulk material.
16. The multi-chip package of claim 12, further comprising a heat sink disposed on a top surface of the CPU die.
17. The multi-chip package of claim 12, further comprising at least one die disposed on a top surface of the bulk material and being substantially aligned with the first area of the CPU die, the at least one die comprising at least one additional processor core.
PCT/CA2012/001086 2011-12-01 2012-11-29 Cpu with stacked memory WO2013078536A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP12854424.4A EP2786409A4 (en) 2011-12-01 2012-11-29 Cpu with stacked memory
CN201280068123.8A CN104094402A (en) 2011-12-01 2012-11-29 CPU with stacked memory
KR1020147018137A KR20140109914A (en) 2011-12-01 2012-11-29 Cpu with stacked memory
JP2014543729A JP2015502663A (en) 2011-12-01 2012-11-29 CPU with stacked memory

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161565709P 2011-12-01 2011-12-01
US61/565,709 2011-12-01

Publications (1)

Publication Number Publication Date
WO2013078536A1 true WO2013078536A1 (en) 2013-06-06

Family

ID=48523861

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2012/001086 WO2013078536A1 (en) 2011-12-01 2012-11-29 Cpu with stacked memory

Country Status (7)

Country Link
US (1) US9158344B2 (en)
EP (1) EP2786409A4 (en)
JP (1) JP2015502663A (en)
KR (1) KR20140109914A (en)
CN (1) CN104094402A (en)
TW (1) TW201347101A (en)
WO (1) WO2013078536A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015015319A (en) * 2013-07-03 2015-01-22 キヤノン株式会社 Integrated circuit device

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9748002B2 (en) * 2013-10-23 2017-08-29 Etron Technology, Inc. System-in-package module with memory
US9287240B2 (en) 2013-12-13 2016-03-15 Micron Technology, Inc. Stacked semiconductor die assemblies with thermal spacers and associated systems and methods
US9607680B2 (en) * 2014-03-04 2017-03-28 Apple Inc. EDRAM/DRAM fabricated capacitors for use in on-chip PMUS and as decoupling capacitors in an integrated EDRAM/DRAM and PMU system
US9443744B2 (en) * 2014-07-14 2016-09-13 Micron Technology, Inc. Stacked semiconductor die assemblies with high efficiency thermal paths and associated methods
KR102307490B1 (en) * 2014-10-27 2021-10-05 삼성전자주식회사 Semiconductor package
KR102413441B1 (en) 2015-11-12 2022-06-28 삼성전자주식회사 Semiconductor package
JP5956708B1 (en) * 2015-11-30 2016-07-27 株式会社PEZY Computing DIE AND PACKAGE, DIE MANUFACTURING METHOD, AND PACKAGE GENERATION METHOD
US10032695B2 (en) 2016-02-19 2018-07-24 Google Llc Powermap optimized thermally aware 3D chip package
CN107615241A (en) 2016-03-31 2018-01-19 慧与发展有限责任合伙企业 Logical operation
JP6822253B2 (en) * 2017-03-22 2021-01-27 富士通株式会社 Electronic devices and their manufacturing methods, electronic components
US10153261B2 (en) * 2017-04-03 2018-12-11 Cisco Technology, Inc. Cooling system for high power application specific integrated circuit with embedded high bandwidth memory
CN111332231B (en) * 2018-06-22 2021-07-20 浙江航芯科技有限公司 Intelligent cabin system for automobile and automobile using same
US11171115B2 (en) * 2019-03-18 2021-11-09 Kepler Computing Inc. Artificial intelligence processor with three-dimensional stacked memory
US11836102B1 (en) 2019-03-20 2023-12-05 Kepler Computing Inc. Low latency and high bandwidth artificial intelligence processor
CN111033728A (en) * 2019-04-15 2020-04-17 长江存储科技有限责任公司 Bonded semiconductor device with programmable logic device and dynamic random access memory and method of forming the same
JP7311615B2 (en) 2019-04-30 2023-07-19 長江存儲科技有限責任公司 Junction semiconductor device with processor and NAND flash memory and method of forming same
CN110870062A (en) 2019-04-30 2020-03-06 长江存储科技有限责任公司 Bonded semiconductor device with programmable logic device and NAND flash memory and method of forming the same
US11844223B1 (en) 2019-05-31 2023-12-12 Kepler Computing Inc. Ferroelectric memory chiplet as unified memory in a multi-dimensional packaging
US11043472B1 (en) 2019-05-31 2021-06-22 Kepler Compute Inc. 3D integrated ultra high-bandwidth memory
WO2022219762A1 (en) * 2021-04-15 2022-10-20 ユニサンティス エレクトロニクス シンガポール プライベート リミテッド Semiconductor device having memory element
US11791233B1 (en) 2021-08-06 2023-10-17 Kepler Computing Inc. Ferroelectric or paraelectric memory and logic chiplet with thermal management in a multi-dimensional packaging

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7215022B2 (en) * 2001-06-21 2007-05-08 Ati Technologies Inc. Multi-die module
US7616470B2 (en) * 2006-06-16 2009-11-10 International Business Machines Corporation Method for achieving very high bandwidth between the levels of a cache hierarchy in 3-dimensional structures, and a 3-dimensional structure resulting therefrom
US8110899B2 (en) * 2006-12-20 2012-02-07 Intel Corporation Method for incorporating existing silicon die into 3D integrated stack
JP5070342B2 (en) 2007-10-23 2012-11-14 ヒューレット−パッカード デベロップメント カンパニー エル.ピー. All-optical high-speed distributed arbitration in computer system equipment
US8014166B2 (en) * 2008-09-06 2011-09-06 Broadpak Corporation Stacking integrated circuits containing serializer and deserializer blocks using through silicon via
US8624626B2 (en) * 2011-11-14 2014-01-07 Taiwan Semiconductor Manufacturing Co., Ltd. 3D IC structure and method

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
KGIL ET AL.: "PicoServer: Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor", ASPLOS XII, PROC. OF THE 12TH INTERNATIONAL CONFERENCE ON ARCHITECTURE, vol. 41, no. 11, 21 October 2006 (2006-10-21), pages 117 - 128, XP003031505 *
LI ET AL.: "Design and Management of 3D Chip Multiprocessors Using Network-in-Memory", PROC. OF THE 33RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, vol. 34, no. 2, May 2006 (2006-05-01), XP010925387 *
LOH ET AL., PROCESSOR DESIGN IN 3D DIE-STACKING TECHNOLOGIES, vol. 27, no. 3, June 2007 (2007-06-01), pages 31 - 48, XP011190647 *
LOH, 3D-STACKED MEMORY ARCHITECTURES FOR MULTI-CORE PROCESSORS'', PROC. OF 35TH. ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, vol. 36, no. 3, June 2008 (2008-06-01), pages 453 - 464, XP031281699 *
LOI ET AL.: "A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy", PROC. OF THE 43RD ANNUAL DESIGN AUTOMATION CONFERENCE, 28 July 2006 (2006-07-28), XP010936649 *
MADAN ET AL.: "Optimizing communication and capacity in a 3D stacked reconfigurable cache hierarchy", IEEE 15TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE, 18 February 2009 (2009-02-18), XP031435384 *
PUTTASWAMY ET AL.: "Implementing caches in a 3D technology for high performance processors", IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN VLSI IN COMPUTERS AND PROCESSORS, 5 October 2005 (2005-10-05), XP010846471 *
See also references of EP2786409A4 *
WOO ET AL.: "An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth", IEEE 16TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTERARCHITECTURE, 14 January 2010 (2010-01-14), XP031640698 *
WOO ET AL.: "Heterogeneous die stacking of SRAM row cache and 3-D DRAM: An empirical design evaluation", IEEE 54TH INTERNATIONAL MIDWEST SYMPOSIUM IN CIRCUITS AND SYSTEMS, 10 August 2011 (2011-08-10), XP031941372 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015015319A (en) * 2013-07-03 2015-01-22 キヤノン株式会社 Integrated circuit device

Also Published As

Publication number Publication date
KR20140109914A (en) 2014-09-16
EP2786409A4 (en) 2015-09-09
CN104094402A (en) 2014-10-08
EP2786409A1 (en) 2014-10-08
TW201347101A (en) 2013-11-16
JP2015502663A (en) 2015-01-22
US20130141858A1 (en) 2013-06-06
US9158344B2 (en) 2015-10-13

Similar Documents

Publication Publication Date Title
US9158344B2 (en) CPU with stacked memory
US11562986B2 (en) Stacked semiconductor die assemblies with partitioned logic and associated systems and methods
US20200350224A1 (en) Stacked semiconductor die assemblies with multiple thermal paths and associated systems and methods
US8710676B2 (en) Stacked structure and stacked method for three-dimensional chip
US7928562B2 (en) Segmentation of a die stack for 3D packaging thermal management
US9287240B2 (en) Stacked semiconductor die assemblies with thermal spacers and associated systems and methods
US9748201B2 (en) Semiconductor packages including an interposer
CN104701287A (en) 3DIC packaging with hot spot thermal management features
TWI778197B (en) Stack packages including bridge dies
TWI810380B (en) System-in-packages including a bridge die
KR102307490B1 (en) Semiconductor package
TWI760518B (en) Semiconductor packages including a heat insulation wall
TW202032731A (en) System-in-packages including a bridge die
KR20210071818A (en) Reconstituted wafer assembly
TWI768118B (en) Semiconductor packages relating to thermal transfer plate and methods of manufacturing the same
US11721680B2 (en) Semiconductor package having a three-dimensional stack structure
TWI808451B (en) Semiconductor device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12854424

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2012854424

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2014543729

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20147018137

Country of ref document: KR

Kind code of ref document: A