CN216118778U - Stacking chip - Google Patents

Stacking chip Download PDF

Info

Publication number
CN216118778U
CN216118778U CN202122118042.3U CN202122118042U CN216118778U CN 216118778 U CN216118778 U CN 216118778U CN 202122118042 U CN202122118042 U CN 202122118042U CN 216118778 U CN216118778 U CN 216118778U
Authority
CN
China
Prior art keywords
programmable gate
memory
gate array
storage
control unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202122118042.3U
Other languages
Chinese (zh)
Inventor
郭一欣
江喜平
左丰国
王嵩
周骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Ziguang Guoxin Semiconductor Co ltd
Original Assignee
Xian Unilc Semiconductors Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Unilc Semiconductors Co Ltd filed Critical Xian Unilc Semiconductors Co Ltd
Priority to CN202122118042.3U priority Critical patent/CN216118778U/en
Application granted granted Critical
Publication of CN216118778U publication Critical patent/CN216118778U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The present invention provides a stacked chip, wherein the stacked chip comprises: the first programmable gate array assembly comprises a first interface module, the first interface module is embedded in the first programmable gate array assembly, and the first interface module comprises a first bonding lead-out area; the first storage array assembly is provided with a second bonding lead-out area; the first bonding lead-out region and the second bonding lead-out region are in bonding connection so as to connect the interconnection signals on the first programmable gate array assembly and the first memory array assembly together. The purposes of high bandwidth and low power consumption of storage access are achieved.

Description

Stacking chip
Technical Field
The present invention relates to the field of integrated circuit technology, and more particularly, to a stacked chip.
Background
With the rapid increase of the application computing scale, the bandwidth and power consumption overhead of memory access become important factors limiting the development of the scaled computing circuit.
SUMMERY OF THE UTILITY MODEL
The utility model provides a stacked chip which can realize high bandwidth and low power consumption of memory access.
In order to solve the technical problems, the utility model provides a technical scheme that: provided is a stacked chip including: the first programmable gate array assembly comprises a first interface module, the first interface module is embedded in the first programmable gate array assembly, and the first interface module comprises a first bonding lead-out area; the first storage array assembly is provided with a second bonding lead-out area; the first bonding lead-out region and the second bonding lead-out region are in bonding connection so as to connect the interconnection signals on the first programmable gate array assembly and the first memory array assembly together.
The stacked chip has the advantages that the stacked chip is different from the situation of the prior art, the first programmable gate array component and the interconnection signal on the first storage array component are connected together through the first bonding lead-out area and the second bonding lead-out area, and the first interface module provided with the first bonding lead-out area is embedded into the first programmable gate array component, so that a three-dimensional heterogeneous integrated structure is realized, and the purposes of high bandwidth and low power consumption of storage access are realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without inventive efforts, wherein:
FIG. 1 is a schematic structural diagram of a first embodiment of stacked chips according to the present invention;
FIG. 2 is a schematic plan view of a first programmable gate array assembly according to the present invention;
FIG. 3 is a schematic diagram of a memory access structure of the first programmable gate array assembly of FIG. 1 to the first memory array assembly;
FIG. 4 is a diagram illustrating a second embodiment of stacked chips according to the present invention;
FIG. 5 is a schematic diagram of the first programmable gate array assembly and the second programmable gate array assembly of FIG. 4 showing the structure of the shared memory access of the first memory array assembly;
FIG. 6 is a schematic diagram of the structure of the first programmable gate array assembly and the second programmable gate array assembly of FIG. 4 for independent memory access to the first memory array assembly;
FIG. 7 is a schematic structural diagram of a stacked chip according to a third embodiment of the present invention;
FIG. 8 is a block diagram illustrating the shared memory access of the first programmable gate array assembly to the first memory array assembly and the second memory array assembly of FIG. 7;
FIG. 9 is a schematic diagram of the independent memory access of the first programmable gate array assembly of FIG. 7 to the first memory array assembly and the second memory array assembly;
FIG. 10 is a diagram illustrating a fourth embodiment of stacked chips according to the present invention;
FIG. 11 is a block diagram illustrating the shared memory access of the first programmable gate array assembly to the first memory array assembly and the second memory array assembly of FIG. 10;
FIG. 12 is a schematic diagram of the independent memory access of the first programmable gate array assembly of FIG. 10 to the first memory array assembly and the second memory array assembly;
FIG. 13 is a schematic diagram of the structure of a programmable routing network and programmable logic blocks;
fig. 14 is a schematic diagram of a three-dimensional heterogeneous integrated structure among the functional elements 210, 220, 230.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first", "second" and "third" in the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. All directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the utility model. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Fig. 1 is a schematic structural diagram of a stacked chip according to a first embodiment of the utility model. Specifically, the stacked chip includes a first programmable gate array assembly 1 and a first memory array assembly 2. In the present application, the first programmable gate array component 1 and the first memory array component 2 are hybrid-bonded and integrated by using a three-dimensional heterogeneous integration method. The three-dimensional heterogeneous integration is that metal layers in two chip assemblies are directly connected across the chips, physical and electrical parameters follow the process characteristics of semiconductor manufacturing procedures, and the interconnection density and speed of the three-dimensional heterogeneous integration are greatly improved compared with the interconnection realized by an input/output (I/O) interface and/or an I/O circuit, and the internal interconnection of stacked chips is realized, so that the high bandwidth and low power consumption of the stacked chips can be realized.
In an embodiment, the first Memory array component 2 may be a DRAM (Dynamic Random Access Memory), in another embodiment, the first Memory array component 2 may also be a Static Random Access Memory (SRAM), and in consideration of the technology iteration development, the first Memory array component 2 may also be other types of memories or a combination of SRAMs and other types of memories, such as a Flash Memory (Flash), a resistance Random Access Memory (RRAM or ReRAM), a magnetoresistive Memory (MRAM), a ferroelectric Memory (FeRAM), an oxide resistive Memory (OxRAM), a bridge Memory (CBRAM), a Phase Change Memory (PCM), a spin transfer torque Memory (STT-MRAM), an electrically erasable Memory (EEPROM), and the like, which are not limited specifically. The memory has respective characteristic advantages, and may require a memory controller as a memory access interface, where the memory controller is used to implement functions such as a physical interface, data read/write, data buffering, data prefetching, data refreshing, and data block remapping, and is not particularly limited.
Specifically, as shown in fig. 1, the first programmable gate array assembly 1 includes a first interface module 11, and the first interface module 11 is embedded in the first programmable gate array assembly 1. Specifically, the first interface module 11 includes a first bond lead out region 111. The first memory array assembly 2 is provided with a second bond lead out region 21. The first bonding lead-out area 111 and the second bonding lead-out area 21 are bonded together through a three-dimensional heterogeneous integration bonding structure, so that three-dimensional heterogeneous integration of the first programmable gate array component 1 and the first storage array component 2 is realized, and a high-bandwidth and low-power-consumption programmable static storage and calculation integrated structure of a stacked chip is further realized. The three-dimensional heterogeneous integrated bonding can greatly improve the interconnection density of the first programmable gate array component 1 and the first interface module 11, and can further improve the interconnection density of the first programmable gate array component 1 and the first storage array component 2, reduce interconnection distribution parameters, improve interconnection bandwidth and reduce interconnection power consumption.
Specifically, the first programmable gate array assembly 1 includes a plurality of functional modules 13, the first interface module 11 is located between the plurality of functional modules 13, an interface routing unit 137 is disposed on one side of the first interface module 11 close to the functional modules 13, and the interface routing unit 137 connects the functional modules 13 with the first interface module 11. Specifically, the function module 13 is connected to the interface routing unit 137 through an internal metal layer, and the first interface module 11 is connected to the interface routing unit 137 through an internal metal layer. In a specific embodiment, the number of the first interface modules 11 is one, in another embodiment, the number of the first interface modules 11 is at least 2, and at least two first interface modules 11 are respectively inserted between the plurality of function modules 13 and connected to the function modules 13 through the interface routing unit 137. The embodiment shown in fig. 1 only shows one first interface module 11, and in other embodiments, there may be multiple first interface modules 11, which is not limited to this, and the present invention is specifically set according to requirements.
In an embodiment, as shown in fig. 2, fig. 2 is a schematic plan view of the first programmable gate array assembly 1. The function module 13 includes a programmable Logic Block (Logic Array Block, LAB/Configurable Logic Block, CLB)133, a Memory Block (Block Random Access Memory, BRAM)134, a multiplication unit (Digital Signal processor) 135, and a multiplication and addition unit (MAC) 138. It should be noted that the multiplication unit 135 is not a digital signal processor chip, but is an embedded programmable multiplication unit. In a specific embodiment, the functional module 13 may be configured as required, and is not limited in this application.
In this embodiment, the first bonding lead-out region 111 is a three-dimensional heterogeneous integrated interconnection resource in the first programmable gate array assembly 1, the first programmable gate array assembly 1 is directly bonded and connected with the second bonding lead-out region 21 of the first storage array assembly 2 through the first bonding lead-out region 111, so that metal layers with high density and low distribution parameters are directly interconnected, storage access is realized, interconnection between the first programmable gate array assembly 1 and the first storage array assembly 2 through an IO interface and an IO interface circuit is avoided, the purposes of high bandwidth and low power consumption are further realized, and the advantages of high density and low distribution parameters are achieved.
In an embodiment, the first programmable gate array assembly 1 further comprises: a programmable routing network. The plurality of functional modules 13 are interconnected with the programmable routing network by internal metal layers and are connected to the interface routing unit 137 by the programmable routing network. Specifically, the programmable routing network is used to establish interconnection and data exchange of all resources inside the first programmable gate array assembly 1 by using the internal metal layer of the first programmable gate array assembly 1 in a programmable manner, and the functional module 13 establishes a wide reconfigurable data interconnection between modules and a storage device through the programmable routing network. As shown in fig. 2, the programmable routing network is connected to the storage routing unit 136, and the storage block BRAM134 is interconnected with the storage routing unit 136 and connected to the programmable routing network, so as to implement storage access (in the prior art) of all the functional modules 13 in the first programmable gate array assembly 1 to all the storage blocks BRAM134 through the storage routing unit 136. The programmable routing network is connected with the interface routing unit 137, and the first storage array component 2 is interconnected with the interface routing unit 137 through the first interface module 11 and is connected to the programmable storage routing network, so that all the functional modules 13 in the first programmable gate array component 1 can access the storage arrays on all the first storage array components 2 through the interface routing unit 137.
Specifically, all the functional modules 13 on the first programmable gate array assembly 1 are connected to the interface routing unit 137 through the programmable routing network, and the interface routing unit 137 is connected to the three-dimensional heterogeneous integrated bonding structure corresponding to the first interface module 11, so as to establish storage access of the functional modules 13 to all the storage arrays on the first storage array assembly 2. Since the programmable routing network is widely distributed on the first programmable gate array assembly 1 and supports programmable features, the high-density on-chip metal layer interconnections with the interface routing unit 137 can be established through the programmable routing network, whether near or far from the functional module 13 of the first interface module 11. The first interface module 11 and the first storage array component 2 realize the direct interconnection of the cross-chip metal layers with high density and low distribution parameters through the first bonding leading-out region 111 and the second bonding leading-out region 21, so that the defects of low interconnection density, low interconnection speed and high interconnection power consumption caused by an IO interface and an IO interface circuit are avoided, and the storage access of all the functional modules 13 to the storage arrays on all the first storage array components 2 with high bandwidth and low power consumption is established.
It can be understood that the memory block BRAM on the programmable gate array component is connected to the programmable routing network through the memory routing unit, and provides high-bandwidth memory resources for the functional module, and is limited by the area constraint of the programmable gate array component, and the capacity of the memory block BRAM is usually in tens of thousands to millions of memory bits (bits), which cannot meet the requirement of the conventional application. In the prior art, outside a programmable gate array component, a mass storage resource is expanded through IO of the programmable gate array component and an external memory, and a memory block BRAM inside the programmable gate array component is generally used as a cache of the external mass storage resource. The interconnection technology of the mass storage resources is expanded outside the programmable gate array assembly, the external storage access bandwidth is far lower than the internal bandwidth, and the storage access power consumption is larger. Compared with the prior art, the application has the advantages that the defects are overcome: similar to the interconnection and memory access structure of the functional modules and the memory block BRAM, the interface routing unit 137 and the first interface module 11 are designed, all the functional modules 13 can establish high-density on-chip metal layer interconnection with the interface routing unit 137 through the programmable routing network, and all the functional modules 13 can be further interconnected with the first interface module 11 through the interface routing unit 137. Since the first interface module 11 is connected with the first memory array component 2 through a three-dimensional heterogeneous integration manner, that is, in this application, the first programmable gate array component 1 and the first memory array component 2 establish high-density interconnection of metal layers between chips through three-dimensional heterogeneous integration, physical and electrical interconnection parameters follow the characteristics of semiconductor manufacturing process, inherit the bandwidth advantages of high density and high speed and the low power consumption advantages of interconnection of the memory block BRAM134 and the function module 13 in the chip of the first programmable gate array component 1 through the memory routing unit 136, and expand the memory capacity almost infinitely. As shown in fig. 2, the programmable logic block LAB/CLB 133, the memory block BRAM134, the multiplication unit DSP 135, the multiply-accumulate unit MAC 138, etc. in the functional module 13 are all striped layouts, and the memory routing unit 136 is a striped layout. The programmable logic block LAB/CLB 133, the memory block BRAM134, the multiplication unit DSP 135, the multiply-accumulate unit MAC 138, the memory routing unit 136, etc. are arbitrarily repeatedly combined in a stripe shape as shown in fig. 2 in the first programmable gate array component 1 according to requirements, and the programmable interconnection is established through the programmable routing network, and the specific combination manner is not limited in the present application. In the present embodiment, the first interface module 11 is configured to fit the shape of the functional modules 13, and also has a strip-shaped layout so as to be embedded between the functional modules 13, and the first interface module 11 extends and expands the capacity along the functional modules 13 in the strip-shaped length direction based on the size of the functional modules 13. In one embodiment, the interface routing unit 137 is configured to fit the shape of the functional modules 13, and is also in a strip shape, so as to be embedded between the functional modules 13, and the interface routing unit 137 extends along the first interface module 11 in the length direction of the strip shape based on the size of the functional modules 13 to support the capacity expansion of the first interface module 11; therefore, large-capacity storage access interconnection between the functional module 13 and the first storage array component 2 can be formed, interconnection density is far higher than that of the internal IO circuit of the FPGA and/or the external IO interface and an external large-capacity storage, and high-bandwidth and low-power-consumption storage access of the stacked chip is achieved.
In the stacked chip of this embodiment, the interface routing unit 137 is designed, which can greatly improve the bit width of the bus, and the interface routing unit 137 is directly connected to the three-dimensional heterogeneous integrated bonding structure, and is connected to the first storage array component 2 through the three-dimensional heterogeneous integrated interconnection structure, which can realize access of the large-capacity storage array.
In this embodiment, the first interface module 11 is disposed on the first programmable gate array component 1 to realize the storage access with the first storage array component 2, which is different from the conventional technique in which the first programmable gate array component 1 is connected to the large-capacity external storage through the internal IO circuit and the external IO interface, and the stacked chip of this embodiment can save the IO resources of the first programmable gate array component 1, provide the external storage interconnection density far higher than that through IO, improve the storage access bandwidth, and reduce the storage access power consumption.
In an embodiment, a global bus, such as NOC AXI AHB, may also be provided on the first programmable gate array component 1, and the programmable logic cross-region memory access on the first programmable gate array component 1 may be implemented. Specifically, the global bus may be disposed near the first interface module 11, or may also be disposed at other storage access related locations, which is not limited in particular.
In an embodiment, as shown in fig. 2, an application specific integrated circuit array unit 139 may be further disposed in the first programmable gate array component 1, where the application specific integrated circuit array unit 139 includes an application specific integrated circuit implemented hard core operation/Processing unit (Processing Element), such as one or more arbitrary combinations of a multiply-add computation array, a multiply computation array, a ripple processor array, a hash computation array, various encoder arrays, a machine learning dedicated layer array, a retrieval function array, an image/video Processing array, and a CPU and MCU. Similar to the layout and interconnection of the functional modules 13 in the first programmable gate array assembly 1, the asic array unit 139 is arranged in the first programmable gate array assembly 1 in a stripe-like layout so as to be embedded between the functional modules 13, has a size extending and capacity expanding with the functional modules 13 in the stripe-like length direction, and is widely interconnected on the programmable routing network to become a hardmac/processing expansion circuit of the functional modules 13. The asic array unit 139 has limited or no programmability, and is applied to acceleration of computation/processing for specific requirements, which is much higher than the computation/processing density of the functional module 13 with any programmability, thereby significantly increasing the computation/processing density of the stacked chips.
In an embodiment, when the requirement for the asic array unit 139 is high in combination with the specific application requirement, the large-capacity storage cross-chip extension mode of the first programmable gate array assembly 1 by the first storage array assembly 2 is combined to perform the cross-chip extension of the asic array unit 139: 1. the design asic array unit 139 includes an asic implemented hard core operation/processing unit, such as a multiply-add calculation array, a systolic processor array, a hash calculation array, various encoder arrays, a machine learning dedicated layer array, a search function array, an image/video processing array, and any combination of one or more of CPU and MCU hard core operation/processing units; 2. the operation/processing interface module is designed on the first programmable gate array component 1, and high-density cross-chip interconnection is established with the operation/processing unit in the special integrated circuit array unit 139 through three-dimensional heterogeneous integration; 3. an operation/processing interface routing unit is designed on the first programmable gate array component 1, and on-chip metal layer high-density interconnection between a programmable routing network and an operation/processing interface module is established. Thereby, the functional module 13 on the first programmable gate array component 1 is realized, and the calculation input and the calculation result of the calculation/processing unit on the integrated circuit array unit 139 are mapped to the large-capacity storage array on the first storage array component 2 through the storage access based on the high-density three-dimensional heterogeneous integration according to the scheduling of the high-density three-dimensional heterogeneous integration.
In one embodiment, the stacking of chips further comprises: and the storage control unit 113, wherein the storage control unit 113 is used for controlling the storage and the access of the first programmable gate array assembly 1 to the first storage array assembly 2. Specifically, the storage control unit 113 may be disposed on the first interface module 11; or near the first interface module 11 on the first programmable gate array assembly 1; or the memory control unit 113 is disposed on the first memory array assembly 2. The stacked chips of the embodiment can avoid interconnection through a physical IO interface, so that IO resources are saved, interconnection density far higher than the IO interface is provided, the storage access bandwidth is improved, and the storage access power consumption is reduced. A high density, close range interconnection of signals internal to the first programmable gate array assembly 1 to the first memory array assembly 2 is achieved.
In a preferred embodiment, the storage control unit 113 is provided on the first interface module 11. This is advantageous for the data flow since the access of the programmable gate array component to the memory array component needs to pass through the first interface module 11. In a preferred embodiment, the memory control unit 113 is arranged on the first programmable gate array assembly 1, so that higher density and speed can be achieved due to the process performance of the programmable gate array assembly being better than that of the memory array assembly. In a preferred embodiment, the memory control unit 113 is disposed near the first interface module 11, so as to inherit the processing performance of the programmable gate array assembly to obtain higher density and speed, and also to reduce the area of the interface module 11 and the area overhead of the three-dimensional heterogeneous integrated interconnection area, and the memory control unit 113 can be combined with the programmable characteristic of the function module 13 to make part of the functions and/or parameters of the memory control unit 113 programmable. In a preferred embodiment, the memory control unit 113 is disposed on the memory array device, which can reduce the implementation cost and relatively increase the density of the programmable gate array device because the memory array device process is cheaper than the programmable gate array device unit area.
In one embodiment, the stacked chip further comprises: and the physical layer 114, wherein the physical layer 114 is used for realizing level conversion of the three-dimensional heterogeneous integrated interconnection between the first programmable gate array assembly 1 and the first storage array assembly 2 when the core voltages of the first programmable gate array assembly 1 and the first storage array assembly 2 are different. In one embodiment, as shown in FIG. 1, the physical layer 114 may be disposed on the first interface module 11. In another embodiment, the physical layer 114 may also be designed on the first programmable gate array assembly 1, typically on or near the first interface module 11, to inherit the process performance of the first programmable gate array assembly 1, to obtain higher density and speed; the physical layer 114 may be designed on the first memory array component 2, typically on or near the vertical projection area of the first interface module 11, to save the area of the first programmable gate array component 1 and increase the computation/processing density of the first programmable gate array component 1.
In the application, physical and electrical parameters of the cross-chip three-dimensional heterogeneous integrated interconnection of the first programmable gate array component 1 and the first memory array component 2 follow the process characteristics of a semiconductor process, and compared with the traditional PCB or 2.5D packaging, the interconnection quantity (memory access bandwidth) of the first programmable gate array component 1 and the first memory array component 2 is increased by 4-2 orders of magnitude. Compared with a traditional PCB or 2.5D package, the direct interconnection of the first programmable gate array component 1 and the first memory array component 2 is realized, an IO interface and/or an IO circuit are not needed, the interconnection distance is shorter, the interconnection distribution parameters are lower (particularly, the distribution capacitance of the interconnection line to a reference ground is lower), and the power consumption overhead of memory access is remarkably reduced. A near memory access architecture of the first programmable gate array component 1 and the first memory array component 2 is formed, so that the near memory access of the functional module 13 on the first programmable gate array component 1 is realized, and the memory access conflict and the efficiency reduction of the traditional shared bus are avoided; the IO overhead for interconnecting the first programmable gate array assembly 1 with the external mass storage device in the conventional art is saved.
In an embodiment of the present application, as shown in fig. 3, a memory control unit is disposed on the first interface module for illustration. Specifically, the storage control unit H21 is disposed on the first interface module H17. The first memory array module 2 includes a memory cell G13 thereon, a second bond pad out region G14 is provided on the memory cell G13, a memory control unit H21 is connected to the first bond pad out region H19, and a first bond pad out region H19 is connected to the second bond pad out region G14 on the first memory array module 2.
Further, the first programmable gate array assembly 1 is provided with a programmable logic unit K23, and the programmable logic unit K23 is connected with the storage control unit H21 through an interface routing unit H22. The programmable logic unit K23 derives logic signals, and the storage control unit H21 controls the first programmable gate array assembly 1 to perform storage access to the first storage array assembly 2 based on the logic signals.
In this application, the number and the positions of the first programmable gate array component 1 and the first memory array component 2 may be set according to the requirement, as shown in fig. 4, fig. 4 is a schematic structural diagram of a stacked chip according to a second embodiment of the present invention. Compared with the first embodiment shown in fig. 1 described above, the difference is that the stacked chip of the present embodiment further includes: a second programmable gate array component 3. The second programmable gate array assembly 3 is disposed on a side of the first programmable gate array assembly 1 away from the first memory array assembly 2. Specifically, the second programmable gate array assembly 3 includes a second interface module 31, and the second interface module 31 includes a third bond lead-out area 32. In this embodiment, the first interface module 11 further includes a fourth bonding lead-out area 12, and the third bonding lead-out area 32 is bonded and connected to the fourth bonding lead-out area 12 to bond the second programmable gate array assembly 3 and the second programmable gate array assembly 1 together.
The stacked chip of this embodiment is provided with two layers of programmable gate array components, that is, the second programmable gate array component 3 and the first programmable gate array component 1, and the second programmable gate array component 3 and the first programmable gate array component 1 are connected in a bonding manner through the third bonding lead-out area 32 and the fourth bonding lead-out area 12. In this embodiment, the third bonding lead-out area 32 is a three-dimensional heterogeneous interconnection resource of the second programmable gate array component 3, that is, the second programmable gate array component 3 is directly connected to the first interface module 11 through the interconnection resource, and further interconnected with the first storage array component 2 through the interconnection resource (the first bonding lead-out area 111) in the first programmable gate array component 1, so as to implement storage access, and avoid interconnection with the first storage array component 2 by using an IO interface of the second programmable gate array component 3, and further achieve the purposes of high bandwidth and low power consumption.
In the stacked chip, adjacent components are interconnected through three-dimensional heterogeneous integration, high-density metal layer interconnection in the chip is built layer by layer, the components in the stacked chip are designed and packaged in the same stacked chip in a stacked mode, functions such as driving, external level boosting (during output), external level reducing (during input), a tri-state controller, electrostatic protection ESD (electro-static discharge) and surge protection circuit provided by an IO circuit in the prior art are not needed, and cross-component high-density metal layer interconnection is directly built without interconnection through an IO interface and/or an IO circuit in the prior art. Therefore, the use of IO structures of the programmable gate array assembly is reduced, and the interconnection density and the interconnection speed of the programmable gate array assembly and the storage array assembly are increased; meanwhile, the three-dimensional heterogeneous integrated interconnection does not pass through the traditional IO structure, and the interconnection distance is short, so that the communication power consumption between chips is reduced; therefore, the integration level of the stacked chips and the interconnection frequency of the programmable gate array component and the memory array component are improved, and the interconnection power consumption is reduced. Therefore, the programmable routing network of the programmable resources widely interconnected on the programmable gate array component extends to the large-capacity storage array on the storage chip across the chip and forms wide interconnection, and the three-dimensional heterogeneous integrated storage access of the programmable resources to the large-capacity storage array on the storage chip in a high-bandwidth and programmable mode is realized. The multilayer chip has the large capacity of an external memory and the key advantages of large bit width and high bandwidth of a similar programmable gate array component which is interconnected with a memory block BRAM (in the prior art, the capacity is small) through a programmable routing network. The bottleneck of IO quantity, the bottleneck of memory access bandwidth and the bottleneck of memory access power consumption of the large-scale memory expanded by the programmable gate array chip in the prior art are fundamentally broken through.
Compared with the first embodiment shown in fig. 1, the stacked chip of the present embodiment can further improve the computation density, and is beneficial to more complex reconfigurable computation. With the stacked chip of the present embodiment, more programmable gate array components can be set according to requirements, so as to increase the density of the programmable gate array components in the stacked chip.
It should be noted that the second programmable gate array assembly 3 may also be different from the first programmable gate array assembly 1, and different functional modules may be arranged according to actual needs. For example, in an embodiment, the functional modules of the first programmable gate array assembly 1 comprise programmable functional modules including, but not limited to, any combination of programmable logic blocks LAB/CLB, memory blocks BRAM, multiplication units DSP and multiply-accumulate units MAC; the functional modules of the second programmable gate array component 3 may partially/completely include an application specific integrated circuit array unit, which includes but is not limited to one or more arbitrary combinations of a multiply-add calculation array, a systolic processor array, a hash calculation array, various encoder arrays, a machine learning dedicated layer array, a search function array, an image/video processing array, and a CPU and MCU.
In the present embodiment, the first programmable gate array assembly 1 and the second programmable gate array assembly 3 share the same memory control unit 113 to access the same memory cell of the first memory array assembly 2. Specifically, in this embodiment, the storage control unit 113 may be disposed on or near the first interface module 11; the storage control unit 113 may also be provided on or near the second interface module 31; alternatively, the storage control unit 113 may also be provided on the first storage array assembly 2.
Specifically, in an embodiment, the first programmable gate array assembly 1 further includes: first programmable logic unit the first programmable logic unit is connected to the storage control unit 113, and the first programmable logic unit derives a first logic signal. The second programmable gate array assembly 3 further comprises: and the second programmable logic unit is connected with the storage control unit 113, and a second logic signal is led out from the second programmable logic unit. The memory control unit 113 selects the first programmable gate array component 1 to access the first memory array component 2 or selects the second programmable gate array component 3 to access the first memory array component 2 based on the first logic signal and the second logic signal.
Specifically, as shown in fig. 5, the memory control unit H21 is disposed on the first interface module H17 for example. The first memory array assembly 2 includes a memory cell G13 thereon, a second bond lead-out region G14 is disposed on the memory cell G13, a first bond lead-out region H19 is disposed on the first interface module H17, and a first bond lead-out region H19 is bonded to the second bond lead-out region G14. The memory control unit H21 is provided on the first interface module H17, and the memory control unit H21 is connected to the first bond lead-out region H19. The first interface module H17 is further provided with a fourth bonding lead-out region H24, and the fourth bonding lead-out region H24 is connected to the memory control unit H21. The second interface module I27 is provided with a third bonding lead-out area I28, and the third bonding lead-out area I28 is connected with a fourth bonding lead-out area H24. Further, in this embodiment, the first programmable gate array assembly 1 further includes a first programmable logic unit H23, and the first programmable logic unit H23 is connected to the memory control unit H21. The second programmable gate array assembly 321 further includes a second programmable logic unit I32, a second programmable logic unit I32, connected to the third bond lead-out area I28.
For example, in an embodiment, when the first programmable gate array assembly 1 needs to access the first memory array assembly 2, the first programmable logic unit H23 outputs the first logic signal to the memory control unit H21, and at this time, the memory control unit H21 controls the first programmable gate array assembly 1 to access the memory cell G13 on the first memory array assembly 2 through the first bonding lead-out region H19 and the second bonding lead-out region G14 based on the first logic signal. When the second programmable gate array assembly 3 needs to access the first memory array assembly 2, the second programmable logic unit I32 outputs a second logic signal to the memory control unit H21. At this time, the memory control unit H21 controls the second programmable gate array assembly 3 to access the memory cell G13 on the first memory array assembly 2 through the third bond lead-out region I28 and the fourth bond lead-out region H24 based on the second logic signal. Therefore, the memory control unit selects the first programmable gate array component 1 to access the first memory array component 2 or the second programmable gate array component 3 to access the first memory array component 2 based on the first logic signal and the second logic signal.
In this embodiment, only one storage controller unit H21 is designed, and the storage controller unit H21 may be located on or near the first interface module H17, on or near the second interface module I27, or on the first storage array assembly 2, which is not limited specifically. The memory cell G13 on the first memory array assembly 2 is connected to the memory control unit H21 through the second bond lead-out region G14 and the first bond lead-out region H19, and the memory control unit H21 can be directly connected to two sets of memory access interfaces (e.g., H19 and H24 in fig. 5), through which the multiple sets of programmable gate array assemblies share the memory access of the memory cell G13.
In one embodiment, the first programmable logic unit H23 and the second programmable logic unit I32 include any combination of programmable logic blocks, memory blocks, multiplication units, multiply-accumulate units, and hard core operation/processing units, among others. The first programmable logic cell H23 derives a first logic signal and the second programmable logic cell I32 derives a second logic signal. The memory access interface of the memory control unit H21 is switched to the bonding direction of the first bonding lead-out area H19 and the second bonding lead-out area G14 or the bonding direction of the fourth bonding lead-out area H24 and the third bonding lead-out area I28 by the memory control unit H21 according to the first logic signal and the second logic signal, and the first programmable logic unit H23 and the second programmable logic unit I32 are used in a time sharing mode, so that shared memory access is achieved.
In this embodiment, the third bond lead-out area I28 is connected to the interface routing unit I30. And the interface routing unit I30 connects the second programmable logic unit I32 to the fourth bonded lead out region H24.
In this embodiment, one memory control unit H21 is shared, and the occupied area is small.
In another embodiment, the first programmable gate array assembly 1 and the second programmable gate array assembly 3 access different memory cells of the first memory array assembly 2 respectively by using independent memory control units. Specifically, the stacked chip includes a first memory control unit and a second memory control unit, the first programmable gate array assembly 1 accesses the memory cells of the first memory array assembly 2 by using the first memory control unit, and the second programmable gate array assembly 3 accesses the memory cells of the first memory array assembly 2 by using the second memory control unit.
In this embodiment, the second storage control unit is disposed on or near the second interface module 31, and the first storage control unit is disposed on or near the first interface module 11. In this embodiment, the first programmable gate array assembly 1 further includes: the first programmable logic unit is connected with the first storage control unit and leads out a first logic signal; the second programmable gate array assembly 3 further comprises: and the second programmable logic unit is connected with the second storage control unit and leads out a second logic signal.
Responding to the first storage control unit and the second storage control unit to control all the storage units of the first storage array component 2, and when the first programmable gate array component 1 and the second programmable gate array component 3 access the same storage unit at the same time, the first storage control unit controls the first programmable gate array component 1 to access the storage unit at the first time based on the first logic signal; the second memory control unit controls the second programmable gate array assembly 3 to access the memory cells at a second time based on the second logic signal. In response to the first memory control unit and the second memory control unit respectively controlling different memory cells of the first memory array assembly, the first memory control unit and the second memory control unit simultaneously control the first programmable gate array assembly 1 and the second programmable gate array assembly 3 to access different memory cells of the first memory array assembly 2.
Specifically, in this embodiment, if the first memory control unit and the second memory control unit both control all the memory cells of the first memory array assembly 2, and if the first programmable gate array assembly 1 and the second programmable gate array assembly 3 access the same memory cell at the same time, the first memory control unit and the second memory control unit respectively control the first programmable gate array assembly 1 and the second programmable gate array assembly 3 to access the memory cell. Specifically, the first memory control unit controls the first programmable gate array component 1 to access the memory unit at a first time based on the first logic signal, and the second memory control unit controls the second programmable gate array component 3 to access the memory unit at a second time based on the second logic signal, so that time-sharing access of different programmable gate arrays to the same memory unit is realized, that is, access conflict is eliminated.
In particular, the first programmable gate array assembly 1 may comprise arbitration logic for the memory cells, selecting to be accessed by the first memory control cell or the second memory control cell based on the first logic signal and the second logic signal. When the first memory control unit of the first programmable gate array assembly 1 and the second memory control unit of the second programmable gate array assembly 3 access the same area of the same memory cell of the first memory array assembly 2 simultaneously, respectively, the arbitration logic of the memory cells in the first programmable gate array assembly 1 establishes the access of the first memory control unit of the first programmable gate array assembly 1 or the second memory control unit of the second programmable gate array assembly 3 in a time-sharing manner based on the first logic signal and the second logic signal. The arbitration logic for the memory cells in the first programmable gate array assembly 1 may also be provided on the first memory array assembly 2 or the second programmable gate array assembly 3. That is, the first programmable gate array component 1 and the second programmable gate array component 3 are selected based on arbitration logic to time-share access to the first memory array component 2.
In another embodiment, when the first memory control unit and the second memory control unit respectively control different memory cells of the first memory array assembly, the first memory control unit and the second memory control unit simultaneously control the first programmable gate array assembly 1 and the second programmable gate array assembly 3 to access different memory cells of the first memory array assembly 2.
Specifically, when the first memory control unit of the first programmable gate array assembly 1 and the second memory control unit of the second programmable gate array assembly 3 access different memory cells of the first memory array assembly 2 simultaneously, respectively, since the respective memory control units are independent, the arbitration logic in the memory cells of the first programmable gate array assembly 1 can establish access of the first memory control unit of the first programmable gate array assembly 1 and the second memory control unit of the second programmable gate array assembly 3 to the memory cells of the first memory array assembly 2 simultaneously based on the first logic signal and the second logic signal.
In the embodiment, each logic component is provided with an independent storage access interface, the access bandwidth is highest, and the logic components can be accessed simultaneously when the specific units accessing the storage array are different; when the specific units are the same, conflict occurs, and arbitration and time-sharing access are needed. Specifically, when the first memory control unit and the second memory control unit both control all the memory cells of the first memory array module 2, if the same memory cell is accessed at the same time, time-sharing access is required. When the storage units controlled by the first storage control unit and the second storage control unit are different, time-sharing access is not needed.
In this embodiment, the second storage control unit is disposed on or near the second interface module 31, and the first storage control unit is disposed on or near the first interface module 11. In the present embodiment, the first memory control unit controls the first programmable gate array assembly 1 to access a part of the memory cells of the first memory array assembly 2 based on the first logic signal; the second storage control unit controls the second programmable gate array component 3 to access the rest storage units of the first storage array component 2 based on the second logic signal; the access area of the second programmable gate array component 1 to the memory cells of the first memory array component 2 is not overlapped with the access area of the first programmable gate array component 3. The first programmable logic unit utilizes a first memory control unit and the second programmable logic unit utilizes a second memory control unit to independently and simultaneously access different memory cells on the respective corresponding first memory array components 2.
In the embodiment, each logic component is provided with an independent storage access interface, the access bandwidth is highest, and the first storage array component 2 is accessed and divided to different programmable logic units to utilize a storage control unit combination; the concurrent memory access of different programmable logic units is realized, and the memory access efficiency is not reduced due to arbitration and time-sharing access.
Specifically, referring to fig. 6, the first memory array assembly 2 includes a memory cell G13, wherein two second bond-out regions, i.e., a second bond-out region G14 and a second bond-out region G12, are disposed on the memory cell G13. Wherein the second bond pad out region G14 is connected to the first bond pad out region H19 on the first interface module H17 on the first programmable gate array assembly 1. The first interface module H17 of the first programmable gate array assembly 1 is provided with a first storage control unit H20, and the first storage control unit H20 is used for controlling the first programmable gate array assembly 1 to access the first storage array assembly 2. Specifically, the first memory control unit H20 is connected to the first bond lead-out region H19. The first programmable gate array assembly 1 is provided with a first programmable logic unit H23, and the first programmable logic unit H23 is connected to the first storage control unit H20 through an interface routing unit H22. When the first programmable gate array assembly 1 accesses the first memory array assembly 2, the first programmable logic unit H23 outputs a first logic signal to the first memory control unit H20, and the first memory control unit H20 controls the first programmable gate array assembly 1 to access a part of the memory cells G13 of the first memory array assembly 2 through the first bonding lead-out region H19 and the second bonding lead-out region G14 based on the first logic signal.
In addition, the second bond lead-out region G12 is connected to the first bond lead-out region H18 on the first interface module H17, and the first bond lead-out region H18 is connected to the third bond lead-out region I28 on the second programmable gate array assembly 3. The second programmable gate array assembly 3 further comprises a second programmable logic unit I32, the second programmable logic unit I32 is connected to a second memory control unit I29 located on a second interface module I27 of the second programmable gate array assembly 3 through an interface routing unit I31. When the second programmable gate array assembly 3 accesses the first memory array assembly 2, the second programmable logic unit I32 outputs a second logic signal to the second memory control unit I29, and the second memory control unit I29 controls the second programmable gate array assembly 3 to access the rest of the memory cells G13 of the first memory array assembly 2 through the third bond lead-out region I28, the first bond lead-out region H18 and the second bond lead-out region G14 based on the second logic signal.
The first programmable gate array component 1 and the second programmable gate array component 3 can access the first memory array component 2 independently through the connection manner shown in fig. 6. It is understood that the programmable gate array component can also be 3-layer or 4-layer without limitation.
It should be noted that the first programmable gate array component 1 and the second programmable gate array component 3 of the present application may be FPGAs (field programmable gate arrays) or efgas (non-volatile field programmable gate arrays). In a preferred embodiment, the first programmable gate array component 1 and the second programmable gate array component 3 are FPGAs (field programmable gate arrays) or efgas (embedded field programmable gate arrays).
In the stacked chip of this embodiment, the memory access of the second programmable gate array component 3 to the first memory array component 2 does not pass through an IO interface and/or an IO circuit, so that the interconnection distance is closer, the interconnection distribution parameter is lower, and the power consumption overhead of the memory access is significantly reduced. In the chip manufacturing process, the second programmable gate array component 3 and the first programmable gate array component 1 can be produced simultaneously, and the second programmable gate array component 3 is bonded with the first programmable gate array component 1 and then bonded with the first storage array component 2, so that the process complexity can be reduced, and the cost can be saved. However, the memory access of the second programmable gate array assembly 3 to the first memory array assembly 2 needs to pass through the first interface module 11 and the second interface module 31, which causes a slight area loss.
The present application also proposes another embodiment in which a plurality of programmable gate array elements implement a hybrid memory access to at least one memory array element by designing a multiplexed or independent memory control cell using the methods of fig. 5 and 6 in a hybrid manner. In the same stacked chip, the programmable logic units in partial areas use the multiplexing storage control unit shown in fig. 5 to realize storage access; the programmable logic units of the partial area use the independent storage control unit shown in fig. 6.
The present application also proposes another embodiment, in which the second programmable gate array assembly 3 is disposed on a side of the first memory array assembly 2 away from the first programmable gate array assembly 1. That is, the first memory array component 2 is disposed between the second programmable gate array component 3 and the first programmable gate array component 1. The first memory array component 2 comprises a fourth bonding lead-out area, and the fourth bonding lead-out area and the third bonding lead-out area form three-dimensional heterogeneous integrated interconnection. In this embodiment, the second programmable gate array component 3 and the first programmable gate array component 1 can be directly interconnected with the first memory array component 2, so as to increase the programmable processing density and facilitate a larger memory access bandwidth.
In this embodiment, the memory access of the first programmable gate array assembly 1 to the first memory array assembly 2 only needs to pass through the first interface module 11, and the memory access of the second programmable gate array assembly 3 to the first memory array assembly 2 only needs to pass through the second interface module 31. This structure makes the interconnection distance between the second programmable gate array assembly 3 and the first memory array assembly 2 closer, which can further reduce the memory access power consumption. However, in the process of manufacturing the stacked chip with such a structure, the second programmable gate array component 3 needs to be bonded with the first memory array component 2 first, and then bonded with the first programmable gate array component 1.
Referring to fig. 7, a schematic structural diagram of a third embodiment of the stacked chip of the present invention is shown, which is different from the first embodiment shown in fig. 1 in that the stacked chip of the present embodiment further includes: a second storage array component 4. The second storage array component 4 is disposed on a side of the first storage array component 2 away from the first programmable gate array component 1, and the second storage array component 4 is disposed with a third bond lead-out area 41. In this embodiment, the first memory array assembly 2 further includes a fourth bonded lead-out region 12, and the third bonded lead-out region 41 and the fourth bonded lead-out region 12 constitute a three-dimensional heterogeneous integrated interconnection.
In this embodiment, more storage array components are integrated, which is beneficial to increasing the storage density and realizing larger storage access bandwidth. In this embodiment, more storage array components are integrated, which is beneficial to increasing the storage density, and after a plurality of storage array components are uniformly produced and tested to form a standard product, the standard product is integrated with the logic component, which is beneficial to reducing the cost.
In one embodiment, the first programmable gate array assembly 1 accesses the first memory array assembly 2 and the second memory array assembly 4 using the same memory control unit. Specifically, when the first programmable gate array assembly 1 shares the same memory control unit to access the first memory array assembly 2 and the second memory array assembly 4, in order to avoid access conflict, the memory control unit may selectively select the first programmable gate array assembly 1 to access the first memory array assembly 2 or the second memory array assembly 4 in a time-sharing manner.
Referring to fig. 8, in the embodiment, the stacked chip further includes a memory control unit H21, and the memory control unit H21 is disposed on the first interface module H17. In this embodiment, the first interface module H17 includes two first bond pad out regions, i.e., a first bond pad out region H19 and a first bond pad out region H18. The first memory array module 2 is provided with a plurality of memory cells G13, and the memory cell G13 is provided with two second bond lead-out regions, namely a second bond lead-out region G12 and a second bond lead-out region G14. The second memory array module 4 is provided with a plurality of memory cells F01, and the memory cell F01 is provided with a third bond lead-out region I28.
Specifically, the first bond lead-out region H18 connects the second bond lead-out region G14. The memory control unit H21 is connected to the first bond lead-out region H18. Thus, the memory control unit H21 can control the first programmable gate array assembly 1 to access the first memory array assembly 2 through the first bonding lead-out region H18 and the second bonding lead-out region G14.
The first bond lead-out region H19 connects to the second bond lead-out region G12, and the second bond lead-out region G12 connects to the third bond lead-out region I28. Thus, the memory control unit H21 can control the first programmable gate array assembly 1 to access the second memory array assembly 4 through the first bonding lead-out region H19, the second bonding lead-out region G12 and the third bonding lead-out region I28. The second bond lead-out region G12 is not connected to the memory cell G13.
In this embodiment, the first programmable gate array assembly 1 further includes a programmable logic unit K23, the programmable logic unit K23 is connected to the storage control unit H21 through an interface routing unit H22, and the programmable logic unit K23 derives logic signals. The memory control unit H21 selectively controls the first programmable gate array assembly 1 to access the first memory array assembly 2 or controls the first programmable gate array assembly 1 to access the second memory array assembly 4 in a time-sharing manner based on the logic signals. Specifically, the memory control unit H21 controls the first programmable gate array assembly 1 to access the first memory array assembly 2 at a first time and controls the first programmable gate array assembly 1 to access the second memory array assembly 4 at a second time based on the logic signals.
In one embodiment, the first programmable gate array assembly 1 accesses the first memory array assembly 2 and the second memory array assembly 4 using two different memory control units, respectively. Specifically, the first programmable gate array assembly 1 respectively accesses the first storage array assembly 2 and the second storage array assembly 4 by using two different storage control units, and since there is no access conflict, the storage control unit can simultaneously control the first programmable gate array assembly 1 to access the first storage array assembly 2 and control the first programmable gate array assembly 1 to access the second storage array assembly 4. Specifically, the first memory control unit controls the first programmable gate array assembly 1 to access the first memory array assembly 2, and the second memory control unit controls the first programmable gate array assembly 1 to access the second memory array assembly 4.
Referring to fig. 9, in the present embodiment, the stacked chip further includes a first memory control unit H20 and a second memory control unit I29, and the first memory control unit H20 and the second memory control unit I29 are disposed on the first interface module H17. In this embodiment, the first interface module H17 includes two first bond pad out regions, i.e., a first bond pad out region H19 and a first bond pad out region H18. The first memory array module 2 is provided with a plurality of memory cells G13, and the memory cell G13 is provided with two second bond lead-out regions, namely a second bond lead-out region G12 and a second bond lead-out region G14. The second memory array module 4 is provided with a plurality of memory cells F01, and the memory cell F01 is provided with a third bond lead-out region I28.
In the present embodiment, the first memory control unit H20 is connected to the first bond lead-out region H18, and the first bond lead-out region H18 is connected to the second bond lead-out region G14. Thus, the first memory control unit H18 can control the first programmable gate array assembly 1 to access the first memory array assembly 2 through the first bonding lead-out region H18 and the second bonding lead-out region G14.
Further, the second memory control unit I29 is connected to the first bond lead-out region H19, the first bond lead-out region H19 is connected to the second bond lead-out region G12, and the second bond lead-out region G12 is connected to the third bond lead-out region I28. Thus, the second memory control unit I29 can control the first programmable gate array assembly 1 to access the second memory array assembly 4 through the first bonding lead-out region H19, the second bonding lead-out region G12 and the third bonding lead-out region I28. The second bond lead-out region G12 is not connected to the memory cell G13.
In this embodiment, the first programmable gate array component 1 further includes: the programmable logic unit K23 and the programmable logic unit K23 are connected with the first storage control unit H20 and the second storage control unit I29, and logic signals are led out of the programmable logic unit K23. Specifically, the programmable logic unit K23 is connected to the first storage control unit H20 and the second storage control unit I29 through the interface routing unit H22. In this embodiment, the first memory control unit H20 controls the first programmable gate array assembly 1 to access the first memory array assembly 2 based on logic signals, and the second memory control unit I29 simultaneously controls the first programmable gate array assembly 1 to access the second memory array assembly 4 based on logic signals.
The present application also proposes another embodiment in which a plurality of memory array elements implement a hybrid memory access to at least one programmable gate array element by designing a multiplexed or independent memory control cell using a hybrid of the methods of fig. 8 and 9. In the same stacked chip, the programmable logic units in partial areas use the multiplexing storage control unit shown in fig. 8 to realize storage access; the programmable logic units in the partial area realize storage access by using the independent storage control unit shown in FIG. 9.
In another embodiment, as shown in fig. 10, the second memory array component 4 can also be disposed on a side of the first programmable gate array component 1 away from the first memory array component 2. In this embodiment, the first interface module 11 further includes a fourth bonding lead-out area 12, and the third bonding lead-out area 41 and the fourth bonding lead-out area 12 form a three-dimensional heterogeneous integrated interconnection.
In this embodiment, more memory array components are integrated, which is beneficial to increasing the memory density. And because the first storage array component 2 and the second storage array component 4 are directly connected with the first programmable gate array component 1, the three-dimensional heterogeneous integration is reduced, the interconnection distance is closer, the storage access distance is short, the distribution parameters are small, and the storage access frequency and the power consumption are optimal.
In one embodiment, the first programmable gate array assembly 1 accesses the first memory array assembly 2 and the second memory array assembly 4 using the same memory control unit. Specifically, when the first programmable gate array assembly 1 shares the same memory control unit to access the first memory array assembly 2 and the second memory array assembly 4, in order to avoid access conflict, the memory control unit may selectively select the first programmable gate array assembly 1 to access the first memory array assembly 2 or the second memory array assembly 4 in a time-sharing manner.
Referring to fig. 11, in the present embodiment, the stacked chip further includes a memory control unit H21, and the memory control unit H21 is disposed on the first interface module H17. In this embodiment, the first interface module H17 includes two first bond pad out regions, i.e., a first bond pad out region H19 and a first bond pad out region H18. A plurality of memory cells G13 are disposed on the first memory array module 2, and a second bond lead-out region G14 is disposed on the memory cell G13. The second memory array module 4 is provided with a plurality of memory cells F01, and the memory cell F01 is provided with a third bond lead-out region I28.
Specifically, the first bond lead-out region H18 connects the second bond lead-out region G14. The memory control unit H21 is connected to the first bond lead-out region H18. Thus, the memory control unit H21 can control the first programmable gate array assembly 1 to access the first memory array assembly 2 through the first bonding lead-out region H18 and the second bonding lead-out region G14.
The memory control unit H21 may connect the third bond lead-out region I28 through the first bond lead-out region H19, the first bond lead-out region H19. Thus, the memory control unit H21 can control the first programmable gate array assembly 1 to access the second memory array assembly 4 through the first bonding lead-out region H19 and the third bonding lead-out region I28.
In this embodiment, the first programmable gate array assembly 1 further includes a programmable logic unit K23, the programmable logic unit K23 is connected to the storage control unit H21 through an interface routing unit H22, and the programmable logic unit K23 derives logic signals. The memory control unit H21 selectively controls the first programmable gate array assembly 1 to access the first memory array assembly 2 or controls the first programmable gate array assembly 1 to access the second memory array assembly 4 in a time-sharing manner based on the logic signals. Specifically, the memory control unit H21 controls the first programmable gate array assembly 1 to access the first memory array assembly 2 at a first time and controls the first programmable gate array assembly 1 to access the second memory array assembly 4 at a second time based on the logic signals
In one embodiment, the first programmable gate array assembly 1 accesses the first memory array assembly 2 and the second memory array assembly 4 using two different memory control units, respectively. Specifically, the first programmable gate array assembly 1 respectively accesses the first storage array assembly 2 and the second storage array assembly 4 by using two different storage control units, and since there is no access conflict, the storage control unit can simultaneously control the first programmable gate array assembly 1 to access the first storage array assembly 2 and control the first programmable gate array assembly 1 to access the second storage array assembly 4. Specifically, the first memory control unit controls the first programmable gate array assembly 1 to access the first memory array assembly 2, and the second memory control unit controls the first programmable gate array assembly 1 to access the second memory array assembly 4.
Referring to fig. 12, in the present embodiment, the stacked chip further includes a first memory control unit H20 and a second memory control unit I29, and the first memory control unit H20 and the second memory control unit I29 are disposed on the first interface module H17. In this embodiment, the first interface module H17 includes two first bond pad out regions, i.e., a first bond pad out region H19 and a first bond pad out region H18. A plurality of memory cells G13 are disposed on the first memory array module 2, and a second bond lead-out region G14 is disposed on the memory cell G13. The second memory array module 4 is provided with a plurality of memory cells F01, and the memory cell F01 is provided with a third bond lead-out region I28.
In the present embodiment, the first memory control unit H20 is connected to the first bond lead-out region H18, and the first bond lead-out region H18 is connected to the second bond lead-out region G14. Thus, the first memory control unit H18 can control the first programmable gate array assembly 1 to access the first memory array assembly 2 through the first bonding lead-out region H18 and the second bonding lead-out region G14.
Further, the second memory control unit I29 is connected to the first bond lead-out region H19, and the first bond lead-out region H19 is connected to the third bond lead-out region I28. Thus, the second memory control unit I29 can control the first programmable gate array assembly 1 to access the second memory array assembly 4 through the first bonding lead-out region H19 and the third bonding lead-out region I28.
In this embodiment, the first programmable gate array component 1 further includes: the programmable logic unit K23 and the programmable logic unit K23 are connected with the first storage control unit H20 and the second storage control unit I29, and logic signals are led out of the programmable logic unit K23. Specifically, the programmable logic unit K23 is connected to the first storage control unit H20 and the second storage control unit I29 through the interface routing unit H22. In this embodiment, the first memory control unit H20 controls the first programmable gate array assembly 1 to access the first memory array assembly 2 based on logic signals, and the second memory control unit I29 simultaneously controls the first programmable gate array assembly 1 to access the second memory array assembly 4 based on logic signals.
The present application also proposes another embodiment in which a plurality of memory array elements implement a hybrid memory access to at least one programmable gate array element by designing a multiplexed or independent memory control cell using a hybrid of the methods of fig. 11 and 12. In the same stacked chip, the programmable logic units in partial areas use the multiplexing storage control unit shown in fig. 11 to realize storage access; the programmable logic units in the partial area realize storage access by using the independent storage control unit shown in FIG. 12.
In the application, the storage array component can be a multilayer chip and is combined through three-dimensional heterogeneous integrated bonding; the application specific integrated circuit array component can be provided with one or more arbitrary combinations of a multiplication and addition calculation array, a multiplication calculation array, a pulse processor array, a hash calculation array, various encoder arrays, a machine learning special layer array, a retrieval function array, an image/video processing array, and hard core operation/processing units such as a CPU (central processing unit), an MCU (microprogrammed control unit) and the like, and is used for being combined with a programming gate array component to improve the processing density of stacked chips.
Specifically, the component may be at least one of a die or a chip and a wafer (wafer), but not limited thereto, and may be any alternative conceivable by those skilled in the art. The wafer (wafer) is a silicon wafer used for manufacturing a silicon semiconductor circuit, and the chip or die (chip or die) is a silicon wafer obtained by dividing the wafer on which the semiconductor circuit is manufactured. For example, the memory array component of the present application may be a memory array die (DRAM die or DRAM chip), a memory array wafer (DRAM wafer).
Based on the same utility model concept as the method, the embodiment of the utility model also provides a three-dimensional heterogeneous integrated stacked chip structure. The stacked chip is provided with layered stacked components, which can be any one of the above components through three-dimensional heterogeneous integration interconnection. When the stacked chip is manufactured, it is also possible to directly manufacture in units of wafers (wafers) and perform three-dimensional heterogeneous integration.
When stacked chips are prepared, the preparation can also be partially carried out by taking a wafer (wafer) as a unit, and three-dimensional heterogeneous integration is carried out, and specifically, two methods are adopted: performing three-dimensional heterogeneous integration on part of wafer layers to form an intermediate product, and performing iteration on the rest of wafer layers and the intermediate product until the preparation is finished; or after three-dimensional heterogeneous integration is carried out on part of wafer layers, an intermediate product is formed, then the intermediate product is cut into crystal grains (die), and the die is subjected to three-dimensional heterogeneous integration of the crystal grains with the crystal grains of other components, so that the preparation is completed.
Specifically, the process of manufacturing the stacked chip composed of the multi-layer programmable gate array device and at least one layer of memory array device shown in fig. 4 includes two methods: carrying out three-dimensional heterogeneous integration on the multilayer programmable gate array component by taking a wafer as a unit to form an intermediate product so as to improve the interconnection density, and carrying out three-dimensional heterogeneous integration on the intermediate product and the intermediate product formed by at least one layer of storage array component to obtain a stacked chip; or, the multilayer programmable gate array component is subjected to three-dimensional heterogeneous integration by taking a wafer as a unit to form an intermediate product, the intermediate product is cut into crystal grains and tested, and then the crystal grains are integrated with the intermediate product after the cutting test formed by at least one layer of storage array component to obtain the stacked chip.
Similarly, the process of manufacturing the stacked chip with the multi-layer memory array module and at least one layer of programmable gate array module shown in fig. 7 includes two methods: carrying out three-dimensional heterogeneous integration on the multilayer storage array component by taking a wafer as a unit to form an intermediate product so as to improve the interconnection density, and carrying out three-dimensional heterogeneous integration on the intermediate product and the intermediate product formed by at least one layer of programmable gate array component to obtain a stacked chip; or, the multilayer storage array component is subjected to three-dimensional heterogeneous integration by taking a wafer as a unit to form an intermediate product, the intermediate product is cut into crystal grains and tested, and then the crystal grains are integrated with the intermediate product after the cutting test formed by at least one layer of programmable gate array component to obtain the stacked chip.
The number and the sequence of the layers of the programmable gate array component and the storage array component of the stacked chip depend on the application scene, engineering requirements, and complex games of production cost and production yield, and the obtained optimal result is not single. Different target products with different layer numbers and layer sequences also have diversified production and preparation processes, and have obvious differences on the design and the reuse design of the memory controller.
In the Programmable Gate Array module, the Programmable function module is widely interconnected with the Programmable routing network, referring to fig. 13, the Programmable Gate Array module is based on the extension of Field-Programmable Gate Array (FPGA/Embedded Field-Programmable Gate Array, effpga) technology, and the Programmable Gate Array module includes a Programmable logic block 11A and a Programmable routing network 11b (interconnect); the programmable logic blocks 11A are interconnected with each other through the routing network 11B and configured as a plurality of programmable function modules, and at least a part of the programmable routing network 11B can be extended to the interface routing unit, so as to form large-capacity, high-bandwidth and programmable storage access by interconnecting large-capacity storage arrays in a cross-layer manner through three-dimensional heterogeneous integration.
Three-dimensional heterogeneous integration is a technology of stacked chip interconnection Bonding, such as Hybrid Bonding (Hybrid Bonding) process. The stacked chip is prepared by utilizing a three-dimensional heterogeneous integrated bonding layer manufactured by a back end of line (BEOL) on the basis of a prepared chip (such as a programmable gate array component or a storage array component) to realize high-density interconnection of signals between chips.
Specifically, fig. 14 is taken as an example for explanation. In fig. 14, the stacked chip includes a functional component 210, a functional component 220, and a functional component 230, and the functional component 210, the functional component 220, and the functional component 230 may be a programmable gate array component and/or a memory array component. The functional components 210, 220 and 230 each comprise a top metal layer, an internal metal layer active layer and a substrate, wherein the top metal layer and the internal metal layer are used for intra-component signal interconnection; the active layer is used for realizing a transistor and forming a module function; the substrate serves to protect the module and provide mechanical support, etc. The functional components 210 and 220 are close to one side of the top metal layer, and three-dimensional heterogeneous integrated bonding layers are manufactured through the subsequent process and are interconnected to form a face-to-face interconnection structure; the side of the functional element 220 close to the substrate and the side of the functional element 230 close to the top metal layer are subjected to a subsequent process to manufacture a three-dimensional heterogeneous integrated bonding layer and are interconnected to form a back-to-back (or back-to-back) interconnection structure. Between the functional components 210, 220 and 230, cross-component signal interconnections can be arbitrarily established through three-dimensional heterogeneous integration. The difference is whether the core voltages of functional component 210, functional component 220, and functional component 230 are the same, corresponding to the two interconnect technologies.
When the core voltages of functional component 210 and functional component 230 are the same, taking functional circuit 1 in functional component 210, as an example, needing to establish cross-component interconnect with functional circuit 10 in functional component 230: leading-out signals of an internal metal layer of the functional component 210 of the functional circuit 1 are connected with a face-to-face three-dimensional heterogeneous integrated bonding structure between the functional component 210 and the functional component 220 through top metal of the functional component 210, and further are interconnected with the top metal of the functional component 220; interconnection signals interconnected to the back-to-face three-dimensional heterogeneous integrated bonding structure between the functional component 220 and the functional component 230 through the internal metal layers of the functional component 220 and Through Silicon Vias (TSVs) that penetrate the active layer of the functional component 220 and the thinned substrate, and further interconnected to the top metal layer of the functional component 230; the interconnect signals implement interconnecting the functional circuitry 10 in the functional component 230 across the components through the internal metal layers of the functional component 230.
When the core voltages of the functional component 210 and the functional component 230 are different, taking the functional circuit 2 in the functional component as an example, it is necessary to establish cross-component interconnection with the functional circuit 10 in the functional component 230: designing a level shift circuit 2 in the functional component 210, the level shift circuit 2 and the functional circuit 2 being interconnected in the functional component 210; after the level shifter circuit 2 converts the interconnect signal of the functional circuit 2 to match the core voltage of the functional component 230, the functional circuit 20 in the functional component 230 is interconnected across the components using the aforementioned method. Further, the level shift circuit 2 may be transferred to the functional module 230 or the functional module 220 by three-dimensional heterogeneous integration interconnection.
In the stacked chip provided by the application, the programmable gate array component and the application specific integrated circuit array component do not access the storage of the storage array component through an IO interface and/or an IO circuit, so that the interconnection distance is shorter, and the power consumption overhead of the storage access is obviously reduced. And a programmable storage integrated structure with high bandwidth and low power consumption is realized by a three-dimensional heterogeneous integrated bonding mode.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A stacked chip, comprising:
the first programmable gate array component comprises a first interface module, the first interface module is embedded in the first programmable gate array component, and the first interface module comprises a first bonding lead-out area;
the first storage array assembly is provided with a second bonding lead-out area;
the first bonding lead-out region and the second bonding lead-out region are in bonding connection so as to connect the first programmable gate array assembly and the interconnection signals on the first memory array assembly together.
2. The stacked chip of claim 1, wherein said first programmable gate array component comprises a plurality of functional modules,
the number of the first interface modules is at least one, and the first interface modules are positioned among the plurality of functional modules and are connected with the functional modules through interface routing units.
3. The stacked chip of claim 2, wherein the functional module is strip-shaped inside, and the first interface module is arranged to extend along the strip-shaped functional module layout.
4. The stacked chip of claim 2, wherein said functional module is connected to an interface routing unit through an internal metal layer, said first interface module being interconnected with said interface routing unit through an internal metal layer.
5. The stacked chip of claim 4, wherein the first programmable gate array component comprises: the plurality of functional modules are interconnected with the programmable routing network through an internal metal layer and connected to the interface routing unit through the programmable routing network.
6. The stacked chip of claim 1, wherein the stacked chip further comprises:
the storage control unit is arranged on the first interface module; or,
the storage control unit is arranged at a position, close to the first interface, of the first programmable gate array component; or,
the storage control unit is arranged on the first storage array component;
the storage control unit controls the first programmable gate array component to carry out storage access on the first storage array component.
7. The stacked chip of any one of claims 1 to 5, further comprising:
the second storage array component is arranged on one side, far away from the first storage array component, of the first programmable gate array component;
the second storage array assembly is provided with a third bonding lead-out area;
the first interface module comprises a fourth bonding leading-out area, and the first programmable gate array component and the second storage array component are connected in a bonding mode through the third bonding leading-out area and the fourth bonding leading-out area.
8. The stacked chip of any one of claims 1 to 5, further comprising:
the second storage array component is arranged on one side of the first storage array component far away from the first programmable gate array component;
the second storage array assembly is provided with a third bonding lead-out area;
the first storage array assembly comprises a fourth bonding leading-out area, and the first storage array assembly and the second storage array assembly are connected in a bonding mode through the fourth bonding leading-out area and the third bonding leading-out area.
9. The stacked chip of claim 7, wherein the stacked chip further comprises:
the storage control unit is arranged on the first interface module;
the memory control unit controls the first programmable gate array component to access the first memory array component and the second memory array component.
10. The stacked chip of claim 7, wherein the stacked chip further comprises:
the first storage control unit is arranged on the first interface module;
the second storage control unit is arranged on the first interface module;
the first storage control unit controls the first programmable gate array component to access the first storage array component, and the second storage control unit controls the first programmable gate array component to access the second storage array component.
CN202122118042.3U 2021-09-02 2021-09-02 Stacking chip Active CN216118778U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202122118042.3U CN216118778U (en) 2021-09-02 2021-09-02 Stacking chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202122118042.3U CN216118778U (en) 2021-09-02 2021-09-02 Stacking chip

Publications (1)

Publication Number Publication Date
CN216118778U true CN216118778U (en) 2022-03-22

Family

ID=80730802

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202122118042.3U Active CN216118778U (en) 2021-09-02 2021-09-02 Stacking chip

Country Status (1)

Country Link
CN (1) CN216118778U (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626374A (en) * 2021-09-02 2021-11-09 西安紫光国芯半导体有限公司 Stacking chip
CN117453619A (en) * 2023-10-27 2024-01-26 北京算能科技有限公司 Data processing chip, manufacturing method thereof and data processing system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626374A (en) * 2021-09-02 2021-11-09 西安紫光国芯半导体有限公司 Stacking chip
CN117453619A (en) * 2023-10-27 2024-01-26 北京算能科技有限公司 Data processing chip, manufacturing method thereof and data processing system

Similar Documents

Publication Publication Date Title
CN113626374A (en) Stacking chip
US11756951B2 (en) Layout design methodology for stacked devices
CN216118778U (en) Stacking chip
US20100140750A1 (en) Parallel Plane Memory and Processor Coupling in a 3-D Micro-Architectural System
CN111492477A (en) 3D computation circuit with high density Z-axis interconnect
US9287239B2 (en) Techniques for interconnecting stacked dies using connection sites
TW202147562A (en) High capacity memory module including wafer-section memory circuit
JP7349812B2 (en) memory system
US12112793B2 (en) Signal routing between memory die and logic die for mode based operations
US20130061004A1 (en) Memory/logic conjugate system
WO2023030054A1 (en) Computing device, computing system, and computing method
KR20210044855A (en) Multidimensional Integrated Circuits and Memory Structures for Integrated Circuits and Related Systems and Methods
CN113722268B (en) Deposit and calculate integrative chip that piles up
CN113626373A (en) Integrated chip
CN113793632B (en) Nonvolatile programmable chip
CN216118777U (en) Integrated chip
CN113626372B (en) Integrated chip integrating memory and calculation
CN108595748B (en) Three-dimensional topological structure of antifuse Field Programmable Gate Array (FPGA)
CN112447202A (en) Neural network intelligent chip and forming method thereof
CN112446475A (en) Neural network intelligent chip and forming method thereof
CN113745197A (en) Three-dimensional heterogeneous integrated programmable array chip structure and electronic device
US20080189480A1 (en) Memory configured on a common substrate

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 710000 floor 4, block a, No. 38, Gaoxin 6th Road, Zhangba street, high tech Zone, Xi'an, Shaanxi

Patentee after: Xi'an Ziguang Guoxin Semiconductor Co.,Ltd.

Country or region after: China

Address before: 710000 floor 4, block a, No. 38, Gaoxin 6th Road, Zhangba street, high tech Zone, Xi'an, Shaanxi

Patentee before: XI''AN UNIIC SEMICONDUCTORS Co.,Ltd.

Country or region before: China

CP03 Change of name, title or address