TW202407579A - Artificial neural network comprising a three-dimensional integrated circuit - Google Patents

Artificial neural network comprising a three-dimensional integrated circuit Download PDF

Info

Publication number
TW202407579A
TW202407579A TW112108853A TW112108853A TW202407579A TW 202407579 A TW202407579 A TW 202407579A TW 112108853 A TW112108853 A TW 112108853A TW 112108853 A TW112108853 A TW 112108853A TW 202407579 A TW202407579 A TW 202407579A
Authority
TW
Taiwan
Prior art keywords
die
array
memory cells
vector matrix
input
Prior art date
Application number
TW112108853A
Other languages
Chinese (zh)
Inventor
曉萬 陳
恩漢 杜
馬克 萊坦
Original Assignee
美商超捷公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/848,371 external-priority patent/US20230325645A1/en
Application filed by 美商超捷公司 filed Critical 美商超捷公司
Publication of TW202407579A publication Critical patent/TW202407579A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Semiconductor Integrated Circuits (AREA)
  • Non-Volatile Memory (AREA)

Abstract

Numerous examples are disclosed of an artificial neural network comprising a three-dimensional integrated circuit. In one embodiment, a three-dimensional integrated circuit for use in an artificial neural network comprises a first die comprising a first vector by matrix multiplication array and a first input multiplexor, the first die located on a first vertical layer; a second die comprising an input circuit, the second die located on a second vertical layer different than the first vertical layer; and one or more vertical interfaces coupling the first die and the second die; wherein during a read operation, the input circuit provides an input signal to the first input multiplexor over at least one of the one or more vertical interfaces, the first input multiplexor applies the input signal to one or more rows in the first vector by matrix multiplication array, and the first vector by matrix multiplication array generates an output.

Description

包含三維積體電路的人工神經網路Artificial neural network containing three-dimensional integrated circuits

[優先權主張] 本申請案主張2022年4月6日申請且標題為「包含三維積體電路之人工神經網路(Artificial Neural Network Comprising a Three-Dimensional Integrated Circuit)」之美國臨時專利申請案第63/328,126號及2022年6月23日申請且標題為「包含三維積體電路之人工神經網路」之美國專利申請案第17/848,371號的優先權。 [Priority claim] This application claims U.S. Provisional Patent Application No. 63/328,126, filed on April 6, 2022 and titled "Artificial Neural Network Comprising a Three-Dimensional Integrated Circuit". Priority is granted to U.S. Patent Application No. 17/848,371, filed on June 23, 2022, and titled "Artificial Neural Networks Including Three-Dimensional Integrated Circuits."

揭示了包含三維積體電路之人工神經網路之眾多實例。Revealed many examples of artificial neural networks including three-dimensional integrated circuits.

人工神經網路模擬生物神經網路(動物之中樞神經系統,尤其係大腦)且用以估計或估算可取決於大量輸入且通常係未知之函數。人工神經網路通常包括彼此交換訊息之互連「神經元」之層。Artificial neural networks simulate biological neural networks (the central nervous system of animals, especially the brain) and are used to estimate or estimate functions that may depend on a large number of inputs and are often unknown. Artificial neural networks typically consist of layers of interconnected "neurons" that exchange messages with each other.

圖1例示人工神經網路,其中圓形表示神經元之輸入或層。連接(稱為突觸)由箭頭表示,且具有可基於經驗進行調諧之數值權重。此使得神經網路適應於輸入且能夠學習。通常,神經網路包括多個輸入之層。通常存在一或多個中間神經元層及提供神經網路之輸出之輸出神經元層。各層級處之神經元基於自突觸所接收之資料而個別地或共同地作出決策。Figure 1 illustrates an artificial neural network, where circles represent inputs or layers of neurons. Connections (called synapses) are represented by arrows and have numerical weights that can be tuned based on experience. This allows the neural network to adapt to the input and learn. Typically, neural networks include multiple input layers. There are typically one or more layers of interneurons and a layer of output neurons that provide the output of the neural network. Neurons at each level make decisions individually or collectively based on the data received from synapses.

用於高效能資訊處理之人工神經網路之發展中的主要挑戰之一在於缺乏適當硬體技術。實際上,切實可行的神經網路依賴於極大量之突觸,從而實現神經元之間的高連接性,亦即極高計算並行性。原則上,此複雜性可利用數位超級電腦或專門圖形處理單元叢集來達成。然而,除高成本以外,與生物網路相比,此等方法亦受中等能效困擾,主要因為生物網路執行低精度類比計算,所以其消耗少得多的能量。CMOS類比電路已用於人工神經網路,但鑒於大量神經元及突觸,故大部分實施CMOS之突觸已過於龐大。One of the major challenges in the development of artificial neural networks for high-performance information processing is the lack of appropriate hardware technology. In fact, practical neural networks rely on an extremely large number of synapses to achieve high connectivity between neurons, which means extremely high computational parallelism. In principle, this complexity could be achieved using digital supercomputers or clusters of specialized graphics processing units. However, in addition to high cost, these methods also suffer from moderate energy efficiency compared to biological networks, mainly because biological networks perform low-precision analog calculations and therefore consume much less energy. CMOS analog circuits have been used in artificial neural networks, but most synapses implementing CMOS are too large due to the large number of neurons and synapses.

申請人先前在美國專利申請公開案2017/0337466A1中揭示一種利用一或多個非揮發性記憶體陣列作為突觸之人工(類比)神經網路,該美國專利申請公開案以引用之方式併入。非揮發性記憶體陣列操作為類比神經記憶體,且包含以列及行配置之非揮發性記憶體胞元。神經網路包括:第一複數個突觸,其被組構以接收第一複數個輸入且自該第一複數個輸入產生第一複數個輸出;及第一複數個神經元,其被組構以接收第一複數個輸出。第一複數個突觸包括複數個記憶體胞元,其中記憶體胞元中之各者包括:形成於半導體基板中之間隔開的源極區及汲極區,其中通道區在源極區與汲極區之間延伸;浮動閘極,其裝設於通道區之第一部分上方且與該第一部分絕緣;以及非浮動閘極,其裝設於通道區之第二部分上方且與該第二部分絕緣。複數個記憶體胞元中之各者儲存對應於浮動閘極上之電子數目的權重值。複數個記憶體胞元將第一複數個輸入乘以所儲存權重值以產生第一複數個輸出。 非揮發性記憶體胞元 The applicant previously disclosed an artificial (analog) neural network using one or more non-volatile memory arrays as synapses in US Patent Application Publication 2017/0337466A1, which is incorporated by reference. . The non-volatile memory array operates analogously to a neural memory and contains non-volatile memory cells arranged in columns and rows. The neural network includes: a first plurality of synapses configured to receive a first plurality of inputs and to generate a first plurality of outputs from the first plurality of inputs; and a first plurality of neurons configured to to receive the first plurality of outputs. The first plurality of synapses includes a plurality of memory cells, wherein each of the memory cells includes: a source region and a drain region formed in the semiconductor substrate, and the channel region is between the source region and the drain region. extending between the drain regions; a floating gate mounted above and insulated from a first portion of the channel region; and a non-floating gate mounted above a second portion of the channel region and insulated from the second Partially insulated. Each of the plurality of memory cells stores a weight value corresponding to the number of electrons on the floating gate. A plurality of memory cells multiply a first plurality of inputs by the stored weight values to generate a first plurality of outputs. non-volatile memory cells

非揮發性記憶體為熟知的。舉例而言,以引用方式併入本文中之美國專利5,029,130 (「'130專利」)揭示了一種分離閘極非揮發性記憶體胞元陣列,其為一種類型之快閃記憶體胞元。此記憶體胞元210顯示於圖2中。各記憶體胞元210包括形成於半導體基板12中之源極區14及汲極區16,其中通道區18處於該源極區與該汲極區之間。浮動閘極20形成於通道區18之第一部分上方且與該第一部分絕緣(且控制該第一部分之導電性),且形成於源極區14之一部分上方。字元線端子22 (其通常耦接至字元線)具有:第一部分,其裝設於通道區18之第二部分上方且與該第二部分絕緣(且控制該第二部分之導電性);及第二部分,其在浮動閘極20上及上方延伸。浮動閘極20及字元線端子22藉由閘極氧化物與基板12絕緣。位元線24耦接至汲極區16。Non-volatile memories are well known. For example, U.S. Patent 5,029,130 (the "'130 patent"), which is incorporated herein by reference, discloses an array of split-gate non-volatile memory cells, which is a type of flash memory cell. This memory cell 210 is shown in Figure 2. Each memory cell 210 includes a source region 14 and a drain region 16 formed in the semiconductor substrate 12, with a channel region 18 located between the source region and the drain region. Floating gate 20 is formed over and insulated from (and controls the conductivity of) a first portion of channel region 18 and over a portion of source region 14 . Wordline terminal 22 (which is typically coupled to the wordline) has a first portion disposed over and insulated from (and controls the conductivity of) a second portion of channel region 18 ; and a second portion extending on and above the floating gate 20 . The floating gate 20 and the word line terminal 22 are insulated from the substrate 12 by gate oxide. Bit line 24 is coupled to drain region 16 .

記憶體胞元210藉由將高正電壓置放於字元線端子22上來抹除(其中電子自浮動閘極移除),此使得浮動閘極20上之電子經由富爾-諾罕(Fowler-Nordheim;FN)穿隧自浮動閘極20穿過中間絕緣件穿隧至字元線端子22。Memory cell 210 is erased (with electrons removed from the floating gate) by placing a high positive voltage on word line terminal 22, which causes the electrons on floating gate 20 to pass through the Fowler -Nordheim; FN) tunnels from the floating gate 20 through the intermediate insulator to the word line terminal 22.

記憶體胞元210係藉由將正電壓置放於字元線端子22上且將正電壓置放於源極區14上而藉由運用熱電子之源極側注入(SSI)而經程式化(其中電子置放於浮動閘極上)。電子電流將自汲極區16朝向源極區14流動。當電子到達字元線端子22與浮動閘極20之間的間隙時,該等電子將加速且經加熱。經加熱電子中之一些將由於來自浮動閘極20之吸引靜電力而穿過閘極氧化物注入至浮動閘極20上。Memory cell 210 is programmed by applying source side injection (SSI) of hot electrons by placing a positive voltage on word line terminal 22 and a positive voltage on source region 14 (The electrons are placed on the floating gate). Electron current will flow from the drain region 16 toward the source region 14 . When electrons reach the gap between word line terminal 22 and floating gate 20, they will be accelerated and heated. Some of the heated electrons will be injected through the gate oxide onto the floating gate 20 due to the attractive electrostatic force from the floating gate 20 .

記憶體胞元210係藉由將正讀取電壓置於汲極區16及字元線端子22上而經讀取(此接通通道區18之在字元線端子下方的部分)。若浮動閘極20帶正電(亦即,電子經抹除),則通道區18之在浮動閘極20下方的部分亦接通,且電流將跨越通道區18流動,此經感測為抹除或「1」狀態。若浮動閘極20帶負電(亦即,用電子程式化),則在浮動閘極20下方的通道區之部分被大部分或完全斷開,且電流將不跨越通道區18流動(或將有極少電流流動),此經感測為經程式化或「0」狀態。Memory cell 210 is read by placing a positive read voltage on drain region 16 and word line terminal 22 (this turns on the portion of channel region 18 below the word line terminal). If floating gate 20 is positively charged (i.e., the electrons are erased), then the portion of channel region 18 below floating gate 20 will also turn on, and current will flow across channel region 18, which is sensed as erased. Divide or "1" status. If floating gate 20 is negatively charged (i.e., programmed electronically), then the portion of the channel region below floating gate 20 is mostly or completely disconnected, and current will not flow across channel region 18 (or there will be (very little current flows), this is sensed as a programmed or "0" state.

表1描述可施加至記憶體胞元210之端子以用於執行讀取、抹除及程式化操作的典型電壓及電流範圍: 表1:圖2之快閃記憶體胞元210之操作    WL BL SL   讀取 2-3V 0.6-2V 0V   抹除 ~11-13V 0V 0V 程式化 1-2V 10.5-3μA 9-10V Table 1 describes typical voltage and current ranges that may be applied to the terminals of memory cell 210 for performing read, erase, and program operations: Table 1: Operation of flash memory cell 210 of Figure 2 wL BL SL read 2-3V 0.6-2V 0V Erase ~11-13V 0V 0V stylized 1-2V 10.5-3μA 9-10V

其他分離閘極記憶體胞元組構為吾人所知,該等分離閘極記憶體胞元組構係其他類型之快閃記憶體胞元。舉例而言,圖3描繪四閘極記憶體胞元310,其包含源極區14、汲極區16、在通道區18之第一部分上方的浮動閘極20、在通道區18之第二部分上方的選擇閘極22 (通常耦接至字元線WL)、在浮動閘極20上方之控制閘極28,以及在源極區14上方之抹除閘極30。此組構描繪於美國專利6,747,310中,其出於所有目的以引用之方式併入本文中。此處,除浮動閘極20以外,所有閘極皆為非浮動閘極,此意謂該等閘極電連接或可電連接至電壓源。程式化係藉由來自通道區18之經加熱電子將自身注入至浮動閘極20上而加以執行。抹除係藉由自浮動閘極20至抹除閘極30之電子穿隧來執行。Other split-gate memory cell configurations are known, which are other types of flash memory cells. For example, FIG. 3 depicts a four-gate memory cell 310 that includes a source region 14 , a drain region 16 , a floating gate 20 over a first portion of channel region 18 , a second portion of channel region 18 An upper select gate 22 (typically coupled to word line WL), a control gate 28 above floating gate 20 , and an erase gate 30 above source region 14 . This configuration is described in US Patent 6,747,310, which is incorporated herein by reference for all purposes. Here, except floating gate 20, all gates are non-floating gates, which means that these gates are electrically connected or can be electrically connected to a voltage source. Programming is performed by heated electrons from channel region 18 injecting themselves onto floating gate 20 . Erasing is performed by electron tunneling from floating gate 20 to erase gate 30.

表2描繪可施加至記憶體胞元310之端子以用於執行讀取、抹除及程式化操作的典型電壓及電流範圍: 表2:圖3之快閃記憶體胞元310之操作    WL/SG BL CG EG SL 讀取 1.0-2V 0.6-2V 0-2.6V 0-2.6V 0V 抹除 -0.5V/0V 0V 0V/-8V 8-12V 0V 程式化 1V 0.1-1μA 8-11V 4.5-9V 4.5-5V Table 2 depicts typical voltage and current ranges that may be applied to the terminals of memory cell 310 for performing read, erase, and program operations: Table 2: Operation of flash memory cell 310 of Figure 3 WL/SG BL CG EG SL read 1.0-2V 0.6-2V 0-2.6V 0-2.6V 0V Erase -0.5V/0V 0V 0V/-8V 8-12V 0V stylized 1V 0.1-1μA 8-11V 4.5-9V 4.5-5V

圖4描繪三閘極記憶體胞元410,其為另一類型之快閃記憶體胞元。記憶體胞元410與圖3之記憶體胞元310相同,不同之處在於記憶體胞元410不具有單獨控制閘極。抹除操作(藉此抹除經由使用抹除閘極來進行)及讀取操作類似於圖3之抹除操作及讀取操作,不同之處在於未施加控制閘極偏壓。程式化操作亦在無控制閘極偏壓之情況下進行,且因此,較高電壓在程式化操作期間施加於源極線上以補償控制閘極偏壓之缺乏。Figure 4 depicts a three-gate memory cell 410, which is another type of flash memory cell. The memory cell 410 is the same as the memory cell 310 of FIG. 3 , except that the memory cell 410 does not have a separate control gate. The erase operation (whereby erasure is performed through the use of the erase gate) and the read operation are similar to those of Figure 3, except that no control gate bias is applied. Programming operations also occur without controlled gate bias, and therefore, higher voltages are applied to the source lines during programming operations to compensate for the lack of controlled gate bias.

表3描繪可施加至記憶體胞元410之端子以用於執行讀取、抹除及程式化操作的典型電壓及電流範圍: 表3:圖4之快閃記憶體胞元410之操作    WL/SG BL EG SL 讀取 0.7-2.2V 0.6-2V 0-2.6V 0V 抹除 -0.5V/0V 0V 11.5V 0V 程式化 1V 0.2-3μA 4.5V 7-9V Table 3 depicts typical voltage and current ranges that may be applied to the terminals of memory cell 410 for performing read, erase, and program operations: Table 3: Operation of flash memory cell 410 of Figure 4 WL/SG BL EG SL read 0.7-2.2V 0.6-2V 0-2.6V 0V Erase -0.5V/0V 0V 11.5V 0V stylized 1V 0.2-3μA 4.5V 7-9V

圖5描繪堆疊閘極記憶體胞元510,其為另一類型之快閃記憶體胞元。記憶體胞元510類似於圖2之記憶體胞元210,不同之處在於浮動閘極20在整個通道區18上方延伸,且控制閘極22 (其在此處將耦接至字元線)在浮動閘極20上方延伸,藉由絕緣層(圖中未示)分離。抹除係藉由電子自FG至基板之FN穿隧而進行,程式化係藉由通道區18與汲極區16之間的區處進行通道熱電子(CHE)注入、藉由電子自源極區14朝向汲極區16流動來進行,且讀取操作類似於針對具有較高控制閘極電壓之記憶體胞元210之讀取操作。Figure 5 depicts a stacked gate memory cell 510, which is another type of flash memory cell. Memory cell 510 is similar to memory cell 210 of Figure 2, except that floating gate 20 extends over the entire channel region 18, and control gate 22 (which will be coupled to the word line here) Extends above the floating gate 20 and is separated by an insulating layer (not shown in the figure). Erasing is performed by FN tunneling of electrons from the FG to the substrate. Programming is performed by channel hot electron (CHE) injection in the area between channel region 18 and drain region 16, by electrons from the source. Region 14 flows toward drain region 16 and the read operation is similar to the read operation for memory cell 210 with a higher control gate voltage.

表4描述可施加至記憶體胞元510之端子以及基板12以用於執行讀取、抹除及程式化操作的典型電壓範圍: 表4:圖5之快閃記憶體胞元510之操作    CG BL SL 基板 讀取 2-5V 0.6 - 2V 0V 0V 抹除 -8至-10V/0V FLT FLT 8-10V / 15-20V 程式化 8-12V 3-5V 0V 0V Table 4 describes typical voltage ranges that may be applied to the terminals of memory cell 510 and substrate 12 for performing read, erase, and program operations: Table 4: Operation of flash memory cell 510 of Figure 5 CG BL SL substrate read 2-5V 0.6-2V 0V 0V Erase -8 to -10V/0V FLT FLT 8-10V/15-20V stylized 8-12V 3-5V 0V 0V

本文中所描述之方法及手段可應用於其他非揮發性記憶體技術,諸如但不限於FINFET分離閘極快閃或堆疊閘極快閃記憶體、NAND快閃、矽-氧化物-氮化物-氧化物-矽(SONOS,氮化物中之電荷捕捉)、金屬-氧化物-氮化物-氧化物-矽(MONOS,氮化物中之金屬電荷捕捉)、電阻式隨機存取記憶體(ReRAM)、相變記憶體(PCM)、磁性ram (MRAM)、鐵電ram (FeRAM)、電荷捕捉(CT)記憶體、碳管(CN)記憶體、雙層級或多層級一次性可程式化(OTP)及相關電子ram (CeRAM)。The methods and approaches described herein may be applied to other non-volatile memory technologies such as, but not limited to, FINFET split gate flash or stacked gate flash, NAND flash, silicon-oxide-nitride- Oxide-silicon (SONOS, charge trapping in nitride), metal-oxide-nitride-oxide-silicon (MONOS, metal charge trapping in nitride), resistive random access memory (ReRAM), Phase change memory (PCM), magnetic ram (MRAM), ferroelectric ram (FeRAM), charge trapping (CT) memory, carbon tube (CN) memory, dual-level or multi-level one-time programmable (OTP) ) and related electronic ram (CeRAM).

為了利用包含上文在人工神經網路中所描繪之非揮發性記憶體胞元類型中之一者的記憶體陣列,進行二個修改。首先,線被組構以使得各記憶體胞元可個別地經程式化、抹除及讀取而不會不利地影響陣列中之其他記憶體胞元的記憶體狀態,如下文進一步解釋。其次,提供記憶體胞元之連續(類比)程式化。To utilize a memory array containing one of the non-volatile memory cell types described above in artificial neural networks, two modifications are made. First, the lines are organized so that each memory cell can be programmed, erased, and read individually without adversely affecting the memory state of other memory cells in the array, as explained further below. Second, a sequential (analogous) stylization of memory cells is provided.

具體而言,陣列中之各記憶體胞元之記憶體狀態(亦即,浮動閘極上的電荷)可獨立地且在最少干擾其他記憶體胞元之情況下連續地自完全抹除狀態改變至完全經程式化狀態,且反之亦然。此意謂胞元儲存器有效地類比或至少可儲存許多離散值(諸如,16或64個不同值)中之一者,此允許記憶體陣列中之所有記憶體胞元的極精確及個別調諧,且此使得記憶體陣列對於儲存及對神經網路之突觸權重進行調諧調整而言係理想的。 採用非揮發性記憶體胞元陣列之神經網路 Specifically, the memory state (i.e., the charge on the floating gate) of each memory cell in the array can independently and continuously change from a fully erased state to a fully erased state with minimal interference to other memory cells. Completely stylized state, and vice versa. This means that the cell memory effectively analogues or at least can store one of many discrete values (such as 16 or 64 different values), which allows for extremely precise and individual tuning of all memory cells in the memory array. , and this makes memory arrays ideal for storing and tuning synaptic weights in neural networks. Neural network using non-volatile memory cell arrays

圖6在概念上例示利用本發明實例之非揮發性記憶體陣列的神經網路之非限制性實例。此實例將非揮發性記憶體陣列神經網路用於人臉辨識應用,但任何其他適當應用皆可使用基於非揮發性記憶體陣列之神經網路來實施。Figure 6 conceptually illustrates a non-limiting example of a neural network utilizing a non-volatile memory array of embodiments of the present invention. This example uses a non-volatile memory array neural network for a face recognition application, but any other suitable application can be implemented using a non-volatile memory array based neural network.

S0為輸入層,對於此實例,該輸入層為具有5位元精度之32×32像素RGB影像(亦即,三個32×32像素陣列,各色彩R、G及B一個陣列,各像素為5位元精度)。自輸入層S0行進至層C1之突觸CB1在一些情況下應用不同權重集合且在其他情況下共用權重,且用3×3像素重疊濾波器(核心)掃描輸入影像,使濾波器移位1個像素(或多於1個像素,如由模型規定)。具體而言,將影像(亦即,稱為濾波器或核心)之3×3部分中之9個像素的值提供至突觸CB1,在該突觸中將此等9個輸入值乘以適當權重,且在加總彼乘法之輸出之後,單一輸出值由第一突觸CB1判定及提供以用於產生層C1之特徵圖中之一者的像素。3×3濾波器接著在輸入層S0內向右移位一個像素(亦即,在右側上添加三個像素之行,且在左側上丟棄三個像素之行),藉此將此新定位濾波器中之9個像素值提供至突觸CB1,其中使該等像素值乘以相同權重,且藉由相關突觸判定第二單一輸出值。此過程針對所有三種色彩且針對所有位元(精度值)繼續,直至3×3濾波器跨越輸入層S0之整個32×32像素影像進行掃描為止。過程接著使用不同權重集合進行重複以產生層C1之不同特徵圖,直至層C1之所有特徵圖已經計算為止。S0 is the input layer, which for this example is a 32×32 pixel RGB image with 5-bit precision (that is, three 32×32 pixel arrays, one for each color R, G, and B, and each pixel is 5 bit precision). Synapse CB1 traveling from input layer S0 to layer C1 applies a different set of weights in some cases and shares weights in other cases, and scans the input image with a 3×3 pixel overlapping filter (kernel), shifting the filter by 1 pixels (or more than 1 pixel, as specified by the model). Specifically, the values of 9 pixels in a 3×3 portion of the image (i.e., called the filter or kernel) are provided to synapse CB1 where the 9 input values are multiplied by the appropriate weights, and after summing the outputs of their multiplications, a single output value is determined by the first synapse CB1 and provided for generating a pixel of one of the feature maps of layer C1. The 3×3 filter is then shifted one pixel to the right within the input layer S0 (i.e., a row of three pixels is added on the right and a row of three pixels is discarded on the left), thereby positioning this new filter Nine of the pixel values are provided to synapse CB1, where they are multiplied by the same weight and a second single output value is determined by the associated synapse. This process continues for all three colors and for all bits (precision values) until the 3x3 filter is scanned across the entire 32x32 pixel image of the input layer S0. The process is then repeated using different sets of weights to generate different feature maps for layer C1 until all feature maps for layer C1 have been calculated.

在本實例中,在層C1中存在16個特徵圖,各特徵圖具有30×30個像素。各像素為自輸入與核心相乘而提取之新特徵像素,且因此各特徵圖為二維陣列,且因此在此實例中,層C1構成二維陣列之16個層(應謹記,本文中所提及的層及陣列為邏輯關係,未必為實體關係-亦即,陣列未必定向於實體二維陣列中)。層C1中之16個特徵圖中之各者皆由應用於濾波器掃描之突觸權重之十六個不同集合中的一者產生。C1特徵圖可皆針對同一影像特徵之不同態樣,諸如邊界識別。舉例而言,第一圖(使用第一權重集合產生,共用於用以產生此第一圖之所有掃描)可識別圓形邊緣,第二圖(使用不同於第一權重集合之第二權重集合產生)可識別矩形邊緣,或某些特徵的縱橫比等。In this example, there are 16 feature maps in layer C1, each feature map has 30×30 pixels. Each pixel is a new feature pixel extracted from the input multiplied by the kernel, and therefore each feature map is a two-dimensional array, and therefore in this example, layer C1 constitutes 16 layers of the two-dimensional array (it should be remembered that in this article The layers and arrays mentioned are logical relationships, not necessarily physical relationships - that is, the arrays are not necessarily oriented in a physical two-dimensional array). Each of the 16 feature maps in layer C1 is generated from one of sixteen different sets of synaptic weights applied to the filter sweep. C1 feature maps can all target different aspects of the same image feature, such as boundary recognition. For example, a first image (generated using a first set of weights, common to all scans used to generate this first image) can identify rounded edges, and a second image (generated using a second set of weights different from the first set of weights) Produces) that can identify rectangular edges, or the aspect ratio of certain features, etc.

在自層C1進入層S1之前應用激活函數P1 (池化(pooling)),其池化來自各特徵圖中之連續非重疊2×2區的值。池化函數P1之目的為使附近位置達到平均(或亦可使用最大函數),以例如降低邊緣位置之相依性且在進入下一階段之前縮減資料大小。在層S1處,存在16個15×15特徵圖(亦即,各自具有15×15像素之十六個不同陣列)。自層S1進入層C2之突觸CB2利用4×4濾波器掃描層S1中之圖,其中濾波器移位1個像素。在層C2處,存在22個12×12特徵圖。在自層C2進入層S2之前應用激活函數P2 (池化),其池化來自各特徵圖中之連續非重疊2×2區的值。在層S2處,存在22個6×6特徵圖。在自層S2進入層C3之突觸CB3處應用激活函數(池化),其中層C3中之每個神經元經由CB3之各別突觸連接至層S2中之每個圖。在層C3處,存在64個神經元。自層C3進入輸出層S3之突觸CB4將C3完全連接至S3,亦即,層C3中之每個神經元連接至層S3中之每個神經元。S3處之輸出包括10個神經元,其中最高輸出神經元判定類別。此輸出可例如指示原始影像之內容的識別或分類。The activation function P1 (pooling) is applied before entering layer S1 from layer C1, which pools values from consecutive non-overlapping 2×2 regions in each feature map. The purpose of the pooling function P1 is to average nearby positions (or a maximum function can also be used), for example to reduce the dependence of edge positions and reduce the data size before entering the next stage. At layer S1, there are sixteen 15x15 feature maps (ie, sixteen different arrays of 15x15 pixels each). The synapse CB2 from layer S1 into layer C2 scans the image in layer S1 using a 4×4 filter, where the filter is shifted by 1 pixel. At layer C2, there are 22 12×12 feature maps. The activation function P2 (pooling) is applied before entering layer S2 from layer C2, which pools values from consecutive non-overlapping 2×2 regions in each feature map. At layer S2, there are 22 6×6 feature maps. An activation function (pooling) is applied at the synapses CB3 from layer S2 into layer C3, where each neuron in layer C3 is connected to each graph in layer S2 via a respective synapse in CB3. At layer C3, there are 64 neurons. Synapse CB4 from layer C3 into output layer S3 completely connects C3 to S3, that is, every neuron in layer C3 is connected to every neuron in layer S3. The output at S3 includes 10 neurons, among which the highest output neuron determines the category. This output may, for example, indicate the identification or classification of the content of the original image.

各突觸層係使用非揮發性記憶體胞元之陣列或陣列之一部分來實施。Each synaptic layer is implemented using an array, or a portion of an array, of non-volatile memory cells.

圖7為可用於彼目的之陣列的方塊圖。向量矩陣乘法(VMM)陣列32包括非揮發性記憶體胞元,且用作一個層與下一層之間的突觸(諸如,圖6中之CB1、CB2、CB3及CB4)。具體而言,VMM陣列32包括非揮發性記憶體胞元陣列33、抹除閘極及字元線閘極解碼器34、控制閘極解碼器35、位元線解碼器36及源極線解碼器37,該等解碼器對非揮發性記憶體胞元陣列33之各別輸入進行解碼。至VMM陣列32之輸入可來自抹除閘極及字元線閘極解碼器34或來自控制閘極解碼器35。在此實例中,源極線解碼器37亦對非揮發性記憶體胞元陣列33之輸出進行解碼。替代地,位元線解碼器36可對非揮發性記憶體胞元陣列33之輸出進行解碼。Figure 7 is a block diagram of an array that may be used for this purpose. Vector matrix multiplication (VMM) array 32 includes non-volatile memory cells and serves as a synapse between one layer and the next (such as CB1, CB2, CB3, and CB4 in Figure 6). Specifically, the VMM array 32 includes a non-volatile memory cell array 33, an erase gate and a word line gate decoder 34, a control gate decoder 35, a bit line decoder 36 and a source line decoder. Decoders 37 decode respective inputs of the non-volatile memory cell array 33. Inputs to VMM array 32 may come from erase gate and word line gate decoders 34 or from control gate decoders 35. In this example, source line decoder 37 also decodes the output of non-volatile memory cell array 33. Alternatively, bit line decoder 36 may decode the output of non-volatile memory cell array 33.

非揮發性記憶體胞元陣列33用於二種目的。首先,其儲存將由VMM陣列32使用之權重。其次,非揮發性記憶體胞元陣列33有效地使輸入乘以儲存於非揮發性記憶體胞元陣列33中之權重,且按輸出線(源極線或位元線)將結果相加以產生輸出,該輸出將為至下一層之輸入或至最終層之輸入。藉由執行乘法及加法函數,非揮發性記憶體胞元陣列33消除對分開的乘法及加法邏輯電路之需求,且由於其就地記憶體計算而亦為功率高效的。The non-volatile memory cell array 33 serves two purposes. First, it stores the weights to be used by the VMM array 32. Next, the non-volatile memory cell array 33 effectively multiplies the input by the weight stored in the non-volatile memory cell array 33 and adds the results according to the output line (source line or bit line) to produce Output, which will be the input to the next layer or the input to the final layer. By performing multiply and add functions, non-volatile memory cell array 33 eliminates the need for separate multiply and add logic circuits, and is also power efficient due to its in-memory computation.

將非揮發性記憶體胞元陣列33之輸出供應至差分求和器(諸如求和運算放大器或求和電流鏡) 38,該差分求和器加總非揮發性記憶體胞元陣列33之輸出以產生用於彼卷積之單一值。差分求和器38經配置以執行正權重與負權重之加總。The output of the non-volatile memory cell array 33 is supplied to a differential summer (such as a summing operational amplifier or summing current mirror) 38 which sums the output of the non-volatile memory cell array 33 to produce a single value for that convolution. Difference summer 38 is configured to perform a summation of positive and negative weights.

接著將差分求和器38之總計輸出值供應至激活函數區塊39,該激活函數區塊對輸出進行整流。激活函數區塊39可提供S型(sigmoid)、雙曲正切(tanh)或ReLU函數。激活函數區塊39之經整流輸出值變成作為下一層(例如圖6中之C1)的特徵圖之元素,且接著應用於下一突觸以產生下一特徵圖層或最終層。因此,在此實例中,非揮發性記憶體胞元陣列33構成複數個突觸(其自前一神經元層或自諸如影像資料庫之輸入層接收該等突觸之輸入),且求和運算放大器38及激活函數區塊39構成複數個神經元。The summed output value of the difference summer 38 is then supplied to an activation function block 39 which rectifies the output. The activation function block 39 can provide sigmoid, hyperbolic tangent (tanh) or ReLU functions. The rectified output values of activation function block 39 become elements of the feature map of the next layer (eg, C1 in Figure 6) and are then applied to the next synapse to produce the next feature map or final layer. Therefore, in this example, the non-volatile memory cell array 33 forms a plurality of synapses (which receive input from the previous neuron layer or from an input layer such as an image database), and the sum operation The amplifier 38 and the activation function block 39 constitute a plurality of neurons.

至圖7中之VMM陣列32之輸入(WLx,EGx,CGx,以及選擇地BLx及SLx)可為類比位準、二進位位準或數位位元(在此情況下,DAC經提供以將數位位元轉換成適當輸入類比位準),且輸出可為類比位準、二進位位準或數位位元(在此情況下,輸出ADC經提供以將輸出類比位準轉換成數位位元)。The inputs to VMM array 32 in Figure 7 (WLx, EGx, CGx, and optionally BLx and SLx) may be analog levels, binary levels, or digital bits (in which case a DAC is provided to convert the digital bits to the appropriate input analog levels), and the output may be analog levels, binary levels, or digital bits (in which case an output ADC is provided to convert the output analog levels to digital bits).

圖8為描繪此處標記為VMM陣列32a、32b、32c、32d及32e之VMM陣列32的眾多層之使用的方塊圖。如圖8中所示,表示為輸入x (Inputx)之輸入由數位至類比轉換器31自數位轉換成類比,且被提供至輸入VMM陣列32a。經轉換類比輸入可為電壓或電流。第一層之輸入D/A轉換可藉由使用函數或LUT (查找表)來進行,該函數或LUT將輸入輸入x映射至用於輸入VMM陣列32a之矩陣乘法器的適當類比位準。輸入轉換亦可藉由類比至類比(A/A)轉換器來進行以將外部類比輸入轉換成至輸入VMM陣列32a之經映射類比輸入。8 is a block diagram depicting the use of numerous layers of VMM array 32, labeled here as VMM arrays 32a, 32b, 32c, 32d, and 32e. As shown in Figure 8, the input denoted as Inputx is converted from digital to analog by a digital-to-analog converter 31 and provided to the input VMM array 32a. The converted analog input can be voltage or current. Input D/A conversion of the first layer may be performed by using a function or LUT (look-up table) that maps input input x to the appropriate analog level for the matrix multiplier input to VMM array 32a. Input conversion may also be performed by an analog-to-analog (A/A) converter to convert an external analog input into a mapped analog input to the input VMM array 32a.

由輸入VMM陣列32a產生之輸出經提供為至下一VMM陣列(隱藏層級1) 32b之輸入,該下一VMM陣列又產生輸出,該輸出經提供為至下一VMM陣列(隱藏層級2) 32c之輸入,等等。VMM陣列32之各種層充當卷積神經網路(CNN)之不同突觸層及神經元層。各VMM陣列32a、32b、32c、32d及32e可為單獨之實體非揮發性記憶體陣列,或多個VMM陣列可利用相同實體非揮發性記憶體陣列之不同部分,或多個VMM陣列可利用相同實體非揮發性記憶體陣列之重疊部分。圖8中所示之實例含有五個層(32a、32b、32c、32d、32e):一個輸入層(32a)、二個隱藏層(32b、32c)及二個完全連接層(32d、32e)。一般熟悉本技藝者應瞭解,此僅為例示性的,且系統替代地可包含多於二個隱藏層及多於二個完全連接層。 向量矩陣乘法(VMM)陣列 The output produced by the input VMM array 32a is provided as an input to the next VMM array (hidden level 1) 32b, which in turn produces an output provided to the next VMM array (hidden level 2) 32c input, etc. The various layers of VMM array 32 serve as different synaptic and neuronal layers of a convolutional neural network (CNN). Each VMM array 32a, 32b, 32c, 32d, and 32e may be a separate physical non-volatile memory array, or multiple VMM arrays may utilize different portions of the same physical non-volatile memory array, or multiple VMM arrays may utilize Overlapping portions of identical physical non-volatile memory arrays. The example shown in Figure 8 contains five layers (32a, 32b, 32c, 32d, 32e): one input layer (32a), two hidden layers (32b, 32c) and two fully connected layers (32d, 32e) . Those skilled in the art should understand that this is illustrative only and the system may alternatively include more than two hidden layers and more than two fully connected layers. Vector matrix multiplication (VMM) array

圖9描繪神經元VMM陣列900,其尤其適合於如圖3中所示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元之部分。VMM陣列900包含非揮發性記憶體胞元之記憶體陣列901及非揮發性參考記憶體胞元之參考陣列902 (在陣列之頂部處)。替代地,另一參考陣列可置放於底部處。Figure 9 depicts a neuronal VMM array 900 that is particularly suitable for memory cells 310 as shown in Figure 3 and serves as part of the synapses and neurons between the input layer and the next layer. VMM array 900 includes a memory array 901 of non-volatile memory cells and a reference array 902 of non-volatile reference memory cells (at the top of the array). Alternatively, another reference array can be placed at the bottom.

在VMM陣列900中,諸如控制閘極線903等控制閘極線在垂直方向上延行(因此,列方向上之參考陣列902與控制閘極線903正交),且諸如抹除閘極線904之抹除閘極線在水平方向上延行。此處,至VMM陣列900之輸入提供於控制閘極線(CG0,CG1,CG2,CG3)上,且VMM陣列900之輸出出現於源極線(SL0,SL1)上。在一個實例中,僅使用偶數列,且在另一實例中,僅使用奇數列。置放於各源極線(分別為SL0、SL1)上之電流對來自連接至彼特定源極線之記憶體胞元的所有電流執行求和函數。In VMM array 900, control gate lines, such as control gate lines 903, run in the vertical direction (thus, reference array 902 in the column direction is orthogonal to control gate lines 903), and erase gate lines, such as The erasure gate line of 904 extends in the horizontal direction. Here, the input to the VMM array 900 is provided on the control gate lines (CG0, CG1, CG2, CG3) and the output of the VMM array 900 appears on the source lines (SL0, SL1). In one instance, only even columns are used, and in another instance, only odd columns are used. The current placed on each source line (SL0, SL1 respectively) performs a summation function on all currents from the memory cells connected to that particular source line.

如本文中針對神經網路所描繪,VMM陣列900之非揮發性記憶體胞元,亦即,VMM陣列900之記憶體胞元310,可被組構以在次臨限區中操作。As described herein for neural networks, the non-volatile memory cells of VMM array 900, ie, memory cells 310 of VMM array 900, may be configured to operate in sub-threshold regions.

本文中所描述之非揮發性參考記憶體胞元及非揮發性記憶體胞元在弱反轉(weak inversion)中經偏壓(次臨限區): Ids = Io * e (Vg- Vth)/nVt= w * Io * e (Vg)/nVt, 其中w = e (- Vth)/nVt其中Ids係汲極至源極電流;Vg係記憶體胞元上之閘極電壓;Vth係記憶體胞元之臨限電壓;Vt係熱電壓=k*T/q,其中k係波茲曼常數(Boltzmann constant),T係以克耳文為單位的溫度,且q係電子電荷;n係斜率因數= 1 + (Cdep/Cox),其中Cdep=耗盡層之電容,且Cox係閘極氧化物層之電容;Io係等於臨限電壓之閘極電壓下之記憶體胞元電流,Io係與(Wt/L)*u*Cox* (n-1) * Vt 2成比例,其中u係記憶體胞元之載流子遷移率,且Wt及L分別為寬度及長度。 The non-volatile reference memory cells and non-volatile memory cells described in this article are biased in weak inversion (sub-threshold): Ids = Io * e (Vg- Vth) /nVt = w * Io * e (Vg)/nVt , where w = e (- Vth)/nVt where Ids is the drain to source current; Vg is the gate voltage on the memory cell; Vth is the memory The threshold voltage of the cell; Vt is the thermal voltage = k*T/q, where k is the Boltzmann constant, T is the temperature in Kelvin, and q is the electron charge; n is the slope Factor = 1 + (Cdep/Cox), where Cdep = capacitance of the depletion layer, and Cox is the capacitance of the gate oxide layer; Io is the memory cell current at the gate voltage equal to the threshold voltage, and Io is Proportional to (Wt/L)*u*Cox* (n-1) * Vt 2 , where u is the carrier mobility of the memory cell, and Wt and L are the width and length respectively.

對於使用記憶體胞元(諸如參考記憶體胞元或周邊記憶體胞元)或電晶體將輸入電流轉換成輸入電壓之I至V對數轉換器: Vg= n*Vt*log [Ids/wp*Io] 其中,wp係參考或周邊記憶體胞元之w。 For an I-to-V logarithmic converter that uses a memory cell (such as a reference memory cell or a peripheral memory cell) or a transistor to convert input current to input voltage: Vg= n*Vt*log [Ids/wp*Io] Among them, wp is the w of the reference or peripheral memory cell.

對於用作具有電流輸入之向量矩陣乘法器VMM陣列之記憶體陣列,輸出電流為: Iout = wa * Io * e (Vg)/nVt,亦即 Iout = (wa/wp) * Iin = W * Iin W = e (Vthp - Vtha)/nVt此處,wa=記憶體陣列中之各記憶體胞元之w。 Vthp為周邊記憶體胞元之有效臨限電壓,且Vtha為主(資料)記憶體胞元之有效臨限電壓。應注意,電晶體之臨限電壓係基板基底偏壓電壓之函數,且表示為Vsb之基板基底偏壓電壓可經調變以補償此溫度下的各種條件。臨限電壓Vth可表述為: Vth = Vth0 + γ (SQRT |Vsb – 2*ϕF) - SQRT |2* ϕF |) 其中Vth0係具有零基板偏壓之臨限電壓,ϕF係表面電位,且γ係體效應參數。 For a memory array used as a vector matrix multiplier VMM array with current input, the output current is: Iout = wa * Io * e (Vg)/nVt , that is, Iout = (wa/wp) * Iin = W * Iin W = e (Vthp - Vtha)/nVt Here, wa = w of each memory cell in the memory array. Vthp is the effective threshold voltage of the peripheral memory cell, and Vtha is the effective threshold voltage of the main (data) memory cell. It should be noted that the threshold voltage of a transistor is a function of the substrate back bias voltage, and the substrate back bias voltage, denoted Vsb, can be modulated to compensate for various conditions at this temperature. The threshold voltage Vth can be expressed as: Vth = Vth0 + γ (SQRT |Vsb – 2*ϕF) - SQRT |2* ϕF |) where Vth0 is the threshold voltage with zero substrate bias, ϕF is the surface potential, and γ System effect parameters.

字元線或控制閘極可用作用於輸入電壓之記憶體胞元之輸入。The word lines or control gates can be used as inputs to the memory cells for input voltages.

替代地,本文中所描繪之VMM陣列之快閃記憶體胞元可被組構以在線性區中操作: Ids =β* (Vgs-Vth)*Vds;β= u*Cox*Wt/L W = α (Vgs-Vth) 此意謂線性區中之權重W係與(Vgs-Vth)成比例。 Alternatively, the flash memory cells of the VMM arrays depicted herein can be configured to operate in the linear region: Ids =β* (Vgs-Vth)*Vds; β= u*Cox*Wt/L W = α (Vgs-Vth) This means that the weight W in the linear region is proportional to (Vgs-Vth).

字元線或控制閘極或位元線或源極線可用作在線性區中操作之記憶體胞元的輸入。位元線或源極線可用作記憶體胞元之輸出。Word lines or control gates or bit lines or source lines may be used as inputs to memory cells operating in the linear region. Bit lines or source lines can be used as the output of the memory cell.

對於I至V線性轉換器,記憶體胞元(諸如參考記憶體胞元或周邊記憶體胞元)或在線性區中操作之電晶體可用以將輸入/輸出電流線性地轉換成輸入/輸出電壓。For I to V linear converters, memory cells (such as reference memory cells or peripheral memory cells) or transistors operating in the linear region can be used to linearly convert input/output currents into input/output voltages .

替代地,本文中所描述之VMM陣列之記憶體胞元可被組構以在飽和區中操作: Ids = * β* (Vgs-Vth) 2;β = u*Cox*Wt/L Wα (Vgs-Vth) 2,此意謂權重W與(Vgs-Vth) 2成比例。 Alternatively, the memory cells of the VMM array described herein can be configured to operate in the saturation region: Ids = * β* (Vgs-Vth) 2 ; β = u*Cox*Wt/L Wα (Vgs-Vth) 2 , which means that the weight W is proportional to (Vgs-Vth) 2 .

字元線、控制閘極或抹除閘極可用作在飽和區中操作之記憶體胞元之輸入。位元線或源極線可用作輸出神經元之輸出。Word lines, control gates, or erase gates can be used as inputs to memory cells operating in the saturation region. Bit lines or source lines can be used as the output of the output neuron.

替代地,本文中所描繪之VMM陣列之記憶體胞元可用於神經網路之各層或多層之所有區或其組合(次臨限區、線性區或飽和區)中。Alternatively, the memory cells of the VMM arrays depicted herein may be used in all regions or combinations thereof (subcritical regions, linear regions, or saturation regions) of each layer or layers of a neural network.

圖7之VMM陣列32之其他實例描述於美國專利第10,748,630號中,該專利以引用之方式併入本文中。如彼申請案中所描繪,源極線或位元線可用作神經元輸出(電流總和輸出)。Other examples of VMM array 32 of Figure 7 are described in U.S. Patent No. 10,748,630, which is incorporated herein by reference. As described in that application, source lines or bit lines can be used as neuron outputs (current summation outputs).

圖10描繪神經元VMM陣列1000,其尤其適合於如圖2中所示之記憶體胞元210,且用作輸入層與下一層之間的突觸。VMM陣列1000包含非揮發性記憶體胞元之記憶體陣列1003、第一非揮發性參考記憶體胞元之參考陣列1001及第二非揮發性參考記憶體胞元的參考陣列1002。配置於陣列之行方向上之參考陣列1001及1002用以將流動至端子BLR0、BLR1、BLR2及BLR3中之電流輸入轉換成電壓輸入WL0、WL1、WL2及WL3。實際上,第一及第二非揮發性參考記憶體胞元為二極體連接式貫穿多工器1014 (僅部分描述),其中電流輸入流入該等多工器中。參考胞元經調諧(例如,程式化)至目標參考位準。目標參考位準係由參考小型陣列矩陣(圖中未示)提供。Figure 10 depicts a neuronal VMM array 1000 that is particularly suitable for memory cells 210 as shown in Figure 2 and serves as a synapse between an input layer and the next layer. The VMM array 1000 includes a memory array 1003 of non-volatile memory cells, a reference array 1001 of first non-volatile reference memory cells, and a reference array 1002 of second non-volatile reference memory cells. Reference arrays 1001 and 1002 arranged in the row direction of the array are used to convert current inputs flowing into terminals BLR0, BLR1, BLR2 and BLR3 into voltage inputs WL0, WL1, WL2 and WL3. In practice, the first and second non-volatile reference memory cells are diode-connected through-multiplexers 1014 (only partially described) with current inputs flowing into the multiplexers. The reference cells are tuned (eg, programmed) to a target reference level. The target reference level is provided by a reference small array matrix (not shown).

記憶體陣列1003用於二種目的。首先,其儲存將由VMM陣列1000在其各別記憶體胞元上使用之權重。其次,記憶體陣列1003有效地使輸入(亦即,在端子BLR0、BLR1、BLR2及BLR3中提供之電流輸入,其由參考陣列1001及1002轉換成輸入電壓以供應至字元線WL0、WL1、WL2及WL3)乘以儲存於記憶體陣列1003中之權重,且隨後將所有結果(記憶體胞元電流)相加以在各別位元線(BL0至BLN)上產生輸出,該輸出將為至下一層的輸入或至最終層之輸入。藉由執行乘法及加法函數,記憶體陣列1003消除對單獨的乘法及加法邏輯電路之需求,且亦係功率高效的。此處,電壓輸入設置於字元線WL0、WL1、WL2及WL3上,且輸出在讀取(推斷)操作期間出現於各別位元線BL0至BLN上。置於位元線BL0至BLN中之每一者上的電流對來自連接至彼特定位元線之所有非揮發性記憶體胞元的電流執行求和函數。Memory array 1003 serves two purposes. First, it stores the weights to be used by the VMM array 1000 on its respective memory cells. Secondly, the memory array 1003 effectively causes the inputs (i.e., the current inputs provided in the terminals BLR0, BLR1, BLR2, and BLR3) to be converted by the reference arrays 1001 and 1002 into input voltages to be supplied to the word lines WL0, WL1, WL2 and WL3) are multiplied by the weights stored in memory array 1003, and all results (memory cell currents) are then summed to produce an output on the respective bit lines (BL0 to BLN), which output will be Input to the next layer or input to the final layer. By performing multiply and add functions, memory array 1003 eliminates the need for separate multiply and add logic circuits and is also power efficient. Here, voltage inputs are provided on word lines WL0, WL1, WL2, and WL3, and outputs appear on respective bit lines BL0 through BLN during read (inference) operations. The current placed on each of the bit lines BL0 through BLN performs a summation function on the currents from all non-volatile memory cells connected to that particular bit line.

表5描繪用於VMM陣列1000之操作電壓及電流。表中之行指示置於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表5:圖10之VMM陣列1000之操作    WL WL -未選定 BL BL -未選定 SL SL -未選定 讀取 1-3.5V -0.5V/0V 0.6-2V (Ineuron) 0.6V-2V/0V 0V 0V 抹除 ~5-13V 0V 0V 0V 0V 0V 程式化 1-2V -0.5V/0V 0.1-3 uA Vinh ~2.5V 4-10V 0-1V/FLT Table 5 depicts the operating voltages and currents for VMM array 1000. The rows in the table indicate the voltages placed on: character lines for selected cells, character lines for unselected cells, bit lines for selected cells, bit lines for unselected cells. Bit lines for selected cells, source lines for selected cells, and source lines for unselected cells. The columns indicate read, erase, and program operations. Table 5: Operation of the VMM array 1000 of Figure 10 wL WL - not selected BL BL - not selected SL SL - not selected read 1-3.5V -0.5V/0V 0.6-2V (Ineuron) 0.6V-2V/0V 0V 0V Erase ~5-13V 0V 0V 0V 0V 0V stylized 1-2V -0.5V/0V 0.1-3uA Vinh ~2.5V 4-10V 0-1V/FLT

圖11描繪神經元VMM陣列1100,其尤其適合於如圖2中所示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元之部分。VMM陣列1100包含非揮發性記憶體胞元之記憶體陣列1103、第一非揮發性參考記憶體胞元之參考陣列1101及第二非揮發性參考記憶體胞元之參考陣列1102。參考陣列1101及1102在VMM陣列1100之列方向上延行。VMM陣列類似於VMM 1000,不同之處在於在VMM陣列1100中,字元線在垂直方向上延行。此處,輸入經提供於字元線(WLA0、WLB0、WLA1、WLB2、WLA2、WLB2、WLA3、WLB3)上,且輸出在讀取操作期間出現於源極線(SL0、SL1)上。置放於各源極線上之電流對來自連接至彼特定源極線之記憶體胞元的所有電流執行求和函數。Figure 11 depicts a neuronal VMM array 1100 that is particularly suitable for memory cells 210 as shown in Figure 2 and serves as part of the synapses and neurons between the input layer and the next layer. The VMM array 1100 includes a memory array 1103 of non-volatile memory cells, a reference array 1101 of first non-volatile reference memory cells, and a reference array 1102 of second non-volatile reference memory cells. Reference arrays 1101 and 1102 extend in the column direction of VMM array 1100 . The VMM array is similar to VMM 1000 except that in VMM array 1100 the word lines run in a vertical direction. Here, inputs are provided on word lines (WLA0, WLB0, WLA1, WLB2, WLA2, WLB2, WLA3, WLB3) and outputs appear on source lines (SL0, SL1) during read operations. The current placed on each source line performs a summation function on all currents from the memory cells connected to that particular source line.

表6描繪用於VMM陣列1100之操作電壓及電流。表中之行指示置於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表6:圖11之VMM陣列1100之操作    WL WL -未選定 BL BL -未選定 SL SL -未選定 讀取 1-3.5V -0.5V/0V 0.6-2V 0.6V-2V/0V ~0.3-1V (Ineuron) 0V 抹除 ~5-13V 0V 0V 0V 0V SL-抑制(~4-8V) 程式化 1-2V -0.5V/0V 0.1-3 uA Vinh ~2.5V 4-10V 0-1V/FLT Table 6 depicts the operating voltages and currents for VMM array 1100. The rows in the table indicate the voltages placed on: character lines for selected cells, character lines for unselected cells, bit lines for selected cells, bit lines for unselected cells. Bit lines for selected cells, source lines for selected cells, and source lines for unselected cells. The columns indicate read, erase, and program operations. Table 6: Operation of VMM array 1100 of Figure 11 wL WL - not selected BL BL - not selected SL SL - not selected read 1-3.5V -0.5V/0V 0.6-2V 0.6V-2V/0V ~0.3-1V (Ineuron) 0V Erase ~5-13V 0V 0V 0V 0V SL-Suppression (~4-8V) stylized 1-2V -0.5V/0V 0.1-3uA Vinh ~2.5V 4-10V 0-1V/FLT

圖12描繪神經元VMM陣列1200,其尤其適合於圖3中所示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元之部分。VMM陣列1200包含非揮發性記憶體胞元之記憶體陣列1203、第一非揮發性參考記憶體胞元之參考陣列1201及第二非揮發性參考記憶體胞元之參考陣列1202。參考陣列1201及1202用以將流入端子BLR0、BLR1、BLR2及BLR3中之電流輸入轉換成電壓輸入CG0、CG1、CG2及CG3。實際上,第一及第二非揮發性參考記憶體胞元為二極體連接之貫穿多工器1212 (僅部分顯示),其中電流輸入經由BLR0、BLR1、BLR2及BLR3流入該等多工器中。多工器1212各自包括各別多工器1205及串疊電晶體1204以確保在讀取操作期間第一及第二非揮發性參考記憶體胞元中之各者之位元線(諸如BLR0)上的恆定電壓。參考胞元經調諧至目標參考位準。Figure 12 depicts a neuronal VMM array 1200 that is particularly suitable for the memory cells 310 shown in Figure 3 and serves as part of the synapses and neurons between the input layer and the next layer. VMM array 1200 includes a memory array 1203 of non-volatile memory cells, a reference array 1201 of first non-volatile reference memory cells, and a reference array 1202 of second non-volatile reference memory cells. Reference arrays 1201 and 1202 are used to convert current inputs flowing into terminals BLR0, BLR1, BLR2 and BLR3 into voltage inputs CG0, CG1, CG2 and CG3. In practice, the first and second non-volatile reference memory cells are diode-connected through multiplexers 1212 (only partially shown) with current inputs flowing into the multiplexers via BLR0, BLR1, BLR2 and BLR3 middle. Multiplexers 1212 each include a respective multiplexer 1205 and cascade transistor 1204 to ensure that a bit line (such as BLRO) of each of the first and second non-volatile reference memory cells during a read operation constant voltage on. The reference cells are tuned to the target reference level.

記憶體陣列1203用於二種目的。首先,其儲存將由VMM陣列1200使用之權重。其次,記憶體陣列1203有效地使輸入(提供至端子BLR0、BLR1、BLR2及BLR3之電流輸入,其中參考陣列1201及1202將此等電流輸入轉換成輸入電壓以供應至控制閘極(CG0、CG1、CG2及CG3)乘以儲存於記憶體陣列中之權重,且接著將所有結果(胞元電流)相加以產生輸出,該輸出顯現於BL0至BLN上,且將為至下一層之輸入或至最終層之輸入。藉由執行乘法及加法函數,記憶體陣列消除對分開的乘法及加法邏輯電路之需求,且亦為功率高效的。此處,輸入提供於控制閘極線(CG0,CG1,CG2及CG3)上,且輸出在讀取操作期間出現於位元線(BL0至BLN)上。置於各位元線上之電流對來自連接至彼特定位元線之記憶體胞元的所有電流執行求和函數。Memory array 1203 serves two purposes. First, it stores the weights that will be used by the VMM array 1200. Second, memory array 1203 effectively provides inputs (current inputs) to terminals BLR0, BLR1, BLR2, and BLR3, where reference arrays 1201 and 1202 convert these current inputs into input voltages for supply to control gates (CG0, CG1 , CG2 and CG3) are multiplied by the weights stored in the memory array, and all results (cell currents) are then summed to produce an output, which appears on BL0 to BLN and will be the input to the next layer or to Inputs to the final layer. By executing the multiply and add functions, the memory array eliminates the need for separate multiply and add logic circuits and is also power efficient. Here, the inputs are provided on the control gate lines (CG0, CG1, CG2 and CG3), and the outputs appear on the bit lines (BL0 to BLN) during read operations. The current placed on each bit line performs on all currents from the memory cells connected to that particular bit line Sum function.

VMM陣列1200針對記憶體陣列1203中之非揮發性記憶體胞元實施單向調諧。亦即,各非揮發性記憶體胞元經抹除且接著經部分程式化,直至達到浮動閘極上之所要電荷為止。若過多電荷置放於浮動閘極上(使得錯誤值儲存於胞元中),則胞元經抹除且部分程式化操作之序列重新開始。如所示,共用相同抹除閘極(諸如EG0或EG1)之二個列經一起抹除(此已知為頁面抹除),且此後,部分地程式化各胞元直至達到浮動閘極上之所要電荷為止。The VMM array 1200 performs unidirectional tuning for the non-volatile memory cells in the memory array 1203 . That is, each non-volatile memory cell is erased and then partially programmed until the desired charge on the floating gate is reached. If too much charge is placed on the floating gate (causing an incorrect value to be stored in the cell), the cell is erased and the sequence of partially programmed operations begins again. As shown, two columns sharing the same erase gate (such as EG0 or EG1) are erased together (this is known as a page erase), and thereafter, each cell is partially programmed until reaching the floating gate. to the desired charge.

表7描繪用於VMM陣列1200之操作電壓及電流。該表中之行指示置於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之控制閘極、用於與選定胞元處於相同扇區中之未選定胞元之控制閘極、用於與選定胞元處於不同扇區中之未選定胞元之控制閘極、用於選定胞元之抹除閘極、用於未選定胞元之抹除閘極、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表7:圖12之VMM陣列1200之操作    WL WL-未 選定 BL BL-未 選定 CG CG -未 選定相 同扇區 CG-未 選定 EG EG-未 選定 SL SL-未 選定 讀取 1.0-2V -0.5V/ 0V 0.6-2V (Ineuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V 抹除 0V 0V 0V 0V 0V 0-2.6V 0-2.6V 5-12V 0-2.6V 0V 0V 程式化 0.7-1V -0.5V/ 0V 0.1-1uA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V Table 7 depicts the operating voltages and currents for VMM array 1200. The rows in the table indicate the voltages placed on the word lines for selected cells, the word lines for unselected cells, the bit lines for selected cells, and the bit lines for unselected cells. Bit lines for cells, control gates for selected cells, control gates for unselected cells in the same sector as the selected cell, control gates for unselected cells in different sectors from the selected cell Control gate for unselected cells, erase gate for selected cells, erase gate for unselected cells, source line for selected cells, and source for unselected cells polar line. The columns indicate read, erase, and program operations. Table 7: Operation of VMM array 1200 of Figure 12 wL WL-not selected BL BL-not selected CG CG - Same sector not selected CG-Not selected EG EG-not selected SL SL-not selected read 1.0-2V -0.5V/0V 0.6-2V (Ineuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V Erase 0V 0V 0V 0V 0V 0-2.6V 0-2.6V 5-12V 0-2.6V 0V 0V stylized 0.7-1V -0.5V/0V 0.1-1uA Vinh(1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V

圖13描繪神經元VMM陣列1300,其尤其適合於如圖3中所示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元之部分。VMM陣列1300包含非揮發性記憶體胞元之記憶體陣列1303、第一非揮發性參考記憶體胞元之參考陣列1301及第二非揮發性參考記憶體胞元之參考陣列1302。EG線EGR0、EG0、EG1及EGR1垂直地延行,而CG線CG0、CG1、CG2及CG3以及SL線WL0、WL1、WL2及WL3水平地延行。VMM陣列1300類似於VMM陣列1400,其不同之處在於VMM陣列1300實施雙向調諧,其中由於使用單獨的EG線,各個別胞元可視需要經完全抹除、部分程式化及部分抹除以達到浮動閘極上之所需電荷量。如所示,參考陣列1301及1302將端子BLR0、BLR1、BLR2及BLR3中之輸入電流轉換成待在列方向上施加至記憶體胞元之控制閘極電壓CG0、CG1、CG2及CG3 (經由二極體連接式參考胞元貫穿多工器1314進行之動作)。電流輸出(神經元)在位元線BL0至BLN中,其中各位元線加總來自連接至彼特定位元線之非揮發性記憶體胞元的所有電流。Figure 13 depicts a neuronal VMM array 1300 that is particularly suitable for memory cells 310 as shown in Figure 3 and serves as part of the synapses and neurons between the input layer and the next layer. The VMM array 1300 includes a memory array 1303 of non-volatile memory cells, a reference array 1301 of first non-volatile reference memory cells, and a reference array 1302 of second non-volatile reference memory cells. The EG lines EGR0, EG0, EG1 and EGR1 run vertically, while the CG lines CG0, CG1, CG2 and CG3 and the SL lines WL0, WL1, WL2 and WL3 run horizontally. The VMM array 1300 is similar to the VMM array 1400 except that the VMM array 1300 implements bidirectional tuning, where each individual cell may be fully erased, partially programmed, and partially erased to achieve float as needed due to the use of separate EG lines. The required amount of charge on the gate. As shown, reference arrays 1301 and 1302 convert input currents in terminals BLR0, BLR1, BLR2, and BLR3 into control gate voltages CG0, CG1, CG2, and CG3 to be applied to the memory cells in the column direction (via two The polar body-connected reference cells perform actions through the multiplexer 1314). The current outputs (neurons) are in bit lines BL0 through BLN, where each bit line sums all currents from the non-volatile memory cells connected to that particular bit line.

表8描繪用於VMM陣列1300之操作電壓及電流。該表中之行指示置於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之控制閘極、用於與選定胞元處於相同扇區中之未選定胞元之控制閘極、用於與選定胞元處於不同扇區中之未選定胞元之控制閘極、用於選定胞元之抹除閘極、用於未選定胞元之抹除閘極、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表8:圖13之VMM陣列1300之操作    WL WL-未 選定 BL BL– 未選定 CG CG–未 選定相 同扇區 CG-未 選定 EG EG-未 選定 SL SL-未 選定 讀取 1.0-2V -0.5V/0V 0.6-2V (Ineuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V 抹除 0V 0V 0V 0V 0V 4-9V 0-2.6V 5-12V 0-2.6V 0V 0V 程式化 0.7-1V -0.5V/0V 0.1-1uA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V Table 8 depicts the operating voltages and currents for VMM array 1300. The rows in the table indicate the voltages placed on the word lines for selected cells, the word lines for unselected cells, the bit lines for selected cells, and the bit lines for unselected cells. Bit lines for cells, control gates for selected cells, control gates for unselected cells in the same sector as the selected cell, control gates for unselected cells in different sectors from the selected cell Control gate for unselected cells, erase gate for selected cells, erase gate for unselected cells, source line for selected cells, and source for unselected cells polar line. The columns indicate read, erase, and program operations. Table 8: Operation of VMM array 1300 of Figure 13 wL WL-not selected BL BL– not selected CG CG – Same sector not selected CG-Not selected EG EG-not selected SL SL-not selected read 1.0-2V -0.5V/0V 0.6-2V (Ineuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V Erase 0V 0V 0V 0V 0V 4-9V 0-2.6V 5-12V 0-2.6V 0V 0V stylized 0.7-1V -0.5V/0V 0.1-1uA Vinh(1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V

圖22描述神經元VMM陣列2200,其尤其適於如圖2中所示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元之部分。在VMM陣列2200中,輸入INPUT 0、…、INPUT N分別接收於位元線BL 0、…、BL N上,且輸出OUTPUT 1、OUTPUT 2、OUTPUT 3及OUTPUT 4分別產生於源極線SL 0、SL 1、SL 2及SL 3上。 Figure 22 depicts a neuronal VMM array 2200 that is particularly suitable for memory cells 210 as shown in Figure 2 and serves as part of the synapses and neurons between the input layer and the next layer. In VMM array 2200, inputs INPUT 0 , ..., INPUT N are received on bit lines BL 0 , ..., BL N respectively, and outputs OUTPUT 1 , OUTPUT 2 , OUTPUT 3 and OUTPUT 4 are respectively generated on source line SL 0 , SL 1 , SL 2 and SL 3 .

圖23描述神經元VMM陣列2300,其尤其適於如圖2中所示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、INPUT 1、INPUT 2及INPUT 3分別在源極線SL 0、SL 1、SL 2及SL 3上經接收,且輸出OUTPUT 0、...、OUTPUT N產生於位元線BL 0、...、BL N上。 Figure 23 depicts a neuronal VMM array 2300 that is particularly suitable for memory cells 210 as shown in Figure 2 and serves as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT 0 , INPUT 1 , INPUT 2 and INPUT 3 are received on source lines SL 0 , SL 1 , SL 2 and SL 3 respectively, and outputs OUTPUT 0 , ..., OUTPUT N are generated on On the bit lines BL 0 , ..., BL N.

圖24描述神經元VMM陣列2400,其尤其適合於如圖2中所示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、…、INPUT M分別接收於字元線WL 0、…、WL M上,且輸出OUTPUT 0、…、OUTPUT N產生於位元線BL 0、…、BL N上。 Figure 24 depicts a neuronal VMM array 2400 that is particularly suitable for memory cells 210 as shown in Figure 2 and serves as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on word lines WL 0 , ..., WLM, respectively, and outputs OUTPUT 0 , ..., OUTPUT N are generated on bit lines BL 0 , ..., BL N .

圖25描繪神經元VMM陣列2500,其尤其適合於如圖3中所示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、…、INPUT M分別接收於字元線WL 0、…、WL M上,且輸出OUTPUT 0 …、OUTPUT N產生於位元線BL 0、…、BL N上。 Figure 25 depicts a neuronal VMM array 2500 that is particularly suitable for memory cells 310 as shown in Figure 3 and serves as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on word lines WL 0 , ..., WLM, respectively, and outputs OUTPUT 0 , ..., OUTPUT N are generated on bit lines BL 0 , ..., BL N .

圖26描繪神經元VMM陣列2600,其尤其適合於如圖4中所示之記憶體胞元410,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、…、INPUT n分別接收於垂直控制閘極線CG 0、…、CG N上,且輸出OUTPUT 1及OUTPUT 2產生於源極線SL 0及SL 1上。 Figure 26 depicts a neuronal VMM array 2600 that is particularly suitable for memory cells 410 as shown in Figure 4 and serves as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT 0 , . . . , INPUT n are received on vertical control gate lines CG 0 , . . . , CGN respectively, and outputs OUTPUT 1 and OUTPUT 2 are generated on source lines SL 0 and SL 1 .

圖27描繪神經元VMM陣列2700,其尤其適合於如圖4中所示之記憶體胞元410,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、…、INPUT N分別接收於位元線控制閘極2701-1、2701-2、…、2701-(N-1)及2701-N之閘極上,該等閘極分別耦接至位元線BL 0、…、BL N。實例輸出OUTPUT 1及OUTPUT 2產生於源極線SL 0及SL 1上。 Figure 27 depicts a neuronal VMM array 2700 that is particularly suitable for memory cells 410 as shown in Figure 4 and serves as part of the synapses and neurons between the input layer and the next layer. In this example, the inputs INPUT 0 ,..., INPUT N are respectively received on the gates of the bit line control gates 2701-1, 2701-2,..., 2701-(N-1) and 2701-N. The poles are respectively coupled to bit lines BL 0 , ..., BL N . Example outputs OUTPUT 1 and OUTPUT 2 are generated on source lines SL 0 and SL 1 .

圖28描繪神經元VMM陣列2800,其尤其適合於如圖3中所示之記憶體胞元310、如圖5中所示之記憶體胞元510及如圖7中所示的記憶體胞元710,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、...、INPUT M在字元線WL 0、...、WL M上經接收,且輸出OUTPUT 0、...、OUTPUT N分別產生於位元線BL 0、...、BL N上。 Figure 28 depicts a neuronal VMM array 2800 that is particularly suitable for memory cells 310 as shown in Figure 3, memory cells 510 as shown in Figure 5, and memory cells as shown in Figure 7 710, and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on word lines WL 0 , ..., WLM , and outputs OUTPUT 0 , ..., OUTPUT N are generated on bit lines BL, respectively. 0 ,...,BL N on.

圖29描述神經元VMM陣列2900,其尤其適合於如圖3中所示之記憶體胞元310、如圖5中所示之記憶體胞元510及如圖7中所示的記憶體胞元710,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、…、INPUT M接收於控制閘極線CG 0、…、CG M上。輸出OUTPUT 0、…、OUTPUT N分別產生於垂直源極線SL 0、…、SL N上,其中各源極線SL i耦接至行i中之所有記憶體胞元之源極線。 Figure 29 depicts a neuron VMM array 2900 that is particularly suitable for memory cells 310 as shown in Figure 3, memory cells 510 as shown in Figure 5, and memory cells as shown in Figure 7 710, and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on control gate lines CG 0 , ..., CGM . The outputs OUTPUT 0 , ..., OUTPUT N are generated on vertical source lines SL 0 , ..., SL N respectively, where each source line SL i is coupled to the source lines of all memory cells in row i.

圖30描繪神經元VMM陣列3000,其尤其適合於如圖3中所示之記憶體胞元310、如圖5中所示之記憶體胞元510及如圖7中所示的記憶體胞元710,且用作輸入層與下一層之間的突觸及神經元之部分。在此實例中,輸入INPUT 0、…、INPUT M接收於控制閘極線CG 0、…、CG M上。輸出OUTPUT 0、….、OUTPUT N分別產生於垂直位元線BL 0、…、BL N上,其中各位元線BL i耦接至行i中之所有記憶體胞元之位元線。 長短期記憶體 Figure 30 depicts a neuronal VMM array 3000 that is particularly suitable for memory cells 310 as shown in Figure 3, memory cells 510 as shown in Figure 5, and memory cells as shown in Figure 7 710, and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on control gate lines CG 0 , ..., CGM . The outputs OUTPUT 0 , ..., OUTPUT N are generated on vertical bit lines BL 0 , ..., BL N respectively, where each bit line BL i is coupled to the bit lines of all memory cells in row i. long short term memory

先前技術包括稱為長短期記憶體(LSTM)之概念。LSTM單元常常用於神經網路中。LSTM允許神經網路在預定任意時間間隔內記住資訊且在後續操作中使用彼資訊。習知LSTM單元包含胞元、輸入閘極、輸出閘極及遺忘閘極。三個閘極調節資訊進入及離開胞元之流動及在LSTM中記住資訊的時間間隔。VMM尤其適用於LSTM單元。Prior technology includes a concept called long short-term memory (LSTM). LSTM units are often used in neural networks. LSTM allows a neural network to remember information for any predetermined time interval and use that information in subsequent operations. It is known that the LSTM unit includes cells, input gates, output gates and forgetting gates. The three gates regulate the flow of information into and out of the cell and the time interval during which information is remembered in the LSTM. VMM is especially suitable for LSTM cells.

圖14描繪實例LSTM 1400。此實例中之LSTM 1400包含胞元1401、1402、1403及1404。胞元1401接收輸入向量x 0,且產生輸出向量h 0及胞元狀態向量c 0。胞元1402接收輸入向量x 1、來自胞元1401之輸出向量(隱藏狀態) h 0及來自胞元1401之胞元狀態c 0,且產生輸出向量h 1及胞元狀態向量c 1。胞元1403接收輸入向量x 2、來自胞元1402之輸出向量(隱藏狀態) h 1及來自胞元1402之胞元狀態c 1,且產生輸出向量h 2及胞元狀態向量c 2。胞元1404接收輸入向量x 3、來自胞元1403之輸出向量(隱藏狀態) h 2及來自胞元1403之胞元狀態c 2,且產生輸出向量h 3。可使用額外胞元,且具有四個胞元之LSTM僅為實例。 Figure 14 depicts an example LSTM 1400. LSTM 1400 in this example includes cells 1401, 1402, 1403, and 1404. Cell 1401 receives an input vector x 0 and generates an output vector h 0 and a cell state vector c 0 . Cell 1402 receives the input vector x 1 , the output vector (hidden state) h 0 from cell 1401 , and the cell state c 0 from cell 1401 , and generates an output vector h 1 and a cell state vector c 1 . Cell 1403 receives the input vector x 2 , the output vector (hidden state) h 1 from cell 1402, and the cell state c 1 from cell 1402, and generates an output vector h 2 and a cell state vector c 2 . Cell 1404 receives the input vector x 3 , the output vector (hidden state) h 2 from cell 1403, and the cell state c 2 from cell 1403, and produces an output vector h 3 . Additional cells can be used, and the LSTM with four cells is only an example.

圖15描繪LSTM胞元1500之實例實施,其可用於圖14中之胞元1401、1402、1403及1404。LSTM胞元1500接收輸入向量x(t)、來自前述胞元之胞元狀態向量c(t-1)及來自前述胞元之輸出向量h(t-1),且產生胞元狀態向量c(t)及輸出向量h(t)。Figure 15 depicts an example implementation of LSTM cell 1500, which may be used with cells 1401, 1402, 1403, and 1404 in Figure 14. The LSTM cell 1500 receives the input vector x(t), the cell state vector c(t-1) from the previous cell, and the output vector h(t-1) from the previous cell, and generates the cell state vector c( t) and the output vector h(t).

LSTM胞元1500包含S型函數構件1501、1502及1503,其中之各者應用0與1之間的數字以控制輸入向量中之各分量被允許通過輸出向量之量。LSTM胞元1500亦包含用以將雙曲正切函數應用於輸入向量之雙曲正切構件1504及1505、用以使二個向量相乘在一起之乘法器構件1506、1507及1508,及用以將二個向量相加在一起之加法構件1509。可將輸出向量h(t)提供至系統中之下一LSTM胞元,或可出於其他目的來存取該輸出向量。LSTM cell 1500 includes sigmoid function components 1501, 1502, and 1503, each of which applies a number between 0 and 1 to control how much each component of the input vector is allowed to pass through the output vector. LSTM cell 1500 also includes hyperbolic tangent components 1504 and 1505 for applying the hyperbolic tangent function to the input vector, multiplier components 1506, 1507, and 1508 for multiplying the two vectors together, and Addition component 1509 for adding two vectors together. The output vector h(t) may be provided to the next LSTM cell in the system, or may be accessed for other purposes.

圖16描繪LSTM胞元1600,其為LSTM胞元1500之實施之實例。為了方便讀者,來自LSTM胞元1500之相同編號用於LSTM胞元1600中。S型函數構件1501、1502及1503以及雙曲正切構件1504各自包含多個VMM陣列1601及激活函數區塊1602。因此,可見VMM陣列特別適用於在某些神經網路系統中使用之LSTM胞元。乘法器構件1506、1507及1508以及加法構件1509以數位方式或以類比方式實施。激活函數區塊1602可以數位方式或以類比方式實施。Figure 16 depicts an LSTM cell 1600, which is an example of an implementation of the LSTM cell 1500. For the convenience of the reader, the same numbers from LSTM cell 1500 are used in LSTM cell 1600. Sigmoid function components 1501, 1502 and 1503 and hyperbolic tangent component 1504 each include a plurality of VMM arrays 1601 and activation function blocks 1602. Therefore, it can be seen that the VMM array is particularly suitable for LSTM cells used in some neural network systems. The multiplier components 1506, 1507 and 1508 and the adding component 1509 are implemented digitally or analogously. Activation function block 1602 may be implemented digitally or analogously.

LSTM胞元1600之替代方案(及LSTM胞元1500之實施之另一實例)在圖17中加以顯示。在圖17中,S型函數構件1501、1502及1503以及雙曲正切構件1504以時間多工方式共用同一實體硬體(VMM陣列1701及激活函數區塊1702)。LSTM胞元1700亦包含用以使二個向量相乘在一起之乘法器構件1703、用以使二個向量相加在一起之加法構件1708、雙曲正切構件1505 (其包含激活函數區塊1702)、用以當值i(t)自S型函數區塊1702輸出時儲存i(t)的暫存器1707、用以當值f(t) * c(t-1)經由多工器1710自乘法器構件1703輸出時儲存該值之暫存器1704、用以當值i(t) * u(t)經由多工器1710自乘法器構件1703輸出時儲存該值的暫存器1705,及用以當值o(t) * c~(t)經由多工器1710及多工器1709自乘法器構件1703輸出時儲存該值之暫存器1706。An alternative to LSTM cell 1600 (and another example of an implementation of LSTM cell 1500) is shown in Figure 17. In FIG. 17 , the S-shaped function components 1501, 1502 and 1503 and the hyperbolic tangent component 1504 share the same physical hardware (VMM array 1701 and activation function block 1702) in a time multiplexing manner. The LSTM cell 1700 also contains a multiplier component 1703 for multiplying two vectors together, an addition component 1708 for adding two vectors together, and a hyperbolic tangent component 1505 (which contains the activation function block 1702 ), a register 1707 for storing i(t) when the value i(t) is output from the S-type function block 1702, and a register 1707 for storing the value f(t) * c(t-1) via the multiplexer 1710 A register 1704 for storing the value when it is output from the multiplier component 1703, a register 1705 for storing the value i(t) * u(t) when it is output from the multiplier component 1703 via the multiplexer 1710, And a register 1706 for storing the value o(t)*c~(t) when it is output from the multiplier component 1703 via the multiplexer 1710 and the multiplexer 1709.

LSTM胞元1600含有VMM陣列1601及各別激活函數區塊1602之多個集合,而LSTM胞元1700僅含有VMM陣列1701及激活函數區塊1702之一個集合,該等VMM陣列1701及該激活函數區塊1702用於表示LSTM胞元1700之實例中之多個層。LSTM胞元1700將需要相較於LSTM 1600較少之空間,此係因為LSTM胞元1700相比於LSTM胞元1600將需要1/4之空間用於VMM及激活函數區塊。LSTM cell 1600 contains multiple sets of VMM arrays 1601 and respective activation function blocks 1602 , while LSTM cell 1700 contains only one set of VMM arrays 1701 and activation function blocks 1702 , the VMM arrays 1701 and the activation function Block 1702 is used to represent multiple layers in an instance of LSTM cell 1700. LSTM cell 1700 will require less space than LSTM 1600 because LSTM cell 1700 will require 1/4 of the space for VMM and activation function blocks compared to LSTM cell 1600.

可進一步瞭解,LSTM單元將通常包含多個VMM陣列,其中之各者需要由VMM陣列外部的某些電路區塊,諸如求和器及激活函數區塊以及高電壓產生區塊所提供之功能。向各VMM陣列提供單獨電路區塊將需要半導體構件內之大量空間且將略微低效。因此,下文所描述之實例減少在VMM陣列自身外部所需之電路系統。 閘控遞迴單元 It is further understood that an LSTM cell will typically contain multiple VMM arrays, each of which requires functionality provided by certain circuit blocks external to the VMM array, such as summer and activation function blocks and high voltage generation blocks. Providing separate circuit blocks to each VMM array would require a large amount of space within the semiconductor device and would be somewhat inefficient. Therefore, the examples described below reduce the circuitry required outside the VMM array itself. gated recursive unit

類比VMM實施可用於閘控遞迴單元(gated recurrent unit;GRU)系統。GRU係遞迴神經網路中之閘控機制。GRU類似於LSTM,不同之處在於GRU胞元通常含有少於LSTM胞元之組件。Analogous VMM implementations can be used in gated recurrent unit (GRU) systems. GRU is the gate control mechanism in recurrent neural networks. GRU is similar to LSTM, except that GRU cells usually contain fewer components than LSTM cells.

圖18描繪實例GRU 1800。此實例中之GRU 1800包含胞元1801、1802、1803及1804。胞元1801接收輸入向量x 0且產生輸出向量h 0。胞元1802接收輸入向量x 1、來自胞元1801之輸出向量h 0,且產生輸出向量h 1。胞元1803接收輸入向量x 2及來自胞元1802之輸出向量(隱藏狀態) h 1,且產生輸出向量h 2。胞元1804接收輸入向量x 3及來自胞元1803之輸出向量(隱藏狀態) h 2且產生輸出向量h 3。可使用額外胞元,且具有四個胞元之GRU僅為實例。 Figure 18 depicts an example GRU 1800. GRU 1800 in this example includes cells 1801, 1802, 1803, and 1804. Cell 1801 receives the input vector x0 and produces the output vector h0 . Cell 1802 receives input vector x 1 , output vector h 0 from cell 1801, and generates output vector h 1 . Cell 1803 receives the input vector x 2 and the output vector (hidden state) h 1 from cell 1802, and generates the output vector h 2 . Cell 1804 receives the input vector x 3 and the output vector (hidden state) h 2 from cell 1803 and produces an output vector h 3 . Additional cells may be used, and a GRU with four cells is an example only.

圖19描繪GRU胞元1900之實例實施,其可用於圖18之胞元1801、1802、1803及1804。GRU胞元1900接收輸入向量x(t)及來自前一GRU胞元之輸出向量h(t-1),且產生輸出向量h(t)。GRU胞元1900包含S型函數構件1901及1902,其中之各者將0與1之間的數字應用至來自輸出向量h(t-1)及輸入向量x(t)之分量。GRU胞元1900亦包含用以將雙曲正切函數應用至輸入向量之雙曲正切構件1903,用以將二個向量相乘在一起之複數個乘法器構件1904、1905及1906,用以將二個向量相加在一起之加法構件1907及用以自1減去輸入以產生輸出之互補構件1908。Figure 19 depicts an example implementation of a GRU cell 1900 that may be used in cells 1801, 1802, 1803, and 1804 of Figure 18. GRU cell 1900 receives the input vector x(t) and the output vector h(t-1) from the previous GRU cell, and generates the output vector h(t). GRU cell 1900 includes sigmoid function components 1901 and 1902, each of which applies a number between 0 and 1 to components from the output vector h(t-1) and the input vector x(t). The GRU cell 1900 also includes a hyperbolic tangent component 1903 for applying a hyperbolic tangent function to an input vector, a plurality of multiplier components 1904, 1905, and 1906 for multiplying two vectors together, and a plurality of multiplier components 1904, 1905, and 1906 for An additive component 1907 that adds vectors together and a complementary component 1908 that subtracts the input from 1 to produce the output.

圖20描繪GRU胞元2000,其為GRU胞元1900之實施之實例。為了方便讀者,來自GRU胞元1900之相同編號用於GRU胞元2000中。如圖20中可見,S型函數構件1901及1902以及雙曲正切構件1903各自包含多個VMM陣列2001及激活函數區塊2002。因此,可見VMM陣列特別用於在某些神經網路系統中使用之GRU胞元。乘法器構件1904、1905、1906、加法構件1907及互補構件1908以數位方式或以類比方式實施。激活函數區塊2002可以數位方式或以類比方式實施。Figure 20 depicts GRU cell 2000, which is an example of an implementation of GRU cell 1900. For the convenience of the reader, the same numbers from GRU cell 1900 are used in GRU cell 2000. As can be seen in Figure 20, the sigmoid function components 1901 and 1902 and the hyperbolic tangent component 1903 each include a plurality of VMM arrays 2001 and activation function blocks 2002. Therefore, it can be seen that the VMM array is particularly suitable for GRU cells used in some neural network systems. The multiplier components 1904, 1905, 1906, the adding component 1907 and the complementary component 1908 are implemented digitally or analogously. Activation function block 2002 may be implemented digitally or analogously.

GRU胞元2000之替代方案(及GRU胞元1900之實施之另一實例)在圖21中加以顯示。在圖21中,GRU胞元2100利用VMM陣列2101及激活函數區塊2102,該激活函數區塊在被組構為S型函數時應用0與1之間的數字以控制輸入向量中之各分量被允許通過輸出向量之量。在圖21中,S型函數構件1901及1902以及雙曲正切構件1903以時間多工方式共用同一實體硬體(VMM陣列2101及激活函數區塊2102)。GRU胞元2100亦包含用以使二個向量在一起相乘之乘法器構件2103、用以使二個向量在一起相加之加法構件2105、用以自1減去輸入以產生輸出之互補構件2109、多工器2104、用以當值h(t-1) * r(t)經由多工器2104自乘法器構件2103輸出時保存彼值之暫存器2106、用以當值h(t-1) *z(t)經由多工器2104自乘法器構件2103輸出時保存彼值之暫存器2107,及用以當值h^(t) * (1-z(t))經由多工器2104自乘法器構件2103輸出時保存彼值之暫存器2108。An alternative to GRU cell 2000 (and another example of an implementation of GRU cell 1900) is shown in Figure 21. In Figure 21, GRU cell 2100 utilizes VMM array 2101 and activation function block 2102, which when configured as a sigmoid function applies a number between 0 and 1 to control each component of the input vector. The amount allowed to pass through the output vector. In Figure 21, the S-shaped function components 1901 and 1902 and the hyperbolic tangent component 1903 share the same physical hardware (VMM array 2101 and activation function block 2102) in a time multiplexing manner. GRU cell 2100 also contains a multiplier block 2103 for multiplying two vectors together, an addition block 2105 for adding two vectors together, and a complementary block for subtracting the input from 1 to produce the output. 2109. Multiplexer 2104. A register 2106 for saving the value h(t-1) * r(t) when it is output from the multiplier component 2103 via the multiplexer 2104. -1) The register 2107 that holds the value of *z(t) when it is output from the multiplier component 2103 via the multiplexer 2104, and is used to store the value h^(t) * (1-z(t)) via the multiplexer 2104 When the processor 2104 outputs from the multiplier block 2103, the register 2108 holds the value.

GRU胞元2000含有VMM陣列2001及激活函數區塊2002之多個集合,而GRU胞元2100僅含有VMM陣列2101及激活函數區塊2102的一個集合,其用於表示GRU胞元2100之實例中的多個層。GRU胞元2100將需要相較於GRU胞元2000較少之空間,此係因為GRU胞元2100相比於GRU胞元2000將需要1/3之空間以用於VMM及激活函數區塊。GRU cell 2000 contains multiple sets of VMM arrays 2001 and activation function blocks 2002, while GRU cell 2100 contains only one set of VMM arrays 2101 and activation function blocks 2102, which is used to represent the example of GRU cell 2100 of multiple layers. GRU cell 2100 will require less space than GRU cell 2000 because GRU cell 2100 will require 1/3 of the space for VMM and activation function blocks compared to GRU cell 2000.

可進一步瞭解,GRU系統將通常包含多個VMM陣列,其中各者需要由VMM陣列外部之某些電路區塊(諸如求和器及激活函數區塊以及高電壓產生區塊)提供的功能。向各VMM陣列提供單獨電路區塊將需要半導體構件內之大量空間且將略微低效。因此,下文所描述之實例減少在VMM陣列自身外部所需之電路系統。It is further understood that a GRU system will typically contain multiple VMM arrays, each of which requires functionality provided by certain circuit blocks external to the VMM array, such as summer and activation function blocks and high voltage generation blocks. Providing separate circuit blocks to each VMM array would require a large amount of space within the semiconductor device and would be somewhat inefficient. Therefore, the examples described below reduce the circuitry required outside the VMM array itself.

至VMM陣列之輸入可為類比位準、二進位位準、脈衝、時間經調變脈衝或數位位元(在此情況下,需要DAC將數位位元轉換成適當的輸入類比位準),且輸出可為類比位準、二進位位準、定時脈衝、脈衝或數位位元(在此情況下,需要輸出ADC將輸出類比位準轉換成數位位元)。The input to the VMM array can be an analog level, a binary level, a pulse, a time-modulated pulse, or a digital bit (in which case a DAC is required to convert the digital bit to the appropriate input analog level), and The output can be an analog level, a binary level, a timing pulse, a pulse, or a digital bit (in this case, an output ADC is required to convert the output analog level into a digital bit).

一般而言,對於VMM陣列中之各記憶體胞元,各權重W可由單一記憶體胞元或差分胞元或二個混合記憶體胞元(2個胞元之平均值)實施。在差分胞元情況下,需要二個記憶體胞元以將權重W實施為差分權重(W = W+ - W-)。在二個混合記憶體胞元中,需要二個記憶體胞元以將權重W實施為二個胞元之平均值。Generally speaking, for each memory cell in the VMM array, each weight W can be implemented by a single memory cell or a differential cell or two hybrid memory cells (the average of 2 cells). In the case of differential cells, two memory cells are required to implement the weight W as a differential weight (W = W+ - W-). In two hybrid memory cells, two memory cells are required to implement the weight W as the average of the two cells.

圖31描述VMM系統3100。在一些實例中,儲存於VMM陣列中之權重W經儲存為差分對W+ (正權重)及W- (負權重),其中W = (W+) - (W-)。在VMM系統3100中,一半位元線經指定為W+線,亦即,連接至將儲存正權重W+之記憶體胞元的位元線,且另一半位元線經指定為W-線,亦即,連接至實施負權重W-之記憶體胞元的位元線。W-線以交替方式穿插於W+線當中。減法運算係由自W+線及W-線接收電流之求和電路執行,諸如為求和電路3101及3102。W+線之輸出及W-線之輸出組合在一起,從而對於所有對(W+, W-)線之各對(W+, W-)胞元,有效地得出W = W+ - W-。雖然上文已關於W-線以交替方式穿插在W+線當中進行描述,但在其他實例中,W+線及W-線可任意地位於陣列中之任何位置。Figure 31 depicts VMM system 3100. In some examples, the weight W stored in the VMM array is stored as a differential pair W+ (positive weight) and W- (negative weight), where W = (W+) - (W-). In the VMM system 3100, half of the bit lines are designated as W+ lines, that is, the bit lines connected to the memory cells that will store the positive weight W+, and the other half of the bit lines are designated as W- lines, also That is, the bit lines connected to memory cells implementing negative weight W-. The W- lines are interspersed with the W+ lines in an alternating manner. The subtraction operation is performed by summing circuits that receive current from the W+ and W- lines, such as summing circuits 3101 and 3102. The output of the W+ line and the output of the W- line are combined, effectively giving W = W+ - W- for each pair of (W+, W-) cells for all pairs of (W+, W-) lines. Although the above has been described with respect to W- lines being interspersed among W+ lines in an alternating manner, in other examples, the W+ lines and W- lines can be arbitrarily located anywhere in the array.

圖32描繪另一實例。在VMM系統3210中,正權重W+經實施於第一陣列3211中且負權重W-經實施於第二陣列3212中,第二陣列3212與第一陣列分離,且所得權重係藉由求和電路3213適當地組合在一起。Figure 32 depicts another example. In the VMM system 3210, the positive weight W+ is implemented in the first array 3211 and the negative weight W- is implemented in the second array 3212, the second array 3212 is separated from the first array, and the resulting weights are obtained by the summation circuit 3213 put together appropriately.

圖33描述VMM系統3300。儲存於VMM陣列中之權重W經儲存為差分對W+ (正權重)及W- (負權重),其中W = (W+) - (W-)。VMM系統3300包含陣列3301及陣列3302。陣列3301及3302中之各者中的一半位元線經指明為W+線,亦即,連接至將儲存正權重W+之記憶體胞元的位元線,且陣列3301及3302中之各者中的另一半位元線經指明為W-線,亦即,連接至實施負權重W-之記憶體胞元的位元線。W-線以交替方式穿插於W+線當中。減法運算係由自W+線及W-線接收電流之求和電路執行,諸如求和電路3303、3304、3305及3306。來自各陣列3301、3302之W+線之輸出及W-線之輸出分別組合在一起,以針對所有對(W+, W-)線之各對(W+, W-)胞元而有效地得到W = W+ - W-。另外,來自各陣列3301及3302之W值可經由求和電路3307及3308進一步組合,以使得各W值係來自陣列3301的W值減去來自陣列3302之W值的結果,此意謂來自求和電路3307及3308之最終結果係二個差分值之差分值。Figure 33 depicts VMM system 3300. The weight W stored in the VMM array is stored as a differential pair W+ (positive weight) and W- (negative weight), where W = (W+) - (W-). VMM system 3300 includes array 3301 and array 3302. Half of the bit lines in each of arrays 3301 and 3302 are designated as W+ lines, that is, the bit lines connected to the memory cells that will store the positive weight W+, and half of the bit lines in each of arrays 3301 and 3302 The other half of the bit lines are designated as W- lines, that is, the bit lines connected to memory cells implementing negative weight W-. The W- lines are interspersed with the W+ lines in an alternating manner. The subtraction operation is performed by summing circuits that receive current from the W+ and W- lines, such as summing circuits 3303, 3304, 3305, and 3306. The outputs of the W+ line and the W- line from each array 3301, 3302 are combined together to effectively obtain W = for each pair of (W+, W-) cells of all pairs of (W+, W-) lines. W+ - W-. Additionally, the W values from each array 3301 and 3302 may be further combined via summing circuits 3307 and 3308 such that each W value is the result of the W value from array 3301 minus the W value from array 3302, meaning that from The final result of the sum circuits 3307 and 3308 is the difference of the two difference values.

用於類比神經記憶體系統中之各非揮發性記憶體胞元待經抹除及程式化,以在浮動閘極中保持極特定且精確的電荷量,亦即電子數目。舉例而言,各浮動閘極應保存N個不同值中之一者,其中N係可由各胞元指示之不同權重的數目。N之實例包括16、32、64、128及256。Each non-volatile memory cell used in an analog neural memory system is erased and programmed to maintain a very specific and precise amount of charge, or number of electrons, in the floating gate. For example, each floating gate should hold one of N different values, where N is the number of different weights that can be indicated by each cell. Examples of N include 16, 32, 64, 128, and 256.

圖34描繪VMM系統3400之方塊圖。VMM系統3400包含VMM陣列3401、列解碼器3402、高電壓解碼器3403、行解碼器3404、位元線驅動器3405、輸入電路3406、輸出電路3407、控制邏輯3408及偏壓產生器3409。VMM系統3400進一步包含高電壓產生區塊3410,該高電壓產生區塊包含電荷泵3411、電荷泵調節器3412及高電壓類比精度位準產生器3413。VMM系統3400進一步包含(程式化/抹除,或權重調諧)演算法控制器3414、類比電路系統3415、控制引擎3416 (其可包括但不限於特殊函數,諸如算術函數、激活函數、嵌入式微控制器邏輯),以及測試控制邏輯3417。Figure 34 depicts a block diagram of VMM system 3400. VMM system 3400 includes VMM array 3401, column decoder 3402, high voltage decoder 3403, row decoder 3404, bit line driver 3405, input circuit 3406, output circuit 3407, control logic 3408 and bias generator 3409. The VMM system 3400 further includes a high voltage generation block 3410 that includes a charge pump 3411, a charge pump regulator 3412, and a high voltage analog precision level generator 3413. VMM system 3400 further includes (programmed/erased, or weight-tuned) algorithm controller 3414, analog circuitry 3415, control engine 3416 (which may include but is not limited to special functions such as arithmetic functions, activation functions, embedded microcontrollers controller logic), and test control logic 3417.

輸入電路3406可包括電路,諸如數位至類比轉換器(DAC)、數位至脈衝轉換器(DPC、數位至時間調變脈衝轉換器)、類比至類比轉換器(AAC,諸如電流至電壓轉換器、對數轉換器)、脈衝至類比位準轉換器(PAC),或任何其他類型之轉換器。輸入電路3406可實施正規化、線性或非線性按比例放大/按比例縮小函數,或算術函數中之一或多者。輸入電路3406可針對輸入位準實施溫度補償函數。輸入電路3406可實施諸如ReLU或S型之激活函數。Input circuitry 3406 may include circuitry such as a digital-to-analog converter (DAC), a digital-to-pulse converter (DPC, a digital-to-time modulated pulse converter), an analog-to-analog converter (AAC, such as a current-to-voltage converter, logarithmic converter), pulse-to-analog converter (PAC), or any other type of converter. Input circuitry 3406 may implement one or more of a normalization, a linear or nonlinear scaling up/down function, or an arithmetic function. Input circuit 3406 may implement a temperature compensation function for the input level. Input circuit 3406 may implement an activation function such as ReLU or sigmoid.

輸出電路3407可包括電路,諸如類比至數位轉換器(ADC,其用以將神經元類比輸出轉換成數位位元)、類比至類比轉換器(AAC,諸如電流至電壓轉換器、對數轉換器)、類比至脈衝轉換器(APC、類比至時間調變脈衝轉換器),或任何其他類型之轉換器。輸出電路3407可實施激活函數,諸如整流線性激活函數(ReLU)或S型。輸出電路3407可針對神經元輸出實施統計正規化、正則化、按比例放大/按比例縮小/增益函數,統計捨位或算術函數(例如,加法、減法、除法、乘法、移位、對數)中之一或多者。輸出電路3407可針對神經元輸出或陣列輸出(諸如位元線輸出)實施溫度補償函數,以便使陣列之功率消耗保持近似恆定或諸如藉由使IV斜率保持大致相同而改良陣列(神經元)輸出的精度。Output circuitry 3407 may include circuits such as analog-to-digital converters (ADCs to convert neuron analog outputs into digital bits), analog-to-analog converters (AACs such as current-to-voltage converters, logarithmic converters) , analog to pulse converter (APC, analog to time modulated pulse converter), or any other type of converter. Output circuit 3407 may implement an activation function, such as a rectified linear activation function (ReLU) or sigmoid. The output circuit 3407 may implement statistical normalization, regularization, scaling/gain functions, statistical rounding, or arithmetic functions (e.g., addition, subtraction, division, multiplication, shift, logarithm) on the neuron output. one or more. The output circuit 3407 may implement a temperature compensation function for the neuron output or the array output (such as the bit line output) in order to keep the power consumption of the array approximately constant or to improve the array (neuron) output, such as by keeping the IV slope approximately the same. accuracy.

隨著人工神經網路之應用變得更複雜,存在增加對較大VMM陣列之需要。同時,存在對高效地使用封裝積體電路內之空間且儘可能節約功率同時仍維持準確度以使得N個不同權重中之各者仍適當地儲存及讀取的需要。As applications of artificial neural networks become more complex, there is an increasing need for larger VMM arrays. At the same time, there is a need to efficiently use space within a packaged integrated circuit and save as much power as possible while still maintaining accuracy so that each of the N different weights still stores and reads appropriately.

描述用於提供包含三維積體電路之人工神經網路系統的眾多實例,該三維積體電路包含一或多個VMM陣列。Various examples are described for providing artificial neural network systems that include three-dimensional integrated circuits that include one or more VMM arrays.

3D VMM系統架構3D VMM system architecture

圖35描繪3D VMM系統3500,其包含複數個晶粒,諸如晶粒3501、3502、3503、3504、3505及3506,該等晶粒垂直地堆疊於封裝3522內以形成封裝積體電路。3D VMM系統3500包含在功能上類似於圖34中之VMM系統3400中含有之區塊的某些功能區塊,但該等區塊可位於不同晶粒上。此處,晶粒3501及3502中含有之組件共用晶粒3503、3504、3505及3506中含有的組件。Figure 35 depicts a 3D VMM system 3500 that includes a plurality of dies, such as dies 3501, 3502, 3503, 3504, 3505, and 3506, that are vertically stacked within a package 3522 to form a packaged integrated circuit. 3D VMM system 3500 includes certain functional blocks that are functionally similar to those included in VMM system 3400 in Figure 34, but the blocks may be located on different dies. Here, components contained in dies 3501 and 3502 share components contained in dies 3503, 3504, 3505, and 3506.

在此實施例中,晶粒3501含有各別VMM陣列3507 (在功能上類似於圖34中之VMM 3401)、各別輸入多工器3509、各別列緩衝器3523 (其可提供例如取樣保持緩衝電壓至陣列輸入)、各別高電壓解碼器3508 (在功能上類似於圖34中之高電壓解碼器3403)及各別神經元電路3510 (其可執行例如但不限於陣列輸出電流之縮放函數、最小/最大限制函數、差分輸出轉換、緩衝)。輸入多工器3509接收類比輸入信號且將類比輸入信號施加至VMM陣列3507,其神經元電路3510自VMM陣列3507接收表示神經元輸出之類比輸出信號。輸出信號可被發送至3D VMM系統3500內之其他區塊。In this embodiment, die 3501 contains respective VMM arrays 3507 (functionally similar to VMM 3401 in Figure 34), respective input multiplexers 3509, respective column buffers 3523 (which can provide, for example, sample and hold buffering voltage to the array input), respective high voltage decoders 3508 (similar in function to high voltage decoder 3403 in Figure 34), and respective neuron circuits 3510 (which may perform, for example, but not limited to, scaling of the array output current function, min/max limit function, differential output conversion, buffering). Input multiplexer 3509 receives the analog input signal and applies the analog input signal to VMM array 3507 from which neuron circuit 3510 receives an analog output signal representative of the neuron output. The output signals can be sent to other blocks within the 3D VMM system 3500.

晶粒3502亦含有各別VMM陣列3507、各別輸入多工器3509、各別列緩衝器3522、各別高電壓解碼器3508及各別神經元電路3510。Die 3502 also contains respective VMM arrays 3507, respective input multiplexers 3509, respective column buffers 3522, respective high voltage decoders 3508, and respective neuron circuits 3510.

在此實施例中,二個晶粒(晶粒3501及3502)含有各別VMM陣列3507,但應理解可包括含有各別VMM陣列之額外晶粒。In this embodiment, two dies (die 3501 and 3502) contain respective VMM arrays 3507, but it is understood that additional dies containing respective VMM arrays may be included.

晶粒3503含有高電壓產生器3511 (在功能上類似於圖34中之高電壓產生區塊3410)、類比電路系統3512 (在功能上類似於圖34中之類比電路系統3415)及溫度補償電路3513。3D VMM系統3500可具有2D VMM系統3400不具有之熱挑戰。因為晶粒3501、3502、3503、3504、3505及3506以垂直組構堆疊且含有不同類型之電路,所以各晶粒可在操作期間經歷不同熱操作條件。舉例而言,某些晶粒將變得比其他晶粒更熱,且溫度增加之速率可在晶粒當中變化。此引入歸因於熱改變之不準確度之可能性。溫度補償電路3513補償各種晶粒當中經歷之溫度改變。視情況,一或多個熱感測器位於各各別晶粒上以將溫度資料提供至溫度補償電路3513。溫度補償電路3513接著改變微調或組構設定以補償溫度中之任何改變。溫度補償電路3513亦補償各晶粒之溫度改變,例如補償溫度內之胞元電流改變,諸如使所得位元線(神經元)電流在溫度上大致相同。溫度補償電路3513亦用於神經元電路3510、DAC及ADC電路,以便使操作動態範圍(例如,DAC之輸出範圍、ADC之輸入範圍、神經元電路3510之輸出範圍)在溫度上大致相同。Die 3503 contains a high voltage generator 3511 (similar in function to high voltage generating block 3410 in Figure 34), analog circuitry 3512 (similar in function to analog circuitry 3415 in Figure 34), and temperature compensation circuitry 3513. The 3D VMM system 3500 may have thermal challenges that the 2D VMM system 3400 does not. Because dies 3501, 3502, 3503, 3504, 3505, and 3506 are stacked in a vertical configuration and contain different types of circuitry, each die may experience different thermal operating conditions during operation. For example, some dies will become hotter than other dies, and the rate of temperature increase may vary among the dies. This introduces the possibility of inaccuracies due to thermal changes. Temperature compensation circuit 3513 compensates for temperature changes experienced among various dies. Optionally, one or more thermal sensors are located on each respective die to provide temperature data to temperature compensation circuitry 3513. Temperature compensation circuit 3513 then changes the trim or configuration settings to compensate for any changes in temperature. The temperature compensation circuit 3513 also compensates for temperature changes of each die, such as compensating for changes in cell current within temperature, such that the resulting bit line (neuron) currents are approximately the same in temperature. Temperature compensation circuitry 3513 is also used in the neuron circuit 3510, DAC, and ADC circuits so that the operating dynamic ranges (eg, the output range of the DAC, the input range of the ADC, the output range of the neuron circuit 3510) are approximately the same across temperatures.

晶粒3504含有輸入電路3514 (在功能上類似於輸入電路3406),其包括位址解碼電路3524、列暫存器3525 (保持陣列列之激活輸入值)及數位至類比轉換器(DAC) 3515。DAC 3515自列暫存器3525接收數位信號且將其轉換成類比信號。Die 3504 contains input circuitry 3514 (functionally similar to input circuitry 3406), which includes address decoding circuitry 3524, column registers 3525 (holding active input values for array columns), and digital-to-analog converters (DACs) 3515 . DAC 3515 receives digital signals from register 3525 and converts them into analog signals.

晶粒3505含有類比至數位轉換器(ADC) 3516。ADC 3516接收類比信號且將其轉換成數位信號。Die 3505 contains analog-to-digital converter (ADC) 3516. ADC 3516 receives analog signals and converts them into digital signals.

晶粒3506含有數位電路3517、靜態隨機存取記憶體(SRAM) 3518、暫存器3519、實體I/O連接3520、數位加速器3531及網路單晶片(network-on-chip;NOC) 3715。晶粒3506提供用於其他晶粒之控制函數。數位電路3517可包括數位邏輯、微控制器、單一指令多重資料(SIMD)處理器及處理器。SRAM 3518及暫存器3519可用於在3D VMM系統3500中儲存由數位電路3517或其他電路或區塊使用之系統資訊及組構資訊。實體I/O連接3520提供至VMM系統3500外部之構件(諸如外部處理單元)或至另一封裝3522之IO介面。數位加速器3531用於其中可需要額外處理之某些神經網路或神經網路內之某些層,諸如當存在較小激活大小時,其中胞元中所儲存之權重為動態的且不固定,其中需要執行MAC運算但不限於此。NOC 3715在3D VMM系統3500內提供網路路由功能,例如藉由產生控制信號以使得信號自一個區塊路由至另一區塊。Die 3506 contains digital circuitry 3517, static random access memory (SRAM) 3518, register 3519, physical I/O connection 3520, digital accelerator 3531 and network-on-chip (NOC) 3715. Die 3506 provides control functions for other die. Digital circuitry 3517 may include digital logic, microcontrollers, single instruction multiple data (SIMD) processors, and processors. SRAM 3518 and register 3519 may be used to store system information and fabric information used by digital circuit 3517 or other circuits or blocks in 3D VMM system 3500. Physical I/O connections 3520 provide an IO interface to components external to the VMM system 3500 (such as an external processing unit) or to another package 3522. Digital accelerator 3531 is used in certain neural networks or certain layers within a neural network where additional processing may be required, such as when there are small activation sizes, where the weights stored in the cells are dynamic and not fixed, The MAC operation needs to be performed but is not limited to this. NOC 3715 provides network routing functions within 3D VMM system 3500, such as by generating control signals to route signals from one block to another.

複數個晶粒中之各別者經由垂直介面3521連接至複數個晶粒中之一或多個其他晶粒,該垂直介面3521分別將二個或更多個晶粒連接在一起。在一個實施例中,垂直介面3521實施為穿透矽通孔(TSV)。Each of the plurality of dies is connected to one or more other dies of the plurality of dies via a vertical interface 3521 , which respectively connects two or more dies together. In one embodiment, vertical interface 3521 is implemented as a through silicon via (TSV).

在3D VMM系統3500之讀取操作期間,數位輸入藉由輸入電路3514接收。數位輸入啟用列暫存器3525,該列暫存器3525儲存激活輸入且回應於數位輸入將選定激活輸入施加至DAC 3515,DAC 3515將數位輸出自列暫存器3525轉換成各別類比信號。由DAC 3515產生之類比信號由輸入電路3514經由一或多個垂直介面3521提供至晶粒3501、3502中之一或多者上的輸入多工器3509及列緩衝器3523,該等晶粒接著將信號施加至各別VMM陣列3507中之一或多個列,導致輸出由VMM陣列3507產生。來自各別VMM陣列3507之輸出由各別神經元電路3510接收,該各別神經元電路3510提供緩衝功能以驅動其連接之一或多個垂直介面3521的寄生電容。神經元電路3510經由一或多個垂直介面3521將類比信號提供至晶粒3505上之ADC 3516,該ADC 3516將類比信號轉換成數位信號。替代地,類比信號可繞過ADC 3516且保持類比形式。ADC 3516之輸出經由實體I/O 3520提供至3D VMM系統3500外部之構件(諸如處理單元或圖形處理單元),或施加為至各別VMM陣列3507 (表示人工神經網路中之另一層)之輸入。替代地,來自神經元電路3510之類比信號可繞過ADC 3516且保持類比形式且作為輸入施加至各別VMM陣列3507。During a read operation of 3D VMM system 3500, digital input is received through input circuit 3514. The digital input enables column register 3525, which stores the activation input and in response to the digital input applies the selected activation input to DAC 3515, which converts the digital output from column register 3525 into respective analog signals. Analog signals generated by DAC 3515 are provided by input circuitry 3514 via one or more vertical interfaces 3521 to input multiplexer 3509 and column buffer 3523 on one or more of dies 3501, 3502, which are then Applying the signal to one or more columns in the respective VMM array 3507 causes an output to be generated by the VMM array 3507. The output from the respective VMM array 3507 is received by a respective neuron circuit 3510 which provides a buffering function to drive the parasitic capacitance of one or more vertical interfaces 3521 to which it is connected. Neuron circuit 3510 provides analog signals via one or more vertical interfaces 3521 to ADC 3516 on die 3505, which converts the analog signals into digital signals. Alternatively, the analog signal can bypass the ADC 3516 and remain in analog form. The output of ADC 3516 is provided via physical I/O 3520 to components external to 3D VMM system 3500 (such as a processing unit or graphics processing unit), or applied to respective VMM arrays 3507 (representing another layer in an artificial neural network). Enter. Alternatively, the analog signal from the neuron circuit 3510 may bypass the ADC 3516 and remain in analog form and applied as input to the respective VMM array 3507.

圖36描繪包含封裝3622之3D VMM系統3600。3D VMM系統3600類似於3D VMM系統3500,且含有許多相同組件,不同之處在於組件及某些額外組件之置放的一些差異。與圖35中相同之項目含有與圖36中相同的項目數目。此處,晶粒3601及3602中含有之組件共用晶粒3603、3604、3605及3606中含有之組件。Figure 36 depicts a 3D VMM system 3600 including a package 3622. The 3D VMM system 3600 is similar to the 3D VMM system 3500 and contains many of the same components, except for some differences in the placement of the components and certain additional components. The same items as in Figure 35 contain the same number of items as in Figure 36. Here, components contained in dies 3601 and 3602 share components contained in dies 3603, 3604, 3605, and 3606.

3D VMM系統3600包含複數個晶粒,諸如晶粒3601、3602、3603、3604、3605及3606,該等晶粒垂直地堆疊於共同封裝3522中以形成封裝積體電路。3D VMM system 3600 includes a plurality of dies, such as dies 3601, 3602, 3603, 3604, 3605, and 3606, which are vertically stacked in a common package 3522 to form a packaged integrated circuit.

在此實施例中,晶粒3601含有各別VMM陣列3507、各別輸入多工器3509、各別暫存器3524 (保持用於陣列列之激活輸入值)、各別列緩衝器3523、各別高電壓多工器3608及各別行多工器3610。In this embodiment, die 3601 contains individual VMM arrays 3507, individual input multiplexers 3509, individual registers 3524 (holding active input values for array columns), individual column buffers 3523, individual A separate high voltage multiplexer 3608 and a separate line multiplexer 3610.

晶粒3602亦含有各別VMM陣列3507、各別輸入多工器3509、各別暫存器3524、各別列緩衝器3523、各別高電壓多工器3608及行多工器3610。Die 3602 also contains individual VMM arrays 3507, individual input multiplexers 3509, individual registers 3524, individual column buffers 3523, individual high voltage multiplexers 3608, and row multiplexers 3610.

在此實施例中,二個晶粒(晶粒3601及3602)含有各別VMM陣列3507,但應理解可包括含有各別VMM陣列之額外晶粒。In this embodiment, two dies (die 3601 and 3602) contain respective VMM arrays 3507, but it is understood that additional dies containing respective VMM arrays may be included.

晶粒3603含有高電壓產生器3511、類比電路系統3512、溫度補償電路3513及高電壓多工器3608。Die 3603 contains a high voltage generator 3511, an analog circuit system 3512, a temperature compensation circuit 3513 and a high voltage multiplexer 3608.

晶粒3604含有包括位址解碼3524及DAC 3515之輸入電路3614。Die 3604 contains input circuitry 3614 including address decoding 3524 and DAC 3515.

晶粒3605含有神經元電路3510及ADC 3516。Die 3605 contains neuron circuit 3510 and ADC 3516.

晶粒3606含有數位電路3517、SRAM 3518、暫存器3519及實體I/O連接3520。Die 3606 contains digital circuitry 3517, SRAM 3518, register 3519 and physical I/O connections 3520.

複數個晶粒中之各別晶粒經由各別垂直介面3521連接複數個晶粒內的一或多個其他晶粒。Respective dies of the plurality of dies are connected to one or more other dies of the plurality of dies via respective vertical interfaces 3521 .

在3D VMM系統3600之讀取操作期間,輸入電路3614接收輸入且接收位址。位址解碼器3524解碼位址,且經由一或多個垂直介面3521將輸入提供至對應於經解碼位址之各別列暫存器3525。各別列暫存器3525之輸出經由一或多個垂直介面3521耦接至DAC 3515,該DAC 3515將自列暫存器3525接收到的數位輸出轉換成類比信號,且經由一或多個垂直介面3521將類比信號提供至各別輸入多工器3509及列緩衝器3523。列緩衝器3523將已緩衝類比信號施加至用於選定列(諸如陣列控制閘極或字元線)之各別VMM陣列3507的列輸入。來自各別VMM陣列3507之輸出(諸如來自陣列位元線)由各別行多工器3610接收,且各別行多工器3610之輸出經由一或多個垂直介面3521將信號提供至晶粒3605上之ADC 3516,該ADC 3516將類比信號轉換成數位信號。替代地,信號可繞過ADC 3516且保持類比形式。ADC 3516之輸出經由一或多個垂直介面3521提供至晶粒3606上之數位電路3517 (其可執行激活函數、池化函數或其他網路功能),數位電路3517之輸出可經由各別垂直介面3521通過晶粒3606上之實體I/O 3520提供至3D VMM系統3600外部的構件(諸如處理單元或圖形處理單元)或至晶粒3604上之輸入電路3614,作為至另一各別VMM陣列3507 (表示人工神經網路中之另一層)之輸入。During a read operation of 3D VMM system 3600, input circuit 3614 receives input and receives an address. Address decoder 3524 decodes addresses and provides input to respective column registers 3525 corresponding to the decoded addresses via one or more vertical interfaces 3521. The output of the respective column register 3525 is coupled to a DAC 3515 through one or more vertical interfaces 3521. The DAC 3515 converts the digital output received from the column register 3525 into an analog signal and passes through one or more vertical interfaces 3521. Interface 3521 provides analog signals to respective input multiplexers 3509 and column buffers 3523. Column buffers 3523 apply buffered analog signals to column inputs of respective VMM arrays 3507 for selected columns, such as array control gates or word lines. Outputs from respective VMM arrays 3507, such as from array bit lines, are received by respective row multiplexers 3610, and the outputs of respective row multiplexers 3610 provide signals to the die via one or more vertical interfaces 3521 The ADC 3516 on the 3605 converts analog signals into digital signals. Alternatively, the signal can bypass the ADC 3516 and remain in analog form. The output of the ADC 3516 is provided to the digital circuit 3517 on the die 3606 (which can perform activation functions, pooling functions or other network functions) through one or more vertical interfaces 3521. The output of the digital circuit 3517 can be provided through the respective vertical interfaces. 3521 is provided via physical I/O 3520 on die 3606 to components external to the 3D VMM system 3600 (such as a processing unit or graphics processing unit) or to input circuitry 3614 on die 3604 as to another respective VMM array 3507 (represents the input of another layer in the artificial neural network).

圖37A描繪3D VMM系統3700,其包含二個或更多個垂直堆疊中之複數個晶粒。在此實施例中,第一垂直堆疊包含晶粒3701、3702、3703、3704、3705及3706,且第二垂直堆疊包含晶粒3707、3708、3709、3710、3711及3712,其均含於共同封裝3522中以形成單一封裝積體電路。在一個實施例中,第一垂直堆疊中之晶粒為與第二垂直堆疊中之晶粒實體分離的晶粒。在另一實施例中,第一垂直堆疊中之晶粒與第二垂直堆疊中之晶粒為相同實體晶粒(意謂例如晶粒3701與3707為相同晶粒)。此處,晶粒3701及3702中含有之組件共用晶粒3703、3704、3705及3706中含有的組件,且晶粒3707及3708中含有之組件共用晶粒3709、3710、3711及3712中含有的組件。Figure 37A depicts a 3D VMM system 3700 that includes a plurality of dies in two or more vertical stacks. In this embodiment, the first vertical stack includes dies 3701, 3702, 3703, 3704, 3705, and 3706, and the second vertical stack includes dies 3707, 3708, 3709, 3710, 3711, and 3712, all contained in a common package 3522 to form a single package integrated circuit. In one embodiment, the dies in the first vertical stack are physically separate dies from the dies in the second vertical stack. In another embodiment, the die in the first vertical stack and the die in the second vertical stack are the same physical die (meaning, for example, die 3701 and 3707 are the same die). Here, components contained in dies 3701 and 3702 share components contained in dies 3703, 3704, 3705, and 3706, and components contained in dies 3707 and 3708 share components contained in dies 3709, 3710, 3711, and 3712. components.

在所示之實施例中,晶粒3701及晶粒3707各自含有各別VMM陣列3507、各別陣列輸入3729 (其包括各別輸入多工器3509、各別位址解碼器3713、各別列暫存器3525及各別列緩衝器3523)、各別高電壓多工器3608及各別神經元電路3510。In the embodiment shown, die 3701 and die 3707 each contain a respective VMM array 3507, a respective array input 3729 (which includes a respective input multiplexer 3509, a respective address decoder 3713, a respective column register 3525 and respective column buffers 3523), respective high voltage multiplexers 3608 and respective neuron circuits 3510.

晶粒3702及晶粒3708亦各自含有各別VMM陣列3507、各別陣列輸入3729 (其包括各別輸入多工器3509、各別位址解碼器3713、各別列暫存器3525及各別列緩衝器3523)、各別高電壓多工器3608及各別神經元電路3510。Die 3702 and die 3708 also each include a respective VMM array 3507, a respective array input 3729 (which includes a respective input multiplexer 3509, a respective address decoder 3713, a respective column register 3525 and a respective column buffer 3523), respective high voltage multiplexers 3608 and respective neuron circuits 3510.

在此實施例中,四個晶粒(晶粒3701、3702、3707及3708)含有各別VMM陣列3507,但應理解可包括含有VMM陣列之額外晶粒。In this embodiment, four dies (die 3701, 3702, 3707, and 3708) contain respective VMM arrays 3507, although it is understood that additional dies containing VMM arrays may be included.

晶粒3703及晶粒3709各自含有各別高電壓解碼器3714、各別高電壓產生器3511、各別類比電路系統3512及各別溫度補償電路3513。Die 3703 and die 3709 each include a respective high voltage decoder 3714, a respective high voltage generator 3511, a respective analog circuitry 3512, and a respective temperature compensation circuit 3513.

晶粒3704及晶粒3710各自含有包括各別DAC 3515之各別輸入電路3514。Die 3704 and die 3710 each contain respective input circuits 3514 including respective DACs 3515 .

晶粒3705及晶粒3711各自含有各別ADC 3516。Die 3705 and die 3711 each contain a respective ADC 3516.

晶粒3706及晶粒3712各自含有各別數位電路3517、各別SRAM 3518、各別暫存器3519、各別實體I/O連接3520及各別網路單晶片(NOC)連接3715。Die 3706 and die 3712 each include respective digital circuits 3517, respective SRAM 3518, respective registers 3519, respective physical I/O connections 3520, and respective network on-chip (NOC) connections 3715.

複數個晶粒中之各別晶粒經由分別將二個或更多個晶粒連接在一起之一或多個垂直介面3521或水平介面3716連接至複數個晶粒內的一或多個其他晶粒。在一個實施例中,垂直介面3521分別為穿透矽通孔(TSV)。在一個實施例中,水平介面3716分別為再分佈層(RDL)連接。Each of the plurality of dies is connected to one or more other dies within the plurality of dies via one or more vertical interfaces 3521 or horizontal interfaces 3716 that respectively connect two or more dies together. grain. In one embodiment, the vertical interfaces 3521 are through silicon vias (TSVs) respectively. In one embodiment, horizontal interfaces 3716 are respectively redistribution layer (RDL) connections.

在3D VMM系統3700之讀取操作期間,數位輸入由各別輸入電路3514接收,該各別輸入電路3514經由其各別DAC 3515將數位輸入轉換成類比信號,且各別DAC 3515之輸出經由各別垂直介面3521及/或水平介面3716耦接至各別陣列輸入3729之列暫存器3525。各別陣列輸入3729之列暫存器3525之輸出提供至輸入多工器3509及列緩衝器3523,接著將信號施加至VMM陣列3507中之一或多個列。來自各別VMM陣列3507之輸出由各別神經元電路3510接收,神經元電路3510提供緩衝功能以驅動連接至其之一或多個垂直介面3521或水平介面3716的寄生電容。神經元電路3510經由一或多個垂直介面3521或水平介面3716將緩衝類比信號提供至各別ADC 3516,該ADC 3516將類比信號轉換成數位信號。ADC 3516之輸出提供至各別數位電路3517 (其執行激活函數、池化函數或網路函數),且各別數位電路3517之輸出可經由各別實體I/O 3520提供至3D VMM系統3700外部之構件(諸如處理單元或圖形處理單元)或至待由各別DAC 3515轉換的各別輸入電路3514,且各別輸入電路3514之輸出經由各別實體I/O 3520耦接至另一VMM陣列3507 (表示人工神經網路中之另一層)或至另一封裝3522。During read operations of the 3D VMM system 3700, digital inputs are received by respective input circuits 3514, which convert the digital inputs to analog signals via their respective DACs 3515, and the outputs of the respective DACs 3515 are Respective vertical interfaces 3521 and/or horizontal interfaces 3716 are coupled to column registers 3525 of respective array inputs 3729 . The outputs of column registers 3525 of respective array inputs 3729 are provided to input multiplexers 3509 and column buffers 3523, which then apply signals to one or more columns in VMM array 3507. The output from the respective VMM array 3507 is received by a respective neuron circuit 3510, which provides a buffering function to drive parasitic capacitance connected to one or more of its vertical interfaces 3521 or horizontal interfaces 3716. Neuron circuitry 3510 provides buffered analog signals via one or more vertical interfaces 3521 or horizontal interfaces 3716 to respective ADCs 3516, which convert the analog signals into digital signals. The output of the ADC 3516 is provided to a respective digital circuit 3517 (which performs an activation function, a pooling function or a network function), and the output of the respective digital circuit 3517 can be provided external to the 3D VMM system 3700 via a respective physical I/O 3520 (such as a processing unit or graphics processing unit) or to a respective input circuit 3514 to be converted by a respective DAC 3515, and the output of the respective input circuit 3514 is coupled to another VMM array via a respective physical I/O 3520 3507 (representing another layer in the artificial neural network) or to another package 3522.

圖37B描繪類似於3D VMM系統3750之3D VMM系統3750,不同之處在於其具有另一類型的VMM陣列,在晶粒3758上顯示為VMM陣列3557,該VMM陣列3557包含靜態RAM胞元或動態RAM胞元。Figure 37B depicts a 3D VMM system 3750 similar to the 3D VMM system 3750, except that it has another type of VMM array, shown as VMM array 3557 on die 3758, which contains static RAM cells or dynamic RAM cell.

圖38描繪3D VMM系統3800,其包含二個垂直堆疊中之複數個晶粒。有可能具有多於二個堆疊。在此實施例中,第一垂直堆疊包含晶粒3801、3802、3803、3804、3805及3806,且第二垂直堆疊包含晶粒3807、3808、3809、3810、3811及3812,其均含於共同封裝3522中以形成單一封裝積體電路。在一個實施例中,第一垂直堆疊中之晶粒為與第二垂直堆疊中之晶粒實體分離的晶粒。在另一實施例中,第一垂直堆疊中之晶粒與第二垂直堆疊中之晶粒為相同實體晶粒(意謂例如晶粒3801與3807為相同晶粒)。此處,晶粒3801及3802中含有之組件共用晶粒3803、3804、3805及3806中含有的組件,且晶粒3807及3808中含有之組件共用晶粒3809、3810、3811及3812中含有的組件。Figure 38 depicts a 3D VMM system 3800 that includes a plurality of dies in two vertical stacks. It is possible to have more than two stacks. In this embodiment, the first vertical stack includes dies 3801, 3802, 3803, 3804, 3805, and 3806, and the second vertical stack includes dies 3807, 3808, 3809, 3810, 3811, and 3812, all contained in a common package 3522 to form a single package integrated circuit. In one embodiment, the dies in the first vertical stack are physically separate dies from the dies in the second vertical stack. In another embodiment, the die in the first vertical stack and the die in the second vertical stack are the same physical die (meaning, for example, die 3801 and 3807 are the same die). Here, components contained in dies 3801 and 3802 share components contained in dies 3803, 3804, 3805, and 3806, and components contained in dies 3807 and 3808 share components contained in dies 3809, 3810, 3811, and 3812. components.

在所示之實施例中,晶粒3801、3802、3807及晶粒3808各自含有各別VMM陣列3507、各別陣列輸入3729、各別高電壓多工器3608、各別行多工器3610及各別陣列輸入電路3729。各別陣列輸入3729包含輸入多工器3509及位址解碼器3713、列暫存器3524及列緩衝器3523。In the embodiment shown, die 3801, 3802, 3807, and die 3808 each include a respective VMM array 3507, a respective array input 3729, a respective high voltage multiplexer 3608, a respective row multiplexer 3610, and Individual array input circuit 3729. Respective array input 3729 includes input multiplexer 3509 and address decoder 3713, column register 3524 and column buffer 3523.

在此實施例中,四個晶粒(晶粒3801、3802、3807及3808)含有VMM陣列,但應理解可包括含有VMM陣列之額外晶粒。In this embodiment, four dies (die 3801, 3802, 3807, and 3808) contain VMM arrays, but it is understood that additional dies containing VMM arrays may be included.

晶粒3803及晶粒3809分別含有高電壓解碼器3714、高電壓產生器3511、類比電路系統3512及溫度補償電路3513。Die 3803 and die 3809 respectively include a high voltage decoder 3714, a high voltage generator 3511, an analog circuit system 3512 and a temperature compensation circuit 3513.

晶粒3804及晶粒3810分別含有包括DAC 3515之輸入電路3514。Die 3804 and die 3810 each contain input circuitry 3514 including a DAC 3515.

晶粒3805及晶粒3811分別含有ADC 3516及神經元電路3510。Die 3805 and die 3811 contain ADC 3516 and neuron circuit 3510 respectively.

晶粒3806及晶粒3812分別含有數位電路3517、SRAM 3518、暫存器3519、實體I/O連接3520及NOC連接3715。Die 3806 and die 3812 respectively include digital circuits 3517, SRAM 3518, registers 3519, physical I/O connections 3520, and NOC connections 3715.

複數個晶粒經由分別將二個或更多個晶粒連接在一起之一或多個垂直介面3521或水平介面3716分別連接至複數個晶粒內的一或多個其他晶粒。在一個實施例中,各別垂直介面3521實施為穿透矽通孔(TSV)。在一個實施例中,各別水平介面3716實施為再分佈層(RDL)連接。The plurality of dies are respectively connected to one or more other dies within the plurality of dies via one or more vertical interfaces 3521 or horizontal interfaces 3716 respectively connecting two or more dies together. In one embodiment, the respective vertical interfaces 3521 are implemented as through silicon vias (TSVs). In one embodiment, the respective horizontal interfaces 3716 are implemented as redistribution layer (RDL) connections.

在3D VMM系統3800之讀取操作期間,數位輸入由輸入電路3514接收,該輸入電路3514使用DAC 3515將數位輸入轉換成類比形式且經由各別垂直介面3521及/或水平介面3716將類比信號提供至各別陣列輸入電路3729。陣列輸入電路3729自輸入電路及位址接收類比信號,且接著在VMM陣列3507中回應於位址將類比信號施加至選定列。來自VMM陣列3507之輸出由行多工器3610接收,該行多工器3610經由一或多個垂直介面3521或水平介面3716將類比信號提供至ADC 3516,該ADC 3516將類比信號轉換成數位信號。替代地,類比信號可繞過ADC 3516且保持類比形式。ADC 3516之輸出經由一或多個垂直介面3521或水平介面3716提供至數位電路3517 (其可執行激活函數、池化函數或其他網路函數),且數位電路3517之輸出可經由實體I/O 3520提供至3D VMM系統3800外部之構件(諸如處理單元或圖形處理單元)或至另一VMM陣列3507 (表示人工神經網路中之另一層)的輸入電路3514或經由實體I/O 3520至另一封裝3522。During read operations of 3D VMM system 3800, digital input is received by input circuitry 3514, which converts the digital input to analog form using DAC 3515 and provides the analog signal via respective vertical interface 3521 and/or horizontal interface 3716 to respective array input circuit 3729. Array input circuit 3729 receives the analog signal from the input circuit and the address, and then applies the analog signal to the selected column in VMM array 3507 in response to the address. Output from VMM array 3507 is received by row multiplexer 3610, which provides analog signals via one or more vertical interfaces 3521 or horizontal interface 3716 to ADC 3516, which converts the analog signals to digital signals. . Alternatively, the analog signal can bypass the ADC 3516 and remain in analog form. The output of the ADC 3516 is provided to the digital circuit 3517 (which can perform activation functions, pooling functions or other network functions) via one or more vertical interfaces 3521 or horizontal interface 3716, and the output of the digital circuit 3517 can be via physical I/O 3520 provides input circuitry 3514 to components external to the 3D VMM system 3800 (such as a processing unit or graphics processing unit) or to another VMM array 3507 (representing another layer in the artificial neural network) or to another via physical I/O 3520 3522 in one package.

圖39A描繪3D VMM系統3900,其包含二個垂直堆疊中之複數個晶粒。在此實施例中,第一垂直堆疊包含晶粒3901、3902、3903及3904,且第二垂直堆疊包含晶粒3905、3906、3907及3908,其均含於共同封裝3522中以形成單一封裝積體電路。在一個實施例中,第一垂直堆疊中之晶粒為與第二垂直堆疊中之晶粒實體分離的晶粒。在另一實施例中,第一垂直堆疊中之晶粒與第二垂直堆疊中之晶粒為相同實體晶粒(意謂例如晶粒3901與3905為相同晶粒)。此處,晶粒3901及3902中含有之組件共用晶粒3903及3904中含有的組件,且晶粒3905及3906中含有的組件共用晶粒3907及3908中含有的組件。Figure 39A depicts a 3D VMM system 3900 that includes a plurality of dies in two vertical stacks. In this embodiment, the first vertical stack includes dies 3901, 3902, 3903, and 3904, and the second vertical stack includes dies 3905, 3906, 3907, and 3908, all contained in common package 3522 to form a single packaged area. body circuit. In one embodiment, the dies in the first vertical stack are physically separate dies from the dies in the second vertical stack. In another embodiment, the die in the first vertical stack and the die in the second vertical stack are the same physical die (meaning, for example, die 3901 and 3905 are the same die). Here, components contained in dies 3901 and 3902 share components contained in dies 3903 and 3904, and components contained in dies 3905 and 3906 share components contained in dies 3907 and 3908.

在所示之實施例中,晶粒3901、3902、3905及晶粒3906分別含有VMM陣列3507、陣列輸入3729、高電壓多工器3608及行多工器3610。In the embodiment shown, die 3901, 3902, 3905, and die 3906 contain VMM array 3507, array input 3729, high voltage multiplexer 3608, and row multiplexer 3610, respectively.

在此實施例中,四個晶粒(晶粒3901、3902、3903及3904)含有VMM陣列,但應理解可包括含有VMM陣列之額外晶粒。In this embodiment, four dies (die 3901, 3902, 3903, and 3904) contain VMM arrays, but it is understood that additional dies containing VMM arrays may be included.

晶粒3903及晶粒3907分別含有高電壓解碼器3714、高電壓產生器3511、類比電路系統3512、溫度補償電路3513、包括DAC 3515、神經元電路3510及ADC 3516之輸入電路3514。Die 3903 and die 3907 respectively include a high voltage decoder 3714, a high voltage generator 3511, an analog circuit system 3512, a temperature compensation circuit 3513, and an input circuit 3514 including a DAC 3515, a neuron circuit 3510, and an ADC 3516.

晶粒3904及晶粒3908分別含有數位電路3517、SRAM 3518、暫存器3519、實體I/O連接3520及NOC連接3715。Die 3904 and die 3908 respectively include digital circuit 3517, SRAM 3518, register 3519, physical I/O connection 3520 and NOC connection 3715.

複數個晶粒經由分別將二個或更多個晶粒連接在一起之一或多個垂直介面3521或水平介面3716分別連接至複數個晶粒內的一或多個其他晶粒。The plurality of dies are respectively connected to one or more other dies within the plurality of dies via one or more vertical interfaces 3521 or horizontal interfaces 3716 respectively connecting two or more dies together.

在3D VMM系統3900之讀取操作期間,數位輸入由輸入電路3514接收,該輸入電路3514使用DAC 3515將數位輸入轉換成類比形式且經由各別垂直介面3521及/或水平介面3716將類比信號提供至各別陣列輸入電路3729。陣列輸入電路3729自輸入電路及位址接收類比信號,且回應於接收到之位址,將類比信號施加至VMM陣列3507中之選定列。來自VMM陣列3507之輸出由行多工器3610接收,該行多工器3610經由一或多個垂直介面3521或水平介面3716將類比信號提供至ADC 3516,該ADC 3516將類比信號轉換成數位信號。替代地,類比信號可繞過ADC 3516且保持類比形式。ADC 3516之輸出經由各別垂直介面3521及/或水平介面3716提供至數位電路3517,且接著經由實體I/O 3520提供至3D VMM系統3900外部之構件(諸如處理單元或圖形處理單元)或至待施加為至VMM陣列3507 (表示人工神經網路中的另一層)之輸入的輸入電路3514或經由實體I/O 3520至另一封裝3522。During read operations of 3D VMM system 3900, digital input is received by input circuitry 3514, which converts the digital input to analog form using DAC 3515 and provides the analog signal via respective vertical interface 3521 and/or horizontal interface 3716 to respective array input circuit 3729. Array input circuit 3729 receives the analog signal from the input circuit and the address, and applies the analog signal to the selected column in VMM array 3507 in response to the received address. Output from VMM array 3507 is received by row multiplexer 3610, which provides analog signals via one or more vertical interfaces 3521 or horizontal interface 3716 to ADC 3516, which converts the analog signals to digital signals. . Alternatively, the analog signal can bypass the ADC 3516 and remain in analog form. The output of ADC 3516 is provided to digital circuitry 3517 via respective vertical interface 3521 and/or horizontal interface 3716, and then to components external to 3D VMM system 3900 (such as a processing unit or graphics processing unit) via physical I/O 3520 or to Input circuitry 3514 to be applied as an input to VMM array 3507 (representing another layer in the artificial neural network) or to another package 3522 via physical I/O 3520.

圖39B描繪類似於圖39A之3D VMM系統之3D VMM系統3950,不同之處在於晶粒3951、3952、3955及3956現在具有輸入電路3514及陣列輸入3729二者。3D VMM系統3950包含封裝3522及晶粒3951、3952、3953、3954、3955、3956、3957及3958。複數個晶粒經由分別將二個或更多個晶粒連接在一起之一或多個垂直介面3521或水平介面3716分別連接至複數個晶粒內的一或多個其他晶粒。Figure 39B depicts a 3D VMM system 3950 similar to the 3D VMM system of Figure 39A, except that dies 3951, 3952, 3955, and 3956 now have both input circuitry 3514 and array input 3729. 3D VMM system 3950 includes package 3522 and dies 3951, 3952, 3953, 3954, 3955, 3956, 3957, and 3958. The plurality of dies are respectively connected to one or more other dies within the plurality of dies via one or more vertical interfaces 3521 or horizontal interfaces 3716 respectively connecting two or more dies together.

圖39C描繪3D VMM系統3980,其包含二個垂直堆疊中之複數個晶粒。在此實施例中,第一垂直堆疊包含晶粒3981、3982、3983,且第二垂直堆疊包含晶粒3984、3985及3986,其均含於共同封裝3522中以形成單一封裝積體電路。在一個實施例中,第一垂直堆疊中之晶粒為與第二垂直堆疊中之晶粒實體分離的晶粒。在另一實施例中,第一垂直堆疊中之晶粒與第二垂直堆疊中之晶粒為相同實體晶粒(意謂例如晶粒3981與3984為相同晶粒)。此處,晶粒3981及3982中含有之組件共用晶粒3983中含有的組件,晶粒3984及3985中含有之組件共用晶粒3986中含有的組件。Figure 39C depicts a 3D VMM system 3980 that includes a plurality of dies in two vertical stacks. In this embodiment, the first vertical stack includes dies 3981, 3982, 3983, and the second vertical stack includes dies 3984, 3985, and 3986, all contained in co-package 3522 to form a single packaged integrated circuit. In one embodiment, the dies in the first vertical stack are physically separate dies from the dies in the second vertical stack. In another embodiment, the die in the first vertical stack and the die in the second vertical stack are the same physical die (meaning, for example, die 3981 and 3984 are the same die). Here, components included in die 3981 and 3982 share components included in die 3983, and components included in die 3984 and 3985 share components included in die 3986.

在所示實施例中,晶粒3981、3982、3984及晶粒3985分別含有VMM陣列3507、高電壓區塊3991、輸入區塊3990、輸出區塊3992及類比區塊3993。輸入區塊3990可包括輸入電路3514、DAC 3515、陣列輸入電路3729。高電壓區塊3991可包括高電壓多工器3608及高電壓解碼器3714。輸出區塊3992可包括行多工器3610、神經元電路3510及ADC 3516。類比區塊3993可包括高電壓產生器3511、類比電路系統3512及溫度補償電路系統3513。In the illustrated embodiment, die 3981, 3982, 3984, and die 3985 contain VMM array 3507, high voltage block 3991, input block 3990, output block 3992, and analog block 3993, respectively. Input block 3990 may include input circuit 3514, DAC 3515, and array input circuit 3729. High voltage block 3991 may include high voltage multiplexer 3608 and high voltage decoder 3714. Output block 3992 may include row multiplexer 3610, neuron circuit 3510, and ADC 3516. Analog block 3993 may include a high voltage generator 3511, analog circuitry 3512, and temperature compensation circuitry 3513.

晶粒3983及晶粒3986各自含有數位電路3517、SRAM 3518、暫存器3519、實體I/O連接3520、數位加速器3521 (用數位方式用於乘積累加(MAC)函數)及NOC連接3715。Die 3983 and die 3986 each contain digital circuitry 3517, SRAM 3518, registers 3519, physical I/O connections 3520, digital accelerator 3521 (digitally used for multiply-accumulate (MAC) functions), and NOC connections 3715.

複數個晶粒經由其中各者分別將二個或更多個晶粒連接在一起之一或多個垂直介面3521或水平介面3716分別連接至複數個晶粒內的一或多個其他晶粒。The plurality of dies are respectively connected to one or more other dies within the plurality of dies via one or more vertical interfaces 3521 or horizontal interfaces 3716 , each of which connects two or more dies together.

圖40描繪3D VMM系統4000,其包含二個垂直堆疊中之複數個晶粒。在此實施例中,第一垂直堆疊包含晶粒4001、4002、4003及4004,且第二垂直堆疊包含晶粒4005、4006、4007及4008,其均含於共同封裝3522中以形成單一封裝積體電路。在一個實施例中,第一垂直堆疊中之晶粒為與第二垂直堆疊中之晶粒實體分離的晶粒。在另一實施例中,第一垂直堆疊中之晶粒與第二垂直堆疊中之晶粒為相同實體晶粒(意謂例如晶粒4001與4005為相同晶粒)。此處,晶粒4001及4002中含有之組件共用晶粒4003及4004中含有的組件,且晶粒4005及4006中含有的組件共用晶粒4007及4008中含有的組件。Figure 40 depicts a 3D VMM system 4000 that includes a plurality of dies in two vertical stacks. In this embodiment, the first vertical stack includes dies 4001, 4002, 4003, and 4004, and the second vertical stack includes dies 4005, 4006, 4007, and 4008, all contained in co-package 3522 to form a single packaged area. body circuit. In one embodiment, the dies in the first vertical stack are physically separate dies from the dies in the second vertical stack. In another embodiment, the die in the first vertical stack and the die in the second vertical stack are the same physical die (meaning, for example, die 4001 and 4005 are the same die). Here, components contained in dies 4001 and 4002 share components contained in dies 4003 and 4004, and components contained in dies 4005 and 4006 share components contained in dies 4007 and 4008.

在所示實施例中,晶粒4001、4002、4005及4006分別含有VMM陣列3507、陣列輸入4029 (其包括輸入多工器3509及/或解碼器3713-未顯示)、高電壓多工器3608及行多工器3610。In the embodiment shown, dies 4001, 4002, 4005, and 4006 respectively contain a VMM array 3507, an array input 4029 (which includes an input multiplexer 3509 and/or a decoder 3713 - not shown), a high voltage multiplexer 3608 and line multiplexer 3610.

在此實施例中,四個晶粒(晶粒4001、4002、4003及4004)含有VMM陣列,但應理解可包括含有VMM陣列之額外晶粒。In this embodiment, four dies (die 4001, 4002, 4003, and 4004) contain VMM arrays, but it is understood that additional dies containing VMM arrays may be included.

晶粒4003及晶粒4007分別含有高電壓解碼器3714、高電壓產生器3511、類比電路系統3512、溫度補償電路3513及神經元電路3510。Die 4003 and die 4007 respectively include a high voltage decoder 3714, a high voltage generator 3511, an analog circuit system 3512, a temperature compensation circuit 3513 and a neuron circuit 3510.

晶粒4004及晶粒4008分別含有數位電路3517、SRAM 3518、暫存器3519、實體I/O連接3520及NOC連接3715。Die 4004 and die 4008 respectively include digital circuit 3517, SRAM 3518, register 3519, physical I/O connection 3520 and NOC connection 3715.

不同於VMM系統3900,VMM系統4000不含有DAC 3515及ADC 3516,因為輸入及輸出保持類比形式且不在類比與數位形式之間轉換。Unlike VMM system 3900, VMM system 4000 does not contain DAC 3515 and ADC 3516 because the input and output remain in analog form and do not convert between analog and digital forms.

複數個晶粒經由分別將二個或更多個晶粒連接在一起之一或多個垂直介面3521或水平介面3716分別連接至複數個晶粒內的一或多個其他晶粒。The plurality of dies are respectively connected to one or more other dies within the plurality of dies via one or more vertical interfaces 3521 or horizontal interfaces 3716 respectively connecting two or more dies together.

在3D VMM系統4000之讀取操作期間,類比輸入(諸如電壓、電流或基於定時之實體,諸如一系列脈衝)由各別陣列輸入4029接收,該各別陣列輸入4029接著將信號施加至各別VMM陣列3507中之一或多個列。來自各別VMM陣列3507之輸出由行多工器3610接收,該行多工器3610經由一或多個垂直介面3521或水平介面3716將類比信號(諸如電壓、電流或基於定時之實體)提供至各別神經元電路3510,該神經元電路3510經由實體I/O 3520將緩衝信號提供至3D VMM系統4000外部之構件(諸如處理單元或圖形處理單元)或至另一封裝3522或至另一VMM陣列3507 (表示人工神經網路中之另一層)的陣列輸入4029。During read operations of the 3D VMM system 4000, analog inputs (such as voltages, currents, or timing-based entities such as a series of pulses) are received by respective array inputs 4029, which in turn apply signals to respective One or more columns in VMM array 3507. Outputs from respective VMM arrays 3507 are received by row multiplexers 3610 which provide analog signals (such as voltage, current, or timing-based entities) via one or more vertical interfaces 3521 or horizontal interfaces 3716 to Separate neuron circuits 3510 that provide buffered signals via physical I/O 3520 to a component external to the 3D VMM system 4000 (such as a processing unit or graphics processing unit) or to another package 3522 or to another VMM Array input 4029 for array 3507 (representing another layer in the artificial neural network).

圖41至圖44描繪關於VMM陣列3507之實例性組構之額外細節。圖41至圖44分別描繪3D VMM系統4100、4200、4300及4400。3D VMM系統4100、4200、4300及4400中之每一者包含共同封裝中(圖中未示)的第一晶粒上之VMM陣列3507-1及第二晶粒上之VMM陣列3507-2。VMM陣列3507-1及3507-2分別含有以m+1個列及n+1個行配置之非揮發性記憶體胞元陣列。各別列耦接至標記為CG0...CGm之控制閘極線中之一者,且各別行耦接至標記為BL0…BLn之位元線中之一者。胞元位於位元線與控制閘極線之相交處。舉例而言,胞元4101mn位於列m及行n中且耦接至VMM陣列3507-1中之CGm及BLn,且胞元4102mn位於列m及行n中且耦接至VMM陣列3507-2中之CGm及BLn。因為VMM陣列3507-1及3507-2位於不同晶粒上,所以其視情況可使用不同半導體製程製造。不管相同或不同半導體製程用於製造VMM陣列3507-1及3507-2,VMM陣列3507-1中之胞元相較於VMM陣列3507-2中之胞元可儲存不同數目個位元。舉例而言,VMM陣列3507-1中之胞元,諸如胞元4101mn可儲存i個位元,而VMM陣列3507-2中之胞元,諸如胞元4102mn可儲存j個位元,其中i及j為不同值之整數。舉例而言,i可為3 (意謂VMM陣列3507-1中之胞元分別儲存3位元值),且j可為5 (意謂VMM陣列3507-2中之胞元分別儲存5位元值)。41-44 depict additional details regarding an example configuration of VMM array 3507. 41-44 depict 3D VMM systems 4100, 4200, 4300, and 4400, respectively. Each of the 3D VMM systems 4100, 4200, 4300, and 4400 includes a first die on a common package (not shown). VMM array 3507-1 and VMM array 3507-2 on the second die. VMM arrays 3507-1 and 3507-2 contain non-volatile memory cell arrays configured in m+1 columns and n+1 rows, respectively. A respective column is coupled to one of the control gate lines labeled CG0...CGm, and a respective row is coupled to one of the bit lines labeled BL0...BLn. The cell is located at the intersection of the bit line and the control gate line. For example, cell 4101mn is located in column m and row n and is coupled to CGm and BLn in VMM array 3507-1, and cell 4102mn is located in column m and row n and is coupled to VMM array 3507-2 CGm and BLn. Because the VMM arrays 3507-1 and 3507-2 are located on different dies, they can be manufactured using different semiconductor processes as appropriate. Regardless of whether the same or different semiconductor processes are used to fabricate VMM arrays 3507-1 and 3507-2, cells in VMM array 3507-1 may store a different number of bits than cells in VMM array 3507-2. For example, a cell in VMM array 3507-1, such as cell 4101mn, can store i bits, and a cell in VMM array 3507-2, such as cell 4102mn, can store j bits, where i and j is an integer with different values. For example, i can be 3 (meaning that the cells in the VMM array 3507-1 each store a 3-bit value), and j can be 5 (meaning that the cells in the VMM array 3507-2 each store a 5-bit value). value).

在圖41中,輸入經由控制閘極線單獨提供至VMM陣列3507-1及3507-2,且在位元線上單獨地獲得輸出。圖41描繪實施例胞元4101mn及4102mn。視情況,可視需要在別處組合輸出,諸如藉由使用ADC 3516(顯示於先前圖中)將輸出轉換成數位形式,且使用數位電路3517 (顯示於先前圖中)將輸出相加。In Figure 41, inputs are provided individually to VMM arrays 3507-1 and 3507-2 via control gate lines, and outputs are obtained individually on bit lines. Figure 41 depicts example cells 4101mn and 4102mn. Optionally, the outputs may be combined elsewhere, such as by using ADC 3516 (shown in the previous figure) to convert the outputs into digital form, and using digital circuitry 3517 (shown in the previous figure) to sum the outputs.

在圖42中,輸入經由控制閘極線單獨提供至VMM陣列3507-1及3507-2。VMM陣列3507-1之輸出在第一位元線集合上獲得,且VMM陣列3507-2之輸出在第二位元線集合上獲得,其中第一位元線集合藉由有效地將輸出信號以類比形式相加在一起之各別垂直介面3521耦接至第二位元線集合。視情況,組合類比輸出可視需要在別處被數位化,諸如藉由使用ADC 3516將其轉換成數位形式(顯示於先前圖中)。In Figure 42, inputs are provided individually to VMM arrays 3507-1 and 3507-2 via control gate lines. The output of VMM array 3507-1 is obtained on a first set of bit lines, and the output of VMM array 3507-2 is obtained on a second set of bit lines by effectively converting the output signal to The respective vertical interfaces 3521 added together in analog form are coupled to the second set of bit lines. Optionally, the combined analog output can be digitized elsewhere if desired, such as by converting it to digital form using an ADC 3516 (shown in the previous figure).

在圖43中,輸入經由第一控制閘極線集合提供至VMM陣列3507-1,且經由第二控制閘極線集合提供至VMM陣列3507-2,其中第一控制閘極線集合藉由各別垂直介面3521耦接至第二控制閘極線集合,且輸出在位元線上單獨獲得。視情況,可視需要在別處組合輸出,諸如藉由使用ADC 3516(顯示於先前圖中)將輸出轉換成數位形式,且使用數位電路3517 (顯示於先前圖中)將輸出相加。In Figure 43, input is provided to VMM array 3507-1 via a first set of control gate lines, and to VMM array 3507-2 via a second set of control gate lines, with each Individual vertical interfaces 3521 are coupled to the second set of control gate lines, and the outputs are obtained individually on the bit lines. Optionally, the outputs may be combined elsewhere, such as by using ADC 3516 (shown in the previous figure) to convert the outputs into digital form, and using digital circuitry 3517 (shown in the previous figure) to sum the outputs.

在圖44中,輸入經由垂直介面3521經由控制閘極線共同提供至VMM陣列3507-1及3507-2。輸出在位元線上獲得,該等輸出經由垂直介面3521組合在一起,該垂直介面3521有效地將信號以類比形式相加在一起。視情況,組合類比輸出可視需要在別處被數位化,諸如藉由使用ADC 3516將其轉換成數位形式(顯示於先前圖中)。In Figure 44, input is provided collectively to VMM arrays 3507-1 and 3507-2 via vertical interface 3521 via control gate lines. The outputs are obtained on the bit lines, which are combined together via a vertical interface 3521, which effectively adds the signals together in analog form. Optionally, the combined analog output can be digitized elsewhere if desired, such as by converting it to digital form using an ADC 3516 (shown in the previous figure).

圖45至圖48描繪用於晶粒、垂直介面及水平介面之實施例結構佈局選項。Figures 45-48 depict embodiment structural layout options for die, vertical interface, and horizontal interface.

圖45描繪3D VMM系統4500,其包含以垂直組構配置且由垂直介面3521連接之第一晶粒集合(晶粒4501、4502、4503及4504)及以垂直組構配置且由垂直介面3521連接之第二晶粒集合(晶粒4505、4506、4507及4508)。此處一個晶粒可藉由垂直介面3521中之各別者連接至二個其他晶粒。Figure 45 depicts a 3D VMM system 4500 including a first set of dies (dies 4501, 4502, 4503, and 4504) configured in a vertical configuration and connected by a vertical interface 3521. The second die set (die 4505, 4506, 4507 and 4508). Here one die can be connected to two other dies through respective ones of vertical interfaces 3521.

圖46描繪3D VMM系統4600,其包含以垂直交錯組構配置之四個晶粒層級,其中第一層級包含晶粒4601及4602,第二層級包含晶粒4603、4604及4605,第三層級包含晶粒4606及4607,且第四層級包含晶粒4608、4609及4610。不同層級中之晶粒藉由各別垂直介面3521連接至上方層級中之晶粒,且連接至下方層級中的晶粒。此處一個晶粒可藉由垂直介面3521中之各別者連接至四個其他晶粒。Figure 46 depicts a 3D VMM system 4600 that includes four die levels configured in a vertical staggered configuration, where the first level includes dies 4601 and 4602, the second level includes dies 4603, 4604, and 4605, and the third level includes Dies 4606 and 4607, and the fourth level includes die 4608, 4609, and 4610. Dies in different levels are connected to die in the level above and to die in the level below through respective vertical interfaces 3521. Here one die can be connected to four other dies through each of the vertical interfaces 3521.

圖47描繪3D VMM系統4700,其包含以垂直交錯組構配置之四個晶粒層級,其中第一層級包含晶粒4701及4702,第二層級包含晶粒4703、4704及4705,第三層級包含晶粒4706及4707,且第四層級包含晶粒4708、4709及4710。不同層級中之晶粒由各別垂直介面3521連接,且同一層級中之晶粒由各別水平介面3716連接。此處,一個晶粒可藉由垂直介面3521中之各別者及藉由水平介面3716中之各別者連接至六個其他晶粒。Figure 47 depicts a 3D VMM system 4700 that includes four die levels configured in a vertical staggered configuration, where the first level includes dies 4701 and 4702, the second level includes dies 4703, 4704, and 4705, and the third level includes Dies 4706 and 4707, and the fourth level includes die 4708, 4709, and 4710. Dies in different levels are connected by respective vertical interfaces 3521, and dies in the same level are connected by respective horizontal interfaces 3716. Here, one die can be connected to six other dies through respective ones of vertical interfaces 3521 and through respective ones of horizontal interfaces 3716 .

圖48描繪3D VMM系統4800之實體佈局之實施例,其中僅顯示連接器4801。連接器4801位於晶粒中且連接至一或多個垂直介面3521及水平介面3716。如可見,晶粒及介面可經配置以使得連接器4801以垂直交錯的組構定位。 用於3D VMM系統中之電路 Figure 48 depicts an embodiment of the physical layout of a 3D VMM system 4800, with only connector 4801 shown. Connector 4801 is located in the die and connects to one or more vertical interfaces 3521 and horizontal interfaces 3716 . As can be seen, the die and interface can be configured so that the connectors 4801 are positioned in a vertically staggered configuration. Circuit used in 3D VMM system

圖49至圖55描繪用於諸如先前描述之3D VMM系統中之電路。Figures 49-55 depict circuits used in 3D VMM systems such as those previously described.

圖49描繪可用於先前描述之神經元電路3510之神經元電路4900。具體而言,神經元電路3510可包含用於VMM陣列3507中之各位元線之神經元電路4900的實施例。神經元電路4900包含如所示配置之p通道金屬氧化物半導體(PMOS)電晶體4901及運算放大器4902。PMOS電晶體4901之一個端子附接至電壓源。PMOS電晶體4901之另一端子附接至PMOS電晶體4910之閘極,且附接至VMM陣列3507中之位元線及運算放大器4902之非反相端子。運算放大器4902之輸出附接至運算放大器4902之反相輸入。由位元線I-BL汲取之電流產生自運算放大器4902輸出之電壓V_IBL。運算放大器4902充當緩衝器,且V_IBL將維持其位準而不管其可附接至之負載。舉例而言,若V_IBL被提供至垂直介面3521,則垂直介面3521可具有寄生電容或寄生電流。神經元電路4900將維持輸出電壓V_IBL,而不管負載之變化。由此,此為神經元電流緩衝器電路。在另一具體例中,神經元電流(位元線電流)可在進入此電路之前由電流反射鏡放大或縮小。Figure 49 depicts a neuron circuit 4900 that may be used with the previously described neuron circuit 3510. Specifically, neuron circuit 3510 may include embodiments of neuron circuit 4900 for each bit line in VMM array 3507. Neuron circuit 4900 includes a p-channel metal oxide semiconductor (PMOS) transistor 4901 and an operational amplifier 4902 configured as shown. One terminal of PMOS transistor 4901 is attached to a voltage source. The other terminal of PMOS transistor 4901 is attached to the gate of PMOS transistor 4910 and to the bit lines in VMM array 3507 and the non-inverting terminal of operational amplifier 4902. The output of operational amplifier 4902 is connected to the inverting input of operational amplifier 4902. The current drawn by bit line I-BL is generated from the voltage V_IBL output by operational amplifier 4902. Op amp 4902 acts as a buffer and V_IBL will maintain its level regardless of the load to which it may be attached. For example, if V_IBL is provided to the vertical interface 3521, the vertical interface 3521 may have parasitic capacitance or parasitic current. Neuron circuit 4900 will maintain the output voltage V_IBL regardless of load changes. Thus, this is a neuronal current buffer circuit. In another specific example, neuronal current (bitline current) can be amplified or reduced by a current mirror before entering the circuit.

圖50描繪可用於先前描述之神經元電路3510之神經元電路5000。具體而言,神經元電路3510可包含用於VMM陣列3507中之各位元線之神經元電路5000的實施例。神經元電路5000包含如所示配置之受控開關5001、參考記憶體胞元5002及運算放大器5003。由位元線汲取之電流I-BL在運算放大器5003之非反相端子處產生某一電壓。當開關5001閉合時,來自運算放大器5003之輸出之回饋VNEUOUT提供至參考記憶體胞元5002之控制閘極端子。歸因於運算放大器之固有特性,運算放大器5003將修改輸出電壓VNEUOUT直至其非反相端子上之電壓等於其反相端子上的電壓VREF為止。此經由對參考記憶體胞元5002之控制閘極端子之回饋進行。神經元電路5000將維持輸出電壓VNEUOUT,而不管可接收電壓之諸如垂直介面3521之寄生電容的負載。神經元電路5000使用記憶體胞元5002將神經元電流I-BL轉換成電壓VNEUOUT。因此,其為使用運算放大器與回饋之基於記憶體胞元之電流至電壓轉換器。Figure 50 depicts a neuron circuit 5000 that may be used with the previously described neuron circuit 3510. Specifically, neuron circuit 3510 may include embodiments of neuron circuit 5000 for each bit line in VMM array 3507. Neuron circuit 5000 includes a controlled switch 5001, a reference memory cell 5002, and an operational amplifier 5003 configured as shown. The current I-BL drawn by the bit line produces a voltage at the non-inverting terminal of operational amplifier 5003. When switch 5001 is closed, feedback VNEUOUT from the output of operational amplifier 5003 is provided to the control gate terminal of reference memory cell 5002. Due to the inherent characteristics of operational amplifiers, operational amplifier 5003 will modify the output voltage VNEUOUT until the voltage on its non-inverting terminal is equal to the voltage VREF on its inverting terminal. This is done via feedback to the control gate terminal of the reference memory cell 5002. Neuron circuit 5000 will maintain output voltage VNEUOUT regardless of loads such as parasitic capacitance of vertical interface 3521 that may receive the voltage. Neuron circuit 5000 uses memory cells 5002 to convert neuron current I-BL into voltage VNEUOUT. Therefore, it is a memory cell-based current-to-voltage converter using op amps and feedback.

圖51描繪力及感測(F/S)驅動電路5100 (其中加壓力由具有在正極端子處之輸入VIN之運算放大器5103所引起,且感測在目標節點、節點5101或5102處發生,目標節點處之電壓回饋至放大器之負極端子,且此藉由放大器之動作使電壓等於輸入電壓VIN),其可用於神經元電路3510或別處以經由變化負載準確地遞送神經元之電壓輸出。F/S驅動電路5100包含如所示配置之運算放大器5103及受控開關5104、5105、5106及5107。Vin在運算放大器5103之非反相輸入處接收,且運算放大器5103之輸出稱為VOUT。開關5104及5106在閉合時經由垂直介面3521分別將運算放大器5103之輸出及運算放大器5103作為VOUTFB之反相輸入連接至第一晶粒中的節點5101。開關5105及5107在閉合時經由垂直介面3521分別將運算放大器5103之輸出及運算放大器5103作為VOUTFB之反相輸入連接至第二晶粒中的節點5102。驅動電路5100提供準確電壓VOUT至節點5101及節點5102,而不管由垂直介面3521或藉由第一及第二晶粒所引起之任何負載效應。此係由於感測節點(驅動節點) 5101或5102被回饋至運算放大器5103之反相輸入。Figure 51 depicts a force and sense (F/S) drive circuit 5100 (where force is induced by an op amp 5103 with input VIN at the positive terminal, and sensing occurs at the target node, node 5101 or 5102, target The voltage at the node is fed back to the negative terminal of the amplifier, and this causes the voltage to be equal to the input voltage VIN) by the action of the amplifier, which can be used in neuron circuit 3510 or elsewhere to accurately deliver the neuron's voltage output through varying loads. F/S drive circuit 5100 includes an operational amplifier 5103 and controlled switches 5104, 5105, 5106, and 5107 configured as shown. Vin is received at the non-inverting input of op amp 5103, and the output of op amp 5103 is called VOUT. When closed, switches 5104 and 5106 respectively connect the output of operational amplifier 5103 and the inverting input of operational amplifier 5103 as VOUTFB to node 5101 in the first die via vertical interface 3521. When closed, switches 5105 and 5107 respectively connect the output of operational amplifier 5103 and the inverting input of operational amplifier 5103 as VOUTFB to node 5102 in the second die via vertical interface 3521. Driver circuit 5100 provides accurate voltage VOUT to node 5101 and node 5102 regardless of any loading effects caused by vertical interface 3521 or by the first and second dies. This is because the sense node (drive node) 5101 or 5102 is fed back to the inverting input of the operational amplifier 5103.

圖52描繪可用於先前描述之神經元電路3510之神經元電路5200。具體而言,神經元電路3510可包含用於VMM陣列3507中之各對差分位元線之差分神經元電路5200的實例,諸如在實例中,其中一個行儲存W+值,且另一行儲存W-值,其各對W+及W-表示所儲存權重。Figure 52 depicts a neuron circuit 5200 that may be used with the previously described neuron circuit 3510. Specifically, neuron circuit 3510 may include an example of differential neuron circuit 5200 for each pair of differential bit lines in VMM array 3507, such as in an example where one row stores W+ values and the other row stores W- value, each pair of W+ and W- represents the stored weight.

差分神經元電路5200包含如所示組構之運算放大器5201、可變積分電阻器5202及5203、受控開關5204、5205、5206及5207及取樣及保持(及/或積分)電容器5208及5209。差分神經元電路5200分別自W+位元線接收差分電流BLw+且自W-位元線接收BLw-,且輸出電壓Vout+及Vout-。輸出電壓Vout+ = (BLw+) * R且Vout- = (BLw-) * R,其中可變積分電阻器5202及5203各自具有等於R之值。電容器5208及5209充當各別取樣及保持(S/H)電容器以在電阻器5202及5203藉由斷開受控開關5206、5207自電路移除且輸入電流藉由斷開受控開關5204、5205關閉時保持輸出電壓。控制電路(圖中未示)控制開關5204、5205、5206及5207之斷開及閉合以提供積分時間。視情況,差分輸出電壓Vout+及Vout-可輸入至ADC 3516,該ADC 3516將差分輸出電壓Vout+及Vout-轉換成數位輸出位元集合Doutx。視情況,電路可使用電容器5208及5209作為積分電容器以積分神經元電流以將電流轉換成電壓,Vout = 時間*神經元/電容。在使用積分電容器方法之情況下,神經元縮放藉由可變電阻器5292及5203或可變積分時間及/或可變電容提供。Differential neuron circuit 5200 includes operational amplifier 5201 configured as shown, variable integrating resistors 5202 and 5203, controlled switches 5204, 5205, 5206 and 5207, and sample and hold (and/or integrating) capacitors 5208 and 5209. The differential neuron circuit 5200 receives differential current BLw+ from the W+ bit line and BLw- from the W- bit line, respectively, and outputs voltages Vout+ and Vout-. The output voltages Vout+ = (BLw+) * R and Vout- = (BLw-) * R, where the variable integrating resistors 5202 and 5203 each have a value equal to R. Capacitors 5208 and 5209 act as respective sample and hold (S/H) capacitors with resistors 5202 and 5203 removed from the circuit by opening controlled switches 5206, 5207 and input current by opening controlled switches 5204, 5205 Maintains output voltage when turned off. The control circuit (not shown in the figure) controls the opening and closing of switches 5204, 5205, 5206 and 5207 to provide integration time. Optionally, the differential output voltages Vout+ and Vout- may be input to an ADC 3516, which converts the differential output voltages Vout+ and Vout- into a set of digital output bits Doutx. Optionally, the circuit can use capacitors 5208 and 5209 as integrating capacitors to integrate the neuron current to convert the current into voltage, Vout = time * neuron / capacitance. In the case of using the integrating capacitor method, neuron scaling is provided by variable resistors 5292 and 5203 or variable integration time and/or variable capacitance.

圖53描繪取樣及保持緩衝器5300,其可用於先前描述之輸入電路3514。具體而言,輸入電路3514可包含用於VMM陣列3507中之每一列之取樣及保持緩衝器5300的實例。取樣及保持緩衝器5300包含受控開關5301、電容器5302及緩衝器5303。緩衝器5303可為由運算放大器形成之單一緩衝器。在操作期間,開關5301閉合,此允許類比值(例如,來自DAC 3515)儲存於電容器5302中。彼值可接著自緩衝器5303輸出,該緩衝器5303驅動VMM陣列3507之列輸入。電容器5302可為實際電容器,或其可為見於電線中之本質電容器。Figure 53 depicts a sample and hold buffer 5300 that may be used with the previously described input circuit 3514. Specifically, input circuit 3514 may include an instance of sample and hold buffer 5300 for each column in VMM array 3507 . Sample and hold buffer 5300 includes controlled switch 5301, capacitor 5302, and buffer 5303. Buffer 5303 may be a single buffer formed from an operational amplifier. During operation, switch 5301 is closed, which allows an analog value (eg, from DAC 3515) to be stored in capacitor 5302. This value may then be output from buffer 5303, which drives the column input of VMM array 3507. Capacitor 5302 may be an actual capacitor, or it may be an intrinsic capacitor found in electrical wires.

圖54描繪差分連續位址暫存器(successive address register;SAR)類比至數位轉換器(ADC) 5400,其可用於先前描述之ADC 3516。Figure 54 depicts a differential sequential address register (SAR) analog-to-digital converter (ADC) 5400 that may be used with the previously described ADC 3516.

差分連續位址暫存器類比至數位轉換器5400使用二元搜尋經由所有可能量化位準將類比輸入或差分類比輸入轉換成數位輸出,以識別適當數位輸出。The differential continuous address register analog-to-digital converter 5400 converts an analog input or a differential analog input to a digital output through all possible quantization levels to identify the appropriate digital output.

差分連續位址暫存器類比至數位轉換器5400包含二進位電容性數位至類比轉換器(CDAC) 5401、二進位CDAC 5402 (與CDAC 5401互補)、比較器5403,以及SAR邏輯及暫存器5404。Differential Continuous Address Register Analog-to-Digital Converter 5400 includes binary capacitive digital-to-analog converter (CDAC) 5401, binary CDAC 5402 (complementary to CDAC 5401), comparator 5403, and SAR logic and registers 5404.

差分連續位址暫存器類比至數位轉換器5400接收差分電流輸入Vinp及Vinn。SAR邏輯及暫存器5404循環通過所有可能的數位位元組合,此又控制CDAC 5401及5402中之開關以將電壓源耦接至電容器。當比較器5403之輸出翻轉時,SAR邏輯及暫存器5404中之數位位元組合接著作為數位輸出(Digital Output)而輸出。選擇地,SAR邏輯及暫存器5404在數位輸出中產生額外1位元數位輸出DMAJ,該數位輸出在數位值中之大多數位元為「1」的情況下為「1」且在對應數位值中之大多數位元不為「1」的情況下為「0」。The differential continuous address register analog-to-digital converter 5400 receives differential current inputs Vinp and Vinn. The SAR logic and register 5404 cycles through all possible combinations of digital bits, which in turn controls the switches in CDACs 5401 and 5402 to couple the voltage source to the capacitor. When the output of the comparator 5403 is inverted, the SAR logic and the digital bit combination in the register 5404 are combined as a digital output (Digital Output) and output. Optionally, the SAR logic and register 5404 generate an additional 1-bit digital output DMAJ in the digital output, which is "1" when the majority of the bits in the digital value are "1" and when the corresponding digital value If most of the bits are not "1", it will be "0".

圖55描繪基於參考電流之SAR ADC電路5500,其可用於先前描述之ADC 3516。具體而言,ADC 3516可包含ADC電路5500之一或多個實例,其中ADC電路5500將電流I-BL轉換成數位值數位輸出(Digital Output)。ADC電路5500包含二進位電流區塊5501、開關5502、比較器5503及SAR邏輯及暫存器5504。二進位電流區塊5501以二元搜尋方式提供二進位參考電流以與位元線輸入電流對照比較,意謂自MSB (最高有效位元)至LSB (最低有效位元)搜尋。Figure 55 depicts a reference current based SAR ADC circuit 5500 that may be used with the previously described ADC 3516. Specifically, ADC 3516 may include one or more instances of ADC circuit 5500, where ADC circuit 5500 converts current I-BL into a digital value digital output (Digital Output). The ADC circuit 5500 includes a binary current block 5501, a switch 5502, a comparator 5503, and a SAR logic and register 5504. The binary current block 5501 provides a binary reference current for comparison with the bit line input current using a binary search method, which means searching from MSB (most significant bit) to LSB (least significant bit).

應注意,如本文中所使用,術語「在...上方」及「在...上」兩者包括「直接在...上」(其間未裝設有中間材料、元件或空間)及「間接地在...上」(其間裝設有中間材料、元件或空間)。同樣地,術語「鄰近」包括「直接鄰近」(其間未裝設有中間材料、元件或空間)及「間接鄰近」(其間裝設有中間材料、元件或空間),「安裝至」包括「直接安裝至」(其間未裝設有中間材料、元件或空間)及「間接安裝至」(其間裝設有中間材料、元件或空間),且「電耦接」包括「直接電耦接至」(其間無將元件電連接在一起的中間材料或元件)及「間接電耦接至」(其間具有將元件電連接在一起的中間材料或元件)。舉例而言,「在基板上方」形成元件可包括直接在基板上形成元件而其間無中間材料/元件,以及間接地在基板上形成元件而其間具有一或多種中間材料/元件。It should be noted that as used herein, the terms "on" and "on" both include "directly on" (without intervening materials, components or spaces) and "Indirectly on" (with intermediate materials, components or spaces installed therebetween). Likewise, the term "adjacent" includes "directly adjacent" (without intervening materials, components, or spaces therebetween) and "indirectly adjacent" (with intervening materials, components, or spaces between them), and "mounted to" includes "directly adjacent" "mounted to" (without intervening materials, components or spaces therebetween) and "indirectly mounted to" (with intermediate materials, components or spaces therebetween), and "electrical coupling" includes "direct electrical coupling to" ( "without intervening materials or components electrically connecting the components together" and "indirectly electrically coupled to" (with intervening materials or components electrically connecting the components together). For example, forming a component "over" a substrate may include forming the component directly on the substrate without intervening materials/components, as well as indirectly forming the component on the substrate with one or more intermediate materials/components therebetween.

12:半導體基板 14:源極區 16:汲極區 18:通道區 20:浮動閘極 22:字元線端子/選擇閘極 24,I-BL:位元線 28:控制閘極 30:抹除閘極 31,3515:數位至類比轉換器 32,32a,32b,32c,32d,32e:向量矩陣乘法陣列 33:非揮發性記憶體胞元陣列 34:抹除閘極及字元線閘極解碼器 35:控制閘極解碼器 36:位元線解碼器 37:源極線解碼器 38:差分求和器/求和運算放大器 39,1602,1702,2002,2102:激活函數區塊 210,310,410,510,710:記憶體胞元 900,1000,1100,1200,1300,2200,2300,2400,2500,2600,2700,2800,2900,3000:神經元VMM陣列 901,1003,1103,1203,1303:記憶體陣列 902,1001,1002,1101,1102,1201,1202,1301,1302:參考陣列 903:控制閘極線 904:抹除閘極線 1014,1212,1314:二極體連接式貫穿多工器 1204:串疊電晶體 1205,1709,1710,2104:多工器 1400:VMM陣列/LSTM 1401,1402,1403,1404,1801,1802,1803,1804,4101mn,4102mn:胞元 1500,1600,1700:LSTM胞元 1501,1502,1503,1901,1902:S型函數構件 1504,1505,1903:雙曲正切構件 1506,1507,1508,1703,1904,1905,1906,2103:乘法器構件 1509,1708,1907,2105:加法構件 1601,1701,2001,2101,3401,3507,3557,3507-1,3507-2:VMM陣列 1704,1705,1706,1707,2106,2107,2108:暫存器 1800:閘控遞迴單元 1900,2000,2100:GRU胞元 1908,2109:互補構件 2701-1,2701-2,2701-(N-1),2701-N:位元線控制閘極 3100,3210,3300,3400:VMM系統 3101,3102,3213,3303,3304,3305,3306,3307,3308:求和電路 3211:第一陣列 3212:第二陣列 3301,3302:陣列 3402:列解碼器 3403,3508,3714:高電壓解碼器 3404:行解碼器 3405:位元線驅動器 3406,3514:輸入電路 3407:輸出電路 3408:控制邏輯 3409:偏壓產生器 3410:高電壓產生區塊 3411:電荷泵 3412:電荷泵調節器 3413:高電壓類比精度位準產生器 3414:演算法控制器 3415,3512:類比電路系統 3416:控制引擎 3417:測試控制邏輯 3500,3600,3700,3750,3800,3900,3950,3980,4000,4100,4200,4300,4400,4500,4600,4700,4800:3D VMM系統 3501,3502,3503,3504,3505,3506,3601,3602,3603,3604,3605,3606,3701,3702,3703,3704,3705,3706,3707,3708,3709,3710,3711,3712,3758,3801,3802,3803,3804,3805,3806,3807,3808,3809,3810,3811,3812,3901,3902,3903,3904,3905,3906,3907,3908,3951,3952,3953,3954,3955,3956,3957,3958,3981,3982,3983 3984,3985,3986,4001,4002,4003,4004,4005,4006,4007,4008,4501,4502,4503,4504,4505,4506,4507,4508,4601,4602,4603,4604,4605,4606,4607,4608,4609,4610,4701,4702,4703,4704,4705,4706,4707,4708,4709,4710:晶粒 3509:輸入多工器 3510,4900,5000,5200:神經元電路 3511:高電壓產生器 3513:溫度補償電路 3516:類比至數位轉換器 3517:數位電路 3518:靜態隨機存取記憶體 3519:暫存器 3520:實體I/O連接 3521:垂直介面/數位加速器 3522,3622:封裝 3523:列緩衝器 3524:位址解碼電路/位址解碼器/列暫存器 3525:列暫存器 3531:數位加速器 3608:高電壓多工器 3610:行多工器 3713:位址解碼電路/位址解碼器 3715:網路單晶片 3716:水平介面 3729,4029:陣列輸入 3990:輸入區塊 3991:高電壓區塊 3992:輸出區塊 3993:類比區塊 4801:連接器 4901:p通道金屬氧化物半導體電晶體 4902,5003,5103,5201:運算放大器 5001,5104,5105,5106,5107,5204,5205,5206,5207,5301:受控開關 5002:參考記憶體胞元 5100:驅動電路 5101,5102:節點 5202,5203:可變積分電阻器 5208,5209:取樣及保持電容器 5300:取樣及保持緩衝器 5302:電容器 5303:緩衝器 5400:差分連續位址暫存器類比至數位轉換器 5401:二進位電容性數位至類比轉換器 5402:二進位CDAC 5403,5503:比較器 5404,5504:SAR邏輯及暫存器 5500:基於參考電流之SAR ADC電路 5501:二進位電流區塊 5502:開關 BL0,BL1,BL2,BL3,BLn,BLN:位元線 BLR0,BLR1,BLR2,BLR3:端子 BLw+,BLw-:差分電流 c 0,c 1,c 2, c(t-1),c(t):胞元狀態向量 C1,C2,C3,S1,S2,S3:層 CB1,CB2,CB3,CB4:突觸 CG0,CG1,CG2,CG3,CG M-1,CG M,CGm,CGn:電壓輸入/控制閘極電壓 DMAJ:額外1位元數位輸出 Doutx:數位輸出位元集合 EG0,EG1:抹除閘極/EG線 h 0, h 1,h 2,h 3,h(t-1),h(t):輸出向量 INPUT 0,INPUT 1,INPUT N-1,INPUT N,INPUT M-1,INPUT M:輸入 OUTPUT 0,OUTPUT 1,OUTPUT 2,OUTPUT 3,OUTPUT 4,OUTPUT N-1,OUTPUT N:輸出 P1,P2:激活函數 S0:輸入層 SL0,SL1,SL 2,SL 3:源極線 V_IBL,VNEUOUT,VOUT,Vout+,Vout-,VREF:電壓 VIN:輸入電壓 Vinp,Vinn:差分電流輸入 VOUTFB:反相輸入 Vth:臨限電壓 WL,WL0,WL1,WL2,WL3,WL4,WL5,WL6,WL7,WL M-1,WL M,WLA0,WLA1,WLA2,WLA3,WLB0,WLB1,WLB2,WLB3:字元線 x 0,x 1,x 2,x 3,x(t):輸入向量 12: Semiconductor substrate 14: Source area 16: Drain area 18: Channel area 20: Floating gate 22: Word line terminal/select gate 24, I-BL: Bit line 28: Control gate 30: Wipe Except gate 31, 3515: Digital to analog converter 32, 32a, 32b, 32c, 32d, 32e: Vector matrix multiplication array 33: Non-volatile memory cell array 34: Erase gate and word line gate Decoder 35: Control gate decoder 36: Bit line decoder 37: Source line decoder 38: Differential summer/summing operational amplifier 39, 1602, 1702, 2002, 2102: Activation function block 210, 310, 410, 510, 710: Memory cells 900, 1000, 1100, 1200, 1300, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000: neuron VMM array 901, 1003, 1103, 1203, 1303: memory array 902 ,1001,1002,1101,1102,1201,1202,1301,1302: Reference array 903: Control gate line 904: Erase gate line 1014,1212,1314: Diode connected through multiplexer 1204: String Stacked transistor 1205, 1709, 1710, 2104: Multiplexer 1400: VMM array/LSTM 1401, 1402, 1403, 1404, 1801, 1802, 1803, 1804, 4101mn, 4102mn: Cell 1500, 1600, 1700: LSTM cell Yuan 1501, 1502, 1503, 1901, 1902: S-shaped function component 1504, 1505, 1903: Hyperbolic tangent component 1506, 1507, 1508, 1703, 1904, 1905, 1906, 2103: Multiplier component 1509, 1708, 1907, 2105: Addition component 1601, 1701, 2001, 2101, 3401, 3507, 3557, 3507-1, 3507-2: VMM array 1704, 1705, 1706, 1707, 2106, 2107, 2108: Temporary register 1800: Gate transfer Return unit 1900, 2000, 2100: GRU cell 1908, 2109: complementary component 2701-1, 2701-2, 2701-(N-1), 2701-N: bit line control gate 3100, 3210, 3300, 3400 : VMM system 3101, 3102, 3213, 3303, 3304, 3305, 3306, 3307, 3308: summation circuit 3211: first array 3212: second array 3301, 3302: array 3402: column decoder 3403, 3508, 3714: High voltage decoder 3404: row decoder 3405: bit line driver 3406, 3514: input circuit 3407: output circuit 3408: control logic 3409: bias generator 3410: high voltage generation block 3411: charge pump 3412: charge pump Regulator 3413: High voltage analog precision level generator 3414: Algorithm controller 3415, 3512: Analog circuit system 3416: Control engine 3417: Test control logic 3500, 3600, 3700, 3750, 3800, 3900, 3950, 3980, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800: 3D VMM system 3501, 3502, 3503, 3504, 3505, 3506, 3601, 3602, 3603, 3604, 3605, 3606, 3701, 3702, 370 3 ,3704,3705,3706,3707,3708,3709,3710,3711,3712,3758,3801,3802,3803,3804,3805,3806,3807,3808,3809,3810,3811,3812,3901,3902,39 03 ,3904,3905,3906,3907,3908,3951,3952,3953,3954,3955,3956,3957,3958,3981,3982,3983 3984,3985,3986,4001,4002,4003,4004,4005,40 06, 4007,4008,4501,4502,4503,4504,4505,4506,4507,4508,4601,4602,4603,4604,4605,4606,4607,4608,4609,4610,4701,4702,4703,4704,470 5, 4706, 4707, 4708, 4709, 4710: Die 3509: Input multiplexer 3510, 4900, 5000, 5200: Neuron circuit 3511: High voltage generator 3513: Temperature compensation circuit 3516: Analog to digital converter 3517: Digital Circuit 3518: Static random access memory 3519: Register 3520: Physical I/O connection 3521: Vertical interface/digital accelerator 3522, 3622: Package 3523: Column buffer 3524: Address decoding circuit/address decoder/ Column register 3525: Column register 3531: Digital accelerator 3608: High voltage multiplexer 3610: Row multiplexer 3713: Address decoding circuit/address decoder 3715: Network single chip 3716: Horizontal interface 3729, 4029: Array input 3990: Input block 3991: High voltage block 3992: Output block 3993: Analog block 4801: Connector 4901: P-channel metal oxide semiconductor transistor 4902, 5003, 5103, 5201: Operational amplifier 5001 ,5104,5105,5106,5107,5204,5205,5206,5207,5301: controlled switch 5002: reference memory cell 5100: drive circuit 5101,5102: node 5202,5203: variable integrating resistor 5208,5209 :Sample and Hold Capacitor 5300: Sample and Hold Buffer 5302: Capacitor 5303: Buffer 5400: Differential Continuous Address Register Analog to Digital Converter 5401: Binary Capacitive Digital to Analog Converter 5402: Binary CDAC 5403 , 5503: Comparator 5404, 5504: SAR logic and register 5500: SAR ADC circuit based on reference current 5501: Binary current block 5502: Switch BL0, BL1, BL2, BL3, BLn, BLN: Bit line BLR0 ,BLR1,BLR2,BLR3: terminal BLw+, BLw-: differential current c 0 ,c 1 ,c 2 , c(t-1),c(t): cell state vector C1,C2,C3,S1,S2, S3: Layer CB1, CB2, CB3, CB4: Synapse CG0, CG1, CG2, CG3, CG M-1 , CG M , CGm, CGn: Voltage input/control gate voltage DMAJ: Additional 1-bit digital output Doutx: Digital output bit set EG0, EG1: erase gate/EG line h 0 , h 1 , h 2 , h 3 , h (t-1), h (t): output vector INPUT 0 , INPUT 1 , INPUT N -1, INPUT N , INPUT M-1 , INPUT M : input OUTPUT 0 , OUTPUT 1 , OUTPUT 2 , OUTPUT 3 , OUTPUT 4 , OUTPUT N-1 , OUTPUT N : output P1, P2: activation function S0: input layer SL0 , SL1, SL 2 , SL 3 : source lines V_IBL, VNEUOUT, VOUT, Vout+, Vout-, VREF: voltage VIN: input voltage Vinp, Vinn: differential current input VOUTFB: inverting input Vth: threshold voltage WL, WL0 ,WL1,WL2,WL3,WL4,WL5,WL6,WL7,WL M-1 ,WL M ,WLA0,WLA1,WLA2,WLA3,WLB0,WLB1,WLB2,WLB3: word line x 0 ,x 1 ,x 2 ,x 3 ,x(t): input vector

圖1為例示人工神經網路之圖。Figure 1 is a diagram illustrating an artificial neural network.

圖2描繪先前技術分離閘極快閃記憶體胞元。Figure 2 depicts prior art separation of gate flash memory cells.

圖3描繪另一先前技術分離閘極快閃記憶體胞元。Figure 3 depicts another prior art split gate flash memory cell.

圖4描繪另一先前技術分離閘極快閃記憶體胞元。Figure 4 depicts another prior art split gate flash memory cell.

圖5描繪另一先前技術分離閘極快閃記憶體胞元。Figure 5 depicts another prior art split gate flash memory cell.

圖6為例示利用一或多個非揮發性記憶體陣列之實例人工神經網路的不同層級之圖。Figure 6 is a diagram illustrating different levels of an example artificial neural network utilizing one or more non-volatile memory arrays.

圖7為例示VMM系統之方塊圖。Figure 7 is a block diagram of an exemplary VMM system.

圖8為例示利用一或多個VMM系統之實例人工神經網路的方塊圖。Figure 8 is a block diagram illustrating an example artificial neural network utilizing one or more VMM systems.

圖9描繪VMM系統之另一實例。Figure 9 depicts another example of a VMM system.

圖10描繪VMM系統之另一實例。Figure 10 depicts another example of a VMM system.

圖11描繪VMM陣列之另一實例。Figure 11 depicts another example of a VMM array.

圖12描繪VMM系統之另一實例。Figure 12 depicts another example of a VMM system.

圖13描繪VMM系統之另一實例。Figure 13 depicts another example of a VMM system.

圖14描繪先前技術長短期記憶體系統。Figure 14 depicts a prior art long short term memory system.

圖15描繪用於長短期記憶體系統中之實例胞元。Figure 15 depicts an example cell used in a long short-term memory system.

圖16描繪圖15之胞元之實例實施。Figure 16 depicts an example implementation of the cell of Figure 15.

圖17描繪圖15之胞元之另一實例實施。Figure 17 depicts another example implementation of the cell of Figure 15.

圖18描繪先前技術閘控遞迴單元系統。Figure 18 depicts a prior art gated recursive unit system.

圖19描繪用於閘控遞迴單元系統中之實例胞元。Figure 19 depicts an example cell used in a gated recursive cell system.

圖20描繪圖19之胞元的實例實施。Figure 20 depicts an example implementation of the cell of Figure 19.

圖21描繪圖19之胞元之另一實例實施。Figure 21 depicts another example implementation of the cell of Figure 19.

圖22描繪VMM系統之另一實例。Figure 22 depicts another example of a VMM system.

圖23描繪VMM系統之另一實例。Figure 23 depicts another example of a VMM system.

圖24描繪VMM系統之另一實例。Figure 24 depicts another example of a VMM system.

圖25描繪VMM系統之另一實例。Figure 25 depicts another example of a VMM system.

圖26描繪VMM系統之另一實例。Figure 26 depicts another example of a VMM system.

圖27描繪VMM系統之另一實例。Figure 27 depicts another example of a VMM system.

圖28描繪VMM系統之另一實例。Figure 28 depicts another example of a VMM system.

圖29描繪VMM系統之另一實例。Figure 29 depicts another example of a VMM system.

圖30描繪VMM系統之另一實例。Figure 30 depicts another example of a VMM system.

圖31描繪VMM系統之另一實例。Figure 31 depicts another example of a VMM system.

圖32描繪VMM系統之另一實例。Figure 32 depicts another example of a VMM system.

圖33描繪VMM系統之另一實例。Figure 33 depicts another example of a VMM system.

圖34描繪2D VMM系統之實例。Figure 34 depicts an example of a 2D VMM system.

圖35描繪3D VMM系統之實例。Figure 35 depicts an example of a 3D VMM system.

圖36描繪3D VMM系統之實例。Figure 36 depicts an example of a 3D VMM system.

圖37A及圖37B描繪3D VMM系統之實例。Figures 37A and 37B depict examples of 3D VMM systems.

圖38描繪3D VMM系統之實例。Figure 38 depicts an example of a 3D VMM system.

圖39A、圖39B及圖39C描繪3D VMM系統之實例。Figures 39A, 39B, and 39C depict examples of 3D VMM systems.

圖40描繪3D VMM系統之實例。Figure 40 depicts an example of a 3D VMM system.

圖41描繪3D VMM系統之實例。Figure 41 depicts an example of a 3D VMM system.

圖42描繪3D VMM系統之實例。Figure 42 depicts an example of a 3D VMM system.

圖43描繪3D VMM系統之實例。Figure 43 depicts an example of a 3D VMM system.

圖44描繪3D VMM系統之實例。Figure 44 depicts an example of a 3D VMM system.

圖45描繪3D VMM系統之實例。Figure 45 depicts an example of a 3D VMM system.

圖46描繪3D VMM系統之實例。Figure 46 depicts an example of a 3D VMM system.

圖47描繪3D VMM系統之實例。Figure 47 depicts an example of a 3D VMM system.

圖48描繪3D VMM系統之實例。Figure 48 depicts an example of a 3D VMM system.

圖49描繪神經元電路之實例。Figure 49 depicts an example of a neuron circuit.

圖50描繪神經元電路之實例。Figure 50 depicts an example of a neuron circuit.

圖51描繪驅動電路之實例。Figure 51 depicts an example of a driver circuit.

圖52描繪差分神經元電路之實例。Figure 52 depicts an example of a differential neuron circuit.

圖53描繪取樣保持緩衝器之實例。Figure 53 depicts an example of a sample and hold buffer.

圖54描繪ADC電路之實例。Figure 54 depicts an example of an ADC circuit.

圖55描繪ADC電路之實例。Figure 55 depicts an example of an ADC circuit.

C1:層 C1:Layer

C2:層 C2:Layer

C3:層 C3:Layer

CB1:突觸 CB1: synapse

CB2:突觸 CB2: synapse

CB3:突觸 CB3: synapse

CB4:突觸 CB4: synapse

P1:激活函數 P1: activation function

P2:激活函數 P2: activation function

S1:層 S1:Layer

S2:層 S2:Layer

S3:層 S3:Layer

Claims (53)

一種用於一人工神經網路中之三維積體電路,其包含: 一第一晶粒,其包含一第一向量矩陣乘法陣列及一第一輸入多工器,該第一晶粒位於一第一垂直層上; 一第二晶粒,其包含一輸入電路,該第二晶粒位於不同於該第一垂直層之一第二垂直層上;及 一或多個垂直介面,其耦接該第一晶粒及該第二晶粒; 其中在一讀取操作期間,該輸入電路經由該一或多個垂直介面中之至少一者將一輸入信號提供至該第一輸入多工器,該第一輸入多工器將該輸入信號施加至該第一向量矩陣乘法陣列中之一或多個列,且該第一向量矩陣乘法陣列產生一輸出。 A three-dimensional integrated circuit used in an artificial neural network, which includes: a first die including a first vector matrix multiplication array and a first input multiplexer, the first die being located on a first vertical layer; a second die including an input circuit, the second die being on a second vertical layer different from the first vertical layer; and one or more vertical interfaces coupling the first die and the second die; During a read operation, the input circuit provides an input signal to the first input multiplexer through at least one of the one or more vertical interfaces, and the first input multiplexer applies the input signal to to one or more columns in the first vector matrix multiplication array, and the first vector matrix multiplication array produces an output. 如請求項1之三維積體電路,其中,該第二晶粒包含用於將一數位輸入轉換成提供至該輸入電路作為該輸入信號之一類比輸入的一數位至類比轉換器。The three-dimensional integrated circuit of claim 1, wherein the second die includes a digital-to-analog converter for converting a digital input into an analog input provided to the input circuit as the input signal. 如請求項1之三維積體電路,其中,該第一晶粒包含用於緩衝該輸出之一神經元電路。The three-dimensional integrated circuit of claim 1, wherein the first die includes a neuron circuit for buffering the output. 如請求項1之三維積體電路,其中,該第一晶粒包含用於經由該一或多個垂直介面中之至少一者將該輸出發送至該第二晶粒或一第三晶粒的一行多工器。The three-dimensional integrated circuit of claim 1, wherein the first die includes a circuit for sending the output to the second die or a third die through at least one of the one or more vertical interfaces. A row of multiplexers. 如請求項1之三維積體電路,其包含: 一第三晶粒,其包含用以將來自該第一晶粒之該輸出轉換成一數位輸出的一類比至數位轉換器,該第三晶粒位於不同於該第一垂直層及該第二垂直層之一第三垂直層上。 For example, the three-dimensional integrated circuit of claim 1 includes: a third die including an analog-to-digital converter for converting the output from the first die to a digital output, the third die being located on a different layer than the first vertical layer and the second vertical layer One of the layers is on the third vertical layer. 如請求項5之三維積體電路,其包含: 一第四晶粒,其包含一高電壓產生器、類比電路系統及一溫度補償電路,該第四晶粒位於不同於該第一垂直層、該第二垂直層及該第三垂直層之一第四垂直層上。 For example, the three-dimensional integrated circuit of claim 5 includes: A fourth die including a high voltage generator, analog circuitry and a temperature compensation circuit, the fourth die being located at one of the different first vertical layers, the second vertical layer and the third vertical layer On the fourth vertical level. 如請求項6之三維積體電路,其包含: 一第五晶粒,其包含一第二向量矩陣乘法陣列、一第二輸入多工器、一高電壓解碼器及一神經元電路,該第五晶粒位於不同於該第一垂直層、該第二垂直層、該第三垂直層及該第四垂直層之一第五垂直層上。 For example, the three-dimensional integrated circuit of claim 6 includes: A fifth die, which includes a second vector matrix multiplication array, a second input multiplexer, a high voltage decoder and a neuron circuit, the fifth die is located in a different position from the first vertical layer, the On a fifth vertical layer of the second vertical layer, the third vertical layer and the fourth vertical layer. 如請求項1之三維積體電路,其包含: 一第三晶粒,其包含一第二向量矩陣乘法陣列及一第二輸入多工器,該第三晶粒位於該第一垂直層上。 For example, the three-dimensional integrated circuit of claim 1 includes: A third die including a second vector matrix multiplication array and a second input multiplexer is located on the first vertical layer. 如請求項1之三維積體電路,其中,該向量矩陣乘法陣列包含複數個非揮發性記憶體胞元。A three-dimensional integrated circuit as claimed in claim 1, wherein the vector matrix multiplication array includes a plurality of non-volatile memory cells. 如請求項9之三維積體電路,其中,該複數個非揮發性記憶體胞元包含堆疊閘極快閃記憶體胞元。The three-dimensional integrated circuit of claim 9, wherein the plurality of non-volatile memory cells include stacked gate flash memory cells. 如請求項9之三維積體電路,其中,該複數個非揮發性記憶體胞元包含分離閘極快閃記憶體胞元。The three-dimensional integrated circuit of claim 9, wherein the plurality of non-volatile memory cells include split-gate flash memory cells. 一種方法,其包含: 經由一或多個垂直介面,藉由位於一第一晶粒上之一輸入電路將一輸入信號提供至位於一第二晶粒上之一輸入多工器; 藉由該輸入多工器將該輸入信號施加至一神經網路陣列中之一或多個列;及 藉由該神經網路陣列產生一輸出; 其中該第一晶粒及該第二晶粒位於不同垂直層上。 A method that contains: providing an input signal via an input circuit on a first die to an input multiplexer on a second die via one or more vertical interfaces; applying the input signal to one or more columns in a neural network array via the input multiplexer; and Generate an output through the neural network array; The first crystal grain and the second crystal grain are located on different vertical layers. 如請求項12之方法,其中,該神經網路陣列包含複數個非揮發性記憶體胞元。The method of claim 12, wherein the neural network array includes a plurality of non-volatile memory cells. 如請求項13之方法,其中,該複數個非揮發性記憶體胞元包含堆疊閘極快閃記憶體胞元。The method of claim 13, wherein the plurality of non-volatile memory cells include stacked gate flash memory cells. 如請求項13之方法,其中,該複數個非揮發性記憶體胞元包含分離閘極快閃記憶體胞元。The method of claim 13, wherein the plurality of non-volatile memory cells include split gate flash memory cells. 一種裝置,其包含: 一第一晶粒,其包含含有以列及行配置之複數個非揮發性記憶體胞元之一第一向量矩陣乘法陣列,該第一晶粒位於一第一垂直層上; 一第二晶粒,其包含含有以列及行配置之複數個非揮發性記憶體胞元之一第二向量矩陣乘法陣列,該第二晶粒位於不同於該第一垂直層之一第二垂直層上;及 一或多個垂直介面,其耦接該第一晶粒及該第二晶粒; 其中在一程式化操作期間,第一陣列中之一或多個非揮發性記憶體胞元能夠儲存 i個位元,且第二陣列中之一或多個非揮發性記憶體胞元能夠儲存 j個位元,其中 ijA device comprising: a first die including a first vector matrix multiplying array containing a plurality of non-volatile memory cells arranged in columns and rows, the first die being located in a first vertical layer above; a second die including a second vector matrix multiply array containing a plurality of non-volatile memory cells arranged in columns and rows, the second die being located in a different vertical layer than the first on the second vertical layer; and one or more vertical interfaces coupling the first die and the second die; wherein during a programming operation, one or more non-volatile memories in the first array A cell can store i bits, and one or more non-volatile memory cells in the second array can store j bits, where ij . 如請求項16之裝置,其中,該第一晶粒根據一第一半導體製程製造,且該第二晶粒根據不同於該第一半導體製程之一第二半導體製程製造。The device of claim 16, wherein the first die is manufactured according to a first semiconductor process, and the second die is manufactured according to a second semiconductor process that is different from the first semiconductor process. 如請求項16之裝置,其包含: 一第一位元線集合,其耦接至該第一向量矩陣乘法陣列;及 一第二位元線集合,其不同於該第一位元線集合,耦接至該第二向量矩陣乘法陣列。 Such as the device of claim 16, which includes: a set of first element lines coupled to the first vector matrix multiplication array; and A second set of bit lines, different from the first set of bit lines, is coupled to the second vector matrix multiplication array. 如請求項18之裝置,其包含: 一第一控制閘極線集合,其耦接至該第一向量矩陣乘法陣列;及 一第二控制閘極線集合,其不同於該第一控制閘極線集合,耦接至該第二向量矩陣乘法陣列。 Such as the device of claim 18, which includes: a first set of control gate lines coupled to the first vector matrix multiplication array; and A second set of control gate lines, different from the first set of control gate lines, is coupled to the second vector matrix multiplication array. 如請求項19之裝置,其中,該第一控制閘極線集合藉由一各別垂直介面耦接至該第二控制閘極線集合。The device of claim 19, wherein the first set of control gate lines is coupled to the second set of control gate lines through a respective vertical interface. 如請求項18之裝置,其中,該第一位元線集合藉由一各別垂直介面耦接至該第二位元線集合。The device of claim 18, wherein the first set of bit lines is coupled to the second set of bit lines through a respective vertical interface. 如請求項21之裝置,其包含: 一第一控制閘極線集合,其耦接至該第一向量矩陣乘法陣列;及 一第二控制閘極線集合,其耦接至該第二向量矩陣乘法陣列。 Such as the device of claim 21, which includes: a first set of control gate lines coupled to the first vector matrix multiplication array; and A second set of control gate lines coupled to the second vector matrix multiplication array. 如請求項22之裝置,其中,該第一控制閘極線集合藉由一各別垂直介面耦接至該第二控制閘極線集合。The device of claim 22, wherein the first set of control gate lines is coupled to the second set of control gate lines through a respective vertical interface. 如請求項16之裝置,其中,該第一晶粒中之該複數個非揮發性記憶體胞元及該第二晶粒中之該複數個非揮發性記憶體胞元分別包含堆疊閘極快閃記憶體胞元。The device of claim 16, wherein the plurality of non-volatile memory cells in the first die and the plurality of non-volatile memory cells in the second die respectively include stacked gate fast Flash memory cells. 如請求項16之裝置,其中,該第一晶粒中之該複數個非揮發性記憶體胞元及該第二晶粒中之該複數個非揮發性記憶體胞元分別包含分離閘極快閃記憶體胞元。The device of claim 16, wherein the plurality of non-volatile memory cells in the first die and the plurality of non-volatile memory cells in the second die respectively include split gate fast Flash memory cells. 一種方法,其包含: 將包含一非揮發性記憶體胞元中之 i個位元之一值儲存於一第一晶粒中之一第一神經網路陣列中;及 將包含一非揮發性記憶體胞元中之 i個位元之一值儲存於一第二晶粒中之一第二神經網路陣列中,其中 iy; 其中該第一晶粒及該第二晶粒位於不同垂直層上。 A method comprising: storing a value comprising i bits in a non-volatile memory cell in a first neural network array in a first die; and storing a value comprising a non-volatile memory cell in a first neural network array in a first die; A value of i bits in the memory cell is stored in a second neural network array in a second die, where iy ; wherein the first die and the second die are located in different on a vertical layer. 一種裝置,其包含: 一第一晶粒,其包含含有以列及行配置之各別複數個非揮發性記憶體胞元之一第一向量矩陣乘法陣列,該第一晶粒位於一第一垂直層上; 一第二晶粒,其包含含有以列及行配置之各別複數個非揮發性記憶體胞元之一第二向量矩陣乘法陣列,該第二晶粒位於不同於該第一層之一第二垂直層上;及 一第三晶粒,其包含一數位至類比轉換器及一類比至數位轉換器中之一或多者。 A device containing: a first die including a first vector matrix multiplying array containing a plurality of respective non-volatile memory cells arranged in columns and rows, the first die being located on a first vertical layer; A second die including a second vector matrix multiply array containing a plurality of respective non-volatile memory cells arranged in columns and rows, the second die being located in a second layer different from the first layer on the second vertical floor; and A third die including one or more of a digital-to-analog converter and an analog-to-digital converter. 如請求項27之裝置,其中,該類比至數位轉換器為一基於電容器之連續近似暫存器類比至數位轉換器。The device of claim 27, wherein the analog-to-digital converter is a capacitor-based continuous approximate register analog-to-digital converter. 如請求項27之裝置,其中,該類比至數位轉換器為一基於參考電流之連續近似暫存器類比至數位轉換器。The device of claim 27, wherein the analog-to-digital converter is a continuous approximate register analog-to-digital converter based on a reference current. 如請求項27之裝置,其中,該第一向量矩陣乘法陣列及該第二向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元包含堆疊閘極快閃記憶體胞元。The device of claim 27, wherein the respective plurality of non-volatile memory cells in the first vector matrix multiplication array and the second vector matrix multiplication array comprise stacked gate flash memory cells. 如請求項27之裝置,其中,該第一向量矩陣乘法陣列及該第二向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元包含分離閘極快閃記憶體胞元。The device of claim 27, wherein the respective plurality of non-volatile memory cells in the first vector matrix multiplication array and the second vector matrix multiplication array comprise split gate flash memory cells. 一種裝置,其包含: 一第一晶粒,其包含含有以列及行配置之複數個非揮發性記憶體胞元之一第一向量矩陣乘法陣列,該第一晶粒位於一第一垂直層上; 一第二晶粒,其包含含有以列及行配置之複數個非揮發性記憶體胞元之一第二向量矩陣乘法陣列,該第二晶粒位於不同於該第一垂直層之一第二垂直層上;及 一第三晶粒,其包含一神經元電路。 A device containing: a first die including a first vector matrix multiplying array containing a plurality of non-volatile memory cells arranged in columns and rows, the first die being located on a first vertical layer; A second die including a second vector matrix multiplying array of non-volatile memory cells arranged in columns and rows, the second die being located in a second vertical layer different from the first on a vertical level; and A third die containing a neuron circuit. 如請求項32之裝置,其中,該神經元電路包含: 一p通道金屬氧化物半導體電晶體,其包含耦接至一電壓源之一第一端子、一閘極及耦接至該閘極及一神經元之一第二端子;及 一運算放大器,其包含耦接至該等p通道金屬氧化物半導體電晶體之該第二端子及該閘極之一非反相輸入、一反相輸入及耦接至該反相輸入以回應於來自該神經元之電流而產生一電壓輸出之一輸出。 The device of claim 32, wherein the neuron circuit includes: A p-channel metal oxide semiconductor transistor including a first terminal coupled to a voltage source, a gate, and a second terminal coupled to the gate and a neuron; and An operational amplifier including a non-inverting input coupled to the second terminal and the gate of the p-channel metal oxide semiconductor transistors, an inverting input and coupled to the inverting input in response to The current from the neuron produces a voltage output. 如請求項32之裝置,其中,該神經元電路包含: 一開關; 一參考記憶體胞元,其包含耦接至一神經元之一位元線端子、一源極線端子及一控制閘極端子;及 一運算放大器,其包含耦接至一參考電壓之一反相輸入、耦接至該參考記憶體胞元之該位元線端子的一非反相輸入及經由該開關可切換地耦接至該參考記憶體胞元之該控制閘極端子之一輸出。 The device of claim 32, wherein the neuron circuit includes: a switch; a reference memory cell including a bit line terminal, a source line terminal and a control gate terminal coupled to a neuron; and An operational amplifier including an inverting input coupled to a reference voltage, a non-inverting input coupled to the bit line terminal of the reference memory cell and switchably coupled to the Reference is made to the output of one of the control gate terminals of the memory cell. 如請求項32之裝置,其中,該神經元電路包含: 一運算放大器,其包含一反相輸入、一非反相輸入及耦接至一第一輸出節點之第一輸出及耦接至一第二輸出節點之一第二輸出; 一第一可變積分電阻器,其經由一第一開關可切換地耦接於該第一輸出節點與該運算放大器之該反相輸入之間; 一第二可變積分電阻器,其經由一第二開關可切換地耦接於該第二輸出節點與該運算放大器之該非反相輸入之間; 一第一電容器,其經由一第三開關可切換地耦接於來自一位元線之一第一輸入電流與該第一輸出節點之間;及 一第二電容器,其經由一第四開關可切換地耦接於來自一位元線之一第二輸入電流與該第二輸出節點之間。 The device of claim 32, wherein the neuron circuit includes: An operational amplifier including an inverting input, a non-inverting input and a first output coupled to a first output node and a second output coupled to a second output node; a first variable integrating resistor switchably coupled between the first output node and the inverting input of the operational amplifier via a first switch; a second variable integrating resistor switchably coupled between the second output node and the non-inverting input of the operational amplifier via a second switch; a first capacitor switchably coupled between a first input current from a bit line and the first output node via a third switch; and A second capacitor switchably coupled via a fourth switch between a second input current from the bit line and the second output node. 如請求項35之裝置,其中,該第一輸入電流及該第二輸入電流為一差分電流信號,且該第一輸出節點及該第二輸出節點含有一差分電壓信號。The device of claim 35, wherein the first input current and the second input current are a differential current signal, and the first output node and the second output node include a differential voltage signal. 如請求項36之裝置,其中,該第一輸入電流係自一W+位元線接收,且該第二輸入電流係自一W-位元線接收。The device of claim 36, wherein the first input current is received from a W+ bit line and the second input current is received from a W- bit line. 如請求項36之裝置,其包含一類比至數位轉換器以將該差分電壓信號轉換成一數位輸出位元集合。The device of claim 36 includes an analog-to-digital converter to convert the differential voltage signal into a set of digital output bits. 如請求項32之裝置,其中,該第一向量矩陣乘法陣列及該第二向量矩陣乘法陣列中之該複數個非揮發性記憶體胞元包含堆疊閘極快閃記憶體胞元。The device of claim 32, wherein the plurality of non-volatile memory cells in the first vector matrix multiplication array and the second vector matrix multiplication array comprise stacked gate flash memory cells. 如請求項32之裝置,其中,該第一向量矩陣乘法陣列及該第二向量矩陣乘法陣列中之該複數個非揮發性記憶體胞元包含分離閘極快閃記憶體胞元。The device of claim 32, wherein the plurality of non-volatile memory cells in the first vector matrix multiplication array and the second vector matrix multiplication array comprise split gate flash memory cells. 一種裝置,其包含: 一第一晶粒,其包含含有以列及行配置之各別複數個非揮發性記憶體胞元之一第一向量矩陣乘法陣列,該第一晶粒位於一第一垂直層上; 一第二晶粒,其包含含有以列及行配置之各別複數個非揮發性記憶體胞元之一第二向量矩陣乘法陣列,該第二晶粒位於不同於該第一垂直層之一第二垂直層上;及 一第三晶粒,其包含含有一微控制器、數位邏輯或一單一指令多資料處理器中之一或多者之數位電路。 A device containing: a first die including a first vector matrix multiplying array containing a plurality of respective non-volatile memory cells arranged in columns and rows, the first die being located on a first vertical layer; a second die including a second vector matrix multiply array containing a plurality of respective non-volatile memory cells arranged in columns and rows, the second die being located in a different vertical layer than the first on the second vertical level; and A third die that includes digital circuitry including one or more of a microcontroller, digital logic, or a single instruction multiple data processor. 如請求項41之裝置,其中,該第三晶粒包含一數位加速器。The device of claim 41, wherein the third die includes a digital accelerator. 如請求項41之裝置,其中,該第三晶粒包含一靜態隨機存取記憶體。The device of claim 41, wherein the third die includes a static random access memory. 如請求項41之裝置,其中,該第三晶粒包含實體輸入/輸出連接。The device of claim 41, wherein the third die includes physical input/output connections. 如請求項41之裝置,其中,該第三晶粒包含暫存器。The device of claim 41, wherein the third die includes a register. 如請求項41之裝置,其中,該第一向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元及該第二向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元包含堆疊閘極快閃記憶體胞元。The device of claim 41, wherein the respective plurality of non-volatile memory cells in the first vector matrix multiplication array and the respective plurality of non-volatile memory cells in the second vector matrix multiplication array The cells include stacked gate flash memory cells. 如請求項41之裝置,其中,該第一向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元及該第二向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元包含分離閘極快閃記憶體胞元。The device of claim 41, wherein the respective plurality of non-volatile memory cells in the first vector matrix multiplication array and the respective plurality of non-volatile memory cells in the second vector matrix multiplication array The cells include split-gate flash memory cells. 一種裝置,其包含: 一第一垂直層,其包含:一第一向量矩陣乘法陣列,其包含以列及行配置之各別複數個非揮發性記憶體胞元;及一第二向量矩陣乘法陣列,其包含以列及行配置之各別複數個非揮發性記憶體胞元; 一或多個各別水平介面,其耦接該第一向量矩陣乘法陣列及該第二向量矩陣乘法陣列; 一第二垂直層,其包含:一第三向量矩陣乘法陣列,其包含以列及行配置之各別複數個非揮發性記憶體胞元;及一第四向量矩陣乘法陣列,其包含以列及行配置之各別複數個非揮發性記憶體胞元;及 一或多個各別水平介面,其耦接該第三向量矩陣乘法陣列及該第四向量矩陣乘法陣列。 A device containing: a first vertical layer including: a first vector matrix multiplication array including a respective plurality of non-volatile memory cells arranged in columns and rows; and a second vector matrix multiplication array including a column and a plurality of respective non-volatile memory cells arranged in rows; one or more respective horizontal interfaces coupled to the first vector matrix multiplication array and the second vector matrix multiplication array; a second vertical layer including: a third vector matrix multiplication array including a respective plurality of non-volatile memory cells arranged in columns and rows; and a fourth vector matrix multiplication array including a column and a plurality of respective non-volatile memory cells configured in rows; and One or more respective horizontal interfaces couple the third vector matrix multiplication array and the fourth vector matrix multiplication array. 如請求項48之裝置,其中,該第一向量矩陣乘法陣列位於一第一晶粒上,該第二向量矩陣乘法陣列位於一第二晶粒上,該第三向量矩陣乘法陣列位於一第三晶粒上且該第四向量矩陣乘法陣列位於一第四晶粒上。The device of claim 48, wherein the first vector matrix multiplication array is located on a first die, the second vector matrix multiplication array is located on a second die, and the third vector matrix multiplication array is located on a third die. on the die and the fourth vector matrix multiplication array is located on a fourth die. 如請求項49之裝置,其中,該第一晶粒與該第三晶粒垂直地對準,且該第二晶粒與該第四晶粒垂直地對準。The device of claim 49, wherein the first die is vertically aligned with the third die, and the second die is vertically aligned with the fourth die. 如請求項49之裝置,其中,該第一晶粒及該第二晶粒係與該第三晶粒及該第四晶粒垂直地交錯。The device of claim 49, wherein the first die and the second die are vertically staggered with the third die and the fourth die. 如請求項48之裝置,其中,該第一向量矩陣乘法陣列、該第二向量矩陣乘法陣列、該第三向量矩陣乘法陣列及該第四向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元包含堆疊閘極快閃記憶體胞元。The device of claim 48, wherein the respective plurality of non-volatile elements in the first vector matrix multiplication array, the second vector matrix multiplication array, the third vector matrix multiplication array and the fourth vector matrix multiplication array The memory cells include stacked gate flash memory cells. 如請求項48之裝置,其中,該第一向量矩陣乘法陣列、該第二向量矩陣乘法陣列、該第三向量矩陣乘法陣列及該第四向量矩陣乘法陣列中之該各別複數個非揮發性記憶體胞元包含分離閘極快閃記憶體胞元。The device of claim 48, wherein the respective plurality of non-volatile elements in the first vector matrix multiplication array, the second vector matrix multiplication array, the third vector matrix multiplication array and the fourth vector matrix multiplication array The memory cells include split gate flash memory cells.
TW112108853A 2022-04-06 2023-03-10 Artificial neural network comprising a three-dimensional integrated circuit TW202407579A (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US202263328126P 2022-04-06 2022-04-06
US63/328,126 2022-04-06
US17/848,371 US20230325645A1 (en) 2022-04-06 2022-06-23 Artificial neural network comprising a three-dimensional integrated circuit
US17/848,371 2022-06-23
PCT/US2022/037644 WO2023196001A1 (en) 2022-04-06 2022-07-19 Artificial neural network comprising a three-dimensional integrated circuit
WOPCT/US22/37644 2022-07-19

Publications (1)

Publication Number Publication Date
TW202407579A true TW202407579A (en) 2024-02-16

Family

ID=83689130

Family Applications (1)

Application Number Title Priority Date Filing Date
TW112108853A TW202407579A (en) 2022-04-06 2023-03-10 Artificial neural network comprising a three-dimensional integrated circuit

Country Status (2)

Country Link
TW (1) TW202407579A (en)
WO (1) WO2023196001A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5029130A (en) 1990-01-22 1991-07-02 Silicon Storage Technology, Inc. Single transistor non-valatile electrically alterable semiconductor memory device
US6747310B2 (en) 2002-10-07 2004-06-08 Actrans System Inc. Flash memory cells with separated self-aligned select and erase gates, and process of fabrication
JP6833873B2 (en) 2016-05-17 2021-02-24 シリコン ストーリッジ テクノロージー インコーポレイテッドSilicon Storage Technology, Inc. Deep learning neural network classifier using non-volatile memory array
US10748630B2 (en) 2017-11-29 2020-08-18 Silicon Storage Technology, Inc. High precision and highly efficient tuning mechanisms and algorithms for analog neuromorphic memory in artificial neural networks
US10552510B2 (en) * 2018-01-11 2020-02-04 Mentium Technologies Inc. Vector-by-matrix multiplier modules based on non-volatile 2D and 3D memory arrays

Also Published As

Publication number Publication date
WO2023196001A1 (en) 2023-10-12

Similar Documents

Publication Publication Date Title
TWI780415B (en) Decoding system and physical layout for analog neural memory in deep learning artificial neural network
TWI790551B (en) Analog neural memory array storing synapsis weights in differential cell pairs in artificial neural network
TW202217825A (en) Concurrent write and verify operations in an analog neural memory
TWI809663B (en) Precise data tuning method and apparatus for analog neural memory in an artificial neural network
TWI785574B (en) Analog neural memory array in artificial neural network with source line pulldown mechanism
TWI819298B (en) Analog neural memory array in artificial neural network comprising logical cells and improved programming mechanism
TW202407579A (en) Artificial neural network comprising a three-dimensional integrated circuit
US20230325645A1 (en) Artificial neural network comprising a three-dimensional integrated circuit
TWI834397B (en) Artificial neural network comprising an analog array and a digital array
TWI814383B (en) Output circuit for analog neural memory in a deep learning artificial neural network
US20230048411A1 (en) Input circuitry for analog neural memory in a deep learning artificial neural network
US11989440B2 (en) Hybrid memory system configurable to store neural memory weight data in analog form or digital form
TWI822198B (en) Output circuitry for analog neural memory in a deep learning artificial neural network
US20230325650A1 (en) Vector-by-matrix-multiplication array utilizing analog outputs
JP7493089B2 (en) Adaptive bias decoder for analog neural memory arrays in artificial neural networks.
US20230244903A1 (en) Artificial neural network comprising an analog array and a digital array
TW202343311A (en) Vector-by-matrix-multiplication array utilizing analog outputs
TW202343451A (en) Artificial neural network comprising reference array for i-v slope configuration
TW202312035A (en) Split array architecture for analog neural memory in a deep learning artificial neural network
CN117751406A (en) Hybrid memory system configurable to store neural memory weight data in analog or digital form
CN117716427A (en) Input circuit for simulating neural memory in deep learning artificial neural network
WO2023146567A1 (en) Artificial neural network comprising an analog array and a digital array
TW202336748A (en) Determination of a bias voltage to apply to one or more memory cells in a neural network
TW202416180A (en) Output circuit for artificial neural network array
TW202416179A (en) Input circuit for artificial neural network array