TW202422557A

TW202422557A - Verification method and system in artificial neural network array

Info

Publication number: TW202422557A
Application number: TW112131058A
Authority: TW
Inventors: 曉萬陳; 史蒂芬鄭; 史丹利洪; 順武; 德阮; 賢范
Original assignee: 美商超捷公司
Priority date: 2022-09-22
Filing date: 2023-08-18
Publication date: 2024-06-01
Also published as: WO2024063792A1

Abstract

Numerous examples are disclosed of verification circuitry and associated methods in an artificial neural network. In one example, a system comprises a vector-by-matrix multiplication array comprising a plurality of non-volatile memory cells arranged in rows and columns, the non-volatile memory cells respectively capable of storing one of N possible levels corresponding to one of N possible currents, and a plurality of output blocks to receive current from respective columns of the vector-by-matrix multiplication array and generate voltages during a verify operation of the vector-by-matrix multiplication and generate digital outputs during a read operation of the vector-by-matrix multiplication.

Description

Verification method and system in artificial neural network array

本申請案主張2022年12月13日所申請之名稱為「人工神經網路陣列中的驗證方法及系統」的美國專利申請案第18/080,545號及2022年9月22日所申請之名稱為「人工神經網路陣列中的驗證方法及系統」的美國臨時專利申請案第63/409,142號之優先權。This application claims priority to U.S. Patent Application No. 18/080,545, filed on December 13, 2022, entitled “VERIFICATION METHODS AND SYSTEMS IN ARTIFICIAL NEURAL NETWORK ARRAYS,” and U.S. Provisional Patent Application No. 63/409,142, filed on September 22, 2022, entitled “VERIFICATION METHODS AND SYSTEMS IN ARTIFICIAL NEURAL NETWORK ARRAYS.”

揭露人工神經網路中的驗證電路及相關方法的許多實例。Many examples of verification circuits and related methods in artificial neural networks are revealed.

人工神經網路模擬生物神經網路(動物的中樞神經系統，特別是大腦)及用於估計或近似可依賴於大量輸入且通常是未知的函數。人工神經網路通常包括可彼此交換信息之互連的「神經元」層。Artificial neural networks simulate biological neural networks (the central nervous system of animals, especially the brain) and are used to estimate or approximate functions that may depend on a large number of inputs and are usually unknown. Artificial neural networks typically consist of layers of interconnected "neurons" that can exchange information with each other.

圖1說明人工神經網路，其中圓圈表示神經元輸入或層。連結(稱為突觸)以箭頭來表示，並且具有可根據經驗調整的數字權重。這使得神經網路適應於輸入且能夠學習。通常，神經網路包括一層多個輸入。通常有一個或多個神經元中間層及提供神經網路輸出的一個神經元輸出層。每個層級的神經元個別地或共同地根據從突觸接收的資料做出決定。Figure 1 illustrates an artificial neural network, where circles represent neuron inputs or layers. Connections (called synapses) are represented by arrows and have numerical weights that can be adjusted based on experience. This allows the neural network to adapt to the inputs and be able to learn. Typically, a neural network includes a layer of multiple inputs. There is usually one or more intermediate layers of neurons and a neuron output layer that provides the output of the neural network. The neurons at each level make decisions based on the data received from the synapses, either individually or collectively.

開發用於高性能資訊處理之人工神經網路的主要挑戰中之一是缺乏足夠的硬體技術。實際上，實際的神經網路依賴於非常大量的突觸，以使神經元之間的高連結性(亦即，非常高的計算平行性)成為可能。原則上，這樣的複雜性可以用數位超級電腦或專用圖形處理單元叢集來實現。然而，除了高成本之外，相較於生物網路，這些方法還因平庸的能量效率而更糟，其中生物網路消耗非常少的能量，主要是因為它們執行低精度類比計算。CMOS類比電路已經用於人工神經網路，但是有鑑於大量的神經元及突觸，大多數CMOS實施的突觸過於龐大。One of the main challenges in developing artificial neural networks for high-performance information processing is the lack of adequate hardware technology. In practice, practical neural networks rely on very large numbers of synapses to enable high connectivity between neurons (i.e., very high computational parallelism). In principle, such complexity could be achieved using digital supercomputers or clusters of dedicated graphics processing units. However, in addition to high cost, these approaches suffer from mediocre energy efficiency compared to biological networks, which consume very little energy, primarily because they perform low-precision analog computations. CMOS analog circuits have been used for artificial neural networks, but the synapses of most CMOS implementations are too large given the large number of neurons and synapses.

申請人以前在美國專利申請案公開第2017/0337466A1號中揭露一種人工(類比)神經網路，其利用一個或多個非揮發性記憶體陣列作為突觸，在此以提及方式將上述美國專利申請案併入本文。非揮發性記憶體陣列作為類比神經記憶體來操作且包括配置成列與行的非揮發性記憶體單元。神經網路包括複數個第一突觸，其構造成接收複數個第一輸入並由此產生複數個第一輸出；以及複數個第一神經元，其構造成接收複數個第一輸出。複數個第一突觸包括複數個記憶體單元，其中每個記憶體單元包括間隔開的源極及汲極區域，其形成在半導體基板中，並且具有通道區域在其間延伸；浮動閘極，其設置在通道區域的第一部分上方且與通道區域的第一部分絕緣；以及非浮動閘極，其設置在通道區域的第二部分上方且與通道區域的第二部分絕緣。複數個記憶體單元中之每一個儲存與浮動閘極上之一些電子相對應的權重值。複數個記憶體單元將複數個第一輸入乘以儲存的權重值，以產生複數個第一輸出。非揮發性記憶體單元 The applicant previously disclosed an artificial (analog) neural network in U.S. Patent Application Publication No. 2017/0337466A1, which is hereby incorporated by reference. The non-volatile memory array operates as an analog neural memory and includes non-volatile memory cells arranged in rows and columns. The neural network includes a plurality of first synapses configured to receive a plurality of first inputs and thereby generate a plurality of first outputs; and a plurality of first neurons configured to receive a plurality of first outputs. The plurality of first contacts include a plurality of memory cells, wherein each memory cell includes spaced-apart source and drain regions formed in a semiconductor substrate and having a channel region extending therebetween; a floating gate disposed over and insulated from a first portion of the channel region; and a non-floating gate disposed over and insulated from a second portion of the channel region. Each of the plurality of memory cells stores a weight value corresponding to a number of electrons on the floating gate. The plurality of memory cells multiply the plurality of first inputs by the stored weight values to generate a plurality of first outputs. Non-volatile memory cells

非揮發性記憶體係眾所周知的。例如，美國專利第5,029,130號(「'130專利」)揭露一種分離式閘極非揮發性記憶體單元(快閃記憶體單元)陣列，並且在此以提及方式將其併入本文。這樣的記憶體單元210顯示在圖2中。每個記憶體單元210包括在半導體基板12中形成之源極區域14及汲極區域16，並且在其間具有通道區域18。浮動閘極20形成在通道區域18的第一部分上方且與其絕緣(並控制其導電性)，並且形成在源極區域14的一部分上方。字元線端子22(通常耦接至字元線)具有第一部分及第二部分，其中第一部分設置在通道區域18的第二部分上方且與其絕緣(並控制其導電性)，而第二部分向上延伸且在浮動閘極20上方。浮動閘極20及字元線端子22藉由閘極氧化物與基板12絕緣。位元線24耦接至汲極區域16。Non-volatile memory is well known. For example, U.S. Patent No. 5,029,130 (the "'130 patent") discloses an array of split-gate non-volatile memory cells (flash memory cells) and is incorporated herein by reference. Such a memory cell 210 is shown in FIG. 2 . Each memory cell 210 includes a source region 14 and a drain region 16 formed in a semiconductor substrate 12 and having a channel region 18 therebetween. A floating gate 20 is formed over and insulated from a first portion of the channel region 18 (and controls its conductivity), and is formed over a portion of the source region 14 . The word line terminal 22 (usually coupled to the word line) has a first portion and a second portion, wherein the first portion is disposed above and insulated from the second portion of the channel region 18 (and controls its conductivity), and the second portion extends upward and above the floating gate 20. The floating gate 20 and the word line terminal 22 are insulated from the substrate 12 by a gate oxide. The bit line 24 is coupled to the drain region 16.

藉由在字元線端子22上施加高正電壓來抹除記憶體單元210(其中從浮動閘極移除電子)，這導致浮動閘極20上的電子藉由富爾-諾罕穿隧(Fowler-Nordheim tunneling)從浮動閘極20隧穿中間絕緣體至字元線端子22。The memory cell 210 is erased (where electrons are removed from the floating gate) by applying a high positive voltage on the word line terminal 22 , which causes the electrons on the floating gate 20 to tunnel from the floating gate 20 through the intermediate insulator to the word line terminal 22 by Fowler-Nordheim tunneling.

藉由在字元線端子22上施加正電壓及在源極14上施加正電壓，透過用熱電子進行源極側注入(SSI)來程式化記憶體單元210(其中電子被放置在浮動閘極上)。電子流將從汲極區域16流向源極區域14。當電子到達字元線端子22與浮動閘極20之間的間隙時，電子將加速並變熱。由於來自浮動閘極20的靜電吸引力，一些加熱的電子將通過閘極氧化物注入至浮動閘極20上。The memory cell 210 is programmed by source side injection (SSI) with hot electrons by applying a positive voltage on the word line terminal 22 and a positive voltage on the source 14 (where electrons are placed on the floating gate). Electron flow will flow from the drain region 16 to the source region 14. When the electrons reach the gap between the word line terminal 22 and the floating gate 20, the electrons will accelerate and become hot. Due to the electrostatic attraction from the floating gate 20, some of the heated electrons will be injected onto the floating gate 20 through the gate oxide.

藉由在汲極區域16及字元線端子22上施加正讀取電壓(這會導通在字元線端子下方之通道區域18的部分)來讀取記憶體單元210。如果浮動閘極20帶正電(亦即，被抹除電子)，則浮動閘極20下方之通道區域18的部分亦導通，並且電流將流過通道區域18，因而被感測為抹除狀態或狀態「1」。如果浮動閘極20帶負電(亦即，用電子來程式化)，則浮動閘極20下方之通道區域的部分大部分或完全截止，並且電流不會流過(或者幾乎不流過)通道區域18，因而被感測為程式化狀態或狀態「0」。The memory cell 210 is read by applying a positive read voltage to the drain region 16 and the word line terminal 22 (which turns on the portion of the channel region 18 below the word line terminal). If the floating gate 20 is positively charged (i.e., erased electrons), the portion of the channel region 18 below the floating gate 20 is also turned on, and current will flow through the channel region 18, which is sensed as an erased state or state "1". If the floating gate 20 is negatively charged (i.e., programmed with electrons), the portion of the channel region below the floating gate 20 is mostly or completely turned off, and current will not flow (or almost not flow) through the channel region 18, which is sensed as a programmed state or state "0".

表1描繪可以施加至記憶體單元210的端子以執行讀取、抹除及程式化操作的典型電壓及電流範圍：表1：圖2的快閃記憶體單元210之操作 WL BL SL 讀取 2-3V 0.6-2V 0V 抹除 ~11-13V 0V 0V 程式化 1-2V 10.5-3µA 9-10V Table 1 describes typical voltage and current ranges that may be applied to the terminals of the memory cell 210 to perform read, erase, and program operations: Table 1: Operation of the flash memory cell 210 of FIG. 2 WL BL SL Read 2-3V 0.6-2V 0V Erase ~11-13V 0V 0V Programming 1-2V 10.5-3µA 9-10V

其它分離式閘極記憶體單元組態係其它類型的快閃記憶體單元且係已知的。例如，圖3描繪4-閘極記憶體單元310，其包括源極區域14、汲極區域16、在通道區域18的第一部分上方之浮動閘極20、在通道區域18的第二部分上方之選擇閘極22(通常耦接至字元線WL)、在浮動閘極20上方之控制閘極28以及在源極區域14上方之抹除閘極30。這種組態被描述在美國專利第6,747,310號中，為了各種目的以提及方式將其併入本文。這裡，除浮動閘極20外，所有其它閘極皆是非浮動閘極，這意味著它們電連接或可連接至電壓源。藉由將加熱的電子從通道區域18注入至浮動閘極20上來執行程式化。藉由電子從浮動閘極20隧穿至抹除閘極30來執行抹除。Other split gate memory cell configurations are other types of flash memory cells and are known. For example, FIG. 3 depicts a 4-gate memory cell 310 that includes a source region 14, a drain region 16, a floating gate 20 over a first portion of a channel region 18, a select gate 22 (typically coupled to a word line WL) over a second portion of the channel region 18, a control gate 28 over the floating gate 20, and an erase gate 30 over the source region 14. Such a configuration is described in U.S. Patent No. 6,747,310, which is incorporated herein by reference for all purposes. Here, except for the floating gate 20, all other gates are non-floating gates, which means they are electrically connected or connectable to a voltage source. Programming is performed by injecting heated electrons from the channel region 18 onto the floating gate 20. Erasing is performed by tunneling electrons from the floating gate 20 to the erase gate 30.

表2描繪可以施加至記憶體單元310的端子以執行讀取、抹除及程式化操作的典型電壓及電流範圍：表2：圖3的快閃記憶體單元310之操作 WL/SG BL CG EG SL 讀取 1.0-2V 0.6-2V 0-2.6V 0-2.6V 0V 抹除 -0.5V/0V 0V 0V/-8V 8-12V 0V 程式化 1V 0.1-1µA 8-11V 4.5-9V 4.5-5V Table 2 depicts typical voltage and current ranges that may be applied to the terminals of the memory cell 310 to perform read, erase, and program operations: Table 2: Operation of the flash memory cell 310 of FIG. 3 WL/SG BL CG EG SL Read 1.0-2V 0.6-2V 0-2.6V 0-2.6V 0V Erase -0.5V/0V 0V 0V/-8V 8-12V 0V Programming 1V 0.1-1µA 8-11V 4.5-9V 4.5-5V

圖4描繪3-閘極記憶體單元410，其為另一種類型的快閃記憶體單元。除記憶體單元410不具有單獨的控制閘極外，記憶體單元410與圖3的記憶體單元310相同。除沒有施加控制閘極偏壓外，抹除操作(藉由抹除閘極的使用來抹除)及讀取操作相似於圖3的操作。程式化操作亦在沒有控制閘極偏壓的情況下完成，結果，在程式化操作期間必須在源極線上施加較高電壓，以補償控制閘極偏壓的缺少。FIG4 depicts a 3-gate memory cell 410, which is another type of flash memory cell. Memory cell 410 is the same as memory cell 310 of FIG3, except that memory cell 410 does not have a separate control gate. Erase operations (erasing by use of the erase gate) and read operations are similar to those of FIG3, except that no control gate bias is applied. Programming operations are also performed without a control gate bias, and as a result, a higher voltage must be applied to the source line during programming operations to compensate for the lack of a control gate bias.

表3描繪可以施加至記憶體單元410的端子以執行讀取、抹除及程式化操作的典型電壓及電流範圍：表3：圖4的快閃記憶體單元410之操作 WL/SG BL EG SL 讀取 0.7-2.2V 0.6-2V 0-2.6V 0V 抹除 -0.5V/0V 0V 11.5V 0V 程式化 1V 0.2-3µA 4.5V 7-9V Table 3 describes typical voltage and current ranges that may be applied to the terminals of the memory cell 410 to perform read, erase, and program operations: Table 3: Operation of the flash memory cell 410 of FIG. 4 WL/SG BL EG SL Read 0.7-2.2V 0.6-2V 0-2.6V 0V Erase -0.5V/0V 0V 11.5V 0V Programming 1V 0.2-3µA 4.5V 7-9V

圖5描繪堆疊式閘極記憶體單元510，其為另一種類型的快閃記憶體單元。除浮動閘極20在整個通道區域18上方延伸及控制閘極22(在此將耦接至字元線)在浮動閘極20上方延伸且以絕緣層(未顯示)隔開外，記憶體單元510相似於圖2的記憶體單元210。抹除係藉由電子從FG至基板的FN穿隧來完成的，程式化係藉由用從源極區域14流向汲極區16的電子在通道18與汲極區域16之間的區域進行通道熱電子(CHE)注入來完成的，讀取操作類似於具有較高控制閘極電壓的記憶體單元210的讀取操作。5 depicts a stacked gate memory cell 510, which is another type of flash memory cell. Memory cell 510 is similar to memory cell 210 of FIG. 2 except that floating gate 20 extends over the entire channel region 18 and control gate 22 (which will be coupled to the word line here) extends over floating gate 20 and is separated by an insulating layer (not shown). Erasing is accomplished by FN tunneling of electrons from FG to the substrate, programming is accomplished by channel hot electron (CHE) injection in the region between channel 18 and drain region 16 using electrons flowing from source region 14 to drain region 16, and the read operation is similar to the read operation of memory cell 210 with a higher control gate voltage.

表4描繪可以施加至記憶體單元510的端子及基板12以執行讀取、抹除及程式化操作的典型電壓範圍：表4：圖5的快閃記憶體單元510之操作 CG BL SL 基板讀取 2-5V 0.6–2V 0V 0V 抹除 -8至-10V/0V FLT FLT 8-10V/15-20V 程式化 8-12V 3-5V 0V 0V Table 4 describes typical voltage ranges that may be applied to the terminals of the memory cell 510 and the substrate 12 to perform read, erase, and program operations: Table 4: Operation of the flash memory cell 510 of FIG. 5 CG BL SL Substrate Read 2-5V 0.6–2V 0V 0V Erase -8 to -10V/0V FLT FLT 8-10V/15-20V Programming 8-12V 3-5V 0V 0V

本文描述的方法及手段可以應用於其它非揮發性記憶體技術，例如，FINFET分離式閘極快閃記憶體或堆疊式閘極快閃記憶體、NAND快閃記憶體、SONOS(矽-氧化物-氮化物-氧化物-矽，氮化物中的電荷捕捉)、MONOS(金屬-氧化物-氮化物-氧化物-矽、氮化物中的金屬電荷捕捉)、ReRAM(電阻式RAM)、PCM(相變化記憶體)、MRAM(磁性RAM)、FeRAM(鐵電RAM)、CT(電荷捕捉)記憶體、CN(碳管)記憶體、OTP(雙階或多階一次可程式)及CeRAM（相關電子RAM），但不限於此。The methods and means described herein may be applied to other non-volatile memory technologies, such as, but not limited to, FINFET split gate flash memory or stacked gate flash memory, NAND flash memory, SONOS (silicon-oxide-nitride-oxide-silicon, charge trapping in nitride), MONOS (metal-oxide-nitride-oxide-silicon, metal charge trapping in nitride), ReRAM (resistive RAM), PCM (phase change memory), MRAM (magnetic RAM), FeRAM (ferroelectric RAM), CT (charge trapping) memory, CN (carbon tube) memory, OTP (two-level or multi-level one-time programmable) and CeRAM (correlated electronic RAM).

為了在人工神經網路中利用包括上述類型的非揮發性記憶體單元中之一的記憶體陣列，實施兩種修改。第一，如下面進一步說明，線路配置成使得每個記憶體單元可以個別地被程式化、抹除及讀取，而不會不利地影響陣列中之其它記憶體單元的記憶狀態。第二，提供記憶體單元的連續(類比)程式化。In order to utilize a memory array including one of the above-described types of non-volatile memory cells in an artificial neural network, two modifications are implemented. First, as further described below, the circuitry is configured so that each memory cell can be individually programmed, erased, and read without adversely affecting the memory state of other memory cells in the array. Second, continuous (analog) programming of memory cells is provided.

具體地，陣列中之每個記憶體單元的記憶狀態(亦即，浮動閘極上的電荷)可以獨立地且以對其它記憶體單元的最小干擾從完全抹除狀態連續地變成完全程式化狀態，反之亦然。這意味著單元儲存實際上是類比的，或者至少可以儲存許多離散值(例如，16或64個不同值)中之一個，這允許對記憶體陣列中之所有記憶體單元進行非常精確且個別的調整，並且這使記憶體陣列非常適合儲存神經網路的突觸權重及對其進行微調。使用非揮發性記憶體單元陣列的神經網路 Specifically, the memory state (i.e., the charge on the floating gate) of each memory cell in the array can be changed from a completely erased state to a fully programmed state and vice versa independently and continuously with minimal disturbance to other memory cells. This means that the cell storage is actually analog, or at least can store one of many discrete values (e.g., 16 or 64 different values), which allows very precise and individual adjustments to all memory cells in the memory array, and this makes memory arrays very suitable for storing and fine-tuning the synaptic weights of neural networks. Neural Network Using Arrays of Non-Volatile Memory Cells

圖6概念性地說明利用本實例的非揮發性記憶體陣列之神經網路的一個非限制性實例。此實例將非揮發性記憶體陣列神經網路用於臉部辨識應用，但是可以使用以非揮發性記憶體陣列為基礎的神經網路來實施任何其它適當的應用。FIG6 conceptually illustrates a non-limiting example of a neural network utilizing the non-volatile memory array of the present example. This example uses the non-volatile memory array neural network for a face recognition application, but any other suitable application may be implemented using a non-volatile memory array-based neural network.

S0係輸入層，對於這個實例，其為具有5位元精度的32×32像素RGB影像(亦即，三個32×32像素陣列，一個陣列用於各自的顏色R、G及B，每個像素為5位元精度)。從輸入層S0至層C1的突觸CB1在某些情況下應用不同組的權重，而在其它情況下應用共享權重，並且用3×3像素重疊濾波器(核心)掃描輸入影像，將濾波器移位1個像素(或者根據模型所規定，多於1個像素)。具體地，提供用於影像的一個3×3部分中之9個像素的數值(亦即，稱為一個濾波器或核心)給突觸CB1，在那裡將這9個輸入值乘以適當的權重，並且在計算乘法輸出的總和之後，由CB1的第一突觸確定及提供單一輸出值，以便產生層C1的特徵圖(feature map)中之一的一個像素。然後，在輸入層S0內將3×3濾波器向右移動一個像素(亦即，添加在右側之三個像素的行及丟棄在左側之三個像素的行)，藉以將這個新定位的濾波器中之9個像素值提供給突觸CB1，在那裡將它們乘以相同的權重，並且由相關的突觸確定第二個單一輸出值。持續這個過程，直到3×3濾波器針對所有三種顏色及所有位元(精度值)掃描輸入層S0的整個32×32像素影像為止。然後，使用不同組的權重重複這個過程，以產生層C1的一個不同特徵圖，直到已經計算層C1的所有特徵圖為止。S0 is the input layer, which for this example is a 32×32 pixel RGB image with 5 bits of precision (i.e., three 32×32 pixel arrays, one for each color R, G, and B, with 5 bits of precision per pixel). Synapse CB1 from input layer S0 to layer C1 applies different sets of weights in some cases and shared weights in other cases, and scans the input image with a 3×3 pixel overlapping filter (kernel), shifting the filter by 1 pixel (or more than 1 pixel as dictated by the model). Specifically, the values for 9 pixels in a 3×3 portion of the image (i.e., called a filter or kernel) are provided to synapse CB1, where the 9 input values are multiplied by appropriate weights and, after calculating the sum of the multiplication outputs, a single output value is determined and provided by the first synapse of CB1 to produce a pixel in one of the feature maps of layer C1. The 3×3 filter is then shifted one pixel to the right within input layer S0 (i.e., the row of three pixels on the right is added and the row of three pixels on the left is discarded), whereby the 9 pixel values in this newly positioned filter are provided to synapse CB1, where they are multiplied by the same weights and a second single output value is determined by the associated synapse. This process continues until the 3×3 filter has scanned the entire 32×32 pixel image of the input layer S0 for all three colors and all bits (precision values). This process is then repeated using a different set of weights to produce a different feature map for layer C1 until all feature maps for layer C1 have been calculated.

在層C1處，在本實例中，具有16個特徵圖，每個特徵圖有30×30像素。每個像素是從輸入與核心相乘得到之新特徵像素，因此每個特徵圖是二維陣列，因此在這個實例中，層C1構成16層二維陣列(記住這裡引用的層及陣列是邏輯關係，不一定是實體關係-亦即，陣列不一定以實體二維陣列來定向)。層C1中之16個特徵圖的每個特徵圖由應用於濾波器掃描之16組不同的突觸權重中之一組來產生。C1特徵圖可以全部有關於諸如邊界識別之同一個影像特徵的不同態樣。例如，第一圖(使用第一組權重所產生，第一組權重對用於產生此第一圖的所有掃描係共享的)可以識別圓形邊緣，第二圖(使用與第一組權重不同的第二組權重所產生)可以識別矩形邊緣或某些特徵的縱橫比等等。At layer C1, in this example, there are 16 feature maps, each of which is 30×30 pixels. Each pixel is a new feature pixel obtained by multiplying the input by the kernel, so each feature map is a two-dimensional array, so in this example, layer C1 constitutes 16 layers of two-dimensional arrays (remember that the layers and arrays referenced here are logical relationships, not necessarily physical relationships - that is, arrays are not necessarily oriented in terms of physical two-dimensional arrays). Each of the 16 feature maps in layer C1 is generated by one of 16 different sets of synaptic weights applied to the filter sweep. The C1 feature maps can all be related to different aspects of the same image feature, such as boundary recognition. For example, a first image (generated using a first set of weights that are shared by all scans used to generate the first image) may identify circular edges, a second image (generated using a second set of weights that is different from the first set of weights) may identify rectangular edges or the aspect ratio of certain features, and so on.

在從層C1到層S1之前應用激勵函數P1(池化(pooling))，其對來自每個特徵圖中之連續的非重疊2×2區域的數值進行池化。池化函數P1的目的是算出附近位置的平均值(或者亦可以使用最大值函數)，以減少例如邊緣位置的依賴性及在進入下一階段之前減小資料大小。在層S1處，具有16個15×15特徵圖(亦即，16個不同陣列，每個陣列有15×15像素)。從層S1到層C2的突觸CB2用4×4濾波器掃描S1中之圖，並且有一個像素的濾波器移位。在層C2處，具有22個12×12特徵圖。在從層C2到層S2之前應用激勵函數P2(池化)，其對來自每個特徵圖中之連續非重疊2×2區域的數值進行池化。在層S2處，具有22個6×6特徵圖。在從層S2到層C3的突觸CB3處應用激勵函數(池化)，其中層C3中之每個神經元經由CB3的個別突觸連接至層S2中之每個圖。在層C3處，具有64個神經元。從層C3到輸出層S3的突觸CB4將C3完全連接至S3，亦即，層C3中之每個神經元連接至層S3中之每個神經元。S3處的輸出包括10個神經元，其中最高輸出神經元確定類別。此輸出可能例如表示原始影像的內容之識別或分類。Before going from layer C1 to layer S1, an activation function P1 (pooling) is applied, which pools the values from consecutive non-overlapping 2×2 regions in each feature map. The purpose of the pooling function P1 is to calculate the average of nearby locations (or a maximum function can also be used) to reduce the dependency of edge locations, for example, and to reduce the data size before entering the next stage. At layer S1, there are 16 15×15 feature maps (i.e., 16 different arrays, each with 15×15 pixels). The synapse CB2 from layer S1 to layer C2 scans the map in S1 with a 4×4 filter and a filter shift of one pixel. At layer C2, there are 22 12×12 feature maps. An activation function P2 (pooling) is applied before layer C2 to layer S2, which pools the values from consecutive non-overlapping 2×2 regions in each feature map. At layer S2, there are 22 6×6 feature maps. An activation function (pooling) is applied at synapse CB3 from layer S2 to layer C3, where each neuron in layer C3 is connected to each map in layer S2 via an individual synapse of CB3. At layer C3, there are 64 neurons. Synapse CB4 from layer C3 to output layer S3 fully connects C3 to S3, that is, every neuron in layer C3 is connected to every neuron in layer S3. The output at S3 includes 10 neurons, where the highest output neuron determines the class. This output may, for example, represent recognition or classification of the content of the original image.

使用一個陣列的非揮發性記憶體單元或一個陣列的非揮發性記憶體單元之一部分來實施每層突觸。Each layer of synapses is implemented using an array of nonvolatile memory cells or a portion of an array of nonvolatile memory cells.

圖7係可用於那個目的之陣列的方塊圖。向量矩陣乘法(VMM)陣列32包括非揮發性記憶體單元，並用作一層與下一層之間的突觸(例如，圖6中之CB1、CB2、CB3及CB4)。具體地，VMM陣列32包括非揮發性記憶體單元陣列33、抹除閘極及字元線閘極解碼器34、控制閘極解碼器35、位元線解碼器36以及源極線解碼器37，它們對非揮發性記憶體單元陣列33的個別輸入進行解碼。對VMM陣列32的輸入可以來自抹除閘極及字元線閘極解碼器34或來自控制閘極解碼器35。此實例中之源極線解碼器37亦對非揮發性記憶體單元陣列33的輸出進行解碼。或者，位元線解碼器36可以對非揮發性記憶體單元陣列33的輸出進行解碼。FIG7 is a block diagram of an array that can be used for that purpose. The vector matrix multiplication (VMM) array 32 includes non-volatile memory cells and is used as a synapse between one layer and the next layer (e.g., CB1, CB2, CB3, and CB4 in FIG6). Specifically, the VMM array 32 includes a non-volatile memory cell array 33, an erase gate and word line gate decoder 34, a control gate decoder 35, a bit line decoder 36, and a source line decoder 37, which decode individual inputs of the non-volatile memory cell array 33. The input to the VMM array 32 may come from the erase gate and word line gate decoder 34 or from the control gate decoder 35. The source line decoder 37 in this example also decodes the output of the non-volatile memory cell array 33. Alternatively, the bit line decoder 36 may decode the output of the non-volatile memory cell array 33.

非揮發性記憶體單元陣列33提供兩個用途。第一，它儲存將由VMM陣列32使用的權重。第二，非揮發性記憶體單元陣列33有效地將輸入乘以非揮發性記憶體單元陣列33中所儲存的權重，並且根據輸出線(源極線或位元線)將它們加起來，以產生輸出，所述輸出將是下一層的輸入或最後一層的輸入。藉由執行乘法及加法函數，非揮發性記憶體單元陣列33不需要個別的乘法及加法邏輯電路，並且因原位記憶體計算而亦具功率效率。The non-volatile memory cell array 33 serves two purposes. First, it stores weights to be used by the VMM array 32. Second, the non-volatile memory cell array 33 effectively multiplies the input by the weights stored in the non-volatile memory cell array 33 and adds them together according to the output line (source line or bit line) to produce an output, which will be the input of the next layer or the input of the last layer. By performing multiplication and addition functions, the non-volatile memory cell array 33 does not require separate multiplication and addition logic circuits and is also power efficient due to in-situ memory calculations.

非揮發性記憶體單元陣列33的輸出被供應至差分加法器(例如，求和運算放大器或求和電流鏡)38，其計算非揮發性記憶體單元陣列33的輸出之總和，以產生用於卷積的單一數值。差分加法器38配置成執行正權重與負權重的總和。The outputs of the non-volatile memory cell array 33 are supplied to a differential adder (e.g., a summing operational amplifier or a summing current mirror) 38, which calculates the sum of the outputs of the non-volatile memory cell array 33 to produce a single value for convolution. The differential adder 38 is configured to perform the sum of positive weights and negative weights.

然後，將差分加法器38之加總的輸出值供應至激勵函數方塊39，其對輸出進行整流。激勵函數方塊39可以提供sigmoid、tanh或ReLU函數。激勵函數方塊39之經整流的輸出值變成作為下一層(例如，圖6中之C1)之特徵圖的元素，然後應用於下一個突觸，以產生下一個特徵圖層或最後一層。因此，在此實例中，非揮發性記憶體陣列33構成複數個突觸(其從先前的神經元層或從諸如影像資料庫的輸入層接收它們的輸入)，並且求和運算放大器38及激勵函數方塊39構成複數個神經元。Then, the summed output value of the difference adder 38 is supplied to the excitation function block 39, which rectifies the output. The excitation function block 39 may provide a sigmoid, tanh, or ReLU function. The rectified output value of the excitation function block 39 becomes an element of the feature map of the next layer (e.g., C1 in FIG. 6 ), and is then applied to the next synapse to generate the next feature map layer or the last layer. Thus, in this example, the non-volatile memory array 33 constitutes a plurality of synapses (which receive their inputs from a previous neuron layer or from an input layer such as an image database), and the summing operational amplifier 38 and the excitation function block 39 constitute a plurality of neurons.

圖7中至VMM陣列32的輸入(WLx、EGx、CGx以及任選的BLx及SLx)可以是類比位準、二進制位準或數位位元(在這種情況下，提供DAC，以將數位位元轉換為適當的輸入類比位準)，並且輸出可以是類比位準、二進制位準或數位位元(在這種情況下，提供輸出ADC，以將輸出類比位準轉換為數位位元)。The inputs to the VMM array 32 in FIG. 7 (WLx, EGx, CGx, and optionally BLx and SLx) may be analog levels, binary levels, or digital bits (in which case a DAC is provided to convert the digital bits to the appropriate input analog level), and the outputs may be analog levels, binary levels, or digital bits (in which case an output ADC is provided to convert the output analog level to digital bits).

圖8係描繪許多層的VMM陣列32之使用的方塊圖，這裡標記為VMM陣列32a、32b、32c、32d及32e。如圖8所示，藉由數位至類比轉換器31將輸入(表示為Inputx)從數位轉換成類比，並提供至輸入VMM 陣列32a。經轉換的類比輸入可以是電壓或電流。用於第一層的輸入D/A轉換可以藉由使用函數或LUT(查找表)來完成，其中LUT(查找表)將輸入Inputx映射至用於輸入VMM陣列32a的矩陣乘數之適當類比位準。輸入轉換亦可以藉由類比至類比(A/A)轉換器來完成，以將外部類比輸入轉換成輸入VMM陣列32a的映射類比輸入。FIG8 is a block diagram depicting the use of many layers of VMM arrays 32, here labeled VMM arrays 32a, 32b, 32c, 32d, and 32e. As shown in FIG8, the input (denoted as Inputx) is converted from digital to analog by a digital to analog converter 31 and provided to the input VMM array 32a. The converted analog input can be a voltage or a current. The input D/A conversion for the first layer can be accomplished by using a function or LUT (lookup table) that maps the input Inputx to the appropriate analog level for the matrix multiplier of the input VMM array 32a. Input conversion may also be accomplished by an analog-to-analog (A/A) converter to convert external analog input to a mapped analog input to the VMM array 32a.

由輸入VMM 32a產生的輸出作為輸入提供給下一個VMM陣列(隱藏層級1)32b，其轉而產生作為輸入提供給下一個VMM陣列(隱藏層級2)32c的輸出等等。各種層的VMM陣列32充當卷積神經網路(CNN)之不同層的突觸及神經元。每個VMM陣列32a、32b、32c、32d及32e可以是獨立的實體非揮發性記憶體陣列，或者多個VMM陣列可以利用同一個實體非揮發性記憶體陣列的不同部分，或者多個VMM陣列可以利用同一個實體非揮發性記憶體陣列的重疊部分。圖8所示的實例包含五層(32a、32b、32c、32d、32e)：一個輸入層(32a)、兩個隱藏層(32b、32c)及兩個完全連接層(32d、32e)。所屬技術領域之具通常技藝人士將理解，這僅僅是示例性的，並且系統反而可以包括多於兩個隱藏層及多於兩個完全連接層。向量矩陣乘法(VMM)陣列 The output generated by the input VMM 32a is provided as input to the next VMM array (hidden level 1) 32b, which in turn generates outputs that are provided as input to the next VMM array (hidden level 2) 32c, and so on. The various layers of VMM arrays 32 act as synapses and neurons at different layers of a convolutional neural network (CNN). Each VMM array 32a, 32b, 32c, 32d, and 32e can be an independent physical non-volatile memory array, or multiple VMM arrays can utilize different portions of the same physical non-volatile memory array, or multiple VMM arrays can utilize overlapping portions of the same physical non-volatile memory array. The example shown in FIG8 includes five layers (32a, 32b, 32c, 32d, 32e): one input layer (32a), two hidden layers (32b, 32c), and two fully connected layers (32d, 32e). Those skilled in the art will understand that this is merely exemplary and that the system may instead include more than two hidden layers and more than two fully connected layers. Vector Matrix Multiplication (VMM) Arrays

圖9描繪神經元VMM陣列900，其特別適用於圖3中所示之記憶體單元310，並且用作輸入層與下一層之間的突觸及神經元的部分。VMM陣列900包括非揮發性記憶體單元的記憶體陣列901及非揮發性參考記憶體單元的參考陣列902(在所述陣列的上方)。或者，可以在下方放置另一個參考陣列。FIG9 depicts a neuron VMM array 900 that is particularly applicable to the memory unit 310 shown in FIG3 and is used as a synapse between the input layer and the next layer and part of the neuron. The VMM array 900 includes a memory array 901 of non-volatile memory units and a reference array 902 of non-volatile reference memory units (above the array). Alternatively, another reference array can be placed below.

在VMM陣列900中，諸如控制閘極線903的控制閘極線在垂直方向上延伸(因此，在列方向上的參考陣列902與控制閘極線903正交)，並且諸如抹除閘極線904的抹除閘極線在水平方向延伸。這裡，在控制閘極線(CG0、CG1、CG2、CG3)上提供對VMM陣列900的輸入，而VMM陣列900的輸出出現在源極線(SL0、SL1)上。在一個具體例中，僅使用偶數列，而在另一個具體例中，僅使用奇數列。在每條源極線(SL0、SL1)上之電流執行來自連接至那條特定源極線之記憶體單元的所有電流之求和函數。In the VMM array 900, control gate lines such as control gate line 903 extend in the vertical direction (thus, the reference array 902 in the column direction is orthogonal to the control gate line 903), and erase gate lines such as erase gate line 904 extend in the horizontal direction. Here, inputs to the VMM array 900 are provided on the control gate lines (CG0, CG1, CG2, CG3), and outputs of the VMM array 900 appear on the source lines (SL0, SL1). In one embodiment, only even columns are used, and in another embodiment, only odd columns are used. The current on each source line (SL0, SL1) performs a summation function of all currents from the memory cells connected to that particular source line.

如本文針對神經網路所述，VMM陣列900的非揮發性記憶體單元(亦即，VMM陣列900的記憶體單元310)較佳地配置成在次臨界區域中操作。As described herein with respect to neural networks, the non-volatile memory cells of the VMM array 900 (ie, the memory cells 310 of the VMM array 900) are preferably configured to operate in a subcritical region.

在弱倒轉(次臨界區域)中施加偏壓於本文所述之非揮發性參考記憶體單元及非揮發性記憶體單元： Ids=Io*e ^(Vg-Vth)/nVt=w*Io*e ^(Vg)/nVt其中w=e ^(-Vth)/nVt其中Ids係汲源電流；Vg係記憶體單元上的閘極電壓；Vth係記憶體單元的臨界電壓；Vt係熱電壓=k*T/q，其中k係波茲曼常數，T係克耳文單位的溫度，q係電子電荷；n係斜率因數=1+(Cdep/Cox)，其中Cdep為空乏層的電容，Cox為閘極氧化層的電容；Io係閘極電壓等於臨界電壓時的記憶體單元電流，Io與(Wt/L)*u*Cox*(n-1)*Vt ²成正比，其中u係載子遷移率，Wt及L分別是記憶體單元的寬度及長度。 Applying bias in weak inversion (subcritical region) to the nonvolatile reference memory cell and nonvolatile memory cell described herein: Ids = Io*e ^(Vg-Vth)/nVt = w*Io*e ^(Vg)/nVt where w = e ^(-Vth)/nVt where Ids is the drain-source current; Vg is the gate voltage on the memory cell; Vth is the critical voltage of the memory cell; Vt is the thermal voltage = k*T/q, where k is the Boltzmann constant, T is the temperature in Kelvin, and q is the electron charge; n is the slope factor = 1+(Cdep/Cox), where Cdep is the capacitance of the depletion layer, and Cox is the capacitance of the gate oxide layer; Io is the memory cell current when the gate voltage is equal to the critical voltage, and Io is related to (Wt/L)*u*Cox*(n-1)*Vt ² , where u is the carrier mobility, Wt and L are the width and length of the memory cell respectively.

對於使用記憶體單元(例如，參考記憶體單元或周邊記憶體單元)或電晶體將輸入電流轉換成輸入電壓之I至V對數轉換器： Vg=n*Vt*log[Ids/wp*Io] 在此，wp係參考或周邊記憶體單元的w。 For an I to V logarithmic converter that uses a memory cell (e.g., a reference memory cell or a peripheral memory cell) or a transistor to convert input current to input voltage: Vg=n*Vt*log[Ids/wp*Io] Here, wp is the w of the reference or peripheral memory cell.

對於用作具有電流輸入之向量矩陣乘法VMM陣列的記憶體陣列，輸出電流為： Iout=wa*Io*e ^(Vg)/ ⁿ ^Vt，亦即 Iout=(wa/wp)*Iin=W*Iin W=e ^(Vthp-Vtha)/ ⁿ ^Vt在此，wa=記憶體陣列中之每個記憶體單元的w。 Vthp係周邊記憶體單元的有效臨界電壓，Vtha係主(資料)記憶體單元的有效臨界電壓。注意，電晶體的臨界電壓係基板本體偏壓的函數，並且基板本體偏壓(表示為Vsb)可以被調整，以補償在這樣的溫度下的各種條件。臨界電壓Vth可以表示為： Vth=Vth0+gamma(SQRT|Vsb–2*φF)-SQRT|2*φF|) 其中Vth0係具有零基板偏壓的臨界電壓，φF係表面電位，gamma係基體效應參數。 For a memory array used as a vector matrix multiplication VMM array with current input, the output current is: Iout=wa*Io*e ^(Vg)/ ⁿ ^Vt , that is, Iout=(wa/wp)*Iin=W*Iin W=e ^(Vthp-Vtha)/ ⁿ ^VtHere , wa=w for each memory cell in the memory array. Vthp is the effective critical voltage of the peripheral memory cell and Vtha is the effective critical voltage of the main (data) memory cell. Note that the critical voltage of the transistor is a function of the substrate bulk bias, and the substrate bulk bias (denoted as Vsb) can be adjusted to compensate for various conditions at such temperatures. The critical voltage Vth can be expressed as: Vth=Vth0+gamma(SQRT|Vsb–2*φF)-SQRT|2*φF|) where Vth0 is the critical voltage with zero substrate bias, φF is the surface potential, and gamma is the matrix effect parameter.

字元線或控制閘極可用以作為用於輸入電壓之記憶體單元的輸入。The word line or control gate may be used as the input of the memory cell for input voltage.

或者，本文所述之VMM陣列的快閃記憶體單元可以配置成在線性區域中操作： Ids=beta*(Vgs-Vth)*Vds；beta=u*Cox*Wt/L W=α(Vgs-Vth) 意味著線性區域中的權重W與(Vgs-Vth)成正比。 Alternatively, the flash memory cells of the VMM array described in this article can be configured to operate in the linear region: Ids = beta*(Vgs-Vth)*Vds; beta = u*Cox*Wt/L W = α(Vgs-Vth) Meaning that the weight W in the linear region is proportional to (Vgs-Vth).

字元線或控制閘極或位元線或源極線可用以作為在線性區域中操作之記憶體單元的輸入。位元線或源極線可以用作記憶體單元的輸出。The word line or control gate or bit line or source line can be used as the input of the memory cell operating in the linear region. The bit line or source line can be used as the output of the memory cell.

對於I至V線性轉換器，在線性區域中操作的記憶體單元(例如，參考記憶體單元或周邊記憶體單元)或電晶體可以用於將輸入/輸出電流線性地轉換成輸入/輸出電壓。For an I to V linear converter, a memory cell (e.g., a reference memory cell or a peripheral memory cell) or a transistor operating in a linear region may be used to linearly convert an input/output current into an input/output voltage.

或者，本文描述之VMM陣列的記憶體單元可以配置成在飽和區域中操作： Ids=½*beta*(Vgs-Vth) ²；beta=u*Cox*Wt/L Wα(Vgs-Vth) ²，這意味著權重W與(Vgs-Vth) ²成正比。 Alternatively, the memory cells of the VMM array described herein can be configured to operate in the saturation region: Ids=½*beta*(Vgs-Vth) ² ; beta=u*Cox*Wt/L Wα(Vgs-Vth) ² , which means that the weight W is proportional to (Vgs-Vth) ² .

字元線、控制閘極或抹除閘極可以用作在飽和區中操作之記憶體單元的輸入。位元線或源極線可以用作輸出神經元的輸出。The word line, control gate, or erase gate can be used as the input of a memory cell operating in the saturation region. The bit line or source line can be used as the output of an output neuron.

或者，本文描述之VMM陣列的記憶體單元可以用於神經網路的每一層或多層之所有區域或其組合(次臨界、線性或飽和)。Alternatively, memory cells of the VMM array described herein may be used for all regions or combinations thereof (subcritical, linear, or saturated) of each layer or multiple layers of a neural network.

在美國專利第10,748,630號中描述圖7的VMM陣列32之其它具體例，在此以提及方式將其併入本文。如上面申請案所述，源極線或位元線可用以作為神經元輸出(電流總和輸出)。Other specific embodiments of the VMM array 32 of FIG. 7 are described in U.S. Patent No. 10,748,630, which is incorporated herein by reference. As described in the above application, source lines or bit lines can be used as neuron outputs (current sum outputs).

圖10描繪神經元VMM陣列1000，其特別適用於圖2所示之記憶體單元210，並且用作輸入層與下一層之間的突觸。VMM陣列1000包括非揮發性記憶體單元的記憶體陣列1003、第一非揮發性參考記憶體單元的參考陣列1001及第二非揮發性參考記憶體單元的參考陣列1002。配置在陣列的行方向上之參考陣列1001及1002用於將流入端子BLR0、BLR1、BLR2及BLR3的電流輸入轉換成電壓輸入WL0、WL1、WL2及WL3。實際上，第一及第二非揮發性參考記憶體單元係以二極體形式經由多工器1014(僅部分被描繪)與流入它們的電流輸入連接。將參考單元調整(例如，程式化)至目標參考位準。目標參考位準由參考微型陣列矩陣(未顯示)來提供。FIG10 depicts a neuron VMM array 1000, which is particularly applicable to the memory cell 210 shown in FIG2 and is used as a synapse between an input layer and a next layer. The VMM array 1000 includes a memory array 1003 of a non-volatile memory cell, a reference array 1001 of a first non-volatile reference memory cell, and a reference array 1002 of a second non-volatile reference memory cell. The reference arrays 1001 and 1002 arranged in the row direction of the array are used to convert current inputs flowing into terminals BLR0, BLR1, BLR2, and BLR3 into voltage inputs WL0, WL1, WL2, and WL3. In practice, the first and second non-volatile reference memory cells are connected in diode form via a multiplexer 1014 (only partially depicted) with current flowing into their inputs. The reference cells are adjusted (e.g., programmed) to a target reference level. The target reference level is provided by a reference microarray matrix (not shown).

記憶體陣列1003提供兩個用途。第一，它在其個別記憶體單元上儲存將被VMM陣列1000使用的權重。第二，記憶體陣列1003有效地將輸入(亦即，被提供至端子BLR0、BLR1、BLR2及BLR3的電流輸入；參考陣列1001及1002將這些電流輸入轉換成輸入電壓，以供應至字元線WL0、WL1、WL2及WL3)乘以記憶體陣列1003中所儲存之權重，然後將所有結果(記憶體單元電流)相加，以在個別位元線(BL0-BLN)上產生輸出，所述輸出將是下一層的輸入或最後一層的輸入。藉由執行乘法及加法函數，記憶體陣列1003不需要個別的乘法及加法邏輯電路，並且還具有功率效率。這裡，電壓輸入被提供在字元線WL0、WL1、WL2及WL3上，並且輸出在讀取(推理)操作期間出現在位元線BL0-BLN上。在位元線BL0-BLN的每條位元線上之電流執行來自連接至那條特定位元線之所有非揮發性記憶體單元的電流之求和函數。Memory array 1003 serves two purposes. First, it stores weights on its individual memory cells to be used by VMM array 1000. Second, memory array 1003 effectively multiplies the inputs (i.e., current inputs provided to terminals BLR0, BLR1, BLR2, and BLR3; reference arrays 1001 and 1002 convert these current inputs into input voltages to be supplied to word lines WL0, WL1, WL2, and WL3) by the weights stored in memory array 1003, and then adds all the results (memory cell currents) to produce outputs on individual bit lines (BL0-BLN), which will be the inputs to the next layer or the inputs to the last layer. By performing multiplication and addition functions, memory array 1003 does not require separate multiplication and addition logic circuits and is also power efficient. Here, voltage inputs are provided on word lines WL0, WL1, WL2, and WL3, and the outputs appear on bit lines BL0-BLN during a read (inference) operation. The current on each of the bit lines BL0-BLN performs a summing function of the currents from all non-volatile memory cells connected to that particular bit line.

表5描繪VMM陣列1000的操作電壓及電流。表中之行表示在被選單元的字元線、未被選單元的字元線、被選單元的位元線、未被選單元的位元線、被選單元的源極線及未被選單元的源極線上之電壓。列表示讀取、抹除及程式化的操作。表5：圖10的VMM陣列1000之操作 WL WL-unsel BL BL-unsel SL SL-unsel 讀取 1-3.5V -0.5V/0V 0.6-2V(I神經元) 0.6V-2V/0V 0V 0V 抹除 ~5-13V 0V 0V 0V 0V 0V 程式化 1-2V -0.5V/0V 0.1-3µA Vinh~2.5V 4-10V 0-1V/FLT Table 5 depicts the operating voltages and currents of the VMM array 1000. The rows in the table represent the voltages on the word line of the selected cell, the word line of the unselected cell, the bit line of the selected cell, the bit line of the unselected cell, the source line of the selected cell, and the source line of the unselected cell. The columns represent the operations of read, erase, and program. Table 5: Operations of the VMM array 1000 of FIG. 10 WL WL-unsel BL BL-unsel SL SL-unsel Read 1-3.5V -0.5V/0V 0.6-2V(I neuron) 0.6V-2V/0V 0V 0V Erase ~5-13V 0V 0V 0V 0V 0V Programming 1-2V -0.5V/0V 0.1-3µA Vinh~2.5V 4-10V 0-1V/FLT

圖11描繪神經元VMM陣列1100，其特別適用於圖2所示之記憶體單元210，並且用作輸入層與下一層之間的突觸及神經元的部分。VMM 陣列1100包括非揮發性記憶體單元的記憶體陣列1103、第一非揮發性參考記憶體單元的參考陣列1101及第二非揮發性參考記憶體單元的參考陣列1102。參考陣列1101及1102在VMM陣列1100的列方向上延伸。除在VMM陣列1100中，字元線在垂直方向上延伸外，VMM陣列與VMM 1000相似。這裡，在字元線(WLA0、WLB0、WLA1、WLB1、WLA2、WLB2、WLA3、WLB3)上提供輸入，並且在讀取操作期間輸出出現在源極線(SL0，SL1)上。在每條源極線上之電流執行來自連接至那條特定源極線之記憶體單元的所有電流之求和函數。FIG11 depicts a neuron VMM array 1100, which is particularly applicable to the memory unit 210 shown in FIG2 and is used as a synapse between an input layer and a next layer and a portion of a neuron. The VMM array 1100 includes a memory array 1103 of a non-volatile memory unit, a reference array 1101 of a first non-volatile reference memory unit, and a reference array 1102 of a second non-volatile reference memory unit. The reference arrays 1101 and 1102 extend in the column direction of the VMM array 1100. The VMM array is similar to the VMM 1000, except that in the VMM array 1100, the word lines extend in the vertical direction. Here, the inputs are provided on word lines (WLA0, WLB0, WLA1, WLB1, WLA2, WLB2, WLA3, WLB3) and the outputs appear on source lines (SL0, SL1) during a read operation. The current on each source line performs a summing function of all the currents from the memory cells connected to that particular source line.

表6描繪VMM陣列1100的操作電壓及電流。表中之行表示在被選單元的字元線、未被選單元的字元線、被選單元的位元線、未被選單元的位元線、被選單元的源極線及未被選單元的源極線上之電壓。列表示讀取、抹除及程式化的操作。表6：圖11的VMM陣列1100之操作 WL WL-unsel BL BL-unsel SL SL-unsel 讀取 1-3.5V -0.5V/0V 0.6-2V 0.6V-2V/0V ~0.3-1V (I神經元) 0V 抹除 ~5-13V 0V 0V 0V 0V 禁止SL (~4-8V) 程式化 1-2V -0.5V/0V 0.1-3µA Vinh~2.5V 4-10V 0-1V/FLT Table 6 depicts the operating voltages and currents of the VMM array 1100. The rows in the table represent the voltages on the word line of the selected cell, the word line of the unselected cell, the bit line of the selected cell, the bit line of the unselected cell, the source line of the selected cell, and the source line of the unselected cell. The columns represent the operations of read, erase, and program. Table 6: Operations of the VMM array 1100 of FIG. 11 WL WL-unsel BL BL-unsel SL SL-unsel Read 1-3.5V -0.5V/0V 0.6-2V 0.6V-2V/0V ~0.3-1V (I neuron) 0V Erase ~5-13V 0V 0V 0V 0V Disable SL (~4-8V) Programming 1-2V -0.5V/0V 0.1-3µA Vinh~2.5V 4-10V 0-1V/FLT

圖12描繪神經元VMM陣列1200，其特別適用於圖3所示之記憶體單元310，並且用作輸入層與下一層之間的突觸及神經元的部分。VMM陣列1200包括非揮發性記憶體單元的記憶體陣列1203、第一非揮發性參考記憶體單元的參考陣列1201及第二非揮發性參考記憶體單元的參考陣列1202。參考陣列1201及1202用於將流入端子BLR0、BLR1、BLR2及BLR3的電流輸入轉換成電壓輸入CG0、CG1、CG2及CG3。實際上，第一及第二非揮發性參考記憶體單元係以二極體形式經由多工器1212(僅部分被顯示)與經由BLR0、BLR1、BLR2及BLR3流入它們的電流輸入連接。多工器1212各自包括個別多工器1205及疊接電晶體1204，以在讀取操作期間確保第一及第二非揮發性參考記憶體單元中之每一者的位元線(諸如BLR0)上的固定電壓。將參考單元調整至目標參考位準。FIG12 depicts a neuron VMM array 1200, which is particularly applicable to the memory cell 310 shown in FIG3 and is used as a synapse between an input layer and the next layer and a portion of a neuron. The VMM array 1200 includes a memory array 1203 of a non-volatile memory cell, a reference array 1201 of a first non-volatile reference memory cell, and a reference array 1202 of a second non-volatile reference memory cell. The reference arrays 1201 and 1202 are used to convert current inputs flowing into terminals BLR0, BLR1, BLR2, and BLR3 into voltage inputs CG0, CG1, CG2, and CG3. In practice, the first and second non-volatile reference memory cells are connected in diode form via multiplexers 1212 (only partially shown) with current inputs flowing into them via BLR0, BLR1, BLR2 and BLR3. Multiplexers 1212 each include a respective multiplexer 1205 and a stacked transistor 1204 to ensure a fixed voltage on the bit line (e.g. BLR0) of each of the first and second non-volatile reference memory cells during a read operation. The reference cells are adjusted to a target reference level.

記憶體陣列1203提供兩個用途。第一，它儲存將被VMM陣列 1200使用的權重。第二，記憶體陣列1203有效地將輸入(被提供至端子BLR0、BLR1、BLR2及BLR3的電流輸入；參考陣列1201及1202將這些電流輸入轉換成輸入電壓，以供應至控制閘極(CG0、CG1、CG2及CG3))乘以記憶體陣列中所儲存之權重，然後將所有結果(單元電流)相加，以產生輸出，所述輸出出現在BL0-BLN且將是下一層的輸入或最後一層的輸入。藉由執行乘法及加法函數，記憶體陣列不需要個別的乘法及加法邏輯電路，並且還具有功率效率。這裡，輸入被提供在控制閘極線(CG0、CG1、CG2及CG3)上，並且輸出在讀取操作期間出現在位元線(BL0-BLN)上。在每條位元線上之電流執行來自連接至那條特定位元線之記憶體單元的所有電流之求和函數。Memory array 1203 serves two purposes. First, it stores weights to be used by VMM array 1200. Second, memory array 1203 effectively multiplies the inputs (current inputs provided to terminals BLR0, BLR1, BLR2, and BLR3; reference arrays 1201 and 1202 convert these current inputs into input voltages to be supplied to control gates (CG0, CG1, CG2, and CG3)) by the weights stored in the memory array, and then adds all the results (cell currents) to produce an output, which appears at BL0-BLN and will be the input to the next layer or the input to the last layer. By performing both multiplication and addition functions, the memory array does not require separate multiplication and addition logic circuits and is also power efficient. Here, the inputs are provided on the control gate lines (CG0, CG1, CG2, and CG3) and the outputs appear on the bit lines (BL0-BLN) during a read operation. The current on each bit line performs a summing function of all the currents from the memory cells connected to that particular bit line.

VMM陣列1200針對記憶體陣列1203中之非揮發性記憶體單元實施單向調整。亦即，抹除及然後部分程式化每個非揮發性記憶體單元，直到達到浮動閘極上的期望電荷為止。如果使太多電荷置於浮動閘極上(使得錯誤值儲存在單元中)，則抹除單元並且重新開始部分程式化操作的順序。如圖所示，共享同一個抹除閘極(例如，EG0或EG1)的兩列一起被抹除(稱為頁抹除)，之後，部分程式化每個單元，直到達到浮動閘極上之期望電荷為止。VMM array 1200 implements a one-way adjustment for non-volatile memory cells in memory array 1203. That is, each non-volatile memory cell is erased and then partially programmed until the desired charge on the floating gate is achieved. If too much charge is placed on the floating gate (causing an erroneous value to be stored in the cell), the cell is erased and the sequence of partial programming operations is restarted. As shown, two rows that share the same erase gate (e.g., EG0 or EG1) are erased together (called a page erase), and then each cell is partially programmed until the desired charge on the floating gate is achieved.

表7描繪VMM陣列1200的操作電壓及電流。表中之行表示在被選單元的字元線、未被選單元的字元線、被選單元的位元線、未被選單元的位元線、被選單元的控制閘極、與被選單元相同的區段中之未被選單元的控制閘極、與被選單元不同的區段中之未被選單元的控制閘極、被選單元的抹除閘極、未被選單元的抹除閘極、被選單元的源極線及未被選單元的源極線上之電壓。列表示讀取、抹除及程式化的操作。表7：圖12的VMM陣列1200之操作 WL WL-unsel BL BL-unsel CG 在同一個區段中之 CG-unsel CG-unsel EG EG-unsel SL SL-unsel 讀取 1.0-2V -0.5V/0V 0.6-2V (I神經元) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V 抹除 0V 0V 0V 0V 0V 0-2.6V 0-2.6V 5-12V 0-2.6V 0V 0V 程式化 0.7-1V -0.5V/ 0V 0.1-1µA Vinh(1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V Table 7 depicts the operating voltages and currents of the VMM array 1200. The rows in the table represent the voltages on the word line of the selected cell, the word line of the unselected cell, the bit line of the selected cell, the bit line of the unselected cell, the control gate of the selected cell, the control gate of the unselected cell in the same segment as the selected cell, the control gate of the unselected cell in a different segment from the selected cell, the erase gate of the selected cell, the erase gate of the unselected cell, the source line of the selected cell, and the source line of the unselected cell. The columns represent the operations of read, erase, and program. Table 7: Operations of the VMM array 1200 of FIG. 12 WL WL-unsel BL BL-unsel CG CG-unsel in the same section CG-unsel EG EG-unsel SL SL-unsel Read 1.0-2V -0.5V/0V 0.6-2V (I neuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V Erase 0V 0V 0V 0V 0V 0-2.6V 0-2.6V 5-12V 0-2.6V 0V 0V Programming 0.7-1V -0.5V/ 0V 0.1-1µA Vinh(1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V

圖13描繪神經元VMM陣列1300，其特別適用於圖3所示之記憶體單元310，並且用作輸入層與下一層之間的突觸及神經元的部分。VMM陣列1300包括非揮發性記憶體單元的記憶體陣列1303、第一非揮發性參考記憶體單元的參考陣列1301及第二非揮發性參考記憶體單元的參考陣列1302。EG線EGR0、EG0、EG1及EGR1垂直延伸，而CG線CG0、CG1、CG2及CG3以及SL線WL0、WL1、WL2及WL3水平延伸。除VMM陣列1300實施雙向調整外，VMM陣列1300與VMM陣列1200相似，其中每個個別單元可以完全被抹除、部分被程式化及根據需要部分被抹除，以因個別EG線的使用而在浮動閘極上達到所需的電荷量。如圖所示，參考陣列1301及1302將端子BLR0、BLR1、BLR2及BLR3中之輸入電流轉換成要施加至列方向上的記憶體單元之控制閘極電壓CG0、CG1、CG2及CG3(藉由以二極體形式經由多工器1314連接之參考單元的作用)。電流輸出(神經元)位於位元線BL0-BLN中，其中每條位元線計算來自與那條特定位元線連接之非揮發性記憶體單元的所有電流之總和。FIG13 depicts a neuron VMM array 1300, which is particularly applicable to the memory cell 310 shown in FIG3 and is used as a synapse between the input layer and the next layer and part of the neuron. The VMM array 1300 includes a memory array 1303 of a non-volatile memory cell, a reference array 1301 of a first non-volatile reference memory cell, and a reference array 1302 of a second non-volatile reference memory cell. The EG lines EGR0, EG0, EG1, and EGR1 extend vertically, while the CG lines CG0, CG1, CG2, and CG3 and the SL lines WL0, WL1, WL2, and WL3 extend horizontally. VMM array 1300 is similar to VMM array 1200 except that VMM array 1300 implements bidirectional scaling, wherein each individual cell can be fully erased, partially programmed, and partially erased as needed to achieve the desired amount of charge on the floating gate due to the use of individual EG lines. As shown, reference arrays 1301 and 1302 convert input currents at terminals BLR0, BLR1, BLR2, and BLR3 into control gate voltages CG0, CG1, CG2, and CG3 to be applied to memory cells in the row direction (through the action of reference cells connected in diode form via multiplexer 1314). The current outputs (neurons) are in bit lines BL0-BLN, where each bit line calculates the sum of all currents from the non-volatile memory cells connected to that particular bit line.

表8描繪VMM陣列1300的操作電壓及電流。表中之行表示在被選單元的字元線、未被選單元的字元線、被選單元的位元線、未被選單元的位元線、被選單元的控制閘極、與被選單元相同的區段中之未被選單元的控制閘極、與被選單元不同的區段中之未被選單元的控制閘極、被選單元的抹除閘極、未被選單元的抹除閘極、被選單元的源極線及未被選單元的源極線上之電壓。列表示讀取、抹除及程式化的操作。表8：圖13的VMM陣列1300之操作 WL WL-unsel BL BL -unsel CG 同一個區段中之CG -unsel CG-unsel EG EG-unsel SL SL-unsel 讀取 1.0-2V -0.5V/0V 0.6-2V (I神經元) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V 抹除 0V 0V 0V 0V 0V 4-9V 0-2.6V 5-12V 0-2.6V 0V 0V 程式化 0.7-1V -0.5V/0V 0.1-1µA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V Table 8 depicts the operating voltages and currents of the VMM array 1300. The rows in the table represent the voltages on the word line of the selected cell, the word line of the unselected cell, the bit line of the selected cell, the bit line of the unselected cell, the control gate of the selected cell, the control gate of the unselected cell in the same segment as the selected cell, the control gate of the unselected cell in a different segment from the selected cell, the erase gate of the selected cell, the erase gate of the unselected cell, the source line of the selected cell, and the source line of the unselected cell. The columns represent the operations of read, erase, and program. Table 8: Operations of the VMM array 1300 of FIG. 13 WL WL-unsel BL BL-unsel CG CG -unsel in the same section CG-unsel EG EG-unsel SL SL-unsel Read 1.0-2V -0.5V/0V 0.6-2V (I neuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V Erase 0V 0V 0V 0V 0V 4-9V 0-2.6V 5-12V 0-2.6V 0V 0V Programming 0.7-1V -0.5V/0V 0.1-1µA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V

圖22描繪神經元VMM陣列2200，其特別適合於圖2所示之記憶體單元210，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在VMM陣列2200中，在位元線BL ₀、…、BL _N上分別接收輸入INPUT ₀、…、INPUT _N，並且在源極線SL ₀、SL ₁、SL ₂及SL ₃上分別產生輸出OUTPUT ₁、OUTPUT ₂、OUTPUT ₃及OUTPUT ₄。 FIG22 depicts a neuron VMM array 2200 that is particularly suitable for the memory cell 210 shown in FIG2 and is used as part of the synapses and neurons between the input layer and the next layer. In the VMM array 2200, inputs INPUT ₀ , ..., INPUT _N are received on bit lines BL ₀ , ..., BL _N , respectively, and outputs OUTPUT ₁ , OUTPUT ₂ , OUTPUT ₃ , and OUTPUT ₄ are generated on source lines SL ₀ , SL ₁ , SL ₂ , and SL ₃ , respectively.

圖23描繪神經元VMM陣列2300，其特別適合於圖2所示之記憶體單元210，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在源極線SL ₀、SL ₁、SL ₂及SL ₃上分別接收輸入INPUT ₀、INPUT ₁、INPUT ₂及INPUT ₃，並且在位元線BL ₀、…、BL _N上產生輸出 OUTPUT ₀、…、OUTPUT _N。 FIG23 depicts a neuron VMM array 2300 that is particularly suitable for the memory cell 210 shown in FIG2 and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , INPUT ₁ , INPUT ₂ , and INPUT ₃ are received on source lines SL ₀ , SL ₁ , SL ₂ , and SL ₃ , respectively, and outputs OUTPUT ₀ , ..., OUTPUT _N are generated on bit lines BL ₀ , ..., BL _N.

圖24描繪神經元VMM陣列2400，其特別適合於圖2所示之記憶體單元210，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在字元線WL ₀、…、WL _M上分別接收輸入INPUT ₀、…、INPUT _M，並且在位元線BL ₀、…、BL _N上產生輸出OUTPUT ₀、…、OUTPUT _N。 FIG24 depicts a neuron VMM array 2400 that is particularly suitable for the memory cell 210 shown in FIG2 and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , ..., INPUT _M are received on word lines WL ₀ , ..., _WLM , respectively, and outputs OUTPUT ₀ , ..., OUTPUT _N are generated on bit lines BL ₀ , ..., _BLN .

圖25描繪神經元VMM陣列2500，其特別適合於圖3所示之記憶體單元310，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在字元線WL ₀、…、WL _M上分別接收輸入INPUT ₀、…、INPUT _M，並且在位元線BL ₀、…、BL _N上產生輸出OUTPUT ₀、…、OUTPUT _N。 FIG25 depicts a neuron VMM array 2500 that is particularly suitable for the memory cell 310 shown in FIG3 and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , ..., INPUT _M are received on word lines WL ₀ , ..., _WLM , respectively, and outputs OUTPUT ₀ , ..., OUTPUT _N are generated on bit lines BL ₀ , ..., _BLN .

圖26描繪神經元VMM陣列2600，其特別適合於圖4所示之記憶體單元410，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在垂直控制閘極線CG ₀、…、CG _N上分別接收輸入INPUT ₀、…、INPUT _N，並且在源極線SL ₀及SL ₁上產生輸出OUTPUT ₁及OUTPUT ₂。 FIG26 depicts a neuron VMM array 2600 that is particularly suitable for the memory cell 410 shown in FIG4 and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , ..., INPUT _N are received on vertical control gate lines CG ₀ , ..., CG _N , respectively, and outputs OUTPUT ₁ and OUTPUT ₂ are generated on source lines SL ₀ and SL ₁ .

圖27描繪神經元VMM陣列2700，其特別適合於圖4所示之記憶體單元410，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在分別耦接至位元線BL ₀、…、BL _N之位元線控制閘2701-1、2701-2、…、2701-(N-1)及2701-N的閘極上分別接收輸入INPUT ₀、…、INPUT _N。在源極線SL ₀及SL ₁上產生示例性輸出OUTPUT ₁及OUTPUT ₂。 FIG27 depicts a neuron VMM array 2700 that is particularly suitable for the memory cell 410 shown in FIG4 and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , ..., INPUT _N are received at the gates of bit line control gates 2701-1, 2701-2, ..., 2701-(N-1), and 2701-N, which are respectively coupled to bit lines BL ₀ , ..., BL _N. Exemplary outputs OUTPUT ₁ and OUTPUT ₂ are generated at source lines SL ₀ and SL ₁ .

圖28描繪神經元VMM陣列2800，其特別適合於圖3所示之記憶體單元310、圖5所示之記憶體單元510及圖7所示之記憶體單元710，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在字元線WL ₀、…、WL _M上接收輸入INPUT ₀、…、INPUT _M，並且在位元線BL ₀、…、BL _N上分別產生輸出OUTPUT ₀、…、OUTPUT _N。 FIG28 depicts a neuron VMM array 2800 that is particularly suitable for the memory cell 310 shown in FIG3, the memory cell 510 shown in FIG5, and the memory cell 710 shown in FIG7, and is used as part of the synapse and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , ..., INPUT _M are received on word lines WL ₀ , ..., _WLM , and outputs OUTPUT ₀ , ..., OUTPUT _N are generated on bit lines BL ₀ , ..., BL _N , respectively.

圖29描繪神經元VMM陣列2900，其特別適合於圖3所示之記憶體單元310、圖5所示之記憶體單元510及圖7所示之記憶體單元710，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在控制閘極線CG ₀、…、CG _M上接收輸入INPUT ₀、…、INPUT _M，並且在垂直源極線SL ₀、…、SL _N上分別產生輸出OUTPUT ₀、…、OUTPUT _N，其中每條源極線SLi耦接至第i行中之所有記憶體單元的源極線。 FIG29 depicts a neuron VMM array 2900 that is particularly suitable for the memory cell 310 shown in FIG3, the memory cell 510 shown in FIG5, and the memory cell 710 shown in FIG7, and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , ..., INPUT _M are received on control gate lines CG ₀ , ..., CG _M , and outputs OUTPUT ₀ , ..., OUTPUT _N are generated on vertical source lines SL ₀ , ..., SL _N , respectively, where each source line SLi is coupled to the source lines of all memory cells in the i-th row.

圖30描繪神經元VMM陣列3000，其特別適合於圖3所示之記憶體單元310、圖5所示之記憶體單元510及圖7所示之記憶體單元710，並且用以作為輸入層與下一層之間的突觸及神經元的部分。在此實例中，在控制閘極線CG ₀、…、CG _M上接收輸入INPUT ₀、…、INPUT _M，並且在垂直位元線BL ₀、…、BL _N上分別產生輸出OUTPUT ₀、…、OUTPUT _N，其中每條位元線BLi耦接至第i行中之所有記憶體單元的位元線。長短期記憶體 FIG30 depicts a neuron VMM array 3000 that is particularly suitable for memory cell 310 shown in FIG3 , memory cell 510 shown in FIG5 , and memory cell 710 shown in FIG7 , and is used as part of the synapses and neurons between the input layer and the next layer. In this example, inputs INPUT ₀ , ..., INPUT _M are received on control gate lines CG ₀ , ..., CG _M , and outputs OUTPUT ₀ , ..., OUTPUT _N are generated on vertical bit lines BL ₀ , ..., BL _N , respectively, where each bit line BLi is coupled to the bit lines of all memory cells in the i-th row. Long-Term Memory

習知技藝包括稱為長短期記憶體(LSTM)的概念。LSTM單元通常用於神經網路中。LSTM允許神經網路在預定任意時間間隔內記住資訊，並在後續操作中使用那個資訊。傳統的LSTM單元包括單元、輸入閘極、輸出閘極及遺忘閘極。三個閘極調整進出單元的資訊流及在LSTM中記住資訊的時間間隔。VMM在LSTM單位中係特別有用的。Learning techniques include a concept called long short-term memory (LSTM). LSTM cells are commonly used in neural networks. LSTM allows neural networks to remember information for a predetermined arbitrary time interval and use that information in subsequent operations. A traditional LSTM cell consists of a cell, an input gate, an output gate, and a forget gate. The three gates regulate the flow of information in and out of the cell and the time interval for which information is remembered in the LSTM. VMMs are particularly useful in LSTM cells.

圖14描繪示例性LSTM 1400。此實例中的LSTM 1400包括單元1401、1402、1403及1404。單元1401接收輸入向量x ₀，並產生輸出向量h ₀及單元狀態向量c ₀。單元1402接收輸入向量x ₁、來自單元1401的輸出向量(隱藏狀態)h ₀及單元狀態c ₀，並產生輸出向量h ₁及單元狀態向量c ₁。單元1403接收輸入向量x ₂、來自單元1402的輸出向量(隱藏狀態)h ₁及單元狀態c ₁，並產生輸出向量h ₂及單元狀態向量c ₂。單元1404接收輸入向量x ₃、來自單元1403的輸出向量(隱藏狀態)h ₂及單元狀態c ₂，並產生輸出向量h ₃。可以使用額外的單元，並且具有四個單元的LSTM僅是一個實例。 FIG14 depicts an exemplary LSTM 1400. LSTM 1400 in this example includes units 1401, 1402, 1403, and 1404. Unit 1401 receives an input vector _x0 , and produces an output vector _h0 and a unit state vector _c0 . Unit 1402 receives an input vector _x1 , an output vector (hidden state) _h0 from unit 1401, and a unit state _c0 , and produces an output vector _h1 and a unit state vector _c1 . Unit 1403 receives an input vector _x2 , an output vector (hidden state) _h1 from unit 1402, and a unit state _c1 , and produces an output vector _h2 and a unit state vector _c2 . Unit 1404 receives input vector x ₃ , output vector (hidden state) h ₂ from unit 1403 , and unit state c ₂ , and produces output vector h ₃ . Additional units may be used, and an LSTM with four units is just one example.

圖15描繪LSTM單元1500的示例性實施，其可以用於圖14中之單元1401、1402、1403及1404。LSTM單元1500接收輸入向量x(t)、來自前一個單元之單元狀態向量c(t-1)及來自前一個單元之輸出向量h(t-1)，並產生單元狀態向量c(t)及輸出向量h(t)。FIG15 depicts an exemplary implementation of an LSTM unit 1500, which may be used for units 1401, 1402, 1403, and 1404 in FIG14. LSTM unit 1500 receives an input vector x(t), a unit state vector c(t-1) from a previous unit, and an output vector h(t-1) from a previous unit, and generates a unit state vector c(t) and an output vector h(t).

LSTM單元1500包括sigmoid函數裝置1501、1502及1503，每個sigmoid函數裝置應用0與1之間的數字，以控制輸入向量中之每個分量有多少被允許直至輸出向量。LSTM單元1500亦包括用以將雙曲正切函數應用於輸入向量的tanh裝置1504及1505、用以將兩個向量相乘的乘法裝置1506、1507及1508以及用以將兩個向量相加的加法裝置1509。輸出向量h(t)可以提供給系統中的下一個LSTM單元，或者亦可以出於其它目的對其進行存取。LSTM unit 1500 includes sigmoid function devices 1501, 1502, and 1503, each of which applies a number between 0 and 1 to control how much of each component in the input vector is allowed to reach the output vector. LSTM unit 1500 also includes tanh devices 1504 and 1505 for applying a hyperbolic tangent function to the input vector, multiplication devices 1506, 1507, and 1508 for multiplying two vectors, and addition device 1509 for adding two vectors. The output vector h(t) can be provided to the next LSTM unit in the system, or it can be accessed for other purposes.

圖16描繪LSTM單元1600，其為LSTM單元1500的實施之一個實例。為方便讀者，在LSTM單元1600中使用與LSTM單元1500相同的編號。sigmoid函數裝置1501、1502及1503以及tanh裝置1504各自包括多個VMM陣列1601及激勵函數區塊1602。因此，可以看出VMM陣列在某些神經網路系統中使用之LSTM單元中係特別有用的。乘法裝置1506、1507及1508以及加法裝置1509以數位方式或以類比方式來實施。激勵函數區塊1602可以數位方式或類比方式來實施。FIG. 16 depicts an LSTM unit 1600, which is an example of an implementation of the LSTM unit 1500. For the convenience of the reader, the same numbering as that of the LSTM unit 1500 is used in the LSTM unit 1600. The sigmoid function devices 1501, 1502 and 1503 and the tanh device 1504 each include a plurality of VMM arrays 1601 and an excitation function block 1602. Therefore, it can be seen that the VMM array is particularly useful in the LSTM unit used in certain neural network systems. The multiplication devices 1506, 1507 and 1508 and the addition device 1509 are implemented digitally or in an analog manner. The excitation function block 1602 can be implemented digitally or in an analog manner.

圖17顯示LSTM單元1600的一個替代方案(以及LSTM單元1500實施的另一個實例)。在圖17中，Sigmoid函數裝置1501、1502及1503以及tanh裝置1504以時間多工方式共享同一個實體硬體(VMM陣列1701及激勵函數區塊1702)。LSTM單元1700亦包括：乘法裝置1703，用於將兩個向量相乘；加法裝置1708，用於將兩個向量相加；tanh裝置1505(其包括激勵函數區塊1702)；暫存器1707，其在i(t)從sigmoid函數區塊1702輸出時儲存數值i(t)；暫存器1704，其在數值f(t)*c(t-1)從乘法裝置1703經由多工器1710輸出時儲存數值f(t)*c(t-1)；暫存器1705，其在數值i(t)*u(t)從乘法裝置1703經由多工器1710輸出時儲存數值i(t)*u(t)；及暫存器1706，其在數值o(t)*c~(t)從乘法裝置1703經由多工器1710輸出時儲存數值o(t)*c~(t)；以及多工器1709。FIG17 shows an alternative to LSTM cell 1600 (and another example of LSTM cell 1500 implementation). In FIG17 , sigmoid function devices 1501, 1502, and 1503 and tanh device 1504 share the same physical hardware (VMM array 1701 and incentive function block 1702) in a time multiplexed manner. LSTM unit 1700 also includes: a multiplication device 1703 for multiplying two vectors; an addition device 1708 for adding two vectors; a tanh device 1505 (which includes an activation function block 1702); a register 1707 for storing the value i(t) when i(t) is output from the sigmoid function block 1702; and a register 1704 for storing the value f(t)*c(t-1) when the value f(t)*c(t-1) is output from the multiplication device 1703. A register 1705 stores the value f(t)*c(t-1) when it is outputted via the multiplexer 1710; a register 1705 stores the value i(t)*u(t) when it is outputted from the multiplication device 1703 via the multiplexer 1710; and a register 1706 stores the value o(t)*c~(t) when it is outputted from the multiplication device 1703 via the multiplexer 1710; and a multiplexer 1709.

LSTM單元1600包含多組VMM陣列1601及個別的激勵函數區塊1602，而LSTM單元1700僅包含一組VMM陣列1701及激勵函數區塊1702，它們在LSTM單元1700的具體例中用於表示多層。LSTM單元1700將需要比LSTM單元1600少的空間，因為相較於LSTM單元1600，LSTM單元1700只需要1/4空間用於VMM及激勵函數區塊。LSTM cell 1600 includes multiple sets of VMM arrays 1601 and individual excitation function blocks 1602, while LSTM cell 1700 includes only one set of VMM arrays 1701 and excitation function blocks 1702, which are used to represent multiple layers in the specific example of LSTM cell 1700. LSTM cell 1700 will require less space than LSTM cell 1600 because LSTM cell 1700 only requires 1/4 of the space for VMM and excitation function blocks compared to LSTM cell 1600.

可以進一步理解，LSTM單元通常將包括多個VMM陣列，每個VMM陣列需要由VMM陣列外部的某些電路區塊(例如，加法器及激勵函數區塊以及高電壓產生區塊)提供的功能。對每個VMM陣列提供個別的電路區塊，將在半導體裝置內需要大量的空間，並且會有些沒有效率。因此，下面所描述的實例減少VMM陣列本身外部所需的電路。閘控遞歸單元 It will be further appreciated that an LSTM cell will typically include multiple VMM arrays, each of which requires functionality provided by certain circuit blocks external to the VMM array (e.g., adder and excitation function blocks and high voltage generation blocks). Providing a separate circuit block for each VMM array would require a large amount of space within the semiconductor device and would be somewhat inefficient. Therefore, the example described below reduces the circuitry required external to the VMM array itself. Gated Recursive Cell

可以將類比VMM實施用於GRU(閘控遞歸單元)系統。GRU係遞歸神經網路中的閘控機制。除GRU單元通常包含比LSTM單元少的組件外，GRU與LSTM相似。The analog VMM implementation can be applied to GRU (Gated Recurrent Unit) systems. GRU is a gating mechanism in recurrent neural networks. GRU is similar to LSTM, except that GRU cells usually contain fewer components than LSTM cells.

圖18描繪示例性GRU 1800。此實例中之GRU 1800包括單元1801、1802、1803及1804。單元1801接收輸入向量x ₀，並產生輸出向量h ₀。單元1802接收輸入向量x ₁及來自單元1801的輸出向量h ₀，並產生輸出向量h ₁。單元1803接收輸入向量x ₂及來自單元1802的輸出向量(隱藏狀態)h ₁，並產生輸出向量h ₂。單元1804接收輸入向量x ₃及來自單元1803的輸出向量(隱藏狀態)h ₂，並產生輸出向量h ₃。可以使用額外的單元，並且具有四個單元的GRU僅是一個實例。 FIG18 depicts an exemplary GRU 1800. The GRU 1800 in this example includes units 1801, 1802, 1803, and 1804. Unit 1801 receives an input vector _x0 and produces an output vector _h0 . Unit 1802 receives an input vector _x1 and an output vector _h0 from unit 1801 and produces an output vector _h1 . Unit 1803 receives an input vector _x2 and an output vector (hidden state) _h1 from unit 1802 and produces an output vector _h2 . Unit 1804 receives an input vector _x3 and an output vector (hidden state) _h2 from unit 1803 and produces an output vector _h3 . Additional units may be used, and a GRU with four units is just one example.

圖19描繪GRU單元1900的示例性實施，其可以用於圖18之單元1801、1802、1803及1804。GRU單元1900接收輸入向量x(t)及來自前一個GRU單元之輸出向量h(t-1)，並產生輸出向量h(t)。GRU單元1900包括sigmoid函數裝置1901及1902，每個sigmoid函數裝置應用0與1之間的數字於來自輸出向量h(t-1)及輸入向量x(t)的分量。GRU單元1900亦包括用以將雙曲正切函數應用於輸入向量的tanh裝置1903、用以將兩個向量相乘的複數個乘法裝置1904、1905及1906、用以將兩個向量相加的加法裝置1907以及用以從1減去輸入來產生輸出的互補裝置1908。FIG. 19 depicts an exemplary implementation of a GRU unit 1900, which may be used for units 1801, 1802, 1803, and 1804 of FIG. 18. GRU unit 1900 receives an input vector x(t) and an output vector h(t-1) from a previous GRU unit, and generates an output vector h(t). GRU unit 1900 includes sigmoid function devices 1901 and 1902, each of which applies a number between 0 and 1 to components from the output vector h(t-1) and the input vector x(t). The GRU unit 1900 also includes a tanh device 1903 for applying a hyperbolic tangent function to an input vector, a plurality of multiplication devices 1904, 1905 and 1906 for multiplying two vectors, an addition device 1907 for adding two vectors, and a complementary device 1908 for subtracting the input from 1 to produce an output.

圖20描繪GRU單元2000，其是GRU單元1900的實施之一個實例。為方便讀者，在GRU單元2000中使用與GRU單元1900相同的編號。從圖20可以看出，sigmoid函數裝置1901及1902以及tanh裝置1903各自包括多個VMM陣列2001及激勵函數區塊2002。因此，可以看出VMM陣列特別用於某些神經網路系統中使用之GRU單元中。乘法裝置1904、1905及1906、加法裝置1907以及互補裝置1908以數位方式或以類比方式來實施。激勵函數區塊2002可以數位方式或類比方式來實施。FIG. 20 depicts a GRU unit 2000, which is an example of an implementation of the GRU unit 1900. For the convenience of the reader, the same numbering as that of the GRU unit 1900 is used in the GRU unit 2000. As can be seen from FIG. 20, the sigmoid function devices 1901 and 1902 and the tanh device 1903 each include a plurality of VMM arrays 2001 and an excitation function block 2002. Therefore, it can be seen that the VMM array is particularly used in the GRU unit used in certain neural network systems. The multiplication devices 1904, 1905 and 1906, the addition device 1907 and the complementary device 1908 are implemented digitally or in an analog manner. The excitation function block 2002 can be implemented digitally or in an analog manner.

圖21顯示GRU單元2000的一個替代方案(以及GRU單元1900實施的另一個實例)。在圖21中，GRU單元2100利用VMM陣列2101及激勵函數區塊2102，激勵函數區塊2102在配置成為Sigmoid函數時應用0與1之間的數字，以控制輸入向量中之每個分量有多少被允許直至輸出向量。在圖21中，Sigmoid函數裝置1901及1902以及tanh裝置1903以時間多工方式共享同一個實體硬體(VMM陣列2101及激勵函數區塊2102)。GRU單元2100亦包括：乘法裝置2103，用於將兩個向量相乘；加法裝置2105，用於將兩個向量相加；互補裝置2109，用於從1減去輸入，以產生輸出；多工器2104；暫存器2106，用於當數值h(t-1)*r(t)從乘法裝置2103經由多工器2104輸出時，保持數值h(t-1)*r(t)；暫存器2107，用於當數值h(t-1)*z(t)從乘法裝置2103經由多工器2104輸出時，保持數值h(t-1)*z(t)；以及暫存器2108，用於當數值h^(t)*(1-z(t))從乘法裝置2103經由多工器2104輸出時，保持數值h^(t)*(1-z(t))。FIG21 shows an alternative to GRU cell 2000 (and another example of an implementation of GRU cell 1900). In FIG21, GRU cell 2100 utilizes VMM array 2101 and excitation function block 2102, which applies a number between 0 and 1 when configured as a sigmoid function to control how much of each component in the input vector is allowed to reach the output vector. In FIG21, sigmoid function devices 1901 and 1902 and tanh device 1903 share the same physical hardware (VMM array 2101 and excitation function block 2102) in a time multiplexed manner. The GRU unit 2100 also includes: a multiplication device 2103 for multiplying two vectors; an addition device 2105 for adding two vectors; a complement device 2109 for subtracting an input from 1 to generate an output; a multiplexer 2104; and a register 2106 for holding the value h(t-1)*r(t) when the value h(t-1)*r(t) is output from the multiplication device 2103 via the multiplexer 2104. )*r(t); a register 2107 for holding the value h(t-1)*z(t) when the value h(t-1)*z(t) is output from the multiplication device 2103 via the multiplexer 2104; and a register 2108 for holding the value h^(t)*(1-z(t)) when the value h^(t)*(1-z(t)) is output from the multiplication device 2103 via the multiplexer 2104.

GRU單元2000包含多組VMM陣列2001及激勵函數區塊2002，而GRU單元2100僅包含一組VMM陣列2101及激勵函數區塊2102，它們在GRU單元2100的具體例中用於表示多層。GRU單元2100將需要比GRU單元2000少的空間，因為相較於GRU單元2000，GRU單元2100只需要1/3空間用於VMM及激勵函數區塊。The GRU unit 2000 includes multiple sets of VMM arrays 2001 and excitation function blocks 2002, while the GRU unit 2100 includes only one set of VMM arrays 2101 and excitation function blocks 2102, which are used to represent multiple layers in the specific example of the GRU unit 2100. The GRU unit 2100 will require less space than the GRU unit 2000 because the GRU unit 2100 only needs 1/3 of the space for the VMM and excitation function blocks compared to the GRU unit 2000.

可以進一步理解，GRU系統通常將包括多個VMM陣列，每個VMM陣列需要由VMM陣列外部的某些電路區塊(例如，加法器及激勵函數區塊以及高電壓產生區塊)提供的功能。對每個VMM陣列提供個別的電路區塊，將在半導體裝置內需要大量的空間，並且會有些沒有效率。因此，下面所描述的實體減少VMM陣列本身外部所需的電路。It will be further appreciated that a GRU system will typically include multiple VMM arrays, each of which requires functionality provided by certain circuit blocks external to the VMM array (e.g., adder and excitation function blocks and high voltage generation blocks). Providing individual circuit blocks for each VMM array would require a significant amount of space within a semiconductor device and would be somewhat inefficient. Therefore, the embodiments described below reduce the circuitry required external to the VMM array itself.

VMM陣列的輸入可以是類比位準、二進制位準、脈衝、時間調變脈衝或數位位元(在這種情況下，需要DAC將數位位元轉換為適當的輸入類比位準)，以及輸出可以是類比位準、二進制位準、時序脈衝、脈衝或數位位元(在這種情況下，需要輸出ADC將輸出類比位準轉換為數位位元)。The inputs to the VMM array can be analog levels, binary levels, pulses, time modulated pulses, or digital bits (in which case a DAC is required to convert the digital bits to the appropriate input analog level), and the outputs can be analog levels, binary levels, timing pulses, pulses, or digital bits (in which case an output ADC is required to convert the output analog level to digital bits).

通常，對於一個VMM陣列中之每個記憶體單元，每個權重W可以由單一記憶體單元或由一個差分單元或由兩個混合記憶體單元(平均2個單元)來實施。在差分單元的情況下，需要兩個記憶體單元來實施權重W成為差分權重(W=W+-W-)。在兩個混合記憶體單元方面，需要兩個記憶體單元來實施權重W成為兩個單元的平均值。Typically, for each memory unit in a VMM array, each weight W can be implemented by a single memory unit, or by a differential unit, or by two hybrid memory units (average of 2 units). In the case of a differential unit, two memory units are required to implement the weight W as a differential weight (W=W+-W-). In the case of two hybrid memory units, two memory units are required to implement the weight W as the average of the two units.

圖31描繪VMM系統3100。在一些實例中，將儲存在VMM陣列中的權重W儲存為差分對W+(正權重)及W-(負權重)，其中W=(W+) -(W-)。在VMM系統3100中，一半的位元線稱為W+線，即連接至將儲存正權重W+之記憶體單元的位元線，而另一半的位元線稱為W-線，即連接至實現負權重W-之記憶體單元的位元線。W-線以交替方式散置在W+線之間。減法運算由從W+線及W-線接收電流之求和電路(例如，求和電路 3101及3102)來執行。W+線的輸出與W-線的輸出組合在一起以有效地為所有(W+、W-)線對的每對(W+、W-)單元提供W=W+-W-。雖然以上已經針對W-線以交替方式散置在W+線之間來進行描述，但在其它實例中W+線及W-線可以任意地位於陣列中的任何位置。FIG. 31 depicts a VMM system 3100. In some embodiments, weights W stored in a VMM array are stored as a differential pair of W+ (positive weight) and W- (negative weight), where W=(W+) -(W-). In the VMM system 3100, half of the bit lines are referred to as W+ lines, i.e., bit lines connected to memory cells that will store positive weights W+, and the other half of the bit lines are referred to as W- lines, i.e., bit lines connected to memory cells that implement negative weights W-. The W- lines are interspersed between the W+ lines in an alternating manner. Subtraction operations are performed by summing circuits (e.g., summing circuits 3101 and 3102) that receive current from the W+ lines and the W- lines. The output of the W+ line is combined with the output of the W- line to effectively provide W=W+-W- for each pair of (W+, W-) cells of all (W+, W-) line pairs. Although the above description has been made for the W- lines to be interspersed between the W+ lines in an alternating manner, in other examples the W+ lines and the W- lines can be arbitrarily placed at any position in the array.

圖 32 描繪另一個實例。在VMM系統3210中，正權重W+在第一陣列3211中實現，負權重W-在第二陣列3212中實現，第二陣列3212與第一陣列分開，並且所得權重由求和電路3213適當地組合在一起。Another example is depicted in Figure 32. In a VMM system 3210, positive weights W+ are implemented in a first array 3211, negative weights W- are implemented in a second array 3212, the second array 3212 is separate from the first array, and the resulting weights are appropriately combined by a summing circuit 3213.

圖33描繪VMM系統3300。將儲存在VMM陣列中的權重W儲存為差分對W+(正權重)及W-(負權重)，其中W=(W+)-(W-)。VMM系統3300包括陣列3301及陣列3302。陣列3301及3302中之每個陣列的一半位元線稱為W+線，即連接至將儲存正權重W+之記憶體單元的位元線，而陣列3301及3302中之每個陣列的另一半位元線稱為W-線，即連接至實現負權重W-之記憶體單元的位元線。W-線以交替方式散置在W+線之間。減法運算由從W+線及W-線接收電流的求和電路(例如，求和電路 3303、3304、3305及3306)來執行。來自每個陣列3301、3302之W+線的輸出與W-線的輸出分別組合在一起，以有效地為所有(W+、W-)線對的每對(W+、W-)單元提供W=W+-W-。此外，來自每個陣列3301及3302的W值可以由求和電路3307及3308來進一步組合，使得每個W值是來自陣列3301的W值減去來自陣列3302的W值之結果，這意味著求和電路3307及3308的最終結果是兩個差分值的差分值。FIG. 33 depicts a VMM system 3300. The weights W stored in the VMM arrays are stored as a differential pair of W+ (positive weight) and W- (negative weight), where W=(W+)-(W-). The VMM system 3300 includes arrays 3301 and 3302. Half of the bit lines of each of arrays 3301 and 3302 are called W+ lines, i.e., bit lines connected to memory cells that will store positive weights W+, and the other half of the bit lines of each of arrays 3301 and 3302 are called W- lines, i.e., bit lines connected to memory cells that implement negative weights W-. The W- lines are interspersed between the W+ lines in an alternating manner. The subtraction operation is performed by summing circuits (e.g., summing circuits 3303, 3304, 3305, and 3306) that receive current from the W+ line and the W- line. The output of the W+ line from each array 3301, 3302 is combined with the output of the W- line to effectively provide W=W+-W- for each pair of (W+, W-) cells of all (W+, W-) line pairs. In addition, the W values from each array 3301 and 3302 can be further combined by summing circuits 3307 and 3308 so that each W value is the result of subtracting the W value from array 3302 from the W value from array 3301, which means that the final result of summing circuits 3307 and 3308 is the difference value of two difference values.

類比神經記憶體系統中使用之每個非揮發性記憶體單元都將被抹除及程式化，以在浮動閘極中保持非常特定及精確的電荷量，亦即，電子數量。例如，每個浮動閘極應該保存N個不同值中之一，其中N是由每個單元可以指示的不同權重的數量。N的實例包括16、32、64、128及256。Each nonvolatile memory cell used in an analog neural memory system will be erased and programmed to hold a very specific and precise amount of charge, i.e., number of electrons, in the floating gate. For example, each floating gate should hold one of N different values, where N is the number of different weights that can be indicated by each cell. Examples of N include 16, 32, 64, 128, and 256.

在執行程式化操作後能夠準確地驗證程式化操作是非常重要的。It is very important to be able to accurately verify the programmed operation after executing it.

揭露人工神經網路中之驗證電路及相關方法的許多實例。Many examples of verification circuits and related methods in artificial neural networks are revealed.

VMM系統架構VMM system architecture

圖34描繪VMM系統3400的方塊圖。VMM系統3400包括VMM陣列3401、列解碼器3402、高電壓解碼器3403、行解碼器3404、位元線驅動器3405、輸入電路3406、輸出電路3407、控制邏輯3408及偏壓產生器3409。VMM系統3400進一步包括高電壓產生區塊3410，其包括電荷泵3411、電荷泵調節器3412及高電壓位準產生器3413。VMM系統3400進一步包括(程式化/抹除，或權重調整)演算法控制器 3414、類比電路3415、控制引擎3416(其可以包括諸如算術函數、激勵函數、嵌入式微控制器邏輯的特定功能，但不限於此)、測試控制邏輯3417以及用以儲存諸如用於輸入電路(例如，激勵資料)或輸出電路(神經元輸出資料)的中間資料或用於程式化的輸入資料(例如，用於整列或多列的輸入資料)之靜態隨機存取記憶體(SRAM)區塊3418。34 depicts a block diagram of a VMM system 3400. The VMM system 3400 includes a VMM array 3401, a row decoder 3402, a high voltage decoder 3403, a row decoder 3404, a bit line driver 3405, an input circuit 3406, an output circuit 3407, a control logic 3408, and a bias generator 3409. The VMM system 3400 further includes a high voltage generation block 3410, which includes a charge pump 3411, a charge pump regulator 3412, and a high voltage level generator 3413. The VMM system 3400 further includes a (programming/erasing, or weight adjustment) algorithm controller 3414, an analog circuit 3415, a control engine 3416 (which may include, but is not limited to, specific functions such as arithmetic functions, excitation functions, and embedded microcontroller logic), a test control logic 3417, and a static random access memory (SRAM) block 3418 for storing intermediate data such as for input circuits (e.g., excitation data) or output circuits (neuron output data) or for programmed input data (e.g., for an entire row or multiple rows of input data).

輸入電路3406可以包括諸如DAC(數位至類比轉換器)、DPC(數位至脈衝轉換器，數位至時間調變脈衝轉換器)、AAC(類比至類比轉換器，例如，電流至電壓轉換器、對數轉換器)、PAC(脈衝至類比位準轉換器)或任何其它類型的轉換器之電路。輸入電路3406可以實施正規化、線性或非線性上/下縮放函數及算術函數中的一個或多個。輸入電路3406可以對輸入位準實施溫度補償函數。輸入電路3406可以實施諸如ReLU或sigmoid的激勵函數。輸入電路3406可以儲存要在程式化或讀取操作期間用作輸入信號或與輸入信號組合的數位激勵資料。數位激勵資料可以儲存在暫存器中。輸入電路3406可以包括用於驅動陣列端子(例如，CG、WL、EG及SL線)的電路，其可以包括取樣保持電路及緩衝器。DAC可用於將數位激勵資料轉換為類比輸入電壓以應用於陣列。Input circuit 3406 may include circuits such as a DAC (digital to analog converter), a DPC (digital to pulse converter, digital to time modulated pulse converter), an AAC (analog to analog converter, e.g., current to voltage converter, logarithmic converter), a PAC (pulse to analog level converter), or any other type of converter. Input circuit 3406 may implement one or more of normalization, linear or nonlinear up/down scaling functions, and arithmetic functions. Input circuit 3406 may implement a temperature compensation function on the input level. Input circuit 3406 may implement an excitation function such as ReLU or sigmoid. Input circuit 3406 may store digital stimulus data to be used as an input signal or combined with an input signal during a programming or read operation. The digital stimulus data may be stored in a register. Input circuit 3406 may include circuitry for driving array terminals (e.g., CG, WL, EG, and SL lines), which may include a sample-and-hold circuit and a buffer. A DAC may be used to convert the digital stimulus data into an analog input voltage for application to the array.

輸出電路3407可以包括諸如ITV(電流至電壓電路)、ADC(類比至數位轉換器，以將神經元類比輸出轉換成數位位元)、AAC(類比至類比轉換器，例如，電流至電壓轉換器、對數轉換器)、APC(類比至脈衝轉換器、類比至時間調變脈衝轉換器)或任何其它類型的轉換器之電路。輸出電路3407可以將陣列輸出轉換成激勵資料。輸出電路3407可以實施諸如整流線性激勵函數(ReLU)或sigmoid的激勵函數。輸出電路3407可以對神經元輸出實施統計正規化、正則化、上/下縮放/增益函數、統計捨入及算術函數(例如，加、減、除、乘、移位、對數)中的一個或多個。輸出電路3407可以對神經元輸出或陣列輸出(例如，位元線輸出)實施溫度補償函數，以便例如藉由在整個溫度範圍保持IV斜率大致相同以隨著溫度保持陣列的功率消耗近似恆定或提高陣列(神經元)輸出的精度。輸出電路3407可以包括用於儲存輸出資料的暫存器。Output circuit 3407 may include circuits such as ITV (current to voltage circuit), ADC (analog to digital converter to convert the analog output of the neuron into digital bits), AAC (analog to analog converter, e.g., current to voltage converter, logarithmic converter), APC (analog to pulse converter, analog to time modulated pulse converter), or any other type of converter. Output circuit 3407 may convert the array output into excitation data. Output circuit 3407 may implement excitation functions such as rectified linear excitation function (ReLU) or sigmoid. Output circuit 3407 may implement one or more of statistical normalization, regularization, up/down scaling/gain functions, statistical rounding, and arithmetic functions (e.g., addition, subtraction, division, multiplication, shift, logarithm) on the neuron output. Output circuit 3407 may implement a temperature compensation function on the neuron output or array output (e.g., bit line output) to, for example, keep the power consumption of the array approximately constant with temperature or improve the accuracy of the array (neuron) output by keeping the IV slope approximately the same over the entire temperature range. Output circuit 3407 may include a register for storing output data.

圖35A描繪程式化方法3500。首先，此方法開始實施(步驟3501)，這通常是回應於接收到程式命令而發生。接著，大規模程式化操作將所有單元程式化為狀態「0」(步驟3502)。然後，軟抹除操作將所有單元抹除為弱抹除位準，使得每個單元會在讀取操作期間汲取例如大約1-5µA的電流(步驟3503)。這與每個單元會在讀取操作期間汲取例如大約20-30µA的電流之深度抹除位準形成對比。接著，對所有未被選的單元執行硬程式化操作至非常深度的程式化狀態，以將電子添加至單元的浮動閘極(步驟3504)，進而確保那些單元真正「關斷」，這意味著那些單元將在讀取操作期間汲取可略忽不計的電流量。FIG. 35A depicts a programming method 3500. First, the method begins (step 3501), which typically occurs in response to receiving a program command. Next, a mass programming operation programs all cells to state "0" (step 3502). Then, a soft erase operation erases all cells to a weak erase level such that each cell draws, for example, about 1-5 µA during a read operation (step 3503). This is in contrast to a deep erase level where each cell draws, for example, about 20-30 µA during a read operation. Next, a hard programming operation is performed on all unselected cells to a very deeply programmed state to add electrons to the floating gates of the cells (step 3504), thereby ensuring that those cells are truly "off," meaning that those cells will draw negligible current during a read operation.

然後，執行粗略程式化操作，以將被選單元程式化為更接近目標的位準，例如，目標的2X至100X。對被選單元執行粗略程式化操作(步驟3505)，隨後對被選單元執行精確程式化操作(步驟3506)，以程式化每個被選單元所需的精確值。Then, a coarse programming operation is performed to program the selected cells to a level closer to the target, for example, 2X to 100X of the target. A coarse programming operation is performed on the selected cells (step 3505), and then a fine programming operation is performed on the selected cells (step 3506) to program each selected cell to the desired fine value.

粗略程式化操作(3505)可以由多個粗略驗證/程式化循環組成。在每個粗略驗證/程式化循環中，執行驗證操作以驗證單元輸出是否符合粗略目標；如果不是，則再次對那個單元執行程式化操作。重複驗證/程式化循環，直到所有目標單元的單元輸出符合粗略目標為止。The coarse programming operation (3505) may consist of multiple coarse verification/programming cycles. In each coarse verification/programming cycle, a verification operation is performed to verify whether the unit output meets the coarse target; if not, the programming operation is performed again for that unit. The verification/programming cycle is repeated until the unit outputs of all target units meet the coarse target.

精確程式化操作(3506)可以由多個精確驗證/程式化循環組成。在每個精確驗證/程式化循環中，執行驗證操作以驗證單元輸出是否符合精確目標；如果不是，則再次對那個單元執行程式化操作。重複驗證/程式化循環，直到所有目標單元的單元輸出符合精確目標為止。The precise programming operation (3506) may consist of multiple precise verification/programming cycles. In each precise verification/programming cycle, a verification operation is performed to verify whether the unit output meets the precise target; if not, the programming operation is performed again for that unit. The verification/programming cycle is repeated until the unit output of all target units meets the precise target.

圖35B描繪另一種程式化方法3510，其類似於程式化方法3500。然而，代替如圖35A的步驟3502中那樣將所有單元程式化為狀態「0」的程式化操作，在此方法開始實施(步驟3501)之後，使用抹除操作將所有單元抹除為狀態「1」(步驟3512)。然後，使用軟程式化操作(步驟3513)，將所有單元程式化為弱程式化狀態(位準)，使得每個單元會在讀取操作期間汲取例如大約3-5µA的電流。之後，對所有未被選的單元執行硬程式化操作至非常深度的程式化狀態(步驟3504)，隨後如圖35A中那樣進行粗略及精確程式化(3505-3506)。圖35B的實例之變型會移除軟程式化操作(步驟3513)。FIG35B depicts another programming method 3510 that is similar to the programming method 3500. However, instead of a programming operation that programs all cells to state "0" as in step 3502 of FIG35A, after the method is initiated (step 3501), an erase operation is used to erase all cells to state "1" (step 3512). Then, a soft programming operation (step 3513) is used to program all cells to a weakly programmed state (level) such that each cell draws, for example, about 3-5 µA during a read operation. Thereafter, a hard programming operation is performed on all unselected cells to a very deeply programmed state (step 3504), followed by coarse and fine programming (3505-3506) as in Figure 35A. A variation of the example of Figure 35B removes the soft programming operation (step 3513).

圖36描繪粗略程式化操作3505的第一個實例，其為搜尋及執行方法3600。首先，執行查找表搜尋，以基於意欲儲存在被選單元中之值來確定那個被選單元的粗略目標電流值(I _CT)(步驟3601)。例如，此表是由矽特性或晶圓測試的校準來建立。認為被選單元可以被程式化以儲存N個可能值(例如，128、64、32，但不限於此)中之一。N個值中的每一個會對應於在讀取操作期間由被選單元汲取的一個不同期望電流值(I _D)。在一個實例中，查找表可以包含M個可能電流值，以在搜尋及執行方法3600期間用作被選單元的粗略目標電流值I _CT，其中M為小於N的整數。例如，如果N是8，則M可能為4，這表示被選單元可以儲存8個可能值，並且將選擇4個粗略目標電流值中之一作為搜尋及執行方法3600的粗略目標。亦即，搜尋及執行方法3600(如上所述，其為粗略程式化操作3505的一個實例)意欲將被選單元快速地程式化為稍微接近期望值(I _D)的數值(I _CT)，然後精確程式化操作3506意欲更精確地將被選單元程式化為期望值(I _D)。 FIG. 36 depicts a first example of a coarse programming operation 3505, which is a search and execute method 3600. First, a lookup table search is performed to determine a coarse target current value (I _CT ) for a selected cell based on the value intended to be stored in that cell (step 3601). For example, this table is built from silicon characterization or calibration of wafer testing. It is assumed that the selected cell can be programmed to store one of N possible values (e.g., 128, 64, 32, but not limited thereto). Each of the N values will correspond to a different desired current value (I _D ) to be drawn by the selected cell during a read operation. In one example, the lookup table may include M possible current values to be used as a rough target current value I _CT for the selected cell during the search and execution method 3600, where M is an integer less than N. For example, if N is 8, then M may be 4, which means that the selected cell may store 8 possible values, and one of the 4 rough target current values will be selected as the rough target for the search and execution method 3600. That is, the search and execution method 3600 (which, as described above, is an example of the coarse programming operation 3505) is intended to quickly program the selected cell to a value (I _CT ) that is somewhat close to the desired value (I _D ), and then the fine programming operation 3506 is intended to more accurately program the selected cell to the desired value (I _D ).

對於N=8及M=4的簡單實例，表9及10中描述單元值、期望電流值及粗略目標電流值的實例：表9：對於N=8之N 個期望電流值的實例被選單元中儲存的值期望電流值(I _D) 000 100pA 001 200pA 010 300pA 011 400pA 100 500pA 101 600pA 110 700pA 111 800pA 表10：對於M=4之M個目標電流值的實例粗略目標電流值(I _CT) 相關單元值 800pA+I _CTOFFSET1 000,001 1600pA+I _CTOFFSET2 010,011 2400pA+I _CTOFFSET3 100,101 3200pA+I _CTOFFSET4 110,111 偏移值I _CTOFFSETx用於防止在粗調期間超出期望電流值。 For the simple example of N=8 and M=4, examples of cell values, desired current values, and rough target current values are described in Tables 9 and 10: Table 9: Example of N desired current values for N=8 The value stored in the selected cell Expected current value (I _D ) 000 100pA 001 200pA 010 300pA 011 400pA 100 500pA 101 600pA 110 700pA 111 800pA Table 10: Example of M target current values for M=4 Rough target current value (I _CT ) Related unit value 800pA+I _CTOFFSET1 000,001 1600pA+I _CTOFFSET2 010,011 2400pA+I _CTOFFSET3 100,101 3200pA+I _CTOFFSET4 110,111 The offset value I _CTOFFSETx is used to prevent the desired current value from being exceeded during coarse adjustment.

一旦選擇粗略目標電流值I _CT，則藉由基於被選單元(例如，記憶體單元210、310、410或510)的單元架構類型將電壓v ₀施加至被選單元的適當端子來對被選單元進行程式化(步驟3602)。如果被選單元屬於圖3中之記憶體單元310的類型，則電壓v ₀將被施加至控制閘極端子28，並且取決於粗略目標電流值I _CT，v ₀可以是5-7V。v ₀的值可選擇地由電壓查找表來決定，其中電壓查找表儲存v ₀對照粗目標電流值I _CT的數值。 Once the coarse target current value I _CT is selected, the selected cell is formatted by applying a voltage v ₀ to the appropriate terminal of the selected cell based on the cell architecture type of the selected cell (e.g., memory cell 210, 310, 410, or 510) (step 3602). If the selected cell is of the type of memory cell 310 in FIG. 3, a voltage v ₀ is applied to the control gate terminal 28, and depending on the coarse target current value I _CT , v ₀ may be 5-7 V. The value of v ₀ may optionally be determined by a voltage lookup table that stores the value of v ₀ against the coarse target current value I _CT .

接下來，藉由施加電壓v _i=v _i-1+v _increment來對被選單元進行程式化，其中i從1開始且每次重複此步驟時都會遞增，其中v _increment是一個小的精細電壓，其將導致一定程度的程式化適合於期望改變的粒度(步驟3603)。因此，執行第一次步驟3603，i＝1，並且v ₁將為v ₀+v _increment。然後，進行驗證操作(步驟3604)，其中對被選單元執行讀取操作，並將透過被選單元汲取的電流(I _cell)與粗略目標臨界值I _CT進行比較。如果I _cell小於或等於I _CT(其在此是第一臨界值)，則搜尋及執行方法3600完成且可以開始進行精確程式化操作3506。如果I _cell不小於或等於I _CT，則增加i值並重複步驟3603。 Next, the selected cell is programmed by applying a voltage of v _i =v _i-1 +v _increment , where i starts at 1 and increments each time this step is repeated, where v _increment is a small, fine voltage that will result in a degree of programming appropriate for the granularity of the desired change (step 3603). Thus, the first time step 3603 is performed, i=1, and v ₁ will be v ₀ +v _increment . A verification operation is then performed (step 3604), where a read operation is performed on the selected cell and the current drawn through the selected cell (I _cell ) is compared to a coarse target threshold value I _CT . If I _cell is less than or equal to I _CT (which is the first critical value here), the search and execution method 3600 is complete and the precise programming operation 3506 can begin. If I _cell is not less than or equal to I _CT , the value of i is increased and step 3603 is repeated.

因此，在粗略程式化方法3505結束且精確程式化方法3506開始時，電壓v _i將是用於對被選單元進行程式化的最後電壓，並且被選單元將儲存與粗略目標電流值I _CT相關聯的值，其中I _cell＞=I _CT。精確程式化方法3506將被選單元程式化到在讀取操作期間其汲取電流I _D(加上或減去可接受的偏差量，例如，+/-30%或更少，例如+/-50pA)的時間點，電流I _D是與要儲存在被選單元中之值相關聯的期望電流值。 Thus, at the end of the coarse programming method 3505 and the start of the fine programming method 3506, the voltage v _i will be the last voltage used to program the selected cell, and the selected cell will store a value associated with the coarse target current value I _CT , where I _cell >= I _CT . The fine programming method 3506 programs the selected cell to the point in time at which it draws a current _ID during a read operation (plus or minus an acceptable amount of deviation, e.g., +/- 30% or less, e.g., +/- 50 pA), the current _ID being the desired current value associated with the value to be stored in the selected cell.

圖37描繪在粗略程式化操作3505及/或精確程式化操作3506期間可以施加至被選記憶體單元的控制閘極之不同電壓序列的實例。它由多個驗證/程式化循環組成。37 depicts an example of a sequence of different voltages that may be applied to the control gates of selected memory cells during a coarse programming operation 3505 and/or a fine programming operation 3506. It consists of multiple verification/programming cycles.

根據第一種方法下，逐漸地將增加的電壓施加至控制閘極，以對被選記憶體單元進行程式化。起始點為v _i，其在精確程式化操作3506期間將是在粗略程式化方法3505期間所施加的最後電壓。將v _p1的增量被添加至v ₁，並且電壓v ₁+v _p1接著被用於對被選單元進行程式化(由序列3701中從左邊算起的第二個脈衝來表示)。v _p1是小於v _increment(在粗略程式化操作3505期間使用的電壓增量)的增量。在施加每個程式化電壓之後，執行驗證操作(類似於步驟3404)，其中確定I _cell是否小於或等於I _PT1(其為第一精確目標電流值，在此為第二臨界值)，其中I _PT1=I _D+I _PT1OFFSET，其中I _PT1OFFSET為用於防止程式化超調而添加的偏移值。如果不是，則將另一個增量v _p1添加至先前施加的程式化電壓，並重複此過程。在I _cell小於或等於I _PT1時，程式化序列的這個部分停止。可選擇地，如果I _PT1等於I _D，或者在具有足夠精度的情況下(亦即，加上或減去可接受的偏差量)幾乎等於I _D，則被選記憶體單元已成功地被程式化。 According to the first method, increasing voltages are gradually applied to the control gates to program the selected memory cell. The starting point is v _i , which during the fine programming operation 3506 will be the last voltage applied during the coarse programming method 3505. The increment of v _p1 is added to v ₁ , and the voltage v ₁ +v _p1 is then used to program the selected cell (represented by the second pulse from the left in sequence 3701). v _p1 is an increment less than v _increment (the voltage increment used during the coarse programming operation 3505). After each programming voltage is applied, a verification operation (similar to step 3404) is performed, in which it is determined whether I _cell is less than or equal to I _PT1 (which is a first precise target current value, here a second critical value), where I _PT1 = I _D + I _PT1OFFSET , where I _PT1OFFSET is an offset value added to prevent programming overshoot. If not, another increment v _p1 is added to the previously applied programming voltage, and the process is repeated. When I _cell is less than or equal to I _PT1 , this portion of the programming sequence stops. Alternatively, if I _PT1 is equal to I _D , or is nearly equal to I _D with sufficient accuracy (i.e., plus or minus an acceptable amount of deviation), the selected memory cell has been successfully programmed.

如果I _PT1與I _D不夠接近，亦即，在具有足夠精度的情況下幾乎不等於I _D，則進行更小粒度的進一步程式化。在此，現在使用序列3702。序列3702的起始點為用於在序列3701下進行程式化的最後一個電壓。將V _p2的增量(其小於v _p1))添加至那個電壓，並且施加組合電壓，以對被選記憶體單元進行程式化。在施加每個程式化電壓之後，執行驗證操作(類似於步驟3404)，其中確定I _cell是否小於或等於I _PT2(其為第二精確目標電流值，在此為第三臨界值)，其中I _PT2=I _D+I _PT2OFFSET，其中I _PT2OFFSET為用於防止程式化超調而增加的偏移值。如果不是，則將另一個增量V _p2添加至先前施加的程式化電壓，並重複此過程。在I _cell小於或等於I _PT2加上或減去可接受的偏差量時，程式化序列的這個部分停止。在此，認為I _PT2等於I _D或足夠接近I _D以致於程式化可以停止，因為在具有足夠精度的情況下已達到目標值。本技術領域之具通常技藝者可以理解，可以在使用越來越小的程式化增量之情況下應用額外的序列。例如，在圖38中，應用三個序列(3801、3802及3803)，而不是只有兩個序列。 If _IPTI is not close enough to _ID , i.e., is not nearly equal to _ID with sufficient accuracy, then further programming with a smaller granularity is performed. Here, sequence 3702 is now used. The starting point for sequence 3702 is the last voltage used for programming under sequence 3701. An increment of _Vp2 (which is less than _vp1) is added to that voltage, and the combined voltage is applied to program the selected memory cell. After each programming voltage is applied, a verification operation (similar to step 3404) is performed, in which it is determined whether I _cell is less than or equal to I _PT2 (which is a second precise target current value, here a third critical value), where I _PT2 = I _D + I _PT2OFFSET , where I _PT2OFFSET is an offset value added to prevent programming overshoot. If not, another increment V _p2 is added to the previously applied programming voltage, and the process is repeated. When I _cell is less than or equal to I _PT2 plus or minus an acceptable amount of deviation, this portion of the programming sequence stops. Here, I _PT2 is considered equal to I _D or close enough to I _D that programming can stop because the target value has been reached with sufficient accuracy. Those skilled in the art will appreciate that additional sequences may be applied with smaller and smaller programmable increments. For example, in Figure 38, three sequences (3801, 3802, and 3803) are applied instead of just two sequences.

在圖37中的序列3703及圖38中的序列3803中顯示第二種方法。不是增加在被選記憶體單元的程式化期間施加的電壓，而是在增加週期的持續時間內施加相同的電壓。亦即，將額外時間增量t _p1添加至程式化脈衝，使得每個施加的脈衝比先前施加的脈衝長t _p1。在施加每個程式化脈衝之後，如先前針對序列3701所描述那樣執行相同的驗證操作。可選擇地，可以應用額外的序列，其中添加至程式化脈衝的額外時間增量具有比先前使用的序列還小的持續時間。雖然僅顯示一個時間序列，但是本技術領域之具通常技藝者將理解，可以應用任何數量的不同時間序列。 The second approach is shown in sequence 3703 in FIG. 37 and sequence 3803 in FIG. 38 . Rather than increasing the voltage applied during programming of the selected memory cell, the same voltage is applied for increasing periods of time. That is, additional time increments _tp1 are added to the programming pulses so that each applied pulse is _tp1 longer than the previously applied pulse. After each programming pulse is applied, the same verification operations are performed as previously described for sequence 3701. Alternatively, additional sequences may be applied in which the additional time increments added to the programming pulses have a smaller duration than the previously used sequence. Although only one time series is shown, one of ordinary skill in the art will appreciate that any number of different time series may be used.

現在將針對粗略程式化操作3505的三個實例提供額外細節。Additional details will now be provided for three examples of coarsely programmed operation 3505.

圖39描繪粗略程式化操作3505的另一個實例，其為可調適校準方法3900。可調適校準方法開始實施(步驟3901)。以預設起始值v ₀對單元進行程式化(步驟3902)。與搜尋及執行方法3600不同，在此v ₀不是從查找表獲得，而是可以是相對較小的初始值。在第一電流值IR1(例如，100na)及第二電流值IR2(例如，10na)下測量單元的控制閘極電壓，並且基於那些測量值確定並儲存次臨界斜率(例如，360mV/dec)(步驟3903)。 FIG. 39 depicts another example of a coarse programming operation 3505, which is an adjustable calibration method 3900. The adjustable calibration method begins implementation (step 3901). The cell is programmed with a preset starting value _v0 (step 3902). Unlike the search and execute method 3600, here _v0 is not obtained from a lookup table, but can be a relatively small initial value. The control gate voltage of the cell is measured at a first current value IR1 (e.g., 100 nA) and a second current value IR2 (e.g., 10 nA), and a subcritical slope (e.g., 360 mV/dec) is determined and stored based on those measurements (step 3903).

確定新的程式化電壓v _i。第一次執行此步驟時，i=1，以及v ₁是使用例如次臨界等式(例如，下面的次臨界等式)基於所儲存的次臨界斜率值以及電流目標及偏移值來進行確定： V _i=V _i-1+V _increment，其中V _increment與Vg的斜率成正比 Vg= n*Vt*log[Ids/wa*Io]。在此，wa是記憶體單元的w，Ids是電流目標加上偏移值。 Determine a new programming voltage, V _i . The first time this step is performed, i=1, and V ₁ is determined based on the stored subcritical slope value and the current target and offset values using, for example, a subcritical equation (e.g., the following subcritical equation): V _i =V _i-1 +V _increment , where V _increment is proportional to the slope of Vg Vg= n*Vt*log[Ids/wa*Io]. Here, wa is the w of the memory cell, and Ids is the current target plus the offset value.

如果儲存的斜率值是相對陡的，則可以使用相對小的電流偏移值。如果儲存的斜率值是相對平坦的，則可以使用相對大的電流偏移值。因此，確定斜率資訊允許選擇針對所討論的特定單元定制之電流偏移值。這最終將使程式化過程更短。當重複此步驟時，使i 遞增，並且v _i=v _i- ₁+v _increment。然後，使用v _i對單元進行程式化。v _increment可以由儲存v _increment值相對於目標電流值之查找表來進行確定。 If the stored slope value is relatively steep, a relatively small current offset value can be used. If the stored slope value is relatively flat, a relatively large current offset value can be used. Therefore, determining the slope information allows the selection of a current offset value that is customized for the particular cell in question. This will ultimately make the programming process shorter. As this step is repeated, i is incremented and _vi = _vi- ₁ + _vincrement . The cell is then programmed using _vi . _vincrement can be determined by a lookup table that stores _vincrement values relative to target current values.

接下來，進行驗證操作，其中對被選單元執行讀取操作，並將透過被選單元汲取的電流(I _cell)與粗略目標臨界值I _CT進行比較(步驟3905)。如果I _cell小於或等於I _CT，其設定I _CT=I _D+I _CTOFFSET，其中I _CTOFFSET為用於防止程式化超調而添加的偏移值，則可調適校準方法3900完成並且可以開始進行精確程式化操作3506。如果I _cell不小於或等於I _CT，則重複步驟3904-3905，並且使i遞增。然後，在電壓v _i是用於對被選單元進行程式化的最後電壓之情況下，開始進行精確程式化方法3506。 Next, a verification operation is performed, in which a read operation is performed on the selected cell, and the current drawn through the selected cell (I _cell ) is compared to the coarse target threshold value I _CT (step 3905). If I _cell is less than or equal to I _CT , it sets I _CT = I _D + I _CTOFFSET , where I _CTOFFSET is an offset value added to prevent programming overshoot, and the adjustable calibration method 3900 is completed and the fine programming operation 3506 can be started. If I _cell is not less than or equal to I _CT , steps 3904-3905 are repeated, and i is incremented. Then, the fine programming method 3506 is started when the voltage _vi is the last voltage used to program the selected cell.

圖40描繪可調適校準操作3900的態樣。在步驟3903期間，電流源4001將電流值IR1及IR2施加至被選單元(在此，記憶體單元4002)，並且接著，測量在記憶體單元4002的控制閘極處之電壓(關於IR1的CGR1及關於IR2的CGR2)。確定斜率為(CGR2-GR1)/dec of current，其為VCG相對於LOG(I)的斜率。FIG40 depicts an example of an adjustable calibration operation 3900. During step 3903, current source 4001 applies current values IR1 and IR2 to the selected cell (here, memory cell 4002), and then the voltage at the control gate of memory cell 4002 is measured (CGR1 for IR1 and CGR2 for IR2). The slope is determined to be (CGR2-GR1)/dec of current, which is the slope of VCG with respect to LOG(I).

圖41描繪粗略程式化操作3505的另一個實例，其為可調適校準方法4100。可調適校準方法開始實施(步驟4101)。以預設起始值v ₀對單元進行程式化(步驟4102)。v ₀從例如由矽特性建立的查找表獲得，表值偏移，例如不超出目標程式值。 FIG41 depicts another example of a coarse programming operation 3505, which is an adjustable calibration method 4100. The adjustable calibration method begins (step 4101). The cell is programmed with a preset starting value _v0 (step 4102). _v0 is obtained from a lookup table, such as created by silicon characterization, and the table value is offset, such as not to exceed the target program value.

在下一個步驟4103中，建立用於預測下一個程式化電壓的I-V斜率參數，將第一控制閘極讀取電壓V _CGR1施加至被選單元，並且測量所得單元電流IR ₁。然後，將第二控制閘極讀取電壓V _CGR2施加至被選單元，並且測量所得單元電流IR ₂。斜率是基於那些測量來確定並儲存，例如根據在次臨界區域中(單元在次臨界區域中進行操作)的等式：斜率=(V _CGR1–V _CGR2)/(LOG(IR1)–LOG(IR2))(步驟4103)。 V _CGR1及V _CGR2的值之實例分別為例如1.5V及1.3V。 In the next step 4103, an IV slope parameter is established for predicting the next programmed voltage, a first control gate read voltage V _CGR1 is applied to the selected cell, and the resulting cell current IR ₁ is measured. Then, a second control gate read voltage V _CGR2 is applied to the selected cell, and the resulting cell current IR ₂ is measured. The slope is determined and stored based on those measurements, for example, according to the equation in the subcritical region (the cell is operating in the subcritical region): Slope = (V _CGR1 – V _CGR2 )/(LOG(IR1) – LOG(IR2)) (step 4103). Examples of values for V _CGR1 and V _CGR2 are, for example, 1.5V and 1.3V, respectively.

確定斜率資訊允許選擇針對所討論的特定單元定制之V _increment值。這最終將使程式化過程更短。 Determining the slope information allows the selection of a V _increment value that is tailored to the particular cell in question. This will ultimately make the programming process shorter.

當重複步驟4104時，使i遞增，使用以下等式基於所儲存的斜率值以及電流目標及偏移值來確定新的期望程式化電壓V _i： V _i=V _i-1+V _increment，其中對於 i-1，V _increment=alpha*斜率*(LOG (IR ₁)–LOG(I _CT))，其中I _CT係目標電流，alpha係預定常數＜1(程式化偏移值)，以防止超調，例如，0.9。例如，V _i係VSLP或VCGP、源極線或控制閘極程式化電壓。 When step 4104 is repeated, i is incremented and a new desired programming voltage V _i is determined based on the stored slope value and the current target and offset values using the following equation: V _i =V _i-1 +V _increment , where for i-1, V _increment =alpha*slope*(LOG (IR ₁ )–LOG(I _CT )), where I _CT is the target current and alpha is a predetermined constant <1 (programming offset value) to prevent overshoot, e.g., 0.9. For example, V _i is VSLP or VCGP, a source line or control gate programming voltage.

然後，使用V _i對單元進行程式化(步驟4105)。 Then, the _cell is formatted using Vi (step 4105).

接下來，進行驗證操作，其中對被選單元執行讀取操作，並且將透過被選單元汲取的電流(I _cell)與I _CT進行比較(步驟4106)。如果I _cell小於或等於I _CT(在此為粗略目標臨界值)，其中設定I _CT=I _D+I _CTOFFSET，其中I _CTOFFSET為用於防止程式化超調而添加的偏移值，則過程進行至步驟4107。如果不是，則過程返回至步驟4104且使i遞增。 Next, a verification operation is performed, in which a read operation is performed on the selected cell, and the current drawn by the selected cell (I _cell ) is compared with I _CT (step 4106). If I _cell is less than or equal to I _CT (here, a rough target threshold), I _CT = I _D + I _CTOFFSET is set, where I _CTOFFSET is an offset value added to prevent programmed overshoot, and the process proceeds to step 4107. If not, the process returns to step 4104 and i is incremented.

在步驟4107中，將I _cell與小於I _CT的臨界值I _CT2進行比較。這樣做的目的是要查看是否發生超調。亦即，雖然I _cell的目標是低於I _CT，但是如果它低於I _CT太多，則發生超調，並且儲存的值實際上可能對應於錯誤的值。如果I _cell不小於或等於I _CT2，則沒有發生超調，並且可調適校準方法4100已經完成，此時過程前進至精確程式化操作3506。如果I _cell小於或等於I _CT2，則發生超調。接著，抹除被選單元(步驟4108)，並且在i被重置為0的情況下，程式化過程在步驟4102重新開始。可選擇地，如果步驟4108被執行超過預定次數，則被選單元可以被認為是不應該被使用的壞單元。 In step 4107, I _cell is compared to a critical value I _CT2 that is less than I _CT . The purpose of this is to see if an overshoot occurs. That is, although the goal of I _cell is to be lower than I _CT , if it is too much lower than I _CT , an overshoot occurs and the stored value may actually correspond to an erroneous value. If I _cell is not less than or equal to I _CT2 , no overshoot occurs and the adaptive calibration method 4100 is complete, at which point the process proceeds to the precise programming operation 3506. If I _cell is less than or equal to I _CT2 , an overshoot occurs. Next, the selected cell is erased (step 4108), and with i reset to 0, the programming process restarts at step 4102. Optionally, if step 4108 is executed more than a predetermined number of times, the selected unit may be considered a bad unit that should not be used.

精確程式化操作3506由多個驗證及程式化(V/P)循環組成，其中程式化電壓以具有固定脈衝寬度的恆定細微電壓來遞增，或者其中固定程式化電壓，而改變程式化脈衝寬度。The fine programming operation 3506 consists of multiple verification and programming (V/P) cycles, where the programming voltage is incremented by a constant fine voltage with a fixed pulse width, or where the programming voltage is fixed and the programming pulse width is varied.

可選擇地，確定在讀取或驗證操作期間通過被選非揮發性記憶體單元的電流是否小於或等於粗略目標臨界值的步驟可以藉由施加固定偏壓至非揮發性記憶體單元的端子；測量並數位化由被選非揮發性記憶體單元汲取的電流以產生數位輸出位元；以及將數位輸出位元與表示第一臨界電流的數位位元進行比較來執行。Optionally, the step of determining whether the current through the selected non-volatile memory cell during a read or verification operation is less than or equal to a coarse target threshold value can be performed by applying a fixed bias to terminals of the non-volatile memory cell; measuring and digitizing the current drawn by the selected non-volatile memory cell to produce a digital output bit; and comparing the digital output bit to a digital bit representing a first threshold current.

可選擇地，確定在讀取或驗證操作期間通過被選非揮發性記憶體單元的電流是否小於或等於粗略目標臨界值的步驟可以藉由施加固定偏壓至非揮發性記憶體單元的端子；測量並數位化由被選非揮發性儲存單元汲取的電流以產生數位輸出位元；以及將數位輸出位元與表示第一臨界電流的數位位元進行比較來執行。Optionally, the step of determining whether the current through the selected non-volatile memory cell during a read or verification operation is less than or equal to a coarse target threshold value can be performed by applying a fixed bias to terminals of the non-volatile memory cell; measuring and digitizing the current drawn by the selected non-volatile memory cell to produce a digital output bit; and comparing the digital output bit to a digital bit representing a first threshold current.

可選擇地，確定在讀取或驗證操作期間通過被選非揮發性記憶體單元的電流是否小於或等於粗略目標臨界值的步驟可以藉由施加輸入至非揮發性記憶體單元的端子；以輸出脈衝調變由被選非揮發性記憶體單元汲取的電流以產生調變輸出；數位化調變輸出以產生數位輸出位元；以及將數位輸出位元與表示第一臨界電流的數位位元進行比較來執行。Optionally, the step of determining whether the current through the selected non-volatile memory cell during a read or verification operation is less than or equal to a coarse target threshold value can be performed by applying an input to a terminal of the non-volatile memory cell; modulating the current drawn by the selected non-volatile memory cell with an output pulse to produce a modulated output; digitizing the modulated output to produce digital output bits; and comparing the digital output bits to digital bits representing a first critical current.

圖42描繪粗略程式化操作3505的第三個實例，其為絕對校準方法4200。絕對校準方法開始實施(步驟4201)。以預設起始值v ₀對單元進行程式化(步驟4202)。在電流值I _target下測量單元的控制閘極電壓(VCGRx)並儲存控制閘極電壓(VCGRx)(步驟4203)。基於所儲存的控制閘極電壓以及電流目標及偏移值I _target+I _offset確定新的期望電壓v ₁(步驟4204)。例如，新的期望電壓v ₁可以如下來計算：v ₁=v ₀+(VCGBIAS-儲存的VCGR)，其中VCGBIAS係在最大目標電流下的預設讀取控制閘極電壓，例如＝～1.5V，並且所儲存的VCGR係步驟4203中測量的讀取控制閘極電壓。 FIG. 42 depicts a third example of the coarse programming operation 3505, which is an absolute calibration method 4200. The absolute calibration method begins (step 4201). The cell is programmed with a preset starting value v ₀ (step 4202). The control gate voltage (VCGRx) of the cell is measured at a current value I _target and the control gate voltage (VCGRx) is stored (step 4203). A new desired voltage v ₁ is determined based on the stored control gate voltage and the current target and offset value I _target +I _offset (step 4204). For example, the new desired voltage _v1 can be calculated as follows: _v1 = _v0 +(VCGBIAS-stored VCGR), where VCGBIAS is the preset read control gate voltage at the maximum target current, for example =~1.5V, and the stored VCGR is the read control gate voltage measured in step 4203.

然後，使用vi對單元進行程式化。當i＝1時，使用來自步驟4204的電壓v ₁。當i＞1時，使用電壓v _i=v _i-1+v _increment。可以由儲存v _increment值相對於目標電流值之查找表來確定v _increment。接下來，執行驗證操作，其中對被選單元執行讀取操作，並且將透過被選單元汲取的電流(I _cell)與I _CT進行比較(步驟4206)。如果I _cell小於或等於I _CT(其在此為臨界值)，則絕對校準方法4200完成且精確程式化方法3506可以開始實施。如果I _cell不小於或等於I _CT，則重複步驟4205-4206，並且使i遞增。 The cell is then programmed using vi. When i=1, the voltage _v1 from step 4204 is used. When i>1, the voltage _vi = _vi-1 + _vincrement is used. _vincrement can be determined from a lookup table that stores _vincrement values relative to target current values. Next, a verification operation is performed in which a read operation is performed on the selected cell and the current drawn through the selected cell ( _Icell ) is compared to _ICT (step 4206). If _Icell is less than or equal to _ICT (which is a critical value here), then the absolute calibration method 4200 is complete and the precise programming method 3506 can begin. If I _cell is not less than or equal to I _CT , steps 4205 - 4206 are repeated and i is incremented.

可以使用單一權重驗證方法來驗證單元是否已經由於程式化操作而達到權重目標。選擇記憶體單元以進行驗證操作，然後藉由下面關於圖43-49所描述之驗證機制來驗證記憶體單元的輸出。A single weight verification method can be used to verify whether a unit has achieved a weight target as a result of a programmed operation. A memory unit is selected for verification operation, and the output of the memory unit is then verified by the verification mechanism described below with respect to Figures 43-49.

可以使用差分權重驗證方法來確定差分單元(由2個單元形成，其中儲存的值是儲存在2個單元中之值的差)是否已經由於程式化操作而達到權重目標。選擇與差分權重相關聯的兩個單元用於驗證操作。然後，單元之間的輸出差由下面關於圖43-49所描述之驗證機制來進行驗證。例如，如果cell ₁儲存w+值，而cell ₂儲存w-值，則w=w+-w-是差分權重。Cell ₁及Cell ₂是被選來進行驗證操作以確定權重w是否已達到目標的兩個單元。或者，先驗證cell ₂的w-值，然後再驗證差分權重w。或者，先驗證cell ₁的w+值，然後再驗證差分權重w。或者，在第一個操作中驗證cell ₁及cell ₂的w+及w-值，然後在第二個操作中驗證差分權重w。 The differential weight verification method can be used to determine whether a differential cell (formed by 2 cells, where the stored value is the difference between the values stored in the 2 cells) has reached the weight target due to the programmed operation. Two cells associated with the differential weight are selected for the verification operation. The output difference between the cells is then verified by the verification mechanism described below with respect to Figures 43-49. For example, if cell ₁ stores the w+ value and cell ₂ stores the w- value, then w=w+-w- is the differential weight. Cell ₁ and Cell ₂ are the two cells selected for the verification operation to determine whether the weight w has reached the target. Alternatively, verify the w- value of cell ₂ first, and then verify the differential weight w. Alternatively, verify the w+ value of cell ₁ first, and then verify the differential weight w. Alternatively, verify the w+ and w- values of cell ₁ and cell ₂ in the first operation, and then verify the difference weight w in the second operation.

圖43描繪VMM系統4300。電流至電壓轉換器及類比至數位轉換器區塊4301從VMM陣列3401接收電流，通常從VMM陣列3401中的位元線或源極線接收電流，並且提供輸出至驗證電路4302。電流至電壓轉換器及類比至數位轉換器區塊4301以及驗證電路4302一起是耦接至VMM陣列3401的輸出區塊，以在VMM陣列3401的驗證操作期間產生電壓並且在VMM陣列3401的讀取操作期間產生數位輸出。區塊4301中的每個電流至電壓轉換器將電流轉換為電壓。區塊4301中的類比至數位轉換器被重新配置(例如，以下面關於圖44A所描述之方式)以在驗證操作(亦稱為讀取-驗證操作)期間使用。在驗證操作期間，參考陣列4304用於產生所有N個可能的電流目標(例如，增量為3nA之在3-96nA之間的32個值)。N個可能的電流目標中之每一個對應於儲存在參考陣列4304中之相應參考記憶體單元中的N個權重目標中之一。或者，主參考電流產生器4305用於產生所有N個電流目標。來自參考陣列4304或主參考電流產生器4305的參考電流被提供至參考電壓產生器4303，參考電壓產生器4303包括電壓DAC及利用電壓DAC將參考電流轉換為參考電壓，其中具有對應於N個可能值的N個可能電壓。例如，對於5-位元單元，具有對應於32個電流目標的32個參考電壓。選擇N個電壓值中的一個適當電壓值以供驗證電路4302將其與由對應的電流至電壓轉換器4301提供的電壓進行比較。此比較是由驗證電路4302執行的驗證操作。因此，權重可以在被程式化至VMM陣列3401中之後被驗證。在此方法中，藉由將來自記憶體單元的輸出電壓與來自電壓參考產生器4303的參考電壓進行比較來完成驗證。43 depicts a VMM system 4300. Current to voltage converter and analog to digital converter block 4301 receives current from VMM array 3401, typically from a bit line or source line in VMM array 3401, and provides an output to verification circuit 4302. Current to voltage converter and analog to digital converter block 4301 and verification circuit 4302 together are output blocks coupled to VMM array 3401 to generate a voltage during a verification operation of VMM array 3401 and to generate a digital output during a read operation of VMM array 3401. Each current to voltage converter in block 4301 converts a current into a voltage. The analog-to-digital converter in block 4301 is reconfigured (e.g., in the manner described below with respect to FIG. 44A ) for use during a verification operation (also referred to as a read-verify operation). During the verification operation, reference array 4304 is used to generate all N possible current targets (e.g., 32 values between 3-96 nA in increments of 3 nA). Each of the N possible current targets corresponds to one of the N weighted targets stored in a corresponding reference memory cell in reference array 4304. Alternatively, a main reference current generator 4305 is used to generate all N current targets. A reference current from a reference array 4304 or a main reference current generator 4305 is provided to a reference voltage generator 4303, which includes a voltage DAC and utilizes the voltage DAC to convert the reference current into a reference voltage, wherein there are N possible voltages corresponding to N possible values. For example, for a 5-bit cell, there are 32 reference voltages corresponding to 32 current targets. An appropriate voltage value among the N voltage values is selected for comparison by the verification circuit 4302 with the voltage provided by the corresponding current to voltage converter 4301. This comparison is a verification operation performed by the verification circuit 4302. Therefore, the weights can be verified after being programmed into the VMM array 3401. In this method, verification is done by comparing the output voltage from the memory cell with the reference voltage from the voltage reference generator 4303.

或者，使用參考電流數位至類比轉換器(IDAC)直接驗證單元電流，而不使用電流至電壓轉換器，這表示將單元電流與參考電流進行比較。在此方法下，由於低電流(例如，幾個nA)電路的穩定時間，延遲及變化通常會較大。Alternatively, the cell current is verified directly using a reference current digital-to-analog converter (IDAC) instead of a current-to-voltage converter, which means comparing the cell current to a reference current. With this approach, delays and variations are typically larger due to the settling time of low-current (e.g., a few nA) circuits.

圖44A描繪神經元輸出ITV+ADC+驗證電路4488，其包括電流至電壓轉換器(ITV)4401、連續近似暫存器(SAR)類比至數位轉換器(ADC)4402及驗證電路4403。驗證電路4403包括比較器4404(其亦用於在讀取或讀取神經元操作期間產生數位輸出)、參考電壓選擇電路4406及驗證暫存器4405。驗證暫存器4405係SAR ADC 4402的資料輸出暫存器，其在此用於執行驗證功能，但也可以在讀取或讀取神經元操作期間用以產生數位輸出。電流至電壓轉換器4401及SAR類比至數位轉換器4402係圖43中之電流至電壓轉換器及類比至數位轉換器4301的實施之實例，並且驗證電路4403係圖43中之驗證電路4302的實施之實例。44A depicts a neuron output ITV+ADC+verification circuit 4488, which includes a current to voltage converter (ITV) 4401, a successive approximation register (SAR) analog to digital converter (ADC) 4402, and a verification circuit 4403. The verification circuit 4403 includes a comparator 4404 (which is also used to generate a digital output during a read or read neuron operation), a reference voltage selection circuit 4406, and a verification register 4405. The verification register 4405 is a data output register of the SAR ADC 4402, which is used here to perform a verification function, but can also be used to generate a digital output during a read or read neuron operation. The current-to-voltage converter 4401 and the SAR analog-to-digital converter 4402 are examples of implementations of the current-to-voltage converter and the analog-to-digital converter 4301 in Figure 43, and the verification circuit 4403 is an example of implementation of the verification circuit 4302 in Figure 43.

電流至電壓轉換器4401及SAR類比至數位轉換器4301可以在讀取或神經讀取操作期間使用。然而，電流至電壓轉換器4401及SAR類比至數位轉換器4301亦可以在驗證操作期間使用，其中驗證已經被程式化到VMM陣列內之非揮發性記憶體單元中的權重(意指N個可能的權重值中之一)。The current-to-voltage converter 4401 and the SAR analog-to-digital converter 4301 can be used during a read or neural read operation. However, the current-to-voltage converter 4401 and the SAR analog-to-digital converter 4301 can also be used during a verification operation, where the weights (meaning one of N possible weight values) that have been programmed into the non-volatile memory cells within the VMM array are verified.

電流至電壓轉換器4401從單一被選單元接收來自VMM陣列的電流並將那個電流轉換為電壓。電流至電壓轉換可以由複數個電阻器ITV(RITV)4490R或複數個電容器ITV(CITV)4490C來完成。N個可能的參考電壓中之一透過驗證暫存器4405及驗證參考電壓選擇電路4406被提供至驗證電路4403。驗證暫存器例如可以是8-位元暫存器，其用於從驗證參考電壓選擇電路4406中之256個電壓參考位準中選擇一個電壓參考位準。作為驗證參考電壓選擇電路4406的輸入(透過驗證參考電壓線)之驗證參考電壓由如圖45中的全域驗證參考電壓產生器來提供。然後，比較器4404將來自電流至電壓轉換器4401的電壓與N個可能的參考電壓中之那個參考電壓進行比較來指示單元是否正在儲存正確的值。在驗證操作期間不使用SAR ADC 4402中的電容器及SAR邏輯。在一個實例中，控制電路閉合SAR ADC 4402中的開關S1A，以便將ITV 4401的正輸出Vinp直接提供至比較器4404的非反相輸入，以及打開一個開關(未命名)並閉合一個開關(未命名)，以將驗證參考電壓選擇電路4406的輸出提供至比較器4404的反相輸入。The current to voltage converter 4401 receives current from the VMM array from a single selected cell and converts that current to a voltage. The current to voltage conversion may be accomplished by a plurality of resistors ITV (RITV) 4490R or a plurality of capacitors ITV (CITV) 4490C. One of N possible reference voltages is provided to the verification circuit 4403 via a verification register 4405 and a verification reference voltage selection circuit 4406. The verification register may be, for example, an 8-bit register that is used to select one voltage reference level from 256 voltage reference levels in the verification reference voltage selection circuit 4406. The verification reference voltage as an input to the verification reference voltage select circuit 4406 (via the verification reference voltage line) is provided by a global verification reference voltage generator as shown in Figure 45. The comparator 4404 then compares the voltage from the current to voltage converter 4401 to which of the N possible reference voltages to indicate whether the cell is storing the correct value. The capacitors and SAR logic in the SAR ADC 4402 are not used during the verification operation. In one example, the control circuit closes switch S1A in SAR ADC 4402 to provide the positive output Vinp of ITV 4401 directly to the non-inverting input of comparator 4404, and opens a switch (unnamed) and closes a switch (unnamed) to provide the output of the verification reference voltage selection circuit 4406 to the inverting input of comparator 4404.

ITV+ADC+驗證電路4488可用於單一權重驗證操作以及差分權重驗證操作。對於單一權重驗證操作，僅需要來自一個單元的一個輸入。ITV 4401的輸出電壓係與單元電流值成正比且對照由驗證參考電壓選擇電路4406所提供的參考電壓位準來進行驗證。對於差分權重驗證操作，來自兩個單元的兩個輸入是到ITV+ADC+驗證電路4488的兩個輸入，並且ITV 4401的輸出電壓(例如，Vinp)係與兩個單元電流的差成正比且對照由驗證參考電壓選擇電路4406所提供的參考電壓位準來進行驗證。The ITV+ADC+verification circuit 4488 can be used for single weight verification operations as well as differential weight verification operations. For single weight verification operations, only one input from one unit is required. The output voltage of ITV 4401 is proportional to the unit current value and is verified against the reference voltage level provided by the verification reference voltage selection circuit 4406. For differential weight verification operations, two inputs from two units are the two inputs to the ITV+ADC+verification circuit 4488, and the output voltage of ITV 4401 (e.g., Vinp) is proportional to the difference between the two unit currents and is verified against the reference voltage level provided by the verification reference voltage selection circuit 4406.

可以藉由使用比較器4404的偏移修整來修整ITV+ADC+驗證電路4488的總偏移補償。此偏移可以藉由修整ITV電路4401的電阻器4490R或電容器4490C來進一步進行修整。The total offset compensation of the ITV+ADC+verification circuit 4488 can be trimmed by using the offset trim of the comparator 4404. This offset can be further trimmed by trimming the resistor 4490R or capacitor 4490C of the ITV circuit 4401.

在另一個實例中，可以藉由修整ITV電路4401的電阻器4490R或電容器4490C來修整ITV+ADC+驗證電路4488的總增益補償。In another example, the overall gain compensation of the ITV+ADC+verification circuit 4488 can be trimmed by trimming the resistor 4490R or the capacitor 4490C of the ITV circuit 4401.

一種替代偏移補償方法可以在時域中藉由使用具有電容器 4490C的ITV 4401來完成。ITV 4401使用具有參考電流輸入以實現電容器4490C的積分之可變寬度脈衝來產生輸出電壓。比較器4404將來自ITV 4401的輸出電壓與參考電壓進行比較。可變寬度脈衝的參數例如在數位域中由計數器(未顯示)來儲存或由在類比域中儲存每個ITV的類比電壓(類比電壓由可變脈衝輸入轉換而來)之表(未顯示)來儲存。控制器(未顯示)使用此資訊，以用致能信號(未顯示)來啟用ITV以進行驗證操作。An alternative offset compensation method can be accomplished in the time domain by using an ITV 4401 with a capacitor 4490C. The ITV 4401 uses a variable width pulse with a reference current input to achieve integration of the capacitor 4490C to produce an output voltage. A comparator 4404 compares the output voltage from the ITV 4401 to the reference voltage. The parameters of the variable width pulse are stored, for example, in the digital domain by a counter (not shown) or in the analog domain by a table (not shown) that stores the analog voltage for each ITV (the analog voltage is converted from the variable pulse input). The controller (not shown) uses this information to enable the ITV with an enable signal (not shown) for authentication operations.

圖44B描繪比較器及偏移電路4490，其可以用來取代圖44A中的比較器4404，以添加偏移補償的附加功能。比較器及偏移電路4490包括比較器4491、其比較輸入VINP與VINN(其可以是圖44A中所示的相同信號)，以產生輸出COMPOUT及其補數COMPOUTB。校準電路4492可以被調整以向比較器4491提供偏移電壓VON，並且校準電路4493可以被調整以向比較器4491提供偏移電壓VOP。FIG44B depicts a comparator and offset circuit 4490 that can be used to replace the comparator 4404 in FIG44A to add the additional function of offset compensation. The comparator and offset circuit 4490 includes a comparator 4491, its comparison inputs VINP and VINN (which can be the same signals shown in FIG44A) to generate an output COMPOUT and its complement COMPOUTB. A calibration circuit 4492 can be adjusted to provide an offset voltage VON to the comparator 4491, and a calibration circuit 4493 can be adjusted to provide an offset voltage VOP to the comparator 4491.

圖45描繪參考電壓產生器4500。參考電壓產生器4500是參考電壓產生器4303的一個示例實施。在一個實例中，參考陣列4304(未顯示)提供來自一個參考單元的最大電流，其中電流代表可以儲存在一個非揮發性記憶體單元中之(N個可能權重的)最高可能權重。電流至電壓轉換器4501將從參考單元所提供的最大電流轉換為高驗證參考電壓，這可以使用電阻器電流至電壓轉換器(RITV)4511或電容器電流至電壓轉換器(CITV)4510來完成。FIG45 depicts a reference voltage generator 4500. The reference voltage generator 4500 is an example implementation of the reference voltage generator 4303. In one example, a reference array 4304 (not shown) provides a maximum current from a reference cell, where the current represents the highest possible weight (of N possible weights) that can be stored in a non-volatile memory cell. The current-to-voltage converter 4501 converts the maximum current provided from the reference cell to a high verification reference voltage, which can be accomplished using a resistor current-to-voltage converter (RITV) 4511 or a capacitor current-to-voltage converter (CITV) 4510.

然後，例如由電阻器串4504使用此電壓來產生例如一個5-位元單元的32個驗證參考電壓，電阻器串4504在此包括串聯的N-1個電阻器。在另一個實例中，參考陣列提供被轉換為參考電壓的參考電流，例如，參考電流可以是中間範圍值，其被適當地轉換為所有N個參考電壓(例如，用一個ITV電路透過電流比率鏡(current ratioed mirror)，透過修整後電阻值，或透過修整後電容值)。如圖所示的ITV4501使用差分運算放大器。此ITV是VMM系統4300中之子電路ITV 4301的複製，亦被顯示為ITV+ADC+驗證電路4488中的ITV 4401。差分運算放大器(op amp)是圖44A中之區域差分運算放大器4480的複製，所以N個全域參考可以追蹤來自ITV4301之隨著PVT(製程、電源或溫度)變化的區域電壓。或者，ITV4501可以以單端運算放大器為基礎。電阻器4511x及電容器4510x係可修整的，以調整範圍並補償配錯或偏移變化。電阻器串4504係可修整的，以調整VN至V1的範圍、上移/下移VN至V1的範圍或調整區域值VN至V1。This voltage is then used, for example, by resistor string 4504, which here includes N-1 resistors in series, to generate 32 verification reference voltages, for example, for a 5-bit cell. In another example, the reference array provides a reference current that is converted to a reference voltage, for example, the reference current can be a mid-range value that is appropriately converted to all N reference voltages (for example, using an ITV circuit through a current ratioed mirror, through trimmed resistor values, or through trimmed capacitor values). The ITV4501 shown in the figure uses a differential operational amplifier. This ITV is a copy of the sub-circuit ITV 4301 in the VMM system 4300, also shown as ITV 4401 in the ITV+ADC+verification circuit 4488. The differential operational amplifier (op amp) is a copy of the regional differential op amp 4480 in Figure 44A, so that the N global references can track regional voltages from ITV4301 that vary with PVT (process, supply or temperature). Alternatively, ITV4501 can be based on a single-ended op amp. Resistors 4511x and capacitors 4510x are trimmable to adjust the range and compensate for mismatch or offset changes. Resistor string 4504 is trimmable to adjust the range of VN to V1, shift the range of VN to V1 up/down, or adjust the regional value VN to V1.

在另一個實例中，使用恆定偏流(例如，來自IDAC)，而不是來自參考陣列的參考電流。In another example, a constant bias current (eg, from an IDAC) is used instead of a reference current from a reference array.

偏流或來自參考陣列的參考電流係可調整的，以達到目標值。它們亦針對PVT(製程、電源或溫度)變化進行補償。Bias current or reference current from a reference array is adjustable to achieve a target value. They also compensate for PVT (process, power or temperature) variations.

ITV+ADC+驗證電路4488的總全域偏移及配錯補償可以藉由使用參考電壓產生器4500的偏移及配錯修整來進行修整。這可以藉由調整偏流電路4512或修整電阻器4511a及4511b、電容器4510a及4510b或電阻串4504來完成。The total global offset and mismatch compensation of the ITV+ADC+verification circuit 4488 can be trimmed by using the offset and mismatch trimming of the reference voltage generator 4500. This can be done by adjusting the bias current circuit 4512 or trimming resistors 4511a and 4511b, capacitors 4510a and 4510b, or resistor string 4504.

使用緩衝器4502來緩衝高驗證參考電壓，其例如代表一個5-位元單元的32個參考電壓位準(L0-L31)中之第32個位準(L31)，以在電阻器串4504的一端處驅動電阻器串4504。在另一個實例中，提供緩衝器4503來緩衝低驗證參考電壓VREF2，其對應於一個5-位元單元的32個位準(L0-L31)中之第1個位準(L0)，並且緩衝器4503將那個電壓提供至電阻器串4504的一端。例如，高驗證參考電壓可以是900mV，而低驗證參考電壓可以是300mV。電壓階梯(電阻串)4504產生範圍從V0至VN的N個電壓，其代表可儲存在VMM陣列中的N個可能值。然後，由圖44A中之驗證電路4403使用那些參考電壓。在另一個實例中，緩衝器4502及4503的輸入電壓及/或輸出電壓被修整，以調整範圍並校準任何偏移，例如，緩衝器偏移。或者，ITV4501可以直接驅動電阻串4504，以提供32個參考位準(L0-L31)。A buffer 4502 is used to buffer a high verification reference voltage, which represents, for example, the 32nd level (L31) of the 32 reference voltage levels (L0-L31) of one 5-bit cell, to drive the resistor string 4504 at one end of the resistor string 4504. In another example, a buffer 4503 is provided to buffer a low verification reference voltage VREF2, which corresponds to the first level (L0) of the 32 levels (L0-L31) of one 5-bit cell, and the buffer 4503 provides that voltage to one end of the resistor string 4504. For example, the high verification reference voltage may be 900 mV, and the low verification reference voltage may be 300 mV. The voltage ladder (resistor string) 4504 generates N voltages ranging from V0 to VN, which represent the N possible values that can be stored in the VMM array. Those reference voltages are then used by the verification circuit 4403 in FIG. 44A. In another example, the input voltage and/or output voltage of the buffers 4502 and 4503 are trimmed to adjust the range and calibrate any offset, such as buffer offset. Alternatively, the ITV 4501 can directly drive the resistor string 4504 to provide 32 reference levels (L0-L31).

在一個實例中，可使用K條驗證參考電壓線來提供N個不同的電壓。例如，對於一個5-位元單元，需要32條驗證參考電壓線來饋電至驗證參考電壓選擇電路4406，而對於一個6-位元單元，需要64條驗證參考電壓線。32條參考電壓線各自使用兩次，以藉由對其使用進行時間多工來提供64個驗證參考電壓，使得32條線在第一驗證期間提供第一組32個電壓，而在第二驗證期間提供第二組32個電壓。此方法可以藉由下列方式來進行擴展：使用四個驗證期間來為7-位元單元提供128個電壓，或者使用八個驗證期間來為8-位元單元提供256個電壓等等，但不限於此。In one example, K verification reference voltage lines may be used to provide N different voltages. For example, for a 5-bit cell, 32 verification reference voltage lines are required to feed the verification reference voltage selection circuit 4406, and for a 6-bit cell, 64 verification reference voltage lines are required. The 32 reference voltage lines are each used twice to provide 64 verification reference voltages by time multiplexing their use, such that the 32 lines provide a first set of 32 voltages during a first verification period and a second set of 32 voltages during a second verification period. This method can be extended by using four verification periods to provide 128 voltages for a 7-bit cell, or using eight verification periods to provide 256 voltages for an 8-bit cell, etc., but is not limited to this.

在一個實例中，用於驗證操作之陣列輸入的偏壓(例如，CG偏壓及EG偏壓)是由參考陣列產生的，使得這些偏壓適應溫度，以保持陣列電流儘可能恆定。In one example, the bias voltages at the array inputs used to verify operation (e.g., CG bias and EG bias) are generated by a reference array so that these bias voltages adapt with temperature to keep the array current as constant as possible.

圖46至圖50描繪可用於圖43的參考陣列4304之參考陣列的實例。46-50 illustrate examples of reference arrays that may be used with reference array 4304 of FIG. 43 .

圖46描繪實體陣列4600。實體陣列4600包括非揮發性記憶體單元的陣列。非揮發性記憶體單元可選擇地包括堆疊式閘極快閃記憶體單元或分離式閘極快閃記憶體單元。實體陣列4600被分成兩種類型的陣列—VMM陣列3401(如圖34中)及參考陣列4304。在一個實例中，VMM陣列3401及參考陣列4304共用相同的位元線。在另一個實例中，VMM陣列3401及參考陣列4304使用個別的位元線組，其中兩組位元線係分離的。FIG. 46 depicts a physical array 4600. The physical array 4600 includes an array of non-volatile memory cells. The non-volatile memory cells may optionally include stacked gate flash memory cells or split gate flash memory cells. The physical array 4600 is divided into two types of arrays—the VMM array 3401 (as in FIG. 34 ) and the reference array 4304. In one example, the VMM array 3401 and the reference array 4304 share the same bit lines. In another example, the VMM array 3401 and the reference array 4304 use separate sets of bit lines, where the two sets of bit lines are separated.

圖47描繪實體陣列4700，其被分成兩個陣列—VMM陣列3401及參考陣列4304。在一個實例中，VMM陣列3401及參考陣列4304共用一組或多組水平線，例如，字元線、控制閘極線及抹除線。在另一個實例中，VMM陣列3401及參考陣列4304不共用任何水平線組。47 depicts a physical array 4700 that is divided into two arrays—VMM array 3401 and reference array 4304. In one example, VMM array 3401 and reference array 4304 share one or more sets of horizontal lines, such as word lines, control gate lines, and erase lines. In another example, VMM array 3401 and reference array 4304 do not share any sets of horizontal lines.

圖48描繪參考陣列4304及VMM陣列3401位於個別的實體陣列中之實例。例如，兩個陣列之間可以存在基板分離或主動擴散分離。實體陣列4801包含VMM陣列3401，而實體陣列4802包含參考陣列4304。VMM陣列3401及參考陣列4304不共用任何的位元線、字元線、控制閘極線或抹除線。FIG48 depicts an example where reference array 4304 and VMM array 3401 are located in separate physical arrays. For example, there may be a substrate separation or active diffusion separation between the two arrays. Physical array 4801 includes VMM array 3401, and physical array 4802 includes reference array 4304. VMM array 3401 and reference array 4304 do not share any bit lines, word lines, control gate lines, or erase lines.

圖49描繪參考陣列4304的實例。在此，參考陣列4304包括複數個子參考陣列，例如，子參考陣列4901-0、4901-1、…、4901-(n-1)及4901-n。因此，參考陣列4304包含n+1個不同的子參考陣列。不同的子參考陣列可以具有不同的特性，這使各個子參考陣列具有與其它子參考陣列不同的I-V曲線之特性。例如，各個子參考陣列可以在以下一項或多項的尺寸方面有所不同：(1)各個參考陣列的電晶體之控制閘極線的寬度；(2)各個參考陣列的電晶體之字元線的寬度；(3)各個參考陣列的電晶體之浮動閘極的寬度；(4)各個參考陣列中之非揮發性記憶體單元的總寬度；(5)各個參考陣列內之淺溝槽隔離(STI)間距；(6)其它特性。再者，子參考陣列在一個或多個裝置植入條件或摻雜特性方面各自有所不同(例如，井植入條件、源極植入條件、汲極植入條件，但不限於此)。FIG49 depicts an example of a reference array 4304. Here, the reference array 4304 includes a plurality of sub-reference arrays, for example, sub-reference arrays 4901-0, 4901-1, ..., 4901-(n-1), and 4901-n. Therefore, the reference array 4304 includes n+1 different sub-reference arrays. Different sub-reference arrays may have different characteristics, which enables each sub-reference array to have a different I-V curve characteristic from other sub-reference arrays. For example, each sub-reference array may differ in one or more of the following dimensions: (1) width of control gate lines of transistors in each reference array; (2) width of word lines of transistors in each reference array; (3) width of floating gates of transistors in each reference array; (4) total width of non-volatile memory cells in each reference array; (5) shallow trench isolation (STI) spacing within each reference array; (6) other characteristics. Furthermore, each sub-reference array may differ in one or more device implant conditions or doping characteristics (e.g., well implant conditions, source implant conditions, drain implant conditions, but not limited thereto).

圖50描繪參考陣列4304的另一個實例。在此，參考陣列4303包括複數個子參考陣列，例如，子參考陣列5001-0、5001-1、…、5001-(n-1)及5001-n以及5002-0、5002-1、…、5002-(n-1)及5002-n。因此，參考陣列4304包括2*(n+1)個不同的子參考陣列，這意味著數量是圖49中的兩倍。如同在圖49中，圖50中的不同子參考陣列可以具有不同的特性，這使各個子參考陣列具有與其它子參考陣列不同的I-V曲線之特性。例如，各個子參考陣列可以在以下一項或多項的尺寸方面有所不同：控制閘極寬度、字元線寬度、浮動閘極寬度、陣列中之非揮發性記憶體單元的總寬度、STI間距以及裝置植入條件，但不限於此。FIG50 depicts another example of a reference array 4304. Here, the reference array 4303 includes a plurality of sub-reference arrays, for example, sub-reference arrays 5001-0, 5001-1, ..., 5001-(n-1), and 5001-n, and 5002-0, 5002-1, ..., 5002-(n-1), and 5002-n. Therefore, the reference array 4304 includes 2*(n+1) different sub-reference arrays, which means that the number is twice that of FIG49. As in FIG49, different sub-reference arrays in FIG50 can have different characteristics, which makes each sub-reference array have different characteristics of I-V curves from other sub-reference arrays. For example, the sub-reference arrays may differ in one or more of the following dimensions: control gate width, word line width, floating gate width, total width of non-volatile memory cells in the array, STI spacing, and device implantation conditions, but are not limited thereto.

應當注意，如本文所使用，術語「在…上方」及「在…上」均包含性地包括「直接在…上」(沒有中間材料、元件或空間設置在其間)及「間接在…上」(中間材料、元件或空間設置在其間)。同樣地，術語「相鄰」包括「直接相鄰」(沒有中間材料、元件或空間設置在其間)及「間接相鄰」(中間材料、元件或空間設置在其間)，「安裝至」包括「直接安裝至」(沒有中間材料、元件或空間設置在其間)及「間接安裝至」(中間材料、元件或空間設置在其間)，以及「電耦接至」包括「直接電耦接至」(沒有中間材料或元件在其間將元件電連接在一起)及「間接電耦接至」(中間材料或元件在其間將元件電連接在一起)。例如，「在基板上方」形成元件可以包括在基板上直接形成元件而在其間沒有中間材料/元件，以及在基板上間接形成元件而在其間具有一個或多個中間材料/元件。It should be noted that, as used herein, the terms "above" and "on" both include "directly on" (without intervening materials, elements, or spaces disposed therebetween) and "indirectly on" (with intervening materials, elements, or spaces disposed therebetween). Similarly, the term "adjacent" includes "directly adjacent" (without intervening materials, elements, or spaces disposed therebetween) and "indirectly adjacent" (with intervening materials, elements, or spaces disposed therebetween), "mounted to" includes "directly mounted to" (without intervening materials, elements, or spaces disposed therebetween) and "indirectly mounted to" (with intervening materials, elements, or spaces disposed therebetween), and "electrically coupled to" includes "directly electrically coupled to" (without intervening materials or elements electrically connecting the elements together) and "indirectly electrically coupled to" (with intervening materials or elements electrically connecting the elements together). For example, forming a component "over a substrate" may include forming the component directly on the substrate without intervening materials/components therebetween, as well as forming the component indirectly on the substrate with one or more intervening materials/components therebetween.

12:半導體基板 14:源極區域 16:汲極區域 18:通道區域 20:浮動閘極 22:字元線端子 24:位元線 28:控制閘極 30:抹除閘極 31:數位至類比轉換器 32:向量矩陣乘法(VMM)陣列 32a:VMM陣列 32b:VMM陣列 32c:VMM陣列 32d:VMM陣列 32e:VMM陣列 33:非揮發性記憶體單元陣列 34:抹除閘極及字元線閘極解碼器 35:控制閘極解碼器 36:位元線解碼器 37:源極線解碼器 38:差分加法器 39:激勵函數方塊 210:記憶體單元 310:4-閘極記憶體單元 410:3-閘極記憶體單元 510:堆疊式閘極記憶體單元 710:記憶體單元 900:神經元VMM 陣列 901:非揮發性記憶體單元的記憶體陣列 902:非揮發性參考記憶體單元的參考陣列 903:控制閘極線 904:抹除閘極線 1000:神經元VMM陣列 1001:第一非揮發性參考記憶體單元的參考陣列 1002:第二非揮發性參考記憶體單元的參考陣列 1003:非揮發性記憶體單元的記憶體陣列 1014:多工器 1100:神經元VMM陣列 1101:第一非揮發性參考記憶體單元的參考陣列 1102:第二非揮發性參考記憶體單元的參考陣列 1103:非揮發性記憶體單元的記憶體陣列 1200:神經元VMM陣列 1201:第一非揮發性參考記憶體單元的參考陣列 1202:第二非揮發性參考記憶體單元的參考陣列 1203:非揮發性記憶體單元的記憶體陣列 1204:疊接電晶體 1205:多工器 1212:多工器 1300:神經元VMM陣列 1301:第一非揮發性參考記憶體單元的參考陣列 1302:第二非揮發性參考記憶體單元的參考陣列 1303:非揮發性記憶體單元的記憶體陣列 1314:多工器 1400:LSTM 1401:單元 1402:單元 1403:單元 1404:單元 1500:LSTM單元 1501:sigmoid函數裝置 1502:sigmoid函數裝置 1503:sigmoid函數裝置 1504:tanh裝置 1505:tanh裝置 1506:乘法裝置 1507:乘法裝置 1508:乘法裝置 1509:加法裝置 1600:LSTM單元 1601:VMM陣列 1602:激勵函數區塊 1700:LSTM單元 1701:VMM陣列 1702:激勵函數區塊 1703:乘法裝置 1704:暫存器 1705:暫存器 1706:暫存器 1707:暫存器 1708:加法裝置 1709:多工器 1710:多工器 1800:GRU 1801:單元 1802:單元 1803:單元 1804:單元 1900:GRU單元 1901:sigmoid函數裝置 1902:sigmoid函數裝置 1903:tanh裝置 1904:乘法裝置 1905:乘法裝置 1906:乘法裝置 1907:加法裝置 1908:互補裝置 2000:GRU單元 2001:VMM陣列 2002:激勵函數區塊 2100:GRU單元 2101:VMM陣列 2102:激勵函數區塊 2103:乘法裝置 2104:多工器 2105:加法裝置 2106:暫存器 2107:暫存器 2108:暫存器 2109:互補裝置 2200:神經元VMM陣列 2300:神經元VMM陣列 2400:神經元VMM陣列 2500:神經元VMM陣列 2600:神經元VMM陣列 2700:神經元VMM陣列 2701-1-2701-N:位元線控制閘 2800:神經元VMM陣列 2900:神經元VMM陣列 3000:神經元VMM陣列 3100:VMM系統 3101:求和電路 3102:求和電路 3210:VMM系統 3211:第一陣列 3212:第二陣列 3213:求和電路 3300:VMM系統 3301:陣列 3302:陣列 3303:求和電路 3304:求和電路 3305:求和電路 3306:求和電路 3307:求和電路 3308:求和電路 3400:VMM系統 3401:VMM陣列 3402:列解碼器 3403:高電壓解碼器 3404:行解碼器 3405:位元線驅動器 3406:輸入電路 3407:輸出電路 3408:控制邏輯 3409:偏壓產生器 3410:高電壓產生區塊 3411:電荷泵 3412:電荷泵調節器 3413:高電壓位準產生器 3414:演算法控制器 3415:類比電路 3416:控制引擎 3417:測試控制邏輯 3418:靜態隨機存取記憶體(SRAM)區塊 3701:序列 3702:序列 3703:序列 3801:序列 3802:序列 3803:序列 4001:電流源 4002:記憶體單元 4300:VMM系統 4301:電流至電壓轉換器及類比至數位轉換器區塊 4302:驗證電路 4303:參考電壓產生器 4304:參考陣列 4305:主參考電流產生器 4401:電流至電壓轉換器(ITV) 4402:連續近似暫存器(SAR)類比至數位轉換器(ADC) 4403:驗證電路 4404:比較器 4405:驗證暫存器 4406:參考電壓選擇電路 4480:區域差分運算放大器 4488:神經元輸出ITV+ADC+驗證電路 4490:比較器及偏移電路 4490C:電容器ITV(CITV) 4490R:電阻器ITV(RITV) 4491:比較器 4492:校準電路 4493:校準電路 4500:參考電壓產生器 4501:電流至電壓轉換器 4502:緩衝器 4503:緩衝器 4504:電阻器串 4510:電容器電流至電壓轉換器(CITV) 4510a:電容器 4510b:電容器 4511:電阻器電流至電壓轉換器(RITV) 4511a:電阻器 4511b:電阻器 4512:偏流電路 4600:實體陣列 4700:實體陣列 4801:實體陣列 4802:實體陣列 4901-0:子參考陣列 4901-1:子參考陣列 4901-(n-1):子參考陣列 4901-n:子參考陣列 5001-0:子參考陣列 5001-1:子參考陣列 5001-(n-1):子參考陣列 5001-n:子參考陣列 5002-0:子參考陣列 5002-1:子參考陣列 5002-(n-1):子參考陣列 5002-n:子參考陣列 BL0-BLN:位元線 BLR0:端子 BLR1:端子 BLR2:端子 BLR3:端子 c ₀:單元狀態向量 c ₁:單元狀態向量 c ₂:單元狀態向量 c(t-1):單元狀態向量 c(t):單元狀態向量 C1:層 C2:層 C3:層 CB1:突觸 CB2:突觸 CB3:突觸 CB4:突觸 CG0:電壓輸入(控制閘極線) CG1:電壓輸入(控制閘極線) CG2:電壓輸入(控制閘極線) CG3:電壓輸入(控制閘極線) CG ₀-CG _M:控制閘極線 COMPOUT:輸出 COMPOUTB:輸出COMPOUT的補數 EG0:EG線 EG1:EG線 EGR0:EG線 EGR1:EG線 h ₀:輸出向量 h ₁:輸出向量 h ₂:輸出向量 h ₃:輸出向量 h(t-1):輸出向量 h(t):輸出向量 INPUT ₀-INPUT _M:輸入 INPUT ₀-INPUT _N:輸入 OUTPUT ₀-OUTPUT _N:輸出 P1:激勵函數 P2:激勵函數 S0:輸入層 S1:層 S2:層 S3:輸出層 S1A:開關 SL0:源極線 SL1:源極線 SL2:源極線 SL3:源極線 VINN:輸入 Vinp:ITV 4401的正輸出 VINP:輸入 VON:偏移電壓 VOP:偏移電壓 VREF2:低驗證參考電壓 WL0:電壓輸入(字元線) WL1:電壓輸入(字元線) WL2:電壓輸入(字元線) WL3:電壓輸入(字元線) WL ₀-WL _M:字元線 WLA0:字元線 WLA1:字元線 WLA2:字元線 WLA3:字元線 WLB0:字元線 WLB1:字元線 WLB2:字元線 WLB3:字元線 x ₀:輸入向量 x ₁:輸入向量 x ₂:輸入向量 x ₃:輸入向量 x(t):輸入向量 12: semiconductor substrate 14: source region 16: drain region 18: channel region 20: floating gate 22: word line terminal 24: bit line 28: control gate 30: erase gate 31: digital to analog converter 32: vector matrix multiplication (VMM) array 32a: VMM array 32b: VMM array 32c: VMM array 32d: VMM array 32e: VMM array 33: Non-volatile memory cell array 34: Erase gate and word line gate decoder 35: Control gate decoder 36: Bit line decoder 37: Source line decoder 38: Differential adder 39: Activation function block 210: Memory cell 3 10: 4-gate memory cell 4 10: 3-gate memory cell 5 10: Stacked gate memory cell 7 10: Memory cell 900: Neuron VMM Array 901: Memory array of non-volatile memory cells 902: Reference array of non-volatile reference memory cells 903: Control gate 904: Erase gate 1000: Neuron VMM Array 1001: Reference array of first non-volatile reference memory cells 1002: Reference array of second non-volatile Reference array of reference memory unit 1003: Memory array of non-volatile memory unit 1014: Multiplexer 1100: Neuron VMM array 1101: Reference array of first non-volatile reference memory unit 1102: Reference array of second non-volatile reference memory unit 1103: Non-volatile Memory array 1200 of a nonvolatile memory unit: neuron VMM array 1201: reference array of a first nonvolatile reference memory unit 1202: reference array of a second nonvolatile reference memory unit 1203: memory array of a nonvolatile memory unit 1204: stacked transistor 1205: Multiplexer 1212: Multiplexer 1300: Neuron VMM array 1301: Reference array of first non-volatile reference memory unit 1302: Reference array of second non-volatile reference memory unit 1303: Memory array of non-volatile memory unit 1314: Multiplexer 1400: LSTM 1401: unit 1402: unit 1403: unit 1404: unit 1500: LSTM unit 1501: sigmoid function device 1502: sigmoid function device 1503: sigmoid function device 1504: tanh device 1505: tanh device 1506: multiplication device 1507: multiplication device 1508: multiplication device 1509: Addition device 1600: LSTM unit 1601: VMM array 1602: Activation function block 1700: LSTM unit 1701: VMM array 1702: Activation function block 1703: Multiplication device 1704: Register 1705: Register 1706: Register 1707: Register 1708: Addition device 1709: Multiplexer 1710: Multiplexer 1800: GRU 1801: Unit 1802: Unit 1803: Unit 1804: Unit 1900: GRU unit 1901: sigmoid function device 1902: sigmoid function device 1903: tanh device 1904: multiplication device 1905: multiplication device 1906: multiplication device 1907: addition device 1908: complementary device 2000: GRU unit 2001: VMM array 2002: activation function block 2100: GRU unit 2101: VMM array 2102: activation function block 2103: multiplication device 2104: multiplexer 2105: addition device 2106: temporary storage 2107: register 2108: register 2109: complementary device 2200: neuron VMM array 2300: neuron VMM array 2400: neuron VMM array 2500: neuron VMM array 2600: neuron VMM array 2700: neuron VMM array 2701-1-2701-N: bit line control gate 2800: neuron VMM array 2900: neuron VMM array 3000: neuron VMM array 3100: VMM system 3101: summing circuit 3102: summing circuit 3210: VMM system 3211: first array 3212: second array 32 13: Summing circuit 3300: VMM system 3301: Array 3302: Array 3303: Summing circuit 3304: Summing circuit 3305: Summing circuit 3306: Summing circuit 3307: Summing circuit 3308: Summing circuit 3400: VMM system 3401: VMM array 3402: Column decoder 3403: High voltage decoder 3404: Row decoder 3405: Bit line driver 3406: Input circuit 3407: Output circuit 3408: Control logic 3409: Bias generator 3410: High voltage generation block 3411: Charge pump 3412: Charge pump regulator 3413: High voltage Level generator 3414: Algorithm controller 3415: Analog circuit 3416: Control engine 3417: Test control logic 3418: Static random access memory (SRAM) block 3701: Sequence 3702: Sequence 3703: Sequence 3801: Sequence 3802: Sequence 3803: Sequence 4001: Current source 4002: Memory unit 4300: VMM system 4301: Current to voltage converter and analog to digital converter block 4302: Verification circuit 4303: Reference voltage generator 4304: Reference array 4305: Main reference current generator 4401: Current to voltage converter (ITV) 4402: Continuous Approximation Register (SAR) Analog to Digital Converter (ADC) 4403: Verification Circuit 4404: Comparator 4405: Verification Register 4406: Reference Voltage Selection Circuit 4480: Regional Differential Operational Amplifier 4488: Neuron Output ITV+ADC+Verification Circuit 4490: Comparator and Offset Circuit 4490C: Capacitor ITV (CITV) 4490R: Resistor ITV (RITV) 4491: Comparator 4492: Calibration circuit 4493: Calibration circuit 4500: Reference voltage generator 4501: Current to voltage converter 4502: Buffer 4503: Buffer 4504: Resistor string 4510: Capacitor current to voltage converter (CITV) 4510a: Capacitor 4510b: Capacitor 4511: Resistor current to voltage converter (RITV) 4511a: resistor 4511b: resistor 4512: bias circuit 4600: physical array 4700: physical array 4801: physical array 4802: physical array 4901-0: sub-reference array 4901-1: sub-reference array 4901-(n-1): sub-reference array 4901-n: sub-reference array 5001-0: sub-reference array 5001-1: sub-reference array 5001-2: sub-reference array 5001-3: sub-reference array 5001-4: sub-reference array 5001-5: sub-reference array 5001-6: sub-reference array 5001-7: sub-reference array 5001-8: sub-reference array 5001-9: sub-reference array 5001-10: sub-reference array 5001-11: sub-reference array 5001-12: sub-reference array 5001-13: sub-reference array 5001-14: sub-reference array 5001-15: sub-reference array 01-1: sub-reference array 5001-(n-1): sub-reference array 5001-n: sub-reference array 5002-0: sub-reference array 5002-1: sub-reference array 5002-(n-1): sub-reference array 5002-n: sub-reference array BL0-BLN: bit line BLR0: terminal BLR1: terminal BLR2: terminal BLR3: terminal c ₀ : Unit state vector c ₁ : Unit state vector c ₂ : Unit state vector c(t-1): Unit state vector c(t): Unit state vector C1: Layer C2: Layer C3: Layer CB1: Synapse CB2: Synapse CB3: Synapse CB4: Synapse CG0: Voltage input (control gate line) CG1: Voltage input (control gate line) CG2: Voltage input (control gate line) CG3: Voltage input (control gate line) CG ₀ -CG _M : Control gate line COMPOUT: Output COMPOUTB: Complement of output COMPOUT EG0: EG line EG1: EG line EGR0: EG line EGR1: EG line h ₀ : Output vector h ₁ : Output vector h ₂ : Output vector h ₃ : Output vector h(t-1): Output vector h(t): Output vector INPUT ₀ -INPUT _M : Input INPUT ₀ -INPUT _N : Input OUTPUT ₀ -OUTPUT _N : Output P1: Excitation function P2: Excitation function S0: Input layer S1: Layer S2: Layer S3: Output layer S1A: Switch SL0: Source line SL1: Source line SL2: Source line SL3: Source line VINN: Input Vinp: Positive output of ITV 4401 VINP: Input VON: Offset voltage VOP: Offset voltage VREF2: Low verification reference voltage WL0: Voltage input (word line) WL1: voltage input (word line) WL2: voltage input (word line) WL3: voltage input (word line) WL ₀ -WL _M : word line WLA0: word line WLA1: word line WLA2: word line WLA3: word line WLB0: word line WLB1: word line WLB2: word line WLB3: word line x ₀ : input vector x ₁ : input vector x ₂ : input vector x ₃ : input vector x(t): input vector

圖1係說明人工神經網路之示圖。FIG1 is a diagram illustrating an artificial neural network.

圖2描繪習知技藝的分離式閘極快閃記憶體單元。FIG. 2 depicts a prior art split gate flash memory cell.

圖3描繪另一個習知技藝的分離式閘極快閃記憶體單元。FIG. 3 depicts another known split-gate flash memory cell.

圖4描繪另一個習知技藝的分離式閘極快閃記憶體單元。FIG. 4 depicts another prior art split gate flash memory cell.

圖5描繪另一個習知技藝的分離式閘極快閃記憶體單元。FIG. 5 depicts another prior art split gate flash memory cell.

圖6係說明利用一個或多個非揮發性記憶體陣列之示例性人工神經網路的不同層級之示圖。FIG6 is a diagram illustrating different levels of an exemplary artificial neural network utilizing one or more non-volatile memory arrays.

圖7係說明VMM系統的方塊圖。FIG7 is a block diagram illustrating a VMM system.

圖8係說明使用一個或多個VMM系統之示例性人工神經網路的方塊圖。FIG8 is a block diagram illustrating an exemplary artificial neural network for use with one or more VMM systems.

圖9描繪VMM系統的另一個具體例。FIG9 depicts another specific example of a VMM system.

圖10描繪VMM系統的另一個具體例。FIG10 depicts another specific example of a VMM system.

圖11描繪VMM系統的另一個具體例。Figure 11 depicts another specific example of a VMM system.

圖12描繪VMM系統的另一個具體例。Figure 12 depicts another specific example of a VMM system.

圖13描繪VMM系統的另一個具體例。Figure 13 depicts another specific example of a VMM system.

圖14描繪習知技藝的長短期記憶體系統。Figure 14 depicts the long-term and short-term memory system for learning skills.

圖15描繪用於長短期記憶體系統中之一個示例性單元。FIG15 depicts an exemplary unit for use in a long short-term memory system.

圖16描繪圖15的單元之一個示例性實施方式。FIG16 depicts an exemplary implementation of the unit of FIG15.

圖17描繪圖15的單元之另一個示例性實施方式。FIG17 depicts another exemplary implementation of the unit of FIG15.

圖18描繪習知技藝的閘控遞歸單元系統。Figure 18 depicts the gated recursive unit system of the learning technique.

圖19描繪用於閘控遞歸單元系統中之一個示例性單元。FIG. 19 depicts an exemplary cell for use in a gated recurrent cell system.

圖20描繪圖19的單元之一個示例性實施方式。FIG. 20 depicts an exemplary implementation of the unit of FIG. 19 .

圖21描繪圖19的單元之另一個示例性實施方式。FIG. 21 depicts another exemplary implementation of the unit of FIG. 19 .

圖22描繪VMM系統的另一個實例。Figure 22 depicts another example of a VMM system.

圖23描繪VMM系統的另一個實例。Figure 23 depicts another example of a VMM system.

圖24描繪VMM系統的另一個實例。Figure 24 depicts another example of a VMM system.

圖25描繪VMM系統的另一個實例。Figure 25 depicts another example of a VMM system.

圖26描繪VMM系統的另一個實例。Figure 26 depicts another example of a VMM system.

圖27描繪VMM系統的另一個實例。Figure 27 depicts another example of a VMM system.

圖28描繪VMM系統的另一個實例。Figure 28 depicts another example of a VMM system.

圖29描繪VMM系統的另一個實例。Figure 29 depicts another example of a VMM system.

圖30描繪VMM系統的另一個實例。Figure 30 depicts another example of a VMM system.

圖31描繪VMM系統的另一個實例。Figure 31 depicts another example of a VMM system.

圖32描繪VMM系統的另一個實例。Figure 32 depicts another example of a VMM system.

圖33描繪VMM系統的另一個實例。Figure 33 depicts another example of a VMM system.

圖34描繪VMM系統的另一個實例。Figure 34 depicts another example of a VMM system.

圖35A及35B描繪個別的程式化方法。Figures 35A and 35B depict individual programming methods.

圖36描繪搜尋及執行方法。Figure 36 depicts the search and execute method.

圖37描繪精確程式化方法。Figure 37 depicts the precise stylized approach.

圖38描繪精確程式化方法。Figure 38 depicts the precise stylized approach.

圖39描繪可調適校準方法。Figure 39 depicts the adaptive calibration method.

圖40描繪校準電路。Figure 40 depicts the calibration circuit.

圖41描繪可調適校準方法。Figure 41 depicts the adaptive calibration method.

圖42描繪絕對校準方法。Figure 42 depicts the absolute calibration method.

圖3描繪包括驗證電路的VMM 系統。Figure 3 depicts a VMM system including verification circuitry.

圖44A描繪實例驗證電路。Figure 44A depicts an example verification circuit.

圖4B描繪具有偏移補償的實例比較器電路。FIG4B depicts an example comparator circuit with offset compensation.

圖45描繪參考電壓產生器。Figure 45 depicts the reference voltage generator.

圖46描繪包括參考陣列的實體陣列。Figure 46 depicts an entity array including a reference array.

圖47描繪包括參考陣列的實體陣列。Figure 47 depicts an entity array including a reference array.

圖48描繪包括VMM陣列的實體陣列及包括參考陣列的另一個實體陣列。FIG. 48 depicts a physical array including a VMM array and another physical array including a reference array.

圖49描繪包括複數個參考子陣列的參考陣列。FIG. 49 depicts a reference array comprising a plurality of reference sub-arrays.

圖50描繪包括複數個參考子陣列的參考陣列。FIG. 50 depicts a reference array comprising a plurality of reference sub-arrays.

C1:層 C1: Layer

C2:層 C2: Layer

C3:層 C3: Layer

CB1:突觸 CB1: contact

CB2:突觸 CB2:Touch

CB3:突觸 CB3:Touch

CB4:突觸 CB4:Touch

P1:激勵函數 P1: incentive function

P2:激勵函數 P2: incentive function

S1:層 S1: Layer

S2:層 S2: Layer

S3:輸出層 S3: Output layer

Claims

A system includes: a vector matrix multiplication array including a plurality of non-volatile memory cells arranged in columns and rows, each of which is capable of storing one of N possible levels corresponding to one of N possible currents; and a plurality of output blocks for receiving current from respective rows of the vector matrix multiplication array and generating voltage during a verification operation of the vector matrix multiplication, and generating digital output during a read operation of the vector matrix multiplication array.

A system as claimed in claim 1, wherein the plurality of output blocks convert current from the rows of the array into voltage using a plurality of resistors or a plurality of capacitors.

The system of claim 1, comprising: A reference voltage generator for generating one of N voltages during the verification operation.

The system of claim 3, comprising: A verification circuit for comparing a voltage from the reference voltage generator with a voltage from one of the plurality of output blocks.

A system as claimed in claim 4, wherein the verification circuit generates a digital output indicating a result of the comparison.

The system of claim 3, wherein the reference voltage generator generates one of the N voltages in response to a current received from a reference array.

A system as in claim 3, wherein the reference voltage generator generates one of the N voltages in response to a current received from a main reference current generator.

The system of claim 3, wherein the reference voltage generator comprises: a current-to-voltage converter for converting a maximum current among the N possible currents into a maximum voltage; and a resistor string for generating N voltages ranging between the maximum voltage and a minimum voltage.

A system as claimed in claim 8, wherein the resistor string comprises (N-1) resistors connected in series.

The system of claim 8, comprising a first buffer for providing the maximum voltage to a first end of the resistor string.

The system of claim 10, comprising a second buffer for providing the maximum voltage to a second end of the resistor string.

A system includes: a current-to-voltage converter for converting a current from a vector matrix array into a voltage; a continuous approximation register analog-to-digital converter for receiving the voltage from the current-to-voltage converter and generating a digital output during a read operation; and a verification circuit for receiving the voltage from the current-to-voltage converter and comparing the voltage to a reference voltage during a verification operation.

A system as claimed in claim 12, wherein the reference voltage is provided by a reference voltage generator, the reference voltage generator comprising: a current-to-voltage converter for converting a maximum current among N possible currents into a maximum voltage; and a resistor string for generating N voltages ranging between the maximum voltage and a minimum voltage.

A system as claimed in claim 13, wherein the resistor string comprises (N-1) resistors connected in series.

The system of claim 13, comprising a first buffer for providing the maximum voltage to a first end of the resistor string.

The system of claim 15, comprising a second buffer for providing the maximum voltage to a second end of the resistor string.

A system includes: a vector matrix multiplication array including a plurality of non-volatile memory cells arranged in columns and rows, each of which is capable of storing one of N possible voltages corresponding to one of N possible currents; and a plurality of current-to-voltage converters and a plurality of verification circuits for receiving current from the rows of the array and generating voltage during a verification operation of the vector matrix multiplication array and generating digital output during a read operation of the vector matrix multiplication array.

The system of claim 17, comprising: A reference voltage generator for generating one of N voltages during the verification operation of the vector matrix multiplication array.

The system of claim 18, comprising: A comparator for comparing a voltage from the reference voltage generator with a voltage from one of the plurality of current-to-voltage converters.

A system as claimed in claim 19, wherein the comparator performs offset calibration.

A system as in claim 20, wherein the offset calibration is performed in the time domain.

A system as in claim 17, wherein the current-to-voltage converter uses a plurality of resistors or a plurality of capacitors to convert current from the rows of the array into a voltage.

A system includes: a vector matrix multiplication array including a plurality of non-volatile memory cells arranged in columns and rows, each of which is capable of storing one of N possible levels corresponding to one of N possible currents; and a plurality of output blocks for receiving differential currents from the rows of the vector matrix multiplication array and generating voltages during a verification operation of the vector matrix multiplication array.

The system of claim 23 further comprises: A verification circuit for generating an output.

A system as claimed in claim 23, wherein the plurality of output blocks include respective current-to-voltage converters for converting current from rows of the array of the vector matrix multiplication array into voltage using a plurality of resistors or a plurality of capacitors, respectively.

A system as in claim 25, wherein the plurality of output blocks include respective analog-to-digital converters for converting the voltages from the current-to-voltage converters into digital outputs, and wherein the verification operation utilizes the digital outputs.

A method, comprising: receiving a differential current according to the formula w=(w+)–(w-), wherein (w+) is received from a first row of a vector matrix multiplication array and (w-) is received from a second row of the vector matrix multiplication array; and verifying the differential current against a reference current.

A system includes: a vector matrix multiplication array including a plurality of non-volatile memory cells arranged in columns and rows, each of which is capable of storing one of N possible levels corresponding to one of N possible currents; and a plurality of reference voltage generators generating K reference voltages on K reference voltage lines, wherein K < N, and wherein the K reference voltage lines are verified for the N possible currents in a time multiplexed manner.

The system of claim 28, comprising: A plurality of current-to-voltage converters for receiving current from rows of the array of the vector matrix multiplication array and generating voltage during a verification operation of the vector matrix multiplication array.

A system as in claim 29, wherein the current-to-voltage converter uses a plurality of resistors or a plurality of capacitors to convert current from the rows of the array of the vector matrix multiplication array into voltage.

A system as claimed in claim 28, comprising: A verification circuit for generating a comparison output.

A system as claimed in claim 28, comprising: an analog-to-digital converter for generating a comparison output.