TWI793278B - Computing cell for performing xnor operation, neural network and method for performing digital xnor operation - Google Patents
Computing cell for performing xnor operation, neural network and method for performing digital xnor operation Download PDFInfo
- Publication number
- TWI793278B TWI793278B TW108107498A TW108107498A TWI793278B TW I793278 B TWI793278 B TW I793278B TW 108107498 A TW108107498 A TW 108107498A TW 108107498 A TW108107498 A TW 108107498A TW I793278 B TWI793278 B TW I793278B
- Authority
- TW
- Taiwan
- Prior art keywords
- field effect
- transistor
- ferroelectric field
- line
- effect transistor
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/60—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
- G06F7/607—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers number-of-ones counters, i.e. devices for counting the number of input lines set to ONE among a plurality of input lines, also called bit counters or parallel counters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Optimization (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Neurology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Pure & Applied Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Semiconductor Memories (AREA)
- Logic Circuits (AREA)
Abstract
Description
本發明大體而言是有關於神經網路,且更具體而言,是有關於一種可在神經形態計算中使用的基於鐵電場效電晶體的反互斥或胞元。The present invention relates generally to neural networks, and more specifically to a ferroelectric field effect transistor based antimutex or cell that can be used in neuromorphic computing.
涉及深度學習神經網路(Neural Network,NN)或神經形態計算(neuromorphic computing)的應用(例如影像辨識、自然語言處理及更一般而言各種型樣匹配或分類任務)正迅速變得像通用計算一樣重要。NN的基本計算元素或神經元(neuron)將一組輸入訊號與一組權重相乘並對各乘積進行求和。因此,神經元執行向量-矩陣乘積(vector-matrix product)或乘法-累加(multiply-accumulate,MAC)運算。NN通常包括大量經互連的神經元,所述神經元中的每一者執行MAC運算。因此,NN的運算是計算密集型的。Applications involving deep learning Neural Networks (NN) or neuromorphic computing (such as image recognition, natural language processing, and more generally various pattern matching or classification tasks) are rapidly becoming like general-purpose computing As important. The basic computational element, or neuron, of a NN multiplies a set of input signals by a set of weights and sums the products. Thus, neurons perform vector-matrix product or multiply-accumulate (MAC) operations. A NN typically includes a large number of interconnected neurons, each of which performs a MAC operation. Therefore, the operation of NN is computationally intensive.
可藉由改良MAC運算的效率來改良NN的效能。將期望在本地端儲存權重,以降低動態隨機存取記憶體(dynamic random access memory,DRAM)存取的功率及頻率。亦可期望以數位方式執行MAC運算,以幫助降低雜訊及過程變異性(process variability)。二值化神經元可能夠滿足該些目標。因此,已開發出一種二值化加權反互斥或網路(XNORNet)。The performance of the NN can be improved by improving the efficiency of the MAC operation. It would be desirable to store weights locally to reduce dynamic random access memory (DRAM) access power and frequency. It may also be desirable to perform the MAC operation digitally to help reduce noise and process variability. Binarized neurons may be able to meet these goals. Therefore, a binarized weighted antimutual exclusion or network (XNORNet) has been developed.
在二值化反互斥或(XNOR)胞元中,權重w在數學上是1及-1,但以數位方式被表示為1及0。同樣地,訊號x在數學上是1及-1,但由1及0以數位方式表示。乘法運算pi =wi xi 的結果只有當x及w均為1且在數學上均為-1(在布林表示法(Boolean representation)中均為0)時才是正的。此正是互斥或運算的邏輯非(反互斥或)。因此,單獨的權重與訊號的乘積可被表達為pi =XNOR (wi ,xi )。給定神經元的完整MAC運算被表達為,或以布林法被表達為sum =2Count(XNOR(w,x))-n 。計數運算對反互斥或表達式的非零結果的數目進行計數,其中n是神經元的輸入的總數。然後,對照偏差(bias)對結果進行定限(threshold),進而得到神經元的高態輸出或低態輸出。整個過程是數位的。因此,不會招致與類比處理相關聯的資訊丟失。In a binarized anti-exclusive OR (XNOR) cell, weights w are mathematically 1 and -1, but are digitally represented as 1 and 0. Likewise, the signal x is mathematically 1 and -1, but is represented digitally by 1 and 0. The result of the multiplication operation p i = w i x i is only positive if both x and w are 1 and both are mathematically -1 (0 in Boolean representation). This is exactly the logical NOT of the exclusive OR operation (inverse exclusive OR). Therefore, the product of individual weights and signals can be expressed as p i = XNOR ( w i , x i ). The complete MAC operation for a given neuron is expressed as , or expressed as sum = 2Count(XNOR(w,x))-n in Boolean's method. The count operation counts the number of non-zero outcomes of the inverse mutex or expression, where n is the total number of inputs to the neuron. Then, the result is thresholded against the bias (bias), and then the high-state output or low-state output of the neuron is obtained. The whole process is digital. Therefore, no loss of information associated with analog processing is incurred.
然而,對權重二值化表示法的使用可成為資訊丟失的來源。二值化網路通常使用較類比(或多位元數位)網路實質上更多的神經元來獲得相同水準的總體準確度。若權重為三值化而非二值化的,則可達成顯著改良。三值化權重採取數學值-1、0及1。對於輸入的任一組合,權重0均產生輸出-1(邏輯0)。因此,三值化反互斥或閘(亦被稱為「閘式反互斥或(Gated XNOR)」)的輸出由以下給出: However, the use of weighted binarized representations can be a source of information loss. Binarized networks typically use substantially more neurons than analog (or multi-bit digital) networks to achieve the same level of overall accuracy. Significant improvements can be achieved if the weights are ternarized rather than binarized. The ternarization weights take mathematical values -1, 0, and 1. A weight of 0 produces an output of -1 (logical 0) for any combination of inputs. Thus, the output of a ternary demutual exclusive OR gate (also known as "Gated XNOR") is given by:
當以上述方程式來執行反互斥或運算時,非零權重及所有訊號自{-1, 1}域被映射至{0, 1}布林域。所述映射是在基於權重的數學值進行分支(branching)之後執行。When performing an exclusive OR operation with the above equation, non-zero weights and all signals from the {-1, 1} domain are mapped to the {0, 1} Bollinger domain. The mapping is performed after branching based on the mathematical value of the weights.
當使用相同數目的神經元時,三值化網路可相對於二值化網路提供改良的準確度。作為另外一種選擇,三值化網路可達成與二值化網路相同水準的準確度,但是以數目更小的神經元來達成。此得到面積、功率及推論通量(throughput)及時延的節省。因此,二值化數位反互斥或網路及三值化數位反互斥或網路均可用於例如NN等應用中。需要一種改良的反互斥或邏輯胞元,以增強數位二值化NN運算及/或數位三值化NN運算或者其他邏輯運算。Ternarized networks may provide improved accuracy over binarized networks when using the same number of neurons. Alternatively, a ternarized network can achieve the same level of accuracy as a binarized network, but with a smaller number of neurons. This results in savings in area, power and inferential throughput and latency. Therefore, both binarized digital demutual exclusion or network and ternary digital demutual exclusion or network can be used in applications such as NN. There is a need for an improved anti-mutual exclusion or logic cell to enhance digital binarization NN operations and/or digital ternary NN operations or other logic operations.
根據一些實施例,一種用於對輸入訊號與權重執行反互斥或運算的計算胞元包括:至少一對鐵電場效電晶體(ferroelectric field-effect transistor,FE-FET),與多個輸入線耦合並儲存所述權重,所述至少一對FE-FET中的每一對FE-FET包括第一FE-FET及第二FE-FET,所述第一FE-FET接收所述輸入訊號並儲存第一權重,所述第二FE-FET接收所述輸入訊號的補數並儲存第二權重;以及多個選擇電晶體,與所述一對FE-FET耦合。According to some embodiments, a computation cell for performing an exclusive OR operation on input signals and weights includes at least one pair of ferroelectric field-effect transistors (FE-FETs), and a plurality of input lines coupling and storing the weight, each pair of FE-FETs in the at least one pair of FE-FETs includes a first FE-FET and a second FE-FET, and the first FE-FET receives the input signal and stores a first weight, the second FE-FET receives the complement of the input signal and stores a second weight; and a plurality of selection transistors coupled to the pair of FE-FETs.
各示例性實施例是有關於執行反互斥或運算且可用於多種領域中的數位計算胞元,所述領域包括但不限於機器學習、人工智慧、神經形態計算及神經網路。所述方法及系統可延伸至其中使用邏輯裝置的其他應用。呈現以下說明是為了使此項技術中具有通常知識者能夠製作及使用本發明,且以下說明是在專利申請案及其要求的上下文中提供。將易於明瞭對本文所述的示例性實施例以及一般原理及特徵的各種潤飾。各示例性實施例主要是依照在特定實施方案中所提供的特定方法及系統來加以闡述。然而,在其他實施方案中,所述方法及系統將有效地運行。Exemplary embodiments are related to digital computing cells that perform exclusive OR operations and are useful in a variety of fields including, but not limited to, machine learning, artificial intelligence, neuromorphic computing, and neural networks. The methods and systems can be extended to other applications in which logic devices are used. The following description is presented to enable one of ordinary skill in the art to make and use the invention, and is provided in the context of a patent application and its claims. Various modifications to the exemplary embodiments and general principles and features described herein will be readily apparent. The exemplary embodiments are described primarily in terms of specific methods and systems provided in specific implementations. However, in other embodiments, the methods and systems will operate efficiently.
例如「示例性實施例」、「一個實施例」及「另一實施例」等片語可指代同一實施例或不同的實施例以及多個實施例。將參照具有某些組件的系統及/或裝置來闡述各實施例。然而,所述系統及/或裝置可包括比所示者更多或更少的組件,且在不背離本發明的範圍的條件下,可對組件的配置及類型作出變化。亦將在具有某些步驟的特定方法的上下文中闡述各示例性實施例。然而,對於具有不同及/或附加步驟以及呈與示例性實施例並不相悖的不同次序的步驟的其他方法,所述方法及系統有效地運行。因此,本發明並非旨在僅限於所示的實施例,而是應被賦予與本文所述的原理及特徵相一致的最寬廣範圍。Phrases such as "exemplary embodiment," "one embodiment," and "another embodiment" can refer to the same embodiment or different embodiments as well as multiple embodiments. Embodiments will be described with reference to systems and/or devices having certain components. However, the systems and/or devices may include more or fewer components than shown, and changes may be made in the configuration and type of components without departing from the scope of the invention. Various exemplary embodiments will also be described in the context of particular methods having certain steps. However, the methods and systems operate effectively with other methods having different and/or additional steps and steps in a different order not inconsistent with the exemplary embodiments. Thus, the present invention is not intended to be limited to the embodiments shown but is to be accorded the widest scope consistent with the principles and features described herein.
除非本文中另有指示或明顯地與上下文相矛盾,否則在對本發明進行闡述的上下文中(尤其在以下申請專利範圍的上下文中)所使用的用語「一個(a及an)」以及「所述(the)」以及相似指代語應被解釋為涵蓋單數形式及複數形式。除非另有說明,否則用語「包括(comprising、including)」、「具有(having)」及「包含(containing)」應被解釋為開放式用語(即,意指「包括但不限於)。Unless otherwise indicated herein or clearly contradicted by context, the terms "a and an" and "the (the)" and similar pronouns should be construed to cover both singular and plural forms. Unless otherwise stated, the terms "comprising, including", "having" and "containing" should be construed as open-ended terms (ie, meaning "including, but not limited to).
除非另有定義,否則本文中所使用的所有技術用語及科學用語均具有與本發明所屬的技術中具有通常知識者通常所理解的含義相同的含義。應注意,除非另有規定,否則對本文所提供的任何及所有實例或示例性用語的使用僅旨在更好地闡明本發明,而非對本發明的範圍進行限制。此外,除非另有定義,否則在通用的詞典中所定義的所有用語不應被過度解讀。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It should be noted that the use of any and all examples, or exemplary terms, provided herein is intended merely to better clarify the invention and is not intended to limit the scope of the invention unless otherwise specified. Furthermore, all terms defined in common dictionaries should not be unduly interpreted unless otherwise defined.
本發明闡述一種用於對輸入訊號與權重執行數位反互斥或的計算胞元及方法。所述計算胞元包括至少一對FE-FET及多個選擇電晶體。所述一(多)對FE-FET與多個輸入線耦合並儲存所述權重。所述至少一對FE-FET中的每一對FE-FET包括:第一FE-FET,接收所述輸入訊號並儲存第一權重;以及第二FE-FET,接收所述輸入訊號的補數並儲存第二權重。所述選擇電晶體與所述一對FE-FET耦合。The present invention describes a computational cell and method for performing a digital inverse exclusive OR of input signals and weights. The calculation cell includes at least one pair of FE-FETs and a plurality of selection transistors. The one (multiple) pairs of FE-FETs are coupled to a plurality of input lines and store the weights. Each pair of FE-FETs in the at least one pair of FE-FETs includes: a first FE-FET receiving the input signal and storing a first weight; and a second FE-FET receiving a complement of the input signal And store the second weight. The select transistor is coupled to the pair of FE-FETs.
圖1是繪示數位反互斥或計算胞元100的示例性實施例的方塊圖。為簡單起見,僅示出反互斥或胞元100的一部分。計算胞元100以數位方式對輸入訊號與權重執行反互斥或運算。因此,計算胞元100可被視為神經形態計算胞元。此外,計算胞元100可執行二值化反互斥或運算或者三值化反互斥或運算。FIG. 1 is a block diagram illustrating an exemplary embodiment of a digital anti-mutex or
計算胞元100包括至少兩個鐵電場效電晶體(FE-FET)110及120、選擇電晶體130以及可選的重設電晶體140。圖中亦示出輸入線102及104以及輸出線106及選擇線108。輸入線102及104分別接收輸入訊號及其補數以用於推論運算。輸出線106提供反互斥或運算的結果。例如,若計算胞元100是神經網路(NN)的一部分,則選擇線108可用於選擇計算胞元100來進行運算。The
計算胞元100包括至少兩個FE-FET 110及120。在其他實施例中,可使用多於兩個FE-FET,但是以胞元密度為代價。在其他實施例中,每一計算胞元僅包括兩個FE-FET 110及120。FE-FET 110及120中的每一者包括鐵電層(圖1中未明確示出)及電晶體(例如,FET),所述鐵電層通常駐存於兩個金屬層之間,進而形成鐵電電容器。在替代實施例中,鐵電層可替換閘極氧化物。鐵電層藉由鐵電層的極化狀態來儲存權重。舉例而言,鐵電層可包含鋯鈦酸鉛(PbZrTi)、氧化鋯鉿(HfZrO)、鈦酸鋇(BaTiO3
)、鈦酸鉍(B12
TiO20
)、碲化鍺(GeTe)及Bax
Eu1-x
TiO3
中的至少一者,其中x大於0且不大於1。在其他實施例中,可使用另一種及/或附加鐵電材料。The
在運算中,可使用重設-評估邏輯(reset-evaluate logic)。因此,可在推論運算(即,使用先前所程式化權重的反互斥或運算)開始時重設儲存節點150。為執行推論運算,分別經由輸入線102及104將輸入x及其補數x_bar提供至FE-FET 110及120。FE-FET 110及120內的鐵電層的極化基於被程式化至FE-FET 110及120中的權重而變化。在推論運算期間,對動態儲存節點150執行選擇性上拉。然後,可經由輸出線106輸出動態儲存節點電壓,以評估或提供反互斥或運算的結果。因此,FE-FET 110及120被連接以使得輸出線106提供輸入訊號x與由FE-FET 110及120儲存的權重w的反互斥或。選擇電晶體130選擇反互斥或胞元100來進行運算。可選的重設電晶體140可用於顯式地重設計算胞元100以例如進行三值化運算。在其他實施例中,可以另一種方式來執行重設操作。因此,可以二值化模式或三值化模式來使用計算胞元100。In operation, reset-evaluate logic may be used. Thus,
計算胞元100可高效地實施反互斥或運算,且可以相對緊湊的方式來實作。由於運算是數位的,因此可減少或消除關於類比反互斥或運算的問題。舉例而言,數位權重的使用為FE-FET 110及120得到程式化穩健性。數位運算亦可為輸出106引起較少雜訊。可避免使用類比至數位轉換器(analog to digital converter,ADC),且亦可節省功率及面積。權重亦在本地端儲存於用作非揮發性記憶體的FE-FET 110及120中。可使得推論運算更高效且更快速。如下所述,計算胞元100可提供二值化或三值化反互斥或。因此,計算胞元100可以數位方式高效且可靠地執行反互斥或運算。
圖2是繪示數位神經網路的一部分180的示例性實施例的方塊圖。部分180可被視為神經元。神經元180執行乘法-累加(MAC)運算。神經元180說明計算胞元100的可能用途且並非旨在進行限制。FIG. 2 is a block diagram illustrating an exemplary embodiment of a
神經元180包括多個計算胞元100-1、100-2、100-3及100-4(統稱為計算胞元100)、以及位元計數與正負號區塊(bit count and sign block)190。在此實施例中,期望將四個輸入x1/x1_bar、x2/x2_bar、x3/x3_bar及x4/x4_bar與四個權重組合。因此,使用四個計算胞元100來執行四個反互斥或運算。在替代實施例中,可使用另一數目的計算胞元100。圖2所示計算胞元100-1、100-2、100-3及100-4中的每一者以與圖1所示計算胞元100類似的方式運行。位元計數與正負號區塊190對來自所述四個反互斥或胞元100的非零結果的數目進行計數並減去4(即神經元180的輸入訊號的數目)。然後,對照偏差對結果進行定限,進而得到神經元180的高態輸出或低態輸出。
因此,使用計算胞元100的神經元180可執行MAC運算。由於神經元180使用以硬體實作的胞元,因此神經元180高效地運行。MAC運算可以數位方式來執行,此避免關於類比反互斥或運算的問題。如參照圖1所述,由計算胞元100執行的反互斥或運算亦可為緊湊、高效的且以二值化模式或三值化模式運行。因此,神經元180的效能可得以改良。Therefore, the
圖3繪示用於執行數位反互斥或運算的計算胞元100A的示例性實施例的示意圖。計算胞元100A類似於反互斥或胞元100,且可用於神經元180或其他應用中。因此,計算胞元100A的與反互斥或胞元100中的組件類似的部分被相似地標示。因此,反互斥或胞元100A包括分別與輸入線102及104、FE-FET 110及120、選擇電晶體130、輸出線106及選擇線108類似的輸入線102及104、FE-FET 110A及120A、選擇電晶體132及134、輸出線106及選擇線108。圖中亦示出動態輸出節點150以及程式化線152及154。在所示的實施例中,選擇電晶體132及134為n-FET。FE-FET 110A及120A被示出為分別包括FET 112及122且分別包括具有鐵電層的鐵電電容器114及124。舉例而言,圖4繪示可在執行數位反互斥或運算的計算胞元中使用的FE-FET 110A/120A的一部分的示例性實施例。FE-FET 110A/120A包括FET 112/122及電容器114/124。介電層116/126可包含PbZrTi、HfZrO、BaTiO3
、B12
TiO20
、GeTe及Bax
Eu1-x
TiO3
中的一或多者,其中x大於0且不大於1。在一些實施例中,鐵電層116/126被併入至第一金屬(M1)層中。在其他實施例中,鐵電層116/126可被併入至其他層中。FIG. 3 is a schematic diagram of an exemplary embodiment of a
輸入線102及104分別載運輸入訊號x及其補數x_bar。輸入線102及104分別連接至FE-FET 110A及120A的源極。FE-FET 110A及120A的閘極分別經由選擇電晶體132及134分別連接至程式化線152及154。程式化線152及154分別提供程式化訊號P及其補數P_bar。各FE-FET的汲極耦合於一起且形成動態輸出節點150。選擇電晶體132及134的源極分別連接至FE-FET 110A及120A,選擇電晶體132及134的汲極分別連接至程式化線152及154,且選擇電晶體132及134的閘極連接至選擇線108。
如上所述,儲存於FE-FET 110A及120A中的權重由鐵電層114及124的極化決定。該些權重可在晶片外被訓練。舉例而言,若計算胞元100A的預期應用僅為推論(晶片外訓練),則抹除操作及程式化操作是不頻繁地執行。因此,FE-FET 110A及120A可僅在期望改變權重時被程式化。在一些實施例中,例如,為考量到晶片外訓練的改良,此程式化可每年僅發生幾次。在替代實施例中,FE-FET 110A及120A可更頻繁或更不頻繁地被程式化。As mentioned above, the weights stored in FE-
程式化至FE-FET 110A及120A中的權重取決於是期望計算胞元100A以二值化模式還是以三值化模式使用。兩個FE-FET 110A及120A中所儲存的狀態對於非零權重可互為補數或者對於零權重可為相等的(例如,為其兩者設定高Vt狀態)。對於三值化運算或零權重,可發生零權重的使用。The weights programmed into FE-
為程式化權重,首先將計算胞元100A抹除且然後進行程式化。若以陣列的形式存在,則可首先將整個陣列中的所有計算胞元100全域地抹除,且然後可程式化單獨的非零位元。為將胞元100A(及陣列中的所有胞元)抹除,將程式化線152及154上的訊號P及P_bar設定成低態(例如,設定成地),且將輸入線102及104上的輸入x及x_bar設定成高態。計算胞元100A的輸出線106被容許浮置。結果是:在胞元100A及陣列中,跨每一FE-FET 110A中的鐵電電容器114及每一FE-FET 120A中的鐵電電容器124存在負電壓。在抹除結束時,每一FE-FET 110A及120A在FE-FET 110A及120A中的基本FET 112及122的閘極的節點上具有小電壓、零電壓或略負電壓。此將所有FE-FET 110A及120A置於低導通性狀態。To program the weights, the
圖5是繪示對執行數位反互斥或運算的計算胞元的示例性實施例的程式化的時序圖200。在圖5所示的實施例中,程式化線152及154被施加脈衝而達到適度高的電壓,例如2.5伏(V)至3伏。參照圖3及圖5,實線202表示施加至程式化線152及154的電壓,點線204表示輸入線102及104上的輸入電壓x或x_bar。短劃線206是在抹除過程中FE-FET 110A及120A的FET 112及122的閘極處的電壓。在垂直線209左側的所施加電壓表示可將FE-FET 110A及120A抹除的所施加電壓。抹除在線209處完成。因此,剛到達線209的右側,FE-FET 110A及120A的閘極的節點處的電壓便為小的。因此,FE-FET 110A及120A均已被抹除至低的閘極電壓。FIG. 5 is a stylized timing diagram 200 illustrating an exemplary embodiment of a computational cell performing a digital exclusive OR operation. In the embodiment shown in FIG. 5, the
在抹除完成之後,可將FE-FET 110A及120A程式化。程式化事件設定表示由FE-FET 110A及120A儲存的數學權重的單獨的位元。若期望進行三值化運算,則僅程式化非零權重。程式化是藉由將訊號輸入線102及104接地(x、x_bar為低態)並將高電壓施加至程式化線152(P為高態)或程式化線152(P_bar為高態)來達成。為被程式化的每一計算胞元100A接通選擇線108。程式化線152或154上的高電壓分別使得跨每一FE-FET 110A中的鐵電電容器114或每一FE-FET 120A中的鐵電電容器124存在正電壓。因此,引起極化狀態的改變。每一FE-FET的閘極節點現在被程式化至高的正電壓,進而將基本FET設定成導通狀態。在圖5中,FE-FET 110A(其可儲存權重)的最終經程式化狀態的閘極電壓由短劃線207示出,而FE-FET 120A(其可儲存權重補數)的最終經程式化狀態的閘極電壓由短劃線208示出。因此,FE-FET 110A及120A中的一或兩者可被程式化。在圖5中,閘極電壓207與208不同。在一些實施例中,接通狀態FET與關斷狀態FET之間的最終電壓差是大致500毫伏(mV)。此可對應於一般的7奈米(nm)節點FET。可藉由改良鐵電電容器114及124的鐵電極化對非鐵電極化的比率而顯著增大所述電壓差。舉例而言,可使用具有較強鐵電極化的較厚鐵電電容器/較厚鐵電層116/126。因此,可使用抹除-程式化將權重程式化至FE-FET 110A及120A中。After the erase is complete, FE-
為執行推論運算,分別在輸入線102及104上提供輸入x及其補數x_bar。將選擇線108驅動成低態。因此,FE-FET 110A及120A的閘極被容許浮置。FE-FET 110A及120A的閘極被容許可浮置,以容許閘極電壓在推論運算期間超過供電電壓且提供全輸出擺幅(full output swing)。期望將鐵電電壓的差最小化或減小,以抑制讀取干擾(read disturbance)。可在儲存節點150上產生計算胞元100A的輸出(輸入x與權重的反互斥或)。因此,推論運算可被執行。用於執行推論運算的時間亦可保持為小的。舉例而言,圖6是繪示執行數位反互斥或運算的計算胞元的示例性實施例的推論運算的時序的圖210。短劃線212指示輸入線102上的輸入x自低態轉變成高態。短劃線214及點線216指示在推論運算期間於FE-FET 110A及120A的閘極(即,電晶體112的閘極及電晶體122的閘極)上產生的電壓。如在圖6中可看出,趨穩至最終電壓可在少於0.1奈秒(ns)內發生。To perform inference operations, an input x and its complement x_bar are provided on
因此,計算胞元100A可具有改良的效能。計算胞元100A可利用僅兩個FE-FET 110A及120A與兩個選擇n型FET(nFET) 132及134的組合。因此,計算胞元100A可為緊湊的。由於FE-FET 110A及120A可以數位方式被程式化,因此程式化可為穩健的。權重是藉由鐵電層116/126的極化而在本地端儲存。由於不需要自晶片外DRAM存取權重,因此時間及功率得以節省。由於計算胞元100A可以數位方式執行反互斥或(推論)運算,因此與類比實施方案相較,在儲存節點150上產生的輸出可展現出降低的雜訊。此外,推論運算是迅速且高效地執行。計算胞元100A對於讀取干擾可為穩健的。各FE-FET的閘極節點(鐵電電容器114及124的頂部節點)在推論期間浮置。因此,推論事件跨鐵電電容器114/124自身而確立(assert)極小的電壓。此外,此小的電壓增量是在較標準鐵電材料的鐵電極化回應小得多的時間標度上發生。如上參照圖6所述,推論時間可合理地被保持於0.1奈秒之下。此時間標度可較標準鐵電材料的鐵電回應的時間小得多。此乃因期望在推論運算期間鐵電極化不發生改變。此回應時間較PbZrTi回應快大約兩個數量級,且較HfZrO回應快至少幾個數量級。因此,預期鐵電層116/126的極化不發生改變。因此,反覆的推論事件可對FE-FET 110A及120A中的閘極節點電壓幾乎沒有影響。此表明鐵電層116/126的極化狀態一直未發生改變。因此,推論運算/讀取操作可不干擾FE-FET 110A及120A的經程式化狀態。Therefore, the
亦可在三值化運算中使用計算胞元100A。對於三值化運算,使用完整權重集合{1, 0, -1}。對於零權重,在以上所述的抹除之後不將計算胞元100A程式化。換言之,抹除-程式化操作是簡單地藉由將計算胞元100A抹除而完成。然而,存在因反覆推論而在儲存節點150處發生電荷累積的可能性。此發生可能是由於當FE-FET 110A及120A均關斷(對於零權重而言,即為此種情形)時動態儲存節點150的自然放電率與推論率相較為低的。為防止發生此種電荷累積,在三值化運算的每次推論之前執行顯式重設。在反互斥或網路情形中,輸入線102及104的x及x_bar的最初接地狀態足以藉由FeFET 110A及/或120A將儲存節點150放電。The
在一個實施例中,可在無任何附加電晶體或互連線的情況下使用計算胞元100A。在此種實施例中,儲存節點150藉由FE-FET 110A及120A被放電。然而,藉由在選擇電晶體132及134接通的同時分別對程式化線152及154施加高電壓來提高FE-FET 110A及120A的導通性。FE-FET 110A及120A上閘極電壓的增添使得常關型FE-FET 110A及120A臨時更具導通性。FE-FET 110A及120A的此種較高導通性使得儲存節點150能夠被快速放電。雖然此種方法起作用,然而在每次推論時,均會對程式化線152及154施加高電壓脈衝。此在原本極不頻繁地受到應力的選擇電晶體132及134上造成增大的電源與電壓應力。作為另外一種選擇,可使用計算胞元的不同實施例。In one embodiment, the
圖7繪示用於藉由顯式重設操作來執行數位反互斥或運算的計算胞元100B的另一示例性實施例。計算胞元100B類似於反互斥或胞元100及計算胞元100A。因此,計算胞元100B可用於神經元180或其他應用中。因此,計算胞元100B的與胞元100/100A中的組件類似的部分被相似地標示。因此,計算胞元100B包括分別與輸入線102及104、FE-FET 110/110A及120/120A、選擇電晶體130/132及134、輸出線106以及選擇線108類似的輸入線102及104、FE-FET 110B及120B、選擇電晶體132及134、輸出線106以及選擇線108。圖中亦示出與圖3所示者類似的動態輸出節點150以及程式化線152及154。選擇電晶體132及134是n-FET。FE-FET 110B及120B分別包括FET 112及122且分別包括具有鐵電層的鐵電電容器114及124。FET 112及122以及鐵電電容器114及124類似於圖3所示者。介電層(圖7中未標示)可包含PbZrTi、HfZrO、BaTiO3
、B12
TiO20
、GeTe及Bax
Eu1-x
TiO3
中的一或多者,其中x大於0且不大於1。組件102、104、106、108、110B、112、114、120B、122、124、132、134、150、152及154的結構及功能類似於圖2至圖4中編號相同的組件。FIG. 7 illustrates another exemplary embodiment of a
計算胞元100B亦包括重設電晶體140(其可為n-FET)以及重設線142。重設電晶體140的閘極耦合至重設線142,而源極耦合至地。為將FE-FET 110B及120B抹除,將重設線142設定成低態,對程式化線152及154低態地施加脈衝,且將輸入線102及104設定成高態。對於推論/反互斥或運算,在分別於輸入線102及104上施加輸入x及x_bar之前,藉由將重設線142通電來接通重設FET 140。因此,重設電晶體140的使用將儲存節點150放電。然後,可施加輸入x及x_bar,且可執行推論運算。因此,在以三值化模式使用計算胞元100B時,可避免出現以上所述的高電壓。在以下兩種情形之間所作的選擇取決於目標及技術約束:利用較小計算胞元100A而對程式化線152及154施加高電壓、與利用具有重設FET 140的較大計算胞元100B但不採用高電壓。The
圖8是繪示用於使用硬體胞元的示例性實施例來執行反互斥或運算的方法的示例性實施例的流程圖。為簡單起見,一些步驟可被省略、以另一次序來執行及/或加以組合。方法300亦係在反互斥或胞元100/100A/100B的上下文中加以闡述。然而,方法300可結合另一反互斥或計算胞元而使用。8 is a flowchart illustrating an exemplary embodiment of a method for performing an exclusive OR operation using an exemplary embodiment of a hardware cell. For simplicity, some steps may be omitted, performed in another order, and/or combined.
藉由步驟302,將權重程式化至FE-FET 110/110A/110B及120/120A/120B中。因此,可如上所述來執行步驟302。舉例而言,步驟302可包括將計算胞元100/100A/100B抹除,隨後進行程式化步驟。雖然被示出為流程300的一部分,然而步驟302可在方法300的其餘步驟很久之前實施且可與所述其餘步驟脫離。By
藉由步驟304,視需要將重設線142驅動成高態、然後驅動成低態,以為重設電晶體140賦能。步驟304是針對計算胞元100B而執行。作為另外一種選擇,可藉由所施加電壓來重設FE-FET 110A及120A。藉由步驟306,接收訊號及其補數。步驟306可包括分別在輸入線102及104中接收x_value及x_value_bar。如上所述來執行推論運算。然後,藉由步驟308,可轉發反互斥或運算的結果。
因此,使用方法300,可使用反互斥或胞元100、100A、100B及/或類似的裝置。因此,可達成反互斥或胞元100、100A、100B及/或類似的裝置中的一或多者的優點。已闡述了一種用於以二值化模式或三值化模式使用緊湊FE-FET計算胞元100/100A/100B來執行數位反互斥或運算的方法及系統。已根據所示的示例性實施例闡述了所述方法及系統,且此項技術中具有通常知識者將容易認識到,可對各實施例作出變化,且任何變化將處於所述方法及系統的精神及範圍內。因此,在不背離隨附申請專利範圍的精神及範圍的條件下,此項技術中具有通常知識者可作出諸多潤飾。Thus, using
100:數位反互斥或計算胞元/反互斥或胞元/計算胞元 100-1、100-2、100-3、100-4:計算胞元 100A:計算胞元/反互斥或胞元/胞元/緊湊FE-FET計算胞元 100B:計算胞元/反互斥或胞元/緊湊FE-FET計算胞元 102、104:輸入線/訊號輸入線/組件 106:輸出線/輸出/組件 108:選擇線/組件 110、110A、120、120A:FE-FET 110B、120B:FE-FET/組件 112、122:FET/電晶體/組件 114、124:鐵電電容器/電容器/鐵電層/組件 116、126:介電層/鐵電層 130:選擇電晶體 132、134:選擇電晶體/選擇nFET/組件 140:重設電晶體/重設FET 142:重設線 150:儲存節點/動態儲存節點/動態輸出節點/組件 152、154:程式化線/組件 180:數位神經網路的一部分/神經元 190:位元計數與正負號區塊 200、210:時序圖 202:實線 204、216:點線 206、207、208、212、214:短劃線 209:垂直線 300:方法/流程 302、304、306、308:步驟 P:程式化訊號/訊號 P_bar:程式化訊號的補數/訊號 x:輸入/輸入電壓 x1、x1_bar、x2、x2_bar、x3、x3_bar、x4、x4_bar:輸入 x_bar:輸入的補數/輸入電壓/輸入100: digital anti-mutual exclusion or calculation cell / anti-mutual exclusion or cell / calculation cell 100-1, 100-2, 100-3, 100-4: Computing cells 100A: Computation Cell/Demutual Exclusion or Cell/Cell/Compact FE-FET Computation Cell 100B: Computation Cell / Antimutex or Cell / Compact FE-FET Computation Cell 102, 104: input line/signal input line/component 106:Output line/output/component 108:Select line/component 110, 110A, 120, 120A: FE-FET 110B, 120B: FE-FET/component 112, 122: FET/transistor/component 114, 124: Ferroelectric capacitors/capacitors/ferroelectric layers/components 116, 126: dielectric layer/ferroelectric layer 130:Select Transistor 132, 134: select transistor/select nFET/component 140: reset transistor / reset FET 142: reset line 150:Storage Node/Dynamic Storage Node/Dynamic Output Node/Component 152, 154: Stylized lines/components 180: Part of a digital neural network/neuron 190:Bit Count and Sign Block 200, 210: timing diagram 202: solid line 204, 216: point line 206, 207, 208, 212, 214: dashes 209: vertical line 300: method/process 302, 304, 306, 308: steps P: stylized signal/signal P_bar: Complement/signal of programmed signal x: input/input voltage x1, x1_bar, x2, x2_bar, x3, x3_bar, x4, x4_bar: input x_bar: complement of input/input voltage/input
圖1是繪示數位反互斥或計算胞元的示例性實施例的方塊圖。 圖2是繪示神經網路的一部分的示例性實施例的方塊圖,所述神經網路包括多個反互斥或計算胞元且執行乘法-累加運算。 圖3繪示用於執行數位反互斥或運算的計算胞元的示例性實施例。 圖4繪示可在執行數位反互斥或運算的計算胞元中使用的FE-FET的一部分的示例性實施例。 圖5是繪示對執行數位反互斥或運算的計算胞元的示例性實施例的程式化的時序圖。 圖6是繪示執行數位反互斥或運算的計算胞元的示例性實施例的推論運算的時序的圖。 圖7繪示用於執行數位反互斥或運算的計算胞元的另一示例性實施例。 圖8是繪示用於使用計算胞元的示例性實施例來執行反互斥或運算的方法的示例性實施例的流程圖。FIG. 1 is a block diagram illustrating an exemplary embodiment of a digital antimutex or computation cell. FIG. 2 is a block diagram illustrating an exemplary embodiment of a portion of a neural network that includes a plurality of demutual exclusion or computation cells and performs a multiply-accumulate operation. FIG. 3 illustrates an exemplary embodiment of a computation cell for performing a digital exclusive OR operation. Figure 4 illustrates an exemplary embodiment of a portion of a FE-FET that may be used in a computational cell that performs a digital exclusive OR operation. 5 is a stylized timing diagram illustrating an exemplary embodiment of a computation cell performing a digital exclusive OR operation. 6 is a diagram illustrating the timing of an inference operation of an exemplary embodiment of a computation cell performing a digital exclusive OR operation. FIG. 7 illustrates another exemplary embodiment of a computation cell for performing a digital exclusive OR operation. FIG. 8 is a flowchart illustrating an exemplary embodiment of a method for performing an exclusive OR operation using an exemplary embodiment of a compute cell.
100:數位反互斥或計算胞元/反互斥或胞元/計算胞元 100: digital anti-mutual exclusion or calculation cell / anti-mutual exclusion or cell / calculation cell
102、104:輸入線/訊號輸入線/組件 102, 104: input line/signal input line/component
106:輸出線/輸出/組件 106:Output line/output/component
108:選擇線/組件 108:Select line/component
110、120:FE-FET 110, 120: FE-FET
130:選擇電晶體 130:Select Transistor
140:重設電晶體/重設FET 140: reset transistor / reset FET
150:儲存節點/動態儲存節點/動態輸出節點/組件 150:Storage Node/Dynamic Storage Node/Dynamic Output Node/Component
x:輸入/輸入電壓 x: input/input voltage
x_bar:輸入的補數/輸入電壓/輸入 x_bar: complement of input/input voltage/input
Claims (19)
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862640076P | 2018-03-08 | 2018-03-08 | |
US62/640,076 | 2018-03-08 | ||
US201862664102P | 2018-04-28 | 2018-04-28 | |
US62/664,102 | 2018-04-28 | ||
US16/137,227 US10461751B2 (en) | 2018-03-08 | 2018-09-20 | FE-FET-based XNOR cell usable in neuromorphic computing |
US16/137,227 | 2018-09-20 |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202036389A TW202036389A (en) | 2020-10-01 |
TWI793278B true TWI793278B (en) | 2023-02-21 |
Family
ID=67882975
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108107498A TWI793278B (en) | 2018-03-08 | 2019-03-06 | Computing cell for performing xnor operation, neural network and method for performing digital xnor operation |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110245749B (en) |
TW (1) | TWI793278B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11120864B2 (en) | 2019-12-09 | 2021-09-14 | International Business Machines Corporation | Capacitive processing unit |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4541067A (en) * | 1982-05-10 | 1985-09-10 | American Microsystems, Inc. | Combinational logic structure using PASS transistors |
TW201603281A (en) * | 2014-04-24 | 2016-01-16 | 美光科技公司 | Ferroelectric field effect transistors, pluralities of ferroelectric field effect transistors arrayed in row lines and column lines, and methods of forming a plurality of ferroelectric field effect transistors |
TW201703430A (en) * | 2015-04-01 | 2017-01-16 | Japan Science & Tech Agency | Electronic circuit |
CN106463513A (en) * | 2014-05-20 | 2017-02-22 | 美光科技公司 | Polar, chiral, and non-centro-symmetric ferroelectric materials, memory cells including such materials, and related devices and methods |
US20170256552A1 (en) * | 2016-03-01 | 2017-09-07 | Namlab Ggmbh | Application of Antiferroelectric Like Materials in Non-Volatile Memory Devices |
US20180039886A1 (en) * | 2016-08-05 | 2018-02-08 | Xilinx, Inc. | Binary neural networks on progammable integrated circuits |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4713792A (en) * | 1985-06-06 | 1987-12-15 | Altera Corporation | Programmable macrocell using eprom or eeprom transistors for architecture control in programmable logic circuits |
US6356112B1 (en) * | 2000-03-28 | 2002-03-12 | Translogic Technology, Inc. | Exclusive or/nor circuit |
KR100482996B1 (en) * | 2002-08-30 | 2005-04-15 | 주식회사 하이닉스반도체 | Nonvolatile Ferroelectric Memory Device |
-
2019
- 2019-02-28 CN CN201910148042.2A patent/CN110245749B/en active Active
- 2019-03-06 TW TW108107498A patent/TWI793278B/en active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4541067A (en) * | 1982-05-10 | 1985-09-10 | American Microsystems, Inc. | Combinational logic structure using PASS transistors |
TW201603281A (en) * | 2014-04-24 | 2016-01-16 | 美光科技公司 | Ferroelectric field effect transistors, pluralities of ferroelectric field effect transistors arrayed in row lines and column lines, and methods of forming a plurality of ferroelectric field effect transistors |
CN106463513A (en) * | 2014-05-20 | 2017-02-22 | 美光科技公司 | Polar, chiral, and non-centro-symmetric ferroelectric materials, memory cells including such materials, and related devices and methods |
TW201703430A (en) * | 2015-04-01 | 2017-01-16 | Japan Science & Tech Agency | Electronic circuit |
US20170256552A1 (en) * | 2016-03-01 | 2017-09-07 | Namlab Ggmbh | Application of Antiferroelectric Like Materials in Non-Volatile Memory Devices |
US20180039886A1 (en) * | 2016-08-05 | 2018-02-08 | Xilinx, Inc. | Binary neural networks on progammable integrated circuits |
Non-Patent Citations (1)
Title |
---|
網路文獻 Borna Obradovic A Multi-Bit Neuromorphic Weight Cell using Ferroelectric FETs, suitable for SoC Integration 2017/10/22 https://arxiv.org/abs/1710.08034.pdf * |
Also Published As
Publication number | Publication date |
---|---|
CN110245749A (en) | 2019-09-17 |
CN110245749B (en) | 2024-06-14 |
TW202036389A (en) | 2020-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10461751B2 (en) | FE-FET-based XNOR cell usable in neuromorphic computing | |
US11842770B2 (en) | Circuit methodology for highly linear and symmetric resistive processing unit | |
Sun et al. | Exploiting hybrid precision for training and inference: A 2T-1FeFET based analog synaptic weight cell | |
US11657259B2 (en) | Kernel transformation techniques to reduce power consumption of binary input, binary weight in-memory convolutional neural network inference engine | |
US11290110B2 (en) | Method and system for providing a variation resistant magnetic junction-based XNOR cell usable in neuromorphic computing | |
JPWO2019049741A1 (en) | Neural network arithmetic circuit using non-volatile semiconductor memory device | |
CN113593623B (en) | Analog content addressable memory using three-terminal memory device | |
TWI699711B (en) | Memory devices and manufacturing method thereof | |
CN112447229B (en) | Nonvolatile memory device performing multiply-accumulate operation | |
KR20190133532A (en) | Transposable synaptic weight cell and array thereof | |
Wang et al. | Investigating ferroelectric minor loop dynamics and history effect—Part II: Physical modeling and impact on neural network training | |
US11011216B1 (en) | Compute-in-memory dynamic random access memory | |
CN114974337B (en) | Time domain memory internal computing circuit based on spin magnetic random access memory | |
TWI793278B (en) | Computing cell for performing xnor operation, neural network and method for performing digital xnor operation | |
WO2022134841A1 (en) | USING FERROELECTRIC FIELD-EFFECT TRANSISTORS (FeFETs) AS CAPACITIVE PROCESSING UNITS FOR IN-MEMORY COMPUTING | |
Thunder et al. | Ultra low power 3D-embedded convolutional neural network cube based on α-IGZO nanosheet and bi-layer resistive memory | |
Reis et al. | In-memory computing accelerators for emerging learning paradigms | |
Eslami et al. | A flexible and reliable RRAM-based in-memory computing architecture for data-intensive applications | |
KR20230025401A (en) | Charge-pump-based current-mode neurons for machine learning | |
Gupta et al. | On-chip unsupervised learning using STDP in a spiking neural network | |
Dubreuil et al. | A novel 3D 1T1R RRAM architecture for memory-centric Hyperdimensional Computing | |
CN116670763A (en) | In-memory computation bit cell with capacitively coupled write operation | |
Tseng et al. | An Analog In-Memory-Search Solution based on 3D-NAND Flash Memory for Brain-Inspired Computing | |
Zanotti et al. | Circuit reliability analysis of in-memory inference in binarized neural networks | |
Ma et al. | A binary-activation, multi-level weight RNN and training algorithm for ADC-/DAC-free and noise-resilient processing-in-memory inference with eNVM |