KR102448396B1

KR102448396B1 - Capacitance-based neural networks that can elastically apply weighted bit widths

Info

Publication number: KR102448396B1
Application number: KR1020190113236A
Authority: KR
Inventors: 유인경; 황현상
Original assignee: 포항공과대학교 산학협력단
Priority date: 2019-09-16
Filing date: 2019-09-16
Publication date: 2022-09-27
Also published as: KR20210032036A

Abstract

본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용할 수 있는 커패시턴스 기반 뉴럴 네트워크는, 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터와, 트랜지스터와 연결되는 커패시터와, 커패시터와 연결되는 플레이트를 포함하되, 비트 라인은 펄스 발진기와 연결되고, 워드 라인은 비트 라인과 직교하여 배치된다.A capacitance-based neural network to which a weighted bit width can be flexibly applied according to an embodiment of the present invention comprises a word line, a bit line, and a transistor connected to a capacitor, a capacitor connected to the transistor, and a plate connected to the capacitor. wherein the bit line is connected to the pulse oscillator and the word line is disposed orthogonal to the bit line.

Description

Capacitance-based neural network with flexible weight bit width {CAPACITANCE-BASED NEURAL NETWORK WITH FLEXIBLE WEIGHT BIT-WIDTH}

본 발명은 뉴럴 네트워크에 관한 것으로서, 보다 상세하게는 가중치 비트폭을 탄력적으로 적용할 수 있는 커패시턴스 기반 뉴럴 네트워크에 관한 것이다.The present invention relates to a neural network, and more particularly, to a capacitance-based neural network to which a weight bit width can be flexibly applied.

모바일용 뉴럴 프로세서는 학습을 서버나 컴퓨터로 수행하고 학습결과를 모바일 뉴럴 프로세서에 저장하여 추론(inference)을 수행한다. 이 때 뉴럴 프로세서의 가중치에 저장되는 값은 멀티레벨이 되는 것이 바람직하나 멀티레벨 값에 한계가 있어서 학습을 수행한 후 전지 작업(pruning), 데이터 압축 등의 과정을 거쳐 작은 비트폭(small bit-width)화 한 다음 그 값을 모바일 뉴럴 프로세서 가중치로 저장한다. 이 가중치는 불휘발성 메모리 또는 휘발성 메모리에 저장할 수 있다. The mobile neural processor performs learning by a server or computer, and stores the learning results in the mobile neural processor to perform inference. At this time, it is preferable that the value stored in the weight of the neural processor be multi-level, but there is a limit to the multi-level value. width) and then store the value as the mobile neural processor weight. This weight may be stored in a non-volatile memory or a volatile memory.

서버용으로는 Google의 TPU(Tensor Processing Unit)가 있는데 가중치 값을 DRAM에 저장한 후 페치(fetch)하여 행렬 곱셈부(matrix multiply unit, MMU)로 보낸다. 출력(output) 계산 결과는 DRAM에 저장된 새로운 가중치 값과 함께 다시 행렬 곱셈부 입력(input)으로 보내어 최종 출력(output) 결과가 나올 때까지 순환시킨다. For servers, there is Google's Tensor Processing Unit (TPU), which stores weight values in DRAM, fetches them, and sends them to the matrix multiply unit (MMU). The output calculation result is sent back to the matrix multiplier input together with the new weight value stored in the DRAM, and circulated until the final output result is obtained.

가중치를 불휘발성 메모리에 저장하여 사용하는 경우에는 추론 속도가 빠른 장점이 있으나 은닉층(hidden layer)을 모두 제작해야 하므로 회로 오버헤드(circuit overhead)가 증가하는 단점이 있다. Google의 TPU같은 경우는 가중치 정보를 뉴럴 네트워크 외부에 저장하고, 동일한 뉴럴 네트위크를 다시 사용하면서 순차적으로 계산하기 때문에 추론 속도는 감소하지만 회로 오버헤드를 줄일 수 있다.When the weights are stored and used in the nonvolatile memory, the inference speed is fast, but there is a disadvantage in that the circuit overhead increases because all hidden layers must be manufactured. In the case of Google's TPU, weight information is stored outside the neural network, and the inference speed is reduced, but circuit overhead can be reduced because it is sequentially calculated while using the same neural network again.

커패시턴스 기반 행렬 곱셈(matrix multiplication)은 커패시턴스를 가중치로 사용한다. 가중치를 결정하기 위해 커패시터들을 그룹으로 묶거나 커패시터의 크기를 바꾸는 방법이 있다.Capacitance-based matrix multiplication uses capacitance as a weight. There are ways to group capacitors or change the size of capacitors to determine the weight.

위 기재된 내용은 오직 본 발명의 기술적 사상들에 대한 배경 기술의 이해를 돕기 위한 것이며, 따라서 그것은 본 발명의 기술 분야의 당업자에게 알려진 선행 기술에 해당하는 내용으로 이해될 수 없다.The above description is only for helping the understanding of the background of the technical spirit of the present invention, and therefore it cannot be understood as the content corresponding to the prior art known to those skilled in the art.

본 발명은 인공지능 학습에서 학습결과를 수행하기 위한 가중치 조절 방법에 관한 뉴럴 네트워크 구성 및 작동 원리이다The present invention is a neural network configuration and operating principle related to a weight adjustment method for performing learning results in artificial intelligence learning

가중치 셀을 하드웨어를 이용하여 멀티레벨로 제작한다는 것은 물리적 한계가 있기 때문에, 소프트웨어에서 사용하는 가중치 비트폭(weight bit-width)을 따라갈 수 없다. 예를 들어서 16 비트폭의 멀티레벨, 즉, 65,536 저항 레벨을 갖는 저항 메모리 소재는 현재로서는 구현하기 어렵다. 따라서 가중치 값을 소프트웨어만큼 탄력적으로 입력하면서 행렬 곱셈이 가능한 행렬 곱셈부의 구조와 작동 방법을 고안해야 한다.Since there is a physical limit to manufacturing a weight cell in multi-level using hardware, it cannot follow the weight bit-width used in software. For example, a 16-bit wide multilevel resistive memory material with 65,536 resistance levels is currently difficult to implement. Therefore, it is necessary to devise a structure and operation method of a matrix multiplier that can perform matrix multiplication while inputting weight values as flexibly as software.

커패시터의 크기를 바꾸는 방법은 여러 크기의 커패시터를 제작한 후 필요한 크기를 선택하는 방법이고 커패시터를 묶는 방법은 여러 커패시터를 동시에 작동하는 방법이다. 이러한 경우 회로가 복잡해지고 특히 비트 라인 커패시턴스(bit line capacitance) 같은 기생 커패시턴스(parasitic capacitance)의 영향을 받는다. 또한, 은닉층을 필요한 만큼 제작하는 것이 칩(chip) 크기에 제약을 주는 또 하나의 문제가 된다.The method of changing the size of a capacitor is a method of selecting the required size after manufacturing capacitors of various sizes, and the method of tying capacitors is a method of operating several capacitors at the same time. In this case, the circuit becomes complicated and is particularly affected by parasitic capacitances such as bit line capacitance. In addition, manufacturing as many hidden layers as necessary is another problem that limits the size of a chip.

상기 목적을 달성하기 위하여 본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용할 수 있는 커패시턴스 기반 뉴럴 네트워크는, 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터와; 상기 트랜지스터와 연결되는 커패시터와; 상기 커패시터와 연결되는 플레이트를 포함하되, 상기 비트 라인은 펄스 발진기(pulse generator)와 연결되고, 상기 워드 라인은 상기 비트 라인과 직교하여 배치된다.In order to achieve the above object, according to an embodiment of the present invention, there is provided a capacitance-based neural network capable of flexibly applying a weighted bit width, comprising: a transistor connected to a word line, a bit line, and a capacitor; a capacitor connected to the transistor; a plate connected to the capacitor, wherein the bit line is connected to a pulse generator and the word line is disposed perpendicular to the bit line.

상기 워드 라인과 상기 비트 라인에 인가되는 전압은 펄스 전압이고, 상기 비트 라인에 상기 펄스 전압을 인가하는 시간에 따라 가중치가 조절될 수 있다.A voltage applied to the word line and the bit line is a pulse voltage, and a weight may be adjusted according to a time for applying the pulse voltage to the bit line.

상기 비트 라인에 인가되는 전압은 상기 커패시터와 연결되는 플레이트(plate)에 상시 인가되는 전압의 두배 이상일 수 있다.The voltage applied to the bit line may be at least twice the voltage that is always applied to a plate connected to the capacitor.

상기 비트 라인과 상기 펄스 발진기 사이에 배치되는 다이오드를 더 포함할 수 있다.The device may further include a diode disposed between the bit line and the pulse oscillator.

상기 비트 라인의 출력과 연결되는 선택 트랜지스터를 더 포함할 수 있다.It may further include a selection transistor connected to the output of the bit line.

상기 선택 트랜지스터의 그라운드에 해당하는 P웰(p-well)을 상기 다이오드와 상기 펄스 발진기 사이에 연결하는 배선을 더 포함할 수 있다.A wiring connecting a P-well corresponding to the ground of the selection transistor between the diode and the pulse oscillator may be further included.

본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용할 수 있는 커패시턴스 기반 뉴럴 네트워크는, 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터와; 상기 트랜지스터와 연결되는 커패시터를 포함하되, 상기 비트 라인의 입력은 펄스 발진기(pulse generator)와 연결되고 출력은 활성화 소자와 연결된다.A capacitance-based neural network capable of flexibly applying a weighted bit width according to an embodiment of the present invention includes: a transistor connected to a word line, a bit line, and a capacitor; a capacitor coupled to the transistor, wherein an input of the bit line is coupled to a pulse generator and an output coupled to an activation element.

상기 워드 라인은 상기 비트 라인과 직교하여 배치될 수 있다.The word line may be disposed perpendicular to the bit line.

상기 활성화 소자는, 강유전체 트랜지스터, 양극 저항 스위치(bipolar resistive switch), 트랜지스터, 및 인버터를 포함할 수 있다.The activation element may include a ferroelectric transistor, a bipolar resistive switch, a transistor, and an inverter.

상기 비트 라인과 상기 활성화 소자 사이에 배치되는 선택 트랜지스터를 더 포함할 수 있다.A selection transistor may be further included between the bit line and the activation element.

본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용할 수 있는 커패시턴스 기반 뉴럴 네트워크는, 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터와; 상기 트랜지스터와 연결되는 커패시터를 포함하되, 상기 비트 라인은 펄스 발진기(pulse generator)와 연결되고, 상기 워드 라인은 상기 비트 라인과 직교하여 배치되는 가중치 셀들을 포함한다.A capacitance-based neural network capable of flexibly applying a weighted bit width according to an embodiment of the present invention includes: a transistor connected to a word line, a bit line, and a capacitor; a capacitor coupled to the transistor, wherein the bit line is coupled to a pulse generator, and the word line includes weight cells disposed perpendicular to the bit line.

상기 가중치 셀들의 워드 라인별로 순차적으로 게이트 전압을 인가하여 순차적으로 행렬 곱셈(Matrix Multiplication)을 수행할 수 있다.By sequentially applying a gate voltage to each word line of the weight cells, matrix multiplication may be sequentially performed.

상기 행렬 곱셈의 출력 정보를 차기 은닉층의 입력 정보로 사용하여 행렬 곱셈을 수행할 수 있다.Matrix multiplication may be performed using the output information of the matrix multiplication as input information of the next hidden layer.

상기 행렬 곱셈을 반복적으로 수행한 결과에 대한 반복 횟수 정보, 가중치 정보, 입력 정보, 출력 정보 및 은닉층 정보를 저장하는 상기 뉴럴 네트워크 외부의 저장 매체를 더 포함할 수 있다.The apparatus may further include a storage medium external to the neural network that stores iteration number information, weight information, input information, output information, and hidden layer information for a result of repeatedly performing matrix multiplication.

상기 가중치 셀들의 비트 라인의 출력과 연결되는 선택 트랜지스터들을 더 포함할 수 있다.It may further include select transistors connected to the output of the bit line of the weight cells.

상기 선택 트랜지스터들에 게이트 전압을 인가하되, 상기 선택 트랜지스터들마다 인가하는 시간을 달리하여 인가할 수 있다.The gate voltage may be applied to the selection transistors, but the application time may be different for each of the selection transistors.

상기 가중치 셀들의 두개 이상의 워드 라인을 선택하여 게이트 전압을 동시에 인가함으로써 순차적으로 행렬 곱셈을 수행할 수 있다.Matrix multiplication may be sequentially performed by selecting two or more word lines of the weight cells and simultaneously applying gate voltages.

상기 선택 트랜지스터들과 연결되는 활성화 소자를 더 포함할 수 있다.It may further include an activation device connected to the selection transistors.

이와 같은 본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용할 수 있는 커패시턴스 기반 뉴럴 네트워크는, 가중치와 은닉층 수를 탄력적으로 조절할 수 있으며, 회로 오버헤드를 줄일 수 있고 행렬 곱셈 유닛 칩 크기(matrix multiplication unit chip size)도 최소화할 수 있다.As described above, the capacitance-based neural network to which the weight bit width can be flexibly applied according to the embodiment of the present invention can flexibly adjust the weight and the number of hidden layers, reduce circuit overhead, and the matrix multiplication unit chip size (matrix). multiplication unit chip size) can also be minimized.

또한, 뉴럴 네트워크에 사용하는 가중치 정보와 출력 정보, 은닉층 정보를 외부에 저장함으로써, 온-칩 러닝(on-chip learning) 또는 모바일 전용 서비스로 활용이 가능한 준 범용 프로세서의 구현이 가능하다.In addition, by externally storing weight information, output information, and hidden layer information used in the neural network, it is possible to implement a semi-general-purpose processor that can be used for on-chip learning or a mobile-only service.

도 1은 본 발명의 일 실시예에 따른 뉴럴 네트워크를 개략적으로 나타내는 회로도이다.
도 2는 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀을 개략적으로 나타내는 회로도이다.
도 3은 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀 작동 원리를 개략적으로 나타내는 회로도이다.
도 4는 본 발명의 일 실시예에 따른 뉴럴 네트워크 구성을 개략적으로 나타내는 회로도이다.
도 5는 본 발명의 일 실시예에 따른 뉴럴 네트워크의 동작을 설명하기 위한 회로도이다.1 is a circuit diagram schematically illustrating a neural network according to an embodiment of the present invention.
2 is a circuit diagram schematically illustrating a weight cell of a neural network according to an embodiment of the present invention.
3 is a circuit diagram schematically illustrating an operation principle of a weight cell of a neural network according to an embodiment of the present invention.
4 is a circuit diagram schematically illustrating a configuration of a neural network according to an embodiment of the present invention.
5 is a circuit diagram illustrating an operation of a neural network according to an embodiment of the present invention.

위 발명의 배경이 되는 기술 란에 기재된 내용은 오직 본 발명의 기술적 사상에 대한 배경 기술의 이해를 돕기 위한 것이며, 따라서 그것은 본 발명의 기술 분야의 당업자에게 알려진 선행 기술에 해당하는 내용으로 이해될 수 없다.The contents described in the technical field of the background of the present invention are only for helping the understanding of the background of the technical idea of the present invention, and therefore it can be understood as content corresponding to the prior art known to those skilled in the art of the present invention. none.

아래의 서술에서, 설명의 목적으로, 다양한 실시예들의 이해를 돕기 위해 많은 구체적인 세부 내용들이 제시된다. 그러나, 다양한 실시예들이 이러한 구체적인 세부 내용들 없이 또는 하나 이상의 동등한 방식으로 실시될 수 있다는 것은 명백하다. 다른 예시들에서, 잘 알려진 구조들과 장치들은 다양한 실시예들을 불필요하게 이해하기 어렵게 하는 것을 피하기 위해 블록도로 표시된다. In the following description, for purposes of explanation, numerous specific details are set forth to aid in understanding various embodiments. It will be evident, however, that various embodiments may be practiced without these specific details or in one or more equivalent manners. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the various embodiments.

도면에서, 레이어들, 필름들, 패널들, 영역들 등의 크기 또는 상대적인 크기는 명확한 설명을 위해 과장될 수 있다. 또한, 동일한 참조 번호는 동일한 구성 요소를 나타낸다.In the drawings, the size or relative size of layers, films, panels, regions, etc. may be exaggerated for clarity. Also, like reference numbers indicate like elements.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함한다. 그러나, 만약 어떤 부분이 다른 부분과 "직접적으로 연결되어 있다"고 서술되어 있으면, 이는 해당 부분과 다른 부분 사이에 다른 소자가 없음을 의미할 것이다. "X, Y, 및 Z 중 적어도 어느 하나", 그리고 "X, Y, 및 Z로 구성된 그룹으로부터 선택된 적어도 어느 하나"는 X 하나, Y 하나, Z 하나, 또는 X, Y, 및 Z 중 둘 또는 그 이상의 어떤 조합 (예를 들면, XYZ, XYY, YZ, ZZ) 으로 이해될 것이다. 여기에서, "및/또는"은 해당 구성들 중 하나 또는 그 이상의 모든 조합을 포함한다.Throughout the specification, when a part is "connected" with another part, this includes not only the case of being "directly connected" but also the case of being "indirectly connected" with another element interposed therebetween. . However, if it is described that a part is "directly connected" to another part, this will mean that there is no other element between the part and the other part. “At least any one of X, Y, and Z” and “at least any one selected from the group consisting of X, Y, and Z” means one X, one Y, one Z, or two of X, Y, and Z or Any further combination (eg, XYZ, XYY, YZ, ZZ) will be understood. Herein, “and/or” includes any combination of one or more of the components.

여기에서, 첫번째, 두번째 등과 같은 용어가 다양한 소자들, 요소들, 지역들, 레이어들, 및/또는 섹션들을 설명하기 위해 사용될 수 있지만, 이러한 소자들, 요소들, 지역들, 레이어들, 및/또는 섹션들은 이러한 용어들에 한정되지 않는다. 이러한 용어들은 하나의 소자, 요소, 지역, 레이어, 및/또는 섹션을 다른 소자, 요소, 지역, 레이어, 및 또는 섹션과 구별하기 위해 사용된다. 따라서, 일 실시예에서의 첫번째 소자, 요소, 지역, 레이어, 및/또는 섹션은 다른 실시예에서 두번째 소자, 요소, 지역, 레이어, 및/또는 섹션이라 칭할 수 있다.Herein, although terms such as first, second, etc. may be used to describe various elements, elements, regions, layers, and/or sections, such elements, elements, regions, layers, and/or or sections are not limited to these terms. These terms are used to distinguish one element, element, region, layer, and/or section from another element, element, region, layer, and/or section. Accordingly, a first element, element, region, layer, and/or section in one embodiment may be referred to as a second element, element, region, layer, and/or section in another embodiment.

"아래", "위" 등과 같은 공간적으로 상대적인 용어가 설명의 목적으로 사용될 수 있으며, 그렇게 함으로써 도면에서 도시된 대로 하나의 소자 또는 특징과 다른 소자(들) 또는 특징(들)과의 관계를 설명한다. 이는 도면 상에서 하나의 구성 요소의 다른 구성 요소에 대한 관계를 나타내는 데에 사용될 뿐, 절대적인 위치를 의미하는 것은 아니다. 예를 들어, 도면에 도시된 장치가 뒤집히면, 다른 소자들 또는 특징들의 "아래"에 위치하는 것으로 묘사된 소자들은 다른 소자들 또는 특징들의 "위"의 방향에 위치한다. 따라서, 일 실시예에서 "아래" 라는 용어는 위와 아래의 양방향을 포함할 수 있다. 뿐만 아니라, 장치는 그 외의 다른 방향일 수 있다 (예를 들어, 90도 회전된 혹은 다른 방향에서), 그리고, 여기에서 사용되는 그런 공간적으로 상대적인 용어들은 그에 따라 해석된다.Spatially relative terms such as "below", "above", etc. may be used for descriptive purposes, thereby describing the relationship of one element or feature to another element(s) or feature(s) as shown in the drawings. do. This is only used to indicate the relationship of one component to another component in the drawing, and does not mean an absolute position. For example, if the device shown in the figures is turned over, elements depicted as being "below" other elements or features are positioned "above" the other elements or features. Thus, in one embodiment, the term "below" may include both up and down. In addition, the device may be otherwise oriented (eg, rotated 90 degrees or in other orientations), and such spatially relative terms used herein are interpreted accordingly.

여기에서 사용된 용어는 특정한 실시예들을 설명하는 목적이고 제한하기 위한 목적이 아니다. 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다 고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 다른 정의가 없는 한, 여기에 사용된 용어들은 본 발명이 속하는 분야에서 통상적인 지식을 가진 자에게 일반적으로 이해되는 것과 같은 의미를 갖는다.The terminology used herein is for the purpose of describing particular embodiments and not for the purpose of limitation. Throughout the specification, when a part "includes" a certain element, it means that other elements may be further included, rather than excluding other elements, unless otherwise stated. Unless otherwise defined, terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

도 1은 본 발명의 일 실시예에 따른 뉴럴 네트워크를 개략적으로 나타내는 회로도이다.1 is a circuit diagram schematically illustrating a neural network according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 뉴럴 네트워크(neural network)는 입력 뉴런(10), 출력 뉴런(20), 및 가중치 셀(30)을 포함한다. 시냅스(30) 소자는 입력 뉴런(10)으로부터 수평으로 연장하는 로우 라인(R)(row lines) 및 출력 뉴런(20)으로부터 수직으로 연장하는 컬럼 라인(C)(column lines)의 교차점에 배치될 수 있다. 설명의 편의를 위해 도 1에는 예시적으로 각각 네 개의 입력 뉴런(10) 및 출력 뉴런(20)이 도시되었으나, 본 발명은 이에 한정되지 않는다.Referring to FIG. 1 , a neural network according to an embodiment of the present invention includes an input neuron 10 , an output neuron 20 , and a weight cell 30 . The synapse 30 device is to be disposed at the intersection of a row line (R) extending horizontally from the input neuron 10 and a column line (C) extending vertically from the output neuron 20 can For convenience of explanation, four input neurons 10 and four output neurons 20 are illustrated in FIG. 1 , respectively, but the present invention is not limited thereto.

입력 뉴런(10)은 학습 모드(learning mode), 리셋 모드(reset mode), 보정 또는 읽기 모드(reading mode)에서 로우 라인(R)을 통하여 가중치 셀(30)로 전기적 펄스들(pulses)을 전송할 수 있다.The input neuron 10 transmits electrical pulses to the weight cell 30 through the row line R in a learning mode, a reset mode, a correction or a reading mode. can

출력 뉴런(20)은 학습 모드 또는 리셋 모드 또는 보정 또는 읽기 모드에서 컬럼 라인(C)을 통하여 가중치 셀(30)로부터 전기적 펄스를 수신할 수 있다.The output neuron 20 may receive an electrical pulse from the weight cell 30 through the column line C in the learning mode, the reset mode, or the correction or read mode.

본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용하기 위하여 가중치를 새롭게 정의한다. 현재까지는 커패시턴스를 기초로 하는 가중치에서 Q=CV(전하량=커패시턴스×전압)를 적용하여 C를 멀티레벨화 했다. 여기서 선형 멀티레벨은 배수 n을 사용하여 C=nC_O로 표현할 수 있다. 이 때 기저(ground) 커패시턴스가 되는 C_O는 가중치의 해상도(resolution)가 되고 C_OV는 전하량의 해상도(resolution)가 된다. 여기서 n은 배수 개념에서 횟수(number) 개념으로 전환할 수 있기 때문에 C_O를 n회 적용하는 방식을 도입할 수 있다. 따라서 가중치를 n으로 정의할 수 있다.In order to flexibly apply the weight bit width according to the embodiment of the present invention, the weight is newly defined. Until now, C was multi-leveled by applying Q=CV (charge=capacitance×voltage) in weights based on capacitance. Here, the linear multilevel can be expressed as C=nC _O using a multiple of n. At this time, _CO , which is the ground capacitance, becomes the resolution of the weight, and _CO V becomes the resolution of the amount of charge. Here, since n can be converted from a multiple concept to a number concept, a method of applying _CO n times can be introduced. Therefore, the weight can be defined as n.

도 2는 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀을 개략적으로 나타내는 회로도이다. 도 3은 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀 작동 원리를 개략적으로 나타내는 회로도이다.2 is a circuit diagram schematically illustrating a weight cell of a neural network according to an embodiment of the present invention. 3 is a circuit diagram schematically illustrating an operation principle of a weight cell of a neural network according to an embodiment of the present invention.

도 2 및 도 3을 참조하면, 본 발명의 실시예에 따른 뉴럴 네트워크의 가중치 셀은, 워드 라인(WL), 비트 라인(BL), 및 커패시터와 연결되는 트랜지스터와, 트랜지스터와 연결되는 커패시터와, 커패시터와 연결되는 플레이트를 포함하고, 워드 라인은 비트 라인과 직교하여 배치된다. 2 and 3 , a weight cell of a neural network according to an embodiment of the present invention includes a transistor connected to a word line WL, a bit line BL, and a capacitor; a capacitor connected to the transistor; and a plate connected to the capacitor, wherein the word line is disposed orthogonal to the bit line.

본 발명의 일 실시예에 따르면, 트랜지스터와 커패시터로 구성된 가중치 셀을 이용하여 커패시터를 반복적으로 충방전하여 일정 시간동안 방전하는 전하량을 출력값으로 한다. 충방전하는 일정 시간(

)에 의해 충방전 하는 회수가 결정되므로 가중치는 충방전 시간

, 혹은 충방전 회수 n으로 정의할 수 있다. 실시예로서, 가중치는 선택 트랜지스터(SL) 에 인가되는 게이트 전압 펄스의 폭(gate pulse width)에 대응될 수 있다.According to an embodiment of the present invention, the amount of charge discharged for a predetermined time by repeatedly charging and discharging a capacitor using a weight cell composed of a transistor and a capacitor is an output value. A certain amount of time to charge and discharge (

) determines the number of times of charging and discharging, so the weight is the charging/discharging time.

, or the number of charge/discharge times n. In an embodiment, the weight may correspond to a gate pulse width of a gate voltage pulse applied to the selection transistor SL.

본 발명의 일 실시예에 따르면, 입력단에서 펄스 발진기로 비트 라인에 전압 펄스를 인가할 때 상기 선택 트랜지스터(SLj)에는 게이트 전압(V_G)이 인가되는 반대 방향으로 펄스 발진기에서 발진하는 전압 펄스를 동일하게 인가할 수 있다. According to an embodiment of the present invention, when a voltage pulse is applied to the bit line by the pulse oscillator at the input terminal, the voltage pulse oscillated by the pulse oscillator in the opposite direction to which the gate voltage V _G is applied to the selection transistor SLj. can be applied in the same way.

본 발명의 실시예에 따르면 선택트랜지스터가 NMOS 트랜지스터인 경우, 선택 트랜지스터의 P웰(p-well)을 펄스 발진기와 연결함으로써 발진기로부터 입력되는 펄스 전압이 선택 트랜지스터를 차단(off)시키게 되어 펄스 전압이 활성화 소자로 직접 입력되는 것을 방지할 수 있다. 또 다른 실시예에 따르면, 비트라인에 입력되는 펄스 전압은 선택 트랜지스터에 입력되는 게이트 전압보다 클 수 있다.According to an embodiment of the present invention, when the selection transistor is an NMOS transistor, the pulse voltage input from the oscillator turns off the selection transistor by connecting the P-well (p-well) of the selection transistor with the pulse oscillator. Direct input to the activation element can be prevented. According to another embodiment, the pulse voltage input to the bit line may be greater than the gate voltage input to the selection transistor.

본 발명의 일 실시예에 따르면, 비트 라인에 인가되는 전압은 펄스 전압이고, 비트 라인에 펄스 전압을 인가하는 시간에 따라 가중치가 조절될 수 있다. 본 발명의 다른 실시예에 따르면, 워드 라인에 인가되는 전압은 일정한 전압이고, 비트 라인에 인가되는 전압은 입력 신호에 대응하는 펄스 전압일 수 있다. 실시예로서, 비트 라인에 인가되는 전압은 커패시터와 연결되는 플레이트에 상시 인가되는 전압의 두배 이상일 수 있다.According to an embodiment of the present invention, the voltage applied to the bit line is a pulse voltage, and the weight may be adjusted according to the time for applying the pulse voltage to the bit line. According to another embodiment of the present invention, the voltage applied to the word line may be a constant voltage, and the voltage applied to the bit line may be a pulse voltage corresponding to the input signal. As an embodiment, the voltage applied to the bit line may be twice or more of the voltage always applied to the plate connected to the capacitor.

도 4는 본 발명의 일 실시예에 따른 뉴럴 네트워크 구성을 개략적으로 나타내는 회로도이다. 도 5는 본 발명의 일 실시예에 따른 뉴럴 네트워크의 동작을 설명하기 위한 회로도이다.4 is a circuit diagram schematically illustrating a configuration of a neural network according to an embodiment of the present invention. 5 is a circuit diagram illustrating an operation of a neural network according to an embodiment of the present invention.

도 4 및 도 5를 참조하면, 본 발명의 실시예에 따른 뉴럴 네트워크의 가중치 셀은, 워드 라인(WL), 비트 라인(BL), 및 커패시터와 연결되는 트랜지스터와, 트랜지스터와 연결되는 커패시터와, 커패시터와 연결되는 플레이트를 포함하고, 비트 라인의 입력은 펄스 발진기와 연결되고 출력은 활성화 소자와 연결된다. 실시예로서, 워드 라인은 비트 라인과 직교하여 배치될 수 있다. 실시예로서, 본 발명의 실시예에 따른 뉴럴 네트워크의 가중치 셀은 비트 라인과 펄스 발진기 사이에 배치되는 다이오드를 더 포함할 수 있다. 실시예로서, 활성화 소자와 비트라인 사이에 배치되는 선택 트랜지스터를 더 포함할 수 있다. 실시예로서, 선택 트랜지스터의 그라운드가 되는 p웰은 다이오드와 펄스 발진기 사이에 배선으로 연결될 수 있다. 4 and 5 , a weight cell of a neural network according to an embodiment of the present invention includes a transistor connected to a word line WL, a bit line BL, and a capacitor, a capacitor connected to the transistor, and a plate coupled to the capacitor, the input of the bit line coupled with the pulse oscillator and the output coupled with the activation element. As an embodiment, the word line may be disposed orthogonal to the bit line. As an embodiment, the weight cell of the neural network according to the embodiment of the present invention may further include a diode disposed between the bit line and the pulse oscillator. In an embodiment, a selection transistor disposed between the activation element and the bit line may be further included. As an embodiment, the p-well serving as the ground of the selection transistor may be connected between the diode and the pulse oscillator by a wiring.

도 4 및 도 5를 참조하면, 본 발명의 실시예에 따른 뉴럴 네트워크는 가중치 셀을 어레이(array)로 구성하고 이를 네트워크 레이어(network layer)로 제작한 다음 가중치 셀 어레이의 행 별로 순차로 작동시키는 동시에 출력 값을 다음 은닉층의 입력 정보로 사용하는 리커런트(recurrent) 혹은 이터레이션(iteration) 방식을 적용한다.4 and 5 , the neural network according to an embodiment of the present invention consists of weight cells as an array, manufactures them as a network layer, and sequentially operates each row of the weight cell array. At the same time, a recurrent or iteration method using the output value as input information of the next hidden layer is applied.

본 발명의 일 실시예에 따르면, 트랜지스터의 비트 라인에는 입력 신호에 대응하는 입력 전압 V_Pj의 펄스 트레인(pulse train)을 인가하고, 워드 라인에는 행 별로 순차적으로 전압 Vg를 인가하여 행렬 곱셈을 수행한다. 실시예로서, 행을 2개, 3개, 혹은 그 이상으로 동시에 작동시켜 행을 선택한 수 만큼 선형적으로 방전량을 증가시킬 수 있다. 실시예로서, 펄스 발진기와 입력 비트 라인 사이에는 다이오드를 배치할 수 있다. 실시예로서, 활성화 소자와 비트 라인 사이에는 선택 트랜지스터를 배치할 수 있다.According to an embodiment of the present invention, a pulse train of the input voltage V _Pj corresponding to the input signal is applied to the bit line of the transistor, and the voltage Vg is sequentially applied to the word line for each row to perform matrix multiplication. do. As an embodiment, two, three, or more rows may be operated simultaneously to increase the amount of discharge linearly by a selected number of rows. As an example, a diode may be placed between the pulse oscillator and the input bit line. In an embodiment, a selection transistor may be disposed between the activation element and the bit line.

본 발명의 실시예에 따른 뉴럴 네트워크는 방전되는 전하들을 축적하여 일정 값이 넘으면 활성화한다. 실시예로서, 활성화 소자는 강유전체 트랜지스터, 양극 저항 스위치, 트랜지스터, 및 인버터를 포함할 수 있고, 뉴럴 네트워크에서 발생하는 출력 전류들을 집적(integration)하여 임계치 이상에서 활성화 한 전압 신호를 다음 은닉층이나 최종 출력층으로 전달한다. 즉, 뉴럴 네트워크는 행렬 곱셈의 출력 정보를 차기 은닉층의 입력 정보로 사용하여 행렬 곱셈을 수행한다. 실시예로서, 행렬 곱셈을 반복적으로 수행한 결과에 대한 반복 횟수 정보, 가중치 정보, 입력 정보, 출력 정보 및 은닉층 정보를 저장하는 뉴럴 네트워크 외부의 저장 매체를 활용할 수 있다. 강유전체 트랜지스터를 작동시키기 위하여 플레이트에 인가하는 상시 전압은 강유전체를 분극시키는 보자(coercive) 전압보다 클 수 있다.The neural network according to an embodiment of the present invention accumulates discharged charges and activates when a predetermined value is exceeded. As an embodiment, the activation element may include a ferroelectric transistor, a bipolar resistance switch, a transistor, and an inverter, and integrate the output currents generated in the neural network to activate the voltage signal above a threshold value in the next hidden layer or final output layer forward to That is, the neural network performs matrix multiplication by using output information of matrix multiplication as input information of the next hidden layer. As an embodiment, a storage medium outside the neural network that stores iteration number information, weight information, input information, output information, and hidden layer information for a result of repeatedly performing matrix multiplication may be utilized. A constant voltage applied to the plate to operate the ferroelectric transistor may be greater than a coercive voltage that polarizes the ferroelectric.

실시예로서, 입력 펄스 지연(delay) 동안, 즉, 입력 전압이 0V가 되는 동안에 방전(discharging) 전하들이 펄스 발진기로 역류하지 않도록 펄스 발진기와 비트 라인 사이에는 다이오드를 배치할 수 있다.As an embodiment, a diode may be disposed between the pulse oscillator and the bit line so that discharging charges do not flow back into the pulse oscillator during the input pulse delay, that is, while the input voltage becomes 0V.

실시예로서, 펄스 발진기에서 전압 펄스를 비트라인으로 인가하는 동안 이 펄스들이 활성화 소자로 직접 전송되지 않도록, 펄스 발진기에서 비트 라인으로 입력되는 동일한 전압 펄스를 각각의 선택 트랜지스터에 인가하되, 선택 트랜지스터 게이트에 인가하는 전압 방향과 반대 방향으로 전압 펄스를 인가할 수 있다. 즉, 역방향 전압 펄스를 인가할 수 있다.As an embodiment, the same voltage pulse input from the pulse oscillator to the bit line is applied to each select transistor, so that the pulses are not directly transmitted to the activation element while the voltage pulse is applied to the bit line in the pulse oscillator, but the select transistor gate A voltage pulse may be applied in a direction opposite to the direction of the voltage applied to the . That is, a reverse voltage pulse may be applied.

실시예로서, 선택 트랜지스터에 인가하는 역방향 펄스의 전압은 선택 트랜지스터 게이트에 인가하는 전압보다 클 수 있다.In an embodiment, the voltage of the reverse pulse applied to the selection transistor may be greater than the voltage applied to the gate of the selection transistor.

실시예로서, 선택 트랜지스터의 게이트 전압과 반대되는 전압을 자동으로 인가하기 위하여, 선택 트랜지스터의 그라운드에 해당하는 p웰을 다이오드와 펄스 발진기 사이에 배선으로 연결할 수 있다. As an embodiment, in order to automatically apply a voltage opposite to the gate voltage of the selection transistor, a p-well corresponding to the ground of the selection transistor may be connected between the diode and the pulse oscillator by a wiring.

전술한 바와 같은 본 발명의 실시예들에 따르면, 본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용할 수 있는 가중치 셀은, 가중치와 은닉층 수를 탄력적으로 조절할 수 있으며, 회로 오버헤드를 줄일 수 있고 행렬 곱셈 유닛 칩 크기(matrix multiplication unit chip size)도 최소화할 수 있다. 또한, 뉴럴 네트워크에 사용하는 가중치 정보와 출력 정보, 은닉층 정보를 외부에 저장함으로써, 온-칩 러닝(on-chip learning) 또는 모바일 전용 서비스로 활용이 가능한 준 범용 프로세서의 구현이 가능하다.According to the embodiments of the present invention as described above, the weight cell to which the weight bit width according to the embodiment of the present invention can be flexibly applied can flexibly adjust the weight and the number of hidden layers, and reduce circuit overhead. and the matrix multiplication unit chip size can be minimized. In addition, by externally storing weight information, output information, and hidden layer information used in the neural network, it is possible to implement a semi-general-purpose processor that can be used for on-chip learning or a mobile-only service.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, in the present invention, specific matters such as specific components, etc., and limited embodiments and drawings have been described, but these are only provided to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , various modifications and variations are possible from these descriptions by those of ordinary skill in the art to which the present invention pertains.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the described embodiments, and not only the claims described below, but also all those with equivalent or equivalent modifications to the claims will be said to belong to the scope of the spirit of the present invention. .

10: 입력 뉴런 20: 출력 뉴런
30: 가중치 셀10: input neuron 20: output neuron
30: weight cell

Claims

a transistor coupled to the word line, the bit line, and the capacitor;
a capacitor connected to the transistor; and
including a plate connected to the capacitor,
the bit line is connected to a pulse generator, and the word line includes weight cells disposed orthogonal to the bit line;
sequentially performing matrix multiplication by selecting two or more word lines of the weight cells and simultaneously applying a gate voltage,
Capacitance-based neural networks that can elastically apply weighted bitwidths.

The method of claim 1,
A voltage applied to the word line and the bit line is a pulse voltage, and a weighted bit width whose weight is adjusted according to a time when the pulse voltage is applied to the bit line can be flexibly applied.

The method of claim 1,
A voltage applied to the bit line is a capacitance-based neural network capable of flexibly applying a weighted bit width that is twice or more of a voltage always applied to a plate connected to the capacitor.

The method of claim 1,
A capacitance-based neural network capable of flexibly applying a weighted bit width further comprising a diode disposed between the bit line and the pulse oscillator.

5. The method of claim 4,
A capacitance-based neural network capable of flexibly applying a weighted bit width further comprising a selection transistor connected to an output of the bit line.

6. The method of claim 5,
A capacitance-based neural network capable of flexibly applying a weighted bit width further comprising a wiring connecting a p-well corresponding to the ground of the selection transistor between the diode and the pulse oscillator.

a transistor coupled to the word line, the bit line, and the capacitor; and
a capacitor connected to the transistor;
An input of the bit line is connected to a pulse generator and an output is connected to an activation element,
The activation element is
including ferroelectric transistors, bipolar resistive switches, transistors and inverters;
Capacitance-based neural networks that can elastically apply weighted bitwidths.

8. The method of claim 7,
A capacitance-based neural network capable of flexibly applying a weighted bit width to the word line orthogonal to the bit line.

delete

8. The method of claim 7,
A capacitance-based neural network capable of flexibly applying a weighted bit width further comprising a diode disposed between the bit line and the pulse oscillator.

8. The method of claim 7,
A capacitance-based neural network capable of flexibly applying a weighted bit width further comprising a selection transistor disposed between the bit line and the activation element.

a transistor coupled to the word line, the bit line, and the capacitor; and
a capacitor connected to the transistor;
the bit line is connected to a pulse generator, and the word line includes weight cells disposed orthogonal to the bit line;
sequentially performing matrix multiplication by selecting two or more word lines of the weight cells and simultaneously applying a gate voltage,
Capacitance-based neural networks that can elastically apply weighted bitwidths.

13. The method of claim 12,
A capacitance-based neural network capable of flexibly applying a weight bit width for sequentially performing matrix multiplication by sequentially applying a gate voltage to each word line of the weight cells.

14. The method of claim 13,
A capacitance-based neural network capable of flexibly applying a weight bit width for performing matrix multiplication by using output information of the matrix multiplication as input information of a next hidden layer.

15. The method of claim 14,
It is possible to flexibly apply a weight bit width further comprising a storage medium external to the neural network for storing iteration number information, weight information, input information, output information, and hidden layer information for a result of repeatedly performing the matrix multiplication. Capacitance-based neural networks.

13. The method of claim 12,
A capacitance-based neural network capable of flexibly applying a weighted bit width further comprising a diode disposed between the bit line and the pulse oscillator.

13. The method of claim 12,
A capacitance-based neural network capable of flexibly applying a weighted bit width, further comprising selection transistors connected to outputs of bit lines of the weighted cells.

18. The method of claim 17,
A capacitance-based neural network in which a gate voltage is applied to the selection transistors, but a weighted bit width applied by varying an application time for each of the selection transistors can be flexibly applied.

delete

18. The method of claim 17,
A capacitance-based neural network capable of flexibly applying a weight bit width further comprising an activation element connected to the selection transistors.