KR102567160B1

KR102567160B1 - Neural network circuit with non-volatile synaptic array

Info

Publication number: KR102567160B1
Application number: KR1020207024195A
Authority: KR
Inventors: 송승환; 허지혜; 이상수
Original assignee: 아나플래시 인코포레이티드
Priority date: 2018-01-23
Filing date: 2019-01-22
Publication date: 2023-08-16
Also published as: CN111656371B; EP3743857A4; KR20200110701A; TW201937413A; WO2019147522A3; WO2019147522A2; CN111656371A; EP3743857A2; TWI751403B

Abstract

본 발명은 비휘발성 신경망의 시냅스 회로를 개시한다. 시냅스는 기준 신호 라인입력 신호 라인; 기준 신호 라인; 출력 라인 및 출력 신호를 생성하기위한 셀. 셀은 입력 신호 라인에 전기적으로 연결된 게이트를 갖는 상부 선택 트랜지스터; 및 일단이 상기 상부 선택 트랜지스터에 직렬로 연결되고 타단이 상기 기준 신호 라인기준 신호 라인에 전기적으로 연결되는 저항변화소자를 포함한다. 저항변화소자의 값은 출력 신호의 크기를 상부 선택 트랜지스터 변경하도록 프로그래밍 할 수 있다. 상부 선택 트랜지스터의 드레인은 제1 출력 라인에 전기적으로 연결된다.The present invention discloses a synaptic circuit of a non-volatile neural network. A synapse is a reference signal line and an input signal line; a reference signal line; Cells for generating output lines and output signals. The cell includes an upper select transistor having a gate electrically connected to the input signal line; and a resistance change element having one end connected in series to the upper select transistor and the other end electrically connected to the reference signal line and the reference signal line. The value of the resistance change element can be programmed to change the magnitude of the output signal of the upper selection transistor. A drain of the upper select transistor is electrically connected to the first output line.

Description

Neural network circuit with non-volatile synaptic array

본 발명은 신경망 회로에 관한 것으로, 보다 상세하게는 아날로그 값들을 사용하는 비휘발성 시냅스어레이들을 가진 신경망 회로에 관한 것이다.The present invention relates to a neural network circuit, and more particularly to a neural network circuit having non-volatile synaptic arrays using analog values.

인공신경망(ANN; Artificial Neural Network)은 인간 두뇌의 계산 모델을 모방한 신경망이다. 신경망은 많은 뉴런들이 그들 사이의 시냅스를 통해 서로 연결된 것으로 설명될 수 있다. 연결의 강도 또는 각 시냅스의 가중치 파라미터는 훈련 가능한 파라미터로 학습 프로세스를 통해 조정될 수 있다. 최근 ANN들을 이용한 인공지능(AI)은 시각 및 음성 감지/인식, 언어번역, 게임, 의료의사결정, 재무 또는 일기예보, 드론, 자율주행차 등과 같은 다양한 분야에 적용되고 있다.An Artificial Neural Network (ANN) is a neural network that mimics the computational model of the human brain. A neural network can be described as many neurons connected to each other through synapses between them. The strength of the connection or the weight parameter of each synapse is a trainable parameter and can be adjusted through a learning process. Recently, artificial intelligence (AI) using ANNs has been applied to various fields such as visual and voice detection/recognition, language translation, games, medical decision-making, financial or weather forecasting, drones, and autonomous vehicles.

전통적으로, 신경망의 계산은 다수의 중앙처리장치(CPUs; Central Processing Units) 및/또는 그래픽처리장치(GPUs; Graphics Processing Units)를 갖는 고성능 클라우드 서버를 요구한다. 계산의 복잡성이 모바일 장치의 제한된 전력과 계산자원으로 인하여 AI 프로그램을 로컬로 실행하는 것을 금지시키기 때문이다. 전용 CMOS(Complementary Metal-O×ide-Semiconductor) 로직으로 신경망의 계산 속도를 높이는 기존의 다른 ASIC(application-specific integrated circuit)나 FPGA(field-programmable gate array) 접근방식은 일반적인 CPU 및 GPU 기반의 접근 방식에 비해 전력 효율적일 수 있다. 그러나 여전히 훈련된 가중치 파라미터가 저장되어 있는 별도의 오프-칩 비휘발성메모리(NVM)와 데이터를 송수신하기 위해 불필요한 전력 및 레이턴시(latency)를 낭비한다. 따라서, 계산 자원을 상당히 적게 소비하는 신경망 회로에 대한 요구가 있다.Traditionally, computation of neural networks requires high-performance cloud servers with multiple Central Processing Units (CPUs) and/or Graphics Processing Units (GPUs). This is because computational complexity prohibits running AI programs locally due to the limited power and computational resources of mobile devices. Other existing application-specific integrated circuit (ASIC) or field-programmable gate array (FPGA) approaches that speed up the computation of neural networks with dedicated CMOS (Complementary Metal-O × ide-Semiconductor) logic are conventional CPU- and GPU-based approaches. It can be power efficient compared to the method. However, it still wastes unnecessary power and latency to transmit and receive data to and from a separate off-chip non-volatile memory (NVM) in which the trained weight parameters are stored. Thus, there is a need for neural network circuits that consume significantly less computational resources.

본 발명의 다른 실시예에 따라, 비휘발성 신경망의 시냅스 회로는 입력 신호 라인, 기준 신호 라인, 출력 라인 및 출력 신호를 생성하기위한 셀을 포함한다. 셀은 입력 신호 라인에 전기적으로 연결된 게이트를 갖는 상부 선택 변환기, 및 일단이 상부 선택 트랜지스터에 직렬로 연결되고 다른 단부가 기준 신호 라인에 전기적으로 연결된 저항변화소자를 포함한다. 저항 변화 요소의 값은 출력 신호의 크기를 변경하도록 프로그래밍 할 수 있다. 상부 선택 트랜지스터의 드레인은 출력 라인에 전기적으로 연결된다.According to another embodiment of the present invention, a synaptic circuit of a non-volatile neural network includes an input signal line, a reference signal line, an output line, and a cell for generating an output signal. The cell includes an upper select converter having a gate electrically connected to the input signal line, and a resistive change element having one end connected in series to the upper select transistor and the other end electrically connected to the reference signal line. The value of the resistive change element can be programmed to change the magnitude of the output signal. The drain of the upper select transistor is electrically connected to the output line.

본 발명의 다른 실시예에 따라, 시냅스 회로는 제1 및 제2 입력 신호 라인, 기준 신호 라인, 제1 및 제2 출력 신호 라인, 제1 및 제2 셀 및 교차 결합 래치 회로를 포함한다. 교차 결합 래치 회로는 제1 및 제2 인버터, 제1 및 제2 신호 노드를 포함한다. 제1 인버터의 입력 단자는 제1 신호 노드에서 제2 인버터의 출력 단자에 연결되고 제2 인버터의 입력 단자는 제2 신호 노드에서 제1 인버터의 출력 단자에 연결된다. 각 셀은 게이트에서 제1 입력 신호 라인에 전기적으로 연결된 제1 상부 선택 트랜지스터, 및 제2 입력 신호 라인에 연결된 제2 상부 선택 트랜지스터를 포함한다. 제1 및 제2 상부 선택 트랜지스터의 소스 단자는 공통노드에 연결된다. 제1 셀에서, 제1 및 제2 상부 선택 트랜지스터의 드레인 단자는 각각 제1 및 제2 출력 신호 라인에 연결된다. 제2 셀에서, 드레인 단자는 제1 상부 선택 트랜지스터가 제2 출력 라인에 연결되고 제2 상부 선택 트랜지스터가 제1 출력 라인에 연결되어 반전된다. 제1 셀의 공통노드는 교차 결합 래치 회로의 제1 신호 노드에 연결되고 제2 셀의 공통노드는 교차 결합 래치 회로의 제2 신호 노드에 연결된다. 기준 신호 라인. 교차 결합된 래치 회로의 제1 및 제2 인버터에 결합된다.According to another embodiment of the present invention, a synapse circuit includes first and second input signal lines, a reference signal line, first and second output signal lines, first and second cells, and a cross-coupled latch circuit. The cross-coupled latch circuit includes first and second inverters and first and second signal nodes. The input terminal of the first inverter is connected to the output terminal of the second inverter at the first signal node and the input terminal of the second inverter is connected to the output terminal of the first inverter at the second signal node. Each cell includes a first upper select transistor electrically connected at a gate to the first input signal line, and a second upper select transistor connected to the second input signal line. Source terminals of the first and second upper select transistors are connected to a common node. In the first cell, drain terminals of the first and second upper select transistors are respectively connected to the first and second output signal lines. In the second cell, the drain terminal is inverted with the first upper select transistor connected to the second output line and the second upper select transistor connected to the first output line. The common node of the first cell is connected to the first signal node of the cross-coupled latch circuit and the common node of the second cell is connected to the second signal node of the cross-coupled latch circuit. Reference signal line. coupled to first and second inverters of a cross-coupled latch circuit.

실시예에서의 이러한 프라이버시 강화 신경망은 창의적인 개인 장치에 사용될 수 있다. 예를 들어, 본 발명의 일 실시예에 따르면 새로운 개인적인 작업, 질문 또는 답변이 온-칩 비휘발성 신경망을 사용하는 휴대용 교육 장치 또는 스마트 장난감에서 상호작용에 의하여 생성될 수 있다. 본 발명의 일 실시예는 오프 칩 액세스를 제한하면서 이미지 또는 사운드 인식을 통해 개인을 식별하는 데 유용할 수 있다. 특히 홈케어나 차일드케어 기기는 음성을 인식해야 하는 사람의 수가 제한되어 있어 매우 복잡한 신경망 모델이 필요하지 않을 수 있다. 그러나 이러한 장치에는 높은 수준의 개인화가 필요할 수 있으며 개인 정보에 대한 엄격한 요구 사항이 있을 수 있다. 또한, 이러한 유형의 애플리케이션을 위한 핵심 신경망 계층은 중요한 정보의 오프-칩 통신없이 실행될 수 있기 때문에, 본 발명의 일 실시예에 따른 온-칩 비휘발성 신경망은 군사 장치 또는 네트워크 방화벽의 보안을 향상시킬 수 있다.This privacy-enhanced neural network in an embodiment may be used in creative personal devices. For example, according to one embodiment of the present invention, a new personal task, question or answer may be generated by interaction in a portable educational device or smart toy using an on-chip non-volatile neural network. An embodiment of the present invention may be useful for identifying individuals through image or sound recognition while restricting off-chip access. In particular, home care or child care devices may not require very complex neural network models due to the limited number of people who need to recognize voices. However, these devices may require a high degree of personalization and may have stringent requirements for privacy. In addition, since the core neural network layer for this type of application can be executed without off-chip communication of critical information, an on-chip non-volatile neural network according to an embodiment of the present invention will improve the security of military devices or network firewalls. can

본 발명의 다른 측면에서, 제안된 온-칩 비휘발성 신경망 시스템은 개인화 정보를 온-칩에 저장 및 계산함으로써 보안 개인화 비전/동작/음성 인식 장치에 사용될 수 있다. 예를 들어, 모든 신경망 계산은 온-칩으로 계산되기 때문에 장치는 개인적으로 훈련된 신경망 매개 변수를 칩 외부로 전송하지 않고도 특정 사람의 제스처 또는 음성을 인식할 수 있다. 이러한 시각/동작/음성 인식 신경망 장치는 부피가 큰 사용자 인터페이스 장치(예: PC의 키보드 또는 마우스, 텔레비전의 원격 컨트롤러)를 대체할 수 있다. 예를 들어, 키보드 터치 디스플레이는 각 텍스트 문자에 대한 장치 소유자의 손 제스처를 인식할 수 있는 신경망 엔진으로 대체될 수 있다. 개인화된 정보를 온-칩 비휘발성 신경망에 저장함으로써 특정 사람만 장치와 상호 작용할 수 있다.In another aspect of the present invention, the proposed on-chip non-volatile neural network system can be used in a secure personalized vision/motion/voice recognition device by storing and calculating personalization information on-chip. For example, since all neural network calculations are performed on-chip, the device can recognize a specific person's gesture or voice without sending personally trained neural network parameters off-chip. Such visual/motion/voice recognition neural network devices can replace bulky user interface devices (eg, PC keyboards or mice, TV remote controllers). For example, a keyboard touch display could be replaced with a neural network engine capable of recognizing the device owner's hand gestures for each text character. By storing personalized information in an on-chip non-volatile neural network, only specific people can interact with the device.

또한, 제안된 온-칩 비휘발성 신경망은 CPU, 메모리 및 센서와 같은 다른 SoC 빌딩 블록의 성능과 신뢰성을 향상시키기 위해 활용될 수 있다. 예를 들어, 트랜지스터의 노화 효과나 온도와 같은 다양한 작동 조건으로 인해 작동 전압 및 주파수는 SoC의 수명 동안 순응적으로 제어될 필요가 있다. 이러한 매개 변수의 수동 조정은 신경망이 최적화하기에는 어려운 작업이다. 그러나, 오프-칩 신경망 가속기는 성능 요구 사항을 충족하지 못하고 과도한 추가 전력이 필요할 수 있다. 비휘발성 신경망은 주어진 성능 및 전력 요구 사항에 대해 자체 칩의 다른 컴포넌트의 이러한 매개 변수를 최적화하는 데 사용될 수 있다.In addition, the proposed on-chip non-volatile neural network can be utilized to improve the performance and reliability of other SoC building blocks such as CPU, memory and sensor. Due to various operating conditions, such as temperature or aging effects of transistors, the operating voltage and frequency need to be adaptively controlled over the lifetime of the SoC. Manual tuning of these parameters is a difficult task for neural networks to optimize. However, off-chip neural network accelerators may not meet performance requirements and require excessive additional power. Non-volatile neural networks can be used to optimize these parameters of different components of its own chip for given performance and power requirements.

본 발명의 실시예를 참조할 것이며, 그 예는 첨부 도면에 도시될 수 있다. 이러한 도면들은 제한적인 것이 아니라 예시적인 것으로 의도된다. 본 발명은 이러한 실시예들과 관련하여 일반적으로 설명되지만, 본 발명의 범위를 이러한 특정 실시예들로 제한하고자 하는 의도가 아님을 이해해야 한다.
도 1은 본 발명의 실시예에 따라 신경망의 개략도를 나타낸다.
도 2는 본 발명의 실시예에 따라 시냅스 어레이의 개략도를 나타낸다.
도 3은 본 발명의 실시예에 따라 시냅스의 개략도를 나타낸다.
도 4는 본 발명의 실시예에 따라 또 다른 시냅스의 개략도를 나타낸다.
도 5는 본 발명의 실시예에 따라 또 다른 시냅스의 개략도를 나타낸다.
도 6은 본 발명의 실시예에 따라 또 다른 시냅스의 개략도를 나타낸다.
도 7은 본 발명의 실시예에 따라 또 다른 시냅스의 개략도를 나타낸다.
도 8은 본 발명의 실시예에 따라 또 다른 시냅스의 개략도를 나타낸다.
도 9a 내지 9b는 임계전압(VTH)을 프로그래밍하기 위한 종래 방법과 본 발명의 실시예에 따른 방법을 비교하여 나타낸다.
도 10a 내지 10b는 본 발명의 실시예에 따라 플로팅게이트노드의 임계전압(VTH)을 프로그래밍하기 위한 또 다른 방법을 나타낸다.
도 11은 본 발명의 실시예에 따라 플로팅게이트노드의 임계전압(VTH)을 프로그래밍하기 위한 예시적인 프로세스의 흐름도를 나타낸다.
도 12a 내지 12c는 본 발명의 실시예에 따라 차동 시그널링을 나타낸다.
도 13은 본 발명의 실시예에 따라 신경망을 포함하는 칩(chip)의 개략도를 나타낸다.
도 14는 본 발명의 실시예에 따라 비휘발성 시냅스어레이를 포함하는 신경망의 개략도를 나타낸다.
도 15는 본 발명의 실시예에 따른 다른 시냅스의 개략도를 나타낸다.
도 16은 본 발명의 실시예에 따라 2진 곱셈을 수행하기 위해 도 15에 도시된 입력 및 출력 라인의 신호를 표로 나타낸 것이다.
도 17은 본 발명의 실시예에 따른 다른 시냅스의 개략도를 나타낸다.
도 18은 본 발명의 실시예에 따른 다른 시냅스의 개략도를 나타낸다.
도 19는 본 발명의 실시예에 따른 다른 시냅스의 개략도를 나타낸다.
도 20은 본 발명의 실시예에 따른 다른 시냅스의 개략도를 나타낸다.
도 21은 본 발명의 실시예에 따른 다른 시냅스의 개략도를 나타낸다.
도 22는 본 발명의 실시예에 따른 다른 시냅스의 개략도를 나타낸다.
도 23은 종래 기술에 따른 신경망 시스템의 개략도를 나타낸다.
도 24는 본 발명의 실시예에 따른 온-칩 비휘발성 신경망 및 외부 신경망 가속기 장치를 포함하는 SoC로 구성된 계층화 된 신경망 컴퓨팅 시스템의 개략도를 나타낸다.
도 25는 본 발명의 실시예에 따른 다수의 SoC들로 구성된 분산 신경망 시스템의 개략도를 나타낸다.
도 26은 본 발명의 실시예에 따른 논리 친화적 NVM 통합 신경망 시스템의 개략도를 나타낸다.
도 27은 본 발명의 실시예에 따른 다른 논리 친화적 NVM 통합 신경망 시스템의 개략도를 나타낸다.Reference will be made to embodiments of the present invention, examples of which may be shown in the accompanying drawings. These drawings are intended to be illustrative rather than restrictive. Although the present invention has been generally described with respect to these embodiments, it should be understood that the scope of the present invention is not intended to be limited to these specific embodiments.
1 shows a schematic diagram of a neural network according to an embodiment of the present invention.
2 shows a schematic diagram of a synapse array according to an embodiment of the present invention.
3 shows a schematic diagram of a synapse according to an embodiment of the present invention.
4 shows a schematic diagram of another synapse according to an embodiment of the present invention.
5 shows a schematic diagram of another synapse according to an embodiment of the present invention.
6 shows a schematic diagram of another synapse according to an embodiment of the present invention.
7 shows a schematic diagram of another synapse according to an embodiment of the present invention.
8 shows a schematic diagram of another synapse according to an embodiment of the present invention.
9A to 9B show a comparison between a conventional method for programming the threshold voltage (VTH) and a method according to an embodiment of the present invention.
10A to 10B show another method for programming the threshold voltage (VTH) of a floating gate node according to an embodiment of the present invention.
11 shows a flow diagram of an exemplary process for programming the threshold voltage (VTH) of a floating gate node in accordance with an embodiment of the present invention.
12a to 12c illustrate differential signaling according to an embodiment of the present invention.
13 shows a schematic diagram of a chip containing a neural network according to an embodiment of the present invention.
14 shows a schematic diagram of a neural network including a non-volatile synapse array according to an embodiment of the present invention.
15 shows a schematic diagram of another synapse according to an embodiment of the present invention.
FIG. 16 tabulates signals of the input and output lines shown in FIG. 15 for performing binary multiplication according to an embodiment of the present invention.
17 shows a schematic diagram of another synapse according to an embodiment of the present invention.
18 shows a schematic diagram of another synapse according to an embodiment of the present invention.
19 shows a schematic diagram of another synapse according to an embodiment of the present invention.
20 shows a schematic diagram of another synapse according to an embodiment of the present invention.
21 shows a schematic diagram of another synapse according to an embodiment of the present invention.
22 shows a schematic diagram of another synapse according to an embodiment of the present invention.
23 shows a schematic diagram of a neural network system according to the prior art.
24 shows a schematic diagram of a layered neural network computing system composed of an SoC including an on-chip non-volatile neural network and an external neural network accelerator device according to an embodiment of the present invention.
25 shows a schematic diagram of a distributed neural network system composed of multiple SoCs according to an embodiment of the present invention.
26 shows a schematic diagram of a logic-friendly NVM integrated neural network system according to an embodiment of the present invention.
27 shows a schematic diagram of another logic-friendly NVM integrated neural network system according to an embodiment of the present invention.

다음의 설명에서 설명의 목적으로 본 발명의 이해를 제공하기 위하여 특정 세부사항들이 설명된다. 그러나 본 발명이 이러한 세부사항 없이도 실시될 수 있다는 것은 당업자에게 명백할 것이다. 당업자는 후술하는 본 발명의 실시예가 다양한 방법으로 그리고 다양한 수단을 사용하여 수행될 수 있음을 인식할 것이다. 당업자는 또한 본 발명이 유용성을 제공할 수 있는 추가적인 분야들과 마찬가지로, 추가적인 수정, 응용 및 실시예가 그 범위 내에 있음을 인식할 것이다. 따라서, 후술하는 실시예들은 본 발명의 특정 실시예들을 예시하고 본 발명을 모호하게 하는 것을 피하도록 의도된다.In the following description, for purposes of explanation, specific details are set forth in order to provide an understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced without these details. Those skilled in the art will recognize that the embodiments of the invention described below can be carried out in a variety of ways and using a variety of means. Those skilled in the art will also recognize that additional modifications, applications and embodiments are within its scope, as well as additional areas in which the present invention may provide utility. Accordingly, the following examples are intended to illustrate specific embodiments of the invention and to avoid obscuring the invention.

본 명세서에서 언급된 "일 실시예" 또는 "실시예"는 실시예와 관련하여 설명된 특정 특징, 구조, 특성 또는 기능이 본 발명의 적어도 하나의 실시예에 포함됨을 의미한다. 본 명세서의 여러 곳에서 "일 실시예에서", "실시예에서" 등의 어구로 나타나는 것은 모두 동일한 실시예를 지칭하는 것은 아니다.Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, characteristic, or function described in connection with the embodiment is included in at least one embodiment of the present invention. The appearances of the phrases “in one embodiment,” “in an embodiment,” and the like in various places in this specification are not all referring to the same embodiment.

도 1은 본 발명의 실시예들에 따라 신경망(100)의 개략도를 나타낸다(명세서 전반에 걸쳐 유사한 참조번호들은 유사한 요소들을 나타낸다). 도시된 바와 같이, 신경망(100)은 5개의 뉴런어레이층(또는 간략히 뉴런층)(110, 130, 150, 170, 190), 및 시냅스어레이층(또는 간략히 시냅스층)(120, 140, 160, 180)을 포함할 수 있다. 뉴런층들의 각각(예를 들어, 110)은 적절한 수의 뉴런들을 포함할 수 있다. 도 1에서 단 5개의 뉴런층과 4개의 시냅스층이 도시되어 있다. 그러나 신경망(100)이 다른 적절한 수의 뉴런층들을 포함할 수 있고 시냅스층이 2개의 인접한 뉴런층들 사이에 배치될 수 있다는 것은 당업자에게 명백하다.1 shows a schematic diagram of a neural network 100 in accordance with embodiments of the present invention (like reference numbers throughout the specification indicate like elements). As shown, the neural network 100 includes five neuron array layers (or simply neuron layers) (110, 130, 150, 170, 190), and synapse array layers (or simply synapse layers) (120, 140, 160, 180) may be included. Each of the neuronal layers (eg, 110) may include any suitable number of neurons. In Figure 1, only 5 neuronal layers and 4 synaptic layers are shown. However, it will be apparent to those skilled in the art that neural network 100 may include any other suitable number of neuronal layers and that a synaptic layer may be disposed between two adjacent neuronal layers.

뉴런층(예를 들어, 110) 내의 각각의 뉴런(예를 들어, 112a)은 시냅스층(예를 들어, 120)의 m개의 시냅스들을 통해 다음의 뉴런어레이층(예를 들어, 130)의 하나 이상의 뉴런들(예를 들어, 132a ~ 132m)에 연결될 수 있다. 예를 들어, 뉴런층(110)의 뉴런 각각이 뉴런층(130)의 모든 뉴런에 전기적으로 연결되면, 시냅스층 (120)은 n×m 시냅스들을 포함할 수 있다. 실시예에서, 각 시냅스는 두 뉴런들 간의 연결 강도를 나타내는 훈련 가능한 가중치 파라미터(w)를 가질 수 있다.Each neuron (eg, 112a) in the neuron layer (eg, 110) is connected to one of the following neuron array layers (eg, 130) through m synapses of the synaptic layer (eg, 120). It may be connected to more than one neuron (eg, 132a to 132m). For example, if each neuron of the neuron layer 110 is electrically connected to all neurons of the neuron layer 130, the synapse layer 120 may include n×m synapses. In an embodiment, each synapse may have a trainable weight parameter (w) representing the strength of a connection between two neurons.

실시예에서, 입력뉴런신호(Ain)와 출력뉴런신호(Aout) 사이의 관계는 다음 수학식의 활성화 함수로 나타내어질 수 있다.In an embodiment, the relationship between the input neuron signal Ain and the output neuron signal Aout may be expressed as an activation function of the following equation.

[수학식 1][Equation 1]

Aout=f(W × Ain + Bias)Aout=f(W × Ain + Bias)

여기서, Ain 및 Aout은 각각 시냅스층에 대한 입력신호와 시냅스층으로부터의 출력신호를 나타내는 행렬이고, W는 시냅스층의 가중치를 나타내는 행렬이며, Bias는 Aout에 대한 바이어스 신호를 나타내는 행렬이다. 실시예들에서, W 및 Bias는 훈련 가능한 파라미터일 수 있고, 논리 친화적 비휘발성 메모리(NVM)에 저장 될 수 있다. 예를 들어, 훈련/기계 학습 프로세스는 W 및 Bias를 결정하기 위해 알려진 데이터와 함께 사용될 수 있다. 실시예들에서, 함수 f는 sigmoid, tanh, ReLU, leaky ReLU 등과 같은 비선형 함수일 수 있다. 실시예들에서, Aout은 (W × Ain + Bias)가 특정 임계값보다 클 때 활성화될 수 있다.Here, Ain and Aout are matrices representing the input signal to and output signals from the synaptic layer, respectively, W is a matrix representing the weight of the synaptic layer, and Bias is a matrix representing the bias signal for Aout. In embodiments, W and Bias may be trainable parameters and may be stored in logic friendly non-volatile memory (NVM). For example, a training/machine learning process can be used with known data to determine W and Bias. In embodiments, function f may be a non-linear function such as sigmoid, tanh, ReLU, leaky ReLU, and the like. In embodiments, Aout may be activated when (W x Ain + Bias) is greater than a certain threshold.

예로서, 수학식 1에 나타낸 관계는 2개의 뉴런을 갖는 뉴런층(110), 시냅스층(120) 및 3개의 뉴런을 갖는 뉴런층(130)으로 설명될 수 있다. 이 예에서, 뉴런어레이층(110)으로부터의 출력신호를 나타내는 Ain은 2행×1열의 행렬로 표현될 수 있고, 시냅스층(120)으로부터의 출력신호를 나타내는 Aout은 3행×1열의 행렬로 표현될 수 있으며, 시냅스층(120)의 가중치를 나타내는 W는 6개의 가중치를 가진 3행×2열의 행렬로 표현될 수 있고, 뉴런층(130)에 부가된 바이어스 값을 나타내는 Bias는 3행×1열의 행렬로 표현될 수 있다. 수학식 1의 (W × Ain + Bias)의 각 요소에 적용되는 비선형 함수 f는 Aout의 각 요소의 최종 값을 결정할 수 있다. 다른예로서, 뉴런어레이층(110)은 센서들로부터 입력신호를 수신할 수 있고, 뉴런어레이층(190)은 응답 신호를 나타낼 수 있다.As an example, the relationship shown in Equation 1 can be described as a neuron layer 110 having two neurons, a synapse layer 120 and a neuron layer 130 having three neurons. In this example, Ain representing the output signal from the neuron array layer 110 can be expressed as a matrix of 2 rows × 1 column, and Aout representing the output signal from the synaptic layer 120 is a matrix of 3 rows × 1 column. W, which represents the weight of the synapse layer 120, can be expressed as a matrix of 3 rows × 2 columns with 6 weights, and Bias, which represents the bias value added to the neuron layer 130, is 3 rows × 2 columns. It can be expressed as a matrix of one column. The nonlinear function f applied to each element of (W × Ain + Bias) in Equation 1 may determine the final value of each element of Aout. As another example, the neuron array layer 110 may receive input signals from sensors, and the neuron array layer 190 may display a response signal.

실시예에서, 신경망(100)에는 수많은 뉴런들과 시냅스들이 있을 수 있으며, 수학식 1에서의 행렬 곱셈 및 합산은 많은 양의 컴퓨팅 자원을 소비할 수 있는 프로세스일 수 있다. 종래의 메모리 내 처리 컴퓨팅 접근법에서, 컴퓨팅 장치는 디지털 논리 및 산술 컴포넌트를 사용하는 것보다 오히려 아날로그 전기값을 사용하여 NVM 셀 어레이 내에서 행렬 곱셈을 수행한다. 이러한 종래 설계는 CMOS 로직과 NVM 컴포넌트들 사이의 통신을 줄여 계산 부하를 줄이고 전력 요구량을 줄이는 것을 목표로 한다. 그러나 이러한 종래 접근법들은 대규모 NVM 셀 어레이에서 전류 입력 신호 경로에 대한 저항성분이 크기 때문에 각 시냅스에 대한 전류 입력 신호에 큰 변화가 있는 경향이 있다. 또한, 큰 어레이에서 절반-선택된(half-selected) 셀을 통한 누설전류는 프로그래밍 된 저항 값을 변경하여, 원하지 않는 프로그램 교란과 신경망 계산 정확도의 저하를 야기한다.In an embodiment, there may be numerous neurons and synapses in neural network 100, and matrix multiplication and summation in Equation 1 may be a process that may consume a large amount of computing resources. In conventional in-memory processing computing approaches, a computing device performs matrix multiplication within an NVM cell array using analog electrical values rather than using digital logic and arithmetic components. These conventional designs aim to reduce the computational load and reduce power requirements by reducing communication between CMOS logic and NVM components. However, these conventional approaches tend to have a large change in the current input signal for each synapse because the resistance component to the current input signal path is large in a large-scale NVM cell array. In addition, leakage current through half-selected cells in a large array changes the programmed resistance value, causing unwanted program perturbation and degradation of neural network calculation accuracy.

종래 접근법과 다르게, 실시예들에서, 전력 효율적인 신경망은 차동 구조를 가진 논리 친화적 비휘발성 시냅스에 기초할 수 있으며, 여기서 차동 구조는 선택트랜지스터 및 논리 친화적 NVM을 포함할 수 있다. 실시예들에서, 완전 차동 시냅스 구조는 곱셈기(multiplier)로서 시냅스 회로의 동작 범위를 넓힐 수있다. 종래의 구조들과 비교하여 보면, 실시예들에서 약간의 곱셈 오류는 훈련된 가중치 파라미터의 특정 레벨의 양자화 잡음을 보상하는데 유리할 수있다.Unlike conventional approaches, in embodiments, a power efficient neural network may be based on a logic-friendly non-volatile synapse with a differential structure, where the differential structure may include a select transistor and a logic-friendly NVM. In embodiments, a fully differential synaptic structure can broaden the operating range of a synaptic circuit as a multiplier. Compared to conventional structures, in embodiments a slight multiplication error can be beneficial to compensate for a certain level of quantization noise of a trained weight parameter.

후술하는 바와 같이, 실시예들에서, 시냅스층들(120, 140, 160, 180)의 각 시냅스로의 입력 신호는 시냅스의 선택트랜지스터의 게이트 단자로 향하여, 곱셈 잡음을 억제할 수 있다. 실시예들에서, 곱셈기 전류는 대략 게이트 단자 전압과 가변저항이나 NVM의 저항 레벨을 곱한 것일 수 있다.As will be described later, in embodiments, an input signal to each synapse of the synaptic layers 120, 140, 160, and 180 is directed to a gate terminal of a select transistor of the synapse, so that multiplication noise can be suppressed. In embodiments, the multiplier current may be approximately the product of the gate terminal voltage and the resistance level of the variable resistor or NVM.

도 2는 본 발명의 실시예들에 따른 시냅스어레이(200)의 개략도를 나타낸다. 도시된 바와 같이, 시냅스어레이(200)는 행과 열로 배열된 비휘발성 시냅스들(210)과, 열선택트랜지스터(263)에 각각 전기적으로 연결된 포지티브출력전류라인들(Bit Lines)(266)과, 열선택트랜지스터(268)에 각각 전기적으로 연결된 네거티브출력전류라인들(Bit Line Bar lines)(267)을 포함할 수 있다. 실시예들에서, 열선택트랜지스터(263)의 드레인 단자는 감지회로(250)의 포지티브전류포트(241)에 전기적으로 연결될 수 있고, 열선택트랜지스터(268)의 드레인 단자는 감지회로(250)의 네거티브전류포트(242)에 전기적으로 연결될 수 있다.2 shows a schematic diagram of a synapse array 200 according to embodiments of the present invention. As shown, the synapse array 200 includes non-volatile synapses 210 arranged in rows and columns, positive output current lines 266 electrically connected to the column selection transistor 263, respectively; It may include negative output current lines (Bit Line Bar lines) 267 electrically connected to the column selection transistor 268, respectively. In embodiments, the drain terminal of the column selection transistor 263 may be electrically connected to the positive current port 241 of the sensing circuit 250, and the drain terminal of the column selection transistor 268 may be connected to the positive current port 241 of the sensing circuit 250. It may be electrically connected to the negative current port 242 .

실시예들에서, 각각의 비휘발성시냅스(210)는 하나의 포지티브 가중치와 하나의 네거티브 가중치를 저장할 수있다. 실시예들에서, 각각의 비휘발성시냅스(210)는, 기준전압입력(201)을 수신하기 위한 신호라인(또는 동일하게 기준 신호 라인)(예를 들어, SL1)(264)과, 신호전압입력(202)을 수신하기 위한 워드라인(또는 동일하게 입력신호라인)(예를 들어, WL1)(265)과, 포지티브전류출력(203)을 출력하기 위한 포지티브출력라인(예를 들어, BL1)(266)과, 네거티브전류출력(204)을 출력하기 위한 네거티브출력라인(예를 들어, BLB1)(267)을 포함한다.In embodiments, each non-volatile synapse 210 may store one positive weight and one negative weight. In embodiments, each non-volatile synapse 210 includes a signal line (or equivalently a reference signal line) (e.g., SL1) 264 for receiving a reference voltage input 201 and a signal voltage input A word line (or equally input signal line) (e.g., WL1) 265 for receiving 202 and a positive output line (e.g., BL1) for outputting the positive current output 203 ( 266) and a negative output line (eg, BLB1) 267 for outputting the negative current output 204.

실시예들에서, 신호전압입력(202) 및 기준전압입력(201) 각각은 포지티브 및 네거티브 가중치 모두와 관련될 수 있고, 포지티브전류출력(203)은 포지티브 가중치와 관련될 수 있고, 네거티브 전류출력(204)은 네거티브 가중치와 관련될 수 있다.In embodiments, each of the signal voltage input 202 and the reference voltage input 201 may be associated with both positive and negative weights, and the positive current output 203 may be associated with a positive weight, and the negative current output ( 204) may be associated with a negative weight.

실시예들에서, 각각의 비휘발성시냅스(210)에 저장된 포지티브(또는 네거티브)의 가중치는 가변 저항값의 역수로 표현될 수 있고, 신호전압입력(202) 및 기준전압입력(201) 값들은 전기적 전압 값일 수있다. 실시예들에서, 포지티브전류출력(203)의 값은 포지티브 가중치 값과 신호전압입력(202)을 곱한 결과일 수 있고, 네거티브전류출력(204)의 값은 네거티브 가중치 값과 신호전압입력(202)을 곱한 결과일 수 있다.In embodiments, a positive (or negative) weight stored in each non-volatile synapse 210 may be expressed as a reciprocal of a variable resistance value, and the values of the signal voltage input 202 and the reference voltage input 201 are electrical can be a voltage value. In embodiments, the value of positive current output 203 can be the result of multiplying the positive weight value by the signal voltage input 202, and the value of negative current output 204 can be the result of multiplying the negative weight value by the signal voltage input 202. may be the result of multiplying

도 2에 도시된 바와 같이, 비휘발성 시냅스어레이(200)의 각 행은 기준전압라인(SL, 264) 및 신호전압라인(WL, 265)을 공유할 수 있으며, 여기서 각 SL은 상응하는 한 행의 비휘발성 시냅스들에 기준전압입력(201)을 제공할 수 있고, 각 WL은 상응하는 한 행의 비휘발성 시냅스들에 신호전압입력(202)을 제공하여, 한 행의 비휘발성 시냅스들은 실질적으로 동일한 신호전압입력 및 동일한 기준전압입력을 수신할 수 있다.As shown in Figure 2, each row of the non-volatile synapse array 200 may share a reference voltage line (SL, 264) and a signal voltage line (WL, 265), where each SL corresponds to one row It is possible to provide a reference voltage input 201 to non-volatile synapses of a row, and each WL provides a signal voltage input 202 to a corresponding row of non-volatile synapses, so that a row of non-volatile synapses is substantially It is possible to receive the same signal voltage input and the same reference voltage input.

전술한 바와 같이, 비휘발성 시냅스어레이(200)의 각 열은 포지티브출력전류라인(BL)(266)과 네거티브출력전류라인(BL-Bar)(267)을 공유할 수 있다. 즉 한 열의 시냅스들의 각각의 포지티브전류출력(203)은 상응하는 BL(266)에 의해 수집될 수 있고, 한 열의 시냅스들의 각각의 네거티브전류출력(204)은 상응하는 BL-Bar라인(267)에 의해 수집될 수 있다. 따라서, BL라인(266) 상의 전기적 전류는 한 열의 시냅스들로부터의 포지티브 전기적 출력전류(203)의 합일 수 있다. 유사하게, 실시예에서, BL-Bar라인(267) 상의 전기적 전류 값은 한 열의 시냅스들로부터의 네거티브 전기적 출력전류(204)의 합일 수 있다.As described above, each column of the non-volatile synapse array 200 may share a positive output current line (BL) 266 and a negative output current line (BL-Bar) 267. That is, each positive current output 203 of a row of synapses can be collected by a corresponding BL 266, and each negative current output 204 of a row of synapses is connected to a corresponding BL-Bar line 267. can be collected by Thus, the electrical current on BL line 266 can be the sum of the positive electrical output currents 203 from a row of synapses. Similarly, in an embodiment, the electrical current value on BL-Bar line 267 may be the sum of the negative electrical output currents 204 from a row of synapses.

실시예들에서, 각각의 포지티브출력전류라인(BL)(266)은 상응하는 열선택트랜지스터(263)의 소스 단자에 전기적으로 연결될 수 있고, 각각의 네거티브출력전류라인(BL-Bar)(267)은 상응하는 열선택트랜지스터(268)의 소스 단자에 전기적으로 연결될 수 있다. 실시예들에서, 한 쌍의 BL 및 BL-Bar 라인(263, 268)의 열선택트랜지스터는 외부의 열 선택 회로(도 2에 도시되지 않음)로부터 게이트 단자에 동일한 열 선택 신호를 수신할 수 있다. 실시예들에서, 열선택트랜지스터(263)의 드레인 단자의 라인은 감지회로(250)의 포지티브전류입력(241)에 전기적으로 연결될 수 있다. 실시예들에서, 열선택트랜지스터(268)의 드레인 단자의 라인은 네거티브전류입력(242)에 전기적으로 연결될 수 있다.In embodiments, each positive output current line (BL-Bar) 266 may be electrically connected to a source terminal of a corresponding column select transistor 263, and each negative output current line (BL-Bar) 267 may be electrically connected to the source terminal of the corresponding column selection transistor 268 . In embodiments, the column selection transistors of the pair of BL and BL-Bar lines 263 and 268 may receive the same column selection signal at their gate terminals from an external column selection circuit (not shown in FIG. 2). . In embodiments, the line of the drain terminal of the column selection transistor 263 may be electrically connected to the positive current input 241 of the sensing circuit 250 . In embodiments, the line of the drain terminal of the column select transistor 268 may be electrically connected to the negative current input 242.

실시예들에서, 포지티브전류포트(241)의 전기적인 전류 값(IBL)(261)은 각 열선택트랜지스터(263) 상에서 열 선택 신호를 수신하는 포지티브출력전류 BL (266)의 값일 수 있다. 마찬가지로, 네거티브전류입력(242)의 전기적 전류 값(IBL-bar)(262)은 각 열선택트랜지스터(268) 상에서 열 선택 신호를 수신하는 네거티브출력전류라인 BL-Bar(267)일 수있다.In embodiments, the electrical current value IBL 261 of the positive current port 241 may be the value of the positive output current BL 266 receiving the column select signal on each column select transistor 263 . Similarly, the electrical current value (IBL-bar) 262 of the negative current input 242 may be the negative output current line BL-Bar 267 receiving the column selection signal on each column selection transistor 268.

실시예들에서, 시냅스(210)의 하나 이상의 행은 WL(265)들 상에 고정된 입력 신호 전압을 가질 수 있고, 해당 행의 시냅스들은 해당 열에 대한 바이어스 값을 저장할 수 있다. 실시예들에서, 시냅스들의 어레이는 수학식 1의 행렬 곱셈을 구현할 수 있다.In embodiments, one or more rows of synapses 210 may have fixed input signal voltages on WLs 265, and synapses in that row may store bias values for that column. In embodiments, the array of synapses may implement matrix multiplication of Equation 1.

W × Ain + BiasW × Ain + Bias

여기서, W는 시냅스어레이일 수 있고, Ain 행렬은 WL 입력을 나타낸다.Here, W may be a synapse array, and the Ain matrix represents the WL input.

실시예들에서, 각각의 비휘발성시냅스(210)는 네거티브 및 포지티브의 가중치를 저장하는 2 개의 회로(또는 동일하게 셀)를 가질 수 있다. 실시예들에서, 전술한 바와 같이, 가중치는 각각 가변저항의 역수 값, l / Rn = W_neg 및 l / Rp = W_pos으로 표현될 수 있다. 어레이(200)의 시냅스들의 각 행은 입력신호를 전기전압, Ain으로 수신할 수 있다. 입력신호에 응답하여, 어레이(200)의 각 시냅스는 BL(예를 들어, BL0(266))을 통한 포지티브 출력전류와, BLB(예를 들어, 267)를 통한 네거티브 출력전류를 생성할 수 있으며, 여기서 포지티브 출력전류 BLc의 값은 BLc = Ain × W_pos로 표현되고 네거티브 출력전류 BLBc는 BLBc = Ain × W_neg로 표현될 수 있다.In embodiments, each non-volatile synapse 210 may have two circuits (or equivalent cells) that store negative and positive weights. In embodiments, as described above, the weights may be expressed as reciprocal values of variable resistors, l/Rn = W_neg and l / Rp = W_pos, respectively. Each row of the synapses of the array 200 may receive an input signal as an electric voltage, Ain. In response to an input signal, each synapse of the array 200 may generate a positive output current through the BL (e.g., BL0 266) and a negative output current through the BLB (e.g., 267); , where the value of the positive output current BLc can be expressed as BLc = Ain × W_pos and the negative output current BLBc can be expressed as BLBc = Ain × W_neg.

실시예들에서, 신경망(100)의 각 시냅스층에 대한 가중치(W)는 별도의 훈련단계에서 결정(계산 및 조정)될 수 있다. 그 다음, 추론단계 동안 입력신호들(Ain)은 신경망(100)에 인가될 수 있으며, 여기서 미리 결정된 가중치들은 출력값들을 생성하기 위해 사용될 수있다. 실시예들에서, 훈련단계 동안 결정될 수 있는 가중치는 추론단계 동안 변경되지 않을 수 있다.In embodiments, the weight W for each synaptic layer of the neural network 100 may be determined (calculated and adjusted) in a separate training step. Then, during the inference step, the input signals Ain may be applied to the neural network 100, where predetermined weights may be used to generate output values. In embodiments, weights that may be determined during the training phase may not change during the inference phase.

실시예들에서, 전술한 바와 같이, BL(예를 들어, BLi)은 시냅스어레이(200)의 한 열에서 시냅스들의 출력라인들 모두에 전기적으로 연결될 수 있고, BL-bar라인(예를 들어, BLBi)은 시냅스어레이(200)의 시냅스들의 출력라인들 모두에 전기적으로 연결될 수 있다. 이러한 구성은 각각의 BL(266)(또는 BLB(267)) 상의 전류값을 어레이(200) 내 상응하는 열의 시냅스들의 개별적으로 계산된 전류 값들의 합이 되도록 만들 수 있다. 실시예들에서, BLn 라인 및 BLBn 라인의 출력전류는 다음과 같이 표현될 수 있다.In embodiments, as described above, BL (eg, BLi) may be electrically connected to all of the output lines of synapses in one column of the synapse array 200, and the BL-bar line (eg, BLBi) may be electrically connected to all of the output lines of the synapses of the synapse array 200. This configuration can make the current value on each BL 266 (or BLB 267) be the sum of the individually calculated current values of the corresponding row of synapses in the array 200. In embodiments, the output current of the BLn line and the BLBn line can be expressed as:

[수학식 2a][Equation 2a]

BLn = ∑(W_pos-row × Ain-row), n-열의 해당 행들에 대하여BLn = ∑(W_pos-row × Ain-row), for corresponding rows of n-column

[수학식 2b][Equation 2b]

BLBn = ∑(W_neg-row × Ain-row), n-열의 해당 행들에 대하여BLBn = ∑(W_neg-row × Ain-row), for corresponding rows of n-column

실시예들에서, 어레이(200)의 하나 이상의 행은 고정된 입력신호전압을 가질 수 있고, 해당 행의 시냅스들은 해당 열에 대한 바이어스 값을 저장할 수 있다. 이 경우 BLn 및 BLBn 상의 총 전기적 전류는 다음과 같이 표현될 수 있다.In embodiments, one or more rows of the array 200 may have a fixed input signal voltage, and synapses of the corresponding rows may store bias values for the corresponding columns. In this case, the total electric current on BLn and BLBn can be expressed as

[수학식 3a][Equation 3a]

BLn = ∑(W_pos-row × Ain-row) + bias_posBLn = ∑(W_pos-row × Ain-row) + bias_pos

[수학식 3b][Equation 3b]

BLBn = ∑(W_neg-row × Ain-row)+ bias_negBLBn = ∑(W_neg-row × Ain-row) + bias_neg

실시예들에서, 감지회로(250)에서 시냅스어레이로부터의 전류입력신호(ISig = IBL(261) 또는 IBLB(262))는 CTIA(Capacitive Trans Impedance Amplifier)를 사용하여 전압신호(Vsig)로 변환될 수 있고, 추가로 ADC(Analog Digital Converter)를 사용하여 디지털신호를 생성하도록 처리될 수 있다. 실시예들에서, ADC는 오프셋 소거 열 비교기와 카운터를 사용하는 단일기울기 열 ADC 구조(single-slope column ADC architecture)를 가질 수 있다. 이러한 구조는 파이프 라인 또는 연속근사 ADC들과 같은 다른 ADC 구조와 비교하여 최소면적 및 최소전력손실을 사용할 수 있다.In embodiments, the current input signal (ISig = IBL (261) or IBLB (262)) from the synaptic array in the sensing circuit 250 is converted to a voltage signal (Vsig) using a CTIA (Capacitive Trans Impedance Amplifier). It can be further processed to generate a digital signal using an analog digital converter (ADC). In embodiments, the ADC may have a single-slope column ADC architecture using an offset cancellation column comparator and counter. This structure can use a minimum area and minimum power loss compared to other ADC structures such as pipeline or successive approximation ADCs.

실시예들에서, 신경망(100)의 각 시냅스층(예를 들어, 120)은 BL(266) 및 BLB(267)에 전기적으로 연결될 수 있고, BL 및 BLB 라인 상의 출력전류를 전기적으로 처리할 수 있는 전기적 컴포넌트(도 2에 도시되지 않음)를 가질 수 있다. 예를들어, 전기적인 컴포넌트는 차동적 감지를 제공하고, 출력전류신호를 전압신호로 변환하며, 추가로 디지털신호로 변환하고, 누산기에서 디지털신호들을 합산할 수 있다. 다른 예에서, 전기적 컴포넌트들은 합산된 값에 대하여 정규화 및 활성화와 같은 다른 다양한 처리 동작들을 수행함으로써, 수학식 1의 Aout에 대한 활성화 함수를 구현할 수 있다. 실시예들에서, 최종 Aout은 데이터 버퍼에 저장될 수 있고 신경망(100)에서 다음 뉴럴어레이층에 대한 입력신호를 생성하는 데 사용될 수 있다.In embodiments, each synaptic layer (eg, 120) of the neural network 100 may be electrically connected to the BL 266 and the BLB 267, and may electrically process the output current on the BL and BLB lines. electrical components (not shown in FIG. 2). For example, an electrical component may provide differential sensing, convert an output current signal to a voltage signal, further convert it to a digital signal, and sum the digital signals in an accumulator. In another example, the electrical components can implement the activation function for Aout of Equation 1 by performing various other processing operations such as normalization and activation on the summed value. In embodiments, the final Aout may be stored in a data buffer and used in neural network 100 to generate an input signal for the next neural array layer.

실시예들에서, 별도의 회로(도 2에 도시되지 않음)가 다음과 같은 보조 기능 - (1) 신경망(100)의 논리적 뉴런-시냅스 구조를 시냅스어레이(200)의 물리적 주소 맵핑에 맵핑하는 라우터/제어기, (2) 입력신호를 구성에 따른 적절한 시냅스의 행으로 구동하는 구동회로, (3) 시냅스들의 하나 이상의 열을 공유하는 감지회로에 대한 열 선택을 제공하는 선택회로, (4) 시냅스들을 선택하기 위해 사용되는 기준 전압을 생성하는 전압발생기, 및 (5) 라우터제어기 및 감지회로(250)에 대한 구성을 저장하는 저장장치 - 을 수행하기 위해 신경망(100)에 포함될 수 있다.In embodiments, a separate circuit (not shown in FIG. 2 ) provides auxiliary functions such as: (1) a router that maps the logical neuron-synapse structure of the neural network 100 to the physical address mapping of the synapse array 200; /controller, (2) a drive circuit that drives the input signal to an appropriate row of synapses according to the configuration, (3) a selection circuit that provides column selection for a sensing circuit that shares one or more columns of synapses, (4) synapses A voltage generator for generating a reference voltage used for selection, and (5) a storage device for storing configurations for the router controller and sensing circuit 250 - may be included in the neural network 100 to perform.

도 3은 본 발명의 실시예들에 따른 시냅스(300)의 개략도를 나타낸다. 실시예들에서, 시냅스(300)는 도 2의 시냅스(210)로서 사용될 수 있다. 도시된 바와 같이, 시냅스(300)는 한 쌍의 입력트랜지스터(311, 312)와, 한 쌍의 비휘발성 저항변화소자(R_p(313), R_n(314))(이하, "비휘발성 저항변화소자"와 "저항"라는 용어는 혼용된다). 다르게 말하면, 시냅스(300)는 한 쌍의 1T-1R(1-트랜지스터 1-저항) 구조를 가질 수 있다. 실시예들에서, 저항 R_p(313) 및 R_n(314)은 논리 친화적 비휘발성 저항변화소자일 수 있다. 실시예들에서, 시냅스(300)는 2개의 셀(332, 334)을 가질 수 있고, 여기서 각 셀은 하나의 입력트랜지스터(311)(또는 312)와 저항 R_p(312)(또는 R_n(314))을 가질 수 있다.3 shows a schematic diagram of a synapse 300 according to embodiments of the present invention. In embodiments, synapse 300 may be used as synapse 210 of FIG. 2 . As shown, the synapse 300 includes a pair of input transistors 311 and 312 and a pair of non-volatile resistance change elements (R_p 313 and R_n 314) (hereinafter referred to as "non-volatile resistance change elements"). The terms “and” “resistance” are used interchangeably). In other words, the synapse 300 may have a pair of 1T-1R (1-transistor 1-resistance) structure. In embodiments, resistors R_p (313) and R_n (314) may be logic-friendly non-volatile resistance change elements. In embodiments, the synapse 300 may have two cells 332 and 334, where each cell has one input transistor 311 (or 312) and a resistor R_p (312) (or R_n (314)). ) can have.

실시예들에서, 논리 친화적 비휘발성 저항변화소자 R_p(313)(또는 R_n(314))는 시냅스(300)가 기억/저장할 수 있는 포지티브(또는 네거티브) 가중치 파라미터와 관련될 수 있다. 실시예들에서, 각각의 저항은 입력트랜지스터(예를 들어, 311)의 소스단자에 전기적으로 연결될 수 있고 기준 신호 라인(264)은 기준신호를 상기 저항에 인가할 수 있다. 실시예들에서, 워드라인(WL)(265)은 입력신호전압을 입력트랜지스터(예를 들어, 311)의 게이트단자에 인가할 수 있다.In embodiments, the logic-friendly non-volatile resistive element R_p (313) (or R_n (314)) may be associated with a positive (or negative) weight parameter that the synapse 300 may store/store. In embodiments, each resistor may be electrically connected to a source terminal of an input transistor (eg, 311) and a reference signal line 264 may apply a reference signal to the resistor. In some embodiments, the word line (WL) 265 may apply an input signal voltage to the gate terminal of the input transistor (eg, 311).

실시예들에서, 저항값 R(= R_p 또는 R_n)은 훈련단계에서 저항변화소자로 프로그래밍될 수 있다. 시냅스입력신호가 WL(265)에 인가될 때, 시냅스출력전류는 이전 뉴런으로부터의 입력값 Ain에 가중치(1/R로 표현)를 곱한 것으로 근사화할 수 있고, 여기서 Ain은 WL(265) 상의 전압으로 나타낼 수있다.In embodiments, the resistance value R (= R_p or R_n) may be programmed into the resistance change element in the training step. When a synaptic input signal is applied to WL 265, the synaptic output current can be approximated by multiplying the input Ain from the previous neuron by a weight (expressed as 1/R), where Ain is the voltage on WL 265. can be represented as

실시예들에서, 시냅스어레이(200)에 저장된 신경망 파라미터는 대략 유사한 수의 포지티브 및 네거티브의 가중치 파라미터를 가질 수 있다. 어레이(200)에서 사용되지 않는 저항소자는 기 설정된 값보다 높은 저항 값을 갖도록 프로그래밍될 수 있다. 셀의 출력전류가 셀의 BL(또는 BLB) 상의 출력전류에 실질적으로 추가되지 않도록, 각 미사용 저항소자를 통한 전기적 전류는 실질적으로 0이어야 한다. 따라서, 계산상 미사용 저항소자의 영향이 최소화되고 전력소비가 감소된다. 훈련된 가중치 파라미터는 신경망 계산의 정확도를 크게 저하시키지 않으면서 저항변화소자로 양자화 및 프로그래밍될 수 있다. 저항 R_p(313)(또는 R_n(314))의 저항값 R이 훈련 단계에서 프로그래밍되고 스케일링된 시냅스입력신호 WL들이 WL(265)을 통해 인가될 때, BL(266)(또는 BLB(267)) 상의 시냅스출력전류 IC는 수학식 4 및 5로 나타내어질 수 있다.In embodiments, the neural network parameters stored in synapse array 200 may have approximately similar numbers of positive and negative weight parameters. Resistance elements not used in the array 200 may be programmed to have a higher resistance value than a preset value. The electrical current through each unused resistive element must be substantially zero, so that the output current of the cell does not substantially add to the output current on the cell's BL (or BLB). Therefore, the influence of unused resistance elements on calculation is minimized and power consumption is reduced. The trained weight parameters can be quantized and programmed into the resistive change element without significantly deteriorating the accuracy of the neural network calculation. When the resistance value R of resistance R_p (313) (or R_n (314)) is programmed in the training phase and the scaled synaptic input signals WL are applied through WL (265), BL (266) (or BLB (267)) The synaptic output current IC of the phase can be expressed by Equations 4 and 5.

[수학식 4][Equation 4]

dIC/dWL = ~ gm / (1 + gm*R) = ~ l/R (R이 l/gm보다 충분히 클 때)dIC/dWL = ~ gm / (1 + gm*R) = ~ l/R (when R is sufficiently greater than l/gm)

여기서, gm은 입력트랜지스터의 컨덕턴스이다.Here, gm is the conductance of the input transistor.

[수학식 5][Equation 5]

IC = ~ WL / R = ~ w Ain (여기서 w = l/R, Ain = WL 이다)IC = ~ WL / R = ~ w Ain (where w = l/R and Ain = WL)

여기서, w와 Ain은 그것들을 곱한결과 IC를 근사적으로 제공할 수 있다.Here, w and Ain can approximate IC as a result of multiplying them.

수학식 5에 나타난 바와 같이, 출력전류 IC는 입력신호(입력전압 Ain)와 가중치(w)의 곱으로 근사화할 수 있다. 종래의 시스템과는 달리, 시냅스(300)에서 발생하는 수학식 5의 아날로그 곱셈 연산은 복잡한 디지털 논리 게이트의 사용을 필요로 하지 않아서 시냅스 구조의 복잡성과 계산 자원의 사용량을 상당히 감소시킨다.As shown in Equation 5, the output current IC can be approximated by multiplying the input signal (input voltage Ain) and the weight w. Unlike conventional systems, the analog multiplication operation of Equation 5 occurring in the synapse 300 does not require the use of complex digital logic gates, significantly reducing the complexity of the synapse structure and the amount of computational resources.

실시예들에서, 입력신호(Ain)는 이전 뉴런으로부터의 출력신호일 수 있고(도 1에 도시된 바와 같이), 입력트랜지스터(311)(또는 312)의 게이트로 유도될 수 있다. 입력신호(Ain)를 게이트로 유도하는 것은 큰 시냅스어레이의 저항성분에 의해 발생되는 노이즈를 최소화할 수 있다. 왜냐하면, 선택트랜지스터의 게이트에 정적전류가 흐르지 않기 때문이다. 대조적으로, 종래의 시스템에서, 입력신호는 시냅스의 선택기 또는 저항변화소자로 구동되는데, 이는 큰 어레이에서의 큰 저항성분 및 작동 중 정적전류 흐름으로 인해 각 시냅스에 대한 전류입력신호에 큰 변화가 발생하기 쉽다.In embodiments, the input signal Ain may be an output signal from a previous neuron (as shown in FIG. 1 ) and may be directed to the gate of the input transistor 311 (or 312 ). Inducing the input signal Ain to the gate can minimize noise generated by the resistance component of the large synapse array. This is because static current does not flow to the gate of the selection transistor. In contrast, in the conventional system, the input signal is driven by a synaptic selector or resistance change element, which causes a large change in the current input signal for each synapse due to the large resistive component in the large array and the static current flow during operation. easy to do.

종래의 시스템에서, 저항변화소자를 프로그래밍할 때, 큰 어레이에서 절반-선택된(half-selected) 셀을 통한 누설전류는 미리 프로그램 된 저항값을 변경할 수 있어 원하지 않는 프로그램 장애가 발생할 수 있다. 대조적으로, 실시예들에서, 입력트랜지스터(311)(또는 312)는 큰 어레이에서 선택된 저항(313)(또는 314)으로만 유도되는 프로그램 펄스로 활성화될 수 있다. 따라서, 실시예들에서, 선택되지 않은 시냅스들은 선택된 시냅스들의 프로그래밍을 교란하지 않을 수 있으며, 여기서 선택된 시냅스들은 BL(또는 BLB) 및 SL 노드들에 적합한 바이어스 조건을 적용함으로써 프로그래밍될 수 있다.In conventional systems, when programming a resistive change element, leakage current through half-selected cells in a large array can change a pre-programmed resistance value, causing undesirable program failure. In contrast, in embodiments, input transistor 311 (or 312) may be activated with a program pulse driven only by resistors 313 (or 314) selected from a large array. Thus, in embodiments, unselected synapses may not perturb programming of selected synapses, where selected synapses may be programmed by applying a suitable bias condition to BL (or BLB) and SL nodes.

제한하지 않은 예로서, 시냅스(200)의 어레이는 시냅스층(120)에 위치될 수 있고, 여기서 뉴런어레이층(110)의 이전 뉴런(예를 들어, 112a)으로부터의 출력신호는 시냅스어레이(200)의 시냅스(300)에 입력될 수 있고 BL(266) 및 BLB(267)의 출력신호는 뉴런어레이층(130)에서 다음 뉴런(예를 들어, 132a ~ 132m) 중 하나 이상에 입력될 수 있다.As a non-limiting example, the array of synapses 200 may be located in the synapse layer 120, where the output signal from the previous neuron (eg, 112a) of the neuron array layer 110 is the synapse array 200. ) Can be input to the synapse 300 and the output signals of the BL 266 and BLB 267 can be input to one or more of the following neurons (eg, 132a to 132m) in the neuron array layer 130 .

실시예들에서, 저항(313)(또는 314)은 비휘발성 MRAM, RRAM, 또는 PRAM 또는 단일 폴리 내장 플래시 메모리와 같은 다양한 회로(또는 메모리)로 구현될 수 있으며, 여기서 회로는 저항의 역수로 표현될 수 있는 연관 파라미터를 기억(저장)하도록 프로그래밍될 수 있다. 실시예들에서, 곱셈 연산은 디지털 논리 및 산술 회로를 사용하지 않고 아날로그 값으로 시냅스 내에서 완료될 수 있다.In embodiments, resistor 313 (or 314) can be implemented in various circuits (or memories), such as non-volatile MRAM, RRAM, or PRAM or single poly embedded flash memory, where the circuit is expressed as the reciprocal of resistance. It can be programmed to memorize (store) associated parameters that can be In embodiments, multiplication operations may be completed intra-synapse with analog values without using digital logic and arithmetic circuitry.

도 4는 본 발명의 실시예들에 따른 다른 시냅스(400)의 개략도를 나타낸다. 실시예들에서, 시냅스(400)는 도 3의 저항들(313 및 314)의 예시적인 구현을 나타낼 수 있다. 달리 말하면, 실시예들에서, 저항(313)은 도 4의 박스(452) 내의 컴포넌트들에 의해 구현될 수있다.4 shows a schematic diagram of another synapse 400 according to embodiments of the present invention. In embodiments, synapse 400 may represent an example implementation of resistors 313 and 314 of FIG. 3 . In other words, in embodiments, resistor 313 may be implemented by components within box 452 of FIG. 4 .

도 4에 도시된 바와 같이, 시냅스(400)는 한 쌍의 논리호환형 내장플래시메모리셀(432, 434)을 포함하며, 여기서 플래시메모리셀에서의 플로팅게이트노드 (FG_p, FG_n)는 이 시냅스(400)가 기억/저장하는 포지티브 및 네거티브의 가중치 파라미터 각각과 관련될 수 있다. As shown in FIG. 4, the synapse 400 includes a pair of logic-compatible built-in flash memory cells 432 and 434, where the floating gate nodes (FG_p, FG_n) in the flash memory cell are connected to the synapse ( 400) may be associated with each of the positive and negative weight parameters that are memorized/stored.

실시예들에서, WL(420) 상의 시냅스입력신호는 BL(406) 및 BLB(407) 상의 차동 시냅스출력전류(IBL, IBLB)를 유도할 수 있는 두 분기 사이에서 공유될 수 있다. 실시예들에서, 프로그램워드라인(또는 간략히 프로그램라인, PWL)(418), 쓰기워드라인(또는 간략히 쓰기라인, WWL)(416) 및 지우기워드라인(또는 간략히 지우기라인, EWL)(414)은 논리호환형 내장플래시메모리셀(432, 434)의 프로그램, 쓰기 및 지우기 동작을 위한 추가적인 제어 신호를 제공하는데 사용될 수 있다.In embodiments, the synaptic input signal on WL 420 may be shared between the two branches, which may induce differential synaptic output currents (IBL, IBLB) on BL 406 and BLB 407. In embodiments, a program wordline (or simply program line, PWL) 418, a write word line (or simply write line, WWL) 416 and an erase word line (or simply erase line, EWL) 414 are It can be used to provide additional control signals for program, write, and erase operations of the logic-compatible embedded flash memory cells 432 and 434.

실시예들에서, 메모리셀들(432, 434)은 표준 로직 프로세스를 넘어 추가적인 프로세스 오버헤드(overhead)를 방지하는 로직 트랜지스터를 포함할 수 있다. 실시예들에서, PWL(418)에 직접 연결된 커플링 트랜지스터(422)(및 423)는 PWL(418)을 통해 제공되는 제어신호에 플로팅게이트노드(FG_p, FG_n)의 더 높은 결합을 위해 확대(upsized)될 수 있다. 실시예들에서, PWL(418)에 직접 연결된 커플링 트랜지스터(422)(또는 423)는 쓰기 트랜지스터(424)(또는 425)보다 상대적으로 더 클 수 있다. PWL(418) 및 WWL(416)로 높은 프로그램전압이 유도되는 경우, 메모리셀(432)(또는 434)은 BL(406)(또는 BLB(407))에 0V를 인가하고 FG_p로 전자를 주입함으로써, 선택되고 프로그래밍될 수 있다. 반면, 선택되지 않은 셀(434)(또는 432)은 VDD를 BLB(407)(또는 BL(406))에 인가하고 VDD를 WL(420)에 인가하여 선택되지 않은 셀(434)(또는 432)의 선택트랜지스터를 오프시킴으로써 프로그램이 금지될 수 있다. 이하, 선택트랜지스터라는 용어는 BL(406) 또는 BLB(407)에 전기적으로 연결된 게이트를 갖는 트랜지스터를 지칭한다.In embodiments, memory cells 432 and 434 may include logic transistors that avoid additional process overhead beyond a standard logic process. In embodiments, coupling transistors 422 (and 423) coupled directly to PWL 418 are extended for higher coupling of floating gate nodes FG_p and FG_n to the control signal provided through PWL 418 ( can be upsized. In embodiments, coupling transistor 422 (or 423 ) directly coupled to PWL 418 may be relatively larger than write transistor 424 (or 425 ). When a high program voltage is induced to the PWL 418 and WWL 416, the memory cell 432 (or 434) applies 0V to the BL 406 (or BLB 407) and injects electrons into FG_p. , can be selected and programmed. On the other hand, the unselected cell 434 (or 432) applies VDD to the BLB 407 (or BL 406) and applies VDD to the WL 420 so that the unselected cell 434 (or 432) The program can be inhibited by turning off the selection transistor of . Hereinafter, the term selection transistor refers to a transistor having a gate electrically connected to the BL 406 or the BLB 407.

실시예들에서, WWL(416)로만 높은 지우기전압이 유도되는 경우, 선택된 WL은 FG로부터 전자를 방출하는 것에 의하여 소거될 수있다. 선택되지 않은 WL들은 프로그램 및 지우기 작동 중 VDD보다 높은 전압으로 유도되지 않을 수 있다. 따라서 선택되지 않은 WL들에는 교란이 없다. 실시예들에서, FG 노드 전압은 PWL(418), WWL(416) 및 FG 노드에 저장된 전자들의 수에 대한 신호들의 함수일 수 있다. FG에 전기적으로 연결된 읽기 트랜지스터(예를 들어, 462)의 컨덕턴스는 PWL(418) 및 WWL(416)에서의 전압과 FG 노드에 저장된 전기적 전하를 제어함으로써 프로그램될 수 있다.In embodiments, when a high erase voltage is induced in only WWL 416, the selected WL may be erased by emitting electrons from FG. Unselected WLs may not be driven to a voltage higher than VDD during program and erase operations. Therefore, there is no perturbation in unselected WLs. In embodiments, the FG node voltage may be a function of the PWL 418, WWL 416 and signals on the number of electrons stored in the FG node. The conductance of the read transistor (e.g., 462) electrically connected to FG can be programmed by controlling the voltages at PWL 418 and WWL 416 and the electrical charge stored at the FG node.

실시예들에서, 내장플래시셀(432)(또는 434)의 임계전압이 프로그래밍되었을 때, 그리고 스케일링된 시냅스입력신호가 WL(420)을 통해 제공될 때, 수학식 5가 대략적으로 충족될 수 있는 특정 임계전압 범위가 있을 수 있다. 여기서 셀출력전류(=IBL, IBLB)는 입력신호뿐만 아니라 프로그래밍된 가중치 파라미터에 비례한다.In embodiments, when the threshold voltage of the embedded flash cell 432 (or 434) is programmed and the scaled synaptic input signal is provided through the WL 420, Equation 5 can be approximately satisfied. There may be a specific threshold voltage range. Here, the cell output current (=IBL, IBLB) is proportional to the input signal as well as the programmed weighting parameter.

실시예들에서, 신경망(100)은 랜덤 오차 또는 가중치 파라미터의 작은 변화에 강할 수 있다. 실시예들에서, 사전훈련된 가중치 파라미터(W)가 신경망(100)의 계산 중 양자화될 때, 신경망 성능 또는 추론 정확도는 곱셈 오차가 특정 범위 내에 있는 한 수학식 5의 약간의 곱셈 오차로 최적화될 수 있다. 또한, 제안된 근사 곱셈기로부터의 약간의 곱셈 오류는 신경망(100)의 훈련된 가중치 파라미터의 양자화 잡음을 보상할 수 있다. 그럼에도 불구하고, 신경망의 반복적인 훈련 후 큰 셀 임계전압쉬프트로 인한 심각한 셀 보존에러를 회피하기 위해, 의도적인 자가치유전류는 WWL(416)을 통해 인가될 수 있다. 의도적인 자가치유전류는 내장플래시메모리셀(432, 434)의 WWL(416)에 전기적으로 연결된 장치의 손상된 게이트 산화물을 치유할 수 있기 때문이다. 실시예들에서, 자가치유전류를 인가하는 것은 모든 훈련이나 추론에 필요하지 않을 수 있으므로 성능이나 전력소비에 최소한의 영향을 미친다.In embodiments, neural network 100 may be robust to random errors or small changes in weight parameters. In embodiments, when the pretrained weight parameter W is quantized during computation of the neural network 100, the neural network performance or inference accuracy will be optimized with some multiplication error in Equation 5 as long as the multiplication error is within a certain range. can Also, a slight multiplication error from the proposed approximate multiplier can compensate for the quantization noise of the trained weight parameters of neural network 100. Nevertheless, in order to avoid a serious cell retention error due to a large cell threshold voltage shift after repetitive training of the neural network, an intentional self-healing current may be applied through the WWL 416. This is because the intentional self-healing current can heal damaged gate oxides of devices electrically connected to the WWL 416 of the embedded flash memory cells 432 and 434. In embodiments, applying the self-healing current may not be necessary for all training or inference and thus has minimal impact on performance or power consumption.

실시예들에서, 각각의 셀(예를 들어, 432)은 커플링 트랜지스터(422), 쓰기 트랜지스터(424) 및 상부 선택 트랜지스터상부 선택 트랜지스터(또는 제1 선택트랜지스터)(460), 읽기 트랜지스터(462) 및 하부 선택 트랜지스터(464)를 포함할 수 있다. 시냅스(400) 내의 단일-폴리(single-poly) 내장플래시메모리는 저항변화소자로서 사용될 수 있고, 플래시의 플로팅게이트(FG)에 전기적으로 연결된 읽기 트랜지스터(예를 들어, 462)의 컨덕턴스는 저항변화소자로서 작용할 수 있다. 실시예들에서, 읽기 트랜지스터(예를 들어, 462)의 컨덕턴스는 FG 노드들(FG_p 또는 FG_n) 각각의 임계전압 VTH에 의해 결정될 수 있다. FG 노드들(FG_p 또는 FG_n)의 VTH는 먼저 밸런스-스텝-펄스 프로그래밍(balanced step pulse programming) 방법을 사용하여 대략적으로 프로그래밍될 수 있고, 이후 감소된 전압을 갖는 후속한 상수-펄스 프로그래밍 방법(constant pulse programming)은 시냅스(400)에 저장될 가중치를 정확하게 프로그래밍하기 위해 VTH 값을 미세 조정할 수 있다. 프로그래밍 단계는 도 10A 내지 10B와 관련하여 설명된다.In embodiments, each cell (eg, 432) includes a coupling transistor 422, a write transistor 424 and an upper select transistor (or first select transistor) 460, a read transistor 462 ) and a lower select transistor 464. The single-poly built-in flash memory in the synapse 400 can be used as a resistance change element, and the conductance of a read transistor (eg 462) electrically connected to the floating gate (FG) of the flash is the resistance change can act as an element. In embodiments, the conductance of the read transistor (eg, 462 ) may be determined by a threshold voltage VTH of each of the FG nodes (FG_p or FG_n). The VTH of the FG nodes (FG_p or FG_n) can be roughly programmed first using a balanced step-pulse programming method, and then a subsequent constant-pulse programming method with reduced voltage (constant pulse programming) can fine-tune the VTH value to accurately program the weight to be stored in the synapse 400. The programming steps are described with respect to FIGS. 10A-10B.

도 5는 본 발명의 실시예들에 따라 시냅스(500)의 개략도를 나타낸다. 실시 예들에서, 시냅스(500)는 도 2의 시냅스(210)로서 사용될 수 있다. 도시된 바와 같이, 시냅스(500)는 3쌍의 1T-1R을 가질 수 있고, 여기서 3개의 워드라인(WLa, WLb 및 WLc)은 6개의 트랜지스터들의 게이트들에 전기적으로 연결될 수 있다. 시냅스(500)는 다른 적절한 수의 입력트랜지스터와 저항뿐만 아니라 입력트랜지스터에 전기적으로 결합된 워드라인을 가질 수 있다. 예를 들어, 실시예들에서, 시냅스(500)는 워드라인(WLa) 및 1T-1R 유닛들(550, 551)의 컴포넌트들이 제거될 수 있도록 변경될 수 있다. 즉 각 셀은 2쌍의 1T-1R을 가질 수 있다. 다른 예로서, 실시예들에서, 시냅스(500)는 각 셀이 4쌍의 lT-1R 및 4개의 워드라인(입력신호라인) (WL)을 갖도록 변경될 수 있다.5 shows a schematic diagram of a synapse 500 according to embodiments of the present invention. In embodiments, synapse 500 may be used as synapse 210 of FIG. 2 . As shown, synapse 500 may have three pairs of 1T-1R, where three wordlines (WLa, WLb, and WLc) may be electrically connected to the gates of six transistors. Synapse 500 may have word lines electrically coupled to the input transistors as well as any other suitable number of input transistors and resistors. For example, in embodiments, synapse 500 may be modified such that components of wordline WLa and 1T-1R units 550 and 551 may be removed. That is, each cell can have two pairs of 1T-1R. As another example, in embodiments, the synapse 500 may be modified so that each cell has 4 pairs of lT-1R and 4 word lines (input signal lines) (WL).

실시예들에서, 시냅스(500)의 SL, BL 및 BLB는 시냅스(300)의 시냅스(300)에서 SL, BL 및 BLB와 같은 유사한 기능을 가질 수 있다. 시냅스(300)와 시냅스(500)의 차이는 시냅스(500)가 3개의 워드라인(WLa, WLb, WLc)을 통해 이전 뉴런으로부터 입력신호를 수신할 수 있다는 것이다. 보다 구체적으로, 각 WL로부터의 신호는 상응하는 입력트랜지스터의 게이트단자로 유도될 수 있다.In embodiments, SL, BL and BLB of synapse 500 may have a similar function as SL, BL and BLB of synapse 300 of synapse 300 . The difference between the synapse 300 and the synapse 500 is that the synapse 500 can receive input signals from previous neurons through three word lines WLa, WLb, and WLc. More specifically, a signal from each WL can be guided to the gate terminal of the corresponding input transistor.

각 시냅스(500)는 3개의 워드라인(WLa, WLb, WLc)에 전기적으로 연결될 수 있는 반면, 도 2의 각 시냅스(210)는 하나의 워드라인(265)에 연결된 것으로 도시되어 있다. 그래서, 도 2의 각 워드라인(265)은 하나 이상의 입력트랜지스터를 포함하는 시냅스에 전기적으로 연결된 적어도 하나의 워드라인을 일괄하여 지칭한다.Each synapse 500 may be electrically connected to three word lines WLa, WLb, and WLc, whereas each synapse 210 in FIG. 2 is shown connected to one word line 265. Thus, each word line 265 in FIG. 2 collectively refers to at least one word line electrically connected to a synapse including one or more input transistors.

실시예에서, 시냅스(500)는 2개의 셀(532, 534)을 가질 수 있고, 여기서 각 셀은 3쌍의 1T-1R(하나의 트랜지스터-하나의 저항)을 가질 수 있고, 각 1T-1R 쌍은 WL과 SL에 전기적으로 연결될 수 있다.In an embodiment, synapse 500 may have two cells 532, 534, where each cell may have three pairs of 1T-1R (one transistor-one resistor), each 1T-1R The pair can be electrically connected to WL and SL.

시냅스(500)의 각 저항은 비휘발성 MRAM, RRAM, 또는 PRAM 또는 단일-폴리 내장플래시메모리와 같은 다양한 회로(또는 메모리)에 의해 구현될 수 있으며, 여기서 회로는 저항으로 표현될 수 있는 관련 파라미터를 기억(저장)하도록 프로그램될 수 있다. 실시예들에서, 시냅스(500)의 각 저항은 도 4의 박스(452)의 컴포넌트들에 의해 구현될 수 있다. 여기서 각 시냅스(500)는 시냅스(400)와 유사한 방식으로 PWL, WWL 및 EWL에 전기적으로 연결될 수 있다.Each resistance of the synapse 500 may be implemented by various circuits (or memories) such as non-volatile MRAM, RRAM, or PRAM or single-poly embedded flash memory, where the circuit has a related parameter that can be expressed as a resistance. It can be programmed to remember (store). In embodiments, each resistance of synapse 500 may be implemented by the components of box 452 of FIG. 4 . Here, each synapse 500 may be electrically connected to PWL, WWL and EWL in a manner similar to the synapse 400.

도 6은 본 발명의 실시예들에 따른 또 다른 시냅스(600)의 개략도를 나타낸다. 실시예들에서, 시냅스(600)는 도 2의 시냅스(210)로서 사용될 수 있다. 도시된 바와 같이, 각각의 셀(632, 634)은 2개의 트랜지스터들(예를 들어, 602, 606)과 1개의 저항(예를 들어, 613), 그리고 전기적으로 연결된 2개의 입력신호(또는 워드)라인(워드라인(WL) 및 워드라인바(WLB)), 그리고 하나의 기준 신호 라인(SL)을 포함할 수 있다. 각 시냅스(600)는 2개의 워드라인에 전기적으로 연결될 수 있지만, 도 2의 각 시냅스(210)는 하나의 워드라인(265)에 연결된 것으로 도시되어 있다. 그래서, 도 2의 각 워드라인(265)은 하나 이상의 입력트랜지스터를 포함하는 시냅스에 전기적으로 연결된 하나 이상의 워드라인을 일괄하여 지칭한다.6 shows a schematic diagram of another synapse 600 according to embodiments of the present invention. In embodiments, synapse 600 may be used as synapse 210 of FIG. 2 . As shown, each cell 632, 634 has two transistors (eg, 602, 606) and a resistor (eg, 613), and two input signals (or words) electrically connected thereto. ) line (word line WL and word line bar WLB), and one reference signal line SL. Although each synapse 600 may be electrically connected to two wordlines, each synapse 210 in FIG. 2 is shown connected to one wordline 265 . Thus, each word line 265 in FIG. 2 collectively refers to one or more word lines electrically connected to a synapse including one or more input transistors.

실시예들에서, 시냅스저항(R_p(613), R_n(614)), 기준 신호 라인(SL), 출력전류라인(BL, BLB)은 도 3의 시냅스(230)의 상응하는 컴포넌트와 유사한 기능을 가질 수 있다. 예를 들어, WL과, 저항 R_p(613) 및 R_n(614) 각각에 전기적으로 연결된 입력선택트랜지스터(602, 604)는 입력선택트랜지스터(211, 212) 각각에 상응할 수 있다.In embodiments, the synaptic resistances (R_p (613), R_n (614)), the reference signal line (SL), and the output current lines (BL, BLB) function similar to the corresponding components of the synapse 230 of FIG. can have For example, the input selection transistors 602 and 604 electrically connected to WL and the resistors R_p 613 and R_n 614 may correspond to the input selection transistors 211 and 212, respectively.

도 3의 시냅스(300)와 비교하여, 시냅스(600)는 다른 입력신호라인(WLB)에 전기적으로 연결될 수 있고, 여기서 WLB는 WL에 대한 차동 입력신호전압을 제공할 수 있다. 실시예들에서, 추가적인 입력선택트랜지스터(606, 608)는 그것들의 게이트단자를 통해 WLB에 전기적으로 연결될 수 있다. 실시예들에서, 입력선택트랜지스터(606, 608)의 소스단자는 저항(R_p(613), R_n(614))에 각각 전기적으로 연결될 수 있다. 실시예들에서, 트랜지스터(602)의 드레인단자는 BL에 전기적으로 연결될 수 있고 트랜지스터(606)의 드레인단자는 BLB에 전기적으로 연결될 수 있다. 마찬가지로, 트랜지스터(604)의 드레인단자는 BLB에 전기적으로 연결될 수 있고 트랜지스터(608)의 드레인단자는 BL에 전기적으로 연결될 수 있다.Compared to the synapse 300 of FIG. 3 , the synapse 600 may be electrically connected to another input signal line WLB, where the WLB may provide a differential input signal voltage to the WL. In embodiments, additional input select transistors 606 and 608 may be electrically connected to the WLB through their gate terminals. In embodiments, source terminals of the input selection transistors 606 and 608 may be electrically connected to resistors R_p (613) and R_n (614), respectively. In embodiments, the drain terminal of transistor 602 may be electrically connected to BL and the drain terminal of transistor 606 may be electrically connected to BLB. Similarly, the drain terminal of transistor 604 can be electrically connected to BLB and the drain terminal of transistor 608 can be electrically connected to BL.

실시예들에서, 시냅스(600)는 차동 입력신호를 수신할 수 있고, 여기서 WL은 포지티브 입력신호전압(공통모드 기준)과 a_pos를 제공하고, WLB는 네거티브 입력신호전압(공통모드 기준)과 a_neg을 제공한다. 실시예들에서, R_p(613)는 포지티브 가중치 w_pos를 저장할 수 있고 R_n(614)은 네거티브 가중치 w_neg를 저장할 수 있다. 따라서, 실시예들에서, BL 상의 출력신호전류(BLo)는 2개의 셀(532, 534)로부터의 2개의 출력신호의 합일 수 있다.In embodiments, synapse 600 may receive a differential input signal, where WL provides a positive input signal voltage (common mode reference) and a_pos, and WLB provides a negative input signal voltage (common mode reference) and a_neg provides In embodiments, R_p 613 may store the positive weight w_pos and R_n 614 may store the negative weight w_neg. Thus, in embodiments, the output signal current BLo on BL may be the sum of the two output signals from the two cells 532 and 534.

[수학식 6][Equation 6]

BLo = a_pos × w_pos + a_neg × w_negBLo = a_pos × w_pos + a_neg × w_neg

마찬가지로, BLB 상의 출력신호전류(BLBo)는 2개의 셀(532, 534)로부터의 2개의 출력신호의 합일 수 있다.Similarly, the output signal current BLBo on BLB may be the sum of the two output signals from the two cells 532 and 534.

[수학식 7][Equation 7]

BLBo = a_pos × w_neg + a_neg × w_posBLBo = a_pos × w_neg + a_neg × w_pos

따라서, 도시된 바와 같이, WL 및 WLB에 대한 차동신호를 가진 일부 실시예들은, 도 3에 도시된 시냅스(300)의 WL에 대한 단일-종단 신호(single-ended signaling)를 갖는 다른 실시예들에 비하여 BL 및 BLB 상 더 큰 범위의 출력전류를 가질 수 있다. 또한, 도시된 바와 같이, 차동입력신호를 가진 실시예들은 공급전압 또는 온도의 변화로부터 공통모드노이즈뿐만 아니라 트랜지스터 오프셋노이즈를 억제할 수 있다.Thus, as shown, some embodiments with differential signaling for WL and WLB, other embodiments with single-ended signaling for WL of synapse 300 shown in FIG. Compared to BL and BLB, it can have a larger range of output current. Also, as shown, embodiments with differential input signals can suppress transistor offset noise as well as common mode noise from changes in supply voltage or temperature.

시냅스(600)의 각 저항은 비휘발성 MRAM, RRAM, 또는 PRAM 또는 단일-폴리 내장플래시메모리와 같은 다양한 회로(또는 메모리)에 의해 구현될 수 있으며, 여기서 회로는 관련 파라미터를 기억(저장)하도록 프로그래밍될 수 있다. 도 7은 본 발명의 실시예들에 따른 또 다른 시냅스(700)의 개략도를 나타낸다. 실시예들에서, 시냅스(700)는 도 6의 저항들(613, 614)의 예시적인 구현을 나타낼 수 있다. 다르게 말하면, 박스(752)의 컴포넌트들은 도 6의 저항(613)에 상응할 수 있다.Each resistance of synapse 600 may be implemented by various circuits (or memories) such as non-volatile MRAM, RRAM, or PRAM or single-poly embedded flash memory, where the circuit is programmed to store (store) relevant parameters. It can be. 7 shows a schematic diagram of another synapse 700 according to embodiments of the present invention. In embodiments, synapse 700 may represent an example implementation of resistors 613 and 614 of FIG. 6 . In other words, the components of box 752 may correspond to resistor 613 in FIG. 6 .

도 7에 도시된 바와 같이, 시냅스(700)는 2개의 셀(732, 734)을 포함할 수 있다. 실시예들에서, 셀(732(또는 734))은 시냅스(400)의 셀(432(또는 434))과 유사할 수 있고, 셀(732)(또는 734)이 추가적인 상부 선택 트랜지스터상부 선택 트랜지스터(720(또는 722)) 및 추가적인 입력신호라인(WLB)을 포함하는 것에 차이가 있다. 실시예들에서, 트랜지스터(720)(또는 722)의 게이트는 입력신호라인(WLB)에 전기적으로 연결될 수 있고 트랜지스터(720)(또는 722)의 드레인은 출력신호라인(BLB)에 전기적으로 연결될 수 있다.As shown in FIG. 7 , synapse 700 may include two cells 732 and 734 . In embodiments, cell 732 (or 734) may be similar to cell 432 (or 434) of synapse 400, and cell 732 (or 734) may include an additional upper select transistor (or upper select transistor). 720 (or 722)) and an additional input signal line (WLB). In embodiments, the gate of transistor 720 (or 722) may be electrically connected to the input signal line (WLB) and the drain of transistor 720 (or 722) may be electrically connected to the output signal line (BLB). there is.

도 8은 본 발명의 실시예들에 따른 또 다른 시냅스(800)의 개략도를 나타낸다. 실시예들에서, 시냅스(800)는 도 2의 시냅스(210)로서 사용될 수 있다. 도시 된 바와 같이, 시냅스(800)는 2개의 셀(832, 834)을 포함할 수 있다. 여기서 각 셀은 3개의 저항과 6개의 트랜지스터를 포함할 수 있다. 시냅스(800)는 2T-1R 구조를 가질 수 있다. 즉, 각 셀은 2T-1R 유닛(802) 3세트를 포함할 수 있다. 시냅스 (800)는 6개의 입력신호라인 - 3개의 워드라인(WLa, WLb 및 WLc)과, 3개의 워드라인바(WLaB, WLbB 및 WLcB) - 에 전기적으로 연결될 수 있다. 시냅스(800)의 각 셀은 다른 적절한 수의 2T-1R 유닛(802)을 포함할 수 있다. 실시예들에서, WL 및 WLB의 각 쌍(예를 들어, WLa 및 WLaB)은 셀(832, 834)에 차동 입력신호를 제공할 수 있다.8 shows a schematic diagram of another synapse 800 according to embodiments of the present invention. In embodiments, synapse 800 may be used as synapse 210 of FIG. 2 . As shown, synapse 800 may include two cells 832 and 834. Here, each cell may include 3 resistors and 6 transistors. The synapse 800 may have a 2T-1R structure. That is, each cell may include three sets of 2T-1R units 802. The synapse 800 may be electrically connected to six input signal lines—three word lines (WLa, WLb, and WLc) and three word line bars (WLaB, WLbB, and WLcB). Each cell of synapse 800 may contain any other suitable number of 2T-1R units 802 . In embodiments, each pair of WL and WLB (eg, WLa and WLaB) may provide a differential input signal to cells 832 and 834.

실시예들에서, 기준 신호 라인(SL)은 셀(832, 834)에 기준신호를 제공할 수 있다. 실시예들에서, 각 출력신호라인(BL, BLB)은 셀(832)의 3개의 트랜지스터들의 드레인 단자와, 셀(834)의 3개의 트랜지스터들의 드레인 단자로부터의 출력신호를 수집할 수 있다. 실시예들에서, 시냅스(800)는 차동입력신호를 수신할 수 있다. 여기서 각각의 WLi는 포지티브 입력신호전압(a_pos_i)을 제공하고, 각각의 WLBj는 네거티브 입력신호전압(a_neg_j)을 제공한다. 실시예들에서, 각각의 R_p는 포지티브 가중치(w_pos_i)를 저장할 수 있고 각각의 R_n은 네거티브 가중치(w_neg_j)를 저장할 수 있다. 실시예들에서, BL 상의 출력신호전류(BLo)는 2개의 셀(832, 834)로부터의 6개의 출력신호의 합일 수 있다.In embodiments, the reference signal line SL may provide a reference signal to the cells 832 and 834. In embodiments, each of the output signal lines BL and BLB may collect output signals from drain terminals of three transistors of cell 832 and drain terminals of three transistors of cell 834 . In embodiments, synapse 800 may receive a differential input signal. Here, each WLi provides a positive input signal voltage (a_pos_i), and each WLBj provides a negative input signal voltage (a_neg_j). In embodiments, each R_p may store a positive weight (w_pos_i) and each R_n may store a negative weight (w_neg_j). In embodiments, the output signal current BLo on BL may be the sum of six output signals from two cells 832 and 834.

[수학식 8][Equation 8]

BLo = ∑(a_pos_i × w_pos_i) + ∑(a_neg_j × w_neg_j)BLo = ∑(a_pos_i × w_pos_i) + ∑(a_neg_j × w_neg_j)

마찬가지로, BLB 상의 출력신호전류(BLBo)는 2개의 셀(832, 834)로부터의 6개의 출력신호의 합일 수 있다.Similarly, the output signal current BLBo on BLB may be the sum of the six output signals from the two cells 832 and 834.

[수학식 9][Equation 9]

BLBo = ∑(a_pos_i × w_neg_j) + ∑(a_neg_j × w_pos_i)BLBo = ∑(a_pos_i × w_neg_j) + ∑(a_neg_j × w_pos_i)

시냅스(800)의 각 저항은 비휘발성 MRAM, RRAM, 또는 PRAM 또는 단일-폴리 내장플래시메모리와 같은 다양한 회로(또는 메모리)에 의해 구현될 수 있으며, 여기서 회로는 관련 파라미터를 기억(저장)하도록 프로그래밍될 수 있다. 실시예들에서, 시냅스(800)의 각 저항은 도 7의 박스(752)의 컴포넌트들에 의해 구현될 수 있다. 여기서, 각 시냅스(800)는 시냅스(700)와 유사한 방식으로 PWL, WWL 및 EWL에 전기적으로 연결될 수 있다.Each resistance of synapse 800 may be implemented by various circuits (or memories) such as non-volatile MRAM, RRAM, or PRAM or single-poly embedded flash memory, where the circuit is programmed to store (store) relevant parameters. It can be. In embodiments, each resistance of synapse 800 may be implemented by components of box 752 of FIG. 7 . Here, each synapse 800 may be electrically connected to PWL, WWL and EWL in a manner similar to the synapse 700.

일반적으로, 플로팅게이트에 전자를 주입함으로써 읽기 트랜지스터(예를 들어, 462)의 컨덕턴스는 변경될 수 있다. 도 9A 내지 9B는 플로팅게이트노드(열(910 및 914))의 임계전압(VTH)을 프로그래밍하기 위한 2개의 종래 방법과 실시예들에 따른 방법(열(912))을 비교하여 나타낸다. 도 9A는 플로팅게이트셀(432)의 프로그램 동작 중 PWL 및 WWL 단자에 인가된 신호의 전압높이 및 폭을 포함하는 테이블(900)을 나타낸다. 이를 위해, 플로팅게이트에 전자를 주입할 수 있다. 도시된 바와 같이, 테이블(900)은 각각 전압신호를 인가하기 위한 3가지 접근법에 상응하는 3개의 열(910, 912, 914) 각각을 포함한다.In general, the conductance of the read transistor (eg 462) can be changed by injecting electrons into the floating gate. 9A-9B show a comparison of two conventional methods for programming the threshold voltage (VTH) of a floating gate node (columns 910 and 914) and a method according to embodiments (column 912). 9A shows a table 900 including voltage heights and widths of signals applied to the PWL and WWL terminals during the program operation of the floating gate cell 432. To this end, electrons may be injected into the floating gate. As shown, table 900 includes three columns 910, 912, and 914 each corresponding to three approaches for applying voltage signals, respectively.

열(910)은 각 후속한 프로그램 단계가 이전 단계보다 일정한 펄스 폭(T_pulse)을 갖는 델타만큼 프로그램 전압을 증가시키는 종래의 증분-스텝-펄스 프로그래밍(ISPP; incremental-step-pulse programming) 방법을 나타낸다. 열(912)은 실시예에 따라 제1 단계가 열(910)의 프로그래밍 방법에 비해 특정 설계파라미터(m)만큼 더 긴 프로그래밍 펄스 폭을 갖는 밸런스-스텝-펄스 프로그래밍(balanced-step-pulse programming) 방법을 나타낸다. 열(914)은 모든 단계가 동일한 프로그램 전압과 프로그램 펄스 폭을 갖는 종래의 상수-펄스 프로그래밍(constant-pulse programming) 방법을 나타낸다.Column 910 represents a conventional incremental-step-pulse programming (ISPP) method in which each subsequent program step increases the program voltage by a delta with a constant pulse width (T_pulse) over the previous step. . Column 912 is a balanced-step-pulse programming in which the first step has a longer programming pulse width by a certain design parameter m compared to the programming method of column 910 according to an embodiment. indicate the way Column 914 represents a conventional constant-pulse programming method in which all steps have the same program voltage and program pulse width.

도 9B는 도 9A의 3가지 방법에 따라 플로팅게이트셀(432 또는 434)의 VTH(950)를 표시하여 나타낸다. 도 9B에서 3개의 표시(960, 962, 964)는 3개의 방법(910, 912, 914) 각각에 상응하고, 도 9B의 각 표시는 도 9A의 상응하는 방법의 각 단계 후 플로팅게이트셀(432 또는 434)의 VTH를 나타낸다.FIG. 9B displays VTH 950 of the floating gate cell 432 or 434 according to the three methods of FIG. 9A. In Figure 9B, the three marks 960, 962, and 964 correspond to each of the three methods 910, 912, and 914, and each mark in Figure 9B represents a floating gate cell 432 after each step of the corresponding method in Figure 9A. or 434).

표시(950)를 근거로, 본 발명의 실시에들에 따른 밸런스-스텝-펄스 프로그래밍 방법이 이들 3가지 방법 중 바람직할 수 있다. 각 단계는 VTH를 대략 동일한 양(델타)만큼 증가시키므로, VTH는 정확하게 프로그래밍 될 수 있고, 다른 방법보다 VTH 변동이 더 좁아지게 된다.Based on indication 950, a balance-step-pulse programming method according to embodiments of the present invention may be preferred of these three methods. Since each step increases VTH by approximately the same amount (delta), VTH can be programmed accurately, resulting in tighter VTH fluctuations than other methods.

도 10A 내지 10B는 본 발명의 실시예들에 따라 플로팅게이트셀(432 또는 434)의 임계전압(VTH)을 프로그래밍하기 위한 또 다른 방법을 나타낸다. 도 10A는 플로팅게이트셀(432 또는 434)의 프로그램 동작 중 PWL 및 WWL 단자에 인가된 신호의 전압높이 및 폭을 포함하는 테이블(1000)을 나타낸다. 이를 위해, 플로팅게이트에 전자를 주입할 수 있다. 도 10B는 도 10B의 각 단계에서 플로팅게이트셀(432 또는 434)에 저장된 VTH의 표시(1050)를 나타낸다.10A-10B show another method for programming the threshold voltage (VTH) of a floating gate cell 432 or 434 according to embodiments of the present invention. 10A shows a table 1000 including voltage heights and widths of signals applied to PWL and WWL terminals during a program operation of a floating gate cell 432 or 434. To this end, electrons may be injected into the floating gate. Figure 10B shows an indication 1050 of the VTH stored in the floating gate cell 432 or 434 at each stage of Figure 10B.

도시된 바와 같이, 몇몇 초기 단계들(여기서는, 4단계까지)에 대해, 밸런스-스텝-펄스 프로그래밍 방법(도 9A 및 9B와 함께 언급됨)은 목표 VTH를 초과하지 않는 값으로 셀 VTH를 대략적으로 프로그래밍하기 위해 사용될 수 있다. 일부 실시예들에서, 목표 VTH는 허용 가능한 마진을 가지고 이러한 초기단계들(4단계까지)에서 달성될 수 있다. 일부 다른 실시예들에서, 목표 VTH에 대한 보다 정확한 프로그래밍은 필요할 수 있다. 이러한 실시예들에서, 현재 VTH와 목표 VTH 간의 차이는 각 단계마다 VTH의 가능한 증가분보다 작아질 수 있다(도 10b의 델타(delta)). 이후, VTH를 정확하게 프로그래밍하기 위해 후속한 상수-펄스 프로그래밍 단계가 더 적용된다.As shown, for some initial steps (here, up to step 4), the balance-step-pulse programming method (referred to in conjunction with Figs. 9A and 9B) approximates the cell VTH to a value that does not exceed the target VTH. can be used for programming. In some embodiments, the target VTH can be achieved at these early stages (up to stage 4) with an acceptable margin. In some other embodiments, more precise programming of the target VTH may be required. In such embodiments, the difference between the current VTH and the target VTH may be less than a possible increment of VTH at each step (delta in FIG. 10B). Then, further subsequent constant-pulse programming steps are applied to correctly program VTH.

실시예들에서, 후속한 상수-펄스 프로그래밍 단계들은 목표로 VTH를 설정하기 위해 감소된 프로그래밍 펄스 높이(도 10A의 알파(alpha)에 의해)를 사용하지만 증가된 펄스 폭(T_pulse * n, n은 1.0 이상)을 사용한다. 결과적으로, 도 10A 내지 10B의 프로그래밍 방식은 목표 VTH와 온-칩(on-chip) 전압기준으로부터 생성된 가능한 전압단계(=델타(delta)) 이하의 최종 프로그램된 셀 임계 전압을 제어할 수 있다.In embodiments, subsequent constant-pulse programming steps use a reduced programming pulse height (by alpha in FIG. 10A) to set the target VTH but an increased pulse width (T_pulse * n, n 1.0 or higher). As a result, the programming schemes of FIGS. 10A to 10B can control the final programmed cell threshold voltage below a possible voltage step (=delta) generated from the target VTH and the on-chip voltage reference. .

도 11은 본 발명의 실시예들에 따른 플로팅게이트노드의 임계전압(VTH)을 프로그래밍하기 위한 예시적인 프로세스의 흐름도(1100)를 나타낸다. 단계 1102에서, 제1 높이(예를 들어, VPGM)와 제1 폭(T_pulse * m, m은 1.0 이상)을 가진 전압펄스(예를 들어, 도 10A의 1단계)가 플로팅게이트셀(432 또는 434)의 PWL 및 WWL 단자에 인가될 수 있다. 이를 위해 전자를 플로팅게이트에 주입할 수 있다. 단계 1104에서, 이전 펄스로부터 기 설정된 값(예를 들어, 델타(delta))만큼 각 펄스의 높이를 증가시키면서 제1 전압펄스시퀀스(예컨대, 도 10A의 2 내지 4단계)가 PWL 및 WWL 단자에 인가될 수있다.11 shows a flow diagram 1100 of an exemplary process for programming the threshold voltage (VTH) of a floating gate node in accordance with embodiments of the present invention. In step 1102, a voltage pulse (eg, step 1 in FIG. 10A) having a first height (eg, VPGM) and a first width (T_pulse * m, where m is greater than or equal to 1.0) is applied to the floating gate cell 432 or 434) may be applied to the PWL and WWL terminals. For this purpose, electrons may be injected into the floating gate. In step 1104, the first voltage pulse sequence (eg, steps 2 to 4 in FIG. 10A) is applied to the PWL and WWL terminals while increasing the height of each pulse by a preset value (eg, delta) from the previous pulse. can be authorized

단계 1106에서, 제1 펄스시퀀스가 인가된 후에 목표 VTH에 도달했는지 여부가 결정될 수 있다. 결정에 대한 대답이 긍정적이면, 프로세스는 단계 1108로 진행한다. 단계 1108에서, 프로세스는 중지된다. 그렇지 않으면, 단계 1110에서, 제2 전압펄스시퀀스(도 10A의 5 내지 19단계와 같은)가 PWL 및 WWL 단자에 인가될 수 있다. 실시예들에서, 제2 펄스시퀀스의 각 펄스는 이전 단계에서의 펄스(T_pulse)보다 좁지 않은 폭(T_pulse * n, n은 1.0 이상)을 가질 수 있다. 실시예들에서, 제2 펄스시퀀스는 제1 높이보다 낮은 높이(VPGM - 알파(alpha))를 가지며, 제2 펄스시퀀스는 제2 폭(T_pulse)보다 좁지 않은 폭(T_pulse * n)을 갖는다. 실시예들에서, 예로서, 값은 m=9.0, n=5.0, alpha=0.8V, delta=0.1V, 그리고 VPGM=7.2V 일 수 있다.At step 1106, it may be determined whether the target VTH has been reached after the first pulse sequence is applied. If the answer to the decision is affirmative, the process proceeds to step 1108. At step 1108, the process is stopped. Otherwise, in step 1110, a second voltage pulse sequence (such as steps 5 to 19 in Fig. 10A) may be applied to the PWL and WWL terminals. In embodiments, each pulse of the second pulse sequence may have a width (T_pulse * n, where n is greater than or equal to 1.0) that is not narrower than the pulse (T_pulse) in the previous step. In embodiments, the second pulse sequence has a height (VPGM - alpha) lower than the first height, and the second pulse sequence has a width (T_pulse * n) that is not narrower than the second width (T_pulse). In embodiments, by way of example, the values may be m=9.0, n=5.0, alpha=0.8V, delta=0.1V, and VPGM=7.2V.

도 9A 내지 11의 플로팅게이트노드의 VTH를 프로그래밍하기 위한 방법들은 셀(732, 734)에 적용될 수 있다. 더 구체적으로, 도 9A의 열(912)과 관련된 방법 및/또는 도 10a 내지 10b와 함께 설명된 방법은 셀들(732, 734)의 VTH를 프로그래밍하기 위해 사용될 수 있다.The methods for programming VTH of the floating gate node of FIGS. 9A-11 can be applied to cells 732 and 734. More specifically, the method associated with column 912 of FIG. 9A and/or the method described in conjunction with FIGS. 10A-10B may be used to program the VTH of cells 732 and 734.

도 3 내지 8의 각 시냅스는 2개의 출력신호라인(BL, BLB)을 통해 2개의 출력신호를 생성할 수 있다. 여기서 2개의 출력신호를 생성하기 위해 차동신호기술이 적용될 수 있다. 차동신호(differential signaling)는 가중 합 계산을 위한 종래의 시냅스 또는 디바이스의 설계에서 출력전류에 심각한 오류를 야기할 수 있는 공급전압 및 온도변화에 따른 트랜지스터오프셋 및 공통모드노이즈에 대한 민감도를 감소시킬 수 있다.Each synapse of FIGS. 3 to 8 may generate two output signals through two output signal lines BL and BLB. Here, a differential signal technique may be applied to generate two output signals. Differential signaling can reduce sensitivity to transistor offset and common mode noise due to supply voltage and temperature changes that can cause significant errors in output current in conventional synapse or device designs for weighted sum computation. there is.

도 12A 내지 12C는 본 발명의 실시예들에 따른 차동신호(differential signaling)를 도시한다. 도 12A에 도시된 바와 같이. IBL라인(1212)과 IeL-Bar라인 (1214)은 각각 시냅스의 출력신호라인(BL(예를 들어, 106), BLB(예를 들어, 107))을 통한 출력전류일 수 있다. 예로서, 각 출력전류는 R_p 및 R_n의 저항값에 따라 최소 0.5(A.U.) 내지 최대 1.5(A.U.)의 범위일 수 있다. 실시예들에서, IBL라인(1212)은 제1 전류신호(1224)와 오프셋전류신호(1220)의 합일 수 있는 반면, IBL라인(1214)은 오프셋전류(1220)와 제2 전류신호(1226)의 합일 수 있다. 도시된 바와 같이, 오프셋전류(1220)는 트랜지스터오프셋 및 공통모드노이즈를 포함할 수 있다.12A-12C illustrate differential signaling in accordance with embodiments of the present invention. As shown in Figure 12A. The IBL line 1212 and the IeL-Bar line 1214 may be output currents through synaptic output signal lines (BL (eg, 106) and BLB (eg, 107)), respectively. As an example, each output current may range from a minimum of 0.5 (A.U.) to a maximum of 1.5 (A.U.) depending on the resistance values of R_p and R_n. In embodiments, IBL line 1212 can be the sum of first current signal 1224 and offset current signal 1220, while IBL line 1214 is offset current 1220 and second current signal 1226. can be the sum of As shown, offset current 1220 may include transistor offset and common mode noise.

도 12B에 도시된 바와 같이. 2개의 출력신호라인(1212, 1214)에 차동신호기술을 적용함으로써, 오프셋전류(1220)가 상쇄될 수 있고, 출력전류신호(1224, 1226)의 값이 획득될 수 있다. 예로서, 출력전류신호들(1224, 1226)은 0.0(A.U.) 내지 1.0(A.U.)의 범위일 수 있다.As shown in Figure 12B. By applying the differential signal technique to the two output signal lines 1212 and 1214, the offset current 1220 can be canceled, and the values of the output current signals 1224 and 1226 can be obtained. As an example, output current signals 1224 and 1226 may range from 0.0 (A.U.) to 1.0 (A.U.).

또한, 실시예에서, 제1 전류신호(1224)는 제2 전류신호(1226)와 반대 극성을 가질 수 있다. 도 12C에 도시된 바와 같이, 두 출력신호들의 차동신호(differential signaling)를 이용함으로써, 2개 신호의 차이(IBL - IBL-Bar)(1216))는 최소 -1.0에서 최대 +1.0의 범위일 수 있다. 즉, 결합된 신호의 범위는 단일출력 범위의 두 배일 수 있다.Also, in an embodiment, the first current signal 1224 may have a polarity opposite to that of the second current signal 1226. As shown in FIG. 12C, by using differential signaling of the two output signals, the difference between the two signals (IBL-IBL-Bar) 1216 can range from a minimum of -1.0 to a maximum of +1.0. there is. That is, the range of the combined signal can be twice the range of a single output.

도 13은 본 발명의 실시예들에 따른 신경망을 포함하는 칩(1300)의 개략도를 나타낸다. 도시된 바와 같이, 칩(1300)은 시스템-온-칩 구조를 가질 수 있으며, 비휘발성 신경망(1316)과, 칩(1300)상의 요소들을 제어하기 위한 CPU(1312)와, 비휘발성 신경망(1316)에 입력신호를 제공하기 위한 센서(1314)와, 메모리(1318)를 포함할 수 있다. 실시예들에서, 신경망(1316)은 도 1의 신경망(100)과 유사할 수 있다. 실시예들에서, 칩(1300)은 실리콘 칩일 수 있고 컴포넌트들(1312 ~ 1318)은 칩(1300) 상에 통합될 수 있다.13 shows a schematic diagram of a chip 1300 including a neural network according to embodiments of the present invention. As shown, the chip 1300 may have a system-on-chip structure, a non-volatile neural network 1316, a CPU 1312 for controlling elements on the chip 1300, and a non-volatile neural network 1316 ) may include a sensor 1314 for providing an input signal and a memory 1318. In embodiments, neural network 1316 may be similar to neural network 100 of FIG. 1 . In embodiments, chip 1300 may be a silicon chip and components 1312 - 1318 may be integrated on chip 1300 .

도 14는 본 발명의 실시예들에 따른 비휘발성 시냅스어레이를 동작시키기 위한 시스템(1400)의 개략도를 나타낸다. 도시된 바와 같이, 시스템(1400)은 비휘발성 시냅스어레이(1410)와, 기준생성기(1402)와, 구성저장장치(1404)와, 비휘발성 시냅스어레이(1410) 중에서 시냅스의 행을 선택하기 위한 행(row)드라이버(1406)와, 라우터/제어기(1408)와, 비휘발성 시냅스어레이(1410) 중에서 시냅스의 열을 선택하기 위한 열(column)선택기(1412)와, 감지회로(1414)와, 비휘발성 시냅스어레이(1410)로부터 출력값을 수집하기 위한 누산기(1416)와, 정규화/활성화/풀링(pooling) 기능블록(1418)과, 비휘발성 시냅스어레이(1410)로부터 버퍼데이터를 위한 데이터버퍼(1420)를 포함할 수 있다. 실시예들에서, 비휘발성 시냅스어레이(1410)는 비휘발성 시냅스어레이(200)와 유사할 수 있고, 감지회로(1414)는 도 2의 감지회로(250)와 유사할 수 있다.14 shows a schematic diagram of a system 1400 for operating a non-volatile synapse array according to embodiments of the present invention. As shown, the system 1400 includes a non-volatile synapse array 1410, a reference generator 1402, a configuration storage device 1404, and a row for selecting a synapse row from among the non-volatile synapse array 1410. A (row) driver 1406, a router/controller 1408, a column selector 1412 for selecting a synapse row from the non-volatile synapse array 1410, a detection circuit 1414, and a non-volatile synapse array 1410. An accumulator 1416 for collecting output values from the volatile synapse array 1410, a normalization/activation/pooling function block 1418, and a data buffer 1420 for buffer data from the non-volatile synapse array 1410 can include In embodiments, the non-volatile synapse array 1410 may be similar to the non-volatile synapse array 200 and the sensing circuitry 1414 may be similar to the sensing circuitry 250 of FIG. 2 .

기준생성기(1402)는 로우(row)드라이버(1406)에 의해 사용되는 기준신호들(예를들어, 도 2 내지 8의 SL)과 입력신호라인들(예를 들어, 도 2 내지 8의 WL)에 필요한 전압레벨을 제공한다. 구성저장장치(1404)는 라우터/제어기(1408)에 의해 사용되는 유한한 상태의 머신에 대한 데이터, 시냅스어레이(200) 내의 시냅스 위치에 대한 가중치 파라미터의 물리적인 맵핑, 및 감지 회로를 위한 다른 구성 파라미터들을 저장한다. 실시예들에서, 구성저장장치는 온-칩의 비휘발성 메모리로 구현 될 수 있다. 라우터/제어기(1408)는 로우(row)드라이버(1406)에 의해 행 선택 시퀀스들을 제어하기 위해 유한한 상태의 기계를 구현한다. 감지회로(1414)는 전압레귤레이터와 아날로그-디지털 변환기를 포함하여 선택된 열로부터의 출력전류신호를 전압신호로 그리고 추가로 디지털값으로 변환한다. 감지회로로부터의 결과는 누산기(1416)에서 합산된다. 정규화/활성화/풀링 기능블록(1418)은 누산기 값에 대해 필요한 신호처리동작을 수행한다. 이러한 수치연산을 병렬로 수행하기 위해 다중의 전용 DSP들 또는 내장형 CPU 코어들이 포함될 수 있다.Reference generator 1402 includes the reference signals used by row driver 1406 (e.g., SL of FIGS. 2-8) and input signal lines (e.g., WL of FIGS. 2-8). It provides the voltage level required for Configuration storage 1404 includes data about the finite state machines used by router/controller 1408, physical mapping of weight parameters to synaptic locations in synapse array 200, and other configurations for the sensing circuitry. save the parameters In embodiments, the configuration storage may be implemented as an on-chip, non-volatile memory. Router/controller 1408 implements a finite state machine to control row selection sequences via row driver 1406. Sensing circuit 1414 includes a voltage regulator and an analog-to-digital converter to convert the output current signal from the selected column to a voltage signal and further to a digital value. The results from the sensing circuit are summed in accumulator 1416. The normalization/activation/pooling function block 1418 performs any necessary signal processing operations on the accumulator value. Multiple dedicated DSPs or embedded CPU cores may be included to perform these numerical operations in parallel.

일부 실시예에서, 신경망 설계는 가중치 및 입력 매개 변수의 값을 1 또는 -1로 이진화 할 수 있다. 그러한 실시예에서, 시냅스(600)는 한 쌍의 비휘발성 저항 변화 요소 대신에 교차 결합 래치 회로가 사용될 수 있도록 수정될 수 있다. 도 15는 본 발명의 실시예들에 따른 다른 시냅스 (1500)의 개략도를 나타낸다. 도시된 바와 같이, 시냅스 (1500)는 교차 결합 래치 회로(1510)를 포함 할 수 있으며, 교차 결합 래치 회로(1510)는 입력 단자가 제2 인버터(1518)의 출력 단자에 전기적으로 결합된 인버터(1514)를 포함할 수 있으며 그 반대의 경우도 마찬가지이다. 실시예에 따르면, 교차 결합 래치는 1518의 출력과 1514의 입력 사이에 위치한 S 노드와 1514의 출력과 1518의 입력 사이에 위치한 SB 노드에 디지털 신호를 저장할 수 있다. 실시예에 따르면, S 노드가 전기적 신호 값을 갖는 경우, SB 노드는 상보적인 신호값을 가질 수 있으며, 그 반대는 인버터 결합으로 인해 가능하다.In some embodiments, the neural network design may binarize the values of the weights and input parameters to 1 or -1. In such an embodiment, synapse 600 may be modified such that a cross-coupled latch circuit may be used in place of a pair of non-volatile resistive change elements. 15 shows a schematic diagram of another synapse 1500 according to embodiments of the present invention. As shown, the synapse 1500 may include a cross-coupled latch circuit 1510, wherein the cross-coupled latch circuit 1510 has an input terminal electrically coupled to an output terminal of a second inverter 1518 (inverter ( 1514) and vice versa. According to an embodiment, the cross-coupled latch may store a digital signal at an S node located between the output of 1518 and the input of 1514 and at an SB node located between the output of 1514 and the input of 1518. According to an embodiment, when the S node has an electrical signal value, the SB node may have a complementary signal value, and vice versa due to inverter coupling.

도 15에 도시된 바와 같이, 시냅스(1500)의 셀(1532,1534) 각각은 2 개의 입력 신호(또는 워드) 라인, 워드 라인(WL) 및 워드라인바(WLB)에 전기적으로 연결된 2 개의 입력 선택 트랜지스터(예를 들어, 1502 및 1506)를 포함할 수 있으며, 게이트 터미널에서. 입력 선택 트랜지스터의 소스 단자는 교차 결합 래치 회로(1510)의 노드에 추가로 전기적으로 결합되는 공통노드에 전기적으로 결합될 수 있다. 셀(1532)은 교차 결합 래치(1510)의 SB 노드에 전기적으로 결합될 수 있다. 셀(1534)은 1510의 S 노드에 전기적으로 연결된다.As shown in FIG. 15, each of the cells 1532 and 1534 of the synapse 1500 has two input signal (or word) lines, two inputs electrically connected to a word line (WL) and a word line bar (WLB). and select transistors (eg, 1502 and 1506), at the gate terminals. The source terminal of the input select transistor may be electrically coupled to a common node that is further electrically coupled to the node of cross-coupled latch circuit 1510. Cell 1532 may be electrically coupled to the SB node of cross coupling latch 1510 . Cell 1534 is electrically connected to the S node of 1510.

실시예에서, 트랜지스터(1502)의 드레인 단자는 출력 라인(BL)에 전기적으로 연결될 수 있고 트랜지스터(1506)의 드레인 단자는 출력 라인(BLB)에 전기적으로 연결될 수 있다. 마찬가지로, 트랜지스터(1504 및 1508)의 드레인 단자는 각각 BLB 및 BL에 전기적으로 연결될 수 있다.In an embodiment, the drain terminal of transistor 1502 can be electrically connected to output line BL and the drain terminal of transistor 1506 can be electrically connected to output line BLB. Similarly, the drain terminals of transistors 1504 and 1508 can be electrically connected to BLB and BL, respectively.

실시예들에서, 기준 신호 라인(SL)은 교차 결합 래치(1510)의 인버터들 (1514 및 1518) 각각에 전기적으로 연결될 수 있고, 기준 전압 입력 신호(201)가 인버터들(1514 및 1518)에 제공될 수 있다.In embodiments, the reference signal line SL may be electrically connected to each of the inverters 1514 and 1518 of the cross-coupled latch 1510, and the reference voltage input signal 201 may be connected to the inverters 1514 and 1518. can be provided.

교차 결합 래치(1510)는 비휘발성 컴포넌트와 같은 다양한 회로 (또는 메모리)에 의해 구현되지만, 전원(예 : 배터리)이 사용 가능한 경우 휘발성 메모리 컴포넌트에 의하여 구현될 수도 있다. The cross-coupled latch 1510 is implemented by various circuits (or memories) such as non-volatile components, but may also be implemented by volatile memory components when a power source (eg, a battery) is available.

도 16은 WL 및 WLB의 입력 전압 값, S 및 SB 노드의 전압 신호로 표시되는 가중치 값, BL 및 BLB 라인의 전류 값으로 표시되는 출력 간의 관계를 보여주는 표를 보여준다. 테이블의 입력에 대해 (WL = High, WLB = Low)는 1이고 (WL = Low, WLB = High)는 -1 일 수 있다. 표의 가중치에 대해 (SB = High, S = Low)는 1이고 (SB = Low, S = High)는 -1 일 수 있다. 표의 입력 및 가중치에 대한 "낮은"전압 값은 "높은"전압 값보다 낮은 전압 값이다. 표의 출력에 대해 (BL = Low, BLB = High)는 1이고 (BL = High, BLB = Low)는 -1 일 수 있다. 표의 출력에서 "Low" 전류 값은 "High" 전류 값보다 낮은 전류 값이다.16 shows a table showing the relationship between input voltage values of WL and WLB, weight values represented by voltage signals of nodes S and SB, and outputs represented by current values of lines BL and BLB. For an entry in the table, (WL = High, WLB = Low) can be 1 and (WL = Low, WLB = High) can be -1. For the weights in the table, (SB = High, S = Low) can be 1 and (SB = Low, S = High) can be -1. The "low" voltage values for the inputs and weights in the table are the lower voltage values than the "high" voltage values. For the output of the table, (BL = Low, BLB = High) can be 1 and (BL = High, BLB = Low) can be -1. In the output of the table, the "Low" current value is the current value lower than the "High" current value.

표에서, BL 및 BLB의 출력은 입력 (WL, WLB)과 가중치 (SB, S)의 곱셈을 나타낼 수 있으며, 여기서 1×1 = 1, 1×-1 = -1, -1×1 = -1, 및 -1×-1 = 1 이다. 따라서 이진화 된 입력과 가중치 사이의 곱셈 연산은 산술적으로 올바른 결과를 산출할 수 있다.In the table, the output of BL and BLB can represent the multiplication of inputs (WL, WLB) and weights (SB, S), where 1×1 = 1, 1×-1 = -1, -1×1 = - 1, and -1×-1 = 1. Therefore, the multiplication operation between the binarized input and the weight can produce arithmetically correct results.

도 17, 18 및 19는 본 발명의 실시 예에 따른 시냅스(1700, 1800, 1900)의 개략도를 각각 나타낸다. 도 17에 도시된 바에 따르면, 시냅스(1700)는 도 6에 도시된 시냅스 (600) 내의 세포 (632)에 대응할 수 있는 세포 (1732)만을 포함할 수 있다. 마찬가지로, 도 18은 도 7에 도시된 시냅스 (700)의 세포 (732)에 대응할 수 있는 세포 (1832)만을 포함하는 시냅스 (1800)를 도시한다. 도 19에 도시된 시냅스(1900)는 도 8의 시냅스(800)의 셀 (832)에 대응할 수 있는 셀(1932)만을 포함할 수 있다. 시냅스 (1700, 1800 및 1900)에서 음의 가중치 w_neg는 0과 같을 수 있다. 즉, 음의 가중치는 시냅스(600), 시냅스(700) 및 시냅스(800) 각각에서 제거될 수 있다. WLB 신호가 BLB 라인에 음의 입력 신호를 제공할 수 있기 때문에 BLB 라인은 유지될 수 있다.17, 18 and 19 show schematic diagrams of synapses 1700, 1800 and 1900, respectively, according to an embodiment of the present invention. As shown in FIG. 17 , synapse 1700 may include only cells 1732 that may correspond to cells 632 in synapse 600 shown in FIG. 6 . Similarly, FIG. 18 shows a synapse 1800 that includes only cells 1832 that may correspond to cells 732 of synapse 700 shown in FIG. 7 . The synapse 1900 shown in FIG. 19 may include only a cell 1932 that may correspond to the cell 832 of the synapse 800 of FIG. 8 . The negative weight w_neg at synapses 1700, 1800 and 1900 may be equal to zero. That is, the negative weight may be removed from each of the synapses 600, 700, and 800. The BLB line can be maintained because the WLB signal can provide a negative input signal to the BLB line.

실시예에서, 시냅스(1700 및 1800)에 대한 출력 신호 전류 BLBo는 다음과 같을 수 있다.In an embodiment, the output signal current BLBo for synapses 1700 and 1800 may be:

[수학식 10][Equation 10]

BLBo = a_neg × w_posBLBo = a_neg × w_pos

마찬가지로, 시냅스(1900)에 대한 출력 전류 신호 BLBo는 다음과 같을 수 있다.Similarly, the output current signal BLBo for synapse 1900 may be:

[수학식 11][Equation 11]

BLBo = ∑(a_neg_j × w_pos_i)BLBo = ∑(a_neg_j × w_pos_i)

도 20은 본 발명의 실시예들에 따른 시냅스(2000)의 개략도를 나타낸다. 도시된 바와 같이, 시냅스(2000)는 시냅스(300)와 유사할 수 있으며, 차이점은 세포(2032)의 양의 가중치(도 3의 세포 (332)에 대응될 수 있음)만이 시냅스(2000)에 포함되고, 도 3의 세포 (334) 및 BLB 라인(267)은 제거될 수 있다는 것이다. 20 shows a schematic diagram of a synapse 2000 according to embodiments of the present invention. As shown, the synapse 2000 may be similar to the synapse 300, the difference being that only the positive weight of the cell 2032 (which may correspond to the cell 332 in FIG. 3) is at the synapse 2000. included, and cells 334 and BLB line 267 in FIG. 3 can be removed.

도 21은 본 발명의 실시예들에 따른 시냅스 (2100)의 개략도를 나타낸다. 도시된 바와 같이, 시냅스 (2100)는 시냅스(400)와 유사할 수 있지만, 차이점은 세포(2132)만이(도 4의 세포(432)에 대응될 수 있음) 사용되고 도 4의 셀(434)과 BLB 출력 라인은 제거될 수 있다는 것이다. 21 shows a schematic diagram of a synapse 2100 according to embodiments of the invention. As shown, synapse 2100 may be similar to synapse 400, except that only cell 2132 (which may correspond to cell 432 in FIG. 4) is used and cell 434 in FIG. The BLB output line is that it can be removed.

도 22는 본 발명의 실시예에 따른 시냅스(2200)의 개략도를 나타낸다. 도시된 바에 따르면, 시냅스(2200)는 도 5의 시냅스(500)와 유사할 수 있지만, 차이점은 셀 2232(도 5의 셀 (532)에 대응)만이 사용되고 도 5의 셀(534)과 BLB 출력 라인은 제거될 수 있는 것이다. 22 shows a schematic diagram of a synapse 2200 according to an embodiment of the present invention. As shown, synapse 2200 may be similar to synapse 500 of FIG. 5 , with the difference that only cell 2232 (corresponding to cell 532 of FIG. 5 ) is used and cell 534 of FIG. 5 and the BLB output. The line is something that can be removed.

도 17 내지 도 22는 도 2에 도시 된 바와 같이 2 차원 배열 형식으로 배열된 시냅스를 개시한다. 즉, 도 17 내지 도 22의 시냅스는 시냅스(210)에 대응될 수 있다. 17 to 22 disclose synapses arranged in a two-dimensional array format as shown in FIG. That is, the synapse of FIGS. 17 to 22 may correspond to the synapse 210 .

본 발명에 따르면, 실시예에서 논리 친화적인 NVM은 분할 게이트 플래시 메모리 또는 EEPROM과 같은 종래의 NVM 컴포넌트보다 적은 처리 단계로 생성될 수 있는 비휘발성 메모리 컴포넌트(대기 전력이 0)를 의미한다. 실시예의 NVM은 CPU 또는 신경망 계산 엔진의 논리 컴포넌트에 비해 몇 가지 추가 처리 단계만 필요하기 때문에, CPU 또는 신경망 엔진과 동일한 칩에 실시예의 NVM을 내장하는 것이 가능하다. 반대로, 기존의 NVM 컴포넌트를 CPU 또는 신경망 엔진과 동일한 칩에 내장하는 것은 그러한 칩을 생성하는데 필요한 과도한 추가 처리로 인해 실현 가능하지 않다.According to the present invention, logic friendly NVM in an embodiment means a non-volatile memory component (with zero standby power) that can be created in fewer processing steps than conventional NVM components such as split gate flash memory or EEPROM. Since the NVM of an embodiment requires only a few additional processing steps compared to the logical components of a CPU or neural network computation engine, it is possible to embed the NVM of an embodiment on the same chip as the CPU or neural network engine. Conversely, embedding existing NVM components on the same chip as a CPU or neural network engine is not feasible due to the excessive extra processing required to create such a chip.

실시예에서 사용되는 논리 친화적 NVM의 예는 로직 컴포넌트보다 몇 개의 처리 단계가 요구되는 STT-MRAM, RRAM, PRAM 또는 FeFET 컴포넌트를 포함한다. 실시 예에서 논리 친화적 NVM의 또 다른 예는 단일 폴리 내장형 플래시 메모리이다. 단일 폴리 플래시 메모리는 로직 컴포넌트에 비해 추가 처리가 필요하지 않으며 특히 CPU, 신경망 엔진과 동일한 칩에 임베딩하는데 적합하다. 기존 NVM과 같이 논리 친화적인 NVM은 전원이 꺼져 있을 때 저장된 데이터를 유지할 수 있다.Examples of logic-friendly NVMs used in the embodiments include STT-MRAM, RRAM, PRAM, or FeFET components that require fewer processing steps than logic components. Another example of a logic friendly NVM in an embodiment is a single poly embedded flash memory. Single-poly flash memory requires no additional processing compared to logic components and is particularly suitable for embedding on the same chip as a CPU or neural network engine. Like conventional NVMs, logic-friendly NVMs can retain stored data when powered off.

도 23에 도시된 종래의 신경망 시스템에서, 외부 NVM 칩(2319)은 CPU (1312), 센서(1314) 및 시스템 버스(2330)를 통해 연결되는 신경망 컴퓨팅 엔진(2320)과 같은 다양한 회로 블록을 통합하는 시스템-온-칩(SoC)(2310)에 별도로 부착된다. In the conventional neural network system shown in FIG. 23, an external NVM chip 2319 integrates various circuit blocks such as a CPU 1312, a sensor 1314, and a neural network computing engine 2320 connected through a system bus 2330. It is separately attached to a System-on-Chip (SoC) 2310 that

CPU (1312) 및 센서 (1314)는 도 13의 구성 요소들과 유사하게 번호가 매겨진다. 신경망 가중치 파라미터는 시스템 전원이 꺼질 때 외부 NVM 칩(2319)에 저장된다. 외부 NVM 칩(2319)에 액세스하는 것은 시스템 버스(2330)의 성능이 SoC(2310)의 핀 수에 의해 제한되기 때문에 느리다. CPU 1312 and sensor 1314 are numbered similarly to the components of FIG. 13 . Neural network weight parameters are stored in the external NVM chip 2319 when the system is powered off. Accessing the external NVM chip 2319 is slow because the performance of the system bus 2330 is limited by the pin count of the SoC 2310.

외부 NVM에 액세스하는 것은 외부 와이어 커패시턴스로 인해 많은 양의 전력을 소비한다. 또한, SoC(2310)과 외부 NVM(2319)간에 개인 정보 보호 관련 신경망 매개 변수가 전송될 때 보안이 문제가 된다.Accessing the external NVM consumes a large amount of power due to external wire capacitance. In addition, security becomes an issue when privacy-related neural network parameters are transmitted between the SoC 2310 and the external NVM 2319.

도 24에는 도 13에 도시된 SoC(1300)와 외부 신경망 가속화 장치(2470) 로 구성된 본 발명에 따른 신경망을 위한 계층화된 시스템을 도시한다. 실시예에서, 온-칩 비휘발성 신경망 모듈(1316)은 고성능 시스템 버스를 통해 SoC(1300) 내의 CPU(1312), 센서(1314) 및 메모리(1318) 블록과 통합된다.24 shows a layered system for a neural network according to the present invention composed of the SoC 1300 shown in FIG. 13 and an external neural network accelerator 2470. In an embodiment, the on-chip non-volatile neural network module 1316 is integrated with the CPU 1312, sensors 1314, and memory 1318 blocks within the SoC 1300 via a high-performance system bus.

실시예에서, 고성능 시스템 버스 (2430)의 폭은 SoC (1300)의 핀 수에 의해 제한되지 않는다. 따라서, 고성능 시스템 버스 (2430)를 통한 통신은 도 23에 도시된 종래기술의 시스템 버스(2330)의 통신보다 훨씬 빠르다. 외부 신경망 가속기 장치(2470)는 국부적으로 유선 또는 원격으로 액세스 될 수 있는 오프-칩 상호 연결(2480)을 통해 연결될 수 있다. 로컬 유선 접근 방식에는 TSV, 3D-스태킹, 와이어 본딩 또는 PCB를 통한 접근이 포함될 수 있다. 원격 액세스 방식에는 LAN, Wi-Fi, Bluetooth가 포함될 수 있다. 외부 신경망 가속기 장치는 자체 CPU 및 고밀도 메모리(DRAM, Flash Memory, SCM 등)를 포함할 수 있으며 클라우드 서버에 위치할 수 있다.In an embodiment, the width of the high-performance system bus 2430 is not limited by the pin count of the SoC 1300. Thus, communication over the high-performance system bus 2430 is much faster than communication with the prior art system bus 2330 shown in FIG. The external neural network accelerator device 2470 can be connected via an off-chip interconnect 2480 that can be wired locally or accessed remotely. Local wired approaches may include TSV, 3D-stacking, wire bonding or access via PCB. Remote access methods may include LAN, Wi-Fi, and Bluetooth. The external neural network accelerator device may include its own CPU and high-density memory (DRAM, Flash Memory, SCM, etc.) and may be located in a cloud server.

실시예에서, 전체 신경망을 SoC(1300) 및 외부 신경망 가속기 장치(2470)로 분할함으로써, 비휘발성 신경망 모듈(1316)을 사용하여 SoC(1300) 내에서 특정 중요 계층이 실행될 수 있는 반면 다른 나머지 계층은 외부 신경망 가속기 장치(2470)는 3D-NAND와 같은 저비용 및 고밀도 메모리를 사용할 수 있다. 예를 들어, 신경망의 초기 계층은 온-칩으로 처리될 수 있고 나머지 계층은 외부 신경망 가속기 장치(2470)로 처리될 수 있다. 온-칩 비휘발성 신경망에서만 추출되거나 코딩된 특징은 오프-칩으로 통신되고, SoC 내에 신경망 모듈이 없는 경우에 비해 외부에서 통신하는 데이터의 양을 줄일 수 있기 때문이다. 온-칩 신경망의 중간 결과는 실행에 필요한 매개 변수가 온-칩 비휘발성 신경망(1316)에 저장되기 때문에 최종 결과의 조기 예측에 유용할 수 있는 짧은 지연 시간 부분 결과를 제공할 수 있다. SoC(1300)와 외부 신경망 가속기 장치(2470) 사이에서만 코딩된 정보로 오프-칩 통신함으로써 개인정보 문제가 확연히 줄어들게 된다.In an embodiment, by dividing the entire neural network into SoC 1300 and external neural network accelerator device 2470, certain critical layers can be executed within SoC 1300 using non-volatile neural network module 1316 while other remaining layers The external neural network accelerator device 2470 may use low-cost and high-density memory such as 3D-NAND. For example, the initial layer of the neural network can be processed on-chip and the remaining layers can be processed by the external neural network accelerator device 2470 . This is because features extracted or coded only in the on-chip non-volatile neural network are communicated off-chip, and the amount of externally communicated data can be reduced compared to the case where there is no neural network module in the SoC. The intermediate results of the on-chip neural network may provide low-latency partial results that may be useful for early prediction of the final result because the parameters required for execution are stored in the on-chip non-volatile neural network 1316. By off-chip communication using coded information only between the SoC 1300 and the external neural network accelerator device 2470, the problem of personal information is significantly reduced.

도 25는 본 발명에 따른 SoC(1300a 및 1300b)의 다중 다이로 구성된 분산 신경망 시스템을 도시한다. 실시예에서, SoC(1300a 및 1300b)는 도 13 및 도 24에 설명된 SoC(1300)와 유사하다. 오프-칩 상호 접속(2480)은 도 24의 것과 유사하다. 전체 신경망을 여러 SoC 장치로 분할함으로써 신경망 계산을 병렬로 수행하여 성능을 향상시킬 수 있다. 예를 들어, 일부 초기 계층은 한 SoC의 온-칩 신경망 모듈로 처리할 수 있고 나머지 계층은 다른 SoC로 처리할 수 있다. 첫 번째 SoC에서 추출되거나 코딩된 기능만 칩 외부로 전달된다. 첫 번째 SoC의 중간 결과는 실행에 필요한 매개 변수가 각각의 온-칩 비휘발성 신경망(1316)에 저장되기 때문에 최종 결과의 조기 예측에 유용할 수 있는 낮은 지연 부분 결과를 제공할 수 있다. SoC(1300a 및 1300b) 사이에서만 코딩된 정보로 오프-칩 통신함으로써 개인정보 문제가 확연히 줄어들게 된다.25 illustrates a distributed neural network system comprised of multiple die of SoC 1300a and 1300b in accordance with the present invention. In an embodiment, SoCs 1300a and 1300b are similar to SoC 1300 described in FIGS. 13 and 24 . Off-chip interconnection 2480 is similar to that of FIG. 24 . By splitting the entire neural network into multiple SoC units, neural network calculations can be performed in parallel to improve performance. For example, some initial layers can be handled by one SoC's on-chip neural network module, while others can be handled by another SoC. Only functions extracted or coded in the first SoC are passed off the chip. The intermediate results of the first SoC can provide low-latency partial results that can be useful for early prediction of the final result because the parameters required for execution are stored in each on-chip non-volatile neural network 1316. By communicating off-chip with coded information only between the SoCs 1300a and 1300b, the privacy problem is significantly reduced.

도 26은 논리 친화적 NVM(2619)이 CPU(1312), 센서(1314) 및 신경망 컴퓨팅 엔진(2320)과 같은 다른 회로 블록과 함께 SoC(2600)에 통합되고 본 발명에 따른 고성능 시스템 버스(2430)를 통해 연결된 시스템-온-칩을 보여준다. 유사하게 번호가 매겨진 컴포넌트는 도 23에 대응하는 컴포넌트를 나타낸다. 실시예들에서 적당한 밀도의 신경망 컴퓨팅 엔진과 SoC의 논리 친화적 NVM(2619)을 통합함으로써, 에너지 소산, 지연 오버헤드도 도 23의 종래 기술 설계에 비해 개선될 수 있다. 또한 외부 NVM 액세스로 인한 보안 문제가 줄어든다. 실시예의 단일 로직 칩 솔루션은 경제적이고, 신경망 매개 변수를 안전하게 저장하는 로직 호환 임베디드 플래시를 특징으로 하는 IoT 애플리케이션에 매력적이다.26 shows that a logic friendly NVM 2619 is integrated into SoC 2600 along with other circuit blocks such as CPU 1312, sensors 1314 and neural network computing engine 2320 and high performance system bus 2430 according to the present invention. Shows the system-on-chip connected via Similarly numbered components indicate corresponding components in FIG. 23 . Energy dissipation, delay overhead can also be improved compared to the prior art design of FIG. It also reduces security issues caused by external NVM access. The single logic chip solution of the embodiment is economical and attractive for IoT applications featuring a logic compatible embedded flash that securely stores neural network parameters.

실시예에서, 버스 폭은 칩의 이용 가능한 핀 수에 의해 제한되지 않는다. 따라서, 넓은 I/O 및 낮은 지연 메모리 인터페이스는 SoC(2600)의 논리 친화적 NVM과 다른 블록 사이의 통신에 사용될 수 있다. 따라서, 신경망 컴퓨팅 엔진 (2320)은 외부 플래시 메모리를 사용하는 종래 기술 시스템에 비해 논리 친화적 NVM(2619)의 데이터에 빠르게 액세스 할 수 있다.In an embodiment, the bus width is not limited by the number of pins available on the chip. Thus, the wide I/O and low-latency memory interface can be used for communication between the logic-friendly NVM of SoC 2600 and other blocks. Thus, the neural network computing engine 2320 can access data in the logic-friendly NVM 2619 faster than prior art systems using external flash memory.

도 27은 논리 친화적 NVM(2719)이 신경망 엔진(2720) 내의 SoC(2700)에 통합된 본 발명의 신경망 시스템을 도시한다. 신경망 컴퓨팅 엔진 (2720)은 도 26의 신경망 컴퓨팅 엔진(2620)과 유사하다. 신경망 컴퓨팅 엔진 (2720)은 도 26의 종래 기술에 비해 향상된 성능 및 전력 효율로 CPU 개입없이 논리 친화적 인 NVM (2719)에 액세스 할 수 있다.27 shows a neural network system of the present invention in which a logic friendly NVM 2719 is integrated into SoC 2700 within neural network engine 2720. The neural network computing engine 2720 is similar to the neural network computing engine 2620 of FIG. 26 . The neural network computing engine 2720 can access the logic-friendly NVM 2719 without CPU intervention with improved performance and power efficiency compared to the prior art of FIG. 26 .

도 24 내지 도 27에 설명된 온-칩 비휘발성 신경망을 갖는 본 발명의 제안 된 구성은 종래 기술에 비해 더 낮은 전력 소비 및 더 높은 성능과 같은 다양한 이점을 갖는다. 또한, 실시예에서 신경망을 실행하기 위해 개인 사용자 데이터가 사용될 때 오프-칩 액세스를 제한함으로써 프라이버시 문제가 상당히 감소된다.The proposed configuration of the present invention with on-chip non-volatile neural networks described in FIGS. 24-27 has various advantages over the prior art, such as lower power consumption and higher performance. In addition, privacy concerns are significantly reduced by restricting off-chip access when private user data is used to run neural networks in an embodiment.

본 발명은 다양한 변형 및 대안적인 형태가 가능하지만, 특정예가 도면에 도시되어 있고 여기에 상세히 설명되어 있다. 그러나 본 발명은 개시된 특정 형태로 제한되지 않고, 반대로 본 발명은 첨부된 청구항들의 범위 내에 속하는 모든 변형, 균등물 및 대안을 포함하는 것으로 이해되어야 한다.Although this invention is capable of many modifications and alternative forms, specific examples are shown in the drawings and described in detail herein. However, it is to be understood that the present invention is not limited to the particular forms disclosed, but on the contrary, the present invention includes all modifications, equivalents and alternatives falling within the scope of the appended claims.

Claims

a first input signal line for providing a first signal;
a second input signal line for providing a second signal;
a reference signal line providing a reference signal;
a first output line for conveying an output signal; and
A cell for generating the output signal;
the cell,
a first upper select transistor having a gate electrically connected to the first input signal line;
a second upper select transistor having a gate electrically connected to the second input signal line;
a first resistive change element having one end connected in series to the first upper select transistor and the other end electrically connected to the reference signal line, wherein a value of the first resistive change element is programmable to change a magnitude of an output signal;
A second resistance change element having one end connected in series to the second upper select transistor and the other end electrically connected to the reference signal line, the value of the second resistance change element being programmable to change the magnitude of an output signal. include,
A non-volatile synapse circuit, characterized in that the drain of the first upper select transistor and the drain of the second upper select transistor of the cell are electrically connected to the first output line.

According to claim 1,
program lines for providing programming signals;
a write line for providing a write signal; and
an erase line for providing an erase signal;
Including more,
The first resistance change element includes a coupling transistor and a write transistor provided to have a floating gate node, wherein the coupling transistor is electrically connected to the program line and the write transistor is electrically connected to the write line; A read transistor and a lower select transistor provided in series with the upper select transistor, the lower select transistor having a source electrically connected to the reference signal line and a gate electrically connected to the erase line, and the read transistor electrically connected to the floating gate node Having a gate connected to - A non-volatile synaptic circuit comprising a.

delete

a first input signal line for providing a first signal;
a second input signal line for providing a second signal;
a reference signal line providing a reference signal;
a first output line for conveying an output signal; and
A cell for generating the output signal;
the cell,
a first upper select transistor having a gate electrically connected to the first input signal line;
a second upper select transistor having a gate electrically connected to the second input signal line; and
a first resistive change element having one end connected in series to the first upper select transistor and the other end electrically connected to the reference signal line, wherein a value of the first resistive change element is programmable to change a magnitude of an output signal; including,
The second upper selection transistor has a source electrically connected to the first resistance change element;
A source of the first upper select transistor and a source of the second upper select transistor are directly connected to a first common node;
A drain of a first upper select transistor of the cell is electrically connected to the first output line;
A non-volatile synaptic circuit, characterized in that the drain of the second upper selection transistor of the cell is electrically connected to the second output line.

According to claim 4,
program lines for providing programming signals;
a write line for providing a write signal; and
an erase line for providing an erase signal;
Including more,
The first resistance change element includes a coupling transistor and a write transistor provided to have a floating gate node, wherein the coupling transistor is electrically connected to the program line and the write transistor is electrically connected to the write line, and a series A read transistor and a lower select transistor provided with - the lower select transistor has a source electrically connected to the reference signal line and a gate electrically connected to the erase line, wherein the read transistor has a gate electrically connected to the floating gate node and the gate electrically connected to the erase line. Having a source directly connected to the first common node. A non-volatile synaptic circuit comprising:

According to claim 4,
a third input signal line for providing a third input signal; and
a fourth input signal line for providing a fourth input signal;
Including more,
the cell,
a third upper select transistor having a gate electrically connected to the third input signal line;
A fourth upper select transistor having a gate electrically connected to the fourth input signal line;
A source of the third upper select transistor and a source of the fourth upper select transistor are directly connected to a second common node;
The second resistance change element has one end connected to the second common node and the other end electrically connected to the reference signal line, and a value of the second resistance change element is programmable to change the magnitude of an output signal;
A drain of the third upper select transistor of the cell is electrically connected to the first output line;
A non-volatile synapse circuit, characterized in that the drain of the fourth upper selection transistor of the cell is electrically connected to the second output line.

a first input signal line for providing a first input signal;
2 input signal lines for providing a second input signal;
a reference signal line for providing a reference signal;
first and second output lines for conveying first and second output signals;
A cross-coupled latch circuit for storing electrical signals;
First and second inverters having input and output terminals, respectively, an input terminal of the first inverter connected to the output terminal of the second inverter at a first signal node, and an output terminal of the first inverter at a second signal node An input terminal of a second inverter connected to and first and second cells for generating first and second output signals, respectively;
Each first and second cell:
a first upper select transistor having a gate electrically connected to the first input signal line;
a second upper select transistor having a gate electrically connected to the second input signal line; and
a source of a first upper select transistor and a source of a second upper select transistor that are directly connected to a common node;
the common node of the first cell is electrically coupled to the first signal node of the cross-coupled latch circuit and the common node of the second cell is electrically coupled to the second signal node of the cross-coupled latch circuit;
here :
The drain of the first upper select transistor of the first cell is electrically connected to the first output line and the drain of the first upper select transistor of the second cell is electrically connected to the second output line;
The drain of the second upper select transistor of the first cell is electrically connected to the second output line and the drain of the second upper select transistor of the second cell is electrically connected to the first output line;
The first and second inverters of the cross-coupled latch circuit are electrically coupled to the reference signal line.

According to claim 7,
The cross-coupled latch circuit is a synapse circuit, characterized in that implemented as a non-volatile memory circuit.

a central processing unit for controlling the elements of the chip;
a non-volatile neural network unit;
a sensor providing an input signal to the non-volatile neural network; and
a memory unit,
The central processing unit, sensor, memory unit and non-volatile neural network unit are electrically coupled,
The non-volatile neural network unit,
Synapse array including a plurality of non-volatile synapses;
low driver;
criteria generator; and
a sensing circuit;
The synapse array, row driver, reference generator and sensing circuit are electrically coupled;
Each non-volatile synapse,
a first input signal line providing a first input signal from a row driver, wherein the reference generator provides a desired voltage level to the row driver;
a reference signal line for providing a reference signal from the row driver, wherein the reference generator provides the required voltage level to the row driver;
a first output line for conveying an output signal, the output signal being processed by the sensing circuit; and
a first cell for generating an output signal;
each cell,
a first upper select transistor having a gate electrically connected to the first input signal line; and
and a first resistive change element having one end connected in series to the first upper select transistor and the other end electrically connected to the reference signal line, wherein a value of the first resistive change element is programmed to change a magnitude of an output signal. possible-
Here, the neural network chip, characterized in that the drain of the first upper select transistor of the first cell is electrically connected to the first output line.

According to claim 9,
Each non-volatile synapse of the synapse array,
A program line-reference generator for providing programming signals from the row drivers provides the required voltage levels for the row drivers;
A write line-reference generator for providing a write signal from the row driver provides the required voltage level to the row driver; and
an erase line providing an erase signal from the row driver;
The first resistance change element,
a coupling and write transistor arranged to have a floating gate node, wherein the coupling transistor is electrically coupled to the program line and the write transistor is electrically coupled to the write line; and
A read transistor and a lower select transistor arranged in series with the upper select transistor - the lower select transistor has a source electrically connected to the reference signal line and a gate electrically connected to the erase line, the read transistor having a floating gate electrically coupled to the gate node Having - A neural network chip comprising a.

According to claim 9,
Each non-volatile synapse of the synapse array,
a second input signal line providing a second input signal from the row driver;
the cell,
a second upper select transistor having a gate electrically connected to the second input signal line;
A second resistive change element having one end connected in series to the second upper select transistor and the other end electrically connected to the reference signal line, wherein a value of the second resistive change element is programmable to change a magnitude of an output signal;
wherein the drain of the second upper select transistor of the cell is electrically coupled to the first output line.

According to claim 9,
Each non-volatile synapse of the synapse array,
a second input signal line reference generator for providing a second input signal from the row driver to provide a desired voltage level to the row driver; and
a second output signal line providing a second output signal;
The output signal is processed by the sensing circuit to deliver an output signal;
the cell,
a second upper select transistor having a gate electrically coupled to the second input signal line; and
a source of the second upper selection transistor electrically coupled to the first resistance change element, wherein a source of the first upper selection transistor and a source of the second upper selection transistor are directly connected to a first common node;
A neural network chip, wherein the drain of the second upper select transistor of the first cell is electrically coupled to the second output line.

According to claim 12,
Each non-volatile synapse of the synapse array,
a program line for providing a programming signal from the row driver, the reference generator providing the desired voltage level to the row driver;
Write line to provide the write signal from the row driver - the reference generator provides the required voltage level to the row driver -;
Erase line for providing an erase signal from the row driver - the reference generator provides the required voltage level to the row driver; including,
The first resistance change element,
A coupling transistor and a write transistor provided to have a floating gate node, wherein the coupling transistor is electrically connected to the program line, and the write transistor is electrically connected to the write line, and the first upper select transistor is in series with A read transistor and a lower selection transistor are provided, wherein the lower selection transistor has a source electrically connected to the reference signal line and a gate electrically connected to the erase line, and the read transistor has a gate electrically connected to the floating gate node. A neural network chip comprising a.

According to claim 12,
Each non-volatile synapse in the synapse array,
a third input signal line for providing a third input signal from the row driver, wherein a reference generator provides the desired voltage level to the row driver; and
a fourth input signal line for providing a fourth input signal from the row driver, wherein the reference generator provides the required voltage level to the row driver; Including more,
the cells,
a third upper select transistor having a gate electrically connected to the third input signal line; and
A fourth upper select transistor having a gate electrically connected to the fourth input signal line - a source of the third upper select transistor and a source of the fourth upper select transistor are directly connected to a second common node, and a second resistance change The element has one end connected to the second common node and the other end electrically coupled to the reference signal line, and the value of the second resistance change element is programmable to change the magnitude of an output signal; Including more,
A drain of a third upper select transistor of the cell is electrically coupled to the first output line;
The neural network chip, characterized in that the drain of the fourth upper select transistor of the cell is electrically coupled to the second output line.

central processing unit;
sensor unit;
neural network engine; and
Includes a logic-friendly non-volatile memory unit;
The central processing unit, the sensor unit, the logic-friendly non-volatile memory unit, and the neural network engine are connected by a system bus,
The neural network engine,
Synapse array including a plurality of non-volatile synapses;
low driver;
criteria generator; and
a sensing circuit;
The synapse array, row driver, reference generator and sensing circuit are electrically coupled;
Each non-volatile synapse,
a first input signal line providing a first input signal from a row driver, wherein the reference generator provides a desired voltage level to the row driver;
a reference signal line for providing a reference signal from the row driver, wherein the reference generator provides the required voltage level to the row driver;
a first output line for conveying an output signal, the output signal being processed by the sensing circuit; and
a first cell for generating an output signal;
each cell,
a first upper select transistor having a gate electrically connected to the first input signal line; and
and a first resistive change element having one end connected in series to the first upper select transistor and the other end electrically connected to the reference signal line, wherein a value of the first resistive change element is programmed to change a magnitude of an output signal. possible-
Here, the drain of the first upper selection transistor of the first cell is electrically connected to the first output line,
The on-chip neural network system, characterized in that the neural network engine can execute data stored in the logic-friendly non-volatile memory without transmitting it to the outside of the chip.

According to claim 15,
The logic-friendly non-volatile memory unit is embedded in the neural network engine,
The on-chip neural network system, characterized in that the neural network engine can access data stored in the logic-friendly non-volatile memory without intervention of the central processing unit.

a central processing unit for controlling the elements of the chip;
a non-volatile neural network unit;
a sensor for providing an input signal to the non-volatile neural network; and
a memory unit,
The central processing unit, the sensor, the memory unit, and the non-volatile neural network unit are electrically coupled, and the non-volatile neural network unit controls operating parameters such as voltage and frequency of the central processing unit, the memory unit, and the sensor. A neural network chip characterized in that for controlling.