KR20210051288A

KR20210051288A - Neural network learning device comprising analog batch normalization device

Info

Publication number: KR20210051288A
Application number: KR1020190136351A
Authority: KR
Inventors: 이병근; 기상균; 여인준
Original assignee: 광주과학기술원
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2021-05-10
Also published as: KR102293820B1

Abstract

Disclosed is a neural network learning device capable of reducing a load applied to hardware by reducing a computation amount of a processor. According to various embodiments, the neural network learning device includes: an input unit to input training data for training an artificial neural network model; and a learning processor to train the artificial neural network model using the training data. The artificial neural network model includes a plurality of layers to receive a signal corresponding to the data received from the input unit and at least one batch normalization device interposed between the plurality of layers. The batch normalization device includes a resistance memory array (RRAM array) to perform a convolution operation for a first analog signal received from a first layer of the plurality of layers and a plurality of active device to transmit a third analog signal to a second layer by normalizing a second analog signal subject to the convolution operation based on the resistance memory array.

Description

Neural network learning device including an analog batch normalization device {NEURAL NETWORK LEARNING DEVICE COMPRISING ANALOG BATCH NORMALIZATION DEVICE}

다양한 실시예들은 아날로그 배치 정규화 장치를 포함하는 신경망 학습 장치에 관한 것이다. 구체적으로는, 신경망 학습 장치는, 배치 정규화를 위하여 저항성 메모리 배열 및 정규화를 위한 능동소자를 가지는 회로 배열을 포함할 수 있다.Various embodiments relate to a neural network training apparatus including an analog batch normalization apparatus. Specifically, the neural network learning apparatus may include a resistive memory arrangement for batch normalization and a circuit arrangement having an active element for normalization.

인공 지능과 관련된 기술이 발전함에 따라 다양한 산업에서 인공 지능 기술이 적용되고 있다. 최근 프로세서 및 전용 가속기 등의 처리 장치의 성능 향상으로 많은 연산량이 필요한 신경망 학습 장치 또는 신경망 학습 알고리즘을 구동할 수 있어, 인공 지능을 적용한 전자 장치는 종래에 제한적인 기능을 동작시킬 수 있게 되었다.As technology related to artificial intelligence develops, artificial intelligence technology is being applied in various industries. Recently, as the performance of processing devices such as a processor and a dedicated accelerator can be improved, a neural network learning device or a neural network learning algorithm requiring a large amount of computation can be driven, so that an electronic device to which artificial intelligence is applied can operate a limited function in the related art.

딥 러닝을 사용하는 사물 인식 알고리즘은 합성곱 레이어(convolution layer)로 구성된 합성곱 신경망(CNN, convolution neral network) 또는 심층 신경망 (DNN, deep neral network) 모델을 채용하고 있다.The object recognition algorithm using deep learning employs a convolution neral network (CNN) or a deep neral network (DNN) model composed of a convolution layer.

합성곱 레이어 또는 컨벌루션 레이어를 이용하는 모델의 경우, 각 레이어의 입력값들의 분포(distribution)의 변화에 의한 영향을 많이 받는 문제 점이 있다. In the case of a model using a convolutional layer or a convolutional layer, there is a problem that is greatly affected by a change in distribution of input values of each layer.

예를 들면, 입력 값들의 분포의 변화에 따라 다음 레이어의 출력값 분포 차이를 야기시키고, 결과값의 에러(error) 양을 증가시킨다. 인공 신경망 학습 장치는 분포의 변화 때문에 인공 신경망 학습 과정에서 가중치 소실 또는 폭발의 우려를 가지고 있다. 이를 해결하기 위하여, 인공 신경망 학습 장치는 정규화(normalization) 알고리즘을 도입하고 있다.For example, according to a change in the distribution of the input values, a difference in the distribution of the output values of the next layer is caused, and an error amount of the result value is increased. The artificial neural network learning apparatus has a fear of loss or explosion in the artificial neural network training process due to the change in distribution. To solve this problem, the artificial neural network learning apparatus introduces a normalization algorithm.

배치 정규화 방법은 배치 단위로 정규화를 통하여 학습을 진행하고 있으므로, 내부 공변량 (internal covariate) 이동을 감소시킬 수 있고, 결과값 역시 빠른 속도로 수렴하여 널리 사용되고 있다. 각각의 내부 공변량에 의한 공변량 시프트 (covariate shift) 문제를 각 레이어의 출력값을 정규화 시키는 방법을 통하여 출력값 분산을 해결할 수 있다.Since the batch normalization method is trained through normalization in batch units, it is possible to reduce the movement of internal covariates, and the result values are also rapidly converged and widely used. The covariate shift problem due to each internal covariate can be solved by a method of normalizing the output values of each layer.

학습단계에서부터 배치 정규화를 적용해 합성곱 신경망을 학습시킨 후, 합성곱 신경망 하드웨어의 각 레이어의 출력값 또는 입력값 마다 배치 정규화를 적용할 수 있다.After the convolutional neural network is trained by applying batch normalization from the learning stage, batch normalization can be applied to each output value or input value of each layer of the convolutional neural network hardware.

배치 정규화 방법은 배치 단위의 데이터들의 통계값을 이용하기 때문에 많은 연산과 연산값들을 저장할 수 있는 저장 공간이 필요하다. 배치 정규화 방법은 방대한 연산으로 근사화(approximation)을 통하여 배치 정규화를 구현한다. 과도한 근사화는 전체 신경망 학습 시스템의 성능 저하를 불러올 수 있다.Since the batch normalization method uses statistical values of data in batch units, a storage space for storing many operations and calculation values is required. The batch normalization method implements batch normalization through approximation with extensive operations. Excessive approximation can lead to performance degradation of the entire neural network learning system.

디지털 방식의 배치 정규화 방법은 레이어간에 디지털 신호 처리기(DSP, digital signal processor)가 필요하기 때문에 추가적인 하드웨어를 요구하고, 이에 따라 신호 경로가 복잡해질 수 있다.Since the digital arrangement normalization method requires a digital signal processor (DSP) between layers, additional hardware is required, and a signal path may be complicated accordingly.

따라서, 배치 정규화 방법의 이용에서도, 신경망 학습 장치는 과도한 근사화를 줄이고, 디지털 신호 처리기와 같은 하드웨어를 감소시킬 방안이 요구된다.Therefore, even in the use of the batch normalization method, a neural network learning apparatus is required to reduce excessive approximation and reduce hardware such as a digital signal processor.

다양한 실시예에 따른 신경망 학습 장치는 인공 신경망 모델 학습을 위한 훈련 데이터를 입력하기 위한 입력부와, 상기 훈련 데이터를 이용하여 상기 인공 신경망 모델을 훈련시키는 러닝 프로세서를 포함하고, 상기 인공 신경망 모델은, 상기 입력부로부터 전달받은 데이터에 대응되는 신호를 전달받는 복수의 레이어들 및, 상기 복수의 레이어들 사이에 배치되는 적어도 하나의 배치 정규화 장치 (batch normalization device)를 포함하고, 상기 배치 정규화 장치는 상기 복수의 레이어들 중 제1 레이어로부터 전달된 제1 아날로그 신호를 합성곱 처리하는 저항 메모리 배열(RRAM array) 및 상기 저항 메모리 배열로부터 합성곱 처리된 제2 아날로그 신호를 정규화하여 제2 레이어로 제3 아날로그 신호를 전달하는 복수의 능동소자를 포함하는 회로 배열을 포함할 수 있다.A neural network training apparatus according to various embodiments includes an input unit for inputting training data for training an artificial neural network model, and a learning processor for training the artificial neural network model using the training data, and the artificial neural network model comprises: A plurality of layers receiving a signal corresponding to the data received from the input unit, and at least one batch normalization device disposed between the plurality of layers, wherein the batch normalization device comprises the plurality of layers. A third analog signal to a second layer by normalizing a resistive memory array (RRAM array) that performs convolutional processing of the first analog signal transmitted from the first layer among the layers, and a second analog signal that is convolutional-processed from the resistive memory array. It may include a circuit arrangement including a plurality of active elements for transmitting.

다양한 실시예에 따르는 전자 장치는 인공 신경망 모델을 정규화 하는 배치 정규화 장치를 포함하고, 상기 배치 정규화 장치는 상기 복수의 레이어들 중 제1 레이어로부터 전달된 제1 아날로그 신호를 합성곱 처리하는 저항 메모리 배열(RRAM array), 상기 저항 메모리 배열로부터 합성곱 처리된 제2 아날로그 신호를 정규화하여 제2 레이어로 제3 아날로그 신호를 전달하는 복수의 능동소자를 포함하는 회로 배열 및 상기 회로 배열을 제어하도록 설정된 제어부를 포함할 수 있다.An electronic device according to various embodiments of the present disclosure includes a batch normalizing device for normalizing an artificial neural network model, and the batch normalizing device is a resistive memory array for convolutional processing a first analog signal transmitted from a first layer among the plurality of layers. (RRAM array), a circuit arrangement including a plurality of active elements for transmitting a third analog signal to a second layer by normalizing a second analog signal subjected to convolution from the resistive memory array, and a control unit configured to control the circuit arrangement It may include.

다양한 실시예에 따르는 신경망 학습 장치는, 배치 정규화를 능동소자로 구현하여, 아날로그 신호를 출력할 수 있어, 디지털 신호 처리장치의 사용을 줄일 수 있다.The neural network learning apparatus according to various embodiments can output an analog signal by implementing batch normalization as an active device, thereby reducing the use of a digital signal processing device.

다양한 실시예에 따르는 신경망 학습 장치는, 프로세서의 연산량을 줄여, 하드웨어에 가해지는 부하를 줄일 수 있다.The apparatus for learning a neural network according to various embodiments may reduce the amount of computation of a processor, thereby reducing a load applied to hardware.

다양한 실시예에 따르는 신경망 학습 장치는, 네트워크 성능 저하 없이 배치정규화를 구현할 수 있다.The apparatus for learning a neural network according to various embodiments may implement batch normalization without deteriorating network performance.

다양한 실시예에 따르는 신경망 학습 장치는, 저항 메모리의 내부 공변량에 의한 공변 이동문제를 줄일 수 있다.The neural network training apparatus according to various embodiments may reduce a covariate movement problem caused by an internal covariate of a resistance memory.

도 1은, 다양한 실시예들에 따른, 인공 신경망 학습 장치의 블록도이다.
도 2는, 다양한 실시예에 따르는 신경망 학습 장치를 구성하는 인공 신경망 모델의 개념도이다.
도 3a 및 3b는, 발생하는 오차를 줄이기 위한 정규화 동작을 도시한다.
도 4는, 다양한 실시예에 따른 인공 신경망 모델에 포함되는 커널 매트릭스의 블록도이다.
도 5는, 다양한 실시예에 따르는, 인공 신경망 모델에 포함되는 커널 매트릭스를 구현하는 저항성 메모리 배열의 개념도이다.
도 6a, 및 6b는 다양한 실시예에 따르는, 배치 정규화 장치의 회로도이다.
도 7a 내지 도 7e는, 다양한 실시예에 따르는 신경망 학습 장치의 각 단계에서의 스위칭 동작을 도시한다.
도 8은 다양한 실시예에 따르는 신경망 학습 장치를 통한 결과 값의 편차를 비교한 그래프이다.1 is a block diagram of an apparatus for learning an artificial neural network, according to various embodiments.
2 is a conceptual diagram of an artificial neural network model constituting a neural network training apparatus according to various embodiments.
3A and 3B illustrate a normalization operation to reduce an error that occurs.
4 is a block diagram of a kernel matrix included in an artificial neural network model according to various embodiments.
5 is a conceptual diagram of a resistive memory arrangement implementing a kernel matrix included in an artificial neural network model, according to various embodiments.
6A and 6B are circuit diagrams of a batch normalization apparatus, according to various embodiments.
7A to 7E illustrate switching operations in each step of an apparatus for learning a neural network according to various embodiments.
8 is a graph comparing deviations of result values through a neural network learning apparatus according to various embodiments.

도 1은, 다양한 실시예들에 따른, 신경망 학습 장치(100)의 블럭도이다. 1 is a block diagram of an apparatus 100 for learning a neural network according to various embodiments.

인공 신경망은 입력에 대하여 일반화된 출력(generalized output)을 제공하기 위한 하드웨어, 소프트웨어 또는 이들의 조합을 의미할 수 있다.The artificial neural network may refer to hardware, software, or a combination thereof for providing a generalized output for an input.

예를 들어, 인공 신경망은, 합성곱 신경망 (CNN, Convolutional Neural Network), 마르코프 체인(Markov Chain), 또는 이진화 신경망 (BNN, binarized neural network) 등을 시뮬레이션하기 위한 어플리케이션 및 상기 어플리케이션을 실행하기 위한 프로세서에 기반하여 작동할 수 있다.For example, an artificial neural network is an application for simulating a convolutional neural network (CNN), a Markov chain, or a binarized neural network (BNN), and a processor for executing the application. Can work based on

도 1을 참조하면, 신경망 학습 장치(100)는 훈련을 통하여 머신 러닝을 수행할 수 있는 장치로서, 인공 신경망으로 구성된 모델을 이용하여 학습하는 장치를 포함할 수 있다. 예를 들면, 신경망 장치(100)는 데이터 마이닝, 데이터 분석, 및 머신 러닝 알고리즘(예: 딥 러닝 알고리즘 (deep learning algorithm))을 위해 이용되는 정보를 입력, 출력, 데이터 베이스 구축 및 저장하도록 구성될 수 있다. Referring to FIG. 1, the apparatus 100 for learning a neural network is a device capable of performing machine learning through training, and may include a device for learning using a model composed of an artificial neural network. For example, the neural network device 100 may be configured to input, output, build a database, and store information used for data mining, data analysis, and machine learning algorithms (eg, deep learning algorithms). I can.

신경망 장치(100)는 통신부(미도시)를 통하여 외부 전자 장치(미도시)와 데이터를 송수신할 수 있고, 외부 전자 장치로부터 전달받은 데이터를 분석하거나 학습하여 결과값을 도출할 수 있다. 신경망 장치(100)는 외부 전자 장치의 연산을 분산하여 처리할 수 있다.The neural network device 100 may transmit and receive data with an external electronic device (not shown) through a communication unit (not shown), and may derive a result value by analyzing or learning data transmitted from the external electronic device. The neural network device 100 may distribute and process an operation of an external electronic device.

신경망 장치(100)는 서버로 구현될 수 있다. 또한 신경망 장치(100)는 복수로 구성되어 신경망 장치 세트를 이룰 수 있다. 각각의 신경망 장치(100)는 연산을 분산하여 처리할 수 있고, 분산 처리된 데이터를 바탕으로 데이터 분석 및 학습을 통하여 결과값을 도출할 수 있다. 신경망 장치(100)는 머신 러닝 알고리즘 등을 이용하여 획득한 결과값을 외부 전자 장치 또는 다른 신경망 장치로 전송할 수 있다.The neural network device 100 may be implemented as a server. In addition, the neural network device 100 may be configured in plural to form a neural network device set. Each neural network device 100 may distribute and process an operation, and may derive a result value through data analysis and learning based on the distributed data. The neural network device 100 may transmit a result obtained by using a machine learning algorithm or the like to an external electronic device or another neural network device.

다양한 실시예에 따르면, 신경망 장치(100)는 입력부(110), 프로세서(120), 메모리(130), 및 러닝 프로세서(140)를 포함할 수 있다.According to various embodiments, the neural network device 100 may include an input unit 110, a processor 120, a memory 130, and a learning processor 140.

다양한 실시예에 따르면, 입력부(110)는 인공 신경망 모델 학습을 통한 출력값을 도출하기 위한 입력 데이터를 획득할 수 있다. 입력부(110)는 가공되지 않은 입력 데이터를 획득할 수 있다. 프로세서(120) 또는 러닝 프로세서(140)는 가공되지 않은 입력데이터를 전처리하여 인공 신경망 모델 학습에 입력 가능한 훈련 데이터를 생성할 수 있다. 상기 전처리는 입력 데이터로부터 특징점을 추출하는 것일 수 있다. 상술한 바와 같이 입력부(110)는 통신부(미도시)를 통하여 데이터를 수신하여 입력 데이터를 획득하거나 데이터를 전처리할 수 있다.According to various embodiments, the input unit 110 may obtain input data for deriving an output value through training an artificial neural network model. The input unit 110 may obtain unprocessed input data. The processor 120 or the learning processor 140 may generate training data that can be input to training an artificial neural network model by pre-processing the raw input data. The preprocessing may be to extract feature points from input data. As described above, the input unit 110 may receive data through a communication unit (not shown) to obtain input data or pre-process the data.

다양한 실시예에 따르면, 프로세서(120)는 신경망 학습 장치(100)에서 사용 히스토리 정보를 수집하여 메모리(130)에 저장할 수 있다. 프로세서(120)는 저장된 사용 히스토리 정보 및 예측 모델링을 통하여 특정 기능을 실행하기 위한 최상의 조합을 결정할 수 있다. 프로세서(120)는 입력부(110)로부터 이미지 정보, 오디오 정보, 데이터 또는 사용자 입력 정보를 수신할 수 있다. According to various embodiments, the processor 120 may collect usage history information from the neural network learning apparatus 100 and store it in the memory 130. The processor 120 may determine the best combination for executing a specific function through the stored usage history information and predictive modeling. The processor 120 may receive image information, audio information, data, or user input information from the input unit 110.

다양한 실시예에 따르면, 프로세서(120)는 정보를 실시간으로 수집하고 정보를 처리 또는 분류하고, 처리된 정보를 메모리(130), 메모리(130)의 데이터 베이스 또는 러닝 프로세서(140)에 저장할 수 있다. According to various embodiments, the processor 120 may collect information in real time, process or classify the information, and store the processed information in the memory 130, the database of the memory 130, or the learning processor 140. .

다양한 실시예에 따르면, 신경망 학습 장치(100)의 동작이 데이터 분석 및 머신 러닝 알고리즘을 바탕으로 결정될 때, 프로세서(120)는 결정된 동작을 실행하기 위해 신경망 학습 장치(100)의 구성요소를 제어할 수 있다. 그리고, 프로세서(120)는 제어 명령에 따라 신경망 학습 장치(100)를 제어하여 결정된 동작을 수행할 수 있다.According to various embodiments, when the operation of the neural network learning apparatus 100 is determined based on data analysis and machine learning algorithms, the processor 120 controls components of the neural network learning apparatus 100 to execute the determined operation. I can. Further, the processor 120 may perform the determined operation by controlling the neural network learning apparatus 100 according to a control command.

프로세서(120)는 특정 동작이 수행되는 경우, 데이터 분석 및 머신 러닝 알고리즘 및 기법을 통해 특정 동작의 실행을 나타내는 이력 정보를 분석하고, 분석된 정보에 기초하여 이전에 학습한 정보의 업데이트를 수행할 수 있다. 프로세서(120)는 러닝 프로세서(140)과 함께, 업데이트 된 정보에 기초하여 데이터 분석 및 머신 러닝 알고리즘 및 성능의 정확성을 향상시킬 수 있다.When a specific operation is performed, the processor 120 analyzes historical information indicating execution of a specific operation through data analysis and machine learning algorithms and techniques, and performs an update of previously learned information based on the analyzed information. I can. The processor 120, together with the learning processor 140, may improve accuracy of data analysis and machine learning algorithms and performance based on updated information.

다양한 실시예에 따르면, 메모리(130)는 입력부(110)에서 획득한 입력 데이터, 학습된 데이터, 또는 학습 히스토리 등을 저장할 수 있다. 메모리(130)는 인공 신경망 모델(131)을 저장할 수 있다. According to various embodiments, the memory 130 may store input data obtained by the input unit 110, learned data, or a learning history. The memory 130 may store the artificial neural network model 131.

다양한 실시예에 따르면, 인공 신경망 모델(131)은 메모리(130)에 할당된 공간에 저장될 수 있다. 상기 메모리(130)에 할당된 공간은 러닝 프로세서(140)를 통하여 학습 중 또는 학습된 인공 신경망 모델(131)을 저장하며, 학습을 통하여 인공 신경망 모델(131)이 갱신되면, 갱신된 인공 신경망 모델(131)을 저장할 수 있다. 상기 메모리(130)에 할당된 공간은 학습된 모델을 학습 시점 또는 학습 진척도 등에 따라 복수의 버전으로 구분하여 저장할 수 있다. According to various embodiments, the artificial neural network model 131 may be stored in a space allocated to the memory 130. The space allocated to the memory 130 stores the artificial neural network model 131 being trained or learned through the learning processor 140, and when the artificial neural network model 131 is updated through learning, the updated artificial neural network model (131) can be saved. The space allocated to the memory 130 may be divided and stored into a plurality of versions according to a learning time point or a learning progress, etc. of the learned model.

다양한 실시예에 따르면, 메모리(130)는 입력부(110)에서 획득한 입력 데이터, 학습된 데이터를 저장, 분류가능한 데이터 베이스를 포함할 수 있다. According to various embodiments, the memory 130 may include a database capable of storing and classifying input data and learned data acquired by the input unit 110.

다양한 실시예에 따르면, 러닝 프로세서(140)는 프로세서(120)가 입력부(110)를 통해 획득한 입력 데이터를 전처리한 데이터를 바로 획득하여 인공 신경망 모델(131)을 학습하거나, 메모리(130)의 데이터 베이스에 저장된 전처리된 입력 데이터를 획득하여 인공 신경망 모델(131)을 학습할 수 있다. 예를 들면, 러닝 프로세서(140)는 다양한 학 습 기법을 이용하여 인공 신경망 모델(131)을 반복적으로 학습시켜 최적화된 인경 신경망 모델(131) 파라미터를 획득할 수 있다.According to various embodiments, the learning processor 140 learns the artificial neural network model 131 by directly acquiring the data obtained by preprocessing the input data obtained by the processor 120 through the input unit 110 or The artificial neural network model 131 may be trained by acquiring preprocessed input data stored in the database. For example, the learning processor 140 may repeatedly learn the artificial neural network model 131 using various learning techniques to obtain the optimized parameters of the neural network model 131.

다양한 실시예에 따르면, 학습된 모델은 데이터 베이스에서 인공 신경망 모델(131)을 갱신할 수 있다. 러닝 프로세서(140)는 신경망 학습 장치(100)에 통합되거나, 메모리(130)에 구현될 수 있다. 구체적으로 러닝 프로세서(140)는 메모리(130)를 사용하여 구현될 수 있다. According to various embodiments, the learned model may update the artificial neural network model 131 in a database. The learning processor 140 may be integrated into the neural network learning apparatus 100 or may be implemented in the memory 130. Specifically, the learning processor 140 may be implemented using the memory 130.

다양한 실시예에 따르면, 러닝 프로세서(140)는 일반적으로 감독 또는 감독되지 않은 학습, 데이터 마이닝, 예측 분석 또는 다른 장치에서 사용하기 위해 데이터를 식별, 색인화, 카테고리화, 조작, 저장, 검색 및 출력하기 위해 데이터를 하나 이상의 데이터베이스에 저장하도록 구성될 수 있다. 여기서, 데이터베이스는 메모리(130), 클라우드 컴퓨팅 환경에서 유지되는 메모리, 또는 네트워크와 같은 통신 방식을 통해 단말기에 의해 액세스 가능한 다른 원격 메모리 위치를 이용하여 구현될 수 있다 러닝 프로세서(140)에 저장된 정보는 다양한 상이한 유형의 데이터 분석 알고리즘 및 기계 학습 알고리즘 중 임의의 것을 사용하여 프로세서(120)에 의해 이용될 수 있다. 예를 들면, 이러한, 알고리즘의 예로는, k-최근 인접 시스템, 퍼지 논리 (예: 가능성 이론), 신경 회로망, 볼츠만 기계, 벡터 양자화, 펄스 신경망, 지원 벡터 기계, 최대 마진 분류기, 힐 클라이밍, 유도 논리 시스템 베이지안 네트워크, 페리트넷 (예: 유한 상태 머신, 밀리 머신, 무어 유한 상태 머신), 분류기 트리 (예: 퍼셉트론 트리, 지원 벡터 트리, 마코프 트리, 의사 결정 트리 포리스트, 임의의 포리스트), 판독 모델 및 시스템, 인공 융합, 센서 융합, 이미지 융합, 보강 학습, 증강 현실, 패턴 인식, 자동화 된 계획 등을 포함한다.According to various embodiments, the learning processor 140 generally identifies, indexes, categorizes, manipulates, stores, retrieves and outputs data for use in supervised or unsupervised learning, data mining, predictive analytics, or other devices. It may be configured to store hazardous data in one or more databases. Here, the database may be implemented using the memory 130, a memory maintained in a cloud computing environment, or another remote memory location accessible by the terminal through a communication method such as a network. It may be utilized by the processor 120 using any of a variety of different types of data analysis algorithms and machine learning algorithms. For example, examples of such algorithms include k-recent adjacency systems, fuzzy logic (e.g. probability theory), neural networks, Boltzmann machines, vector quantization, pulsed neural networks, support vector machines, maximum margin classifiers, hill climbing, derivation. Logical system Bayesian network, Peritnet (e.g. finite state machine, milli machine, Moore finite state machine), classifier tree (e.g. perceptron tree, support vector tree, Markov tree, decision tree forest, random forest), readout model And systems, artificial fusion, sensor fusion, image fusion, reinforcement learning, augmented reality, pattern recognition, automated planning, etc.

본 문서에 개시된 다양한 실시예들에 따른 외부 전자 장치 또는 전자 장치는 다양한 형태의 장치가 될 수 있다. 전자 장치는, 예를 들면, 휴대용 통신 장치 (예: 스마트폰), 컴퓨터 장치, 휴대용 의료 기기, 카메라, 웨어러블 장치, 또는 가전 장치를 포함할 수 있다. 본 문서의 실시예에 따른 전자 장치는 전술한 기기들에 한정되지 않는다.An external electronic device or an electronic device according to various embodiments disclosed in this document may be various types of devices. The electronic device may include, for example, a portable communication device (eg, a smartphone), a computer device, a portable medical device, a camera, a wearable device, or a home appliance. The electronic device according to the embodiment of the present document is not limited to the above-described devices.

본 문서의 다양한 실시예들 및 이에 사용된 용어들은 본 문서에 기재된 기술적 특징들을 특정한 실시예들로 한정하려는 것이 아니며, 해당 실시예의 다양한 변경, 균등물, 또는 대체물을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 또는 관련된 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다. 아이템에 대응하는 명사의 단수 형은 관련된 문맥상 명백하게 다르게 지시하지 않는 한, 상기 아이템 한 개 또는 복수 개를 포함할 수 있다. 본 문서에서, "A 또는 B", "A 및 B 중 적어도 하나", "A 또는 B 중 적어도 하나,""A, B 또는 C," "A, B 및 C 중 적어도 하나,"및 "A, B, 또는 C 중 적어도 하나"와 같은 문구들 각각은 그 문구들 중 해당하는 문구에 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. "제 1", "제 2", 또는 "첫째" 또는 "둘째"와 같은 용어들은 단순히 해당 구성요소를 다른 해당 구성요소와 구분하기 위해 사용될 수 있으며, 해당 구성요소들을 다른 측면(예: 중요성 또는 순서)에서 한정하지 않는다. 어떤(예: 제 1) 구성요소가 다른(예: 제 2) 구성요소에, "기능적으로" 또는 "통신적으로"라는 용어와 함께 또는 이런 용어 없이, "커플드" 또는 "커넥티드"라고 언급된 경우, 그것은 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로(예: 유선으로), 무선으로, 또는 제 3 구성요소를 통하여 연결될 수 있다는 것을 의미한다.Various embodiments of the present document and terms used therein are not intended to limit the technical features described in this document to specific embodiments, and should be understood to include various modifications, equivalents, or substitutes of the corresponding embodiment. In connection with the description of the drawings, similar reference numerals may be used for similar or related components. The singular form of a noun corresponding to an item may include one or more of the above items unless clearly indicated otherwise in a related context. In this document, “A or B”, “at least one of A and B”, “at least one of A or B,” “A, B or C,” “at least one of A, B and C,” and “A Each of phrases such as "at least one of B, or C" may include all possible combinations of items listed together in the corresponding phrase among the phrases. Terms such as "first", "second", or "first" or "second" may be used simply to distinguish the component from other Order) is not limited. Some (eg, a first) component is referred to as “coupled” or “connected” to another (eg, a second) component, with or without the terms “functionally” or “communicatively”. When mentioned, it means that any of the above components may be connected to the other components directly (eg by wire), wirelessly, or via a third component.

본 문서에서 사용된 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구현된 유닛을 포함할 수 있으며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로 등의 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는, 상기 부품의 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 일실시예에 따르면, 모듈은 ASIC(application-specific integrated circuit)의 형태로 구현될 수 있다. The term "module" used in this document may include a unit implemented in hardware, software, or firmware, and may be used interchangeably with terms such as logic, logic blocks, parts, or circuits. The module may be an integrally configured component or a minimum unit of the component or a part thereof that performs one or more functions. For example, according to an embodiment, the module may be implemented in the form of an application-specific integrated circuit (ASIC).

본 문서의 다양한 실시예들은 기기(machine)(예: 전자 장치(101)) 의해 읽을 수 있는 저장 매체(storage medium)(예: 내장 메모리(136) 또는 외장 메모리(138))에 저장된 하나 이상의 명령어들을 포함하는 소프트웨어(예: 프로그램(140))로서 구현될 수 있다. 예를 들면, 기기(예: 전자 장치(101))의 프로세서(예: 프로세서(120))는, 저장 매체로부터 저장된 하나 이상의 명령어들 중 적어도 하나의 명령을 호출하고, 그것을 실행할 수 있다. 이것은 기기가 상기 호출된 적어도 하나의 명령어에 따라 적어도 하나의 기능을 수행하도록 운영되는 것을 가능하게 한다. 상기 하나 이상의 명령어들은 컴파일러에 의해 생성된 코드 또는 인터프리터에 의해 실행될 수 있는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, ‘비일시적'은 저장매체가 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다.Various embodiments of the present document include one or more instructions stored in a storage medium (eg, internal memory 136 or external memory 138) that can be read by a machine (eg, electronic device 101). It may be implemented as software (for example, the program 140) including them. For example, the processor (eg, the processor 120) of the device (eg, the electronic device 101) may call and execute at least one command among one or more commands stored from a storage medium. This enables the device to be operated to perform at least one function according to the at least one command invoked. The one or more instructions may include code generated by a compiler or code executable by an interpreter. A storage medium that can be read by a device may be provided in the form of a non-transitory storage medium. Here,'non-transitory' only means that the storage medium is a tangible device and does not contain a signal (e.g., electromagnetic waves), and this term refers to the case where data is semi-permanently stored in the storage medium. It does not distinguish between temporary storage cases.

일실시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어^TM)를 통해 또는 두개의 사용자 장치들(예: 스마트폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to an embodiment, a method according to various embodiments disclosed in the present document may be provided by being included in a computer program product. Computer program products can be traded between sellers and buyers as commodities. The computer program product is distributed in the form of a device-readable storage medium (e.g. compact disc read only memory (CD-ROM)), or through an application store (e.g. Play Store ^TM ) or two user devices It can be distributed (e.g., downloaded or uploaded) directly between, e.g. smartphones), online. In the case of online distribution, at least some of the computer program products may be temporarily stored or temporarily generated in a storage medium that can be read by a device such as a server of a manufacturer, a server of an application store, or a memory of a relay server.

다양한 실시예들에 따르면, 상기 기술한 구성요소들의 각각의 구성요소(예: 모듈 또는 프로그램)는 단수 또는 복수의 개체를 포함할 수 있다. 다양한 실시예들에 따르면, 전술한 해당 구성요소들 중 하나 이상의 구성요소들 또는 동작들이 생략되거나, 또는 하나 이상의 다른 구성요소들 또는 동작들이 추가될 수 있다. 대체적으로 또는 추가적으로, 복수의 구성요소들(예: 모듈 또는 프로그램)은 하나의 구성요소로 통합될 수 있다. 이런 경우, 통합된 구성요소는 상기 복수의 구성요소들 각각의 구성요소의 하나 이상의 기능들을 상기 통합 이전에 상기 복수의 구성요소들 중 해당 구성요소에 의해 수행되는 것과 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따르면, 모듈, 프로그램 또는 다른 구성요소에 의해 수행되는 동작들은 순차적으로, 병렬적으로, 반복적으로, 또는 휴리스틱하게 실행되거나, 상기 동작들 중 하나 이상이 다른 순서로 실행되거나, 생략되거나, 또는 하나 이상의 다른 동작들이 추가될 수 있다.According to various embodiments, each component (eg, module or program) of the above-described components may include a singular number or a plurality of entities. According to various embodiments, one or more components or operations among the above-described corresponding components may be omitted, or one or more other components or operations may be added. Alternatively or additionally, a plurality of components (eg, a module or program) may be integrated into one component. In this case, the integrated component may perform one or more functions of each component of the plurality of components in the same or similar to that performed by the corresponding component among the plurality of components prior to the integration. . According to various embodiments, operations performed by a module, program, or other component may be sequentially, parallel, repeatedly, or heuristically executed, or one or more of the operations may be executed in a different order or omitted. Or one or more other actions may be added.

도 2는, 다양한 실시예에 따르는 신경망 학습 장치를 구성하는 인공 신경망 모델의 개념도이다.2 is a conceptual diagram of an artificial neural network model constituting a neural network training apparatus according to various embodiments.

도 2를 참조하면, 일반적인 합성곱 신경망은 컨벌루션 레이어(201), 및 풀링 레이어(pooling layer)(202)를 이용한 입력 데이터(210)의 특성 추출(11) 및 완전 연결 레이어(290)를 이용한 입력 데이터(210)의 분류(12)에 사용될 수 있다.Referring to FIG. 2, in a general convolutional neural network, feature extraction 11 of input data 210 using a convolutional layer 201 and a pooling layer 202 and input using a fully connected layer 290 It can be used for classification 12 of data 210.

다양한 실시예에 따르면, 컨벌루션 레이어(101)는 합성곱 연산을 통해 입력 데이터(210)의 의미있는 특징들을 추출하는 레이어일 수 있다. 예를 들면, 컨벌루션 레이어(201)는 입력데이터(210)에 특정 크기의 필터 또는 커널 매트릭스(kernel(weight) matrix)(230)를 적용하여 다음 레이어에 전달할 새로운 데이터를 생성할 수 있다. 이와 같은 컨벌루션 레이어(201)의 입출력 데이터는 특징 맵(feature maps)으로 지칭될 수 있다.According to various embodiments, the convolutional layer 101 may be a layer for extracting meaningful features of the input data 210 through a convolution operation. For example, the convolutional layer 201 may generate new data to be transmitted to the next layer by applying a filter of a specific size or a kernel (weight) matrix 230 to the input data 210. The input/output data of the convolutional layer 201 may be referred to as feature maps.

합성곱 신경망 모델에 입력된느 데이터가 RGB 성분과 같이 복수의 성분을 포함하는 입력 이미지인 경우, 입력 데이터는 복수의 채널로 구성될 수 있다. 예를 들면, 컨벌루션 레이어(201)의 입출력 데이터가 2차원 이미지의 공간 이외에 채널을 포함하고, 입출ㄹ력 데이터의 특징 맵은 3차원 형태로 이루어 질 수 있다.When data input to the convolutional neural network model is an input image including a plurality of components such as an RGB component, the input data may be composed of a plurality of channels. For example, input/output data of the convolutional layer 201 may include channels other than a space of a 2D image, and a feature map of the input/output data may be formed in a 3D form.

다양한 실시예에 따르면, 풀링 레이어(202)는 서브 샘플링(Sub-sampling)을 통하여 입력받은 데이터를 축소할 수 있다. 예를 들면, 풀링 레이어(202)는 최대 풀링(max pooling) 및 평균 풀링(average pooling)과 같은 풀링 기법을 통해 데이터를 샘플링 함으로써 데이터의 크기를 축소할 수 있다.According to various embodiments, the pooling layer 202 may reduce data received through sub-sampling. For example, the pooling layer 202 may reduce the size of data by sampling data through a pooling technique such as max pooling and average pooling.

다양한 실시예에 따르면, 완전 연결 레이어(290)는 컨벌루션 레이어(201) 및 풀링 레이어(202)를 통해 전달된 특징을 바탕으로 데이터 분류를 수행하기 위한 레이어로서, 3차원 형태의 특징 맵을 평탄화된 1차원 형태의 데이터를 입력 받을 수 있다. 이와 같이 완전 연결 레이어(290)를 통과한 1차원 형태의 데이터는 활성화 함수를 통해 출력신호로 변환될 수 있다. 합성곱 신경망은 컨벌루션 레이어 및 풀링 레이어를 사용하여 입력 데이터(예: 입력 이미지)에 대한 특징 맵의 3차원 형상을 유지할 수 있으므로, 입력 이미지의 화소 또는 채널 사이의 관련성에 관한 정보가 손실되는 것을 방지하여 이미지 인식률을 높일 수 있다.According to various embodiments, the fully connected layer 290 is a layer for performing data classification based on features transmitted through the convolutional layer 201 and the pooling layer 202, and a feature map in a three-dimensional form is flattened. One-dimensional data can be input. As described above, data in a one-dimensional form that has passed through the fully connected layer 290 may be converted into an output signal through an activation function. Convolutional neural networks can use convolutional layers and pooling layers to maintain the three-dimensional shape of feature maps for input data (e.g., input images), preventing loss of information about the relationship between pixels or channels in the input image. Thus, the image recognition rate can be increased.

도 3a 및 3b는, 발생하는 오차를 줄이기 위한 정규화 동작을 도시한다.3A and 3B illustrate a normalization operation to reduce an error that occurs.

합성곱 신경망 모델의 측면에서 내부 공변량은 각각의 레이어에서 예상치 못한 출력 값의 분산을 야기시킬 수 있다. 머신 러닝 동작에서 각 레이어의 출력 값은 오프라인 출력 값과 차이를 보일 수 있다. 각각의 레이어의 출력 값의 공변량 이동(corariate shift)은 다음 레이어의 입력 값으로 사용되기 때문에 출력값의 분산 량은 최종 결과 값의 오차(error)를 발생시킬 수 있다.In terms of convolutional neural network models, internal covariates can cause unexpected variance of output values in each layer. In machine learning operations, the output value of each layer may show a difference from the offline output value. Since the corariate shift of the output value of each layer is used as the input value of the next layer, the amount of variance of the output value may cause an error of the final result value.

합성곱 신경망은 내부 공변량에 의한 공변량 시프트 문제를 각각의 레이어로부터 도출되는 결과 값을 배치 정규화(batch normalization)시키는 방법으로 해결할 수 있다. 학습 단계에서부터 배치 정규화를 적용해 신경망 학습 모델을 학습시킨 후 합성 신경망 모델의 각각의 레이어의 출력값 마다 배치정규화를 적용시킬 수 있다.The convolutional neural network can solve the problem of covariate shift due to internal covariates by batch normalizing the result values derived from each layer. After training a neural network training model by applying batch normalization from the learning stage, batch normalization can be applied to each output value of each layer of the synthetic neural network model.

도 3a를 참조하면, 오프라인 학습 단계에서 입력 레이어(310)는 입력 벡터 매트릭스를 전달받아, 제1 결과값을 도출할 수 있다. 제1 컨벌루션 레이어(320) 및 제2 컨벌루션 레이어(330)는 초기 모델에서는 공변량 이동이 발생하지 않고, 출력값을 생성할 수 있다. 제2 컨벌루션 레이어(330)는 출력값을 바탕으로 완전 연결 레이어(340)로 값을 전달할 수 있다. 완전 연결 레이어로부터 획득된 출력 값 분산이 일정하게 발생할 수 있다.Referring to FIG. 3A, in the offline learning step, the input layer 310 may receive an input vector matrix and derive a first result value. The first convolutional layer 320 and the second convolutional layer 330 may generate output values without covariate movement in the initial model. The second convolutional layer 330 may transmit a value to the fully connected layer 340 based on the output value. Dispersion of output values obtained from the fully connected layer may occur uniformly.

다양한 실시예에 따르면, 가중치 추출(weight extraction)을 통한 추론동작을 거치게 되면, 각각의 레이어에서 공변량 이동이 발생할 수 있다. 예를 들면, 입력레이어(310)를 통과한 입력 벡터 매트릭스는 합성곱 과정을 통하여 출력 값이 이동되어 원하지 않는 분산이 발생할 수 있다. 입력레이어(310)의 출력값은 제1 컨벌루션 레이어(320)의 입력값으로 사용되고, 이에 따른 출력 값 분산이 추가될 수 있다. 각각의 레이어를 거치는 동안 출력값들의 이동은 더 커지게 되고, 최종 결과 값의 오차는 더욱 증가할 수 있다. According to various embodiments, when an inference operation is performed through weight extraction, covariate movement may occur in each layer. For example, an input vector matrix that has passed through the input layer 310 may have an output value shifted through a convolution process, resulting in undesired variance. The output value of the input layer 310 is used as an input value of the first convolutional layer 320, and a variance of the output value accordingly may be added. While passing through each layer, the movement of the output values becomes larger, and the error of the final result value may further increase.

도 3b를 참조하면, 배치 정규화 과정을 포함하는 합성곱 신경망은 정규화 과정을 거치면서, 오차를 줄이는 동작을 도시한다.Referring to FIG. 3B, a convolutional neural network including a batch normalization process shows an operation of reducing an error while undergoing a normalization process.

정규화 이전엔는 입력 레이어(310)에 입력되는 값의 평균은 0(μ=0)이고, 편차는 1(σ=1) 인 상태로 입력된다. 입력 레이어(310)에 처리된 입력 레이어(310)의 출력 값 또는 제1 컨벌루션 레이어(320)의 입력 값의 평균은 μ'(μ= μ')이고, 편차는 σ'(σ=σ')인 상태로 획득된다. Before normalization, the average of the values input to the input layer 310 is 0 (μ=0) and the deviation is 1 (σ=1). The average of the output value of the input layer 310 processed in the input layer 310 or the input value of the first convolutional layer 320 is μ'(μ= μ'), and the deviation is σ'(σ=σ') It is obtained in the state of being.

제1 컨벌루션 레이어(320)에 입력되는 값의 평균은 μ' (μ= μ')이고, 편차는 σ'(σ= σ') 인 상태로 입력된다. 제1 컨벌루션 레이어(320)에 처리된 제2 컨벌루션 레이어(330)의 출력 값 또는 제2 컨벌루션 레이어(330)의 입력 값의 평균은 μ"(μ= μ")이고, 편차는 σ"(σ=σ")인 상태로 획득된다. 따라서, 제2 컨벌루션 레이어(330)로부터 전달된 결과 값을 바탕으로 획득된 완전 연결 레이어(340)의 오차는 발생하게 된다.The average of values input to the first convolutional layer 320 is μ'(μ = μ'), and the deviation is input in a state of σ'(σ = σ'). The average of the output value of the second convolutional layer 330 processed on the first convolutional layer 320 or the input value of the second convolutional layer 330 is μ"(μ=μ"), and the deviation is σ"(σ =σ"). Accordingly, an error occurs in the fully connected layer 340 obtained based on the result value transmitted from the second convolutional layer 330.

각각의 레이어에 정규화 과정을 거치게 되면, 입력 레이어(310)에 입력되는 값의 평균은 0(μ=0)이고, 편차는 1(σ=1) 인 상태로 입력된다. 정규화 동작을 거쳐, 입력 레이어(310)에 처리된 입력 레이어(310)의 출력 값 또는 제1 컨벌루션 레이어(320)의 입력 값의 평균은 0(μ=0)이고, 편차는 1(σ=1)인 상태로 획득된다. When each layer undergoes a normalization process, the average of the values input to the input layer 310 is 0 (μ=0) and the deviation is 1 (σ=1). After a normalization operation, the average of the output value of the input layer 310 processed to the input layer 310 or the input value of the first convolutional layer 320 is 0 (μ=0), and the deviation is 1 (σ=1). ).

제1 컨벌루션 레이어(320)에 입력되는 값의 평균은 0(μ=0)이고, 편차는 1(σ=1) 인 상태로 입력된다. 정규화 동작을 거쳐, 제1 컨벌루션 레이어(320)에 처리된 제2 컨벌루션 레이어(330)의 출력 값 또는 제2 컨벌루션 레이어(330)의 입력 값의 평균은 0(μ=0)이고, 편차는 1(σ=1) 인 상태로 획득된다. 따라서, 제2 컨벌루션 레이어(330)로부터 전달된 결과 값을 바탕으로 획득된 완전 연결 레이어(340)의 값은 오차없이 균일한 분산 값으로 획득될 수 있다.The average of the values input to the first convolutional layer 320 is 0 (μ=0), and the deviation is 1 (σ=1). After the normalization operation, the average of the output value of the second convolutional layer 330 processed in the first convolutional layer 320 or the input value of the second convolutional layer 330 is 0 (μ=0), and the deviation is 1 It is obtained in the state of (σ=1). Accordingly, the value of the fully connected layer 340 obtained based on the result value transmitted from the second convolutional layer 330 may be obtained as a uniform dispersion value without error.

도 4는, 다양한 실시예에 따른 인공 신경망 모델에 포함되는 커널 매트릭스의 블록도이다.4 is a block diagram of a kernel matrix included in an artificial neural network model according to various embodiments.

도 4를 참조하면, 배치 정규화 장치(400)는 커널 매트릭스(430), 정규화 회로 배열(410) 및 제어부(420)를 포함할 수 있다. 커널 매트릭스(430)은 입력되는 데이터에 가중치를 부여하여, 특징 값을 추출할 수 있다. 예를 들면, 제1 레이어(480)로부터 전달되는 입력 벡터(input vector)(431)를 합성곱 연산을 통해 특징들을 추출해 낼 수 있다. 커널 매트릭스(430)는 제2 레이어(490)로 전달되는 제1 출력 데이터(411)를 생성할 수 있다. 커널 매트릭스(430)를 통해 획득된 출력 데이터는 특징 맵으로 지칭될 수 있다. Referring to FIG. 4, the arrangement normalization apparatus 400 may include a kernel matrix 430, a normalization circuit arrangement 410, and a control unit 420. The kernel matrix 430 may extract a feature value by assigning a weight to the input data. For example, features may be extracted from the input vector 431 transmitted from the first layer 480 through a convolution operation. The kernel matrix 430 may generate first output data 411 delivered to the second layer 490. Output data obtained through the kernel matrix 430 may be referred to as a feature map.

다양한 실시예에 따르면, 커널 매트릭스(430)는 저항성 메모리의 교차 배열(RRAM crossbar array)(또는 RRAM 시냅스 어레이)을 통하여 구현될 수 있다. 저항성 메모리(ReRAM, resistive RAM)는 외부에서 인가하는 전압 펄스에 따라 저항 값이 변하는 특성을 이용하여 시냅스 학습을 구현할 수 있다. 저항성 메모리의 교차 배열로 형성되는 커널 매트릭스(430)는 인가되는 전압 펄스에 의해 출력되는 출력 값을 획득할 수 있다.According to various embodiments, the kernel matrix 430 may be implemented through an RRAM crossbar array (or RRAM synapse array) of resistive memories. Resistive memory (ReRAM) can implement synaptic learning by using a characteristic in which a resistance value changes according to a voltage pulse applied from an external device. The kernel matrix 430 formed by the cross arrangement of resistive memories may obtain an output value output by an applied voltage pulse.

다양한 실시예에 따르면, 제1 출력 데이터(411)는 가중치 부여를 위한 합성곱 연산 과정에서 내부 공변량 시프트가 발생할 수 있다. 정규화 회로 배열(410)은 제1 출력 데이터(411)의 파라미터 변화로 인한 분포 변화에 의해 발생한 가중치 소실 또는 가중치 폭발을 감소시키기 위하여 제1 출력 데이터(411)를 정규화 시키고, 정규화된 제2 출력 데이터(412)를 획득하여 제2 레이어(490)으로 전달한다.According to various embodiments, the first output data 411 may undergo an internal covariate shift during a convolution operation for weighting. The normalization circuit arrangement 410 normalizes the first output data 411 and normalizes the second output data in order to reduce weight loss or weight explosion caused by a change in distribution due to a parameter change of the first output data 411 Acquires 412 and transfers it to the second layer 490.

다양한 실시예에 따르면, 정규화 회로 배열(410)은 능동소자인 OP-AMP와 커패시터, 복수의 스위치로 형성된 회로 배열을 포함할 수 있다. 회로 배열(410)은 스위칭 동작에 의하여, 출력값의 평균을 구하고, 입렵 값과 평균값 간의 편차, 편차의 배수값 등을 구할 수 있고, 획득된 값들의 조합으로 정규화를 구현할 수 있다.According to various embodiments, the normalization circuit arrangement 410 may include a circuit arrangement formed of an active element OP-AMP, a capacitor, and a plurality of switches. The circuit arrangement 410 may obtain an average of an output value by a switching operation, obtain a deviation between an input value and an average value, a multiple of the deviation, and the like, and implement normalization with a combination of the obtained values.

다양한 실시예에 따르면, 아날로그 신호(예: 전압)를 전달받아, 저항성 메모리 배열로 형성된 커널 매트릭스는 가중치를 주고, 이를 회로 배열을 통해서 정규화 시킬 수 있다. 예를 들면, 아날로그 신호를 전달받은 커널은 데이터를 처리하여, 디지털 시호의 변환 과정없이 아날로그 출력 신호를 전달할 수 있다.According to various embodiments, by receiving an analog signal (eg, voltage), a kernel matrix formed of a resistive memory array may be given a weight and normalized through a circuit array. For example, a kernel receiving an analog signal may process data and transmit an analog output signal without converting a digital time signal.

다양한 실시예에 따르면, 제어부(420)는 상기 정규화 회로 배열(410)의 스위칭 동작을 제어하는 신호를 전달하도록 구성될 수 있다. 제어부(420)는 프로세서(120)과 전기적으로 연결될 수 있다.According to various embodiments, the controller 420 may be configured to transmit a signal for controlling the switching operation of the normalization circuit arrangement 410. The controller 420 may be electrically connected to the processor 120.

다양한 실시예에 따르면, 프로세서(120)는 제어부(420)로 클럭 신호 및 가중치 신호들을 포함하는 제어 신호(421)를 전달할 수 있다. 제어부(420)는 제어 신호(421)에 기반하여, 스위치 동작을 제어하기 위한 복수의 클럭 신호(422)를 전달할 수 있다. 클럭 신호에 따라 스위칭이 동작하고, 정규화 회로 배열(410)은 평균값, 편차, 및 가중치가 부여된 값을 획득할 수 있다. According to various embodiments, the processor 120 may transmit a control signal 421 including a clock signal and a weight signal to the controller 420. The controller 420 may transmit a plurality of clock signals 422 for controlling a switch operation based on the control signal 421. Switching is operated according to the clock signal, and the normalization circuit arrangement 410 may obtain an average value, a deviation, and a weighted value.

다양한 실시예에 따르면, 정규화 회로 배열(410)은 전달된 클럭 신호(422)에 따라 제1 출력 데이터(411)를 정규화하여 제2 출력 데이터(412)를 획득할 수 있다. 제2 출력데이터(412)는 제2 레이어(490)의 입력 데이터로 활용될 수 있다.According to various embodiments, the normalization circuit arrangement 410 may obtain the second output data 412 by normalizing the first output data 411 according to the transmitted clock signal 422. The second output data 412 may be used as input data of the second layer 490.

도 5는, 다양한 실시예에 따르는, 인공 신경망 모델에 포함되는 커널 매트릭스를 구현하는 저항성 메모리 배열의 개념도이다.5 is a conceptual diagram of a resistive memory arrangement implementing a kernel matrix included in an artificial neural network model, according to various embodiments.

도 5를 참조하면, 커널 매트릭스(430)(또는 저항 메모리 배열)는 저항 메모리를 격자 형태로 배열시킬 수 있다.Referring to FIG. 5, the kernel matrix 430 (or the resistive memory array) may arrange resistive memories in a lattice form.

다양한 실시예에 따르면, 저항성 메모리(531)는 외부에서 인가하는 전압 펄스에 따라 저항 값이 변하는 특성을 이용하여 시냅스 학습을 구현할 수 있다. 저항성 메모리의 교차 배열로 형성되는 커널 매트릭스(430)는 인가되는 전압 펄스에 의해 출력되는 출력 값을 획득할 수 있다. 출력 값은 입력 레이어로부터 획득된 특징 점들일 수 있다. 예를 들면, 출력 값은 가중치가 부여된 커널 매트릭스(430)를 통하여 입력 데이터(예: 도 2의 입력 데이터(210))로부터 변환된 컨벌루션 레이어(예: 도 2의 컨럴루션 레이어(201))의 입력 값일 수 있다.According to various embodiments, the resistive memory 531 may implement synaptic learning by using a characteristic in which a resistance value changes according to an externally applied voltage pulse. The kernel matrix 430 formed by the cross arrangement of resistive memories may obtain an output value output by an applied voltage pulse. The output value may be feature points obtained from the input layer. For example, the output value is a convolutional layer converted from input data (eg, input data 210 of FIG. 2) through a weighted kernel matrix 430 (eg, convolutional layer 201 of FIG. 2). It can be an input value of.

다양한 실시예에 따르면, 집적할 수 있는 어레이가 제한적이면, 다수의 저항성 메모리 배열들은 상호 연결을 통해 합성곱 신경망의 필요한 커널을 확장할 수 있다.According to various embodiments, if the number of arrays that can be integrated is limited, the plurality of resistive memory arrays may extend the required kernel of the convolutional neural network through interconnection.

도 6a, 및 6b는 다양한 실시예에 따르는, 배치 정규화 장치의 회로도이다.6A and 6B are circuit diagrams of a batch normalization apparatus, according to various embodiments.

도 6a 및 도 6b를 참조하면, 배치 정규화 장치(batch normalization device)(600)는 저항성 메모리(531)의 교차 배열로 형성된 커널 매트릭스(430)와 커널 매트릭스(430)의 일단에 연결되는 적분기(611)의 배열(610), 적분기(611)의 출력단에 연결되는 커패시터(620) 및 적분기(611)의 출력단에서 커패시터(620)와 병렬로 연결되는 스위치 배열(630)을 포함할 수 있다.6A and 6B, a batch normalization device 600 includes a kernel matrix 430 formed by a cross-array of resistive memories 531 and an integrator 611 connected to one end of the kernel matrix 430. ), a capacitor 620 connected to the output terminal of the integrator 611, and a switch arrangement 630 connected in parallel with the capacitor 620 at the output terminal of the integrator 611.

정규화 동작은 다음과 같이 동작할 수 있다. 적분기 출력단의 전압은 적분기(611)의 배열(610)을 지나 각각의 출력단에서 V1, V2, V3, V4의 값을 가질 수 있다(S601). 각각의 적분기(611) 출력단의 전압의 평균은 스위치 배열(630)의 스위칭 동작을 통하여, 커패시터(620)는 서로 병렬 연결될 수 있다. CL1, CL2, CL3, CL4값을 가지는 커패시터(620)들의 일단은 전압 Vμ가 인가될 수 있다(S602). 각각의 커패시터의 CL1, CL2, CL3, CL4값은 동일할 수 있다. 전압의 평균 값 Vμ는 아래의 수식을 통하여 획득될 수 있다.The normalization operation can be operated as follows. The voltage at the output terminal of the integrator passes through the array 610 of the integrator 611 and may have values of V1, V2, V3, and V4 at each output terminal (S601). The average of the voltages at the output terminals of each integrator 611 may be connected to each other in parallel through the switching operation of the switch arrangement 630. A voltage Vμ may be applied to one end of the capacitors 620 having values CL1, CL2, CL3, and CL4 (S602). The values of CL1, CL2, CL3, and CL4 of each capacitor may be the same. The average value Vμ of the voltage can be obtained through the following equation.

,

스위치 배열(630)내에 포함된 커패시터들과 OP AMP를 이용하여, 커패시터에 Vμ 및 Vi를 충전할 수 있다(S603). OP AMP의 출력단과 입력단에 병렬 연결되는 커패시터에 의해 출력단에 Vi'이 인가될 수 있고, 이 값들을 출력 값으로 사용할 수 있다(S604). Vi-Vμ값의 가중치 α는 커패시터들의 값을 조절하여 결정할 수 있다. 동작 S604에서의, OP AMP의 출력단에 인가되는 전압Vi'은 아래의 수식을 통하여 획득될 수 있다.Using the capacitors included in the switch array 630 and the OP AMP, Vμ and Vi may be charged in the capacitor (S603). Vi' can be applied to the output terminal by a capacitor connected in parallel to the output terminal and the input terminal of the OP AMP, and these values can be used as an output value (S604). The weight α of the Vi-Vμ value can be determined by adjusting the values of the capacitors. In operation S604, the voltage Vi' applied to the output terminal of the OP AMP may be obtained through the following equation.

,

도 6b의 배치 정규화 장치는 전력 소모가 많은 OP AMP의 개수를 줄이기 위해서, 스위치 배열(630)회로와 적분기(610) 회로를 겹합 한 것을 도시한다. The arrangement normalization apparatus of FIG. 6B shows a combination of a switch arrangement 630 circuit and an integrator 610 circuit in order to reduce the number of OP AMPs that consume a lot of power.

다양한 실시예에 따르면, 결합된 스위치 배열(630')회로는 스위치 배열(630)의 회로에 포함된 OP AMP를 줄일 수 있다. 구체적인 스위치 배열의 동작은 후술할 도 7a 내지 도 7e를 통해 설명한다.According to various embodiments, the combined switch arrangement 630' circuit can reduce the OP AMP included in the circuit of the switch arrangement 630. A detailed operation of the switch arrangement will be described with reference to FIGS. 7A to 7E to be described later.

도 7a 내지 도 7e는, 다양한 실시예에 따르는 신경망 학습 장치의 각 단계에서의 스위칭 동작을 도시한다.7A to 7E illustrate switching operations in each step of an apparatus for learning a neural network according to various embodiments.

스위치 회로 배열(630)은 제어부로부터 제어 신호를 받아 스위칭 동작을 제어할 수 있다. 스위치 회로 배열은, 적분기(750a, 750b), 및 커패시터(720a)를 포함하고, 스위칭 동작에 따라, 적분기(750a, 750b) 및 커패시터(720a, 720b) 사이의 전기적 경로를 변경할 수 있다.The switch circuit arrangement 630 may control a switching operation by receiving a control signal from the controller. The switch circuit arrangement includes the integrators 750a and 750b, and the capacitor 720a, and an electrical path between the integrators 750a and 750b and the capacitors 720a and 720b may be changed according to a switching operation.

도 7a를 참조하면, 스위치 회로 배열(630)은 제어부로부터 적분(integration)을 위한 제1 신호를 수신하면, 커패시터(720a, 720b)는 적분기(750a, 750b)의 출력단과 전기적으로 연결시키고, 복수의 스위칭 회로(701a, 701b)는 서로 단락될 수 있다. 구체적으로, 스위치 회로 배열(630)에 포함된 적분기(750a, 750b)의 출력단에 연결된 스위치(721a, 721b)는 단락될 수 있다. 스위치 회로 배열(630)에 포함된 각각의 스위칭 회로(701a, 701b)는 스위치(722)의 개방으로 개방상태로 유지될 수 있다. 스위치 회로 배열(630)은 저항 메모리 배열(예: 도 4의 커널 매트릭스(430))로부터 발생한 전류를 적분기(750a, 750b)에 포함된 커패시터(715a) 및 추가 커패시터(720a)에 전하를 충전할 수 있다. 적분기(750a, 750b)에 포함된 커패시터(715a) 및 추가 커패시터(720a)의 크기는 동일할 수 있다. 따라서, 커패시터(715a)에 충전된 전하량은 동일할 수 있다. Referring to FIG. 7A, when the switch circuit arrangement 630 receives a first signal for integration from the controller, the capacitors 720a and 720b are electrically connected to the output terminals of the integrators 750a and 750b, and a plurality of The switching circuits 701a and 701b of may be shorted to each other. Specifically, the switches 721a and 721b connected to the output terminals of the integrators 750a and 750b included in the switch circuit arrangement 630 may be short-circuited. Each of the switching circuits 701a and 701b included in the switch circuit arrangement 630 may be maintained in an open state by the opening of the switch 722. The switch circuit arrangement 630 charges the current generated from the resistive memory array (eg, the kernel matrix 430 of FIG. 4) to the capacitor 715a and the additional capacitor 720a included in the integrators 750a and 750b. I can. The capacitors 715a and the additional capacitors 720a included in the integrators 750a and 750b may have the same size. Accordingly, the amount of charge charged in the capacitor 715a may be the same.

도 7b를 참조하면, 스위치 회로 배열(630)은 제어부로부터 평균값을 구하기 위한 제2 신호를 수신하면, 커패시터(720a, 720b)의 일단을 적분기(750a, 750b)의 입력단과 전기적으로 연결시키고, 복수의 스위칭 회로(701a, 701b)는 병렬로 연결될 수 있다. 구체적으로, 스위치 회로 배열(630)에 포함된 각각의 채널의 커패시터(720a, 720b)는 적분기(750a, 750b)의 출력단에서 개방되고, 각각의 커패시터(720a, 720b)는 서로 단락될 수 있다. 예를 들면, 적분기(750a, 750b)의 출력단에 연결된 스위치(721a, 721b)는 개방되고, 각각의 스위칭 회로(701a, 701b)의 커패시터(720a, 720b)는 스위치(722a)의 단락으로 서로 연결될 수 있다.Referring to FIG. 7B, when the switch circuit arrangement 630 receives a second signal for obtaining an average value from the controller, one end of the capacitors 720a and 720b is electrically connected to the input terminals of the integrators 750a and 750b, and a plurality of The switching circuits 701a and 701b of may be connected in parallel. Specifically, the capacitors 720a and 720b of each channel included in the switch circuit arrangement 630 are open at the output terminals of the integrators 750a and 750b, and each of the capacitors 720a and 720b may be short-circuited to each other. For example, the switches 721a and 721b connected to the output terminals of the integrators 750a and 750b are opened, and the capacitors 720a and 720b of each of the switching circuits 701a and 701b are connected to each other by a short circuit of the switch 722a. I can.

다양한 실시예에 따르면, 각각의 커패시터(720a, 720b)들은 도 7a에서 V1, V2, V3??로 충전될 수 있다. 도 7b의 스위칭 동작이후, 각각의 커패시터(720a, 720b)들은 동일한 커패시턴스를 가지고 있으므로, 전압의 평균값 Vμ로 충전되게 되고, 각각의 적분기(750a, 750b)의 출력단은 도 7a와 동일하게 V1, V2, V3??로 유지될 수 있다.According to various embodiments, each of the capacitors 720a and 720b may be charged with V1, V2, and V3?? in FIG. 7A. After the switching operation of FIG. 7B, since each of the capacitors 720a and 720b has the same capacitance, the capacitors 720a and 720b are charged with the average value of voltage Vμ, and the output terminals of each integrator 750a and 750b are V1 and V2 as in FIG. 7A. , Can be maintained as V3??

도 7c를 참조하면, 스위치 회로 배열(630)은 제어부로부터 평균값만큼 시프팅을 위한 제3 신호를 수신하면, 커패시터(720a, 720b)의 일단을 적분기(750a, 750b)의 입력단과 전기적으로 연결시키고 커패시터(720a, 720b)의 타단을 접지시킬 수 있다.Referring to FIG. 7C, when receiving a third signal for shifting by an average value from the control unit, the switch circuit arrangement 630 electrically connects one end of the capacitors 720a and 720b to the input terminals of the integrators 750a and 750b. The other ends of the capacitors 720a and 720b may be grounded.

다양한 실시예에 따르면, 도 7b의 동작 이후, 각각의 채널의 커패시터(720a, 720b)들은 출력단의 평균 값 Vμ만큼 저장되어 있다. 이때, 평균 값 Vμ만큼 저장된 커패시터(720a, 720b)는 스위치(723a, 723b)의 단락으로, 적분기(750a, 750b)의 입력단과 연결될 수 있다. 또한, 적분기(750a, 750b)의 입력단으로 유입되는 전류는 커패시터(720a, 720b)로 일부 유입되어, 적분기(750a, 750b)에서의 출력 값은 평균 값 Vμ만큼 시프팅되어, V1- Vμ, V2- Vμ일 수 있다.According to various embodiments, after the operation of FIG. 7B, the capacitors 720a and 720b of each channel are stored as much as the average value Vμ of the output terminal. At this time, the capacitors 720a and 720b stored as much as the average value Vμ may be connected to the input terminals of the integrators 750a and 750b by shorting the switches 723a and 723b. In addition, the current flowing into the input terminals of the integrators 750a and 750b partially flows into the capacitors 720a and 720b, and the output value of the integrators 750a and 750b is shifted by the average value Vμ, and thus V1-Vμ, V2 -May be Vμ.

도 7d를 참조하면, 스위치 회로 배열(630)은 제어부로부터 샘플링을 위한 제4 신호를 수신하면, 상기 스위치는 제어부로부터 전달된 신호가 제4 신호이면, 상기 커패시터의 타단을 상기 적분기의 입력단과 전기적으로 연결시키되, 상기 커패시터의 일단을 접지시키고, 복수의 회로는 서로 개방될 수 있다.Referring to FIG. 7D, when the switch circuit arrangement 630 receives a fourth signal for sampling from the control unit, the switch connects the other end of the capacitor to the input terminal of the integrator when the signal transmitted from the control unit is a fourth signal. However, one end of the capacitor is grounded, and a plurality of circuits may be open to each other.

다양한 실시예에 따르면, 각각의 채널의 커패시터(720a, 720b)들은 적분기(750a, 750b)의 출력단과 전기적으로 연결시키고, 복수의 스위칭 회로(701a, 701b)는 서로 단락될 수 있다. 구체적으로, 스위치 회로 배열(630)에 포함된 적분기(750a, 750b)의 출력단과 커패시터(720a, 720b)의 일단에 연결된 스위치(724a, 724b)는 단락될 수 있다. 커패시터(720a, 720b)의 타단에 연결된 스위치(725a, 725b)는 단락되어 접지될 수 있다. 스위치 회로 배열(630)에 포함된 각각의 스위칭 회로(701a, 701b)는 스위치(722)의 개방으로 개방상태로 유지될 수 있다. 적분기(750a, 750b)에 포함된 커패시터(715a,715b) 및 추가 커패시터(720a,720b)의 크기는 동일할 수 있다. 따라서, 커패시터(720a, 720b)에 충전된 전하량은 CL(V1- Vμ), CL(V2- Vμ)값일 수 있다.According to various embodiments, the capacitors 720a and 720b of each channel are electrically connected to the output terminals of the integrators 750a and 750b, and the plurality of switching circuits 701a and 701b may be shorted to each other. Specifically, the output terminals of the integrators 750a and 750b included in the switch circuit arrangement 630 and the switches 724a and 724b connected to one end of the capacitors 720a and 720b may be short-circuited. The switches 725a and 725b connected to the other ends of the capacitors 720a and 720b may be short-circuited to be grounded. Each of the switching circuits 701a and 701b included in the switch circuit arrangement 630 may be maintained in an open state by the opening of the switch 722. The sizes of the capacitors 715a and 715b and the additional capacitors 720a and 720b included in the integrators 750a and 750b may be the same. Accordingly, the amount of charge charged in the capacitors 720a and 720b may be CL(V1-Vμ) and CL(V2-Vμ) values.

스위치 회로 배열(630)은 제어부로부터 증폭을 위한 제5 신호를 수신하면, 제4 신호에 의한 스위칭 동작 이후, 커패시터(720a, 720b)의 일단을 상기 적분기의 입력단과 전기적으로 연결시키고 커패시터(720a, 720b)의 타단을 접지시킬 수 있다. When the switch circuit arrangement 630 receives the fifth signal for amplification from the controller, after the switching operation by the fourth signal, one end of the capacitors 720a and 720b is electrically connected to the input terminal of the integrator, and the capacitor 720a, 720b) can be grounded.

다양한 실시예에 따르면, 제4 신호에 의한 스위칭 동작 이후, 각각의 채널의 커패시터(720a, 720b)들은 V1- Vμ, V2- Vμ 만큼 저장될 수 있다. 제4 신호에 의한 스위칭 동작에 의해 연결되었던 스위치(724a, 724b, 725a, 725b)들은 개방되고, 동시에 평균 값 V1- Vμ, V2-Vμ만큼 저장된 커패시터(720a, 720b)는 스위치(723a, 723b)의 단락으로, 적분기(750a, 750b)의 입력단과 연결될 수 있다. 이때, 제4 신호에 의한 스위칭 동작에서와 상이하게 커패시터(720a, 720b)는 접지부와 연결되는 단자부는 반대로 연결될 수 있다. 또한, 적분기(750a, 750b)의 입력단으로 유입되는 전류는 커패시터(720a, 720b)로 일부 유입되어, 적분기(750a, 750b)에서의 출력 값은 평균 값 V1- Vμ 만큼 증폭되어, 2(V1- Vμ), 2(V2- Vμ)일 수 있다.According to various embodiments, after the switching operation by the fourth signal, the capacitors 720a and 720b of each channel may be stored as much as V1-Vμ and V2-Vμ. The switches 724a, 724b, 725a, and 725b that were connected by the switching operation by the fourth signal are opened, and the capacitors 720a and 720b stored as much as the average values V1-Vμ and V2-Vμ at the same time are the switches 723a and 723b. With a short circuit of, it may be connected to the input terminals of the integrators 750a and 750b. In this case, differently from the switching operation by the fourth signal, the terminal portions connected to the ground portions of the capacitors 720a and 720b may be connected in reverse. In addition, the current flowing into the input terminals of the integrators 750a and 750b partially flows into the capacitors 720a and 720b, and the output value from the integrators 750a and 750b is amplified by the average value V1-Vμ, and 2 (V1- Vμ), 2 (V2-Vμ) may be.

다양한 실시예에 따르면, 제4 신호 및 제5 신호 동작은 반복적으로 수행될 수 있으며, 1회 반복될 때의 적분기(750a, 750b)에서의 출력 값은 2(V1- Vμ), 2(V2- Vμ), 2회 반복될 때의 적분기(750a, 750b)에서의 출력 값은 4(V1- Vμ), 4(V2- Vμ), 3회 반복될 때의 적분기(750a, 750b)에서의 출력 값은 8(V1- Vμ), 8(V2- Vμ)을 획득할 수 있다.According to various embodiments, the fourth and fifth signal operations may be repeatedly performed, and output values from the integrators 750a and 750b when repeated once are 2 (V1-Vμ) and 2 (V2- Vμ), the output value from the integrator (750a, 750b) when repeated twice is 4(V1-Vμ), 4(V2-Vμ), the output value from the integrator (750a, 750b) when repeated 3 times 8 (V1-Vμ), 8 (V2-Vμ) can be obtained.

다양한 실시예에 따르면, 이러한 사이클의 수는 제어부(120)에서 인가되는 제어 신호(예: 도 4의 제어 신호(421))에 의해 결정될 수 있다. 사이클 수는 정규화 가정에서의 증폭 인자일 수 있다.According to various embodiments, the number of cycles may be determined by a control signal (eg, the control signal 421 of FIG. 4) applied from the controller 120. The number of cycles may be an amplification factor in the normalization assumption.

도 7e를 참조하면, 스위치 회로 배열(630)은 제어부로부터 리셋을 위한 제6 신호를 수신하면, 상기 커패시터 및 상기 적분기의 출력단을 접지시킬 수 있다.Referring to FIG. 7E, when the switch circuit arrangement 630 receives a sixth signal for reset from the controller, the capacitor and the output terminal of the integrator may be grounded.

다양한 실시예에 따르면, 커패시터(720a, 720b)의 양단에 연결된 스위치(726a, 726b)를 단락시킬 수 있다. 커패시터(720a, 720b)는 양단을 모두 접지되어 저장된 전하를 모두 방전할 수 있다. According to various embodiments, the switches 726a and 726b connected to both ends of the capacitors 720a and 720b may be short-circuited. Both ends of the capacitors 720a and 720b are grounded to discharge all stored charges.

적분기(750a, 750b)의 출력부와 일단이 연결된 스위치(727a, 727b)를 단락 시킬 수 있다. 스위치(727a, 727b)의 타단은 접지되어, 적분기(750a, 750b)에 배치된 커패시터(715a, 715b)도 방전될 수 있다. 이를 통하여, 정규화 동작이후에 입력된 신호를 초기화할 수 있다.Switches 727a and 727b connected to one end of the integrators 750a and 750b may be short-circuited. The other ends of the switches 727a and 727b are grounded, so that the capacitors 715a and 715b disposed in the integrators 750a and 750b may also be discharged. Through this, the signal input after the normalization operation can be initialized.

상술한 다양한 실시예에 따르는 신경망 학습 장치는, 디지털 처리 장치없이 아날로그 신호만으로 배치 정규화를 수행할 수 있다. 다양한 실시예에 따르면, 배치 정규화 장치는 OP AMP, 복수의 커패시터 및 복수의 스위치로 형성되어, 스위칭 동작으로, 평균 값, 편차, 가중치 부여를 수여할 수 있도록 설계될 수 있다.The neural network training apparatus according to the various embodiments described above may perform batch normalization only with an analog signal without a digital processing device. According to various embodiments of the present disclosure, the arrangement normalization device may be formed of an OP AMP, a plurality of capacitors, and a plurality of switches, and may be designed to grant an average value, a deviation, and a weighting by switching operation.

도 8은 다양한 실시예에 따르는 신경망 학습 장치를 통한 결과 값의 편차를 비교한 그래프이다.8 is a graph comparing deviations of result values through a neural network learning apparatus according to various embodiments.

첫번째 행에 있는 그래프는 공변량이 없음을 가정하고, 시뮬레이션한 제1 레이어 및 제2 레이어의 출력 값 분포를 나타낸 그래프이다. 두번째 행에 있는 그래프는 장치에 따른 공변량이 존재하는 경우, 시뮬레이션한 제1 레이어 및 제2 레이어의 출력 값 분포를 나타낸 그래프이다. 두 그래프를 참조하면, 장치에 의한 공변량이 존재하는 경우, 첫번째 그래프와의 분포에 관한 편차가 심하게 남을 알 수 있다.The graph in the first row is a graph showing the distribution of the simulated output values of the first layer and the second layer, assuming that there is no covariate. The graph in the second row is a graph showing the distribution of simulated output values of the first layer and the second layer when there is a device-dependent covariate. Referring to the two graphs, when there is a covariate due to the device, it can be seen that the deviation from the first graph remains severe.

세번째 행에 있는 그래프는 정규화를 통하여, 획득된 제1 레이어 및 제2 레이어의 출력 값 분포를 나타낸 그래프이다. 첫번째 행에 배치된 그래프와 비교할 때, 편차가 거의 존재하지 않고, 각각의 출력 값의 빈도도 유사하게 나타남을 알 수 있다. 상기의 시뮬레이션을 통하여 다양한 실시예에 따르는, 신경망 학습 장치는 정확성이 높은 결과를 가질 수 있음을 알 수 있다.The graph in the third row is a graph showing the distribution of output values of the first layer and the second layer obtained through normalization. Compared to the graph arranged in the first row, there is almost no deviation, and it can be seen that the frequency of each output value is similar. Through the above simulation, it can be seen that the neural network learning apparatus according to various embodiments may have a result with high accuracy.

상술한 다양한 실시예에 따르는 신경망 학습 장치는, 인공 신경망 모델 학습을 위한 훈련 데이터를 입력하기 위한 입력부와, 상기 훈련 데이터를 이용하여 상기 인공 신경망 모델을 훈련시키는 러닝 프로세서를 포함하고, 상기 인공 신경망 모델은, 상기 입력부로부터 전달받은 데이터에 대응되는 신호를 전달받는 복수의 레이어들 및, 상기 복수의 레이어들 사이에 배치되는 적어도 하나의 배치 정규화 장치 (batch normalization device)를 포함하고, 상기 배치 정규화 장치는 상기 복수의 레이어들 중 제1 레이어로부터 전달된 제1 아날로그 신호를 합성곱 처리하는 저항 메모리 배열(RRAM array) 및 상기 저항 메모리 배열로부터 합성곱 처리된 제2 아날로그 신호를 정규화하여 제2 레이어로 제3 아날로그 신호를 전달하는 복수의 전기 소자 및 복수의 스위치를 포함하는 회로 배열을 포함할 수 있다.The neural network training apparatus according to the various embodiments described above includes an input unit for inputting training data for training an artificial neural network model, and a learning processor for training the artificial neural network model using the training data, and the artificial neural network model Includes a plurality of layers receiving a signal corresponding to the data received from the input unit, and at least one batch normalization device disposed between the plurality of layers, wherein the batch normalization device comprises: A resistive memory array (RRAM array) that performs convolutional processing of a first analog signal transmitted from a first layer among the plurality of layers, and a second analog signal convolutional-processed from the resistive memory array is normalized to form a second layer. 3 A circuit arrangement including a plurality of electrical elements and a plurality of switches for transmitting analog signals may be included.

다양한 실시예에 따르면, 상기 회로 배열은, 상기 저항 메모리 배열의 각 채널을 통해 전달되는 전압 값을 제어할 수 있다.According to various embodiments, the circuit arrangement may control a voltage value transmitted through each channel of the resistive memory arrangement.

다양한 실시예에 따르면, 상기 배치 정규화 장치는, 상기 저항 메모리 배열의 각 채널에 인가되는 신호를 정규화 하도록 상기 회로 배열을 제어하는 제어부를 포함할 수 있다.According to various embodiments, the arrangement normalization apparatus may include a control unit that controls the circuit arrangement to normalize signals applied to each channel of the resistive memory arrangement.

다양한 실시예에 따르면, 상기 제어부는, 상기 회로 배열의 스위치 동작을 위한 제어 신호를 상기 회로 배열로 전달할 수 있다.According to various embodiments, the control unit may transmit a control signal for a switch operation of the circuit arrangement to the circuit arrangement.

다양한 실시예에 따르면, 상기 회로 배열의 복수의 전기 소자는, 복수의 적분기, 및 복수의 커패시터를 포함하고, 상기 스위치는 상기 제어부의 상기 제어 신호를 바탕으로 적분기 및 커패시터 사이의 전기적 경로를 변경할 수 있다.According to various embodiments, the plurality of electrical elements of the circuit arrangement include a plurality of integrators and a plurality of capacitors, and the switch may change an electrical path between the integrator and the capacitor based on the control signal of the controller. have.

다양한 실시예에 따르면, 상기 회로 배열은 출력되는 제2 아날로그 신호 또는 제3 아날로그 신호를 전달받는 채널에 대응되는 복수의 회로가 병렬 연결되고, 상기 커패시터는 상기 적분기의 입력단 또는 출력단에 연결될 수 있다.According to various embodiments, in the circuit arrangement, a plurality of circuits corresponding to a channel receiving an output second analog signal or a third analog signal may be connected in parallel, and the capacitor may be connected to an input terminal or an output terminal of the integrator.

다양한 실시예에 따르면, 상기 스위치는 제어부로부터 전달된 신호가 제1 신호이면, 상기 커패시터를 상기 적분기의 출력단과 전기적으로 연결시키고, 복수의 회로는 서로 단락될 수 있다.According to various embodiments, when the signal transmitted from the control unit is a first signal, the switch may electrically connect the capacitor to the output terminal of the integrator, and a plurality of circuits may be shorted to each other.

다양한 실시예에 따르면, 상기 스위치는 제어부로부터 전달된 신호가 제2 신호이면, 상기 커패시터의 일단을 상기 적분기의 입력단과 전기적으로 연결시키고, 복수의 회로는 병렬로 연결될 수 있다.According to various embodiments, when the signal transmitted from the controller is the second signal, the switch may electrically connect one end of the capacitor to the input terminal of the integrator, and a plurality of circuits may be connected in parallel.

다양한 실시예에 따르면, 상기 스위치는 제어부로부터 전달된 신호가 제3 신호이면, 상기 커패시터의 일단을 상기 적분기의 입력단과 전기적으로 연결시키고 상기 커패시터의 타단을 접지시킬 수 있다.According to various embodiments, when the signal transmitted from the controller is a third signal, the switch may electrically connect one end of the capacitor to the input terminal of the integrator and ground the other end of the capacitor.

다양한 실시예에 따르면, 상기 스위치는 제어부로부터 전달된 신호가 제4 신호이면, 상기 커패시터의 타단을 상기 적분기의 입력단과 전기적으로 연결시키되, 상기 커패시터의 일단을 접지시키고, 복수의 회로는 서로 개방될 수 있다.According to various embodiments, when the signal transmitted from the controller is a fourth signal, the switch electrically connects the other end of the capacitor to the input terminal of the integrator, grounds one end of the capacitor, and the plurality of circuits are opened to each other. I can.

다양한 실시예에 따르면, 상기 스위치는 제어부로부터 전달된 신호가 제5 신호이면, 상기 커패시터의 일단을 상기 적분기의 입력단과 전기적으로 연결시키고 상기 커패시터의 타단을 접지시킬 수 있다.According to various embodiments, when the signal transmitted from the controller is a fifth signal, the switch may electrically connect one end of the capacitor to the input terminal of the integrator and ground the other end of the capacitor.

다양한 실시예에 따르면, 상기 스위치는 제어부로부터 전달된 신호가 제6 신호이면, 상기 커패시터 및 상기 적분기의 출력단을 접지시킬 수 있다.According to various embodiments, the switch may ground the capacitor and the output terminal of the integrator when the signal transmitted from the controller is a sixth signal.

다양한 실시예에 따르면, 상기 인공 신경망 모델은, CNN (convolution neural network) 또는 BNN (binarized neural network)을 포함할 수 있다.According to various embodiments, the artificial neural network model may include a convolution neural network (CNN) or a binarized neural network (BNN).

본 개시의 청구항 또는 명세서에 기재된 실시 예들에 따른 방법들은 하드웨어, 소프트웨어, 또는 하드웨어와 소프트웨어의 조합의 형태로 구현될(implemented) 수 있다. The methods according to the embodiments described in the claims or the specification of the present disclosure may be implemented in the form of hardware, software, or a combination of hardware and software.

소프트웨어로 구현하는 경우, 하나 이상의 프로그램(소프트웨어 모듈)을 저장하는 컴퓨터 판독 가능 저장 매체가 제공될 수 있다. 컴퓨터 판독 가능 저장 매체에 저장되는 하나 이상의 프로그램은, 전자 장치(device) 내의 하나 이상의 프로세서에 의해 실행 가능하도록 구성된다(configured for execution). 하나 이상의 프로그램은, 전자 장치로 하여금 본 개시의 청구항 또는 명세서에 기재된 실시 예들에 따른 방법들을 실행하게 하는 명령어(instructions)를 포함한다. When implemented in software, a computer-readable storage medium storing one or more programs (software modules) may be provided. One or more programs stored in a computer-readable storage medium are configured to be executable by one or more processors in an electronic device (device). The one or more programs include instructions for causing the electronic device to execute methods according to embodiments described in the claims or specification of the present disclosure.

이러한 프로그램(소프트웨어 모듈, 소프트웨어)은 랜덤 액세스 메모리 (random access memory), 플래시(flash) 메모리를 포함하는 불휘발성(non-volatile) 메모리, 롬(ROM: Read Only Memory), 전기적 삭제가능 프로그램가능 롬(EEPROM: Electrically Erasable Programmable Read Only Memory), 자기 디스크 저장 장치(magnetic disc storage device), 컴팩트 디스크 롬(CD-ROM: Compact Disc-ROM), 디지털 다목적 디스크(DVDs: Digital Versatile Discs) 또는 다른 형태의 광학 저장 장치, 마그네틱 카세트(magnetic cassette)에 저장될 수 있다. 또는, 이들의 일부 또는 전부의 조합으로 구성된 메모리에 저장될 수 있다. 또한, 각각의 구성 메모리는 다수 개 포함될 수도 있다. These programs (software modules, software) include random access memory, non-volatile memory including flash memory, read only memory (ROM), and electrically erasable programmable ROM. (EEPROM: Electrically Erasable Programmable Read Only Memory), magnetic disc storage device, Compact Disc-ROM (CD-ROM), Digital Versatile Discs (DVDs), or other types of It may be stored in an optical storage device or a magnetic cassette. Alternatively, it may be stored in a memory composed of a combination of some or all of them. In addition, a plurality of configuration memories may be included.

또한, 상기 프로그램은 인터넷(Internet), 인트라넷(Intranet), LAN(Local Area Network), WLAN(Wide LAN), 또는 SAN(Storage Area Network)과 같은 통신 네트워크, 또는 이들의 조합으로 구성된 통신 네트워크를 통하여 접근(access)할 수 있는 부착 가능한(attachable) 저장 장치(storage device)에 저장될 수 있다. 이러한 저장 장치는 외부 포트를 통하여 본 개시의 실시 예를 수행하는 장치에 접속할 수 있다. 또한, 통신 네트워크상의 별도의 저장장치가 본 개시의 실시 예를 수행하는 장치에 접속할 수도 있다. In addition, the program is through a communication network composed of a communication network such as the Internet, an intranet, a local area network (LAN), a wide LAN (WLAN), or a storage area network (SAN), or a combination thereof. It may be stored in an accessible storage device. Such a storage device may access a device performing an embodiment of the present disclosure through an external port. In addition, a separate storage device on the communication network may access a device performing an embodiment of the present disclosure.

상술한 본 개시의 구체적인 실시 예들에서, 개시에 포함되는 구성 요소는 제시된 구체적인 실시 예에 따라 단수 또는 복수로 표현되었다. 그러나, 단수 또는 복수의 표현은 설명의 편의를 위해 제시한 상황에 적합하게 선택된 것으로서, 본 개시가 단수 또는 복수의 구성 요소에 제한되는 것은 아니며, 복수로 표현된 구성 요소라 하더라도 단수로 구성되거나, 단수로 표현된 구성 요소라 하더라도 복수로 구성될 수 있다. In the above-described specific embodiments of the present disclosure, components included in the disclosure are expressed in the singular or plural according to the presented specific embodiments. However, the singular or plural expression is selected appropriately for the situation presented for convenience of description, and the present disclosure is not limited to the singular or plural constituent elements, and even constituent elements expressed in plural are composed of the singular or Even the expressed constituent elements may be composed of pluralities.

한편 본 개시의 상세한 설명에서는 구체적인 실시 예에 관해 설명하였으나, 본 개시의 범위에서 벗어나지 않는 한도 내에서 여러 가지 변형이 가능함은 물론이다. 그러므로 본 개시의 범위는 설명된 실시 예에 국한되어 정해져서는 아니 되며 후술하는 특허청구의 범위뿐만 아니라 이 특허청구의 범위와 균등한 것들에 의해 정해져야 한다. Meanwhile, although specific embodiments have been described in the detailed description of the present disclosure, various modifications may be made without departing from the scope of the present disclosure. Therefore, the scope of the present disclosure is limited to the described embodiments and should not be defined, and should be determined by the scope of the claims and equivalents as well as the scope of the claims to be described later.

100 : 입력부
120: 프로세서
130 : 메모리
131 : 인공 신경망 모델
140 : 러닝 프로세서
230: 커널 매트릭스 또는 저항 메모리 배열
410 : 정규화 회로 배열
420 : 제어부100: input
120: processor
130: memory
131: artificial neural network model
140: running processor
230: kernel matrix or resistive memory array
410: normalization circuit arrangement
420: control unit

Claims

An input unit for inputting training data for training an artificial neural network model; And
Including a learning processor for training the artificial neural network model using the training data,
The artificial neural network model,
A plurality of layers receiving a signal corresponding to the data received from the input unit, and at least one batch normalization device disposed between the plurality of layers,
The arrangement normalization device normalizes a resistive memory array (RRAM array) for convolutional processing a first analog signal transmitted from a first layer among the plurality of layers, and a second analog signal convolutional processing from the resistive memory array. A neural network learning apparatus comprising a circuit arrangement including a plurality of switches and a plurality of electrical elements for transmitting a third analog signal to a second layer.

The method of claim 1,
The circuit arrangement,
Neural network training apparatus for controlling a voltage value transmitted through each channel of the resistive memory array.

The method of claim 1,
The batch normalization device,
And a control unit for controlling the circuit arrangement to normalize signals applied to each channel of the resistive memory arrangement.

The method of claim 3,
The control unit,
Neural network learning apparatus for transmitting a control signal for a switch operation of the circuit arrangement to the circuit arrangement.

The method of claim 4,
The plurality of electric elements of the circuit arrangement includes a plurality of integrators and a plurality of capacitors,
The switch is a neural network learning apparatus for changing an electrical path between an integrator and a capacitor based on the control signal of the controller.

The method of claim 5,
In the circuit arrangement, a plurality of circuits corresponding to a channel receiving an output second analog signal or a third analog signal are connected in parallel,
The capacitor is a neural network learning device connected to an input terminal or an output terminal of the integrator.

The method of claim 6,
The switch electrically connects the capacitor to the output terminal of the integrator when the signal transmitted from the control unit is a first signal, and a plurality of circuits are short-circuited to each other.

The method of claim 6,
The switch electrically connects one end of the capacitor to the input terminal of the integrator, and a plurality of circuits are connected in parallel when the signal transmitted from the controller is a second signal.

The method of claim 6,
The switch is a neural network learning apparatus for electrically connecting one end of the capacitor to an input terminal of the integrator and grounding the other end of the capacitor when the signal transmitted from the controller is a third signal.

The method of claim 9,
When the signal transmitted from the controller is a fourth signal, the switch electrically connects the other end of the capacitor to the input terminal of the integrator, grounds one end of the capacitor, and opens a plurality of circuits to each other.

The method of claim 10,
The switch is a neural network learning apparatus for electrically connecting one end of the capacitor to an input terminal of the integrator and grounding the other end of the capacitor if the signal transmitted from the controller is a fifth signal.

The method of claim 6,
The switch is a neural network learning device for grounding the capacitor and the output terminal of the integrator when the signal transmitted from the controller is a sixth signal.

The method of claim 1,
The artificial neural network model,
A neural network training device including a convolution neural network (CNN) or a binarized neural network (BNN).