KR20240060522A

KR20240060522A - Neuromorphic device implementing neural network and operation method of the same

Info

Publication number: KR20240060522A
Application number: KR1020240006036A
Authority: KR
Inventors: 이충현
Original assignee: 주식회사 페블스퀘어
Priority date: 2022-10-28
Filing date: 2024-01-15
Publication date: 2024-05-08
Also published as: WO2024090858A1; KR102627460B1

Abstract

뉴럴 네트워크를 구현하는 뉴로모픽 장치로서, 적어도 하나의 프로그램이 저장된 메모리; 크로스바 어레이 회로를 포함하는 온-칩 메모리(on-chip memory); 및 상기 적어도 하나의 프로그램을 실행함으로써 상기 뉴럴 네트워크를 구동하는 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 오디오 신호를 수신하고, 소정의 학습 데이터에 기초하여 학습된 상기 뉴럴 네트워크에 상기 오디오 신호를 입력하여 음성 인식 결과를 출력하며, 상기 뉴럴 네트워크는, 상기 오디오 신호를 시간 대역에서 주파수 대역으로 변환하여 주파수 신호를 생성하는 푸리에 변환 레이어, 상기 주파수 신호로부터 멜 스펙트로그램을 생성하는 멜 생성 레이어 및 상기 멜 스펙트로그램의 음성 특징에 기초하여 상기 음성 인식 결과를 출력하는 출력 레이어를 포함할 수 있다.A neuromorphic device implementing a neural network, comprising: a memory storing at least one program; on-chip memory including crossbar array circuitry; and at least one processor that drives the neural network by executing the at least one program, wherein the at least one processor receives an audio signal and transmits the neural network to the neural network learned based on predetermined training data. An audio signal is input and a voice recognition result is output, and the neural network includes a Fourier transform layer that converts the audio signal from a time band to a frequency band to generate a frequency signal, and a Mel generator that generates a Mel spectrogram from the frequency signal. It may include an output layer that outputs the voice recognition result based on the voice characteristics of the layer and the mel spectrogram.

Description

Neuromorphic device implementing a neural network and its operating method {NEUROMORPHIC DEVICE IMPLEMENTING NEURAL NETWORK AND OPERATION METHOD OF THE SAME}

본 발명은 뉴럴 네트워크를 구현하는 뉴로모픽 장치 및 그 동작 방법에 관한 것으로, 보다 상세하게는 클라우드 서버 또는 물리적 서버를 이용하지 않고 Edge AI 칩을 이용하여 음성 인식을 수행하는 시스템 및 그 동작 방법에 관한 것이다.The present invention relates to a neuromorphic device implementing a neural network and a method of operating the same. More specifically, to a system that performs voice recognition using an edge AI chip without using a cloud server or physical server and a method of operating the same. It's about.

인터넷 기술의 발전과 더불어, 사람과 컴퓨터 장치와의 인터랙션은 더욱 빈번해지고 있다. 사람의 발화를 통한 컴퓨터 장치와의 인터랙션 과정에서 자연 언어를 정확하게 인식하여 사용자의 의도를 결정하는 것이 매우 중요해지고 있다.With the development of Internet technology, interaction between people and computer devices is becoming more frequent. In the process of interacting with a computer device through human speech, it has become very important to accurately recognize natural language and determine the user's intention.

그러나, 현재 음성 인식은 클라우드 환경에서 이루어지고 있으며, 클라우드 기반의 음성 인식 시스템의 경우, 입력된 음성이 클라우드로 전송해야 하므로 실시간 음성 인식에는 한계가 있다. 특히, 음성 인식 시점에 통신 연결이 불안하거나, 통신 연결에 문제가 생기는 경우 음성 인식 자체가 불가능해지는 문제가 있다.However, voice recognition is currently carried out in a cloud environment, and in the case of cloud-based voice recognition systems, real-time voice recognition has limitations because the input voice must be transmitted to the cloud. In particular, if the communication connection is unstable at the time of voice recognition or a problem occurs in the communication connection, there is a problem in which voice recognition itself becomes impossible.

전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다.The above-mentioned background technology is technical information that the inventor possessed for deriving the present invention or acquired in the process of deriving the present invention, and cannot necessarily be said to be known art disclosed to the general public before filing the application for the present invention.

본 개시의 목적은 뉴럴 네트워크를 구현하는 뉴로모픽 장치 및 그 동작 방법을 제공하는 데 있다. 본 개시가 해결하고자 하는 과제는 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 본 발명의 기재로부터 당해 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있고, 본 개시의 실시예에 의해 보다 분명하게 이해될 것이다. 또한, 본 개시가 해결하고자 하는 과제 및 장점들은 특허 청구범위에 나타난 수단 및 그 조합에 의해 실현될 수 있음을 알 수 있을 것이다.The purpose of the present disclosure is to provide a neuromorphic device that implements a neural network and a method of operating the same. The problem to be solved by the present disclosure is not limited to the technical problems mentioned above, and other technical problems not mentioned can be clearly understood by those skilled in the art from the description of the present invention. The present disclosure will be understood more clearly by the examples. In addition, it will be appreciated that the problems and advantages to be solved by the present disclosure can be realized by the means and combinations thereof indicated in the patent claims.

전술한 기술적 과제를 해결하기 위한 수단으로서, 본 개시의 제1 측면은, 적어도 하나의 프로그램이 저장된 메모리; 크로스바 어레이 회로를 포함하는 온-칩 메모리(on-chip memory); 및 상기 적어도 하나의 프로그램을 실행함으로써 상기 뉴럴 네트워크를 구동하는 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 오디오 신호를 수신하고, 소정의 학습 데이터에 기초하여 학습된 뉴럴 네트워크에 상기 오디오 신호를 입력하여 음성 인식 결과를 출력하며, 상기 뉴럴 네트워크는, 상기 오디오 신호를 시간 대역에서 주파수 대역으로 변환하여 주파수 신호를 생성하는 푸리에 변환 레이어, 상기 주파수 신호로부터 멜 스펙트로그램을 생성하는 멜 생성 레이어 및 상기 멜 스펙트로그램의 음성 특징에 기초하여 상기 음성 인식 결과를 출력하는 출력 레이어를 포함하는, 뉴로모픽 장치를 제공할 수 있다.As a means for solving the above-described technical problem, a first aspect of the present disclosure includes: a memory storing at least one program; on-chip memory including crossbar array circuitry; and at least one processor that drives the neural network by executing the at least one program, wherein the at least one processor receives an audio signal and transmits the audio to a neural network learned based on predetermined training data. A voice recognition result is output by inputting a signal, and the neural network includes a Fourier transform layer that converts the audio signal from a time band to a frequency band to generate a frequency signal, and a Mel generation layer that generates a Mel spectrogram from the frequency signal. and an output layer that outputs the voice recognition result based on the voice characteristics of the Mel spectrogram.

본 개시의 제2 측면은, 오디오 신호를 수신하는 단계; 및 소정의 학습 데이터에 기초하여 학습된 뉴럴 네트워크에 상기 오디오 신호를 입력하여 음성 인식 결과를 출력하는 단계;를 포함하고, 상기 뉴럴 네트워크는, 상기 오디오 신호를 시간 대역에서 주파수 대역으로 변환하여 주파수 신호를 생성하는 푸리에 변환 레이어, 상기 주파수 신호로부터 멜 스펙트로그램을 생성하는 멜 생성 레이어 및 상기 멜 스펙트로그램의 음성 특징에 기초하여 상기 음성 인식 결과를 출력하는 출력 레이어를 포함하는, 뉴럴 네트워크를 구현하는 뉴로모픽 장치를 동작하는 방법을 제공할 수 있다.A second aspect of the present disclosure includes receiving an audio signal; And inputting the audio signal to a neural network learned based on predetermined learning data to output a voice recognition result, wherein the neural network converts the audio signal from a time band to a frequency band to produce a frequency signal. A new layer that implements a neural network, including a Fourier transform layer that generates a Mel, a Mel generation layer that generates a Mel spectrogram from the frequency signal, and an output layer that outputs the voice recognition result based on the voice characteristics of the Mel spectrogram. A method of operating a lomorphic device may be provided.

본 개시의 제 3 측면은, 제 2 측면의 방법을 컴퓨터에서 실행하기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공할 수 있다.A third aspect of the present disclosure may provide a computer-readable recording medium recording a program for executing the method of the second aspect on a computer.

이 외에도, 본 발명을 구현하기 위한 다른 방법, 다른 장치 및 상기 방법을 실행하기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체가 더 제공될 수 있다.In addition, another method for implementing the present invention, another device, and a computer-readable recording medium recording a program for executing the method may be further provided.

전술한 것 외의 다른 측면, 특징, 이점이 이하의 도면, 특허 청구범위 및 발명의 상세한 설명으로부터 명확해질 것이다.Other aspects, features and advantages in addition to those described above will become apparent from the following drawings, claims and detailed description of the invention.

전술한 본 개시의 과제 해결 수단에 의하면, 서버에의 의존도가 낮고 통신 비용이 절감되며 개인 정보에 대한 보안이 향상된 뉴로모픽 장치를 제공할 수 있다.According to the means for solving the problems of the present disclosure described above, it is possible to provide a neuromorphic device that is less dependent on servers, reduces communication costs, and improves security for personal information.

또한, 본 개시의 과제 해결 수단에 따르면, 데이터 송수신에 따른 병목 현상을 방지하여 속도가 향상된 뉴로모픽 장치를 제공할 수 있다.In addition, according to the problem solving means of the present disclosure, it is possible to provide a neuromorphic device with improved speed by preventing bottlenecks due to data transmission and reception.

실시예들에 의한 효과가 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 본 발명의 기재로부터 당해 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects of the examples are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description of the present invention.

도 1은 일 실시예에 따른 뉴로모픽 칩 구조를 설명하기 위한 도면이다.
도 2는 일 실시예에 따른 완전 연결 계층(Fully-connected Neural Network, FCNN)의 아키텍처를 설명하기 위한 도면이다.
도 3은 일 실시예에 따른 뉴로모픽 장치의 하드웨어 구성을 도시한 블록도이다.
도 4a 내지 도 4b는 폰 노이만(Von Neumann) 구조와 PIM(Processing-In Memory) 구조를 비교하기 위한 예시도이다.
도 5a 내지 도 5b는 일 실시예에 따른 뉴럴 네트워크의 동작 방법을 설명하기 위한 도면이다.
도 6a 내지 도 6b는 일 실시예에 따른 벡터-행렬 곱셈과 뉴럴 네트워크에서 수행되는 연산을 비교하기 위한 도면이다.
도 7은 일 실시예에 따른 뉴럴 네트워크에서 컨벌루션 연산이 수행되는 예시를 설명하기 위한 도면이다.
도 8은 일 실시예에 따른 뉴럴 네트워크의 구현 예시도이다.
도 9는 일 실시예에 따른 크로스바 어레이 회로의 구현 예시도이다.
도 10은 일 실시예에 따른 뉴로모픽 장치를 동작하는 방법의 흐름도이다.
도 11은 본 발명의 일 실시예에 따른 뉴로모픽 장치의 블록도이다.Figure 1 is a diagram for explaining a neuromorphic chip structure according to an embodiment.
FIG. 2 is a diagram illustrating the architecture of a fully-connected neural network (FCNN) according to an embodiment.
Figure 3 is a block diagram showing the hardware configuration of a neuromorphic device according to an embodiment.
Figures 4a and 4b are exemplary diagrams for comparing the von Neumann structure and the PIM (Processing-In Memory) structure.
5A to 5B are diagrams for explaining a method of operating a neural network according to an embodiment.
6A to 6B are diagrams for comparing vector-matrix multiplication and operations performed in a neural network according to one embodiment.
FIG. 7 is a diagram for explaining an example of a convolution operation being performed in a neural network according to an embodiment.
Figure 8 is an example of a neural network implementation according to an embodiment.
Figure 9 is an exemplary implementation diagram of a crossbar array circuit according to an embodiment.
Figure 10 is a flowchart of a method of operating a neuromorphic device according to one embodiment.
Figure 11 is a block diagram of a neuromorphic device according to an embodiment of the present invention.

본 발명을 설명함에 있어 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략할 수 있으며, 다르게 정의되지 않는 한 본 명세서에서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다.In describing the present invention, if it is judged that a detailed description of related known technology may obscure the gist of the present invention, the detailed description may be omitted, and unless otherwise defined, all terms used in this specification refer to the present invention. It has the same meaning as generally understood by those with ordinary knowledge in the relevant technical field.

본 명세서에의 "일 실시예에 따른", "일 실시예에 관한" 또는 "일 실시예의 구현에 따라" 등의 어구는 반드시 모두 동일한 실시 예를 가리키는 것은 아니다.Phrases such as “according to one embodiment,” “related to one embodiment,” or “according to an implementation of an embodiment” in this specification do not necessarily all refer to the same embodiment.

실시예들은 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는바, 일부 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나 이는 실시예들을 특정한 개시 형태에 대해 한정하려는 것이 아니며, 실시예들의 사상 및 기술범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 명세서에서 사용한 용어들은 단지 실시예들의 설명을 위해 사용된 것으로, 실시예들을 한정하려는 의도가 아니다.Since the embodiments can be subject to various changes and have various forms, some embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the embodiments to a specific disclosed form, and should be understood to include all changes, equivalents, and substitutes included in the spirit and technical scope of the embodiments. The terms used in the specification are merely used to describe the embodiments and are not intended to limit the embodiments.

실시예들에서 사용되는 용어는 본 실시예들에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 실시예들이 속하는 기술 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 부분에서 상세히 그 의미를 기재할 것이다. 따라서 실시예들에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 실시예들 전반에 걸친 내용을 토대로 정의되어야 한다. The terms used in the embodiments are general terms that are currently widely used as much as possible while considering the functions in the embodiments, but this is due to the intention or precedent of technicians working in the technical field to which the embodiments belong, the emergence of new technology, etc. It may vary depending on In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the relevant section. Therefore, the terms used in the embodiments should be defined based on the meaning of the term and the overall content of the embodiments, rather than simply the name of the term.

본 개시의 일부 실시예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들의 일부 또는 전부는, 특정 기능들을 실행하는 다양한 개수의 하드웨어 및/또는 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 본 개시의 기능 블록들은 하나 이상의 마이크로프로세서들에 의해 구현되거나, 소정의 기능을 위한 회로 구성들에 의해 구현될 수 있다. Some embodiments of the present disclosure may be represented by functional block configurations and various processing steps. Some or all of these functional blocks may be implemented in various numbers of hardware and/or software configurations that perform specific functions. For example, the functional blocks of the present disclosure may be implemented by one or more microprocessors, or may be implemented by circuit configurations for certain functions.

또한, 예를 들어, 본 개시의 기능 블록들은 다양한 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능 블록들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 개시는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다.Additionally, for example, functional blocks of the present disclosure may be implemented in various programming or scripting languages. Functional blocks may be implemented as algorithms running on one or more processors. Additionally, the present disclosure may employ conventional technologies for electronic environment setup, signal processing, and/or data processing.

"데이터베이스", “요소”, “수단” 및 “구성”등과 같은 용어는 넓게 사용될 수 있으며, 기계적이고 물리적인 구성들로서 한정되는 것은 아니다. 또한, 명세서에 기재된 "-부", "-모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.Terms such as “database,” “element,” “means,” and “configuration” may be used broadly and are not limited to mechanical or physical configurations. Additionally, terms such as “-unit” and “-module” used in the specification refer to a unit that processes at least one function or operation, which may be implemented as hardware or software, or as a combination of hardware and software.

또한, 도면에 도시된 구성 요소들 간의 연결 선 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것일 뿐이다. 실제 장치에서는 대체 가능하거나 추가된 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들에 의해 구성 요소들 간의 연결이 나타내어질 수 있다.Additionally, connection lines or connection members between components shown in the drawings merely exemplify functional connections and/or physical or circuit connections. In an actual device, connections between components may be represented by various replaceable or additional functional connections, physical connections, or circuit connections.

또한, 본 명세서에서 사용되는 '제 1' 또는 '제 2' 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용할 수 있지만, 구성 요소들은 용어들에 의해 한정되어서는 안 된다. 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다.Additionally, terms including ordinal numbers such as 'first' or 'second' used in this specification may be used to describe various components, but the components should not be limited by the terms. Terms are used only to distinguish one component from another.

또한, 도면 상의 일부 구성 요소는 그 크기나 비율 등이 다소　과장되어　도시 되었을 수 있다. 또한, 어떤 도면 상에 도시된 구성 요소가 다른 도면 상에는 도시 되지 않을 수 있다.Additionally, some components in the drawing may be depicted with their size or proportions somewhat exaggerated. Additionally, components shown in one drawing may not be shown in other drawings.

명세서 전체에서 '실시예'는 본 개시에서 발명을 용이하게 설명하기 위한 임의의 구분으로서, 실시예 각각이 서로 배타적일 필요는 없다. 예를 들어, 일 실시예에 개시된 구성들은 다른 실시예에 적용 및/또는 구현될 수 있으며, 본 개시의 범위를 벗어나지 않는 한도에서 변경되어 적용 및/또는 구현될 수 있다.Throughout the specification, 'examples' is an arbitrary division for easily explaining the invention in the present disclosure, and each embodiment does not need to be mutually exclusive. For example, configurations disclosed in one embodiment may be applied and/or implemented in another embodiment, and may be applied and/or implemented with changes without departing from the scope of the present disclosure.

또한, 본 개시에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 실시예들을 제한하고자 하는 것은 아니다. 본 개시에서 단수형은 특별히 언급하지 않는 한 복수형도 포함한다. Additionally, the terms used in this disclosure are for describing the embodiments and are not intended to limit the embodiments. In this disclosure, the singular form also includes the plural form unless otherwise specified.

아래에서는 첨부한 도면을 참고하여 본 개시의 실시예들에 대하여 해당 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 개시의 실시예들은 여러 가지 상이한 형태로 구현될 수 있으며, 본 개시에서 설명하는 실시예에 한정되지 않는다.Below, with reference to the attached drawings, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily implement them. However, embodiments of the present disclosure may be implemented in various different forms and are not limited to the embodiments described in the present disclosure.

이하에서는 이를 기반으로 도면을 참조하여 본 발명을 상세히 설명한다.Hereinafter, the present invention will be described in detail with reference to the drawings.

도 1은 일 실시예에 따른 뉴로모픽 칩 구조를 설명하기 위한 도면이다.Figure 1 is a diagram for explaining a neuromorphic chip structure according to an embodiment.

도 1을 참조하면, 뉴로모픽(Neuromorphic) 칩(1)은 뉴런의 형태를 모방한 회로를 만들어 인간의 뇌 기능을 모사하는 하드웨어이다. 즉, 뉴로모픽 칩(1)은 신경계의 구조를 모방한 컴퓨터 칩을 의미한다.Referring to FIG. 1, a neuromorphic chip 1 is hardware that simulates human brain function by creating a circuit that mimics the shape of a neuron. In other words, the neuromorphic chip (1) refers to a computer chip that mimics the structure of the nervous system.

뉴로모픽 칩(1)은 신경망 연산을 위해 필요한 회로만으로 구성되므로 전력, 면적, 속도 측면에서 수백 배 이상의 이득을 볼 수 있다. 기존의 컴퓨터와 달리 인간의 뇌가 수많은 데이터를 처리하더라도 전력을 많이 소모하지 않는 것과 같이, 뉴로모픽 칩(1)은 이러한 뇌의 작동 방식을 모사하여 뉴런과 시냅스를 잇는 구조가 병렬로 구성되고 데이터 처리를 하지 않을 때 이어졌다 끊어짐으로서 에너지를 절약할 수 있다.Since the neuromorphic chip (1) consists of only the circuits necessary for neural network calculation, it can achieve hundreds of times more gains in terms of power, area, and speed. Unlike conventional computers, the human brain does not consume a lot of power even when processing a large amount of data. The neuromorphic chip (1) mimics the way the brain operates and has a structure connecting neurons and synapses in parallel. Energy can be saved by switching on and off when data is not being processed.

예를 들어, 기존의 컴퓨터인 폰 노이만 구조에서는 데이터가 입력되면 데이터를 순차적으로 처리하기 때문에 정밀하게 작성된 프로그램을 실행하는 데 탁월하나 전력소모 한계를 비롯하여 패턴 인식, 실시간 인식 등에서 효율성이 낮다는 문제가 있다.For example, the von Neumann architecture of the existing computer is excellent for executing precisely written programs because it processes data sequentially when data is input, but has problems such as limitations in power consumption and low efficiency in pattern recognition and real-time recognition. there is.

반면에, 뉴로모픽 칩(1)은 데이터를 0, 1과 같은 디지털이 아닌 다양한 상태가 점진적으로 변하는 아날로그 동작을 사용한다. 즉, 병렬로 구성된 인공 뉴런들은 클럭(Clock) 동작 없이 이벤트 구동 방식으로 작동된다. 따라서 기존 컴퓨터가 직관적으로 인식하기 어려운 비정형적인 문자, 음성, 영상 등을 효율적으로 처리할 수 있다. 구체적으로, 신경세포인 뉴런과 연결선인 시냅스를 실리콘 트랜지스터 회로와 메모리 소자로 분산 구현하여 데이터를 병렬로 처리할 수 있다.On the other hand, the neuromorphic chip 1 uses an analog operation in which data gradually changes into various states rather than digital ones such as 0 and 1. In other words, artificial neurons configured in parallel operate in an event-driven manner without clock operation. Therefore, it can efficiently process atypical text, voice, and video that are difficult for existing computers to recognize intuitively. Specifically, data can be processed in parallel by distributing neurons, which are nerve cells, and synapses, which are connecting lines, with silicon transistor circuits and memory devices.

일 실시예에서, 도 1의 뉴로모픽 칩(1)에 이미지, 영상 또는 음성 등 입력 데이터가 입력된 경우, 입력 데이터는 뉴로모픽 칩(1) 내부의 연산을 통해 소정의 출력 데이터가 출력될 수 있다. 소정의 출력 데이터는 입력 데이터의 특징 분류를 통한 음성/이미지/영상 인식 결과를 포함될 수 있다. 예를 들어, 음성 신호가 입력 데이터로 입력된 경우 소정의 출력 데이터는 입력 데이터에 기 설정된 키워드가 포함되어 있는지 여부를 이진 분류(Binary Classification)한 음성 인식 결과를 포함할 수 있다.In one embodiment, when input data such as an image, video, or voice is input to the neuromorphic chip 1 of FIG. 1, the input data is processed within the neuromorphic chip 1 to output predetermined output data. It can be. Predetermined output data may include voice/image/video recognition results through feature classification of input data. For example, when a voice signal is input as input data, predetermined output data may include a voice recognition result obtained by binary classification of whether the input data includes a preset keyword.

한편, 뉴로모픽 칩(1)에 입력되는 데이터는 전술한 이미지, 영상 또는 음성에 제한되지 않으며, 텍스트 등을 다양한 형태의 데이터를 포함할 수 있다.Meanwhile, data input to the neuromorphic chip 1 is not limited to the above-described images, videos, or voices, and may include various types of data such as text.

도 2는 일 실시예에 따른 완전 연결 계층(Fully-connected Neural Network, FCNN)의 아키텍처를 설명하기 위한 도면이다.FIG. 2 is a diagram illustrating the architecture of a fully-connected neural network (FCNN) according to an embodiment.

도 2를 참조하면, 완전 연결 계층인 FCNN은 한 층(Layer)의 모든 뉴런이 다음 층의 모든 뉴런과 연결된 상태의 컨볼루션 뉴럴 네트워크(Convolution Neural Network)를 의미한다. 1차원 배열의 형태로 평탄화 된 행렬을 통해 데이터를 분류하는 데 사용되는 계층이다.Referring to Figure 2, FCNN, a fully connected layer, refers to a convolution neural network in which all neurons in one layer are connected to all neurons in the next layer. This is a layer used to classify data through a flattened matrix in the form of a one-dimensional array.

FCNN에서는 각 노드마다 값이 존재하며 각 뉴런마다 가중치(Weight)와 편향 값(Bias)이 존재한다. 한 층에서 다른 층으로 넘어갈 때 각 노드에 가중치를 곱하고 Bias를 더한 값이 다음 층의 노드 값이 되는데, 이 때 해당 산출 값이 특정 조건을 만족하면 활성화 함수를 거친 출력 값을 다음 층의 노드 값으로 입력하여 활성화하고 해당 산출 값이 특정 조건을 만족하지 못하면 활성화 함수를 거쳐 해당 노드를 비활성화하는 활성화 함수(Activation Function)가 개입된다.In FCNN, there is a value for each node and a weight and bias value for each neuron. When moving from one layer to another, the value of each node multiplied by the weight and the bias added becomes the node value of the next layer. At this time, if the calculated value satisfies a specific condition, the output value after going through the activation function is used as the node value of the next layer. It is activated by inputting, and if the calculated value does not meet certain conditions, an activation function is involved to deactivate the node through the activation function.

따라서, 활성화 함수의 종류에 따라 그 출력 값이 달라지므로 필요에 따라 적절한 활성화 함수를 사용하는 것이 중요하다. 대표적으로 ReLU 함수와 Softmax 함수가 있다.Therefore, since the output value varies depending on the type of activation function, it is important to use an appropriate activation function as needed. Representative examples include the ReLU function and Softmax function.

FCNN의 입력은 1차원 배열의 데이터만 가능하므로, 입력 데이터가 세로, 가로, 채널(색상)로 이루어진 3차원 배열의 이미지인 경우 1차원 데이터로 평탄화 시켜 FCNN에 입력해주어야 하는 단점이 있다. 즉, 이미지의 공간적 정보가 무시됨에 따라 형상에 담긴 특징을 추출할 수 없는 단점이 있다. 따라서, FCNN의 경우 음성 데이터 등 1차원 배열의 데이터로 구현할 수 있는 데이터를 입력으로 할 때 활용도가 높을 수 있다. 일 실시예에 따르면, 후술할 뉴럴 네트워크는 FCNN일 수 있다.FCNN's input can only be 1-dimensional array data, so if the input data is a 3-dimensional array image consisting of vertical, horizontal, and channel (color), it has the disadvantage of having to be flattened into 1-dimensional data and then input it to FCNN. In other words, there is a disadvantage in that the features contained in the shape cannot be extracted because the spatial information of the image is ignored. Therefore, in the case of FCNN, it can be highly useful when inputting data that can be implemented as one-dimensional array data, such as voice data. According to one embodiment, the neural network to be described later may be FCNN.

도 3은 일 실시예에 따른 뉴로모픽 장치의 하드웨어 구성을 도시한 블록도이다.Figure 3 is a block diagram showing the hardware configuration of a neuromorphic device according to an embodiment.

뉴로모픽 장치(300)는 PC(personal computer), 서버 디바이스, 모바일 디바이스, 임베디드 디바이스 등의 다양한 종류의 디바이스들로 구현될 수 있고, 구체적인 예로서 뉴럴 네트워크를 이용한 음성 인식, 영상 인식, 영상 분류 등을 수행하는 스마트폰, 태블릿 디바이스, AR(Augmented Reality) 디바이스, IoT(Internet of Things) 디바이스, 자율주행 자동차, 로보틱스, 의료기기 등에 해당될 수 있으나, 이에 제한되지 않는다. 나아가서, 뉴로모픽 장치(300)는 위와 같은 디바이스에 탑재되는 전용 하드웨어 가속기(HW accelerator)에 해당될 수 있고, 뉴로모픽 장치(300)는 뉴럴 네트워크 구동을 위한 전용 모듈인 NPU(neural processing unit), TPU(Tensor Processing Unit), Neural Engine 등과 같은 하드웨어 가속기일 수 있으나, 이에 제한되지 않는다.The neuromorphic device 300 can be implemented with various types of devices such as personal computers (PCs), server devices, mobile devices, and embedded devices, and specific examples include voice recognition, image recognition, and image classification using a neural network. This may apply to smartphones, tablet devices, AR (Augmented Reality) devices, IoT (Internet of Things) devices, self-driving cars, robotics, medical devices, etc., but is not limited thereto. Furthermore, the neuromorphic device 300 may correspond to a dedicated hardware accelerator (HW accelerator) mounted on the above device, and the neuromorphic device 300 is a neural processing unit (NPU), a dedicated module for driving a neural network. ), TPU (Tensor Processing Unit), Neural Engine, etc., but is not limited to hardware accelerators.

뉴로모픽 장치(300)는 프로세서 및 메모리를 포함할 수 있다. 도 3에 도시된 뉴로모픽 장치(300)에는 본 실시예들와 관련된 구성요소들만이 도시되어 있으며, 뉴로모픽 장치(300)에는 도 3에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음은 당해 기술분야의 통상의 기술자에게 자명하다.Neuromorphic device 300 may include a processor and memory. In the neuromorphic device 300 shown in FIG. 3, only components related to the present embodiments are shown, and in addition to the components shown in FIG. 3, the neuromorphic device 300 includes other general-purpose components. It is obvious to those skilled in the art that it can be included.

일 실시예에 따른 뉴로모픽 장치(300)는 입출력 인터페이스(310)를 포함할 수 있다.The neuromorphic device 300 according to one embodiment may include an input/output interface 310.

일 실시예에 따른 입출력 인터페이스(310)는, 데이터를 뉴로모픽 장치(300) 내부에서 외부 소스로 전송하고 데이터를 외부 소스로부터 뉴로모픽 장치(300) 내부로 수신할 수 있다. 입출력 인터페이스(310)의 신호는 단방향 또는 양방향 신호일 수 있으며 단일 종단(Single-Ended) 또는 차동 모드(Differential-Mode)일 수 있으며 다른 입출력 인터페이스 표준 중 하나를 따를 수 있다.The input/output interface 310 according to one embodiment may transmit data from the inside of the neuromorphic device 300 to an external source and receive data from the external source to the inside of the neuromorphic device 300. Signals in the input/output interface 310 may be unidirectional or bidirectional, may be single-ended or differential-mode, and may follow one of the other input/output interface standards.

일 실시예에 따른 입출력 인터페이스(310)는 오디오 수신부를 더 포함할 수 있다. 예를 들어, 오디오 ADC 혹은 DAC를 더 포함할 수 있다. 오디오 수신부는 마이크에서 입력되는 음성을 디지털화 할 수 있으며, 또한, 오디오 수신부는 증폭기가 포함되어 있을 수 있고 다중 샘플링 속도를 지원할 수 있다.The input/output interface 310 according to one embodiment may further include an audio receiver. For example, it may further include an audio ADC or DAC. The audio receiver can digitize the voice input from the microphone. Additionally, the audio receiver may include an amplifier and support multiple sampling rates.

일 실시예에서, 입출력 인터페이스(310)는 뉴로모픽 장치(300) 외부의 오디오 신호를 수신할 수 있다. 또한, 입출력 인터페이스(310)는 뉴럴 네트워크(330)의 출력 값인 음성 인식 결과를 입력 받아 연산 회로(350) 또는 외부의 다른 유닛(미도시)로 전송할 수 있다.In one embodiment, input/output interface 310 may receive audio signals external to neuromorphic device 300. Additionally, the input/output interface 310 may receive the voice recognition result, which is the output value of the neural network 330, and transmit it to the operation circuit 350 or another external unit (not shown).

일 실시예에 따른 입출력 인터페이스(310)에는, GPIO(General-Purpose Input/Output), I2S(Integrated Interchip Sound, Inter-IC Sound), I2C(Inter-Integrated Circuit), SPI(Serial Peripheral Interface), UART(Universal asynchronous receiver/transmitter), PWM(Pulse Width Modulation) 등이 더 포함될 수 있다.The input/output interface 310 according to one embodiment includes General-Purpose Input/Output (GPIO), Integrated Interchip Sound (I2S), Inter-IC Sound (I2S), Inter-Integrated Circuit (I2C), Serial Peripheral Interface (SPI), and UART. (Universal asynchronous receiver/transmitter), PWM (Pulse Width Modulation), etc. may be further included.

일 실시예에 따른 뉴로모픽 장치(300)는 아날로그 장치(320, 340)를 포함할 수 있다.Neuromorphic device 300 according to one embodiment may include analog devices 320 and 340.

아날로그 장치(320, 340)는 Edge AI chip에서 안정적인 전원 공급, 시스템의 감시 및 감독을 위한 기능을 제공할 수 있다.The analog devices 320 and 340 can provide functions for stable power supply and system monitoring and supervision in the Edge AI chip.

구체적으로, 아날로그 장치(320, 340)는 감시 제어, 데이터 수집 및 자동 제어에서 전압, 저항, 회전, 압력 등과 같이 연속적인 물리량에 따라 나타나는 파라미터로 작동하는 장치이다. 아날로그 장치(320, 340)는 아날로그 표시 장치 및 아날로그 입출력 장치를 포함할 수 있으며, 아날로그-디지털 컨버터 및 디지털-아날로그 컨버터를 더 포함할 수 있다.Specifically, the analog devices 320 and 340 are devices that operate with parameters that appear according to continuous physical quantities such as voltage, resistance, rotation, pressure, etc. in supervisory control, data collection, and automatic control. The analog devices 320 and 340 may include an analog display device and an analog input/output device, and may further include an analog-to-digital converter and a digital-to-analog converter.

또한, 아날로그 장치(320, 340)는 뉴로모픽 장치(300)의 전원 관리 및 감독 기능을 수행할 수 있다. 아날로그 장치(320, 340)는 저전압 감지기(Low Voltage Detector, LVD)를 더 포함하여, 디지털 전력 공급 장치(VDD)가 안전한 작동 레벨 이하로 떨어지면 POR(Power-On Reset)을 작동하여 뉴로모픽 장치(300)의 불안정한 동작을 방지할 수 있다.Additionally, the analog devices 320 and 340 may perform power management and supervision functions of the neuromorphic device 300. The analog devices 320 and 340 further include a low voltage detector (LVD), and operate a power-on reset (POR) when the digital power supply (VDD) falls below a safe operating level to control the neuromorphic device. Unstable operation of (300) can be prevented.

일 실시예에 따른 뉴로모픽 장치(300)는 연산 회로(350)를 포함할 수 있다.The neuromorphic device 300 according to one embodiment may include an arithmetic circuit 350.

연산 회로(350)는 MCU(Micro Controller Unit), DMA(Direct Memory Access) 등을 포함할 수 있다.The operation circuit 350 may include a Micro Controller Unit (MCU), Direct Memory Access (DMA), etc.

MCU는 PIM 구조에서 효율적인 알고리즘을 실행할 수 있도록 제어 및 관리의 역할을 수행할 수 있다. PIM 구조는 MAC(Multiply And Accumulate) 작업의 효율성이 높으므로 본 발명의 일 실시예에 따른 신경망 계산에 적합할 수 있다. PIM 구조에 관하여는 도 4b를 통해 후술하기로 한다.The MCU can play a control and management role to execute efficient algorithms in the PIM structure. The PIM structure has high efficiency in MAC (Multiply And Accumulate) operations, so it may be suitable for neural network calculations according to an embodiment of the present invention. The PIM structure will be described later with reference to FIG. 4B.

DMA는 주변 장치와 메모리 간, 또는 메모리 간의 고속 데이터 전송을 수행할 수 있다. 데이터는 CPU 없이 DMA로 빠르게 이동할 수 있으며 CPU는 다른 작업을 위하여 활용될 수 있다. DMA는 SRAM 내부에 위치하며 신경망의 데이터 이동을 가속화하는 데 기여할 수 있다.DMA can perform high-speed data transfer between peripheral devices and memory, or between memories. Data can be quickly moved to DMA without the CPU and the CPU can be utilized for other tasks. DMA is located inside SRAM and can contribute to accelerating data movement in neural networks.

일 실시예에 따른 뉴로모픽 장치(300)는 Edge AI Chip으로 구현될 수 있다. Edge AI는 시스템에서 생성된 데이터를 기반으로 하는 엣지 컴퓨팅(Edge computing)을 사용하여 하드웨어 장치에서 AI 알고리즘을 실행하는 기술을 의미한다. AI 프로세싱은 주로 막대한 컴퓨팅 용량을 필요로 하는 클라우드 기반 데이터 센터에서 이루어져 서버 의존도가 높다. 반면에 Edge AI를 사용하게 되면 AI 알고리즘 연산의 수행이 로컬(Local)로 이루어져 클라우드(서버)에의 의존도가 낮아지고 이에 따른 통신 비용 또한 절감되며, 민감한 개인 정보가 클라우드에 전송되지 않아 프라이버시가 보호될 수 있다. 따라서, 뉴로모픽 장치(300)를 Edge AI Chip으로 구성함에 따라 비용 절감 및 보안 향상 뿐만 아니라 연산의 수행이 동일한 하드웨어 내에서 즉시 처리되어 높은 응답성을 가지는 시스템을 구현할 수 있다.Neuromorphic device 300 according to one embodiment may be implemented with an Edge AI Chip. Edge AI refers to a technology that runs AI algorithms on hardware devices using edge computing based on data generated from the system. AI processing is mainly performed in cloud-based data centers that require enormous computing capacity and is highly server-dependent. On the other hand, when Edge AI is used, AI algorithm calculations are performed locally, reducing dependence on the cloud (server), thereby reducing communication costs, and protecting privacy as sensitive personal information is not transmitted to the cloud. You can. Therefore, by configuring the neuromorphic device 300 with an Edge AI Chip, not only can costs be reduced and security improved, but calculations can be performed immediately within the same hardware, enabling the implementation of a highly responsive system.

도 4a 내지 도 4b는 폰 노이만(Von Neumann) 구조와 PIM(Processing-In Memory) 구조를 비교하기 위한 예시도이다.Figures 4a and 4b are exemplary diagrams for comparing the von Neumann structure and the PIM (Processing-In Memory) structure.

도 4a를 참조하면, 폰 노이만 구조는 존 폰 노이만이 제시한 컴퓨터 구조로써 주기억 장치, 중앙 처리 장치 및 입출력 장치의 전형적인 3단계 구조로 이루어진 프로그램 내장형 컴퓨터 구조이다.Referring to Figure 4a, the von Neumann architecture is a computer architecture proposed by John von Neumann and is a program-embedded computer architecture consisting of a typical three-level structure of a main memory, a central processing unit, and an input/output device.

폰 노이만 구조는, 컴퓨팅 장치에서 다른 작업으로 변경하는 경우 하드웨어(전선 등)를 재배치할 필요 없이 소프트웨어(프로그램)만 변경하면 되므로 범용성이 크게 향상된다는 장점이 있으나, 나열된 명령을 순차적으로 수행하고 그 명령은 일정한 기억 장소의 값을 변경하는 작업으로 구성되기 때문에 고속 컴퓨터의 설계에서 심각한 문제를 일으키게 된다. 이를 폰 노이만 병목(von-Neumann bottleneck)현상이라고 한다.The von Neumann structure has the advantage of greatly improving versatility because only the software (program) needs to be changed without the need to rearrange the hardware (wires, etc.) when changing from one computing device to another, but the listed commands are performed sequentially and the commands are Because it consists of changing the value of a certain memory location, it causes serious problems in the design of high-speed computers. This is called the von-Neumann bottleneck phenomenon.

폰노이만 병목현상을 해결하기 위하여, 메모리를 명령어가 저장되는 곳과 데이터를 저장하는 곳으로 구분한 하버드 아키텍처, 또는 메모리에서 데이터 저장 뿐만 아니라 데이터 연산까지 수행하는 PIM 아키텍처 및 고등동물의 뇌 구조를 모방한 인공신경망 형태의 집적회로로, 연산과 기억 기능이 통합된 유닛을 수없이 많이 구성하여 그물망처럼 병렬적으로 연결한 다음 각 유닛을 이벤트 구동(event-driven) 방식으로 작동시키는 뉴로모픽 컴퓨팅 등이 그 대안으로 제시되고 있다.In order to solve the von Neumann bottleneck, the Harvard architecture divides memory into where instructions are stored and where data is stored, or the PIM architecture that not only stores data in memory but also performs data operations, and imitates the brain structure of higher animals. Neuromorphic computing, which is an integrated circuit in the form of an artificial neural network that consists of numerous units with integrated calculation and memory functions, connects them in parallel like a mesh, and then operates each unit in an event-driven manner, etc. This is being presented as an alternative.

도 4b를 참조하면 PIM 구조는 프로세서 및 컴퓨팅 기능을 갖는 메모리로 구성되는 것을 알 수 있다.Referring to Figure 4b, it can be seen that the PIM structure is composed of a processor and a memory with a computing function.

기존의 폰 노이만 구조에서는 연산을 위해 메모리 내부의 모든 데이터가 프로세서로 이동하는 것과 달리, PIM 구조에서는 프로세서의 명령어가 전달되면 메모리 내에서 연산을 수행하고 결과 데이터만 프로세서에 전송하여 대량의 데이터의 이동이 없어 전술한 폰 노이만 병목현상을 효과적으로 해결할 수 있다. 또한, 전력 소모가 현저히 낮아진다는 장점이 있다.Unlike in the existing von Neumann structure, all data inside the memory is moved to the processor for calculation, in the PIM structure, when the processor's command is delivered, the calculation is performed within the memory and only the result data is sent to the processor, thereby moving a large amount of data. Without this, the aforementioned von Neumann bottleneck can be effectively solved. Additionally, there is an advantage that power consumption is significantly lowered.

다시 도 3으로 돌아와서, 일 실시예에 따른 뉴로모픽 장치(300)는 뉴럴 네트워크(330)를 포함할 수 있다.Returning to FIG. 3 , the neuromorphic device 300 according to one embodiment may include a neural network 330.

일 실시예에서, 뉴럴 네트워크(330)는 복수의 레이어를 이용하여 입력 데이터에 대한 특징을 추출함으로써 출력 데이터를 출력할 수 있다. 예를 들어, 뉴럴 네트워크(330)는 오디오 신호를 입력 받아 음성 인식 결과를 출력할 수 있다. 전술한 바와 같이, 뉴럴 네트워크(330)는 입력 받은 오디오 신호의 특징을 필터링 또는 분류(Classification)하여 해당 오디오 신호가 특정 키워드를 포함하고 있는지 여부 등 오디오 신호의 유형을 판단하여 음성 인식 결과로 출력할 수 있다. 이 때, 뉴럴 네트워크(330)는 직교 행렬로 구성되며 행 방향을 통해 입력되고 열 방향을 통해 출력이 생성될 수 있다.In one embodiment, the neural network 330 may output output data by extracting features for input data using a plurality of layers. For example, the neural network 330 may receive an audio signal and output a voice recognition result. As described above, the neural network 330 filters or classifies the characteristics of the input audio signal to determine the type of the audio signal, such as whether the audio signal contains a specific keyword, and outputs it as a voice recognition result. You can. At this time, the neural network 330 is composed of an orthogonal matrix, and input may be input through the row direction and output may be generated through the column direction.

일 실시예에서, 뉴럴 네트워크(330)는 소정의 학습 데이터에 기초하여 학습된 모델일 수 있다. 뉴럴 네트워크(330)를 구성하는 각 레이어는 가중치 및/또는 바이어스 값을 가지는데 한 레이어의 노드 수는 입력되는 텐서의 크기이므로 한 층마다 수백 또는 수천 개의 노드가 존재할 수 있고 이러한 레이어가 여러 개 쌓이게 되면 다음 레이어의 노드로 연결되는 가중치의 수가 무수히 많아지므로 모든 가중치 및/또는 바이어스 값을 설정해두기에는 어려움이 있다. 따라서 학습을 통하여 방대한 데이터 양에 대하여 분류하고자 하는 바에 가장 최적화 된 레이어(즉, 가중치 및/또는 바이어스)를 찾을 수 있다.In one embodiment, the neural network 330 may be a model learned based on predetermined training data. Each layer that makes up the neural network 330 has a weight and/or bias value. Since the number of nodes in one layer is the size of the input tensor, there may be hundreds or thousands of nodes in each layer, and multiple such layers can be stacked. As the number of weights connected to the nodes of the next layer increases, it is difficult to set all weight and/or bias values. Therefore, through learning, you can find the most optimized layer (i.e. weight and/or bias) for what you want to classify on a large amount of data.

예를 들어, 뉴럴 네트워크(330)의 학습 방법으로서 딥러닝은, 손실 함수(Loss Function) 또는 비용 함수(Cost Function), 최적화(Optimization) 및 역전파(Back propagation)의 세 가지 방법의 조합으로 신경망에서 가장 적합한 가중치를 찾아 설정할 수 있다.For example, deep learning as a learning method for the neural network 330 is a neural network using a combination of three methods: loss function or cost function, optimization, and back propagation. You can find and set the most appropriate weight.

뉴럴 네트워크(330)는 외부 메모리를 이용하지 않고 온-칩 메모리(on-chip memory)만을 이용하여 연산을 수행할 수 있다. 예를 들면, 뉴럴 네트워크(330)는 외부 메모리(예: 오프칩 메모리 등)를 이용하지 않고 온-칩 메모리만을 이용하여 PIM 기반으로 각 레이어에 대한 연산을 수행함으로써, 오디오 신호를 처리하는 동안 메모리 업데이트 없이 연산을 수행할 수 있다. 구체적으로, 뉴럴 네트워크(330)는 각 메모리 셀과 프로세서가 직접 연결된, PIM 기반의 연산이 이루어질 수 있다.The neural network 330 can perform calculations using only on-chip memory without using external memory. For example, the neural network 330 performs calculations for each layer based on PIM using only on-chip memory without using external memory (e.g., off-chip memory, etc.), thereby processing the memory while processing the audio signal. Operations can be performed without updates. Specifically, the neural network 330 may perform PIM-based calculations in which each memory cell and the processor are directly connected.

다만, PIM은 높은 메모리 대역폭을 필요로 하는데 메모리는 높은 온도에 민감하기 때문에 컴퓨팅 장치가 소모할 수 있는 전력이 제한될 수 있다. 따라서 고성능의 PIM 구조 칩을 생산하기 위해서 새로운 하드웨어 구조가 필요할 수 있고, 이에 따라 제조 비용 등이 높아질 수 있다. 따라서 PIM 구조는 비교적 적은 양 또는 단순한 계산을 수행하는 데에 유리한 구조일 수 있다. 반면, 뉴럴 네트워크(330)는 이러한 PIM 구조의 단점을 극복하기 위하여 멀티비트(Multi-bit) 구현이 가능한 메모리로 구성될 수 있다. 예를 들어, 뉴럴 네트워크(330)는 7 bits 128 analog memory states 구현이 가능한 메모리로 구성될 수 있다. 뉴럴 네트워크(330)를 대용량으로 구성함에 따라, 일반적인 PIM 칩에서 발열 또는 성능저하 등의 문제를 보이는 것과 달리 장시간 사용에도 방대한 양의 데이터 처리를 저전력, 고성능으로 수행할 수 있다.However, PIM requires high memory bandwidth, and since memory is sensitive to high temperatures, the power that a computing device can consume may be limited. Therefore, in order to produce high-performance PIM structure chips, a new hardware structure may be required, which may increase manufacturing costs. Therefore, the PIM structure may be advantageous for performing relatively small amounts or simple calculations. On the other hand, the neural network 330 may be configured with a memory capable of multi-bit implementation to overcome the shortcomings of the PIM structure. For example, the neural network 330 may be configured with a memory capable of implementing 7 bits 128 analog memory states. By configuring the neural network 330 with a large capacity, it is possible to process a large amount of data with low power and high performance even when used for a long time, unlike typical PIM chips that have problems such as heat generation or performance degradation.

도 5a 내지 도 5b는 일 실시예에 따른 뉴럴 네트워크의 동작 방법을 설명하기 위한 도면이다.5A to 5B are diagrams for explaining a method of operating a neural network according to an embodiment.

도 5a를 참조하면, 뉴럴 네트워크는 복수의 코어들을 포함할 수 있으며, 각각의 코어들은 RCA(Resistive Crossbar Memory Arrays)로 구현될 수 있다. 구체적으로, 각각의 코어는 복수의 프리 시냅틱 뉴런(presynaptic neuron, 510), 복수의 포스트 시냅틱 뉴런(postsynaptic neuron, 520), 및 복수의 프리 시냅틱 뉴런(510)과 복수의 포스트 시냅틱 뉴런(520) 사이의 각각의 연결을 제공하는 시냅스(530)를 포함할 수 있다. Referring to FIG. 5A, the neural network may include a plurality of cores, and each core may be implemented as RCA (Resistive Crossbar Memory Arrays). Specifically, each core includes a plurality of presynaptic neurons (510), a plurality of postsynaptic neurons (520), and a plurality of presynaptic neurons (510) and a plurality of postsynaptic neurons (520). It may include a synapse 530 that provides each connection.

일 실시예에서 뉴럴 네트워크의 코어는, 4개의 프리 시냅틱 뉴런(510), 4개의 포스트 시냅틱 뉴런(520) 및 16개의 시냅스(530)를 포함하고 있으나, 이들 개수는 다양하게 변형될 수 있다. 프리 시냅틱 뉴런(510)의 개수가 N개(여기서, N은 2 이상의 자연수임)이고, 포스트 시냅틱 뉴런(520)의 개수가 M개(여기서, M은 2 이상의 자연수이고, N과 같거나 다를 수 있음)인 경우, N*M개의 시냅스(530)가 매트릭스 형태로 배열될 수 있다. In one embodiment, the core of the neural network includes 4 pre-synaptic neurons 510, 4 post-synaptic neurons 520, and 16 synapses 530, but these numbers may vary. The number of pre-synaptic neurons 510 is N (where N is a natural number of 2 or more), and the number of post-synaptic neurons 520 is M (where M is a natural number of 2 or more and may be equal to or different from N). In this case, N*M synapses 530 may be arranged in a matrix form.

구체적으로, 복수의 프리 시냅틱 뉴런(510) 각각과 연결되고 제1 방향(예를 들어, 가로 방향)으로 연장하는 배선(512)과, 복수의 포스트 시냅틱 뉴런(520) 각각과 연결되고 제1 방향과 교차하는 제2 방향(예를 들어, 세로 방향)으로 연장하는 배선(522)이 제공될 수 있다. 이하, 설명의 편의를 위하여, 제1 방향으로 연장하는 배선(512)을 로우 배선(row line)이라 하고, 제2 방향으로 연장하는 배선(522)을 컬럼 배선(column line)이라 하기로 한다. 복수의 시냅스(530)는 로우 배선(512)과 컬럼 배선(522)의 교차점마다 배치되어 대응하는 로우 배선(512)과 대응하는 컬럼 배선(522)을 서로 연결시킬 수 있다.Specifically, a wiring 512 connected to each of the plurality of pre-synaptic neurons 510 and extending in the first direction (e.g., horizontal direction), and connected to each of the plurality of post-synaptic neurons 520 and extending in the first direction. A wiring 522 extending in a second direction (eg, vertical direction) that intersects may be provided. Hereinafter, for convenience of explanation, the wiring 512 extending in the first direction will be referred to as a row line, and the wiring 522 extending in the second direction will be referred to as a column line. A plurality of synapses 530 may be disposed at each intersection of the row wiring 512 and the column wiring 522 to connect the corresponding row wiring 512 and the corresponding column wiring 522 to each other.

프리 시냅틱 뉴런(510)은 신호 예컨대, 특정 데이터에 해당하는 신호를 생성하여 로우 배선(512)으로 보내는 역할을 수행하고, 포스트 시냅틱 뉴런(520)은 시냅스 소자(530)를 거친 시냅틱 신호를 컬럼 배선(522)을 통하여 수신하고 처리하는 역할을 수행할 수 있다. 프리 시냅틱 뉴런(510)은 액손(axon)에 대응하고, 포스트 시냅틱 뉴런(520)은 뉴런(neuron)에 대응할 수 있다. 그러나, 프리 시냅틱 뉴런인지 포스트 시냅틱 뉴런인지 여부는 다른 뉴런과의 상대적인 관계에 의해 결정될 수 있다. 예컨대, 프리 시냅틱 뉴런(510)이 다른 뉴런과의 관계에서 시냅틱 신호를 수신하는 경우 포스트 시냅틱 뉴런으로 기능할 수 있다. 유사하게, 포스트 시냅틱 뉴런(520)이 다른 뉴런과의 관계에서 신호를 보내는 경우 프리 시냅틱 뉴런으로 기능할 수 있다. 프리 시냅틱 뉴런(510) 및 포스트 시냅틱 뉴런(520)은 CMOS 등 다양한 회로로 구현될 수 있다.The pre-synaptic neuron 510 generates a signal, for example, a signal corresponding to specific data and sends it to the row wiring 512, and the post-synaptic neuron 520 transmits the synaptic signal that has passed through the synaptic element 530 to the column wiring. It can perform the role of receiving and processing through (522). The pre-synaptic neuron 510 may correspond to an axon, and the post-synaptic neuron 520 may correspond to a neuron. However, whether a neuron is pre-synaptic or postsynaptic can be determined by its relative relationship to other neurons. For example, when the pre-synaptic neuron 510 receives synaptic signals in a relationship with another neuron, it may function as a post-synaptic neuron. Similarly, a post-synaptic neuron 520 may function as a pre-synaptic neuron when it sends signals in relationships with other neurons. The pre-synaptic neuron 510 and the post-synaptic neuron 520 may be implemented with various circuits such as CMOS.

프리 시냅틱 뉴런(510)과 포스트 시냅틱 뉴런(520) 사이의 연결은 시냅스(530)를 통하여 이루어질 수 있다. 여기서, 시냅스(530)는 양단에 인가되는 전기적 펄스 예컨대, 전압 또는 전류에 따라 전기적 전도도(electrical conductance) 혹은 웨이트(weight)가 변하는 소자이다.The connection between the pre-synaptic neuron 510 and the post-synaptic neuron 520 may be made through the synapse 530. Here, the synapse 530 is an element whose electrical conductance or weight changes depending on electrical pulses, such as voltage or current, applied to both ends.

뉴럴 네트워크의 코어는, 시냅스(530)가 가변 저항으로 구성된 ReRAM(Resistive RAM), 시냅스(530)가 강유전체로 구성된 FeRAM(Ferroelectric RAM), 시냅스(530)가 열을 가함에 따라 비정질 상태에서 결정질 상태로 바뀌는 칼코게나이드 유리로 구성된 PRAM(Phase-change RAM), 시냅스(530)가 자성체 소자로 구성된 MRAM(Magnetic RAM), NAND/NOR 플래시 메모리 등 멀티레벨(multi-level) 메모리를 사용하여 구성될 수 있다.The core of the neural network is ReRAM (Resistive RAM), where the synapse 530 is composed of variable resistance, and FeRAM (Ferroelectric RAM), where the synapse 530 is composed of a ferroelectric. As the synapse 530 is heated, it changes from an amorphous state to a crystalline state. It will be configured using multi-level memory such as PRAM (Phase-change RAM) made of chalcogenide glass, MRAM (Magnetic RAM) whose synapse 530 is made of magnetic elements, and NAND/NOR flash memory. You can.

시냅스(530)는 예컨대, 가변 저항 소자를 포함할 수 있다. 가변 저항 소자는 양단에 인가되는 전압 또는 전류에 따라 서로 다른 저항 상태 사이에서 스위칭할 수 있는 소자로서, 복수의 저항 상태를 가질 수 있는 다양한 물질 예컨대, 전이 금속 산화물, 페로브스카이트(perovskite)계 물질 등과 같은 금속 산화물, 칼코게나이드(chalcogenide)계 물질 등과 같은 상변화 물질, 강유전 물질, 강자성 물질 등을 포함하는 단일막 구조 또는 다중막 구조를 가질 수 있다. The synapse 530 may include, for example, a variable resistance element. A variable resistance element is an element that can switch between different resistance states depending on the voltage or current applied to both ends, and is made of various materials that can have multiple resistance states, such as transition metal oxides and perovskite systems. It may have a single-layer structure or a multi-layer structure including metal oxides such as metal oxides, phase change materials such as chalcogenide-based materials, ferroelectric materials, and ferromagnetic materials.

코어의 시냅스(530)는, 셋 동작과 리셋 동작에서 급격한(abrupt) 저항 변화가 없고, 입력되는 전기적 펄스의 개수에 따라 전도도가 점진적으로 변화하는 아날로그 거동(analog behavior)을 보이는 등, 메모리에서의 가변 저항 소자와 구별되는 여러가지 특성을 갖도록 구현될 수 있다. 이는 메모리에서 가변 저항 소자에 요구되는 특성과 뉴럴 네트워크의 코어에서 시냅스(530)에 요구되는 특성이 서로 상이하기 때문이다. The synapse 530 of the core has no abrupt change in resistance during set and reset operations and exhibits analog behavior in which conductivity gradually changes depending on the number of input electrical pulses. It can be implemented to have various characteristics that distinguish it from a variable resistance element. This is because the characteristics required for the variable resistance element in the memory and the characteristics required for the synapse 530 in the core of the neural network are different from each other.

구체적으로, 뉴로모픽 칩은 코어의 시냅스(530) 아날로그 장치를 통해 다변화된 전압을 인가할 수 있다. 이에 따라 시냅스(530)의 저항값 또는 가중치가 점진적으로 변화될 수 있다.Specifically, the neuromorphic chip can apply diversified voltages through the synapse 530 analog device of the core. Accordingly, the resistance value or weight of the synapse 530 may gradually change.

위와 같은 뉴럴 네트워크의 동작을 도 5b를 참조하여 설명하면 아래와 같다. 설명의 편의를 위하여 로우 배선(512)을 위쪽에서부터 순서대로 제1 로우 배선(512A), 제2 로우 배선(512B), 제3 로우 배선(512C) 및 제4 로우 배선(512D)이라 칭할 수 있고, 컬럼 배선(522)을 좌측에서부터 순서대로 제1 컬럼 배선(522A), 제2 컬럼 배선(522B), 제3 컬럼 배선(522C) 및 제4 컬럼 배선(522D)이라 칭할 수 있다.The operation of the above neural network is explained with reference to FIG. 5b as follows. For convenience of explanation, the row wiring 512 may be referred to as a first row wiring 512A, a second row wiring 512B, a third row wiring 512C, and a fourth row wiring 512D in order from the top. , the column wiring 522 may be referred to as a first column wiring 522A, a second column wiring 522B, a third column wiring 522C, and a fourth column wiring 522D in that order from the left.

도 5b를 참조하면, 최초 상태에서, 시냅스(530) 전부는 전도도가 상대적으로 낮은 상태 즉, 고저항 상태에 있을 수 있다. 복수의 시냅스(530)의 적어도 일부가 저저항 상태인 경우, 이들을 고저항 상태로 만드는 초기화 동작이 추가로 필요할 수 있다. 복수의 시냅스(530) 각각은 저항 및/또는 전도도 변화에 요구되는 소정의 임계값을 가질 수 있다. 보다 구체적으로, 각 시냅스(530)의 양단에 소정 임계값보다 작은 크기의 전압 또는 전류가 인가되면 시냅스(530)의 전도도는 변하지 않고, 시냅스(530)에 소정 임계값보다 큰 전압 또는 전류가 인가되면 시냅스(530)의 전도도는 변화할 수 있다.Referring to FIG. 5B, in the initial state, all of the synapses 530 may be in a state of relatively low conductivity, that is, a high resistance state. If at least some of the plurality of synapses 530 are in a low-resistance state, an additional initialization operation may be required to bring them into a high-resistance state. Each of the plurality of synapses 530 may have a predetermined threshold required for change in resistance and/or conductance. More specifically, when a voltage or current smaller than a predetermined threshold is applied to both ends of each synapse 530, the conductivity of the synapse 530 does not change, and a voltage or current larger than the predetermined threshold is applied to the synapse 530. When this happens, the conductivity of the synapse 530 may change.

이 상태에서, 특정 데이터를 특정 컬럼 배선(522)의 결과로 출력하는 동작을 수행하기 위하여, 프리 시냅틱 회로(510)의 출력에 대응하여 특정 데이터에 해당하는 입력 신호가 로우 배선(512)으로 들어올 수 있다. 예를 들어, 입력 신호는 로우 배선(512) 각각에 대한 전기적 펄스의 인가로 나타날 수 있다. 또한, 컬럼 배선(522)은 출력을 위하여 적절한 전압 또는 전류로 구동될 수 있다.In this state, in order to perform an operation of outputting specific data as a result of the specific column wiring 522, an input signal corresponding to specific data comes into the row wiring 512 in response to the output of the pre-synaptic circuit 510. You can. For example, the input signal may appear as the application of an electrical pulse to each row wiring 512. Additionally, the column wiring 522 may be driven with an appropriate voltage or current for output.

다른 일례로서, 특정 데이터를 출력할 컬럼 배선(522)이 정하여져 있지 않을 수도 있다. 이러한 경우, 특정 데이터에 해당하는 전기적 펄스를 로우 배선(512)으로 인가하면서 컬럼 배선(522) 각각에 흐르는 전류를 측정하여 가장 먼저 소정 임계 전류에 도달하는 컬럼 배선(522) 예컨대, 제3 컬럼 배선(522C)이 이 특정 데이터를 출력한 컬럼 배선(522)이 될 수 있다.As another example, the column wiring 522 to output specific data may not be determined. In this case, while applying an electrical pulse corresponding to specific data to the row wiring 512, the current flowing through each of the column wirings 522 is measured and the column wiring 522 that reaches a predetermined threshold current first, for example, the third column wiring. (522C) may be the column wiring 522 that outputs this specific data.

이상으로 설명한 방식에 의하여, 서로 다른 데이터가 서로 다른 컬럼 배선(522)에 각각 출력될 수 있다.Using the method described above, different data can be output to different column wires 522, respectively.

도 6a 내지 도 6b는 일 실시예에 따른 벡터-행렬 곱셈과 뉴럴 네트워크에서 수행되는 연산을 비교하기 위한 도면이다.6A to 6B are diagrams for comparing vector-matrix multiplication and operations performed in a neural network according to one embodiment.

먼저 도 6a를 참조하면, 입력 데이터와 커널 간의 컨벌루션 연산은 벡터-행렬 곱셈(vector-matrix multiplication)을 이용하여 수행될 수 있다. 예를 들어, 입력 데이터의 픽셀 데이터는 행렬 X(610)로 표현될 수 있고, 커널 값들은 행렬 W(611)로 표현될 수 있다. 출력 데이터의 픽셀 데이터는, 행렬 X(610)와 행렬 W(611) 간의 곱셈 연산 결과 값인 행렬 Y(612)로 표현될 수 있다. First, referring to FIG. 6A, the convolution operation between input data and the kernel can be performed using vector-matrix multiplication. For example, pixel data of input data can be expressed as a matrix X (610), and kernel values can be expressed as a matrix W (611). Pixel data of the output data can be expressed as a matrix Y (612), which is the result of a multiplication operation between the matrix X (610) and the matrix W (611).

도 6b를 참조하면, 뉴럴 네트워크의 코어를 이용하여 벡터 곱셈 연산이 수행될 수 있다. 도 6a와 비교하여 설명하면, 입력 데이터의 픽셀 데이터는 코어의 입력 값으로 수신될 수 있으며, 입력 값은 전압(620)일 수 있다. 또한, 커널 값들은 코어의 시냅스 즉, 메모리 셀에 저장될 수 있으며, 메모리 셀에 저장된 커널 값들은 컨덕턴스(621)일 수 있다. 따라서, 코어의 출력 값은, 전압(620) 및 컨덕턴스(621) 간의 곱셈 연산 결과 값인 전류(622)로 표현될 수 있다.Referring to FIG. 6B, a vector multiplication operation can be performed using the core of a neural network. 6A , pixel data of the input data may be received as an input value of the core, and the input value may be a voltage 620. Additionally, kernel values may be stored in the synapse of the core, that is, a memory cell, and the kernel values stored in the memory cell may be conductance 621. Accordingly, the output value of the core can be expressed as a current 622, which is the result of a multiplication operation between the voltage 620 and the conductance 621.

도 7은 일 실시예에 따른 뉴럴 네트워크에서 컨벌루션 연산이 수행되는 예시를 설명하기 위한 도면이다.FIG. 7 is a diagram for explaining an example of a convolution operation being performed in a neural network according to an embodiment.

뉴럴 네트워크는 오디오 신호(710)를 수신할 수 있고, 뉴럴 네트워크의 코어(700)는 RCA(Resistive Crossbar Memory Arrays)로 구현될 수 있다. 이 때, 오디오 신호(710)는 뉴럴 네트워크의 코어(700)로 구현되는 푸리에 변환 레이어에 의해 시간 대역에서 주파수 대역으로 변환되고, 뉴럴 네트워크의 코어(700)로 구현되는 멜 생성 레이어에 의해 멜 스펙트로그램(mel-spectrogram)으로 변환되어, 상기 멜 스펙트로그램에 대응되는 픽셀 데이터에 대한 컨벌루션 연산을 수행할 수 있다. 따라서, 이하 도 7에 관한 설명에서 오디오 신호(710)의 픽셀 데이터는, 변환된 주파수 대역의 픽셀 데이터 또는 생성된 멜 스펙트로그램의 픽셀 데이터를 의미할 수 있다.The neural network can receive an audio signal 710, and the core 700 of the neural network can be implemented with RCA (Resistive Crossbar Memory Arrays). At this time, the audio signal 710 is converted from the time band to the frequency band by the Fourier transform layer implemented by the core 700 of the neural network, and into a mel spectrum by the mel generation layer implemented by the core 700 of the neural network. It is converted into a mel-spectrogram, and a convolution operation can be performed on the pixel data corresponding to the mel-spectrogram. Therefore, in the description of FIG. 7 below, pixel data of the audio signal 710 may mean pixel data of a converted frequency band or pixel data of a generated Mel spectrogram.

일 실시예에서 코어(700)가 NxM 크기의 행렬인 경우(N 및 M은 2 이상의 자연수임), 오디오 신호(710)의 픽셀 데이터 개수는 코어(700)의 열(M)의 개수보다 작거나 같을 수 있다. 오디오 신호(710)의 픽셀 데이터는 부동 소수점 포맷 또는 고정 소수점 포맷의 파라미터일 수 있다.In one embodiment, when the core 700 is a matrix of size NxM (N and M are natural numbers of 2 or more), the number of pixel data of the audio signal 710 is less than the number of columns (M) of the core 700. It can be the same. Pixel data of the audio signal 710 may be parameters in floating point format or fixed point format.

뉴럴 네트워크는 디지털 신호 형태의 픽셀 데이터를 수신할 수 있으며, DAC(Digital Analog Converter)(720)를 이용하여, 수신된 픽셀 데이터를 아날로그 신호 형태의 전압으로 변환할 수 있다. 오디오 신호(710)의 픽셀 데이터는 1비트, 4비트 및 8비트 해상도(resolution) 등 다양한 비트 해상도 값을 가질 수 있다. 일 실시예에서, 뉴럴 네트워크는 DAC(720)를 이용하여 픽셀 데이터를 전압으로 변환한 후, 전압을 코어(700)의 입력 값(701)으로 수신할 수 있다.The neural network can receive pixel data in the form of a digital signal, and can convert the received pixel data into voltage in the form of an analog signal using a DAC (Digital Analog Converter) 720. Pixel data of the audio signal 710 may have various bit resolution values, such as 1-bit, 4-bit, and 8-bit resolution. In one embodiment, the neural network may convert pixel data into voltage using the DAC 720 and then receive the voltage as the input value 701 of the core 700.

또한, 뉴럴 네트워크의 코어(700)에는 학습된 커널 값들이 저장될 수 있다. 커널 값들은 코어의 메모리 셀에 저장될 수 있으며 메모리 셀에 저장된 커널 값들은 컨덕턴스(702)일 수 있다. 이 때, 뉴럴 네트워크는 전압(701)과 컨덕턴스(702) 간의 벡터 곱셈 연산을 수행함으로써 출력 값을 산출할 수 있으며, 출력 값은 전류(703)로 표현될 수 있다. 즉, 뉴럴 네트워크는 코어(700)를 이용하여 오디오 신호와 커널 간의 컨벌루션 연산 결과와 동일한 결과 값을 출력할 수 있다.Additionally, learned kernel values may be stored in the core 700 of the neural network. Kernel values may be stored in a memory cell of the core, and the kernel values stored in the memory cell may be conductance 702. At this time, the neural network can calculate the output value by performing a vector multiplication operation between the voltage 701 and the conductance 702, and the output value can be expressed as a current 703. That is, the neural network can use the core 700 to output the same result as the result of the convolution operation between the audio signal and the kernel.

코어(700)에서 출력된 전류(703)는 아날로그 신호이므로, 전류(703)를 다른 코어의 입력 데이터로 사용하기 위해 뉴럴 네트워크는 ADC(Analog Digital Converter)(730)를 이용할 수 있다. 뉴럴 네트워크는 ADC(730)를 이용하여, 아날로그 신호인 전류(703)를 디지털 신호로 변환할 수 있다. 일 실시예에서 뉴럴 네트워크는 ADC(730)를 이용하여, 오디오 신호(710)의 픽셀 데이터와 동일한 비트 해상도를 갖도록 전류(703)를 디지털 신호로 변환할 수 있다. 예를 들어, 오디오 신호(710)의 픽셀 데이터가 1비트 해상도인 경우, 뉴럴 네트워크는 ADC(730)를 이용하여 전류(703)를 1비트 해상도의 디지털 신호로 변환할 수 있다.Since the current 703 output from the core 700 is an analog signal, the neural network can use the ADC (Analog Digital Converter) 730 to use the current 703 as input data for another core. The neural network can convert the current 703, which is an analog signal, into a digital signal using the ADC 730. In one embodiment, the neural network may use the ADC 730 to convert the current 703 into a digital signal to have the same bit resolution as the pixel data of the audio signal 710. For example, if the pixel data of the audio signal 710 has 1-bit resolution, the neural network can convert the current 703 into a digital signal with 1-bit resolution using the ADC 730.

뉴럴 네트워크는 활성화 유닛(740)을 이용하여, ADC(730)에서 변환된 디지털 신호에 활성화 함수를 적용할 수 있다. 활성화 함수로는 Sigmoid 함수, Tanh 함수 및 ReLU(Rectified Linear Unit) 함수를 이용할 수 있으나, 디지털 신호에 적용할 수 있는 활성화 함수는 이에 제한되지 않는다.The neural network can apply an activation function to the digital signal converted in the ADC 730 using the activation unit 740. The Sigmoid function, Tanh function, and ReLU (Rectified Linear Unit) function can be used as activation functions, but the activation function applicable to digital signals is not limited to these.

활성화 함수가 적용된 디지털 신호는 다른 코어(750)의 입력 값으로 이용될 수 있다. 활성화 함수가 적용된 디지털 신호가 다른 코어(750)의 입력 값으로 이용되는 경우, 다른 코어(750)에서 상술한 과정이 동일하게 적용될 수 있다.The digital signal to which the activation function is applied can be used as an input value for another core 750. When a digital signal to which an activation function is applied is used as an input value for another core 750, the above-described process can be applied equally to the other core 750.

한편, 코어(700)와 다른 코어(750)는 물리적으로 분리된 것이 아닌, 각 코어(700, 750)가 갖는 가중치 및/또는 바이어스 값에 따라 시냅스의 가변 저항 소자 값이 변경된 각각의 코어(700, 750)를 의미할 수 있다.Meanwhile, the core 700 and the other core 750 are not physically separated, but each core 700 has a variable resistance element value of the synapse changed according to the weight and/or bias value of each core 700 and 750. , 750).

도 8은 일 실시예에 따른 뉴럴 네트워크의 구현 예시도이다.Figure 8 is an example of a neural network implementation according to an embodiment.

도 8을 참조하면, 뉴럴 네트워크는 수신한 오디오 신호를 시간 대역에서 주파수 대역으로 변환하여 주파수 신호를 생성하는 푸리에 변환(Fourier Transform) 레이어(810), 주파수 신호로부터 멜 스펙트로그램을 생성하는 멜 생성 레이어(820), 멜 스펙트로그램에 기초하여 오디오 신호의 특징을 분류하는 하나 이상의 히든(hidden) 레이어(830, 840) 및 음성 인식 결과를 출력하는 출력 레이어(850)를 포함할 수 있다. 각 레이어의 순서는 도 8에 도시한 바와 같을 수 있으나, 이에 제한되는 것은 아니다.Referring to Figure 8, the neural network includes a Fourier Transform layer 810 that generates a frequency signal by converting the received audio signal from a time band to a frequency band, and a Mel generation layer that generates a Mel spectrogram from the frequency signal. (820), it may include one or more hidden layers (830, 840) that classify features of the audio signal based on the Mel spectrogram, and an output layer (850) that outputs a voice recognition result. The order of each layer may be as shown in FIG. 8, but is not limited thereto.

일 실시예에서, 푸리에 변환 레이어(810)는 시간 대역의 오디오 신호를 푸리에 변환하여 주파수 대역의 주파수 신호를 생성할 수 있다. 전술한 바와 같이, 오디오 신호는 입출력 인터페이스(또는 입출력 인터페이스에 포함된 오디오 수신부)가 마이크에서 입력되는 음성을 디지털화 한 데이터일 수 있으므로, 시간 대역을 가질 수 있다. 따라서, 푸리에 변환 레이어(810)는 오디오 신호의 분류를 위한 다른 레이어(820 내지 850)에의 입력에 적합하도록, 푸리에 연산을 통해 주파수 대역의 데이터로 변환할 수 있다. 예를 들어, 푸리에 변환 레이어(810)는 불연속의 이산적 함수에 대한 푸리에 변환인 DFT(Discrete Fourier Transform)를 통해 주파수 신호를 생성할 수 있다.In one embodiment, the Fourier transform layer 810 may perform Fourier transform on an audio signal in the time band to generate a frequency signal in the frequency band. As described above, the audio signal may be data that the input/output interface (or an audio receiver included in the input/output interface) digitizes the voice input from the microphone, so it may have a time band. Accordingly, the Fourier transform layer 810 can convert data in a frequency band through Fourier operation to be suitable for input to other layers 820 to 850 for classifying audio signals. For example, the Fourier transform layer 810 may generate a frequency signal through Discrete Fourier Transform (DFT), which is a Fourier transform for a discontinuous discrete function.

일 실시예에서, 뉴럴 네트워크는 디지털 신호에 해당하는 오디오 신호를 아날로그 신호로 변환하는 디지털-아날로그 컨버터(DAC)를 더 포함하여, 수신된 오디오 신호는 디지털-아날로그 컨버터를 통해 아날로그 신호로 변환된 후 상기 푸리에 변환 레이어(810)에 입력될 수 있다.In one embodiment, the neural network further includes a digital-to-analog converter (DAC) that converts an audio signal corresponding to a digital signal into an analog signal, and the received audio signal is converted to an analog signal through the digital-to-analog converter and then converted to an analog signal. It may be input to the Fourier transform layer 810.

일 실시예에서, 멜 생성 레이어(820)는 푸리에 변환 레이어(810)로부터 주파수 신호를 입력 받아 오디오 신호에 대응하는 멜 스펙트로그램을 생성할 수 있다.In one embodiment, the Mel generation layer 820 may receive a frequency signal from the Fourier transform layer 810 and generate a Mel spectrogram corresponding to the audio signal.

스펙트로그램은 음성 신호의 스펙트럼을 시각화하여 그래프로 표현한 것이다. 스펙트로그램의 x축은 시간, y축은 주파수를 나타내며 각 시간당 주파수가 가지는 값을 값의 크기에 따라 색으로 표현할 수 있다. 이 때, 오디오 신호에 푸리에 변환을 수행한 결과물은 복소수 값이기 때문에, 복소수 값에 절대값을 취해 위상(phase) 정보를 소실시키고 크기(magnitude) 정보만을 포함하는 스펙트로그램을 생성할 수 있다.A spectrogram is a visualization of the spectrum of a voice signal and expressed as a graph. The x-axis of the spectrogram represents time, the y-axis represents frequency, and the value of each time frequency can be expressed in color according to the size of the value. At this time, since the result of performing Fourier transform on the audio signal is a complex value, the absolute value of the complex value can be taken to lose phase information and generate a spectrogram containing only magnitude information.

한편, 멜 스펙트로그램은 스펙트로그램의 주파수 간격을 멜 스케일(Mel Scale)로 재조정한 것이다. 사람의 청각기관은 고주파수(high frequency) 보다 저주파수(low frequency) 대역에서 더 민감하며, 이러한 특성을 반영해 물리적인 주파수와 실제 사람이 인식하는 주파수의 관계를 표현한 것이 멜 스케일이다. 멜 스펙트로그램은 멜 스케일에 기반한 필터 뱅크(filter bank)를 스펙트로그램에 적용하여 생성될 수 있다.Meanwhile, the Mel spectrogram is a re-adjustment of the frequency interval of the spectrogram into Mel Scale. The human auditory system is more sensitive to low frequencies than to high frequencies, and the Mel scale reflects this characteristic and expresses the relationship between physical frequencies and frequencies perceived by humans. A Mel spectrogram can be generated by applying a filter bank based on the Mel scale to the spectrogram.

일 실시예에서, 하나 이상의 히든 레이어(830, 840)는 멜 생성 레이어(820)로부터 오디오 신호에 대응하는 멜 스펙트로그램을 입력 받아 오디오 신호 또는 멜 스펙트로그램의 음성 특징을 분류할 수 있다.In one embodiment, one or more hidden layers 830 and 840 may receive a Mel spectrogram corresponding to an audio signal from the Mel generation layer 820 and classify voice features of the audio signal or Mel spectrogram.

하나 이상의 히든 레이어(830, 840)는 입력된 멜 스펙트로그램에 대응되는 텐서를 프리 시냅틱 뉴런으로 하고, 프리 시냅틱 뉴런에 가중치를 반영함으로써 멜 스펙트로그램을 소정의 기준에 따라 분류할 수 있다. 따라서, 히든 레이어의 개수가 많을수록 입력되는 데이터를 보다 정교하게 처리하여 알고리즘의 기능을 향상시킬 수 있다.One or more hidden layers 830 and 840 can classify the Mel spectrogram according to a predetermined standard by using the tensor corresponding to the input Mel spectrogram as a pre-synaptic neuron and reflecting the weight to the pre-synaptic neuron. Therefore, as the number of hidden layers increases, the function of the algorithm can be improved by processing input data more precisely.

일 실시예에서, 하나 이상의 히든 레이어(830, 840)는 멜 생성 레이어(820)와 출력 레이어(850) 사이에 위치할 수 있다. 따라서, 하나 이상의 히든 레이어(830, 840)는 멜 생성 레이어(820)로부터 멜 스펙트로그램을 입력 받아 특징을 분류하고, 분류 결과를 출력하여 출력 레이어(850)로 전달할 수 있다.In one embodiment, one or more hidden layers 830 and 840 may be located between the mel generation layer 820 and the output layer 850. Accordingly, one or more hidden layers 830 and 840 may receive the Mel spectrogram from the Mel generation layer 820, classify the features, output the classification result, and transmit it to the output layer 850.

일 실시예에서, 출력 레이어(850)는 멜 생성 레이어(820) 또는 하나 이상의 히든 레이어(830, 840)로부터 음성 특징 또는 음성 특징을 분류한 결과를 입력 받아 음성 인식 결과를 출력할 수 있다. 예를 들어, 출력 레이어(850)는 하나 이상의 히든 레이어(830, 840)가 음성 특징을 분류한 결과, 기 설정된 복수의 키워드 중 어느 하나에 해당하는지 여부를 결정함으로써 음성 인식 결과를 출력할 수 있다. 구체적으로, 뉴로모픽 장치는 기 설정된 복수의 키워드를 저장할 수 있다. 이 때, 히든 레이어(830, 840)가 오디오 신호의 특징을 분류하고 출력 레이어(850)가 분류 결과를 전달받아 기 설정된 복수의 키워드 중 어느 하나에 해당하는지 여부를 결정할 수 있다.In one embodiment, the output layer 850 may receive voice features or a result of classifying voice features from the mel generation layer 820 or one or more hidden layers 830 and 840 and output a voice recognition result. For example, the output layer 850 may output a voice recognition result by determining whether one or more hidden layers 830 and 840 classify voice features and correspond to one of a plurality of preset keywords. . Specifically, the neuromorphic device can store a plurality of preset keywords. At this time, the hidden layers 830 and 840 classify the characteristics of the audio signal, and the output layer 850 receives the classification result to determine whether it corresponds to one of a plurality of preset keywords.

예를 들어, 복수의 키워드는 "불 켜", "불 꺼", "조명 밝게", "조명 어둡게", "청광 차단", "따듯한 조명" 등 소정의 음절 개수로 이루어진 명령어일 수 있다. 따라서, 입출력 인터페이스가 생활 소음 및 자연 언어를 포함한 오디오 신호를 수신하고, 푸리에 변환 레이어(810), 멜 생성 레이어(820) 및 하나 이상의 히든 레이어(830, 840)를 통해 오디오 신호의 특징을 분류함으로써, 출력 레이어(850)가 분류 결과가 수신한 오디오 신호가 상기 기 설정된 복수의 키워드 중 어느 하나에 해당하는지를 결정할 수 있다. 즉, 뉴로모픽 장치는, 수신한 오디오 신호가 기 설정된 복수의 키워드 중 어느 하나에 해당하는 경우 해당 키워드를 출력 레이어(850)를 통해 음성 인식 결과로써 출력하고, 수신한 오디오 신호가 기 설정된 복수의 키워드 중 어느 것에도 해당하지 않는 경우 빈 값(null)을 출력하거나, 출력 없이 다음 오디오 신호를 수신하여 상술한 각 레이어에 의한 연산을 반복적으로 수행할 수 있다. 일 실시예에서, 상기 복수의 키워드는 20개 내외 또는 20개 이내로 설정된 것일 수 있다.For example, a plurality of keywords may be commands consisting of a predetermined number of syllables, such as “turn on the light,” “turn off the light,” “brighten the light,” “dim the light,” “block blue light,” and “warm light.” Therefore, the input/output interface receives audio signals including living noise and natural language, and classifies the features of the audio signal through the Fourier transform layer 810, the mel generation layer 820, and one or more hidden layers 830 and 840. , the output layer 850 may determine which one of the plurality of preset keywords corresponds to the audio signal received as a classification result. That is, when the received audio signal corresponds to one of a plurality of preset keywords, the neuromorphic device outputs the corresponding keyword as a voice recognition result through the output layer 850, and the received audio signal corresponds to one of the plurality of preset keywords. If it does not correspond to any of the keywords, an empty value (null) can be output, or the next audio signal can be received without output, and the calculation by each layer described above can be performed repeatedly. In one embodiment, the plurality of keywords may be set to around 20 or less than 20 keywords.

일 실시예에서, 출력 레이어(850)는 오디오 신호가 기 설정된 복수의 키워드 중 적어도 하나와 유사한 네거티브 신호에 해당하는 경우, 빈 값을 음성 인식 결과로 출력하거나, 출력 없이 다음 오디오 신호를 수신하여 상술한 각 레이어에 의한 연산을 반복적으로 수행할 수 있다. 네거티브 신호는 기 설정된 복수의 키워드 중 어느 하나와 유사하여 뉴럴 네트워크가 유사한 키워드로 인식할 가능성이 높은 키워드를 의미할 수 있다. 예를 들어, 기 설정된 키워드가 "auto care"일 때 "어떡해"가 네거티브 신호일 수 있다. 따라서, 네거티브 신호는 뉴럴 네트워크가 키워드와 유사한 네거티브 신호를, 키워드로 잘못 인식하는 결과를 방지하기 위하여 기 설정된 것일 수 있다.In one embodiment, when the audio signal corresponds to a negative signal similar to at least one of a plurality of preset keywords, the output layer 850 outputs an empty value as a voice recognition result or receives the next audio signal without outputting the above-described audio signal. Operations for each layer can be performed repeatedly. A negative signal may mean a keyword that is similar to one of a plurality of preset keywords and is therefore likely to be recognized by the neural network as a similar keyword. For example, when the preset keyword is “auto care,” “what should I do?” may be a negative signal. Therefore, the negative signal may be preset to prevent the neural network from incorrectly recognizing a negative signal similar to a keyword as a keyword.

일 실시예에서, 뉴럴 네트워크는 아날로그 신호에 해당하는 음성 인식 결과를 디지털 신호로 변환하는 아날로그-디지털 컨버터(ADC)를 더 포함하여, 출력 레이어로부터 출력된 특징 분류 결과 또는 음성 인식 결과는 아날로그-디지털 컨버터를 통해 디지털 신호로 변환된 후 입출력 인터페이스를 통해 출력될 수 있다.In one embodiment, the neural network further includes an analog-to-digital converter (ADC) that converts a voice recognition result corresponding to an analog signal into a digital signal, so that the feature classification result or voice recognition result output from the output layer is analog-to-digital. It can be converted into a digital signal through a converter and then output through an input/output interface.

일 실시예에서, 뉴럴 네트워크는, 멜 생성 레이어(820)로부터 출력된 멜 스펙트로그램에 기초하여 오디오 신호가 기 설정된 복수의 키워드 중 어느 것에도 해당하지 않는 것으로 결정된 경우, 멜 스펙트로그램을 히든 레이어(830, 840)에 입력하지 않고 빈 값을 음성 인식 결과로 출력하거나, 출력 없이 다음 오디오 신호를 수신하여 상술한 각 레이어에 의한 연산을 반복적으로 수행할 수 있다. 다시 말해, 뉴럴 네트워크는 오디오 신호를 수신한 경우 푸리에 변환 레이어(810) 및 멜 생성 레이어(820)까지는 기본 값(Default)으로 데이터에 대한 연산을 수행하되, 멜 스펙트로그램에 기초하여 오디오 신호가 사람의 음성이 아니거나, 기 설정된 복수의 키워드 중 어느 하나에 해당하지 않는 것으로 결정된 경우 히든 레이어(830, 840) 및/또는 출력 레이어(850)로는 데이터를 전송하지 않을 수 있다. 모든 오디오 신호에 대한 히든 레이어(830, 840) 및/또는 출력 레이어(850)의 연산이 수행될 필요가 없으므로 이에 대한 연산량을 줄임으로써 뉴로모픽 장치의 연산 효율을 높이고, PIM 구조를 갖는 경우 과한 발열도 방지할 수 있다.In one embodiment, when it is determined that the audio signal does not correspond to any of a plurality of preset keywords based on the Mel spectrogram output from the Mel generation layer 820, the neural network generates the Mel spectrogram to the hidden layer ( 830, 840), an empty value can be output as a voice recognition result, or the next audio signal can be received without output, and the calculation by each layer described above can be performed repeatedly. In other words, when the neural network receives an audio signal, it performs operations on the data with default values up to the Fourier transform layer 810 and the Mel generation layer 820, but based on the Mel spectrogram, the audio signal is If it is determined that the voice is not a voice or does not correspond to one of a plurality of preset keywords, data may not be transmitted to the hidden layers 830 and 840 and/or the output layer 850. Since there is no need to perform calculations on the hidden layers (830, 840) and/or output layer (850) for all audio signals, the calculation efficiency of the neuromorphic device is increased by reducing the amount of calculations, and when it has a PIM structure, it increases the computational efficiency of the neuromorphic device. It can also prevent fever.

구체적으로, 뉴럴 네트워크는 오디오 신호를 수신하고, 소정의 학습 데이터로 학습된 추론 모델을 통해 수신한 입력 신호가 학습된 모델 데이터에 포함되어 있지 않은 경우 연산작용을 중단하고, 저전력 모드(슬립 모드)로 전환할 수 있다. 즉, 입력 신호가 주변 환경의 일상적 소음을 포함한 노이즈에 불과한 경우 히든 레이어(830, 840)를 통한 연산을 수행하지 않음으로써 불필요한 전력 소모를 최소화할 수 있다. 반대로, 뉴럴 네트워크는 수신한 입력 신호가 학습된 모델 데이터에 포함된 경우 다음 히든 레이어(830, 840) 및 출력 레이어(850)에 의한 분류 및 음성 인식 결과의 출력을 수행한 후, 저전력 모드(슬립 모드)로 전환할 수 있다.Specifically, the neural network receives an audio signal, and if the input signal received through an inference model learned with predetermined training data is not included in the learned model data, it stops calculating and enters a low-power mode (sleep mode). can be converted to . That is, if the input signal is nothing more than noise including everyday noise from the surrounding environment, unnecessary power consumption can be minimized by not performing calculations through the hidden layers 830 and 840. Conversely, if the received input signal is included in the learned model data, the neural network performs classification and output of speech recognition results by the next hidden layer (830, 840) and output layer (850), and then enters a low-power mode (sleep mode). mode) can be switched.

한편, 뉴럴 네트워크의 각 레이어(810 내지 850)는 물리적으로 분리된 것이 아닌, 각 레이어가 갖는 가중치에 따라 시냅스의 가변 저항 소자 값이 변경된 코어를 의미할 수 있다. 설명을 위하여 다시 도 3을 참조하면, 뉴럴 네트워크의 각 레이어(810 내지 850)는 모두 뉴럴 네트워크(330)로 구현될 수 있다.Meanwhile, each layer 810 to 850 of the neural network may not be physically separated, but may mean a core in which the variable resistance element value of the synapse is changed according to the weight of each layer. Referring again to FIG. 3 for explanation, each layer 810 to 850 of the neural network may be implemented as the neural network 330.

예를 들어, 오디오 신호가 입출력 인터페이스(310)를 통하여 수신되면, 디지털-아날로그 컨버터(320)를 통해 아날로그 신호로 변환된 오디오 신호는 푸리에 변환 레이어(810)가 갖는 가중치에 대응하여 시냅스의 가변 저항 소자 값이 설정된 뉴럴 네트워크(330)에 입력되고, 오디오 신호의 시간 대역이 주파수 대역으로 변경된 주파수 신호가 생성될 수 있다. 이후, 주파수 신호는 아날로그-디지털 컨버터(340)를 통해 디지털 신호로 변환되어 연산 회로(350)에 입력될 수 있다. 또한, 연산 회로(350)의 출력이 디지털-아날로그 컨버터(320)를 통해 아날로그 신호로 변환된 주파수 신호는 멜 생성 레이어(820)가 갖는 가중치에 대응하여 시냅스의 가변 저항 소자 값이 설정된 뉴럴 네트워크(330)에 입력되고, 주파수 신호의 멜 스펙트로그램이 생성될 수 있다. 이후, 멜 스펙트로그램은 아날로그-디지털 컨버터(340)를 통해 디지털 신호로 변환되어 연산 회로(350)에 입력될 수 있다. 마찬가지로, 하나 이상의 히든 레이어(830, 840) 및 출력 레이어(850) 각각이 갖는 가중치에 대응하여 시냅스의 가변 저항 소자 값이 설정된 뉴럴 네트워크(330)에 기초하여 오디오 신호의 특징을 분류하고, 음성 인식 결과를 출력하는 과정은 상술한 과정과 중복되므로 설명을 생략하기로 한다.For example, when an audio signal is received through the input/output interface 310, the audio signal converted to an analog signal through the digital-to-analog converter 320 is converted into a variable resistance of the synapse in response to the weight of the Fourier transform layer 810. Element values are input to the set neural network 330, and a frequency signal in which the time band of the audio signal is changed to a frequency band can be generated. Thereafter, the frequency signal can be converted into a digital signal through the analog-to-digital converter 340 and input into the calculation circuit 350. In addition, the frequency signal converted from the output of the operation circuit 350 to an analog signal through the digital-analog converter 320 is connected to a neural network ( 330), and a Mel spectrogram of the frequency signal can be generated. Afterwards, the Mel spectrogram can be converted into a digital signal through the analog-to-digital converter 340 and input into the calculation circuit 350. Likewise, the characteristics of the audio signal are classified based on the neural network 330 in which the variable resistance element value of the synapse is set in response to the weight of each of the one or more hidden layers 830, 840 and the output layer 850, and voice recognition Since the process of outputting the results overlaps with the above-mentioned process, description will be omitted.

일 실시예에서, 뉴럴 네트워크는 소정의 학습 데이터로 학습된 것일 수 있다. 예를 들어, 소정의 학습 데이터는, 기 설정된 복수의 키워드 및 노이즈 세트의 조합으로 생성된 오디오 데이터일 수 있다. 음성 인식 분야에서, 뉴럴 네트워크를 학습하는 데 활용되는 학습 데이터는 기 설정된 복수의 키워드와 생활 소음 등 노이즈가 섞인 데이터를 주로 사용하여 여러 번 학습함으로써 인식 성능이 좋은 뉴럴 네트워크를 구축할 수 있다. 그러나, 심한 노이즈 환경에서의 키워드 음성을 학습 데이터로 사용하여 키워드를 인식 성능이 좋은 뉴럴 네트워크를 구축하기 위해서는 무수히 많은 학습 데이터와 학습 과정이 필요할 수 있다. 반면에, 먼저 기 설정된 복수의 키워드 음성을 학습 데이터로 하여 뉴럴 네트워크를 학습하고, 이후에 상기 기 설정된 복수의 키워드 음성과 노이즈 데이터의 조합으로 생성된 데이터 세트를 학습 데이터로 하여 뉴럴 네트워크를 학습하는 경우 전술한 학습 방법에 비하여 학습 데이터 양이 적어 효율적이며 학습 데이터 생성을 위한 비용도 절감할 수 있다.In one embodiment, the neural network may be trained with certain training data. For example, predetermined learning data may be audio data generated by combining a plurality of preset keywords and noise sets. In the field of speech recognition, the learning data used to learn a neural network is mainly data mixed with noise such as a plurality of preset keywords and household noise, and by learning multiple times, a neural network with good recognition performance can be built. However, in order to build a neural network with good keyword recognition performance by using keyword voices in a severe noisy environment as learning data, countless learning data and learning processes may be required. On the other hand, first, a neural network is learned using a plurality of preset keyword voices as training data, and then a neural network is learned using a data set generated by a combination of the preset plurality of keyword voices and noise data as learning data. In this case, compared to the above-mentioned learning method, the amount of learning data is small, so it is efficient and the cost for generating learning data can also be reduced.

일 실시예에서, 하나 이상의 히든 레이어(830, 840)는 소정의 학습 데이터에 기초하여 조정된 웨이트로 구성되는 것일 수 있다. 하나 이상의 히든 레이어(830, 840)의 연산은 뉴럴 네트워크의 시냅스가 갖는 가변 저항 소자의 각각의 저항 값이 가중치가 되어 수행된다. 이 때, 각 가중치는 전술한 소정의 학습 데이터, 즉 기 설정된 복수의 키워드 음성 및 노이즈 세트의 조합으로 생성된 오디오 데이터에 기초하여 조정될 수 있다.In one embodiment, one or more hidden layers 830 and 840 may be composed of weights adjusted based on predetermined learning data. The calculation of one or more hidden layers 830 and 840 is performed using the resistance values of each variable resistance element of the synapse of the neural network as weights. At this time, each weight can be adjusted based on the above-mentioned predetermined learning data, that is, audio data generated by combining a plurality of preset keyword voices and noise sets.

도 9는 일 실시예에 따른 크로스바 어레이 회로의 구현 예시도이다.Figure 9 is an exemplary implementation diagram of a crossbar array circuit according to an embodiment.

도 9를 참조하면, 뉴럴 네트워크를 구현하는 크로스바 어레이 회로(900)는 복수의 서브 회로(910, 920, 930)를 포함할 수 있다. 서브 회로(910, 920, 930)는 크로스바 어레이 회로(900)를 구성하는 복수의 코어의 조합으로 이루어진 회로로써, 각 서브 회로(910, 920, 930)가 뉴럴 네트워크를 구성하는 각 레이어에 대응되도록 시냅스의 웨이트가 설정될 수 있다.Referring to FIG. 9, a crossbar array circuit 900 implementing a neural network may include a plurality of sub-circuits 910, 920, and 930. The sub-circuits 910, 920, and 930 are circuits composed of a combination of a plurality of cores constituting the crossbar array circuit 900, and each sub-circuit 910, 920, and 930 corresponds to each layer constituting the neural network. The weight of the synapse can be set.

일 실시예에서, 크로스바 어레이 회로(900)는 제1 서브 회로(910), 제2 서브 회로(920) 및 제3 서브 회로(930)를 포함할 수 있다. 이 때, 제1 서브 회로(910)는 푸리에 변환 레이어, 제2 서브 회로(920)는 멜 생성 레이어, 제3 서브 회로(930)는 출력 레이어를 구현할 수 있다.In one embodiment, the crossbar array circuit 900 may include a first sub-circuit 910, a second sub-circuit 920, and a third sub-circuit 930. At this time, the first sub-circuit 910 may implement a Fourier transform layer, the second sub-circuit 920 may implement a Mel generation layer, and the third sub-circuit 930 may implement an output layer.

일 실시예에서, 푸리에 변환 레이어에 대응하는 웨이트가 제1 서브 회로(910)의 시냅스에 저장될 수 있다. 예를 들어, 제1 서브 회로(910)는 시간 대역의 오디오 신호를 프리 시냅틱 뉴런을 통하여 입력 데이터로 수신하고, 푸리에 변환 레이어에 대응하는 웨이트가 저장된 시냅스를 거친 시냅틱 신호를 포스트 시냅스 뉴런을 통하여 출력 데이터로 전송함으로써, 주파수 신호를 생성할 수 있다.In one embodiment, weights corresponding to the Fourier transform layer may be stored in the synapse of the first sub-circuit 910. For example, the first sub-circuit 910 receives an audio signal in the time band as input data through a pre-synaptic neuron, and outputs a synaptic signal that has passed through a synapse in which weights corresponding to the Fourier transform layer are stored through a post-synaptic neuron. By transmitting data, a frequency signal can be generated.

일 실시예에서, 멜 생성 레이어에 대응하는 웨이트가 제2 서브 회로(920)의 시냅스에 저장될 수 있다. 예를 들어, 제2 서브 회로(920)는 주파수 신호를 프리 시냅틱 뉴런을 통하여 입력 데이터로 수신하고, 멜 생성 레이어에 대응하는 웨이트가 저장된 시냅스를 거친 시냅틱 신호를 포스트 시냅스 뉴런을 통하여 출력 데이터로 전송함으로써, 멜 스펙트로그램을 생성할 수 있다.In one embodiment, weights corresponding to the mel generation layer may be stored in the synapse of the second sub-circuit 920. For example, the second sub-circuit 920 receives a frequency signal as input data through a pre-synaptic neuron, and transmits a synaptic signal that has passed through a synapse in which the weight corresponding to the mel generation layer is stored as output data through a post-synaptic neuron. By doing so, a Mel spectrogram can be generated.

일 실시예에서, 출력 레이어에 대응하는 웨이트가 제3 서브 회로(930)의 시냅스에 저장될 수 있다. 예를 들어, 제3 서브 회로(930)는 멜 스펙트로그램 또는 멜 스펙트로그램의 특징을 프리 시냅틱 뉴런을 통하여 입력 데이터로 수신하고, 출력 레이어에 대응하는 웨이트가 저장된 시냅스를 거친 시냅틱 신호를 포스트 시냅스 뉴런을 통하여 출력 데이터로 전송함으로써, 음성 인식 결과를 출력할 수 있다.In one embodiment, weights corresponding to the output layer may be stored in the synapse of the third sub-circuit 930. For example, the third sub-circuit 930 receives the Mel spectrogram or the features of the Mel spectrogram as input data through the pre-synaptic neuron, and sends the synaptic signal through the synapse in which the weight corresponding to the output layer is stored to the post-synaptic neuron. The voice recognition result can be output by transmitting it as output data through .

마찬가지로, 크로스바 어레이 회로(900)는 제1 내지 제3 서브 회로(910, 920, 930) 이외에 히든 레이어를 구현하는 다른 서브 회로(미도시)를 더 포함할 수 있다. 히든 레이어를 구현하는 다른 서브 회로(미도시)는 히든 레이어에 대응하는 웨이트가 시냅스에 저장된 회로일 수 있다. 예를 들어, 서브 회로(미도시)는 멜 스펙트로그램을 프리 시냅틱 뉴런을 통하여 입력 데이터로 수신하고, 히든 레이어에 대응하는 웨이트가 저장된 시냅스를 거친 시냅틱 신호를 포스트 시냅스 뉴런을 통하여 출력 데이터로 전송함으로써, 멜 스펙트로그램의 음성 특징을 분류할 수 있다.Likewise, the crossbar array circuit 900 may further include other sub-circuits (not shown) that implement a hidden layer in addition to the first to third sub-circuits 910, 920, and 930. Another sub-circuit (not shown) that implements the hidden layer may be a circuit in which weights corresponding to the hidden layer are stored in the synapse. For example, the sub-circuit (not shown) receives the Mel spectrogram as input data through the pre-synaptic neuron, and transmits the synaptic signal that has passed through the synapse in which the weight corresponding to the hidden layer is stored as output data through the post-synaptic neuron. , the voice features of the mel spectrogram can be classified.

한편, 각 서브 회로(910, 920, 930)가 입력 데이터를 수신하는 것은 프리 시냅틱 뉴런이 아닌, 이전 레이어를 구현하는 서브 회로와 연결된 로우 배선 또는 컬럼 배선을 통하여 수신할 수 있다. 또한, 마찬가지로 각 서브 회로(910, 920, 930)가 출력 데이터를 전송하는 것은 포스트 시냅틱 뉴런이 아닌, 다음 레이어를 구현하는 서브 회로와 연결된 로우 배선 또는 컬럼 배선을 통하여 전송할 수 있다. 각 서브 회로(910, 920, 930)가 입력 데이터 및 출력 데이터를 송수신 하는 방식은 이에 제한되지 않는다. 또한, 도 9에는 크로스바 어레이 회로(900)에 3개의 서브 회로(910, 920, 930)가 포함되는 것으로 도시하였으나 서브 회로의 개수는 이에 제한되지 않으며, 크로스바 어레이 회로(900)에서 서브 회로(910, 920, 930)의 영역, 시냅스 구성 및 위치 또한 이에 제한되지 않는다.Meanwhile, each sub-circuit (910, 920, 930) may receive input data not through pre-synaptic neurons, but through row wiring or column wiring connected to the sub-circuit implementing the previous layer. Additionally, similarly, each sub-circuit (910, 920, 930) can transmit output data not through a post-synaptic neuron, but through a row wire or column wire connected to a sub-circuit implementing the next layer. The method by which each sub-circuit (910, 920, and 930) transmits and receives input data and output data is not limited to this. In addition, FIG. 9 shows that the crossbar array circuit 900 includes three sub-circuits 910, 920, and 930, but the number of sub-circuits is not limited to this, and the crossbar array circuit 900 includes three sub-circuits 910, 920, and 930. , 920, 930), the area, synapse configuration and location are also not limited thereto.

전술한 일 실시예에 따른 크로스바 어레이 회로(900)는 온-칩 메모리에 포함되어 PIM 기반의 연산을 통해 뉴럴 네트워크를 구현할 수 있다. 이러한 뉴로모픽 장치에 의하면 칩 부피를 줄일 수 있을 뿐만 아니라 소모 전력 대비 고성능을 발휘할 수 있어 비교적 적은 개수의 키워드에 대한 음성 인식 결과를 매우 효율적으로 출력할 수 있다.The crossbar array circuit 900 according to the above-described embodiment is included in an on-chip memory and can implement a neural network through PIM-based calculation. This neuromorphic device not only reduces chip volume, but also achieves high performance relative to power consumption, making it possible to output voice recognition results for a relatively small number of keywords very efficiently.

도 10은 일 실시예에 따른 뉴로모픽 장치를 동작하는 방법의 흐름도이다.Figure 10 is a flowchart of a method of operating a neuromorphic device according to one embodiment.

단계 1010에서, 뉴로모픽 장치(이하, "장치")는 오디오 신호를 수신할 수 있다.At step 1010, a neuromorphic device (hereinafter “device”) may receive an audio signal.

단계 1020에서, 장치는 소정의 학습 데이터에 기초하여 학습된 뉴럴 네트워크에 오디오 신호를 입력하여 음성 인식 결과를 출력할 수 있다.In step 1020, the device may output a voice recognition result by inputting an audio signal to a neural network learned based on predetermined training data.

일 실시예에서, 뉴럴 네트워크는 오디오 신호를 시간 대역에서 주파수 대역으로 변환하여 주파수 신호를 생성하는 푸리에 변환 레이어, 주파수 신호로부터 멜 스펙트로그램을 생성하는 멜 생성 레이어 및 멜 스펙트로그램의 음성 특징에 기초하여 음성 인식 결과를 출력하는 출력 레이어를 포함할 수 있다.In one embodiment, the neural network includes a Fourier transform layer that converts the audio signal from a time band to a frequency band to generate a frequency signal, a Mel generation layer that generates a Mel spectrogram from the frequency signal, and a Mel spectrogram based on speech features of the Mel spectrogram. It may include an output layer that outputs voice recognition results.

일 실시예에서, 뉴럴 네트워크는 멜 스펙트로그램의 음성 특징을 분류하는 하나 이상의 히든 레이어를 더 포함할 수 있다. 이 때, 하나 이상의 히든 레이어는 멜 생성 레이어 및 출력 레이어 사이에 위치할 수 있다.In one embodiment, the neural network may further include one or more hidden layers that classify voice features of the mel spectrogram. At this time, one or more hidden layers may be located between the mel generation layer and the output layer.

일 실시예에서, 출력 레이어는 히든 레이어가 음성 특징을 분류한 결과, 오디오 신호가 기 설정된 복수의 키워드 중 어느 하나에 해당하는지 여부를 결정함으로써 음성 인식 결과를 출력할 수 있다.In one embodiment, the output layer may output a voice recognition result by determining whether the audio signal corresponds to one of a plurality of preset keywords as a result of the hidden layer classifying voice features.

일 실시예에서, 소정의 학습 데이터는 기 설정된 복수의 키워드 및 노이즈 세트의 조합으로 생성된 오디오 데이터이고 하나 이상의 히든 레이어는 소정의 학습 데이터에 기초하여 조정된 웨이트로 구성될 수 있다.In one embodiment, the predetermined learning data is audio data generated by combining a plurality of preset keywords and noise sets, and one or more hidden layers may be composed of weights adjusted based on the predetermined learning data.

일 실시예에서, 뉴럴 네트워크는 오디오 신호가 기 설정된 복수의 키워드 중 적어도 하나와 유사한 네거티브 신호에 해당하는 경우, 빈 값(null)을 음성 인식 결과로 출력할 수 있다.In one embodiment, the neural network may output an empty value (null) as a voice recognition result when the audio signal corresponds to a negative signal similar to at least one of a plurality of preset keywords.

일 실시예에서, 장치는 메모리 셀(또는 온-칩 메모리)과 프로세서가 직접 연결되어 PIM(Processing-in-Memory) 기반의 연산을 수행하는 것일 수 있다.In one embodiment, the device may be one in which a memory cell (or on-chip memory) and a processor are directly connected to perform a processing-in-memory (PIM)-based operation.

일 실시예에서, 장치는 디지털 신호에 해당하는 오디오 신호를 아날로그 신호로 변환하여 푸리에 변환 레이어에 입력하고 출력 레이어로부터 출력된 아날로그 신호에 해당하는 음성 인식 결과를 디지털 신호로 변환할 수 있다.In one embodiment, the device may convert an audio signal corresponding to a digital signal into an analog signal, input it to a Fourier transform layer, and convert the voice recognition result corresponding to the analog signal output from the output layer into a digital signal.

일 실시예에서, 장치는 멜 스펙트로그램에 기초하여 오디오 신호가 기 설정된 복수의 키워드 중 어느 것에도 해당하지 않는 것으로 결정된 것에 응답하여 멜 스펙트로그램을 히든 레이어에 입력하지 않을 수 있다.In one embodiment, the device may not input the Mel spectrogram to the hidden layer in response to determining that the audio signal does not correspond to any of a plurality of preset keywords based on the Mel spectrogram.

도 11은 본 발명의 일 실시예에 따른 뉴로모픽 장치의 블록도이다. 이하의 설명에 있어서, 상술한 도 1 내지 10에 대한 설명과 중복되는 설명은 생략하기로 한다.Figure 11 is a block diagram of a neuromorphic device according to an embodiment of the present invention. In the following description, descriptions that overlap with those of FIGS. 1 to 10 described above will be omitted.

프로세서는 뉴로모픽 장치(1100)를 실행하기 위한 전반적인 기능들을 제어하는 역할을 한다. 예를 들어, 프로세서(1110)는 뉴로모픽 장치(1100) 내의 메모리(1120)에 저장된 프로그램들을 실행함으로써, 뉴로모픽 장치(1100)를 전반적으로 제어한다. 프로세서(1110)는 뉴로모픽 장치(1100) 내에 구비된 CPU(central processing unit), GPU(graphics processing unit), AP(application processor) 등으로 구현될 수 있으나, 이에 제한되지 않는다.The processor serves to control overall functions for executing the neuromorphic device 1100. For example, the processor 1110 generally controls the neuromorphic device 1100 by executing programs stored in the memory 1120 in the neuromorphic device 1100. The processor 1110 may be implemented as a central processing unit (CPU), a graphics processing unit (GPU), an application processor (AP), etc. provided in the neuromorphic device 1100, but is not limited thereto.

메모리(1120)는 뉴로모픽 장치(1100) 내에서 처리되는 각종 데이터들을 저장하는 하드웨어로서, 예를 들어, 메모리는 뉴로모픽 장치(1100)에서 처리된 데이터들 및 처리될 데이터들을 저장할 수 있다. 또한, 메모리(1120)는 뉴로모픽 장치(1100)에 의해 구동될 애플리케이션들, 드라이버들 등을 저장할 수 있다. 메모리는 DRAM(dynamic random access memory), SRAM(static random access memory) 등과 같은 RAM(random access memory), ROM(read-only memory), EEPROM(electrically erasable programmable read-only memory), CD-ROM, 블루레이 또는 다른 광학 디스크 스토리지, HDD(hard disk drive), SSD(solid state drive), 또는 플래시 메모리를 포함할 수 있다.The memory 1120 is hardware that stores various data processed within the neuromorphic device 1100. For example, the memory may store data processed and data to be processed in the neuromorphic device 1100. . Additionally, the memory 1120 may store applications, drivers, etc. to be driven by the neuromorphic device 1100. Memory includes RAM (random access memory) such as DRAM (dynamic random access memory), SRAM (static random access memory), ROM (read-only memory), EEPROM (electrically erasable programmable read-only memory), CD-ROM, and blue It may include Ray or other optical disk storage, a hard disk drive (HDD), a solid state drive (SSD), or flash memory.

도 1에 도시된 뉴럴 네트워크(1)와 달리, 뉴로모픽 장치(1100)에서 구동되는 실제 뉴럴 네트워크는 보다 복잡한 아키텍처로 구현될 수 있다. 이에 따라 프로세서(1110)는 수억에서 수백억에 다다를 정도로 매우 많은 연산량(operation count)에 해당하는 연산을 수행하게 되고, 프로세서(1110)가 연산을 위해 메모리(1130)에 액세스하는 빈도가 함께 비약적으로 증가될 수 있다.Unlike the neural network 1 shown in FIG. 1, an actual neural network running in the neuromorphic device 1100 may be implemented with a more complex architecture. Accordingly, the processor 1110 performs operations corresponding to a very large operation count, ranging from hundreds of millions to tens of billions, and the frequency with which the processor 1110 accesses the memory 1130 for operations also dramatically increases. It can be.

따라서, 메모리(1120)는 온-칩 메모리일 수 있다. 일 실시예에 따른 뉴로모픽 장치(1100)는 메모리(1120)를 온-칩 메모리 형태로만 구비하여, 외부 메모리(1130)에 대한 접근 없이 연산을 수행할 수 있다. 예를 들면, 메모리(1120)는 온-칩 메모리 형태로 구현된 SRAM일 수 있다. 이 경우 상술된 바와 달리 DRAM, ROM, HDD, SSD 등 외부 메모리(1130)로 주로 이용되는 종류의 메모리는 메모리(1120)로 이용되지 않을 수 있다.Accordingly, memory 1120 may be an on-chip memory. The neuromorphic device 1100 according to one embodiment has the memory 1120 only in the form of an on-chip memory, and can perform calculations without accessing the external memory 1130. For example, the memory 1120 may be SRAM implemented as an on-chip memory. In this case, unlike what was described above, types of memory mainly used as external memory 1130, such as DRAM, ROM, HDD, and SSD, may not be used as memory 1120.

프로세서(1110)는 메모리(1120)로부터 뉴럴 네트워크 데이터, 예를 들어 음성 데이터(오디오 신호) 등을 읽기/쓰기(read/write)하고, 읽기/쓰기된 데이터를 이용하여 뉴럴 네트워크를 실행할 수 있다. 뉴럴 네트워크가 실행될 때, 프로세서(1110)는 출력에 관한 데이터를 생성하기 위하여, 오디오 신호에 대한 연산을 반복적으로 수행할 수 있다. 즉, 소정의 주기에 따라 오디오 신호를 수신하고 상술한 레이어에 관한 연산을 상기 소정의 주기마다 반복적으로 수행할 수 있다.The processor 1110 can read/write neural network data, for example, voice data (audio signal), etc., from the memory 1120, and execute a neural network using the read/written data. When the neural network is executed, the processor 1110 may repeatedly perform operations on the audio signal to generate data related to the output. In other words, an audio signal can be received according to a predetermined period, and the operation related to the above-described layer can be repeatedly performed at the predetermined period.

일 실시예에서, 뉴로모픽 장치(1100)는 서버일 수 있다. 서버는 네트워크를 통해 통신하여 명령, 코드, 파일, 컨텐츠, 서비스 등을 제공하는 컴퓨터 장치 또는 복수의 컴퓨터 장치들로 구현될 수 있다. 서버는 음성 인식을 위해 필요한 데이터를 수신하고, 수신한 데이터에 기초하여 음성 인식을 수행할 수 있다.In one embodiment, neuromorphic device 1100 may be a server. A server may be implemented as a computer device or a plurality of computer devices that communicate over a network to provide commands, codes, files, content, services, etc. The server may receive data necessary for voice recognition and perform voice recognition based on the received data.

한편, 본 발명에 따른 실시예는 컴퓨터 상에서 다양한 구성요소를 통하여 실행될 수 있는 컴퓨터 프로그램의 형태로 구현될 수 있으며, 이와 같은 컴퓨터 프로그램은 컴퓨터로 판독 가능한 매체에 기록될 수 있다. 이때, 매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치를 포함할 수 있다.Meanwhile, embodiments according to the present invention may be implemented in the form of a computer program that can be executed through various components on a computer, and such a computer program may be recorded on a computer-readable medium. At this time, the media includes magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and ROM. , RAM, flash memory, etc., may include hardware devices specifically configured to store and execute program instructions.

한편, 상기 컴퓨터 프로그램은 본 발명을 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 프로그램의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함될 수 있다.Meanwhile, the computer program may be designed and configured specifically for the present invention, or may be known and available to those skilled in the art of computer software. Examples of computer programs may include not only machine language code such as that created by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.

일 실시예에 따르면, 본 개시의 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어TM)를 통해 또는 두 개의 사용자 장치들 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, methods according to various embodiments of the present disclosure may be included and provided in a computer program product. Computer program products are commodities and can be traded between sellers and buyers. The computer program product may be distributed in the form of a machine-readable storage medium (e.g. compact disc read only memory (CD-ROM)) or through an application store (e.g. Play StoreTM) or between two user devices. It may be distributed in person or online (e.g., downloaded or uploaded). In the case of online distribution, at least a portion of the computer program product may be at least temporarily stored or temporarily created in a machine-readable storage medium, such as the memory of a manufacturer's server, an application store's server, or a relay server.

본 발명에 따른 방법을 구성하는 단계들에 대하여 명백하게 순서를 기재하거나 반하는 기재가 없다면, 상기 단계들은 적당한 순서로 행해질 수 있다. 반드시 상기 단계들의 기재 순서에 따라 본 발명이 한정되는 것은 아니다. 본 발명에서 모든 예들 또는 예시적인 용어의 사용은 단순히 본 발명을 상세히 설명하기 위한 것으로서 특허청구범위에 의해 한정되지 않는 이상 상기 예들 또는 예시적인 용어로 인해 본 발명의 범위가 한정되는 것은 아니다. 또한, 당업자는 다양한 수정, 조합 및 변경이 부가된 특허청구범위 또는 그 균등물의 범주 내에서 설계 조건 및 팩터에 따라 구성될 수 있음을 알 수 있다.Unless there is an explicit order or statement to the contrary regarding the steps constituting the method according to the invention, the steps may be performed in any suitable order. The present invention is not necessarily limited by the order of description of the above steps. The use of all examples or illustrative terms in the present invention is simply for explaining the present invention in detail, and the scope of the present invention is not limited by the examples or illustrative terms unless limited by the claims. Additionally, those skilled in the art will recognize that various modifications, combinations and changes may be made depending on design conditions and factors within the scope of the appended claims or their equivalents.

따라서, 본 발명의 사상은 상기 설명된 실시 예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and the scope of the patent claims described below as well as all scopes equivalent to or equivalently changed from the scope of the claims are within the scope of the spirit of the present invention. It will be said to belong to

Claims

In a neuromorphic device implementing a neural network,
At least one processor running the neural network; and
An on-chip memory including a crossbar array circuit that receives instructions from the at least one processor and performs in-memory operations,
The at least one processor,
receive an audio signal,
Input the audio signal to the neural network learned based on predetermined learning data and output a voice recognition result,
The neural network is,
A Fourier transform layer that converts the audio signal from a time band to a frequency band to generate a frequency signal, a Mel generation layer that generates a Mel spectrogram from the frequency signal, and a voice recognition result based on the voice characteristics of the Mel spectrogram. Contains an output layer that outputs,
The neural network is,
A neuromorphic device comprising one or more hidden layers that classify the speech features of the Mel spectrogram, wherein the one or more hidden layers are located between the Mel generation layer and the output layer.

According to claim 1,
The at least one processor,
A neuromorphic device that does not input the Mel spectrogram to the hidden layer in response to determining that the audio signal does not correspond to any of a plurality of preset keywords based on the Mel spectrogram.

According to claim 2,
The at least one processor,
A neuromorphic device that outputs a null value as the voice recognition result in response to determining that the audio signal does not correspond to any of the plurality of preset keywords.

According to claim 2,
The at least one processor,
A neuromorphic device, in response to determining that the audio signal does not correspond to any of the plurality of preset keywords, receiving a next audio signal without outputting the voice recognition result.

According to claim 2,
The at least one processor,
A neuromorphic device, switching to a low power mode in response to determining that the audio signal does not correspond to any of the plurality of preset keywords.

According to claim 1,
The predetermined learning data is,
A neuromorphic device, which is audio data generated based on at least one of a plurality of preset keywords and noise sets.

According to claim 6,
The neural network is,
First learning is performed using the audio data of the plurality of preset keywords as learning data, and after the first learning, secondary learning is performed using the audio data set generated by a combination of the plurality of preset keywords and the noise set as learning data. A neuromorphic device.

According to claim 1,
The at least one processor,
When the audio signal corresponds to a negative signal similar to at least one of a plurality of preset keywords, a neuromorphic device outputs an empty value (null) as the voice recognition result or receives the next audio signal without output.

In a method of operating a neuromorphic device implementing a neural network,
Receiving an audio signal; and
Including the step of inputting the audio signal to a neural network learned based on predetermined learning data and outputting a voice recognition result,
The neural network is,
A Fourier transform layer that converts the audio signal from a time band to a frequency band to generate a frequency signal, a Mel generation layer that generates a Mel spectrogram from the frequency signal, and a voice recognition result based on the voice characteristics of the Mel spectrogram. Includes an output layer that outputs,
The neural network is,
A method comprising one or more hidden layers that classify the speech features of the Mel spectrogram, wherein the one or more hidden layers are located between the Mel generation layer and the output layer.

A computer-readable recording medium that records a program for executing the method of claim 9 on a computer.