KR102491202B1

KR102491202B1 - Method, system and non-transitory computer-readable recording medium for performing operations of artificial neural network

Info

Publication number: KR102491202B1
Application number: KR1020190112054A
Authority: KR
Inventors: 신동주
Original assignee: 주식회사 모빌린트
Priority date: 2019-09-10
Filing date: 2019-09-10
Publication date: 2023-01-25
Also published as: KR20210030654A; WO2021049829A1

Abstract

본 발명의 일 태양에 따르면, 인공 신경망 연산을 수행하는 방법으로서, (a) 외부 메모리(external memory)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 가중치에 관한 정보를 획득하는 단계, (b) 상기 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원으로 분할 처리하여 생성되는 연산 결과를 상기 외부 메모리에 전송하여 저장되도록 하는 단계, 및 (c) 상기 (a) 단계 및 상기 (b) 단계를 반복적으로 수행하여, 상기 입력 레이어와 연관되는 복수의 계층에 대한 연산이 완료되면, 상기 연산 결과에 기초한 출력 레이어(output layer)를 상기 외부 메모리에 전송하여 저장되도록 하는 단계를 포함하고, 상기 외부 메모리에게 접근되어야 할 횟수 및 상기 적어도 하나의 히든 레이어가 복수의 차원으로 분할 처리됨에 따라 발생되는 중복 처리 정도 중 적어도 하나를 참조하여 상기 적어도 하나의 히든 레이어의 차원 분할 수준 및 연산 처리 순서가 결정되는 방법이 제공된다.According to one aspect of the present invention, there is provided a method for performing an artificial neural network operation, comprising: (a) acquiring information about an input layer and a weight associated with an artificial neural network operation from an external memory; (b) transmitting and storing an operation result generated by dividing at least one hidden layer into a plurality of dimensions with reference to the obtained information to the external memory, and (c) the step of (a) ) and the step (b) are repeatedly performed to transmit and store an output layer based on the result of the operation to the external memory when the operation for the plurality of layers associated with the input layer is completed. and dimensional division of the at least one hidden layer with reference to at least one of a number of times to access the external memory and a degree of redundancy generated as the at least one hidden layer is divided into a plurality of dimensions. A method is provided for determining the level and order of processing operations.

Description

Method, system and non-transitory computer readable recording medium for performing artificial neural network operation

본 발명은 인공 신경망 연산을 수행하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체에 관한 것이다.The present invention relates to a method, system, and non-transitory computer readable recording medium for performing artificial neural network operations.

인공 신경망은 뇌의 뉴런과 그 연결 구조에서 착안한 것으로써, 인공 신경망의 압도적으로 높은 성능과 그 범용성으로 인해 많은 응용 분야에 접목되어 사용되고 있다. 인공 신경망은 시각, 음성, 언어 등 많은 응용 분야에서 상당한 정확도의 진전을 가져왔고, 최근 다양한 분야에서 인간 수준 혹은 그 이상의 성능을 보이고 있다. 인공 신경망의 뛰어난 성능은 대량의 데이터에 대한 통계 학습에 기초하여 형상을 추출하는 능력에서 비롯되며 그것은 보통 인간의 경험과 직관으로부터 고안된 특징이나 규칙을 사용하는 기존 알고리즘과는 그 접근법이 다르다.Artificial neural networks are conceived from brain neurons and their connection structure, and are used in many application fields due to the overwhelmingly high performance and versatility of artificial neural networks. Artificial neural networks have made significant advances in accuracy in many application fields, such as vision, speech, and language, and recently show performance at or above human levels in various fields. The outstanding performance of artificial neural networks comes from their ability to extract features based on statistical learning on large amounts of data, and their approach differs from existing algorithms that usually use features or rules devised from human experience and intuition.

그러나, 이러한 인공 신경망 알고리즘을 수행하기 위하여는 방대한 연산이 요구되므로 인공 신경망을 활용하는 어플리케이션에서 요구되는 성능을 범용 프로세서인 CPU 또는 GPU만으로는 충족시키기는 어렵다. 이에, 인공 신경망을 CPU나 GPU보다 훨씬 높은 효율로 처리할 수 있는 인공 신경망 전용 연산 장치에 관하여 활발히 연구가 진행되고 있다.However, since extensive calculations are required to perform such artificial neural network algorithms, it is difficult to satisfy the performance required in applications using artificial neural networks only with a CPU or GPU, which is a general-purpose processor. Accordingly, active research is being conducted on an artificial neural network computing device capable of processing artificial neural networks with much higher efficiency than a CPU or GPU.

이와 같은 인공 신경망 연산 장치를 구성함에 있어, 인공 신경망 연산 장치의 내부 메모리 용량은 제한적일 수밖에 없어 외부 메모리를 두는 것이 일반적인데, 외부 메모리는 전력, 지연 시간 등의 측면에서 내부 메모리에 비해 훨씬 크게 된다. 즉, 인공 신경망 연산 수행에 있어 외부 메모리 접근을 최소화하는 것이 인공 신경망 연산 장치의 성능 향상과도 직결된다고 볼 수 있다.In constructing such an artificial neural network computing device, the internal memory capacity of the artificial neural network computing device is inevitably limited, so it is common to have an external memory, but the external memory is much larger than the internal memory in terms of power and delay time. That is, it can be seen that minimizing external memory access in performing artificial neural network calculations is directly related to performance improvement of artificial neural network computing devices.

특히, 최근 인공 신경망의 필요성이 높아진 자동차, 드론, TV 등에서 고화질 영상 데이터를 입력으로 하여 인공 신경망 연산을 수행하는 경우에, 연산 과정에서 생성되는 데이터의 양이 방대하게 되며, 이에 대한 외부 메모리 접근을 효율적으로 관리하는 것이 큰 과제로 볼 수 있다.In particular, when artificial neural network calculation is performed using high-definition image data as input in automobiles, drones, TVs, etc., where the need for artificial neural networks has recently increased, the amount of data generated in the calculation process is enormous, and access to external memory is required. Managing it effectively can be seen as a major challenge.

이에, 본 발명자(들)은, 인공 신경망 연산 과정에서 외부 메모리에 대한 접근을 최소화하면서 그 처리 속도를 일정 수준 이상으로 유지할 수 있는 신규하고도 진보된 기술을 제안하는 바이다.Accordingly, the present inventor(s) proposes a novel and advanced technique capable of maintaining the processing speed at a certain level or higher while minimizing access to external memory in the artificial neural network calculation process.

본 발명은, 전술한 종래 기술의 문제점을 모두 해결하는 것을 그 목적으로 한다.The present invention has as its object to solve all the problems of the prior art described above.

또한, 본 발명은, 인공 신경망 연산 과정에서, 최적의 레이어(layer) 분할 방안 및 연산 순서를 도출하여, 외부 메모리에 대한 접근을 최소화하면서 그 처리 속도를 일정 수준 이상으로 유지하는 것을 그 목적으로 한다.In addition, an object of the present invention is to maintain the processing speed at a certain level or higher while minimizing access to an external memory by deriving an optimal layer division method and calculation sequence in an artificial neural network calculation process. .

상기 목적을 달성하기 위한 본 발명의 대표적인 구성은 다음과 같다.Representative configurations of the present invention for achieving the above object are as follows.

또한, 본 발명의 다른 태양에 따르면, 인공 신경망 연산을 수행하는 시스템으로서, (a) 외부 메모리(external memory)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 가중치에 관한 정보를 획득하는 버퍼부, (b) 상기 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원으로 분할 처리하여 생성되는 연산 결과를 상기 외부 메모리에 전송하여 저장되도록 하는 연산 관리부, 및 (c) 상기 (a) 단계 및 상기 (b) 단계를 반복적으로 수행하여, 상기 입력 레이어와 연관되는 복수의 계층에 대한 연산이 완료되면, 상기 연산 결과에 기초한 출력 레이어(output layer)를 상기 외부 메모리에 전송하여 저장되도록 하는 출력 레이어 관리부를 포함하고, 상기 외부 메모리에게 접근되어야 할 횟수 및 상기 적어도 하나의 히든 레이어가 복수의 차원으로 분할 처리됨에 따라 발생되는 중복 처리 정도 중 적어도 하나를 참조하여 상기 적어도 하나의 히든 레이어의 차원 분할 수준 및 연산 처리 순서가 결정되는 시스템이 제공된다.In addition, according to another aspect of the present invention, a system for performing an artificial neural network operation, comprising: (a) acquiring information about an input layer and a weight associated with an artificial neural network operation from an external memory; a buffer unit, (b) an operation management unit that transmits and stores an operation result generated by dividing at least one hidden layer into a plurality of dimensions with reference to the obtained information to the external memory, and (c) ) When the operation of the plurality of layers associated with the input layer is completed by repeatedly performing steps (a) and (b), an output layer based on the operation result is stored in the external memory. and an output layer management unit configured to transmit and store the at least one hidden layer by referring to at least one of a number of times to access the external memory and a degree of redundant processing generated as the at least one hidden layer is divided into a plurality of dimensions. A system for determining the dimension division level and operation processing order of the hidden layer of is provided.

이 외에도, 본 발명을 구현하기 위한 다른 방법, 다른 시스템 및 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 비일시성의 컴퓨터 판독 가능한 기록 매체가 더 제공된다.In addition to this, another method for implementing the present invention, another system, and a non-transitory computer readable recording medium recording a computer program for executing the method are further provided.

본 발명에 의하면, 인공 신경망 연산 과정에서, 최적의 레이어(layer) 분할 방안 및 연산 순서를 도출하여, 외부 메모리에 대한 접근을 최소화하면서 그 처리 속도를 일정 수준 이상으로 유지시킬 수 있게 된다.According to the present invention, in an artificial neural network calculation process, an optimal layer division method and calculation sequence are deduced, thereby minimizing access to an external memory while maintaining the processing speed above a certain level.

도 1은 본 발명의 일 실시예에 따라 인공 신경망 연산을 수행하는 전체 시스템의 구성을 개략적으로 나타내는 도면이다.
도 2는 본 발명의 일 실시예에 따른 인공 신경망 연산 시스템의 내부 구성을 예시적으로 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따라 제한적인 내부 메모리를 통해 인공 신경망 연산이 수행되는 과정을 예시적으로 나타내는 도면이다.1 is a diagram schematically showing the configuration of an entire system for performing an artificial neural network operation according to an embodiment of the present invention.
2 is a diagram showing the internal configuration of an artificial neural network calculation system according to an embodiment of the present invention by way of example.
3 is a diagram exemplarily illustrating a process of performing an artificial neural network operation through a limited internal memory according to an embodiment of the present invention.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이러한 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 본 명세서에 기재되어 있는 특정 형상, 구조 및 특성은 본 발명의 정신과 범위를 벗어나지 않으면서 일 실시예로부터 다른 실시예로 변경되어 구현될 수 있다. 또한, 각각의 실시예 내의 개별 구성요소의 위치 또는 배치도 본 발명의 정신과 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 행하여지는 것이 아니며, 본 발명의 범위는 특허청구범위의 청구항들이 청구하는 범위 및 그와 균등한 모든 범위를 포괄하는 것으로 받아들여져야 한다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 구성요소를 나타낸다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The detailed description of the present invention which follows refers to the accompanying drawings which illustrate, by way of illustration, specific embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable any person skilled in the art to practice the present invention. It should be understood that the various embodiments of the present invention are different from each other but are not necessarily mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be implemented from one embodiment to another without departing from the spirit and scope of the present invention. It should also be understood that the location or arrangement of individual components within each embodiment may be changed without departing from the spirit and scope of the present invention. Therefore, the detailed description to be described later is not performed in a limiting sense, and the scope of the present invention should be taken as encompassing the scope claimed by the claims and all scopes equivalent thereto. Like reference numbers in the drawings indicate the same or similar elements throughout the various aspects.

본 명세서에서의, 인공 신경망(ANN; Artificial Neural Network)은 기계 학습과 인지 과학에서 생물학의 신경망에서 영감을 얻은 통계학적 학습 알고리즘을 포함하는 개념이며, 시냅스의 결합으로 네트워크를 형성한 복수의 인공 뉴런(neuron)이 학습을 통해 시냅스의 결합 세기(예를 들어, 가중치)를 변화시켜, 문제 해결 능력을 가지는 모델 전반을 의미할 수 있다. 예를 들어, 이러한 인공 신경망 모델은, 입력 레이어(input layer), 복수의 히든 레이어(hidden layer) 및 출력 레이어(output layer)를 포함하는 계층 구조로 구성될 수 있다.In the present specification, an artificial neural network (ANN) is a concept including a statistical learning algorithm inspired by a neural network of biology in machine learning and cognitive science, and a plurality of artificial neurons formed by combining synapses to form a network. It may refer to an overall model that has problem-solving ability by changing synaptic coupling strength (eg, weight) through learning. For example, such an artificial neural network model may have a hierarchical structure including an input layer, a plurality of hidden layers, and an output layer.

이하에서는, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 하기 위하여, 본 발명의 여러 바람직한 실시예에 관하여 첨부된 도면을 참조하여 상세히 설명하기로 한다.Hereinafter, various preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings in order to enable those skilled in the art to easily practice the present invention.

전체 시스템의 구성Composition of the entire system

도 1은 본 발명의 일 실시예에 따라 인공 신경망 연산을 수행하는 전체 시스템의 구성을 개략적으로 나타내는 도면이다.1 is a diagram schematically showing the configuration of an entire system for performing an artificial neural network operation according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 전체 시스템은 통신망(100), 인공 신경망 연산 시스템(200) 및 외부 메모리(300)를 포함할 수 있다.As shown in FIG. 1 , the entire system according to an embodiment of the present invention may include a communication network 100 , an artificial neural network calculation system 200 and an external memory 300 .

먼저, 본 발명의 일 실시예에 따르면, 통신망(100)은 하나의 시스템 내(예를 들어, 칩, 메모리) 또는 복수의 시스템 사이(예를 들어, 칩-칩, 칩-메모리, 메모리-메모리)의 인터페이스 회로 등에서 데이터를 송수신하는 버스(bus)를 의미할 수 있다.First, according to an embodiment of the present invention, the communication network 100 is provided within one system (eg, chip, memory) or between a plurality of systems (eg, chip-chip, chip-memory, memory-memory). ) may refer to a bus that transmits and receives data in an interface circuit or the like.

다음으로, 본 발명의 일 실시예에 따른 인공 신경망 연산 시스템(200)은, 통신망(100)을 통하여 후술할 외부 메모리(external memory)(300)와 통신을 수행할 수 있고, (a) 외부 메모리(300)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 가중치(weight)에 관한 정보를 획득하고, (b) 그 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원(dimension)으로 분할 처리하여 생성되는 연산 결과를 외부 메모리(300)에 전송하여 저장되도록 하고, (c) 위의 (a) 및 (b) 과정을 반복적으로 수행하여, 위의 입력 레이어와 연관되는 복수의 계층에 대한 연산이 완료되면, 그 연산 결과에 기초한 출력 레이어(output layer)를 위의 외부 메모리(300)에 전송하여 저장되도록 하는 기능을 수행할 수 있다.Next, the artificial neural network operating system 200 according to an embodiment of the present invention can communicate with an external memory 300 to be described later through the communication network 100, and (a) the external memory Information on weights associated with the operation of an input layer and an artificial neural network is obtained from 300, and (b) a plurality of at least one hidden layer is created by referring to the obtained information. The operation result generated by dividing into the dimension of is transmitted to and stored in the external memory 300, and (c) by repeatedly performing the above (a) and (b) processes, the above input layer and When the calculation of the plurality of related layers is completed, a function of transmitting and storing an output layer based on the calculation result to the external memory 300 may be performed.

또한, 본 발명의 일 실시예에 따르면, 인공 신경망 연산 시스템(200)는 위의 외부 메모리(300)에게 접근(예를 들어, 읽기/쓰기)되어야 할 횟수 및 적어도 하나의 히든 레이어가 복수의 차원(dimension)으로 분할 처리됨에 따라 발생되는 중복 처리 정도 중 적어도 하나에 기초하여 위의 적어도 하나의 히든 레이어의 차원 분할 수준(예를 들어, 차원 분할 개수) 및 연산 처리 순서를 결정할 수 있다.In addition, according to an embodiment of the present invention, the artificial neural network computing system 200 determines the number of times to access (eg, read/write) the external memory 300 and at least one hidden layer in a plurality of dimensions. A dimension division level (eg, the number of dimension divisions) and an operation processing order of the at least one hidden layer may be determined based on at least one of overlapping processing degrees generated as a result of the division into (dimension).

인공 신경망 연산 시스템(200)의 기능에 관하여는 아래에서 더 자세하게 알아보기로 한다. 한편, 인공 신경망 연산 시스템(200)에 관하여 위와 같이 설명되었으나, 이러한 설명은 예시적인 것이고, 인공 신경망 연산 시스템(200)에 요구되는 기능이나 구성요소의 적어도 일부가 필요에 따라 스마트폰, 태블릿 PC 등과 같이 메모리 수단을 구비하고 마이크로 프로세서를 탑재하여 연산 능력을 갖춘 디지털 기기 또는 IC칩(IC chip) 내에서 실현되거나 외부 시스템(미도시됨) 내에 포함될 수도 있다.Functions of the artificial neural network computing system 200 will be described in more detail below. On the other hand, although the artificial neural network operation system 200 has been described as above, this description is exemplary, and at least some of the functions or components required for the artificial neural network operation system 200 are provided on a smartphone, a tablet PC, etc. as needed. Similarly, it may be realized in a digital device or IC chip equipped with a memory means and equipped with a microprocessor and having arithmetic capability, or may be included in an external system (not shown).

다음으로, 본 발명의 일 실시예에 따른 외부 메모리(300)는 통신망(100)을 통하여 인공 신경망 연산 시스템(200)과 통신을 수행할 수 있고, 인공 신경망과 연관되는 각 입력 뉴런(input neuron)의 입력 데이터 및 그 입력 뉴런으로부터 연산을 수행하기 위한 시냅스 가중치를 포함하는 정보를 저장하는 기능을 수행할 수 있다. 예를 들어, 본 발명의 일 실시예에 따르면, 외부 메모리(300)는 DDR-SDRAM(Double Data Rate Synchronous Dynamic Random Access Memory)과 같은 휘발성 메모리를 포함하여 구성될 수 있다.Next, the external memory 300 according to an embodiment of the present invention can communicate with the artificial neural network operation system 200 through the communication network 100, and each input neuron associated with the artificial neural network It can perform a function of storing information including input data of and synaptic weights for performing calculations from the input neurons. For example, according to an embodiment of the present invention, the external memory 300 may include a volatile memory such as double data rate synchronous dynamic random access memory (DDR-SDRAM).

인공 신경망 연산 시스템의 구성Composition of artificial neural network calculation system

이하에서는, 본 발명의 구현을 위하여 중요한 기능을 수행하는 인공 신경망 연산 시스템(200)의 내부 구성 및 각 구성요소의 기능에 대하여 살펴보기로 한다.Hereinafter, the internal configuration of the artificial neural network operating system 200 that performs important functions for the implementation of the present invention and the functions of each component will be reviewed.

도 2는 본 발명의 일 실시예에 따른 인공 신경망 연산 시스템(200)의 내부 구성을 예시적으로 나타내는 도면이다.2 is a diagram showing the internal configuration of an artificial neural network calculation system 200 according to an embodiment of the present invention by way of example.

도 2를 참조하면, 본 발명의 일 실시예에 따른 인공 신경망 연산 시스템(200)은 버퍼부(210), 연산 관리부(220), 출력 레이어 관리부(230), 통신부(240) 및 제어부(250)를 포함할 수 있다. 본 발명의 일 실시예에 따르면, 버퍼부(210), 연산 관리부(220), 출력 레이어 관리부(230), 통신부(240) 및 제어부(250)는 그 중 적어도 일부가 외부 시스템(미도시됨)과 통신하는 프로그램 모듈들일 수 있다. 이러한 프로그램 모듈들은 운영 시스템, 응용 프로그램 모듈 및 기타 프로그램 모듈의 형태로 인공 신경망 연산 시스템(200)에 포함될 수 있으며, 물리적으로는 여러 가지 공지의 기억 장치 상에 저장될 수 있다. 또한, 이러한 프로그램 모듈들은 인공 신경망 연산 시스템(200)과 통신 가능한 원격 기억 장치에 저장될 수도 있다. 한편, 이러한 프로그램 모듈들은 본 발명에 따라 후술할 특정 업무를 수행하거나 특정 추상 데이터 유형을 실행하는 루틴, 서브루틴, 프로그램, 오브젝트, 컴포넌트, 데이터 구조 등을 포괄하지만, 이에 제한되지는 않는다.Referring to FIG. 2 , the artificial neural network operation system 200 according to an embodiment of the present invention includes a buffer unit 210, an operation management unit 220, an output layer management unit 230, a communication unit 240, and a control unit 250. can include According to an embodiment of the present invention, at least some of the buffer unit 210, operation management unit 220, output layer management unit 230, communication unit 240, and control unit 250 are external systems (not shown). It may be program modules that communicate with. These program modules may be included in the artificial neural network computing system 200 in the form of an operating system, application program modules, and other program modules, and may be physically stored on various known storage devices. Also, these program modules may be stored in a remote storage device capable of communicating with the artificial neural network computing system 200 . Meanwhile, these program modules include, but are not limited to, routines, subroutines, programs, objects, components, data structures, etc. that perform specific tasks or execute specific abstract data types according to the present invention.

먼저, 본 발명의 일 실시예에 따른 버퍼부(210)는 외부 메모리(300)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 시냅스 가중치에 관한 정보를 획득하는 기능을 수행할 수 있다. 또한, 본 발명의 일 실시예에 따른 버퍼부(210)는 인공 신경망 연산 시스템(200)의 내부에서 임시로 정보를 저장하는 기능을 수행할 수 있고, 저장 용량에 따라 외부 메모리(200)에 저장된 정보의 전체 또는 일부를 저장할 수 있다.First, the buffer unit 210 according to an embodiment of the present invention may perform a function of obtaining information about synaptic weights associated with the operation of an input layer and an artificial neural network from the external memory 300. . In addition, the buffer unit 210 according to an embodiment of the present invention may perform a function of temporarily storing information inside the artificial neural network operating system 200, and may store information stored in the external memory 200 according to the storage capacity. All or part of the information may be stored.

구체적으로, 본 발명의 일 실시예에 따르면, 버퍼부(210)는 외부 메모리(300)로부터 입력 레이어 및 해당 입력 레이어에서 인공 신경망의 다음 계층으로의 연산 결과를 도출하기 위한 시냅스 가중치에 관한 정보를 획득할 수 있다.Specifically, according to an embodiment of the present invention, the buffer unit 210 transmits information about an input layer from the external memory 300 and synaptic weights for deriving an operation result from the corresponding input layer to the next layer of the artificial neural network. can be obtained

다음으로, 본 발명의 일 실시예에 따른 연산 관리부(220)는 버퍼부(210)에서 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원(dimension)으로 분할(또는 분산) 처리하여 생성되는 연산 결과를 외부 메모리(300)에 전송하여 저장되도록 하는 기능을 수행할 수 있다. 본 발명의 일 실시예에 따른 차원은 공지의 인공 신경망에서 사용되는 깊이(depth), 채널(channel) 등의 개념을 포함할 수 있고, 예를 들어, 영상에 관한 RGB 채널의 경우에, 224 X 224 X 3(즉, 가로, 세로 및 컬러 채널)으로 나타내어 지고, 그 차원은 3을 의미하는 것일 수 있다. 또한, 본 발명의 일 실시예에 따른 차원은, 인공 신경망 전체의 차원이 아니라 각 입력 뉴런들에 대한 데이터에 해당하는 액티베이션 볼륨(activation volume)에서 정의되는 개념일 수 있다.Next, the operation management unit 220 according to an embodiment of the present invention refers to information obtained from the buffer unit 210 and divides (or distributes) at least one hidden layer into a plurality of dimensions. ), it is possible to perform a function of transmitting and storing the result of the operation to the external memory 300. Dimensions according to an embodiment of the present invention may include concepts such as depth and channels used in known artificial neural networks. For example, in the case of an RGB channel related to an image, 224 X It is represented by 224 X 3 (ie, horizontal, vertical and color channels), and the dimension may mean 3. In addition, a dimension according to an embodiment of the present invention may be a concept defined in an activation volume corresponding to data for each input neuron rather than a dimension of an entire artificial neural network.

또한, 본 발명의 일 실시예에 따른 연산 관리부(220)는 위의 외부 메모리(300)에게 접근되어야 할 횟수 및 적어도 하나의 히든 레이어가 복수의 차원으로 분할 처리됨에 따라 발생되는 중복 처리 정도(예를 들어, 레이어가 겹쳐지는 영역에 대하여 오버 헤드가 발생하게 되는데 그 정도를 의미하는 개념일 수 있다.) 중 적어도 하나에 기초하여 위의 적어도 하나의 히든 레이어의 차원 분할 수준 및 연산 처리 순서를 결정할 수 있다.In addition, the operation management unit 220 according to an embodiment of the present invention determines the number of times the external memory 300 is to be accessed and the degree of redundant processing (e.g., For example, an overhead occurs in an area where layers overlap, which may be a concept meaning the degree of overhead.) to determine the dimension division level and operation processing order of at least one hidden layer above based on at least one of the above. can

다음으로, 본 발명의 일 실시예에 따른 출력 레이어 관리부(230)는 버퍼부(210)가 외부 메모리(300)로부터 정보를 획득하여 저장하는 과정과, 그 버퍼부(210)로부터 제공받은 정보를 복수의 히든 레이어(hidden layer)로 분할 처리하여 생성되는 연산 결과를 위의 외부 메모리(300)에 전송하여 저장되도록 하는 과정을 반복적으로 수행하여, 입력 레이어와 연관되는 복수의 계층에 대한 연산이 완료되면, 그 연산 결과에 기초한 출력 레이어(output layer)를 외부 메모리(300)에 전송하여 저장되도록 하는 기능을 수행할 수 있다. 본 발명의 일 실시예에 따르면, 위의 계층의 수준(예를 들어, 깊이)은 위의 히든 레이어의 수 및 외부 메모리의 접근 횟수 중 적어도 하나에 기초하여 설정될 수 있다.Next, the output layer management unit 230 according to an embodiment of the present invention includes a process in which the buffer unit 210 acquires and stores information from the external memory 300 and the information provided from the buffer unit 210. By repeatedly performing the process of transferring and storing the calculation result generated by dividing into a plurality of hidden layers to the external memory 300, the calculation of the plurality of layers associated with the input layer is completed. , it can perform a function of transmitting and storing an output layer based on the operation result to the external memory 300 . According to an embodiment of the present invention, the level (eg, depth) of the above layer may be set based on at least one of the number of hidden layers and the number of accesses to the external memory.

다음으로, 본 발명의 일 실시예에 따르면, 통신부(240)는 버퍼부(210), 연산 관리부(220), 출력 레이어 관리부(230)로부터의/로의 데이터 송수신이 가능하도록 하는 기능을 수행할 수 있다.Next, according to an embodiment of the present invention, the communication unit 240 may perform a function of enabling data transmission/reception from/to the buffer unit 210, the calculation management unit 220, and the output layer management unit 230. there is.

마지막으로, 본 발명의 일 실시예에 따르면, 제어부(250)는 버퍼부(210), 연산 관리부(220), 출력 레이어 관리부(230) 및 통신부(240) 간의 데이터의 흐름을 제어하는 기능을 수행할 수 있다. 즉, 본 발명에 따른 제어부(250)는 인공 신경망 연산 시스템(200)의 외부로부터의/로의 데이터 흐름 또는 인공 신경망 연산 시스템(200)의 각 구성요소 간의 데이터 흐름을 제어함으로써, 버퍼부(210), 연산 관리부(220), 출력 레이어 관리부(230) 및 통신부(240)에서 각각 고유 기능을 수행하도록 제어할 수 있다.Finally, according to an embodiment of the present invention, the control unit 250 performs a function of controlling the flow of data between the buffer unit 210, the calculation management unit 220, the output layer management unit 230, and the communication unit 240. can do. That is, the control unit 250 according to the present invention controls the data flow from/to the outside of the artificial neural network operation system 200 or the data flow between each component of the artificial neural network operation system 200, so that the buffer unit 210 , The calculation management unit 220, the output layer management unit 230, and the communication unit 240 can each be controlled to perform unique functions.

도 3은 본 발명의 일 실시예에 따라 제한적인 내부 메모리를 통해 인공 신경망 연산이 수행되는 과정을 예시적으로 나타내는 도면이다.3 is a diagram exemplarily illustrating a process of performing an artificial neural network operation through a limited internal memory according to an embodiment of the present invention.

먼저, 본 발명의 일 실시예에 따르면, SD(Secured Digital) 카드와 같은 비휘발성 메모리로 구성되는 플래시 메모리(미도시됨)로부터 인공 신경망 연산에 필요한 시냅스 가중치 전체에 관한 정보가 외부 메모리(300)로 전달될 수 있다.First, according to an embodiment of the present invention, information on all synaptic weights required for artificial neural network operation is stored in an external memory 300 from a flash memory (not shown) composed of a non-volatile memory such as a Secured Digital (SD) card. can be forwarded to

그 다음에, 본 발명의 일 실시예에 따르면, 위의 외부 메모리(300)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 가중치에 관한 정보가 획득될 수 있다(310).Next, according to an embodiment of the present invention, information on weights associated with the operation of an input layer and an artificial neural network may be obtained from the external memory 300 (310).

보다 구체적으로, 본 발명의 일 실시예에 따르면, 인공 신경망의 연산과 연관되는 가중치에 관한 정보 중 위의 입력 레이어로부터 연산되어야 하는 특정 계층에 해당하는 가중치에 관한 정보만이 획득될 수 있다.More specifically, according to an embodiment of the present invention, only information about weights corresponding to a specific layer to be calculated from an upper input layer among information about weights associated with an operation of an artificial neural network may be obtained.

그 다음에, 본 발명의 일 실시예에 따르면, 위의 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원(321)(예를 들어, 채널(channel))으로 분할 처리하여 생성되는 연산 결과가 위의 외부 메모리(300)에 전송 및 저장될 수 있다.Next, according to an embodiment of the present invention, at least one hidden layer is divided into a plurality of dimensions 321 (eg, channels) with reference to the obtained information. The result of the operation may be transmitted and stored in the external memory 300 above.

이 경우에, 본 발명의 일 실시예에 따라 외부 메모리(300)에 대한 접근되어야 할 횟수 및 적어도 하나의 히든 레이어가 복수의 차원으로 분할 처리됨에 따라 발생되는 중복 처리 정도 중 적어도 하나를 참조하여 적어도 하나의 히든 레이어의 차원 분할 수준 및 연산 처리 순서가 결정될 수 있다.In this case, with reference to at least one of the number of times the external memory 300 is to be accessed and the degree of redundancy processing generated as the at least one hidden layer is divided into a plurality of dimensions according to an embodiment of the present invention, at least A dimension division level and an operation processing order of one hidden layer may be determined.

그 다음에, 본 발명의 일 실시예에 따르면, 위의 외부 메모리(300)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 가중치에 관한 정보가 획득되는 과정(330, 350)과, 위의 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원으로 분할 처리하여 생성되는 연산 결과가 위의 외부 메모리에게 전송 및 저장되는 과정(340, 360)이 반복하여 수행됨으로써, 위의 입력 레이어와 연관되는 복수의 계층에 대한 연산이 완료되면, 해당 연산 결과에 기초한 출력 레이어(output layer)가 외부 메모리(300)에게 전송 및 저장될 수 있다(370).Next, according to an embodiment of the present invention, steps 330 and 350 of obtaining information on weights associated with the operation of an input layer and an artificial neural network from the external memory 300 above; By referring to the obtained information above, the operation result generated by dividing at least one hidden layer into a plurality of dimensions is transmitted and stored to the external memory (340, 360) is repeatedly performed. , When the operation of the plurality of layers associated with the above input layer is completed, an output layer based on the operation result may be transmitted and stored in the external memory 300 (370).

100: 통신망
200: 인공 신경망 연산 시스템
210: 버퍼부
220: 연산 관리부
230: 출력 레이어 관리부
240: 통신부
250: 제어부
300: 외부 메모리100: communication network
200: artificial neural network calculation system
210: buffer unit
220: operation management unit
230: output layer management unit
240: communication department
250: control unit
300: external memory

Claims

A method for performing an artificial neural network operation,
(a) acquiring information on an input layer and synaptic weight associated with the operation of an artificial neural network from an external memory - the external memory is associated with each input neuron associated with the artificial neural network ) of input data and information including synaptic weights for performing an operation from the input neuron, and information on synaptic weights associated with the operation of the artificial neural network is transmitted from the input layer to the next layer of the artificial neural network. - Information on synaptic weights to derive calculation results
(b) transmitting and storing an operation result generated by dividing at least one hidden layer into a plurality of dimensions with reference to the obtained information to the external memory - the dimension is each input neuron defined in the activation volume corresponding to the data for - , and
(c) When the operation of the plurality of layers associated with the input layer is completed by repeatedly performing steps (a) and (b), an output layer based on the result of the operation is output to the external layer. Including the step of transmitting to the memory so that it is stored,
A dimension division level and an operation processing order of the at least one hidden layer are determined with reference to the number of times the external memory is to be accessed and the degree of overlapping processing generated as the at least one hidden layer is divided into a plurality of dimensions.
Way.

A non-temporary computer readable recording medium storing a computer program for executing the method according to claim 1.

A system for performing artificial neural network calculations,
A buffer unit acquiring information on synaptic weights associated with the operation of an input layer and an artificial neural network from an external memory, wherein the external memory stores information on each input neuron associated with the artificial neural network. Stores input data and information including synaptic weights for performing an operation from the input neurons, and information on synaptic weights associated with the operation of the artificial neural network is a result of operation from the input layer to the next layer of the artificial neural network. - Information on synaptic weights to derive
An operation management unit for transmitting and storing an operation result generated by dividing at least one hidden layer into a plurality of dimensions with reference to the acquired information to the external memory - the dimension is for each input neuron. defined in the activation volume corresponding to the data - , and
When the process performed by the buffer unit and the process performed by the operation management unit are repeatedly performed, and the operation for the plurality of layers associated with the input layer is completed, an output layer based on the operation result And an output layer management unit for transmitting and storing to the external memory,
A dimension division level and an operation processing order of the at least one hidden layer are determined with reference to the number of times the external memory is to be accessed and the degree of overlapping processing generated as the at least one hidden layer is divided into a plurality of dimensions.
system.