KR102247896B1

KR102247896B1 - Convolution neural network parameter optimization method, neural network computing method and apparatus

Info

Publication number: KR102247896B1
Application number: KR1020210011925A
Authority: KR
Inventors: 이상헌; 김명겸; 김주혁
Original assignee: 주식회사 디퍼아이
Priority date: 2019-05-18
Filing date: 2021-01-27
Publication date: 2021-05-04
Also published as: KR102247896B9; KR20210015990A

Abstract

본 발명은 학습된 파라미터의 형태변환을 이용한 컨벌루션 신경망 파라미터 최적화 방법, 컨벌루션 신경망 연산방법 및 그 장치에 관한 것으로서, 본 발명에 따른 하드웨어 구현에 적합한 신경망 파라미터 최적화 방법은, 파라미터 최적화부를 구성하는 부호 파라미터 변환부와 크기 파라미터 변환부를 이용해 신경망의 기존 파라미터를 부호 파라미터 및 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환하는 단계; 및 파라미터 최적화부를 구성하는 파라미터 가지치기부를 이용해 상기 형태 변환된 파라미터들 중에서 크기 파라미터만을 가지치기하여 최적화된 파라미터를 생성하는 단계;를 포함하며,형태 변환 및 가지치기는 신경망 컨볼루션 연산에 이용되는 컨볼루션 레이어의 파라미터를 대상으로 할 수 있다.
따라서, 본 발명은, 컨볼루션 신경망이 가지는 많은 연산량과 파라미터를 하드웨어 구현에 효과적으로 최적화시켜 최소한의 정확도 손실 및 최대한의 연산속도를 얻을 수 있는 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치를 제공하는 효과가 있다.The present invention relates to a convolutional neural network parameter optimization method, a convolutional neural network calculation method, and an apparatus thereof using a shape transformation of a learned parameter. A neural network parameter optimization method suitable for hardware implementation according to the present invention comprises: Converting an existing parameter of the neural network into a code parameter and a size parameter having a single value per channel using a negative and a size parameter conversion unit; And generating an optimized parameter by pruning only a size parameter from among the shape-transformed parameters using a parameter pruning unit constituting the parameter optimization unit, wherein the shape conversion and pruning are performed in a neural network convolution operation. You can target the parameters of the root layer.
Accordingly, the present invention is effective in providing a neural network parameter optimization method, a neural network computation method, and an apparatus for obtaining a minimum loss of accuracy and maximum computation speed by effectively optimizing a large amount of computation and parameters of a convolutional neural network for hardware implementation. There is.

Description

Convolutional neural network parameter optimization method using shape transformation of learned parameters, convolutional neural network calculation method, and apparatus thereof {Convolution neural network parameter optimization method, neural network computing method and apparatus}

본 발명은 인공 신경망에서의 연산기술에 관한 것으로, 더욱 상세하게는, 하드웨어 구현에 적합한 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치에 관한 것이다.The present invention relates to an operation technique in an artificial neural network, and more particularly, to a neural network parameter optimization method suitable for hardware implementation, a neural network operation method, and an apparatus thereof.

최근 인공지능(artificial intelligence) 기술이 발달함에 따라 다양한 산업에서 인공지능 기술을 응용 및 도입하고 있다. With the recent development of artificial intelligence technology, various industries are applying and introducing artificial intelligence technology.

이러한 추세에 따라 컨볼루션 신경망(Convolutional Neural Network)과 같은 인공 신경망(artificial neural network)을 실시간 하드웨어로 구현하고자 하는 수요도 증가하고 있다. According to this trend, the demand to implement an artificial neural network such as a convolutional neural network as real-time hardware is also increasing.

그러나, 종래기술에 따른 인공신경망 관련 연산방법은, 컨볼루션 신경망이 가지는 많은 파라미터와 연산량 때문에 실제 구현은 쉽지 않다. However, the artificial neural network-related computation method according to the prior art is not easy to actually implement because of the large number of parameters and computational amount of the convolutional neural network.

이를 해결하기 위해 학습을 통해 불필요한 파라미터를 생략하는 방법과 파라미터의 비트 수를 감소시키는 방법 등, 정확도 대비 연산량을 감소시키기 위한 파라미터 최적화 방법들이 제안되고 있다. To solve this problem, parameter optimization methods have been proposed to reduce the amount of computation compared to accuracy, such as a method of omitting unnecessary parameters through learning and a method of reducing the number of bits of a parameter.

하지만, 종래기술에 따른 파라미터 최적화 방법들은 하드웨어 구현을 고려하지 않은 소프트웨어 연산방법에 한정 되기 때문에, 이러한 방법만을 적용하여 컨볼루션 신경망을 실시간 하드웨어로 구현하는 것은 한계가 있다.However, since parameter optimization methods according to the prior art are limited to software calculation methods that do not consider hardware implementation, there is a limit to implementing a convolutional neural network in real-time hardware by applying only such a method.

따라서, 하드웨어 구현에 적합한 인공신경망 관련 연산기술에 관한 현실적이고도 적용이 가능한 기술이 절실히 요구되고 있는 실정이다.Therefore, there is an urgent need for a practical and applicable technology for an artificial neural network-related computation technology suitable for hardware implementation.

공개특허공보 KR 10-2011-0027916호(공개일자 2011.03.17)Publication Patent Publication No. KR 10-2011-0027916 (published date 2011.03.17)

본 발명은 상기 문제점을 해결하기 위하여 안출된 것으로서, 본 발명은, 컨볼루션 신경망이 가지는 많은 연산량과 파라미터를 하드웨어 구현에 효과적으로 최적화시켜 최소한의 정확도 손실 및 최대한의 연산속도를 얻기 위한 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치를 제공하는데 있다.The present invention was conceived to solve the above problem, and the present invention is a method for optimizing neural network parameters to obtain a minimum loss of accuracy and maximum operation speed by effectively optimizing a large amount of computation and parameters of a convolutional neural network in hardware implementation, It is to provide a neural network computation method and its device.

또한, 최소한의 연산속도 손실을 통해 정확도를 보정할 수 있는 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치를 제공하는데 있다.In addition, it is to provide a neural network parameter optimization method, a neural network computation method, and an apparatus for correcting the accuracy through a minimum computational speed loss.

본 발명의 실시예에 따른 신경망 파라미터 최적화 방법은, 신경망의 기존 파라미터를 부호 파라미터 및 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환하는 단계와, 상기 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성하는 단계를 포함한다.A method for optimizing a neural network parameter according to an embodiment of the present invention includes the steps of converting an existing parameter of a neural network into a code parameter and a size parameter having a single value per channel, and a parameter optimized by pruning the shape-converted size parameter. And generating.

상기 부호 파라미터는 기존 파라미터의 채널 별 원소들의 방향을 결정하고, 상기 크기 파라미터는 기존 파라미터의 채널 당 단일의 대표 값으로 가중치들을 최적화한 것이다.The sign parameter determines the direction of elements for each channel of the existing parameter, and the size parameter is a value obtained by optimizing the weights with a single representative value per channel of the existing parameter.

상기 형태 변환하는 단계에서, 기존 파라미터의 채널 별 원소들의 절대값의 평균을 연산하여 크기 파라미터를 생성할 수 있다.In the step of converting the shape, a size parameter may be generated by calculating an average of the absolute values of elements for each channel of the existing parameter.

상기 형태 변환하는 단계에서, 기존 파라미터의 채널을 구성하는 소정의 원소가 0보다 작으면 대응하는 부호 파라미터의 원소 값을 0으로, 0보다 크거나 같으면 대응하는 부호 파라미터의 원소 값을 1로 표현하여 부호 파라미터를 생성할 수 있다.In the shape conversion step, if a predetermined element constituting the channel of the existing parameter is less than 0, the element value of the corresponding sign parameter is expressed as 0, and if it is greater than or equal to 0, the element value of the corresponding sign parameter is expressed as 1. Sign parameters can be created.

상기 가지치기하는 단계에서, 입력채널 별 크기 파라미터의 평균 값 및 크기 분포 또는 입력 및 출력채널 별 크기 파라미터의 평균 값 및 크기 분포를 이용하여 기준 값을 계산하고, 계산된 기준 값보다 적은 값을 가진 크기 파라미터 값을 0으로 만들어 해당 채널의 컨볼루션 연산을 생략하도록 할 수 있다.In the pruning step, a reference value is calculated by using the average value and size distribution of the size parameter for each input channel or the average value and size distribution of the size parameter for each input and output channel, and having a value less than the calculated reference value. By setting the value of the size parameter to 0, the convolution operation of the corresponding channel can be omitted.

소정의 레이어를 구성하는 입력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기하며, 소정의 레이어를 구성하는 입력 및 출력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들 및 출력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.When pruning using the average value and size distribution of the size parameters of the input channels constituting a predetermined layer, a reference value is calculated by multiplying the average value of the size parameters of the input channels by a constant for each layer, and a predetermined input If the channel size parameter value is smaller than the reference value, pruning is performed by changing the size parameter value of the corresponding channel to 0, and pruning is performed using the average value and size distribution of the size parameters of the input and output channels constituting a predetermined layer. In this case, a reference value is calculated by multiplying the average value of the size parameters of the input channels and output channels by a constant for each layer, and if the size parameter value of a predetermined input channel is less than the reference value, the size parameter value of the corresponding channel is calculated. It can be pruned by changing it to 0.

상기 레이어 별 상수 값은 레이어 별 컨볼루션 파라미터 분포에 따라 결정될 수 있다.The constant value for each layer may be determined according to a distribution of convolution parameters for each layer.

상기 신경망 파라미터 최적화 방법은, 신경망의 기존 파라미터를 비율 파라미터로 형태 변환하는 단계를 더 포함할 수 있다.The neural network parameter optimization method may further include converting the shape of an existing parameter of the neural network into a ratio parameter.

상기 비율 파라미터로 형태 변환하는 단계는, 비율 파라미터의 비트를 가변적으로 할당하는 단계와, 비율 파라미터 원소의 값의 범위 및 가중치를 사용자 선택에 의해 양자화되는 단계를 포함할 수 있다.The step of converting the shape to the ratio parameter may include variably allocating bits of the ratio parameter, and quantizing the range and weight of the value of the ratio parameter element by user selection.

본 발명의 다른 실시 예에 따른 신경망 연산방법은, 신경망의 기존 파라미터 및 입력채널 데이터를 메모리에 로드하는 단계와, 기존 파라미터를 부호 파라미터 및 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환하고 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성하는 단계와, 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계와, 최적화된 파라미터를 보정하는 단계와, 기존 파라미터를 보정된 최적화 파라미터로 업데이트하는 단계를 포함할 수 있다.A neural network operation method according to another embodiment of the present invention includes the steps of loading an existing parameter and input channel data of a neural network into a memory, converting the shape of the existing parameter into a sign parameter and a size parameter having a single value per channel, and converting the shape. Pruning the sized parameter to generate an optimized parameter, convolutional calculation of the optimized parameter and input channel data to infer, correcting the optimized parameter, and converting the existing parameter to the corrected optimization parameter. It may include the step of updating.

상기 신경망 연산방법은, 학습된 파라미터가 존재하는지 판단하는 단계와, 학습된 파라미터가 없으면 파라미터 초기화를 통해 초기 파라미터를 생성하는 단계와, 초기 파라미터를 대상으로 최적화된 파라미터를 생성하는 단계, 및 학습된 파라미터가 있으면 기존 파라미터를 로드하는 단계를 더 포함할 수 있다.The neural network operation method includes determining whether a learned parameter exists, generating an initial parameter through parameter initialization if there is no learned parameter, and generating an optimized parameter for the initial parameter, and the learned parameter. If there is a parameter, the step of loading an existing parameter may be further included.

상기 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계는, 최적화된 파라미터를 메모리에 로드하는 단계와, 상기 메모리에 로드된 최적화된 파라미터에 포함되는 크기 파라미터의 값이 0인지 판단하는 단계와, 상기 크기 파라미터의 값이 0이면 컨볼루션 연산 과정을 생략하는 단계를 포함할 수 있다.The step of inferring by convolutional operation of the optimized parameter and input channel data includes loading the optimized parameter into a memory, and determining whether a value of a size parameter included in the optimized parameter loaded into the memory is 0. And, if the value of the size parameter is 0, the step of omitting the convolution operation process may be included.

상기 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계는, 상기 크기 파라미터의 값이 0이 아니면 부호 파라미터와 입력채널 데이터를 비트 연산하여 데이터의 방향을 결정하는 단계와, 컨볼루션 파라미터 필터 크기만큼 입력채널 데이터 간 합 연산하는 단계와, 크기 파라미터와 입력채널 데이터를 대상으로 단일의 곱 연산을 수행하는 단계를 포함할 수 있다.Inferring the optimized parameter and input channel data by convolutional operation, if the value of the size parameter is not 0, determining the direction of the data by bitwise operation of a sign parameter and input channel data, and a convolution parameter filter It may include performing a sum operation between input channel data by a size, and performing a single multiplication operation on the size parameter and the input channel data.

상기 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계는, 비율 파라미터가 존재하면 비율 파라미터를 이용하여 크기 파라미터에 가중치를 차등 반영함에 따라 연산결과의 오차를 줄이는 단계를 더 포함할 수 있다.The step of inferring by convolutional operation of the optimized parameter and input channel data may further include reducing an error in the operation result by differentially reflecting a weight to a size parameter using a ratio parameter if a ratio parameter exists. .

본 발명의 또 다른 실시 예에 따른 신경망 연산장치는, 기존 파라미터를 부호 파라미터로 형태 변환하는 부호 파라미터 변환부와, 기존 파라미터를 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환하는 크기 파라미터 변환부, 및 상기 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성하는 파라미터 가지치기부를 포함할 수 있다.A neural network computing device according to another embodiment of the present invention includes a code parameter conversion unit that converts the shape of an existing parameter into a code parameter, a size parameter conversion unit that converts the shape of the existing parameter into a size parameter having a single value per channel, And a parameter pruning unit configured to generate an optimized parameter by pruning the shape-transformed size parameter.

상기 파라미터 가지치기부는, 입력채널 별 크기 파라미터의 평균 값 및 크기 분포 또는 입력 및 출력채널 별 크기 파라미터의 평균 값 및 크기 분포를 이용하여 기준 값을 계산하고, 계산된 기준 값보다 적은 값을 가진 크기 파라미터 값을 0으로 만들어 해당 채널의 컨볼루션 연산을 생략하도록 할 수 있다.The parameter pruning unit calculates a reference value using the average value and size distribution of the size parameter for each input channel or the average value and size distribution of the size parameter for each input and output channel, and a size having a value less than the calculated reference value. By setting the parameter value to 0, the convolution operation of the corresponding channel can be omitted.

상기 파라미터 가지치기부는, 소정의 레이어를 구성하는 입력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기하며, 소정의 레이어를 구성하는 입력 및 출력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들 및 출력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.When pruning by using the average value and size distribution of the size parameter of the input channel constituting a predetermined layer, the parameter pruning unit calculates a reference value by multiplying the average value of the size parameter of the input channels by a constant for each layer. Calculation, and if the size parameter value of a predetermined input channel is less than the reference value, the size parameter value of the corresponding channel is changed to 0 and pruned, and the average value and size of the size parameters of the input and output channels constituting the predetermined layer In the case of pruning using a distribution, a reference value is calculated by multiplying the average value of the size parameters of the input channels and output channels by a constant for each layer, and if the value of the size parameter of a predetermined input channel is less than the reference value, the corresponding You can prune by changing the channel size parameter value to 0.

상기 신경망 연산장치는, 기존 파라미터를 비율 파라미터로 형태 변환하는 비율 파라미터 변환부를 더 포함할 수 있다.The neural network computing device may further include a ratio parameter conversion unit for converting the shape of the existing parameter into a ratio parameter.

상기 신경망 연산장치는, 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 추론부를 더 포함할 수 있다. The neural network computing device may further include an inference unit for inferring by convolutional calculation of optimized parameters and input channel data.

상기 추론부는 최적화 파라미터에 포함된 크기 파라미터의 값이 0인지 판단하고 크기 파라미터의 값이 0이면 컨볼루션 연산 과정을 생략할 수 있다. The inference unit may determine whether the value of the size parameter included in the optimization parameter is 0, and if the value of the size parameter is 0, the convolution operation process may be omitted.

상기 추론부는 비율 파라미터가 존재하면 비율 파라미터를 이용하여 크기 파라미터에 가중치를 차등 반영함에 따라 연산결과의 오차를 줄일 수 있다.If there is a ratio parameter, the reasoning unit may reduce an error in an operation result by differentially reflecting the weight to the size parameter using the ratio parameter.

이상에서 설명한 바와 같이, 본 발명은, 컨볼루션 신경망이 가지는 많은 연산량과 파라미터를 하드웨어 구현에 효과적으로 최적화시켜 최소한의 정확도 손실 및 최대한의 연산속도를 얻을 수 있는 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치를 제공하는 효과가 있다.As described above, the present invention provides a neural network parameter optimization method, a neural network computation method, and a device thereof that can effectively optimize a large amount of computation and parameters of a convolutional neural network to obtain a minimum loss of accuracy and maximum computation speed. Has the effect of providing.

또한, 본 발명은, 신경망의 파라미터를 최적화함에 따라, 실시간으로 동작하는 컨볼루션 신경망을 구현한 하드웨어에서 요구되는 저전력, 고성능을 만족시킬 수 있다. 일 실시 예에 따른 신경망 파라미터 최적화 기법을 적용하면, 컨볼루션 연산에서의 채널의 크기에 따라 채널 별로 연산을 생략할 수 있다. 게다가, 하드웨어에서 채널 내 각각의 원소에 대해 연산을 생략할 것인지 수행할 것인지 판별하는 것이 아니라, 각 채널 별로 연산 생략 여부를 판별하므로 채널 원소 개수의 배수로 연산이 줄어드는 효과가 있다.In addition, according to the present invention, by optimizing the parameters of the neural network, low power and high performance required by hardware implementing a convolutional neural network operating in real time can be satisfied. When the neural network parameter optimization technique according to an embodiment is applied, the calculation for each channel may be omitted according to the size of the channel in the convolution calculation. In addition, the hardware does not determine whether to omit or perform an operation for each element in a channel, but determines whether to omit an operation for each channel, so there is an effect of reducing the operation by a multiple of the number of channel elements.

또한, 본 발명은, 최소한의 연산속도 손실을 통해 정확도를 보정할 수 있다. 예를 들어, 비율 파라미터(scale parameter)를 별도로 분리해 값의 범위에 따라 다른 가중치를 효과적으로 적용할 수 있어 하드웨어 구현 시 성능을 효율적으로 높일 수 있다.In addition, the present invention can correct the accuracy through a minimum loss of operation speed. For example, by separating the scale parameter separately, different weights can be effectively applied according to the range of values, so performance can be efficiently improved when implementing hardware.

도 1은 본 발명이 일 실시 예에 따른 신경망 컨볼루션 연산에 사용되는 컨볼루션 파라미터(convolution parameter)를 정의한 도면,
도 2는 본 발명의 일 실시 예에 따른 컨볼루션 신경망(Convolution Neural Network: CNN)에서 최적화 파라미터 생성을 위한 학습 프로세스를 도시한 도면,
도 3은 본 발명의 일 실시 예에 따른 도 2의 최적화 파라미터 생성을 위한 학습 프로세스를 세부화하여 도시한 도면,
도 4는 본 발명의 일 실시 예에 따른 추론 프로세스를 세부화하여 도시한 도면,
도 5는 본 발명의 일 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면,
도 6은 일반적인 컨볼루션 연산과 본 발명의 일 실시 예에 따른 컨볼루션 연산을 비교한 도면,
도 7은 본 발명의 일 실시 예에 따른 파라미터 최적화 이후 컨볼루션 연산의 이점을 보여주는 도면,
도 8은 본 발명의 다른 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면,
도 9는 비율 파라미터 최적화를 설명하기 위해 크기 파라미터만을 적용한 파라미터 분포와 비율 파라미터를 추가 적용한 파라미터 분포를 비교한 도면,
도 10은 본 발명의 일 실시 예에 따른 최적화된 파라미터를 이용한 컨볼루션 연산의 예를 도시한 도면,
도 11은 본 발명의 일 실시 예에 따른 신경망 연산장치의 구성을 도시한 도면,
도 12는 본 발명의 일 실시 예에 따른 도 11의 파라미터 최적화부의 세부 구성을 도시한 도면이다.1 is a diagram illustrating a convolution parameter used in a neural network convolution operation according to an embodiment of the present invention;
2 is a diagram illustrating a learning process for generating an optimization parameter in a convolution neural network (CNN) according to an embodiment of the present invention;
3 is a diagram illustrating in detail a learning process for generating an optimization parameter of FIG. 2 according to an embodiment of the present invention;
4 is a diagram showing in detail an inference process according to an embodiment of the present invention;
5 is a diagram showing an application example of a convolution parameter optimization process according to an embodiment of the present invention;
6 is a diagram comparing a general convolution operation and a convolution operation according to an embodiment of the present invention;
7 is a diagram showing an advantage of a convolution operation after parameter optimization according to an embodiment of the present invention;
8 is a diagram showing an application example of a convolution parameter optimization process according to another embodiment of the present invention;
9 is a diagram comparing a parameter distribution to which only a size parameter is applied and a parameter distribution to which a ratio parameter is additionally applied to explain ratio parameter optimization;
10 is a diagram showing an example of a convolution operation using an optimized parameter according to an embodiment of the present invention;
11 is a diagram showing the configuration of a neural network computing device according to an embodiment of the present invention;
12 is a diagram showing a detailed configuration of the parameter optimization unit of FIG. 11 according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention, and a method of achieving them will become apparent with reference to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments make the disclosure of the present invention complete, and the general knowledge in the technical field to which the present invention pertains. It is provided to completely inform the scope of the invention to the possessor, and the invention is only defined by the scope of the claims. The same reference numerals refer to the same elements throughout the specification.

본 발명의 실시 예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이며, 후술되는 용어들은 본 발명의 실시 예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing the embodiments of the present invention, if it is determined that a detailed description of a known function or configuration may unnecessarily obscure the subject matter of the present invention, a detailed description thereof will be omitted, and terms to be described later are in the embodiment of the present invention. These terms are defined in consideration of the functions of the user and may vary according to the intention or custom of users or operators. Therefore, the definition should be made based on the contents throughout the present specification.

첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램인스트럭션들(실행 엔진)에 의해 수행될 수도 있으며, 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치의 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다.Combinations of each block of the attached block diagram and each step of the flowchart may be executed by computer program instructions (execution engine), and these computer program instructions are used on a processor of a general-purpose computer, special purpose computer, or other programmable data processing device. As can be mounted, the instructions executed by the processor of a computer or other programmable data processing device generate means for performing the functions described in each block of the block diagram or each step of the flowchart.

이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치를 지향할 수 있는 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다.These computer program instructions may also be stored in a computer-usable or computer-readable memory that can be directed to a computer or other programmable data processing device to implement a function in a particular manner, so that the computer-usable or computer-readable memory It is also possible to produce an article of manufacture in which the instructions stored in the block diagram contain instruction means for performing the functions described in each block of the block diagram or each step of the flowchart.

그리고 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명되는 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.In addition, since computer program instructions can be mounted on a computer or other programmable data processing device, a series of operation steps are performed on a computer or other programmable data processing device to create a computer-executable process. It is also possible that the instructions for performing the data processing apparatus provide steps for executing the functions described in each block in the block diagram and in each step in the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능들을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있으며, 몇 가지 대체 실시 예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하며, 또한 그 블록들 또는 단계들이 필요에 따라 해당하는 기능의 역순으로 수행되는 것도 가능하다.In addition, each block or each step may represent a module, segment, or part of code containing one or more executable instructions for executing specified logical functions, and in some alternative embodiments mentioned in the blocks or steps. It should be noted that it is also possible for functions to occur out of order. For example, two blocks or steps shown in succession may in fact be performed substantially simultaneously, and the blocks or steps may be performed in the reverse order of a corresponding function as necessary.

이하, 첨부 도면을 참조하여 본 발명의 실시 예를 상세하게 설명한다. 그러나 다음에 예시하는 본 발명의 실시 예는 여러 가지 다른 형태로 변형될 수 있으며, 본 발명의 범위가 다음에 상술하는 실시 예에 한정되는 것은 아니다. 본 발명의 실시 예는 이 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위하여 제공된다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the embodiments of the present invention exemplified below may be modified in various other forms, and the scope of the present invention is not limited to the embodiments described below. Embodiments of the present invention are provided to more completely describe the present invention to those of ordinary skill in the art to which the present invention pertains.

도 1은 본 발명의 일 실시 예에 따른 신경망 컨볼루션 연산에 사용되는 컨볼루션 파라미터(convolution parameter)를 정의한 도면이다.1 is a diagram illustrating a convolution parameter used in a neural network convolution operation according to an embodiment of the present invention.

도면에 도시된 바와 같이, 도 1을 참조하면, 신경망의 전체 컨볼루션 레이어 개수를 l이라 하면, 각 레이어 마다 k개의 출력채널(Output Channel)(O₁, … , O_k)이 존재하고, 각 출력채널은 j개의 입력채널(Input Channel)(I₁, I₂, … , O_j)로 구성된다. 각각의 입력채널은 i개의 가중치 파라미터(W₁, W₂, … , W_i)를 가진다. 본 발명에서는 위에서 정의한 컨볼루션 파라미터의 정의를 기반으로 최적화 기법을 수식적으로 설명하였다.As shown in the figure, referring to FIG. 1, if the total number of convolutional layers of a neural network is l, k output channels (O ₁ ,…, O _k ) exist for each layer, and each The output channel is composed of j input channels (I ₁ , I ₂ ,…, O _j ). Each input channel has i weight parameters (W ₁ , W ₂ ,…, W _i ). In the present invention, the optimization technique is mathematically described based on the definition of the convolution parameter defined above.

도 2는 본 발명의 일 실시 예에 따른 컨볼루션 신경망(Convolution Neural Network: CNN)에서 최적화 파라미터 생성을 위한 학습 프로세스를 도시한 도면이다.2 is a diagram illustrating a learning process for generating an optimization parameter in a convolution neural network (CNN) according to an embodiment of the present invention.

도 2를 참조하면, 학습 프로세스는 기존 파라미터 및 입력 데이터 로드 단계(S200), 파라미터 최적화 단계(S210), 추론 단계(S220), 보정 단계(S230) 및 파라미터 업데이트 단계(S240)를 포함할 수 있다.2, the learning process may include an existing parameter and input data loading step (S200), a parameter optimization step (S210), an inference step (S220), a correction step (S230), and a parameter update step (S240). .

보다 상세하게는, 상기 파라미터 최적화 단계(S210)를 생략할 경우 일반적인 딥러닝 학습 과정과 동일하며, 이는 일반적인 딥러닝 학습환경에서 본 발명을 쉽게 적용할 수 있음을 의미한다. More specifically, if the parameter optimization step S210 is omitted, it is the same as a general deep learning learning process, which means that the present invention can be easily applied in a general deep learning learning environment.

본 발명의 실시예에 따른 컨볼루션 신경망 최적화 기법은 기존의 학습된 파라미터가 존재할 경우, 기존의 학습 환경에 파라미터 최적화 단계(S210)만 추가하여 수행할 수 있다. 이때, 기존의 학습된 파라미터가 존재하지 않을 경우, 파라미터 최적화 단계(S210)를 생략한 학습을 통해 초기 학습을 진행한 후 파라미터 최적화 단계(S210)를 적용하도록 할 수 있다. 한편, 처음부터 파라미터 최적화 단계(S210)를 적용하여 학습을 진행할 경우 동일한 연산감소량에 비해 결과 정확도가 낮아질 수 있기 때문에, 기존의 학습된 컨볼루션 파라미터와 입력 데이터를 메모리에 로드(S200)한 후, 로드된 기존 컨볼루션 파라미터에 본 발명에서 제안하는 컨볼루션 파라미터 최적화 기법을 적용한다(S210).The convolutional neural network optimization technique according to an embodiment of the present invention can be performed by adding only the parameter optimization step (S210) to the existing learning environment when there are existing learned parameters. In this case, if the previously learned parameter does not exist, the parameter optimization step S210 may be applied after initial learning is performed through learning by omitting the parameter optimization step S210. On the other hand, if the learning is performed by applying the parameter optimization step (S210) from the beginning, the result accuracy may be lowered compared to the same amount of operation reduction, so after loading the existing learned convolution parameters and input data into the memory (S200), The convolution parameter optimization technique proposed in the present invention is applied to the loaded existing convolution parameter (S210).

다음으로, 상기 최적화가 적용된 파라미터와 입력 데이터를 컨볼루션 연산하여 추론(S220)을 실행하고, 파라미터 보정 단계(S230)를 거쳐 기존 파라미터를 업데이트한다(S240). 상기 파라미터 보정 단계(S230)에서는 역전파를 구하는 과정이 포함된다. 학습이 종료된 이후, 컨볼루션 파라미터 최적화 기법을 적용하여 파라미터를 저장하도록 한다. 파라미터 최적화 단계(S210), 추론 단계(S220), 파라미터 보정 단계(S230) 및 업데이트 단계(S240)는 목표 횟수에 도달할 때까지 반복적으로 수행될 수 있다.Next, the parameter to which the optimization has been applied and the input data are subjected to a convolution operation to perform inference (S220), and the existing parameters are updated through a parameter correction step (S230) (S240). The parameter correction step (S230) includes a process of obtaining backpropagation. After learning is complete, the parameters are stored by applying a convolutional parameter optimization technique. The parameter optimization step S210, the inference step S220, the parameter correction step S230, and the update step S240 may be repeatedly performed until the target number of times is reached.

도 3은 본 발명의 일 실시 예에 따른 도 2의 최적화 파라미터 생성을 위한 학습 프로세스를 세부화하여 도시한 도면이다.3 is a diagram illustrating a detailed learning process for generating an optimization parameter of FIG. 2 according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 실시예에 따른 신경망 연산장치는, 먼저, 학습된 파라미터가 존재하는지 판단한다(S300). 다음으로, 학습된 파라미터가 존재하면 이를 메모리에 로드한다(S302). 이어서, 본 발명의 컨볼루션 파라미터 최적화(S303)를 적용할지 여부를 판단한다(S304). 컨볼루션 파라미터 최적화(S303)를 적용하지 않는 경우는 일반적인 딥러닝 학습과정과 동일하다.Referring to FIG. 3, the neural network computing apparatus according to an embodiment of the present invention first determines whether a learned parameter exists (S300). Next, if the learned parameter exists, it is loaded into the memory (S302). Next, it is determined whether to apply the convolution parameter optimization (S303) of the present invention (S304). The case where the convolution parameter optimization (S303) is not applied is the same as the general deep learning learning process.

상기 컨볼루션 파라미터 최적화 단계(S303)는 크기 파라미터 생성 단계(S306), 부호 파라미터 생성 단계(S308)를 포함하며, 비율 파라미터 최적화를 적용할지 여부를 판단하는 단계(S310) 및 비율 파라미터 최적화를 적용하는 것으로 판단한 경우 비율 파라미터를 생성하는 단계(S312)를 더 포함할 수 있다.The convolutional parameter optimization step (S303) includes a size parameter generation step (S306), a sign parameter generation step (S308), determining whether to apply a ratio parameter optimization (S310), and applying a ratio parameter optimization. If it is determined that the ratio parameter is generated (S312) may be further included.

본 발명의 실시예에 따르면, 상기 컨볼루션 파라미터 최적화 단계(S303)가 완료되면, 신경망 연산장치는 추론 단계(S316)를 통해 최적화가 적용된 파라미터 및 입력 데이터로 추론을 실행하고, 보정 단계(S318)를 거쳐 기존 파라미터를 최적화 및 보정된 파라미터로 업데이트한다(S320). 이때, 미리 설정된 목표 횟수에 도달하였는지 판단(S322)하여, 그렇지 않은 경우 최적화 파라미터 최적화 단계(S303), 추론 단계(S316), 보정 단계(S318) 및 파라미터 업데이트 단계(S320)를 반복 수행한다.According to an embodiment of the present invention, when the convolutional parameter optimization step (S303) is completed, the neural network computing device performs inference with the parameter and input data to which the optimization has been applied through the inference step (S316), and the correction step (S318) Through the process, the existing parameters are updated with the optimized and corrected parameters (S320). At this time, it is determined whether the preset target number has been reached (S322), and if not, the optimization parameter optimization step (S303), the inference step (S316), the correction step (S318), and the parameter update step (S320) are repeatedly performed.

도 4는 본 발명의 일 실시 예에 따른 추론 프로세스를 세부화하여 도시한 도면이다.4 is a diagram illustrating in detail an inference process according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 실시예에 따른 신경망 연산장치는 최적화 파라미터를 메모리에 로드(S400)하고, 크기 파라미터 값이 존재하는지, 즉, 크기 파라미터 값이 0(zero)인지 아닌지를 판단한다(S410). 크기 파라미터 값이 0이면 이후 과정이 생략됨에 따라 추론 연산량을 감소시킬 수 있다.4, the neural network computing device according to an embodiment of the present invention loads an optimization parameter into a memory (S400), and determines whether a size parameter value exists, that is, whether the size parameter value is 0 (zero). (S410). If the value of the size parameter is 0, the amount of inference computation can be reduced as subsequent processes are omitted.

다음으로, 크기 파라미터 값이 0이 아니면, 입력 데이터를 메모리에 로드(S420)하고, 부호 파라미터와 입력 데이터를 비트 연산한다(S430). 그리고 입력 데이터 간 합 연산을 적용(S460)하고 크기 파라미터와 입력 데이터의 곱 연산(S470)을 수행한다. 비트 연산(S430) 이후, 비율 파라미터를 적용하는 경우(S440), 비율 파라미터를 참조해 비율 파라미터 대응상수를 크기 파라미터에 적용하는 단계(S470)를 더 포함할 수 있다.Next, if the size parameter value is not 0, the input data is loaded into the memory (S420), and the sign parameter and the input data are bitwise calculated (S430). Then, the sum operation between the input data is applied (S460), and the multiplication operation of the size parameter and the input data (S470) is performed. After the bit operation (S430), when the ratio parameter is applied (S440), a step (S470) of applying the ratio parameter correspondence constant to the size parameter with reference to the ratio parameter may be further included.

도 5는 본 발명의 일 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면이다.5 is a diagram illustrating an application example of a convolution parameter optimization process according to an embodiment of the present invention.

도 5를 참조하면, 일 실시 예에 따른 파라미터 최적화 프로세스는 크게 형태변환 단계(510)와 가지치기(pruning) 단계(520) 두 부분으로 구성된다. 파라미터 형태변환(510)은 기존의 컨볼루션 파라미터의 연산량을 줄이고 가지치기 용이한 형태로 변환하는 과정을 의미한다. 예를 들어, 도 5에 도시된 바와 같이 기존 파라미터(50)가 크기를 나타내는 크기 파라미터(Magnitude Parameter)(52)와 방향을 나타내는 부호 파라미터(Signed Parameter)(54)로 분리된다. 도 5에서는 기존 파라미터가 3개의 입력채널(50-1, 50-2, 50-3)을 가지는 예를 들어 도시하고 있다. 각 입력채널(50-1,50-2,50-3)은 다수의 파라미터 원소들(elements)로 이루어진다. 부호 파라미터(54)는 기존 파라미터(50)의 채널 별 원소들의 방향을 결정한 것이고, 크기 파라미터(52)는 기존 파라미터(50)의 채널 당 단일의 대표 값으로 가중치들을 최적화한 것이다.Referring to FIG. 5, a parameter optimization process according to an embodiment is largely composed of two parts: a shape change step 510 and a pruning step 520. The parameter shape conversion 510 refers to a process of reducing the amount of calculation of the existing convolution parameter and converting it into a shape that is easy to pruning. For example, as shown in FIG. 5, the existing parameter 50 is divided into a Magnitude Parameter 52 representing a magnitude and a Signed Parameter 54 representing a direction. 5 shows an example in which the existing parameter has three input channels 50-1, 50-2, and 50-3. Each of the input channels 50-1, 50-2 and 50-3 consists of a number of parameter elements. The sign parameter 54 determines the directions of elements for each channel of the existing parameter 50, and the size parameter 52 optimizes the weights with a single representative value per channel of the existing parameter 50.

사용자는 크기 파라미터(52)의 각 구성원소들의 비트 수를 임의로 설정할 수 있으며, 정수(Integer)와 소수(Floating) 등으로 표현될 수 있다. 비트 수가 많아질수록 원래의 컨볼루션 연산 결과와 동일한 결과를 가지게 된다. 부호 파라미터(54) 내 각 구성원소의 비트 수는 1비트(Bit) 이며, 이는 양(+)의 방향과 음(-)의 방향만을 나타낸다. 기존 파라미터(50)에서 크기 파라미터(52) 및 부호 파라미터(54)를 분리하기 위해 다양한 방법을 적용할 수 있다. 예를 들어, 기존 파라미터(50)에서 컨볼루션 레이어의 입력채널의 평균과 각 파라미터의 최상위 비트를 통해 형태변환을 수행할 수 있다.The user may arbitrarily set the number of bits of each element of the size parameter 52, and may be expressed as an integer and a floating number. As the number of bits increases, the result is the same as the original convolution operation result. The number of bits of each element in the sign parameter 54 is 1 bit, which indicates only the positive (+) direction and the negative (-) direction. Various methods can be applied to separate the size parameter 52 and the sign parameter 54 from the existing parameter 50. For example, in the existing parameter 50, shape conversion may be performed through the average of the input channels of the convolutional layer and the most significant bit of each parameter.

크기 파라미터(52)로의 형태 변환 방법 중 하나는 입력채널의 구성원소들(가중치 파라미터들)의 평균 값을 이용하는 것이다. 예를 들어, i개의 구성원소 W_i, j개의 입력채널 I_j, k개의 출력채널 O_k이고, 크기 파라미터 M = {M₁, M₂, … , M_j}이면, M_j 는 수학식 1과 같다.One of the methods of converting the shape to the size parameter 52 is to use the average value of the elements (weight parameters) of the input channel. For example, i elements W _i , j input channels I _j , k output channels O _k , and size parameter M = {M ₁ , M ₂ ,… , M _j }, M _j is the same as in Equation 1.

부호 파라미터(54)로의 형태 변환 방법 중 하나는 채널의 구성원소(가중치 파라미터들)가 0보다 크거나 같은 값으면 0 값을 할당하고, 0보다 작으면 1 값을 부여하는 방식이다. 즉, 부호 파라미터 S = {S₁, S₂, … , S_i}이면, S_i 는 수학식 2와 같다.One of the method of converting the shape to the sign parameter 54 is a method of assigning a value of 0 when the channel element (weight parameters) is greater than or equal to 0, and assigning a value of 1 when it is less than 0. That is, the sign parameter S = {S ₁ , S ₂ ,… , S _i }, S _i is equal to Equation 2.

형태변환 과정을 수행한 후 불필요한 파라미터들을 제거하는 가지치기 단계(520)를 진행한다. 가지치기 단계(520)에서는 중요도가 낮은 파라미터를 선별하여 그 값을 0(zero)로 만들어 연산을 생략할 수 있게 하여 전체 연산량을 줄이는 효과를 얻도록 한다. 이때, 크기 파라미터를 크기에 따라 가지치기하여 최적화된 파라미터를 생성한다. 예를 들어, 소정의 크기 파라미터 값이 미리 설정된 기준 값보다 적으면 해당 채널의 크기 파라미터를 가지치기한다.After performing the shape conversion process, a pruning step 520 of removing unnecessary parameters is performed. In the pruning step 520, a parameter of low importance is selected and its value is set to 0 (zero) so that the operation can be omitted, thereby obtaining an effect of reducing the total amount of calculation. At this time, an optimized parameter is generated by pruning the size parameter according to the size. For example, if the predetermined size parameter value is less than a preset reference value, the size parameter of the corresponding channel is pruned.

여기서 중요한 점은 이 가지치기 과정을 컨볼루션 레이어 전체 파라미터에 적용하는 것이 아니라, 형태변환된 크기 파라미터(52)에만 적용하도록 하는 것이다. 크기 파라미터(52)만 가지치기(520)를 적용함으로써 생략된 파라미터가 컨볼루션 레이어의 각 입력채널의 모든 파라미터에 영향을 끼치게 되어 가지치기(520)의 효과를 극대화할 수 있다.The important point here is that this pruning process is applied only to the shape-transformed size parameter 52, not to the entire parameter of the convolutional layer. By applying the pruning 520 only to the size parameter 52, the omitted parameter affects all parameters of each input channel of the convolution layer, thereby maximizing the effect of the pruning 520.

불필요한 파라미터를 생략하기 위해 다양한 방법들이 적용될 수 있다. 예를 들어, 컨볼루션 레이어 입력채널 및 출력채널 별 평균값과 분포를 이용해서 가지치기를 수행한다. 각 레이어 별로 사용자가 임의로 정한 상수가 존재하며, 이 상수는 레이어 별 컨볼루션 파라미터 분포에 따라 정해진다. 값을 생략하기 위한 기준 값으로 입력채널의 크기 파라미터들의 평균값 혹은 입력채널 및 출력채널의 크기 파라미터의 평균에 가중치를 곱한 값을 사용할 수 있다. 레이어 상수는 0~1 값을 가지는데, 레이어 별 컨볼루션 파라미터를 연속균등분포 혹은 정규분포라 가정할 때, 50%의 크기 파라미터를 생략하기 위해서 레이어 상수는 0.5~0.7일 수 있다. 이 값을 참조하여 실제 파라미터의 분포를 고려해 레이어 상수를 정하도록 한다.Various methods can be applied to omit unnecessary parameters. For example, pruning is performed using the average value and distribution of each input channel and output channel of the convolution layer. Each layer has a user-determined constant, and this constant is determined according to the distribution of convolution parameters for each layer. As a reference value for omitting the value, an average value of the size parameters of the input channel or a value obtained by multiplying the average of the size parameters of the input channel and the output channel by a weight may be used. The layer constant has a value of 0 to 1. When the convolution parameter for each layer is assumed to be a continuous uniform distribution or a normal distribution, the layer constant may be 0.5 to 0.7 in order to omit the 50% size parameter. By referring to this value, consider the distribution of actual parameters to determine the layer constant.

M_LOI = {M₁₁₁, M₁₁₂, … , M_lkj}, L = {C₁, C₂, … , C_l)이고, C는 레이어 상수인 경우, 입력채널 별 기준을 적용한 크기 파라미터 가지치기 방식은 다음 수학식 3과 같다.M _LOI = {M ₁₁₁ , M ₁₁₂ ,… , M _lkj }, L = {C ₁ , C ₂ ,… , C _l ), and C is a layer constant, the size parameter pruning method to which the reference for each input channel is applied is as shown in Equation 3 below.

출력채널 별 기준을 적용하나 크기 파라미터 가지치기 방식은 다음 수학식 4와 같다.The standard for each output channel is applied, but the size parameter pruning method is as shown in Equation 4 below.

이하, 전술한 수학식 1 내지 4를 적용한 컨볼루션 파라미터 최적화 계산 예시를 설명한다.Hereinafter, an example of a convolution parameter optimization calculation to which Equations 1 to 4 described above are applied will be described.

i=4, j=2, k=2인 기존 컨볼루션 파라미터를 대상으로 크기 파라미터 및 부호 파라미터를 계산하고, 크기 파라미터를 대상으로 가지치기를 한다고 가정한다. 이때, O₁ = {I₁₁, I₁₂} ={[W₁₁₁, W₁₁₂, W₁₁₃, W₁₁₄}, [W₁₂₁, W₁₂₂, W₁₂₃, W₁₂₄]]} = {[-0.5, 1.0, 0.5, -1.0], [1.0, -1.5, 1.0, -1.5]}, O₂= {I₂₁, I₂₂} ={[W₂₁₁, W₂₁₂, W₂₁₃, W₂₁₄}, [W₂₃₁, W₂₂₂, W₂₂₃, W₂₂₄]]} = {[1.5, -2.0, -1.5, 2.0], [-2.0, -2.5, 2.0, 2.5]}라고 가정한다.It is assumed that a size parameter and a sign parameter are calculated for the existing convolution parameters with i=4, j=2, and k=2, and pruning is performed on the size parameters. At this time, O ₁ = {I ₁₁ , I ₁₂ } ={[W ₁₁₁ , W ₁₁₂ , W ₁₁₃ , W ₁₁₄ }, [W ₁₂₁ , W ₁₂₂ , W ₁₂₃ , W ₁₂₄ ]]} = {[-0.5, 1.0 , 0.5, -1.0], [1.0, -1.5, 1.0, -1.5]}, O ₂ = {I ₂₁ , I ₂₂ } ={[W ₂₁₁ , W ₂₁₂ , W ₂₁₃ , W ₂₁₄ }, [W ₂₃₁ , Assume that W ₂₂₂ , W ₂₂₃ , W ₂₂₄ ]]} = {[1.5, -2.0, -1.5, 2.0], [-2.0, -2.5, 2.0, 2.5]}.

수학식 1을 적용하여 계산한 크기 파라미터 M={M₁₁, M₁₂, M₂₁, M₂₂} = {0.75, 1.25, 1.75, 2.25}이다. 즉, M₁₁ = 1/4 (0.5 + 1.0 + 0.5 + 1.0 ) =0.75, M₁₂ = 1/4 (1.0 + 1.5 + 1.0 + 1.5 ) =1.25, M₂₁ = 1/4 (1.5 + 2.0 + 1.5 + 2.0 ) =1.75, M₂₂ = 1/4 (2.0 + 2.5 + 2.0 + 2.5 ) =2.25이다.The size parameter calculated by applying Equation 1 is M={M ₁₁ , M ₁₂ , M ₂₁ , M ₂₂ } = {0.75, 1.25, 1.75, 2.25}. That is, M ₁₁ = 1/4 (0.5 + 1.0 + 0.5 + 1.0) =0.75, M ₁₂ = 1/4 (1.0 + 1.5 + 1.0 + 1.5) =1.25, M ₂₁ = 1/4 (1.5 + 2.0 + 1.5) + 2.0) =1.75, M ₂₂ = 1/4 (2.0 + 2.5 + 2.0 + 2.5) =2.25.

수학식 2를 적용하여 계산한 부호 파라미터 S = {S₁, S₂, S_3,S₄} = {[1, 0, 0, 1], [0, 1, 0, 1], [0, 1, 1, 0], [1, 1, 0, 0]}이다.Sign parameter calculated by applying Equation 2 S = {S ₁ , S ₂ , S _3, S ₄ } = {[1, 0, 0, 1], [0, 1, 0, 1], [0, 1, 1, 0], [1, 1, 0, 0]}.

크기 파라미터에 대한 가지치기 적용 예를 들면 다음과 같다. j=2, k=2, l=1인 기존 파라미터의 경우, 기존 파라미터로부터 분리된 크기 파라미터 M = {M₁₁₁, M₁₁₂, M₁₂₁, M₁₂₂} = {0.75, 1.25, 1.75, 2.25}이고, 레이어 상수 L₁=0.9라 하면, 수학식 3의 입력채널 별로 적용하면, 가지치기된 크기 파라미터 M' = {M₁₁₁, M₁₁₂, M₁₂₁, M₁₂₂} = {0, 1.25, 0, 2.25}이다. 예를 들어, M₁₁₂의 경우, 1.25 < 1/2(0.75+1.25)·0.9이므로, 수학식 3에 의해 M₁₁₂= 0이다.An example of applying pruning to the size parameter is as follows. In the case of the existing parameter with j=2, k=2, l=1, the size parameter M = {M ₁₁₁ , M ₁₁₂ , M ₁₂₁ , M ₁₂₂ } = {0.75, 1.25, 1.75, 2.25} separated from the existing parameter If the layer constant L ₁ =0.9 is applied for each input channel of Equation 3, the pruned size parameter M'= {M ₁₁₁ , M ₁₁₂ , M ₁₂₁ , M ₁₂₂ } = {0, 1.25, 0, 2.25 }to be. For example, in _{the case of M 112} , since 1.25 <1/2(0.75+1.25)·0.9, M ₁₁₂ = 0 by Equation 3.

수학식 4의 입력채널 밀 출력채널 별로 적용하면, 가지치기된 크기 파라미터 M' = {M₁₁₁, M₁₁₂, M₁₂₁, M₁₂₂} = {0, 0, 1.75, 2.25}이다. M₁₁₂의 경우, 1.25 < 1/4(0.75+1.25+1.75+2.25)·0.9이므로, 수학식 4에 의해 M₁₁₂= 0이다. 입력채널 별로 적용할지 입력 및 출력채널 별로 적용할지는 분포를 따라 결정될 수 있다.When applied to each input channel mill output channel of Equation 4, the pruned size parameter M'= {M ₁₁₁ , M ₁₁₂ , M ₁₂₁ , M ₁₂₂ } = {0, 0, 1.75, 2.25}. In _{the case of M 112} , since 1.25 <1/4 (0.75+1.25+1.75+2.25)·0.9, M ₁₁₂ = 0 by Equation 4. Whether to apply for each input channel or for each input and output channel may be determined according to the distribution.

도 6은 일반적인 컨볼루션 연산과 본 발명의 일 실시 예에 따른 컨볼루션 연산을 비교한 도면이다.6 is a diagram illustrating a comparison between a general convolution operation and a convolution operation according to an embodiment of the present invention.

도 6을 참조하면, 일반적인 컨볼루션 연산(60)은 곱 연산(600)과 합 연산(602)이 컨볼루션 파라미터 필터 크기만큼 존재한다. 하지만 일 실시 예에 따른 파라미터 최적화를 적용한 컨볼루션 연산(62)은 비트 연산(620)을 통해 데이터의 방향을 정하고, 필터 크기만큼 모두 합 연산(622)한 후, 단 한 번의 곱 연산(624)으로 결과를 도출할 수 있다. 예를 들어, 도 6에 도시된 바와 같이 일반적인 컨볼루션 연산(60)은 곱 연산(600)은 9번이나, 일 실시 예에 따른 파라미터 최적화를 적용한 컨볼루션 연산(62)은 단 1번임을 확인할 수 있다. 이는 필터 크기만큼의 곱 연산 이득이 있음을 의미한다.Referring to FIG. 6, in a general convolution operation 60, a multiplication operation 600 and a sum operation 602 exist as much as the convolution parameter filter size. However, in the convolution operation 62 to which parameter optimization according to an embodiment is applied, the direction of data is determined through the bit operation 620, the sum operation 622 is performed by the size of the filter, and then only one multiplication operation 624 is performed. You can derive the results. For example, as shown in FIG. 6, it is confirmed that the multiplication operation 600 is 9 times in the general convolution operation 60, but the convolution operation 62 to which the parameter optimization according to an embodiment is applied is only 1 time. I can. This means that there is a product operation gain as much as the filter size.

또한, 컨볼루션 파라미터를 형태변환 하지 않고 가지치기를 적용하게 되면 생략해야할 연산인지 아닌지를 판단하는데에 필터 크기만큼의 비교연산이 필요하다. 도 6의 예시의 일반적인 컨볼루션 연산(60) 경우는 총 9번의 비교연산이 필요하다. 그러나 본 발명의 파라미터 최적화 기법을 적용한 컨볼루션 연산(62)의 경우, 컨볼루션 레이어 입력채널 당 단 한 번의 비교만으로 연산을 생략할 것인지 여부를 판단할 수 있다. 이러한 연산량의 감소는 기존의 컨볼루션 연산(60)보다 적은 연산기를 요구한다. 따라서 효율적으로 하드웨어를 설계할 수 있다.In addition, when pruning is applied without changing the shape of the convolution parameter, a comparison operation equal to the filter size is required to determine whether or not an operation should be omitted. In the case of the general convolution operation 60 in the example of FIG. 6, a total of 9 comparison operations are required. However, in the case of the convolution operation 62 to which the parameter optimization technique of the present invention is applied, it can be determined whether or not to skip the operation with only one comparison per input channel of the convolution layer. This reduction in the amount of computation requires fewer operators than the conventional convolution operation 60. Therefore, it is possible to design the hardware efficiently.

도 7은 본 발명의 일 실시 예에 따른 파라미터 최적화 이후 컨볼루션 연산의 이점을 보여주는 도면이다.7 is a diagram illustrating an advantage of a convolution operation after parameter optimization according to an embodiment of the present invention.

도 7을 참조하면, 입력 데이터와 크기 파라미터 및 부호 파라미터를 컨볼루션 연산하여 출력 데이터를 생성하는데, 크기 파라미터 값이 0이면 연산을 수행할 필요가 없게 되어 채널 단위로 연산량을 크게 줄일 수 있다.Referring to FIG. 7, output data is generated by performing a convolution operation of input data, a size parameter, and a sign parameter. If the size parameter value is 0, there is no need to perform an operation, so that the amount of calculation in a channel unit can be greatly reduced.

도 8은 본 발명의 다른 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면이다.8 is a diagram illustrating an application example of a convolution parameter optimization process according to another embodiment of the present invention.

컨볼루션 파라미터 최적화의 목표는 최대한의 최적화(연산감소)를 가지고 최대한의 정확도를 얻는 것이다. 연산감소와 정확도는 보편적으로 반비례 관계이다. 실제 응용단계에서는 만족해야할 성능(속도 및 정확도)이 존재하는데, 이를 사용자가 조절하기는 쉽지 않다. 이를 해결하기 위해 본 발명은 비율 파라미터(Scale Parameter)를 이용한 최적화 방식을 제안한다. 이는 도 5를 참조로 하여 전술한 최적화 방식에 비율 파라미터가 추가된 것이다.The goal of optimizing convolution parameters is to achieve maximum accuracy with maximum optimization (reduction of computation). Operation reduction and accuracy are generally inversely proportional to each other. In the actual application stage, there are performances (speed and accuracy) to be satisfied, but it is not easy for the user to adjust them. To solve this, the present invention proposes an optimization method using a scale parameter. This is a ratio parameter added to the optimization method described above with reference to FIG. 5.

형태변환 과정(510)에서 기존의 파라미터(50)의 형태를 크기 파라미터(52), 부호 파라미터(54) 및 비율 파라미터(58)로 변환하고, 크기 파라미터(52)에만 가지치기(520)를 적용한다. 추론 시에는 비율 파라미터(58)를 참고하여 크기 파라미터에 가중치를 적용해 기존 연산결과의 오차를 줄이도록 한다.In the shape conversion process 510, the shape of the existing parameter 50 is converted into a size parameter 52, a sign parameter 54, and a ratio parameter 58, and pruning 520 is applied only to the size parameter 52. do. During inference, a weight is applied to the size parameter by referring to the ratio parameter 58 to reduce the error of the existing operation result.

비율 파라미터(58)는 크기 파라미터(52)가 얼마나 중요한지를 나타낸다. 즉, 컨볼루션 레이어 입력채널 파라미터 간의 중요도를 비교하여 상수를 차등 적용하는 방식이다. 비율 파라미터(58)의 하나의 구성원소에 대해 사용자는 원하는 비트를 할당할 수 있으며, 사용자는 이를 통해 최적화 정도를 조절할 수 있다. 비율 파라미터(58)는 가중치를 의미하지 않으며, 비율 파라미터(58)를 참조하는 상수 값들이 존재한다. 이 값들은 하드웨어로 구현되기 쉬운 값이 할당되어 있어 비트 연산만으로 구현 가능하게 한다. 크기 파라미터(52)의 값을 바로 적용하지 않고 비율 파라미터(58)를 통한 가중치를 적용하여 최적화를 진행하게 되면, 기존의 최적화 전 파라미터와의 유사도가 높아져 정확도가 올라가게 된다.The ratio parameter 58 indicates how important the size parameter 52 is. In other words, it is a method of differentially applying a constant by comparing the importance between the input channel parameters of the convolution layer. For one element of the ratio parameter 58, the user can allocate a desired bit, through which the user can adjust the degree of optimization. The ratio parameter 58 does not mean a weight, and there are constant values referring to the ratio parameter 58. These values are assigned values that are easy to implement in hardware, so they can be implemented only with bit operations. If the value of the size parameter 52 is not immediately applied, but the weight through the ratio parameter 58 is applied to perform optimization, the similarity with the existing pre-optimization parameter increases, thereby increasing the accuracy.

도 9는 비율 파라미터 최적화를 설명하기 위해 크기 파라미터만을 적용한 파라미터 분포와 비율 파라미터를 추가 적용한 파라미터 분포를 비교한 도면이다.9 is a diagram comparing a parameter distribution to which only a size parameter is applied and a parameter distribution to which a ratio parameter is additionally applied to explain ratio parameter optimization.

도 8 및 도 9를 참조하면, 비율 파라미터(58)를 생성하기 위해 다양한 방법이 적용될 수 있는데, 본 발명에서는 컨볼루션 파라미터의 분포를 통해 비율 파라미터(58)를 생성한다. 비율 파라미터 비트 수를 b라고 가정하면, 2^b개의 사용자가 정의한 임의의 상수가 존재하며, 이를 비율 파라미터 대응상수라 한다. 비율 파라미터 대응상수는 컨볼루션 파라미터 분포에 근거하여 사용자가 임의로 정할 수 있으며, 비율 파라미터에 대응되는 상수이다. 8 and 9, various methods can be applied to generate the ratio parameter 58. In the present invention, the ratio parameter 58 is generated through the distribution of the convolution parameter. Assuming that the number of ratio parameter bits is b, there are 2 ^b user-defined constants, which are referred to as ratio parameter correspondence constants. The ratio parameter correspondence constant can be arbitrarily determined by the user based on the convolution parameter distribution, and is a constant corresponding to the ratio parameter.

비율 파라미터(58)를 연산하는 예는 다음과 같다.An example of calculating the ratio parameter 58 is as follows.

비율 파라미터 비트 수 = b,Ratio parameter number of bits = b,

λ = {λ₁, λ₂, … , λ_t}, t = {1, 2, … , 2^b-1}, λ_t> λ_t+1이고, 비율 파라미터 SC = {SC₁, SC₂, … ,SC_i}이면, SC_i는 다음 수학식 5와 같다.λ = {λ ₁ , λ ₂ ,… , λ _t }, t = {1, 2,… , 2 ^b -1}, λ _t > λ _t+1 , and the ratio parameter SC = {SC ₁ , SC ₂ ,… If ,SC _i }, then SC _i is as in Equation 5 below.

추론 시에는 비율 파라미터 대응상수가 크기 파라미터에 곱해진 값이 실제 연산 값으로 사용된다.In inference, the value of the ratio parameter correspondence constant multiplied by the magnitude parameter is used as the actual calculation value.

이하, 전술한 수학식 5를 적용한 비율 파라미터 최적화 계산 예시를 설명한다.Hereinafter, an example of calculating ratio parameter optimization to which Equation 5 is applied will be described.

i=4, j=2, k=2인 기존 컨볼루션 파라미터를 대상으로 비율 파라미터를 계산한다고 가정한다. 비율 파라미터 비트 수 b = 2, t = {1, 2, 3}, λ = {0.9, 0.7, 0.5}라고 가정한다. 이때, O₁ = {I₁₁, I₁₂} ={W₁₁₁, W₁₁₂, W₁₁₃, W₁₁₄, W₁₂₁, W₁₂₂, W₁₂₃, W₁₂₄} = {-0.5, 1.0, 0.5, -1.0, 1.0, -1.5, 1.0, -1.5}, O₂= {I₂₁, I₂₂} ={W₂₁₁, W₂₁₂, W₂₁₃, W₂₁₄, W₂₃₁, W₂₂₂, W₂₂₃, W₂₂₄} = {1.5, -2.0, -1.5, 2.0, -2.0, -2.5, 2.0, 2.5}라고 가정한다.Assume that the ratio parameter is calculated for the existing convolution parameters with i=4, j=2, and k=2. Assume that the ratio parameter number of bits b = 2, t = {1, 2, 3}, and λ = {0.9, 0.7, 0.5}. At this time, O ₁ = {I ₁₁ , I ₁₂ } ={W ₁₁₁ , W ₁₁₂ , W ₁₁₃ , W ₁₁₄ , W ₁₂₁ , W ₁₂₂ , W ₁₂₃ , W ₁₂₄ } = {-0.5, 1.0, 0.5, -1.0, 1.0, -1.5, 1.0, -1.5}, O ₂ = {I ₂₁ , I ₂₂ } ={W ₂₁₁ , W ₂₁₂ , W ₂₁₃ , W ₂₁₄ , W ₂₃₁ , W ₂₂₂ , W ₂₂₃ , W ₂₂₄ } = {1.5 , -2.0, -1.5, 2.0, -2.0, -2.5, 2.0, 2.5}.

수학식 5에 의해, SC₁₁₁ = 2이다. 1/4(0.5 + 1.0 + 0.5 + 1.0)·0.7 > |-0.5| > 1/4(0.5 + 1.0 + 0.5 + 1.0)·0.5이기 때문이다. 이런 식으로 계산된 비율 파라미터 SC_i = (SC₁₁₁, SC₁₁₂, SC₁₁₃, SC₁₁₄, SC₁₂₁, SC₁₂₂, SC₁₂₃, SC₁₂₄, SC₂₁₁, SC₂₁₂, SC₂₁₃, SC₂₁₄, SC₂₂₁, SC₂₂₂, SC₂₂₃, SC₂₂₄} = {2, 0, 2, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}이다.By Equation 5, SC ₁₁₁ = 2. 1/4(0.5 + 1.0 + 0.5 + 1.0) 0.7> |-0.5| > 1/4(0.5 + 1.0 + 0.5 + 1.0)·0.5. _{The ratio parameter SC i} = (SC ₁₁₁ , SC ₁₁₂ , SC ₁₁₃ , SC ₁₁₄ , SC ₁₂₁ , SC ₁₂₂ , SC ₁₂₃ , SC ₁₂₄ , SC ₂₁₁ , SC ₂₁₂ , SC ₂₁₃ , SC ₂₁₄ , SC ₂₂₁ , calculated in this way) SC ₂₂₂ , SC ₂₂₃ , SC ₂₂₄ } = {2, 0, 2, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}.

도 10은 본 발명의 일 실시 예에 따른 최적화된 파라미터를 이용한 컨볼루션 연산의 예를 도시한 도면이다.10 is a diagram illustrating an example of a convolution operation using an optimized parameter according to an embodiment of the present invention.

도면에 도시된 바와 같이, 크기 파라미터(M), 부호 파라미터(S), 비율 파라미터(SC), 그리고, 비율 파라미터 대응상수(C)가 존재하고, 입력 데이터(In)가 입력될 때의 컨볼루션 연산은 도 10에 도시된 바와 같다. 이때, M = {0.75}, S = {0, 0, 1, 1}, SC = {0, 0, 1, 1}, C = {1.0, 0.5}이고, 입력 데이터 In = {1.2, -2.0, 0.5, 3.1}인 경우를 예로 들었다.As shown in the figure, the size parameter (M), the sign parameter (S), the ratio parameter (SC), and the ratio parameter correspondence constant (C) exist, and the convolution when the input data (In) is input. The operation is as shown in FIG. 10. At this time, M = {0.75}, S = {0, 0, 1, 1}, SC = {0, 0, 1, 1}, C = {1.0, 0.5}, and input data In = {1.2, -2.0 , 0.5, 3.1} as an example.

도 10을 참조한 예시에서는 비율 파라미터 대응상수(C)는 2의 배수로 정하여 일반적인 산술연산이 아닌 쉬프트(Shift) 비트연산으로 결과를 도출할 수 있도록 한다. 이 경우, 하드웨어 구현 시 이점을 가진다.In the example with reference to FIG. 10, the ratio parameter correspondence constant (C) is set as a multiple of 2 so that the result can be derived by shift bit operation rather than general arithmetic operation. In this case, there is an advantage in hardware implementation.

도 11은 본 발명의 일 실시 예에 따른 신경망 연산장치의 구성을 도시한 도면이다.11 is a diagram illustrating a configuration of a neural network computing device according to an embodiment of the present invention.

도면에 도시된 바와 같이, 본 발명의 실시예에 따른 신경망 연산장치(1)는 입력부(10), 프로세서(12), 메모리(14) 및 출력부(16)를 포함할 수 있다.As shown in the drawing, the neural network computing device 1 according to the embodiment of the present invention may include an input unit 10, a processor 12, a memory 14, and an output unit 16.

상기 입력부(10)는 사용자 명령을 포함한 데이터를 입력받고, 입력된 데이터는 메모리(18)에 저장된다. 상기 프로세서(12)는 신경망 연산을 수행하고 연산 결과를 메모리(18)에 저장하며, 출력부(16)를 통해 출력할 수 있다. 상기 메모리(14)에는 입력 데이터 및 파라미터가 로드될 수 있다. 일 실시 예에 따른 프로세서(12)는 파라미터 최적화부(120), 추론부(122), 보정부(124) 및 업데이트부(126)를 포함한다.The input unit 10 receives data including a user command, and the input data is stored in the memory 18. The processor 12 may perform a neural network operation, store the operation result in the memory 18, and output through the output unit 16. Input data and parameters may be loaded into the memory 14. The processor 12 according to an embodiment includes a parameter optimization unit 120, an inference unit 122, a correction unit 124, and an update unit 126.

상기 파라미터 최적화부(120)는 기존 파라미터를 형태 변환하고, 형태 변환된 파라미터들 중에 크기 파라미터를 가지치기한다. 기존 파라미터는 이미 학습된 파라미터일 수 있고, 연산을 위해 메모리에 로드된다.The parameter optimizing unit 120 converts the shape of the existing parameter and prunes the size parameter among the shape-converted parameters. Existing parameters may be parameters that have already been learned, and are loaded into memory for computation.

본 발명의 실시 예에 따른 파라미터 최적화부(120)는 기존 파라미터를 크기 파라미터 및 부호 파라미터로 형태 변환하거나, 크기 파라미터, 부호 파라미터 및 비율 파라미터로 형태 변환한다. 부호 파라미터는 기존 파라미터의 채널 별 원소들의 방향을 결정한 것이고, 크기 파라미터는 기존 파라미터의 채널 당 단일의 대표 값으로 가중치들을 최적화한 것이다. 비율 파라미터는 크기 파라미터가 얼마나 중요한지를 나타내는 것으로, 추론 시에 비율 파라미터를 참고하여 크기 파라미터에 차등적 가중치를 적용할 수 있다.The parameter optimizer 120 according to an embodiment of the present invention converts an existing parameter into a size parameter and a sign parameter, or converts the form into a size parameter, a sign parameter, and a ratio parameter. The sign parameter determines the direction of elements for each channel of the existing parameter, and the size parameter optimizes the weights with a single representative value per channel of the existing parameter. The ratio parameter indicates how important the size parameter is, and a differential weight can be applied to the size parameter by referring to the ratio parameter during inference.

본 발명의 실시예에서, 상기 가지치기는 형태 변환된 파라미터들 중에서 크기 파라미터만을 대상으로 수행된다. 가지치기는 중요도가 낮은 크기 파라미터를 선별하여 그 값을 0(zero)로 만드는 것을 의미한다. 가지치기된 파라미터는 채널 단위로 컨볼루션 연산을 생략할 수 있어서 전체 연산량을 줄이는 효과를 얻는다. 중요한 점은 이 가지치기 과정을 컨볼루션 레이어의 각 입력채널을 구성하는 모든 파라미터 원소에 영향을 끼치게 되어 가지치기의 효과를 극대화할 수 있다.In an embodiment of the present invention, the pruning is performed for only the size parameter among the shape-transformed parameters. Pruning means selecting a size parameter of low importance and making its value 0 (zero). For pruned parameters, the convolution operation can be omitted for each channel, thereby reducing the total amount of computation. The important point is that this pruning process affects all parameter elements constituting each input channel of the convolutional layer, thus maximizing the effect of pruning.

또한, 상기 추론부(122)는 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론한다. 여기서, 상기 추론부(122)는 최적화 파라미터에 포함된 크기 파라미터의 값이 0인지 판단하고 크기 파라미터의 값이 0이면 컨볼루션 연산 과정을 생략할 수 있다. 또한, 상기 추론부(122)는 비율 파라미터가 존재하면 비율 파라미터를 이용하여 크기 파라미터에 가중치를 차등 반영함에 따라 연산결과의 오차를 줄일 수 있다. 본 발명의 실시예에서, 상기 보정부(124)는 최적화된 파라미터를 보정하고, 업데이트부(126)는 기존 파라미터를 보정된 최적화 파라미터로 업데이트한다.In addition, the inference unit 122 infers by performing a convolution operation on the optimized parameters and input channel data. Here, the inference unit 122 may determine whether the value of the size parameter included in the optimization parameter is 0, and if the value of the size parameter is 0, the convolution operation process may be omitted. In addition, if the ratio parameter exists, the inference unit 122 may reduce an error in the calculation result by differentially reflecting the weight to the size parameter using the ratio parameter. In an embodiment of the present invention, the correction unit 124 corrects the optimized parameter, and the update unit 126 updates the existing parameter with the corrected optimization parameter.

도 12는 본 발명의 일 실시 예에 따른 도 11의 파라미터 최적화부의 세부 구성을 도시한 도면이다.12 is a diagram showing a detailed configuration of the parameter optimization unit of FIG. 11 according to an embodiment of the present invention.

도 11 및 도 12를 참조하면, 본 발명의 실시예에 따른 파라미터 최적화부(120)는 크기 파라미터 변환부(1200), 부호 파라미터 변환부(1202), 파라미터 가지치기부(1204) 및 비율 파라미터 변환부(1206)를 포함할 수 있다.11 and 12, the parameter optimization unit 120 according to an embodiment of the present invention includes a size parameter conversion unit 1200, a sign parameter conversion unit 1202, a parameter pruning unit 1204, and a ratio parameter conversion. It may include a part 1206.

상기 부호 파라미터 변환부(1202)는 기존 파라미터를 부호 파라미터로 형태 변환한다. 상기 크기 파라미터 변환부(1200)는 기존 파라미터를 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환한다. 상기 비율 파라미터 변환부(1206)는 기존 파라미터를 비율 파라미터로 형태 변환한다.The sign parameter conversion unit 1202 converts an existing parameter into a sign parameter. The size parameter conversion unit 1200 converts the existing parameter into a size parameter having a single value per channel. The ratio parameter conversion unit 1206 converts the existing parameter into a ratio parameter.

또한, 상기 파라미터 가지치기부(1204)는 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성한다. 예를 들어, 소정의 크기 파라미터 값이 미리 설정된 기준 값보다 적으면 해당 채널의 크기 파라미터를 가지치기한다.In addition, the parameter pruning unit 1204 prunes the shape-converted size parameter to generate an optimized parameter. For example, if the predetermined size parameter value is less than a preset reference value, the size parameter of the corresponding channel is pruned.

본 발명의 실시 예에 따른 파라미터 가지치기부(1204)는 입력채널 별 크기 파라미터의 평균 값 및 크기 분포 또는 입력 및 출력채널 별 크기 파라미터의 평균 값 및 크기 분포를 이용하여 기준 값을 계산하고, 계산된 기준 값보다 적은 값을 가진 크기 파라미터 값을 0으로 만들어 해당 채널의 컨볼루션 연산을 생략한다.The parameter pruning unit 1204 according to an embodiment of the present invention calculates and calculates a reference value using the average value and size distribution of the size parameter for each input channel or the average value and size distribution of the size parameter for each input and output channel. By making the value of the size parameter with a value less than the specified reference value to 0, the convolution operation of the corresponding channel is omitted.

또한, 상기 파라미터 가지치기부(1204)는 소정의 레이어를 구성하는 입력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.In addition, when pruning using the average value and size distribution of the size parameter of the input channel constituting a predetermined layer, the parameter pruning unit 1204 multiplies the average value of the size parameter of the input channels by a constant for each layer. A reference value is calculated from the value, and if the size parameter value of a predetermined input channel is smaller than the reference value, the size parameter value of the corresponding channel may be changed to 0 and pruned.

또한, 상기 파라미터 가지치기부(1204)는 소정의 레이어를 구성하는 입력 및 출력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들 및 출력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.In addition, the parameter pruning unit 1204 is an average value of the size parameters of the input channels and output channels when pruning by using the average value and size distribution of the size parameters of the input and output channels constituting a predetermined layer. A reference value is calculated by multiplying by a constant for each layer, and if the size parameter value of a predetermined input channel is smaller than the reference value, the size parameter value of the corresponding channel may be changed to 0 and pruned.

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far, the present invention has been looked at around the embodiments. Those of ordinary skill in the art to which the present invention pertains will be able to understand that the present invention may be implemented in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered from a descriptive point of view rather than a limiting point of view. The scope of the present invention is shown in the claims rather than the above description, and all differences within the scope equivalent thereto should be construed as being included in the present invention.

10 : 입력부
12 : 프로세서
14 : 메모리
16 : 출력부
120 : 파라미터 최적화부
122 : 추론부
124 : 보정부
126 : 업데이트부
1200 : 크기 파라미터 변환부
1202 : 부호 파라미터 변환부
1204 : 파라미터 가지치기부
1206 : 비율 파라미터 변환부10: input
12: processor
14: memory
16: output
120: parameter optimization unit
122: reasoning unit
124: correction unit
126: update unit
1200: size parameter conversion unit
1202: sign parameter conversion unit
1204: parameter pruning unit
1206: ratio parameter conversion unit

Claims

Converting an existing parameter of the neural network into a code parameter and a size parameter having a single value per channel using a code parameter conversion unit and a size parameter conversion unit constituting the parameter optimization unit; And
Generating an optimized parameter by pruning only a size parameter from among the shape-transformed parameters using a parameter pruning unit constituting a parameter optimization unit, and
The shape transformation and the pruning are performed by targeting a parameter of a convolution layer used for a neural network convolution operation.

The method of claim 1,
The sign parameter determines the direction of elements for each channel of the existing parameter,
The size parameter is a neural network parameter optimization method, characterized in that the weights are optimized with a single representative value per channel of the existing parameter.

The method of claim 1, wherein the step of converting the shape,
A method for optimizing a neural network parameter, comprising generating a size parameter by calculating an average of the absolute values of elements for each channel of an existing parameter.

The method of claim 1, wherein the step of converting the shape,
If a predetermined element constituting the channel of the existing parameter is less than 0, the element value of the corresponding sign parameter is expressed as 0, and if it is greater than or equal to 0, the element value of the corresponding sign parameter is expressed as 1 to generate a sign parameter. Neural network parameter optimization method using.

The method of claim 1, wherein the pruning step,
Calculate the reference value using the average value and size distribution of the size parameter for each input channel or the average value and size distribution of the size parameter for each input and output channel, and make the size parameter value with a value less than the calculated reference value to 0. A neural network parameter optimization method, characterized in that omitting a convolution operation of a corresponding channel.

The method of claim 1, wherein the neural network parameter optimization method,
Converting an existing parameter of the neural network into a ratio parameter using a ratio parameter conversion unit constituting the parameter optimization unit;
Neural network parameter optimization method, characterized in that it further comprises.

The method of claim 6, wherein the step of converting the shape to the ratio parameter,
Variably allocating bits of the ratio parameter; And
Quantizing the range and weight of the value of the ratio parameter element by user selection;
Neural network parameter optimization method comprising a.

Loading existing parameters and input channel data of a neural network into a memory using a processor constituting a neural network computing device;
Code parameter change constituting the parameter optimization unit included in the processor
The size of the previously learned parameters of the neural network is converted into a code parameter and a size parameter having a single value per channel using the affected part and the size parameter conversion unit, and the size of the shape converted parameters using a parameter pruning unit constituting the parameter optimization unit. Generating optimized parameters by pruning only the parameters;
Inferring by performing a convolution operation on the optimized parameter and input channel data using an inference unit included in the processor;
Correcting the optimized parameter using a correction unit included in the processor; And
Including; updating an existing parameter to a corrected optimization parameter using an update unit included in the processor,
The shape transformation and the pruning are performed by targeting a parameter of a convolution layer used in a neural network convolution operation.

The method of claim 8, wherein the neural network operation method comprises:
Determining whether the learned parameter exists;
Generating an initial parameter through parameter initialization if there is no learned parameter;
Generating optimized parameters for the initial parameters; And
Loading an existing parameter if there is the learned parameter;
Neural network calculation method, characterized in that it further comprises.

The method of claim 8, wherein the step of inferring by convolutional calculation of optimized parameters and input channel data using the inference unit,
Loading the optimized parameter into a memory;
Determining whether a value of a size parameter included in the loaded optimized parameter is 0; And
If the value of the size parameter is 0, omitting the convolution operation process;
Neural network operation method comprising a.

The method of claim 8, wherein the step of inferring by convolutional calculation of optimized parameters and input channel data using the inference unit,
If the value of the size parameter is not 0, determining a direction of data by bitwise operation of a sign parameter and input channel data;
Calculating a sum of input channel data by the size of the convolution parameter filter; And
Performing a single multiplication operation on the size parameter and input channel data;
Neural network operation method comprising a.

The method of claim 10, wherein the step of inferring by convolving the optimized parameter and input channel data comprises:
Reducing the error of the calculation result by differentially reflecting the weight to the size parameter using the ratio parameter if there is a ratio parameter;
Neural network calculation method, characterized in that it further comprises.

A sign parameter conversion unit converting the shape of an existing parameter into a sign parameter;
A size parameter conversion unit converting the shape of the existing parameter into a size parameter having a single value per channel; And
Includes; a parameter pruning unit for generating an optimized parameter by pruning only a size parameter from among the shape-converted parameters,
The shape transformation and the pruning are performed by targeting a parameter of a convolutional layer used in a neural network convolution operation.

The method of claim 13, wherein the parameter pruning unit
Calculate the reference value using the average value and size distribution of the size parameter for each input channel or the average value and size distribution of the size parameter for each input and output channel, and make the size parameter value with a value less than the calculated reference value to 0. Neural network computing device, characterized in that to omit the convolution operation of the corresponding channel.

The method of claim 13, wherein the neural network computing device
A ratio parameter conversion unit converting the shape of an existing parameter into a ratio parameter;
Neural network computing device, characterized in that it further comprises.

The method of claim 13, wherein the neural network computing device
An inference unit for inferring by convolutional operation of optimized parameters and input channel data;
Neural network computing device, characterized in that it further comprises.

The method of claim 16, wherein the reasoning unit
A neural network computing device, comprising: determining whether a value of the size parameter included in the optimization parameter is 0, and if the value of the size parameter is 0, skipping a convolution operation process.

The method of claim 16, wherein the reasoning unit
If there is a ratio parameter, the error of the calculation result is reduced by differentially reflecting the weight to the size parameter using the ratio parameter.