KR101987475B1

KR101987475B1 - Neural network parameter optimization method, neural network computing method and apparatus thereof suitable for hardware implementation

Info

Publication number: KR101987475B1
Application number: KR1020190011516A
Authority: KR
Inventors: 이상헌; 김명겸; 김주혁
Original assignee: 주식회사 디퍼아이
Priority date: 2019-01-29
Filing date: 2019-01-29
Publication date: 2019-06-10

Abstract

The present invention relates to a method for optimizing a neural network parameter suitable for hardware implementation, a method for operating a neural network, and an apparatus thereof. According to the present invention, the method for optimizing a neural network parameter suitable for hardware implementation comprises the steps of: performing shape transformation on an existing parameter of a neural network into a code parameter and a size parameter having a single value per a channel; and generating an optimized parameter by pruning the shape-transformed size parameter. Therefore, the accuracy loss is minimized and an operation speed can maximized by effectively optimizing a large amount of operations and parameters of a convolution neural network in hardware implementation.

Description

[0001] The present invention relates to a neural network parameter optimization method, a neural network parameter optimization method,

본 발명은 인공 신경망에서의 연산기술에 관한 것으로, 더욱 상세하게는, 하드웨어 구현에 적합한 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치에 관한 것이다.Field of the Invention The present invention relates to computational techniques in artificial neural networks, and more particularly, to a method of neural network parameter optimization, a neural network computation method, and apparatus therefor, which are suitable for hardware implementation.

최근 인공지능(artificial intelligence) 기술이 발달함에 따라 다양한 산업에서 인공지능 기술을 응용 및 도입하고 있다. Recently, with the development of artificial intelligence technology, artificial intelligence technology is applied and introduced in various industries.

이러한 추세에 따라 컨볼루션 신경망(Convolutional Neural Network)과 같은 인공 신경망(artificial neural network)을 실시간 하드웨어로 구현하고자 하는 수요도 증가하고 있다. According to this trend, there is an increasing demand for real-time hardware implementation of an artificial neural network such as a Convolutional Neural Network.

그러나, 종래기술에 따른 인공신경망 관련 연산방법은, 컨볼루션 신경망이 가지는 많은 파라미터와 연산량 때문에 실제 구현은 쉽지 않다. However, the artificial neural network-related arithmetic method according to the related art is not easy to implement because of many parameters and computation amount of the convolution neural network.

이를 해결하기 위해 학습을 통해 불필요한 파라미터를 생략하는 방법과 파라미터의 비트 수를 감소시키는 방법 등, 정확도 대비 연산량을 감소시키기 위한 파라미터 최적화 방법들이 제안되고 있다. In order to solve this problem, there have been proposed parameter optimization methods for reducing the amount of computation compared with accuracy, such as a method of omitting unnecessary parameters through learning and a method of reducing the number of bits of parameters.

하지만, 종래기술에 따른 파라미터 최적화 방법들은 하드웨어 구현을 고려하지 않은 소프트웨어 연산방법에 한정 되기 때문에, 이러한 방법만을 적용하여 컨볼루션 신경망을 실시간 하드웨어로 구현하는 것은 한계가 있다.However, since the parameter optimization methods according to the prior art are limited to a software operation method that does not consider a hardware implementation, there is a limit to implementing the convolutional neural network in real time hardware by applying only this method.

따라서, 하드웨어 구현에 적합한 인공신경망 관련 연산기술에 관한 현실적이고도 적용이 가능한 기술이 절실히 요구되고 있는 실정이다.Therefore, a realistic and applicable technology for artificial neural network related computation technology suitable for hardware implementation is desperately needed.

공개특허공보 KR 10-2011-0027916호(공개일자 2011.03.17)Published Patent Publication No. KR 10-2011-0027916 (published date March 17, 2011)

본 발명은 상기 문제점을 해결하기 위하여 안출된 것으로서, 본 발명은, 컨볼루션 신경망이 가지는 많은 연산량과 파라미터를 하드웨어 구현에 효과적으로 최적화시켜 최소한의 정확도 손실 및 최대한의 연산속도를 얻기 위한 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치를 제공하는데 있다.SUMMARY OF THE INVENTION The present invention is conceived to solve the problems described above, and it is an object of the present invention to provide a neural network parameter optimization method for optimizing a large amount of computation and parameters of a convolutional neural network to hardware implementation to obtain a minimum accuracy loss and a maximum computation speed, And to provide a neural network computing method and apparatus therefor.

또한, 최소한의 연산속도 손실을 통해 정확도를 보정할 수 있는 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치를 제공하는데 있다.In addition, the present invention is to provide a neural network parameter optimization method, a neural network operation method, and a device thereof, which can correct an accuracy through a minimum operation speed loss.

본 발명의 실시예에 따른 신경망 파라미터 최적화 방법은, 신경망의 기존 파라미터를 부호 파라미터 및 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환하는 단계와, 상기 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성하는 단계를 포함한다.The neural network parameter optimization method according to an embodiment of the present invention includes the steps of transforming existing parameters of a neural network into a size parameter having a sign parameter and a single value per channel, .

상기 부호 파라미터는 기존 파라미터의 채널 별 원소들의 방향을 결정하고, 상기 크기 파라미터는 기존 파라미터의 채널 당 단일의 대표 값으로 가중치들을 최적화한 것이다.The sign parameter determines the direction of the elements of each channel of the existing parameter, and the size parameter optimizes the weights to a single representative value per channel of the existing parameter.

상기 형태 변환하는 단계에서, 기존 파라미터의 채널 별 원소들의 절대값의 평균을 연산하여 크기 파라미터를 생성할 수 있다.In the transforming step, a magnitude parameter may be generated by calculating an average of absolute values of the elements of each channel of the existing parameter.

상기 형태 변환하는 단계에서, 기존 파라미터의 채널을 구성하는 소정의 원소가 0보다 작으면 대응하는 부호 파라미터의 원소 값을 0으로, 0보다 크거나 같으면 대응하는 부호 파라미터의 원소 값을 1로 표현하여 부호 파라미터를 생성할 수 있다.If the predetermined element constituting the channel of the existing parameter is less than 0, the element value of the corresponding code parameter is set to 0, and if it is equal to or larger than 0, the element value of the corresponding code parameter is represented by 1 Code parameters can be generated.

상기 가지치기하는 단계에서, 입력채널 별 크기 파라미터의 평균 값 및 크기 분포 또는 입력 및 출력채널 별 크기 파라미터의 평균 값 및 크기 분포를 이용하여 기준 값을 계산하고, 계산된 기준 값보다 적은 값을 가진 크기 파라미터 값을 0으로 만들어 해당 채널의 컨볼루션 연산을 생략하도록 할 수 있다.In the pruning step, a reference value is calculated using an average value and a size distribution of size parameters for each input channel or an average value and a size distribution of size parameters for input and output channels, and a reference value having a value smaller than the calculated reference value The size parameter value may be set to 0 to omit the convolution operation of the corresponding channel.

소정의 레이어를 구성하는 입력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기하며, 소정의 레이어를 구성하는 입력 및 출력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들 및 출력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.When a pruning is performed using an average value and a size distribution of size parameters of input channels constituting a predetermined layer, a reference value is calculated by multiplying an average value of size parameters of input channels by a constant for each layer, If the value of the size parameter of the channel is smaller than the reference value, the size parameter value of the channel is changed to 0 and pruned. By using the average value and the size distribution of the size parameters of the input and output channels constituting the predetermined layer, The reference value is calculated by multiplying the average value of the size parameters of the input channels and the output channels by a value obtained by multiplying the average value of the size parameters of the input channels and the output channels by a layer specific constant. If the size parameter value of the predetermined input channel is smaller than the reference value, 0 to be pruned.

상기 레이어 별 상수 값은 레이어 별 컨볼루션 파라미터 분포에 따라 결정될 수 있다.The constant value for each layer may be determined according to the distribution of convolution parameters for each layer.

상기 신경망 파라미터 최적화 방법은, 신경망의 기존 파라미터를 비율 파라미터로 형태 변환하는 단계를 더 포함할 수 있다.The neural network parameter optimization method may further include a step of transforming the existing parameters of the neural network into rate parameters.

상기 비율 파라미터로 형태 변환하는 단계는, 비율 파라미터의 비트를 가변적으로 할당하는 단계와, 비율 파라미터 원소의 값의 범위 및 가중치를 사용자 선택에 의해 양자화되는 단계를 포함할 수 있다.The step of transforming into the ratio parameter may comprise the steps of variably allocating bits of the ratio parameter and quantizing the range and the weight of the ratio parameter element by user selection.

본 발명의 다른 실시 예에 따른 신경망 연산방법은, 신경망의 기존 파라미터 및 입력채널 데이터를 메모리에 로드하는 단계와, 기존 파라미터를 부호 파라미터 및 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환하고 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성하는 단계와, 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계와, 최적화된 파라미터를 보정하는 단계와, 기존 파라미터를 보정된 최적화 파라미터로 업데이트하는 단계를 포함할 수 있다.The neural network operation method according to another embodiment of the present invention includes the steps of loading an existing parameter and input channel data of a neural network into a memory and converting the existing parameter into a size parameter having a single value per code parameter and channel, Convolutionalizing the optimized parameters and input channel data, correcting the optimized parameters, and adjusting the existing parameters to the corrected optimized parameters And updating it.

상기 신경망 연산방법은, 학습된 파라미터가 존재하는지 판단하는 단계와, 학습된 파라미터가 없으면 파라미터 초기화를 통해 초기 파라미터를 생성하는 단계와, 초기 파라미터를 대상으로 최적화된 파라미터를 생성하는 단계, 및 학습된 파라미터가 있으면 기존 파라미터를 로드하는 단계를 더 포함할 수 있다.The neural network computation method includes the steps of: determining whether a learned parameter exists; generating an initial parameter through parameter initialization if the learned parameter is not present; generating an optimized parameter for the initial parameter; And if the parameter exists, loading the existing parameter.

상기 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계는, 최적화된 파라미터를 메모리에 로드하는 단계와, 상기 메모리에 로드된 최적화된 파라미터에 포함되는 크기 파라미터의 값이 0인지 판단하는 단계와, 상기 크기 파라미터의 값이 0이면 컨볼루션 연산 과정을 생략하는 단계를 포함할 수 있다.The step of convoluting and optimizing the optimized parameter and the input channel data includes the steps of loading optimized parameters into a memory and determining whether the value of the magnitude parameter included in the optimized parameter loaded in the memory is 0 And omitting the convolution operation if the value of the magnitude parameter is zero.

상기 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계는, 상기 크기 파라미터의 값이 0이 아니면 부호 파라미터와 입력채널 데이터를 비트 연산하여 데이터의 방향을 결정하는 단계와, 컨볼루션 파라미터 필터 크기만큼 입력채널 데이터 간 합 연산하는 단계와, 크기 파라미터와 입력채널 데이터를 대상으로 단일의 곱 연산을 수행하는 단계를 포함할 수 있다.The step of convoluting and calculating the optimized parameter and the input channel data comprises the steps of determining the direction of the data by bitwise computing the sign parameter and the input channel data if the value of the magnitude parameter is not 0, Calculating a sum of the input channel data by the size of the input channel data; and performing a single multiplication operation on the size parameter and the input channel data.

상기 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 단계는, 비율 파라미터가 존재하면 비율 파라미터를 이용하여 크기 파라미터에 가중치를 차등 반영함에 따라 연산결과의 오차를 줄이는 단계를 더 포함할 수 있다.The step of convoluting and calculating the optimized parameter and the input channel data may further include the step of reducing the error of the calculation result by differentiating the weight parameter in the size parameter using the ratio parameter when the ratio parameter is present .

본 발명의 또 다른 실시 예에 따른 신경망 연산장치는, 기존 파라미터를 부호 파라미터로 형태 변환하는 부호 파라미터 변환부와, 기존 파라미터를 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환하는 크기 파라미터 변환부, 및 상기 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성하는 파라미터 가지치기부를 포함할 수 있다.According to another aspect of the present invention, there is provided a neural network computing apparatus including a code parameter converting unit for converting an existing parameter into a code parameter, a size parameter converting unit for converting an existing parameter into a magnitude parameter having a single value per channel, And a parameter pruning unit for pruning the transformed size parameter to generate an optimized parameter.

상기 파라미터 가지치기부는, 입력채널 별 크기 파라미터의 평균 값 및 크기 분포 또는 입력 및 출력채널 별 크기 파라미터의 평균 값 및 크기 분포를 이용하여 기준 값을 계산하고, 계산된 기준 값보다 적은 값을 가진 크기 파라미터 값을 0으로 만들어 해당 채널의 컨볼루션 연산을 생략하도록 할 수 있다.The parameter pruning unit calculates a reference value using an average value and a size distribution of size parameters for each input channel or an average value and a size distribution of size parameters for input and output channels and calculates a reference value having a value smaller than the calculated reference value The parameter value may be set to 0 to omit the convolution operation of the corresponding channel.

상기 파라미터 가지치기부는, 소정의 레이어를 구성하는 입력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기하며, 소정의 레이어를 구성하는 입력 및 출력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들 및 출력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.When the pruning is performed using the average value and the size distribution of the size parameters of the input channels constituting a predetermined layer, the parameter pruning unit multiplies the average value of the input channel size parameters by a constant for each layer, If the size parameter value of a predetermined input channel is smaller than the reference value, the size parameter value of the corresponding channel is changed to 0 and pruned. The average value and size of the size parameters of the input and output channels constituting the predetermined layer When the pruning is performed using the distribution, a reference value is calculated by multiplying the average value of the size parameters of the input channels and the output channels by a constant for each layer. If the size parameter value of the predetermined input channel is smaller than the reference value, It can be pruned by changing the size parameter value of the channel to zero.

상기 신경망 연산장치는, 기존 파라미터를 비율 파라미터로 형태 변환하는 비율 파라미터 변환부를 더 포함할 수 있다.The neural network computation apparatus may further include a rate parameter conversion unit for converting the existing parameters into rate parameters.

상기 신경망 연산장치는, 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론하는 추론부를 더 포함할 수 있다. The neural network computing apparatus may further include a reasoning unit for convoluting and optimizing the optimized parameters and the input channel data.

상기 추론부는 최적화 파라미터에 포함된 크기 파라미터의 값이 0인지 판단하고 크기 파라미터의 값이 0이면 컨볼루션 연산 과정을 생략할 수 있다. The reasoning unit may determine whether the value of the magnitude parameter included in the optimization parameter is 0, and omit the convolution operation if the value of the magnitude parameter is zero.

상기 추론부는 비율 파라미터가 존재하면 비율 파라미터를 이용하여 크기 파라미터에 가중치를 차등 반영함에 따라 연산결과의 오차를 줄일 수 있다.If the ratio parameter is present, the reasoning unit may reduce the error of the operation result by differently reflecting the weight value in the size parameter using the ratio parameter.

이상에서 설명한 바와 같이, 본 발명은, 컨볼루션 신경망이 가지는 많은 연산량과 파라미터를 하드웨어 구현에 효과적으로 최적화시켜 최소한의 정확도 손실 및 최대한의 연산속도를 얻을 수 있는 신경망 파라미터 최적화 방법, 신경망 연산방법 및 그 장치를 제공하는 효과가 있다.As described above, the present invention provides a neural network parameter optimization method, a neural network operation method, and a device thereof, which can optimize a large amount of computation and parameters of a convolutional neural network to hardware implementations to obtain a minimum accuracy loss and a maximum operation speed .

또한, 본 발명은, 신경망의 파라미터를 최적화함에 따라, 실시간으로 동작하는 컨볼루션 신경망을 구현한 하드웨어에서 요구되는 저전력, 고성능을 만족시킬 수 있다. 일 실시 예에 따른 신경망 파라미터 최적화 기법을 적용하면, 컨볼루션 연산에서의 채널의 크기에 따라 채널 별로 연산을 생략할 수 있다. 게다가, 하드웨어에서 채널 내 각각의 원소에 대해 연산을 생략할 것인지 수행할 것인지 판별하는 것이 아니라, 각 채널 별로 연산 생략 여부를 판별하므로 채널 원소 개수의 배수로 연산이 줄어드는 효과가 있다.Further, according to the present invention, by optimizing the parameters of the neural network, it is possible to satisfy the low power and high performance required in the hardware implementing the convolution neural network operating in real time. If the neural network parameter optimization technique according to one embodiment is applied, computation can be omitted for each channel according to the size of the channel in the convolution operation. In addition, the hardware does not determine whether to omit or omit the operation for each element in the channel, but it determines whether to omit operation for each channel, so that the operation is reduced by a multiple of the number of channel elements.

또한, 본 발명은, 최소한의 연산속도 손실을 통해 정확도를 보정할 수 있다. 예를 들어, 비율 파라미터(scale parameter)를 별도로 분리해 값의 범위에 따라 다른 가중치를 효과적으로 적용할 수 있어 하드웨어 구현 시 성능을 효율적으로 높일 수 있다.Further, the present invention can correct the accuracy through a minimum operation speed loss. For example, the scale parameters can be separated to effectively apply different weights according to the range of the values, thereby effectively improving the performance in hardware implementation.

도 1은 본 발명이 일 실시 예에 따른 신경망 컨볼루션 연산에 사용되는 컨볼루션 파라미터(convolution parameter)를 정의한 도면,
도 2는 본 발명의 일 실시 예에 따른 컨볼루션 신경망(Convolution Neural Network: CNN)에서 최적화 파라미터 생성을 위한 학습 프로세스를 도시한 도면,
도 3은 본 발명의 일 실시 예에 따른 도 2의 최적화 파라미터 생성을 위한 학습 프로세스를 세부화하여 도시한 도면,
도 4는 본 발명의 일 실시 예에 따른 추론 프로세스를 세부화하여 도시한 도면,
도 5는 본 발명의 일 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면,
도 6은 일반적인 컨볼루션 연산과 본 발명의 일 실시 예에 따른 컨볼루션 연산을 비교한 도면,
도 7은 본 발명의 일 실시 예에 따른 파라미터 최적화 이후 컨볼루션 연산의 이점을 보여주는 도면,
도 8은 본 발명의 다른 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면,
도 9는 비율 파라미터 최적화를 설명하기 위해 크기 파라미터만을 적용한 파라미터 분포와 비율 파라미터를 추가 적용한 파라미터 분포를 비교한 도면,
도 10은 본 발명의 일 실시 예에 따른 최적화된 파라미터를 이용한 컨볼루션 연산의 예를 도시한 도면,
도 11은 본 발명의 일 실시 예에 따른 신경망 연산장치의 구성을 도시한 도면,
도 12는 본 발명의 일 실시 예에 따른 도 11의 파라미터 최적화부의 세부 구성을 도시한 도면이다.1 is a diagram illustrating a convolution parameter used in a neural network convolution operation according to an embodiment of the present invention;
FIG. 2 illustrates a learning process for generating optimization parameters in a Convolution Neural Network (CNN) according to an embodiment of the present invention; FIG.
FIG. 3 is a detailed view illustrating a learning process for generating optimization parameters of FIG. 2 according to an embodiment of the present invention. FIG.
FIG. 4 is a detailed view of the reasoning process according to an embodiment of the present invention. FIG.
5 is a diagram illustrating an example of application of a convolution parameter optimization process according to an embodiment of the present invention;
FIG. 6 illustrates a comparison between a general convolution operation and a convolution operation according to an embodiment of the present invention;
Figure 7 illustrates the benefits of convolution operations after parameter optimization in accordance with an embodiment of the present invention;
8 is a diagram illustrating an example of application of a convolution parameter optimization process according to another embodiment of the present invention;
9 is a view comparing a parameter distribution applying only a magnitude parameter and a parameter distribution added a ratio parameter in order to explain a ratio parameter optimization;
10 illustrates an example of a convolution operation using an optimized parameter according to an embodiment of the present invention;
11 is a diagram illustrating the configuration of a neural network computing apparatus according to an embodiment of the present invention.
FIG. 12 is a detailed block diagram of the parameter optimizer of FIG. 11 according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout the specification.

본 발명의 실시 예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이며, 후술되는 용어들은 본 발명의 실시 예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In the following description of the present invention, detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. , Which may vary depending on the intention or custom of the user, the operator, and the like. Therefore, the definition should be based on the contents throughout this specification.

첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램인스트럭션들(실행 엔진)에 의해 수행될 수도 있으며, 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치의 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다.Each block of the accompanying block diagrams and combinations of steps of the flowcharts may be performed by computer program instructions (execution engines), which may be stored in a general-purpose computer, special purpose computer, or other processor of a programmable data processing apparatus The instructions that are executed through the processor of the computer or other programmable data processing apparatus will produce means for performing the functions described in each block or flowchart of the block diagram.

이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치를 지향할 수 있는 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다.These computer program instructions may be stored in a computer usable or computer readable memory capable of directing a computer or other programmable data processing apparatus to implement a function in a particular manner, It is also possible for the instructions stored in the block diagram to produce an article of manufacture containing instruction means for performing the functions described in each block or flowchart of the flowchart.

그리고 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명되는 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.And computer program instructions may also be stored on a computer or other programmable data processing apparatus so that a series of operating steps may be performed on a computer or other programmable data processing apparatus to create a computer- It is also possible for the instructions to perform the data processing apparatus to provide steps for executing the functions described in each block of the block diagram and at each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능들을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있으며, 몇 가지 대체 실시 예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하며, 또한 그 블록들 또는 단계들이 필요에 따라 해당하는 기능의 역순으로 수행되는 것도 가능하다.Also, each block or step may represent a portion of a module, segment, or code that includes one or more executable instructions for executing the specified logical functions, and in some alternative embodiments, It should be noted that functions may occur out of order. For example, two successive blocks or steps may actually be performed substantially concurrently, and it is also possible that the blocks or steps are performed in the reverse order of the function as needed.

이하, 첨부 도면을 참조하여 본 발명의 실시 예를 상세하게 설명한다. 그러나 다음에 예시하는 본 발명의 실시 예는 여러 가지 다른 형태로 변형될 수 있으며, 본 발명의 범위가 다음에 상술하는 실시 예에 한정되는 것은 아니다. 본 발명의 실시 예는 이 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위하여 제공된다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the following embodiments of the present invention may be modified into various other forms, and the scope of the present invention is not limited to the embodiments described below. The embodiments of the present invention are provided to enable those skilled in the art to more fully understand the present invention.

도 1은 본 발명의 일 실시 예에 따른 신경망 컨볼루션 연산에 사용되는 컨볼루션 파라미터(convolution parameter)를 정의한 도면이다.FIG. 1 is a diagram illustrating a convolution parameter used in a neural network convolution operation according to an exemplary embodiment of the present invention. Referring to FIG.

도면에 도시된 바와 같이, 도 1을 참조하면, 신경망의 전체 컨볼루션 레이어 개수를 l이라 하면, 각 레이어 마다 k개의 출력채널(Output Channel)(O₁, … , O_k)이 존재하고, 각 출력채널은 j개의 입력채널(Input Channel)(I₁, I₂, … , O_j)로 구성된다. 각각의 입력채널은 i개의 가중치 파라미터(W₁, W₂, … , W_i)를 가진다. 본 발명에서는 위에서 정의한 컨볼루션 파라미터의 정의를 기반으로 최적화 기법을 수식적으로 설명하였다.1, when the total number of convolutional layers of a neural network is denoted by l, k output channels (O ₁ , ..., O _k ) exist for each layer, The output channel consists of j Input Channels (I ₁ , I ₂ , ..., O _j ). Each input channel has i weight parameters W ₁ , W ₂ , ..., W _i . In the present invention, the optimization technique is formally described based on the definition of the convolution parameter defined above.

도 2는 본 발명의 일 실시 예에 따른 컨볼루션 신경망(Convolution Neural Network: CNN)에서 최적화 파라미터 생성을 위한 학습 프로세스를 도시한 도면이다.2 is a diagram illustrating a learning process for generating optimization parameters in a Convolution Neural Network (CNN) according to an embodiment of the present invention.

도 2를 참조하면, 학습 프로세스는 기존 파라미터 및 입력 데이터 로드 단계(S200), 파라미터 최적화 단계(S210), 추론 단계(S220), 보정 단계(S230) 및 파라미터 업데이트 단계(S240)를 포함할 수 있다.2, the learning process may include an existing parameter and input data loading step S200, a parameter optimization step S210, an inference step S220, a correction step S230, and a parameter update step S240 .

보다 상세하게는, 상기 파라미터 최적화 단계(S210)를 생략할 경우 일반적인 딥러닝 학습 과정과 동일하며, 이는 일반적인 딥러닝 학습환경에서 본 발명을 쉽게 적용할 수 있음을 의미한다. More specifically, when the parameter optimization step (S210) is omitted, it is the same as a general deep learning learning procedure, which means that the present invention can be easily applied in a general deep learning learning environment.

본 발명의 실시예에 따른 컨볼루션 신경망 최적화 기법은 기존의 학습된 파라미터가 존재할 경우, 기존의 학습 환경에 파라미터 최적화 단계(S210)만 추가하여 수행할 수 있다. 이때, 기존의 학습된 파라미터가 존재하지 않을 경우, 파라미터 최적화 단계(S210)를 생략한 학습을 통해 초기 학습을 진행한 후 파라미터 최적화 단계(S210)를 적용하도록 할 수 있다. 한편, 처음부터 파라미터 최적화 단계(S210)를 적용하여 학습을 진행할 경우 동일한 연산감소량에 비해 결과 정확도가 낮아질 수 있기 때문에, 기존의 학습된 컨볼루션 파라미터와 입력 데이터를 메모리에 로드(S200)한 후, 로드된 기존 컨볼루션 파라미터에 본 발명에서 제안하는 컨볼루션 파라미터 최적화 기법을 적용한다(S210).The convolution neural network optimization technique according to the embodiment of the present invention can perform the convolutional neural network optimization method by adding only the parameter optimization step S210 to the existing learning environment when the existing learned parameters exist. At this time, if the existing learned parameter does not exist, the parameter optimization step (S210) may be applied after the initial learning is performed through learning omitting the parameter optimization step (S210). On the other hand, when the learning is performed by applying the parameter optimization step (S210) from the beginning, the result accuracy may be lower than the same calculation decrease amount. Therefore, the existing learned convolution parameter and input data are loaded into the memory (S200) The convolution parameter optimization method proposed by the present invention is applied to the loaded existing convolution parameter (S210).

다음으로, 상기 최적화가 적용된 파라미터와 입력 데이터를 컨볼루션 연산하여 추론(S220)을 실행하고, 파라미터 보정 단계(S230)를 거쳐 기존 파라미터를 업데이트한다(S240). 상기 파라미터 보정 단계(S230)에서는 역전파를 구하는 과정이 포함된다. 학습이 종료된 이후, 컨볼루션 파라미터 최적화 기법을 적용하여 파라미터를 저장하도록 한다. 파라미터 최적화 단계(S210), 추론 단계(S220), 파라미터 보정 단계(S230) 및 업데이트 단계(S240)는 목표 횟수에 도달할 때까지 반복적으로 수행될 수 있다.Next, the inference (S220) is performed by convoluting the parameter to which the optimization is applied and the input data, and the existing parameter is updated through the parameter correction step (S230) (S240). In the parameter correction step (S230), a process of obtaining back propagation is included. After the learning ends, the convolution parameter optimization technique is applied to store the parameters. The parameter optimization step (S210), the inference step (S220), the parameter correction step (S230), and the update step (S240) may be repeatedly performed until the target number of times is reached.

도 3은 본 발명의 일 실시 예에 따른 도 2의 최적화 파라미터 생성을 위한 학습 프로세스를 세부화하여 도시한 도면이다.FIG. 3 is a detailed view illustrating a learning process for generating optimization parameters of FIG. 2 according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 실시예에 따른 신경망 연산장치는, 먼저, 학습된 파라미터가 존재하는지 판단한다(S300). 다음으로, 학습된 파라미터가 존재하면 이를 메모리에 로드한다(S302). 이어서, 본 발명의 컨볼루션 파라미터 최적화(S303)를 적용할지 여부를 판단한다(S304). 컨볼루션 파라미터 최적화(S303)를 적용하지 않는 경우는 일반적인 딥러닝 학습과정과 동일하다.Referring to FIG. 3, the neural network computing apparatus according to the embodiment of the present invention determines whether there is a learned parameter (S300). Next, if the learned parameter exists, it is loaded into the memory (S302). Subsequently, it is determined whether or not to apply the convolution parameter optimization S303 of the present invention (S304). The case where the convolution parameter optimization (S303) is not applied is the same as the general deep learning learning process.

상기 컨볼루션 파라미터 최적화 단계(S303)는 크기 파라미터 생성 단계(S306), 부호 파라미터 생성 단계(S308)를 포함하며, 비율 파라미터 최적화를 적용할지 여부를 판단하는 단계(S310) 및 비율 파라미터 최적화를 적용하는 것으로 판단한 경우 비율 파라미터를 생성하는 단계(S312)를 더 포함할 수 있다.The convolution parameter optimization step (S303) includes a size parameter generation step (S306), a sign parameter generation step (S308), and it is determined whether to apply ratio parameter optimization (S310) (S312) of generating a ratio parameter when it is determined that the ratio parameter is not satisfied.

본 발명의 실시예에 따르면, 상기 컨볼루션 파라미터 최적화 단계(S303)가 완료되면, 신경망 연산장치는 추론 단계(S316)를 통해 최적화가 적용된 파라미터 및 입력 데이터로 추론을 실행하고, 보정 단계(S318)를 거쳐 기존 파라미터를 최적화 및 보정된 파라미터로 업데이트한다(S320). 이때, 미리 설정된 목표 횟수에 도달하였는지 판단(S322)하여, 그렇지 않은 경우 최적화 파라미터 최적화 단계(S303), 추론 단계(S316), 보정 단계(S318) 및 파라미터 업데이트 단계(S320)를 반복 수행한다.According to the embodiment of the present invention, when the convolution parameter optimization step (S303) is completed, the neural network computation apparatus performs inference with parameters and input data to which the optimization is applied through inference step (S316) And updates the existing parameter with optimized and corrected parameters (S320). At this time, it is determined whether the target number of times has been reached (S322). If not, optimization parameter optimization step (S303), inference step (S316), correction step (S318) and parameter update step (S320) are repeated.

도 4는 본 발명의 일 실시 예에 따른 추론 프로세스를 세부화하여 도시한 도면이다.4 is a detailed view illustrating a reasoning process according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 실시예에 따른 신경망 연산장치는 최적화 파라미터를 메모리에 로드(S400)하고, 크기 파라미터 값이 존재하는지, 즉, 크기 파라미터 값이 0(zero)인지 아닌지를 판단한다(S410). 크기 파라미터 값이 0이면 이후 과정이 생략됨에 따라 추론 연산량을 감소시킬 수 있다.Referring to FIG. 4, the neural network computation apparatus according to an embodiment of the present invention loads optimization parameters into a memory (S400) and determines whether or not a size parameter value exists, that is, whether a size parameter value is zero (S410). If the size parameter value is 0, the inference operation amount can be reduced since the subsequent process is omitted.

다음으로, 크기 파라미터 값이 0이 아니면, 입력 데이터를 메모리에 로드(S420)하고, 부호 파라미터와 입력 데이터를 비트 연산한다(S430). 그리고 입력 데이터 간 합 연산을 적용(S460)하고 크기 파라미터와 입력 데이터의 곱 연산(S470)을 수행한다. 비트 연산(S430) 이후, 비율 파라미터를 적용하는 경우(S440), 비율 파라미터를 참조해 비율 파라미터 대응상수를 크기 파라미터에 적용하는 단계(S470)를 더 포함할 수 있다.Next, if the size parameter value is not 0, the input data is loaded into the memory (S420), and the sign parameter and the input data are bit-operated (S430). Then, a sum operation is performed between the input data (S460) and a multiplication operation of the size parameter and the input data (S470) is performed. After the bit operation (S430), if the ratio parameter is applied (S440), referring to the ratio parameter, applying the ratio parameter corresponding constant to the size parameter (S470) may be further included.

도 5는 본 발명의 일 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면이다.5 is a diagram illustrating an application example of a convolution parameter optimization process according to an embodiment of the present invention.

도 5를 참조하면, 일 실시 예에 따른 파라미터 최적화 프로세스는 크게 형태변환 단계(510)와 가지치기(pruning) 단계(520) 두 부분으로 구성된다. 파라미터 형태변환(510)은 기존의 컨볼루션 파라미터의 연산량을 줄이고 가지치기 용이한 형태로 변환하는 과정을 의미한다. 예를 들어, 도 5에 도시된 바와 같이 기존 파라미터(50)가 크기를 나타내는 크기 파라미터(Magnitude Parameter)(52)와 방향을 나타내는 부호 파라미터(Signed Parameter)(54)로 분리된다. 도 5에서는 기존 파라미터가 3개의 입력채널(50-1, 50-2, 50-3)을 가지는 예를 들어 도시하고 있다. 각 입력채널(50-1,50-2,50-3)은 다수의 파라미터 원소들(elements)로 이루어진다. 부호 파라미터(54)는 기존 파라미터(50)의 채널 별 원소들의 방향을 결정한 것이고, 크기 파라미터(52)는 기존 파라미터(50)의 채널 당 단일의 대표 값으로 가중치들을 최적화한 것이다.Referring to FIG. 5, the parameter optimization process according to an exemplary embodiment of the present invention is mainly composed of a shape conversion step 510 and a pruning step 520. The parameter type conversion 510 refers to a process of reducing the amount of computation of the existing convolution parameter and converting it into a form that is easy to prune. For example, as shown in FIG. 5, the existing parameter 50 is divided into a magnitude parameter 52 indicating a magnitude and a sign parameter 54 indicating a direction. FIG. 5 shows an example in which the existing parameters have three input channels 50-1, 50-2, and 50-3. Each of the input channels 50-1, 50-2, and 50-3 includes a plurality of parameter elements. The sign parameter 54 determines the direction of the elements of the channel of the existing parameter 50 and the magnitude parameter 52 optimizes the weights to a single representative value per channel of the existing parameter 50.

사용자는 크기 파라미터(52)의 각 구성원소들의 비트 수를 임의로 설정할 수 있으며, 정수(Integer)와 소수(Floating) 등으로 표현될 수 있다. 비트 수가 많아질수록 원래의 컨볼루션 연산 결과와 동일한 결과를 가지게 된다. 부호 파라미터(54) 내 각 구성원소의 비트 수는 1비트(Bit) 이며, 이는 양(+)의 방향과 음(-)의 방향만을 나타낸다. 기존 파라미터(50)에서 크기 파라미터(52) 및 부호 파라미터(54)를 분리하기 위해 다양한 방법을 적용할 수 있다. 예를 들어, 기존 파라미터(50)에서 컨볼루션 레이어의 입력채널의 평균과 각 파라미터의 최상위 비트를 통해 형태변환을 수행할 수 있다.The user can arbitrarily set the number of bits of each constituent element of the size parameter 52, and can be represented by an integer and a floating. As the number of bits increases, the same result as the original convolution operation result is obtained. The number of bits of each constituent element in the sign parameter 54 is 1 bit, which indicates only positive (+) direction and negative (-) direction. Various methods can be applied to separate the magnitude parameter 52 and the sign parameter 54 from the existing parameter 50. For example, in the conventional parameter 50, the type conversion can be performed through the average of the input channels of the convolution layer and the most significant bits of each parameter.

크기 파라미터(52)로의 형태 변환 방법 중 하나는 입력채널의 구성원소들(가중치 파라미터들)의 평균 값을 이용하는 것이다. 예를 들어, i개의 구성원소 W_i, j개의 입력채널 I_j, k개의 출력채널 O_k이고, 크기 파라미터 M = {M₁, M₂, … , M_j}이면, M_j 는 수학식 1과 같다.One of the form conversion methods to the size parameter 52 is to use the average value of the constituent elements (weight parameters) of the input channel. For example, it is assumed that there are i constituent elements W _i , j input channels I _j , k output channels O _k , and size parameters M = {M ₁ , M ₂ , ... , M _j }, M _j is expressed by Equation (1).

부호 파라미터(54)로의 형태 변환 방법 중 하나는 채널의 구성원소(가중치 파라미터들)가 0보다 크거나 같은 값으면 0 값을 할당하고, 0보다 작으면 1 값을 부여하는 방식이다. 즉, 부호 파라미터 S = {S₁, S₂, … , S_i}이면, S_i 는 수학식 2와 같다.One of the form conversion methods into the sign parameter 54 is a method of allocating a value of 0 if the constituent elements (weight parameters) of the channel are equal to or greater than 0, and assigning a value of 1 when the constituent elements (weight parameters) That is, the sign parameter S = {S ₁ , S ₂ , ... , S _i }, S _i is expressed by Equation (2).

형태변환 과정을 수행한 후 불필요한 파라미터들을 제거하는 가지치기 단계(520)를 진행한다. 가지치기 단계(520)에서는 중요도가 낮은 파라미터를 선별하여 그 값을 0(zero)로 만들어 연산을 생략할 수 있게 하여 전체 연산량을 줄이는 효과를 얻도록 한다. 이때, 크기 파라미터를 크기에 따라 가지치기하여 최적화된 파라미터를 생성한다. 예를 들어, 소정의 크기 파라미터 값이 미리 설정된 기준 값보다 적으면 해당 채널의 크기 파라미터를 가지치기한다.After the morphing process is performed, a pruning step 520 is performed to remove unnecessary parameters. In the pruning step 520, low-importance parameters are selected and their values are set to 0 (zero), which makes it possible to omit calculation, thereby reducing the total amount of computation. At this time, the optimized parameter is generated by trimming the size parameter according to the size. For example, if the predetermined size parameter value is smaller than a preset reference value, the size parameter of the corresponding channel is truncated.

여기서 중요한 점은 이 가지치기 과정을 컨볼루션 레이어 전체 파라미터에 적용하는 것이 아니라, 형태변환된 크기 파라미터(52)에만 적용하도록 하는 것이다. 크기 파라미터(52)만 가지치기(520)를 적용함으로써 생략된 파라미터가 컨볼루션 레이어의 각 입력채널의 모든 파라미터에 영향을 끼치게 되어 가지치기(520)의 효과를 극대화할 수 있다.It is important to note that this pruning process is not applied to the entire convolution layer parameters but only to the transformed size parameters 52. [ By applying the scale parameter 520 only to the size parameter 52, the omitted parameters can affect all parameters of each input channel of the convolution layer, maximizing the effect of the pruning 520.

불필요한 파라미터를 생략하기 위해 다양한 방법들이 적용될 수 있다. 예를 들어, 컨볼루션 레이어 입력채널 및 출력채널 별 평균값과 분포를 이용해서 가지치기를 수행한다. 각 레이어 별로 사용자가 임의로 정한 상수가 존재하며, 이 상수는 레이어 별 컨볼루션 파라미터 분포에 따라 정해진다. 값을 생략하기 위한 기준 값으로 입력채널의 크기 파라미터들의 평균값 혹은 입력채널 및 출력채널의 크기 파라미터의 평균에 가중치를 곱한 값을 사용할 수 있다. 레이어 상수는 0~1 값을 가지는데, 레이어 별 컨볼루션 파라미터를 연속균등분포 혹은 정규분포라 가정할 때, 50%의 크기 파라미터를 생략하기 위해서 레이어 상수는 0.5~0.7일 수 있다. 이 값을 참조하여 실제 파라미터의 분포를 고려해 레이어 상수를 정하도록 한다.Various methods can be applied to omit unnecessary parameters. For example, pruning is performed using the convolution layer input channel and output channel average value and distribution. Each layer has a user-defined constant, which is determined by the distribution of convolution parameters per layer. As a reference value for omitting the value, an average value of input channel size parameters or an average of input channel and output channel size parameters may be used. The layer constant has a value of 0 to 1. When the convolution parameter per layer is assumed to be a continuous uniform distribution or a normal distribution, the layer constant may be 0.5 to 0.7 in order to omit the 50% size parameter. This value is referred to, and the layer constant is determined by considering the distribution of actual parameters.

M_LOI = {M₁₁₁, M₁₁₂, … , M_lkj}, L = {C₁, C₂, … , C_l)이고, C는 레이어 상수인 경우, 입력채널 별 기준을 적용한 크기 파라미터 가지치기 방식은 다음 수학식 3과 같다.M _LOI = {M ₁₁₁ , M ₁₁₂ , ... , M _lkj }, L = {C ₁ , C ₂ , ... , C _l ), and C is a layer constant, the size parameter pruning method applying the criterion for each input channel is expressed by Equation 3 below.

출력채널 별 기준을 적용하나 크기 파라미터 가지치기 방식은 다음 수학식 4와 같다.The output parameter is applied to each output channel, but the size parameter pruning method is expressed by Equation (4).

이하, 전술한 수학식 1 내지 4를 적용한 컨볼루션 파라미터 최적화 계산 예시를 설명한다.Hereinafter, examples of convolution parameter optimization calculations using Equations 1 to 4 described above will be described.

i=4, j=2, k=2인 기존 컨볼루션 파라미터를 대상으로 크기 파라미터 및 부호 파라미터를 계산하고, 크기 파라미터를 대상으로 가지치기를 한다고 가정한다. 이때, O₁ = {I₁₁, I₁₂} ={[W₁₁₁, W₁₁₂, W₁₁₃, W₁₁₄}, [W₁₂₁, W₁₂₂, W₁₂₃, W₁₂₄]]} = {[-0.5, 1.0, 0.5, -1.0], [1.0, -1.5, 1.0, -1.5]}, O₂= {I₂₁, I₂₂} ={[W₂₁₁, W₂₁₂, W₂₁₃, W₂₁₄}, [W₂₃₁, W₂₂₂, W₂₂₃, W₂₂₄]]} = {[1.5, -2.0, -1.5, 2.0], [-2.0, -2.5, 2.0, 2.5]}라고 가정한다.it is assumed that a size parameter and a sign parameter are calculated with respect to an existing convolution parameter having i = 4, j = 2, k = 2, and pruning is performed on the size parameter. At this _{_{time, O 1 = {I 11,}} I 12} = {[W 111, W 112, W 113, W 114}, [W 121, W 122, W 123, W 124]]} = {[-0.5, 1.0 , 0.5, -1.0], [1.0 , -1.5, 1.0, -1.5]}, O 2 = {I 21, I 22} = {[W 211, W 212, W 213, W 214}, [W 231, W ₂₂₂ , W ₂₂₃ , W ₂₂₄ ]]} = {[1.5, -2.0, -1.5, 2.0], [-2.0, -2.5, 2.0, 2.5]}.

수학식 1을 적용하여 계산한 크기 파라미터 M={M₁₁, M₁₂, M₂₁, M₂₂} = {0.75, 1.25, 1.75, 2.25}이다. 즉, M₁₁ = 1/4 (0.5 + 1.0 + 0.5 + 1.0 ) =0.75, M₁₂ = 1/4 (1.0 + 1.5 + 1.0 + 1.5 ) =1.25, M₂₁ = 1/4 (1.5 + 2.0 + 1.5 + 2.0 ) =1.75, M₂₂ = 1/4 (2.0 + 2.5 + 2.0 + 2.5 ) =2.25이다.The magnitude parameters M = {M ₁₁ , M ₁₂ , M ₂₁ , M ₂₂ } = {0.75, 1.25, 1.75, 2.25} calculated by applying Equation (1). That is, M ₁₁ = 1/4 (0.5 + 1.0 + 0.5 + 1.0) = 0.75, M ₁₂ = 1/4 (1.0 + 1.5 + 1.0 + 1.5) = 1.25, M ₂₁ = 1/4 + 2.0) = 1.75 and M ₂₂ = 1/4 (2.0 + 2.5 + 2.0 + 2.5) = 2.25.

수학식 2를 적용하여 계산한 부호 파라미터 S = {S₁, S₂, S_3,S₄} = {[1, 0, 0, 1], [0, 1, 0, 1], [0, 1, 1, 0], [1, 1, 0, 0]}이다.Applying equation (2) a code parameter S = calculated by _{_{{S 1, S 2, S}} 3, S 4} = {[1, 0, 0, 1], [0, 1, 0, 1], [0, 1, 1, 0], [1, 1, 0, 0]}.

크기 파라미터에 대한 가지치기 적용 예를 들면 다음과 같다. j=2, k=2, l=1인 기존 파라미터의 경우, 기존 파라미터로부터 분리된 크기 파라미터 M = {M₁₁₁, M₁₁₂, M₁₂₁, M₁₂₂} = {0.75, 1.25, 1.75, 2.25}이고, 레이어 상수 L₁=0.9라 하면, 수학식 3의 입력채널 별로 적용하면, 가지치기된 크기 파라미터 M' = {M₁₁₁, M₁₁₂, M₁₂₁, M₁₂₂} = {0, 1.25, 0, 2.25}이다. 예를 들어, M₁₁₂의 경우, 1.25 < 1/2(0.75+1.25)·0.9이므로, 수학식 3에 의해 M₁₁₂= 0이다.Here is an example of a pruning application for the size parameter. M ₁₁₁ , M ₁₁₂ , M ₁₂₁ , M ₁₂₂ } = {0.75, 1.25, 1.75, 2.25} separated from the existing parameters for the existing parameters with j = 2, k = 2, , And the layer constant L ₁ = 0.9. When applied to the input channels of Equation (3), the pruned magnitude parameters M '= {M ₁₁₁ , M ₁₁₂ , M ₁₂₁ , M ₁₂₂ } = {0, 1.25, }to be. For example, in the case of M ₁₁₂ , since 1.25 < 1/2 (0.75 + 1.25) 0.9, M ₁₁₂ = 0 according to Equation (3).

수학식 4의 입력채널 밀 출력채널 별로 적용하면, 가지치기된 크기 파라미터 M' = {M₁₁₁, M₁₁₂, M₁₂₁, M₁₂₂} = {0, 0, 1.75, 2.25}이다. M₁₁₂의 경우, 1.25 < 1/4(0.75+1.25+1.75+2.25)·0.9이므로, 수학식 4에 의해 M₁₁₂= 0이다. 입력채널 별로 적용할지 입력 및 출력채널 별로 적용할지는 분포를 따라 결정될 수 있다.If applied to the input channel mill output channels of Equation 4, the pruned magnitude parameters M '= {M ₁₁₁ , M ₁₁₂ , M ₁₂₁ , M ₁₂₂ } = {0, 0, 1.75, 2.25}. In the case of ₁₁₂ M, 1.25 because <1/4 (0.75 + 1.25 + 1.75 + 2.25), 0.9, and M ₁₁₂ = 0 by the equation (4). It can be determined according to the input channel or the distribution depending on the input and output channels.

도 6은 일반적인 컨볼루션 연산과 본 발명의 일 실시 예에 따른 컨볼루션 연산을 비교한 도면이다.FIG. 6 illustrates a comparison between a general convolution operation and a convolution operation according to an embodiment of the present invention.

도 6을 참조하면, 일반적인 컨볼루션 연산(60)은 곱 연산(600)과 합 연산(602)이 컨볼루션 파라미터 필터 크기만큼 존재한다. 하지만 일 실시 예에 따른 파라미터 최적화를 적용한 컨볼루션 연산(62)은 비트 연산(620)을 통해 데이터의 방향을 정하고, 필터 크기만큼 모두 합 연산(622)한 후, 단 한 번의 곱 연산(624)으로 결과를 도출할 수 있다. 예를 들어, 도 6에 도시된 바와 같이 일반적인 컨볼루션 연산(60)은 곱 연산(600)은 9번이나, 일 실시 예에 따른 파라미터 최적화를 적용한 컨볼루션 연산(62)은 단 1번임을 확인할 수 있다. 이는 필터 크기만큼의 곱 연산 이득이 있음을 의미한다.Referring to FIG. 6, the general convolution operation 60 includes a multiplication operation 600 and a sum operation 602 as many as the convolution parameter filter size. However, the convolution operation 62 using the parameter optimization according to the embodiment directs the data through the bit operation 620, performs a sum operation 622 by the filter size, and then performs a single multiplication operation 624, The results can be derived. For example, as shown in FIG. 6, a general convolution operation 60 may be performed such that the multiplication operation 600 is performed nine times, but the convolution operation 62 using parameter optimization according to one embodiment is only one . This means that there is a gain in the product of the filter size.

또한, 컨볼루션 파라미터를 형태변환 하지 않고 가지치기를 적용하게 되면 생략해야할 연산인지 아닌지를 판단하는데에 필터 크기만큼의 비교연산이 필요하다. 도 6의 예시의 일반적인 컨볼루션 연산(60) 경우는 총 9번의 비교연산이 필요하다. 그러나 본 발명의 파라미터 최적화 기법을 적용한 컨볼루션 연산(62)의 경우, 컨볼루션 레이어 입력채널 당 단 한 번의 비교만으로 연산을 생략할 것인지 여부를 판단할 수 있다. 이러한 연산량의 감소는 기존의 컨볼루션 연산(60)보다 적은 연산기를 요구한다. 따라서 효율적으로 하드웨어를 설계할 수 있다.In addition, if a convolution parameter is not transformed and a pruning is applied, it is necessary to perform a comparison operation as much as the filter size in order to determine whether or not the operation should be omitted. A typical convolution operation (60) case of the example of FIG. 6 requires a total of nine comparison operations. However, in the case of the convolution operation 62 to which the parameter optimization technique of the present invention is applied, it is possible to determine whether or not to omit the operation by only one comparison per convolution layer input channel. This reduction in the amount of computation requires fewer operators than the conventional convolution operation 60. Therefore, hardware can be efficiently designed.

도 7은 본 발명의 일 실시 예에 따른 파라미터 최적화 이후 컨볼루션 연산의 이점을 보여주는 도면이다.Figure 7 illustrates the benefits of convolution operations after parameter optimization in accordance with an embodiment of the present invention.

도 7을 참조하면, 입력 데이터와 크기 파라미터 및 부호 파라미터를 컨볼루션 연산하여 출력 데이터를 생성하는데, 크기 파라미터 값이 0이면 연산을 수행할 필요가 없게 되어 채널 단위로 연산량을 크게 줄일 수 있다.Referring to FIG. 7, output data is generated by convoluting input data, size parameter, and sign parameter. When the size parameter value is 0, there is no need to perform an operation, and the amount of operation can be greatly reduced on a channel-by-channel basis.

도 8은 본 발명의 다른 실시 예에 따른 컨볼루션 파라미터 최적화 프로세스의 적용 예를 보여주는 도면이다.8 is a diagram showing an application example of a convolution parameter optimization process according to another embodiment of the present invention.

컨볼루션 파라미터 최적화의 목표는 최대한의 최적화(연산감소)를 가지고 최대한의 정확도를 얻는 것이다. 연산감소와 정확도는 보편적으로 반비례 관계이다. 실제 응용단계에서는 만족해야할 성능(속도 및 정확도)이 존재하는데, 이를 사용자가 조절하기는 쉽지 않다. 이를 해결하기 위해 본 발명은 비율 파라미터(Scale Parameter)를 이용한 최적화 방식을 제안한다. 이는 도 5를 참조로 하여 전술한 최적화 방식에 비율 파라미터가 추가된 것이다.The goal of convolutional parameter optimization is to achieve maximum accuracy with maximum optimization (reduction in computation). Decrease in computation and accuracy are universally inversely related. There are performance (speed and accuracy) to be satisfied in actual application level, but it is not easy for users to control it. In order to solve this problem, the present invention proposes an optimization method using a scale parameter. This is the ratio parameter added to the above-described optimization method with reference to FIG.

형태변환 과정(510)에서 기존의 파라미터(50)의 형태를 크기 파라미터(52), 부호 파라미터(54) 및 비율 파라미터(58)로 변환하고, 크기 파라미터(52)에만 가지치기(520)를 적용한다. 추론 시에는 비율 파라미터(58)를 참고하여 크기 파라미터에 가중치를 적용해 기존 연산결과의 오차를 줄이도록 한다.The shape of the existing parameter 50 is converted into the size parameter 52, the sign parameter 54 and the ratio parameter 58 in the shape conversion process 510 and the pruning 520 is applied only to the size parameter 52 do. For inference, we apply a weight to the size parameter by referring to the ratio parameter (58) to reduce the error of the existing operation result.

비율 파라미터(58)는 크기 파라미터(52)가 얼마나 중요한지를 나타낸다. 즉, 컨볼루션 레이어 입력채널 파라미터 간의 중요도를 비교하여 상수를 차등 적용하는 방식이다. 비율 파라미터(58)의 하나의 구성원소에 대해 사용자는 원하는 비트를 할당할 수 있으며, 사용자는 이를 통해 최적화 정도를 조절할 수 있다. 비율 파라미터(58)는 가중치를 의미하지 않으며, 비율 파라미터(58)를 참조하는 상수 값들이 존재한다. 이 값들은 하드웨어로 구현되기 쉬운 값이 할당되어 있어 비트 연산만으로 구현 가능하게 한다. 크기 파라미터(52)의 값을 바로 적용하지 않고 비율 파라미터(58)를 통한 가중치를 적용하여 최적화를 진행하게 되면, 기존의 최적화 전 파라미터와의 유사도가 높아져 정확도가 올라가게 된다.The ratio parameter 58 indicates how important the magnitude parameter 52 is. In other words, we compare the importance of the convolution layer input channel parameters and apply a constant difference. For one constituent element of the ratio parameter 58, the user can assign the desired bit and the user can thereby adjust the degree of optimization. The ratio parameter 58 does not imply a weight, and there are constant values that refer to the ratio parameter 58. These values are assigned values that are easy to be implemented in hardware, and can be implemented only by bit operations. If the optimization is performed by applying the weight based on the ratio parameter 58 without immediately applying the value of the size parameter 52, the degree of similarity with the existing pre-optimization parameter is increased and the accuracy is increased.

도 9는 비율 파라미터 최적화를 설명하기 위해 크기 파라미터만을 적용한 파라미터 분포와 비율 파라미터를 추가 적용한 파라미터 분포를 비교한 도면이다.FIG. 9 is a diagram comparing a parameter distribution applying only a magnitude parameter and a parameter distribution adding a ratio parameter to explain a ratio parameter optimization.

도 8 및 도 9를 참조하면, 비율 파라미터(58)를 생성하기 위해 다양한 방법이 적용될 수 있는데, 본 발명에서는 컨볼루션 파라미터의 분포를 통해 비율 파라미터(58)를 생성한다. 비율 파라미터 비트 수를 b라고 가정하면, 2^b개의 사용자가 정의한 임의의 상수가 존재하며, 이를 비율 파라미터 대응상수라 한다. 비율 파라미터 대응상수는 컨볼루션 파라미터 분포에 근거하여 사용자가 임의로 정할 수 있으며, 비율 파라미터에 대응되는 상수이다. 8 and 9, various methods can be applied to generate the ratio parameter 58, in which the ratio parameter 58 is generated through the distribution of the convolution parameters. Assuming that the number of bits of the rate parameter is b, there are 2 ^b user defined arbitrary constants, which are called rate parameter corresponding constants. The ratio parameter correspondence constant can be arbitrarily set by the user based on the convolution parameter distribution, and is a constant corresponding to the ratio parameter.

비율 파라미터(58)를 연산하는 예는 다음과 같다.An example of calculating the ratio parameter 58 is as follows.

비율 파라미터 비트 수 = b,Rate parameter Number of bits = b,

λ = {λ₁, λ₂, … , λ_t}, t = {1, 2, … , 2^b-1}, λ_t> λ_t+1이고, 비율 파라미터 SC = {SC₁, SC₂, … ,SC_i}이면, SC_i는 다음 수학식 5와 같다.? = {? ₁ ,? ₂ , ... ,? _t }, t = {1, 2, ... , 2 ^b -1}, λ _t > λ _{t + 1} , and the ratio parameter SC = {SC ₁ , SC ₂ , ... , SC _i }, SC _i is expressed by the following equation (5).

추론 시에는 비율 파라미터 대응상수가 크기 파라미터에 곱해진 값이 실제 연산 값으로 사용된다.At the time of inference, the value obtained by multiplying the ratio parameter correspondence constant to the size parameter is used as the actual calculation value.

이하, 전술한 수학식 5를 적용한 비율 파라미터 최적화 계산 예시를 설명한다.Hereinafter, an example of the ratio parameter optimization calculation using the above-described equation (5) will be described.

i=4, j=2, k=2인 기존 컨볼루션 파라미터를 대상으로 비율 파라미터를 계산한다고 가정한다. 비율 파라미터 비트 수 b = 2, t = {1, 2, 3}, λ = {0.9, 0.7, 0.5}라고 가정한다. 이때, O₁ = {I₁₁, I₁₂} ={W₁₁₁, W₁₁₂, W₁₁₃, W₁₁₄, W₁₂₁, W₁₂₂, W₁₂₃, W₁₂₄} = {-0.5, 1.0, 0.5, -1.0, 1.0, -1.5, 1.0, -1.5}, O₂= {I₂₁, I₂₂} ={W₂₁₁, W₂₁₂, W₂₁₃, W₂₁₄, W₂₃₁, W₂₂₂, W₂₂₃, W₂₂₄} = {1.5, -2.0, -1.5, 2.0, -2.0, -2.5, 2.0, 2.5}라고 가정한다.It is assumed that the ratio parameter is calculated for an existing convolution parameter i = 4, j = 2, k = 2. It is assumed that the ratio parameter bit number b = 2, t = {1, 2, 3}, λ = {0.9, 0.7, 0.5}. At this _{_{time, O 1 = {I 11,}} I 12} = {W 111, W 112, W 113, W 114, W 121, W 122, W 123, W 124} = {-0.5, 1.0, 0.5, -1.0, 1.0, -1.5, 1.0, -1.5} , O 2 = {I 21, I 22} = {W 211, W 212, W 213, W 214, W 231, W 222, W 223, W 224} = {1.5 , -2.0, -1.5, 2.0, -2.0, -2.5, 2.0, 2.5}.

수학식 5에 의해, SC₁₁₁ = 2이다. 1/4(0.5 + 1.0 + 0.5 + 1.0)·0.7 > |-0.5| > 1/4(0.5 + 1.0 + 0.5 + 1.0)·0.5이기 때문이다. 이런 식으로 계산된 비율 파라미터 SC_i = (SC₁₁₁, SC₁₁₂, SC₁₁₃, SC₁₁₄, SC₁₂₁, SC₁₂₂, SC₁₂₃, SC₁₂₄, SC₂₁₁, SC₂₁₂, SC₂₁₃, SC₂₁₄, SC₂₂₁, SC₂₂₂, SC₂₂₃, SC₂₂₄} = {2, 0, 2, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}이다.According to equation (5), SC ₁₁₁ = 2. 1/4 (0.5 + 1.0 + 0.5 + 1.0) 0.7> | -0.5 | > 1/4 (0.5 + 1.0 + 0.5 + 1.0) 0.5. The ratio parameters calculated in this way _{_{SC i = (SC 111, SC}} 112, SC 113, SC 114, SC 121, SC 122, SC 123, SC 124, SC 211, SC 212, SC 213, SC 214, SC 221, SC is _{_{_{222, SC 223, SC 224}}}} = {2, 0, 2, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}.

도 10은 본 발명의 일 실시 예에 따른 최적화된 파라미터를 이용한 컨볼루션 연산의 예를 도시한 도면이다.10 is a diagram illustrating an example of a convolution operation using an optimized parameter according to an embodiment of the present invention.

도면에 도시된 바와 같이, 크기 파라미터(M), 부호 파라미터(S), 비율 파라미터(SC), 그리고, 비율 파라미터 대응상수(C)가 존재하고, 입력 데이터(In)가 입력될 때의 컨볼루션 연산은 도 10에 도시된 바와 같다. 이때, M = {0.75}, S = {0, 0, 1, 1}, SC = {0, 0, 1, 1}, C = {1.0, 0.5}이고, 입력 데이터 In = {1.2, -2.0, 0.5, 3.1}인 경우를 예로 들었다.As shown in the figure, there exists a magnitude parameter M, a sign parameter S, a ratio parameter SC, and a ratio parameter correspondence constant C, and a convolution when the input data In is input The calculation is as shown in Fig. In this case, the input data In = {1.2, -2.0, 0, 0, 1, 1}, M = {0.75}, S = {0,0,1,1} , 0.5, 3.1}.

도 10을 참조한 예시에서는 비율 파라미터 대응상수(C)는 2의 배수로 정하여 일반적인 산술연산이 아닌 쉬프트(Shift) 비트연산으로 결과를 도출할 수 있도록 한다. 이 경우, 하드웨어 구현 시 이점을 가진다.In the example referring to FIG. 10, the ratio parameter correspondence constant C is set to a multiple of 2, so that a result can be derived by a shift bit operation rather than a general arithmetic operation. In this case, it has an advantage in hardware implementation.

도 11은 본 발명의 일 실시 예에 따른 신경망 연산장치의 구성을 도시한 도면이다.11 is a diagram illustrating the configuration of a neural network computing apparatus according to an embodiment of the present invention.

도면에 도시된 바와 같이, 본 발명의 실시예에 따른 신경망 연산장치(1)는 입력부(10), 프로세서(12), 메모리(14) 및 출력부(16)를 포함할 수 있다.As shown in the figure, a neural network computing apparatus 1 according to an embodiment of the present invention may include an input unit 10, a processor 12, a memory 14 and an output unit 16. [

상기 입력부(10)는 사용자 명령을 포함한 데이터를 입력받고, 입력된 데이터는 메모리(18)에 저장된다. 상기 프로세서(12)는 신경망 연산을 수행하고 연산 결과를 메모리(18)에 저장하며, 출력부(16)를 통해 출력할 수 있다. 상기 메모리(14)에는 입력 데이터 및 파라미터가 로드될 수 있다. 일 실시 예에 따른 프로세서(12)는 파라미터 최적화부(120), 추론부(122), 보정부(124) 및 업데이트부(126)를 포함한다.The input unit 10 receives data including a user command, and the input data is stored in the memory 18. The processor 12 may perform a neural network operation and store the result of the operation in the memory 18 and output it via the output 16. The memory 14 may be loaded with input data and parameters. The processor 12 according to one embodiment includes a parameter optimizer 120, a speculator 122, a calibration unit 124, and an update unit 126. [

상기 파라미터 최적화부(120)는 기존 파라미터를 형태 변환하고, 형태 변환된 파라미터들 중에 크기 파라미터를 가지치기한다. 기존 파라미터는 이미 학습된 파라미터일 수 있고, 연산을 위해 메모리에 로드된다.The parameter optimizing unit 120 transforms the existing parameters and truncates the size parameters among the transformed parameters. Existing parameters may already be learned parameters and loaded into memory for computation.

본 발명의 실시 예에 따른 파라미터 최적화부(120)는 기존 파라미터를 크기 파라미터 및 부호 파라미터로 형태 변환하거나, 크기 파라미터, 부호 파라미터 및 비율 파라미터로 형태 변환한다. 부호 파라미터는 기존 파라미터의 채널 별 원소들의 방향을 결정한 것이고, 크기 파라미터는 기존 파라미터의 채널 당 단일의 대표 값으로 가중치들을 최적화한 것이다. 비율 파라미터는 크기 파라미터가 얼마나 중요한지를 나타내는 것으로, 추론 시에 비율 파라미터를 참고하여 크기 파라미터에 차등적 가중치를 적용할 수 있다.The parameter optimizer 120 according to the embodiment of the present invention transforms existing parameters into size parameters and sign parameters or converts them into size parameters, sign parameters, and ratio parameters. The sign parameter determines the direction of the elements of each channel of the existing parameter and the size parameter optimizes the weights to a single representative value per channel of the existing parameter. The ratio parameter indicates how important the magnitude parameter is, and the differential weight can be applied to the magnitude parameter by referring to the ratio parameter during inference.

본 발명의 실시예에서, 상기 가지치기는 형태 변환된 파라미터들 중에서 크기 파라미터만을 대상으로 수행된다. 가지치기는 중요도가 낮은 크기 파라미터를 선별하여 그 값을 0(zero)로 만드는 것을 의미한다. 가지치기된 파라미터는 채널 단위로 컨볼루션 연산을 생략할 수 있어서 전체 연산량을 줄이는 효과를 얻는다. 중요한 점은 이 가지치기 과정을 컨볼루션 레이어의 각 입력채널을 구성하는 모든 파라미터 원소에 영향을 끼치게 되어 가지치기의 효과를 극대화할 수 있다.In an embodiment of the present invention, the pruning is performed only on size parameters among the transformed parameters. Pruning means selecting low-importance size parameters and making their values zero (0). The pruned parameters can omit the convolution operation on a channel-by-channel basis, thereby reducing the overall amount of computation. Importantly, this pruning process will affect all the parameter elements that make up each input channel of the convolution layer, maximizing the effect of pruning.

또한, 상기 추론부(122)는 최적화된 파라미터 및 입력채널 데이터를 컨볼루션 연산하여 추론한다. 여기서, 상기 추론부(122)는 최적화 파라미터에 포함된 크기 파라미터의 값이 0인지 판단하고 크기 파라미터의 값이 0이면 컨볼루션 연산 과정을 생략할 수 있다. 또한, 상기 추론부(122)는 비율 파라미터가 존재하면 비율 파라미터를 이용하여 크기 파라미터에 가중치를 차등 반영함에 따라 연산결과의 오차를 줄일 수 있다. 본 발명의 실시예에서, 상기 보정부(124)는 최적화된 파라미터를 보정하고, 업데이트부(126)는 기존 파라미터를 보정된 최적화 파라미터로 업데이트한다.The reasoning unit 122 performs convolution operation on the optimized parameters and input channel data to deduce. Here, the reasoning unit 122 may determine whether the value of the size parameter included in the optimization parameter is 0, and omit the convolution operation if the value of the size parameter is 0. In addition, if the ratio parameter is present, the inference unit 122 may reduce the error of the calculation result as the weight parameter is reflected in the size parameter by using the ratio parameter. In the embodiment of the present invention, the correction unit 124 corrects the optimized parameter, and the update unit 126 updates the existing parameter with the corrected optimization parameter.

도 12는 본 발명의 일 실시 예에 따른 도 11의 파라미터 최적화부의 세부 구성을 도시한 도면이다.FIG. 12 is a detailed block diagram of the parameter optimizer of FIG. 11 according to an embodiment of the present invention.

도 11 및 도 12를 참조하면, 본 발명의 실시예에 따른 파라미터 최적화부(120)는 크기 파라미터 변환부(1200), 부호 파라미터 변환부(1202), 파라미터 가지치기부(1204) 및 비율 파라미터 변환부(1206)를 포함할 수 있다.11 and 12, the parameter optimizing unit 120 according to the embodiment of the present invention includes a size parameter converting unit 1200, a sign parameter converting unit 1202, a parameter pruning unit 1204, 1206 < / RTI >

상기 부호 파라미터 변환부(1202)는 기존 파라미터를 부호 파라미터로 형태 변환한다. 상기 크기 파라미터 변환부(1200)는 기존 파라미터를 채널 당 단일의 값을 가지는 크기 파라미터로 형태 변환한다. 상기 비율 파라미터 변환부(1206)는 기존 파라미터를 비율 파라미터로 형태 변환한다.The code parameter conversion unit 1202 converts the existing parameters into code parameters. The size parameter converting unit 1200 transforms existing parameters into size parameters having a single value per channel. The rate parameter conversion unit 1206 converts an existing parameter into a rate parameter.

또한, 상기 파라미터 가지치기부(1204)는 형태 변환된 크기 파라미터를 가지치기하여 최적화된 파라미터를 생성한다. 예를 들어, 소정의 크기 파라미터 값이 미리 설정된 기준 값보다 적으면 해당 채널의 크기 파라미터를 가지치기한다.In addition, the parameter pruning unit 1204 truncates the transformed size parameter to generate an optimized parameter. For example, if the predetermined size parameter value is smaller than a preset reference value, the size parameter of the corresponding channel is truncated.

본 발명의 실시 예에 따른 파라미터 가지치기부(1204)는 입력채널 별 크기 파라미터의 평균 값 및 크기 분포 또는 입력 및 출력채널 별 크기 파라미터의 평균 값 및 크기 분포를 이용하여 기준 값을 계산하고, 계산된 기준 값보다 적은 값을 가진 크기 파라미터 값을 0으로 만들어 해당 채널의 컨볼루션 연산을 생략한다.The parameter pruning unit 1204 according to the exemplary embodiment of the present invention calculates a reference value using an average value and a size distribution of size parameters for each input channel or an average value and a size distribution of size parameters for input and output channels, The size parameter value having a value smaller than the reference value is made 0, and the convolution operation of the corresponding channel is omitted.

또한, 상기 파라미터 가지치기부(1204)는 소정의 레이어를 구성하는 입력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.In case of pruning using the average value and size distribution of the size parameters of the input channels constituting the predetermined layer, the parameter pruning unit 1204 multiplies the average value of the input channel size parameters by a constant for each layer And if the size parameter value of a predetermined input channel is smaller than the reference value, the size parameter value of the corresponding channel can be changed to 0 and pruned.

또한, 상기 파라미터 가지치기부(1204)는 소정의 레이어를 구성하는 입력 및 출력채널의 크기 파라미터의 평균 값 및 크기 분포를 이용하여 가지치기하는 경우, 입력채널들 및 출력채널들의 크기 파라미터의 평균 값에 레이어 별 상수를 곱한 값으로 기준 값을 계산하고, 소정의 입력채널의 크기 파라미터 값이 기준 값보다 작으면 해당 채널의 크기 파라미터 값을 0으로 변경하여 가지치기할 수 있다.When the pruning unit 1204 pruning the average value and the size distribution of the size parameters of the input and output channels constituting the predetermined layer, the parameter pruning unit 1204 calculates the average value of the size parameters of the input channels and the output channels And if the size parameter value of a predetermined input channel is smaller than the reference value, the size parameter value of the corresponding channel can be changed to 0 and pruned.

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.The embodiments of the present invention have been described above. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

10 : 입력부
12 : 프로세서
14 : 메모리
16 : 출력부
120 : 파라미터 최적화부
122 : 추론부
124 : 보정부
126 : 업데이트부
1200 : 크기 파라미터 변환부
1202 : 부호 파라미터 변환부
1204 : 파라미터 가지치기부
1206 : 비율 파라미터 변환부10: Input unit
12: Processor
14: Memory
16: Output section
120: parameter optimization section
122: Reasoning section
124:
126:
1200: size parameter conversion unit
1202: Code parameter conversion unit
1204: Parameter pruning section
1206: Rate parameter conversion section

Claims

Transforming existing parameters of the neural network into a code parameter and a size parameter having a single value per channel using a code parameter conversion unit and a size parameter conversion unit constituting the parameter optimization unit; And
And pruning the transformed size parameter using a parameter pruning unit constituting the parameter optimizing unit to generate an optimized parameter,
The pruning using the parameter pruning unit may include calculating a reference value using an average value and a size distribution of size parameters for input channels or an average value and a size distribution of size parameters for input and output channels, The size parameter value having a value smaller than the value is set to 0 to omit the convolution operation of the corresponding channel,
In the pruning step using the parameter pruning unit, when pruning is performed using the average value and the size distribution of the size parameters of the input channels constituting a predetermined layer, a constant per layer is added to the average value of the size parameters of the input channels And when the size parameter value of the predetermined input channel is smaller than the reference value, the size parameter value of the channel is changed to 0 and pruned,
In the pruning step using the parameter pruning unit, when pruning is performed using the average value and the size distribution of the size parameters of the input and output channels constituting the predetermined layer, the average of the size parameters of the input channels and output channels Value is multiplied by a layer-specific constant, and when the size parameter value of a predetermined input channel is smaller than a reference value, the size parameter value of the channel is changed to 0 and pruned. Way.

The method according to claim 1,
The code parameters determine the direction of the channel-specific elements of the existing parameters,
Wherein the size parameter optimizes weights to a single representative value per channel of an existing parameter.

2. The method according to claim 1,
And a size parameter is generated by calculating an average of absolute values of elements of each channel of the existing parameters.

2. The method according to claim 1,
When the predetermined element constituting the channel of the existing parameter is smaller than 0, the element value of the corresponding code parameter is set to 0 and when the element value of the corresponding code parameter is equal to or larger than 0, Wherein the neural network parameter optimization method comprises:

delete

The method according to claim 1,
Wherein the constant value per layer is determined according to a distribution of convolution parameters per layer.

2. The method of claim 1,
And converting the existing parameters of the neural network into rate parameters using the rate parameter converting unit that constitutes the parameter optimizing unit.

9. The method of claim 8, wherein transforming the existing parameters of the neural network into rate parameters using the ratio parameter transformer comprises:
Variably allocating bits of the ratio parameter; And
And quantizing the range and the weight of the value of the ratio parameter element by user selection.

Loading existing parameters and input channel data of a neural network into a memory using a processor configuring the neural network computing device;
Converting the existing parameters of the neural network into a code parameter and a size parameter having a single value per channel by using a code parameter conversion unit and a size parameter conversion unit constituting a parameter optimization unit included in the processor, Pratecting the transformed size parameter using a stroke unit to generate an optimized parameter;
Performing convolution operation on the optimized parameter and input channel data by using an inference unit included in the processor;
Correcting the optimized parameter using a correction unit included in the processor; And
Updating an existing parameter with a corrected optimization parameter using an update unit included in the processor,
The step of convoluting and optimizing the optimized parameter and the input channel data using the inference unit may include: loading the optimized parameter into a memory; Determining whether a value of a size parameter included in the loaded optimized parameter is 0; And omitting the convolution operation if the value of the magnitude parameter is zero,
The step of convoluting and optimizing the optimized parameter and the input channel data using the inferring unit includes the steps of: determining the direction of the data by performing a bit operation on the code parameter and the input channel data if the value of the size parameter is not 0; Summing input channel data by a convolution parameter filter size; And performing a single product operation on the size parameter and the input channel data.

11. The method of claim 10,
Determining whether a learned parameter exists;
Generating an initial parameter through parameter initialization if the learned parameter is not present;
Generating an optimized parameter for an initial parameter; And
Loading the learned parameters if they exist;
Further comprising the steps of:

delete

11. The method of claim 10, wherein convoluting and inferring the optimized parameters and input channel data comprises:
If the ratio parameter is present, reducing the error of the calculation result by differently reflecting the weight value in the size parameter using the ratio parameter of the ratio parameter conversion unit constituting the parameter optimization unit;
Further comprising the steps of:

A code parameter conversion unit for converting the existing parameters into code parameters;
A size parameter converting unit for converting the existing parameter into a size parameter having a single value per channel; And
And a parameter pruning section for pruning the transformed size parameter to generate an optimized parameter,
The parameter pruning unit calculates a reference value using an average value and a size distribution of size parameters for each input channel or an average value and a size distribution of size parameters for input and output channels and calculates a reference value having a value smaller than the calculated reference value The parameter value is set to 0 to omit the convolution operation of the corresponding channel,
The parameter pruning unit includes:
When a pruning is performed using an average value and a size distribution of size parameters of input channels constituting a predetermined layer, a reference value is calculated by multiplying an average value of size parameters of input channels by a constant for each layer, If the size parameter value of the channel is smaller than the reference value, the size parameter value of the corresponding channel is changed to 0,
In case of pruning using the average value and the size distribution of the size parameters of the input and output channels constituting a predetermined layer, the reference value is obtained by multiplying the average value of the size parameters of the input channels and output channels by a constant for each layer And when the size parameter value of the predetermined input channel is smaller than the reference value, the size parameter value of the corresponding channel is changed to 0 and pruned.

delete

16. The apparatus of claim 15, wherein the neural network computing device
A ratio parameter conversion unit for converting the existing parameter into a ratio parameter;
Wherein the neural network computation apparatus further comprises:

16. The apparatus of claim 15, wherein the neural network computing device
A reasoning unit for convoluting and optimizing optimized parameters and input channel data;
Wherein the neural network computation apparatus further comprises:

20. The apparatus of claim 19, wherein the reasoning unit
Determining whether the value of the size parameter included in the optimization parameter is 0, and omitting the convolution operation if the value of the size parameter is 0.

20. The apparatus of claim 19, wherein the reasoning unit
Wherein when the ratio parameter is present, the weight parameter is reflected to the size parameter using the ratio parameter to reduce the error of the calculation result.