KR20210009584A

KR20210009584A - Quantization apparatus and method using numerical representation with dynamic precision

Info

Publication number: KR20210009584A
Application number: KR1020190086281A
Authority: KR
Inventors: 정기석; 노시동; 박상수; 박경빈; 이민규
Original assignee: 한양대학교 산학협력단
Priority date: 2019-07-17
Filing date: 2019-07-17
Publication date: 2021-01-27
Also published as: KR102243119B1

Abstract

The present invention may provide a quantization device and method using variable precision to receive input data and quantize input data into n-bit variable precision data that precision is variable for each range of input data. The quantization device comprises: a binary bit section setting unit dividing data of the n number of bits into a binary bit section and an incremental bit section, and setting the number of bits in the binary bit section among the n number of bits so that the number of bits of the incremental bit section is adjusted according to the required expression precision and expression range; and a variable precision conversion unit calculating incremental data by designating the minimum unit of increase of decrease of the binary data inputted into the binary bit section to adjust a quantization range of the input data to be quantized, calculating binary data representing a value corresponding to the input data in the quantization range according to the incremental data, sequentially arranging the calculated incremental data and binary data to convert the incremental data and binary data into variable precision data.

Description

Variable precision quantization apparatus and method {QUANTIZATION APPARATUS AND METHOD USING NUMERICAL REPRESENTATION WITH DYNAMIC PRECISION}

본 발명은 양자화 장치 및 방법에 관한 것으로, 가변되는 정밀도를 가질수 있는 가변 정밀도 양자화 장치 및 방법에 관한 것이다.The present invention relates to a quantization apparatus and method, and to a variable precision quantization apparatus and method capable of having a variable precision.

최근 인간의 두뇌가 패턴을 인식하는 방법을 모사하여 두뇌와 비슷한 방식으로 여러 정보를 처리하도록 구성되는 인공 신경망(artificial neural network)이 다양한 분야에 적용되어 사용되고 있다. 인공 신경망은 방대한 데이터를 바탕으로 하는 학습을 필요로 하며, 학습 과정에서 대량의 가산 및 곱셈 연산을 필요로 한다.Recently, artificial neural networks, which are configured to process various information in a manner similar to that of the brain by simulating the way the human brain recognizes patterns, has been applied and used in various fields. Artificial neural networks require learning based on vast amounts of data, and require a large amount of addition and multiplication operations in the learning process.

이러한 대량의 연산을 처리하기 위해, 고정 소수점이나 부동 소수점과 같이 수를 표현하기 위한 포멧이 이용되고 있으며, 연산 효율성을 향상시키기 위해 다양한 양자화(Quantization) 기법이 제안되고 있다. 그러나 일반적으로 양자화 기법을 이용하는 경우, 각 포멧에서 수를 표현하는 방식에 따른 정밀도와 효율성이 서로 반비례하는 특성이 있다. 즉 정밀도를 향상시키고자 하는 경우, 큰 메모리 용량과 대역폭 및 전력 소모를 요구하여 효율성이 낮아지며, 효율성을 향상시키고자 하는 경우 정밀도가 낮아지는 문제가 있다.In order to process such a large amount of operations, formats for representing numbers such as fixed-point or floating-point are used, and various quantization techniques have been proposed to improve operation efficiency. However, in general, when a quantization technique is used, the precision and efficiency according to the method of expressing numbers in each format are inversely proportional to each other. That is, when the accuracy is to be improved, the efficiency is lowered by requiring a large memory capacity, bandwidth and power consumption, and when the efficiency is to be improved, the accuracy is lowered.

한국 공개 특허 제10-2019-0043849호 (2019.04.29 공개)Korean Patent Publication No. 10-2019-0043849 (published on April 29, 2019)

본 발명의 목적은 발생 빈도에 따른 데이터 분포에 기반하여 범위별로 정밀도를 상이하게 조절하여 양자화 할 수 있는 가변 정밀도 양자화 장치 및 방법을 제공하는데 있다.An object of the present invention is to provide a variable precision quantization apparatus and method capable of quantizing by differently controlling precision for each range based on data distribution according to occurrence frequency.

본 발명의 다른 목적은 적은 데이터 크기로 수를 표현할 수 있는 가변 정밀도 포멧을 이용하여 연산 효율성을 크게 향상시키면서도 요구되는 정밀도를 유지할 수 있는 가변 정밀도 양자화 장치 및 방법을 제공하는데 있다.Another object of the present invention is to provide a variable precision quantization apparatus and method capable of maintaining a required precision while greatly improving computational efficiency by using a variable precision format capable of expressing a number with a small data size.

상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른 가변 정밀도 양자화 장치는 입력 데이터를 인가받아 입력 데이터의 범위별로 정밀도가 가변되는 n 비트의 가변 정밀도 데이터로 양자화하기 위해, n 비트의 데이터를 이진 비트 구간 및 증가 비트 구간으로 구분하고, 요구되는 표현 정밀도 및 표현 범위 따라 상기 증가 비트 구간의 비트 수가 조절되도록 n 비트 중 상기 이진 비트 구간의 비트 수를 설정하는 이진 비트 구간 설정부; 및 양자화되는 입력 데이터의 양자화 범위를 조절하기 위해 상기 이진 비트 구간에 입력되는 이진 데이터의 최소 증감 단위를 지정하여 증분 데이터를 계산하고, 상기 증분 데이터에 따른 양자화 범위에서 상기 입력 데이터에 대응하는 값을 나타내는 이진 데이터를 계산하며, 계산된 증분 데이터와 이진 데이터를 순차적으로 배치하여 가변 정밀도 데이터로 변환하는 가변 정밀도 변환부; 를 포함한다.In order to achieve the above object, the variable precision quantization apparatus according to an embodiment of the present invention receives input data and converts n bits of data into binary data in order to quantize them into n bits of variable precision data whose precision is variable for each range of input data. A binary bit section setting unit that divides into a bit section and an incremental bit section, and sets the number of bits of the binary bit section among n bits so that the number of bits of the incremental bit section is adjusted according to a required expression precision and expression range; And calculating incremental data by specifying a minimum incremental unit of binary data input to the binary bit interval to adjust the quantization range of the quantized input data, and calculating a value corresponding to the input data in the quantization range according to the incremental data. A variable precision conversion unit that calculates the represented binary data and sequentially arranges the calculated incremental data and the binary data to convert the calculated incremental data into variable precision data; Includes.

상기 가변 정밀도 변환부는 이진수로 인가되는 입력 데이터에서 부호 비트를 제외한 최상위 비트의 비트 위치값에서 상기 이진 비트 구간의 비트 수를 차감하여 증분 데이터를 획득하고, 부호 비트와 다음 최상위 비트를 제외한 나머지 최상위 비트 중 최상위 비트로부터 순차적으로 이진 비트 구간의 비트 수만큼의 데이터를 이진 데이터로 추출하며, 이진 데이터로 추출된 비트를 제외한 나머지 비트 중 최상위 1비트를 반올림 비트 값으로 획득하여, 부호 비트를 최상위 비트로 획득된 증분 데이터 및 추출된 이진 데이터를 순차적으로 배열하고, 상기 반올림 비트 값을 가산하여 상기 가변 정밀도 데이터를 획득할 수 있다.The variable precision converter obtains incremental data by subtracting the number of bits of the binary bit section from the bit position value of the most significant bit excluding the sign bit from the input data applied in binary, and the remaining most significant bits excluding the sign bit and the next most significant bit. From the most significant bit, the data as many as the number of bits in the binary bit section are sequentially extracted as binary data, the most significant 1 bit of the remaining bits excluding the bit extracted as binary data is acquired as the rounded bit value, and the sign bit is acquired as the most significant bit The variable precision data may be obtained by sequentially arranging the incremented incremental data and the extracted binary data and adding the rounding bit values.

상기 양자화 장치는 상기 입력 데이터가 소수점 이하 자리를 포함하는 데이터이면, 정수형 데이터로 변환하고, 입력 데이터를 정수형 데이터로 변환하기 위한 지수값을 나타내는 스케일링 팩터를 획득하여 저장하는 스테일링 팩터 설정부; 를 더 포함할 수 있다.The quantization apparatus includes: a scaling factor setting unit for converting the input data into integer data, obtaining and storing a scaling factor representing an exponent value for converting the input data into integer data, if the input data is data including decimal places; It may further include.

상기 이진 비트 구간 설정부는 요구되는 표현 정밀도가 세밀하거나, 표현 범위가 넓을 수록 상기 이진 비트 구간의 비트 수를 감소시킬 수 있다.The binary bit section setting unit may reduce the number of bits of the binary bit section as the required expression precision is finer or the expression range is wider.

상기 이진 비트 구간 설정부는 상기 입력 데이터의 분산에 비례하여 상기 이진 비트 구간의 비트 수를 증가시킬 수 있다.The binary bit interval setting unit may increase the number of bits of the binary bit interval in proportion to the variance of the input data.

상기 이진 비트 구간 설정부는 상기 입력 데이터에 부호 비트가 포함되어 있으면, 상기 n 비트의 가변 정밀도 데이터에서 최상위 비트를 부호 비트로 설정할 수 있다.When a sign bit is included in the input data, the binary bit section setting unit may set the most significant bit as a sign bit in the n-bit variable precision data.

상기 목적을 달성하기 위한 본 발명의 다른 실시예에 따른 가변 정밀도 양자화 방법은 입력 데이터를 인가받아 입력 데이터의 범위별로 정밀도가 가변되는 n 비트의 가변 정밀도 데이터로 양자화하기 위해, n 비트의 데이터를 이진 비트 구간 및 증가 비트 구간으로 구분하는 단계; 요구되는 표현 정밀도 및 표현 범위 따라 상기 증가 비트 구간의 비트 수가 조절되도록 n 비트 중 상기 이진 비트 구간의 비트 수를 설정하는 단계; 및 양자화되는 입력 데이터의 양자화 범위를 조절하기 위해 상기 이진 비트 구간에 입력되는 이진 데이터의 최소 증감 단위를 지정하여 증분 데이터를 계산하고, 상기 증분 데이터에 따른 양자화 범위에서 상기 입력 데이터에 대응하는 값을 나타내는 이진 데이터를 계산하며, 계산된 증분 데이터와 이진 데이터를 순차적으로 배치하여 가변 정밀도 데이터로 변환하는 단계; 를 포함한다.In the variable precision quantization method according to another embodiment of the present invention for achieving the above object, in order to quantize input data into n bits of variable precision data whose precision is variable for each range of input data, n bits of data are binary Dividing into a bit section and an incremental bit section; Setting the number of bits of the binary bit section among n bits so that the number of bits of the increasing bit section is adjusted according to the required expression precision and expression range; And calculating incremental data by specifying a minimum incremental unit of binary data input to the binary bit interval to adjust the quantization range of the quantized input data, and calculating a value corresponding to the input data in the quantization range according to the incremental data. Calculating the represented binary data, sequentially arranging the calculated incremental data and the binary data, and converting them into variable precision data; Includes.

따라서, 본 발명의 실시예에 따른 가변 정밀도 양자화 장치 및 방법은 기지정된 길이를 갖는 데이터의 값을 표현하기 위한 포멧으로 이진값을 표현하는 이진 비트 구간과 이진값의 증가분을 조절하는 증가 비트 구간으로 구분하고, 이진값의 증가분을 조절하는 증가 비트 구간의 값에 따라 이진 비트 구간의 이진값의 변화를 서로 다른 크기로 양자화할 수 있다. 특히 데이터 분포에 기반하여 데이터에서 이진 비트 구간의 길이와 증가 비트의 길이를 조절하여, 지정된 길이의 데이터를 이용한 양자화에서 범위별로 정밀도가 상이하게 설정될 수 있도록 함으로써, 연산 효율성을 크게 향상시키면서도 요구되는 특정 범위에서의 정밀도를 유지할 수 있다.Accordingly, the variable precision quantization apparatus and method according to an embodiment of the present invention is a format for expressing a value of data having a predetermined length, and includes a binary bit section expressing a binary value and an incremental bit section controlling the increment of the binary value. The change of the binary value of the binary bit interval can be quantized with different sizes according to the value of the incremental bit interval for controlling the increment of the binary value. In particular, by adjusting the length of the binary bit section and the length of the incremental bit in the data based on the data distribution, the precision can be set differently for each range in quantization using data of a specified length. It can maintain precision in a certain range.

도 1은 인공 신경망의 개략적 구조를 나타낸다.
도 2는 도 1의 인공 신경망에서 신경 노드의 개략적 구조를 나타낸다.
도 3은 도 1의 인공 신경망의 입력 액티베이션과 가중치의 빈도 히스토그램의 일예를 나타낸다.
도 4는 본 발명의 일 실시예에 따른 가변 정밀도 포멧을 설명하기 위한 도면이다.
도 5는 도 4의 가변 정밀도 포멧을 기존의 포멧과 비교한 결과를 나타낸다.
도 6은 본 실시예의 가변 정밀도 포멧으로 변환하는 방법을 설명하기 위한 도면이다.
도 7은 본 발명의 일 실시예에 따른 가변 정밀도 양자화 장치가 적용된 인공 신경망의 개략적 구조를 나타낸다.
도 8 및 도 9는 본 실시예에 따른 가변 정밀도 포멧 데이터의 가산 연산 알고리즘을 설명하기 위한 도면이다.
도 10 및 도 11은 본 실시예에 따른 가변 정밀도 포멧 데이터의 곱셈 연산 알고리즘을 설명하기 위한 도면이다.
도 12는 본 발명의 일 실시예에 따른 가변 정밀도 양자화 방법을 나타낸다.1 shows a schematic structure of an artificial neural network.
FIG. 2 shows a schematic structure of a neural node in the artificial neural network of FIG. 1.
3 shows an example of a histogram of the frequency of input activation and weights of the artificial neural network of FIG. 1.
4 is a view for explaining a variable precision format according to an embodiment of the present invention.
5 shows a result of comparing the variable precision format of FIG. 4 with an existing format.
6 is a diagram for explaining a method of converting to a variable precision format according to the present embodiment.
7 shows a schematic structure of an artificial neural network to which a variable precision quantization device according to an embodiment of the present invention is applied.
8 and 9 are diagrams for explaining an addition operation algorithm of variable precision format data according to the present embodiment.
10 and 11 are diagrams for explaining a multiplication algorithm for variable precision format data according to the present embodiment.
12 shows a variable precision quantization method according to an embodiment of the present invention.

본 발명과 본 발명의 동작상의 이점 및 본 발명의 실시에 의하여 달성되는 목적을 충분히 이해하기 위해서는 본 발명의 바람직한 실시예를 예시하는 첨부 도면 및 첨부 도면에 기재된 내용을 참조하여야만 한다. In order to fully understand the present invention, the operational advantages of the present invention, and the objects achieved by the implementation of the present invention, reference should be made to the accompanying drawings illustrating preferred embodiments of the present invention and the contents described in the accompanying drawings.

이하, 첨부한 도면을 참조하여 본 발명의 바람직한 실시예를 설명함으로써, 본 발명을 상세히 설명한다. 그러나, 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 설명하는 실시예에 한정되는 것이 아니다. 그리고, 본 발명을 명확하게 설명하기 위하여 설명과 관계없는 부분은 생략되며, 도면의 동일한 참조부호는 동일한 부재임을 나타낸다. Hereinafter, the present invention will be described in detail by describing a preferred embodiment of the present invention with reference to the accompanying drawings. However, the present invention may be implemented in various different forms, and is not limited to the described embodiments. In addition, in order to clearly describe the present invention, parts irrelevant to the description are omitted, and the same reference numerals in the drawings indicate the same members.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라, 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "...부", "...기", "모듈", "블록" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. Throughout the specification, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components unless specifically stated to the contrary. In addition, terms such as "... unit", "... group", "module", and "block" described in the specification mean units that process at least one function or operation, which is hardware, software, or hardware. And software.

도 1은 인공 신경망의 개략적 구조를 나타내고, 도 2는 도 1의 인공 신경망에서 신경 노드의 개략적 구조를 나타낸다.FIG. 1 shows a schematic structure of an artificial neural network, and FIG. 2 shows a schematic structure of a neural node in the artificial neural network of FIG. 1.

도 1을 참조하면, 인공 신경망은 각각 다수의 신경 노드를 포함하는 다수의 레이어(layer₁ ~ layer_s)를 포함하여 구성된다. 최초단의 레이어(layer₁)는 입력 데이터를 인가받는 입력 레이어이고, 최종단의 레이어(layer_s)는 입력 데이터에 대해 인공 신경망에서 연산을 수행한 결과인 출력 데이터를 출력하는 출력 레이어로 기능한다. 그리고 입력 레이어(layer₁)와 출력 레이어(layer_s) 사이의 적어도 하나의 레이어(layer₂ ~ layer_s-1)는 은닉 레이어로서 이전단의 레이어의 신경 노드들로부터 데이터를 인가받고, 인가된 데이터 각각에 대해 기지정된 연산을 수행하여 다음 레이어로 전달한다. 여기서 다수의 신경 노드 각각에 입력되는 데이터를 입력 액티베이션이라하고, 다수의 신경 노드 각각에서 출력되는 데이터를 출력 액티베이션이라 할 수 있다.Referring to FIG. 1, an artificial neural network is configured to include a plurality of layers (layer ₁ to layer _s ) each including a plurality of neural nodes. The first layer (layer ₁ ) is an input layer that receives input data, and the last layer (layer _s ) functions as an output layer that outputs output data that is the result of an artificial neural network's operation on the input data. . In addition, at least one layer (layer ₂ to layer _s-1 ) between the input layer (layer ₁ ) and the output layer (layer _s ) is a hidden layer that receives data from the nerve nodes of the previous layer and receives the applied data. A predetermined operation is performed for each and transferred to the next layer. Here, data input to each of the plurality of neural nodes may be referred to as input activation, and data output from each of the plurality of neural nodes may be referred to as output activation.

도 2를 참조하면, 은닉 레이어(layer₂ ~ layer_s-1)의 다수의 신경 노드 각각은 이전 레이어의 신경 노드들에서 전달된 입력 액티베이션(IA)(x₁, x₂, x₃, ... x_m) 각각에 대해 가중치(weight)(w_1j, w_2j, w_3j, ... w_mj)를 가중하여 각 데이터의 반영 수준을 조절하고, 가중치가 가중된 입력 액티베이션들의 총합에 각 노드에 지정된 바이어스(bias)(b_i)를 더하고, 활성화 수준을 조절하기 위한 활성 함수(activation function)(f)를 적용하여 다음 레이어의 다수의 신경 노드로 출력 액티베이션(OA)(y_i)을 출력한다.Referring to FIG. 2, each of a plurality of neural nodes of a hidden layer (layer ₂ to layer _s-1 ) has an input activation (IA) (x ₁ , x ₂ , x ₃ , .. x _m ) Each node adjusts the reflection level of each data by weighting a weight (w _1j , w _2j , w _3j , ... w _mj ) for each, and adds to the sum of the weighted input activations. Outputs the output activation (OA)(y _i ) to multiple nerve nodes in the next layer by adding the specified bias (b _i ) to and applying the activation function (f) to adjust the activation level. do.

여기서 다수의 신경 노드 각각은 다수의 입력 액티베이션(x₁, x₂, x₃, ... x_m) 각각에 대해 지정된 가중치(w_1j, w_2j, w_3j, ... w_mj)를 가중하며, 가중치(w_1j, w_2j, w_3j, ... w_mj)는 인공 신경망의 학습을 통해 지정된 값을 가진다.Here, each of the multiple neural nodes weights a specified weight (w _1j , w _2j , w _3j , ... w _mj ) for each of the multiple input activations (x ₁ , x ₂ , x ₃ , ... x _m ). And, the weights (w _1j , w _2j , w _3j , ... w _mj ) have values specified through training of artificial neural networks.

그리고 인공 신경망은 대량의 학습 데이터에 대해 지정된 연산을 수행하고, 대량의 학습 데이터 각각에 대한 연산 수행 결과가 기지정된 오차 범위 이내가 되도록 반복적으로 학습된다. 학습 데이터를 이용한 학습 과정에서 인공 신경망의 가중치(w_1j, w_2j, w_3j, ... w_mj)는 오차 역전파 기법에 따라 업데이트된다. In addition, the artificial neural network performs a designated operation on a large amount of training data, and is repeatedly trained so that the result of the operation on each of the large amount of training data is within a predetermined error range. In the learning process using the training data, the weights (w _1j , w _2j , w _3j , ... w _mj ) of the artificial neural network are updated according to the error _{backpropagation} technique.

즉 인공 신경망은 학습 과정 동안, 동일한 연산을 학습 데이터의 양과 반복 횟수에 따라 가중치(w_1j, w_2j, w_3j, ... w_mj)를 업데이트하면서 계속적으로 반복하여 수행한다. 이에 입력 및 출력 액티베이션과 가중치를 양자화하여 연산 효율성을 높일 필요가 있다.That is, during the learning process, the artificial neural network continuously repeatedly performs the same operation while updating weights (w _1j , w _2j , w _3j , ... w _mj ) according to the amount of training data and the number of iterations. Accordingly, it is necessary to increase computational efficiency by quantizing input and output activations and weights.

그러나 양자화를 수행하는 경우에는 양자화 오차가 발생하며, 이러한 양자화 오차는 양자화 비트수가 적을 수록 증가한다. 특히 인공 신경망의 딥러닝 과정에서는 레이어의 개수가 증가할 수록 연산 오차가 반복적으로 누적되므로, 양자화 오차로 인한 정확도가 반복 누적되어 인공 신경망의 정확도가 크게 낮아지게 되는 문제가 있다.However, when quantization is performed, a quantization error occurs, and this quantization error increases as the number of quantization bits decreases. In particular, in the deep learning process of an artificial neural network, as the number of layers increases, computational errors are repeatedly accumulated, and thus, accuracy due to quantization errors is repeatedly accumulated, resulting in a problem that the accuracy of the artificial neural network is significantly lowered.

도 3은 도 1의 인공 신경망의 입력 액티베이션과 가중치의 빈도 히스토그램의 일예를 나타낸다.3 shows an example of a histogram of the frequency of input activation and weights of the artificial neural network of FIG. 1.

도 3은 cifar-10 데이터 셋에 대해 인공 신경망 중 하나인 ResNET-32 모델을 기반으로 학습을 수행하는 경우, 신경 노드에 인가되는 입력 액티베이션과 가중치를 값에 따른 출현 빈도 분포를 히스토그램으로 나타낸 결과를 나타낸다.FIG. 3 is a histogram showing the distribution of frequency of appearance according to values of input activation and weight applied to a neural node when learning is performed on the cifar-10 data set based on the ResNET-32 model, one of artificial neural networks. Show.

도 3에서 (a) 내지 (c)는 ResNET-32 모델에서 서로 다른 레이어의 신경 노드의 입력 액티베이션과 가중치를 나타내고 있다. 일예로 도 3에서 (a)는 도1 의 입력 레이어(layer1)에서의 입력 액티베이션과 가중치의 분포이고, (b)는 은닉 레이어(layer2)에서의 분포이며, (c)는 출력 레이어(layer_s)에서의 분포일 수 있다.In FIG. 3, (a) to (c) show input activations and weights of neural nodes of different layers in the ResNET-32 model. As an example, in FIG. 3 (a) is a distribution of input activation and weights in the input layer (layer1) of FIG. 1, (b) is a distribution in the hidden layer (layer2), and (c) is the output layer (layer _s). ).

도 3을 참조하면, 서로 다른 레이어에 대한 신경 노드의 입력 액티베이션과 가중치의 분포를 나타내고 있음에도, 입력 액티베이션과 가중치는 0에 가까운 값을 평균으로 갖는 정규 분포와 유사한 패턴을 가진다는 것을 알 수 있다. 이와 같이 데이터가 정규 분포를 따르는 경우, 발생 빈도가 가장 높은 데이터 구간에서 세밀한 표현이 가능하다면, 상대적으로 더 정확한 연산을 수행할 수 있다. 그리고 발생 빈도가 낮은 데이터의 경우에는 오차가 발생하더라도, 발생 빈도가 높은 데이터의 오차에 비해 인공 신경망 전체에 미치는 영향이 상대적으로 작게 나타나게 된다.Referring to FIG. 3, although the distribution of the input activation and weight of the neural node for different layers is shown, it can be seen that the input activation and the weight have a pattern similar to a normal distribution having a value close to zero as an average. In this way, when data follows a normal distribution, a relatively more accurate operation can be performed if detailed expression is possible in a data section with the highest occurrence frequency. In the case of data with a low frequency of occurrence, even if an error occurs, the effect on the entire artificial neural network is relatively small compared to the error of data with a high frequency of occurrence.

또한 도 3에 도시된 바와 같이, 입력 액티베이션과 가중치의 분포는 인공 신경망 모델 및 레이어별로 상이할 수 있기 때문에, 인공 신경망의 모든 신경 노드에서 고정 소수점이나 부동 소수점 포멧과 같이 동일한 포멧을 이용하는 경우, 정확도 또는 연산 효율성이 저하되는 문제가 있다.In addition, as shown in FIG. 3, since the distribution of input activation and weight may be different for each artificial neural network model and layer, when all neural nodes of the artificial neural network use the same format such as a fixed-point or floating-point format, accuracy Or, there is a problem that the computational efficiency is lowered.

따라서 각 레이어별로 다양하게 변화하는 액티베이션과 가중치의 값 분포에 대응하는 새로운 포멧으로 수를 표현할 수 있다면, 연산 정확도와 효율성을 모두 향상시킬 수 있다.Therefore, if the number can be expressed in a new format corresponding to the distribution of values of activations and weights that change variously for each layer, both computational accuracy and efficiency can be improved.

도 4는 본 발명의 일 실시예에 따른 가변 정밀도 포멧을 설명하기 위한 도면이다.4 is a view for explaining a variable precision format according to an embodiment of the present invention.

도 4를 참조하면, 본 실시예에 따른 가변 정밀도 포멧은 n 비트 길이를 갖는 데이터가 부호값을 나타내는 부호 비트와 이진값을 표현하는 이진 비트 구간 및 이진 데이터의 최소 증감 단위를 조절하는 증가 비트 구간으로 구분될 수 있다.Referring to FIG. 4, in the variable precision format according to the present embodiment, a code bit representing a sign value of data having an n-bit length, a binary bit section representing a binary value, and an incremental bit section for adjusting the minimum incremental unit of binary data It can be classified as

n 비트 길이의 데이터에서 1비트의 부호 비트는 데이터의 부호를 표현하기 위한 비트로서 부호 비트가 0이면 데이터가 양수임을 나타내고, 1이면 데이터가 음수임을 나타낼 수 있다. 도 4에서는 데이터가 음의 값을 나타낼 수 있도록 부호값을 나타내는 부호 비트를 포함하였으나, 데이터가 부호를 표현할 필요가 없는 경우, 부호 비트는 생략될 수 있다.In the data of n-bit length, a 1-bit sign bit is a bit for expressing the sign of the data. If the sign bit is 0, the data is positive, and if the sign bit is 1, the data is negative. In FIG. 4, a sign bit representing a sign value is included so that data can represent a negative value, but if the data does not need to represent a sign, the sign bit may be omitted.

그리고 n 비트의 데이터에서 k비트의 이진 비트 구간은 기존의 포멧과 같이 데이터의 값을 표현하기 위한 구간이다. 이진 비트 구간은 기존과 같이 이진수로서 데이터(M)의 값을 표현한다. 이진수로 데이터의 값이 표현되므로, 이진 비트 구간의 이진 데이터(M)는 최소 1단위로 증감될 수 있다.In addition, in the n-bit data, the k-bit binary bit interval is an interval for expressing the value of the data as in the conventional format. The binary bit interval represents the value of data M as a binary number as in the past. Since the data value is expressed in binary numbers, the binary data M in the binary bit interval can be increased or decreased by at least one unit.

증가 비트 구간은 n 비트의 데이터에서 1비트의 부호 비트와 k 비트의 이진 비트 구간이 제외되어 n-1-k 비트의 길이를 갖는다. 일예로 데이터가 8 비트(n = 8)의 길이를 갖고 이진 비트 구간이 4 비트(k = 4)인 경우, 증가 비트 구간은 3비트의 길이(8 - 1 - 4 = 3)를 갖는다.The incremental bit interval has a length of n-1-k bits by excluding a 1-bit sign bit and k-bit binary bit interval from n-bit data. For example, if data has a length of 8 bits (n = 8) and a binary bit interval is 4 bits (k = 4), the incremental bit interval has a length of 3 bits (8-1-4 = 3).

그리고 증가 비트 구간의 증분 데이터(N)는 k비트의 이진 비트 구간의 이진 데이터(M)가 증가 또는 감소되는 최소 간격을 조절한다. 증가 비트 구간의 증분 데이터(N)는 증가 또는 감소되는 이진 데이터(M)의 값이 증분 데이터(N)에 대응하는 단위로 증가 또는 감소되는 것으로 해석되도록 한다.In addition, the incremental data N of the incremental bit interval adjusts the minimum interval at which the binary data M of the k-bit binary bit interval increases or decreases. The incremental data N of the incremental bit period allows the value of the incremented or decremented binary data M to be interpreted as increasing or decreasing in a unit corresponding to the incremental data N.

8 비트 길이의 데이터에서 이진 비트 구간이 4(k = 4)인 가변 정밀도 포멧을 예로 들어 설명한다.In 8-bit data, a variable precision format with a binary bit interval of 4 (k = 4) will be described as an example.

증가 비트 구간의 증분 데이터(N)가 "000"인 경우, 이진 데이터(M)의 값이 "0000"에서 "0001"으로 1증가되면, 본 실시예의 가변 정밀도 포멧에서 8비트의 데이터는 십진수 0에서 1로 증가된다. 즉 이진 비트 구간의 이진 데이터(M)의 1 증가 또는 감소는 기존과 동일하게 증가 또는 감소된다.When the incremental data (N) of the incremental bit interval is "000", when the value of the binary data (M) is increased by 1 from "0000" to "0001", 8-bit data in the variable precision format of this embodiment is decimal 0. Is increased to 1. That is, the increment or decrease by 1 of the binary data M in the binary bit interval increases or decreases as before.

그러나 증가 비트 구간의 증분 데이터(N)가 "001"인 경우, 이진 데이터(M)의 값이 "0000"에서 "0001"으로 1증가되면, 가변 정밀도 포멧에서 8비트의 데이터는 십진수 2를 나타낸다. 즉 증가 비트 구간의 증분 데이터(N)에 따라 이진 비트 구간의 이진 데이터(M)의 1 증가 또는 감소가 최소 2 단위로 증가 또는 감소되는 것으로 해석될 수 있다.However, if the incremental data (N) of the incremental bit section is "001", when the value of the binary data (M) is increased by 1 from "0000" to "0001", 8-bit data in variable precision format represents 2 decimal. . That is, it can be interpreted that 1 increase or decrease of the binary data M of the binary bit period increases or decreases by at least 2 units according to the increment data N of the increasing bit period.

또한 증가 비트 구간의 증분 데이터(N)가 "010"인 경우, 이진 데이터(M)의 값 "0001"은 십진수 4를 나타내고, 이진 데이터(M)의 값 "0010"은 십진수 8을 나타내게 된다.In addition, when the incremental data N of the incremental bit section is "010", the value "0001" of the binary data M represents the decimal number 4, and the value "0010" of the binary data M represents the decimal number 8.

이러한 도 4의 가변 정밀도 포멧에 따른 데이터는 수학식 1에 따라 십진수로 변환될 수 있다.The data according to the variable precision format of FIG. 4 may be converted to a decimal number according to Equation 1.

수학식 1에 따르면, 8 비트 길이의 데이터에서 이진 비트 구간이 4인 가변 정밀도 포멧에서 "00101000"은 k = 4이고, N이 2이며, M이 8이므로, 십진수 80(= 2⁴ × (2² - 1) + 2² × 8)을 나타낸다.According to Equation 1, in the variable precision format in which the binary bit interval is 4 in 8-bit data, "00101000" is k = 4, N is 2, and M is 8, so the decimal number is 80 (= 2 ⁴ × (2 ^2-1 ) + 2 ² × 8).

이는 가변 정밀도 포멧에서는 증분 데이터(N)에 따라 0에 가까울 수록 세밀한 간격을 갖고, 증분 데이터(N)가 증가될 수록 점차로 넓은 간격의 수를 표현할 수 있음을 의미한다.This means that in the variable precision format, the closer to 0 according to the incremental data N has a finer interval, and as the incremental data N increases, the number of wider intervals can be expressed.

그리고 데이터에서 이진 비트 구간의 비트 수(k)는 필요에 따라 조절될 수 있다. 이진 비트 구간의 비트 수(k)가 증가 또는 감소되면, 지정된 비트 길이를 갖는 데이터에서 증가 비트 구간의 길이(여기서는 일예로 n-1-k)는 감소 또는 증가된다. 그리로 증가 비트 구간의 길이의 변화로 인해, 가변 정밀도 포멧이 나타낼 수 있는 값의 범위는 크게 변화된다.In addition, the number of bits (k) of the binary bit section in the data can be adjusted as needed. When the number of bits k of the binary bit interval increases or decreases, the length of the increasing bit interval (here, n-1-k for example) in data having a specified bit length decreases or increases. Therefore, due to the change in the length of the incremental bit section, the range of values that can be represented by the variable precision format varies greatly.

표 1은 본 실시예에 다른 가변 정밀도 포멧에서 이진 비트 구간의 비트 수(k)에 따른 데이터 값의 변화의 일부를 나타낸다.Table 1 shows a part of a change in a data value according to the number of bits k in a binary bit section in a variable precision format according to the present embodiment.

표 1에서는 부호 비트로 1비트를 제외한 7비트의 데이터를 도 4의 가변 정밀도 포멧에 따라 나타내었으며, 이진 비트 구간의 비트 수(k)의 변화에 따른 데이터 값의 변화를 나타낸다.In Table 1, data of 7 bits excluding 1 bit as a sign bit are shown according to the variable precision format of FIG. 4, and the change of the data value according to the change of the number of bits k in the binary bit section is shown.

표 1을 참조하면, 이진 비트 구간의 비트 수(k)가 3, 4 및 5인 경우에 모두, "0000000" ~ "0001000"은 0 ~ 8의 값을 나타낸다. 그러나 "0001001" ~ "0010000"에서 이진 비트 구간의 비트 수(k)가 4 및 5인 경우에는 값이 여전히 1씩 증가하여 9 ~ 16의 값을 나타내는 반면, 비트 수(k)가 3인 경우에는 값이 2씩 증가되어 10 ~ 24의 값을 나타내게 된다.Referring to Table 1, in all cases where the number of bits (k) of the binary bit interval is 3, 4, and 5, "0000000" to "0001000" represent values from 0 to 8. However, if the number of bits (k) of the binary bit section is 4 and 5 from "0001001" to "0010000", the value is still increased by 1 to represent a value of 9 to 16, whereas the number of bits (k) is 3. The value is increased by 2 to indicate a value from 10 to 24.

그리고 "0010001" ~ "0011000"에서는 비트 수(k)가 5인 경우에만 여전히 1씩 증가하여 17 ~ 24의 값을 나타내고, 비트 수(k)가 4인 경우에는 2씩 증가되어 18 ~ 32의 값을 나타내며, 비트 수(k)가 3인 경우에는 4씩 증가되어 28 ~ 56의 값을 나타내게 된다.And in "0010001" to "0011000", only when the number of bits (k) is 5, it still increases by 1 to represent the value of 17 to 24, and when the number of bits (k) is 4, it increases by 2 and the number of bits (k) is 4, It represents a value, and when the number of bits (k) is 3, it increases by 4 to represent a value of 28 to 56.

또한 "0011001" ~ "0100000"에서 비트 수(k)가 5인 경우에는 1씩 증가하여 25 ~ 32의 값을 나타내고, 비트 수(k)가 4인 경우에도 여전히 2씩 증가되어 34 ~ 48의 값을 나타내는 반면, 비트 수(k)가 3인 경우에는 8씩 증가되어 64 ~ 120의 값을 나타내게 된다.In addition, if the number of bits (k) is 5 from "0011001" to "0100000", it increases by 1 to represent a value of 25 to 32, and even if the number of bits (k) is 4, it is still increased by 2 and the number of bits is 34 to 48. On the other hand, when the number of bits (k) is 3, the value is increased by 8 to represent a value of 64 to 120.

즉 표 1에 나타난 바와 같이, 도 4의 가변 정밀도 포멧은 동일한 비트에서 다른 포멧의 수 표현 기법에 비해 더 넓은 범위의 수를 표현할 수 있다. 일예로 8비트의 정수는 -128 ~ 127의 범위를 표현할 수 있는 반면, 가변 정밀도 포멧에서는 이진 비트 구간의 비트 수(k)가 3인 경우에, 동일한 8비트로 양의 정수를 기준으로 최대 491,512( = 2³ × (2¹⁵ - 1) + 2¹⁵ × 7)의 값을 나타낼 수 있다. 이는 8비트 정수에 비해 대략 3,870배의 범위를 표현할 수 있음을 의미한다. 또한 0의 값으로부터 가까운 "0000000" ~ "0001000"의 범위에서는 기존과 동일하게 증가 또는 감소 간격을 유지하고, 0에서 멀어질수록 증가 또는 감소 간격이 커지게 된다. 이는 0에 인접한 구간에 대해서는 기존의 정밀도를 유지할 수 있음을 의미한다.That is, as shown in Table 1, the variable precision format of FIG. 4 can express a wider range of numbers in the same bit compared to the number expression techniques of other formats. As an example, an 8-bit integer can represent a range of -128 to 127, whereas in the variable precision format, when the number of bits (k) of a binary bit section is 3, up to 491,512 ( = 2 ³ × (2 ^15-1 ) + 2 ¹⁵ × 7). This means that it can represent a range of approximately 3,870 times that of an 8-bit integer. In addition, in the range of "0000000" to "0001000" close to the value of 0, the increase or decrease interval is maintained as before, and the increase or decrease interval increases as the distance from 0 increases. This means that the existing precision can be maintained for the section adjacent to 0.

즉 본 실시예에 따른 가변 정밀도 포멧은 동일한 비트 수의 데이터로 매우 큰 범위의 값을 표현할 수 있다. 이때 표현해야 하는 수의 범위와 발생 빈도에 따른 분포에 기반하여 이진 비트 구간의 비트 수(k)를 조절함으로써, 제한된 비트 수의 데이터에서 요구되는 형태로 수를 표현할 수 있게 된다. 특히 일부 범위에 높은 발생 빈도를 갖는 수를 표현하고자 하는 경우에 정밀도를 가능한 유지하면서 높은 효율성을 나타낼 수 있다.That is, the variable precision format according to the present embodiment can represent a very large range of values with data having the same number of bits. At this time, by adjusting the number of bits (k) of the binary bit section based on the range of the number to be expressed and the distribution according to the frequency of occurrence, the number can be expressed in the form required for data with a limited number of bits. In particular, in the case of expressing a number having a high frequency of occurrence in a partial range, high efficiency can be exhibited while maintaining precision as much as possible.

이는 반대로 지정된 범위 내의 수를 지정된 비트수의 데이터로 양자화하고자 하는 경우에, 구간별 정밀도를 차등화 할 수 있어 일부 구간에서의 정밀도를 크게 향상 시킬 수 있다. 이는 특정 구간에서 높은 발생 빈도를 갖는 정규 분포 패턴을 양자화 함에 있어, 동일한 비트 수로도 높은 정밀도를 제공할 수 있음을 의미한다. 즉 양자화 시에 정밀도를 향상시키기 위해 비트 수를 높이지 않아도 되므로, 높은 연산 효율성을 유지할 수 있다.Conversely, when quantizing a number within a specified range into data of a specified number of bits, the precision for each section can be differentiated, and the precision in some sections can be greatly improved. This means that in quantizing a normal distribution pattern having a high frequency of occurrence in a specific section, high precision can be provided even with the same number of bits. That is, since it is not necessary to increase the number of bits to improve precision during quantization, high computational efficiency can be maintained.

도 5는 도 4의 가변 정밀도 포멧을 기존의 포멧과 비교한 결과를 나타낸다.5 shows a result of comparing the variable precision format of FIG. 4 with an existing format.

도 5는 0 ~ 4로 지정된 범위를 8비트의 데이터로 양자화하는 경우, 가변 정밀도 포멧과 기존의 표현에 따른 표현값과 구간별 표현값의 분포, 즉 표현 가능한 표현값의 개수를 도시하였다. 도 5에서 (a)는 기존의 고정 소수점 포멧과 비교한 결과를 나타내고, (b)는 부동 소수점 포멧과 비교한 결과를 나타낸다. 그리고 (a) 및 (b) 각각에서 이진 비트 구간의 비트 수(k)가 4 및 5인 경우를 함께 도시하였다.5 illustrates a variable precision format and a distribution of expression values for each section and expression values according to an existing expression, ie, the number of expression values that can be expressed when quantizing a range designated as 0 to 4 into 8-bit data. In FIG. 5, (a) shows the result of comparison with the existing fixed-point format, and (b) shows the result of comparison with the floating-point format. And in (a) and (b), the cases in which the number of bits (k) of the binary bit section are 4 and 5 are shown together.

도 5의 (a)를 살펴보면, 기존의 고정 소수점 포멧 균일한 값 분포를 나타내게 되며, 이에 값 분포가 항시 균등하게 나타나게 된다. 이는 균일한 값 분포를 갖는 경우에 적합하지만, 도 3에 도시된 바와 같이, 정규 분포와 유사한 값의 분포 패턴을 갖는 시스템에서의 양자화에서는 세밀한 값 표현이 어렵기 때문에 비효율적이다. 그에 반해, 본 실시예에 따른 가변 정밀도 포멧의 경우, 비록 0에서 멀어질 수록 값의 분포가 크게 낮아지게 되지만, 0에 근접한 구간에서는 표현할 수 있는 표현값이 크게 증가한다. 즉 동일한 비트 수의 데이터로 0에 근접한 구간에서의 값을 세밀하게 표현할 수 있다.Referring to (a) of FIG. 5, a uniform distribution of values in the existing fixed-point format is shown, and the distribution of values is always uniform. This is suitable for the case of having a uniform distribution of values, but as shown in FIG. 3, it is inefficient because it is difficult to express detailed values in quantization in a system having a distribution pattern of values similar to those of a normal distribution. On the other hand, in the case of the variable precision format according to the present embodiment, although the distribution of values decreases significantly as the distance from 0 increases, the expression value that can be expressed increases significantly in a section close to 0. That is, a value in a section close to 0 can be expressed in detail with data of the same number of bits.

한편 (b)를 살펴보면, 부동 소수점 포멧의 경우, 지수 표현 방식을 이용하므로, 0에 매우 근접한 값에 대해 매우 세밀한 표현이 가능함을 알 수 있다. 그러나 부동 소수점 포멧은 0에 매우 근접한 값에 대해서 세밀한 표현이 가능하지만 0에서 조금만 멀어지더라도 표현 가능한 수의 개수가 기하급수적으로 줄어든다. 이는 매우 작은 분산을 갖는 값들을 표현하기에는 적합하지만, 대부분의 경우, 이러한 조건을 만족할 수 없다. 즉 다양한 분산의 분포를 나타내는 값들을 표현하기에 적합하지 않으며, 필요 이상의 비트 수를 요구하여 비효율적인 경우가 빈번하게 발생한다.On the other hand, looking at (b), in the case of the floating point format, since the exponential expression method is used, it can be seen that a very detailed expression is possible for a value very close to 0. However, in the floating-point format, a value that is very close to 0 can be expressed in detail, but the number of expressible numbers decreases exponentially even if it moves a little further from 0. This is suitable for representing values with very small variance, but in most cases this condition cannot be satisfied. In other words, it is not suitable for representing values representing a distribution of various variances, and inefficient cases frequently occur because the number of bits more than necessary is required.

그에 반해, 본 실시예에 따른 가변 정밀도 포멧의 경우, 이진 비트 구간의 비트 수(k)를 적절하게 선택하여 결정함으로써, (b)에 도시된 바와 같이, 다양한 분산 분포를 갖는 수 표현이 가능하다. 즉 피연산자 값의 분산 분포를 고려하여 이진 비트 구간의 비트 수(k)를 결정함으로써, 적은 비트수로 효율적으로 정밀한 수 표현이 가능해진다.On the other hand, in the case of the variable precision format according to the present embodiment, by appropriately selecting and determining the number of bits (k) of the binary bit section, as shown in (b), numbers having various distribution distributions can be expressed. . That is, by determining the number of bits k in the binary bit section in consideration of the variance distribution of the operand values, it is possible to efficiently and accurately express the number with a small number of bits.

다만 본 실시예에 따른 가변 정밀도 포멧에서는 정수를 표현하도록 구성되므로, 소수점 이하 자리에 대해 표현할 수 없다. 따라서 소수점을 표현하기 위해서는 별도로 소수점 자리를 나타내는 스케일링 팩터(scaling factor)를 가변 정밀도 포멧에 곱해주어야 한다. 이때 스케일링 팩터는 부동 소수점 포멧과 유사하게 2의 음의 지수승 형태(2^-e)로 표현될 수 있다.However, since the variable precision format according to the present embodiment is configured to represent an integer, it cannot be expressed for a decimal place. Therefore, in order to represent the decimal point, a scaling factor representing the decimal place must be separately multiplied by the variable precision format. In this case, the scaling factor may be expressed in the form of a negative power of 2 (2 ^-e ) similar to the floating point format.

도 6은 본 실시예의 가변 정밀도 수로 변환하는 방법을 설명하기 위한 도면이다.6 is a diagram for explaining a method of converting to a variable precision number in the present embodiment.

여기서는 십진수(A)를 7비트의 가변 정밀도 수로 변환하는 과정을 설명하며, 일예로 십진수 83(A = 83)을 변환하는 과정을 설명한다. 그리고 이진 비트 구간의 비트 수(k)는 4 인 것으로 가정한다.Here, a process of converting a decimal number (A) to a 7-bit variable precision number will be described, and as an example, a process of converting a decimal number 83 (A = 83) will be described. And it is assumed that the number of bits (k) of the binary bit interval is 4.

도 6를 참조하여, 가변 정밀도 수 변환 과정을 살펴보면, 7비트의 양의 정수 83₍₁₀₎은 이진수 "1010011₍₂₎"로 나타난다. 이진수를 A라고 할 때, 우선 a)와 같이 이진수에 이진 비트 구간의 비트 수(k)에 대응하는 2_k 를 더하여 A'을 계산한다.Referring to FIG. 6, a process of converting a variable precision number is described, and a 7-bit positive integer 83 ₍₁₀₎ is represented by a binary number "1010011 ₍₂₎ ". When the binary number is A, first, as in a), A'is calculated by adding 2 _k corresponding to the number of bits (k) of the binary bit interval to the binary number.

그리고 b)와 같이, A'에서 최상위 비트(MSB)의 위치에서 k(=4)를 차감하여 증분 데이터(N)의 값(N = 010₍₂₎)을 획득한다. 증분 데이터(N)가 획득되면, c)와 같이 최상위 비트(MSB)를 제외한 나머지 6비트에서 다시 최상위 비트로부터 하위 비트 순서로 k개 비트의 데이터(1000₍₂₎)를 이진 데이터(M)으로서 추출한다.And as shown in b), the value of incremental data N (N = 010 ₍₂₎ ) is obtained by subtracting k (= 4) from the position of the most significant bit (MSB) from A'. When the incremental data (N) is acquired, k bits of data (1000 ₍₂₎ ) in the order of the most significant bit to the lower bit again from the remaining 6 bits excluding the most significant bit (MSB) as shown in c) as binary data (M). Extract.

k개 비트의 데이터가 추출되면 d)와 같이 2개의 비트가 남게 된다. 여기서 2개의 비트 중 최상위 비트는 반올림 비트(rounding bit)로서, c)에서 획득된 이진 데이터(M)에 더하여 십진수(A = 83)를 가변 정밀도 수(DP(A)) "0101001" 로 변환할 수 있다.When k bits of data are extracted, 2 bits remain as shown in d). Here, the most significant bit of the two bits is a rounding bit, and in addition to the binary data (M) obtained in c), a decimal number (A = 83) is converted to a variable precision number (DP(A)) "0101001". I can.

도 7은 본 발명의 일 실시예에 따른 가변 정밀도 양자화 장치가 적용된 인공 신경망의 개략적 구조를 나타낸다.7 shows a schematic structure of an artificial neural network to which a variable precision quantization device according to an embodiment of the present invention is applied.

도 7을 참조하면, 본 실시예에 따른 인공 신경망(100)은 가변 정밀도 변환부(110), 이진 비트 구간 설정부(120), 신경 노드부(130), 노드 설정부(140), 포멧 변환부(150) 및 스케일링 팩터 설정부(160)를 포함할 수 있다.Referring to FIG. 7, the artificial neural network 100 according to this embodiment includes a variable precision conversion unit 110, a binary bit section setting unit 120, a neural node unit 130, a node setting unit 140, and format conversion. A unit 150 and a scaling factor setting unit 160 may be included.

도 7에서 가변 정밀도 변환부(110), 이진 비트 구간 설정부(120) 및 스케일링 팩터 설정부(160)는 입력 액티베이션(IA)과 가중치(w), 즉 입력 데이터를 가변 정밀도 포멧의 데이터로 양자화하기 위한 구성으로 본 실시예의 가변 양자화 장치로 볼 수 있다.In FIG. 7, the variable precision conversion unit 110, the binary bit interval setting unit 120, and the scaling factor setting unit 160 quantize input activation (IA) and weight (w), that is, input data into data in a variable precision format. It can be seen as the variable quantization device of this embodiment with the configuration for the following.

이하에서는 도 1 및 도 2 를 참조하여, 도 7의 인공 신경망을 설명한다. 도 1에 도시된 바와 같이, 인공 신경망은 다수의 신경 노드를 포함하여 구성된다. 그러나 실제 인공 신경망은 일반적으로 범용 연산기를 하드웨어로 이용하는 소프트웨어 형태로 구현된다. 특히 각 레이어(layer₁ ~ layer_s)에서 다수의 신경 노드를 개별적으로 구현하지 않고, 소프트웨어로 구현된 적어도 하나의 신경 노드의 입력 액티베이션(IA)과 가중치(w) 및 활성화 함수를 가변하면서 병렬 또는 직렬 연산하도록 함으로써 인공 신경망을 구현할 수 있다.Hereinafter, the artificial neural network of FIG. 7 will be described with reference to FIGS. 1 and 2. As shown in FIG. 1, the artificial neural network includes a plurality of neural nodes. However, an actual artificial neural network is generally implemented in the form of software using a general-purpose operator as hardware. In particular, each layer (layer ₁ to layer _s ) does not individually implement multiple neural nodes, but is parallel or parallel while varying the input activation (IA), weight (w) and activation function of at least one neural node implemented in software. An artificial neural network can be implemented by serial operation.

이에 도 7의 인공 신경망은 다수의 신경 노드에서 각각에서 요구되는 연산을 수행하는 연산 장치로서 신경 노드부(130)를 포함하고, 신경 노드부(130)가 인공 신경망의 다수의 신경 노드로 기능할 수 있도록 하는 노드 설정부(140)를 포함한다. 노드 설정부(140)는 다수의 신경 노드가 수행해야 하는 기능에 대한 신경 노드 설정이 미리 저장되고, 저장된 신경 노드 설정을 신경 노드부(130)로 전달하여 신경 노드부(130)가 해당 신경 노드로서 기능을 수행할 수 있도록 한다.Accordingly, the artificial neural network of FIG. 7 includes a neural node unit 130 as a computing device that performs an operation required by each of a plurality of neural nodes, and the neural node unit 130 functions as a plurality of neural nodes of the artificial neural network. It includes a node setting unit 140 to enable. The node setting unit 140 stores neural node settings for functions to be performed by a plurality of neural nodes in advance, and transmits the stored neural node settings to the neural node unit 130 so that the neural node unit 130 To be able to perform the function as

신경 노드부(130)는 도 2에 도시된 신경 노드의 구조에 따라 입력 액티베이션(IA)과 가중치(w)에 대한 곱셈 연산 및 합 연산을 수행하는 연산 장치로서, 노드 설정부(140)의 설정에 따라 각 레이어에서 요구되는 연산을 수행할 수 있도록 가변되는 가변 신경 노드로 구성될 수 있다.The neural node unit 130 is a computing device that performs a multiplication operation and a sum operation on the input activation (IA) and the weight (w) according to the structure of the neural node shown in FIG. 2, and is set by the node setting unit 140 It may be composed of variable neural nodes that are variable so that the operation required by each layer can be performed.

가변 정밀도 변환부(110)는 신경 노드부(130)에 전달할 입력 액티베이션(IA)과 가중치(w)를 인가받아, 도 4와 같이 가변 정밀도 포멧으로 변환한다.The variable precision conversion unit 110 receives the input activation IA and the weight w to be transmitted to the neural node unit 130 and converts it into a variable precision format as shown in FIG. 4.

가변 정밀도 변환부(110)는 입력 액티베이션(IA)과 가중치(w)가 기존의 포멧으로 인가되는 경우에, 가변 정밀도 포멧으로 변환하여 신경 노드부(130)로 전달한다. 이때 가변 정밀도 변환부(110)는 이진 비트 구간 설정부(120)에 설정된 이진 비트 구간 비트 수(k)에 따라 기지정된 비트 수(n)의 데이터에서 이진 비트 구간의 비트 수(k)와 증가 비트 구간의 비트 수를 조절하여 가변 정밀도 포멧으로 변환한다. 또한 가변 정밀도 변환부(110)는 부호의 필요 여부를 판별하여, 부호가 필요한 경우, 데이터에서 1비트를 부호 비트로 설정한다.When the input activation (IA) and the weight (w) are applied in the existing format, the variable precision conversion unit 110 converts the input activation (IA) and the weight (w) into a variable precision format and transmits it to the neural node unit 130. At this time, the variable precision conversion unit 110 increases the number of bits (k) of the binary bit interval from the data of a predetermined number of bits (n) according to the number of bits (k) of the binary bit interval set in the binary bit interval setting unit 120 Converts into variable precision format by adjusting the number of bits in the bit section. In addition, the variable precision conversion unit 110 determines whether or not a code is required, and when a code is required, sets 1 bit in the data as a code bit.

즉 가변 정밀도 변환부(110)는 입력 액티베이션(IA)과 가중치(w)를 n 비트의 데이터로 변환하는 경우, 이진 비트 구간의 비트 수(k)와 부호 비트를 제외한 나머지 비트(n-1-k)를 증가 비트 구간의 비트 수로 결정하여 가변 정밀도 포멧으로 변환한다.That is, when converting the input activation (IA) and the weight (w) into n-bit data, the variable precision conversion unit 110 converts the number of bits (k) of the binary bit section and the remaining bits (n-1- k) is determined as the number of bits in the incremental bit interval and converted into a variable precision format.

이때, 가변 정밀도 변환부(110)는 입력 액티베이션(IA)과 가중치(w)가 고정 소수점 포멧이나 부동 소수점 포멧으로 인가되는 경우, 가변 정밀도 변환부(110)는 입력 액티베이션(IA)과 가중치(w)를 정수형으로 변환하고, 정수로의 변환에 따른 스케일링 팩터를 획득하여 스케일링 팩터 설정부(160)로 전달할 수 있다.At this time, when the input activation (IA) and the weight (w) are applied in a fixed-point format or a floating-point format, the variable-precision conversion unit 110 includes the input activation (IA) and the weight (w). ) May be converted to an integer type, and a scaling factor according to the conversion to an integer may be obtained and transmitted to the scaling factor setting unit 160.

이진 비트 구간 설정부(120)는 신경 노드부(130)가 수행해야 하는 신경 노드에서 요구되는 수 표현에 따라 이진 비트 구간의 비트 수(k)를 설정하여, 가변 정밀도 변환부(110)으로 전달한다. 이진 비트 구간 설정부(120)는 피연산자인 입력 액티베이션(IA)과 가중치(w)의 값 분포 패턴에 기반하여 이진 비트 구간의 비트 수(k)를 설정할 수 있다. 그러나 입력 액티베이션(IA)과 가중치(w)의 값 분포를 판별하기 위해서는 또 다른 연산을 요구하게 되므로 비효율적이다.The binary bit section setting unit 120 sets the number of bits (k) of the binary bit section according to the number expression required by the neural node to be performed by the neural node unit 130 and transmits it to the variable precision conversion unit 110 do. The binary bit interval setting unit 120 may set the number of bits k of the binary bit interval based on the value distribution pattern of the input activation IA and the weight w as operands. However, it is inefficient because another operation is required to determine the distribution of values of the input activation (IA) and the weight (w).

다만 인공 신경망의 경우, 일반적으로 다수의 레이어 각각에서 각 신경 노드에 인가되는 피연산자들의 값 분포가 경험적으로 또는 시뮬레이션을 통해 개략적으로 유추될 수 있다. 즉 인공 신경망의 다수의 신경 노드 각각에 대한 가변 정밀도 포멧에서 이진 비트 구간의 비트 수(k)는 신경 노드별로 미리 설정될 수 있다. 이에 이진 비트 구간 설정부(120)는 인공 신경망의 다수의 신경 노드 각각 대한 이진 비트 구간의 비트 수(k)를 미리 저장하고, 노드 설정부(140)의 설정에 따라 신경 노드부(130)가 특정 신경 노드에 대한 기능을 수행하는 경우에 대응하는 이진 비트 구간의 비트 수(k)를 선택하여 전달하도록 구성될 수 있다.However, in the case of an artificial neural network, in general, the distribution of values of operands applied to each neural node in each of a plurality of layers can be roughly inferred empirically or through simulation. That is, in the variable precision format for each of the plurality of neural nodes of the artificial neural network, the number of bits k of the binary bit interval may be preset for each neural node. Accordingly, the binary bit section setting unit 120 pre-stores the number of bits (k) of the binary bit section for each of the plurality of neural nodes of the artificial neural network, and the neural node unit 130 according to the setting of the node setting unit 140 It may be configured to select and transmit the number of bits k of a binary bit section corresponding to a case of performing a function for a specific neural node.

그리고 인공 신경망에서 동일 레이어 내의 신경 노드들에 대한 포멧은 동일하게 설정되는 것이 이후 다음 레이어의 신경 노드의 연산에서도 효율적이다. 이에 이진 비트 구간 설정부(120)는 인공 신경망의 레이어 단위로 이진 비트 구간의 비트 수(k)가 미리 설정되어 저장될 수 있다.In addition, in the artificial neural network, the format of the neural nodes in the same layer is set to be the same, which is efficient in the computation of the neural nodes in the next layer. Accordingly, the binary bit section setting unit 120 may preset and store the number of bits k of the binary bit section in units of layers of the artificial neural network.

신경 노드부(130)는 가변 정밀도 변환부(110)에서 가변 정밀도 포멧으로 변환된 입력 액티베이션(IA)과 가중치(w)를 인가받아 지정된 연산을 수행하여 포멧 변환부(150)로 전달한다.The neural node unit 130 receives the input activation IA and the weight w converted to the variable precision format by the variable precision conversion unit 110, performs a designated operation, and transmits the received input activation to the format conversion unit 150.

포멧 변환부(150)는 신경 노드부(130)에서 연산된 결과인 연산 데이터를 인가받아 기지정된 포멧의 데이터로 변환하여 출력 액티베이션(OA)을 출력한다. 포멧 변환부(150)는 가변 정밀도 포멧을 인지하지 못하는 외부 시스템과의 호환성을 위한 것으로 가변 정밀도 변환부(110)에 대응하는 구성이다.The format conversion unit 150 receives operation data, which is a result of the operation of the neural node unit 130, and converts it into data of a predetermined format, and outputs an output activation OA. The format conversion unit 150 is for compatibility with an external system that does not recognize a variable precision format, and is a component corresponding to the variable precision conversion unit 110.

다만 본 실시예에서 포멧 변환부(150)는 가변 정밀도 포멧으로 인가된 연산 데이터를 기지정된 포멧으로 변환할 때, 스케일링 팩터 설정부(160)로부터 스케일링 팩터를 인가받아 스케일 변환을 함께 수행할 수 있다.However, in the present embodiment, when the format conversion unit 150 converts the operation data applied in the variable precision format into a known format, the scaling factor may be applied from the scaling factor setting unit 160 to perform scale conversion together. .

상기한 바와 같이, 본 실시예에 따른 가변 정밀도 포멧은 소수점 자리를 표현하지 못한다. 따라서 소수점 자리를 표현하기 위해서는 별도의 스케일링 팩터가 요구된다. 신경 노드부(130)가 가변 정밀도 포멧으로 인가되는 모든 입력 액티베이션(IA)과 가중치(w)에 대해 스케일링 팩터를 곱하여 함께 연산하는 것은 매우 비효율적이다.As described above, the variable precision format according to the present embodiment cannot represent decimal places. Therefore, a separate scaling factor is required to represent the decimal place. It is very inefficient for the neural node unit 130 to multiply all input activations (IA) and weights (w) applied in a variable precision format by a scaling factor to calculate them together.

이에 스케일링 팩터 설정부(160)는 가변 정밀도 변환부(110)로부터 입력 액티베이션(IA)과 가중치(w)에 대한 스케일링 팩터를 인가받고, 인가된 스케일링 팩터에 대응하는 스케일링 팩터를 포멧 변환부(150)로 전달하여 포멧 변환부(150)가 가변 정밀도 포멧의 데이터를 기지정된 포멧으로 변환할 때, 스케일 변환을 함께 수행하도록 할 수 있다.Accordingly, the scaling factor setting unit 160 receives the scaling factor for the input activation (IA) and the weight (w) from the variable precision conversion unit 110, and converts a scaling factor corresponding to the applied scaling factor to the format conversion unit 150 ), and when the format conversion unit 150 converts data in a variable precision format into a predetermined format, scale conversion may be performed together.

이 경우, 신경 노드부(130)는 정수 형태로 표현되는 가변 정밀도 포멧의 입력 액티베이션(IA)과 가중치(w)에 대한 합 연산과 곱 연산을 그대로 수행하고, 소수점 표현을 위한 스케일 값은 포멧 변환부(150)에서 반영함으로써, 신경 노드부(130)의 연산 효율성을 향상시킬 수 있다.In this case, the neural node unit 130 performs the sum operation and multiplication operation for the input activation (IA) and the weight (w) of the variable precision format expressed in the form of an integer as it is, and the scale value for the decimal representation is converted to the format. By reflecting in the unit 150, the operation efficiency of the neural node unit 130 may be improved.

도 4에 도시된 바와 같이, 신경 노드는 입력 액티베이션(IA)과 가중치(w)에 대한 곱 연산과 합 연산을 수행한다. 여기서 각 신경 노드들의 스케일링 팩터는 이진 비트 구간의 비트 수(k)와 마찬가지로 레이어 단위로 동일하게 설정되어야 한다. 합 연산의 경우 스케일의 변화를 유발하지 않지만, 곱 연산의 경우, 스케일링 팩터의 제곱에 해당하는 스케일 변화를 유발하게 된다. 그리고 다수로 수행되는 입력 액티베이션(IA)과 가중치(w)의 곱 연산에 매번 스케일링 팩터를 함께 곱하여 반영하는 것은 비효율적이다.As shown in FIG. 4, the neural node performs a multiplication operation and a sum operation on the input activation (IA) and the weight (w). Here, the scaling factor of each neural node should be set equal to the number of bits (k) of the binary bit interval in units of layers. In the case of sum operation, the scale does not change, but in the case of the multiplication operation, the scale change corresponding to the square of the scaling factor is caused. In addition, it is inefficient to multiply the multiplication operation of the input activation (IA) and the weight (w) performed by multiple times by multiplying the scaling factor together.

이에 스케일링 팩터 설정부(160)는 신경 노드부(130)에서 수행되는 입력 액티베이션(IA)과 가중치(w)의 곱셈 연산에 대응하여, 가변 정밀도 변환부(110)에서 인가된 스케일링 팩터의 제곱에 대응하는 연산 스케일링 팩터를 미리 저장하고, 연산 스케일링 팩터를 포멧 변환부(150)로 전달하도록 구성될 수도 있다. Accordingly, the scaling factor setting unit 160 corresponds to the multiplication operation of the input activation (IA) and the weight (w) performed by the neural node unit 130, and corresponds to the square of the scaling factor applied by the variable precision conversion unit 110. It may be configured to store a corresponding computational scaling factor in advance and transfer the computational scaling factor to the format conversion unit 150.

그리고 상기한 바와 같이, 인공 신경망의 다수의 신경 노드 각각에서 피연산자들의 값 분포가 개략적으로 유추될 수 있으므로, 각 신경 노드들에 대한 입력 액티베이션(IA)과 가중치(w)의 스케일링 팩터는 미리 설정되어 스케일링 팩터 설정부(160)에 저장될 수도 있다.And as described above, since the distribution of the values of the operands in each of the plurality of neural nodes of the artificial neural network can be roughly inferred, the scaling factor of the input activation (IA) and the weight (w) for each neural node is preset. It may be stored in the scaling factor setting unit 160.

한편 도 1에 도시한 바와 같이, 인공 신경망은 다수의 신경 노드를 포함하는 다수의 레이어가 직렬로 연결되는 구성을 갖는다. 따라서 입력 레이어(layer₁)의 경우, 외부로부터 다른 포멧의 입력 액티베이션(IA)을 인가받으므로, 가변 정밀도 포멧으로 변환할 필요가 있다. 그리고 출력 레이어(layer_s)의 경우, 외부로 다른 포멧의 출력 액티베이션(OA)을 출력해야 하므로, 가변 정밀도 포멧의 데이터를 다른 포멧의 데이터로 변환해야 한다.Meanwhile, as shown in FIG. 1, the artificial neural network has a configuration in which a plurality of layers including a plurality of neural nodes are connected in series. Therefore, in the case of the input layer (layer ₁ ), since the input activation (IA) of a different format is applied from the outside, it is necessary to convert it to a variable precision format. And, in the case of the output layer (layer _s ), since the output activation (OA) of a different format must be output to the outside, data in a variable precision format must be converted into data of another format.

그러나 나머지 레이어(layer₂ ~ layer_s)의 경우, 가변 정밀도 포멧의 입력 액티베이션(IA)을 포멧 변환하지 않고 인가받아 출력하는 것이 더욱 효율적이다.However, in the case of the remaining layers (layer ₂ to layer _s ), it is more efficient to output the input activation (IA) in a variable precision format without format conversion.

따라서 가변 정밀도 변환부(110)는 신경 노드부(130)가 입력 레이어(layer₁)의 신경 노드로 기능하거나, 이진 비트 구간의 비트 수(k)를 변경해야 하는 경우에만 포멧 변환 동작을 수행하도록 구성될 수 있다. 또한 포멧 변환부(150)는 신경 노드부(130)가 출력 레이어(layer_s)의 신경 노드로 기능하는 경우에만 포멧 변환 동작을 수행하도록 구성될 수 있다.Therefore, the variable precision conversion unit 110 performs the format conversion operation only when the neural node unit 130 functions as a neural node of the input layer (layer ₁ ) or when the number of bits k of the binary bit section needs to be changed. Can be configured. In addition, the format conversion unit 150 may be configured to perform a format conversion operation only when the neural node unit 130 functions as a neural node of the output layer _s .

도 8 및 도 9는 본 실시예에 따른 가변 정밀도 포멧 데이터의 가산 연산 알고리즘을 설명하기 위한 도면으로, 도 8은 피연산자와 연산 데이터의 가변 정밀도 포멧은 나타내고, 도 9는 가변 정밀도 포멧 데이터의 가산 연산 알고리즘을 나타낸다.8 and 9 are diagrams for explaining an addition operation algorithm of variable precision format data according to the present embodiment, FIG. 8 shows a variable precision format of operands and computation data, and FIG. 9 is an addition operation of variable precision format data Represents the algorithm.

도 8에 도시된 바와 같이, n비트의 가변 정밀도 포멧 데이터에서 이진 비트 구간의 비트 수가 k인 2개의 피연산자(X₁, X₂)를 가산하여 연산 데이터(X₃)를 획득하는 경우, 연산 데이터(X₃)의 포멧 또한 동일하게 이진 비트 구간의 비트 수가 k인 n비트의 가변 정밀도 포멧으로 획득된다.As shown in FIG. 8, in the case of obtaining operation data (X ₃ ) by adding two operands (X ₁ , X ₂ ) whose number of bits in a binary bit interval is k in n-bit variable precision format data, operation data The format of (X ₃ ) is also obtained in a variable precision format of n bits in which the number of bits in the binary bit section is k.

그리고 N₁, N₂, N₃는 각각 피연산자(X₁, X₂)와 연산 데이터(X₃)의 증분 데이터를 나타내고, M₁, M₂, M₃는 각각 피연산자(X₁, X₂)와 연산 데이터(X₃)의 이진 데이터를 나타낸다.And N ₁ , N ₂ , N ₃ represent incremental data of operands (X ₁ , X ₂ ) and operation data (X ₃ ), respectively, and M ₁ , M ₂ , M ₃ are operands (X ₁ , X ₂ ), respectively. And the binary data of the operation data (X ₃ ).

도 9를 참조하여 가산 연산 알고리즘을 설명하면, 우선 피연산자(X₂)의 증분 데이터(N₂)를 연산 데이터(X₃)의 증분 데이터(N₃)로 대입한다(S12). 그리고 (2^k + M₁)과 ((N₂ - N₁) - (2^k ≫ N₂)) 중 큰 값을 이진 데이터 변화(ΔM)로 선택한다. 여기서 ≫ 는 큰 값을 선택하는 연산자이다.Referring to Fig. 9 describes the addition operation algorithm, the first incremental data is substituted into the (N ₂₎ of the operand (X ₂₎ to the incremental data (N ₃₎ of the operational data (X ₃₎ (S12). And the larger of (2 ^k + M ₁ ) and ((N ₂ -N ₁ )-(2 ^k ≫ N ₂ )) is selected as the binary data change (ΔM). Where ≫ is an operator that selects large values.

이후 피연산자(X₂)의 이진 데이터(M₂)에 선택된 이진 데이터 변화(ΔM)를 가산하여 이진 데이터(M₃)를 계산한다(S13). 계산된 연산 데이터(X₃)의 이진 데이터(M₃)가 k비트를 초과하는 오버플로우가 발생되는지 판별한다(S14). 만일 오버플로우가 발생되면, 증분 데이터(N₃)에 1을 가산한다(S15). 그리고 이진 데이터(M₃)와 1 중 큰 값을 연산 데이터(X₃)의 이진 데이터(M₃)로 선택한다.Thereafter, the binary data M ₃ is calculated by adding the selected binary data change ΔM to the binary data M ₂ of the operand X ₂ (S13). It is determined whether an overflow occurs in which the binary data M ₃ of the calculated operation data X ₃ exceeds k bits (S14). If overflow occurs, 1 is added to the incremental data (N ₃ ) (S15). Then, the larger of the binary data (M ₃ ) and 1 is selected as the binary data (M ₃ ) of the computational data (X ₃ ).

이후 획득된 증분 데이터(N₃)와 이진 데이터(M₃)로 연산 데이터(X₃ = {N₃:M₃})를 획득한다(S17). 만일 오버플로우가 발생되지 않으면, 이전 획득된 증분 데이터(N₃)와 이진 데이터(M₃)로 연산 데이터(X₃ = {N₃:M₃})를 획득한다(S17). Subsequently, operation data (X ₃ = {N ₃ :M ₃ }) is obtained from the acquired incremental data (N ₃ ) and binary data (M ₃ ) (S17). If overflow does not occur, operation data (X ₃ = {N ₃ :M ₃ }) is obtained from the previously acquired incremental data (N ₃ ) and binary data (M ₃ ) (S17).

도 10 및 도 11은 본 실시예에 따른 가변 정밀도 포멧 데이터의 곱셈 연산 알고리즘을 설명하기 위한 도면으로, 도 8 및 도 9와 마찬가지로 도 10은 피연산자와 연산 데이터의 가변 정밀도 포멧은 나타내고, 도 11은 가변 정밀도 포멧 데이터의 곱셈 연산 알고리즘을 나타낸다.10 and 11 are diagrams for explaining a multiplication algorithm of variable precision format data according to the present embodiment. Like FIGS. 8 and 9, FIG. 10 shows a variable precision format of operands and computation data, and FIG. 11 Represents a multiplication algorithm for variable precision format data.

도 10에서 X₃ 는 가산 연산 데이터로서 도 8의 연산 데이터(X₃)와 동일하고, X₄ 는 2개의 피연산자(X₁, X₂)에 대한 곱셈 연산 데이터이다.In FIG. 10, X ₃ is the addition operation data, which is the same as the operation data X ₃ of FIG. 8, and X ₄ is the multiplication operation data for the two operands X ₁ and X ₂ .

도 11을 참조하면, 곱셈 연산을 수행하기 위해서는 우선 2개의 피연산자(X₁, X₂)의 증분 데이터(N₁, N₂)와 k를 모두 더하여 임시 증분 데이터(N₄')를 게산한다(S21). 그리고 피연산자(X₁, X₂)의 이진 데이터(M₁, M₂)와 이진 데이터(M₁, M₂)의 곱과 이진 비트 구간 비트 수(k) 중 큰값을 더하여 임시 이진 데이터(M₄')를 획득한다(S22). 임시 증분 데이터(N₄')와 임시 이진 데이터(M₄')가 획득되면, 임시 곱셈 연산 데이터(X₄' = {N₄':M₄'})를 획득한다(S23).And Referring to Figure 11, in addition to all of the order to perform the multiplication operations first two operands (X _1, X ₂₎ increment data of the (N _1, N ₂₎ and k Calculation temporary increment data (N ₄ ') ( S21). And temporary binary data (M ₄ ) by adding the product of the binary data (M ₁ , M ₂ ) of the operands (X ₁ , X ₂ ) and the binary data (M ₁ , M ₂ ) and the number of bits (k) in the binary bit section. ') is obtained (S22). When temporary incremental data (N ₄ ′) and temporary binary data (M ₄ ′) are obtained, temporary multiplication operation data (X ₄ ′ = {N ₄ ′:M ₄ ′}) is obtained (S23).

한편, 도 9에서 설명한 가산 알고리즘에 따라 피연산자(X₁, X₂)에 대한 가산 연산 데이터(X₃)를 계산한다. 여기서 add()는 가변 정밀도 포멧 가산 연산을 의미한다.Meanwhile, the addition operation data X ₃ for the operands X ₁ and X ₂ are calculated according to the addition algorithm described in FIG. 9. Here, add() means a variable precision format addition operation.

가산 연산 데이터(X₃)가 계산되면, 가산 연산 데이터(X₃)의 증분 데이터(N₃)에 이진 비트 구간 비트 수(k)를 가산하여 임시 증분 데이터(N₃')를 획득하고(S25), 획득된 임시 증분 데이터(N₃')를 이용하여 임시 가산 연산 데이터(X₃' = {N₃':M₃})를 획득한다(S26).When the addition operation data (X ₃ ) is calculated, the temporary incremental data (N ₃ ') is obtained by adding the number of bits (k) for the binary bit interval to the increment data (N ₃ ) of the addition operation data (X ₃ ) (S25 ), the temporary addition operation data (X ₃ ′ = {N ₃ ′:M ₃ }) is obtained using the obtained temporary increment data (N ₃ ′) (S26).

그리고 임시 가산 연산 데이터(X₃')에서 임시 곱셈 연산 데이터(X₄')를 차감하여 감산 데이터(X₄")를 획득한다(S27). 여기서 sub()는 가변 정밀도 포멧 감산 연산을 의미한다.And by subtracting the temporary addition operation data (X ₃ ') (Temporary multiplying the data X ₄₎ in the' to obtain the subtracted data _{(X 4 ") (S27)} . Here, the sub () refers to the variable-precision format subtraction operation .

이후 감산 데이터(X₄")와 이진 비트 구간 비트 수(k)에 따른 자릿수를 나타내는 2^2k+1)을 가변 정밀도 포멧 가산하여 곱셈 연산 데이터(X₄)를 획득한다.Thereafter, the subtraction data (X ₄ ") and 2 ^{2k + 1} representing the number of digits according to the number of bits of the binary bit section (k) are added in a variable precision format to obtain multiplication operation data (X ₄ ).

도 12는 본 발명의 일 실시예에 따른 가변 정밀도 양자화 방법을 나타내며, 기지정된 포멧으로 인가된 데이터를 n 비트의 가변 정밀도 포멧 데이터로 양자화하는 경우를 가정하여 설명한다.12 shows a variable precision quantization method according to an embodiment of the present invention, and will be described on the assumption that data applied in a predetermined format is quantized into n-bit variable precision format data.

도 12를 참조하여, 가변 정밀도 양자화 방법을 설명하면, 우선 부호 비트를 설정한다(S31). 부호 비트는 가변 정밀도 포멧으로 변환해야 하는 데이터가 음의 값을 가질 수 있는지에 따라 결정된다.Referring to Fig. 12, when the variable precision quantization method is described, a code bit is first set (S31). The sign bit is determined according to whether the data to be converted to the variable precision format can have a negative value.

일반적으로 양자화를 요구하는 시스템에서는 입력되는 데이터의 포멧이 미리 지정되어 있으며, 이에 부호 비트는 시스템의 요구 사항에 따라 설정될 수 있다. 부호 비트가 필요한 경우, 부호 비트는 n비트에서 최상위 비트(MSB) 1비트로 설정될 수 있다.In general, in a system that requires quantization, the format of the input data is pre-designated, and the sign bit may be set according to system requirements. When a sign bit is required, the sign bit may be set from n bits to 1 bit of the most significant bit (MSB).

그리고 데이터를 n 비트의 가변 정밀도 포멧 데이터로 양자화하기 위해, 표현해야 하는 값의 정밀도와 범위에 기초하여 n 비트에서 이진값을 표현하는 이진 비트 구간의 비트 수(k)를 설정한다(S32). 양자화를 요구하는 각 시스템에서는 일반적으로 요구하는 정밀도와 범위가 미리 지정되며, 지정된 표현 정밀도와 표현 범위에 따라 이진 비트 구간 비트 수(k)를 결정할 수 있다. 이때, 가변 정밀도 포멧으로 변환해야 하는 데이터의 개략적인 분산 분포가 주어지는 경우, 분산 분포에 기초하여, 이진 비트 구간 비트 수(k)를 결정할 수도 있다.Then, in order to quantize the data into n-bit variable precision format data, the number of bits k of the binary bit section representing the binary value from n bits is set based on the precision and range of the value to be expressed (S32). In each system that requires quantization, the generally required precision and range are specified in advance, and the number of bits (k) in a binary bit section can be determined according to the specified expression precision and expression range. In this case, when an approximate variance distribution of data to be converted into a variable precision format is given, the number of bits k of the binary bit section may be determined based on the variance distribution.

n 비트에서 부호 비트로 1비트와 이진 비트 구간 비트 수(k)가 설정되면, 나머지 비트(n-1-k)는 증가 비트 구간으로 설정된다. 부호 비트가 설정되지 않은 경우, 증가 비트 구간은 (n-k)의 비트 수를 갖는다.When the number of bits (k) of 1-bit and binary bit intervals are set from n bits to sign bits, the remaining bits (n-1-k) are set as incremental bit intervals. When the sign bit is not set, the incremental bit interval has a number of (n-k) bits.

즉 본 실시예에 따른 가변 정밀도 양자화 방법에서는 n 비트의 데이터를 부호값을 나타내는 부호 비트와 이진값을 표현하는 이진 비트 구간 및 이진 데이터의 최소 증감 단위를 조절하는 증가 비트 구간으로 구분한다. 그리고 부호 비트를 최상위 비트(MSB)로 하여 증가 비트 구간 및 이진 비트 구간이 순차적으로 배치되어 가변 정밀도 포멧을 구성한다.That is, in the variable-precision quantization method according to the present embodiment, n-bit data is divided into a sign bit representing a sign value, a binary bit section representing a binary value, and an increment bit section controlling the minimum increment/decrement unit of the binary data. In addition, an incremental bit section and a binary bit section are sequentially arranged with the sign bit as the most significant bit (MSB) to constitute a variable precision format.

이후 가변 정밀도 포멧으로 변환되어야 하는 입력 데이터가 인가되면, 인가된 입력 데이터를 이진수의 정수로 변환하고, 이진수의 정수로의 변환 과정에서 스케일링 팩터를 획득한다(S33). 본 실시예에서 가변 정밀도 포멧의 데이터는 기본적으로 정수를 표현하도록 구성된다. 따라서 소수점 이하 자리를 표현하기 위해서는 별도로 소수점 자리를 나타내는 스케일링 팩터를 입력 데이터로부터 획득한다. 여기서 스케일링 팩터는 2의 음의 지수승 형태(2^-e)로 획득될 수 있다. 만일 입력 데이터가 부동 소수점 포멧으로 인가되는 경우, 부동 소수점 포멧의 지수부를 스케일링 팩터로 획득할 수도 있다.Thereafter, when input data to be converted into a variable precision format is applied, the applied input data is converted to a binary integer, and a scaling factor is obtained in the process of converting the binary number to an integer (S33). In this embodiment, data in a variable precision format is basically configured to represent integers. Therefore, in order to represent the decimal places, a scaling factor representing the decimal places is separately obtained from the input data. Here, the scaling factor may be obtained in the form of a negative power of 2 (2 ^-e ). If input data is applied in a floating point format, the exponent portion of the floating point format may be obtained as a scaling factor.

그리고 도 6에 도시된 바와 같이, 정수로 변환된 이진수에서 부호 비트를 제외한 최상위 비트의 위치값(position of MSB)에서 이진 비트 구간 비트 수(k)를 차감하여, 증분 데이터(N)를 획득한다(S34).In addition, as shown in FIG. 6, incremental data (N) is obtained by subtracting the number of bits k of the binary bit section from the position of MSB of the most significant bit excluding the sign bit in the binary number converted to an integer. (S34).

여기서 증분 데이터(N)는 k비트의 이진 비트 구간의 이진 데이터(M)가 증가 또는 감소되는 최소 간격을 나타내는 데이터로서 증분 데이터(N)는 증가 또는 감소되는 이진 데이터(M)의 값이 증분 데이터(N)에 대응하는 단위로 증가 또는 감소되는 것으로 해석되도록 한다.Here, the incremental data (N) is data representing the minimum interval at which the binary data (M) of the k-bit binary bit interval increases or decreases, and the incremental data (N) is the incremental data that increases or decreases the value of the binary data (M). It should be interpreted as increasing or decreasing in units corresponding to (N).

증분 데이터(N)가 계산되면, 이진수에서 최상위 2비트를 제외한 나머지 비트 중 최상위 비트로부터 순차적으로 이진 비트 구간 비트 수(k)만큼의 데이터를 이진 데이터(M)로 획득한다(S35). 최상위 2비트와 이진 비트 구간 비트 수(k)만큼의 비트를 제외한 나머지 비트 중 최상위 비트인 반올림 비트의 값을 획득한다(S36).When the incremental data (N) is calculated, data corresponding to the number of bits (k) of the binary bit section are sequentially obtained as binary data (M) from the most significant bit of the remaining bits except for the most significant 2 bits in the binary number (S35). The value of the rounded bit, which is the most significant bit, is obtained among the remaining bits excluding the most significant 2 bits and the number of bits for the binary bit interval (k) (S36).

반올림 비트의 값이 획득되면, 부호 비트를 최상위 비트로 하여 증분 데이터(N) 및 이진 데이터(M)를 순차적으로 배열하고, 배열된 값에 반올림 비트의 값을 가산하여 가변 정밀도 포멧의 데이터로 양자화 한다(S37).When the value of the rounded bit is obtained, incremental data (N) and binary data (M) are sequentially arranged with the sign bit as the most significant bit, and the value of the rounded bit is added to the arranged value to quantize data in variable precision format. (S37).

본 발명에 따른 방법은 컴퓨터에서 실행 시키기 위한 매체에 저장된 컴퓨터 프로그램으로 구현될 수 있다. 여기서 컴퓨터 판독가능 매체는 컴퓨터에 의해 액세스 될 수 있는 임의의 가용 매체일 수 있고, 또한 컴퓨터 저장 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함하며, ROM(판독 전용 메모리), RAM(랜덤 액세스 메모리), CD(컴팩트 디스크)-ROM, DVD(디지털 비디오 디스크)-ROM, 자기 테이프, 플로피 디스크, 광데이터 저장장치 등을 포함할 수 있다.The method according to the present invention may be implemented as a computer program stored in a medium for execution on a computer. Here, the computer-readable medium may be any available medium that can be accessed by a computer, and may also include all computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, and ROM (Read Dedicated memory), RAM (random access memory), CD (compact disk)-ROM, DVD (digital video disk)-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다.The present invention has been described with reference to the embodiments shown in the drawings, but these are merely exemplary, and those of ordinary skill in the art will appreciate that various modifications and other equivalent embodiments are possible therefrom.

따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 청구범위의 기술적 사상에 의해 정해져야 할 것이다.Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

100: 인공 신경망 110: 가변 정밀도 변환부
120: 이진 비트 구간 설정부 130: 신경 노드부
140: 노드 설정부 150: 포멧 변환부
160: 스케일링 팩터 설정부100: artificial neural network 110: variable precision conversion unit
120: binary bit section setting unit 130: neural node unit
140: node setting unit 150: format conversion unit
160: scaling factor setting unit

Claims

In the quantization apparatus for receiving input data and quantizing it into n-bit variable precision data whose precision is variable for each range of input data,
Binary bit section setting for setting the number of bits of the binary bit section among n bits so that n-bit data is divided into a binary bit section and an incremental bit section, and the number of bits in the incrementing bit section is adjusted according to the required expression precision and expression range part; And
In order to adjust the quantization range of the input data to be quantized, incremental data is calculated by specifying a minimum incremental unit of binary data input to the binary bit interval, and a value corresponding to the input data is represented in a quantization range according to the incremental data. A variable precision converter for calculating binary data and sequentially arranging the calculated incremental data and binary data to convert the calculated incremental data into variable precision data; Quantization device comprising a.

The method of claim 1, wherein the variable precision conversion unit
Incremental data is obtained by subtracting the number of bits in the binary bit section from the bit position value of the most significant bit excluding the sign bit from the input data applied in binary,
From the most significant bit excluding the sign bit and the next most significant bit, data as much as the number of bits in the binary bit section are sequentially extracted as binary data,
The most significant 1 bit of the remaining bits excluding the bit extracted as binary data is obtained as the rounded bit value,
Quantization apparatus for obtaining the variable precision data by sequentially arranging the incremental data obtained with the sign bit as the most significant bit and the extracted binary data, and adding the rounded bit value.

The method of claim 2, wherein the quantization device
If the input data is data including decimal places, a scaling factor setting unit converting the input data into integer data, obtaining and storing a scaling factor representing an exponent value for converting the input data into integer data; Quantization device further comprising a.

The method of claim 1, wherein the binary bit section setting unit
A quantization apparatus for reducing the number of bits in the binary bit interval as the required expression precision is fine or the expression range is wider.

The method of claim 1, wherein the binary bit section setting unit
Quantization device for increasing the number of bits in the binary bit interval in proportion to the variance of the input data.

The method of claim 1, wherein the binary bit section setting unit
When the input data contains a sign bit, the quantization device sets the most significant bit as a sign bit in the n-bit variable precision data.

In the quantization method of receiving input data and quantizing it into n-bit variable precision data whose precision is variable for each range of input data,
dividing n-bit data into a binary bit period and an incremental bit period;
Setting the number of bits of the binary bit section among n bits so that the number of bits of the increasing bit section is adjusted according to the required expression precision and expression range; And
In order to adjust the quantization range of the input data to be quantized, incremental data is calculated by specifying a minimum incremental unit of binary data input to the binary bit interval, and a value corresponding to the input data is represented in a quantization range according to the incremental data. Calculating binary data and sequentially arranging the calculated incremental data and binary data to convert them into variable precision data; Quantization method comprising a.

The method of claim 7, wherein converting the variable precision data
Obtaining incremental data by subtracting the number of bits of the binary bit section from the bit position value of the most significant bit excluding the sign bit from the input data applied as binary numbers;
Sequentially extracting, as binary data, the number of bits of the binary bit section from the most significant bit of the remaining most significant bits excluding the sign bit and the next most significant bit;
Obtaining the most significant 1-bit of the remaining bits excluding the bits extracted as binary data as rounded-off bit values;
Sequentially arranging the incremental data obtained from the sign bit as the most significant bit and the extracted binary data; And
Obtaining the variable precision data by adding the rounding bit value; Quantization method comprising a.

The method of claim 8, wherein the quantization method
If the input data is data including decimal places, converting it into integer data, and obtaining and storing a scaling factor representing an exponent value for converting the input data into integer data; Quantization method further comprising.

The method of claim 7, wherein the setting of the number of bits in the binary bit section comprises:
A quantization method for reducing the number of bits in the binary bit interval as the required expression precision is fine or the expression range is wide.

The method of claim 7, wherein the setting of the number of bits in the binary bit section comprises:
Quantization method for increasing the number of bits in the binary bit interval in proportion to the variance of the input data.

The method of claim 7, wherein the quantization method
If the input data includes a sign bit, setting a most significant bit as a sign bit in the n-bit variable precision data; Quantization method further comprising.