JP2020119518A

JP2020119518A - Method and device for transforming cnn layers to optimize cnn parameter quantization to be used for mobile devices or compact networks with high precision via hardware optimization

Info

Publication number: JP2020119518A
Application number: JP2019238383A
Authority: JP
Inventors: ゲヒョンキム; Kye-Hyeon Kim; ヨンジュンキム; Yong-Jun Kim; インスキム; Insu Kim; ハクギョンキム; Hak Kyoung Kim; ウンヒョンナム; Woonhyun Nam; ソクフンブ; Sukhoon Boo; ミョンチョルソン; Myungchul Sung; ドンフンヨ; Donghun Yeo; ウジュリュ; Wooju Ryu; テウンジャン; Taewoong Jang
Original assignee: Stradvision Inc
Current assignee: Stradvision Inc
Priority date: 2019-01-23
Filing date: 2019-12-27
Publication date: 2020-08-06
Anticipated expiration: 2039-12-27
Also published as: KR102349916B1; EP3686808A1; KR20200091785A; EP3686808C0; US10325352B1; EP3686808B1; JP6872264B2; CN111476341B; CN111476341A

Abstract

To provide a method for transforming CNN layers to flatten values included in at least one feature map in order to properly reflect respective values of specific channels including small values on output values.SOLUTION: A method for transforming convolutional layers of a CNN including m convolutional blocks includes steps of: generating k-th quantization loss values by referring to k-th initial weight values of a k-th initial convolutional layer included in a k-th convolutional block, a (k-1)-th feature map outputted from the (k-1)-th convolutional block, and each of k-th scaling parameters; determining each of k-th optimized scaling parameters by referring to the k-th quantization loss values; generating a k-th scaling layer and a k-th inverse scaling layer by referring to the k-th optimized scaling parameters; and transforming the k-th initial convolutional layer into a k-th integrated convolutional layer by using the k-th scaling layer and the (k-1)-th inverse scaling layer.SELECTED DRAWING: Figure 3b

Description

本発明は、ｍ個のコンボリューションブロックを含むＣＮＮのコンボリューションレイヤを変換する方法において、（ａ）コンピューティング装置が、スケーリングパラメータを決定するために使用される入力イメージが取得されると、（ｉ）第ｋコンボリューションブロックに含まれた第ｋ初期コンボリューションレイヤの一つ以上の第ｋ初期重み付け値と、（ｉｉ）（ｉｉ−１）ｋが１である場合、前記入力イメージ、（ｉｉ−２）ｋが２からｍまでの常数である場合、第（ｋ−１）コンボリューションブロックから出力された前記入力イメージに対応する第（ｋ−１）特徴マップと、（ｉｉｉ）（ｉｉｉ−１）ｋが１である場合、前記入力イメージに含まれたチャンネルそれぞれに対応する第ｋスケーリングパラメータそれぞれ、及び（ｉｉｉ−２）ｋが２からｍまでの常数である場合、前記第（ｋ−１）特徴マップに含まれたチャンネルそれぞれに対応する第ｋスケーリングパラメータそれぞれを参照して、一つ以上の第ｋ量子化ロス値と、を生成する段階（ｋは１からｍまでの常数である）；（ｂ）前記コンピューティング装置が、前記第ｋ量子化ロス値を参照して、前記第ｋスケーリングパラメータのうちで前記第（ｋ−１）特徴マップに含まれた前記チャンネルそれぞれに対応する第ｋ最適スケーリングパラメータそれぞれを決定する段階；（ｃ）前記コンピューティング装置が、前記第ｋ最適スケーリングパラメータを参照して第ｋスケーリングレイヤ及び第ｋ逆スケーリングレイヤを生成する段階；（ｄ）前記コンピューティング装置が、（ｉ）ｋが１である場合、前記第ｋスケーリングレイヤを使用して、前記第ｋ初期コンボリューションレイヤを第ｋ統合コンボリューションレイヤに変換し、（ｉｉ）ｋが２からｍまでの常数である場合、前記第ｋスケーリングレイヤ及び前記第（ｋ−１）逆スケーリングレイヤを使用して前記第ｋ初期コンボリューションレイヤを前記第ｋ統合コンボリューションレイヤに変換する段階；を含むことを特徴とする方法及び装置に関する。 The present invention provides a method for transforming a convolution layer of a CNN containing m convolution blocks, wherein (a) when a computing device obtains an input image used to determine a scaling parameter, i) one or more k-th initial weight values of the k-th initial convolution layer included in the k-th convolution block, and (ii) (ii-1) if k is 1, the input image, (ii) -2) When k is a constant from 2 to m, the (k-1)th feature map corresponding to the input image output from the (k-1)th convolution block, and (iii)(iii- 1) When k is 1, each of the k-th scaling parameters corresponding to each channel included in the input image, and (iii-2) when k is a constant from 2 to m, the (k- 1) A step of generating one or more kth quantization loss values with reference to each kth scaling parameter corresponding to each channel included in the feature map (k is a constant from 1 to m) ); (b) The computing device refers to the kth quantization loss value and corresponds to each of the channels included in the (k-1)th feature map among the kth scaling parameters. Determining each of the kth optimal scaling parameters; (c) the computing device generating the kth scaling layer and the kth inverse scaling layer with reference to the kth optimal scaling parameter; (d) the computing device. And (i) k is 1, the k-th scaling layer is used to transform the k-th initial convolution layer into a k-th integrated convolution layer, and (ii) k is 2 to m. Transforming the k-th initial convolution layer into the k-th integrated convolution layer using the k-th scaling layer and the (k-1)-th inverse scaling layer. And a method and device characterized by the above.

ディープコンボリューションニューラルネットワーク（ＤｅｅｐＣｏｎｖｏｌｕｔｉｏｎＮｅｕｒａｌＮｅｔｗｏｒｋｓ；ＤｅｅｐＣＮＮ）は、ディープラーニング分野で起きた驚くべき発展の核心である。ＣＮＮは文字の認識問題を解決するために９０年代にも使用されたが、近年になって機械学習（ＭａｃｈｉｎｅＬｅａｒｎｉｎｇ）分野で広く使用されるようになった。例えば、ＣＮＮは２０１２年にイメージ認識コンテスト（ＩｍａｇｅＮｅｔＬａｒｇｅＳｃａｌｅＶｉｓｕａｌＲｅｃｏｇｎｉｔｉｏｎＣｈａｌｌｅｎｇｅ）で他の競争相手に勝って優勝を収めた。その後、ＣＮＮは、機械学習分野で非常に有用なツールとして使用されるようになった。 Deep Convolution Neural Networks (Deep CNN) are at the heart of the amazing developments that have taken place in the field of deep learning. CNN was also used in the 90's to solve the problem of character recognition, but in recent years, it has become widely used in the field of machine learning. For example, CNN won the image recognition contest (ImageNet Large Scale Visual Recognition Challenge) in 2012 over other competitors. After that, CNN came to be used as a very useful tool in the field of machine learning.

しかし、ディープラーニングアルゴリズムには、３２ビット浮動小数点演算が必要であるという偏見があったため、モバイル装置は、ディープラーニングアルゴリズムを含むプログラムを遂行することができないものとみなされていた。 However, due to the prejudice that deep learning algorithms require 32-bit floating point arithmetic, mobile devices were considered to be incapable of executing programs containing deep learning algorithms.

ところが、３２ビット浮動小数点演算より少ないコンピューティング性能が必要な１０ビット固定小数点演算がディープラーニングアルゴリズムには十分であるということが一部の実験で証明された。従って、リソースが制限された装置、すなわちモバイル装置において、ディープラーニングアルゴリズムに１０ビット固定小数点演算を使用する方法を提供しようとする多くの試みがあった。 However, some experiments have demonstrated that 10-bit fixed point arithmetic, which requires less computing performance than 32-bit floating point arithmetic, is sufficient for deep learning algorithms. Therefore, there have been many attempts to provide a method of using 10-bit fixed point arithmetic for deep learning algorithms in resource limited devices, ie mobile devices.

３２ビット浮動小数点で表現された数を１０ビット固定小数点に量子化するいくつかの成功裏の方法が提示されたが、重要な問題があった。複数のチャンネルに含まれた値が大きく変わる場合、チャンネルのうち小さな値を含む一つ以上の特定チャンネルの値が無視されることがある。それは、図５において見ることができる。 Although some successful methods of quantizing numbers represented in 32-bit floating point to 10-bit fixed point have been presented, there have been significant problems. When the values included in a plurality of channels are significantly changed, values of one or more specific channels including a small value among the channels may be ignored. It can be seen in FIG.

図５は、大きく異なる様々なチャンネルの値を例示的に示している。 FIG. 5 exemplarily shows values of various channels that are significantly different.

図５を参照すると、第１コンボリューションブロック２１０−１から出力された第１特徴マップの第１チャンネルに含まれた各値は、（０．６４、０．６５、０．６３）であり、第２チャンネルに含まれた各値は、（０．００２、０．００１、０．００１９）であることが分かる。従来技術によると、第１チャンネルの値及び第２チャンネルの値が量子化される場合、量子化に使用される単位値は第１チャンネルまたは第２チャンネルによって決定された。 Referring to FIG. 5, each value included in the first channel of the first feature map output from the first convolution block 210-1 is (0.64, 0.65, 0.63), It can be seen that the values included in the second channel are (0.002, 0.001, 0.0019). According to the prior art, when the value of the first channel and the value of the second channel are quantized, the unit value used for the quantization is determined by the first channel or the second channel.

単位値が第１チャンネルによって決定される場合、単位値は第１チャンネルに含まれた値を示すために大きくなる。そして、単位値が第２チャンネルに含まれた値に比べてあまりにも大きいため、第２チャンネルに含まれた値は０に量子化され得る。反対に、単位値が第２チャンネルによって決定される場合、単位値は第２チャンネルに含まれた値を示すために小さくなる。それでは、単位値があまりにも小さくて第１チャンネルに含まれた値を正しく量子化することができない。 If the unit value is determined by the first channel, the unit value is increased to indicate the value included in the first channel. And, since the unit value is too large as compared with the value included in the second channel, the value included in the second channel may be quantized to zero. On the contrary, when the unit value is determined by the second channel, the unit value becomes smaller to indicate the value included in the second channel. Then, the unit value is too small to correctly quantize the value included in the first channel.

特定チャンネルの各値が無視されるか、または特定チャンネルの各値が上記のように適切に量子化されなければ、ＣＮＮの出力が歪曲されかねない。 If the particular channel values are ignored, or if the particular channel values are not properly quantized as described above, the CNN output may be distorted.

本発明は、上述した問題点を解決することを目的とする。 The present invention aims to solve the above-mentioned problems.

本発明は、出力値にある小さな値を含む特定チャンネルの各値を適切に反映するために、少なくとも一つの特徴マップに含まれた値を平坦化することができるようにＣＮＮレイヤを変換する方法を提供することを目的とする。 The present invention is a method of transforming a CNN layer so that the values included in at least one feature map can be flattened to properly reflect each value of a specific channel including a small value in an output value. The purpose is to provide.

前記のような本発明の目的を達成し、後述する本発明の特徴的な効果を実現するための本発明の特徴的な構成は以下の通りである。 The characteristic constitution of the present invention for achieving the above-mentioned object of the present invention and realizing the characteristic effect of the present invention described later is as follows.

本発明の一態様によると、ｍ個のコンボリューションブロックを含むＣＮＮのコンボリューションレイヤを変換する方法において、（ａ）コンピューティング装置が、スケーリングパラメータを決定するために使用される入力イメージが取得されると、（ｉ）第ｋコンボリューションブロックに含まれた第ｋ初期コンボリューションレイヤの一つ以上の第ｋ初期重み付け値と、（ｉｉ）（ｉｉ−１）ｋが１である場合、前記入力イメージ、（ｉｉ−２）ｋが２からｍまでの常数である場合、第（ｋ−１）コンボリューションブロックから出力された前記入力イメージに対応する第（ｋ−１）特徴マップと、（ｉｉｉ）（ｉｉｉ−１）ｋが１である場合、前記入力イメージに含まれたチャンネルそれぞれに対応する第ｋスケーリングパラメータそれぞれ、及び（ｉｉｉ−２）ｋが２からｍまでの常数である場合、前記第（ｋ−１）特徴マップに含まれたチャンネルそれぞれに対応する第ｋスケーリングパラメータそれぞれを参照して、一つ以上の第ｋ量子化ロス値と、を生成する段階（ｋは１からｍまでの常数である）；（ｂ）前記コンピューティング装置が、前記第ｋ量子化ロス値を参照して、前記第ｋスケーリングパラメータのうちで前記第（ｋ−１）特徴マップに含まれた前記チャンネルそれぞれに対応する第ｋ最適スケーリングパラメータそれぞれを決定する段階；（ｃ）前記コンピューティング装置が、前記第ｋ最適スケーリングパラメータを参照して第ｋスケーリングレイヤ及び第ｋ逆スケーリングレイヤを生成する段階；（ｄ）前記コンピューティング装置が、（ｉ）ｋが１である場合、前記第ｋスケーリングレイヤを使用して、前記第ｋ初期コンボリューションレイヤを第ｋ統合コンボリューションレイヤに変換し、（ｉｉ）ｋが２からｍまでの常数である場合、前記第ｋスケーリングレイヤ及び前記第（ｋ−１）逆スケーリングレイヤを使用して前記第ｋ初期コンボリューションレイヤを前記第ｋ統合コンボリューションレイヤに変換する段階；を含むことを特徴とする方法が提供される。 According to one aspect of the invention, in a method of transforming a convolution layer of a CNN containing m convolution blocks, (a) a computing device obtains an input image used to determine a scaling parameter. Then, if (i) one or more kth initial weighting values of the kth initial convolution layer included in the kth convolution block and (ii)(ii-1)k is 1, the input The image, (ii-2) where k is a constant from 2 to m, the (k-1)th feature map corresponding to the input image output from the (k-1)th convolution block; and (iii) )(Iii-1)k is 1, each of the k-th scaling parameters corresponding to each channel included in the input image, and (iii-2)k is a constant from 2 to m, A step of generating one or more k-th quantization loss values with reference to each of the k-th scaling parameters corresponding to each channel included in the (k-1)th feature map (k is from 1 to m) (B) the computing device refers to the kth quantization loss value, and the channel included in the (k-1)th feature map among the kth scaling parameters. Determining each of the k-th optimal scaling parameters corresponding thereto; (c) the computing device generating the k-th scaling layer and the k-th inverse scaling layer with reference to the k-th optimal scaling parameter; d) the computing device transforms the kth initial convolutional layer into akth integrated convolutional layer using the kth scaling layer if (i)k is 1, and (ii)k Is a constant from 2 to m, transforming the k th initial convolution layer into the k th integrated convolution layer using the k th scaling layer and the (k−1) inverse scaling layer. A method is provided.

一実施例において、前記（ａ）段階は、前記コンピューティング装置が、（ｉｖ）ＢＷ値（前記ＢＷ値は、前記ＣＮＮに含まれた重み付け値及び特徴マップに含まれた値を二進数で表現するために使用されたビットの個数である）、及び（ｖ）第ｋＦＬ値（前記第ｋＦＬ値は、（ｉ）前記第ｋ初期コンボリューションレイヤの前記第ｋ初期重み付け値、及び（ｉｉ）ｋが２からｍまでの常数である場合、前記第（ｋ−１）特徴マップに含まれた値であり、ｋが１である場合、前記入力イメージに含まれた値のＬＳＢが示す数の指数の絶対値である）をさらに参照して、前記第ｋ量子化ロス値を生成することを含むことを特徴とする方法が提供される。 In one embodiment, in the step (a), the computing device expresses (iv) a BW value (wherein the BW value is a weight value included in the CNN and a value included in a feature map in binary numbers). And (v) the kth FL value (the kth FL value is (i) the kth initial weighting value of the kth initial convolution layer, and (ii)k). Is a constant from 2 to m, it is a value included in the (k-1)th feature map, and when k is 1, an index of the number indicated by the LSB of the value included in the input image. , Which is the absolute value of), and generating the k-th quantization loss value.

一実施例において、前記（ａ）段階は、前記数式によって前記第ｋ量子化ロス値が生成され、前記数式でθ_ｐは（ｉ）ｋが２からｍまでの常数である場合、前記第（ｋ−１）特徴マップ及び前記第ｋ初期コンボリューション特徴マップの前記第ｋ初期重み付け値の値、（ｉｉ）ｋが１である場合、前記入力イメージ及び前記第ｋ初期コンボリューション特徴マップの前記第ｋ初期重み付け値の値を含み、Ｃ_ｋｉは、前記第ｋスケーリングパラメータのうちで特定の第ｋスケーリングパラメータであり、ＦＬ及びＢＷはそれぞれ前記ＦＬ値及び前記ＢＷ値であり、Ｑ演算は、前記ＦＬ値及び前記ＢＷ値を参照して生成されたＣ_ｋｉθ_ｉの量子化された値とＣ_ｋｉθ_ｉとの間の差を生成する演算であり、前記（ｂ）段階は、前記コンピューティング装置が、前記△Ｌ_ｋを最も小さくする前記Ｃ_ｋｉを選択することにより、前記第ｋ最適スケーリングパラメータそれぞれを決定することを特徴とする方法が提供される。 In one embodiment, in the step (a), the k-th quantization loss value is generated according to the equation, and θ _p is (i) k is a constant from 2 to m. k-1) the value of the k-th initial weighting value of the feature map and the k-th initial convolution feature map, (ii) when k is 1, the input image and the k-th initial convolution feature map C _k is a specific kth scaling parameter of the kth scaling parameters, FL and BW are the FL value and the BW value, respectively, and the Q operation is the a calculation for generating the difference between the quantized value of the FL values and C _ki theta _i generated by referring to the BW values and C _ki theta _i, step (b), the computing device, by selecting the C _ki be minimized the △ L _k, wherein determining a respective said first k optimal scaling parameters are provided.

一実施例において、前記コンピューティング装置が、ネステロフ加速勾配（ＮｅｓｔｅｒｏｖＡｃｃｅｌｅｒａｔｅｄＧｒａｄｉｅｎｔ）法を使用して、前記Ｃ_ｋｉを選択して前記第ｋ最適スケーリングパラメータを決定することを特徴とする方法が提供される。 In one embodiment, a method is provided in which the computing device uses the Nesterov Accelerated Gradient method to select the C _ki to determine the k-th optimal scaling parameter. It

一実施例において、前記（ｃ）段階は、前記コンピューティング装置が、前記第ｋ最適スケーリングパラメータそれぞれがその構成要素として決定される前記第ｋスケーリングレイヤを生成し、前記第ｋ最適スケーリングパラメータの逆数それぞれがその構成要素として決定される前記第ｋ逆スケーリングレイヤを生成することを特徴とする方法が提供される。 In one embodiment, in the step (c), the computing device generates the kth scaling layer, each of which is determined as a constituent of the kth optimum scaling parameter, and a reciprocal of the kth optimum scaling parameter. A method is provided, characterized in that it produces said k-th inverse scaling layer, each of which is determined as its component.

一実施例において、前記（ｄ）段階は、前記コンピューティング装置が、（１）ｋが１である場合、（ｉ）前記第ｋ初期コンボリューションレイヤ及び前記第ｋスケーリングレイヤの演算を入力値に適用して生成された結果と、（ｉｉ）前記第ｋ統合コンボリューションレイヤの演算を前記入力値に適用して生成された結果との間の差が閾値より小さくなるように前記第ｋ初期コンボリューションレイヤを前記第ｋ統合コンボリューションレイヤに変換し、（２）ｋが２以上ｍ以下の常数である場合、（ｉ）第（ｋ−１）逆スケーリングレイヤと、前記第ｋ初期コンボリューションレイヤと、前記第ｋスケーリングレイヤとの演算を入力値に適用して生成された結果と、（ｉｉ）前記第ｋ統合コンボリューションレイヤの演算を入力値に適用して生成された結果との間の差が前記閾値より小さくなるように前記第ｋ初期コンボリューションレイヤを前記第ｋ統合コンボリューションレイヤに変換することを特徴とする方法が提供される。 In one embodiment, in the step (d), when the computing device (1) k is 1, (i) the operation of the k-th initial convolution layer and the k-th scaling layer are input values. The k-th initial con- struct so that the difference between the result generated by applying the result and (ii) the result generated by applying the operation of the k-th integrated convolution layer to the input value is smaller than a threshold value. When the volume layer is converted into the kth integrated convolution layer, and (2) k is a constant of 2 or more and m or less, (i) the (k-1) inverse scaling layer and the kth initial convolution layer. And (ii) a result generated by applying the operation with the k-th scaling layer to an input value, and (ii) a result generated by applying the operation of the k-th integrated convolution layer to the input value. A method is provided, which comprises transforming the k th initial convolution layer into the k th integrated convolution layer such that the difference is less than the threshold.

一実施例において、（ｅ）前記コンピューティング装置が、前記第ｋコンボリューションブロックに含まれた前記第ｋ統合コンボリューションレイヤの各重み付け値を量子化して、前記第ｋコンボリューションブロックによって遂行されるＣＮＮ演算に対する最適化重み付け値として、第ｋ量子化重み付け値を生成する段階；をさらに含むことを特徴とする方法が提供される。 In one embodiment, (e) the computing device quantizes each weighting value of the kth integrated convolution layer included in the kth convolution block and is performed by the kth convolution block. Providing a k-th quantized weighting value as an optimized weighting value for the CNN operation.

本発明の他の態様によると、ｍ個のコンボリューションブロックを含むＣＮＮのコンボリューションレイヤを変換するコンピューティング装置において、各インストラクションを格納する少なくとも一つのメモリと、（Ｉ）（ｉ）第ｋコンボリューションブロックに含まれた第ｋ初期コンボリューションレイヤの一つ以上の第ｋ初期重み付け値と、（ｉｉ）（ｉｉ−１）ｋが１である場合、スケーリングパラメータを決定するために使用される入力イメージ、（ｉｉ−２）ｋが２からｍまでの常数である場合、第（ｋ−１）コンボリューションブロックから出力された前記入力イメージに対応する第（ｋ−１）特徴マップと、（ｉｉｉ）（ｉｉｉ−１）ｋが１である場合、前記入力イメージに含まれたチャンネルそれぞれに対応する第ｋスケーリングパラメータそれぞれ、及び（ｉｉｉ−２）ｋが２からｍまでの常数である場合、前記第（ｋ−１）特徴マップに含まれたチャンネルそれぞれに対応する第ｋスケーリングパラメータそれぞれを参照して、一つ以上の第ｋ量子化ロス値と、を生成するプロセス（ｋは１からｍまでの常数である）；（ＩＩ）前記第ｋ量子化ロス値を参照して、前記第ｋスケーリングパラメータのうちで前記第（ｋ−１）特徴マップに含まれた前記チャンネルそれぞれに対応する第ｋ最適スケーリングパラメータそれぞれを決定するプロセス；（ＩＩＩ）前記第ｋ最適スケーリングパラメータを参照して、第ｋスケーリングレイヤ及び第ｋ逆スケーリングレイヤを生成するプロセス；（ＩＶ）（ｉ）ｋが１である場合、前記第ｋスケーリングレイヤを使用して、前記第ｋ初期コンボリューションレイヤを第ｋ統合コンボリューションレイヤに変換し、（ｉｉ）ｋが２からｍまでの常数である場合、前記第ｋスケーリングレイヤ及び前記第（ｋ−１）逆スケーリングレイヤを使用して前記第ｋ初期コンボリューションレイヤを前記第ｋ統合コンボリューションレイヤに変換するプロセス；を遂行するための前記インストラクションを実行するように構成された少なくとも一つのプロセッサと、を含むことを特徴とするコンピューティング装置が提供される。 According to another aspect of the present invention, in a computing device for transforming a convolution layer of a CNN including m convolution blocks, at least one memory storing each instruction, and (I)(i) kth concatenation. One or more kth initial weighting values of the kth initial convolution layer included in the volume block, and (ii)(ii-1) if k is 1, the input used to determine the scaling parameter Image, (ii-2) where k is a constant from 2 to m, the (k-1)th feature map corresponding to the input image output from the (k-1)th convolution block; and (iii) )(Iii-1)k is 1, each of the k-th scaling parameters corresponding to each channel included in the input image, and (iii-2)k is a constant from 2 to m, A process of generating one or more k-th quantization loss values with reference to each of the k-th scaling parameters corresponding to each channel included in the (k-1)th feature map (k is from 1 to m). (II) referring to the k-th quantization loss value, the k-th corresponding to each of the channels included in the (k-1)th feature map among the k-th scaling parameters. A process of determining each optimum scaling parameter; (III) a process of generating a kth scaling layer and akth inverse scaling layer with reference to the kth optimum scaling parameter; (IV)(i) when k is 1 , The k-th scaling layer is used to transform the k-th initial convolution layer into a k-th integrated convolution layer, and (ii) if k is a constant from 2 to m, then the k-th scaling layer and Converting the kth initial convolutional layer to the kth integrated convolutional layer using the (k-1)th inverse scaling layer; at least configured to perform the instructions. A computing device is provided that includes a processor.

一実施例において、前記（Ｉ）プロセスは、前記プロセッサが、（ｉｖ）ＢＷ値（前記ＢＷ値は、前記ＣＮＮに含まれた重み付け値及び特徴マップに含まれた値を二進数で表現するために使用されたビットの個数である）、及び（ｖ）第ｋＦＬ値（前記第ｋＦＬ値は、（１）前記第ｋ初期コンボリューションレイヤの前記第ｋ初期重み付け値、及び（２）ｋが２からｍまでの常数である場合、前記第（ｋ−１）特徴マップに含まれた値であり、ｋが１である場合、前記入力イメージに含まれた値のＬＳＢが示す数の指数の絶対値である）をさらに参照して、前記第ｋ量子化ロス値を生成することを含むことを特徴とするコンピューティング装置が提供される。 In one embodiment, the process (I) is performed by the processor to express (iv) a BW value (wherein the BW value is a weight value included in the CNN and a value included in a feature map in binary numbers). The number of bits used in the above), and (v) the kth FL value (the kth FL value is (1) the kth initial weighting value of the kth initial convolution layer, and (2) k is 2). Is a value included in the (k−1)th feature map when it is a constant from m to m, and when k is 1, the absolute value of the exponent of the number indicated by the LSB of the value included in the input image Value) is further provided to provide the k-th quantization loss value.

一実施例において、前記（Ｉ）プロセスは、前記プロセッサが、前記数式によって前記第ｋ量子化ロス値が生成され、前記数式でθ_ｐは（ｉ）ｋが２からｍまでの常数である場合、前記第（ｋ−１）特徴マップ及び前記第ｋ初期コンボリューション特徴マップの前記第ｋ初期重み付け値の値、（ｉｉ）ｋが１である場合、前記入力イメージ及び前記第ｋ初期コンボリューション特徴マップの前記第ｋ初期重み付け値の値を含み、Ｃ_ｋｉは、前記第ｋスケーリングパラメータのうちで特定の第ｋスケーリングパラメータであり、ＦＬ及びＢＷはそれぞれ前記ＦＬ値及び前記ＢＷ値であり、Ｑ演算は、前記ＦＬ値及び前記ＢＷ値を参照して生成されたＣ_ｋｉθ_ｉの量子化された値とＣ_ｋｉθ_ｉとの間の差を生成する演算であり、前記（ＩＩ）プロセスは、前記プロセッサが、前記△Ｌ_ｋを最も小さくする前記Ｃ_ｋｉを選択することにより、前記第ｋ最適スケーリングパラメータそれぞれを決定することを特徴とするコンピューティング装置が提供される。 In one embodiment, in the process (I), the processor generates the k-th quantization loss value according to the formula, and θ _p is (i) k is a constant from 2 to m. The value of the kth initial weighting value of the (k-1)th feature map and the kth initial convolution feature map, (ii) when k is 1, the input image and the kth initial convolution feature C _k is a particular kth scaling parameter of the kth scaling parameters, FL and BW are the FL value and the BW value, respectively. calculation is a calculation for generating the difference between the FL values and C _ki are generated by referring to the BW value theta _i quantized values and C _ki theta _i of the (II) process , wherein the processor is the △ by selecting the C _ki a L _k be the smallest, the computing apparatus characterized by determining each of the first k optimal scaling parameters are provided.

一実施例において、前記プロセッサが、ネステロフ加速勾配（ＮｅｓｔｅｒｏｖＡｃｃｅｌｅｒａｔｅｄＧｒａｄｉｅｎｔ）法を使用して、前記Ｃ_ｋｉを選択して前記第ｋ最適スケーリングパラメータを決定することを特徴とするコンピューティング装置が提供される。 In one embodiment, a computing device is provided, characterized in that the processor uses the Nesterov Accelerated Gradient method to select the C _ki to determine the k-th optimal scaling parameter. It

一実施例において、前記（ＩＩＩ）プロセスは、前記プロセッサが、前記第ｋ最適スケーリングパラメータそれぞれがその構成要素として決定される前記第ｋスケーリングレイヤを生成し、前記第ｋ最適スケーリングパラメータの逆数それぞれがその構成要素として決定される前記第ｋ逆スケーリングレイヤを生成することを特徴とするコンピューティング装置が提供される。 In one embodiment, in the (III) process, the processor generates the k-th scaling layer for which each of the k-th optimum scaling parameters is determined as its constituents, and each of the reciprocals of the k-th optimum scaling parameter is A computing device is provided that is characterized in that it generates the k-th inverse scaling layer that is determined as its component.

一実施例において、前記（ＩＶ）プロセスは、前記プロセッサが、（１）ｋが１である場合、（ｉ）前記第ｋ初期コンボリューションレイヤ及び前記第ｋスケーリングレイヤの演算を入力値に適用して生成された結果と、（ｉｉ）前記第ｋ統合コンボリューションレイヤの演算を前記入力値に適用して生成された結果との間の差が閾値より小さくなるように前記第ｋ初期コンボリューションレイヤを前記第ｋ統合コンボリューションレイヤに変換し、（２）ｋが２以上ｍ以下の常数である場合、（ｉ）第（ｋ−１）逆スケーリングレイヤと、前記第ｋ初期コンボリューションレイヤと、前記第ｋスケーリングレイヤとの演算を入力値に適用して生成された結果と、（ｉｉ）前記第ｋ統合コンボリューションレイヤの演算を入力値に適用して生成された結果との間の差が前記閾値より小さくなるように前記第ｋ初期コンボリューションレイヤを前記第ｋ統合コンボリューションレイヤに変換することを特徴とするコンピューティング装置が提供される。 In one embodiment, the (IV) process includes: (1) applying (i) an operation of the k-th initial convolution layer and the k-th scaling layer to an input value when k is 1. The kth initial convolution layer such that the difference between the result generated by (ii) applying the operation of the kth integrated convolution layer to the input value is smaller than a threshold value. To the kth integrated convolution layer, and (2) k is a constant not less than 2 and not more than m, (i) the (k-1) inverse scaling layer, the kth initial convolution layer, The difference between the result generated by applying the operation with the kth scaling layer to the input value and (ii) the result generated by applying the operation of the kth integrated convolution layer to the input value is A computing device is provided, which transforms the k-th initial convolution layer into the k-th integrated convolution layer so as to be smaller than the threshold value.

一実施例において、前記プロセッサが、（Ｖ）前記第ｋコンボリューションブロックに含まれた前記第ｋ統合コンボリューションレイヤの重み付け値を量子化して、前記第ｋコンボリューションブロックによって遂行されるＣＮＮ演算に対する最適化重み付け値として、第ｋ量子化重み付け値を生成するプロセス；をさらに遂行することを特徴とするコンピューティング装置が提供される。 In one embodiment, the processor quantizes (V) a weight value of the kth integrated convolution layer included in the kth convolution block to perform a CNN operation performed by the kth convolution block. A computing device is provided, further comprising: a process of generating a k-th quantization weighting value as an optimization weighting value.

本発明は、出力値にある小さな値を含む特定チャンネルの各値を適切に反映するために、少なくとも一つの特徴マップに含まれた値を平坦化することができるようにＣＮＮレイヤを変換する方法を提供することができ、モバイル装置または高精度の小型ネットワークなどに適用することが可能なハードウェアを最適化して使用され得る。 The present invention is a method of transforming a CNN layer so that the values included in at least one feature map can be flattened to properly reflect each value of a specific channel including a small value in an output value. And can be used by optimizing hardware that can be applied to mobile devices or high-precision small networks.

本発明の実施例の説明に利用されるために添付された以下の各図面は、本発明の実施例のうちの一部に過ぎず、本発明が属する技術分野でおいて、通常の知識を有する者（以下「通常の技術者」）にとっては、発明的作業が行われることなくこの図面に基づいて他の図面が得られ得る。
図１は、本発明の一実施例によるＣＮＮパラメータ量子化の最適化のためにＣＮＮレイヤを変換する方法を遂行するためのコンピューティング装置の構成を示した図面である。図２は、本発明の一実施例によるスケーリングレイヤ及び逆スケーリングレイヤが含まれたＣＮＮの構成を示した図面である。図３ａは、本発明の一実施例によるスケーリングレイヤ及び逆スケーリングレイヤの位置を切り換えることにより、統合コンボリューションレイヤを生成する過程を示す図面である。図３ｂは、本発明の一実施例によるスケーリングレイヤ及び逆スケーリングレイヤの位置を切り換えることにより、統合コンボリューションレイヤを生成する過程を示す図面である。図４は、本発明の一実施例によるスケーリング方法によって値が大きく変わらない異なるいくつかのチャンネルの値を示す例示図である。図５は、従来技術によって値が大きく変わる異なるいくつかのチャンネルの値を示す例示図である。 The following drawings, which are attached to explain the embodiments of the present invention, are only a part of the embodiments of the present invention, and are common knowledge in the technical field to which the present invention belongs. For those who have it (hereinafter "regular technician"), other drawings can be obtained based on this drawing without any inventive work.
FIG. 1 is a diagram illustrating a configuration of a computing device for performing a method of transforming a CNN layer for optimizing CNN parameter quantization according to an exemplary embodiment of the present invention. FIG. 2 is a diagram illustrating a configuration of a CNN including a scaling layer and an inverse scaling layer according to an exemplary embodiment of the present invention. FIG. 3a illustrates a process of generating an integrated convolution layer by switching positions of a scaling layer and an inverse scaling layer according to an exemplary embodiment of the present invention. FIG. 3b is a diagram illustrating a process of generating an integrated convolution layer by switching positions of a scaling layer and an inverse scaling layer according to an embodiment of the present invention. FIG. 4 is a view illustrating values of several different channels whose values do not change significantly according to a scaling method according to an exemplary embodiment of the present invention. FIG. 5 is an exemplary diagram showing the values of several different channels whose values change significantly according to the related art.

後述する本発明に対する詳細な説明は、本発明が実施され得る特定の実施例を例示として示す添付図面を参照する。これらの実施例は、当業者が本発明を実施することができるように十分詳細に説明される。本発明の多様な実施例は互いに異なるが、相互に排他的である必要はないことが理解されるべきである。例えば、ここに記載されている特定の形状、構造及び特性は、一実施例に関連して本発明の精神及び範囲を逸脱せず、かつ他の実施例で具現され得る。また、それぞれの開示された実施例内の個別の構成要素の位置又は配置は、本発明の精神及び範囲を逸脱せず、かつ変更され得ることが理解されるべきである。したがって、後述する詳細な説明は、限定的な意味として受け取ろうとするのではなく、本発明の範囲は適切に説明されるのであれば、その請求項が主張することと均等な全ての範囲とともに添付された請求項によってのみ限定される。図面において類似した参照符号は、様々な側面にわたって同一であるか、又は類似の機能を指す。 DETAILED DESCRIPTION The following detailed description of the invention refers to the accompanying drawings that illustrate, by way of illustration, specific embodiments in which the invention may be practiced. These examples are described in sufficient detail to enable one of ordinary skill in the art to practice the invention. It should be understood that the various embodiments of the invention differ from each other, but need not be mutually exclusive. For example, the particular shape, structure, and characteristics described herein may be embodied in other embodiments without departing from the spirit and scope of the invention in connection with one embodiment. It should also be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the invention. Therefore, the following detailed description is not to be taken in a limiting sense, and the scope of the present invention is, if properly explained, accompanied by all the scope equivalent to what the claims claim. It is limited only by the appended claims. Like reference symbols in the various drawings indicate identical or similar functions across various aspects.

また、本発明の詳細な説明及び各請求項にわたって、「含む」という単語及びそれらの変形は、他の各技術的特徴、各付加物、構成要素又は段階を除外することを意図したものではない。通常の技術者にとって本発明の他の各目的、長所及び各特性が、一部は本明細書から、また一部は本発明の実施から明らかになるであろう。以下の例示及び図面は実例として提供され、本発明を限定することを意図したものではない。 Also, the word "comprising" and variations thereof throughout the detailed description of the invention and the claims are not intended to exclude other technical features, additions, components or steps. .. Other objects, advantages and features of the invention will be apparent to one of ordinary skill in the art, in part, from the specification and in part from the practice of the invention. The following illustrations and drawings are provided by way of illustration and are not intended to limit the invention.

本発明で言及している各種イメージは、舗装または非舗装道路関連のイメージを含み得、この場合、道路環境で登場し得る物体（例えば、自動車、人、動物、植物、物、建物、飛行機やドローンのような飛行体、その他の障害物）を想定し得るが、必ずしもこれに限定されるものではなく、本発明で言及している各種イメージは、道路と関係のないイメージ（例えば、非舗装道路、路地、空き地、海、湖、川、山、森、砂漠、空、室内と関連したイメージ）でもあり得、この場合、非舗装道路、路地、空き地、海、湖、川、山、森、砂漠、空、室内環境で登場し得る物体（例えば、自動車、人、動物、植物、物、建物、飛行機やドローンのような飛行体、その他の障害物）を想定し得るが、必ずしもこれに限定されるものではない。 The various images referred to in the present invention may include paved or unpaved road-related images, in which case objects that may appear in the road environment (eg, cars, people, animals, plants, objects, buildings, airplanes or A flying object such as a drone and other obstacles may be envisioned, but the present invention is not limited to this. The various images referred to in the present invention are images not related to roads (for example, unpaved roads). Roads, alleys, vacant lots, seas, lakes, rivers, mountains, forests, deserts, sky, indoors and related images), in which case unpaved roads, alleys, vacant lots, seas, lakes, rivers, mountains, forests , Deserts, sky, objects that can appear in indoor environments (eg, cars, people, animals, plants, objects, buildings, air vehicles such as airplanes and drones, and other obstacles), but not necessarily It is not limited.

以下、本発明が属する技術分野で通常の知識を有する者が本発明を容易に実施することができるようにするために、本発明の好ましい実施例について添付の図面に基づいて詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art to which the present invention pertains can easily practice the present invention.

図１は、本発明の一実施例によるＣＮＮパラメータ量子化の最適化のためにＣＮＮレイヤを変換する方法を遂行するためのコンピューティング装置１００の構成を示した図面である。また、図２は、本発明の一実施例によるスケーリングレイヤ及び逆スケーリングレイヤが含まれたＣＮＮの構成を示した図面である。 FIG. 1 illustrates a configuration of a computing device 100 for performing a method of transforming a CNN layer for optimizing CNN parameter quantization according to an exemplary embodiment of the present invention. In addition, FIG. 2 is a diagram illustrating a configuration of a CNN including a scaling layer and an inverse scaling layer according to an exemplary embodiment of the present invention.

図１を参照すると、コンピューティング装置１００はＣＮＮ２００を含むことができる。前記ＣＮＮ２００による様々なデータの入力及び出力と各種データ演算の過程は、それぞれ通信部１１０及びプロセッサ１２０によって行われ得る。ところが、図１において、通信部１１０とプロセッサ１２０とがどのように連結されるのについての詳細な説明は省略する。また、コンピューティング装置は、次のプロセスを遂行するためのコンピュータ読取り可能な命令語を格納することができるメモリ１１５をさらに含むことができる。一例として、プロセッサ、メモリ、ミディアム等は、統合プロセッサと統合され得る。 With reference to FIG. 1, the computing device 100 may include a CNN 200. The process of inputting and outputting various data and calculating various data by the CNN 200 may be performed by the communication unit 110 and the processor 120, respectively. However, a detailed description of how the communication unit 110 and the processor 120 are connected in FIG. 1 will be omitted. In addition, the computing device may further include a memory 115 that may store computer readable instructions for performing the following processes. By way of example, a processor, memory, medium, etc. may be integrated with an integrated processor.

ＣＮＮ２００は、一つ以上のコンボリューションブロックを含むことができる。以下、便宜上、ＣＮＮ２００はｍ個のコンボリューションブロックを含み、ｋは１ないしｍの常数を示すための変数として使用する。ここで、第ｋコンボリューションブロックは、図２に示されたように第ｋ初期コンボリューションレイヤ２１１＿ｋ、第ｋアクティベーションレイヤ２１２＿ｋ及び第ｋプーリングレイヤ２１３＿ｋを含むことができる。 CNN 200 may include one or more convolution blocks. Hereinafter, for convenience, the CNN 200 includes m convolution blocks, and k is used as a variable for indicating a constant number of 1 to m. Here, the kth convolution block may include the kth initial convolution layer 211_k, the kth activation layer 212_k, and the kth pooling layer 213_k, as shown in FIG.

以上、本発明のコンピューティング装置１００及びそれに含まれたＣＮＮ２００の構成を検討したところ、本発明の一実施例による第ｋ初期コンボリューションレイヤ２１１＿ｋの変換方法について簡略に説明することにする。 As described above, the configuration of the computing device 100 of the present invention and the CNN 200 included in the computing device 100 have been examined, and a method of converting the k-th initial convolution layer 211_k according to an embodiment of the present invention will be briefly described.

先ず、通信部１１０によってスケーリングパラメータを決定するために使用される入力イメージが取得され得る。以後、コンピューティング装置１００は、（ｉ）第ｋコンボリューションブロック２１０＿ｋに含まれた第ｋ初期コンボリューションレイヤ２１１＿ｋの一つ以上の第ｋ初期重み付け値と、（ｉｉ）第ｋコンボリューションブロック２１０＿ｋによって処理される第（ｋ−１）特徴マップと、（ｉｉｉ）第（ｋ−１）特徴マップに含まれたチャンネルそれぞれに対応する第ｋスケーリングレイヤ２１４＿ｋのそれぞれの第ｋスケーリングパラメータとを参照して、一つ以上の第ｋ量子化ロス値を生成することができる。ここで、ｋが１である場合、第（ｋ−１）特徴マップは、入力イメージを示すことができ、以下と同じである。 First, the input image used to determine the scaling parameter may be obtained by the communication unit 110. Thereafter, the computing device 100 uses (i) one or more kth initial weighting values of the kth initial convolution layer 211_k included in the kth convolution block 210_k, and (ii) the kth convolution block 210_k. With reference to the (k-1)th feature map to be processed and (iii) each kth scaling parameter of the kth scaling layer 214_k corresponding to each channel included in the (k-1)th feature map. , One or more k-th quantization loss values can be generated. Here, when k is 1, the (k−1)th feature map can indicate the input image, and is the same as the following.

また、コンピューティング装置１００は、（ｉｖ）ＢＷ値（前記ＢＷ値は、前記ＣＮＮに含まれた重み付け値及び特徴マップに含まれた値を二進数で表現するために使用されたビットの個数である）、及び（ｖ）第ｋＦＬ値（前記第ｋＦＬ値は、前記第ｋ初期コンボリューションレイヤの前記第ｋ初期重み付け値及び第（ｋ−１）特徴マップに含まれた値のＬＳＢが示す数の指数の絶対値である）をさらに参照して、前記第ｋ量子化ロス値を生成することができる。 The computing device 100 may further include (iv) a BW value (the BW value is a number of bits used to represent a weight value included in the CNN and a value included in the feature map in binary numbers). And (v) the kth FL value (the kth FL value is a number indicated by the LSB of the kth initial weighting value of the kth initial convolution layer and the value included in the (k-1)th feature map. (Which is the absolute value of the exponent of ), the k-th quantization loss value can be generated.

また、コンピューティング装置１００は、下記の公式によって第ｋ量子化ロス値を生成することができる。 In addition, the computing device 100 may generate the kth quantization loss value according to the following formula.

ここで、前記数式は、量子化ロスを微分して第ｋ量子化ロス値を生成する過程を示す。 Here, the mathematical formula represents a process of differentiating the quantization loss to generate the k-th quantization loss value.

前記数式でθ_ｐは、前記第（ｋ−１）特徴マップ及び前記第ｋ初期コンボリューションレイヤの前記第ｋ初期重み付け値を含むことができる。Ｃ_ｋｉは、前記第ｋスケーリングパラメータのうちで前記第（ｋ−１）特徴マップに含まれた第ｉチャンネルに対応する特定の第ｋスケーリングパラメータであり得る。ＦＬ及びＢＷはそれぞれ前記ＦＬ値及び前記ＢＷ値であり、Ｑ演算は、前記ＦＬ値及び前記ＢＷ値を参照して生成されたＣ_ｋｉθ_ｉの量子化された値とＣ_ｋｉθ_ｉとの間の差を生成する演算であり得る。 In the equation, θ _p may include the (k−1)th feature map and the kth initial weighting value of the kth initial convolution layer. C _ki may be a specific k-th scaling parameter corresponding to the i-th channel included in the (k-1)th feature map among the k-th scaling parameters. FL and BW are each the FL value and the BW values, Q calculation, and the FL value and the quantized value of the _{C ki} theta _i generated by referring to the BW values and _{C ki} theta _i Can be an operation that produces the difference between.

前記のように、第ｋ量子化ロス値が生成された後、コンピューティング装置１００は、前記第ｋスケーリングパラメータのうちで前記第（ｋ−１）特徴マップに含まれた前記チャンネルそれぞれに対応する第ｋ最適スケーリングパラメータそれぞれを決定することができる。具体的に、前記コンピューティング装置１００は、前記△Ｌ_ｋを最も小さくする前記Ｃ_ｋｉを選択することにより、前記第ｋ最適スケーリングパラメータそれぞれを決定する。前記Ｃ_ｋｉを選択するためにネステロフ最適化アルゴリズムが使用され得るが、これに限定されはしない。 As described above, after the kth quantization loss value is generated, the computing device 100 corresponds to each of the channels included in the (k−1)th feature map among the kth scaling parameter. Each k-th optimal scaling parameter can be determined. Specifically, the computing device 100, by selecting the C _ki be minimized the △ L _k, to determine each of the k-th best scaling parameter. A Nesterov optimization algorithm may be used to select the C _ki , but is not limited thereto.

ネステロフ最適化アルゴリズムを適用するためには、第ｋスケーリングパラメータ間の制約条件（ｃｏｎｓｔｒａｉｎｔ）を決定しなければならない。したがって、コンピューティング装置１００は、ＣＮＮ２００に含まれたレイヤをトポロジー的にソート（ｔｏｐｏｌｏｇｉｃａｌｌｙｓｏｒｔ）することができる。以後、各レイヤの類型に対応する第ｋスケーリングパラメータに対する制約条件が決定され得る。しかし、制約条件のうち不必要な制約条件、例えば重複した制約条件が存在し得る。したがって、一部の制約条件が除去され得る。ここで、制約条件が除去される過程で各レイヤ間の連結状態情報を参照することができる。 In order to apply the Nesterov optimization algorithm, the constraint between the kth scaling parameters must be determined. Therefore, the computing device 100 may topologically sort the layers included in the CNN 200. Thereafter, the constraint condition for the kth scaling parameter corresponding to the type of each layer can be determined. However, there may be unnecessary constraints, such as duplicate constraints, among the constraints. Therefore, some constraints may be removed. Here, the connection state information between layers can be referred to in the process of removing the constraint condition.

以後、コンピューティング装置１００は、ＣＮＮ２００でフォワードパッシング（Ｆｏｒｗａｒｄｐａｓｓｉｎｇ）及びバックワードパッシング（ｂａｃｋｗａｒｄｐａｓｓｉｎｇ）を何回か繰り返して、各レイヤに含まれた重み付け値それぞれに対応する２Ｄヒストグラム（ｈｉｓｔｏｇｒａｍ）それぞれを取得して第ｋ量子化ロス値のグラフを生成することができる。そして、コンピューティング装置１００は、第ｋスケーリングパラメータそれぞれであるＣ_ｋｉを変化させながら第ｋ量子化ロス値に対応する最も小さい第ｋ最適スケーリングパラメータを決定することができる。前記パラメータを変化させる過程は、ネステロフ最適化アルゴリズムによって提案されたベクトル移動技法によることができる。 After that, the computing apparatus 100 repeats forward passing and backward passing by the CNN 200 several times to generate 2D histograms corresponding to the weighting values included in each layer. It is possible to obtain and generate a graph of the k-th quantization loss value. The computing device 100 may determine the smallest k-th best scaling parameter corresponding to the k quantization loss value while changing the C _ki is a k-th scaling parameter, respectively. The process of changing the parameters may be according to the vector movement technique proposed by the Nesterov optimization algorithm.

前記のような方式で、第ｋ最適スケーリングパラメータが決定されると、コンピューティング装置１００は、前記第（ｋ−１）特徴マップに含まれたチャンネルそれぞれに対応する前記第ｋ最適スケーリングパラメータのそれぞれがその構成要素として決定される前記第ｋスケーリングレイヤ２１４＿ｋを生成し、前記第（ｋ−１）特徴マップに含まれたチャンネルそれぞれに対応する前記第ｋ最適スケーリングパラメータの逆数それぞれがその構成要素として決定される前記第ｋ逆スケーリングレイヤ２１５＿ｋを生成することができる。 When the kth optimum scaling parameter is determined by the above method, the computing device 100 determines each of the kth optimum scaling parameter corresponding to each channel included in the (k-1)th feature map. Generates the k-th scaling layer 214_k determined as a constituent element thereof, and the reciprocal of the k-th optimum scaling parameter corresponding to each channel included in the (k-1)th feature map is a constituent element thereof. The determined kth inverse scaling layer 215_k may be generated.

以下では、図２を参照して、第ｋスケーリングレイヤ２１４＿ｋ及び第ｋ逆スケーリングレイヤ２１５＿ｋが第ｋコンボリューションブロック２１０＿ｋにどのように挿入されるのかについて説明する。 Hereinafter, it will be described with reference to FIG. 2 how the kth scaling layer 214_k and the kth inverse scaling layer 215_k are inserted into the kth convolution block 210_k.

図２を参照すると、第ｋスケーリングレイヤ２１４＿ｋ及び第ｋ逆スケーリングレイヤ２１５＿ｋがそれぞれアクティベーションレイヤ２１２＿ｋの前端及び後端に挿入され得る。これは第ｋアクティベーションレイヤ２１２＿ｋ、第ｋスケーリングレイヤ２１４＿ｋ、及び第ｋ逆スケーリングレイヤ２１５＿ｋによって遂行される動作に交換法則が成立するためである。ここで、第ｋアクティベーションレイヤ２１２＿ｋが遂行する演算は、ＲｅＬＵ演算であり得るが、これに限定されるわけではない。 Referring to FIG. 2, the kth scaling layer 214_k and the kth inverse scaling layer 215_k may be inserted at the front end and the rear end of the activation layer 212_k, respectively. This is because the exchange law is established in the operations performed by the kth activation layer 212_k, the kth scaling layer 214_k, and the kth inverse scaling layer 215_k. Here, the operation performed by the kth activation layer 212_k may be a ReLU operation, but is not limited thereto.

数学的に整理すると、
Mathematically,

前記数式を参照すると、Ｓｃ＊Ｉ．Ｓｃは元来数式に追加され得る。なぜならば、ＳｃすなわちスケーリングレイヤとＩ．Ｓｃすなわち逆スケーリングレイヤとは互いに逆関数の関係であるからである。そして、前記Ｓｃ項目とＩ．Ｓｃ項目とはアクティベーションレイヤと交換法則が成立するため、アクティベーションレイヤの両方に移され得るのである。 Referring to the above formula, Sc*I. Sc can originally be added to the formula. Because Sc, that is, the scaling layer and I.S. This is because Sc and the inverse scaling layer have an inverse function relationship with each other. The Sc item and the I.D. The Sc item can be transferred to both the activation layer because the activation law and the exchange law are established.

一方、第ｋスケーリングレイヤ２１４＿ｋ及び第ｋ逆スケーリングレイヤ２１５＿ｋがＣＮＮ２００に追加されると、さらに多くのコンピュータリソースが必要であり、これは非効率的である。したがって、本発明はスケーリングレイヤ、初期コンボリューションレイヤ、及び逆スケーリングレイヤを統合する方法を提示するところ、図３ａ及び図３ｂを参照して説明することにする。 On the other hand, when the kth scaling layer 214_k and the kth inverse scaling layer 215_k are added to the CNN 200, more computer resources are required, which is inefficient. Accordingly, the present invention presents a method of integrating a scaling layer, an initial convolution layer, and an inverse scaling layer, which will be described with reference to Figures 3a and 3b.

図３ａ及び図３ｂは、本発明の一実施例によるスケーリングレイヤ及び逆スケーリングレイヤの位置を切り換えることにより、統合コンボリューションレイヤを生成する過程を示す図面である。 3a and 3b are views showing a process of generating an integrated convolution layer by switching positions of a scaling layer and an inverse scaling layer according to an embodiment of the present invention.

図３ａ及び図３ｂを参照すると、第（ｋ−１）コンボリューションブロック２１０＿（ｋ−１）に含まれた第（ｋ−１）逆スケーリングレイヤ２１５＿（ｋ−１）は、第ｋコンボリューションブロック２１０＿ｋに移され得る。これは第（ｋ−１）プーリングレイヤ２１３＿（ｋ−１）自体は値の変化と関連がないためである。 Referring to FIGS. 3A and 3B, the (k−1)th inverse scaling layer 215_(k−1) included in the (k−1)th convolution block 210_(k−1) is a kth convolution block. 210_k. This is because the (k-1)th pooling layer 213_(k-1) itself is not associated with a change in value.

図３ｂを参照すると、第（ｋ−１）逆スケーリングレイヤ２１５＿（ｋ−１）、第ｋ初期コンボリューションレイヤ２１１＿ｋ、及び第ｋスケーリングレイヤ２１４＿ｋは、第ｋ統合コンボリューションレイヤ２１６＿ｋを生成するために統合され得る。コンピューティング装置１００は、（ｉ）第（ｋ−１）逆スケーリングレイヤと、前記第ｋ初期コンボリューションレイヤと、前記第ｋスケーリングレイヤとの演算を入力値に適用して生成された結果と、（ｉｉ）前記第ｋ統合コンボリューションレイヤの演算を入力値に適用して生成された結果との差が閾値より小さくなるように前記第ｋ統合コンボリューションレイヤ２１６＿ｋのパラメータが決定され得る。ここで、今までに説明した統合プロセスは、第（ｋ−１）逆スケーリングレイヤ２１５＿（ｋ−１）、第ｋ初期コンボリューションレイヤ２１１＿ｋ、及び第ｋスケーリングレイヤ２１４＿ｋに対応する構成要素を掛け合わせるプロセスが含まれ得るが、これに限定されるわけではない。 Referring to FIG. 3b, the (k−1)th inverse scaling layer 215_(k−1), the kth initial convolution layer 211_k, and the kth scaling layer 214_k may generate the kth integrated convolution layer 216_k. Can be integrated. The computing device 100 includes (i) a (k-1)th inverse scaling layer, a result generated by applying an operation of the kth initial convolution layer and the kth scaling layer to an input value, (Ii) Parameters of the kth integrated convolution layer 216_k may be determined such that a difference between the result generated by applying the operation of the kth integrated convolution layer to an input value is smaller than a threshold value. Here, the integration process described so far multiplies the components corresponding to the (k−1)th inverse scaling layer 215_(k−1), the kth initial convolution layer 211_k, and the kth scaling layer 214_k. Processes may be included, but are not limited to.

ｋが１である場合については図３ｂに示されていないが、当然その前のブロックから移動された逆スケーリングレイヤがないので、最初の初期コンボリューションレイヤ２１１＿１と最初のスケーリングレイヤ２１４＿１のみが統合コンボリューションレイヤ２１６＿１を生成するのに使用される。 The case where k is 1 is not shown in Fig. 3b, but since there is of course no descaling layer moved from the block before it, only the first initial convolution layer 211_1 and the first scaling layer 214_1 are integrated. It is used to generate the evolution layer 216_1.

上記で説明されたプロセスは、量子化に最適化された第ｋ統合コンボリューションレイヤ２１６＿ｋのパラメータを生成するためのものである。ここで、第ｋ統合コンボリューションレイヤ２１６＿ｋのパラメータを生成するプロセスとは独立して量子化プロセスが説明される。したがって、コンピューティング装置１００が前記第ｋコンボリューションブロック２１０＿ｋに含まれた重み付け値を量子化して、前記第ｋコンボリューションブロック２１０＿ｋによって遂行されるＣＮＮ演算に対する最適化重み付け値として、第ｋ量子化重み付け値を生成することができる。これは、前記第ｋ統合コンボリューションレイヤ２１６＿ｋを生成するプロセスの前、プロセスの途中、プロセスの後に関係なく遂行される。 The process described above is for generating the parameters of the k-th integrated convolutional layer 216_k optimized for quantization. Here, the quantization process is described independently of the process of generating the parameters of the kth integrated convolution layer 216_k. Therefore, the computing device 100 quantizes the weight value included in the kth convolution block 210_k, and determines the kth quantization weight as an optimized weight value for the CNN operation performed by the kth convolution block 210_k. A value can be generated. This may be performed before, during, or after the process of generating the kth integrated convolution layer 216_k.

前記最適化された量子化ＣＮＮ重み付け値の長所は、図４を参照して説明する。 The advantages of the optimized quantized CNN weighting value will be described with reference to FIG.

図４は、本発明の一実施例によるスケーリング方法によって値が大きく変わらない異なるいくつかのチャンネルの値を示す例示図である。 FIG. 4 is a view illustrating values of several different channels whose values do not change significantly according to a scaling method according to an exemplary embodiment of the present invention.

まず、従来技術を説明する際に参照していた図５を参照すると、本発明によって提供される方法が適用されない場合、第１特徴マップに含まれた第２チャンネルの値が第１特徴マップに含まれた第１チャンネルの値よりはるかに小さいことが確認された。これと反対に、図４を参照すると、第１チャンネルの値と第２チャンネルの値とが類似することが分かる。これは第１統合コンボリューションレイヤ２１６＿１の重み付け値に反映された第１スケーリングパラメータによるものであって、第１値と第２値との間の差が大きくないため、第１コンボリューションブロック２１０＿２によって遂行された演算の後で第１値及び第２の値が適切に量子化され得るのである。 First, referring to FIG. 5 referred to when describing the prior art, when the method provided by the present invention is not applied, the value of the second channel included in the first feature map is converted into the first feature map. It was confirmed that it was much smaller than the value of the included first channel. On the contrary, referring to FIG. 4, it can be seen that the value of the first channel and the value of the second channel are similar. This is due to the first scaling parameter reflected in the weighting value of the first integrated convolution layer 216_1, and since the difference between the first value and the second value is not large, the first convolution block 210_2 After the operation performed, the first value and the second value can be properly quantized.

本発明の技術分野における通常の技術者にとって理解され得るところとして、上記にて説明されたイメージ、例えば原本イメージ、原本ラベル及び追加ラベルのようなイメージデータの送受信が学習装置及びテスト装置の各通信部によって行われ得、特徴マップと演算を遂行するためのデータが学習装置及びテスト装置のプロセッサ（及び／又はメモリ）により保有／維持され得、コンボリューション演算、デコンボリューション演算、ロス値演算の過程が主に学習装置及びテスト装置のプロセッサにより遂行され得るが、本発明がこれに限定されはしないであろう。 As can be understood by those of ordinary skill in the art of the present invention, the transmission and reception of the image data such as the images described above, for example, the original image, the original label, and the additional label is performed by the learning device and the test device. Data that may be performed by the unit and may be retained/maintained by the processor (and/or memory) of the learning device and the test device to perform the feature map and the operation, and the process of convolution operation, deconvolution operation, loss value operation. Can be accomplished primarily by the processor of the learning and testing devices, but the invention will not be so limited.

以上にて説明された本発明による実施例は、多様なコンピュータの構成要素を通じて遂行することができるプログラム命令語の形態で具現されて、コンピュータ読取り可能な記録媒体に格納され得る。前記コンピュータ読取り可能な記録媒体はプログラム命令語、データファイル、データ構造などを単独で又は組み合わせて含むことができる。前記コンピュータ読取り可能な記録媒体に格納されるプログラム命令語は、本発明のために特別に設計され、構成されたものであるか、コンピュータソフトウェア分野の当業者に公知にされて使用可能なものであり得る。コンピュータ読取り可能な記録媒体の例には、ハードディスク、フロッピーディスク及び磁気テープのような磁気媒体、ＣＤ−ＲＯＭ、ＤＶＤのような光記録媒体、フロプティカル・ディスク（ＦｌｏｐｔｉｃａｌＤｉｓｋ）のような磁気−光メディア（Ｍａｇｎｅｔｏ−ＯｐｔｉｃａｌＭｅｄｉａ）、及びＲＯＭ、ＲＡＭ、フラッシュメモリなどのようなプログラム命令語を格納して遂行するように特別に構成されたハードウェア装置が含まれる。プログラム命令語の例には、コンパイラによって作られるもののような機械語コードだけでなく、インタープリターなどを使用してコンピュータによって実行される高級言語コードも含まれる。前記ハードウェア装置は、本発明による処理を実行するために一つ以上のソフトウェアモジュールとして作動するように構成され得、その反対も同様である。 The embodiments of the present invention described above may be embodied in the form of program command words that can be executed through various computer components and stored in a computer-readable recording medium. The computer-readable recording medium may include program command words, data files, data structures, etc., alone or in combination. The program command stored in the computer-readable recording medium may be specially designed and constructed for the present invention, or may be known and used by those skilled in the computer software field. possible. Examples of the computer-readable recording medium include a magnetic medium such as a hard disk, a floppy disk and a magnetic tape, an optical recording medium such as a CD-ROM and a DVD, and a magnetic-optical medium such as a floppy disk. (Magneto-Optical Media), and hardware devices specially configured to store and execute program command words, such as ROM, RAM, flash memory, and the like. Examples of program instruction words include not only machine language code such as that produced by a compiler, but also high level language code executed by a computer using an interpreter or the like. Said hardware device may be arranged to operate as one or more software modules for carrying out the processes according to the invention and vice versa.

以上にて本発明が具体的な構成要素などのような特定事項と限定された実施例及び図面によって説明されたが、これは本発明のより全般的な理解の一助とするために提供されたものであるに過ぎず、本発明が前記実施例に限られるものではなく、本発明が属する技術分野において通常の知識を有する者であれば、係る記載から多様な修正及び変形が行われ得る。 The present invention has been described above with reference to specific examples such as specific components and limited examples and drawings, which are provided to help a more general understanding of the present invention. However, the present invention is not limited to the above-described embodiments, and various modifications and variations can be made from the above description by a person having ordinary knowledge in the technical field to which the present invention belongs.

従って、本発明の思想は、前記説明された実施例に局限されて定められてはならず、後述する特許請求の範囲だけでなく、本特許請求の範囲と均等または等価的に変形されたものすべては、本発明の思想の範囲に属するといえる。 Therefore, the idea of the present invention should not be limited to the above-described embodiments, and should not be limited to the scope of the claims to be described later, but may be modified equivalently or equivalently to the scope of the claims. All can be said to belong to the scope of the idea of the present invention.

Claims

In a method of transforming a convolution layer of a CNN containing m convolution blocks,
(A) When the computing device obtains the input image used to determine the scaling parameter, (i) one or more th of the k th initial convolution layers included in the k th convolution block. The k initial weighting value and (ii)(ii-1)k is 1 if the input image, (ii-2)k is a constant from 2 to m, the (k-1) convolution If the (k−1)th feature map corresponding to the input image output from the block is (iii)(iii−1)k is 1, then the kth feature map corresponding to each channel included in the input image. When each of the scaling parameters and (iii-2)k is a constant from 2 to m, each of the kth scaling parameters corresponding to each of the channels included in the (k-1)th feature map is referred to, Generating one or more k th quantization loss values, where k is a constant from 1 to m;
(B) The computing device refers to the k-th quantization loss value, and among the k-th scaling parameters, the k-th corresponding to each of the channels included in the (k-1)th feature map. Determining each optimal scaling parameter;
(C) referring to the k-th optimal scaling parameter, the computing device generates a k-th scaling layer and a k-th inverse scaling layer;
(D) the computing device transforms the kth initial convolutional layer into akth integrated convolutional layer using the kth scaling layer if (i)k is 1, and (ii) When k is a constant from 2 to m, the kth initial convolution layer is converted to the kth integrated convolution layer using the kth scaling layer and the (k-1)th inverse scaling layer. Stage of doing;
A method comprising:

The step (a) includes
(Iv) BW value (the BW value is the number of bits used to represent the weight value included in the CNN and the value included in the feature map in binary number) by the computing device. And (v) a kFL value (wherein the kFL value is (i) the kth initial weighting value of the kth initial convolution layer, and (ii) k is a constant from 2 to m, (K-1) value included in the feature map, where k is 1, it is the absolute value of the exponent of the number indicated by the LSB of the value included in the input image). The method of claim 1, comprising generating the kth quantization loss value.

The step (a) includes
Wherein is generated the first k quantization loss value by a formula, when in theta _p is the equation is a constant to m 2 is (i) k, wherein the (k-1), wherein the map and the first k initial con If (ii) k is 1, the value of the k-th initial weighting value of the convolution feature map includes the input image and the value of the k-th initial weighting value of the k-th initial convolution feature map, and C _ki is , A specific k-th scaling parameter among the k-th scaling parameters, FL and BW are the FL value and the BW value, respectively, and a Q operation is generated by referring to the FL value and the BW value. and a calculation for generating the difference between the C _ki theta _i quantized values and C _ki theta _i of
Step (b), the computing device, wherein the △ by selecting the C _ki a L _k be minimized, according to claim 2, wherein determining a respective said first k optimal scaling parameters the method of.

4. The method of claim 3, wherein the computing device uses the Nesterov Accelerated Gradient method to select the C _ki to determine the k-th optimal scaling parameter.

The step (c) includes
The computing device is
Each of the k-th optimal scaling parameters produces the k-th scaling layer whose constituents are determined, and each of the reciprocals of the k-th optimal scaling parameter produces the k-th inverse scaling layer whose constituents are determined. The method of claim 1, wherein:

The step (d) includes
(1) when k is 1, (i) a result generated by applying (i) an operation of the kth initial convolution layer and the kth scaling layer to an input value; and (ii) Transforming the kth initial convolution layer into the kth integrated convolution layer such that a difference between the result generated by applying the operation of the kth integrated convolution layer to the input value is smaller than a threshold value; If (2) k is a constant of 2 or more and m or less, (i) the (k-1) inverse scaling layer, the kth initial convolution layer, and the kth scaling layer are input. The k th th such that a difference between a result generated by applying the value to the value and (ii) a result generated by applying the operation of the k th integrated convolution layer to the input value is smaller than the threshold. The method of claim 1, converting an initial convolutional layer to the kth integrated convolutional layer.

(E) The computing device quantizes each weighting value of the kth integrated convolution layer included in the kth convolution block to optimize the CNN operation performed by the kth convolution block. Generating a k-th quantization weighting value as a weighting value;
The method of claim 1, further comprising:

In a computing device for transforming a CNN convolution layer containing m convolution blocks,
At least one memory for storing each instruction,
(I) (i) one or more k-th initial weighting values of the k-th initial convolution layer included in the k-th convolution block, and (ii) (ii-1) k is 1, a scaling parameter The input image used to determine (ii-2)k is a constant from 2 to m, the (k-1)th corresponding to the input image output from the (k-1) convolution block. -1) If the feature map and (iii)(iii-1)k are 1, the k-th scaling parameter corresponding to each channel included in the input image, and (iii-2)k from 2 If it is a constant up to m, one or more k-th quantization loss values are generated by referring to the k-th scaling parameter corresponding to each channel included in the (k-1)th feature map. (K) is a constant number from 1 to m; (II) is included in the (k-1)th feature map among the kth scaling parameters with reference to the kth quantization loss value. And (IV) generating a kth scaling layer and a kth inverse scaling layer by referring to the kth optimal scaling parameter. (I) When k is 1, the k-th scaling layer is used to transform the k-th initial convolution layer into a k-th integrated convolution layer, and (ii) k is a constant from 2 to m. A step of converting the kth initial convolutional layer to the kth integrated convolutional layer using the kth scaling layer and the (k-1)th inverse scaling layer, if any; At least one processor configured to execute
A computing device comprising:.

The process (I) is
The processor is (iv) a BW value (the BW value is the number of bits used to represent the weight value included in the CNN and the value included in the feature map in binary); and (V) kth FL value (the kth FL value is (1) the kth initial weighting value of the kth initial convolution layer, and (2) when k is a constant from 2 to m, the k-1) a value included in the feature map, where k is 1, it is the absolute value of the exponent of the number indicated by the LSB of the value included in the input image). The computing device of claim 8 including generating a k-quantization loss value.

The process (I) is
The processor is
Wherein is generated the first k quantization loss value by a formula, when in theta _p is the equation is a constant to m 2 is (i) k, wherein the (k-1), wherein the map and the first k initial con If (ii) k is 1, the value of the k-th initial weighting value of the convolution feature map includes the input image and the value of the k-th initial weighting value of the k-th initial convolution feature map, and C _ki is , A specific k-th scaling parameter among the k-th scaling parameters, FL and BW are the FL value and the BW value, respectively, and a Q operation is generated by referring to the FL value and the BW value. and a calculation for generating the difference between the C _ki theta _i quantized values and C _ki theta _i of
The process (II) is
The processor is
The △ L by selecting the smallest to the C _ki and _k, computing device of claim 9, wherein determining a respective said first k optimal scaling parameter.

11. The computing device of claim 10, wherein the processor uses the Nesterov Accelerated Gradient method to select the C _ki to determine the k-th optimal scaling parameter.

The process (III) is
The processor generates the kth scaling layer, each of the kth optimal scaling parameters being determined as its constituents, and each of the kth inverse scalings, wherein each reciprocal of the kth optimal scaling parameter is determined as its constituents. The computing device of claim 8, wherein the computing device generates a layer.

The process (IV) is
(1) when k is 1, (i) a result generated by applying the operation of the k-th initial convolution layer and the k-th scaling layer to an input value; and (ii) the converting the k-th initial convolution layer to the k-th integrated convolution layer such that the difference between the k-integrated convolution layer operation and the result generated by applying the input value to the input value is smaller than a threshold value; ) When k is a constant of 2 or more and m or less, (i) the (k-1) inverse scaling layer, the kth initial convolution layer, and the kth scaling layer are applied to input values. The kth initial convolution layer such that the difference between the result generated by (ii) the result generated by applying the operation of the kth integrated convolution layer to the input value is smaller than the threshold value. 9. The computing device of claim 8, wherein is converted to the kth integrated convolution layer.

The processor is
(V) The weighting value of the kth integrated convolution layer included in the kth convolution block is quantized, and the kth quantum is used as an optimized weighting value for the CNN operation performed by the kth convolution block. 9. The computing device of claim 8, further comprising the step of: generating a quantization weighting value.