KR20220069653A

KR20220069653A - Learning method for ultra light-weight deep learning network

Info

Publication number: KR20220069653A
Application number: KR1020200156990A
Authority: KR
Inventors: 박종희; 이상설; 장성준
Original assignee: 한국전자기술연구원
Priority date: 2020-11-20
Filing date: 2020-11-20
Publication date: 2022-05-27
Also published as: WO2022107951A1

Abstract

Provided is an ultra-lightweight deep learning network learning/quantization method. According to an embodiment of the present invention, the network learning/quantization method quantizes parameters of LSN, transmits quantization knowledge information generated in a process of performing quantization to SSN, and quantizes the parameters of the SSN using the transmitted quantization knowledge information. Accordingly, it is possible to learn to quantize a low-complexity base network with ultra-light weight.

Description

{Learning method for ultra light-weight deep learning network}

본 발명은 딥러닝 네트워크 학습 방법에 관한 것으로, 더욱 상세하게는 초경량 딥러닝 네트워크를 학습하고 양자화하는 방법에 관한 것이다.The present invention relates to a method for learning a deep learning network, and more particularly, to a method for learning and quantizing an ultralight deep learning network.

종래 기술에서는 초경량 네트워크를 학습하기 위해 부동 소수점 수체계를 기반으로 네트워크를 학습하고 해당 파라미터를 양자화함으로써 진행된다. In the prior art, in order to learn an ultra-light network, the network is trained based on a floating-point number system and the corresponding parameters are quantized.

부동 소수점의 데이터를 양자화하는 방식은 크게 2가지로 나뉠수 있다. 첫 번째 기술은 도 1과 같이 학습이 완료된 부동 소수점의 데이터를 특정 맵핑 함수를 통해 양자화를 수행하는 기법이다.A method of quantizing floating-point data can be roughly divided into two types. The first technique is a technique of performing quantization of floating-point data that has been trained through a specific mapping function, as shown in FIG. 1 .

이는 실수로 구성된 부분이 정수로 바뀌게 되면서 발생하는 양자화 에러를 발생시키기 때문에 초경량 딥러닝 네트워크의 정확도가 부동 소수점 결과에 비해 크게 낮아진다는 문제가 있다. This has a problem in that the accuracy of the ultra-light deep learning network is greatly lowered compared to the floating-point result because it generates a quantization error that occurs when the real part is changed to an integer.

두 번째 기술은 도 2와 같이 학습 중에 양자화를 수행하는 방법이다. 첫 번째 기술의 양자화 에러를 보완하기 위해 제시된 방법이다.The second technique is a method of performing quantization during learning as shown in FIG. 2 . This method is presented to compensate for the quantization error of the first technique.

이는 초경량으로 만들기 위한 베이스 네트워크에 양자화 매핑함수를 중간에 삽입하는 방식으로, 양자화 매핑함수에 사용되는 하이퍼 파라미터들이 함께 학습되어 첫 번째 기술에 비해 높은 정확도를 가지고 있다.This is a method of inserting a quantization mapping function in the middle of the base network to make it ultra-light, and the hyper parameters used in the quantization mapping function are learned together, so it has higher accuracy than the first technique.

하지만, 베이스 네트워크의 구조 자체는 변경되지 않아 베이스 네트워크가 복잡도가 낮은 경우에는 양자화 후 정확도가 떨어진다는 단점이 있다. However, since the structure of the base network itself is not changed, when the base network has low complexity, there is a disadvantage in that the accuracy after quantization is lowered.

본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은, 초경량 네트워크에서 양자화 후 정확도가 떨어지는 것을 방지하기 위한 방안으로, LSN의 양자화 지식 정보를 전파하여, SSN의 파라미터들을 양자화하는 방법을 제공함에 있다.The present invention has been devised to solve the above problems, and an object of the present invention is to quantize the parameters of the SSN by propagating the quantization knowledge information of the LSN as a method to prevent the accuracy from falling after quantization in an ultra-light network. to provide a way to do it.

또한, 본 발명의 다른 목적은, SSN의 양자화 기법을 적응적으로 선택할 수 있는 방법을 제공함에 있다.Another object of the present invention is to provide a method for adaptively selecting a quantization technique of an SSN.

상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른, 네트워크 양자화 방법은, 제1 딥러닝 네트워크의 파라마터들을 양자화하는 제1 양자화 단계; 제1 양자화 단계에서 양자화를 수행하는 과정에 생성되는 양자화 지식 정보를 제2 딥러닝 네트워크로 전파하는 단계; 전파된 양자화 지식 정보를 이용하여, 제2 딥러닝 네트워크의 파라미터들을 양자화하는 제2 양자화 단계;를 포함한다.According to an embodiment of the present invention for achieving the above object, a network quantization method includes a first quantization step of quantizing parameters of a first deep learning network; propagating quantization knowledge information generated in the process of performing quantization in the first quantization step to a second deep learning network; and a second quantization step of quantizing parameters of a second deep learning network using the propagated quantization knowledge information.

제1 딥러닝 네트워크는, LSN(Large Scale Network)이고, 제2 딥러닝 네트워크는, SSN(Small Scale Network)일 수 있다.The first deep learning network may be a Large Scale Network (LSN), and the second deep learning network may be a Small Scale Network (SSN).

양자화 지식 정보는, 데이터 분산, 양자화 에러, 에러의 분산 중 적어도 하나를 포함할 수 있다.The quantization knowledge information may include at least one of data dispersion, quantization error, and error dispersion.

전파 단계는, 제1 딥러닝 네트워크의 레이어 개수와 제2 딥러닝 네트워크의 레이어 개수가 동일하지 않으면, 전체 레이어의 파라미터들에 대한 양자화 지식 정보를 전파할 수 있다.In the propagation step, if the number of layers of the first deep learning network and the number of layers of the second deep learning network are not the same, quantization knowledge information about parameters of all layers may be propagated.

전파 단계는, 제1 딥러닝 네트워크의 레이어 개수와 제2 딥러닝 네트워크의 레이어 개수가 동일하면, 레이어 별로 파라미터들에 대한 양자화 지식 정보를 전파할 수 있다.In the propagation step, when the number of layers of the first deep learning network and the number of layers of the second deep learning network are the same, quantization knowledge information about parameters may be propagated for each layer.

제2 양자화 단계는, 부동 소수점과 정수를 1:N 으로 맵핑하는 양자화 기법으로, 제2 딥러닝 네트워크의 파라미터들을 양자화할 수 있다.The second quantization step is a quantization technique that maps floating point and integer to 1:N, and parameters of the second deep learning network can be quantized.

본 발명의 실시예에 따른 네트워크 양자화 방법은, 제1 딥러닝 네트워크를 학습시키는 제1 학습 단계; 제1 학습 단계를 통해 획득된 분류 지식 정보를 제2 딥러닝 네트워크로 전파하는 단계; 전파된 분류 지식 정보를 이용하여, 제2 딥러닝 네트워크를 학습시키는 제2 학습 단계;를 더 포함할 수 있다.A network quantization method according to an embodiment of the present invention includes a first learning step of learning a first deep learning network; propagating the classification knowledge information obtained through the first learning step to a second deep learning network; The method may further include a second learning step of learning the second deep learning network by using the propagated classification knowledge information.

한편, 본 발명의 다른 실시예에 따른, 네트워크 양자화 시스템은, 제1 딥러닝 네트워크의 파라마터들을 양자화하는 과정에 생성되는 양자화 지식 정보를 수신하는 통신부; 수신한 양자화 지식 정보를 이용하여, 제2 딥러닝 네트워크의 파라미터들을 양자화하는 프로세서;를 포함한다.On the other hand, according to another embodiment of the present invention, a network quantization system, a communication unit for receiving quantization knowledge information generated in the process of quantizing parameters of a first deep learning network; and a processor that quantizes parameters of the second deep learning network by using the received quantization knowledge information.

이상 설명한 바와 같이, 본 발명의 실시예들에 따르면, LSN(Teacher Network)로부터 SSN(Student Network)로 양자화 정보를 전파하는 구조를 도입하여, 복잡도가 낮은 베이스 네트워크를 초경량으로 양자화하기 위한 학습이 가능해진다.As described above, according to embodiments of the present invention, a structure for propagating quantization information from LSN (Teacher Network) to SSN (Student Network) is introduced, so that it is possible to learn to quantize a low-complexity base network with ultra-light weight. becomes

또한, 본 발명의 실시예들에 따르면, 베이스 네트워크의 구조를 양자화와 동시에 변경하면서 탐색함으로써, 1:N 매핑 양자화 기법이 적응적으로 선택될 수 있게 된다.In addition, according to embodiments of the present invention, a 1:N mapping quantization technique can be adaptively selected by searching while changing the structure of the base network simultaneously with quantization.

도 1운 학습 후 양자화 기법,
도 2는 학습 중 양자화 기법,
도 3은 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 개념,
도 4는 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 설명에 제공되는 흐름도,
도 5 및 도 6은, 양자화 지식 전파 방법의 설명에 제공되는 도면들,
도 7은 적응적 양자화 기법을 예시한 도면,
도 8은 본 발명의 다른 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 시스템의 블럭도이다.Figure 1 quantization technique after rhyming learning,
2 is a quantization technique during learning;
3 is a concept of an ultra-light deep learning network learning / quantization method according to an embodiment of the present invention;
4 is a flowchart provided for the description of an ultra-light deep learning network learning / quantization method according to an embodiment of the present invention;
5 and 6 are diagrams provided for explanation of a quantization knowledge propagation method;
7 is a diagram illustrating an adaptive quantization technique;
8 is a block diagram of an ultra-light deep learning network learning/quantization system according to another embodiment of the present invention.

이하에서는 도면을 참조하여 본 발명을 보다 상세하게 설명한다.Hereinafter, the present invention will be described in more detail with reference to the drawings.

본 발명의 실시예에서는 초경량 딥러닝 네트워크를 학습하고 양자화하는 방법을 제시한다.In an embodiment of the present invention, a method for learning and quantizing an ultra-light deep learning network is presented.

초경량 딥러닝 네트워크를 학습하고 양자화하기 위한 프레임워크의 기본 개념은 도 3과 같다. 도 3은 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 설명에 제공되는 도면이다.The basic concept of a framework for learning and quantizing an ultra-light deep learning network is shown in FIG. 3 . 3 is a diagram provided to explain an ultra-light deep learning network learning/quantization method according to an embodiment of the present invention.

도시된 바와 같이, 먼저, 초경량 네트워크를 생성하기 위한 SSN(Small Scale Network)(200)를 보다 높은 정확도로 학습하기 위해 Knowledge distillation 기법을 활용하여, LSN(Large Scale Network)(100)의 분류 지식 정보를 전파하여 학습한다. 즉, LSN(100)는 'Teacher network'로 기능하고, SSN(200)는 'Student network'로 각각 기능하는 것이다.As shown, first, by utilizing the Knowledge distillation technique to learn the SSN (Small Scale Network) 200 for generating an ultra-light network with higher accuracy, classification knowledge information of the LSN (Large Scale Network) 100 learn by spreading That is, the LSN 100 functions as a 'Teacher network' and the SSN 200 functions as a 'Student network', respectively.

다음, LSN(100)을 양자화 할 때 생성되는 지식을 SSN(200)에 전파하여, SSN(200)을 양자화함으로써 초경량 네트워크를 구성하게 된다.Next, the knowledge generated when the LSN 100 is quantized is propagated to the SSN 200, and the SSN 200 is quantized to configure an ultra-light network.

도 3에 도시된 개념을 도 4를 참조하여 구체적으로 설명한다. 도 4는 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 설명에 제공되는 흐름도이다.The concept shown in FIG. 3 will be described in detail with reference to FIG. 4 . 4 is a flowchart provided to explain an ultra-light deep learning network learning/quantization method according to an embodiment of the present invention.

도시된 바와 같이, 먼저, LSN(100)을 학습시키고(S310), S310단계를 통해 획득된 분류 지식 정보를 SSN(200)으로 전파한다(S320). 이에, S320단계에서 전파된 분류 지식 정보를 이용하여, SSN(200)을 학습시킨다(S330).As shown, first, the LSN 100 is learned (S310), and the classification knowledge information obtained through the step S310 is propagated to the SSN 200 (S320). Accordingly, the SSN 200 is learned using the classification knowledge information propagated in step S320 (S330).

다음, LSN(100)의 파라마터들을 양자화하고(S340), S340단계에서 양자화를 수행하는 과정에 생성되는 양자화 지식 정보를 SSN(200)으로 전파한다(S350). 이에, S350단계에서 전파된 양자화 지식 정보를 이용하여, SSN(200)의 파라미터들을 양자화한다(S360).Next, the parameters of the LSN 100 are quantized (S340), and quantization knowledge information generated in the process of performing quantization in step S340 is propagated to the SSN 200 (S350). Accordingly, the parameters of the SSN 200 are quantized using the quantization knowledge information propagated in step S350 (S360).

S350단계에서 전파되어 SSN(200)의 파라미터 양자화에 이용되는 양자화 지식 정보에는, 데이터 분산, 양자화 에러, 에러의 분산이 포함될 수 있다.The quantization knowledge information propagated in step S350 and used for parameter quantization of the SSN 200 may include data dispersion, quantization error, and error dispersion.

S340단계 내지 S360단계에서 수행되는 양자화 방법은 2가지로 분류될 수 있다.The quantization method performed in steps S340 to S360 may be classified into two types.

도 5는 첫 번째 양자화 방법의 개념을 나타낸 도면이다. 이는, LSN(100)의 전체 레이어의 파라미터들에 대한 양자화 지식 정보를 SSN(200)으로 전파하고, SSN(200)가 이를 기초로 전체 레이어의 파라미터들에 대한 양자화를 수행하는 방법이다.5 is a diagram illustrating the concept of the first quantization method. This is a method in which quantization knowledge information on parameters of all layers of the LSN 100 is propagated to the SSN 200, and the SSN 200 performs quantization on parameters of all layers based on the propagation information.

이는 LSN(100)의 레이어 개수와 SSN(200)의 레이어 개수가 동일하지 않은 경우에 적합한 방법이다.This is a suitable method when the number of layers of the LSN 100 and the number of layers of the SSN 200 are not the same.

도 6은 LSN(100)의 각 레이어 별로 파라미터들에 대한 양자화 지식 정보를 SSN(200)의 각 레이어 별로 전파하고, SSN(200)가 이를 기초로 각 레이어 별로 파라미터들에 대한 양자화를 수행하는 방법이다.6 is a method of propagating quantization knowledge information on parameters for each layer of the LSN 100 for each layer of the SSN 200, and the SSN 200 performs quantization on parameters for each layer based on this. to be.

이는 LSN(100)의 레이어 개수와 SSN(200)의 레이어 개수가 동일한 경우에 적합한 방법이다.This is a suitable method when the number of layers of the LSN 100 and the number of layers of the SSN 200 are the same.

한편, S340단계 내지 S360단계에서 양자화 방법에서는 부동 소수점과 정수를 1:N 으로 맵핑하는 양자화 기법으로, SSN(200)의 파라미터들을 양자화할 수 있다.Meanwhile, in the quantization method in steps S340 to S360, the parameters of the SSN 200 can be quantized using a quantization technique that maps a floating point and an integer to 1:N.

초경량 네트워크는 LSN(100) 보다 파라미터들이 확연히 줄어들기 때문에, 생성되는 네트워크의 구분력 또한 줄어들게 된다. 이는 일반적으로, 베이스 네트워크를 양자화할 때 부동 소수점과 정수를 1:1로 맵핑하기 때문이다. Since the ultra-light network has significantly fewer parameters than the LSN 100 , the discrimination power of the generated network is also reduced. This is because, in general, when quantizing a base network, we map a floating point to an integer 1:1.

이를 적응적으로 선택하기 위해, 본 발명의 실시예에 따른 네트워크 학습/양자화 방법에서는 1:N 맵핑을 적용한다. 하지만, 이 경우 N 이라는 파라미터는 hueristic 하여 데이터 셋마다 최적의 N 개를 선택하는 것은 여러 단계의 재학습을 필요로 한다. In order to adaptively select this, 1:N mapping is applied in the network learning/quantization method according to an embodiment of the present invention. However, in this case, the parameter N is hueristic, and selecting N optimal for each data set requires re-learning in several steps.

이를 방지하기 위해, 도 7에 도시된 바와 같이, 적응적 초경량 양자화 네트워크 구조 변경을 위해 매핑되는 N개 중 실질적으로 사용되는 특징을 선택하기 위한 선택 N-dimension의 선택 Vector를 1:N 매핑 변환 중간에 위치시키고, 각 vector 내부 element를 학습이 진행되는 도중 선택할 수 있는 Sigmoid 함수로 설계하면, 적응적으로 사용되는 특징을 선택할 수 있도록 구현한다.In order to prevent this, as shown in FIG. 7, a selection vector of N-dimension to select a feature that is actually used among N mapped for adaptive ultra-light quantization network structure change is converted into 1:N mapping transformation intermediate , and designing each vector internal element with a sigmoid function that can be selected while learning is in progress, it is implemented so that features used adaptively can be selected.

도 8은 본 발명의 다른 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 시스템의 블럭도이다.8 is a block diagram of an ultra-light deep learning network learning/quantization system according to another embodiment of the present invention.

본 발명의 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 시스템은, 도시된 바와 같이, 통신부(410), 출력부(420), 프로세서(430), 입력부(440) 및 저장부(450)를 포함하는 컴퓨팅 시스템으로 구현 가능하다.Ultralight deep learning network learning / quantization system according to an embodiment of the present invention, as shown, includes a communication unit 410, an output unit 420, a processor 430, an input unit 440 and a storage unit 450 It can be implemented with a computing system that

통신부(410)는 외부 네트워크 또는 외부 기기와 통신 가능하도록 연결되어, 데이터/정보를 송수신하는 통신 인터페이스이다. 본 발명의 실시예에서 통신부(410)는 LSN(100)를 학습시키는 시스템과 통신 연결하여, LSN(100)의 분류 지식 정보와 양자화 지식 정보를 수신한다.The communication unit 410 is a communication interface that is connected to communicate with an external network or an external device, and transmits and receives data/information. In an embodiment of the present invention, the communication unit 410 receives the classification knowledge information and the quantization knowledge information of the LSN 100 by communicating with the system for learning the LSN 100 .

프로세서(430)는 통신부(410)를 통해 수신한 분류 지식 정보를 이용하여 SSN(200)을 학습시키고, 양자화 지식 정보를 이용하여 SSN(200)을 양자화 시켜 초경량 네트워크를 생성한다.The processor 430 learns the SSN 200 using the classification knowledge information received through the communication unit 410 , and quantizes the SSN 200 using the quantization knowledge information to create an ultra-light network.

입력부(440)는 사용자 명령을 프로세서(430)로 전달하는 입력 수단이고, 출력부(420)는 프로세서(430)의 실행 결과를 출력하는 출력 수단이다. 저장부(450)는 프로세서(430)가 동작하고 기능함에 있어 필요한 저장 공간을 제공한다.The input unit 440 is an input unit that transmits a user command to the processor 430 , and the output unit 420 is an output unit that outputs an execution result of the processor 430 . The storage unit 450 provides a storage space necessary for the processor 430 to operate and function.

지금까지, 초경량 딥러닝 네트워크를 학습하고 양자화하는 방법에 대해 바람직한 실시예를 들어 상세히 설명하였다.So far, a preferred embodiment has been described in detail for a method of learning and quantizing an ultralight deep learning network.

위 실시예에서는, LSN(Teacher Network)(100)로부터 SSN(Student Network)(200)로 양자화 정보를 전파하는 구조를 도입하여, 복잡도가 낮은 베이스 네트워크를 초경량으로 양자화하기 위한 학습 기술을 제시하였다.In the above embodiment, a structure for propagating quantization information from the LSN (Teacher Network) 100 to the SSN (Student Network) 200 was introduced, and a learning technique for quantizing a low-complexity base network to an ultra-light weight was presented.

또한, 위 실시예에서는, 베이스 네트워크의 구조를 양자화와 동시에 변경하면서 탐색함으로써, 1:N 매핑 양자화 기법이 적응적으로 선택될 수 있는 구조를 제시하였다.In addition, in the above embodiment, a structure in which a 1:N mapping quantization technique can be adaptively selected was proposed by searching while changing the structure of the base network while quantizing it.

이에 의해, 고가의 GPU 기기에서 동작가능한 딥러닝 기술을 다양한 분야에 활용될 수 있게 되고, 다양한 활용 분야에서 네트워크가 적응적으로 학습할 수 있게 된다.Accordingly, the deep learning technology that can be operated in expensive GPU devices can be utilized in various fields, and the network can adaptively learn in various fields of application.

나아가, 모바일이나 엣지 디바이스와 같이 연산 성능이 낮은 기기에서도 실시간으로 딥러닝 알고리즘을 수행할 수 있기 때문에, 다양한 산업 분야로 AI 기술이 적용 가능하게 된다.Furthermore, since deep learning algorithms can be performed in real time even on devices with low computational performance, such as mobile devices or edge devices, AI technology can be applied to various industries.

뿐만 아니라, 분야별로 특성이 다른 데이터셋을 기반으로 네트워크의 구조를 자동적으로 변경시킬 수 있어 효율적인 운용이 가능하게 된다.In addition, the network structure can be automatically changed based on data sets with different characteristics for each field, enabling efficient operation.

한편, 본 실시예에 따른 장치와 방법의 기능을 수행하게 하는 컴퓨터 프로그램을 수록한 컴퓨터로 읽을 수 있는 기록매체에도 본 발명의 기술적 사상이 적용될 수 있음은 물론이다. 또한, 본 발명의 다양한 실시예에 따른 기술적 사상은 컴퓨터로 읽을 수 있는 기록매체에 기록된 컴퓨터로 읽을 수 있는 코드 형태로 구현될 수도 있다. 컴퓨터로 읽을 수 있는 기록매체는 컴퓨터에 의해 읽을 수 있고 데이터를 저장할 수 있는 어떤 데이터 저장 장치이더라도 가능하다. 예를 들어, 컴퓨터로 읽을 수 있는 기록매체는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광디스크, 하드 디스크 드라이브, 등이 될 수 있음은 물론이다. 또한, 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터로 읽을 수 있는 코드 또는 프로그램은 컴퓨터간에 연결된 네트워크를 통해 전송될 수도 있다.On the other hand, it goes without saying that the technical idea of the present invention can be applied to a computer-readable recording medium containing a computer program for performing the functions of the apparatus and method according to the present embodiment. In addition, the technical ideas according to various embodiments of the present invention may be implemented in the form of computer-readable codes recorded on a computer-readable recording medium. The computer-readable recording medium may be any data storage device readable by the computer and capable of storing data. For example, the computer-readable recording medium may be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, or the like. In addition, the computer-readable code or program stored in the computer-readable recording medium may be transmitted through a network connected between computers.

또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although preferred embodiments of the present invention have been illustrated and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention belongs without departing from the gist of the present invention as claimed in the claims In addition, various modifications are possible by those of ordinary skill in the art, and these modifications should not be individually understood from the technical spirit or perspective of the present invention.

100 : LSN(Large Scale Network)
200 : SSN(Small Scale Network)
410 : 통신부
420 : 출력부
430 : 프로세서
440 : 입력부
450 : 저장부100: LSN (Large Scale Network)
200: SSN (Small Scale Network)
410: communication department
420: output unit
430: processor
440: input unit
450: storage

Claims

a first quantization step of quantizing parameters of a first deep learning network;
propagating quantization knowledge information generated in the process of performing quantization in the first quantization step to a second deep learning network;
A network quantization method comprising: a second quantization step of quantizing parameters of a second deep learning network using the propagated quantization knowledge information.

The method according to claim 1,
The first deep learning network is
LSN (Large Scale Network),
The second deep learning network is
Network quantization method, characterized in that the SSN (Small Scale Network).

The method according to claim 1,
Quantization knowledge information,
A network quantization method comprising at least one of data variance, quantization error, and error variance.

4. The method according to claim 3,
The propagation stage is
When the number of layers of the first deep learning network and the number of layers of the second deep learning network are not the same, quantization knowledge information about the parameters of all layers is propagated.

4. The method according to claim 3,
The propagation stage is
When the number of layers of the first deep learning network and the number of layers of the second deep learning network are the same, quantization knowledge information on parameters is propagated for each layer.

The method according to claim 1,
The second quantization step is
A network quantization method, characterized in that the parameters of the second deep learning network are quantized as a quantization technique that maps floating point and integer to 1:N.

The method according to claim 1,
A first learning step of learning the first deep learning network;
propagating the classification knowledge information obtained through the first learning step to a second deep learning network;
Using the propagated classification knowledge information, a second learning step of learning the second deep learning network; Network quantization method further comprising a.

a communication unit for receiving quantization knowledge information generated in the process of quantizing parameters of the first deep learning network;
A network quantization system comprising: a processor that quantizes parameters of the second deep learning network by using the received quantization knowledge information.