WO2022107951A1 - Method for training ultra-lightweight deep learning network - Google Patents

Method for training ultra-lightweight deep learning network Download PDF

Info

Publication number
WO2022107951A1
WO2022107951A1 PCT/KR2020/016635 KR2020016635W WO2022107951A1 WO 2022107951 A1 WO2022107951 A1 WO 2022107951A1 KR 2020016635 W KR2020016635 W KR 2020016635W WO 2022107951 A1 WO2022107951 A1 WO 2022107951A1
Authority
WO
WIPO (PCT)
Prior art keywords
quantization
network
deep learning
learning network
knowledge information
Prior art date
Application number
PCT/KR2020/016635
Other languages
French (fr)
Korean (ko)
Inventor
박종희
이상설
장성준
Original Assignee
한국전자기술연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국전자기술연구원 filed Critical 한국전자기술연구원
Publication of WO2022107951A1 publication Critical patent/WO2022107951A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present invention relates to a method for learning a deep learning network, and more particularly, to a method for learning and quantizing an ultralight deep learning network.
  • the network in order to learn an ultra-light network, the network is trained based on a floating-point number system and the corresponding parameters are quantized.
  • a method of quantizing floating-point data can be roughly divided into two types.
  • the first technique is a technique of performing quantization of floating-point data that has been trained through a specific mapping function, as shown in FIG. 1 .
  • the second technique is a method of performing quantization during learning as shown in FIG. 2 . This method is presented to compensate for the quantization error of the first technique.
  • the present invention has been devised to solve the above problems, and an object of the present invention is to quantize the parameters of the SSN by propagating the quantization knowledge information of the LSN as a method to prevent the accuracy from falling after quantization in an ultra-light network. to provide a way to do it.
  • Another object of the present invention is to provide a method for adaptively selecting a quantization technique of an SSN.
  • a network quantization method includes a first quantization step of quantizing parameters of a first deep learning network; propagating quantization knowledge information generated in the process of performing quantization in the first quantization step to a second deep learning network; and a second quantization step of quantizing parameters of a second deep learning network using the propagated quantization knowledge information.
  • the first deep learning network may be a Large Scale Network (LSN), and the second deep learning network may be a Small Scale Network (SSN).
  • LSN Large Scale Network
  • SSN Small Scale Network
  • the quantization knowledge information may include at least one of data dispersion, quantization error, and error dispersion.
  • quantization knowledge information about parameters of all layers may be propagated.
  • quantization knowledge information about parameters may be propagated for each layer.
  • the second quantization step is a quantization technique that maps floating point and integer to 1:N, and parameters of the second deep learning network can be quantized.
  • a network quantization method includes a first learning step of learning a first deep learning network; propagating the classification knowledge information obtained through the first learning step to a second deep learning network; The method may further include a second learning step of learning the second deep learning network by using the propagated classification knowledge information.
  • a network quantization system a communication unit for receiving quantization knowledge information generated in the process of quantizing parameters of a first deep learning network; and a processor that quantizes parameters of the second deep learning network by using the received quantization knowledge information.
  • a structure for propagating quantization information from LSN (Teacher Network) to SSN (Student Network) is introduced, so that it is possible to learn to quantize a low-complexity base network with ultra-light weight. becomes
  • a 1:N mapping quantization technique can be adaptively selected by searching while changing the structure of the base network simultaneously with quantization.
  • FIG. 4 is a flowchart provided for explaining a method for learning/quantizing an ultra-light deep learning network according to an embodiment of the present invention
  • 5 and 6 are diagrams provided for the explanation of a quantization knowledge propagation method
  • FIG. 7 is a diagram illustrating an adaptive quantization technique
  • FIG. 8 is a block diagram of an ultra-light deep learning network learning/quantization system according to another embodiment of the present invention.
  • a method for learning and quantizing an ultra-light deep learning network is presented.
  • FIG. 3 is a diagram provided to explain an ultra-light deep learning network learning/quantization method according to an embodiment of the present invention.
  • the LSN 100 functions as a 'Teacher network'
  • the SSN 200 functions as a 'Student network', respectively.
  • the knowledge generated when the LSN 100 is quantized is propagated to the SSN 200, and the SSN 200 is quantized to configure an ultra-light network.
  • FIG. 4 is a flowchart provided to explain an ultra-light deep learning network learning/quantization method according to an embodiment of the present invention.
  • the LSN 100 is learned (S310), and the classification knowledge information obtained through the step S310 is propagated to the SSN 200 (S320). Accordingly, the SSN 200 is learned using the classification knowledge information propagated in step S320 (S330).
  • the parameters of the LSN 100 are quantized (S340), and quantization knowledge information generated in the process of performing quantization in step S340 is propagated to the SSN 200 (S350). Accordingly, the parameters of the SSN 200 are quantized using the quantization knowledge information propagated in step S350 (S360).
  • the quantization knowledge information propagated in step S350 and used for parameter quantization of the SSN 200 may include data dispersion, quantization error, and error dispersion.
  • the quantization method performed in steps S340 to S360 may be classified into two types.
  • FIG. 5 is a diagram illustrating the concept of the first quantization method. This is a method in which quantization knowledge information on parameters of all layers of the LSN 100 is propagated to the SSN 200, and the SSN 200 performs quantization on parameters of all layers based on the propagation information.
  • 6 is a method of propagating quantization knowledge information on parameters for each layer of the LSN 100 for each layer of the SSN 200, and the SSN 200 performs quantization on parameters for each layer based on this. to be.
  • the parameters of the SSN 200 can be quantized using a quantization technique that maps a floating point and an integer to 1:N.
  • the ultra-light network has significantly fewer parameters than the LSN 100 , the discrimination power of the generated network is also reduced. This is because, in general, when quantizing a base network, we map a floating point to an integer 1:1.
  • 1:N mapping is applied in the network learning/quantization method according to an embodiment of the present invention.
  • the parameter N is hueristic, and selecting N optimal for each data set requires re-learning in several steps.
  • a selection vector of N-dimension to select a feature that is actually used among N mapped for adaptive ultra-light quantization network structure change is converted into 1:N mapping transformation intermediate , and designing each vector internal element with a sigmoid function that can be selected while learning is in progress, it is implemented so that features used adaptively can be selected.
  • FIG. 8 is a block diagram of an ultra-light deep learning network learning/quantization system according to another embodiment of the present invention.
  • Ultralight deep learning network learning / quantization system includes a communication unit 410, an output unit 420, a processor 430, an input unit 440 and a storage unit 450 It can be implemented with a computing system that
  • the communication unit 410 is a communication interface that is connected to communicate with an external network or an external device, and transmits and receives data/information.
  • the communication unit 410 receives the classification knowledge information and the quantization knowledge information of the LSN 100 by communicating with the system for learning the LSN 100 .
  • the processor 430 learns the SSN 200 using the classification knowledge information received through the communication unit 410 , and quantizes the SSN 200 using the quantization knowledge information to create an ultra-light network.
  • the input unit 440 is an input unit that transmits a user command to the processor 430
  • the output unit 420 is an output unit that outputs an execution result of the processor 430
  • the storage unit 450 provides a storage space necessary for the processor 430 to operate and function.
  • the deep learning technology that can be operated in expensive GPU devices can be utilized in various fields, and the network can adaptively learn in various fields of application.
  • AI technology can be applied to various industries.
  • the network structure can be automatically changed based on data sets with different characteristics for each field, enabling efficient operation.
  • the technical idea of the present invention can be applied to a computer-readable recording medium containing a computer program for performing the functions of the apparatus and method according to the present embodiment.
  • the technical ideas according to various embodiments of the present invention may be implemented in the form of computer-readable codes recorded on a computer-readable recording medium.
  • the computer-readable recording medium may be any data storage device readable by the computer and capable of storing data.
  • the computer-readable recording medium may be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, or the like.
  • the computer-readable code or program stored in the computer-readable recording medium may be transmitted through a network connected between computers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

Provided is a method for training/quantizing an ultra-lightweight deep learning network. The method for training/quantizing an ultra-lightweight deep learning network according to an embodiment of the present invention quantizes LSN parameters, disseminates quantized knowledge generated by quantization to an SSN, and quantizes SSN parameters using the disseminated quantized knowledge. As such, training for ultra-lightweight quantization of a base network having low complexity becomes possible.

Description

초경량 딥러닝 네트워크 학습 방법How to train an ultra-light deep learning network
본 발명은 딥러닝 네트워크 학습 방법에 관한 것으로, 더욱 상세하게는 초경량 딥러닝 네트워크를 학습하고 양자화하는 방법에 관한 것이다.The present invention relates to a method for learning a deep learning network, and more particularly, to a method for learning and quantizing an ultralight deep learning network.
종래 기술에서는 초경량 네트워크를 학습하기 위해 부동 소수점 수체계를 기반으로 네트워크를 학습하고 해당 파라미터를 양자화함으로써 진행된다. In the prior art, in order to learn an ultra-light network, the network is trained based on a floating-point number system and the corresponding parameters are quantized.
부동 소수점의 데이터를 양자화하는 방식은 크게 2가지로 나뉠수 있다. 첫 번째 기술은 도 1과 같이 학습이 완료된 부동 소수점의 데이터를 특정 맵핑 함수를 통해 양자화를 수행하는 기법이다.A method of quantizing floating-point data can be roughly divided into two types. The first technique is a technique of performing quantization of floating-point data that has been trained through a specific mapping function, as shown in FIG. 1 .
이는 실수로 구성된 부분이 정수로 바뀌게 되면서 발생하는 양자화 에러를 발생시키기 때문에 초경량 딥러닝 네트워크의 정확도가 부동 소수점 결과에 비해 크게 낮아진다는 문제가 있다. This has a problem in that the accuracy of the ultra-light deep learning network is greatly lowered compared to the floating-point result because it generates a quantization error that occurs when the real part is changed to an integer.
두 번째 기술은 도 2와 같이 학습 중에 양자화를 수행하는 방법이다. 첫 번째 기술의 양자화 에러를 보완하기 위해 제시된 방법이다.The second technique is a method of performing quantization during learning as shown in FIG. 2 . This method is presented to compensate for the quantization error of the first technique.
이는 초경량으로 만들기 위한 베이스 네트워크에 양자화 매핑함수를 중간에 삽입하는 방식으로, 양자화 매핑함수에 사용되는 하이퍼 파라미터들이 함께 학습되어 첫 번째 기술에 비해 높은 정확도를 가지고 있다.This is a method of inserting a quantization mapping function in the middle of the base network to make it ultra-light, and the hyper parameters used in the quantization mapping function are learned together, so it has higher accuracy than the first technique.
하지만, 베이스 네트워크의 구조 자체는 변경되지 않아 베이스 네트워크가 복잡도가 낮은 경우에는 양자화 후 정확도가 떨어진다는 단점이 있다. However, since the structure of the base network itself is not changed, when the base network has low complexity, there is a disadvantage in that the accuracy after quantization is lowered.
본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은, 초경량 네트워크에서 양자화 후 정확도가 떨어지는 것을 방지하기 위한 방안으로, LSN의 양자화 지식 정보를 전파하여, SSN의 파라미터들을 양자화하는 방법을 제공함에 있다.The present invention has been devised to solve the above problems, and an object of the present invention is to quantize the parameters of the SSN by propagating the quantization knowledge information of the LSN as a method to prevent the accuracy from falling after quantization in an ultra-light network. to provide a way to do it.
또한, 본 발명의 다른 목적은, SSN의 양자화 기법을 적응적으로 선택할 수 있는 방법을 제공함에 있다.Another object of the present invention is to provide a method for adaptively selecting a quantization technique of an SSN.
상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른, 네트워크 양자화 방법은, 제1 딥러닝 네트워크의 파라마터들을 양자화하는 제1 양자화 단계; 제1 양자화 단계에서 양자화를 수행하는 과정에 생성되는 양자화 지식 정보를 제2 딥러닝 네트워크로 전파하는 단계; 전파된 양자화 지식 정보를 이용하여, 제2 딥러닝 네트워크의 파라미터들을 양자화하는 제2 양자화 단계;를 포함한다.According to an embodiment of the present invention for achieving the above object, a network quantization method includes a first quantization step of quantizing parameters of a first deep learning network; propagating quantization knowledge information generated in the process of performing quantization in the first quantization step to a second deep learning network; and a second quantization step of quantizing parameters of a second deep learning network using the propagated quantization knowledge information.
제1 딥러닝 네트워크는, LSN(Large Scale Network)이고, 제2 딥러닝 네트워크는, SSN(Small Scale Network)일 수 있다.The first deep learning network may be a Large Scale Network (LSN), and the second deep learning network may be a Small Scale Network (SSN).
양자화 지식 정보는, 데이터 분산, 양자화 에러, 에러의 분산 중 적어도 하나를 포함할 수 있다.The quantization knowledge information may include at least one of data dispersion, quantization error, and error dispersion.
전파 단계는, 제1 딥러닝 네트워크의 레이어 개수와 제2 딥러닝 네트워크의 레이어 개수가 동일하지 않으면, 전체 레이어의 파라미터들에 대한 양자화 지식 정보를 전파할 수 있다.In the propagation step, if the number of layers of the first deep learning network and the number of layers of the second deep learning network are not the same, quantization knowledge information about parameters of all layers may be propagated.
전파 단계는, 제1 딥러닝 네트워크의 레이어 개수와 제2 딥러닝 네트워크의 레이어 개수가 동일하면, 레이어 별로 파라미터들에 대한 양자화 지식 정보를 전파할 수 있다.In the propagation step, when the number of layers of the first deep learning network and the number of layers of the second deep learning network are the same, quantization knowledge information about parameters may be propagated for each layer.
제2 양자화 단계는, 부동 소수점과 정수를 1:N 으로 맵핑하는 양자화 기법으로, 제2 딥러닝 네트워크의 파라미터들을 양자화할 수 있다.The second quantization step is a quantization technique that maps floating point and integer to 1:N, and parameters of the second deep learning network can be quantized.
본 발명의 실시예에 따른 네트워크 양자화 방법은, 제1 딥러닝 네트워크를 학습시키는 제1 학습 단계; 제1 학습 단계를 통해 획득된 분류 지식 정보를 제2 딥러닝 네트워크로 전파하는 단계; 전파된 분류 지식 정보를 이용하여, 제2 딥러닝 네트워크를 학습시키는 제2 학습 단계;를 더 포함할 수 있다.A network quantization method according to an embodiment of the present invention includes a first learning step of learning a first deep learning network; propagating the classification knowledge information obtained through the first learning step to a second deep learning network; The method may further include a second learning step of learning the second deep learning network by using the propagated classification knowledge information.
한편, 본 발명의 다른 실시예에 따른, 네트워크 양자화 시스템은, 제1 딥러닝 네트워크의 파라마터들을 양자화하는 과정에 생성되는 양자화 지식 정보를 수신하는 통신부; 수신한 양자화 지식 정보를 이용하여, 제2 딥러닝 네트워크의 파라미터들을 양자화하는 프로세서;를 포함한다.On the other hand, according to another embodiment of the present invention, a network quantization system, a communication unit for receiving quantization knowledge information generated in the process of quantizing parameters of a first deep learning network; and a processor that quantizes parameters of the second deep learning network by using the received quantization knowledge information.
이상 설명한 바와 같이, 본 발명의 실시예들에 따르면, LSN(Teacher Network)로부터 SSN(Student Network)로 양자화 정보를 전파하는 구조를 도입하여, 복잡도가 낮은 베이스 네트워크를 초경량으로 양자화하기 위한 학습이 가능해진다.As described above, according to embodiments of the present invention, a structure for propagating quantization information from LSN (Teacher Network) to SSN (Student Network) is introduced, so that it is possible to learn to quantize a low-complexity base network with ultra-light weight. becomes
또한, 본 발명의 실시예들에 따르면, 베이스 네트워크의 구조를 양자화와 동시에 변경하면서 탐색함으로써, 1:N 매핑 양자화 기법이 적응적으로 선택될 수 있게 된다.In addition, according to embodiments of the present invention, a 1:N mapping quantization technique can be adaptively selected by searching while changing the structure of the base network simultaneously with quantization.
도 1은 학습 후 양자화 기법,1 is a quantization technique after learning;
도 2는 학습 중 양자화 기법,2 is a quantization technique during learning;
도 3은 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 개념,3 is a concept of an ultra-light deep learning network learning / quantization method according to an embodiment of the present invention;
도 4는 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 설명에 제공되는 흐름도,4 is a flowchart provided for explaining a method for learning/quantizing an ultra-light deep learning network according to an embodiment of the present invention;
도 5 및 도 6은, 양자화 지식 전파 방법의 설명에 제공되는 도면들,5 and 6 are diagrams provided for the explanation of a quantization knowledge propagation method;
도 7은 적응적 양자화 기법을 예시한 도면,7 is a diagram illustrating an adaptive quantization technique;
도 8은 본 발명의 다른 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 시스템의 블럭도이다.8 is a block diagram of an ultra-light deep learning network learning/quantization system according to another embodiment of the present invention.
이하에서는 도면을 참조하여 본 발명을 보다 상세하게 설명한다.Hereinafter, the present invention will be described in more detail with reference to the drawings.
본 발명의 실시예에서는 초경량 딥러닝 네트워크를 학습하고 양자화하는 방법을 제시한다.In an embodiment of the present invention, a method for learning and quantizing an ultra-light deep learning network is presented.
초경량 딥러닝 네트워크를 학습하고 양자화하기 위한 프레임워크의 기본 개념은 도 3과 같다. 도 3은 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 설명에 제공되는 도면이다.The basic concept of a framework for learning and quantizing an ultra-light deep learning network is shown in FIG. 3 . 3 is a diagram provided to explain an ultra-light deep learning network learning/quantization method according to an embodiment of the present invention.
도시된 바와 같이, 먼저, 초경량 네트워크를 생성하기 위한 SSN(Small Scale Network)(200)를 보다 높은 정확도로 학습하기 위해 Knowledge distillation 기법을 활용하여, LSN(Large Scale Network)(100)의 분류 지식 정보를 전파하여 학습한다. 즉, LSN(100)는 'Teacher network'로 기능하고, SSN(200)는 'Student network'로 각각 기능하는 것이다.As shown, first, by utilizing the Knowledge distillation technique to learn the SSN (Small Scale Network) 200 for generating an ultra-light network with higher accuracy, classification knowledge information of the LSN (Large Scale Network) 100 learn by spreading That is, the LSN 100 functions as a 'Teacher network', and the SSN 200 functions as a 'Student network', respectively.
다음, LSN(100)을 양자화 할 때 생성되는 지식을 SSN(200)에 전파하여, SSN(200)을 양자화함으로써 초경량 네트워크를 구성하게 된다.Next, the knowledge generated when the LSN 100 is quantized is propagated to the SSN 200, and the SSN 200 is quantized to configure an ultra-light network.
도 3에 도시된 개념을 도 4를 참조하여 구체적으로 설명한다. 도 4는 본 발명의 일 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 방법의 설명에 제공되는 흐름도이다.The concept shown in FIG. 3 will be described in detail with reference to FIG. 4 . 4 is a flowchart provided to explain an ultra-light deep learning network learning/quantization method according to an embodiment of the present invention.
도시된 바와 같이, 먼저, LSN(100)을 학습시키고(S310), S310단계를 통해 획득된 분류 지식 정보를 SSN(200)으로 전파한다(S320). 이에, S320단계에서 전파된 분류 지식 정보를 이용하여, SSN(200)을 학습시킨다(S330).As shown, first, the LSN 100 is learned (S310), and the classification knowledge information obtained through the step S310 is propagated to the SSN 200 (S320). Accordingly, the SSN 200 is learned using the classification knowledge information propagated in step S320 (S330).
다음, LSN(100)의 파라마터들을 양자화하고(S340), S340단계에서 양자화를 수행하는 과정에 생성되는 양자화 지식 정보를 SSN(200)으로 전파한다(S350). 이에, S350단계에서 전파된 양자화 지식 정보를 이용하여, SSN(200)의 파라미터들을 양자화한다(S360).Next, the parameters of the LSN 100 are quantized (S340), and quantization knowledge information generated in the process of performing quantization in step S340 is propagated to the SSN 200 (S350). Accordingly, the parameters of the SSN 200 are quantized using the quantization knowledge information propagated in step S350 (S360).
S350단계에서 전파되어 SSN(200)의 파라미터 양자화에 이용되는 양자화 지식 정보에는, 데이터 분산, 양자화 에러, 에러의 분산이 포함될 수 있다.The quantization knowledge information propagated in step S350 and used for parameter quantization of the SSN 200 may include data dispersion, quantization error, and error dispersion.
S340단계 내지 S360단계에서 수행되는 양자화 방법은 2가지로 분류될 수 있다.The quantization method performed in steps S340 to S360 may be classified into two types.
도 5는 첫 번째 양자화 방법의 개념을 나타낸 도면이다. 이는, LSN(100)의 전체 레이어의 파라미터들에 대한 양자화 지식 정보를 SSN(200)으로 전파하고, SSN(200)가 이를 기초로 전체 레이어의 파라미터들에 대한 양자화를 수행하는 방법이다.5 is a diagram illustrating the concept of the first quantization method. This is a method in which quantization knowledge information on parameters of all layers of the LSN 100 is propagated to the SSN 200, and the SSN 200 performs quantization on parameters of all layers based on the propagation information.
이는 LSN(100)의 레이어 개수와 SSN(200)의 레이어 개수가 동일하지 않은 경우에 적합한 방법이다.This is a suitable method when the number of layers of the LSN 100 and the number of layers of the SSN 200 are not the same.
도 6은 LSN(100)의 각 레이어 별로 파라미터들에 대한 양자화 지식 정보를 SSN(200)의 각 레이어 별로 전파하고, SSN(200)가 이를 기초로 각 레이어 별로 파라미터들에 대한 양자화를 수행하는 방법이다.6 is a method of propagating quantization knowledge information on parameters for each layer of the LSN 100 for each layer of the SSN 200, and the SSN 200 performs quantization on parameters for each layer based on this. to be.
이는 LSN(100)의 레이어 개수와 SSN(200)의 레이어 개수가 동일한 경우에 적합한 방법이다.This is a suitable method when the number of layers of the LSN 100 and the number of layers of the SSN 200 are the same.
한편, S340단계 내지 S360단계에서 양자화 방법에서는 부동 소수점과 정수를 1:N 으로 맵핑하는 양자화 기법으로, SSN(200)의 파라미터들을 양자화할 수 있다.Meanwhile, in the quantization method in steps S340 to S360, the parameters of the SSN 200 can be quantized using a quantization technique that maps a floating point and an integer to 1:N.
초경량 네트워크는 LSN(100) 보다 파라미터들이 확연히 줄어들기 때문에, 생성되는 네트워크의 구분력 또한 줄어들게 된다. 이는 일반적으로, 베이스 네트워크를 양자화할 때 부동 소수점과 정수를 1:1로 맵핑하기 때문이다. Since the ultra-light network has significantly fewer parameters than the LSN 100 , the discrimination power of the generated network is also reduced. This is because, in general, when quantizing a base network, we map a floating point to an integer 1:1.
이를 적응적으로 선택하기 위해, 본 발명의 실시예에 따른 네트워크 학습/양자화 방법에서는 1:N 맵핑을 적용한다. 하지만, 이 경우 N 이라는 파라미터는 hueristic 하여 데이터 셋마다 최적의 N 개를 선택하는 것은 여러 단계의 재학습을 필요로 한다. In order to adaptively select this, 1:N mapping is applied in the network learning/quantization method according to an embodiment of the present invention. However, in this case, the parameter N is hueristic, and selecting N optimal for each data set requires re-learning in several steps.
이를 방지하기 위해, 도 7에 도시된 바와 같이, 적응적 초경량 양자화 네트워크 구조 변경을 위해 매핑되는 N개 중 실질적으로 사용되는 특징을 선택하기 위한 선택 N-dimension의 선택 Vector를 1:N 매핑 변환 중간에 위치시키고, 각 vector 내부 element를 학습이 진행되는 도중 선택할 수 있는 Sigmoid 함수로 설계하면, 적응적으로 사용되는 특징을 선택할 수 있도록 구현한다.In order to prevent this, as shown in FIG. 7, a selection vector of N-dimension to select a feature that is actually used among N mapped for adaptive ultra-light quantization network structure change is converted into 1:N mapping transformation intermediate , and designing each vector internal element with a sigmoid function that can be selected while learning is in progress, it is implemented so that features used adaptively can be selected.
도 8은 본 발명의 다른 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 시스템의 블럭도이다.8 is a block diagram of an ultra-light deep learning network learning/quantization system according to another embodiment of the present invention.
본 발명의 실시예에 따른 초경량 딥러닝 네트워크 학습/양자화 시스템은, 도시된 바와 같이, 통신부(410), 출력부(420), 프로세서(430), 입력부(440) 및 저장부(450)를 포함하는 컴퓨팅 시스템으로 구현 가능하다.Ultralight deep learning network learning / quantization system according to an embodiment of the present invention, as shown, includes a communication unit 410, an output unit 420, a processor 430, an input unit 440 and a storage unit 450 It can be implemented with a computing system that
통신부(410)는 외부 네트워크 또는 외부 기기와 통신 가능하도록 연결되어, 데이터/정보를 송수신하는 통신 인터페이스이다. 본 발명의 실시예에서 통신부(410)는 LSN(100)를 학습시키는 시스템과 통신 연결하여, LSN(100)의 분류 지식 정보와 양자화 지식 정보를 수신한다.The communication unit 410 is a communication interface that is connected to communicate with an external network or an external device, and transmits and receives data/information. In an embodiment of the present invention, the communication unit 410 receives the classification knowledge information and the quantization knowledge information of the LSN 100 by communicating with the system for learning the LSN 100 .
프로세서(430)는 통신부(410)를 통해 수신한 분류 지식 정보를 이용하여 SSN(200)을 학습시키고, 양자화 지식 정보를 이용하여 SSN(200)을 양자화 시켜 초경량 네트워크를 생성한다.The processor 430 learns the SSN 200 using the classification knowledge information received through the communication unit 410 , and quantizes the SSN 200 using the quantization knowledge information to create an ultra-light network.
입력부(440)는 사용자 명령을 프로세서(430)로 전달하는 입력 수단이고, 출력부(420)는 프로세서(430)의 실행 결과를 출력하는 출력 수단이다. 저장부(450)는 프로세서(430)가 동작하고 기능함에 있어 필요한 저장 공간을 제공한다.The input unit 440 is an input unit that transmits a user command to the processor 430 , and the output unit 420 is an output unit that outputs an execution result of the processor 430 . The storage unit 450 provides a storage space necessary for the processor 430 to operate and function.
지금까지, 초경량 딥러닝 네트워크를 학습하고 양자화하는 방법에 대해 바람직한 실시예를 들어 상세히 설명하였다.So far, a preferred embodiment has been described in detail for a method of learning and quantizing an ultralight deep learning network.
위 실시예에서는, LSN(Teacher Network)(100)로부터 SSN(Student Network)(200)로 양자화 정보를 전파하는 구조를 도입하여, 복잡도가 낮은 베이스 네트워크를 초경량으로 양자화하기 위한 학습 기술을 제시하였다.In the above embodiment, a structure for propagating quantization information from the LSN (Teacher Network) 100 to the SSN (Student Network) 200 was introduced, and a learning technique for quantizing a low-complexity base network to an ultra-light weight was presented.
또한, 위 실시예에서는, 베이스 네트워크의 구조를 양자화와 동시에 변경하면서 탐색함으로써, 1:N 매핑 양자화 기법이 적응적으로 선택될 수 있는 구조를 제시하였다.In addition, in the above embodiment, a structure in which a 1:N mapping quantization technique can be adaptively selected was proposed by searching while changing the structure of the base network while quantizing it.
이에 의해, 고가의 GPU 기기에서 동작가능한 딥러닝 기술을 다양한 분야에 활용될 수 있게 되고, 다양한 활용 분야에서 네트워크가 적응적으로 학습할 수 있게 된다.Accordingly, the deep learning technology that can be operated in expensive GPU devices can be utilized in various fields, and the network can adaptively learn in various fields of application.
나아가, 모바일이나 엣지 디바이스와 같이 연산 성능이 낮은 기기에서도 실시간으로 딥러닝 알고리즘을 수행할 수 있기 때문에, 다양한 산업 분야로 AI 기술이 적용 가능하게 된다.Furthermore, since deep learning algorithms can be performed in real time even on devices with low computational performance, such as mobile devices or edge devices, AI technology can be applied to various industries.
뿐만 아니라, 분야별로 특성이 다른 데이터셋을 기반으로 네트워크의 구조를 자동적으로 변경시킬 수 있어 효율적인 운용이 가능하게 된다.In addition, the network structure can be automatically changed based on data sets with different characteristics for each field, enabling efficient operation.
한편, 본 실시예에 따른 장치와 방법의 기능을 수행하게 하는 컴퓨터 프로그램을 수록한 컴퓨터로 읽을 수 있는 기록매체에도 본 발명의 기술적 사상이 적용될 수 있음은 물론이다. 또한, 본 발명의 다양한 실시예에 따른 기술적 사상은 컴퓨터로 읽을 수 있는 기록매체에 기록된 컴퓨터로 읽을 수 있는 코드 형태로 구현될 수도 있다. 컴퓨터로 읽을 수 있는 기록매체는 컴퓨터에 의해 읽을 수 있고 데이터를 저장할 수 있는 어떤 데이터 저장 장치이더라도 가능하다. 예를 들어, 컴퓨터로 읽을 수 있는 기록매체는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광디스크, 하드 디스크 드라이브, 등이 될 수 있음은 물론이다. 또한, 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터로 읽을 수 있는 코드 또는 프로그램은 컴퓨터간에 연결된 네트워크를 통해 전송될 수도 있다.On the other hand, it goes without saying that the technical idea of the present invention can be applied to a computer-readable recording medium containing a computer program for performing the functions of the apparatus and method according to the present embodiment. In addition, the technical ideas according to various embodiments of the present invention may be implemented in the form of computer-readable codes recorded on a computer-readable recording medium. The computer-readable recording medium may be any data storage device readable by the computer and capable of storing data. For example, the computer-readable recording medium may be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, or the like. In addition, the computer-readable code or program stored in the computer-readable recording medium may be transmitted through a network connected between computers.
또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although preferred embodiments of the present invention have been illustrated and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention belongs without departing from the gist of the present invention as claimed in the claims In addition, various modifications are possible by those of ordinary skill in the art, and these modifications should not be individually understood from the technical spirit or perspective of the present invention.

Claims (8)

  1. 제1 딥러닝 네트워크의 파라마터들을 양자화하는 제1 양자화 단계;a first quantization step of quantizing parameters of a first deep learning network;
    제1 양자화 단계에서 양자화를 수행하는 과정에 생성되는 양자화 지식 정보를 제2 딥러닝 네트워크로 전파하는 단계;propagating quantization knowledge information generated in the process of performing quantization in the first quantization step to a second deep learning network;
    전파된 양자화 지식 정보를 이용하여, 제2 딥러닝 네트워크의 파라미터들을 양자화하는 제2 양자화 단계;를 포함하는 것을 특징으로 하는 네트워크 양자화 방법.A network quantization method comprising: a second quantization step of quantizing parameters of a second deep learning network using the propagated quantization knowledge information.
  2. 청구항 1에 있어서,The method according to claim 1,
    제1 딥러닝 네트워크는,The first deep learning network is
    LSN(Large Scale Network)이고,LSN (Large Scale Network),
    제2 딥러닝 네트워크는,The second deep learning network is
    SSN(Small Scale Network)인 것을 특징으로 하는 네트워크 양자화 방법.Network quantization method, characterized in that the SSN (Small Scale Network).
  3. 청구항 1에 있어서,The method according to claim 1,
    양자화 지식 정보는,Quantization knowledge information,
    데이터 분산, 양자화 에러, 에러의 분산 중 적어도 하나를 포함하는 것을 특징으로 하는 네트워크 양자화 방법.A network quantization method comprising at least one of data variance, quantization error, and error variance.
  4. 청구항 3에 있어서,4. The method according to claim 3,
    전파 단계는,The propagation stage is
    제1 딥러닝 네트워크의 레이어 개수와 제2 딥러닝 네트워크의 레이어 개수가 동일하지 않으면, 전체 레이어의 파라미터들에 대한 양자화 지식 정보를 전파하는 것을 특징으로 하는 네트워크 양자화 방법.When the number of layers of the first deep learning network and the number of layers of the second deep learning network are not the same, quantization knowledge information about the parameters of all layers is propagated.
  5. 청구항 3에 있어서,4. The method according to claim 3,
    전파 단계는,The propagation stage is
    제1 딥러닝 네트워크의 레이어 개수와 제2 딥러닝 네트워크의 레이어 개수가 동일하면, 레이어 별로 파라미터들에 대한 양자화 지식 정보를 전파하는 것을 특징으로 하는 네트워크 양자화 방법.When the number of layers of the first deep learning network and the number of layers of the second deep learning network are the same, quantization knowledge information on parameters is propagated for each layer.
  6. 청구항 1에 있어서,The method according to claim 1,
    제2 양자화 단계는,The second quantization step is
    부동 소수점과 정수를 1:N 으로 맵핑하는 양자화 기법으로, 제2 딥러닝 네트워크의 파라미터들을 양자화하는 것을 특징으로 하는 네트워크 양자화 방법.A network quantization method, characterized in that the parameters of the second deep learning network are quantized as a quantization technique that maps floating points and integers to 1:N.
  7. 청구항 1에 있어서,The method according to claim 1,
    제1 딥러닝 네트워크를 학습시키는 제1 학습 단계;A first learning step of learning the first deep learning network;
    제1 학습 단계를 통해 획득된 분류 지식 정보를 제2 딥러닝 네트워크로 전파하는 단계;propagating the classification knowledge information obtained through the first learning step to a second deep learning network;
    전파된 분류 지식 정보를 이용하여, 제2 딥러닝 네트워크를 학습시키는 제2 학습 단계;를 더 포함하는 것을 특징으로 하는 네트워크 양자화 방법.Using the propagated classification knowledge information, a second learning step of learning the second deep learning network; Network quantization method further comprising a.
  8. 제1 딥러닝 네트워크의 파라마터들을 양자화하는 과정에 생성되는 양자화 지식 정보를 수신하는 통신부;a communication unit for receiving quantization knowledge information generated in the process of quantizing parameters of the first deep learning network;
    수신한 양자화 지식 정보를 이용하여, 제2 딥러닝 네트워크의 파라미터들을 양자화하는 프로세서;를 포함하는 것을 특징으로 하는 네트워크 양자화 시스템.A network quantization system comprising: a processor that quantizes parameters of the second deep learning network by using the received quantization knowledge information.
PCT/KR2020/016635 2020-11-20 2020-11-24 Method for training ultra-lightweight deep learning network WO2022107951A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020200156990A KR20220069653A (en) 2020-11-20 2020-11-20 Learning method for ultra light-weight deep learning network
KR10-2020-0156990 2020-11-20

Publications (1)

Publication Number Publication Date
WO2022107951A1 true WO2022107951A1 (en) 2022-05-27

Family

ID=81709227

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/016635 WO2022107951A1 (en) 2020-11-20 2020-11-24 Method for training ultra-lightweight deep learning network

Country Status (2)

Country Link
KR (1) KR20220069653A (en)
WO (1) WO2022107951A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200013710A (en) * 2017-07-07 2020-02-07 미쓰비시덴키 가부시키가이샤 Data processing apparatus, data processing method and storage medium
KR20200063330A (en) * 2018-11-21 2020-06-05 한국과학기술원 Method and system for transfer learning into any target dataset and model structure based on meta-learning
CN111709516A (en) * 2020-06-09 2020-09-25 深圳先进技术研究院 Compression method and compression device of neural network model, storage medium and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200013710A (en) * 2017-07-07 2020-02-07 미쓰비시덴키 가부시키가이샤 Data processing apparatus, data processing method and storage medium
KR20200063330A (en) * 2018-11-21 2020-06-05 한국과학기술원 Method and system for transfer learning into any target dataset and model structure based on meta-learning
CN111709516A (en) * 2020-06-09 2020-09-25 深圳先进技术研究院 Compression method and compression device of neural network model, storage medium and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
EHMANN CHRISTOPHER; SAMEK WOJCIECH: "Transferring Information Between Neural Networks", 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, 15 April 2018 (2018-04-15), pages 2361 - 2365, XP033404076, DOI: 10.1109/ICASSP.2018.8461511 *
ZHENG XIE , ZHIQUAN WEN , JING LIU , ZHIQIANG LIU , XIXIAN WU, MINGKUI TAN: "Deep Transferring Quantization", COMPUTER VISION - ECCV 2020 : 16TH EUROPEAN CONFERENCE, GLASGOW, UK, AUGUST 23-28, 2020 : PROCEEDINGS, vol. 37, no. 504445, 7 November 2020 (2020-11-07) - 28 August 2020 (2020-08-28), Cham, pages 625 - 642, XP009536880, ISSN: 0302-9743, ISBN: 978-3-030-41298-2, DOI: 10.1007/978-3-030-58598-3_37 *

Also Published As

Publication number Publication date
KR20220069653A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
WO2021096009A1 (en) Method and device for supplementing knowledge on basis of relation network
WO2016159497A1 (en) Method, system, and non-transitory computer-readable recording medium for providing learning information
WO2019066104A1 (en) Process control method and system which use history data-based neural network learning
WO2022146080A1 (en) Algorithm and method for dynamically changing quantization precision of deep-learning network
Waytowich et al. A narration-based reward shaping approach using grounded natural language commands
WO2022107951A1 (en) Method for training ultra-lightweight deep learning network
WO2023113372A1 (en) Apparatus and method for label-based sample extraction for improvement of deep learning classification model performance for imbalanced data
WO2023058969A1 (en) Machine learning model compression using weighted low-rank factorization
WO2023033194A1 (en) Knowledge distillation method and system specialized for pruning-based deep neural network lightening
WO2022270840A1 (en) Deep learning-based word recommendation system for predicting and improving foreign language learner's vocabulary ability
WO2022163996A1 (en) Device for predicting drug-target interaction by using self-attention-based deep neural network model, and method therefor
CN115114927A (en) Model training method and related device
WO2022124449A1 (en) Method for optimizing hyper parameter of lightweight artificial intelligence algorithm by using genetic algorithm
EP4170552A1 (en) Method for generating neural network, and device and computer-readable storage medium
WO2021107231A1 (en) Sentence encoding method and device using hierarchical word information
WO2022107925A1 (en) Deep learning object detection processing device
WO2023080292A1 (en) Apparatus and method for generating adaptive parameter for deep learning acceleration device
WO2023214609A1 (en) Quantum circuit computation method for efficiently computing state vectors
WO2023214608A1 (en) Quantum circuit simulation hardware
WO2022145550A1 (en) Algorithm and method for dynamically varying quantization precision of deep learning network
WO2023090499A1 (en) Sparsity learning-based filter pruning method for deep neural networks
WO2021029563A1 (en) Method, system and non-transitory computer-readable recording medium for providing learning information
WO2020213757A1 (en) Word similarity determination method
WO2022005046A1 (en) Method for transferring knowledge from deep learning network to lightweight deep learning network
WO2022107910A1 (en) Mobile deep learning hardware device capable of retraining

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20962557

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20962557

Country of ref document: EP

Kind code of ref document: A1