WO2024090600A1 - Deep learning model training method and deep learning computation apparatus applied with same - Google Patents

Deep learning model training method and deep learning computation apparatus applied with same Download PDF

Info

Publication number
WO2024090600A1
WO2024090600A1 PCT/KR2022/016397 KR2022016397W WO2024090600A1 WO 2024090600 A1 WO2024090600 A1 WO 2024090600A1 KR 2022016397 W KR2022016397 W KR 2022016397W WO 2024090600 A1 WO2024090600 A1 WO 2024090600A1
Authority
WO
WIPO (PCT)
Prior art keywords
deep learning
weights
learning model
pruning
loading
Prior art date
Application number
PCT/KR2022/016397
Other languages
French (fr)
Korean (ko)
Inventor
이상설
장성준
김경호
Original Assignee
한국전자기술연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국전자기술연구원 filed Critical 한국전자기술연구원
Publication of WO2024090600A1 publication Critical patent/WO2024090600A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates to image-based deep learning processing and system SoC (System on chip) technology, and more specifically, to a method of learning a deep learning model at high speed with high accuracy in a lightweight deep learning computing device.
  • SoC System on chip
  • the present invention was created to solve the above problems, and the purpose of the present invention is to quickly learn a deep learning model using additional datasets in a deep learning computing device with limited resources, while maintaining a high level of prediction accuracy. To provide a deep learning model learning method that can be maintained and a deep learning computing device to which it is applied.
  • a deep learning model learning method to achieve the above object includes a first learning step of training a deep learning model; A first pruning step of pruning some weights in the learned deep learning model; It includes a first loading step of loading specific weights into the pruned weights.
  • the first loading step may load the weights of a previously learned deep learning model.
  • the deep learning model to which the weights of the previously learned deep learning model are transferred may be fine-tuned to the first data set.
  • a deep learning model learning method includes a second learning step of fine tuning the deep learning model on which the first loading step has been performed with a second data set; A second pruning step of pruning some weights in the fine-tuned deep learning model; It may further include a second loading step of loading specific weights into the pruned weights.
  • the second loading step may load the weights of a previously learned deep learning model. Some weights pruned in the second pruning step may be some of the weights pruned in the first pruning step.
  • the first pruning step and the second pruning step may prune weights on a channel basis.
  • the first pruning step and the second pruning step may prune weights of different channels for each layer.
  • Deep learning models can be mounted on lightweight, low-power deep learning computing devices.
  • a deep learning computing device trains a deep learning model. An operator that prunes some weights from the learned deep learning model and loads specific weights into the pruned weights; and a memory that provides storage space required for the calculator.
  • a deep learning model learning method includes a first pruning step of pruning some weights in the deep learning model; A first loading step of loading specific weights into the pruned weights; a second pruning step of pruning some weights in the deep learning model in which the first loading step was performed; It includes a second loading step of loading specific weights into the pruned weights.
  • a deep learning computing device prunes some weights in a deep learning model, loads specific weights on the pruned weights, and prunes some weights in the deep learning model loaded with specific weights.
  • 1 is a diagram conceptually showing a deep learning model learning method in a deep learning computing device
  • Figure 2 shows test results for the transfer learned deep learning model
  • 3 to 5 are diagrams provided to explain a deep learning model learning method according to an embodiment of the present invention.
  • Figure 6 is a diagram showing the configuration of a deep learning computing device according to another embodiment of the present invention.
  • Figure 1 is a diagram conceptually showing a deep learning model learning method in a deep learning computing device (deep learning accelerator). As shown in the upper part of FIG. 1, a deep learning computing device that cannot learn from many learning datasets provides additional data for the deep learning model transfer learned by the server as shown in the lower part of FIG. 1. It is carried out by learning three.
  • Figure 2 shows test results for the transfer learned deep learning model. As shown, when a transfer-learned deep learning model is additionally trained, learning performance quickly increases compared to a deep learning model without transfer learning.
  • FC layer Fely Connected Layer
  • An embodiment of the present invention presents a deep learning model learning method that can quickly train a deep learning model using an additional dataset in a deep learning computing device with limited resources while maintaining high prediction accuracy.
  • 3 to 5 are diagrams provided to explain a deep learning model learning method according to an embodiment of the present invention.
  • the deep learning model learning method according to an embodiment of the present invention is suitable for learning a deep learning model mounted on a lightweight deep learning accelerator, but is not necessarily limited to this and can also be applied in other environments/methods.
  • weights are transferred to the deep learning model as shown in Figure 3. This is a process of securing the weights of the deep learning model acquired through pre-training using a large amount of learning data sets at the server side and loading them into the deep learning model to be learned.
  • the weights shown on the left are the weights of the first layer, and the weights shown on the right are the weights of the second layer.
  • the deep learning model trained in the embodiment of the present invention consists of two layers, but this is only an example for convenience of explanation. There is no limit to the number of layers of a deep learning model to which embodiments of the present invention can be applied.
  • the deep learning model has a structure in which images are input through multi-channels, and feature maps of the images are also generated through multi-channels, and are divided by weights for each channel.
  • the deep learning accelerator uses dataset #1 to fine-tune the deep learning model to which the weights have been transferred, and to select weights subject to pruning.
  • weights subject to pruning are those displayed in white.
  • weight pruning is performed on a channel basis. That is, the weights for some channels are pruned and the weights for the remaining channels are left. Meanwhile, weight pruning can prune the weights of different channels for each layer. As shown, the weight pruning target channels in the first layer shown on the left and the weight pruning target channels in the second layer shown on the right are different from each other.
  • the weights of the previously learned deep learning model are loaded for the pruned weights.
  • Previously 0 was loaded into pruned waddles or randomly generated weights were loaded.
  • the prediction accuracy of the deep learning model was improved by loading the weights of the previously learned deep learning model into the pruned weights.
  • the deep learning accelerator uses dataset #2 to fine-tune the deep learning model that has been learned through the process shown in FIG. 4, and select weights subject to pruning. .
  • weights that were not subject to pruning in FIG. 4 can be excluded from the pruning subject. That is, among the weights that were the pruning target in FIG. 4, some weights are selected as the pruning target. Weights that were not subject to pruning in Figure 4 are not selected for pruning even in learning using dataset #2.
  • weights selected from the fine-tuned deep learning model are pruned.
  • the weights subject to pruning are those displayed in white.
  • weight pruning is performed on a channel basis, and the pruning target channel for each layer may be different. .
  • the weights of the previously learned deep learning model are loaded onto the pruned weights.
  • Figure 6 is a diagram showing the configuration of a deep learning computing device according to another embodiment of the present invention.
  • the deep learning computing device includes a communication interface 110, a deep learning calculator 120, and a memory 130.
  • the communication interface 110 communicates with an external host system and receives data sets, parameters (weight, bias) of previously learned deep learning models, etc.
  • the deep learning calculator 120 trains the mounted deep learning model using the method shown in FIGS. 3 to 5 described above.
  • the memory 130 provides storage space necessary for the deep learning calculator 120 to perform calculations.
  • the deep learning processing unit does not perform calculations on the pruned weights, allowing high-speed learning with low power while maintaining prediction accuracy at a high level.
  • a computer-readable recording medium can be any data storage device that can be read by a computer and store data.
  • computer-readable recording media can be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, etc.
  • computer-readable codes or programs stored on a computer-readable recording medium may be transmitted through a network connected between computers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Manipulator (AREA)
  • Image Processing (AREA)
  • Feedback Control In General (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Provided are a deep learning model training method and a deep learning computation apparatus applied with same. The deep learning model training method according to an embodiment of the present invention comprises training a deep learning model, pruning some weights in the trained deep learning model, and loading specific weights to the pruned weights. Accordingly, the deep learning computation apparatus in which resources are limited can quickly perform training while quickly improving prediction accuracy by applying pre-learned weights to weights pruned during deep learning model training by an additional dataset.

Description

딥러닝 모델 학습 방법 및 이를 적용한 딥러닝 연산장치Deep learning model learning method and deep learning computing device to which it is applied
본 발명은 영상 기반 딥러닝 처리 및 시스템 SoC(System on chip) 기술에 관한 것으로, 더욱 상세하게는 대해 경량화된 딥러닝 연산장치에서 딥러닝 모델을 높은 정확도로 고속으로 학습시키는 방법에 관한 것이다.The present invention relates to image-based deep learning processing and system SoC (System on chip) technology, and more specifically, to a method of learning a deep learning model at high speed with high accuracy in a lightweight deep learning computing device.
딥러닝 학습을 위한 가장 좋은 방법은 많은 학습 데이터셋을 이용하여 딥러닝 모델을 학습하는 것이다. 하지만 SoC와 같이 리소스가 한정되어 있는 딥러닝 연산장치(딥러닝 가속기)의 경우 많은 학습 데이터셋에 의한 학습은 불가능하다.The best way to learn deep learning is to learn a deep learning model using many training datasets. However, in the case of deep learning computing devices (deep learning accelerators) with limited resources, such as SoC, learning from many learning datasets is impossible.
이에 따라 전이 학습된 딥러닝 모델에 대해, 적은 양의 학습 데이터 셋만을 이용하여 추가 학습하는 방법이 널리 활용되고 있다.Accordingly, a method of additional learning using only a small amount of training data set for a transfer learned deep learning model is widely used.
하지만 이 경우 딥러닝 모델의 정확도가 낮아지는 문제가 있으며, 리소스의 한계로 인해 추가 학습에도 많은 시간이 소요되어 학습 속도가 매우 느리다는 문제도 있다.However, in this case, there is a problem that the accuracy of the deep learning model is lowered, and due to resource limitations, additional learning takes a lot of time, so the learning speed is very slow.
본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은, 리소스가 한정되어 있는 딥러닝 연산장치에서 추가 데이터셋에 의한 딥러닝 모델의 학습을 빠르게 진행하면서도 예측 정확도는 높은 수준으로 유지시킬 수 있는 딥러닝 모델 학습 방법 및 이를 적용한 딥러닝 연산장치를 제공함에 있다.The present invention was created to solve the above problems, and the purpose of the present invention is to quickly learn a deep learning model using additional datasets in a deep learning computing device with limited resources, while maintaining a high level of prediction accuracy. To provide a deep learning model learning method that can be maintained and a deep learning computing device to which it is applied.
상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른 딥러닝 모델 학습 방법은, 딥러닝 모델을 학습시키는 제1 학습단계; 학습된 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하는 제1 프루닝 단계; 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제1 로딩단계;를 포함한다.A deep learning model learning method according to an embodiment of the present invention to achieve the above object includes a first learning step of training a deep learning model; A first pruning step of pruning some weights in the learned deep learning model; It includes a first loading step of loading specific weights into the pruned weights.
제1 로딩단계는, 선행 학습된 딥러닝 모델의 웨이트들을 로딩할 수 있다. 제1 학습단계는, 선행 학습된 딥러닝 모델의 웨이트들이 전이된 딥러닝 모델을 제1 데이터 셋으로 파인 튜닝할 수 있다.The first loading step may load the weights of a previously learned deep learning model. In the first learning step, the deep learning model to which the weights of the previously learned deep learning model are transferred may be fine-tuned to the first data set.
본 발명의 실시예에 따른 딥러닝 모델 학습 방법은, 제1 로딩단계가 수행된 딥러닝 모델을 제2 데이터 셋으로 파인 튜닝하는 제2 학습단계; 파인 튜닝된 딥러닝 모델에서 일부 웨이트들을 프루닝 하는 제2 프루닝단계; 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제2 로딩단계;를 더 포함할 수 있다.A deep learning model learning method according to an embodiment of the present invention includes a second learning step of fine tuning the deep learning model on which the first loading step has been performed with a second data set; A second pruning step of pruning some weights in the fine-tuned deep learning model; It may further include a second loading step of loading specific weights into the pruned weights.
제2 로딩단계는, 선행 학습된 딥러닝 모델의 웨이트들을 로딩할 수 있다. 제2 프루닝 단계에서 프루닝되는 일부 웨이트들은, 제1 프루닝 단계에서 프루닝된 웨이트들 중 일부 웨이트들일 수 있다.The second loading step may load the weights of a previously learned deep learning model. Some weights pruned in the second pruning step may be some of the weights pruned in the first pruning step.
제1 프루닝 단계와 제2 프루닝 단계는, 채널 단위로 웨이트들을 프루닝 할 수 있다. 제1 프루닝 단계와 제2 프루닝 단계는, 레이어 별로 각기 다른 채널의 웨이트들을 프루닝 할 수 있다.The first pruning step and the second pruning step may prune weights on a channel basis. The first pruning step and the second pruning step may prune weights of different channels for each layer.
딥러닝 모델은, 경량의 저전력 딥러닝 연산 장치에 탑재될 수 있다.Deep learning models can be mounted on lightweight, low-power deep learning computing devices.
본 발명의 다른 실시예에 따른 딥러닝 연산장치는, 딥러닝 모델을 학습시키고. 학습된 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하며, 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 연산기; 및 연산기에 필요한 저장공간을 제공하는 메모리;를 포함한다.A deep learning computing device according to another embodiment of the present invention trains a deep learning model. An operator that prunes some weights from the learned deep learning model and loads specific weights into the pruned weights; and a memory that provides storage space required for the calculator.
본 발명의 다른 실시예에 따른 딥러닝 모델 학습 방법은, 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하는 제1 프루닝 단계; 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제1 로딩단계; 제1 로딩단계가 수행된 딥러닝 모델에서 일부 웨이트들을 프루닝 하는 제2 프루닝단계; 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제2 로딩단계;를 포함한다.A deep learning model learning method according to another embodiment of the present invention includes a first pruning step of pruning some weights in the deep learning model; A first loading step of loading specific weights into the pruned weights; a second pruning step of pruning some weights in the deep learning model in which the first loading step was performed; It includes a second loading step of loading specific weights into the pruned weights.
본 발명의 다른 실시예에 따른 딥러닝 연산장치는, 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하고 프루닝된 웨이트들에 특정 웨이트들을 로딩하며, 특정 웨이트들이 로딩된 딥러닝 모델에서 일부 웨이트들을 프루닝 하고 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 연산기; 및 연산기에 필요한 저장공간을 제공하는 메모리;를 포함한다.A deep learning computing device according to another embodiment of the present invention prunes some weights in a deep learning model, loads specific weights on the pruned weights, and prunes some weights in the deep learning model loaded with specific weights. an operator for pruning weights and loading specific weights into the pruned weights; and a memory that provides storage space required for the calculator.
이상 설명한 바와 같이, 본 발명의 실시예들에 따르면, 리소스가 한정되어 있는 딥러닝 연산장치에서 추가 데이터셋에 의한 딥러닝 모델 학습시 프루닝(pruning)된 웨이트들에 사전 학습된 웨이트들을 적용함으로써, 학습을 빠르게 진행하면서도 예측 정확도는 빠르게 향상시킬 수 있게 된다.As described above, according to embodiments of the present invention, by applying pre-learned weights to pruned weights when learning a deep learning model using an additional dataset in a deep learning computing device with limited resources. , prediction accuracy can be quickly improved while learning progresses quickly.
도 1은 딥러닝 연산장치에서 딥러닝 모델 학습 방법을 개념적으로 나타낸 도면,1 is a diagram conceptually showing a deep learning model learning method in a deep learning computing device;
도 2는 전이 학습된 딥러닝 모델에 대한 테스트 결과,Figure 2 shows test results for the transfer learned deep learning model,
도 3 내지 도 5는 본 발명의 일 실시예에 따른 딥러닝 모델 학습 방법의 설명에 제공되는 도면들,3 to 5 are diagrams provided to explain a deep learning model learning method according to an embodiment of the present invention;
도 6은 본 발명의 다른 실시예에 따른 딥러닝 연산장치의 구성을 도시한 도면이다.Figure 6 is a diagram showing the configuration of a deep learning computing device according to another embodiment of the present invention.
이하에서는 도면을 참조하여 본 발명을 보다 상세하게 설명한다.Hereinafter, the present invention will be described in more detail with reference to the drawings.
도 1은 딥러닝 연산장치(딥러닝 가속기)에서 딥러닝 모델 학습 방법을 개념적으로 나타낸 도면이다. 도 1의 상부에 도시된 바와 같이 많은 학습 데이터셋에 의한 학습이 불가능한 딥러닝 연산장치는 도 1의 하부에 도시된 바와 같이 서버 단에 의해 전이 학습(Transfer Learning)된 딥러닝 모델에 대해 추가 데이터셋을 학습하는 방식으로 진행된다.Figure 1 is a diagram conceptually showing a deep learning model learning method in a deep learning computing device (deep learning accelerator). As shown in the upper part of FIG. 1, a deep learning computing device that cannot learn from many learning datasets provides additional data for the deep learning model transfer learned by the server as shown in the lower part of FIG. 1. It is carried out by learning three.
도 2에는 전이 학습된 딥러닝 모델에 대한 테스트 결과를 보여주고 있다. 도시된 바와 같이 전이 학습된 딥러닝 모델을 추가 학습시킨 경우, 전이 학습이 없었던 딥러닝 모델에 비해 학습 성능이 빠르게 높아진다.Figure 2 shows test results for the transfer learned deep learning model. As shown, when a transfer-learned deep learning model is additionally trained, learning performance quickly increases compared to a deep learning model without transfer learning.
하지만 전이 학습의 경우 Catastrophic Forgetting 현상, 즉 연속적 다른 데이터셋을 추가 학습하는 이전 데이터셋에 대한 정확도가 떨어지게 되는 현상이 발생한다.However, in the case of transfer learning, a catastrophic forgetting phenomenon occurs, that is, a phenomenon in which the accuracy of the previous dataset that is additionally learned from another consecutive dataset decreases.
이를 해결하기 위한 방안으로, 추가되는 데이터셋을 위한 FC 레이어(Fully Connected Layer)를 독립적으로 적용하는 것이 가능하다. 하지만 이는 또한 제한된 리소스를 갖는 딥러닝 연산장치에서는 적용이 어렵다. FC 레이어의 증가는 새로운 데이터셋이 증가될 때마다 늘어나게 됨과 동시에 기존의 학습 성능 저하의 문제를 야기시키기 때문이다.As a way to solve this, it is possible to independently apply the FC layer (Fully Connected Layer) for the additional dataset. However, this is also difficult to apply to deep learning computing devices with limited resources. This is because the increase in FC layers increases each time a new dataset increases and at the same time causes the problem of deterioration of existing learning performance.
본 발명의 실시예에서는 리소스가 한정되어 있는 딥러닝 연산장치에서 추가 데이터셋에 의한 딥러닝 모델의 학습을 빠르게 진행하면서도 예측 정확도는 높게 유지시킬 수 있는 딥러닝 모델 학습 방법을 제시한다.An embodiment of the present invention presents a deep learning model learning method that can quickly train a deep learning model using an additional dataset in a deep learning computing device with limited resources while maintaining high prediction accuracy.
도 3 내지 도 5는 본 발명의 일 실시예에 따른 딥러닝 모델 학습 방법의 설명에 제공되는 도면들이다. 본 발명의 실시예에 따른 딥러닝 모델 학습 방법은 경량의 딥러닝 가속장치에 탑재된 딥러닝 모델을 학습시키는데 적합하지만 있지만 반드시 이에 한정되는 것은 아니며 다른 환경/방식에서의 응용도 가능하다.3 to 5 are diagrams provided to explain a deep learning model learning method according to an embodiment of the present invention. The deep learning model learning method according to an embodiment of the present invention is suitable for learning a deep learning model mounted on a lightweight deep learning accelerator, but is not necessarily limited to this and can also be applied in other environments/methods.
딥러닝 모델 학습을 위해, 먼저 도 3에 도시된 바와 같이 딥러닝 모델에 웨이트들을 전이한다. 이는 서버 단에서 다량의 학습 데이터셋을 활용하여 선행 학습(Pre-Training)을 통해 획득한 딥러닝 모델의 웨이트들을 확보하여, 학습하고자 하는 딥러닝 모델에 로드하는 과정에 의한다.To learn a deep learning model, first, weights are transferred to the deep learning model as shown in Figure 3. This is a process of securing the weights of the deep learning model acquired through pre-training using a large amount of learning data sets at the server side and loading them into the deep learning model to be learned.
도 3에서 좌측에 도시된 웨이트들은 첫 번째 레이어의 웨이트들이고, 우측에 도시된 웨이트들은 두 번째 레이어의 웨이트들이다. 이에 따르면 본 발명의 실시예에서 학습시키는 딥러닝 모델은 두 개의 레이어로 구성된 것이 되는데, 이는 설명의 편의를 위한 예시에 불과하다. 본 발명의 실시예가 적용가능한 딥러닝 모델의 레이어 개수는 제한이 없다.In Figure 3, the weights shown on the left are the weights of the first layer, and the weights shown on the right are the weights of the second layer. According to this, the deep learning model trained in the embodiment of the present invention consists of two layers, but this is only an example for convenience of explanation. There is no limit to the number of layers of a deep learning model to which embodiments of the present invention can be applied.
한편 도 3에 나타난 바와 같이 딥러닝 모델은 멀티 채널로 영상이 입력되는 구조로, 영상의 피처 맵도 멀티 채널로 생성되며, 웨이트들로 채널 별로 구분되어 있다.Meanwhile, as shown in Figure 3, the deep learning model has a structure in which images are input through multi-channels, and feature maps of the images are also generated through multi-channels, and are divided by weights for each channel.
다음 도 4의 상부에 도시된 바와 같이 딥러닝 가속장치에서 데이터셋 #1을 이용하여 웨이트들이 전이된 딥러닝 모델을 파인 튜닝(fine-tuning) 하고, 프루닝(pruning) 대상 웨이트들을 선별한다.Next, as shown in the upper part of FIG. 4, the deep learning accelerator uses dataset #1 to fine-tune the deep learning model to which the weights have been transferred, and to select weights subject to pruning.
그리고 도 4의 중앙에 도시된 바와 같이 파인 튜닝된 딥러닝 모델에서 일부 웨이트들을 프루닝 한다. 도 4에서 프루닝 대상 웨이트들은 흰색으로 표시되어 있는 웨이트들이다.And, as shown in the center of Figure 4, some weights are pruned in the fine-tuned deep learning model. In Figure 4, the weights subject to pruning are those displayed in white.
도시된 바와 같이 웨이트 프루닝은 채널 단위로 수행된다. 즉 일부 채널들에 대한 웨이트들을 프루닝하고 나머지 채널에 대한 웨이트들은 남긴다. 한편 웨이트 프루닝은 레이어 별로 각기 다른 채널의 웨이트들을 프루닝할 수 있다. 도시된 바와 같이 좌측에 도시된 첫 번째 레이어에서 웨이트 프루닝 대상 채널들과 우측에 도시된 두 번째 레이어에서 웨이트 프루닝 대상 채널들은 서로 상이하다.As shown, weight pruning is performed on a channel basis. That is, the weights for some channels are pruned and the weights for the remaining channels are left. Meanwhile, weight pruning can prune the weights of different channels for each layer. As shown, the weight pruning target channels in the first layer shown on the left and the weight pruning target channels in the second layer shown on the right are different from each other.
이후 도 4의 하부에 도시된 바와 같이, 프루닝된 웨이트들에 대해서는 선행 학습된 딥러닝 모델의 웨이트들을 로딩한다. 기존에는 프루닝된 웨이들에 0을 로딩하거나 랜덤하게 생성한 웨이트들을 로딩하였다. 본 발명의 실시예에서는 프루닝된 웨이트들에 선행 학습된 딥러닝 모델의 웨이트들을 로딩함으로써 딥러닝 모델의 예측 정확도를 향상시켰다.Thereafter, as shown in the lower part of FIG. 4, the weights of the previously learned deep learning model are loaded for the pruned weights. Previously, 0 was loaded into pruned waddles or randomly generated weights were loaded. In an embodiment of the present invention, the prediction accuracy of the deep learning model was improved by loading the weights of the previously learned deep learning model into the pruned weights.
데이터셋 #1에 의해 학습이 완료된 딥러닝 모델에 대해, 데이터셋 #2에 의한 추가 학습이 수행될 수 있으며, 이 과정을 도 5에 도시하였다.For the deep learning model for which training has been completed using dataset #1, additional learning can be performed using dataset #2, and this process is shown in FIG. 5.
먼저 도 5의 상부에 도시된 바와 같이, 딥러닝 가속장치에서 데이터셋 #2를 이용하여 도 4에 도시된 과정에 의한 학습이 수행된 딥러닝 모델을 파인 튜닝 하고, 프루닝 대상 웨이트들을 선별한다.First, as shown in the upper part of FIG. 5, the deep learning accelerator uses dataset #2 to fine-tune the deep learning model that has been learned through the process shown in FIG. 4, and select weights subject to pruning. .
이 과정에서 도 4에서 프루닝 대상이 아니었던 웨이트들에 대해서는 프루닝 대상에서 제외시킬 수 있다. 즉 도 4에서 프루닝 대상이었던 웨이트들 중에서 일부 웨이트들이 프루닝 대상이 선별되는 것이다. 도 4에서 프루닝 대상이 아니었던 웨이트들은 데이터셋 #2에 의한 학습에서도 프루닝 대상에 선별되지 않는다.In this process, weights that were not subject to pruning in FIG. 4 can be excluded from the pruning subject. That is, among the weights that were the pruning target in FIG. 4, some weights are selected as the pruning target. Weights that were not subject to pruning in Figure 4 are not selected for pruning even in learning using dataset #2.
더 나아가 도 4에서 프루닝 대상이 아니었던 웨이트들에 대해서는 파인 튜닝 대상에서도 제외시킴으로써, 웨이트들이 파인 튜닝에 의해 변경되지 않도록 할 수 있다.Furthermore, by excluding weights that were not subject to pruning in FIG. 4 from fine tuning, it is possible to prevent the weights from being changed by fine tuning.
다음 도 5의 중앙에 도시된 바와 같이 파인 튜닝된 딥러닝 모델에서 선별된 일부 웨이트들을 프루닝 한다. 도 5에서 프루닝 대상 웨이트들은 흰색으로 표시되어 있는 웨이트들이다.Next, as shown in the center of FIG. 5, some weights selected from the fine-tuned deep learning model are pruned. In Figure 5, the weights subject to pruning are those displayed in white.
도 4에 도시된 데이터셋 #1에 의한 학습과 마찬가지로, 도 5에 도시된 데이터셋 #2에 의한 학습에서도, 웨이트 프루닝은 채널 단위로 수행되며, 레이어 별로 프루닝 대상 채널은 서로 다를 수 있다.Similar to learning with dataset #1 shown in FIG. 4, in learning with dataset #2 shown in FIG. 5, weight pruning is performed on a channel basis, and the pruning target channel for each layer may be different. .
이후 도 5의 하부에 도시된 바와 같이, 프루닝된 웨이트들에 대해서는 선행 학습된 딥러닝 모델의 웨이트들을 로딩한다.Thereafter, as shown in the lower part of FIG. 5, the weights of the previously learned deep learning model are loaded onto the pruned weights.
도 6은 본 발명의 다른 실시예에 따른 딥러닝 연산장치의 구성을 도시한 도면이다. 본 발명의 실시예에 따른 딥러닝 연산장치는, 도시된 바와 같이, 통신 인터페이스(110), 딥러닝 연산기(120) 및 메모리(130)를 포함하여 구성된다.Figure 6 is a diagram showing the configuration of a deep learning computing device according to another embodiment of the present invention. As shown, the deep learning computing device according to an embodiment of the present invention includes a communication interface 110, a deep learning calculator 120, and a memory 130.
통신 인터페이스(110)는 외부 호스트 시스템과 통신 연결하여, 데이터셋, 선행 학습된 딥러닝 모델의 파라미터(웨이트, 바이어스) 등을 수신한다. 딥러닝 연산기(120)는 탑재된 딥러닝 모델에 대해 전술한 도 3 내지 도 5에 제시된 방법에 학습시킨다. 메모리(130)는 딥러닝 연산기(120)가 연산을 수행함에 있어 필요한 저장 공간을 제공한다.The communication interface 110 communicates with an external host system and receives data sets, parameters (weight, bias) of previously learned deep learning models, etc. The deep learning calculator 120 trains the mounted deep learning model using the method shown in FIGS. 3 to 5 described above. The memory 130 provides storage space necessary for the deep learning calculator 120 to perform calculations.
지금까지 딥러닝 모델 학습 방법 및 이를 적용한 딥러닝 연산장치에 대해 바람직한 실시예를 들어 상세히 설명하였다.So far, the deep learning model learning method and the deep learning computing device to which it is applied have been described in detail with preferred embodiments.
위 실시예에서는 리소스가 한정되어 있는 딥러닝 연산장치에서 추가 데이터셋에 의한 딥러닝 모델시 프루닝된 웨이트들에 사전 학습된 웨이트들을 적용하는 방법을 제시하였다.In the above example, a method of applying pre-learned weights to pruned weights during a deep learning model using an additional dataset in a deep learning computing device with limited resources was presented.
이에 의해 딥러닝 연산장치에서 프루닝된 웨이트들에 대해 연산을 수행하지 않아 저전력으로 학습을 고속으로 진행하면서도, 예측 정확도는 높은 수준으로 유지할 수 있게 된다.As a result, the deep learning processing unit does not perform calculations on the pruned weights, allowing high-speed learning with low power while maintaining prediction accuracy at a high level.
한편, 본 실시예에 따른 장치와 방법의 기능을 수행하게 하는 컴퓨터 프로그램을 수록한 컴퓨터로 읽을 수 있는 기록매체에도 본 발명의 기술적 사상이 적용될 수 있음은 물론이다. 또한, 본 발명의 다양한 실시예에 따른 기술적 사상은 컴퓨터로 읽을 수 있는 기록매체에 기록된 컴퓨터로 읽을 수 있는 코드 형태로 구현될 수도 있다. 컴퓨터로 읽을 수 있는 기록매체는 컴퓨터에 의해 읽을 수 있고 데이터를 저장할 수 있는 어떤 데이터 저장 장치이더라도 가능하다. 예를 들어, 컴퓨터로 읽을 수 있는 기록매체는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광디스크, 하드 디스크 드라이브, 등이 될 수 있음은 물론이다. 또한, 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터로 읽을 수 있는 코드 또는 프로그램은 컴퓨터간에 연결된 네트워크를 통해 전송될 수도 있다.Meanwhile, of course, the technical idea of the present invention can be applied to a computer-readable recording medium containing a computer program that performs the functions of the device and method according to this embodiment. Additionally, the technical ideas according to various embodiments of the present invention may be implemented in the form of computer-readable code recorded on a computer-readable recording medium. A computer-readable recording medium can be any data storage device that can be read by a computer and store data. For example, of course, computer-readable recording media can be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, etc. Additionally, computer-readable codes or programs stored on a computer-readable recording medium may be transmitted through a network connected between computers.
또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although preferred embodiments of the present invention have been shown and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the invention pertains without departing from the gist of the present invention as claimed in the claims. Of course, various modifications can be made by those skilled in the art, and these modifications should not be understood individually from the technical idea or perspective of the present invention.

Claims (12)

  1. 딥러닝 모델을 학습시키는 제1 학습단계;A first learning step of training a deep learning model;
    학습된 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하는 제1 프루닝 단계;A first pruning step of pruning some weights in the learned deep learning model;
    프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제1 로딩단계;를 포함하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method comprising a first loading step of loading specific weights into the pruned weights.
  2. 청구항 1에 있어서,In claim 1,
    제1 로딩단계는,The first loading step is,
    선행 학습된 딥러닝 모델의 웨이트들을 로딩하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method characterized by loading the weights of a previously learned deep learning model.
  3. 청구항 2에 있어서,In claim 2,
    제1 학습단계는,The first learning stage is,
    선행 학습된 딥러닝 모델의 웨이트들이 전이된 딥러닝 모델을 제1 데이터 셋으로 파인 튜닝하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method characterized by fine tuning a deep learning model to which the weights of a previously learned deep learning model are transferred to a first data set.
  4. 청구항 3에 있어서,In claim 3,
    제1 로딩단계가 수행된 딥러닝 모델을 제2 데이터 셋으로 파인 튜닝하는 제2 학습단계;A second learning step of fine tuning the deep learning model on which the first loading step was performed with a second data set;
    파인 튜닝된 딥러닝 모델에서 일부 웨이트들을 프루닝 하는 제2 프루닝단계;A second pruning step of pruning some weights in the fine-tuned deep learning model;
    프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제2 로딩단계;를 더 포함하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method further comprising a second loading step of loading specific weights into the pruned weights.
  5. 청구항 4에 있어서,In claim 4,
    제2 로딩단계는,The second loading step is,
    선행 학습된 딥러닝 모델의 웨이트들을 로딩하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method characterized by loading the weights of a previously learned deep learning model.
  6. 청구항 4에 있어서,In claim 4,
    제2 프루닝 단계에서 프루닝되는 일부 웨이트들은,Some weights that are pruned in the second pruning step are:
    제1 프루닝 단계에서 프루닝된 웨이트들 중 일부 웨이트들인 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method, characterized in that some weights are among the weights pruned in the first pruning step.
  7. 청구항 4에 있어서,In claim 4,
    제1 프루닝 단계와 제2 프루닝 단계는,The first pruning step and the second pruning step are,
    채널 단위로 웨이트들을 프루닝 하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method characterized by pruning weights on a channel basis.
  8. 청구항 7에 있어서,In claim 7,
    제1 프루닝 단계와 제2 프루닝 단계는,The first pruning step and the second pruning step are,
    레이어 별로 각기 다른 채널의 웨이트들을 프루닝 하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method characterized by pruning the weights of different channels for each layer.
  9. 청구항 1에 있어서,In claim 1,
    딥러닝 모델은,The deep learning model is,
    경량의 저전력 딥러닝 연산 장치에 탑재되는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method characterized by being mounted on a lightweight, low-power deep learning computing device.
  10. 딥러닝 모델을 학습시키고. 학습된 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하며, 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 연산기; 및Train a deep learning model. An operator that prunes some weights from the learned deep learning model and loads specific weights into the pruned weights; and
    연산기에 필요한 저장공간을 제공하는 메모리;를 포함하는 것을 특징으로 하는 딥러닝 연산장치.A deep learning computing device comprising a memory that provides storage space required for the computing device.
  11. 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하는 제1 프루닝 단계;A first pruning step of pruning some weights in the deep learning model;
    프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제1 로딩단계;A first loading step of loading specific weights into the pruned weights;
    제1 로딩단계가 수행된 딥러닝 모델에서 일부 웨이트들을 프루닝 하는 제2 프루닝단계;a second pruning step of pruning some weights in the deep learning model in which the first loading step was performed;
    프루닝된 웨이트들에 특정 웨이트들을 로딩하는 제2 로딩단계;를 포함하는 것을 특징으로 하는 딥러닝 모델 학습 방법.A deep learning model learning method comprising a second loading step of loading specific weights into the pruned weights.
  12. 딥러닝 모델에서 일부 웨이트들을 프루닝(pruning) 하고 프루닝된 웨이트들에 특정 웨이트들을 로딩하며, 특정 웨이트들이 로딩된 딥러닝 모델에서 일부 웨이트들을 프루닝 하고 프루닝된 웨이트들에 특정 웨이트들을 로딩하는 연산기; 및Pruning some weights in a deep learning model and loading specific weights into the pruned weights. Pruning some weights from a deep learning model loaded with specific weights and loading specific weights into the pruned weights. a calculator that does; and
    연산기에 필요한 저장공간을 제공하는 메모리;를 포함하는 것을 특징으로 하는 딥러닝 연산장치.A deep learning computing device comprising a memory that provides storage space required for the computing device.
PCT/KR2022/016397 2022-10-26 2022-10-26 Deep learning model training method and deep learning computation apparatus applied with same WO2024090600A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2022-0138798 2022-10-26
KR1020220138798A KR20240058252A (en) 2022-10-26 2022-10-26 Deep learning model training method and deep learning computing device appllying the same

Publications (1)

Publication Number Publication Date
WO2024090600A1 true WO2024090600A1 (en) 2024-05-02

Family

ID=90831078

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/016397 WO2024090600A1 (en) 2022-10-26 2022-10-26 Deep learning model training method and deep learning computation apparatus applied with same

Country Status (2)

Country Link
KR (1) KR20240058252A (en)
WO (1) WO2024090600A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180013674A (en) * 2016-07-28 2018-02-07 삼성전자주식회사 Method for lightening neural network and recognition method and apparatus using the same
KR20210015990A (en) * 2019-05-18 2021-02-10 주식회사 디퍼아이 Convolution neural network parameter optimization method, neural network computing method and apparatus
KR20210108413A (en) * 2018-12-18 2021-09-02 모비디어스 리미티드 Neural Network Compression
KR20220085280A (en) * 2020-12-15 2022-06-22 경희대학교 산학협력단 Method and apparatus processing weight of artificial neural network for super resolution
KR20220116270A (en) * 2020-02-07 2022-08-22 주식회사 히타치하이테크 Learning processing apparatus and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180013674A (en) * 2016-07-28 2018-02-07 삼성전자주식회사 Method for lightening neural network and recognition method and apparatus using the same
KR20210108413A (en) * 2018-12-18 2021-09-02 모비디어스 리미티드 Neural Network Compression
KR20210015990A (en) * 2019-05-18 2021-02-10 주식회사 디퍼아이 Convolution neural network parameter optimization method, neural network computing method and apparatus
KR20220116270A (en) * 2020-02-07 2022-08-22 주식회사 히타치하이테크 Learning processing apparatus and method
KR20220085280A (en) * 2020-12-15 2022-06-22 경희대학교 산학협력단 Method and apparatus processing weight of artificial neural network for super resolution

Also Published As

Publication number Publication date
KR20240058252A (en) 2024-05-03

Similar Documents

Publication Publication Date Title
CN106960219A (en) Image identification method and device, computer equipment and computer-readable medium
WO2021125619A1 (en) Method for inspecting labeling on bounding box by using deep learning model and apparatus using same
CN107391549A (en) News based on artificial intelligence recalls method, apparatus, equipment and storage medium
WO2021118041A1 (en) Method for distributing labeling work according to difficulty thereof and apparatus using same
CN108229535A (en) Relate to yellow image audit method, apparatus, computer equipment and storage medium
WO2024090600A1 (en) Deep learning model training method and deep learning computation apparatus applied with same
WO2022146080A1 (en) Algorithm and method for dynamically changing quantization precision of deep-learning network
CN109815992A (en) A kind of support vector machines accelerates training method and system parallel
WO2022107925A1 (en) Deep learning object detection processing device
WO2023033194A1 (en) Knowledge distillation method and system specialized for pruning-based deep neural network lightening
WO2023085458A1 (en) Method and device for controlling lightweight deep learning training memory
WO2024135867A1 (en) Efficient transfer learning method for small-scale deep learning network
WO2022107927A1 (en) Deep learning apparatus enabling rapid post-processing
WO2022107951A1 (en) Method for training ultra-lightweight deep learning network
WO2024135860A1 (en) Data pruning method for lightweight deep-learning hardware device
WO2022102912A1 (en) Neuromorphic architecture dynamic selection method for modeling on basis of snn model parameter, and recording medium and device for performing same
WO2023095934A1 (en) Method and system for lightening head neural network of object detector
WO2023113450A1 (en) Support sink application method for 3d printing heat dissipation analysis
WO2024091106A1 (en) Method and system for selecting an artificial intelligence (ai) model in neural architecture search (nas)
WO2024135862A1 (en) Data processing and manipulation device supporting unstructured data processing
WO2024135861A1 (en) Deep learning training method applying variable data representation type and mobile device applying same
WO2022107929A1 (en) Deep learning accelerator comprising variable data compressor/decompressor
WO2023080291A1 (en) Pooling device for deep learning accelerator
CN112819022B (en) Image recognition device and image recognition method based on neural network
WO2022005057A1 (en) Matrix index information generation method, matrix processing method using matrix index information, and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22963561

Country of ref document: EP

Kind code of ref document: A1