KR102176111B1

KR102176111B1 - Method and system for providing of naive semi-supervised deep learning using unlabeled data

Info

Publication number: KR102176111B1
Application number: KR1020180124118A
Authority: KR
Inventors: 홍봉희; 최호진; 리준; 고병수
Original assignee: 부산대학교 산학협력단
Priority date: 2018-10-18
Filing date: 2018-10-18
Publication date: 2020-11-09
Also published as: KR20200046173A

Abstract

본 발명은 언레이블 데이터를 사용하여 심층 학습 모델의 성능을 향상시키는 나이브 반지도 심층 학습의 제공 방법 및 그 시스템에 관한 것으로, 학습 데이터가 부족한 상황에서도 유사 레이블을 이용하여 성능이 향상된 반지도 심층 학습(semi-supervised deep learning)을 제공한다. The present invention relates to a method and a system for providing naive ring road deep learning that improves the performance of a deep learning model using unlabeled data. Even when training data is insufficient, ring road deep learning with improved performance using a similar label (semi-supervised deep learning) is provided.

Description

Method and system for providing deep learning of naive rings using unlabeled data {METHOD AND SYSTEM FOR PROVIDING OF NAIVE SEMI-SUPERVISED DEEP LEARNING USING UNLABELED DATA}

본 발명은 언레이블 데이터를 사용한 나이브 반지도 심층 학습의 제공 방법 및 그 시스템에 관한 것으로, 보다 상세하게는 언레이블 데이터를 사용하여 심층 학습 모델의 성능을 향상시키는 기술에 관한 것이다.The present invention relates to a method and a system for providing naive ring degree deep learning using unlabeled data, and more particularly, to a technique for improving the performance of a deep learning model using unlabeled data.

합성곱 신경망(Convolutional Neural Networks; CNNs) 및 장단기 메모리 네트워크(Long Short-Term Memory networks; LSTMs)와 같은 심층 학습 모델들은 수 많은 복잡한 태스크에 대한 최첨단의 결과를 제공한다. 이 결과들에 대한 핵심 요인들 중 하나는 수천만 개의 파라미터들을 최적화하는 데 사용될 수 있는 엄청난 양의 레이블 데이터 세트이다. 데이터 수집 및 스토리지 기술에 대한 급속한 발전으로 인해 대규모의 언레이블 데이터를 수집하는 것이 용이해지고 있다. 그러나, 엄청난 양의 레이블 데이터를 습득하는 것은 여전히 많은 비용과 시간이 소요된다. Deep learning models such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory networks (LSTMs) provide state-of-the-art results for many complex tasks. One of the key factors for these results is the huge set of label data that can be used to optimize tens of millions of parameters. With rapid advances in data collection and storage technologies, it is becoming easier to collect large amounts of unlabeled data. However, acquiring a huge amount of label data is still costly and time consuming.

예를 들어, 1400만 개 이상의 이미지들을 포함하는 유명한 데이터 세트인 ImageNet을 구축하는 데에는 7년 이상(2010~2017)이 소요되었다. 이와 대조적으로, 페이스북의 사용자들은 하루에 수억 개의 사진들을 업로드한다. 이러한 레이블 데이터를 획득하는 것은 상당한 인력과 물질 자원을 필요로 하기 때문에, 유용한 모델을 학습시키는데 필요한 데이터를 획득하기 위한 기술의 연구가 요구되었다. For example, it took more than 7 years (2010-2017) to build ImageNet, a popular data set containing over 14 million images. In contrast, Facebook users upload hundreds of millions of photos a day. Since acquiring such label data requires considerable manpower and material resources, research on techniques for acquiring the data required to train a useful model is required.

언레이블 데이터를 사용하는 지도 학습 태스크의 성능을 개선하는 것은 반지도 학습의 핵심 토픽 중 하나이다. 핵심은 대규모 언레이블 데이터(unlabeled data)를 사용하여 보조 학습(auxiliary training)을 제공하는 것이다. 심층 학습 영역에서 반지도 학습에 대한 많은 연구들이 있다. Improving the performance of supervised learning tasks using unlabeled data is one of the key topics of ring map learning. The key is to provide auxiliary training using large amounts of unlabeled data. There are many studies on ring-do learning in the area of deep learning.

기존 방법 중 하나는 언레이블 데이터와 레이블 데이터를 동시에 사용하며, 일반적으로 비지도 항과 지도 항을 모두 포함하고 있는 변형된 손실 함수를 최소화하도록 학습한다. 기존 방법 중 다른 하나는 언레이블 데이터와 레이블 데이터를 별개로 활용하며, 우선적으로 비지도 방법으로 모델을 사전 학습한 후 특징들을 학습하고, 이후에 지도 목적으로 특징들을 사용한다. 기존 방법 중 또 다른 하나는 계층 단위(layer-wise) 사전 학습을 수행한다. One of the existing methods uses both unlabeled data and label data at the same time, and generally learns to minimize a transformed loss function that includes both unsupervised and supervised terms. Another of the existing methods utilizes the unlabeled data and the label data separately, and first learns features after pre-training the model with an unsupervised method, and then uses the features for instructional purposes. Another of the existing methods performs layer-wise pre-learning.

전술한 기존 방법들은 다른 기계 학습 모델들을 사용하지 않은 채 심층 학습 모델 그 자체를 개선하는 데 주로 초점을 둔 것이므로, 심층 학습 모델의 성능 향상이 미비하다는 한계가 존재하였다.Since the above-described existing methods mainly focus on improving the deep learning model itself without using other machine learning models, there is a limitation that the performance of the deep learning model is insufficient.

본 발명의 목적은 언레이블 데이터를 사용하여 심층 학습 모델의 성능을 향상시킨 지도 학습(supervised learning)의 성과를 향상시키고자 한다.An object of the present invention is to improve the performance of supervised learning that improves the performance of a deep learning model using unlabeled data.

또한, 본 발명의 목적은 다른 기계 학습 모델을 활용하여 반지도 학습 방식에서 심층 학습 모델을 개선시키고자 한다.In addition, an object of the present invention is to improve a deep learning model in a ring map learning method by using another machine learning model.

본 발명의 실시예에 따른 언레이블 데이터(unlabeled data)를 사용하여 나이브 반지도 심층 학습(Naive semi-supervised deep learning)을 제공하는 방법에 있어서, 레이블 데이터(labeled data)를 사용하여 분류기(classifier)를 학습시키는 단계, 상기 언레이블 데이터에 대한 상기 분류기의 출력을 사용하여 슈도 레이블(pseudo-labels)을 예측하는 단계, 상기 슈도 레이블에 의해 출력되는 슈도 레이블 데이터(pseudo-labeled data)를 사용하여 심층 학습 모델을 사전 학습(pre-train)시키는 단계 및 상기 레이블 데이터를 사용하여 상기 심층 학습 모델을 미세 조정(fine-tune)하는 단계를 포함한다.In a method for providing naive semi-supervised deep learning using unlabeled data according to an embodiment of the present invention, a classifier using labeled data Learning, predicting pseudo-labels using the output of the classifier for the unlabeled data, and using pseudo-labeled data outputted by the pseudo-label And pre-training the learning model and fine-tuneing the deep learning model using the label data.

상기 분류기를 학습시키는 단계는 대규모의 상기 레이블 데이터를 사용하여 슈도 레이블링 모델(pseudo-labeling model)의 상기 분류기를 학습시킬 수 있다.In the training of the classifier, the classifier of a pseudo-labeling model may be trained using the large-scale label data.

상기 슈도 레이블을 예측하는 단계는 상기 언레이블 데이터에 대한 상기 슈도 레이블링 모델의 출력을 사용하여 상기 슈도 레이블을 예측하는 슈도 레이블링(pseudo-labeling)을 수행할 수 있다.In the predicting of the pseudo-label, pseudo-labeling for predicting the pseudo-label by using the output of the pseudo-labeling model for the unlabeled data may be performed.

상기 심층 학습 모델을 사전 학습시키는 단계는 상기 슈도 레이블 데이터를 사용하여 합성곱 신경망(Convolutional Neural Networks; CNNs) 및 장단기 메모리 네트워크(Long Short-Term Memory networks; LSTMs)의 상기 심층 학습 모델을 사전 학습시킬 수 있다. In the pre-training of the deep learning model, the deep learning model of convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) is pre-trained using the pseudo-label data. I can.

상기 심층 학습 모델을 미세 조정하는 단계는 상기 레이블 데이터를 사용하여 상기 심층 학습 모델을 미세 조정함으로써, 상기 언레이블 데이터를 다시 레이블링하여 상기 슈도 레이블링을 향상시킬 수 있다.The step of fine-tuning the deep learning model may improve the pseudo-labeling by re-labeling the unlabeled data by fine-tuning the deep learning model using the label data.

상기 나이브 반지도 심층 학습을 제공하는 방법은 상기 슈도 레이블을 예측하는 슈도 레이블링(pseudo-labeling), 사전 학습(pre-train) 및 미세 조정(fine-tune)을 반복하여 교환 반복 방식(alternating iterative manner)으로, 상기 심층 학습 모델을 학습시키는 상기 나이브 반지도 심층 학습(Naive semi-supervised deep learning)을 수행할 수 있다. The method of providing the naive ring degree deep learning is an alternating iterative manner by repeating pseudo-labeling, pre-train, and fine-tune for predicting the pseudo-label. ), the naive ring for training the deep learning model may also perform deep learning (Naive semi-supervised deep learning).

상기 나이브 반지도 심층 학습을 제공하는 방법은 래핑 알고리즘(wrapping algorithm)을 수행할 수 있다.A method of providing deep learning even for the naive ring may perform a wrapping algorithm.

상기 나이브 반지도 심층 학습을 제공하는 방법은 상기 레이블 데이터와 상기 언레이블 데이터를 번갈아 가며(in alternate way) 사용할 수 있다.The naive ring also provides the deep learning method may use the label data and the unlabeled data in alternate way.

상기 나이브 반지도 심층 학습을 제공하는 방법은 엔트로피 정규화(entropy regularization)의 원리에 기반한 것일 수 있다.The method of providing deep learning of the naive ring may be based on the principle of entropy regularization.

본 발명의 실시예에 따른 언레이블 데이터(unlabeled data)를 사용하여 나이브 반지도 심층 학습(Naive semi-supervised deep learning)을 제공하는 시스템에 있어서, 레이블 데이터(labeled data)를 사용하여 분류기(classifier)를 학습시키는 분류기 학습부, 상기 언레이블 데이터에 대한 상기 분류기의 출력을 사용하여 슈도 레이블(pseudo-labels)을 예측하는 예측부, 상기 슈도 레이블에 의해 출력되는 슈도 레이블 데이터(pseudo-labeled data)를 사용하여 심층 학습 모델을 사전 학습(pre-train)시키는 모델 사전 학습부 및 상기 레이블 데이터를 사용하여 상기 심층 학습 모델을 미세 조정(fine-tune)하는 모델 미세 조정부를 포함한다.In the system for providing naive semi-supervised deep learning using unlabeled data according to an embodiment of the present invention, a classifier using labeled data A classifier learning unit that learns, a prediction unit that predicts pseudo-labels using the output of the classifier for the unlabeled data, and pseudo-labeled data output by the pseudo-label And a model pre-training unit for pre-training the deep learning model by using and a model fine tuning unit for fine-tuning the deep learning model by using the label data.

본 발명의 실시예에 따른 학습 데이터가 부족한 상황에서도 유사 레이블을 이용함으로써 지도 학습(supervised learning)의 성과를 향상시킬 수 있으므로, 딥러닝(deep learning)을 사용하는 다양한 기술 환경에 적용 가능할 수 있다.Even in a situation in which the training data according to the embodiment of the present invention is insufficient, the performance of supervised learning can be improved by using a similar label, and thus it can be applied to various technology environments using deep learning.

또한, 본 발명의 실시예에 따르면, 분류기를 구축하고 언레이블 데이터에 대한 분류기의 출력을 심층 학습 모델에 입력함으로써, 손실 함수를 설계하거나 계층 단위 학습을 수행할 필요가 없다. Further, according to an embodiment of the present invention, there is no need to design a loss function or perform hierarchical learning by constructing a classifier and inputting the output of the classifier for unlabeled data to the deep learning model.

도 1은 본 발명의 실시예에 따른 나이브 반지도 심층 학습 제공 방법의 흐름도를 도시한 것이다.
도 2 및 도 3은 본 발명의 실시예에 따른 분류 정확도 및 모델 개선에 대한 실험 결과를 도시한 것이다.
도 4는 본 발명의 실시예에 따른 나이브 반지도 심층 학습 제공 시스템의 세부 구성을 블록도로 도시한 것이다.1 is a flowchart of a method for providing deep learning of naive rings according to an embodiment of the present invention.
2 and 3 show experimental results for classification accuracy and model improvement according to an embodiment of the present invention.
4 is a block diagram showing a detailed configuration of a naive ring diagram deep learning providing system according to an embodiment of the present invention.

이하, 본 발명에 따른 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 그러나 본 발명이 실시예들에 의해 제한되거나 한정되는 것은 아니다. 또한, 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited or limited by the embodiments. In addition, the same reference numerals shown in each drawing denote the same member.

또한, 본 명세서에서 사용되는 용어(terminology)들은 본 발명의 바람직한 실시예를 적절히 표현하기 위해 사용된 용어들로서, 이는 시청자, 운용자의 의도 또는 본 발명이 속하는 분야의 관례 등에 따라 달라질 수 있다. 따라서, 본 용어들에 대한 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. In addition, terms used in the present specification are terms used to properly express preferred embodiments of the present invention, which may vary depending on the intention of viewers or operators, or customs in the field to which the present invention belongs. Accordingly, definitions of these terms should be made based on the contents throughout the present specification.

도 1은 본 발명의 실시예에 따른 나이브 반지도 심층 학습 제공 방법의 흐름도를 도시한 것이다.1 is a flowchart of a method for providing deep learning of naive rings according to an embodiment of the present invention.

도 1의 본 발명의 실시예에 따른 나이브 반지도 심층 학습을 제공하는 방법은 도 4에 도시된 본 발명의 실시예에 따른 나이브 반지도 심층 학습을 제공하는 시스템에 의해 수행된다.The method of providing deep naive ring map learning according to the embodiment of the present invention of FIG. 1 is performed by the system for providing deep learning of naive ring maps according to the embodiment of the present invention shown in FIG. 4.

도 1을 참조하면, 단계 110에서, 레이블 데이터(labeled data)를 사용하여 분류기(classifier)를 학습시킨다.Referring to FIG. 1, in step 110, a classifier is trained using labeled data.

상기 분류기는 슈도 레이블링 모델(pseudo-labeling model)일 수 있으며, 단계 110은 대규모의 레이블 데이터를 사용하여 슈도 레이블링 모델(pseudo-labeling model)을 학습시킬 수 있다. The classifier may be a pseudo-labeling model, and in step 110, a pseudo-labeling model may be trained using large-scale label data.

단계 120에서, 언레이블 데이터(unlabeled data)에 대한 분류기의 출력을 사용하여 슈도 레이블(pseudo-labels)을 예측한다. In step 120, pseudo-labels are predicted by using the output of the classifier for unlabeled data.

단계 120은 언레이블 데이터에 대한 슈도 레이블링 모델의 출력을 사용하여 슈도 레이블을 예측하는 슈도 레이블링(pseudo-labeling)을 수행할 수 있다. 이 때, 상기 슈도 레이블은 합리적인 확률 간격을 특정하여 그 간격 내에 위치하는 것일 수 있다. 일반적으로, 더 높은 확률을 갖는 슈도 레이블은 더 높은 슈도 레이블링 정확도를 나타낼 수 있다. In operation 120, pseudo-labeling for predicting a pseudo-label using the output of the pseudo-labeling model for unlabeled data may be performed. In this case, the pseudo label may be positioned within the interval by specifying a reasonable probability interval. In general, a pseudo label with a higher probability may indicate a higher pseudo labeling accuracy.

보다 구체적으로, 작은 레이블 데이터 세트

와 큰 언레이블 데이터 세트

가 있다고 가정하면, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 방법은 단계 110에서, 레이블 데이터

을 사용하여 분류기

를 학습시키며, 단계 120에서, 분류기

를 사용하여 언레이블 데이터

에 대한 슈도 레이블을 예측한다. 이 때,

이다. More specifically, a small set of label data

And a large unlabeled data set

Assuming that there is, in step 110, the method for providing naive ring degree deep learning according to an embodiment of the present invention includes label data

Classifier using

And, in step 120, the classifier

Unlabeled data using

Predict the pseudo label for. At this time,

to be.

본 발명에서 분류기

는 슈도 레이블링 모델(pseudo-labeling model)이다. 또한, 슈도 레이블(pseudo-label)은 언레이블 데이터

에 대한 슈도 레이블링 모델

의 출력을 나타내며, 이는

라고 표현되고, 여기서

이다. Classifier in the present invention

Is a pseudo-labeling model. In addition, the pseudo-label is unlabeled data

Pseudo labeling model for

Represents the output of, which is

Is expressed, where

to be.

마찬가지로, 출력을 예측하는 과정을 슈도 레이블링(pseudo-labeling)이라 한다. Likewise, the process of predicting the output is called pseudo-labeling.

단계 130에서, 슈도 레이블에 의해 출력되는 슈도 레이블 데이터(pseudo-labeled data)를 사용하여 심층 학습 모델을 사전 학습(pre-train)시킨다.In step 130, the deep learning model is pre-trained using pseudo-labeled data output by the pseudo-label.

단계 130은 슈도 레이블 데이터

를 사용하여 합성곱 신경망(Convolutional Neural Networks; CNNs) 및 장단기 메모리 네트워크(Long Short-Term Memory networks; LSTMs)의 심층 학습 모델

을 사전 학습시킬 수 있다. Step 130 is pseudo label data

Deep learning models of convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) using

You can pre-learn.

이 때, 사전 학습(pre-train)은 슈도 레이블을 포함하는 언레이블 데이터(즉, 슈도 레이블 데이터

)를 사용하여 심층 학습 모델

을 학습시키는 것을 의미하며, 이는

일 수 있다. In this case, the pre-train is unlabeled data including pseudo-label (ie, pseudo-label data).

) Using deep learning models

Means to learn

Can be

단계 140에서, 레이블 데이터를 사용하여 심층 학습 모델을 미세 조정(fine-tune)한다. In step 140, the deep learning model is fine-tuned using the label data.

단계 140은 레이블 데이터

를 사용하여 심층 학습 모델

을 미세 조정함으로써, 언레이블 데이터를 다시 레이블링하여 슈도 레이블링을 향상시킬 수 있다. Step 140 is the label data

Using a deep learning model

By fine-tuning is, it is possible to improve pseudo-labeling by relabeling the unlabeled data.

슈도 레이블 데이터를 사용하여 심층 학습 모델

을 사전 학습시키는 것은 모델의 최종적인 테스트 정확도를 향상시킬 수 있으므로, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 방법의 단계 140은 미세 조정 모델

을 이용하여 언레이블 데이터를 다시 레이블링하여 슈도 레이블링을 향상시킬 수 있다. Deep learning model using pseudo-label data

Pre-learning can improve the final test accuracy of the model, so step 140 of the method for providing deep learning even for naive rings according to an embodiment of the present invention is a fine tuning model

Pseudo-labeling can be improved by re-labeling the unlabeled data by using.

본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 방법은 슈도 레이블을 예측하는 슈도 레이블링(pseudo-labeling), 사전 학습(pre-train) 및 미세 조정(fine-tune)을 반복하는 교환 반복 방식(alternating iterative manner)을 수행하는 것을 특징으로 하며, 이는 하기의 [알고리즘 1]과 같이 요약되고, 심층 학습 모델을 학습시키는 나이브 반지도 심층 학습(Naive semi-supervised deep learning)이라 할 수 있다. The method of providing deep naive ring degree learning according to an embodiment of the present invention is an exchange repetition method that repeats pseudo-labeling, pre-train, and fine-tune for predicting pseudo-labels. It is characterized by performing an alternating iterative manner, which is summarized as [Algorithm 1] below, and the naive ring that trains the deep learning model can be called Naive semi-supervised deep learning.

[알고리즘 1][Algorithm 1]

[알고리즘 1]에서, 초매개변수 N은 반복 횟수이다. N은 유효성 정확도의 수렴을 보장하기에 충분할 정도로 커야 한다.

는 스크래치로부터 학습되며,

은 미세 조정 방법에 의해 학습된다. In [Algorithm 1], the hyperparameter N is the number of iterations. N should be large enough to ensure convergence of effectiveness accuracy.

Is learned from scratch,

Is learned by fine tuning method.

본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 방법은 엔트로피 정규화(entropy regularization)의 원리에 기반하나, 두 가지의 중요한 방식에서 표준적인 엔트로피 정규화와 차별화될 수 있다. 첫 번째는, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 방법은 교환 반복 방식(alternating iterative manner)으로 모델을 학습시키기 때문에, 균형 초매개변수

를 선택할 필요가 없다. 두 번째는, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 방법은 래핑 알고리즘(wrapping algorithm)이므로, 내부 방식을 바꾸지 않고 심층 학습 모델에 연결시킬 수 있다. The method of providing deep learning of naive rings according to an embodiment of the present invention is based on the principle of entropy regularization, but can be differentiated from standard entropy regularization in two important ways. First, since the method for providing deep learning of naive rings according to an embodiment of the present invention trains the model in an alternating iterative manner, balanced hyperparameters

There is no need to choose. Second, since the naive ring deep learning provision method according to an embodiment of the present invention is a wrapping algorithm, it can be connected to the deep learning model without changing the internal method.

또한, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 방법은 레이블 데이터와 언레이블 데이터를 번갈아 가며(in alternate way) 사용하므로, 트레이드오프(trade-off) 계수에 대한 적절한 스케줄링을 선택할 필요가 없다. In addition, since the method for providing deep learning of naive rings according to an embodiment of the present invention uses label data and unlabeled data in alternate ways, it is necessary to select an appropriate scheduling for a trade-off coefficient. There is no

도 2 및 도 3은 본 발명의 실시예에 따른 분류 정확도 및 모델 개선에 대한 실험 결과를 도시한 것이다.2 and 3 show experimental results for classification accuracy and model improvement according to an embodiment of the present invention.

보다 상세하게는, 도 2는 본 발명의 실시예에 따른 나이브 반지도 심층 학습을 사용한 합성곱 신경망(CNN)과 MNIST 테스트 세트 간의 분류 정확도를 비교한 실험 결과를 도시한 것이고, 도 3은 본 발명의 실시예에 따른 교환 반복 방식에 따른 모델 개선 및 정확도 향상을 비교한 실험 결과를 도시한 것이다. In more detail, FIG. 2 shows an experimental result comparing the classification accuracy between a convolutional neural network (CNN) using deep naive ring degree learning and an MNIST test set according to an embodiment of the present invention, and FIG. It shows the experimental results comparing the model improvement and the accuracy improvement according to the exchange iteration method according to the embodiment of.

도 2의 실험 결과를 도출하기 위해, 실험에서는 반복 횟수를 1로 설정하였으며, 각 실험을 30번 반복하고 평균을 최종 정확도로 산출한 것이다.In order to derive the experimental results of FIG. 2, the number of repetitions was set to 1 in the experiment, and each experiment was repeated 30 times and the average was calculated as the final accuracy.

도 2를 참조하면, 슈도 레이블링(Pseudo-labeling)을 사용한 CNN의 분류 정확도(CNN 슈도 레이블링을 포함한 CNN(CNN with CNN Pseudo-labeling), RF 슈도 레이블링을 포함한 CNN(CNN with RF Pseudo-labeling), 및 SVM 슈도 레이블링을 포함한 CNN(CNN with SVM Pseudo-labeling))는 CNN의 분류 정확도 보다 훨씬 높은 것을 확인할 수 있다.2, the classification accuracy of CNN using Pseudo-labeling (CNN with CNN Pseudo-labeling (CNN) including CNN Pseudo-labeling), CNN (CNN with RF Pseudo-labeling) including RF Pseudo-labeling, And CNN (CNN with SVM Pseudo-labeling)) including SVM pseudo-labeling can be confirmed to be much higher than the classification accuracy of CNN.

이로 인하여, 본 발명의 실시예에 따른 언레이블 데이터를 사용한 나이브 반지도 심층 학습의 제공 방법 및 그 시스템에 의해 CNN 모델의 성능이 향상된 것을 알 수 있다. Accordingly, it can be seen that the performance of the CNN model is improved by the method and system for providing deep learning even for naive rings using unlabeled data according to an embodiment of the present invention.

도 3의 실험 결과를 도출하기 위해, 실험에서는 클래스를 1번과 10번으로 각기 반복 수행하였으며, 클래스 별로 10개의 샘플을 채취하여 테스트 정확도의 차이를 산출한 것이다. 이 때, 1회의 반복 수행은 슈도 레이블링(pseudo-labeling), 사전 학습(pre-train) 및 미세 조정(fine-tune)을 1회 반복한 것을 의미한다. In order to derive the experimental results of FIG. 3, in the experiment, the classes were repeatedly performed as 1 and 10, respectively, and 10 samples were collected for each class to calculate the difference in test accuracy. In this case, performing one repetition means that pseudo-labeling, pre-train, and fine-tune are repeated once.

도 3을 참조하면, 1번 반복(1 Repetition)에 비해 10번 반복(10 Repetitions)한 결과가 더욱 테스트 정확도를 향상시키는 것을 확인할 수 있다. Referring to FIG. 3, it can be seen that the result of 10 repetitions (10 Repetitions) compared to 1 repetition (1 Repetition) further improves the test accuracy.

이로 인하여, 본 발명의 실시예에 따른 언레이블 데이터를 사용한 나이브 반지도 심층 학습의 제공 방법 및 그 시스템에 의해 슈도 레이블링(pseudo-labeling), 사전 학습(pre-train) 및 미세 조정(fine-tune)의 절차를 반복 수행함으로써, 모델의 성능을 향상시킬 수 있음을 알 수 있다. For this reason, pseudo-labeling, pre-training, and fine-tuning by the method and system for providing deep naive ring map learning using unlabeled data according to an embodiment of the present invention ), it can be seen that the performance of the model can be improved.

도 4는 본 발명의 실시예에 따른 나이브 반지도 심층 학습 제공 시스템의 세부 구성을 블록도로 도시한 것이다.4 is a block diagram showing a detailed configuration of a naive ring diagram deep learning providing system according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템은 언레이블 데이터를 사용하여 심층 학습 모델의 성능을 향상시킨다.Referring to FIG. 4, a system for providing deep learning of a naive ring map according to an embodiment of the present invention improves the performance of a deep learning model by using unlabeled data.

이를 위해, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템(400)은 분류기 학습부(410), 예측부(420), 모델 사전 학습부(430) 및 모델 미세 조정부(440)를 포함한다.To this end, the system 400 for providing deep naive ring degree learning according to an embodiment of the present invention includes a classifier learning unit 410, a prediction unit 420, a model dictionary learning unit 430, and a model fine adjustment unit 440. Include.

분류기 학습부(410)는 레이블 데이터(labeled data)를 사용하여 분류기(classifier)를 학습시킨다.The classifier learning unit 410 trains a classifier using labeled data.

상기 분류기는 슈도 레이블링 모델(pseudo-labeling model)일 수 있으며, 분류기 학습부(410)는 대규모의 레이블 데이터를 사용하여 슈도 레이블링 모델(pseudo-labeling model)을 학습시킬 수 있다. The classifier may be a pseudo-labeling model, and the classifier learning unit 410 may train a pseudo-labeling model using large-scale label data.

예측부(420)는 언레이블 데이터(unlabeled data)에 대한 분류기의 출력을 사용하여 슈도 레이블(pseudo-labels)을 예측한다. The prediction unit 420 predicts pseudo-labels by using the output of the classifier for unlabeled data.

예측부(420)는 언레이블 데이터에 대한 슈도 레이블링 모델의 출력을 사용하여 슈도 레이블을 예측하는 슈도 레이블링(pseudo-labeling)을 수행할 수 있다. 이 때, 상기 슈도 레이블은 합리적인 확률 간격을 특정하여 그 간격 내에 위치하는 것일 수 있다. 일반적으로, 더 높은 확률을 갖는 슈도 레이블은 더 높은 슈도 레이블링 정확도를 나타낼 수 있다. The prediction unit 420 may perform pseudo-labeling for predicting a pseudo-label by using the output of the pseudo-labeling model for unlabeled data. In this case, the pseudo label may be positioned within the interval by specifying a reasonable probability interval. In general, a pseudo label with a higher probability may indicate a higher pseudo labeling accuracy.

모델 사전 학습부(430)는 슈도 레이블에 의해 출력되는 슈도 레이블 데이터(pseudo-labeled data)를 사용하여 심층 학습 모델을 사전 학습(pre-train)시킨다.The model pre-training unit 430 pre-trains the deep learning model by using pseudo-labeled data output by the pseudo-label.

모델 사전 학습부(430)는 슈도 레이블 데이터를 사용하여 합성곱 신경망(Convolutional Neural Networks; CNNs) 및 장단기 메모리 네트워크(Long Short-Term Memory networks; LSTMs)의 심층 학습 모델을 사전 학습시킬 수 있다. The model pre-learning unit 430 may pre-train deep learning models of convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) using pseudo-label data.

이 때, 사전 학습(pre-train)은 슈도 레이블을 포함하는 언레이블 데이터(즉, 슈도 레이블 데이터)를 사용하여 심층 학습 모델을 학습시키는 것을 의미할 수 있다.In this case, pre-training may mean training a deep learning model by using unlabeled data (ie, pseudo-label data) including pseudo-labels.

모델 미세 조정부(440)는 레이블 데이터를 사용하여 심층 학습 모델을 미세 조정(fine-tune)한다. The model fine tuning unit 440 fine-tunes the deep learning model using the label data.

모델 미세 조정부(440)는 레이블 데이터를 사용하여 심층 학습 모델을 미세 조정함으로써, 언레이블 데이터를 다시 레이블링하여 슈도 레이블링을 향상시킬 수 있다. The model fine-tuning unit 440 may fine-tune the deep learning model using the label data, thereby re-labeling the unlabeled data to improve pseudo-labeling.

슈도 레이블 데이터를 사용하여 심층 학습 모델을 사전 학습시키는 것은 모델의 최종적인 테스트 정확도를 향상시킬 수 있으므로, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템의 모델 미세 조정부(440)는 미세 조정 모델을 이용하여 언레이블 데이터를 다시 레이블링하여 슈도 레이블링을 향상시킬 수 있다. Since pre-training the deep learning model using pseudo-label data can improve the final test accuracy of the model, the model fine adjustment unit 440 of the system for providing deep learning of naive rings according to an embodiment of the present invention is Pseudo-labeling can be improved by relabeling the unlabeled data using an adjustment model.

본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템(400)은 슈도 레이블을 예측하는 슈도 레이블링(pseudo-labeling), 사전 학습(pre-train) 및 미세 조정(fine-tune)을 반복하여 교환 반복 방식(alternating iterative manner)으로, 심층 학습 모델을 학습시키는 나이브 반지도 심층 학습(Naive semi-supervised deep learning)을 수행할 수 있다. The system 400 for providing deep naive ring degree learning according to an embodiment of the present invention repeats pseudo-labeling, pre-train, and fine-tune for predicting pseudo-labels. In an alternating iterative manner, a naive ring that trains a deep learning model can also perform deep learning (Naive semi-supervised deep learning).

본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템(400)은 엔트로피 정규화(entropy regularization)의 원리에 기반하나, 두 가지의 중요한 방식에서 표준적인 엔트로피 정규화와 차별화될 수 있다. 첫 번째는, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템(400)은 교환 반복 방식(alternating iterative manner)으로 모델을 학습시키기 때문에, 균형 초매개변수

를 선택할 필요가 없다. 두 번째는, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템(400)은 래핑 알고리즘(wrapping algorithm)이므로, 내부 방식을 바꾸지 않고 심층 학습 모델에 연결시킬 수 있다. The naive ring map deep learning providing system 400 according to an embodiment of the present invention is based on the principle of entropy regularization, but can be differentiated from standard entropy regularization in two important ways. First, since the naive ring degree deep learning providing system 400 according to an embodiment of the present invention trains the model in an alternating iterative manner, the balanced hyperparameter

There is no need to choose. Second, since the naive ring deep learning providing system 400 according to an embodiment of the present invention is a wrapping algorithm, it can be connected to the deep learning model without changing an internal method.

또한, 본 발명의 실시예에 따른 나이브 반지도 심층 학습의 제공 시스템(400)은 레이블 데이터와 언레이블 데이터를 번갈아 가며(in alternate way) 사용하므로, 트레이드오프(trade-off) 계수에 대한 적절한 스케줄링을 선택할 필요가 없다. In addition, since the naive ring degree deep learning providing system 400 according to an embodiment of the present invention uses label data and unlabeled data in alternate ways, appropriate scheduling for a trade-off coefficient There is no need to choose.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The apparatus described above may be implemented as a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the devices and components described in the embodiments include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It can be implemented using one or more general purpose computers or special purpose computers, such as a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. In addition, the processing device may access, store, manipulate, process, and generate data in response to the execution of software. For the convenience of understanding, although it is sometimes described that one processing device is used, one of ordinary skill in the art, the processing device is a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of these, configuring the processing unit to behave as desired or processed independently or collectively. You can command the device. Software and/or data may be interpreted by a processing device or to provide instructions or data to a processing device, of any type of machine, component, physical device, virtual equipment, computer storage medium or device. , Or may be permanently or temporarily embodyed in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -A hardware device specially configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of the program instructions include not only machine language codes such as those produced by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operation of the embodiment, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described by the limited embodiments and drawings, various modifications and variations are possible from the above description by those of ordinary skill in the art. For example, the described techniques are performed in a different order from the described method, and/or components such as a system, structure, device, circuit, etc. described are combined or combined in a form different from the described method, or other components Alternatively, even if substituted or substituted by an equivalent, an appropriate result can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and claims and equivalents fall within the scope of the claims to be described later.

Claims

In the method of providing naive semi-supervised deep learning using unlabeled data,
Learning a classifier by using labeled data through a classifier learning unit;
Predicting pseudo-labels using the output of the classifier for the unlabeled data through a prediction unit;
Pre-training a deep learning model using pseudo-labeled data output by the pseudo-label through a model pre-training unit; And
Fine-tuneing the deep learning model by using the label data through a model fine adjustment unit
Including,
The naive ring also provides in-depth learning,
Based on the principle of entropy regularization, it is possible to learn a deep learning model without selecting a balanced hyperparameter,
By constructing a classifier and inputting the output of the classifier for unlabeled data into the deep learning model, it is possible to learn a deep learning model without designing a loss function and performing hierarchical unit learning.
The deep learning model is trained in an alternating iterative manner by repeating pseudo-labeling, pre-train, and fine-tune to predict the pseudo label. Naive rings also perform deep learning,
The naive ring also performs deep learning in a way that can be connected to the deep learning model without changing the internal method of the algorithm for deep learning,
A classifier is trained using label data through a classifier learning unit, a pseudo-label is predicted by using the output of the classifier for the unlabeled data through a prediction unit, and pseudo-label data output by the predicted pseudo-label (pseudo- labeled data) to pre-train the deep learning model, and in the process of fine-tuning the deep learning model using the label data through the model fine-tuning unit, label data and unlabeled data are sequentially used, and exchange repetition method A method for providing deep learning even for naive rings, characterized in that scheduling for a trade-off coefficient is not required by using.

The method of claim 1,
Learning the classifier
A method for providing deep learning of naive ring diagrams for training the classifier of a pseudo-labeling model using the label data.

The method of claim 2,
Predicting the pseudo label
A method for providing deep naïve ring degree learning for performing pseudo-labeling for predicting the pseudo-label using the output of the pseudo-labeling model for the unlabeled data.

The method of claim 3,
Pre-training the deep learning model
A method of providing naive ring road deep learning for pre-training the deep learning models of convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) using the pseudo-label data.

The method of claim 4,
Fine-tuning the deep learning model
A method for providing deep learning of naive rings in which the deep learning model is finely adjusted using the label data to relabel the unlabeled data to improve the pseudo-labeling.

delete

In a system that provides naive semi-supervised deep learning using unlabeled data,
A classifier learning unit that trains a classifier using labeled data;
A prediction unit that predicts pseudo-labels using the output of the classifier for the unlabeled data;
A model pre-training unit that pre-trains a deep learning model using pseudo-labeled data output by the pseudo-label; And
A model fine-tuning unit fine-tuning the deep learning model using the label data
Including,
The naive ring also provides a deep learning system,
Based on the principle of entropy regularization, it is possible to learn a deep learning model without selecting a balanced hyperparameter,
By constructing a classifier and inputting the output of the classifier for unlabeled data into the deep learning model, it is possible to learn a deep learning model without designing a loss function and performing hierarchical unit learning.
The deep learning model is trained in an alternating iterative manner by repeating pseudo-labeling, pre-train, and fine-tune to predict the pseudo label. Naive rings also perform deep learning,
The naive ring also performs deep learning in a way that can be connected to the deep learning model without changing the internal method of the algorithm for deep learning,
A classifier is trained using label data through a classifier learning unit, a pseudo-label is predicted by using the output of the classifier for the unlabeled data through a prediction unit, and pseudo-label data output by the predicted pseudo-label (pseudo- labeled data) to pre-train the deep learning model, and in the process of fine-tuning the deep learning model using the label data through the model fine-tuning unit, label data and unlabeled data are sequentially used, and exchange repetition method A naive ring system for providing deep learning, characterized in that it does not require scheduling for a trade-off coefficient by using.