KR20210030063A

KR20210030063A - System and method for constructing a generative adversarial network model for image classification based on semi-supervised learning

Info

Publication number: KR20210030063A
Application number: KR1020190111542A
Authority: KR
Inventors: 양지훈; 김상록
Original assignee: 서강대학교산학협력단
Priority date: 2019-09-09
Filing date: 2019-09-09
Publication date: 2021-03-17

Abstract

The present invention relates to a method for constructing an adversarial image generative model based on semi-supervised learning. The method for constructing an adversarial image generative model includes the steps of: (a) obtaining a first loss function according to semi-supervised learning for a discriminator and learning to optimize the first loss function; (b) obtaining a second loss function for minimizing an Earth Mover′s distance (EM distance) for the discriminator and learning to optimize the second loss function; (c) obtaining a third loss function according to an existing adversarial image generative model for a generator and learning to optimize the third loss function; (d) obtaining a fourth loss function for minimizing the EM distance for the generator and learning to optimize the fourth loss function; and (e) classifying a result of the discriminator using a classifier and providing an output value.

Description

System and method for constructing a generative adversarial network model for image classification based on semi-supervised learning}

본 발명은 적대적 이미지 생성 모델 구축 시스템 및 방법에 관한 것으로서, 더욱 구체적으로는 준지도 학습을 기반으로 한 적대적 이미지 생성 모델(Semi-supervised GAN; 'SGAN')과 Wassersteing GAN(WGAN) 모델을 결합하여 구성한 이미지 분류를 위한 향상된 적대적 이미지 생성 모델 구축 방법에 관한 것이다. The present invention relates to a system and method for constructing a hostile image generation model, and more specifically, by combining a hostile image generation model (Semi-supervised GAN;'SGAN') and a Wassersteing GAN (WGAN) model based on semi-supervised learning. The present invention relates to a method of constructing an improved hostile image generation model for classification of composed images.

일반적으로 딥러닝 기반의 분류 알고리즘들은 일정량의 훈련 데이터들을 사용하여 모델을 학습시키고, 이를 바탕으로 시험 데이터가 어떤 종류의 이미지인지 분류한다. 기계학습은 지도 학습, 비지도 학습, 준지도 학습의 3가지 유형으로 구분된다. 지도 학습(Supervised Learning)은 훈련 데이터로부터 하나의 함수를 유추해 내는 기계학습으로서, 훈련 데이터는 입력값과 이에 대한 목표값 또는 정답으로 이루어지게 된다. 비지도 학습(Unsupervised Learning)은 데이터가 어떻게 구성되었는지를 알아내는 문제의 범주에 속하는 것으로서, 훈련 데이터의 입력값에 대한 목표값이나 정답이 포함되어 있지 않는다. 한편, 준지도 학습(Semi-supervised Learning)은 기계 학습의 한 범주로 목표값이 표시된 데이터와 표시되지 않은 데이터를 모두 훈련에 사용하게 된다. In general, classification algorithms based on deep learning train a model using a certain amount of training data, and classify what kind of image the test data is based on. Machine learning is divided into three types: supervised learning, unsupervised learning, and semi-supervised learning. Supervised learning is machine learning that infers a function from training data, and training data consists of an input value and a target value or a correct answer for it. Unsupervised learning belongs to the category of problems to find out how the data is structured, and does not include the target value or the correct answer for the input value of the training data. On the other hand, semi-supervised learning is a category of machine learning, in which both data with target values and undisplayed data are used for training.

이들 중 지도학습은 레이블된 데이터의 양이 충분히 많은 더 좋은 결과를 기대할 수 있다. 하지만, 최근 정보 기술의 급격한 발전에 따라 예전과는 비교할 수 없는 막대한 양의 데이터가 축적되기는 하나, 그 중에서 지도 학습을 위한 레이블된 데이터는 매우 작은 비율을 차지할 수 밖에 없다. 왜냐하면 레이블을 추가하기 위해서는 사람의 노동이나 추가적인 자원이 필요하기 때문이다. Of these, supervised learning can expect better results with a sufficiently large amount of labeled data. However, according to the recent rapid development of information technology, a huge amount of data that cannot be compared to the past is accumulated, but among them, the labeled data for supervised learning is bound to occupy a very small proportion. This is because adding a label requires human labor or additional resources.

이러한 문제점을 해결하기 위한 방법 중 하나로 비지도 학습을 들 수 있다. 비지도 학습은 레이블되지 않은 데이터를 사용하기 때문에 무수히 많은 데이터들을 사용할 수 있다는 장점이 있다. 하지만, 학습 방법이 명확하게 제시되지 않고 종류가 다양하며 불안정성을 갖고 있는 단점이 있다. 현재 연구되고 있는 대부분의 생성 모델(Generative Model)은 비지도 학습을 사용하여 잠재 변수 z 가 주어졌을 때의 조건부 확률 p(x|z) 을 구함으로써 데이터를 생성한다. One of the ways to solve this problem is unsupervised learning. Unsupervised learning has the advantage of being able to use a myriad of data because it uses unlabeled data. However, there is a disadvantage in that the learning method is not clearly presented, the types are diverse, and has instability. Most of the generative models currently being studied generate data by using unsupervised learning to find the conditional probability p(x|z) given the latent variable z.

생성 모델은 주어진 확률 분포를 바탕으로 하여 데이터를 생성하는 모델로서, 입력 데이터를 생성하기 위하여 필요한 것은 수학식 1과 같은 결합 분포를 알아내는 것이다. The generation model is a model that generates data based on a given probability distribution, and what is required to generate input data is to find a joint distribution as shown in Equation 1.

이때, z를 잠재 변수(Latent Vector)라고 하며, 데이터의 확률 분포인 p(x)를 알아내는 것이 생성 모델의 목표이다. 따라서, 파라미터 값을 변화시키며 확률 분포의 최대 우도 추정을 반복함으로써 p(x)를 추정한다. At this time, z is called a Latent Vector, and the goal of the generation model is to find out p(x), which is the probability distribution of the data. Therefore, p(x) is estimated by changing the parameter value and repeating the maximum likelihood estimation of the probability distribution.

전술한 생성 모델 중 대표적인 모델이 적대적 생성 신경망(Generative Adversarial Network; 'GAN')이며, GAN은 잠재 변수 z를 입력 데이터의 특징이라고 했을 때, 생성된 데이터가 원본과 가장 비슷하게 생성될 수 있도록 z를 변화시키게 된다. 전술한 GAN은 생성자(Generator)와 판별자(Discriminator)의 역할을 하는 두 개의 모델로 구성된다. 판별자는 실제 데이터만을 참으로 판단하도록 학습하고, 생성자는 판별자가 거짓으로 판별하지 못하도록 가짜 데이터를 생성하도록 학습한다. 이러한 적대적 경쟁을 통해 두 모델의 성능을 동시에 높임으로써 최적점에 도달하겠다는 것이 GAN의 핵심 아이디어이다. 그러나, 적대적 생성 신경망 역시 학습이 어렵고 불안정하며 생성된 이미지가 판별자가 구별하기 어려워하는 특정 이미지에 치우치게 되는 Mode Collapsing이 발생하기 쉽다는 문제점을 가지고 있다. 이러한 문제점들을 해결하기 위하여 다양한 모델들이 제안되고 있는 추세이다. Among the above-described generation models, a representative model is the Generative Adversarial Network ('GAN'), and GAN refers to the latent variable z as a characteristic of the input data. It will change. The aforementioned GAN is composed of two models serving as a generator and a discriminator. The discriminator learns to judge only real data as true, and the generator learns to generate fake data so that the discriminator cannot judge it as false. GAN's core idea is to reach the optimum point by simultaneously increasing the performance of both models through such hostile competition. However, the hostile generated neural network also has a problem that it is difficult to learn and is unstable, and mode collapsing is likely to occur in which the generated image is biased to a specific image that is difficult for the discriminator to distinguish. Various models are being proposed to solve these problems.

한편, 지도 학습과 비지도 학습의 문제점들을 보완하기 위한 중간 단계의 학습 방법을 준지도 학습(Semi-Supervised Learning; 'SSL')이라고 한다. 앞서 설명한 바와 같이, 준지도 학습은 레이블된 데이터와 레이블되지 않은 데이터를 같이 사용하여 학습하도록 모델을 구성한다. 그리고, 이를 통해 분류(Classification)나 회귀 분석(Regression) 등의 지도 학습 문제를 더 쉽고 빠르게 해결하고자 하는 추세이다. Meanwhile, semi-supervised learning ('SSL') is an intermediate learning method to compensate for the problems of supervised and unsupervised learning. As described above, semi-supervised learning constructs a model to learn by using both labeled and unlabeled data. In addition, there is a trend to more easily and quickly solve supervised learning problems such as classification and regression through this.

따라서, 본 발명에서는 적대적 생성 모델들을 결합하여 보다 향상된 성능을 갖는 준지도 학습을 기반으로 한 적대적 생성 모델을 구축할 수 있는 시스템 및 방법을 제안하고자 한다. Accordingly, the present invention proposes a system and method capable of constructing a hostile generation model based on quasi-supervised learning with improved performance by combining hostile generation models.

한국등록특허공보 제 10-1925913호Korean Registered Patent Publication No. 10-1925913 한국공개특허공보 제 10-2019-0078710호Korean Patent Application Publication No. 10-2019-0078710

전술한 문제점을 해결하기 위한 본 발명은 준지도 학습을 기반으로 하여 적대적 생성 모델들을 결합함으로써, 성능이 향상된 새로운 형태의 적대적 생성 모델을 구축할 수 있는 시스템 및 방법을 제공하는 것을 목적으로 한다. An object of the present invention for solving the above-described problem is to provide a system and method capable of constructing a new type of hostile generation model with improved performance by combining hostile generation models based on quasi-supervised learning.

전술한 기술적 과제를 달성하기 위한 본 발명의 제1 특징에 따른 적대적 생성 모델 구축 시스템은, 입력된 무작위 잠재 변수로부터 이미지를 생성해 내는 생성자; 상기 생성자에 의해 생성된 이미지를 입력으로 받아서 진짜인지 가짜인지 여부를 판단하는 판별자; 및 상기 판별자의 결과를 이용하여 분류하여 출력값을 제공하는 분류기; 를 구비하고, The hostile generation model building system according to the first aspect of the present invention for achieving the above-described technical problem comprises: a generator generating an image from an input random latent variable; A discriminator that receives the image generated by the creator as an input and determines whether it is real or fake; And a classifier for providing an output value by classifying using the result of the discriminator. And,

상기 판별자는 준지도 학습에 따른 판별자의 제1 손실 함수 및 Earth Mover's 거리(EM 거리)를 최소화시키기 위한 판별자의 제2 손실함수를 각각 학습 데이터를 이용하여 최적화되도록 학습시켜 구성된 것을 특징으로 하며, 상기 생성자는 기본적인 적대적 생성 모델의 생성자의 제3 손실 함수 및 EM 거리를 최소화시키기 위한 생성자의 제4 손실 함수를 각각 최적화되도록 학습시켜 구성된다. The discriminator is characterized in that it is configured by learning to optimize the discriminant's first loss function and the discriminator's second loss function for minimizing the Earth Mover's distance (EM distance) according to quasi-supervised learning, respectively, using learning data, and the The constructor is constructed by learning to optimize the third loss function of the generator of the basic hostile generation model and the fourth loss function of the generator for minimizing the EM distance, respectively.

전술한 제1 특징에 따른 적대적 생성 모델 구축 시스템에 있어서, 상기 판별자의 학습 데이터는 레이블된 진짜 데이터, 레이블되지 않은 진짜 데이터 및 생성자가 생성한 가짜 데이터인 것이 바람직하다. In the hostile generation model building system according to the first feature described above, it is preferable that the learning data of the discriminator are labeled real data, unlabeled real data, and fake data generated by the creator.

전술한 제1 특징에 따른 적대적 생성 모델 구축 시스템에 있어서, 상기 판별자의 상기 제1 손실함수는, 지도 학습에 따른 레이블된 데이터에 의한 손실 함수와 비지도 학습에 따른 레이블되지 않은 데이터에 의한 손실 함수의 합으로 이루어진 것이 바람직하다. In the hostile generation model construction system according to the first feature described above, the first loss function of the discriminator is a loss function based on labeled data according to supervised learning and a loss function due to unlabeled data according to unsupervised learning. It is preferably made up of the sum of.

전술한 제1 특징에 따른 적대적 생성 모델 구축 시스템에 있어서, 상기 판별자 ?? 상기 생성자는 복수 층의 합성곱 신경망으로 구성된 것이 바람직하다. In the hostile generation model construction system according to the first characteristic described above, the discriminator ?? It is preferable that the generator is composed of multiple layers of convolutional neural networks.

전술한 제1 특징에 따른 적대적 생성 모델 구축 시스템에 있어서, 상기 분류기는 소프트맥스(Softmax) 함수로 구성되어, 판별자의 결과에 대하여 클래스의 종류 및 가짜 유무를 분류하여 출력하도록 구성된 것이 바람직하다. In the hostile generation model construction system according to the first feature described above, the classifier is preferably configured with a Softmax function, and is configured to classify and output the type of class and the presence or absence of a fake with respect to the result of the discriminator.

본 발명의 제2 특징에 따른 준지도 학습을 기반으로 한 적대적 생성 모델 구축 방법은, 무작위 잠재 변수로부터 이미지를 생성해 내는 생성자와 상기 생성자에 의해 생성된 이미지를 입력으로 받아서 진짜인지 가짜인지 여부를 판단하는 판별자를 구비하는 적대적 생성 모델 구축 방법에 관한 것으로서, (a) 판별자에 대하여 준지도 학습에 따른 제1 손실함수를 구하고, 상기 제1 손실함수가 최적화되도록 학습시키는 단계; (b) 판별자에 대하여 Earth Mover's 거리(EM 거리)를 최소화시키기 위한 제2 손실함수를 구하고, 상기 제2 손실함수가 최적화되도록 학습시키는 단계; (c) 생성자에 대하여 기존의 적대적 생성 모델에 따른 제3 손실 함수를 구하고, 상기 제3 손실함수가 최적화되도록 학습시키는 단계; (d) 생성자에 대하여 EM 거리를 최소화시키기 위한 제4 손실 함수를 구하고, 상기 제4 손실함수가 최적화되도록 학습시키는 단계; (e) 분류기를 이용하여 상기 판별자의 결과를 분류하여 출력값을 제공하는 단계; 를 구비한다. A method of constructing a hostile generation model based on quasi-supervised learning according to a second feature of the present invention is a generator that generates an image from a random latent variable and an image generated by the generator as inputs to determine whether it is real or fake. A method for constructing a hostile generation model having a discriminator to determine, comprising: (a) obtaining a first loss function for the discriminator according to quasi-supervised learning, and learning the first loss function to be optimized; (b) obtaining a second loss function for minimizing the Earth Mover's distance (EM distance) with respect to the discriminator, and learning the second loss function to be optimized; (c) obtaining a third loss function according to an existing hostile generation model with respect to the generator, and learning the third loss function to be optimized; (d) obtaining a fourth loss function for minimizing the EM distance with respect to the generator, and learning the fourth loss function to be optimized; (e) classifying the result of the discriminator using a classifier and providing an output value; It is equipped with.

전술한 제2 특징에 따른 적대적 생성 모델 구축 방법에 있어서, 상기 (a) 단계와 (b) 단계는 상기 판별자는 레이블된 진짜 데이터, 레이블되지 않은 진짜 데이터 및 생성자가 생성한 가짜 데이터가 입력되고, 판별자는 입력된 데이터들을 이용하여 학습하는 것이 바람직하다. In the method of constructing a hostile generation model according to the second feature described above, in steps (a) and (b), the discriminator inputs labeled real data, unlabeled real data, and fake data generated by the creator, It is desirable for the discriminator to learn using the input data.

전술한 제2 특징에 따른 적대적 생성 모델 구축 방법에 있어서, 상기 (a) 단계는 상기 제1 손실함수는 레이블된 데이터에 의한 손실 함수와 레이블되지 않은 데이터에 의한 손실 함수의 합으로 이루어진 것이 바람직하다. In the method of constructing a hostile generation model according to the second feature described above, in step (a), the first loss function is preferably a sum of a loss function due to labeled data and a loss function due to unlabeled data. .

전술한 제2 특징에 따른 적대적 생성 모델 구축 방법에 있어서, 상기 판별자 및 상기 생성자는 복수 층의 합성곱 신경망으로 구성된 것이 바람직하다. In the method for constructing a hostile generation model according to the second feature described above, it is preferable that the discriminator and the generator are composed of a plurality of layers of convolutional neural networks.

전술한 제2 특징에 따른 적대적 생성 모델 구축 방법에 있어서, 상기 (e) 단계는, 판별자의 결과를 분류하는 분류기는 소프트맥스 함수로 구성되어, 클래스의 종류 및 가짜 유무를 분류하여 출력하도록 구성된 것이 바람직하다. In the method for constructing a hostile generation model according to the second characteristic described above, in step (e), the classifier for classifying the result of the discriminator is configured with a softmax function, and is configured to classify and output the type of class and the presence or absence of fakes. desirable.

전술한 제2 특징에 따른 적대적 생성 모델 구축 방법에 있어서, 판별자에 대한 상기 (a) 단계와 (b) 단계의 학습은 복수회 수행되고, 생성자에 대한 상기 (c) 단계와 (d) 단계의 학습은 1회 수행되는 과정이 반복되는 것이 바람직하다. In the method for constructing a hostile generation model according to the second feature described above, the learning of steps (a) and (b) for the discriminator is performed a plurality of times, and steps (c) and (d) for the generator It is preferable that the process of learning is repeated once.

전술한 구성을 갖는 본 발명에 따른 향상된 성능의 적대적 생성 모델 구축 시스템 및 방법은 SGAN과 WGAN을 결합하여 새로이 구축된 적대적 생성 모델을 제공하게 된다. The system and method for constructing a hostile generation model with improved performance according to the present invention having the above-described configuration combines SGAN and WGAN to provide a newly constructed hostile generation model.

본 발명에 따른 시스템은 준지도 학습 방식을 기반으로 함으로써, 레이블된 진짜 데이터를 사용하여 판별자의 학습을 돕기 때문에 비지도 학습 방식과 비교하여 학습 속도가 빨라지게 된다. Since the system according to the present invention is based on a semi-supervised learning method, the learning speed is increased compared to the unsupervised learning method because it helps the discriminator to learn using the labeled real data.

또한, 본 발명에 따른 시스템은 Deep Convolution GAN의 합성곱 신경망 구조를 SGAN에 적용함으로써, SGAN에 따른 Mode Collapsing 문제를 해결할 수 있게 된다 In addition, the system according to the present invention can solve the Mode Collapsing problem according to the SGAN by applying the convolutional neural network structure of the Deep Convolution GAN to the SGAN.

또한, 본 발명에 따른 시스템은 WGAN에 따른 EM 거리를 최소화하는 방식을 적용함으로써, 데이터의 크기가 증가되더라도 학습의 안정성을 향상시킬 수 있게 된다. In addition, the system according to the present invention applies a method of minimizing the EM distance according to the WGAN, so that even if the size of the data increases, the stability of learning can be improved.

도 1은 기본적인 적대적 생성 모델(GAN)을 도시한 모식도이다.
도 2는 Deep Convolution GAN의 생성자의 구조를 도시한 모식도이다.
도 3은 Semi-Supervised GAN(SGAN)의 구조를 도시한 모식도이다.
도 4는 Semi-Supervised GAN(SGAN)의 학습 과정에 대한 알고리즘을 도시한 것이다.
도 5는 본 발명의 바람직한 실시예에 따른 준지도 학습을 기반으로 한 새로운 형태의 GAN 모델의 구조를 도시한 모식도이다.
도 6은 본 발명의 바람직한 실시예에 따른 준지도 학습을 기반으로 한 적대적 생성 모델 구축 방법을 순차적으로 도시한 흐름도이다.
도 7은 본 발명의 바람직한 실시예에 따른 적대적 생성 모델 구축 방법에 있어서, GAN 결합 모델을 구축하는 알고리즘을 도시한 것이다.1 is a schematic diagram showing a basic hostile generation model (GAN).
2 is a schematic diagram showing the structure of the generator of the Deep Convolution GAN.
3 is a schematic diagram showing the structure of a Semi-Supervised GAN (SGAN).
4 shows an algorithm for the learning process of Semi-Supervised GAN (SGAN).
5 is a schematic diagram showing the structure of a new type of GAN model based on semi-supervised learning according to a preferred embodiment of the present invention.
6 is a flowchart sequentially showing a method of constructing a hostile generation model based on quasi-supervised learning according to a preferred embodiment of the present invention.
7 is a diagram illustrating an algorithm for constructing a GAN combination model in a method of constructing a hostile generation model according to a preferred embodiment of the present invention.

본 발명에 따른 새로운 형태의 적대성 생성 모델 구축 방법 및 시스템은 SGAN의 손실함수와 WGAN의 손실함수를 각각 최적화시킴으로써, 학습 속도를 향상시키고 Mode Collapsing의 문제를 해결함과 동시에 학습의 안정성도 향상시키도록 한 것을 특징으로 한다. The method and system for constructing a new form of hostility generation model according to the present invention improves learning speed and solves the problem of mode collapsing by optimizing the loss function of SGAN and the loss function of WGAN, respectively, and also improves the stability of learning. It is characterized by one.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예에 따른 준지도 학습을 기반으로 한 새로운 GAN 모델 시스템 및 상기 모델 구축 방법에 대하여 구체적으로 설명한다. Hereinafter, a new GAN model system based on semi-supervised learning according to a preferred embodiment of the present invention and a method of building the model will be described in detail with reference to the accompanying drawings.

도 1은 기본적인 적대적 생성 모델(GAN)을 도시한 모식도이다. 도 1을 참조하면, 기본적인 적대적 생성 모델인 Vanila GAN (VGAN)으로서, 기본적인 적대적 생성 신경망은 생성자(Generator;'G')와 판별자(Discriminator;'D')의 역할을 하는 두 개의 모델로 구성된다. 생성자는 무작위 잠재 변수 z로부터 이미지를 생성해 내는 것으로서, 생성자의 목적은 판별자를 통과한 출력이 진짜 이미지로 분류되는 것이 목적이다. 반면에, 판별자는 생성자가 만들어낸 이미지를 입력으로 받아서 이를 판별하는 역할을 하는 것으로서, 입력받은 이미지가 진짜인지 가짜인지 정확하게 구분하는 것이 목적이다. 수학식 2는 GAN의 목적함수이다. 1 is a schematic diagram showing a basic hostile generation model (GAN). Referring to FIG. 1, as Vanila GAN (VGAN), which is a basic hostile generation model, the basic hostile generation neural network is composed of two models serving as a generator ('G') and a discriminator ('D'). do. The constructor creates an image from a random latent variable z, whose purpose is to classify the output passing through the discriminator as a real image. On the other hand, the discriminator plays a role of discriminating the image created by the creator as an input, and its purpose is to accurately distinguish whether the input image is real or fake. Equation 2 is the objective function of GAN.

여기서, V(G,D)는 변수 D와 G에 대한 minmax를 해결하는 가치 함수(value function)이며, E는 기대값으로서

는

에 속하는 x라는 데이터들의 기대값이 된다. D(x)는 x에 대한 판별기이며, G(z)는 잠재변수 z에 대한 생성기이다. Here, V(G,D) is a value function that solves minmax for variables D and G, and E is the expected value.

Is

It becomes the expected value of the data x belonging to. D(x) is the discriminator for x, and G(z) is the generator for the latent variable z.

GAN의 목적 함수를 살펴 보면, D가 V(D,G)를 최대화하는 관점에서 보면

의 확률분포이고, x는 그 중 샘플링 데이터이다. 판별자 D는 출력이 실제 데이터가 들어오면 1에 가깝게 확률을 추정하고, 생성자 G가 만들어 낸 가짜 데이터가 들어오면 0에 가깝게 한다. 따라서, 로그를 사용했기 때문에 실제 데이터라면 최댓값인 0에 가까운 값이 나오고, 가짜 데이터라면 무한대로 발산하기 때문에 V(D,G)를 최대화하는 방향으로 학습하게 된다. 생성자 G 부분인 오른쪽 부분이 V(D,G)를 최소화하는 관점에서 보면,

는 보통 정규분포로 사용하는 임의의 노이즈 분포이고 z는 노이즈 분포에서 샘플링한 것이다. 이 입력을 생성자 G에 넣어 만든 데이터를 판별자 D가 진짜로 판별하면

이기 때문에,

는 1이며, D가 가짜로 판별하면

는 0 이기 때문에

는 0에 가까운 최댓값이 나오게 될 것이다. 따라서, G는 V(D,G)에 있어서 G는 이를 최소화하는 방향으로 D는 최대화하는 방향으로 간다. Looking at the objective function of GAN, from the point of view that D maximizes V(D,G)

Is the probability distribution of, and x is the sampled data among them. Discriminator D estimates the probability that the output is close to 1 when the actual data comes in, and closes to 0 when the fake data generated by the generator G comes in. Therefore, because the logarithm is used, a value close to 0, which is the maximum value for real data, is obtained, and if it is fake data, it is emitted to infinity, so the learning is performed in the direction of maximizing V(D,G). From the perspective of minimizing V(D,G), the right part, which is the constructor G part,

Is an arbitrary noise distribution normally used as a normal distribution, and z is a sampled from the noise distribution. If the discriminator D really determines the data created by putting this input into the constructor G

Because

Is 1, and if D is determined to be fake,

Because is 0

Will result in a maximum value close to zero. Therefore, G goes in the direction of minimizing it and D in the direction of maximizing it in V(D,G).

한편, DCGAN(Deep Convolution GAN)은 전술한 VGAN의 구조를 발전시킨 것으로서, 합성곱 신경망을 VGAN에 적용한 것으로서, GAN 모델의 기본이 된다. DCGAN은 VGAN에 아래의 5가지 방법을 적용하였는데, (1) 최대 풀링층을 없애고 합성곱을 사용하여 특징 맵의 크기를 조절하고, (2) 배치 정규화를 적용하였고, (3) 완전 연결 은닉 계층을 제거하고, (4) 생성자의 출력 활성 함수로 tanh를 사용하고 나머지 층은 relu를 사용하며, (5) 판별자의 활성함수로 leakyrelu를 사용한다. On the other hand, DCGAN (Deep Convolution GAN) is a development of the above-described VGAN structure, and applies a convolutional neural network to VGAN, and is the basis of the GAN model. DCGAN applied the following five methods to VGAN: (1) the maximum pooling layer was removed and the size of the feature map was adjusted using convolution, (2) batch normalization was applied, and (3) a fully connected hidden layer was used. Removed, (4) tanh is used as the output activation function of the constructor, relu is used for the rest of the layers, and (5) leakyrelu is used as the activation function of the discriminator.

도 2는 Deep Convolution GAN의 생성자의 구조를 도시한 모식도이다. 도 2를 참조하면, DCGAN은 전술한 방법을 통해 DCGAN의 생성자 모델을 구성하였으며, 그 결과 VGAN에 비하여 훨씬 선명한 이미지를 생성할 수 있게 되고, 학습 과정에서 발생하는 Mode Collapsing 문제를 상당 수 해결하게 된다. 2 is a schematic diagram showing the structure of the generator of the Deep Convolution GAN. Referring to FIG. 2, DCGAN constructs a DCGAN generator model through the above-described method, and as a result, it is possible to generate a much sharper image compared to VGAN, and a considerable number of mode collapsing problems occurring in the learning process are solved. .

본 발명에 따른 GAN 모델 구축 방법은, DCGAN의 판별자와 생성자 구조를 사용하는 Semi-Supervised GAN(SGAN)과 Wassertein(WGAN)의 손실 함수(Loss Function)를 결합하여 새로운 모델을 만들고, 레이블된 진짜 데이터와 생성자가 만들어낼 레이블되지 않은 가짜 데이터를 모두 사용하는 준지도 학습을 이용하여 판별자를 학습시키는 것을 특징으로 한다. 본 발명에 따른 적대적 생성 모델 구축 시스템은 사전에 학습 데이터 등이 데이터베이스에 저장되어 있는 고성능의 컴퓨터 등으로 구현될 수 있으며, 적대적 생성 모델 구축 방법은 적대적 생성 모델 구축을 위한 프로그램 등의 형태로 저장 장치등에 저장되고 상기 적대적 생성 모델 구축 시스템의 컴퓨터 또는 서버 등에 의해 실행되어 구현될 수 있다. The GAN model construction method according to the present invention creates a new model by combining a semi-supervised GAN (SGAN) using a DCGAN discriminator and generator structure and a loss function of Wassertein (WGAN), It is characterized in that the discriminator is learned using semi-supervised learning that uses both the data and the unlabeled fake data to be generated by the generator. The hostile generation model construction system according to the present invention can be implemented with a high-performance computer, etc., in which training data is stored in a database in advance, and the hostile generation model construction method is a storage device in the form of a program for building a hostile generation model. And the like, and executed and implemented by a computer or server of the hostile generation model building system.

Semi-Supervised GAN(SGAN)은 DCGAN의 구조를 바탕으로 하는 발전된 GAN 모델이다. VGAN의 판별자가 단순히 생성자가 생성한 이미지가 진짜 혹은 가짜인지만을 구별하던 것과는 달리, SGAN은 판별자가 분류를 수행하여 입력 데이터의 클래스의 종류 및 가짜 여부를 구별할 수 있도록 한다. VGAN의 판별자는 진짜와 가짜만을 구별하므로 시그모이드 함수를 사용하여 출력하였으나, SGAN은 클래스의 종류 및 가짜 여부를 동시에 구별하여야 하므로 소프트맥스 함수를 사용하여 출력한다. 또한, VGAN의 최종 목적은 판별자가 구분할 수 없는 정교한 가짜 이미지를 생성하는 생성자를 얻는 것이므로 학습의 초점이 생성자에게 맞추어 있으나, SGAN은 클래스의 종류 및 가짜 여부를 동시에 구별할 수 있는 판별자를 학습시키는 것이 궁극적인 목표이다. Semi-Supervised GAN (SGAN) is an advanced GAN model based on the structure of DCGAN. Unlike the VGAN discriminator simply discriminating whether the image created by the creator is real or fake, the SGAN allows the discriminator to classify the class of input data and distinguish whether it is fake or not. Since the VGAN discriminator distinguishes only the real and the fake, it is output using the sigmoid function, but the SGAN must simultaneously distinguish the type of class and whether it is fake, so the softmax function is used to output it. In addition, the final purpose of VGAN is to obtain a constructor that generates a sophisticated fake image that the discriminator cannot distinguish, so the focus of learning is on the constructor, but SGAN is to learn a discriminator that can simultaneously distinguish the type of class and whether it is fake. It's the ultimate goal.

도 3은 Semi-Supervised GAN(SGAN)의 구조를 도시한 모식도이다. 도 3을 참조하면, Semi-Supervised GAN(SGAN)는 기본적으로 준지도 학습을 통해 판별자를 학습시키는 것으로서, 잠재 변수 z를 바탕으로 하여, 생성자는 가짜 이미지를 생성하고 판별자는 이러한 가짜 이미지와 레이블된 진짜 데이터를 모두 사용하여 학습하게 된다. 즉, SGAN의 판별자는 레이블된 데이터와 레이블되지 않은 데이터를 모두 사용하는 준지도 학습을 통해 판별자를 효과적으로 학습시키고, 이를 통해 생성자 역시 경쟁을 통해 학습시킬 수 있게 된다. 생성자와 판별자는 기본적으로 DCGAN의 구조를 사용하게 된다. 3 is a schematic diagram showing the structure of a Semi-Supervised GAN (SGAN). 3, Semi-Supervised GAN (SGAN) basically learns a discriminator through semi-supervised learning, and based on the latent variable z, the generator creates a fake image and the discriminator is labeled with this fake image. You will learn using all the real data. That is, the SGAN discriminator effectively learns the discriminator through semi-supervised learning using both labeled and unlabeled data, and through this, the creator can also learn through competition. The generator and discriminator basically use the structure of DCGAN.

SGAN에서는 판별자가 n개의 클래스를 구별함과 동시에 생성자가 만드는 이미지를 가짜로 구별해야 하므로, 총 n+1개의 클래스를 분류할 수 있어야 한다. x를 가짜 데이터라고 하면, x의 각 클래스의 확률은 수학식 3과 같이 표현될 수 있다. In SGAN, since the discriminator must distinguish n classes and at the same time distinguish the image created by the creator as a fake, a total of n+1 classes must be classified. Assuming that x is fake data, the probability of each class of x can be expressed as in Equation 3.

여기서, l은 logit의 약자이며, 분류기가 k+1 차원의 벡터를 다루기 때문에, 해당 vector에 포함된 원소들을 l1,l2, …, ln+1로 정의한다. Here, l is an abbreviation of logit, and since the classifier handles a vector of k+1 dimensions, the elements contained in the vector are l1, l2,… , it is defined as ln+1.

만약 x가 레이블된 진짜 데이터라면, 클래스 중 하나로 분류되어야 하므로, 수학식 4와 같이 x의 확률을 표현할 수 있다. If x is real labeled data, it must be classified as one of the classes, so the probability of x can be expressed as shown in Equation 4.

따라서, 각각의 경우에 대한 판별자의 손실함수는 수학식 5 및 수학식 6과 같다. Therefore, the loss function of the discriminator for each case is the same as in Equations 5 and 6.

그러므로, SGAN의 판별자(D)의 최종 손실함수는 수학식 7과 같이 전술한 2개의 손실 함수를 더해줌으로써 구할 수 있다. Therefore, the final loss function of the discriminator D of the SGAN can be obtained by adding the aforementioned two loss functions as shown in Equation 7.

한편, SGAN의 생성자(G)의 손실 함수는 수학식 8과 같이 VGAN과 동일한 손실함수를 사용한다. Meanwhile, the loss function of the generator G of the SGAN uses the same loss function as in the VGAN as shown in Equation 8.

위와 같이 손실함수를 설정하고 준지도 학습을 통해 판별자를 학습시킴으로서, 기존의 GAN보다 더 빠르게 판별자를 학습시키면서 좋을 결과를 얻어낼 수 있게 된다. By setting the loss function as above and learning the discriminator through semi-supervised learning, it is possible to obtain good results while learning the discriminator faster than the existing GAN.

도 4는 Semi-Supervised GAN(SGAN)의 학습 과정에 대한 알고리즘을 도시한 것이다. 4 shows an algorithm for the learning process of Semi-Supervised GAN (SGAN).

한편, Wasserstein GAN(WGAN)은 VGAN에서 손실함수를 재 정의하여 학습의 안정성을 추구한 모델로서, 손실함수를 설정할 때 Wassertein 거리를 사용하는 것을 특징으로 한다. 일반적으로 VGAN은 JS 발산(Jensen-Shannon Divergence)를 사용하는 것과 달리, Wasserstein GAN(WGAN)은 Wassertein 거리를 사용하는데, Wassertein 거리는 Earth Mover's 거리(EM 거리)라고도 한다. EM 거리는 거리의 개념을 어떤 확률 분포 모양을 띄는 흙더미를 다른 확률분포 모양을 가지도록 하는데 드는 최소비용이라고 해석하기 때문이다. 이때 비용은 흙의 양과 이동한 거리를 곱하여 정량화하게 된다. WGAN에서 JS 거리를 사용하지 않고 EM 거리를 사용하는 이유는, JS 거리로는 수렴하지 않는 확률 분포의 수열이 EM 거리에서는 수렴하는 경우가 있기 때문이다. On the other hand, Wasserstein GAN (WGAN) is a model that seeks stability of learning by redefining the loss function in VGAN, and is characterized by using the Wassertein distance when setting the loss function. In general, VGAN uses JS divergence (Jensen-Shannon Divergence), whereas Wasserstein GAN (WGAN) uses Wassertein distance, which is also called Earth Mover's distance (EM distance). This is because EM distance interprets the concept of distance as the minimum cost to make a pile of some probability distribution shape have a different probability distribution shape. At this time, the cost is quantified by multiplying the amount of soil and the distance traveled. The reason why WGAN does not use the JS distance and uses the EM distance is that the sequence of probability distributions that do not converge at the JS distance sometimes converge at the EM distance.

도 5는 본 발명의 바람직한 실시예에 따른 준지도 학습을 기반으로 한 새로운 형태의 GAN 모델의 구조를 도시한 모식도이다. 도 5를 참조하면, 본 발명에 따른 새로운 형태의 적대적 생성 모델 시스템은 SGAN과 WGAN의 모델을 결합한 것으로서, 생성자와 판별자는 기본적으로 DCGAN의 구조를 사용하고, 각각 5층 및 4층의 합성곱 신경망으로 구성된다. 본 발명에 따른 GAN 모델(10)은 생성자(20) 및 판별자(30)를 구비하고, 생성자와 판별자는 합성곱 신경망으로 구성되고, 판별자는 분류기를 더 구비하여 판별자의 결과값을 클래스의 종류 및 가짜 유무를 판단하여 출력한다. 5 is a schematic diagram showing the structure of a new type of GAN model based on semi-supervised learning according to a preferred embodiment of the present invention. 5, a new form of hostile generation model system according to the present invention is a combination of SGAN and WGAN models, and the generator and discriminator basically use the structure of DCGAN, and the convolutional neural network of layers 5 and 4, respectively. It consists of. The GAN model 10 according to the present invention includes a generator 20 and a discriminator 30, the generator and discriminator are composed of a convolutional neural network, and the discriminator further includes a classifier, so that the result value of the discriminator is the type of class. And the presence or absence of a fake is determined and output.

이하, 도 6을 참조하여 본 발명의 바람직한 실시예에 따른 준지도 학습을 기반으로 한 GAN 모델의 구축 방법을 구체적으로 설명한다. 도 6은 본 발명의 바람직한 실시예에 따른 준지도 학습을 기반으로 한 적대적 생성 모델 구축 방법을 순차적으로 도시한 흐름도이다. Hereinafter, a method of constructing a GAN model based on semi-supervised learning according to a preferred embodiment of the present invention will be described in detail with reference to FIG. 6. 6 is a flowchart sequentially showing a method of constructing a hostile generation model based on quasi-supervised learning according to a preferred embodiment of the present invention.

본 발명에 따른 결합 모델의 판별자는 준지도 학습 방식을 적용하기 위하여, 실제 데이터에서 레이블 파라미터(

)의 값을 조정하면서 일부분의 데이터만을 레이블되고 나머지 데이터는 레이블없이 사용되도록 입력데이터가 구성된다. 따라서, 본 발명에 따른 결합 모델의 판별자로 입력되는 입력데이터는 레이블된 진짜 데이터, 레이블되지 않은 진짜 데이터 및 생성자가 생성한 가짜 데이터이다. In order to apply the semi-supervised learning method, the discriminator of the combined model according to the present invention is a label parameter (

While adjusting the value of ), the input data is configured so that only part of the data is labeled and the rest of the data is used without a label. Accordingly, the input data input to the discriminator of the combined model according to the present invention are labeled real data, unlabeled real data, and fake data generated by the creator.

준지도 학습을 하게 되면, 레이블된 진짜 데이터를 사용하여 판별자의 학습을 돕기 때문에, 비지도 학습 방식보다 학습 속도가 향상된다. 그런데, 학습 초기에는 판별자가 입력 이미지를 구분하는 능력이 떨어지기 때문에 기울기 값이 큰 값을 갖게 되고 학습이 잘 진행되지만, 상대적으로 생성자보다 학습 속도가 빨라져서 분류 성능이 올라가게 되면 Mode Collapsing 문제가 일어날 수 있다. 이 경우, 학습의 안정성을 더욱 높이기 위하여, 본 발명에 따른 결합 모델은 WGAN의 EM 거리를 최소화하는 방식을 추가하는 것을 특징으로 한다. When semi-supervised learning is performed, the learning speed is improved compared to the unsupervised learning method because it helps the discriminator to learn by using the labeled real data. However, at the beginning of learning, since the discriminator has a lower ability to distinguish the input image, the slope value has a large value and the learning proceeds well.However, if the learning speed is relatively faster than the generator and the classification performance increases, the Mode Collapsing problem occurs. I can. In this case, in order to further increase the stability of learning, the combined model according to the present invention is characterized by adding a method of minimizing the EM distance of the WGAN.

본 발명에 따른 GAN 결합 모델의 손실 함수는 각각 다른 최적화 방식을 통해 두 종류의 손실 함수를 최적화한다. 먼저, 준지도 학습에 따른 판별자의 손실 함수(

)는 수학식 9와 같은 레이블되지 않은 데이터를 이용한 비지도 학습에 따른 손실 함수와 수학식 10과 같은 레이블된 데이터를 이용한 지도 학습에 따른 손실함수의 합으로 이루어진다. 한편, WGAN의 판별자의 손실 함수(

)는 수학식 12과 같이 EM 거리를 최소화시키는 함수로 나타낼 수 있다. 따라서, 본 발명에 따른 GAN 결합 모델에 있어서, 판별자는 SGNA에 따른 손실함수(

)와 WGAN에 따른 EM 거리를 최소화하는 손실 함수(

)를 각각 최소화시킨다. The loss function of the GAN coupling model according to the present invention optimizes two types of loss functions through different optimization methods. First, the loss function of the discriminant according to semi-supervised learning (

) Is a sum of a loss function according to unsupervised learning using unlabeled data such as Equation 9 and a loss function according to supervised learning using labeled data such as Equation 10. Meanwhile, the loss function of the discriminator of WGAN (

) Can be expressed as a function that minimizes the EM distance as shown in Equation 12. Therefore, in the GAN coupling model according to the present invention, the discriminator is the loss function according to SGNA (

) And the loss function (

) Are minimized respectively.

본 발명에 따른 GAN 결합 모델에 있어서, 생성자는 SGAN에 의한 손실 함수(

)와 WGAN에 의한 EM 거리를 최소화하는 손실함수(

)를 각각 최소화시키게 된다. 수학식 13은 생성자는 SGAN에 의한 손실 함수(

)이며, 수학식 14는 WGAN에 의한 EM 거리를 최소화하는 손실함수(

)이다. In the GAN combination model according to the present invention, the generator is a loss function by SGAN (

) And the loss function (

) Are minimized respectively. Equation 13 shows that the generator is a loss function by SGAN (

), and Equation 14 is a loss function that minimizes the EM distance by WGAN (

)to be.

정리하면, 본 발명에 따른 GAN 모델 구축 방법은, 판별자에 대하여, 준지도 학습을 위해 레이블이 있는 경우의 손실함수와 레이블이 없는 경우의 손실함수를 각각 설정하고, 두 손실 함수의 값을 더해줌으로써 준지도 학습에 대한 손실함수를 구하고, 이에 대하여 최적화 알고리즘을 적용한다. 이때 최적화 알고리즘으로는 Adam Optimizer를 사용할 수 있다. 또한, 판별자에 대하여 WGAN에 따른 EM 거리를 최소화시키는 손실 함수를 설정하고, 이에 대하여 최적화 알고리즘을 적용하는데, 이때 최적화 알고리즘으로는 RMS Optimizer를 사용할 수 있다. In summary, the GAN model construction method according to the present invention sets the loss function when there is a label and the loss function when there is no label for semi-supervised learning for the discriminator, and adds the values of the two loss functions. By giving, the loss function for quasi-supervised learning is obtained, and an optimization algorithm is applied to it. At this time, Adam Optimizer can be used as an optimization algorithm. In addition, a loss function for minimizing the EM distance according to the WGAN is set for the discriminator, and an optimization algorithm is applied thereto. In this case, the RMS Optimizer can be used as the optimization algorithm.

한편, 생성자의 경우, 기본적인 적대적 생성 모델인 VGAN에 따른 손실함수를 설정하고 이에 대하여 최적화 알고리즘을 적용하는데, 이때 최적화 알고리즘으로는 Adam Optimizer를 사용할 수 있다. 또한 생성자에 대하여 WGAN에 따른 손실함수를 설정하고 이에 대하여 최적화 알고리즘을 적용하는데, 이때 최적화 알고리즘으로는 RMS Optimizer을 사용할 수 있다. On the other hand, in the case of the generator, a loss function according to VGAN, which is a basic hostile generation model, is set and an optimization algorithm is applied. In this case, Adam Optimizer can be used as the optimization algorithm. In addition, the loss function according to the WGAN is set for the generator and an optimization algorithm is applied to it. In this case, the RMS Optimizer can be used as the optimization algorithm.

본 발명에 따른 GAN 모델 구축 방법은 판별자를 복수 회 학습시킬 때 생성자를 1번 학습시키도록 함으로써, WGAN 손실 함수의 조건을 충족시키기 위해 가중치를 c로 클립해주는 과정에서 학습의 안정성을 위해 판별자를 더 학습시킬 수 있게 된다. In the method of constructing a GAN model according to the present invention, the generator is trained once when the discriminator is trained multiple times, so that the discriminator is further for stability of learning in the process of clipping the weight to c to satisfy the condition of the WGAN loss function. You will be able to learn.

도 7은 본 발명의 바람직한 실시예에 따른 적대적 생성 모델 구축 방법에 있어서, 본 발명에 따른 GAN 모델을 구축하는 알고리즘을 도시한 것이다. 도 7을 참조하면, 전술한 GAN 모델은, 먼저 각 파라미터들에 대하여

의 입력을 받아 동작한다. 먼저, 판별자 D를 5회 학습시킨 뒤 생성자 G를 1회 학습시켜야 하므로, t=0,…,5만큼 판별자 D를 우선적으로 학습시키게 된다. 각 수식들의 손실 함수는 전술한 바와 동일하다. 7 is a diagram illustrating an algorithm for constructing a GAN model according to the present invention in a method of constructing a hostile generation model according to a preferred embodiment of the present invention. Referring to Figure 7, the GAN model described above, first for each parameter

It receives the input of and operates. First, since the discriminator D needs to be trained 5 times and then the generator G must be trained once, t=0,... The discriminator D is learned preferentially by ,5. The loss function of each equation is the same as described above.

알고리즘의

는 SGAN의 손실함수를 의미하며 앞의 두분은 unsupervised loss 이며, 뒤의 두 부분은 supervised loss로서, 이들의 합으로 이루어진다. 한편, 알고리즘의

는 WGAN의 손실함수를 의미한다. Algorithmic

Means the loss function of SGAN, the first two are unsupervised losses, and the latter two are supervised losses, which are summed up. Meanwhile, of the algorithm

Means the loss function of WGAN.

판별자의 학습 과정에서 SGAN의 손실함수와 WGAN의 손실함수를 각각 Adam optimizer 와 RMS optimizer를 이용하여 변수

를 업데이트하게 된다. In the discriminator's learning process, the loss function of SGAN and the loss function of WGAN are variable using Adam optimizer and RMS optimizer, respectively.

Will be updated.

다음, 생성자 G를 한번 학습시키게 되며, 판별자의 학습 방법과 마찬가지로 진행된다. Next, the generator G is learned once, and proceeds in the same way as the discriminator learning method.

본 발명에 따른 GAM 결합 모델 구축 방법에 있어서, 판별자는 마지막에 분류기를 사용하게 되는데, 분류기로는 소프트맥스(Softmax) 함수를 사용하는 것이 바람직하다. 이와 같이, 판별자는 분류기로서 소프트맥스 함수를 사용함으로써, 클래스의 분류 및 가짜 여부를 판별하여 출력할 수 있게 된다. 따라서, 본 발명에 따른 GAN 결합 모델의 판별자는 준지도 학습을 통해 학습함으로써, 생성자가 생성한 데이터의 경우 진짜인지 가짜인지 여부를 구별하고 레이블된 진짜 데이터인 경우 어떤 레이블에 속하는 데이터인지 분류할 수 있게 된다. In the method for constructing a GAM coupling model according to the present invention, the discriminator uses a classifier last, and it is preferable to use a softmax function as the classifier. In this way, by using the softmax function as the classifier, the discriminator can determine and output the classification of the class and whether it is fake. Therefore, the discriminator of the GAN combination model according to the present invention learns through quasi-supervised learning, thereby distinguishing whether the data generated by the generator is real or fake, and classifying the data belonging to a label in the case of labeled real data. There will be.

한편, 생성자도 마지막에 분류기를 사용하게 되는데, 분류기로는 시그모이드(Sigmoid)는 것이 바람직하다. On the other hand, the generator also uses a classifier at the end, and sigmoid is preferred as the classifier.

이상에서 본 발명에 대하여 그 바람직한 실시예를 중심으로 설명하였으나, 이는 단지 예시일 뿐 본 발명을 한정하는 것이 아니며, 본 발명이 속하는 분야의 통상의 지식을 가진 자라면 본 발명의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 그리고, 이러한 변형과 응용에 관계된 차이점들은 첨부된 청구 범위에서 규정하는 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다. In the above, the present invention has been described with reference to its preferred embodiments, but these are only examples and do not limit the present invention, and those of ordinary skill in the field to which the present invention pertains will not depart from the essential characteristics of the present invention. It will be appreciated that various modifications and applications not exemplified above are possible in the range. And, differences related to these modifications and applications should be construed as being included in the scope of the present invention defined in the appended claims.

10 : GAN 모델
20 : 생성자
30 : 판별자
40 : 분류기10: GAN model
20: constructor
30: discriminator
40: classifier

Claims

A generator that generates an image from the input random latent variables;
A discriminator that receives the image generated by the creator as an input and determines whether it is real or fake; And
A classifier that classifies using the result of the discriminator and provides an output value;
And the discriminator is configured by learning to optimize the discriminator's first loss function and the second loss function of the discriminator for minimizing the Earth Mover's distance (EM distance) according to quasi-supervised learning, respectively, using learning data. And
The generator is a hostile generation model construction system based on quasi-supervised learning, characterized in that the generator is configured by learning to optimize the third loss function of the generator of the basic hostile generation model and the fourth loss function of the generator to minimize the EM distance. .

The system of claim 1, wherein the learning data of the discriminator are labeled real data, unlabeled real data, and fake data generated by a generator.

The method of claim 1, wherein the first loss function of the discriminator is
A system for constructing a hostile generation model based on quasi-supervised learning, characterized by consisting of a sum of a loss function based on labeled data according to supervised learning and a loss function due to unlabeled data according to unsupervised learning.

The method of claim 1, wherein the discriminator is composed of a convolutional neural network of multiple layers,
The generator is a hostile generation model construction system based on quasi-supervised learning, characterized in that the generator is composed of a convolutional neural network of multiple layers.

The hostile generation model based on quasi-supervised learning according to claim 1, wherein the classifier is configured with a Softmax function, and is configured to classify and output the type of class and the presence or absence of a fake with respect to the discriminator result. Build system.

In a method for constructing a hostile generation model, comprising: a generator generating an image from a random latent variable and a discriminator determining whether it is real or fake by receiving the image generated by the generator as an input,
(a) obtaining a first loss function for the discriminator according to quasi-supervised learning, and learning the first loss function to be optimized;
(b) obtaining a second loss function for minimizing the Earth Mover's distance (EM distance) with respect to the discriminator, and learning the second loss function to be optimized;
(c) obtaining a third loss function according to an existing hostile generation model with respect to the generator, and learning the third loss function to be optimized;
(d) obtaining a fourth loss function for minimizing the EM distance with respect to the generator, and learning the fourth loss function to be optimized;
(e) classifying the result of the discriminator using a classifier and providing an output value;
A method of constructing a hostile generation model based on quasi-supervised learning, characterized in that it comprises a.

The method of claim 6, wherein steps (a) and (b) are
The discriminator is a method of constructing a hostile generation model based on quasi-supervised learning, characterized in that the labeled real data, the unlabeled real data, and the fake data generated by the creator are input, and the discriminator learns using the input data. .

The method of claim 6, wherein step (a)
The first loss function is a method of constructing a hostile generation model based on quasi-supervised learning, characterized in that the first loss function is composed of a sum of a loss function due to labeled data and a loss function due to unlabeled data.

The method of claim 6,
The method of constructing a hostile generation model based on quasi-supervised learning, wherein the discriminator and the generator are composed of a plurality of layers of convolutional neural networks.

The method of claim 6, wherein the step (e),
The classifier for classifying the result of the discriminator is composed of a softmax function, and is configured to classify and output the type of class and the presence or absence of fakes.

The method of claim 6, wherein the learning of steps (a) and (b) for the discriminator is performed multiple times, and the learning of steps (c) and (d) for the generator is performed once. A method of constructing a hostile generation model based on quasi-supervised learning, characterized in that it is repeated.