KR20190078710A

KR20190078710A - Image classfication system and mehtod

Info

Publication number: KR20190078710A
Application number: KR1020170173228A
Authority: KR
Inventors: 김한준; 김소현
Original assignee: 서울시립대학교 산학협력단
Priority date: 2017-12-15
Filing date: 2017-12-15
Publication date: 2019-07-05
Also published as: KR102018788B1

Abstract

According to an embodiment of the present invention, provided is an image classification system, which comprises: a single class model classifying whether an input image is a positive class which is an object of interest; and a support model generating counterfeit class data which is negative class data which is an object of non-interest and not over fitted with positive class data belonging to the positive class. The single class model performs classification learning on the positive class data with the counterfeit class data.

Description

IMAGE CLASSIFICATION SYSTEM AND MEHTOD [0002]

본 개시는 이미지 분류 시스템 및 방법에 관한 것이다.The present disclosure relates to an image classification system and method.

방대한 양의 데이터를 고속으로 수집하는 빅데이터 시대에는 기계 학습 기술을 이용하여 크고 작은 문제를 해결하기 위한 많은 연구가 이루어지고 있다. 그 중 요소가 속한 카테고리를 예측하는 기술을 분류(classification)라고 한다. 분류는 범주 수에 따라 다중 클래스 분류(multiclass classification)와 2 진 분류(binary classification)로 구분된다. In the age of Big Data, which collects vast amounts of data at high speed, much research has been done to solve large and small problems using machine learning technology. Among them, the technique of predicting the category to which the element belongs is called classification. The classification is divided into a multiclass classification and a binary classification according to the number of categories.

멀티 클래스 분류는 3 개 이상의 범주를 포함하며, 2 진 분류는 2 개의 범주로 포지티브(positive) 및 네거티브(negative)만을 포함한다. 일반적으로 포지티브 클래스 및 네거티브 클래스 데이터 각각은 일반적인 이진 분류에 필요하다. 그러나 상황에 따라 포지티브 클래스 데이터만 훈련 세트로 얻을 수 있다. 이진 클래스 상황에서, 포지티브 클래스 데이터만을 트레이닝 세트로 사용하여 미지의 데이터 항목이 속하는 카테고리를 예측하는 것을 단일 분류(one-class classification)라고 한다. The multi-class classification includes three or more categories, and the binary classification includes only positive and negative in two categories. Generally, each of the positive class and negative class data is required for general binary classification. However, depending on the situation, only positive class data can be obtained as a training set. In the context of a binary class, predicting the category to which an unknown data item belongs, using only positive class data as a training set, is referred to as a one-class classification.

단일 분류 기법은 개인화된 이미지 검색 시스템을 개발하는 데 유용하다. 예를 들어 사용자가 관심을 가질 이미지를 예측하고 사용자에게 추천 할 수 있다. 이러한 상황에서 우리는 사용자가 관심을 갖고 있는 이미지가 포함된 사용자 활동의 여러 로그를 수집 할 수 있다. 사용자가 관심을 보였던 이미지는 포지티브 클래스 데이터로 고려하고, 사용자가 관심이 없는 이미지는 네거티브 클래스 데이터로 고려하여, 사용자가 특정 이미지에 관심이 있는지 여부를 예측하는 것이 2진 클래스 분류로 구현될 수 있다. 이와 같은 2진 클래스 분류는 훈련 세트로 포지티브 클래스 데이터만 사용하기 때문에 단일 분류일 수 있다. Single classification techniques are useful for developing personalized image retrieval systems. For example, you can predict which images users will be interested in and recommend them to users. In this situation, we can collect a number of logs of user activity that contain images of interest to the user. Considering the image that the user is interested in as positive class data and considering the image that the user is not interested in as negative class data, prediction of whether or not the user is interested in a particular image can be implemented in binary class classification . Such binary classifications can be a single classification because they use only positive class data as training sets.

이미지 검색 시스템을 위한 몇 가지 단일 분류 연구가 이미 수행되고 있고, 단일 분류 기법을 사용하여 안전 및 보안 문제, 제품 품질 등을 예측할 수 있다.Several single classification studies for image retrieval systems are already underway, and safety and security issues, product quality, etc. can be predicted using a single classification scheme.

그러나 기존의 단일 분류 기법은 포지티브 클래스 데이터의 분포만을 학습하고, 입력 데이터를 포지티브 클래스 데이터 및 네거티브 클래스 데이터 중 하나로 분류한다. 이 경우, 포지티브 클래스 데이터가 과적합(over-fit)되어, 그 경계가 지나치게 좁아지는 문제가 발생한다. However, the existing single classification technique only learns the distribution of the positive class data and classifies the input data into one of the positive class data and the negative class data. In this case, there is a problem that the positive class data is over-fitted and the boundary thereof is excessively narrowed.

이미지 검색 시스템을 위한 단일 클래스 모델로, 과적합(over-fit)에 의해 포지티브 클래스의 경계가 지나치게 좁아지는 것을 해결할 수 있는 이미지 분류 시스템 및 방법을 제공하고자 한다. SUMMARY OF THE INVENTION It is an object of the present invention to provide an image classification system and method that can solve the problem of overly narrowing the boundary of a positive class by over-fitting with a single class model for an image search system.

발명의 한 특징에 따른 이미지 분류 시스템은, 입력 이미지가 관심 대상인 포지티브 클래스인지 분류하는 단일 클래스 모델, 및 상기 포지티브 클래스에 속하는 포지티브 클래스 데이터에 과적합되지 않고 비관심 대상인 네거티브 클래스 데이터인 위조 클래스 데이터를 생성하는 지원 모델을 포함하고, 상기 단일 클래스 모델은 상기 포지티브 클래스와 데이터와 상기 위조 클래스 데이터를 분류 학습을 수행한다.According to an aspect of the present invention, there is provided an image classification system including: a single class model for classifying whether an input image is a positive class of interest; and a class classifier for classifying falsification class data, which is negative class data that is not subject to positive class data belonging to the positive class, And the single class model performs classification learning on the positive class and data and the falsified class data.

상기 지원 모델은, 상기 단일 클래스 모델로부터 분류 결과를 획득하고, 획득된 분류 결과에 기초한 훈련을 통해 상기 위조 클래스 데이터를 업데이트할 수 있다.The support model may obtain classification results from the single class model and update the falsification class data through training based on the obtained classification results.

상기 단일 클래스 모델의 분류 학습 및 상기 지원 모델의 상기 분류 결과에 기초한 업데이트 동작이 소정의 k 에폭스(epochs)까지 반복되어, 상기 위조 클래스 데이터는 상기 포지티브 클래스 샘플의 분포와 일치하지 않으나 근접한 분포를 가지는 네거티브 데이터 샘플이 될 수 있다.Wherein the classifying learning of the single class model and the updating operation based on the classification result of the supporting model are repeated up to a predetermined k to fox epochs so that the falsification class data does not coincide with the distribution of the positive class samples, May be a negative data sample.

상기 단일 클래스 모델은, 상기 k 에폭스 동안 생성된 상기 위조 클래스 데이터 및 포지티브 클래스 데이터를 포함하는 유효성 검증 세트를 이용하여 분류 학습을 수행할 수 있다. The single class model may perform classification learning using a validity verification set including the falsification class data and positive class data generated during fox in k.

상기 단일 클래스 모델은, 입력에 컨볼루션 커넬을 적용하는 컨볼루션 레이어, 상기 컨볼루션 레이에를 통해 볼륨이 변환된 입력에 맥스 풀링 커넬을 적용하는 제1 풀링 레이어, 입력에 대해 평균 풀링 커넬을 적용하는 제2 풀링 레이어, 및 상기 평균 풀링 커넬이 적용된 입력의 모든 구성 요소와 연결되어, 각 요소를 계산하여 단일 클래스 모델의 결과값을 생성하는 FC(fully-connected) 레이어를 포함할 수 있다.The single class model includes a convolution layer for applying a convolution kernel to an input, a first pooling layer for applying a Max-Pulling kernel to the volume-converted input through the convolution array, an average pooling kernel for the input And a fully-connected (FC) layer connected to all of the components of the input to which the average pulling kernel is applied to calculate each element to produce a result of the single class model.

상기 지원 모델은, 입력되는 노이즈 벡터의 모든 요소에 연결되어, 각 요소를 계산하여 소정 볼륨의 입력을 생성하는 FC(fully-connected) 레이어, 입력에 컨볼루션 커넬을 적용하는 컨볼루션 레이어, 및 입력에 업샘플링 커넬을 적용하는 레이어를 포함할 수 있다.The support model includes: a fully-connected layer (FC) layer connected to all elements of an input noise vector to generate an input of a predetermined volume by calculating each element; a convolution layer for applying a convolution kernel to an input; Lt; RTI ID = 0.0 > up-sampling < / RTI > kernel.

상기 위조 데이터는 GAN(generative adversarial net) 프레임 워크에 의해 생성될 수 있다. The falsification data may be generated by a generative adversarial net (GAN) framework.

발명의 다른 특징에 따른 이미지 분류 방법은, 단일 클래스 모델이 위조 클래스 데이터와 관심 대상인 포지티브 클래스에 속하는 포지티브 클래스 데이터를 분류하는 학습을 수행하는 단계, 지원 모델이 상기 포지티브 클래스 데이터에 과적합되지 않고 비관심 대상인 네거티브 클래스 데이터인 위조 클래스 데이터를 생성하는 단계, 및 입력 이미지가 관심 대상인 상기 포지티브 클래스인지 분류하는 단계를 포함한다. According to another aspect of the present invention, there is provided an image classifying method comprising the steps of: classifying a single class model into fog class data and positive class data belonging to a positive class of interest; Generating fake class data that is negative class data of interest, and classifying whether the input image is the positive class of interest.

상기 위조 클래스 데이터를 생성하는 단계는, 상기 지원 모델이 상기 분류 학습 단계에서의 분류 결과를 획득하는 단계, 및 상기 획득된 분류 결과에 기초한 훈련을 통해 상기 위조 클래스 데이터를 업데이트하는 단계를 포함할 수 있다.The step of generating the falsification class data may include the step of the support model obtaining the classification result in the classification learning step and the updating of the falsification class data through training based on the obtained classification result have.

상기 단일 클래스 모델의 분류 학습 단계 및 상기 지원 모델의 상기 분류 결과에 기초한 업데이트 단계가 소정의 k 에폭스(epochs)까지 반복될 수 있다.An updating step based on the classifying learning step of the single class model and the classification result of the supporting model can be repeated up to the predetermined k to the epochs.

상기 분류 학습을 수행하는 단계는, 상기 k 에폭스 동안 생성된 상기 위조 클래스 데이터 및 포지티브 클래스 데이터를 포함하는 유효성 검증 세트를 이용할 수 있다. The step of performing classification learning may use a validity verification set including the falsification class data and positive class data generated during fox in k.

상기 위조 데이터를 생성하는 단계는, GAN(generative adversarial net) 프레임 워크에 의해 수행될 수 있다. The generating of the counterfeit data may be performed by a generative adversarial net (GAN) framework.

실시 예는 이미지 검색 시스템을 위한 단일 클래스 모델로, GAN 프레임 워크를 사용하여 지원 모델에서 생성된 위조 클래스 데이터를 사용하여 일반화 오류를 줄이고, 단일 클래스 모델이 포지티브 클래스의 특징을 정확하게 학습할 수 있다. The embodiment is a single class model for an image retrieval system. Using the GAN framework, the falsification class data generated in the support model can be used to reduce the generalization error, and the single class model can accurately learn the characteristics of the positive class.

도 1A는 기존의 단일 분류 기법에 있어서 과적합 상황을 보여주는 개념도이다.
도 1B는 무작위로 네거티브 클래스 데이터를 이용한 단일 분류 결과를 보여주는 개념도이고, 도 1C는 적절한 네거티브 클래스 데이터를 이용한 단일 분류 결과를 보여주는 개념도이다.
도 2는 실시 예에 따른 이미지 분류 시스템을 나타낸 도면이다.
도 3은 실시 예에 따른 단일 클래스 모델의 네트워크 아키텍쳐(architecture)를 나타낸 도면이다.
도 4는 실시 예에 따른 지원 모델의 네트워크 아키텍쳐(architecture)를 나타낸 도면이다.FIG. 1A is a conceptual diagram showing an over sum condition in a conventional single classification technique. FIG.
FIG. 1B is a conceptual diagram showing a single classification result using random negative class data, and FIG. 1C is a conceptual diagram showing a single classification result using appropriate negative class data.
2 is a diagram illustrating an image classification system according to an embodiment.
3 is a diagram illustrating a network architecture of a single class model according to an embodiment.
4 is a diagram illustrating a network architecture of a support model according to an embodiment.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

기술적 과제를 해결하기 위해서, 실시 예에 따른 이미지 분류 시스템은 네거티브 클래스 데이터를 생성하고 이를 사용하여 포지티브 클래스 데이터를 과도하게 초과하지 않는 판별 함수를 작성한다. 결과적으로 GAN(generative adversarial net) 프레임 워크에 의해 생성된 위조 데이터를 네거티브 클래스 데이터로 사용하는 새로운 단일 분류 기법이 적용된 이미지 분류 시스템이 제공될 수 있다. In order to solve the technical problem, the image classification system according to the embodiment generates negative class data and uses it to create a discrimination function that does not excessively exceed the positive class data. As a result, it is possible to provide an image classification system using a new single classification scheme that uses falsified data generated by a generative adversarial net (GAN) framework as negative class data.

기존의 단일 분류 기법은 포지티브 클래스 데이터에 있어서 과적합 문제를 가지고 있다. Existing single classification schemes have over - sum problems in positive class data.

도 1A는 기존의 단일 분류 기법에 있어서 과적합 상황을 보여주는 개념도이다.FIG. 1A is a conceptual diagram showing an over sum condition in a conventional single classification technique. FIG.

원형 점은 포지티브 클래스 데이터를 나타내고 닫힌 곡선은 결정 경계를 나타낸다. 학습 머신이 포지티브 클래스 데이터의 분포만 학습하므로 결정 경계가 포지티브 클래스 데이터에 너무 가깝다. 따라서, 적절한 네거티브 클래스 데이터를 생성하는 것은 포지티브 클래스 데이터에 과적합되지 않은 판별 함수를 생성하는데 도움이 된다. 이 때, 네거티브 클래스 데이터를 무작위로 형성하지 않고, 양수 클래스 데이터에 근접하게 적절히 생성해야 한다. The circular point represents the positive class data and the closed curve represents the crystal boundary. Since the learning machine only learns the distribution of the positive class data, the decision boundary is too close to the positive class data. Thus, generating appropriate negative class data helps to generate a discriminant function that is not overly positive class data. At this time, the negative class data should not be formed at random, but should be appropriately generated close to the positive class data.

도 1B는 무작위로 네거티브 클래스 데이터를 이용한 단일 분류 결과를 보여주는 개념도이고, 도 1C는 적절한 네거티브 클래스 데이터를 이용한 단일 분류 결과를 보여주는 개념도이다. 도 1B 및 도 1C에서, 삼각형 점은 네거티브 클래스 데이터를 나타낸다. FIG. 1B is a conceptual diagram showing a single classification result using random negative class data, and FIG. 1C is a conceptual diagram showing a single classification result using appropriate negative class data. 1B and 1C, the triangle point represents negative class data.

도 1B에 도시된 바와 같이, 네거티브 클래스 데이터가 무작위로 생성되기 때문에, 학습 결과 결정된 경계는 포지티브 클래스 데이터로부터 부적절하게 너무 멀리 떨어져있다. As shown in FIG. 1B, since the negative class data is randomly generated, the boundary determined as a result of learning is inappropriately too far from the positive class data.

도 1C에 도시된 바와 같이, 적절한 네거티브 클래스 데이터를 사용한 학습의 경우, 네거티브 클래스 데이터가 포지티브 클래스 데이터 근처에 적절하게 생성되어, 학습 결과 결정된 경계가 포지티브 클래스 데이터에 과적합되지 않고 지나치게 넓지 않음을 알 수 있다.As shown in Fig. 1C, in the case of learning using the appropriate negative class data, it is found that the negative class data is appropriately generated in the vicinity of the positive class data, and that the boundary determined as the learning result is not excessively wider than positive class data .

GAN 프레임 워크에서, 생성 모델은 학습이 진행됨에 따라 포지티브 클래스 데이터에 유사한 위조 데이터를 생성한다. 실시 예에서는, 기존의 단일 분류 기법의 문제를 해결하기 위해서, GAN 프레임 워크를 사용하여 포지티브 클래스 데이터에 과적합되지 않은 네거티브 클래스 데이터를 생성할 수 있다. 이하, '포지티브 클래스 데이터에 과적합되지 않은 네거티브 클래스 데이터'를 위조 클래스 데이터라 한다. 실시 예는 포지티브 클래스 데이터에 가까운 위조 클래스 데이터를 생성하고, 이를 포지티브 클래스 데이터의 특징을 학습하는데 이용할 수 있다.In the GAN framework, the generation model generates counterfeit data similar to positive class data as learning progresses. In the embodiment, in order to solve the problem of the existing single classification technique, the GAN framework can be used to generate negative class data that is not overly compliant with the positive class data. Hereinafter, 'negative class data not excessively included in positive class data' is referred to as falsified class data. The embodiment can generate falsification class data that is close to the positive class data and use it to learn the characteristics of the positive class data.

도 2는 실시 예에 따른 이미지 분류 시스템을 나타낸 도면이다.2 is a diagram illustrating an image classification system according to an embodiment.

도 2에 도시된 바와 같이, 이미지 분류 시스템(1)은 지원모델(support model)(10) 및 단일 클래스 모델(one-class classification model)(20)을 포함한다. As shown in FIG. 2, the image classification system 1 includes a support model 10 and a one-class classification model 20.

지원모델(10)은 입력 데이터(2)에서 위조 클래스 데이터(3)를 생성하여 단일 클래스 모델(20)의 학습을 지원한다. 입력 데이터(2)는 가우시안 노이즈 벡터일 수 있다. 단일 클래스 모델(20)은 단일 분류 학습에 있어서, 포지티브 클래스 데이터(4)와 위조 클래스 데이터(3)를 분류할 수 있다. The support model 10 generates the falsification class data 3 from the input data 2 to support the learning of the single class model 20. The input data 2 may be a Gaussian noise vector. The single class model 20 can classify the positive class data 4 and the falsified class data 3 in single classification learning.

지원 모델(10)은 유효성 검증 세트(validation set)를 위한 위조 클래스 데이터(3)를 생성한다. 다음으로 단일 클래스 모델(20)은 위조 클래스 데이터(3)와 포지티브 클래스 데이터(4)를 분류하는 훈련을 수행한다. 지원 모델(10)은 단일 클래스 모델(20)로부터 분류 결과를 획득하고, 획득된 분류 결과에 기초한 훈련을 통해 위조 클래스 데이터(3)를 업데이트한다. 단일 클래스 모델(2)은 지원 모델(10)로부터 업데이트된 위조 클래스 데이터(3)와 포지티브 클래스 데이터(4)를 분류하는 훈련을 수행한다. The support model 10 generates fake class data 3 for a validation set. Next, the single class model 20 performs a training to classify the falsification class data 3 and the positive class data 4. The support model 10 obtains the classification results from the single class model 20 and updates the falsification class data 3 through training based on the obtained classification results. The single class model 2 performs training to classify the updated fake class data 3 and the positive class data 4 from the support model 10.

이와 같이, 단일 클래스 모델(20)은 지원 모델(10)에서 생성되는 위조 클래스 데이터(3)와 포지티브 클래스 데이터(4)를 분류하고, 분류 결과에 기초하여 지원 모델(10)이 위조 클래스 데이터(3)를 업데이트 하는 동작이 k 에폭스(epochs)까지 반복된다. 그러면, 위조 클래스 데이터(3)는 포지티브 클래스 샘플의 분포와 일치하지 않으나 근접한 분포를 가지는 네거티브 데이터 샘플이 될 수 있다. 이렇게 생성된 유효성 검증 세트는 k 에폭스(epochs) 동안 생성된 위조 클래스 데이터(3)와 포지티브 클래스 데이터(4)를 포함할 수 있다.As described above, the single class model 20 classifies the falsification class data 3 and the positive class data 4 generated in the support model 10 and outputs the falsification class data 3 3) is repeated up to k epoxs. Then, the falsification class data 3 may be a negative data sample which does not coincide with the distribution of the positive class samples but has a close distribution. The validation set thus generated may include fake class data 3 and positive class data 4 generated during the epochs in k.

단일 클래스 모델(20)은 유효성 검증 세트를 이용하여 학습을 수행하여 최적화될 수 있다. 이렇게 최적화된 단일 클래스 모델(20)은 이미지 분류 시스템에 입력되는 이미지를 변환하여 포지티브 클래스인지 분류한다. The single class model 20 can be optimized by performing learning using a validation set. The optimized single class model 20 thus transforms the image input into the image classification system to classify it as a positive class.

먼저, 단일 클래스 모델(20)은 유효성 검증 세트를 이용하여, 위조 클래스 데이터(3)와 포지티브 클래스 데이터(4)를 분류하는 훈련을 수행한다. 다음으로, 지원 모델(10)은 단일 클래스 모델(20)의 분류 결과를 획득하고, 위조 클래스 데이터(3)를 생성한다. 위와 같은 동작은 유효성 검증 세트의 F1이 가장 높은 값을 가질 때가지 반복되고, F1 값이 가장 높을 때의 단일 클래스 모델(20)이 최종적인 단일 클래스 모델로 결정된다. First, the single class model 20 performs training to classify the falsification class data 3 and the positive class data 4 using a validation set. Next, the support model 10 obtains the classification result of the single class model 20 and generates the falsification class data 3. The above operation is repeated until F1 of the validation set has the highest value, and the single class model 20 when the F1 value is the highest is determined as the final single class model.

이와 같은 학습 과정을 통해, 단일 클래스 모델(20)은 포지티브 클래스 데이터(4)의 특징을 적절하게 학습하게 되는데, 지원 모델(10)에 의해 생성된위조클래스 데이터(3)에 의해 단일 클래스 모델(20)이 포지티브 클래스데이터(4)를 과적합하지 않도록 최적화된다. 실시 예를 통해서, GAN 프레임 워크에서 학습한 후의 단일 클래스 모델은 포지티브 클래스 데이터와 실제 네거티브 클래스 데이터를 매우 잘 분류하게 된다.Through the learning process as described above, the single class model 20 learns the characteristics of the positive class data 4 appropriately. By using the fake class data 3 generated by the support model 10, the single class model 20 20 are optimized so as not to over sum the positive class data 4. Through the embodiment, the single class model after learning in the GAN framework classifies the positive class data and the actual negative class data very well.

도 3은 실시 예에 따른 단일 클래스 모델의 네트워크 아키텍쳐(architecture)를 나타낸 도면이다.3 is a diagram illustrating a network architecture of a single class model according to an embodiment.

도 3에 도시된 바와 같이, 단일 클래스 모델(20)에 입력되는 데이터에 따른 입력 영상은 32×32×3인 것으로 설명한다. '32×32'는 영상을 구성하는 화소들의 가로×세로를 의미하고, '3'은 영상의 채널로 RGB를 의미할 수 있다. 입력 영상의 볼륨은 설명을 위한 일 예로 발명이 이에 한정되는 것은 아니다.As shown in FIG. 3, the input image according to the data input to the single class model 20 is 32 × 32 × 3. '32 × 32' means the width × height of the pixels constituting the image, and '3' means the RGB channel of the image. The volume of the input image is an example for explanation, and the invention is not limited thereto.

도 3에 도시된 바와 같이, 실시 예에 따른 컨볼루션 레이어의 가중치인 커넬(convolution kernel)은 5×5 커넬이고, 액티베이션 함수(activation function)로 LeakyReLu가 사용될 수 있다. As shown in FIG. 3, the convolution kernel, which is a weight of the convolution layer according to the embodiment, is a 5 × 5 kernel, and LeakyReLu can be used as an activation function.

먼저, 단일 클래스 모델(20)의 입력 영상에 대한 컨볼루션 레이에서는, 입력 영상에 대해서 컨볼루션 커넬(5,5)이 적용되어 입력 영상이 32×32×64의 볼륨으로 변환되고, 다음의 풀링 레이어에서는, 맥스 풀링 커넬(max pooling kernel)(2,2)이 적용된다. 다음 컨볼루션 레이에서는 32×32×64의 볼륨의 입력 영상에 대해서 컨볼루션 커넬(5,5)이 적용되어 입력 영상이 16×16×128의 볼륨으로 변환되고, 그 다음의 풀링 레이어에서는, 맥스 풀링 커넬(2,2)이 적용된다. 다음 컨볼루션 레이에서는 16×16×128의 볼륨의 입력 영상에 대해서 컨볼루션 커넬(5,5)이 적용되어 입력 영상이 8×8×256의 볼륨으로 변환되고, 그 다음의 풀링 레이어에서는, 맥스 풀링 커넬(2,2)이 적용된다. 마지막으로 컨볼루션 레이에서는 8×8×256의 볼륨의 입력 영상에 대해서 컨볼루션 커넬(5,5)이 적용되어 입력 영상이 4×4×1의 볼륨으로 변환되고, 풀링 레이어에서는, 평균 풀링 커넬(average pooling kernel)(5,5)이 적용된다. FC(fully-connected) 레이어에서, 평균 풀링 커넬이 적용된 4×4×1의 모든 요소와 연결되고, 각 요소를 계산하여 단일 클래스 모델의 결과값(embedding)을 생성한다. First, in the convolution array for the input image of the single class model 20, the convolution kernel 5,5 is applied to the input image to convert the input image into a volume of 32 x 32 x 64, In the layer, a max pooling kernel (2, 2) is applied. In the next convolutional ray, the convolution kernel 5,5 is applied to the input image of the volume of 32x32x64 so that the input image is converted into the volume of 16x16x128, and in the next pooling layer, The pooling kernel (2, 2) is applied. In the next convolutional ray, the convolution kernel 5,5 is applied to an input image having a volume of 16 × 16 × 128, so that the input image is converted into a volume of 8 × 8 × 256, and in the next pulling layer, The pooling kernel (2, 2) is applied. Finally, in the convolution array, the convolution kernel (5, 5) is applied to the input image of 8 × 8 × 256 volume so that the input image is converted into the volume of 4 × 4 × 1. In the pooling layer, (average pooling kernel) (5,5) is applied. At the FC (fully-connected) layer, it is connected to all 4 × 4 × 1 elements applied with an average pooling kernel, and each element is calculated to generate the embedding of a single class model.

도 4는 실시 예에 따른 지원 모델의 네트워크 아키텍쳐(architecture)를 나타낸 도면이다.4 is a diagram illustrating a network architecture of a support model according to an embodiment.

도 4에 도시된 바와 같이, 지원 모델(10)에 입력되는 입력 데이터(2)는 노이즈 벡터일 수 있다. FC 레이어에서, 노이즈 벡터의 모든 요소에 연결되고, 각 요소를 계산하여 4×4×256 볼륨의 입력을 생성한다. 컨볼루션 레이어에서, 4×4×256 볼륨의 입력에 대해서 컨볼루션 커넬(5,5)이 적용되어 입력이 4×4×128의 볼륨으로 변환된다. 다음의 레이어에서, 업샘플링 커넬(upsampling kernel)(2,2)이 적용되고, 컨볼루션 레이어에서, 컨볼루션 커넬(5,5)이 적용되어 입력이 8×8×128의 볼륨으로 변환된다. 다음의 레이어에서, 업샘플링 커넬(2,2)이 적용되고, 컨볼루션 레이어에서, 컨볼루션 커넬(5,5)이 적용되어 입력이 16×16×64의 볼륨으로 변환된다. 그 다음의 레이어에서, 업샘플링 커넬(2,2)이 적용되고, 컨볼루션 레이어에서, 컨볼루션 커넬(5,5)이 적용되어 입력이 32×32×3의 볼륨으로 변환되어 출력된다. 이는 단일 클래스 모델(20)의 입력 영상과 동일한 볼륨이다. 입력 영상 볼륨에 따라 지원 모델(10)의 출력의 볼륨이 결정될 수 있다.As shown in Fig. 4, the input data 2 input to the support model 10 may be a noise vector. At the FC layer, it is connected to all the elements of the noise vector, and each element is calculated to generate a 4 x 4 x 256 volume input. At the convolution layer, a convolution kernel (5,5) is applied to a 4 × 4 × 256 volume input to convert the input to a volume of 4 × 4 × 128. At the next layer, an upsampling kernel (2,2) is applied and at the convolution layer a convolution kernel (5,5) is applied to convert the input to a volume of 8x8x128. At the next layer, the upsampling kernel (2,2) is applied, and at the convolution layer, the convolution kernel (5,5) is applied to convert the input to a volume of 16 x 16 x 64. At the next layer, the upsampling kernel (2,2) is applied, and at the convolution layer, the convolution kernel (5,5) is applied and the input is converted to a volume of 32 x 32 x 3 and output. This is the same volume as the input image of the single class model 20. The volume of the output of the support model 10 can be determined according to the input image volume.

도 3 및 4를 참조하여, 실시 예에 따른 지원 모델(10) 및 단일 클래스 모델(20)의 네트워크 아키텍쳐를 설명하였으나, 발명이 이에 한정되는 것은 아니다. 컨볼루션 커넬, 맥스 풀링 커넬, 업샘플링 커넬등은 변형될 수 있고, 네트워크 아키텍쳐를 구성하는 레이어도 변형될 수 있다. 3 and 4, the network architecture of the support model 10 and the single class model 20 according to the embodiment has been described, but the invention is not limited thereto. The convolution kernel, the max pooling kernel, the upsampling kernel, etc. can be modified, and the layers constituting the network architecture can be modified.

CIFAR-10, CIFAR-100를 적용하여 실시 예를 통해 분류 결과를 테스트할 수 있다. 참고로, CIFAR-10은 10 개의 클래스로 32×32 컬러 이미지로 구성되는데, 실시 예를 위한 테스트에서는 5 개의 클래스(비행기, 자동차, 새, 자동차, 사슴) 만을 사용하였고, CIFAR-100은 100 개의 클래스로 구성된 32×32 컬러 이미지로 구성되는데, 실시 예를 위한 테스트에서는 5 개의 클래스(비버, 돌고래, 수달, 물개, 고래)만을 사용했다. 각 클래스의 영상 데이터는 포지티브 클래스 데이터로 간주된다. 실시 예를 위한 테스트에서, CIFAR-10, CIFAR-100에서 선택된 포지티브 클래스를 제외한 9개의 클래스에서 무작위로 네거티브 클래스 데이터를 추출할 수 있다.CIFAR-10 and CIFAR-100 can be applied to test classification results through the examples. For reference, the CIFAR-10 consists of 10 classes of 32 × 32 color images. The test for the example used only five classes (airplane, car, bird, car, deer) 32 color image consisting of 32 x 32 color images. In the test for the example, only five classes (beaver, dolphin, otter, seal, whale) were used. The video data of each class is regarded as positive class data. In the test for the embodiment, negative class data can be randomly extracted from nine classes except the positive class selected in CIFAR-10 and CIFAR-100.

예를 들어, CIFAR-10에서, 단일 클래스 모델(20)의 훈련 세트로 4,500 개의 포지티브 클래스 데이터를 사용하고, 테스트 세트로 1,000 개의 포지티브 클래스 데이터와 1,000 개의 네거티브 클래스 데이터를 사용하며, 유효성 검증 세트로 500 개의 포지티브 클래스 데이터와 500 개의 위조 클래스 데이터를 사용한다.For example, in CIFAR-10, the training set of the single class model (20) uses 4,500 positive class data, uses 1,000 positive class data and 1,000 negative class data in the test set, 500 positive class data and 500 fake class data are used.

CIFAR-100에서, 단일 클래스 모델(20)의 훈련 세트로 450 개의 포지티브 클래스 데이터를 사용하고, 테스트 세트로 100 개의 포지티브 클래스 데이터와 100 개의 네거티브 클래스 데이터를 사용하며, 유효성 검증 세트로 50 개의 포지티브 클래스 데이터와 50 개의 위조 클래스 데이터를 사용한다.In the CIFAR-100, 450 positive class data is used as the training set of the single class model 20, 100 positive class data and 100 negative class data are used as the test set, and 50 positive classes Data and 50 fake class data are used.

먼저, 단일 클래스 모델(20)은CIFAR-10, CIFAR-100 훈련 세트를 사용하여 유효성 검증을 위한 위조 클래스 데이터를 생성하는 프로세스를 수행한다. 유효성 검증을 위한 위조 클래스 데이터를 생성하기 위해 적절한 수의 에폭스(10, 100, 200)까지 프로세스가 반복될 수 있다. 다음으로, CIFAR-10, CIFAR-100의 훈련 세트 및 유효성 검증 세트를 사용하여 최적의 단일 클래스 모델을 찾기 위한 프로세스를 수행한다. F1을 확인하면서 최적의 단일 클래스 모델을 찾는 프로세스가 100 에폭스까지 반복될 수 있다. 결과적으로 최종 단일 클래스 모델은 유효성 검증 세트의 F1이 가장 높을 때의 학습한 모델로 결정될 수 있다. F1은 범용적으로 분류 결과의 정확도를 나타내는 인자인F1 스코어로서, 정확도(Precision)과 재현율(recall)을 이용해 산출될 수 있다. F1 = 2*(정확도*재현율)/(정확도+재현율))의 수식을 통해 산출될 수 있다. First, the single class model 20 performs the process of generating fake class data for validation using the CIFAR-10 and CIFAR-100 training sets. The process may be repeated up to the appropriate number of foxes 10, 100, 200 to generate falsification class data for validation. Next, we use a training set and validation set of CIFAR-10, CIFAR-100 to perform a process to find an optimal single class model. The process of finding the optimal single-class model while checking for F1 can be repeated up to 100 in Fox. As a result, the final single-class model can be determined as the learned model when F1 of the validation set is the highest. F1 is a F1 score, which is a factor indicating the accuracy of the classification result in general terms, and can be calculated using the precision and the recall. F1 = 2 * (accuracy * recall) / (accuracy + recall)).

실시 예에 따른 GAN 기반의 단일 클래스 모델의 분류 성능과 기존의 신경망 기반의 단일 클래스 모델의 분류 성능과 비교할 수 있다. The classification performance of the GAN-based single class model according to the embodiment and the classification performance of the single class model based on the existing neural network can be compared.

표 1은 CIFAR-10과 유효성 검증을 위해 10 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델에 대한 테스트 결과를 나타낸 표이다. Table 1 shows the test results for CIFAR-10 and the single-class model using fake class data generated through Fox 10 for validation.

표 2는 CIFAR-10과 유효성 검증을 위해 100 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델에 대한 테스트 결과를 나타낸 표이다. Table 2 shows the test results for CIFAR-10 and single class model using fake class data generated through Fox for 100 for validation.

표 3은 CIFAR-10과 유효성 검증을 위해 200 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델에 대한 테스트 결과를 나타낸 표이다. Table 3 is a table showing test results for CIFAR-10 and a single class model using fake class data generated through Fox for 200 for validation.

표 4는 CIFAR-10을 사용한 종래 단일 클래스 모델의 테스트 결과를 나타낸 표이다.Table 4 shows the test result of the conventional single class model using CIFAR-10.

표 1 내지 표 4에 도시된 바와 같이, 실시 예에 따른 F1 값이 종래 단일 클래스 모델보다 높게 나오는 것을 알 수 있다. 특히 표 3에 도시된 바와 같이, CIFAR-10와 유효성 검증을 위해 200 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델의 테스트 결과는 종래에 비해 0.087 더 높다.As shown in Tables 1 to 4, it can be seen that the F1 value according to the embodiment is higher than that of the conventional single class model. Particularly, as shown in Table 3, the test result of the single class model using the fake class data generated through Fox in CIFAR-10 and 200 for validity verification is 0.087 higher than the conventional test result.

표 5는 CIFAR-100과 유효성 검증을 위해 10 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델에 대한 테스트 결과를 나타낸 표이다. Table 5 shows the test results for CIFAR-100 and single-class model using fake class data generated through Fox for 10 validation.

표 6은 CIFAR-100과 유효성 검증을 위해 100 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델에 대한 테스트 결과를 나타낸 표이다. Table 6 shows the test results for CIFAR-100 and single-class model using fake class data generated through Fox for 100 for validation.

표 7은 CIFAR-100과 유효성 검증을 위해 200 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델에 대한 테스트 결과를 나타낸 표이다. Table 7 is a table showing test results for CIFAR-100 and a single class model using fake class data generated through Fox for 200 for validation.

표 8은 CIFAR-100을 사용한 종래 단일 클래스 모델의 테스트 결과를 나타낸 표이다.Table 8 shows the test result of the conventional single class model using CIFAR-100.

표 5 내지 표 8에 도시된 바와 같이, 실시 예에 따른 F1 값이 종래 단일 클래스 모델보다 높게 나오는 것을 알 수 있다. 특히 표 3에 도시된 바와 같이, CIFAR-100와 유효성 검증을 위해 100 에폭스를 통해 생성된 위조 클래스 데이터를 이용한 단일 클래스 모델의 테스트 결과는 종래에 비해 0.084 더 높다.As shown in Tables 5 to 8, it can be seen that the F1 value according to the embodiment is higher than that of the conventional single class model. Especially, as shown in Table 3, the test result of the single class model using CIFAR-100 and fake class data generated through Fox for 100 for validation is 0.084 higher than the conventional one.

실시 예는 이미지 검색 시스템을 위한 단일 클래스 모델로, GAN 프레임 워크를 사용하여 지원 모델에서 생성된 위조 클래스 데이터를 사용하여 일반화 오류를 줄이고, 단일 클래스 모델이 포지티브 클래스의 특징을 정확하게 학습할 수 있다.The embodiment is a single class model for an image retrieval system. Using the GAN framework, the falsification class data generated in the support model can be used to reduce the generalization error, and the single class model can accurately learn the characteristics of the positive class.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, It belongs to the scope of right.

1: 이미지 분류 시스템
10: 지원모델
20: 단일 클래스 모델1: Image classification system
10: Support Model
20: Single class model

Claims

A single class model that classifies the input image as a positive class of interest, and
And a support model for generating fake class data, which is negative class data that is not overly related to positive class data belonging to the positive class,
Wherein the single class model performs classification learning of the positive class and data and the falsified class data.

The method according to claim 1,
Wherein the support model obtains a classification result from the single class model and updates the falsification class data through training based on the obtained classification result.

3. The method of claim 2,
The classification learning of the single class model and the updating operation based on the classification result of the supporting model are repeated up to the predetermined k to the epochs,
Wherein the falsification class data is a negative data sample that does not match the distribution of the positive class samples but has a close distribution.

The method of claim 3,
The single-class model includes:
Wherein the classification learning is performed using the falsification class data generated during fox in k and the validity verification set including positive class data.

The method according to claim 1,
The single-class model includes:
A convolution layer that applies a convolution kernel to the input,
A first pooling layer for applying a Max-Pulling kernel to the volume-converted input via the convolution array,
A second pooling layer for applying an average pooling kernel on the input, and
And a fully-connected (FC) layer coupled to all components of the input to which the average pooling kernel is applied to calculate each element to produce a result of a single class model.

The method according to claim 1,
In the support model,
A fully-connected (FC) layer connected to all elements of the noise vector to be input, for calculating each element to generate a predetermined volume of input,
A convolution layer that applies a convolution kernel to the input, and
An image classification system that includes layers that apply an upsampling kernel to the input.

The method according to claim 1,
Wherein the forged data is generated by a generative adversarial net (GAN) framework.

The single class model performing learning to classify the falsification class data and the positive class data belonging to the positive class of interest;
Generating fake class data that is negative class data that is not an object of interest and that is not overly supported by the supporting model; And
Classifying the input image as the positive class of interest.

9. The method of claim 8,
Wherein the generating of the falsification class data comprises:
The support model acquiring the classification result in the classification learning step; And
And updating the falsification class data through training based on the obtained classification result.

10. The method of claim 9,
Wherein the classifying learning step of the single class model and the updating step based on the classification result of the supporting model are repeated up to a predetermined k to fox epochs.

11. The method of claim 10,
The step of performing the classification learning includes:
Wherein the classification learning is performed using the falsification class data generated during fox in k and the validity verification set including positive class data.

9. The method of claim 8,
Wherein the generating of the counterfeit data comprises:
A method of image classification performed by a generative adversarial net (GAN) framework.