KR102049331B1

KR102049331B1 - Apparatus and method for classifying images, and apparatus for training images for classification of images

Info

Publication number: KR102049331B1
Application number: KR1020170106244A
Authority: KR
Inventors: 강명균; 이상철; 이인호; 이은혜; 백상엽
Original assignee: 주식회사 인피닉스
Priority date: 2017-08-22
Filing date: 2017-08-22
Publication date: 2019-11-27
Also published as: KR20190021095A

Abstract

본 발명은 이미지에 포함된 특징들에 대한 정보와 이미지 내 클래스에 포함된 특징들에 대한 정보를 기초로 생성된 오차 함수를 이용하여 이미지들을 학습시키고, 그 결과를 바탕으로 이미지들을 분류하는 이미지 분류 장치 및 방법을 제안한다. 본 발명에 따른 장치는 기준 이미지에 포함된 제1 특징들에 대한 정보 및 기준 이미지에서 객체 단위로 형성된 클래스 내에 위치하는 제2 특징들에 대한 정보를 기초로 오차 함수를 생성하며, 오차 함수를 기초로 입력된 이미지들을 학습시키는 이미지 학습부; 이미지들을 학습시켜 얻은 결과를 기초로 임계값을 설정하는 임계값 설정부; 및 임계값을 기초로 이미지들을 분류하는 이미지 분류부를 포함한다.The present invention classifies images by using the error function generated based on information on features included in an image and information on features included in a class in the image, and classifies the images based on the results. An apparatus and method are proposed. The apparatus according to the present invention generates an error function based on the information about the first features included in the reference image and the information about the second features located in the class formed in units of objects in the reference image, and based on the error function. An image learner configured to learn the input images; A threshold setting unit that sets a threshold based on a result obtained by learning the images; And an image classifier that classifies the images based on the threshold value.

Description

Apparatus and method for classifying images, and apparatus for training images for classification of images}

본 발명은 이미지 분류를 위해 신경망을 학습시키는 이미지 학습 장치에 관한 것이다. 또한 본 발명은 이미지 학습 결과를 바탕으로 이미지를 분류하는 이미지 분류 장치 및 방법에 관한 것이다.The present invention relates to an image learning apparatus for training neural networks for image classification. The present invention also relates to an image classification apparatus and method for classifying images based on image learning results.

일반적으로 합성곱 신경망(CNN; Convolutional Neural Network)은 소프트맥스(softmax) 오차를 최소화하는 방식을 활용하여 학습한다. 하지만 얼굴 인식 문제에서는 소프트맥스 오차만을 이용하여 학습한 모델이 어려움을 겪고 있다. 이는 모델이 도출한 특징이 서로 밀집되어 있어 각 클래스 간의 구분이 어려워지기 때문이다.In general, a convolutional neural network (CNN) is trained using a method of minimizing softmax errors. However, in the face recognition problem, the model trained using only the softmax error has difficulty. This is because the features derived from the models are concentrated together, making it difficult to distinguish between classes.

얼굴과 같이 유사도가 높은 이미지 분류를 위해서는 학습되는 특징이 서로 분류(classification)되어야 할 뿐만 아니라, 특징이 더 많이 구분되고 차별적 구분 즉, 식별(discrimination)되는 모델이 필요하다.In order to classify images with high similarity, such as faces, not only the features to be learned must be classified, but also a model in which the features are distinguished and discriminated, that is, discriminated, is required.

한국공개특허 제2014-0096595호 (공개일 : 2014.08.06.)Korean Laid-Open Patent No. 2014-0096595 (Published: 2014.08.06.)

본 발명은 상기한 문제점을 해결하기 위해 안출된 것으로서, 이미지에 포함된 특징들에 대한 정보와 이미지 내 클래스에 포함된 특징들에 대한 정보를 기초로 생성된 오차 함수를 이용하여 이미지들을 학습시키고, 그 결과를 바탕으로 이미지들을 분류하는 이미지 분류 장치 및 방법을 제안하는 것을 목적으로 한다.The present invention has been made to solve the above-described problem, the image is trained using an error function generated based on the information about the features included in the image and the information contained in the class in the image, An object of the present invention is to propose an image classification apparatus and method for classifying images based on the results.

또한 본 발명은 이미지에 포함된 특징들에 대한 정보와 이미지 내 클래스에 포함된 특징들에 대한 정보를 기초로 생성된 오차 함수를 이용하여 이미지들을 학습시키는 이미지 학습 장치를 제안하는 것을 목적으로 한다.Another object of the present invention is to propose an image learning apparatus for learning images using an error function generated based on information on features included in an image and information on features included in a class in the image.

그러나 본 발명의 목적은 상기에 언급된 사항으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.However, the object of the present invention is not limited to the above-mentioned matters, and other objects not mentioned will be clearly understood by those skilled in the art from the following description.

본 발명은 상기한 목적을 달성하기 위해 안출된 것으로서, 기준 이미지에 포함된 제1 특징들에 대한 정보 및 기준 이미지에서 객체(object) 단위로 형성된 클래스(class) 내에 위치하는 제2 특징들에 대한 정보를 기초로 오차 함수를 생성하며, 상기 오차 함수를 기초로 입력된 이미지들을 학습시키는 이미지 학습부; 상기 이미지들을 학습시켜 얻은 결과를 기초로 임계값을 설정하는 임계값 설정부; 및 상기 임계값을 기초로 상기 이미지들을 분류하는 이미지 분류부를 포함하는 것을 특징으로 하는 이미지 분류 장치를 제안한다.The present invention has been made to achieve the above object, the information on the first features included in the reference image and the second feature located in the class (object) formed in the unit (object) unit in the reference image An image learning unit configured to generate an error function based on the information and to learn the input images based on the error function; A threshold setting unit that sets a threshold based on a result obtained by learning the images; And an image classification unit for classifying the images based on the threshold value.

바람직하게는, 상기 이미지 학습부는 합성곱 신경망(Convolutional Neural Network)을 이용하여 상기 이미지들을 학습시킨다.Preferably, the image learner learns the images using a convolutional neural network.

바람직하게는, 상기 이미지 학습부는 상기 제1 특징들의 평균값을 기초로 생성된 제1 오차, 상기 제2 특징들의 평균값을 기초로 생성된 제2 오차 및 크로스 엔트로피(cross-entropy)와 관련된 제3 오차를 기초로 상기 오차 함수를 생성한다.Advantageously, the image learner comprises a first error generated based on the average value of the first features, a second error generated based on the average value of the second features, and a third error associated with cross-entropy. Generate the error function based on.

바람직하게는, 상기 이미지 학습부는 상기 제1 특징들의 평균값과 상기 기준 이미지에 포함된 각 특징 사이의 제1 차이값들, 및 제1 가중치를 기초로 상기 제1 오차를 산출한다. 더욱 바람직하게는, 상기 이미지 학습부는 상기 제1 차이값들 중에서 선택된 최대값을 L2-norm에 적용하여 제1 값을 산출하고, 각각의 제1 차이값을 L2-norm에 적용하여 제2 값들을 산출하며, 상기 제1 값에 상기 제1 가중치를 더한 후 각각의 제2 값을 빼서 얻은 제3 값들과 0을 비교하여 얻은 결과들을 기초로 상기 제1 오차를 산출한다.Preferably, the image learner calculates the first error based on first difference values between the average value of the first features and each feature included in the reference image, and a first weight. More preferably, the image learner calculates a first value by applying a maximum value selected from the first difference values to L2-norm, and applies each first difference value to L2-norm to apply second values. The first error is calculated based on the results obtained by comparing the first values with zeros and the third values obtained by subtracting each second value and zero.

바람직하게는, 상기 이미지 학습부는 상기 제2 특징들의 평균값과 상기 기준 이미지에 포함된 각 특징 사이의 제2 차이값들, 및 상기 기준 이미지에 포함된 상기 클래스의 개수를 기초로 상기 제2 오차를 산출한다. 더욱 바람직하게는, 상기 이미지 학습부는 각 클래스를 대상으로 제2 차이값들을 산출하고, 클래스마다 산출된 제2 차이값들을 L2-norm에 적용하여 클래스마다 제4 값들을 산출하며, 클래스마다 산출된 제4 값들을 합산하여 얻은 결과를 기초로 상기 제2 오차를 산출한다.Preferably, the image learner may generate the second error based on the second difference values between the average value of the second features and each feature included in the reference image, and the number of classes included in the reference image. Calculate. More preferably, the image learning unit calculates second difference values for each class, calculates fourth values for each class by applying the second difference values calculated for each class to L2-norm, and calculates each class. The second error is calculated based on the result obtained by summing the fourth values.

바람직하게는, 상기 이미지 학습부는 상기 제1 오차에 제2 가중치를 곱하여 제5 값을 산출하고, 상기 제2 오차에 제3 가중치를 곱하여 제6 값을 산출하며, 상기 제5 값과 상기 제6 값 및 상기 제3 오차를 합산하여 얻은 결과를 기초로 상기 오차 함수를 생성한다.Preferably, the image learner calculates a fifth value by multiplying the first error by a second weight, multiplies the second error by a third weight, and calculates a sixth value, and the fifth value and the sixth value. The error function is generated based on a result obtained by summing a value and the third error.

바람직하게는, 상기 이미지 분류부는 상기 이미지들에 포함된 특징들이 상기 임계값 이하인지 여부를 기초로 상기 이미지들을 분류한다.Advantageously, the image classifier classifies the images based on whether the features included in the images are below the threshold.

바람직하게는, 상기 이미지 분류부는 비교 대상 이미지에 포함된 특징들이 상기 임계값 이하인 것으로 판단되면 상기 비교 대상 이미지에 포함된 객체를 상기 기준 이미지에 포함된 객체와 동일인으로 분류하며, 상기 비교 대상 이미지에 포함된 특징들이 상기 임계값 초과인 것으로 판단되면 상기 비교 대상 이미지에 포함된 객체를 상기 기준 이미지에 포함된 객체와 타인으로 분류한다.Preferably, when it is determined that the features included in the comparison target image are less than or equal to the threshold value, the image classification unit classifies the object included in the comparison target image into the same person as the object included in the reference image. If it is determined that the included features exceed the threshold, the objects included in the comparison target image are classified into objects and others included in the reference image.

바람직하게는, 상기 이미지 분류 장치는 상기 이미지들이 입력되면 상기 제1 특징들이 밀집되어 있을 것으로 예측되는 지점을 기준으로 미리 정해진 크기를 가지도록 각 이미지를 편집하는 이미지 조정부를 더 포함하며, 상기 이미지 학습부는 편집된 상기 이미지들을 학습시킨다.The image classification apparatus may further include an image adjusting unit configured to edit each image to have a predetermined size based on a point where the first features are expected to be concentrated when the images are input. Part learns the edited images.

또한 본 발명은 기준 이미지에 포함된 제1 특징들에 대한 정보 및 상기 기준 이미지에서 객체(object) 단위로 형성된 클래스(class) 내에 위치하는 제2 특징들에 대한 정보를 기초로 오차 함수를 생성하며, 상기 오차 함수를 기초로 입력된 이미지들을 학습시키는 단계; 상기 이미지들을 학습시켜 얻은 결과를 기초로 임계값을 설정하는 단계; 및 상기 임계값을 기초로 상기 이미지들을 분류하는 단계를 포함하는 것을 특징으로 하는 이미지 분류 방법을 제안한다.In addition, the present invention generates an error function based on the information about the first features included in the reference image and the information about the second features located in the class (object) formed in the object unit in the reference image Learning the input images based on the error function; Setting a threshold based on a result obtained by learning the images; And classifying the images based on the threshold value.

바람직하게는, 상기 학습시키는 단계는 합성곱 신경망(Convolutional Neural Network)을 이용하여 상기 이미지들을 학습시킨다.Advantageously, said training step trains said images using a convolutional neural network.

바람직하게는, 상기 학습시키는 단계는 상기 제1 특징들의 평균값을 기초로 생성된 제1 오차, 상기 제2 특징들의 평균값을 기초로 생성된 제2 오차 및 크로스 엔트로피(cross-entropy)와 관련된 제3 오차를 기초로 상기 오차 함수를 생성한다.Advantageously, said learning step comprises a first error generated based on an average value of said first features, a second error generated based on an average value of said second features and a third associated with cross-entropy. The error function is generated based on the error.

바람직하게는, 상기 학습시키는 단계는 상기 제1 특징들의 평균값과 상기 기준 이미지에 포함된 각 특징 사이의 제1 차이값들, 및 제1 가중치를 기초로 상기 제1 오차를 산출한다. 더욱 바람직하게는, 상기 학습시키는 단계는 상기 제1 차이값들 중에서 선택된 최대값을 L2-norm에 적용하여 제1 값을 산출하고, 각각의 제1 차이값을 L2-norm에 적용하여 제2 값들을 산출하며, 상기 제1 값에 상기 제1 가중치를 더한 후 각각의 제2 값을 빼서 얻은 제3 값들과 0을 비교하여 얻은 결과들을 기초로 상기 제1 오차를 산출한다.Preferably, the learning step calculates the first error based on first difference values between the average value of the first features and each feature included in the reference image, and a first weight. More preferably, the learning may include applying a maximum value selected from the first difference values to L2-norm to calculate a first value, and applying each first difference value to L2-norm to a second value. The first error is calculated based on the results obtained by comparing the first values with zeros and the third values obtained by subtracting each second value and zero.

바람직하게는, 상기 학습시키는 단계는 상기 제2 특징들의 평균값과 상기 기준 이미지에 포함된 각 특징 사이의 제2 차이값들, 및 상기 기준 이미지에 포함된 상기 클래스의 개수를 기초로 상기 제2 오차를 산출한다. 더욱 바람직하게는, 상기 학습시키는 단계는 각 클래스를 대상으로 제2 차이값들을 산출하고, 클래스마다 산출된 제2 차이값들을 L2-norm에 적용하여 클래스마다 제4 값들을 산출하며, 클래스마다 산출된 제4 값들을 합산하여 얻은 결과를 기초로 상기 제2 오차를 산출한다.Preferably, the learning may include the second error value based on the second difference between the average value of the second features and each feature included in the reference image, and the number of the classes included in the reference image. To calculate. More preferably, the learning may be performed by calculating second difference values for each class, calculating second values by class by applying the second difference values calculated for each class to L2-norm, and calculating each class. The second error is calculated based on the result obtained by summing the fourth values.

바람직하게는, 상기 학습시키는 단계는 상기 제1 오차에 제2 가중치를 곱하여 제5 값을 산출하고, 상기 제2 오차에 제3 가중치를 곱하여 제6 값을 산출하며, 상기 제5 값과 상기 제6 값 및 상기 제3 오차를 합산하여 얻은 결과를 기초로 상기 오차 함수를 생성한다.Preferably, the learning may be performed by calculating a fifth value by multiplying the first error by a second weight, multiplying the second error by a third weight, and calculating a sixth value. The error function is generated based on a result obtained by summing six values and the third error.

바람직하게는, 상기 분류하는 단계는 상기 이미지들에 포함된 특징들이 상기 임계값 이하인지 여부를 기초로 상기 이미지들을 분류한다.Advantageously, said categorizing classifies said images based on whether features included in said images are below said threshold.

바람직하게는, 상기 분류하는 단계는 비교 대상 이미지에 포함된 특징들이 상기 임계값 이하인 것으로 판단되면 상기 비교 대상 이미지에 포함된 객체를 상기 기준 이미지에 포함된 객체와 동일인으로 분류하며, 상기 비교 대상 이미지에 포함된 특징들이 상기 임계값 초과인 것으로 판단되면 상기 비교 대상 이미지에 포함된 객체를 상기 기준 이미지에 포함된 객체와 타인으로 분류한다.Preferably, in the classifying step, when it is determined that the features included in the comparison target image are less than or equal to the threshold value, the object included in the comparison target image is classified as the same person as the object included in the reference image. If it is determined that the features included in the threshold value are exceeded, the object included in the comparison target image is classified into an object included in the reference image and another person.

바람직하게는, 상기 학습시키는 단계 이전에, 상기 이미지들이 입력되면 상기 제1 특징들이 밀집되어 있을 것으로 예측되는 지점을 기준으로 미리 정해진 크기를 가지도록 각 이미지를 편집하는 단계를 더 포함하며, 상기 학습시키는 단계는 편집된 상기 이미지들을 학습시킨다.Preferably, prior to the learning step, further comprising the step of editing each image to have a predetermined size based on the point where the first features are expected to be concentrated when the images are input, the learning The step of learning trains the edited images.

또한 본 발명은 기준 이미지에 포함된 제1 특징들의 평균값 및 상기 기준 이미지에서 객체(object) 단위로 형성된 클래스(class) 내에 위치하는 제2 특징들의 평균값을 기초로 오차 함수를 생성하며, 상기 오차 함수를 기초로 입력된 이미지들을 학습시키는 것을 특징으로 하는 이미지 학습 장치를 제안한다.In addition, the present invention generates an error function based on the average value of the first features included in the reference image and the average value of the second features located in the class (object) formed in the object unit in the reference image, the error function The present invention proposes an image learning apparatus characterized by learning input images.

본 발명은 상기한 목적 달성을 위한 구성들을 통하여 다음 효과를 얻을 수 있다.The present invention can achieve the following effects through the configuration for achieving the above object.

첫째, 신경망 학습을 통해 얼굴에 대한 이미지도 분류하는 것이 가능해진다.First, it is possible to classify images of faces through neural network learning.

둘째, 신규 이미지가 입력될 때마다 신경망 학습을 반복할 필요가 없으며, 이에 따라 이미지 분류의 효율성을 향상시킬 수 있다.Second, neural network learning does not need to be repeated every time a new image is input, thereby improving the efficiency of image classification.

도 1은 본 발명의 일실시예에 따른 이미지 분류 시스템의 내부 구성을 개략적으로 도시한 개념도이다.
도 2는 심화 학습된 특징들의 분포도이다.
도 3은 본 발명에서 제안하는 글로벌 중심 오차가 반영된 오차 함수의 성능을 보여주는 예시도이다.
도 4는 본 발명에서 제안하는 오차 함수들의 성능을 보여주는 참고도이다.
도 5는 본 발명의 바람직한 실시예에 따른 이미지 분류 장치를 개략적으로 도시한 개념도이다.
도 6은 본 발명의 바람직한 실시예에 따른 이미지 분류 방법을 개략적으로 도시한 흐름도이다.1 is a conceptual diagram schematically showing an internal configuration of an image classification system according to an embodiment of the present invention.
2 is a distribution diagram of advanced learned features.
3 is an exemplary view showing the performance of the error function reflecting the global center error proposed in the present invention.
4 is a reference diagram showing the performance of the error functions proposed in the present invention.
5 is a conceptual diagram schematically illustrating an image classification apparatus according to an exemplary embodiment of the present invention.
6 is a flowchart schematically illustrating an image classification method according to a preferred embodiment of the present invention.

이하, 본 발명의 바람직한 실시예를 첨부된 도면들을 참조하여 상세히 설명한다. 우선 각 도면의 구성요소들에 참조 부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다. 또한, 이하에서 본 발명의 바람직한 실시예를 설명할 것이나, 본 발명의 기술적 사상은 이에 한정하거나 제한되지 않고 당업자에 의해 변형되어 다양하게 실시될 수 있음은 물론이다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. First, in adding reference numerals to the components of each drawing, it should be noted that the same reference numerals are assigned to the same components as much as possible even if displayed on different drawings. In addition, in describing the present invention, when it is determined that the detailed description of the related well-known configuration or function may obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, the following will describe a preferred embodiment of the present invention, but the technical idea of the present invention is not limited thereto and may be variously modified and modified by those skilled in the art.

도 1은 본 발명의 일실시예에 따른 이미지 분류 시스템의 내부 구성을 개략적으로 도시한 개념도이다.1 is a conceptual diagram schematically showing an internal configuration of an image classification system according to an embodiment of the present invention.

도 1에 따르면, 이미지 분류 시스템(100)은 입력 모듈(110), 조정 모듈(120), 학습 모듈(130), 설정 모듈(140) 및 분류 모듈(150)를 포함한다.According to FIG. 1, the image classification system 100 includes an input module 110, an adjustment module 120, a learning module 130, a setting module 140, and a classification module 150.

입력 모듈(110)은 분류 대상 이미지들을 입력받아 저장하는 기능을 수행한다.The input module 110 receives and stores the images to be classified.

조정 모듈(120)은 입력 모듈(110)로 분류 대상 이미지들이 입력되면 미리 정해진 기준에 따라 이 이미지들을 편집하는 기능을 수행한다. 일례로, 조정 모듈(120)은 각 이미지를 효율적으로 분류시키기 위해 특징들이 밀집되어 있을 것으로 예측되는 중심부를 기준으로 미리 정해진 크기를 가지도록 각 이미지를 편집할 수 있다. 본 발명에서는 이렇게 편집된 이미지를 미니 배치(mini-batch)로 이용할 수 있다.The adjustment module 120 edits the images according to a predetermined criterion when the classification target images are input to the input module 110. In one example, the adjustment module 120 may edit each image to have a predetermined size based on the center where the features are expected to be dense in order to efficiently classify each image. In the present invention, the edited image can be used as a mini-batch.

학습 모듈(130)은 조정 모듈(120)에 의해 편집된 각 이미지를 학습시켜 각 이미지로부터 특징들을 추출하는 기능을 수행한다. 학습 모듈(130)은 이미지 학습을 위해 일례로 합성곱 신경망(CNN; Convolutional Neural Network)을 이용할 수 있다.The learning module 130 learns each image edited by the adjustment module 120 and extracts features from each image. The learning module 130 may use a convolutional neural network (CNN) as an example for image learning.

본 발명에서는 얼굴에 대한 이미지도 분류하는 것이 가능하도록 특징들이 특정 부분에 밀집되지 않게 글로벌 중심 오차(global center loss)와 로컬 중심 오차(local center loss)가 추가된 오차 함수를 이용하여 각 이미지를 학습시킨다. 이하 이에 대해 자세하게 설명한다.In the present invention, each image is trained using an error function to which global center loss and local center loss are added so that the features are not classified in a specific part so that the images of the face can be classified. Let's do it. This will be described in detail below.

얼굴과 같이 유사한 이미지를 분류하기 위해서는 신경망이 좀더 구분되는 특징을 도출해야 한다. 본 발명에서는 얼굴과 같이 유사한 이미지를 분류하기 위해 오차 함수에 글로벌 중심 오차(global center loss)와 로컬 중심 오차(local center loss)를 추가한다.In order to classify similar images such as faces, neural networks need to be distinguished. In the present invention, a global center loss and a local center loss are added to an error function to classify similar images such as faces.

글로벌 중심 오차(global center loss)와 로컬 중심 오차(local center loss)는 이미지로부터 추출된 특징들이 밀집되어 클래스를 분류하는 매니폴드를 구하기 어려워져 이미지 분류 성능이 저하되는 문제점을 해결하기 위해 제안된 것으로서, 특징들이 소정의 지점에 밀집되지 않도록 강제하는 방식을 말한다.Global center loss and local center loss have been proposed to solve the problem of poor image classification performance because it is difficult to find a manifold that classifies the features extracted from the image. In other words, it is a method of forcing features not to be concentrated at a predetermined point.

이미지 분류 학습시 활용하는 오차에는 소프트맥스 크로스 엔트로피(softmax cross-entropy) 오차, 각 클래스의 분산을 줄이는 오차 등이 있다. 본 발명에서는 이러한 오차들과 상기에서 제안하는 오차들을 조합하여 이미지 분류 학습에 활용하기로 한다.Errors used in image classification learning include softmax cross-entropy errors and errors that reduce variance in each class. In the present invention, the combination of these errors and the errors proposed above will be used for image classification learning.

이하에서는 오차 함수에 본 발명에서 제안하는 오차들이 추가된 알고리즘에 따라 식별적 특징 학습을 거친 얼굴 인식 향상 기법에 대하여 설명한다.Hereinafter, a description will be given of a face recognition enhancement technique that has undergone discriminative feature learning according to an algorithm in which the errors proposed by the present invention are added to an error function.

합성곱 신경망은 신경망으로 구성된 딥 러닝(deep learning) 유형이다. 일반적인 신경망은 이미지 데이터 그대로 처리하지만, 합성곱 신경망은 이미지에서 특징들을 추출한 후 이 특징들을 기초로 이미지를 처리한다.Composite product neural networks are a type of deep learning that consists of neural networks. A general neural network processes image data as it is, but a convolutional neural network extracts features from an image and processes the image based on these features.

그런데 ILSVRC(ImageNet Large-Scale Visual Recognition Challenge)를 통해 많고 다양한 이미지 분류 학습을 경험하게 되면서 기존보다 깊이가 더 깊어진 합성곱 신경망을 구축하는 것이 가능해졌다. 이에 따라 합성곱 신경망은 이미지로부터 고차원의 특징들까지 추출하는 것이 가능해졌으며, 이미지 인식 성능도 향상시킬 수 있게 되었다.However, through the ILSVRC (ImageNet Large-Scale Visual Recognition Challenge), we have been able to build a multiplicative neural network that is deeper than before. As a result, the composite product neural network can extract high-dimensional features from an image and improve image recognition performance.

깊이가 깊은 합성곱 신경망이 깊이가 얕은 합성곱 신경망보다 더 좋은 특징 표현을 가진다는 전제 하에, 더 깊은 신경망을 설계하기 위해 많은 다양한 기법들이 제안되었지만, 분류하고 싶은 대상이 일반적인 이미지와 다른 성격을 갖고 있으면 신경망의 깊이와 함께 추가적으로 고려해야 할 사항이 있다.While many deep techniques have been proposed to design deeper neural networks on the premise that deeper convolutional neural networks have better feature representations than shallower convolutional neural networks, the object to be classified has a different nature than the general image. If so, there are additional considerations along with the depth of the neural network.

본 발명에서는 합성곱 신경망이 더 구분되는 특징들을 도출하여 얼굴과 같이 유사한 이미지도 분류할 수 있도록 이미지로부터 추출된 특징들이 각 클래스의 중심과의 거리를 줄임과 동시에 모든 특징들의 중심으로부터 멀어지도록 하는 모델을 제안한다. 본 발명에서는 이와 같이 클래스 간에 거리를 멀어지도록 하면서 학습하는 방식을 글로벌 중심 오차(global center loss)로 정의한다.In the present invention, a model that allows the features of the multiplicative neural network to be further separated from the center of all features while reducing the distance from the center of each class so that similar images such as faces can be classified. Suggest. In the present invention, the method of learning while keeping the distance between classes is defined as a global center loss.

본 발명에서 조인트 중심 오차(joint center loss)는 글로벌 중심 오차(global center loss) 및 로컬 중심 오차(local center loss)를 포함하는 오차 함수의 개념이다.In the present invention, a joint center loss is a concept of an error function including a global center loss and a local center loss.

조인트 중심 오차(joint center loss)를 기반으로 한 오차 함수가 적용된 모델은 이미지를 분류하는 방법에도 차이가 있다. 일반적인 모델을 이용하는 경우 소프트맥스(softmax)를 활용하여 학습 후 도출된 라벨을 통해 이미지를 분류하지만, 조인트 중심 오차를 기반으로 한 오차 함수가 적용된 모델을 이용하는 경우 이미지로부터 추출된 특징들과 각 클래스의 중심 사이의 거리를 비교하여 가장 가까운 중심을 같은 클래스로 선택하여 이미지를 분류한다.The model to which the error function based on joint center loss is applied also differs in how images are classified. In case of using the general model, the image is classified by the label derived after learning using softmax, but when using the model to which the error function based on the joint center error is applied, the features extracted from the image and the Compare the distances between the centers and classify the images by selecting the nearest centers of the same class.

합성곱 신경망은 소프트맥스(softmax)한 값을 정답 레이블의 크로스 엔트로피(cross-entropy)를 최소화하는 방식을 활용하여 학습한다. 하지만 얼굴 인식 문제에서는 소프트맥스(softmax) 오차만을 이용하여 학습한 모델이 어려움을 겪고 있다. 이는 모델로부터 추출된 특징들이 서로 밀집되어 있어 각 클래스 간의 구분이 어려워지기 때문이다.The multiplicative neural network learns softmax values by minimizing cross-entropy of the correct label. However, in the face recognition problem, the model trained using only the softmax error has difficulty. This is because features extracted from the model are concentrated together, making it difficult to distinguish between classes.

얼굴과 같이 유사도가 높은 이미지를 분류하기 위해서는 학습되는 특징들이 서로 분류(classification)되어야 할 뿐만 아니라, 특징들이 더 많이 구분되고 차별적 구분 즉, 식별(discrimination)되는 모델이 필요하다.In order to classify high similarity images such as faces, not only the features to be learned must be classified, but also a model in which the features are distinguished and discriminated, that is, discriminated, is required.

도 2는 심화 학습된 특징들(deeply learned features)의 분포도이다. 도 2의 (a)는 구별적 특징(separable feature)의 분포도를 나타낸 것이며, 도 2의 (b)는 식별적 특징(discriminative feature)의 분포도를 나타낸 것이다.2 is a distribution diagram of deeply learned features. FIG. 2 (a) shows a distribution of distinguishable features, and FIG. 2 (b) shows a distribution of discriminant features.

모델이 식별적 특징들을 가지도록 강제하면서 학습시키면 얼굴 인식 분야에서 높은 수준의 인식률을 얻을 수 있다. 추가적으로 식별적 특징들을 가지도록 강제한 모델은 분류시 다음과 같은 장점도 갖게 된다.Learning while forcing the model to have distinctive features yields a high level of recognition in the face recognition field. In addition, models that are forced to have identifying features also have the following advantages in classification:

첫째, 모델로부터 추출된 특징들은 이전의 소프트맥스(softmax)에 비해 더 구분되기 때문에 특징들 간의 유클리드 거리를 측정하여 분류가 더욱 적합해진다. 또한 추출된 특징들과 각 클래스의 중심을 비교하여 가장 가까운 중심을 찾아 분류를 하는 것도 가능해진다.First, the features extracted from the model are more distinguished than the previous softmax, so the classification is more suitable by measuring the Euclidean distance between the features. In addition, it is possible to find and classify the nearest center by comparing the extracted features with the center of each class.

둘째, 여러 이미지들을 활용하여 특징들을 도출하고 각 특징의 유클리드 거리를 비교하여 자신과 가장 유사한 이미지 순으로 분류할 수 있다. 특히 클래스 결과를 받지 않고 특징들을 추출하여 분류하는 것은 실용적으로 매우 큰 장점을 갖는다. 그 이유는 모델이 새로운 분류의 이미지가 추가될 때마다 다시 학습을 시켜 결과를 도출하지 않고 새로 추가된 분류의 이미지의 특징과 비교하여 분류하기 때문이다.Second, it is possible to derive features by using multiple images and compare Euclidean distance of each feature to classify them in the order of the most similar images. In particular, it is very practical to extract and classify features without receiving class results. The reason for this is that the model classifies the images by comparing them with the features of the newly added classifications without re-learning each time an image of the new classification is added.

본 발명에서는 각 클래스의 중심과 거리를 최소화하는 오차 함수와 모든 특징들의 중심과 거리를 멀어지도록 하는 오차 함수를 이용하여 모델이 더 좋은 식별적 특징을 갖도록 강제한다. 더 좋은 식별적 특징을 갖도록 학습을 강제한 모델은 각 특징 간의 거리를 늘려 최종 분류 성능을 향상시킬 수 있다.In the present invention, the model is forced to have better discriminating features by using an error function that minimizes the center and distance of each class and an error function that moves the center and distance of all features away. Models that force learning to have better discriminating features can improve the final classification performance by increasing the distance between each feature.

모델로부터 식별적 특징들을 추출하기 위해서는 식별적 특징들이 추출되도록 강제하는 오차 함수가 있어야 한다. 본 발명에서는 이러한 오차 함수로 조인트 중심 오차(joint center loss)를 기반으로 하는 오차 함수를 제안한다. 조인트 중심 오차(joint center loss)는 글로벌 중심 오차(global center loss)와 로컬 중심 오차(local center loss)를 포함하는 개념이다.To extract identifying features from the model, there must be an error function that forces the identifying features to be extracted. The present invention proposes an error function based on joint center loss as such an error function. Joint center loss is a concept that includes a global center loss and a local center loss.

먼저 글로벌 중심 오차에 대해 설명하고, 이후 조인트 중심 오차를 기반으로 하는 오차 함수에 대해 설명하기로 한다.First, the global center error will be described, and then the error function based on the joint center error will be described.

식별적 특징들을 가지도록 학습하는 이유는 특징들이 너무 밀집되지 않도록 하는 것이다. 특징들이 밀집되면 모든 값을 만족시키기 위해 함수가 고차가 되어 오버 피팅(over fitting)이 발생하거나, 모든 값을 만족시키지 못해 함수가 저차가 되어 언더 피팅(under fitting)이 발생한다. 이것은 결과적으로 모델의 분류 성능을 저하시키는 문제점을 초래한다.The reason for learning to have identifying features is to make sure that the features are not too dense. When the features are dense, the function becomes higher order to satisfy all values, or over fitting occurs, or the function becomes lower order to satisfy all values, resulting in under fitting. This results in a problem of degrading the classification performance of the model.

지도 학습을 통해 모델을 학습시키는 것은 각 클래스의 매니폴드를 구하는 것과 같다. 그런데 밀집 문제와 더불어 분류해야 할 클래스의 개수가 많아지면 적합한 매니폴드를 구하는 것이 상대적으로 적은 클래스에 비해 힘들어진다. 따라서 밀집될 가능성이 높고 매니폴드를 구하기 어려운 공간일수록 특징들이 추출되지 않도록 강제함으로써 적합한 매니폴드를 용이하게 구할 수 있게 할 필요가 있다.Training a model through supervised learning is like finding a manifold for each class. However, with dense problems, the larger the number of classes to classify, the more difficult it is to find a suitable manifold compared to a relatively small class. Therefore, it is necessary to easily obtain a suitable manifold by forcing the features not to be extracted in a space that is likely to be dense and difficult to obtain a manifold.

일반적으로 가장 밀집될 가능성이 높은 공간은 모든 특징들의 중심이므로, 본 발명에서는 모든 특징들의 중심에 각 특징이 가까워질수록 높은 오차를 주기로 한다.In general, since the space most likely to be concentrated is the center of all the features, in the present invention, as each feature gets closer to the center of all the features, a higher error is given.

추출된 특징들이 중심으로 밀집되지 않도록 강제하는 글로벌 중심 오차는 다음 수학식 1과 같다.The global center error forcing the extracted features not to be concentrated at the center is represented by Equation 1 below.

상기에서 c_g는 모든 특징들의 중심, 즉 모든 특징들의 평균을 의미하며, 본 발명에서는 이를 글로벌 중심(global center)으로 정의한다.In the above, c _g means the center of all features, that is, the average of all features, and the present invention defines this as a global center.

글로벌 중심(global center)은 모든 특징들의 좌표값들을 평균하여 산출할 수 있다. 글로벌 중심(global center)은 매시간 모든 학습 데이터들과 비교하여 도출하는 것이 가장 이상적이겠지만, 본 발명에서는 성능상의 이유로 미니 배치(mini-batch) 내에서 각 클래스의 중심을 이용하여 구하기로 한다.The global center may be calculated by averaging coordinate values of all features. The global center may be ideally derived in comparison with all learning data every hour. However, in the present invention, the center of each class is used in a mini-batch for performance reasons.

학습은 신경망의 오차 함수에 의한 백프로퍼게이션(backpropagation)을 하는 과정을 말한다. 하지만 가지고 있는 모든 데이터들을 활용하여 학습하는 것(full batch)은 현실적으로 불가능하다. 그래서 가지고 있는 일부 데이터들(ex. 100개)을 활용하여 학습한 후 다시 다음 일부 데이터들(ex. 100개)을 활용하여 신경망을 학습하는 방법이 제안되고 있다. 미니 배치(mini-batch)는 신경망 학습시 데이터의 일부만을 활용하여 학습하는 방법을 의미한다.Learning refers to the process of backpropagation by the error function of neural networks. However, it is practically impossible to use a full batch of all the data you have. Therefore, a method of learning neural networks using some data (ex. 100) and then using some next data (ex. 100) is proposed. Mini-batch refers to a method of learning by using only a part of data in neural network training.

γ는 오차(margin)를 의미한다. γ는 하이퍼 파라미터(hyper-parameter)로 학습되어지는 변수가 아닌, 사용자가 임의로 지정하는 값이다.γ means margin. γ is not a variable to be learned as a hyper-parameter, but a value arbitrarily assigned by the user.

x_i는 이미지로부터 추출된 i번째 특징을 의미한다. 그리고 m은 클래스(class)의 개수를 의미한다. 본 발명에서 클래스는 지도 학습에 활용되는 라벨로서, 예를 들어 고양이와 개를 구분하는 신경망이면 개와 고양이가 라벨이 된다.x _i means the i th feature extracted from the image. And m is the number of classes. In the present invention, a class is a label utilized for supervised learning, and for example, a dog and a cat are labels if a neural network distinguishing a cat and a dog.

r은 글로벌 중심(global center)과 이 글로벌 중심으로부터 가장 먼 곳에 위치하는 특징 사이의 거리(largest distance)를 의미한다. r은 다음 수학식 2를 통하여 구할 수 있다.r means the largest distance between the global center and the feature located farthest from the global center. r can be obtained from Equation 2 below.

본 발명에서는 수학식 2에 도시된 바와 같이 미니 배치(mini-batch) 내에서 글로벌 중심(global center)과 이미지 내의 특정 특징 사이의 차이값이 최대값인 것을 찾아 이것을 r로 산출한다.In the present invention, as shown in Equation 2, the difference between the global center and a specific feature in the image in the mini-batch is found to be the maximum value and is calculated as r.

한편 x_i는 입력 이미지 혹은 이전 레이어(layer)에서 도출된 값 전체를 의미할 수도 있다.On the other hand, x _i may mean the entire value derived from the input image or the previous layer.

한편 max(A, B)는 A와 B 중 큰 값을 도출하는 함수로, 수학식 1에서는 0보다 큰 값이 항상 도출되도록 max(0, B) 방식을 활용하였다.On the other hand, max (A, B) is a function for deriving a larger value of A and B. In Equation 1, max (0, B) is used so that a value larger than 0 is always derived.

한편 || ||는 L2 norm을 의미한다.Meanwhile || || means L2 norm.

글로벌 중심 오차 L_g는 미니 배치(mini-batch) 내에서 제1 제곱값(즉, r의 제곱값)과 제2 제곱값(즉, x_i와 c_g 사이의 차이값의 제곱값) 사이의 차이값이 γ보다 커지도록 오차를 주어 강제하는 것이다. 글로벌 중심 오차 L_g를 최소화하면 특징이 γ보다 커지도록 강제하면서 더불어 특징이 중심으로부터 멀어지게 학습할 수 있다. 글로벌 중심 오차 L_g는 이를 통해 이미지로부터 추출된 특징들이 중심으로 밀집되지 않도록 강제할 수 있다.The global center error L _g is between the first square value (ie, the square of r) and the second square value (ie, the square of the difference between x _i and c _g ) within the mini-batch. It is forced to give an error so that the difference is larger than γ. Minimizing the global center error L _g can force the feature to be larger than γ while learning the feature away from the center. The global center error L _g may thereby force the features extracted from the image not to be centered.

한편 본 발명에서는 모든 데이터들을 기초로 학습을 반복할 때 초래되는 비효율성을 피하기 위해 수학식 3, 수학식 4 등을 이용하여 글로벌 중심(global center) c_g, 최대 거리(largest distance) r 등을 업데이트한다.Meanwhile, in the present invention, in order to avoid the inefficiency caused by repeating the learning based on all the data, the global center c _g , the maximum distance r, etc. are calculated using Equation 3, Equation 4, and the like. Update.

수학식 3은 x_i에 의한 L_g의 그래디언트(gradient)를 나타낸 것으로서, 자세하게는 다음과 같다.Equation 3 shows a gradient of L _g by x _i , and is described in detail as follows.

수학식 3에서 ∂L_g/∂x_i의 결과값이 -(x_i-c_g) 또는 0이 되는 것은 수학식 1의 max(0, ||r||² - ||x_i - c_g||² + γ)와 관련된다.In Equation 3, the result of ∂L _g / ∂x _i becomes-(x _i -c _g ) or 0 is calculated as max (0, || r || ^2- || x _i -c _g ² + γ).

x_i가 c_g로부터 멀리 떨어져 있는 경우 ||x_i - c_g||²이 ||r||²보다 큰 값을 가지는 경우가 발생하며, 이 경우 ||r||² - ||x_i - c_g||² + γ의 값이 0보다 작아지는 현상이 발생할 수 있다. 이때에는 그래디언트(gradient)가 음수값을 가져 정상적으로 학습이 되지 않는 문제점이 생긴다. 수학식 1 및 3은 이러한 문제점을 고려하여 결과값이 음수가 되지 않도록 설정한다.x _i is far from c _g || x _i -c _g || ² || r || Occurs with a value greater than ² , in which case || r || ^2- || x _i -c _g || ^The phenomenon that ² + γ is smaller than 0 may occur. At this time, there is a problem that the gradient has a negative value and does not normally learn. Equations 1 and 3 are set in such a manner that the result value is not negative in consideration of such a problem.

수학식 4는 c_g의 업데이트 공식(update equation)을 나타낸 것으로서, 자세하게는 다음과 같다.Equation 4 shows an update equation of c _g , and is described in detail as follows.

도 3은 본 발명에서 제안하는 글로벌 중심 오차가 반영된 오차 함수의 성능을 보여주는 예시도이다.3 is an exemplary view showing the performance of the error function reflecting the global center error proposed in the present invention.

얼굴에 대한 이미지의 경우, 일반적으로 도 3의 (a)에 도시된 바와 같이 중앙에 특징들이 밀집되어 있다. 이러한 이미지에 글로벌 중심 오차가 반영된 오차 함수를 기초로 한 중심 확장(center expansion) 기법을 적용하여 식별적 심화 학습(discriminative deep learning)을 수행하면, 도 3의 (b)에 도시된 바와 같이 특징들의 분포가 중앙에 밀집되지 않도록 이미지를 변화시킬 수 있다.In the case of an image of a face, features are generally concentrated in the center, as shown in FIG. When discriminative deep learning is performed by applying a center expansion technique based on an error function reflecting a global center error to such an image, as shown in (b) of FIG. The image can be varied so that the distribution is not dense in the center.

다음으로 조인트 중심 오차(joint center loss)를 기반으로 하는 오차 함수에 대하여 설명한다.Next, an error function based on joint center loss will be described.

이미지에서 가장 쉽게 식별적 특징들을 추출하는 방법은 각 클래스 내 분산을 작아지게 학습하는 것이다. 클래스 내 분산이 작아지면 추출된 특징들이 다른 클래스와 겹칠 수 있는 가능성이 줄어들어 결과적으로 식별적 특징들이 도출되는 효과를 얻을 수 있다.The easiest way to extract identifying features from an image is to learn the variance in each class to be small. If the variance in the class is small, the possibility that the extracted features overlap with other classes is reduced, resulting in the effect that the distinguishing features are derived.

클래스 내 분산을 최소화하기 위해 본 발명에서는 각 클래스의 중심과 소속된 클래스의 특징 간의 거리를 최소화하는 오차 함수를 이용한다. 이러한 오차 함수는 다음 수학식 5와 같이 정의된다.In order to minimize variance in the class, the present invention uses an error function that minimizes the distance between the center of each class and the features of the class to which it belongs. This error function is defined as in Equation 5 below.

상기에서 c_yi는 y 클래스의 중심 즉, y 클래스 특징의 평균을 의미한다. c_yi는 매시간 모든 학습 데이터와 비교하여 도출하는 것이 가장 이상적이지만 성능상 효율이 없으므로, 본 발명에서는 미니 배치(mini-batch) 내에서 각각의 클래스의 중심을 구한다.In the above, c _yi means the center of the y class, that is, the average of the y class features. Since c _yi is ideally derived compared to all the learning data every hour, but there is no performance efficiency, the present invention finds the center of each class in a mini-batch.

신경망 학습의 특성상 전체 데이터 세트를 만족하기 위해 변량으로 갱신한다. c_yi와 c_g는 초반에 랜덤값 혹은 0으로 초기화된 후 지속적으로 적합한 값(모든 데이터들을 만족하는 값)으로 갱신된다.Due to the nature of neural network learning, we update it with variables to satisfy the entire data set. c _yi and c _g are initially initialized to a random value or 0 and then continuously updated to a suitable value (a value that satisfies all the data).

한편 L_g에 대해서는 전술하였으므로, 여기서는 그 자세한 설명을 생략한다.Since L _g has been described above, its detailed description will be omitted here.

글로벌 중심 오차의 경우와 마찬가지로 조인트 중심 오차의 경우도 모든 데이터들을 기초로 학습을 반복할 때 초래되는 비효율성을 피하기 위해 수학식 6, 수학식 7 등을 이용하여 글로벌 중심(global center) c_g, 최대 거리(largest distance) r 등을 업데이트한다.As in the case of the global center error, the joint center error also uses the global center c _g , Equation 7, etc. to avoid the inefficiency caused by repeating the learning based on all the data. Update the largest distance r and so on.

수학식 6은 x_i에 의한 L_c의 그래디언트(gradient)를 나타낸 것으로서, 자세하게는 다음과 같다.Equation 6 shows a gradient of L _c by x _i , and is described in detail as follows.

수학식 7은 c_j의 업데이트 공식(update equation)을 나타낸 것으로서, 자세하게는 다음과 같다.Equation 7 shows an update equation of c _j , and is described in detail as follows.

상기에서, δ(y_i = j)는 디렉 함수로 0 혹은 1의 값만 도출할 수 있는 함수이다. 즉 δ는 내부 조건을 만족하면 1이고 내부 조건을 만족하지 못하면 0이다. 본 발명에서 δ는 () 내의 연산이 참일 경우 1을 도출하고 () 내의 연산이 거짓일 경우 0을 도출한다. 이것은 자신이 속한 클래스의 중심이 오로지 자신 클래스의 특징에게만 영향을 미칠 수 있도록 하기 위해서이다.In the above, δ (y _i = j) is a function that can derive only a value of 0 or 1 as a Direc function. That is, δ is 1 when the internal condition is satisfied and 0 when the internal condition is not satisfied. In the present invention, δ derives 1 when the operation in () is true and 0 when the operation in () is false. This is to ensure that the center of your class can only affect the characteristics of your class.

c_j는 각 클래스의 중심을 의미한다. △c_j는 이미지로부터 추출된 특징들과 그 특징들이 속하는 클래스의 중심 사이의 차이값을 이용하여 각 클래스에 해당하는 c_j를 수정한다. 실제로 △c_j를 활용하여 c_j를 갱신할 때는 하이퍼 파라미터 α를 활용하여 균형을 맞춘다.c _j stands for the center of each class. Δc _j modifies c _j corresponding to each class by using the difference between the features extracted from the image and the center of the class to which the features belong. Utilizing the fact △ c _j and when updating the c _j balances utilize a hyper parameter α.

하이퍼 파라미터 α는 클래스의 중심이 학습 데이터에 의해 너무 쉽게 변경되지 않도록 임의 설정한 것으로서, 학습시 생기는 불균형을 제어하기 위한 값이다. 본 발명에서 α는 0보다 큰 값을 가질 수 있으며, 학습에 활용되는 옵티마이저(optimizer)에 의해 가변될 수 있다.The hyper parameter α is arbitrarily set so that the center of the class is not easily changed by the training data, and is a value for controlling an imbalance caused during learning. In the present invention, α may have a value greater than 0 and may be changed by an optimizer used for learning.

하지만 L_c와 L_g만을 이용하여 모델을 학습할 수는 없다.However, you cannot train the model using only L _c and L _g .

최종적인 제2 오차 함수는 L_c, L_g, 소프트맥스 크로스 엔트로피(softmax cross-entropy) 오차를 합산한 것이며, 자세하게는 다음 수학식 8과 같다.The final second error function is the sum of the L _c , L _g , and softmax cross-entropy errors, which are described in Equation 8 below.

상기에서 W와 b는 학습되는 인자를 의미한다. W의 우측 상단의 T는 행렬 곱을 위해 트랜스포즈(transpose)하는 것을 의미하며, 우측 하단의 j 또는 yi는 몇번째 학습 인자인지를 설명하는 것이다. 특히 j는 해당 특징의 컬럼(column)의 인덱스를 의미하며, yi는 i번째가 몇번째 클래스인지를 의미한다.In the above description, W and b mean a factor to be learned. T at the top right of W means transpose for matrix multiplication, and j or yi at the bottom right describes the number of learning factors. In particular, j means the index of the column of the feature, and yi means the class of the i th.

L_s에 해당하는 식은 일반적으로 신경망을 학습할 때 활용되는 식이며, 소프트맥스 오차(softmax loss) 또는 크로스 엔트로피(cross-entropy), 네거티브 라이크후드(negative likelihood)라고 불린다. e는 exponential을 의미한다.The equation corresponding to L _s is generally used to learn neural networks and is called softmax loss, cross-entropy, or negative likelihood. e means exponential.

본 발명에서 조인트 중심 오차를 기반으로 하는 오차 함수를 산출할 때 통상 학습 과정에서 활용되고 있는 경사 하강 기법을 이용할 수 있다.In the present invention, when calculating an error function based on the joint center error, a gradient descent technique used in a normal learning process may be used.

L_c에 의한 오차는 c_yi에 가까워질수록 지수적으로 작아지고, L_g에 의한 오차는 c_g로부터 멀어질수록 지수적으로 작아진다. 따라서 모델의 특징이 각 특징의 중심과 너무 가까워지거나 혹은 특징이 비정상적으로 특징의 중심과 멀어지는 것을 완화시킬 수 있다. 즉 수학식 8을 이용하면, 학습될 특징이 각 클래스의 중심과 멀어질수록 오차를 키워 식별적 특징이 되도록 함과 더불어 접점이 생길 가능성이 높은 부분인 c_g 주변의 특징은 오차를 키워 다른 클래스 간의 밀집을 피하도록 학습을 강제할 수 있다.The error due to L _c decreases exponentially as it approaches c _yi , and the error due to L _g decreases exponentially as it moves away from c _g . Therefore, it is possible to mitigate that the feature of the model is too close to the center of each feature or the feature is abnormally far from the center of the feature. In other words, using Equation 8, as the feature to be learned becomes far from the center of each class, the error is increased to become an identifying feature, and the feature around c _g , which is a part that is likely to generate a contact point, increases the error to other classes. You can force learning to avoid crowding the liver.

λ_c와 λ_g는 하이퍼 파라미터(hyper-parameter)로 학습되어지는 변수가 아닌, 사용자가 임의로 지정하는 값이다. 본 발명에서 λ_c와 λ_g는 0보다 큰 값을 가질 수 있으며, 두 개의 오차 함수의 영향을 조정하는 데에 활용한다. 즉 λ_c와 λ_g 값이 커질수록 특징의 분포는 자신의 클래스의 중심에 가까워지며, 0일 경우에는 소프트맥스(softmax)만을 활용하여 학습한 모델과 그 결과가 같아지게 된다.λ _c and λ _g are not variables that can be learned with hyper-parameters, but values that the user arbitrarily specifies. In the present invention, λ _c and λ _g may have a value greater than zero and are used to adjust the influence of two error functions. That is, as λ _c and λ _g increase, the feature distribution gets closer to the center of its class. If it is 0, the result is the same as the model trained using only softmax.

본 발명에서는 수학식 8에 기재되어 있는 바와 같이 c_g와 특징의 차, c_yi와 특징의 차 등 두 개를 오차 함수로 추가함으로써 각 특징은 자신의 클래스 내의 분산을 줄임과 동시에 아닌 모든 특징의 중심과 멀어지게 학습한다.In the present invention, as described in Equation 8, by adding two, such as c _g and the difference between the features, c _yi and the difference between the features as an error function, each feature reduces the variance in its class and at the same time Study away from the center.

한편 n은 클래스의 개수를 의미한다.N is the number of classes.

도 4는 본 발명에서 제안하는 오차 함수들의 성능을 보여주는 참고도이다. 도 4의 (a) 내지 (d)에서 각 점은 학습된 신경망에서 도출된 특징을 의미하며, 서로 다른 색상의 점들은 서로 다른 클래스에 속하는 특징들임을 의미한다.4 is a reference diagram showing the performance of the error functions proposed in the present invention. In (a) to (d) of FIG. 4, each point means a feature derived from the learned neural network, and points of different colors mean features belonging to different classes.

도 4의 (a)는 글로벌 중심 오차(global center loss)가 고려된 오차 함수 즉, 제1 오차 함수를 이용하여 학습한 결과로 추출된 특징들의 분포를 보여준다. 이때의 제1 오차 함수는 소프트맥스 크로스 엔트로피(softmax cross-entropy) 오차와 글로벌 중심 오차(global center loss)가 반영된 오차 함수이다.4 (a) shows a distribution of features extracted as a result of learning using an error function that is considered a global center loss, that is, a first error function. In this case, the first error function is an error function in which a softmax cross-entropy error and a global center loss are reflected.

도 4의 (a)에 도시된 바와 같이 본 발명에서 제안하는 제1 오차 함수를 이용하는 경우, 이미지로부터 추출된 특징들을 중앙에 밀집되지 않도록 분포시킬 수 있어 얼굴에 대한 이미지도 분류하는 것이 가능해진다.As shown in (a) of FIG. 4, when using the first error function proposed in the present invention, the features extracted from the images can be distributed so as not to be concentrated at the center, thereby making it possible to classify the images of the faces.

도 4의 (b) 내지 (d)는 조인트 중심 오차(joint center loss)가 고려된 오차 함수 즉, 제2 오차 함수를 이용하여 학습한 결과로 추출된 특징들의 분포를 보여준다. 이때의 제2 오차 함수는 소프트맥스 크로스 엔트로피 오차, 글로벌 중심 오차 및 조인트 중심 오차(joint center loss)가 반영된 오차 함수이다.4 (b) to (d) show distributions of features extracted as a result of learning using an error function, that is, a second error function in which joint center loss is considered. In this case, the second error function is an error function in which the softmax cross entropy error, the global center error, and the joint center loss are reflected.

자세하게는, 도 4의 (b)는 λ_c와 λ_g의 비율을 1:1로 설정하여 얻은 특징들의 분포를 나타내며, 도 4의 (c)는 λ_c와 λ_g의 비율을 0.1:1로 설정하여 얻은 특징들의 분포를 나타낸다. 또한 도 4의 (d)는 λ_c와 λ_g의 비율을 0.01:1로 설정하여 얻은 특징들의 분포를 나타낸다.In detail, FIG. 4 (b) shows the distribution of features obtained by setting the ratio of λ _c and λ _g to 1: 1, and FIG. 4 (c) shows the ratio of λ _c and λ _g to 0.1: 1. The distribution of the characteristics obtained by setting is shown. 4D shows the distribution of features obtained by setting the ratio of λ _c and λ _g to 0.01: 1.

본 발명에서 제안하는 제2 오차 함수를 이용하는 경우, 제1 오차 함수를 이용하는 경우와 마찬가지로 이미지로부터 추출된 특징들을 중앙에 밀집되지 않도록 분포시킬 수 있어 얼굴에 대한 이미지도 분류하는 것이 가능해진다.In the case of using the second error function proposed in the present invention, the features extracted from the image can be distributed so as not to be concentrated at the center, as in the case of using the first error function, thereby making it possible to classify the image of the face.

다시 도 1을 참조하여 설명한다.This will be described with reference to FIG. 1 again.

학습 모듈(130)에 의한 신경망 학습이 끝나면, 설정 모듈(140)은 각 이미지로부터 추출된 특징들을 기초로 임계값(threshold value)을 설정하는 기능을 수행한다.After the neural network training by the learning module 130 is finished, the setting module 140 performs a function of setting a threshold value based on the features extracted from each image.

일례로 설정 모듈(140)은 다음 순서에 따라 임계값을 설정할 수 있다.For example, the setting module 140 may set a threshold in the following order.

먼저 설정 모듈(140)은 학습된 신경망에 테스트 데이터 세트(test dataset)를 활용하여 특징들을 도출한다.First, the setting module 140 derives the features by using a test dataset in the learned neural network.

이후 설정 모듈(140)은 임의의 값을 임계값으로 임시 설정한다.The setting module 140 then temporarily sets an arbitrary value as a threshold.

이후 설정 모듈(140)은 얼굴 인식 정확도를 측정한다. 설정 모듈(140)의 이 기능은 여러 차례 반복될 수 있다.The setting module 140 then measures face recognition accuracy. This function of the configuration module 140 can be repeated several times.

이후 설정 모듈(140)은 얼굴 인식의 반복 측정을 통해 가장 적합한 임계값을 결정한다.The setting module 140 then determines the most appropriate threshold value through repeated measurements of face recognition.

분류 모듈(150)은 설정 모듈(140)에 의해 설정된 임계값을 기초로 각 이미지를 분류하는 기능을 수행한다. 일례로, 분류 모듈(150)은 다음 순서에 따라 각 이미지를 분류할 수 있다.The classification module 150 performs a function of classifying each image based on the threshold set by the setting module 140. In one example, the classification module 150 may classify each image in the following order.

먼저 분류 모듈(150)은 이미지의 특징이 임계값 이하인지 여부를 판단한다. 분류 모듈(150)의 이러한 기능을 수식으로 나타내면 다음과 같다.First, the classification module 150 determines whether a feature of an image is equal to or less than a threshold. This function of the classification module 150 is represented by a formula as follows.

D(x_i, x_j) ≤ dD (x _i , x _j ) ≤ d

상기에서 D(x_i, x_j)는 이미지로부터 추출된 특징을 의미한다. 그리고 D는 임계값(threshold value)을 의미한다.In the above, D (x _i , x _j ) means a feature extracted from the image. And D means a threshold value.

이미지의 특징이 임계값 이하인 것으로 판단되면, 이후 분류 모듈(150)은 해당 이미지를 임계값과 관련된 이미지와 동일인인 것으로 분류한다. 반면 이미지의 특징이 임계값 초과인 것으로 판단되면, 이후 분류 모듈(150)은 해당 이미지를 임계값과 관련된 이미지와 동일인이 아닌 것으로(즉, 타인인 것으로) 분류한다.If it is determined that the feature of the image is less than or equal to the threshold, the classification module 150 then classifies the image as being the same as the image associated with the threshold. On the other hand, if it is determined that the feature of the image is above the threshold, the classification module 150 then classifies the image as not being the same person as the image associated with the threshold (ie, being another person).

이상 도 1 내지 도 4를 참조하여 본 발명의 일실시 형태에 대하여 설명하였다. 이하에서는 이러한 일실시 형태로부터 추론 가능한 본 발명의 바람직한 형태에 대하여 설명한다.As mentioned above, one Embodiment of this invention was described with reference to FIGS. Hereinafter, the preferable form of this invention which can be inferred from such one Embodiment is demonstrated.

도 5는 본 발명의 바람직한 실시예에 따른 이미지 분류 장치를 개략적으로 도시한 개념도이다.5 is a conceptual diagram schematically illustrating an image classification apparatus according to an exemplary embodiment of the present invention.

도 5에 따르면, 이미지 분류 장치(200)는 이미지 학습부(210), 임계값 설정부(220), 이미지 분류부(230), 전원부(240) 및 주제어부(250)를 포함한다.According to FIG. 5, the image classifying apparatus 200 includes an image learner 210, a threshold value setting unit 220, an image classifying unit 230, a power supply unit 240, and a main control unit 250.

전원부(240)는 이미지 분류 장치(200)를 구성하는 각 구성에 전원을 공급하는 기능을 수행한다. 주제어부(250)는 이미지 분류 장치(200)를 구성하는 각 구성의 전체 작동을 제어하는 기능을 수행한다. 이미지 분류 장치(200)는 이미지를 처리하는 장치에 구비되어 이미지를 분류할 수 있다. 이미지를 처리하는 장치에도 전원부와 주제어부가 존재하므로, 이 점을 참작할 때 본 실시예에서 전원부(240)와 주제어부(250)는 구비되지 않아도 무방하다.The power supply unit 240 performs a function of supplying power to each component of the image classification apparatus 200. The main controller 250 performs a function of controlling the overall operation of each component of the image classification apparatus 200. The image classification apparatus 200 may be provided in an apparatus for processing images to classify the images. Since the power supply unit and the main control unit also exist in the apparatus for processing the image, in view of this point, the power supply unit 240 and the main control unit 250 may not be provided in this embodiment.

이미지 학습부(210)는 기준 이미지에 포함된 제1 특징들에 대한 정보 및 기준 이미지에서 객체(object) 단위로 형성된 클래스(class) 내에 위치하는 제2 특징들에 대한 정보를 기초로 오차 함수를 생성하는 기능을 수행한다. 또한 이미지 학습부(210)는 생성된 오차 함수를 기초로 입력된 이미지들을 학습시키는 기능을 수행한다. 이미지 학습부(210)는 도 1의 학습 모듈(130)에 대응하는 개념이다.The image learner 210 calculates an error function based on the information about the first features included in the reference image and the information about the second features located in a class formed in units of objects in the reference image. Perform the function to create. In addition, the image learner 210 performs a function of learning the input images based on the generated error function. The image learner 210 is a concept corresponding to the learning module 130 of FIG. 1.

이미지 학습부(210)는 이미지들을 학습시킬 때에 합성곱 신경망(Convolutional Neural Network)을 이용할 수 있다.The image learner 210 may use a convolutional neural network when learning the images.

이미지 학습부(210)는 제1 특징들의 평균값을 기초로 생성된 제1 오차, 제2 특징들의 평균값을 기초로 생성된 제2 오차 및 크로스 엔트로피(cross-entropy)와 관련된 제3 오차를 기초로 오차 함수를 생성할 수 있다. 자세하게 설명하면, 이미지 학습부(210)는 제1 오차에 제2 가중치를 곱하여 제5 값을 산출하고, 제2 오차에 제3 가중치를 곱하여 제6 값을 산출하며, 제5 값과 제6 값 및 제3 오차를 합산하여 얻은 결과를 기초로 오차 함수를 생성할 수 있다. 상기에서 제1 오차는 L_g에 대응하는 개념이고, 제2 오차는 L_c에 대응하는 개념이며, 제3 오차는 L_s에 대응하는 개념이다.The image learner 210 based on a first error generated based on the average value of the first features, a second error generated based on the average value of the second features, and a third error related to cross-entropy. An error function can be generated. In detail, the image learner 210 calculates a fifth value by multiplying the first error by a second weight, multiplies the second error by a third weight, and calculates a sixth value, and the fifth value and the sixth value. And an error function based on the result obtained by summing the third errors. In the above description, the first error is a concept corresponding to L _g , the second error is a concept corresponding to L _c , and the third error is a concept corresponding to L _s .

제1 오차를 산출하는 경우, 이미지 학습부(210)는 제1 특징들의 평균값과 기준 이미지에 포함된 각 특징 사이의 제1 차이값들, 및 제1 가중치를 기초로 제1 오차를 산출할 수 있다. 자세하게 설명하면, 이미지 학습부(210)는 제1 차이값들 중에서 선택된 최대값을 L2-norm에 적용하여 제1 값을 산출하고, 각각의 제1 차이값을 L2-norm에 적용하여 제2 값들을 산출하며, 제1 값에 제1 가중치를 더한 후 각각의 제2 값을 빼서 얻은 제3 값들과 0을 비교하여 얻은 결과들을 기초로 제1 오차를 산출할 수 있다.When calculating the first error, the image learner 210 may calculate the first error based on first difference values between the average value of the first features and each feature included in the reference image, and the first weight. have. In detail, the image learner 210 applies a maximum value selected from the first difference values to L2-norm to calculate a first value, and applies each first difference value to L2-norm to a second value. The first error may be calculated based on the results obtained by comparing the third values obtained by adding the first weight to the first value and subtracting each second value and 0. FIG.

제2 오차를 산출하는 경우, 이미지 학습부(210)는 제2 특징들의 평균값과 기준 이미지에 포함된 각 특징 사이의 제2 차이값들, 및 기준 이미지에 포함된 클래스의 개수를 기초로 제2 오차를 산출할 수 있다. 자세하게 설명하면, 이미지 학습부(210)는 각 클래스를 대상으로 제2 차이값들을 산출하고, 클래스마다 산출된 제2 차이값들을 L2-norm에 적용하여 클래스마다 제4 값들을 산출하며, 클래스마다 산출된 제4 값들을 합산하여 얻은 결과를 기초로 제2 오차를 산출할 수 있다.When calculating the second error, the image learner 210 based on the second difference between the average value of the second features and each feature included in the reference image, and the number of classes included in the reference image, is determined. The error can be calculated. In detail, the image learning unit 210 calculates second difference values for each class, calculates fourth values for each class by applying the second difference values calculated for each class to L2-norm, and for each class. The second error may be calculated based on the result obtained by summing the calculated fourth values.

임계값 설정부(220)는 이미지 학습부(210)를 통해 이미지들을 학습시켜 얻은 결과를 기초로 임계값을 설정하는 기능을 수행한다. 임계값 설정부(220)는 도 1의 설정 모듈(140)에 대응하는 개념이다.The threshold setting unit 220 performs a function of setting a threshold value based on a result obtained by learning the images through the image learning unit 210. The threshold setting unit 220 is a concept corresponding to the setting module 140 of FIG. 1.

이미지 분류부(230)는 임계값 설정부(220)에 의해 설정된 임계값을 기초로 이미지들을 분류하는 기능을 수행한다. 이미지 분류부(230)는 도 1의 분류 모듈(150)에 대응하는 개념이다.The image classifying unit 230 classifies the images based on the threshold set by the threshold setting unit 220. The image classifier 230 is a concept corresponding to the classification module 150 of FIG. 1.

이미지 분류부(230)는 이미지들에 포함된 특징들이 임계값 이하인지 여부를 기초로 이미지들을 분류할 수 있다. 자세하게 설명하면, 비교 대상 이미지에 포함된 특징들이 임계값 이하인 것으로 판단되면, 이미지 분류부(230)는 비교 대상 이미지에 포함된 객체를 기준 이미지에 포함된 객체와 동일인으로 분류한다. 반면 비교 대상 이미지에 포함된 특징들이 임계값 초과인 것으로 판단되면, 이미지 분류부(230)는 비교 대상 이미지에 포함된 객체를 기준 이미지에 포함된 객체와 타인으로 분류한다.The image classifier 230 may classify the images based on whether the features included in the images are below a threshold. In detail, if it is determined that the features included in the comparison target image are less than or equal to the threshold value, the image classification unit 230 classifies the objects included in the comparison target image into the same person as the objects included in the reference image. On the other hand, if it is determined that the features included in the comparison target image exceeds the threshold, the image classifying unit 230 classifies the objects included in the comparison target image into objects and others included in the reference image.

이미지 분류 장치(200)는 이미지 조정부(260)를 더 포함할 수 있다.The image classification apparatus 200 may further include an image adjusting unit 260.

이미지 조정부(260)는 이미지들이 입력되면 제1 특징들이 밀집되어 있을 것으로 예측되는 지점을 기준으로 미리 정해진 크기를 가지도록 각 이미지를 편집하는 기능을 수행한다. 이 경우 이미지 학습부(210)는 이미지 조정부(260)에 의해 편집된 이미지들을 학습시킬 수 있다. 이미지 조정부(260)는 도 1의 조정 모듈(120)에 대응하는 개념이다.The image adjusting unit 260 edits each image to have a predetermined size based on a point where the first features are expected to be concentrated when the images are input. In this case, the image learner 210 may learn the images edited by the image adjuster 260. The image adjusting unit 260 is a concept corresponding to the adjusting module 120 of FIG. 1.

한편 본 실시예에서는 이미지 학습부(210)가 이미지 분류 장치(200)로부터 독립적으로 구성되어 이미지 학습 장치로 구현되는 것도 가능하다.Meanwhile, in the present exemplary embodiment, the image learner 210 may be configured independently from the image classifier 200 and implemented as an image learner.

다음으로 이미지 분류 장치(200)의 작동 방법에 대하여 설명한다.Next, a method of operating the image classification apparatus 200 will be described.

도 6은 본 발명의 바람직한 실시예에 따른 이미지 분류 방법을 개략적으로 도시한 흐름도이다.6 is a flowchart schematically illustrating an image classification method according to a preferred embodiment of the present invention.

먼저 이미지 학습부(210)는 기준 이미지에 포함된 제1 특징들에 대한 정보 및 기준 이미지에서 객체(object) 단위로 형성된 클래스(class) 내에 위치하는 제2 특징들에 대한 정보를 기초로 오차 함수를 생성하며, 이 오차 함수를 기초로 입력된 이미지들을 학습시킨다(S310).First, the image learner 210 based on the information about the first features included in the reference image and the information about the second features located in a class formed in units of objects in the reference image. To generate and learn the input images based on the error function (S310).

이후 임계값 설정부(220)는 이미지들을 학습시켜 얻은 결과를 기초로 임계값을 설정한다(S320).Thereafter, the threshold setting unit 220 sets a threshold based on the result obtained by learning the images (S320).

이후 이미지 분류부(230)는 임계값을 기초로 이미지들을 분류한다(S330).The image classifier 230 classifies the images based on the threshold value (S330).

한편 S310 단계 이전에, 이미지 조정부(260)는 이미지들이 입력되면 제1 특징들이 밀집되어 있을 것으로 예측되는 지점을 기준으로 미리 정해진 크기를 가지도록 각 이미지를 편집할 수 있다.Meanwhile, before operation S310, the image adjusting unit 260 may edit each image to have a predetermined size based on a point where the first features are expected to be concentrated when the images are input.

이상에서 설명한 본 발명의 실시예를 구성하는 모든 구성요소들이 하나로 결합하거나 결합하여 동작하는 것으로 기재되어 있다고 해서, 본 발명이 반드시 이러한 실시예에 한정되는 것은 아니다. 즉, 본 발명의 목적 범위 안에서라면, 그 모든 구성요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다. 또한, 그 모든 구성요소들이 각각 하나의 독립적인 하드웨어로 구현될 수 있지만, 각 구성요소들의 그 일부 또는 전부가 선택적으로 조합되어 하나 또는 복수개의 하드웨어에서 조합된 일부 또는 전부의 기능을 수행하는 프로그램 모듈을 갖는 컴퓨터 프로그램으로서 구현될 수도 있다. 또한, 이와 같은 컴퓨터 프로그램은 USB 메모리, CD 디스크, 플래쉬 메모리 등과 같은 컴퓨터가 읽을 수 있는 기록매체(Computer Readable Media)에 저장되어 컴퓨터에 의하여 읽혀지고 실행됨으로써, 본 발명의 실시예를 구현할 수 있다. 컴퓨터 프로그램의 기록매체로서는 자기 기록매체, 광 기록매체 등이 포함될 수 있다.Although all components constituting the embodiments of the present invention described above are described as being combined or operating in combination, the present invention is not necessarily limited to these embodiments. In other words, within the scope of the present invention, all of the components may be selectively operated in combination with one or more. In addition, although all of the components may be implemented in one independent hardware, each or some of the components of the components are selectively combined to perform some or all of the functions combined in one or a plurality of hardware It may be implemented as a computer program having a. In addition, such a computer program is stored in a computer readable medium such as a USB memory, a CD disk, a flash memory, and the like, and is read and executed by a computer, thereby implementing embodiments of the present invention. The recording medium of the computer program may include a magnetic recording medium, an optical recording medium and the like.

또한, 기술적이거나 과학적인 용어를 포함한 모든 용어들은, 상세한 설명에서 다르게 정의되지 않는 한, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 갖는다. 사전에 정의된 용어와 같이 일반적으로 사용되는 용어들은 관련 기술의 문맥상의 의미와 일치하는 것으로 해석되어야 하며, 본 발명에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In addition, all terms including technical or scientific terms have the same meaning as commonly understood by a person of ordinary skill in the art unless otherwise defined in the detailed description. Terms used generally, such as terms defined in a dictionary, should be interpreted to coincide with the contextual meaning of the related art, and shall not be interpreted in an ideal or excessively formal sense unless explicitly defined in the present invention.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위 내에서 다양한 수정, 변경 및 치환이 가능할 것이다. 따라서, 본 발명에 개시된 실시예 및 첨부된 도면들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예 및 첨부된 도면에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구 범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리 범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present invention, and various modifications, changes, and substitutions may be made by those skilled in the art without departing from the essential characteristics of the present invention. will be. Accordingly, the embodiments disclosed in the present invention and the accompanying drawings are not intended to limit the technical spirit of the present invention but to describe the present invention, and the scope of the technical idea of the present invention is not limited by the embodiments and the accompanying drawings. . The scope of protection of the present invention should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

Claims

Generate an error function based on the information about the first features included in the reference image and the information about the second features located in the corresponding class among the classes formed in units of objects in the reference image. An image learning unit learning the input images based on the error function;
A threshold setting unit for setting a threshold value based on a result obtained by learning the input images; And
And an image classifier classifying the input images based on the threshold value.
The image learner may include a first error calculated based on an average value of the first features, a second error calculated based on an average value of the second features, and a cross entropy to increase the distance between the classes. calculate the error function based on a third error associated with
The image learning unit may include a predetermined number of classes to be classified; And a distance from an average value of the first features to a feature that is farthest from the average value of the first features among the first features included in the reference image. And learning models associated with the features included in the reference image by using the error function to learn the input images.

The method of claim 1,
And the image learner learns the input images using a convolutional neural network.

delete

The method of claim 1,
And the image learner calculates the first error based on first difference values between the average value of the first features and each feature included in the reference image, and a first weight.

The method of claim 4, wherein
The image learner calculates a first value by applying a maximum value selected from the first difference values to Euclidean norm, and calculates second values by applying each first difference value to Euclidean norm, And calculating the first error based on the results obtained by comparing the first values with zeros and the third values obtained by subtracting each of the second values and zero.

The method of claim 1,
The image learner calculates the second error based on second difference values between the average value of the second features and each feature included in the reference image, and the number of classes included in the reference image. Image classification apparatus.

The method of claim 6,
The image learning unit calculates the second difference values for each class, calculates fourth values for each class by applying the second difference values calculated for each class to Euclid norm, and adds the fourth values calculated for each class. And calculating the second error based on the result obtained.

The method of claim 1,
The image learner calculates a fifth value by multiplying the first error by a second weight, and calculates a sixth value by multiplying the second error by a third weight, and the fifth value, the sixth value, and the fifth value. And an error function is generated based on a result obtained by summing 3 errors.

The method of claim 1,
And the image classifier classifies the input images based on whether the features included in the input images are less than or equal to the threshold value.

The method of claim 9,
If it is determined that the features included in the comparison target image are less than or equal to the threshold value, the image classification unit classifies the object included in the comparison target image into the same person as the object included in the reference image, and the features included in the comparison target image And determining that the object included in the comparison target image is classified into an object included in the reference image and another person when it is determined that the threshold value is exceeded.

The method of claim 1,
And an image adjusting unit for editing each input image to have a predetermined size based on a point where the first features are expected to be concentrated when the images are input.
And the image learning unit learns the edited input images.

Generate an error function based on the information about the first features included in the reference image and the information about the second features located in the corresponding class among the classes formed in units of objects in the reference image. Learning the input images based on the error function;
Setting a threshold value based on a result obtained by learning the input images; And
Classifying the input images based on the threshold value;
The training of the input images may include: a first error calculated based on the average value of the first features, a second error calculated based on the average value of the second features, and so as to distance the classes; Calculating the error function based on a third error related to cross-entropy, wherein a predetermined number of classes to be classified; And a distance from an average value of the first features to a feature that is farthest from the average value of the first features among the first features included in the reference image. And learning the input images using an error function so that a learning model associated with features included in the reference image has an identifying feature.

delete

The method of claim 12,
The training may include calculating the first error based on first difference values between a mean value of the first features and each feature included in the reference image, and a first weight.

The method of claim 12,
The learning may include calculating the second error based on the second difference between the average value of the second features and each feature included in the reference image, and the number of classes included in the reference image. An image classification method characterized by the above-mentioned.

The method of claim 12,
The classifying step may include classifying the input images based on whether features included in the input images are less than or equal to the threshold value.

The method of claim 12,
Editing each input image to have a predetermined size based on a point where the first features are expected to be concentrated when the images are input;
The training may include learning the edited input images.

Generate an error function based on an average value of first features included in a reference image and an average value of second features located in a corresponding class among classes formed in units of objects in the reference image; Learn the input images based on the function,
A first error calculated based on an average value of the first features, a second error calculated based on an average value of the second features, and a cross-entropy related to cross-entropy so as to distance the classes between the classes; Calculating the error function based on the error, and classifying a predetermined number of classes; And a distance from an average value of the first features to a feature that is farthest from the average value of the first features among the first features included in the reference image. And learning the input images by using an error function so that a learning model associated with features included in the reference image has an identifying feature.