KR102138657B1

KR102138657B1 - Apparatus and method for robust face recognition via hierarchical collaborative representation

Info

Publication number: KR102138657B1
Application number: KR1020180042456A
Authority: KR
Inventors: 이상웅; 보둑미
Original assignee: 가천대학교 산학협력단
Priority date: 2018-04-12
Filing date: 2018-04-12
Publication date: 2020-07-28
Also published as: KR20190123372A

Abstract

본 발명은 계층적 현업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법에 관한 것으로, 학습데이터의 협업 서브공간에서 얼굴인식을 위한 얼굴 이미지와 상기 얼굴 이미지에 대한 투영벡터 사이의 유클리드 거리와 상기 투영벡터에서 학습데이터에 대한 학습 벡터까지의 유클리드 거리를 고려하는 계층적 협업 표현 분류를 통해 노이즈, 조명효과에 영향을 받지 않고 정확하게 얼굴인식을 수행할 수 있도록 하는 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법에 관한 것이다.The present invention relates to a robust face recognition apparatus and its method through hierarchical field representation based classification, the Euclidean distance between the face image for face recognition and the projection vector for the face image in the collaborative subspace of learning data, and the projection Strong face through hierarchical collaborative expression-based classification that enables accurate face recognition without being affected by noise and lighting effects through hierarchical collaborative expression classification that considers the Euclidean distance from vector to training vector for learning data It relates to a recognition device and method.

Description

A robust face recognition device and method through hierarchical collaborative expression-based classification{APPARATUS AND METHOD FOR ROBUST FACE RECOGNITION VIA HIERARCHICAL COLLABORATIVE REPRESENTATION}

본 발명은 계층적 현업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법에 관한 것으로, 더욱 상세하게는 학습데이터의 협업 서브공간에서 얼굴인식을 위한 얼굴 이미지와 상기 얼굴 이미지에 대한 투영벡터 사이의 유클리드 거리와 상기 투영벡터에서 학습데이터에 대한 학습벡터까지의 유클리드 거리를 고려하는 계층적 협업 표현 분류를 통해 노이즈, 조명효과에 영향을 받지 않고 정확하게 얼굴인식을 수행할 수 있도록 하는 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법에 관한 것이다.The present invention relates to a robust face recognition apparatus and its method through hierarchical field representation-based classification, and more specifically, Euclidean between a face image for face recognition in a collaborative subspace of learning data and a projection vector for the face image. Hierarchical collaborative expression-based classification that enables accurate face recognition without being affected by noise and lighting effects through hierarchical collaborative expression classification that considers the distance and the Euclidean distance from the projection vector to the learning vector for learning data It relates to a robust face recognition apparatus and method.

최근 산업발전과 보안기술의 급속한 발전으로 인해 사람의 신체를 이용한 생체인식기술이 고도화됨에 따라 얼굴, 홍체, 지문, 정맥 등을 이용한 사용자의 인원을 인식하는 생체인식기술은 기존의 열쇠나 번호 등을 이용한 방법을 대체해 나가고 있는 실정이다.Due to recent advances in industrial development and rapid development of security technologies, biometric technology that recognizes the user's personnel using faces, irises, fingerprints, and veins, etc. It is replacing the method used.

특히, 얼굴인식 기술은 홍체, 지문, 정맥 등을 이용한 다른 생체인식기술이 사용자로 하여금 일정한 동작을 취하도록 요구하는 것과 달리, 비접촉식으로 자연스럽게 신원확인을 수행할 수 있도록 하며, 저렴한 설치 및 유지비용 등과 같은 다양한 장점으로 인해 상기 얼굴인식 기술을 상용화하기 위한 많은 노력과 연구가 진행 중에 있다.In particular, face recognition technology allows users to perform identification without contact, naturally, unlike other biometric technologies using irises, fingerprints, veins, etc., which require users to perform certain actions. Due to the various advantages, many efforts and studies are underway to commercialize the face recognition technology.

이러한 얼굴인식 기술은 보안시스템, 모바일 로봇과 같은 다양한 분야에 적용되고 있으며, 아파트, 공항, 대행사(agencies)의 보안 관리와 같이 사람의 노력이 많이 필요로 했던 어려운 작업을 시스템이 자동적으로 수행하여 사용자에게 편의성을 제공하고 있다.This face recognition technology is applied to various fields such as security systems and mobile robots, and the system automatically performs difficult tasks that require a lot of human effort, such as security management of apartments, airports, and agencies. It is providing convenience to people.

종래의 얼굴인식 기술은 기계학습을 기반으로 구축되고 있기 때문에, 정확한 얼굴인식을 위해서는 많은 수의 학습 얼굴이 필요하다. 그러나 현실적으로 많은 수의 학습 얼굴을 수집하는 것이 매우 어려워 상기 종래의 얼굴인식 기술은 실제적으로 얼굴인식에 대한 정확도가 매우 낮은 실정이다.Since the conventional face recognition technology is constructed based on machine learning, a large number of learning faces are required for accurate face recognition. However, in reality, it is very difficult to collect a large number of learning faces, and thus the conventional face recognition technology has a very low accuracy for face recognition.

이러한 문제점을 해결하기 위해 최근에는, SRC(sparse representation-based classification) 및 CRC(collaborative representation-based classification) 기술이 개발되고 있다.To solve this problem, recently, SRC (sparse representation-based classification) and CRC (collaborative representation-based classification) technologies have been developed.

상기 SRC기술은 전체 데이터 세트(data set)에 대한 학습 벡터의 선형 조합으로 얼굴 특징 벡터를 나타낼 수 있기 때문에 적은 수의 학습 얼굴에 대해서도 높은 얼굴인식 정확도를 보이나, 상기 데이터 세트를 전체적으로 처리하기 때문에 그 계산 비용이 너무 높아 실제 현실에서 적용하기에는 그 한계가 있다.Since the SRC technique can represent the facial feature vector as a linear combination of the learning vectors for the entire data set, it shows high face recognition accuracy even for a small number of learning faces, but because it processes the data set as a whole, The computational cost is so high that there are limits to applying it in real life.

또한 CRC기술은 학습 얼굴을 복수의 클래스로 나누어, 학습 얼굴의 협업 서브 공간에서 테스트 얼굴의 근사자와 테스트 얼굴 사이의 유클리드 거리를 계산하여 얼굴인식을 수행한다. 상기 CRC기술은 테스트 얼굴과 상기 근사자 사이의 유클리드 거리를 최소화한 결과에 의존하기 때문에 상기 각 클래스에 해당하는 학습얼굴의 수가 적으면 얼굴인식 정확도가 현저하게 떨어지는 문제점이 있다.In addition, CRC technology divides the learning face into a plurality of classes, and performs face recognition by calculating the Euclidean distance between the test face approximator and the test face in the cooperative subspace of the learning face. Since the CRC technique relies on the result of minimizing the Euclidean distance between the test face and the approximator, if the number of learning faces corresponding to each class is small, the accuracy of face recognition is significantly reduced.

이에 따라 본 발명은 얼굴 이미지로부터 얼굴특징을 추출하기 위한 특징추출모델과 결합되어, 상기 추출한 얼굴특징을 학습하여 학습 얼굴의 협업 서브공간에서 테스트 얼굴과 상기 테스트 얼굴의 투영벡터까지의 유클리드 거리를 최소화한 후, 상기 투영벡터와 학습 벡터 사이의 유클리드 거리를 고려하는 2단계 얼굴인식 과정을 포함하는 계층적 협업 표현 분류기를 제안하여, 사용자 얼굴에 대한 상이한 포즈나 표현 및 조명의 변화에 따라 정확하고 실시간으로 상기 사용자를 인식할 수 있도록 하는 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법을 제공하고자 한다.Accordingly, the present invention is combined with a feature extraction model for extracting facial features from a face image, learning the extracted facial features to minimize the Euclidean distance from the cooperative subspace of the learning face to the test face and the projection vector of the test face. Then, we propose a hierarchical collaborative expression classifier that includes a two-step face recognition process that considers the Euclidean distance between the projection vector and the learning vector, so that it is accurate and real-time according to different poses, expressions, and lighting changes for the user's face In order to provide a robust face recognition device and method through hierarchical collaborative expression-based classification to recognize the user.

다음으로 본 발명의 기술분야에 존재하는 선행기술에 대하여 간단하게 설명하고, 이어서 본 발명이 상기 선행기술에 비해서 차별적으로 이루고자 하는 기술적 사항에 대해서 기술하고자 한다.Next, the prior art existing in the technical field of the present invention will be briefly described, and then the technical matters to be achieved differently from the prior art will be described.

먼저 비특허 문헌, 이미지 분류를 위한 노벨 커널 협업 표현 방법(2014IEEE International conference on image processing(ICIP), 2013, pp.4241-4245)은 비선형 데이터를 고차원 특징 공간(커널 공간)으로 변환하여 학습 데이터를 분리할 수 있도록 하는 것으로, 상기 커널 공간에서의 새로운 특징들은 CRC에 의해 학습되어 얼굴인식을 수행할 수 있도록 한다. First, the non-patent document, the Nobel Kernel Collaborative Expression Method for Image Classification (2014IEEE International conference on image processing (ICIP), 2013, pp.4241-4245) converts nonlinear data into a high-dimensional feature space (kernel space) to transform learning data. By separating them, new features in the kernel space are learned by CRC to perform face recognition.

또한 비특허 문헌 마진 분포 최적화를 통한 얼굴인식을 위한 다중 스케일 패치 협업 표현(ECCV'12, Springer-Verlag,, Berlin, Heidelberg)은 얼굴 이미지에 대한 서로 다른 스케일에 대한 정보를 이용하는 것으로, 각 스케일에 있어서, 테스트 이미지가 오버랩된 패치들의 출력이 결합됨으로써 분류되며, 이를 통해 테스트 이미지에 대한 얼굴을 인식할 수 있도록 한다.In addition, multi-scale patch cooperative expression for face recognition through optimization of non-patent document margin distribution (ECCV'12, Springer-Verlag,, Berlin, Heidelberg) uses information on different scales for face images. In this way, the test images are classified by combining the outputs of the overlapped patches, thereby allowing the face of the test image to be recognized.

상기 선행기술들은 얼굴인식을 수행함에 있어, CRC를 기반으로 하고 있기 때문에 학습 얼굴의 수가 적은 경우에는 얼굴인식의 정확도가 현저하게 떨어지는 문제점이 있다.When performing the face recognition, the prior arts are based on CRC, and thus, when the number of learning faces is small, the accuracy of face recognition is significantly reduced.

또한 상기 선행기술들은 기본적으로 CRC를 토대로 얼굴을 인식하는 방법에 대해서만 논의하고 있을 뿐, 본 발명의 학습 얼굴의 협업 서브 공간에서 테스트 얼굴과 테스트 얼굴의 투영벡터까지의 유클리드 거리를 최소화하는 제1 과정과 상기 투영벡터와 학습 벡터 사이의 유클리드 거리를 최소화하는 제2 과정을 포함하는 계층적 협업 표현 분류기를 통해 사용자를 신속하고 정확하게 인식하도록 하는 수단에 대한 구성이 전혀 제시되어 있지 않으며 이에 대한 그 어떠한 암시도 되어 있지 않다.In addition, the above prior art basically only discusses a method for recognizing a face based on CRC, and the first process of minimizing the Euclidean distance between the test face and the projection vector of the test face in the cooperative subspace of the learning face of the present invention. And a method for quickly and accurately recognizing a user through a hierarchical collaborative expression classifier including a second process of minimizing the Euclidean distance between the projection vector and the learning vector is not suggested at all and any implication for this Nor is it.

본 발명은 상기와 같은 문제점을 해결하기 위해 창작된 것으로서, 기계학습방법을 통해 학습얼굴에 대한 얼굴특징을 학습하여, 테스트 얼굴과 해당 테스트 얼굴에 대한 투영벡터 사이의 유클리드 거리를 최소화하고, 상기 투영벡터와 학습얼굴까지의 유클리드 거리를 최소화하는 계층적 협업 표현 분류기를 통해 얼굴인식을 실시간으로 수행할 수 있도록 하는 계층적 협업 표현 기반 분류기를 통한 강인한 얼굴인식 장치 및 그 방법을 제공하는 것을 목적으로 한다.The present invention was created to solve the above problems, by learning the facial features of the learning face through a machine learning method, to minimize the Euclidean distance between the test face and the projection vector for the test face, the projection It is an object of the present invention to provide a robust face recognition apparatus and method through a hierarchical collaborative expression based classifier that enables real-time face recognition through a hierarchical collaborative expression classifier that minimizes the Euclidean distance between vectors and learning faces. .

또한 본 발명은 상기 계층적 협업 표현 분류기에 얼굴 특징 추출을 위한 DCNN 모델 또는 LTP 모델을 결합하여 얼굴 이미지에 포함되는 랜덤 노이즈, 조명의 변화에 상관없이 더욱 정확하고 신속하게 얼굴인식을 주행할 수 있도록 하는 계층적 협업 표현 기반 분류기를 통한 강인한 얼굴인식 장치 및 그 방법을 제공하는 것을 또 다른 목적으로 한다. In addition, the present invention combines the DCNN model for extracting facial features or the LTP model with the hierarchical collaborative expression classifier, so that face recognition can be performed more accurately and quickly regardless of random noise and illumination changes included in the face image. Another object is to provide a robust face recognition apparatus and method through a hierarchical collaborative expression-based classifier.

즉, 본 발명은 상기 계층적 협업 표현 분류기에 DCNN 모델과 LTP 모델을 선택적으로 적용하여 동일한 사용자 얼굴에 대한 상이한 포즈나 표현 및 조명의 변화에 따라 정확하고 실시간으로 상기 사용자의 신원을 확인할 수 있도록 하는 것이다.That is, the present invention selectively applies a DCNN model and an LTP model to the hierarchical collaborative expression classifier to accurately and real-time identify the user's identity according to different poses, expressions, and lighting changes for the same user face. will be.

본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치는, 얼굴 이미지로 구성된 복수의 학습데이터를 학습하여 얼굴특징을 추출하기 위한 얼굴특징 추출용 학습모델을 생성하는 얼굴특징 추출용 학습모델 생성부, 상기 얼굴특징 추출용 학습모델에 의해 추출된 얼굴특징을 학습하여, 얼굴특징과 상기 얼굴특징에 대한 프로젝션 벡터 사이의 유클리드 거리에 따라 상기 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류하기 위한 제1 단계 분류용 학습모델을 생성하는 제1 단계 분류용 학습모델 생성부 및 제1 단계 분류용 학습모델에 의해 분류된 상기 적어도 하나 이상의 후보 클래스를 학습하여, 상기 프로젝션 벡터와 상기 후보 클래스에 대한 유클리드 거리에 따라 상기 후보 클래스를 재분류하는 제2 단계 분류용 학습모델을 생성하는 제2 단계 분류용 학습모델 생성부를 포함하는 것을 특징으로 한다.The face recognition apparatus through hierarchical collaborative expression-based classification according to an embodiment of the present invention extracts facial features that generate a learning model for extracting facial features for extracting facial features by learning a plurality of learning data composed of face images Learning model generation unit, learning the facial features extracted by the learning model for extracting the facial features, and generating at least one candidate class for the facial features according to the Euclidean distance between the facial features and the projection vector for the facial features. The projection vector and the candidate are learned by learning the at least one candidate class classified by the first-stage classification learning model generator and the first-stage classification learning model for generating a first-class classification learning model for classification. And a learning model generator for classifying the second level classifying the learning model for classifying the second class according to the Euclidean distance to the class.

또한 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치는, 상기 생성한 얼굴특징 추출용 학습모델에 특정 얼굴 이미지를 적용하여 얼굴특징을 추출하는 얼굴특징 추출부, 상기 제1 단계 분류용 학습모델에 상기 얼굴특징 추출부를 통해 추출한 얼굴특징을 적용하여, 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류하는 제1 단계 분류기 및 상기 제2 단계 분류용 학습모델에, 상기 얼굴특징 추출부를 통해 추출한 얼굴특징과 제1 단계 분류기 통해 분류된 적어도 하나 이상의 후보 클래스를 적용하여, 상기 분류된 적어도 하나 이상의 후보 클래스를 재분류하는 제2 단계 분류기를 더 포함하며, 상기 제1 단계 분류기 및 제2 단계 분류기의 계층적 협업 표현을 통해 상기 특정 얼굴 이미지에 대한 얼굴 인식을 수행하기 위해 상기 학습데이터를 분류하는 것을 특징으로 한다.In addition, the face recognition apparatus through hierarchical collaborative expression-based classification includes a facial feature extraction unit that extracts a facial feature by applying a specific face image to the generated facial feature extraction learning model, and the first step classification learning model includes: The first feature classifier classifying the at least one candidate class for the corresponding facial feature by applying the facial feature extracted through the facial feature extraction unit and the learning feature for the second stage classification, the facial feature extracted through the facial feature extraction unit A second step classifier for reclassifying the classified at least one candidate class by applying at least one candidate class classified through the first step classifier, the hierarchical structure of the first step classifier and the second step classifier It is characterized by classifying the learning data to perform face recognition on the specific face image through collaborative expression.

또한 상기 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치는, 상기 얼굴특징 추출부를 통해 추출한 얼굴특징에 대한 프로젝션 벡터와 상기 제2 단계 분류기를 통해 재분류한 후보 클래스 간의 유클리드 거리를 비교하여 제일 작은 유클리드 거리를 가지는 후보 클래스를 선택함으로써, 상기 특정 얼굴 이미지에 대한 얼굴인식을 수행하는 얼굴인식부를 더 포함하는 것을 특징으로 한다.In addition, the face recognition apparatus through the hierarchical collaborative expression-based classification compares the Euclidean distance between the projection vector for the facial feature extracted through the facial feature extraction unit and the candidate class reclassified through the second step classifier, and the smallest Euclidean. By selecting a candidate class having a distance, it is characterized in that it further comprises a face recognition unit for performing face recognition on the specific face image.

또한 상기 얼굴특징 추출용 학습모델 생성부는, DCNN(deep convolutional neural network) 모델로 구성되거나, 또는 LTP(local ternary patterns) 모델로 구성되는 것을 특징으로 한다.In addition, the learning model generation unit for extracting facial features may be configured as a deep convolutional neural network (DCNN) model or a local ternary patterns (LTP) model.

또한 상기 DCNN 모델은, 복수의 컨볼루션(convolution) 레이어, 상기 각 컨볼루션 레이어와 연결되는 복수의 맥스아웃(maxout) 레이어, 복수의 풀링(pooling) 레이어 및 소프트 맥스(softmax) 레이어를 포함하며, 각 학습데이터에 대한 고유의 특징을 공통의 세트로 변환함으로써, 얼굴특징을 추출하는 것을 특징으로 한다.In addition, the DCNN model includes a plurality of convolution layers, a plurality of maxout layers connected to the respective convolution layers, a plurality of pooling layers, and a softmax layer, It is characterized by extracting facial features by converting unique features for each learning data into a common set.

또한 상기 LTP 모델은, 상기 각 학습데이터를 복수의 블록으로 나누어, 상기 각 블록에 대한 LTP 코드를 히스토그램으로 수집하고, 상기 각 히스토그램을 여러 개의 빈(bin)으로 구성되는 결합된 특징 히스토그램으로 연결함으로써, 얼굴특징을 추출하는 것을 특징으로 한다.In addition, the LTP model divides each learning data into a plurality of blocks, collects LTP codes for each block as a histogram, and connects each histogram to a combined feature histogram composed of multiple bins. , Extracting facial features.

아울러 본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 얼굴인식 방법은, 얼굴특징 추출용 학습모델 생성부를 통해, 얼굴 이미지로 구성된 복수의 학습데이터를 학습하여 얼굴특징을 추출하기 위한 얼굴특징 추출용 학습모델을 생성하는 단계, 제1 단계 분류용 학습모델 생성부를 통해 상기 얼굴특징 추출용 학습모델을 통해 추출된 얼굴특징을 학습하여, 얼굴특징과 상기 얼굴특징에 대한 프로젝션 벡터 사이의 유클리드 거리에 따라 상기 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류하기 위한 제1 단계 분류용 학습모델을 생성하는 단계 및 제2 단계 분류용 학습모델 생성부를 통해 제1 단계 분류용 학습모델을 통해 분류된 상기 적어도 하나 이상의 후보 클래스를 학습하여, 상기 프로젝션 벡터와 상기 후보 클래스에 대한 유클리드 거리에 따라 상기 후보 클래스를 재분류하는 제2 단계 분류용 학습모델을 생성하는 단계를 포함하는 것을 특징으로 한다.In addition, the face recognition method through the hierarchical collaborative expression-based classification according to an embodiment of the present invention is a face for extracting face features by learning a plurality of learning data composed of face images through a learning model generation unit for extracting face features. Generating a learning model for feature extraction, learning the facial features extracted through the learning model for extracting facial features through the learning model generator for classifying the first step, and Euclidean between the facial features and the projection vector for the facial features. Classified through the learning model for the first stage classification through the step of generating a learning model for the first stage classification for classifying at least one candidate class for the face feature according to the distance and the learning model generator for the second stage classification And learning the at least one candidate class, and generating a learning model for classification in a second step of reclassifying the candidate class according to the projection vector and the Euclidean distance to the candidate class.

또한 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치는, 얼굴특징 추출부를 통해, 상기 생성한 얼굴특징 추출용 학습모델에 얼굴 이미지를 적용하여 얼굴특징을 추출하는 단계, 제 1단계 분류기를 통해, 상기 제1 단계 분류용 학습모델에 상기 얼굴특징 추출부에 의해 추출된 얼굴특징을 적용하여, 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류하는 단계, 제 2단계 분류기를 통해, 상기 제2 단계 분류용 학습모델에, 상기 얼굴특징 추출부에 의해 추출된 얼굴특징과 제1 단계 분류기에 의해 분류된 적어도 하나 이상의 후보 클래스를 적용하여, 상기 분류된 적어도 하나 이상의 후보 클래스를 재분류하는 단계를 더 포함하며, 상기 제1 단계 분류기 및 제2 단계 분류기의 계층적 협업 표현을 통해 상기 얼굴 이미지에 대한 얼굴 인식을 수행하기 위해 상기 학습데이터를 분류하는 것을 특징으로 한다.In addition, the face recognition apparatus through hierarchical collaboration expression-based classification, through the facial feature extraction unit, applying a face image to the generated learning model for facial feature extraction to extract facial features, through a first-stage classifier, the Classifying at least one candidate class for the corresponding facial feature by applying the facial feature extracted by the facial feature extraction unit to the learning model for classification in the first stage, and classifying the second stage through the second stage classifier The method further includes reclassifying the classified at least one candidate class by applying the facial feature extracted by the facial feature extraction unit and at least one candidate class classified by the first stage classifier to the learning model for use. In addition, the learning data is classified to perform face recognition on the face image through hierarchical collaborative expressions of the first stage classifier and the second stage classifier.

또한 계층적 협업 표현 기반 분류를 통한 얼굴인식 방법은, 얼굴인식부를 통해 제2 단계 분류기에 의해 재분류한 후보 클래스 중 상기 얼굴특징 추출부에 의해 추출된 얼굴특징에 대한 프로젝션 벡터와의 유클리드 거리가 제일 작은 후보 클래스를 선택함으로써, 얼굴 이미지에 대한 얼굴 인식을 수행하는 단계를 더 포함하는 것을 특징으로 한다.In addition, in the face recognition method through hierarchical collaborative expression-based classification, the Euclidean distance from the projection vector for the face feature extracted by the face feature extracting unit among the candidate classes reclassified by the second stage classifier through the face recognition unit And selecting the smallest candidate class, further comprising performing face recognition on the face image.

이상에서와 같이 본 발명의 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법에 따르면, 학습 얼굴의 협업 서브공간에서 테스트 얼굴로부터 상기 테스트 얼굴의 투영벡터까지의 유클리드 거리와, 상기 투영벡터에서 상기 학습얼굴가지의 유클리드 거리를 고려하는 계층적 협업 표현기반 분류기를 통해 얼굴인식 속도를 현저하게 향상시킬 수 있도록 하는 효과가 있다.As described above, according to the robust face recognition apparatus and method through the hierarchical collaboration expression-based classification of the present invention, the Euclidean distance from the test face to the projection vector of the test face in the collaboration subspace of the learning face, and the projection vector There is an effect to significantly improve the face recognition speed through a hierarchical collaborative expression-based classifier that considers the Euclidean distance of the learning face branch.

또한 상기 계층적 협업 표현기반 분류기와 얼굴 이미지로부터 특징점을 추출하기 위한 DCNN 모델 또는 LTP 모델을 선택적으로 결합하여, 랜덤 노이즈에 민감하지 않고 통제되지 않은 조명하에서도 신속하고 정확하게 얼굴을 인식할 수 있도록 하는 효과가 있다.In addition, the hierarchical collaborative expression-based classifier and DCNN model or LTP model for extracting feature points from the face image are selectively combined, so that the face can be quickly and accurately recognized even under uncontrolled lighting that is not sensitive to random noise. There is.

도 1은 본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법을 개략적으로 설명하기 위한 개념도이다.
도 2는 본 발명의 일 실시예에 따른 학습 얼굴의 부족으로 인해 발생할 수 있는 문제점을 설명하기 위해 나타낸 도면이다.
도 3은 본 발명의 일 실시예에 따른 투영 벡터의 두 가지의 전형적인 위치를 비교하여 설명하기 위해 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치의 구성을 나타낸 블록도이다.
도 5는 본 발명의 일 실시예에 따른 계층적 협업 표현 분류기가 DCNN 모델과 결합하여 얼굴인식을 수행하는 과정을 나타낸 도면이다.
도 6은 본 발명의 일 실시예에 따른 DCNN 모델의 구조를 나타낸 도면이다.
도 7은 발명의 일 실시예에 따른 계층적 협업 표현 분류기가 LTP 모델과 결합하여 얼굴인식을 수행하는 과정을 나타낸 도면이다.
도 8은 본 발명의 일 실시예에 따른 LTP 모델을 설명하기 위해 나타낸 도면이다.
도 9는 본 발명의 일 실시예에 따른 계층적 협업 표현 분류기와 타 얼굴 특징 학습 모델과의 성능을 비교한 도면이다.
도 10은 본 발명의 일 실시예에 따른 AR 데이터 세트에서 랜덤 노이즈를 가진 얼굴 이미지의 인식률을 비교한 도면이다.
도 11은 본 발명의 일 실시예에 따른 AR 데이터 세트에서 랜덤 오클루젼을 가진 얼굴 이미지의 인식률을 비교한 도면이다.
도 12는 본 발명의 일 실시예에 따른 확장된 Yale B 데이터 세트에서 얼굴인식 성능을 비교한 도면이다.
도 13은 본 발명의 일 실시예에 따른 확장된 Yale B 데이터 세트에서 랜덤 노이즈를 가지는 얼굴 이미지의 인식률을 비교한 도면이다.
도 14는 본 발명의 일 실시예에 따른 LFW-a 데이터 세트에서 얼굴인식률을 비교한 도면이다.
도 15는 본 발명의 일 실시예에 따른 LFW-a 데이터 세트에서 랜덤 노이즈를 가지는 얼굴 이미지에 대한 인식성능을 설명하기 위해 나타낸 도면이다.
도 16은 본 발명의 일 실시예에 따른 FW-a 데이터 세트에서 랜덤 오클루젼을 가지는 얼굴 이미지에 대한 인식률을 비교한 도면이다.
도 17은 본 발명의 일 실시예에 따른 얼굴인식 절차를 나타낸 흐름도이다.1 is a conceptual diagram schematically illustrating a robust face recognition apparatus and method through hierarchical collaborative expression-based classification according to an embodiment of the present invention.
2 is a view illustrating a problem that may occur due to a lack of a learning face according to an embodiment of the present invention.
3 is a view for explaining by comparing the two typical positions of the projection vector according to an embodiment of the present invention.
4 is a block diagram showing the configuration of a face recognition apparatus through hierarchical collaborative expression-based classification according to an embodiment of the present invention.
5 is a diagram illustrating a process of performing a face recognition by combining the hierarchical collaboration expression classifier with the DCNN model according to an embodiment of the present invention.
6 is a diagram showing the structure of a DCNN model according to an embodiment of the present invention.
7 is a diagram illustrating a process of performing a face recognition by combining a hierarchical collaborative expression classifier with an LTP model according to an embodiment of the present invention.
8 is a diagram illustrating an LTP model according to an embodiment of the present invention.
9 is a diagram for comparing performance of a hierarchical collaborative expression classifier and another facial feature learning model according to an embodiment of the present invention.
FIG. 10 is a diagram comparing recognition rates of face images having random noise in an AR data set according to an embodiment of the present invention.
11 is a diagram comparing the recognition rate of a face image with random occlusion in an AR data set according to an embodiment of the present invention.
12 is a view comparing face recognition performance in an extended Yale B data set according to an embodiment of the present invention.
13 is a diagram comparing a recognition rate of a face image having random noise in an extended Yale B data set according to an embodiment of the present invention.
14 is a view comparing face recognition rates in an LFW-a data set according to an embodiment of the present invention.
FIG. 15 is a diagram illustrating a recognition performance of a face image having random noise in an LFW-a data set according to an embodiment of the present invention.
16 is a diagram comparing recognition rates for face images having random occlusion in an FW-a data set according to an embodiment of the present invention.
17 is a flowchart illustrating a face recognition procedure according to an embodiment of the present invention.

이하, 첨부한 도면을 참조하여 본 발명의 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법에 대한 바람직한 실시 예를 상세히 설명한다. 각 도면에 제시된 동일한 참조부호는 동일한 부재를 나타낸다. 또한 본 발명의 실시 예들에 대해서 특정한 구조적 내지 기능적 설명들은 단지 본 발명에 따른 실시 예를 설명하기 위한 목적으로 예시된 것으로, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는 것이 바람직하다.Hereinafter, a preferred embodiment of the robust face recognition apparatus and method through the hierarchical collaborative expression-based classification of the present invention will be described in detail with reference to the accompanying drawings. The same reference numerals in each drawing denote the same members. Also, specific structural or functional descriptions of the embodiments of the present invention are exemplified for the purpose of describing the embodiments according to the present invention, and unless defined otherwise, all terms used herein, including technical or scientific terms. These have the same meaning as those generally understood by those of ordinary skill in the art. Terms such as those defined in a commonly used dictionary should be interpreted as having meanings consistent with meanings in the context of related technologies, and should not be interpreted as ideal or excessively formal meanings unless explicitly defined herein. It is desirable not to.

도 1은 본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법을 개략적으로 설명하기 위한 개념도이다.1 is a conceptual diagram schematically illustrating a robust face recognition apparatus and method through hierarchical collaborative expression-based classification according to an embodiment of the present invention.

도 1에 도시한 바와 같이, 본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치(이하, 얼굴인식 장치로 칭함)(100)는 학습데이터 베이스(310)에 저장되어 있는 얼굴 이미지로 구성된 학습데이터에 대한 얼굴특징을 기계학습하여 상기 학습 얼굴을 분류함으로써, 특정 사람에 대한 얼굴을 신속하고 정확하게 인식할 수 있도록 한다. As shown in FIG. 1, the face recognition device (hereinafter referred to as a face recognition device) 100 through hierarchical collaborative expression-based classification according to an embodiment of the present invention is stored in the learning database 310 By classifying the learning faces by machine learning the facial features of the learning data composed of the face images, it is possible to quickly and accurately recognize a face for a specific person.

또한 상기 분류는 2단계 분류를 통해 수행되며, 제1 단계 분류는 학습 데이터의 협업 서브 공간에서 테스트 얼굴 이미지(즉, 인식하고자 하는 얼굴 이미지를 의미함)과 해당 테스트 얼굴의 근사자(approximator)사이의 유클리드 거리를 최소화하며, 제2 단계 분류에서는 상기 근사자로부터 각 클래스의 학습데이터까지의 유클리드 거리를 최소화한다. 이를 통해 얼굴인식 장치(100)는 얼굴인식 정확도를 현저하게 향상시킬 수 있다. 한편 상기 제1 단계 및 제 2단계 분류에 대해서는 도 4를 참조하여 상세히 설명하도록 한다.In addition, the classification is performed through a two-stage classification, and the first-stage classification is between a test face image (that is, a face image to be recognized) and an approximator of the test face in a collaborative subspace of learning data. The Euclidean distance is minimized, and in the second step classification, the Euclidean distance from the approximator to the learning data of each class is minimized. Through this, the face recognition apparatus 100 may significantly improve face recognition accuracy. Meanwhile, the first step and the second step classification will be described in detail with reference to FIG. 4.

즉, 본 발명의 얼굴인식 장치(100)는 인식하고자 하는 얼굴 이미지(즉, 테스트 얼굴 이미지, 이하 테스트 얼굴 이미지라 칭함)와 해당 테스트 얼굴 이미지의 근사자에 대한 유클리드 거리를 최소화하는 것에 의존하는 종래의 CRC기술과는 달리, 테스트 얼굴 이미지와 상기 근사자에 대한 유클리드 거리뿐만 아니라 상기 근사자와 학습데이터에 대한 유클리드 거리를 모두 고려하여 얼굴인식에 대한 정확도와 속도를 현저하게 향상시킬 수 있도록 하는 것이다.That is, the face recognition apparatus 100 of the present invention relies on minimizing the Euclidean distance to an approximation of a face image to be recognized (ie, a test face image, hereinafter referred to as a test face image) and a corresponding test face image. Unlike the CRC technique of, it is possible to significantly improve the accuracy and speed of face recognition by considering both the test face image and the Euclidean distance to the approximator as well as the Euclidean distance to the approximator and learning data. .

또한 얼굴인식 장치(100)는 상기 학습한 결과 생성한 학습모델을 학습모델 데이터베이스(320)에 저장한다. In addition, the face recognition apparatus 100 stores the learning model generated as a result of the learning in the learning model database 320.

또한 얼굴인식 장치(100)는 사용자 단말(200)로부터 얼굴인식을 위한 얼굴 이미지가 입력되는 경우, 상기 저장한 학습모델에 상기 입력되는 얼굴 이미지를 적용함으로써, 해당 얼굴 이미지에 대한 신원을 확인하다. 이때, 상기 사용자 단말(200)은 스마트 폰, PDA, 노트북 PC 등과 같이 사용자가 구비한 무선통신단말을 의미한다. In addition, when a face image for face recognition is input from the user terminal 200, the face recognition apparatus 100 confirms the identity of the face image by applying the input face image to the stored learning model. At this time, the user terminal 200 refers to a wireless communication terminal provided by the user, such as a smart phone, PDA, notebook PC.

한편 상기 사용자 단말(200)은 얼굴인식을 수행함에 있어, 상기에서 설명한 것과 같이 네트워크를 통해 얼굴인식 장치(100)로 촬영한 얼굴 이미지를 전송하여 상기 얼굴인식 장치(100)로부터 얼굴인식 결과를 수신할 수도 있으나, 상기 얼굴인식 장치(100)로부터 학습모델을 다운로드하여 사용자 단말(200) 자체에서 얼굴인식을 수행할 수 있다. 이때, 상기 사용자 단말(200)은 얼굴인식용 디바이스가 된다.On the other hand, in performing the face recognition, the user terminal 200 receives the face recognition result from the face recognition device 100 by transmitting a face image captured by the face recognition device 100 through a network as described above. However, it is possible to download the learning model from the face recognition device 100 and perform face recognition on the user terminal 200 itself. At this time, the user terminal 200 becomes a face recognition device.

또한 얼굴인식 장치(100)는 보안 시스템과 연동하여, 적어도 하나 이상의 카메라(400)로부터 촬영되는 얼굴 이미지를 입력받아 해당 얼굴 이미지에 대한 신원을 확인할 수 있으며, 상기 확인 결과를 상기 보안 시스템을 관리하는 사용자 단말(200)로 전송할 수 있다. 이때, 상기 얼굴인식 장치(100)는 네트워크를 통해 보안 시스템과 연동될 수 있으며, 상기 보안 시스템과 통합되어 로컬에서 얼굴인식을 수행할 수 도 있다.In addition, the face recognition device 100 may interlock with a security system, receive a face image photographed from at least one camera 400, and confirm the identity of the face image, and manage the security system for the verification result. It can be transmitted to the user terminal 200. At this time, the face recognition device 100 may be interlocked with a security system through a network, and may be integrated with the security system to perform face recognition locally.

또한 얼굴인식 장치(100)는 모바일 로봇에 적용되어, 상기 모바일 로봇을 통해 얼굴인식을 수행할 수 있도록 구성될 수 도 있다. 즉, 상기 얼굴인식 장치(100)는 얼굴인식을 위한 다양한 분야에 적용되어 사용자의 신원을 확인할 수 있도록 구현될 수 있다.In addition, the face recognition apparatus 100 may be applied to a mobile robot, and may be configured to perform face recognition through the mobile robot. That is, the face recognition device 100 may be implemented to be applied to various fields for face recognition to identify a user.

또한 데이터베이스(300)는 학습을 위한 학습데이터를 저장하는 학습데이터 데이터베이스(310) 및 상기 얼굴인식 장치(100)에 의해 생성한 학습모델을 저장하는 학습모델 데이터베이스(320)를 포함하여 구성된다.In addition, the database 300 includes a learning data database 310 that stores learning data for learning, and a learning model database 320 that stores learning models generated by the face recognition apparatus 100.

한편, 상기 얼굴특징은 학습데이터로 구성되는 얼굴 이미지로부터 추출되는 것으로, 차별적인 특징 추출 모델을 통해 추출된다. 상기 특징 추출 모델은 기계학습 기법인 DCNN(deep convolutional neural network) 모델 또는 LTP(local ternary patterns) 모델을 포함한다.Meanwhile, the face feature is extracted from a face image composed of learning data, and is extracted through a differential feature extraction model. The feature extraction model includes a machine learning technique of deep convolutional neural network (DCNN) model or local ternary patterns (LTP) model.

즉, 얼굴인식 장치(100)는 특징 추출 모델과 결합하여 상기 특징 추출 모델을 통해 학습얼굴로부터 추출되는 얼굴특징을 기계학습하며, 상기 기계학습을 통해 상기 학습얼굴을 분류함으로써, 학습얼굴에 대한 상이한 포즈나 표현, 노이즈, 조명효과에 대해서는 정확하고 신속하게 얼굴인식을 수행할 수 있도록 한다.That is, the face recognition apparatus 100 combines with a feature extraction model to machine-learn facial features extracted from a learning face through the feature extraction model, and classify the learning faces through the machine learning, thereby differentiating the learning faces. Poses, expressions, noise, and lighting effects can be performed accurately and quickly.

도 2는 본 발명의 일 실시예에 따른 학습얼굴의 부족으로 인해 발생할 수 있는 문제점을 설명하기 위해 나타낸 도면이다.2 is a view illustrating a problem that may occur due to a lack of a learning face according to an embodiment of the present invention.

도 2에 도시한 바와 같이, X = [ X₁ , X₂ , X₃ , . . . , X_K ]에 의한 아이덴티티(identities)들의 K개 클래스들에 대한 세트가 표시되어 있다. 여기서 X_i는 i번째 클래스의 서브세트를 의미한다. 데이터 행렬의 X_i열의 수는 i번째 클래스의 학습 벡터의 수와 같다. 또한 상기 얼굴인식 장치(100)에서 학습을 수행함에 있어, 데이터 행렬 X에 대한 이미지의 레이블 셋 L_x가 필요하다. 또한 얼굴인식 장치(100)는 다음의 [수학식 1]에 따라 임의의 얼굴 특징 벡터 y에 대한 새로운 표현을 찾아 전체 데이터 세트 상의 모든 학습 벡터에 의해 효과적으로 표현될 수 있도록 한다.2, X = [X ₁ , X ₂ , X ₃ ,. . . , X _K ] A set of K classes of identities is indicated. Here, X _i means a subset of the i-th class. The number of X _i columns of the data matrix is equal to the number of training vectors of the i-th class. In addition, in performing the learning in the face recognition apparatus 100, the label set L _x of the image for the data matrix X is required. In addition, the face recognition apparatus 100 finds a new expression for an arbitrary facial feature vector y according to the following [Equation 1] so that it can be effectively expressed by all learning vectors on the entire data set.

[수학식 1][Equation 1]

여기서 y는 얼굴 특징 벡터를 나타내며, α는 표현벡터(representation vector)를 의미한다. 상기 벡터 α에 대한 이상적인 솔루션은 l₂-norm 알고리즘의 최소화 문제를 푸는 것으로 발견할 수 있다. 그러나 상기 l₂-norm 알고리즘은 NP-hard(non-deterministic polynomial-time hard)문제 또는 해답에 대해 매우 느리게 수렴하기 때문에 실패할 수 있다. Here, y denotes a facial feature vector, and α denotes a representation vector. The ideal solution for the vector α can be found to solve the minimization problem of the l ₂ -norm algorithm. However, the l ₂ -norm algorithm may fail because it converges very slowly for a non-deterministic polynomial-time hard (NP-hard) problem or solution.

종래의 CRC 기술에 대한 일반적인 전략은 학습데이터에 의해 채워지는 얼굴 서브 공간(즉, 협업 서브 공간) Ω로 폴링(fall)되는

의 최소 근사치를 찾기 위한 것이고, 이러한 상기 얼굴 서브 공간 Ω에 의해 선형적으로 표현될 수 있다. 다시 말해, 벡터

는 상기 서브 공간 Ω의 내에 위치하는 벡터 y의 프로젝션(projection) 벡터이다.A general strategy for the conventional CRC technique is to fall into a face sub-space (ie, a collaborative sub-space) Ω filled by learning data.

It is for finding the minimum approximation of and can be expressed linearly by the face subspace Ω. In other words, vector

Is a projection vector of vector y located in the sub-space Ω.

대부분의 경우, CRC는 테스트 얼굴이 오브컴플리트(over complete)한 학습 서브 공간 Ω에 의해 표현되고, 상기 프로젝션 벡터가 해당 서브 공간에 완전하게 들어 있기 때문에 얼굴 인식 정확도가 높다. 그러나 상술한 바와 같이 종래의 CRC는 얼굴 인식을 사용하는 다양한 생체 인식 시스템에서 학습 샘플의 다양성으로 인해 얼굴 인식 정확도가 현저하게 떨어지는 문제점이 있다.In most cases, the CRC is high in face recognition accuracy because the test face is represented by an overcomplete learning subspace Ω, and the projection vector is completely contained in the corresponding subspace. However, as described above, the conventional CRC has a problem in that face recognition accuracy is significantly reduced due to diversity of learning samples in various biometric recognition systems using face recognition.

테스트 얼굴 이미지 y는 대게 매우 많은 다양한 학습얼굴을 커버할 필요가 있는 고차원 얼굴 공간에 속한다. 따라서 벡터 y는 협업 서브 공간 쉽게 폴 아웃(fall out)되고 상기 벡터 y의 프로젝션 벡터

는 상기 협업 서브 공간 Ω의 경계 근처에 위치할 수 있다. 이 경우에 본 발명의 얼굴인식 장치(100)는 프로젝션 벡터

로부터 학습 벡터까지의 유클리드 거리를 산출할 수 있다. 도 2에 도시한 것과 같이, 테스트 얼굴이 클래스 X₁에 속한다고 가정할 경우, 테스트 얼굴 이미지을 완전하게 표현하기 위한 학습데이터의 부족성 때문에 테스트 얼굴 이미지의 프로젝션 벡터

는 내부에 폴링(falling)되는 것 대신에 협업 서브 공간 Ω의 경계 근처로 폴(fall)된다. 따라서 클래스 X₁과 클래스 X₂의 대한 대부분의 학습데이터는 상기 벡터

와는 거리가 멀다. 이 경우 종래의 CRC기술은 테스트 얼굴의 신원을 예측하지 못하는 문제점이 있다. 이러한 이유는 종래의 CRC기술은 협업 서브 공간 Ω에서 테스트 얼굴 이미지 y에서 상기 y의 프로젝션 벡터

까지의 유클리드 거리를 최소화하는 데에 초점이 맞춰져 있는 반면에 이 프로젝션 벡터

에서 학습 벡터까지의 유클리드 거리는 고려되지 않기 때문이다.Test face image y usually belongs to a high-dimensional face space that needs to cover so many different learning faces. Therefore, the vector y is easily out of the collaborative subspace and the projection vector of the vector y

May be located near the boundary of the cooperative subspace Ω. In this case, the face recognition apparatus 100 of the present invention is a projection vector.

The Euclidean distance from to the learning vector can be calculated. As shown in FIG. 2, when it is assumed that the test face belongs to the class X ₁ , the projection vector of the test face image is due to the lack of training data to completely represent the test face image.

Instead of falling inside, falls near the boundary of the cooperative subspace Ω. Therefore, most of the learning data for class X ₁ and class X ₂ are the vector

It is far from. In this case, the conventional CRC technology has a problem in that it cannot predict the identity of the test face. For this reason, the conventional CRC technology uses the projection vector of y in the test face image y in the collaborative subspace Ω.

While focusing on minimizing the Euclidean distance to, this projection vector

This is because the Euclidean distance from to the learning vector is not considered.

다음으로 도 3을 참조하여 프로젝션 벡터

의 두 가지 일반적인 위치를 비교하여 설명하도록 한다.Next, referring to FIG. 3, the projection vector

Let's compare and explain the two general locations of.

도 3의 (a)는 협업 서브 공간 Ω의 경계 근처에 프로젝션 벡터

가 폴링된 것을 나타낸 도면이며, 도 3의 (b)는 협업 서브 공간 Ω의 중심 근처에 프로젝션 벡터

가 폴링된 것을 나타낸 도면이다.3(a) shows a projection vector near the boundary of the cooperative subspace Ω.

Is a polled view, and FIG. 3(b) is a projection vector near the center of the cooperative subspace Ω.

It is a figure showing that is polled.

도 3의 (a)에 나타낸 프로젝션 벡터

의 위치는 l₂-norm square

인 벡터

에 의해 표현된다.Projection vector shown in Fig. 3(a)

The location of l ₂ -norm square

Phosphorus vector

Is expressed by

또한 도 3의 (b)에 나타낸 프로젝션 벡터

의 위치는 l₂-norm square

인 벡터

에 의해 표현된다. 이 결과는 표현 벡터 α₁의l₂-norm square

가 표현벡터 α₂의 l₂-norm square

보다 훨씬 더 큰 것을 보여준다. 또한 표현벡터 α₂는 사람의 얼굴을 인식하는데 표현벡터 α₁보다 더 신뢰성이 있는 것을 보여준다. 또한 l₂-norm square를 최소화할 수 있다면, 프로젝션 벡터

는 협업 서브 공간 Ω의 중심에 더 가깝게 폴링된다는 것을 알 수 있으며, 이때 테스트 얼굴은 더욱 정확하게 인식될 수 있다. In addition, the projection vector shown in Figure 3 (b)

The location of l ₂ -norm square

Phosphorus vector

Is expressed by The result is the expression vector α ₁ l ₂ -norm square

Is the expression vector α ₂ of l ₂ -norm square

It shows much bigger than that. In addition, the expression vector α ₂ shows that it is more reliable than the expression vector α ₁ to recognize human faces. Also, if l ₂ -norm square can be minimized, projection vector

It can be seen that is polled closer to the center of the cooperative subspace Ω, where the test face can be more accurately recognized.

따라서 본 발명은 상기와 같은 문제점을 해결하기 위해 제1 단계 분류기 및 제 2단계 분류기를 포함하는 계층적 협업 표현 기반 분류기를 통해 테스트 얼굴 이미지 y와 해당 테스트 얼굴의 프로젝션 벡터

사이의 유클리드 거리를 최소화할 뿐만 아니라, 이 프로젝션 벡터

에서부터 상기 학습 클래스까지의 유클리드 거리를 최소화함으로써, 상기 프로젝션 벡터

가 협업 서브 공간 Ω와 클래스 i의 중심에 가능한 가깝게 폴링되도록 하여 적은 수의 학습얼굴에 대해서는 얼굴인식의 정확도를 현저하게 향상시킬 수 있도록 한다.Therefore, the present invention provides a test vector of a test face image y and a corresponding test face through a hierarchical collaborative expression-based classifier including a first step classifier and a second step classifier to solve the above problems.

In addition to minimizing the Euclidean distance between, this projection vector

By minimizing the Euclidean distance from to the learning class, the projection vector

Polling makes it possible to poll as close as possible to the center of the cooperative subspace Ω and class i, so that the accuracy of face recognition can be significantly improved for a small number of learning faces.

본 발명에서 이론적으로, 표현벡터 α에 대한 솔루션은 다음의 [수학식 2]를 통해 l₁-norm 최소화 문제의 확장된 공식을 사용하여 구할 수 있다.In the present invention, theoretically, the solution for the expression vector α can be obtained by using the extended formula of the l ₁ -norm minimization problem through the following [Equation 2].

[수학식 2][Equation 2]

여기서 w_i(i는 1에서 K값을 가짐)는 정규화 가중치(regularization weights)를 나타내고 ε₁ 와 ε₂는 작은 상수이다. 상기 [수학식 2]의 솔루션은 벡터

이며, 협업 서브 공간 Ω에서 테스트 벡터 y를 분류하기에 충분하지 않다. 이 최적화 문제는 주된 제약조건

와 추가적이 제약조건인

(i는 1에서 K값을 가짐)의 두 가지 유형의 제약 조건을 따른다. 상기 주된 제약조건

는 효율적인 근사범위 ε₁을 가지는 벡터 α를 최적화하여 y에서 밀집한 작은 노이즈를 설명하기 위한 것이다. 또한 본 발명은 추가적인 제약 조건

를 사용하여 코딩 벡터 Xα로부터 협업 서브 공간 Ω내의 각 클래스 X_i에서 학습얼굴의 코딩 벡터까지의 유클리드 거리를 최소화할 것을 제안한다.Where w _i (i has a K value from ₁ ) denotes regularization weights and ε ₁ and ε ₂ are small constants. The solution of Equation 2 above is a vector

And is not sufficient to classify the test vector y in the cooperative subspace Ω. This optimization problem is the main constraint

And additional constraints

There are two types of constraints (i has a K value from 1). The main constraints above

Is for optimizing a vector α having an efficient approximate range ε ₁ to explain small noise dense at y. In addition, the present invention is an additional constraint

It is proposed to minimize the Euclidean distance from the coding vector Xα to the coding vector of the learning face in each class X _i in the cooperative subspace Ω.

전체적으로 추가 제약 조건

(i는 1에서 K값을 가짐)은 ε₂ 의 범위에서 사전 지식과 각 특정 문제의 경합에 따라 자동으로 선택되는 정규화 가중치 w_i로 근사화된다. 다음으로 이러한 파라미터의 선택에 대해 설명하도록 한다.Overall additional constraints

(i has a K value from 1) is approximated to a normalization weight w _{i that} is automatically selected according to the contention of prior knowledge and each specific problem in the range of ε ₂ . Next, the selection of these parameters will be described.

비록 상기 [수학식 2]에 대한 솔루션은 l₁-norm 최소화 알고리즘을 사용하여 찾을 수 있으나, 이 알고리즘은 매우 느리게 수렴하는 문제점이 있다. 따라서 본 발명에서는 얼굴 인식 성능을 향상시키고 상기 l₁-norm 최소화 알고리즘보다 더 강건한 l₂-norm 최소화 알고리즘으로 완전히 대체하여 사용하도록 한다. 이는 상기 l₁-norm 최소화 알고리즘보다 훨씬 낮은 복잡도와 상기 l₁-norm 최소화 알고리즘에 의한 정확도와 거의 동일한 정확도를 보여준다.Although the solution to [Equation 2] can be found using the l ₁ -norm minimization algorithm, this algorithm has a problem of converging very slowly. Therefore, in the present invention, it improves face recognition performance and to be used to completely replace the more robust l ₂ -norm minimization algorithm than the l ₁ -norm minimization algorithm. This shows substantially the same precision and accuracy due to the much lower complexity and the l ₁ -norm minimization algorithm than the l ₁ -norm minimization algorithm.

결과적으로 X에 의한 y의 표현은 다음의 [수학식 3]에 의해 공식화될 수 있다.As a result, the expression of y by X can be formulated by the following [Equation 3].

[수학식 3][Equation 3]

여기서 τ는 정규화 파라미터이다. 원래의 협업 표현은 wi(i는 1에서 K값을 가짐)가 0일 때, 본 발명의 특별한 케이스이다. 사실, 더 나은 정규화 가중치 세트인 wi를 선택하는 것은, 프로젝션 벡터

가 협업 서브 공간 Ω의 중심 근처에 있고, 얼굴이 속한 클래스에 가깝게 만드는 것이 중요하다.Where τ is the normalization parameter. The original collaborative expression is a special case of the present invention when wi (i has a K value from 1) is zero. In fact, choosing a better set of normalized weights, wi, is a projection vector.

It is important to make it close to the center of the collaborative subspace Ω, and close to the class to which the face belongs.

이러한 이유로 본 발명은 제1 단계 분류기 및 제2 단계 분류기를 포함하는 계층적 협업 표현 기반 분류기를 제안한다. 즉, 상기 계층적 협업 표현 기반 분류기는 얼굴 인식을 위해 두 단계로 구성되며, 정규화 가중치 wi는 상기 제1 단계 분류기를 통해 0으로 설정되고, 상기 제2 단계 분류기를 통해 업데이트 된다.For this reason, the present invention proposes a hierarchical collaborative expression-based classifier including a first step classifier and a second step classifier. That is, the hierarchical collaboration expression-based classifier is composed of two stages for face recognition, and the normalization weight wi is set to 0 through the first stage classifier and updated through the second stage classifier.

한편 상기 계층적 협업 표현 기반 분류기는 도 4를 참조하여 상세히 설명하도록 한다.Meanwhile, the classifier based on the hierarchical collaboration expression will be described in detail with reference to FIG. 4.

도 4는 본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치의 구성을 나타낸 블록도이다.4 is a block diagram showing the configuration of a face recognition apparatus through hierarchical collaborative expression-based classification according to an embodiment of the present invention.

도 4에 도시한 바와 같이, 본 발명의 일 실시예에 따른 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치(100)는 인식을 위한 얼굴 이미지를 수집하는 얼굴 이미지 수집부(110), 얼굴 이미지로 구성된 복수의 학습데이터를 학습하여 얼굴특징을 추출하기 위한 얼굴특징 추출용 학습모델을 생성하는 얼굴특징 추출용 학습모델 생성부(120), 상기 얼굴특징 추출용 학습모델을 통해 추출된 얼굴특징을 학습하여 각 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류하기 위한 제 1단계 분류용 학습모델 생성부(130), 상기 제1 단계 분류용 학습모델 생성부를 통해 분류된 상기 적어도 하나 이상의 후보 클래스를 학습하여, 상기 각 얼굴특징에 대해 상기 분류한 적어도 하나 이상의 후보 클래스를 재분류하는 제2 단계 분류용 학습모델 생성부(140), 상기 얼굴특징 추출용 학습모델에 상기 얼굴 이미지 수집부(110)를 통해 수집한 얼굴 인식을 위한 특정 얼굴 이미지를 적용하여 얼굴특징을 추출하는 얼굴특징 추출부(150), 상기 생성한 제1 단계 분류용 학습모델 및 제2 단계 분류용 학습모델을 통해 얼굴 인식을 위한 상기 학습데이터를 분류하는 계층적 협업 표현 기반 분류기(160), 상기 얼굴 이미지 수집부(110)를 통해 수집한 특정 얼굴 이미지에 얼굴인식을 수행하고, 인식한 결과를 사용자에게 제공하는 얼굴인식부(170) 및 상기 얼굴인식 장치(100)를 전반적으로 제어하기 위한 제어부(180)를 포함하여 구성된다.As shown in FIG. 4, the face recognition apparatus 100 through hierarchical collaborative expression-based classification according to an embodiment of the present invention is a face image collection unit 110 that collects face images for recognition, as face images A learning model generation unit 120 for extracting facial features to generate a learning model for extracting facial features by learning a plurality of configured learning data, and learning facial features extracted through the learning model for extracting facial features By learning the at least one candidate class classified through the first-stage classification learning model generation unit 130 for classifying at least one candidate class for each face feature, and the first-stage classification learning model generation unit , A second stage classification learning model generation unit 140 reclassifying the at least one candidate class classified for each face feature, and the face image collection unit 110 in the learning model for extracting face features The facial feature extraction unit 150 extracts the facial features by applying a specific face image for the collected face recognition, the recognition for face recognition through the generated first stage classification learning model and second stage classification learning model A hierarchical collaboration expression-based classifier 160 that classifies learning data, and a face recognition unit 170 that performs face recognition on a specific face image collected through the face image collection unit 110 and provides a recognized result to a user ) And a control unit 180 for overall control of the face recognition apparatus 100.

또한 얼굴 이미지 수집부(110)는 상기 얼굴인식 장치(100)를 통해 얼굴을 인식(즉, 해당 얼굴에 대한 신원을 확인)하기 위한 얼굴 이미지를 수집한다.In addition, the face image collection unit 110 collects a face image for recognizing a face (that is, confirming the identity of the corresponding face) through the face recognition device 100.

상기 얼굴 이미지는 적어도 하나 이상의 사용자 단말(200)로부터 수집되거나, CCTV 등과 같이 적어도 하나 이상의 카메라(400)로부터 수집될 수 있다. 다만, 상기 얼굴 이미지 수집부(110)는 얼굴인식을 위해 얼굴 이미지를 수집하는 것으로, 사용자 단말(200), 카메라(400)뿐만 아니라 웹사이트를 통해 수집될 수도 있다. 즉, 본 발명에서 얼굴 이지미를 수집하는 방법에 대해서는 그 제한을 두지 않는다.The face image may be collected from at least one user terminal 200 or at least one camera 400 such as CCTV. However, the face image collection unit 110 collects face images for face recognition, and may be collected through a website as well as the user terminal 200 and the camera 400. That is, in the present invention, the method for collecting the face image is not limited.

또한 얼굴 이미지 수집부(110)는 상기 얼굴특징 추출용 학습모델 생성부(120), 상기 제1 단계 분류용 학습모델 생성부(130), 상기 제2 단계 분류용 학습모델 생성부(140)에서 생성되는 각각의 학습모델을 업데이트할 수 있도록 학습데이터의 기반이 되는 얼굴 이미지를 주기적으로 수집할 수 있다. 즉, 얼굴 이미지 수집부(110)는 사용자 단말(200) 또는 얼굴 이미지를 제공하는 기관으로부터 얼굴 이미지를 주기적으로 수집하여 상기 학습데이터 데이터베이스(310)에 상기 수집한 얼굴 이미지를 반영함으로써, 상기 각각의 학습모델을 업데이트할 수 있도록 한다.In addition, the face image collection unit 110 may include the learning model generation unit 120 for extracting facial features, the learning model generation unit 130 for classification in the first step, and the learning model generation unit 140 for classification in the second step. In order to update each generated learning model, face images that are the basis of learning data may be periodically collected. That is, the face image collection unit 110 periodically collects face images from the user terminal 200 or an organization that provides face images, and reflects the collected face images in the learning data database 310, thereby allowing each of the face images to be collected. Let the learning model be updated.

또한 얼굴 특징 추출용 학습모델 생성부(120)는 얼굴 이미지로부터 얼굴특징을 추출하는 기능을 수행하기 위한 얼굴 특징 추출용 학습모델을 생성하는 기능을 수행하며, 상기 얼굴 특징 추출부(150)는 상기 계층적 협업 표현 기반 분류기(160)의 학습대상이 되는 학습데이터로부터 얼굴 특징을 추출하거나, 상기 얼굴 이미지 수집부(110)를 통해 수집되는 얼굴인식을 위한 특정 얼굴이미지로부터 얼굴 특징을 추출한다. In addition, the learning model generation unit 120 for extracting facial features performs a function of generating a learning model for extracting facial features for performing a function of extracting facial features from the face image, and the facial feature extraction unit 150 includes the The facial feature is extracted from the learning data that is the learning object of the hierarchical collaboration expression-based classifier 160, or the facial feature is extracted from a specific face image for face recognition collected through the face image collection unit 110.

한편 상기 얼굴 특징 추출용 학습모델 생성부(120)는 DCNN(deep convolutional neural network) 모델 또는 LTP(local ternary patterns) 모델로 구성될 수 있으며, 상기 DCNN 모델 및 LTP 모델은 도 5 및 도 6을 각각 참조하여 상세히 설명하도록 한다.Meanwhile, the learning model generator 120 for extracting facial features may be configured as a deep convolutional neural network (DCNN) model or a local ternary patterns (LTP) model, wherein the DCNN model and the LTP model are shown in FIGS. 5 and 6, respectively. It will be described in detail with reference.

또한 제1 단계 분류용 학습모델 생성부(130)는 상기 얼굴특징 추출용 학습모델을 통해 추출된 얼굴특징을 학습하여, 상기 각 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류하기 위한 것으로, 상기 분류는 상기 각 얼굴특징과 상기 각 얼굴특징에 대한 프로젝션 벡터 사이의 유클리드 거리에 따라 수행된다.In addition, the learning model generation unit 130 for the first step classification is for classifying at least one candidate class for each of the facial features by learning the facial features extracted through the learning model for extracting the facial features. Is performed according to the Euclidean distance between each face feature and the projection vector for each face feature.

또한 제 2단계 분류 학습모델 생성부(140)는 상기 제1 단계 분류용 학습모델 생성부(130)를 통해 분류된 상기 적어도 하나 이상의 후보 클래스를 학습하여, 상기 각 얼굴특징에 대한 프로젝션 벡터와 상기 각 얼굴특징에 대해 분류된 상기 적어도 하나 이상의 후보 클래스에 대한 유클리드 거리에 따라 상기 분류된 적어도 하나 이상의 후보 클래스를 재분류하는 기능을 수행한다.In addition, the second-stage classification learning model generation unit 140 learns the at least one candidate class classified through the first-stage classification learning model generation unit 130 to project the projection vector for each face feature and the A function of reclassifying the classified at least one candidate class according to the Euclidean distance for the at least one candidate class classified for each facial feature is performed.

또한 계층적 협업 표현 기반 분류기(160)는 제1 단계 분류기(161) 및 제2 단계 분류기(162)를 포함하여 구성된다. 즉, 상기 계층적 협업 표현 기반 분류기(160)는 제1 단계 분류기(161) 및 제2 단계 분류기(162)의 계층적 협업을 통해 학습얼굴을 분류함으로써, 신속하고 정확하게 얼굴인식을 수행할 수 있도록 한다.In addition, the hierarchical collaboration expression-based classifier 160 is configured to include a first step classifier 161 and a second step classifier 162. That is, the hierarchical collaboration expression-based classifier 160 classifies the learning faces through hierarchical collaboration of the first-class classifier 161 and the second-class classifier 162, so that face recognition can be performed quickly and accurately. do.

또한 제1 단계 분류기(161)는 상기 얼굴특징 추출부(150)에 의해 추출된 얼굴특징을 하기 제1 단계 분류용 학습모델에 적용하여 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류한다.In addition, the first stage classifier 161 applies the facial features extracted by the facial feature extraction unit 150 to the learning model for the first stage classification to classify at least one candidate class for the corresponding facial features.

또한 제2 단계 분류기(162)는 상기 얼굴특징 추출부(150)에 의해 추출된 얼굴특징과, 상기 제1 단계 분류기(161)에 의해 분류된 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 상기 제2 단계 분류용 학습모델에 적용하여 상기 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 재분류함으로써, 상기 얼굴인식부(170)를 통해 해당 얼굴특징에 대한 얼굴인식을 정확하고 신속하게 수행할 수 있도록 한다.In addition, the second-stage classifier 162 may include at least one candidate class for the facial features extracted by the facial feature extraction unit 150 and the corresponding facial features classified by the first-stage classifier 161. Applied to the learning model for the two-step classification, by reclassifying at least one candidate class for the face feature, it is possible to accurately and quickly perform face recognition for the face feature through the face recognition unit 170. .

상기 도 3을 참조하여 설명한 것과 같이, 상기 제1 단계 분류기(161)는 정규화 가중치 w_i를 0으로 설정한다. 이때, 상기 제1 단계 분류기(161)는 원래의 CRC(협업 표현 기반 분류기)가 된다. 이를 통해 상기 제1 단계 분류기(161)는 CRC의 두 가지 이점을 얻을 수 있다.As described with reference to FIG. 3, the first-stage classifier 161 sets the normalization weight w _i to 0. At this time, the first step classifier 161 becomes an original CRC (collaborative expression-based classifier). Through this, the first stage classifier 161 may obtain two advantages of CRC.

하나는, 원래의 CRC는 테스트 얼굴 y가 이들 클래스에 속할 가능성이 매우 낮으므로, 임계값 θ보다 높은 정규화된 잔차(residual)를 갖는 대부분의 클래스를 신속하게 필터링하기 위해 사용될 수 있다. 또 다른 하나는, 상기 제1 단계 분류기(161)는 상기 제2 단계 분류기(162)를 통해 가중치 w_i를 업데이트하기 위해 프로젝션 벡터

로부터 학습 클래스까지의 모든 유클리드 거리를 제공한다. One, the original CRC can be used to quickly filter most classes with normalized residuals higher than the threshold θ since the test face y is very unlikely to belong to these classes. In another one, the first stage classifier 161 is a projection vector to update the weight w _i through the second stage classifier 162.

Provides all Euclidean distances from

실제로 상기 제1단계 분류기는 테스트 벡터 y가 가장 속할 수 있는 소수의 후보 클래스를 선택하는 강력한 다중-클래스 분류기이다. 그런 다음, 제2 단계 분류기에서 상기 소수의 후보 클래스 중 가장 적합한 후보를 선택한다. 이러한 전략은 계산상의 복잡성을 줄이고, 더 높은 인식 정확도를 달성하기 위한 것이다. 임계값 θ는 정규화된 잔차의 비율을 기반으로 선택된다. 특히, 클래스 i의 정규화된 잔차는 다음의 [수학식 4]에 의해 계산된다.Indeed, the first-class classifier is a powerful multi-class classifier that selects the few candidate classes to which test vector y may best belong. Then, the most suitable candidate is selected from the small number of candidate classes in the second stage classifier. This strategy is intended to reduce computational complexity and achieve higher recognition accuracy. The threshold θ is selected based on the ratio of the normalized residuals. In particular, the normalized residual of class i is calculated by the following [Equation 4].

[수학식 4][Equation 4]

여기서,

는 클래스 i의 계수 벡터이다. 클래스 i 및 상기 클래스 i의 학습 샘플은 다음의 수학식 5를 만족하면 상기 제2 단계 분류기(162)를 통해 제외된다.here,

Is a coefficient vector of class i. The class i and the learning samples of the class i are excluded through the second step classifier 162 when Equation 5 below is satisfied.

[수학식 5][Equation 5]

여기서, r₀은 최소 정규화 잔차를 나타낸다. 이때, 상기 클래스 i의 정규화된 잔차는 테스트 얼굴 y가 클래스 i에 얼마나 가까운지를 보여주기 때문에 i가 1내지 K값을 가지는 정규화된 잔차 r_i에 비례하는 정규화 가중치를 설정한다. 결과적으로 상기 [수학식 3]은 [수학식 6]으로 수정될 수 있고, 다음의 [수학식 7]에 따라 η를 계산하여 가장 높은 인식 정확도를 얻을 수 있다. Here, r ₀ represents the minimum normalized residual. At this time, since the normalized residual of the class i shows how close the test face y is to the class i, a normalization weight proportional to the normalized residual r _i having 1 to K values is set. As a result, [Equation 3] may be modified to [Equation 6], and the highest recognition accuracy may be obtained by calculating η according to the following [Equation 7].

[수학식 6][Equation 6]

[수학식 7][Equation 7]

여기서 r₀은 최소 정규화된 잔차이다. 정규화 요소 η는 τ 및 ri(i는 1에서 K값을 가짐)를 포함하는 상기 [수학식 6]의 파리미터들에 대한 밸런스(balance)를 맞추기 위한 것이다. 결과적으로 상기 제1 단계 분류기(131)는 X' = [X'₁, X'₂, X'₃, . . . , X'_K'] 인 아이덴티티(identities)들의 K'에 대한 새로운 세트를 수집한다. 여기서 X'_i는 i번째 클래스의 서브 세트이다. 이러한 소수의 K'클래스에 대한 세트는 제2 단계 분류기(132)에 의해 분류된다. 또한 상기 제1 단계 분류기(141)에 의해 설정된 가중치 w_i의 세트는 계층적 협업 표현 기반 분류기(160)의 성능을 향상시키는데 사용된다.Where r ₀ is the minimum normalized residual. The normalization factor η is for balancing the parameters of [Equation 6] including τ and ri (i has a K value from 1). As a result, the first stage classifier 131 is _{X '= [X' 1,} X '2, X' 3,. . . , X'K _' ] collects a new set of K's of identities. Where _X'i is a subset of the i-th class. The set for this small number of K'classes is sorted by the second stage classifier 132. In addition, the set of weights w _i set by the first step classifier 141 is used to improve the performance of the hierarchical collaborative expression-based classifier 160.

즉, 제1 단계 분류기(161)는 제1 단계 분류용 학습모델 생성부(130)를 통해 생성한 제1 단계 분류용 학습모델을 이용하여, 특정 얼굴 이미지에 대한 얼굴을 인식하기 위해 상기 학습데이터로부터 소수의 후보 클래스(즉, 적어도 하나 이상의 후보 클래스)를 분류하는 기능을 수행한다.That is, the first stage classifier 161 uses the first stage classification learning model generated by the first stage classification learning model generator 130 to recognize the face for a specific face image. A function of classifying a small number of candidate classes (ie, at least one or more candidate classes) from.

또한 제2 단계 분류기(162)는 다음의 [수학식 8]에 따라 정규화된 최소 제곱법을 사용하여 학습 세트 X'에 대해 테스트 벡터 y를 인코딩하기 위한 향상된 협업 표현 방법을 적용한다.In addition, the second step classifier 162 applies an improved collaborative expression method for encoding the test vector y for the training set X'using the least squares normalized method according to the following [Equation 8].

[수학식 8][Equation 8]

CRC에서와 같이, 상기 [수학식 8]을 위한 솔루션은 다음의 [수학식 9]를 통해 분석적으로 도출된다.As in CRC, the solution for Equation 8 is analytically derived through Equation 9 below.

[수학식 9][Equation 9]

또한 상기 제2 단계 분류기(142)는 다음의 [수학식 10]을 통해 클래스의 정규화된 잔차를 계산한다.In addition, the second step classifier 142 calculates the normalized residual of the class through the following [Equation 10].

[수학식 10][Equation 10]

여기서,

는 클래스 i의 계수 벡터이다. 최소 정규화된 재구성 오차(reconstruction error)를 발견함으로써, y의 인식은 다음의 [수학식 11]을 통해 계산된다.here,

Is a coefficient vector of class i. By finding the minimum normalized reconstruction error, the recognition of y is calculated by the following [Equation 11].

[수학식 11][Equation 11]

즉, 상기 제2 단계 분류기(162)는 특정 얼굴 이미지에 대한 얼굴을 인식하기 위해 상기 제1 단계 분류기(161)를 통해 분류한 적어도 하나 이상의의 후보 클래스 중 가장 접합한 후보 클래스를 재분류하는 기능을 수행하는 것이다.That is, the second stage classifier 162 reclassifies the most contiguous candidate class among at least one candidate class classified through the first stage classifier 161 in order to recognize a face for a specific face image. Is to do

또한 얼굴인식부(170)는 상기 얼굴 이미지 수집부(110)를 통해 수집한 얼굴 이미지를 토대로 얼굴을 인식하기 위한 기능을 수행한다.In addition, the face recognition unit 170 performs a function for recognizing a face based on the face image collected through the face image collection unit 110.

즉, 상기 얼굴인식부(170)는 얼굴 이미지 수집부(110)를 통해 얼굴 인식을 위한 특정 얼굴 이미지가 수신되는 경우, 상기 얼굴특징 추출부(150), 제1 단계 분류기(161) 및 제2 단계 분류기(162)를 제어하여, 상기 특정 얼굴 이미지에 대한 얼굴특징, 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스의 분류 및 상기 후보 클래스를 재분류하도록 한다.That is, when the face recognition unit 170 receives a specific face image for face recognition through the face image collection unit 110, the face feature extraction unit 150, the first step classifier 161 and the second The step classifier 162 is controlled to classify face features for the specific face image, classification of at least one candidate class for the face features, and reclassify the candidate classes.

또한 얼굴인식부(170)는 상기 재분류한 후보 클래스 중 상기 얼굴특징 추출부(150)를 통해 추출한 얼굴특징에 대한 프로젝션 벡터와의 유클리드 거리가 제일 작은 후보 클래스를 선택함으로써, 상기 특정 얼굴에 대한 얼굴인식을 수행한다.In addition, the face recognition unit 170 selects a candidate class having the smallest Euclidean distance from the projection vector for the facial feature extracted through the facial feature extraction unit 150 among the reclassified candidate classes, thereby selecting the candidate face for the specific face. Perform face recognition.

또한 얼굴인식부(170)는 상기 인식한 결과를 얼굴인식을 요청한 사용자 단말(200)로 제공하는 기능을 수행한다.In addition, the face recognition unit 170 performs a function of providing the recognized result to the user terminal 200 requesting face recognition.

한편 본 발명의 일 실시예에 따른 상기 얼굴특징을 추출하는 추출 모델은 DCNN 모델 또는 LTP 모델을 이용할 수 있으며, 상기 DCNN 모델 또는 LTP 모델은 상기 계층적 협업 표현 기반 분류기(160)와 결합되어 신속하고 정확하게 테스트 얼굴을 인식할 수 있도록 한다. 상기 DCNN 모델 및 LTP 모델은 도 5와 도 6 및 도 7과 도 8을 참조하여 상세히 설명하도록 한다.Meanwhile, an extraction model for extracting the facial features according to an embodiment of the present invention may use a DCNN model or an LTP model, and the DCNN model or LTP model is combined with the hierarchical collaborative expression-based classifier 160 to be fast and fast. Make sure to recognize the test face accurately. The DCNN model and the LTP model will be described in detail with reference to FIGS. 5 and 6 and 7 and 8.

또한 제어부(180)는 상기 얼굴인식 장치(100)의 각 구성부분에 대한 구동 및 데이터 이동을 포함하여 상기 얼굴인식 장치(100)의 전반적인 제어를 수행한다.In addition, the controller 180 performs overall control of the face recognition device 100 including driving and data movement for each component of the face recognition device 100.

도 5는 본 발명의 일 실시예에 따른 계층적 협업 표현 분류기가 DCNN 모델과 결합하여 얼굴인식을 수행하는 과정을 나타낸 도면이며, 도 6은 본 발명의 일 실시예에 따른 DCNN 모델의 구조를 나타낸 도면이다.5 is a diagram illustrating a process of performing hierarchical face recognition by combining a hierarchical collaborative expression classifier according to an embodiment of the present invention with a DCNN model, and FIG. 6 shows the structure of a DCNN model according to an embodiment of the present invention It is a drawing.

도 5에 도시한 바와 같이, 본 발명의 일 실시예에 따른 계층적 협업 표현 분류기(160)는 DCNN 모델로부터 추출되는 얼굴특징을 토대로 특정 얼굴 이미지에 대한 얼굴인식을 수행할 수 있도록 구현될 수 있다. As illustrated in FIG. 5, the hierarchical collaborative expression classifier 160 according to an embodiment of the present invention may be implemented to perform face recognition on a specific face image based on facial features extracted from the DCNN model. .

즉, 얼굴인식 장치(100)의 얼굴특징 추출용 학습모델 생성부(120)는 DCNN 모델로 구성될 수 있으며, 상기 DCNN 모델은 학습데이터를 구성하는 얼굴 이미지를 학습하여 상기 얼굴이미지로터 얼굴특징을 추출하기 위한 얼굴특징 추출용 학습모델을 생성한다. 즉, 상기 DCNN 모델은 상기 얼굴특징 추출용 학습모델 학습모델을 생성하여, 학습데이터에 대한 차별적인 얼굴특징을 공통의 세트로 변환하는 기능을 수행하는 것이다.That is, the learning model generator 120 for extracting facial features of the face recognition apparatus 100 may be configured as a DCNN model, and the DCNN model learns the facial images constituting the training data to acquire the facial features from the facial image rotor. Create a learning model for extracting facial features for extraction. That is, the DCNN model performs a function of generating a learning model learning model for extracting the facial features, and converting the differential facial features for the learning data into a common set.

또한 계층적 협업 표현 기반 분류기(160)는 상기 도 4를 참조하여 설명한 것과 같이 상기 DCNN 모델에 의해 추출되는 얼굴특징을 학습하여 생성되는 제1 단계 분류용 학습모델과 상기 제1 단계 분류용 학습모델의 출력을 학습하여 생성되는 제2 단계 분류용 학습모델을 이용하여 특정 얼굴 이미지에 대한 얼굴을 인식하기 위한 제1 단계 및 제2 단계 분류과정을 수행하게 된다.In addition, the hierarchical collaboration expression-based classifier 160, as described with reference to FIG. 4, is a learning model for first-class classification and a learning model for first-class classification, which is generated by learning facial features extracted by the DCNN model. The first and second step classification processes for recognizing a face for a specific face image are performed by using the learning model for the second step classification generated by learning the output of.

또한 상기 얼굴인식부(170)는 상기 도 4를 참조하여 설명한 것과 같이 상기 계층적 협업 표현 기반 분류기(160)를 토대로 최종적으로 분류되는 후보 클래스 중 어느 하나를 선택함으로써, 상기 특정 얼굴 이미지에 대한 얼굴인식을 수행하게 된다. Also, the face recognition unit 170 selects one of the candidate classes finally classified based on the hierarchical collaborative expression-based classifier 160, as described with reference to FIG. 4, to face the specific face image. Recognition is performed.

한편 본 발명에서는 상기 각 학습데이터에 대한 이미지의 크기는 128 x 128 x 1로 조정되며(다만, 이에 한정하지 않음), 도 6에 도시한 바와 같이, 상기 DCNN은 복수의 컨볼루션 레이어(convolutional layer)(예: 8개), 상기 각 컨볼루션 레이어와 연결되는 복수의 맥스아웃(maxout)레이어를 포함하여 구성된다.On the other hand, in the present invention, the size of the image for each learning data is adjusted to 128 x 128 x 1 (but not limited to this), as shown in FIG. 6, the DCNN is a plurality of convolutional layer (convolutional layer) ) (E.g., 8), and includes a plurality of maxout layers connected to each convolution layer.

상기 맥스아웃 레이어는 일반적인 맥스아웃 네트워크와는 달리 최대 특징 맵(maximal feature maps)의 레이어로 간주된다. 상세하게는 각각의 컨볼루션 레이어는 n 그룹의 특징맵으로 무작위로 카테고라이즈(catagorized)된다. 이러한 그룹으로부터 동일한 좌표에 있는 특징값을 비교하여 최대값을 선택한 다음, 맥스아웃 레이어의 동일한 좌표에 상기 선택한 최대값을 할당한다. 상기 맥스아웃 레이어는 얼굴인식의 성능 향상 측면에서 몇 가지 중요한 장점이 있다. 첫째, 맥스아웃 레이어는 필요한 뉴런의 수와 각 레이어에서의 네트워크의 파라미터를 최소화하는데 중요한 역할을 한다. 둘째, 맥스아웃 레이어는 효율적인 활성 함수(activation functions)들의 세트로 간주되며, 이는 기존의 활성 함수보다 빠르다. 이러한 두 가지 장점은 다른 딥 컨볼루션 네트워크보다 훨씬 빠르게 만들어준다. 마지막으로, 맥스아웃 레이어를 사용하여 우수한 특징 추출 모델을 구축하는데 유효한 경쟁 특징(competitive feature)들을 빠르게 얻을 수 있는 장점이 있다.The max-out layer is considered as a layer of maximum feature maps, unlike a general max-out network. In detail, each convolution layer is randomly categorized as an n-group feature map. The maximum value is selected by comparing feature values at the same coordinates from these groups, and then the selected maximum value is assigned to the same coordinates of the max-out layer. The max-out layer has several important advantages in terms of improving face recognition performance. First, the max-out layer plays an important role in minimizing the number of neurons required and the parameters of the network in each layer. Second, the maxout layer is considered a set of efficient activation functions, which is faster than the existing activation functions. These two advantages make it much faster than other deep convolutional networks. Lastly, there is an advantage of quickly obtaining competitive features that are effective in constructing an excellent feature extraction model using a max-out layer.

또한 본 발명의 DCNN 모델은 복수의 풀링 레이어(pooling layer)(예: 4개)를 더 포함하여 구성되며, 상기 각 풀링 레이어는 다운 샘플링 특징 맵에 적용되고, 학습 파라미터의 수를 줄이는 기능을 수행한다. 또한 상기 DCNN 모델은 오버피팅(overfitting)으로부터 DCNN 모델을 보호하기 위한 좋은 기술로 간주되는 드롭아웃 레이어(dropout layer)를 더 포함한다. 또한 학습 단계에서 목적 함수(objective function)를 생성하기 위해 소프트 맥스 레이어(softmax layer)가 추가된다.In addition, the DCNN model of the present invention is further composed of a plurality of pooling layers (pooling layers) (for example, four), each of the pooling layer is applied to the down-sampling feature map, and performs a function of reducing the number of learning parameters do. In addition, the DCNN model further includes a dropout layer, which is considered a good technique for protecting the DCNN model from overfitting. In addition, a softmax layer is added to create an objective function in the learning stage.

상기에서 설명한 것과 같이, 상기 DCNN을 통해 추출되는 얼굴특징은 상기 계층적 협업 표현 기반 분류기(160)의 입력으로써, 사용된다.As described above, the facial feature extracted through the DCNN is used as an input of the hierarchical collaboration expression-based classifier 160.

도 7은 발명의 일 실시예에 따른 계층적 협업 표현 분류기가 LTP 모델과 결합하여 얼굴인식을 수행하는 과정을 나타낸 도면이며, 도 8은 본 발명의 일 실시예에 따른 LTP 모델을 설명하기 위해 나타낸 도면이다.7 is a diagram illustrating a process of performing a face recognition by combining a hierarchical collaborative expression classifier according to an embodiment of the present invention with an LTP model, and FIG. 8 is shown for explaining an LTP model according to an embodiment of the present invention It is a drawing.

도 7 도시한 바와 같이, 본 발명의 일 실시예에 따른 계층적 협업 표현 분류기(160)는 LTP 모델과 결합하여 상기 LTP 모델로부터 추출되는 얼굴특징을 토대로 특정 얼굴 이미지에 대한 얼굴인식을 수행할 수 있도록 구현될 수 있다.As illustrated in FIG. 7, the hierarchical collaborative expression classifier 160 according to an embodiment of the present invention can perform face recognition for a specific face image based on the facial features extracted from the LTP model in combination with the LTP model. Can be implemented.

즉, 얼굴인식 장치(100)의 얼굴특징 추출용 학습모델 생성부(120)는 LTP 모델로 구성될 수 있다.That is, the learning model generator 120 for extracting facial features of the facial recognition apparatus 100 may be configured as an LTP model.

따라서, 계층적 협업 표현 기반 분류기(160)는 상기 LTP 모델로 부터 추출되는 얼굴특징을 학습하여 생성되는 제1 단계 분류용 학습모델 상기 제1 단계 분류용 학습모델의 출력을 학습하여 생성되는 제2 단계 분류용 학습모델을 이용하여 특정 얼굴 이미지에 대한 얼굴을 인식하기 위한 제1 단계 및 제2 단계 분류과정을 수행하게 된다.Therefore, the hierarchical collaboration expression-based classifier 160 is a learning model for the first-level classification generated by learning the facial features extracted from the LTP model, and a second generated by learning the output of the learning model for the first-stage classification. The first and second step classification processes for recognizing a face for a specific face image are performed using a learning model for step classification.

한편 상기 LTP 모델은 제어되지 않은 조명의 효과를 크게 줄일 뿐만 아니라 무작위 노이즈에 민감하지 않기 때문에 계층적 협업 표현 기반 분류기(160)의 성능을 크게 향상시킬 수 있는 장점이 있다.On the other hand, the LTP model has an advantage of significantly reducing the effect of uncontrolled lighting and also being not sensitive to random noise, thereby greatly improving the performance of the hierarchical collaborative expression-based classifier 160.

도 8에 도시 한 바와 같이, LTP 오퍼레이터(operator)는 얼굴 이미지의 중심 픽셀과 이웃 픽셀 간에 대한 차이가 삼중 코드(trinary code)로 인코딩되는 얼굴 이미지의 3x3 픽셀블록에서 작동한다. 상기 LTP 코드는 다음의 [수학식 12]에 의해 계산된다.As shown in FIG. 8, the LTP operator operates on a 3x3 pixel block of a face image in which the difference between the center pixel and the neighboring pixels of the face image is encoded in a trinary code. The LTP code is calculated by the following [Equation 12].

[수학식 12][Equation 12]

여기서, lc는 중심 픽셀의 그레이 레벨을 나타내고 lp는 이웃 픽셀의 그레이 레벨을 나타낸다. 또한 p는 0 또는 1의 값을 가진다. 또한 f(l_p, l_c, th )는 임계함수를 나타내며, 상기 임계함수는 다음의 [수학식 13]에 의해 계산된다.Here, lc represents the gray level of the center pixel and lp represents the gray level of the neighboring pixel. Also, p has a value of 0 or 1. In addition, f(l _p , l _c , th) represents a critical function, and the critical function is calculated by the following [Equation 13].

[수학식 13][Equation 13]

여기서, th는 임계값을 나타낸다. 만약, 임계값 th가 충분히 큰 경우, 노이즈에 의해 야기되는 중심 픽셀의 작은 그레이 변화는 이미지 내의 중심픽셀의 이웃 픽셀에 대한 코드들을 변화시킬 수 없다. 이것이 LTP 모델이 얼굴 이미지에 발생되는 노이즈에 둔감한 이유이다. 본 발명에서 상기 th는 5로 설정될 수 있으며, 특징 차원을 줄이기 위해 LTP는 효과적인 코딩체계에 의해 구성된다. 상기 LTP 모델은 다음의 [수학식 14] 및 [수학식 15]에 의해 포지티브(positive) 및 네거티브(negative) LBP 부분으로 나누어진다.Here, th represents a threshold value. If the threshold th is sufficiently large, a small gray change in the center pixel caused by noise cannot change the codes for neighboring pixels of the center pixel in the image. This is why the LTP model is insensitive to the noise generated in the face image. In the present invention, th may be set to 5, and LTP is configured by an effective coding system to reduce feature dimensions. The LTP model is divided into positive and negative LBP parts by the following [Equation 14] and [Equation 15].

[수학식 14][Equation 14]

[수학식 15][Equation 15]

이에 따라, 상기 LTP 모델은 얼굴 특징을 추출하기 위한 두 개의 LPB 이미지를 생성한다. 로컬 특징을 성공적으로 보존하고 얼굴의 공간 위치정보를 유지하기 위해 얼굴 이미지를 블록으로 나눔으로써, LTP 모델의 고차원적인 특징을 구성한다. 또한 본 발명에서 상기 블록은 8x8 픽셀의 크기로 고정(다만, 이에 한정하지 않음)되며 각 블록에서 LTP 코드의 발생은 히스토그램으로 수집된다. 이러한 히스토그램은 여러 개의 빈(bin)으로 구성되는 결합된 특징 히스토그램으로 연결된다. 또한 차원의 저주를 피하기 위해, 상기 고차원의 히스토그램을 훨씬 낮은 차원의 특징벡터로 변환하기 위해 PCA(principal component analysis)방법을 적용하며, 상기 PCA의 출력은 얼굴 특징 벡터이다.Accordingly, the LTP model generates two LPB images for extracting facial features. By successfully dividing the face image into blocks to successfully preserve the local features and maintain the spatial location information of the face, the high-level features of the LTP model are constructed. In addition, in the present invention, the block is fixed (but not limited to) to a size of 8x8 pixels, and the generation of LTP codes in each block is collected as a histogram. These histograms are linked by a combined feature histogram consisting of several bins. Also, in order to avoid dimensional curse, a PCA (principal component analysis) method is applied to convert the high-dimensional histogram into a feature vector of a much lower dimension, and the output of the PCA is a facial feature vector.

즉, LTP 모델의 출력은 상기 DCNN 모델에서와 같이 상기 계층적 협업 표현 기반 분류기(160)의 입력으로 사용되며, 상기 LTP 모델은 실시간 얼굴 인식을 위해 더 우수한 특징을 추출할 수 있다.That is, the output of the LTP model is used as the input of the hierarchical collaborative expression-based classifier 160 as in the DCNN model, and the LTP model can extract better features for real-time face recognition.

도 9는 본 발명의 일 실시예에 따른 계층적 협업 표현 분류기와 타 얼굴 특징 학습 모델과의 성능을 비교한 도면이다.9 is a diagram for comparing performance of a hierarchical collaborative expression classifier and another facial feature learning model according to an embodiment of the present invention.

도 9의 (a)는 AR 데이터 세트에서 인식 정확도를 비교한 도면이며, 도 9의 (b)는 AR 데이터 세트에서의 인식속도를 비교한 도면이다.9(a) is a diagram comparing the recognition accuracy in the AR data set, and FIG. 9(b) is a diagram comparing the recognition speed in the AR data set.

도 9의 (b)에 도시한 바와 같이, 상이한 환경에서 얼굴인식의 정확성을 평가하기 위해, 비교 목적으로 사용되는 AR 데이터베이스는 50명의 남성과 50명의 여성에 대한 얼굴로 구성된다. 상기 목적을 위해 조명과 환경 및 표현이 다른 7개의 이미지를 학습하기 위해 수집하였으며, 이 데이터베이스의 이미지는 60x43픽셀로 조정하였다. 계층적 협업 표현 분류기(160)에 대한 실험에서 τ= α= 1로 설정하였다. 얼굴 특징을 추출하기 위해 맥스아웃 네트워크(maxout network), VGG(very deep convolutional network) 및 센트로스 네트워크(centerloss network)가 사용하였다. 그런 다음 상기 특징들을 분류하기 위한 본 발명의 계층적 협업 표현 기반 분류기(160)를 적용하였으며, 맥스아웃 네트워크, VGG 및 센트로스 네트워크를 포함하는 기존의 딥 네트워크와 각각 비교하였다. 상기 비교결과는 도 9에 도시한 바와 같이, 특징 추출 모델과 결합한 계층적 협업 표현 기반 분류기(160)의 성능이 제일 좋은 것을 알 수 있다.As shown in FIG. 9(b), to evaluate the accuracy of face recognition in different environments, the AR database used for comparison purposes is composed of faces for 50 men and 50 women. For this purpose, 7 images with different lighting, environment, and expression were collected to learn, and the image in this database was adjusted to 60x43 pixels. In the experiment for the hierarchical cooperative expression classifier 160, τ=α=1 was set. A maxout network, a very deep convolutional network (VGG) and a centerloss network were used to extract facial features. Then, the hierarchical collaborative expression-based classifier 160 of the present invention for classifying the above features was applied, and compared with existing deep networks including Maxout Network, VGG, and Centros Network, respectively. As shown in FIG. 9, it can be seen that the performance of the hierarchical collaborative expression-based classifier 160 combined with the feature extraction model is best.

즉, 본 발명의 계층적 협업 표현 기반 분류기(160)의 정확도가 SRC 및 CRC의 정확도보다 우수함을 알 수 있으며, 계층적 협업 표현 기반 분류기(160)는 CRC에 비해 2%, SRC에 비해 2.4% 더 정확하다. 이 결과는 계층적 협업 표현 기반 분류기(160)가 종래의 CRC의 성능을 효과적으로 향상시킨다는 것을 증명한다.That is, it can be seen that the accuracy of the hierarchical collaboration expression-based classifier 160 of the present invention is superior to the accuracy of SRC and CRC, and the hierarchical collaboration expression-based classifier 160 is 2% compared to CRC and 2.4% compared to SRC. More accurate. This result proves that the hierarchical collaboration expression-based classifier 160 effectively improves the performance of the conventional CRC.

또한 LTP 모델을 결합한 계층적 협업 표현 기반 분류기(160)가 99.9%의 정확도를 달성하고 딥 러닝 모델을 사용하지 않는 다른 접근방식보다 훨씬 뛰어나다는 것을 나타낸다. 이 결과는 LTP 모델이 얼굴 특징에서 노이즈를 제거함으로써, 인식 성능을 향상시키는 데 매우 크게 기여함을 알 수 있다. It also shows that the hierarchical collaborative expression-based classifier 160 that combines the LTP model achieves 99.9% accuracy and is significantly better than other approaches that do not use deep learning models. This result shows that the LTP model greatly contributes to improving recognition performance by removing noise from facial features.

또한 도 9의 (b)에 도시한 것과 같이 계층적 협업 표현 기반 분류기(160)가 LTP 모델과 결합된 경우, 제일 정확하고 타 접근방법보다 빠르다는 것을 알 수 있으며, 이는 감시보안 시스템이나 모바일 로봇과 같은 실시간 얼굴 인식분야에 적용될 수 있음을 보여준다.In addition, it can be seen that the hierarchical collaborative expression-based classifier 160, as shown in FIG. 9(b), is the most accurate and faster than other approaches when combined with the LTP model, which is a surveillance security system or a mobile robot. It can be applied to the real-time face recognition field.

또한 계층적 협업 표현 기반 분류기(160)가 VGG 네트워크과 결합된 경우(VGG-HCRC) 99.9%의 정확도를 달성하는 것을 알 수 있다. 이는 딥 러닝 모델을 사용하는 타 방식 중에서도 최고이며, 계층적 협업 표현 기반 분류기(160)가 LTP 모델과 결합되는 경우(LTP-HCRC)와 동일한 정확도를 보인다. 또한 계층적 협업 표현 기반 분류기(160)가 맥스아웃 네트워크과 결합(maxout-HCRC)된 경우는 99.1%의 정확도를 보이며, 이는 상기 VGG-HCRC 및 계층적 협업 표현 기반 분류기(160)가 센트로스 네트워크과 결합(centerloss-HCRC)된 경우보다 약간 적다. 그러나 VGG-HCRC 및 centerloss-HCRC는 네트워크 파리미터가 많기 때문에 maxout-HCRC보다 각각 7.8 배 및 2.5배 더 느리다. maxout-HCRC는 GPU의 지원을 갖춘 보다 강력한 디바이스에서 실행되는 경우 훨씬 더 빠르다. 일반적으로 maxout-HCRC는 실시간 얼굴 인식을 위한 유망한 알고리즘이며, VGG-HCRC는 실시간 얼굴인식이 필요하지 않은 환경에서의 얼굴 인식 시스템을 위한 최적의 알고리즘이다.In addition, it can be seen that the hierarchical collaboration expression-based classifier 160 achieves 99.9% accuracy when combined with the VGG network (VGG-HCRC). This is the best among other methods using a deep learning model, and shows the same accuracy as when the hierarchical collaborative expression-based classifier 160 is combined with the LTP model (LTP-HCRC). In addition, when the hierarchical collaborative expression-based classifier 160 is combined with the max-out network (maxout-HCRC), it shows an accuracy of 99.1%. (centerloss-HCRC) slightly less than the case. However, VGG-HCRC and centerloss-HCRC are 7.8 times and 2.5 times slower than maxout-HCRC, respectively, because of the large number of network parameters. maxout-HCRC is much faster when running on more powerful devices with GPU support. In general, maxout-HCRC is a promising algorithm for real-time face recognition, and VGG-HCRC is an optimal algorithm for a face recognition system in an environment that does not require real-time face recognition.

도 10은 본 발명의 일 실시예에 따른 AR 데이터 세트에서 랜덤 노이즈를 가진 얼굴 이미지의 인식률을 비교한 도면이다.FIG. 10 is a diagram comparing recognition rates of face images having random noise in an AR data set according to an embodiment of the present invention.

도 10에 도시한 바와 같이 AR 데이터 세트에서 랜덤 노이즈를 가진 얼굴 이미지의 인식률은, 본 발명의 계층적 협업 표현 기반 분류기(160)가 상이한 노이즈 비율 하에서 MSPCRC와 CRC보다 현저히 우수함을 보여준다. 특히 계층적 협업 표현 기반 분류기(160)는 MSPCRC보다 1.7 ~ 10.6%향상됨을 알 수 있다. 도 10에 도시한 결과는 LTP 모델이 노이즈에 대한 내성이 높기 때문에 인식 성능 향상에 기여한다는 것을 보여준다. 따라서 상기 LTP-HCRC는 99.4%의 정확도를 달성하며, 최상의 방법임을 알 수 있다. 또한 딥 러닝 모델을 사용하는 다른 접근방법보다 우수하다.As illustrated in FIG. 10, the recognition rate of the face image having random noise in the AR data set shows that the hierarchical collaborative expression-based classifier 160 of the present invention is significantly superior to MSPCRC and CRC under different noise ratios. In particular, it can be seen that the hierarchical collaboration expression-based classifier 160 is 1.7 to 10.6% higher than the MSPCRC. The results shown in FIG. 10 show that the LTP model contributes to the improvement of recognition performance because of its high resistance to noise. Therefore, it can be seen that the LTP-HCRC achieves an accuracy of 99.4% and is the best method. It is also superior to other approaches using deep learning models.

또한 maxout-HCRC, VGG-HCRC 및 centerloss-HCRC의 정확도는 테스트 얼굴에서 랜덤 노이즈가 증가할 때, 현저하기 감소한다. 특징 딥 러닝 모델의 경우와 마찬가지로 이 문제는 오버피팅으로 인해 발생한다. DCNN은 분류 작업에는 강력하지만 지나친 오버피팅에 완전히 자유롭지 못하다. 이러한 경우 흐릿하고 노이즈가 심하고, 낮은 해상도와 같은 얼굴 이미지를 다루지 못한다. 결과적으로 DCNN 모델은 학습 데이터에서는 잘 수행할 수 있지만 이전에 보지 못했던 새로운 얼굴 이미지가 포함된 일부 평가 데이터 세트에서는 잘 수행되지 않는다.Also, the accuracy of maxout-HCRC, VGG-HCRC and centerloss-HCRC decreases markedly when random noise increases in the test face. Features As with the deep learning model, this problem is caused by overfitting. DCNN is powerful for classification, but is not completely free from overfitting. In this case, it is blurry, noisy, and cannot handle face images such as low resolution. As a result, the DCNN model can perform well on training data, but not on some evaluation data sets that contain new facial images that have not been seen before.

도 11은 본 발명의 일 실시예에 따른 AR 데이터 세트에서 랜덤 오클루젼을 가진 얼굴 이미지의 인식률을 비교한 도면이다.11 is a diagram comparing the recognition rate of a face image with random occlusion in an AR data set according to an embodiment of the present invention.

도 11에 도시한 바와 같이, 얼굴 인식 문제에 대해 더욱 어렵게 만드는 블록 오클루젼(block occlusion)을 가지는 얼굴 이미지를 이용하여 본 발명의 계층적 협업 표현 기반 분류기(130)와 타 분류기를 평가하였다.As illustrated in FIG. 11, the hierarchical collaboration expression-based classifier 130 and other classifiers of the present invention were evaluated using a face image having block occlusion, which makes the face recognition problem more difficult.

각 테스트 이미지는 작은 정사각형 블록에 의해 무작위로 가려졌다. 도 11에 나타낸 것과 같이, 상기 계층적 협업 표현 기반 분류기(160)는 CRC보다 성능이 우수하고 4.0%이상 정확하다는 것을 알 수 있다. 또한 상기 LTP-HCRC가 높은 인식률을 달성하는 것을 알 수 있으며, 그것의 정확도가 테스트 얼굴의 오클루젼 비율이 급격하게 증가할 때, 다소 감소함을 알 수 있다. LTP-HCRC는 오클루젼에 의해 야기되는 손상에 대해 상대적으로 민감하지 않은 것으로 나타났다. 이는 LTP 모델이 오클루션, 조명, 노이즈 및 음영과 같은 다양한 유형에 의해 야기되는 이미지 손상에 대한 경고성으로 인해 인식률 향상에 크게 기여한다는 것을 증명한다. Each test image was randomly masked by a small square block. As shown in FIG. 11, it can be seen that the hierarchical collaboration expression-based classifier 160 has better performance and more than 4.0% accuracy than CRC. It can also be seen that the LTP-HCRC achieves a high recognition rate, and its accuracy decreases somewhat when the occlusion rate of the test face increases rapidly. LTP-HCRC has been shown to be relatively insensitive to damage caused by occlusion. This proves that the LTP model contributes significantly to the recognition rate due to the warning of image damage caused by various types such as occlusion, lighting, noise and shading.

또한 maxout-HCRC, VGG-HCRC 및 centerloss-HCRC는 오클루젼 하에서 얼굴 인식에 효과적이며, maxout, VGG 및 centerloss보다 정확하다는 것을 알 수 있다. 이것은 여전히 부분적인 오클루젼에도 불구하고 인식을 위한 좋은 특징을 유지한다는 사실로 설명할 수 있다. 따라서 오클루젼 비율이 급격하게 증가하더라도 maxout-HCRC, VGG-HCRC 및 centerloss-HCRC의 정확도는 매우 높다. 그 중에서 VGG-HCRC가 가장 좋은 방법이며, maxout-HCRC는 centerloss-HCRC와 비슷한 결과는 얻는다. It can also be seen that maxout-HCRC, VGG-HCRC and centerloss-HCRC are effective for face recognition under occlusion, and are more accurate than maxout, VGG and centerloss. This can be explained by the fact that despite the partial occlusion it still retains good features for recognition. Therefore, even if the occlusion ratio increases rapidly, the accuracy of maxout-HCRC, VGG-HCRC, and centerloss-HCRC is very high. Among them, VGG-HCRC is the best method, and maxout-HCRC achieves results similar to centerloss-HCRC.

도 12는 본 발명의 일 실시예에 따른 확장된 Yale B 데이터 세트에서 얼굴인식 성능을 비교한 도면이다.12 is a view comparing face recognition performance in an extended Yale B data set according to an embodiment of the present invention.

도 12에 도시한 바와 같이, 본 발명의 일 실시예에 따른 확장된 Yale B 얼굴 데이터베이스를 사용하여 다양한 조명 조건 하에서 본 발명의 계층적 협업 표현 기반 분류기(160)과 타 접근방법에 대한 정확성을 평가하였다.As illustrated in FIG. 12, the accuracy of the hierarchical collaborative expression-based classifier 160 of the present invention and other approaches is evaluated under various lighting conditions using the extended Yale B face database according to an embodiment of the present invention. Did.

조명 조건의 변화는 얼굴인식 결과에 가장 큰 영향을 미쳤다. 상기 확장된 Yale B 얼굴 데이터베이스는 64개의 조명 조건 하에서 38개의 아이덴티티로 구성된다. 얼굴 이미지는 32x32 픽셀로 조정하였다.Changes in lighting conditions had the greatest effect on face recognition results. The expanded Yale B face database consists of 38 identities under 64 lighting conditions. The face image was adjusted to 32x32 pixels.

도 12에 나타낸 것과 같이, LTP-HCRC 및 VGG-HCRC가 해당 평가에서 포함된 접근방법 중에서 가장 우수하다는 것을 증명한다. LTP-HCRC는 CRC에 비해 0.9%의 성능 향상을 보이는 계층적 협업 표현 기반 분류기(160)와 복잡한 조명 조건에서 견고한 LTP 모델의 이점을 모두 상속받는다. maxout 네트워크, VGG 네트워크 및 centerloss 네트워크가 높은 인식 정확도를 보여준다. 이는 데이터베이스에서 주요 얼굴 특징을 추출할 수 있기 때문이다. 이는 또한 VGG-HCRC가 가장 높은 인식 정확도를 달성하는 이유이며, maxout 네트워크 및 centerloss 네트워크는 VGG-HCRC보다 0.4% 덜 정확함을 알 수 있다.As shown in Figure 12, LTP-HCRC and VGG-HCRC prove that the best of the approaches included in the evaluation. The LTP-HCRC inherits all the advantages of the hierarchical collaborative expression-based classifier 160, which shows 0.9% performance improvement over the CRC, and the robust LTP model under complex lighting conditions. maxout network, VGG network and centerloss network show high recognition accuracy. This is because the main facial features can be extracted from the database. This is also why VGG-HCRC achieves the highest recognition accuracy, and it can be seen that the maxout network and centerloss network are 0.4% less accurate than VGG-HCRC.

도 13은 본 발명의 일 실시예에 따른 확장된 Yale B 데이터 세트에서 랜덤 노이즈 및 오클루젼을 가지는 얼굴 이미지의 인식률을 비교한 도면이다.13 is a diagram comparing the recognition rate of a face image having random noise and occlusion in an extended Yale B data set according to an embodiment of the present invention.

도 13의 (a)는 본 발명의 일 실시예에 따른 확장된 Yale B 데이터 세트에서 랜덤 노이즈를 가지는 얼굴 이미지의 인식률을 비교한 도면이다. 13(a) is a diagram comparing a recognition rate of a face image having random noise in an extended Yale B data set according to an embodiment of the present invention.

도 13의 (a)에 도시한 바와 같이, 본 발명의 일 실시예에 따른 랜덤 노이즈로 인해 손상된 데이터 세트를 이용하여 본 발명의 계층적 협업 표현 기반 분류기(130)과 타 접근방법에 대한 정확성을 평가하였다.As shown in (a) of FIG. 13, the accuracy of the hierarchical collaborative expression-based classifier 130 and other approaches of the present invention is corrected using a data set damaged due to random noise according to an embodiment of the present invention. Was evaluated.

랜덤 노이즈로 인해 손상된 데이터 세트에 대한 얼굴인식 결과는 본 발명의 계층적 협업 표현 기반 분류기(160)가 CRC보다 우수함을 알 수 있다. 또한 LTP-HCRC는 상기 평가에 포함되는 타 접근방법 중 최고임을 보여준다. It can be seen that the hierarchical collaborative expression-based classifier 160 of the present invention is superior to the CRC as a result of face recognition for a data set damaged by random noise. It also shows that LTP-HCRC is the best of the other approaches included in the assessment.

또한 LTP-HCRC의 정확성은 노이즈의 비율에 따라 약간만 감소하는 반면에 타 접근방법은 크게 저하되는 것을 알 수 있다. 반면, maxout-HCRC, VGG-HCRC 및 centerloss-HCRC의 정확도는 테스트 얼굴에서 랜덤 노이즈가 증가하면 빠르게 감소한다. 다시한번, 이러한 결과는 딥 피처 학습 모델은 랜덤 노이즈에 민감하다는 것을 증명하며, LTP-HCRC는 여전히 최첨단 노이즈 방지 접근법이다.In addition, it can be seen that the accuracy of the LTP-HCRC decreases only slightly depending on the ratio of noise, while other approaches are significantly reduced. On the other hand, the accuracy of maxout-HCRC, VGG-HCRC and centerloss-HCRC decreases rapidly as random noise increases in the test face. Again, these results demonstrate that the deep feature learning model is sensitive to random noise, and LTP-HCRC is still a state-of-the-art noise protection approach.

도 13의 (b)는 본 발명의 일 실시예에 따른 확장된 Yale B 데이터 세트에서 랜덤 노이즈 및 오클루젼을 가지는 얼굴 이미지의 인식률을 비교한 도면이다.FIG. 13B is a diagram comparing the recognition rate of a face image having random noise and occlusion in an extended Yale B data set according to an embodiment of the present invention.

도 13의 (b)에 도시한 바와 같이, 블록 오클루젼에 하에서 LTP-HCRC, maxout-HCRC, VGG-HCRC 및 centerloss-HCRC가 높은 인식률을 유지하는 것을 알 수 있다.13(b), it can be seen that under block occlusion, LTP-HCRC, maxout-HCRC, VGG-HCRC and centerloss-HCRC maintain high recognition rates.

도 14는 본 발명의 일 실시예에 따른 LFW-a 데이터 세트에서 얼굴인식률을 비교한 도면이다.14 is a view comparing face recognition rates in an LFW-a data set according to an embodiment of the present invention.

도 14에 도시한 바와 같이, LFW-a 데이터베이스에서 수집되는 학습얼굴을 이용하여 본 발명의 계층적 협업 표현 기반 분류기(160)와 타 접근방법과의 얼굴인식률을 비교 평가하였다.As shown in FIG. 14, the face recognition rate between the hierarchical collaborative expression-based classifier 160 of the present invention and other approaches was evaluated using learning faces collected from the LFW-a database.

상기 LFW-a 데이터베이스는 무제한 얼굴 인식을 연구하기 위해 만들어 진 것으로, 다양한 인종, 나이 및 성별에 따라 158명의 다른 사람들로 구성된다. 이 개인들 각각을 위해, 5개의 학습 이미지 및 2개의 테스트 이미지를 수집하였다. 상기 학습 이미지와 테스트 이미지를 통해 서로 다른 수의 학습얼굴을 사용함으로써, 제약없는 환경에서 얼굴 인식 성능을 평가하였다.The LFW-a database was created to study unlimited face recognition and consists of 158 different people of various races, ages and genders. For each of these individuals, 5 training images and 2 test images were collected. By using different numbers of learning faces through the learning image and the test image, facial recognition performance was evaluated in an unconstrained environment.

이 이미지의 모든 얼굴은 32x32 픽셀로 조정되었으며, 동일한 개인의 얼굴은 포즈나 표현 및 조명이 다르다. LFW-a 데이터베이스에서 수집되는 학습얼굴의 수는 1,2,3,4,5로 각각 설정되었다.All faces in this image have been scaled to 32x32 pixels, and faces of the same individual have different poses, expressions and lighting. The number of learning faces collected from the LFW-a database was set to 1,2,3,4,5, respectively.

도 14에 나타낸 것과 같이, 학습얼굴의 수가 증가하면 본 발명의 계층적 협업 표현 기반 분류기(160)는 MSPCRC 및 CRC보다 정확함을 알 수 있다. maxout-HCRC, VGG-HCRC 및 centerloss-HCRC가 maxout 네트워크, VGG 네트워크, centerloss 네트워크보다 정확하다. 이중에서 VGG-HCRC가 가장 좋은 방법이며, maxout-HCRC는 centerloss-HCRC와 비슷한 인식률을 보인다. 이는, 다른 기계학습 툴을 사용할 때 보다, 딥 피처 학습 모델을 사용하여 더 복잡한 특징 세트를 추출할 수 있다는 사실로 설명할 수 있다. 또한 계층적 협업 표현 기반 분류기(160)의 장점을 상속하며, 다른 표현 기반 방법보다 성능이 우수함을 알 수 있다. LTP-HCRC는 maxout-HCRC, VGG-HCRC 및 centerloss-HCRC보다 약간 덜 정확하다. As shown in FIG. 14, when the number of learning faces increases, it can be seen that the hierarchical collaboration expression-based classifier 160 of the present invention is more accurate than MSPCRC and CRC. maxout-HCRC, VGG-HCRC and centerloss-HCRC are more accurate than maxout network, VGG network, centerloss network. Among them, VGG-HCRC is the best method, and maxout-HCRC shows a similar recognition rate as centerloss-HCRC. This can be explained by the fact that more complex feature sets can be extracted using a deep feature learning model than when using other machine learning tools. In addition, it inherits the advantages of the hierarchical collaborative expression-based classifier 160, and it can be seen that performance is superior to other expression-based methods. LTP-HCRC is slightly less accurate than maxout-HCRC, VGG-HCRC and centerloss-HCRC.

도 15는 본 발명의 일 실시예에 따른 LFW-a 데이터 세트에서 랜덤 노이즈를 가지는 얼굴 이미지에 대한 인식성능을 설명하기 위해 나타낸 도면이다.FIG. 15 is a diagram illustrating a recognition performance of a face image having random noise in an LFW-a data set according to an embodiment of the present invention.

도 15의 (a)는 본 발명의 일 실시예에 따른 LFW-a 데이터 세트에서 랜덤 노이즈를 가지는 테스트 이미지의 샘플을 나타낸 도면이며, 도 15의 (b)는 본 발명의 일 실시예에 따른 LFW-a 데이터 세트에서 랜덤 노이즈를 가지는 얼굴 이미지에 대한 인식률을 비교한 도면이다.15(a) is a diagram illustrating a sample of a test image having random noise in an LFW-a data set according to an embodiment of the present invention, and FIG. 15(b) is an LFW according to an embodiment of the present invention This is a comparison of recognition rates for face images with random noise in the -a data set.

도 15의 (a)에 도시한 바와 같이, 노이즈 방지 특성을 평가하기 위해 더 많은 랜덤 노이즈를 추가하여 상기 데이터베이스의 모든 테스트 이미지를 다시 손상 시켰다. 각 이미지의 픽셀 수는 [0, 255]내의 임의의 값으로 대체되었다. 랜덤 노이즈에 의해 손상된 픽셀의 비율은 각각 10%, 20%, 30%, 40% 및 50%로 설정된다. 각 개인별로 5개의 학습용 이미지와 2개의 테스트 이미지를 수집하였으며, 이에 대한 평가는 도 15의 (b)에 나타내었다.As shown in Fig. 15(a), more random noise was added to evaluate the anti-noise property, and all the test images in the database were damaged again. The number of pixels in each image was replaced with an arbitrary value in [0, 255]. The proportion of pixels damaged by random noise is set to 10%, 20%, 30%, 40% and 50%, respectively. Five training images and two test images were collected for each individual, and the evaluations thereof are shown in FIG. 15(b).

도 15의 (b)에 도시한 바와 같이, 본 발명의 계층적 협업 표현 기반 분류기(130)는 상이한 노이즈 비율 하에서 MSPCRC 및 CRC보다 현저하게 우수함을 보여준다. 또한 LTP-HCRC가 딥 러닝 모델을 사용하여 가장 높은 정확도를 달성하고, 다른 접근방식보다 훨씬 우수함을 나타낸다. 다시 한번, 이결과는 LTP 모델이 노이즈에 매우 강력하다는 것을 증명한다.As shown in FIG. 15B, the hierarchical collaboration expression-based classifier 130 of the present invention shows that it is significantly superior to MSPCRC and CRC under different noise ratios. It also shows that the LTP-HCRC achieves the highest accuracy using a deep learning model and is far superior to other approaches. Once again, these results demonstrate that the LTP model is very robust against noise.

LTP-HCRC와 대조적으로 테스트 얼굴에서 랜덤 노이즈가 증가하면 maxout-HCRC, VGG-HCRC 및 centerloss-HCRC의 정확도가 급격하게 떨어진다. 이는 DCNN 모델이 높은 수준의 노이즈가 있는 저해상도 테스트 이미지를 처리할 때 발생하는 오버피팅 문제 때문이다. 이 실험의 결과는 딥 피처 학습 모델이 매우 높은 노이즈 비율, 저해상도 이미지의 경우에는 얼굴 인식문제를 효과적으로 해결할 수 없음을 보여준다.In contrast to the LTP-HCRC, the increase in random noise in the test face dramatically reduces the accuracy of maxout-HCRC, VGG-HCRC and centerloss-HCRC. This is due to the overfitting problem that occurs when the DCNN model processes a low-noise, low-resolution test image. The results of this experiment show that the deep feature learning model cannot effectively solve the face recognition problem in the case of very high noise ratio and low-resolution images.

도 16은 본 발명의 일 실시예에 따른 FW-a 데이터 세트에서 랜덤 오클루젼을 가지는 얼굴 이미지에 대한 인식률을 비교한 도면이다.16 is a diagram comparing recognition rates for face images having random occlusion in an FW-a data set according to an embodiment of the present invention.

도 16에 도시한 바와 같이, 도 15에서 설명한 각각의 테스트 이미지는 크기가 다른 사각형 블록에 의해 랜덤하게 오클루젼되며, 상기 오클루젼된 테스트 이미지를 활용하여 본 발명의 계층적 협업 표현 기반 분류기(160)와 타 접근방법에 대한 인식률을 평가하였다.As shown in FIG. 16, each test image described in FIG. 15 is randomly occluded by a rectangular block having a different size, and the hierarchical collaboration expression-based classifier of the present invention is utilized by utilizing the occluded test image. (160) and other approaches were evaluated.

그 결과, 상기 계층적 협업 표현 기반 분류기(160)가 CRC보다 성능이 우수하며, 4.3%이상 정확하다는 것을 알 수 있다. 또한 VGG-HCRC가 현저한 인식 정확도를 달성하는 것을 보여준다.As a result, it can be seen that the hierarchical collaboration expression-based classifier 160 has better performance than CRC and is more than 4.3% accurate. It also shows that VGG-HCRC achieves remarkable recognition accuracy.

VGG-HCRC의 정확도는 테스트 얼굴에서 오클루젼 비율이 급격하게 증가할 때, 완만한게 감소한다. LTP-HCRC는 두번째로 좋은 인식률을 달성한다. maxout-HCRC는 LTP-HCRC보다 정확도가 떨어지며, centerloss-HCRC와 비슷한 결과는 내는 것을 알 수 있다.The accuracy of the VGG-HCRC decreases gently when the occlusion rate in the test face increases rapidly. LTP-HCRC achieves the second best recognition rate. It can be seen that maxout-HCRC is less accurate than LTP-HCRC, and produces results similar to centerloss-HCRC.

도 17은 본 발명의 일 실시예에 따른 얼굴인식 절차를 나타낸 흐름도이다.17 is a flowchart illustrating a face recognition procedure according to an embodiment of the present invention.

도 17에 도시한 바와 같이, 본 발명의 일 실시예에 따른 얼굴인식 절차는 우선, 얼굴인식 장치(100)는 학습데이터 데이터베이스(310)에 저장된 학습데이터를 로딩한 후, 이를 학습하여 얼굴 이미지로부터 얼굴특징을 추출하기 위한 얼굴특징 추출용 학습모델을 생성한다(S110). 상기 생성한 얼굴특징 추출용 학습모델은 학습모델 데이터베이스(320)에 저장된다.As shown in FIG. 17, in the face recognition procedure according to an embodiment of the present invention, first, the face recognition apparatus 100 loads learning data stored in the learning data database 310 and then learns them from the face image. A learning model for extracting facial features is generated to extract facial features (S110). The generated learning model for extracting facial features is stored in the learning model database 320.

또한 상기 학습은 DCNN 모델 또는 LTP 모델을 통해 수행됨은 상술한 바와 같다. 즉, 본 발명은 상기 DCNN 모델 또는 LTP 모델을 통해 얼굴 이미지로부터 얼굴특징을 추출할 수 있다.In addition, the learning is performed through the DCNN model or the LTP model as described above. That is, the present invention can extract facial features from the face image through the DCNN model or the LTP model.

다음으로 얼굴인식 장치(100)는 상기 얼굴특징 추출용 학습모델을 통해 추출되는 얼굴특징을 학습하여 상기 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류하기 위한 제1 단계 분류용 학습모델과, 상기 분류한 적어도 하나 이상의 후보 클래스를 재분류하기 위한 제2 단계 분류용 학습모델을 생성한다(S120).Next, the face recognition apparatus 100 learns the facial features extracted through the learning model for extracting the facial features and classifies the learning model for the first step to classify at least one candidate class for the facial features, and the classification. A learning model for the second step classification for reclassifying one or more candidate classes is generated (S120).

한편 제1 단계 분류용 학습모델을 통해 수행되는 분류는, 상기 얼굴특징과 상기 얼굴특징에 대한 프로젝션 벡터 사이의 유클리드 거리에 따라 수행되며, 상기 제2 단계 분류용 학습모델을 통해 수행되는 재분류는 상기 프로젝션 벡터와 상기 후보 클래스에 대한 유클리드 거리에 따라 수행됨은 상술한 바와 같다.On the other hand, classification performed through the learning model for the first step classification is performed according to the Euclidean distance between the facial feature and the projection vector for the facial feature, and reclassification performed through the learning model for the second step classification is performed. As described above, it is performed according to the Euclidean distance for the projection vector and the candidate class.

다음으로 상기 얼굴인식 장치(100)는 얼굴인식을 위한 특정 얼굴 이미지가 입력되는 경우(S130), 상기 얼굴인식 장치(100)의 얼굴특징 추출부(120)는 상기 입력된 특정 얼굴 이미지를 상기 생성한 얼굴특징 추출용 학습모델에 적용하여 해당 얼굴 이미지에 대한 얼굴특징을 추출한다(S140).Next, when the face recognition device 100 inputs a specific face image for face recognition (S130), the face feature extraction unit 120 of the face recognition device 100 generates the input specific face image. It is applied to a learning model for extracting a face feature to extract face features for the corresponding face image (S140).

다음으로 상기 얼굴인식 장치(100)의 제1 단계 분류기(161)를 통해 상기 생성한 제1 단계 분류용 학습모델에 상기 추출한 얼굴특징을 적용하여 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스를 분류한다(S150).Next, the extracted face feature is applied to the generated learning model for the first stage classification through the first stage classifier 161 of the face recognition apparatus 100 to classify at least one candidate class for the corresponding face feature. (S150).

즉, 상기 제1 단계 분류용 학습모델의 입력은 상기 추출한 얼굴특징이 되며, 출력은 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스이다.That is, the input of the learning model for the first step classification becomes the extracted facial feature, and the output is at least one candidate class for the corresponding facial feature.

다음으로 상기 얼굴인식 장치(100)의 제2 단계 분류기(162)는 상기 생성한 제2 단계 분류용 학습모델을 이용하여 상기 제1 단계 분류기(161)에 의해 분류된 적어도 하나 이상의 후보 클래스를 재분류한다(S160).Next, the second stage classifier 162 of the face recognition apparatus 100 re-establishes at least one candidate class classified by the first stage classifier 161 using the generated second stage classification learning model. Classify (S160).

즉, 상기 제2 단계 분류용 학습모델의 입력은 상기 얼굴특징 추출부(120)에서 추출한 얼굴특징과 상기 제1 단계 분류기(161)에서 분류한 해당 얼굴특징에 대한 적어도 하나 이상의 후보 클래스이며, 출력은 상기 얼굴특징의 프로젝션 벡터와 상기 분류한 적어도 하나 이상의 후보 클래스에 대한 유클리드에 따라 재분류되는 상기 후보 클래스 중 적어도 하나 이상의 후보 클래스가 된다.That is, the input of the learning model for the second step classification is at least one candidate class for the facial feature extracted by the facial feature extraction unit 120 and the corresponding facial feature classified by the first step classifier 161, and output Is a candidate class of at least one of the candidate classes reclassified according to the projection vector of the facial feature and the classified at least one candidate class.

다음으로 상기 얼굴인식 장치(200)의 얼굴인식부(170)는 상기 재분류한 결과에 따라 상기 입력된 특정 얼굴 이미지에 대한 얼굴인식을 수행한다(S170).Next, the face recognition unit 170 of the face recognition apparatus 200 performs face recognition on the input specific face image according to the reclassified result (S170).

상기 얼굴인식은 상기 제2 단계 분류기(162)에 의해 재분류한 후보 클래스 중 상기 얼굴특징 추출부에 의해 추출된 얼굴특징에 대한 프로젝션 벡터와의 유클리드 거리가 제일 작은 후보 클래스를 선택함으로써, 수행됨은 상술한 바와 같다.The face recognition is performed by selecting a candidate class having the smallest Euclidean distance with a projection vector for a facial feature extracted by the facial feature extraction unit among candidate classes reclassified by the second step classifier 162. As described above.

이상에서 설명한 바와 같이 본 발명은 계층적 협업 표현 기반 분류를 통한 강인한 얼굴인식 장치 및 그 방법에 관한 것으로, 얼굴인식의 대상이 되는 얼굴 이미지의 프로젝션 벡터와 학습데이터까지의 유클리드 거리에 대한 추가적인 제약조건을 사용함으로써, 학습데이터를 구성하는 얼굴 이미지를 신속하게 분류할 수 있도록 하여 얼굴인식의 성능을 현저하게 향상시킬 수 있는 효과가 있다.As described above, the present invention relates to a robust face recognition apparatus and method through hierarchical collaborative expression-based classification, and additional constraints on the Euclidean distance to the projection vector and learning data of the face image that is the object of face recognition. By using, it is possible to rapidly classify the face images constituting the learning data, thereby significantly improving the performance of face recognition.

또한 본 발명은 DCNN 모델 또는 LTP 모델과 결합하여 학습데이터를 구성하는 얼굴 이미지를 분류할 수 있도록 함으로써, 노이즈, 상이한 조명효과에 대해서도 신속하고 정확하게 얼굴인식을 수행할 수 있도록 하는 효과가 있다.In addition, according to the present invention, by combining the DCNN model or the LTP model, the face image constituting the learning data can be classified, and thus, it is possible to quickly and accurately perform face recognition for noise and different lighting effects.

또한 상기에서는 본 발명에 따른 바람직한 실시 예를 위주로 상술하였으나 본 발명의 기술적 사상은 이에 한정되는 것은 아니며 본 발명의 각 구성요소는 동일한 목적 및 효과의 달성을 위하여 본 발명의 범위 내에서 변경 또는 수정될 수 있을 것이다.In addition, the preferred embodiment according to the present invention has been mainly described above, but the technical spirit of the present invention is not limited thereto, and each component of the present invention may be changed or modified within the scope of the present invention to achieve the same purpose and effect. Will be able to.

아울러 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형 실시가 가능한 것은 물론이고, 이러한 변형 실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안 될 것이다.In addition, although the preferred embodiments of the present invention have been shown and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention pertains without departing from the gist of the present invention claimed in the claims. Of course, various modifications can be implemented by a person having ordinary knowledge, and these modifications should not be individually understood from the technical spirit or prospect of the present invention.

100 : 계층적 협업 표현 기반 분류를 통한 얼굴인식 장치
110 : 얼굴 이미지 수집부 120 : 얼굴특징 추출용 학습모델 생성부
130 : 제1 단계 분류용 학습모델 생성부
140 : 제2 단계 분류용 학습모델 생성부 150 : 얼굴특징 추출부
160 : 계층적 협업 표현 기반 분류기 161 : 제1 단계 분류기
162 : 제2 단계 분류기 170 : 얼굴인식부
200 : 사용자 단말 300 : 데이터베이스
310 : 학습데이터 데이터베이스 320 : 학습모델 데이터베이스
400 : 카메라100: face recognition device through hierarchical collaborative expression-based classification
110: face image collection unit 120: learning model generation unit for extracting facial features
130: learning model generation unit for the first stage classification
140: second stage classification learning model generation unit 150: facial feature extraction unit
160: hierarchical collaboration expression-based classifier 161: first stage classifier
162: second stage classifier 170: face recognition unit
200: user terminal 300: database
310: learning data database 320: learning model database
400: camera

Claims

A learning model generator for extracting facial features to generate a learning model for extracting facial features for extracting facial features for a specific facial image by learning learning data including a plurality of facial images;
A learning model for first-level classification that learns the facial features extracted through the generated learning model for facial feature extraction and generates a learning model for first-class classification for classifying into at least one candidate class according to the extracted facial features. Generation unit; And
A second step classification for re-learning face features for the candidate class classified through the generated first stage classification learning model and generating a second stage classification learning model for reclassifying according to the extracted face features. Includes; learning model generation unit;
The learning model for classification in the first step is at least one from the learning data according to the Euclidean distance between at least one face feature polled in the collaborative subspace composed of the learning data and the extracted face feature projected in the collaborative subspace. Classified as above candidate class,
The learning model for classification in the second step classifies the extracted facial features with respect to a class having a minimum normalized reconstruction error among the at least one candidate class,
Through the learning model for classification in the first step and the learning model for classification in the second step, the Euclidean distance between the extracted facial feature and the facial feature in which the extracted facial feature is projected in the collaborative subspace is minimized, and the projected A face recognition apparatus through hierarchical collaborative expression-based classification, characterized by minimizing the Euclidean distance between a facial feature and at least one or more facial features polled in a collaborative subspace composed of the learning data.

The method according to claim 1,
The face recognition device through hierarchical collaborative expression-based classification,
A facial feature extraction unit to extract a facial feature for the specific facial image by applying a specific facial image to the generated learning model for facial feature extraction;
A first step classifier that classifies into at least one candidate class according to the face feature by applying the face feature extracted through the face feature extraction unit to the learning model for classifying the first step; And
Reclassifying the classified at least one candidate class by applying the facial feature extracted through the facial feature extraction unit and at least one candidate class classified through the first stage classifier to the learning model for classification in the second stage Two-stage classifier; further comprises,
A face recognition apparatus through hierarchical collaboration expression-based classification, characterized in that the learning data is classified to perform face recognition on the specific face image through hierarchical collaboration expressions of the first and second stage classifiers. .

The method according to claim 2,
The face recognition device through the hierarchical collaboration expression-based classification,
The candidate class having the smallest Euclidean distance by comparing the Euclidean distance between the facial features projected in the cooperative subspace and the facial features of the candidate class reclassified through the second classifier by extracting the facial features extracted through the facial feature extraction unit By selecting, a face recognition unit for recognizing the face of the specific face image; Face recognition apparatus through a hierarchical collaboration expression-based classification further comprising.

The method according to claim 1,
The learning model generator for extracting facial features,
It consists of a DCNN (deep convolutional neural network) model,
The DCNN model,
It includes a plurality of convolution layers, a plurality of maxout layers connected to each convolution layer, a plurality of pooling layers, and a softmax layer, and is unique for each learning data. A facial recognition device through hierarchical collaborative expression-based classification, characterized by extracting facial features by transforming the features of a common set.

delete

The method according to claim 1,
The learning model generator for extracting facial features,
It consists of LTP (local ternary patterns) model,
The LTP model,
Dividing each learning data into a plurality of blocks, collecting LTP codes for each block as a histogram, and extracting facial features by connecting each histogram with a combined feature histogram composed of several bins Face recognition device through hierarchical collaborative expression-based classification.

Generating a learning model for extracting facial features for extracting facial features for a specific facial image by learning learning data including a plurality of facial images through a learning model generator for extracting facial features;
A learning model for first-class classification for learning the facial features extracted through the learning model for extracting the facial features generated through the learning model generator for first-class classification and classifying them into at least one candidate class according to the extracted facial features. Generating a; And
A second-class classification for re-classifying the candidate feature classified through the generated first-class classification learning model through the learning model generator for second-class classification and reclassifying according to the extracted facial feature Generating a learning model for the dragon; includes,
The learning model for classification in the first step is at least one from the learning data according to the Euclidean distance between at least one face feature polled in the collaborative subspace composed of the learning data and the extracted face feature projected in the collaborative subspace. Classified as above candidate class,
The learning model for classification in the second step classifies the extracted facial features with respect to a class having a minimum normalized reconstruction error among the at least one candidate class,
Through the learning model for classification in the first step and the learning model for classification in the second step, the Euclidean distance between the extracted facial feature and the facial feature in which the extracted facial feature is projected in the collaborative subspace is minimized, and the projected A face recognition method through hierarchical collaborative expression-based classification, characterized by minimizing the Euclidean distance between a facial feature and at least one or more facial features polled in a collaborative subspace composed of the learning data.

The method according to claim 7,
The face recognition method through hierarchical collaborative expression-based classification,
Extracting a facial feature for the specific facial image by applying a facial image to the generated learning model for facial feature extraction through a facial feature extraction unit;
Applying a facial feature extracted by the facial feature extraction unit to the learning model for classification in the first stage through a first stage classifier, and classifying into at least one candidate class according to the facial feature; And
At least one classified by applying the facial feature extracted by the facial feature extraction unit and at least one candidate class classified by the first stage classifier to the learning model for the second stage classification through the second stage classifier. Reclassifying the above candidate classes; further comprising,
A face recognition method through hierarchical collaboration expression-based classification, characterized in that the learning data is classified to perform face recognition on a specific face image through hierarchical collaboration expressions of the first and second stage classifiers.

The method according to claim 8,
The face recognition method through hierarchical collaborative expression-based classification,
The face recognition unit compares the Euclidean distance between the facial features projected in the cooperative sub-space and the facial features of the candidate class reclassified through the second classifier through the facial feature extracted by the facial feature extraction unit. And recognizing a face of the specific face image by selecting a candidate class having a Euclidean distance.

The method according to claim 7,
The learning model generator for extracting facial features,
It consists of a DCNN (deep convolutional neural network) model,
The DCNN model,
It includes a plurality of convolution layers, a plurality of maxout layers connected to each convolution layer, a plurality of pooling layers, and a softmax layer, and is unique for each learning data. A face recognition method through hierarchical collaborative expression-based classification, characterized in that face features are extracted by converting the features of a common set.

delete

The method according to claim 7,
The learning model generator for extracting facial features,
It consists of LTP (local ternary patterns) model,
The LTP model,
Dividing each learning data into a plurality of blocks, collecting LTP codes for each block as a histogram, and extracting facial features by connecting each histogram with a combined feature histogram composed of several bins Face recognition method through hierarchical collaborative expression-based classification.