KR101326691B1

KR101326691B1 - Robust face recognition method through statistical learning of local features

Info

Publication number: KR101326691B1
Application number: KR1020110125412A
Authority: KR
Inventors: 박혜영; 서정인
Original assignee: 경북대학교 산학협력단
Priority date: 2011-11-28
Filing date: 2011-11-28
Publication date: 2013-11-08
Also published as: KR20130059212A

Abstract

지역적 특징의 통계적 학습을 통한 강건한 얼굴인식방법이 개시된다. 상기 개시된 본 발명은 학습할 얼굴 이미지를 M개 이미지로 분할하고, SIFT 특징 추출을 통해 각 분할 이미지의 지역적 특징 기술자를 획득하는 (a)단계; 상기 (a)단계에서 획득한 다수의 지역적 특징 기술자에 대하여 평균 및 분산을 계산하는 (b)단계; 비교할 얼굴 이미지를 M개 이미지로 분할하고, SIFT 특징 추출을 통해 각 분할 이미지의 지역적 특징 기술자를 획득하는 (c)단계; 상기 (a)단계에서 획득한 다수의 지역적 특징 기술자와 상기 (c)단계에서 획득한 다수의 지역적 특징 기술자 간의 거리를 계산하는 (d)단계; 상기 (b)단계에서 계산된 평균 및 분산을 이용하여 상기 학습할 얼굴 이미지의 M개 이미지에 대한 각 가중치를 계산하는 (e)단계; 및 상기 (d)단계에서 계산된 다수의 지역적 특징 기술자 간의 거리와 상기 (e)단계에서 계산된 각 가중치를 결합하여, 상기 학습할 얼굴 이미지와 상기 비교할 얼굴 이미지 간의 M개로 분할된 각 이미지에 대한 거리를 계산하는 (f)단계;를 포함하는 것을 특징으로 한다.A robust face recognition method is disclosed through statistical learning of local features. The present invention disclosed above comprises the steps of: dividing a face image to be learned into M images, and obtaining a local feature descriptor of each segmented image through SIFT feature extraction; (B) calculating an average and a variance for the plurality of local feature descriptors obtained in step (a); Dividing the face images to be compared into M images, and obtaining local feature descriptors of each of the divided images through SIFT feature extraction; (D) calculating a distance between the plurality of local feature descriptors obtained in step (a) and the plurality of local feature descriptors obtained in step (c); (E) calculating each weight for M images of the face image to be learned using the average and variance calculated in step (b); And combining the distances between the plurality of local feature descriptors calculated in step (d) and the respective weights calculated in step (e), for each image divided into M pieces between the face image to be learned and the face image to be compared. (F) calculating a distance; characterized in that it comprises a.

Description

Robust Face Recognition Method through Statistical Study of Regional Features {ROBUST FACE RECOGNITION METHOD THROUGH STATISTICAL LEARNING OF LOCAL FEATURES}

본 발명은 얼굴인식방법에 관한 것으로, 특히 얼굴 이미지에 대해 지역적 특징 기술자를 추출하고 이에 대한 통계적 학습을 통햐 강건한 얼굴인식이 가능한 얼굴인식방법에 관한 것이다.The present invention relates to a face recognition method, and more particularly, to a face recognition method capable of robust face recognition through extraction of local feature descriptors for face images and statistical learning thereof.

일반적으로 사람에게서는 여러 가지 종류의 신호를 포착할 수 있는데, 그 중에서도 얼굴 이미지는 경우에 따라 매우 다양한 변화가 있기 때문에 얼굴인식기술은 패턴 인식과 기계 학습에서 가장 주목받고 있는 분야 중 하나이다.In general, various types of signals can be captured by humans. Among them, face images have various changes in some cases, so face recognition technology is one of the most popular areas in pattern recognition and machine learning.

종래의 얼굴 인식 기술은 하나의 얼굴 이미지에 대하여 이미지 전체를 단일 이미지에 대한 특징을 추출하거나 또는 얼굴 이미지 중 눈, 코, 입 등 특정 부분의 특징만을 추출하고 이렇게 추출된 데이터를 인식률 결과에 반영하고 있다.Conventional face recognition technology extracts a feature of a single image from an entire image or extracts only features of a specific part of the face image such as eyes, nose, and mouth, and reflects the extracted data in a recognition rate result. have.

상기와 같이 특징을 추출하는 방법은 여러 가지 기준에 의해 구분될 수 있으며, 대표적인 기준으로는 특징을 추출하는 영역의 범위에 따른 전역적 특징 추출방법과 지역적 특징 추출방법이 있다.As described above, the method of extracting features can be classified by various criteria, and representative examples include a global feature extraction method and a local feature extraction method according to a range of a region from which the feature is extracted.

전역적 특징 추출방법은 학습 데이터 전체에서 공분산과 같은 통계적 수치를 이용한 것으로 특정 목적 함수를 만족시키기 위해 고유치, 고유 벡터 등을 적용하는 것이 일반적이다. 따라서 임의의 학습 데이터(얼굴 이미지)로부터 획득되는 특징들은 해당 학습 데이터의 전체적인 설명이 가능하도록 표현되어 진다. 대표적인 방법으로 PCA(Principal Component Analysis) 및 LDA(Linear Discriminant Analysis)와 같은 방법이 있다.The global feature extraction method uses statistical values, such as covariance, across the training data. It is common to apply eigenvalues, eigenvectors, etc. to satisfy specific objective functions. Therefore, features acquired from arbitrary learning data (face images) are expressed to enable a general description of the corresponding learning data. Representative methods include Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA).

지역적 특징 추출방법은 임의의 학습 데이터(얼굴 이미지)에서 이를 잘 설명해 줄 수 있는 지역적인 특징을 추출하는 것으로, 각각의 학습 데이터에 대해 각각 추출되며 이때 나머지 학습 데이터들에 대한 고려를 전혀 하지 않고 해당 데이터 내에서도 다른 지역에 대한 고려하지 않는다. 지역적 특징 추출방법은 대표적으로 SIFT(Scale Invariant Feature Transform) 및 Dens-SIFT와 같은 방법을 들 수 있다.The local feature extraction method extracts local features that can explain this well from arbitrary learning data (face images), and is extracted for each learning data, without considering the rest of the learning data. Do not consider other regions within the data. Representative methods for regional feature extraction include methods such as Scale Invariant Feature Transform (SIFT) and Dens-SIFT.

그런데 상기 전역적 특징 추출방법이나 지역적 특징 추출방법으로는 조명, 표정, 포즈, 폐색(occlusion) 등의 매우 다양한 얼굴 이미지의 변화에 대한 통계적인 특징의 추출이 어렵기 때문에 원본 이미지의 중요한 정보를 정확하게 표현하지 못하고, 그 결과 얼굴 이미지의 다양한 변화에 따른 강건한 특징(statistical feature)을 제대로 보존하지 못해 얼굴 이미지에 대한 높은 인식률을 기대하기 어려운 문제가 있었다.However, the global feature extraction method or the local feature extraction method is difficult to extract statistical features for changes in various face images such as lighting, facial expressions, poses, occlusion, etc. There was a problem that it was difficult to express a high recognition rate for the face image because it could not express properly, and as a result, it did not properly preserve the robust (statistical feature) according to various changes of the face image.

상기 문제점을 해결하기 위하여 본 발명이 갖는 목적은, 인물의 다양한 표정을 촬영한 다수의 얼굴 이미지에 대하여 각 얼굴 이미지를 다수 개로 분할하고 이 분할된 이미지 들로부터 지역적인 특징을 추출한 후 이를 통계적으로 학습함으로써, 얼굴 인식률을 극대화할 수 있는 얼굴인식방법을 제공하는 데 있다.In order to solve the above problems, an object of the present invention is to divide each face image into a plurality of face images photographing various facial expressions of a person, extract local features from the divided images, and then statistically learn them. By providing a face recognition method that can maximize the face recognition rate.

상기 목적을 달성하기 위해, 본 발명은 학습할 다수의 얼굴 이미지를 각각 동일하게 M개 이미지로 분할하고, 지역적 특징 기술자 추출을 통해 각 분할 이미지의 지역적 특징 기술자를 획득하는 (a)단계; 상기 (a)단계에서 획득한 다수의 지역적 특징 기술자에 대하여 평균 및 분산을 계산하는 (b)단계; 비교할 하나의 얼굴 이미지를 M개 이미지로 분할하고, 지역적 특징 기술자 추출을 통해 각 분할 이미지의 지역적 특징 기술자를 획득하는 (c)단계; 상기 (a)단계에서 획득한 다수의 지역적 특징 기술자와 상기 (c)단계에서 획득한 다수의 지역적 특징 기술자 간의 거리를 계산하는 (d)단계; 상기 (b)단계에서 계산된 평균 및 분산을 이용하여, 상기 각 학습할 얼굴 이미지의 M개로 분할된 이미지에 대한 각 가중치를 계산하는 (e)단계; 및 상기 (d)단계에서 계산된 다수의 지역적 특징 기술자 간의 거리와 상기 (e)단계에서 계산된 각 가중치를 결합하여, 상기 학습할 얼굴 이미지와 상기 비교할 얼굴 이미지 간의 거리를 계산하는 (f)단계를 포함하는 것을 특징으로 하는 얼굴인식방법을 제공한다.In order to achieve the above object, the present invention comprises the steps of: (a) dividing a plurality of face images to be learned to each of the same M image, and to obtain a local feature descriptor of each segmented image through local feature descriptor extraction; (B) calculating an average and a variance for the plurality of local feature descriptors obtained in step (a); (C) dividing one face image to be compared into M images, and obtaining a local feature descriptor of each segmented image through local feature descriptor extraction; (D) calculating a distance between the plurality of local feature descriptors obtained in step (a) and the plurality of local feature descriptors obtained in step (c); (E) calculating respective weights of the M-segmented images of the face images to be learned, using the average and the variance calculated in the step (b); And (f) calculating a distance between the face image to be learned and the face image to be compared by combining the distances between the plurality of local feature descriptors calculated in step (d) and the respective weights calculated in step (e). It provides a face recognition method comprising a.

상기 (f)단계에서 상기 학습할 얼굴 이미지와 상기 비교할 얼굴 이미지 간의 거리 거리 계산은 하기 수학식에 의해 산출될 수 있다.In the step (f), the distance distance calculation between the face image to be learned and the face image to be compared may be calculated by the following equation.

상기 (a)단계 및 (c)단계의 지역적 특징 기술자 추출은 SIFT(Scale Invariant Feature Transform) 특징 추출방법 또는 Dens-SIFT 특징 추출방법을 통해 이루어지는 것이 바람직하다.Local feature descriptor extraction in steps (a) and (c) is preferably performed through a scale invariant feature transform (SIFT) feature extraction method or a dens-SIFT feature extraction method.

상기한 바와 같이 본 발명에 있어서는, 인물의 조명, 포즈, 표정 변화에 강건한 지역적 특징의 장점과 더불어 통계적 학습을 통해서 폐색에도 강건한 특징을 사용할 수 있어 얼굴 인식률을 향상시킬 수 있는 이점이 있다.As described above, in the present invention, a robust feature can be used for occlusion through statistical learning as well as an advantage of local features that are robust to lighting, poses, and facial expressions of a person, thereby improving face recognition rate.

도 1은 본 발명의 지역적 특징의 통계적 학습을 통한 강건한 얼굴인식 과정을 순차적으로 나타내는 흐름도이고,
도 2는 본 발명에서 학습할 단일 얼굴 이미지(트레이닝 이미지)를 나타내는 도면이고,
도 3은 도 2에 도시된 얼굴 이미지의 지역적 특징을 추출하기 위해 격자식으로 전체 분할한 상태를 나타내는 도면이고,
도 4는 얼굴 이미지의 일부가 가려진 폐색된 얼굴 이미지(테스트 이미지)를 나타내는 도면이고,
도 5는 도 4에 도시된 얼굴 이미지의 지역적 특징을 추출하기 위해 격자식으로 전체 분할한 상태를 나타내는 도면이고,
도 6은 본 발명과 종래의 PCA 및 LDA 특징 추출방법 간의 얼굴 인식률을 비교한 그래프이고,
도 7은 얼굴 이미지의 일부를 인위적으로 폐색시킨 얼굴 이미지(테스트 이미지)를 나타내는 도면이고,
도 8은 본 발명과 종래의 PCA 및 LDA 특징 추출방법 간의 얼굴 인식률을 비교한 그래프이다.1 is a flowchart sequentially illustrating a robust face recognition process through statistical learning of regional features of the present invention.
2 is a view showing a single face image (training image) to be learned in the present invention,
FIG. 3 is a diagram illustrating a state in which the grid is entirely divided in order to extract local features of the face image shown in FIG. 2.
4 is a view showing a closed face image (test image) in which part of the face image is hidden;
FIG. 5 is a diagram showing a state in which the grid is entirely divided in order to extract local features of the face image shown in FIG. 4.
6 is a graph comparing face recognition rates between the present invention and conventional PCA and LDA feature extraction methods;
7 is a diagram illustrating a face image (test image) in which part of the face image is artificially occluded.
8 is a graph comparing face recognition rates between the present invention and conventional PCA and LDA feature extraction methods.

이하 첨부한 도면을 참고하여, 본 발명의 지역적 특징의 통계적 학습을 통한 강건한 얼굴인식방법을 설명한다.With reference to the accompanying drawings, a robust face recognition method through the statistical learning of the regional features of the present invention will be described.

먼저, 도 2와 같이 얼굴 전면에 대한 하나의 얼굴 이미지(1)에 대하여, 도 3과 같이 복수 개(M개)의 부분영역으로 구획한다.First, as shown in FIG. 2, one face image 1 of the entire face is divided into a plurality of (M) partial regions as shown in FIG. 3.

이때 본 발명의 경우 하나의 얼굴 이미지(1)의 눈, 코, 입 등의 특정 부분에 대한 특징을 사용하지 않고 하나의 얼굴 이미지(1) 전체를 사용해야 하므로, 얼굴 이미지(1)의 모든 부분을 동일한 크기의 이미지로 분할하기 위해 격자 패턴으로 분할하는 것이 바람직하다. 이렇게 분할된 각 부분을 이하에서는 분할 이미지(P1~PM)라 한다.In this case, since the entire face image 1 must be used without using features of a specific part of one face image 1 such as eyes, nose, and mouth, all parts of the face image 1 must be used. It is preferable to divide into a grid pattern in order to divide into images of the same size. Each of the divided parts is referred to as divided images P1 to PM hereinafter.

이렇게 다수 개로 분할된 각 분할 이미지(P1~PM)에 대하여, 지역적 특징 추출 방법인 SIFT를 통해 각각 지역적 특징 기술자를 획득한다(S1). 지역적 특징 기술자를 획득하는 과정은 하기와 같다.For each of the divided images P1 to PM divided into a plurality of regions, local feature descriptors are acquired through SIFT, which is a local feature extraction method (S1). The process of obtaining a local feature descriptor is as follows.

SIFT는 이미지 특징들의 세트(set of image features)를 생성하기 위해 2가지 계산 단계를 거친다. 1 단계는 전체 얼굴 이미지(1)로부터 중요 포인트를 어떻게 선택할 것인지를 결정한다. 여기서, 선택된 중요 픽셀을 '키포인드(keypoint)'라 한다. 2 단계는 해당 이미지(1)의 의미 있는 지역적 속성 들(local properties)을 나타낼 수 있도록 상기 선택된 키포인트 들에 대한 적절한 기술자(descriptor)를 정의한다. 여기서, 기술자는 '지역적 특징 기술자'라 한다.SIFT goes through two computational steps to generate a set of image features. Step 1 determines how to select key points from the full face image 1. Here, the selected important pixel is referred to as a 'keypoint'. Step 2 defines an appropriate descriptor for the selected keypoints to represent meaningful local properties of the image 1. Here, the descriptor is referred to as a 'local feature descriptor'.

이와 같이 하나의 얼굴 이미지(1)에서 M개로 분할된 각각의 분할 이미지(P1~PM)는 상기 지역적 특징 기술자를 갖는 복수의 키포인트 세트에 의해 나타낼 수 있다.As described above, each of the divided images P1 to PM divided into M pieces in one face image 1 may be represented by a plurality of keypoint sets having the local feature descriptor.

하기에서는 지역적 특징 기술자에 대해 간략하게 설명하고, 복수 개로 분할된 각 분할 이미지(P1~PM)를 나타내기 위해 지역적 특징 기술자를 적용하는 방법을 설명한다.Hereinafter, the local feature descriptors will be briefly described, and a method of applying the local feature descriptors to show each of the divided images P1 to PM divided into a plurality will be described.

SIFT는 각 이미지 내에서 복수의 키포인트를 검출하기 위해 스케일-공간 DOG(scale-space Difference-Of-Gaussian) 함수를 이용한다. 입력된 하나의 얼굴 이미지 I(x,y)에 대하여, 상기 스케일-공간은 얼굴 이미지를 갖는 가변스케일 가우시안 G(x,y,σ)의 합성곱(convolution)으로부터 제공되는 함수 L(x,y,σ)로 정의된다. 이에 따라, 상기 DOG 함수는 하기 수학식 1과 같이 정의된다.SIFT uses a scale-space difference-of-Gaussian (DOG) function to detect a plurality of keypoints within each image. For one face image I (x, y) input, the scale-space is a function L (x, y) provided from the convolution of the variable-scale Gaussian G (x, y, σ) with the face image. , σ). Accordingly, the DOG function is defined as in Equation 1 below.

여기서, x는 x축 좌표, y는 y축 좌표, σ는 스케일(scale), k는 증배율(multiplicative factor)을 나타낸다.Where x is the x-axis coordinate, y is the y-axis coordinate, sigma is the scale, and k is the multiplicative factor.

이 경우, D(x,y,σ) 함수의 지역적 최대값 및 최소값은 현재 단일 이미지(1) 내에서 하나의 분할 이미지를 둘러싸는 8개의 주변 이미지를 기반으로 한다. 종래의 SIFT를 통해 특징자를 추출하는 경우, 안정성과 지역적 특징 기술자 들의 값에 근거하여 다수의 키포인트를 선택하는데, 이 경우 다수의 키포인트와 위치는 각 분할 이미지에 따라 달라진다.In this case, the local maximum and minimum values of the D (x, y, σ) function are based on eight surrounding images that surround one segmented image within the current single image 1. In the case of extracting a feature through the conventional SIFT, a plurality of keypoints are selected based on the values of stability and local feature descriptors, in which case the plurality of keypoints and positions are different for each segmented image.

한편, 얼굴인식의 경우, 얼굴이미지의 텍스쳐(texture)가 부족하기 때문에 매우 적은 수의 키포인트 들이 추출될 수 있는데, 이 경우 SIFT 대신에 또 다른 지역적 특징자 추출방법인 Dense-SIFT 추출 방법을 적용하여 상기 문제를 해결할 수 있다.On the other hand, in case of face recognition, very few keypoints can be extracted because of lack of texture of face image. In this case, instead of SIFT, another local feature extraction method, Dense-SIFT extraction method, is applied. The problem can be solved.

상기 SIFT 추출방법을 통해 추출된 각 지역적 특징 기술자는 4개 부분으로 이루어진 128 차원 벡터인 기술자(128 dimensional vecter descriptor)로 대표된다. 여기서 4개 부분은 특징이 선택된 위치를 나타내는 로커스(locus), 스케일(scale)(σ), 방향 및 기울기이다. 이 경우, 각 분할 이미지에 대한 2차원 좌표 (x,y)에 위치한 각 키포인트에 대한 기울기 정도(m(x,y)) 및 방향(θ(x,y))은 하기의 수학식 2 및 수학식 3을 통해 얻어진다.Each local feature descriptor extracted through the SIFT extraction method is represented by a 128 dimensional vecter descriptor, which is a four-dimensional 128-dimensional vector. The four parts here are a locus, scale (σ), direction and slope indicating where the feature is selected. In this case, the degree of inclination (m (x, y)) and the direction (θ (x, y)) for each key point located in the two-dimensional coordinates (x, y) for each segmented image are represented by Equations 2 and Obtained through Equation 3.

이어서, 얼굴 이미지(1)를 나타내는데 SIFT 추출방법을 적용하기 위해서, 먼저 M개의 지역적 특징 기술자 및 일반적인 격자(regular grid) 상에 상기 지역적 특징 기술자 들의 위치를 확정한다. 여기서, 각 지역적 특징 기술자는 기술자 벡터(κ)에 의해 나타나므로, 하나의 얼굴 이미지(I)는 M개의 기술자 벡터의 세트에 의해 나타날 수 있다. 이 경우 얼굴 이미지(I)는 하기 수학식 4와 같이 나타낼 수 있다.Subsequently, in order to apply the SIFT extraction method to displaying the face image 1, first, the locations of the M feature descriptors and the local feature descriptors on a regular grid are determined. Here, since each local feature descriptor is represented by a descriptor vector κ, one face image I can be represented by a set of M descriptor vectors. In this case, the face image I may be expressed as in Equation 4 below.

이러한 식을 바탕으로, M개의 기술자 벡터(κ)에 대한 평균 및 분산을 계산하여 확률분포 학습이 이루어진다(S2).Based on this equation, probability distribution learning is performed by calculating the mean and variance for the M descriptor vectors κ (S2).

얼굴 이미지에 대한 임의의 학습 데이터가 {I ⁱ } _i _=1,…,N 와 같이 주어질 때, 키포인트 기술자에 대한 M개 트레이닝 세트를 하기 수학식 5와 같이 나타낼 수 있다.The random training data for the face image is {I ⁱ } _i _{= 1,...} Given by _{, N} , M training sets for the keypoint descriptor can be expressed as Equation 5 below.

상기 M개 트레이닝 세트(T_m)는 m개 영역으로 분할된 모든 얼굴 이미지로부터 획득한 얼굴 영상들의 분할 이미지 중 특정 위치의 분할 이미지에 대한 다수의 지역적 특징 기술자를 갖는다.The M training sets T _m have a plurality of local feature descriptors for a segmented image of a specific position among the segmented images of the face images obtained from all the face images divided into m regions.

상기 트레이닝 세트(T_m)를 사용함에 따라, 특정 위치의 분할 이미지에 대한 지역적 특징 기술자(κ_m)의 확률 밀도를 추산할 수 있다. 또한 단순한 예비 어프로치(preliminary approach)에 따라, 128 차원 랜덤 벡터에 대한 다변량 가우시안 모델(multivariate Gaussian model)을 사용한다. 따라서 지역적 특징 기술자(κ_m)는 하기 수학식 6과 같이 나타낼 수 있다.By using the training set T _m , it is possible to estimate the probability density of the local feature descriptor κ _m for the segmented image of a particular location. We also use a multivariate Gaussian model for 128-dimensional random vectors, following a simple preliminary approach. Therefore, the local feature descriptor κ _m can be expressed as Equation 6 below.

상기 수학식 6에서 2개 모델 파라미터인 평균(μ_m) 및 공분산 행렬(Σ_m)은 각각 상기 트레이닝 세트(T_m)의 단순 평균 및 단순 공분산 행렬에 의해 추산될 수 있다.In Equation 6, the two model parameters, mean (μ _m ) and covariance matrix (Σ _m ), can be estimated by the simple mean and simple covariance matrix of the training set (T _m ), respectively.

상기한 바와 같이 하나의 얼굴 이미지(1)는 확정된 다수의 지역적 특징 기술자 κ _m (m=1,…,M)에 의해 나타낼 수 있으며, 동일 인물에 대한 다양한 표정을 촬영한 복수의 얼굴 이미지에 대하여 각각 상술한 과정 S1 및 S2 단계를 거쳐 평균 및 분산을 계산한다. 아울러, 복수의 상이한 인물에 대해서도 상술한 S1 및 S2 단계를 거쳐 학습이 이루어진다.As described above, one face image 1 may be represented by a plurality of determined local feature descriptors κ _m (m = 1,…, M), and may be applied to a plurality of face images photographing various expressions of the same person. The average and the variance are calculated through the steps S1 and S2 described above. In addition, a plurality of different people are learned through the above-described steps S1 and S2.

상기와 같이 추산된 확률 밀도를 사용하면, 사람의 전면 얼굴(human frontal faces)에 대한 원형 이미지(prototype image)의 특정한 위치에서 각 기술자가 발견되는 확률을 계산할 수 있다.Using the probability density estimated as above, it is possible to calculate the probability that each descriptor is found at a particular location in a prototype image of the human frontal faces.

한편, 얼굴인식 테스트를 위해 도 4와 같은 소정 인물의 얼굴 이미지(3)(이하 '테스트 이미지'라 함)를 선정하고, 상기 테스트 이미지에 대하여 도 5와 같이 영역을 M개로 분할한 후 SIFT 특징 추출방법을 통해 테스트 이미지의 각 지역적 특징 기술자를 획득한다(S3). 상기 테스트 이미지는 트레이닝 이미지(1)와 동일 인물을 촬영한 영상으로 코 밑부분을 목도리로 가린 상태의 이미지이다.Meanwhile, a face image 3 of a predetermined person (hereinafter referred to as a 'test image') as shown in FIG. 4 is selected for a face recognition test, and a segment is divided into M regions as shown in FIG. 5 with respect to the test image. Each local feature descriptor of the test image is obtained through the extraction method (S3). The test image is an image of the same person as the training image 1 and is a state in which the bottom part of the nose is covered with a shawl.

계속해서, 트레이닝 이미지(1)와 테스트 이미지(3) 간 지역적 특징 기술자에 대한 거리를 산출한다(S4). 이를 위해, 각 기술자의 가중치를 찾도록 상기 획득된 지역적 특징 기술자 들을 사용할 수 있다.Subsequently, the distance for the local feature descriptor between the training image 1 and the test image 3 is calculated (S4). To this end, the acquired local feature descriptors can be used to find the weight of each descriptor.

이 경우, 트레이닝 이미지(1)와 비교하기 위한 입력된 테스트 이미지(3)를 I^tst 라고 하면, 하기 수학식 7과 같이 SIFT 특징 추출방법을 적용하여 테스트 이미지(3)에 대한 지역적 특징 기술자 세트(set of keypoint descriptors)를 획득할 수 있다.In this case, I ^tst input test image (3) for comparison with training image (1). In this case, the set of keypoint descriptors for the test image 3 may be obtained by applying the SIFT feature extraction method as shown in Equation 7 below.

그 후, 각 부분영역의 지역적 특징 기술자

에 대하여, 확률 밀도

를 계산할 수 있고, 각 지역적 특징 기술자

에 대한 가중치 w_m를 획득할 수 있다(S5). 여기서 가중치 w_m는 하기 수학식 8과 같이 나타낼 수 있다.Then, local feature descriptors for each subarea

, Probability density

Can be calculated, and each local feature descriptor

A weight w _m may be obtained (S5). The weight w _m may be expressed as Equation 8 below.

이어서, 상기와 같이 계산된 테스트 이미지 I^tst 와 트레이닝 이미지 Iⁱ 간 해당 영역간 지역적 특징 기술자의 거리 및 상기 산출된 가중치 w_m를 결합하여 두 데이터 간의 거리를 계산한다(S6). 이 경우 거리는 하기 수학식 9를 사용하여 계산될 수 있다.Then, the test image I ^tst calculated as above And the distance of the local feature descriptors between the corresponding regions between the training image I ⁱ and the calculated weight w _m are calculated (S6). In this case, the distance may be calculated using Equation 9 below.

여기서, m은 각 부분 영역 인덱스, w_m는 임의의 부분영역 가중치, d(·,·)는 L₁ norm 및 L₂ norm과 같은 거리계산함수,

및

는 테스트 이미지와 i번째 트레이닝 이미지의 m 부분영역에 대한 지역적 특징 기술자를 각각 의미한다.Where m is each subregion index, w _m is an arbitrary subregion weight, d (·, ·) is a distance calculation function such as L ₁ norm and L ₂ norm,

And

Denotes local feature descriptors for the m subregion of the test image and the i th training image, respectively.

이 경우, 가중치 w_m는 m번째 지역적 특징 기술자에 의해 나타나는 테스트 이미지의 m번째 지역적 패치(local patch)에 의해 결정되기 때문에, 트레이닝 이미지와 테스트 이미지들 간에 측정에 있어서 상기 지역적 패치는 가중치를 나타내기 위한 중요한 요소에 해당한다.In this case, the weight w _m is determined by the m th local patch of the test image represented by the m th local feature descriptor, so that the local patch represents the weight in the measurement between the training image and the test images. Corresponds to an important factor.

따라서, 만약 테스트 이미지에 폐색이 존재하는 경우, 상기 폐색을 포함하는 지역적 패치는 미리 획득된 트레이닝 이미지 세트에서 흔히 보이는 패치가 아닐 수 있으며 이에 따라 가중치는 작아지게 된다. 이러한 점을 고려해 볼 때, 본 발명은 측정치에 있어 가려진 일부를 노출함에 따라 지역적 변화에 대하여 더욱 강건한 결과를 제공할 수 있다.Thus, if there is an occlusion in the test image, the regional patch containing the occlusion may not be a patch commonly seen in a pre-acquired set of training images and thus the weight is small. In view of this, the present invention can provide more robust results against regional changes by exposing a hidden portion of the measurements.

본 발명에 대한 얼굴 이미지의 강건성(robustness)을 확인하기 위해, 지역적 편차가 있는 샘플 데이터베이스 상에서 비교실험을 행하였다. 상기 본 발명을 종래의 지역적 접근 및 종래의 통계적 방법과 비교하였다. 상기 샘플 데이터베이스는 피실험자 126명(남자 70명, 여자 56명)의 얼굴을 촬영한 다수의 정면 이미지(3600 컬러 이상)로 이루어진다. 구체적으로는 각 피실험에 대하여 26개의 영상(이미지)을 촬영하였으며, 이 이미지 들은 모두 서로 다른 표정으로 촬영되었다. 각 주제에 대하여, 이 이미지들은 2주일 간격으로 2개의 서로 다른 기간에 촬영되었다. 각 기간은 얼굴 표정, 조명, 폐색에서 차이를 갖는 13개의 이미지로 이루어진다.To confirm the robustness of the facial image for the present invention, comparative experiments were conducted on a sample database with regional variations. The present invention has been compared with conventional regional approaches and conventional statistical methods. The sample database consists of a number of frontal images (3600 colors or more) photographing the faces of 126 subjects (70 males and 56 females). Specifically, 26 images (images) were taken for each test, and all of these images were taken with different expressions. For each subject, these images were taken at two different time periods at two week intervals. Each period consists of 13 images with differences in facial expressions, lighting, and occlusion.

이 실험에서, 상기 126명을 촬영한 이미지 중 100명에 대한 이미지를 선별하여 각 피실험자에 대해 제1 기간에 촬영된 13개의 이미지를 사용하였다. 참고로, 전처리 단계로, 눈 위치를 갖는 정렬된 이미지를 수작업으로 획득하였고. 위치 작업 후, 얼굴 이미지는 모핑(morphing) 및 88X64 픽셀로 리사이징(resizing) 되었다.In this experiment, images of 100 images of the 126 images were selected and 13 images captured in the first period were used for each test subject. For reference, as a preprocessing step, an aligned image with eye position was obtained manually. After positioning, the face image was morphed and resized to 88 × 64 pixels.

상기 이미지 데이터를 이용하여 본 발명과 종래의 PCA 및 LDA 방법을 통해 각각의 얼굴 인식률을 확인해 본 결과, 도 6과 같이 본 발명은 90%에 가까운 얼굴 인식률을 나타내는 데 반해, 종래의 PCA 및 LDA 방법은 모두 60%에도 미치지 못하는 얼굴 인식률을 나타내었다.As a result of confirming each face recognition rate using the present invention and the conventional PCA and LDA method using the image data, as shown in FIG. 6, the present invention shows a face recognition rate close to 90%, whereas the conventional PCA and LDA method is shown. All of them showed facial recognition rate less than 60%.

또한, 도 7과 같이 테스트 이미지에 인위적인 폐색을 가한 경우 본 발명과 종래기술에 대한 얼굴 인식률을 비교해 본 결과, 도 8과 같이 본 발명은 95%의 얼굴 인식률을 나타내는 데 반해, 종래의 PCA 및 LDA 방법은 모두 70% 미만의 얼굴 인식률을 나타내었다.In addition, when artificial occlusion is applied to the test image as shown in FIG. 7, the face recognition rate of the present invention is compared with that of the prior art. As shown in FIG. 8, the present invention shows a face recognition rate of 95%. All of the methods showed a facial recognition rate of less than 70%.

이상과 같이, 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 이것에 의해 한정되지 않으며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 본 발명의 기술 사상과 아래에 기재될 특허청구범위의 균등범위 내에서 다양한 수정 및 변형 가능함은 물론이다.As described above, although the present invention has been described by way of limited embodiments and drawings, the present invention is not limited thereto, and the technical idea of the present invention and the following by those skilled in the art to which the present invention pertains. Of course, various modifications and variations are possible within the scope of equivalents of the claims to be described.

1: 트레이닝 이미지 3: 테스트 이미지1: training image 3: test image

Claims

(A) dividing a plurality of face images to be learned into M images in the same manner and acquiring local feature descriptors of each of the divided images by extracting local feature descriptors;
(B) calculating averages and variances among the plurality of regional feature descriptors for the divided images corresponding to each other among the plurality of face images to be learned obtained in step (a);
(C) dividing one face image to be compared into M images, and obtaining a local feature descriptor of each segmented image through local feature descriptor extraction;
(D) calculating a distance between the plurality of local feature descriptors obtained in step (a) and the plurality of local feature descriptors obtained in step (c);
(E) calculating respective weights of the M-segmented images of the face images to be learned, using the average and the variance calculated in the step (b); And
(F) calculating a distance between the face image to be learned and the face image to be compared by combining the distances between the plurality of local feature descriptors calculated in step (d) and the respective weights calculated in step (e). Facial recognition method comprising a.

The method of claim 1,
In step (f), the distance recognition between the face image to be learned and the face image to be compared is calculated by the following equation.

Where I ^tst is the face image to be compared, I ⁱ is the i-th face image to learn, m is each subregion index, w _m is any subregion weight, and d (·, ·) is L ₁ norm and L ₂ norm Same distance calculation function,

And

Denotes local feature descriptors for the m subregion of the image to be compared and the image to be learned, respectively.

The method of claim 1,
The local feature descriptor extraction in steps (a) and (c) is performed through a scale invariant feature transform (SIFT) feature extraction method or a dens-SIFT feature extraction method.