KR101473991B1

KR101473991B1 - Method and apparatus for detecting face

Info

Publication number: KR101473991B1
Application number: KR1020130062029A
Authority: KR
Inventors: 정지년
Original assignee: 주식회사 에스원
Priority date: 2013-05-30
Filing date: 2013-05-30
Publication date: 2014-12-24
Also published as: KR20140140953A

Abstract

입력 영상으로터 얼굴 검출시, 얼굴 특징 추출을 위한 특징 요소들을 토대로 하는 연산자를 영상에 적용하여 특징값을 획득하고, 특징값을 토대로 스캔 영역의 스캔 영상이 얼굴 영상인지 비얼굴 영상인지를 판별한다. 여기서, 연산자는 복수의 서브블록을 포함하며, 특징 요소는 상기 서브 블록의 좌표, 넓이, 높이 그리고 이웃하는 서브 블록과의 간격을 포함한다. When detecting a face from an input image, an operator based on feature elements for extracting facial features is applied to the image to obtain a feature value, and it is determined whether the scan image in the scan area is a face image or a non-face image based on the feature value . Here, the operator includes a plurality of subblocks, and the feature element includes the coordinates, the width, the height, and the interval between adjacent subblocks.

Description

FIELD OF THE INVENTION [0001]

본 발명은 얼굴 인식에 관한 것으로, 특히 영상으로부터 얼굴을 검출하는 방법 및 그 장치에 관한 것이다. The present invention relates to face recognition, and more particularly, to a method and apparatus for detecting a face from an image.

얼굴 검출 기술은 영상으로부터 얼굴의 위치와 크기를 알아내는 기술로, 얼굴 인증, 성별 판별, 나이 추정 등의 얼굴 인식 응용 기술에 필요한 핵심 기술이다. Face detection technology is a technology for finding the position and size of a face from a video and is a core technology for face recognition application technology such as face authentication, gender discrimination, and age estimation.

얼굴 검출의 정확성을 높이기 위한 다양한 연구들이 이루어지고 있으며, 주어진 입력 이미지에서 얼굴 부분의 형태 정보를 특징량으로 변환하고, 입력 이미지로부터 획득한 특징량과 등록된 이미지의 특징량들을 비교하여 입력된 얼굴 이미지의 정체성을 결정하는 과정을 통하여 수행된다. In order to improve the accuracy of face detection, various studies have been made. In the given input image, the shape information of the face part is converted into the feature amount, the feature amount acquired from the input image is compared with the feature amount of the registered image, It is performed through the process of determining the identity of the image.

사람의 얼굴은 세부적으로 이마, 눈썹, 눈, 코, 입 등의 부분으로 이루어지며, 3차원적인 대상(object)이다. 따라서 조명, 시야 각도(또는 카메라 각도), 피사체인 사람 즉, 피사자의 표정 유무 등의 변화 요소에 따라 촬상 평면에 투영되는 형태에 큰 변화가 발생된다. 변화의 정도는 얼굴 부분에 따라 다르다. 예를 들어, 코는 눈보다 더 입체적이기 때문에 촬상 평면에 투영되는 형태가 촬영 각도에 따라 더 큰 변화를 가진다. 한편 동일한 촬영 각도의 조건하에서 피사자가 크게 웃는 표정을 짓고 있는 경우에는, 코에 대한 이미지의 형태는 표정이 없는 경우에 비하여 거의 변화가 없으나, 눈과 입에 대한 이미지의 형태는 크게 변하게 된다. A person's face is a three-dimensional object consisting of parts of the forehead, eyebrows, eyes, nose, and mouth in detail. Therefore, a large change occurs in the shape projected on the imaging plane in accordance with the lighting, the viewing angle (or the camera angle), and the change factor such as the person who is the subject, that is, the subject's face. The degree of change depends on the face part. For example, since the nose is more stereoscopic than the eye, the shape projected on the imaging plane has a larger change depending on the imaging angle. On the other hand, when the subject is laughing smile under the condition of the same photographing angle, the shape of the image with respect to the nose hardly changes as compared with the case without the facial expression, but the shape of the image with respect to the eyes and mouth greatly changes.

획득한 얼굴 이미지에 대한 컴퓨터화된 식별을 위하여 주파수 영역에서의 정량화를 가능케 하는 필터링이 널리 사용되고 있는데, 필터링에 사용되는 얼굴 인식 필터로는 가버(Gabor) 필터, 하(Haar) 필터, LBP(local binary pattern) - DLBP(discriminative LBP), ULBP(uniform LBP), NLBP (number LBP) 등을 포함- 등이 있다. 특히 LBP의 경우, 조명 변화에 강인한 특성으로 인해 Locally Assembled Binary(LAB) Feature, Multi-Block LBP 등으로 개량되어 얼굴 검출 기술 개발에 사용되었다. For the computerized identification of acquired facial images, filtering that enables quantification in the frequency domain is widely used. As a face recognition filter used for filtering, Gabor filter, Haar filter, LBP (local binary pattern), including discriminative LBP (DLBP), uniform LBP (ULBP), and NLBP (number LBP). Especially, LBP was improved by Locally Assembled Binary (LAB) feature and Multi-Block LBP due to its robustness against illumination change and used in the development of face detection technology.

그러나 이런 기존 LBP를 이용한 방법은 특징의 표현력이 부족하여 얼굴 검출에 많은 특징 추출 연산이 요구되는 단점이 있다. 이에 따라 얼굴 검출에 많은 연산 처리량과 시간이 소요되어, 실시간으로 얼굴 인식을 처리하는 환경에 적용하는 것이 어렵다. However, the conventional LBP method has a drawback in that it requires a lot of feature extraction operations for face detection because of lack of expressiveness of features. Accordingly, a lot of computation processing time and time are required for face detection, and it is difficult to apply it to an environment for processing face recognition in real time.

본 발명이 해결하고자 하는 기술적 과제는 보다 정확하게 얼굴 검출을 빠르게 수행할 수 있는 얼굴 검출 방법 및 그 장치를 제공하고자 한다. An object of the present invention is to provide a face detection method and apparatus which can more accurately perform face detection.

위의 과제를 위한 본 발명의 특징에 따른 얼굴 검출 방법은, 입력 영상에 대하여 이미지들로 이루어진 얼굴 피라미드를 생성하는 단계; 얼굴 피라미드에 포함된 각 이미지들을 설정 크기의 윈도우를 이용하여 스캔하는 단계; 상기 스캔 영역에 대하여 얼굴 특징 추출을 위한 특징 요소들을 토대로 하는 연산자를 적용하여 영상을 코딩하여 특징값을 획득하고, 특징값을 토대로 스캔 영역의 스캔 영상이 얼굴 영상인지 비얼굴 영상인지를 판별하는 단계; 얼굴 영상으로 판별된 스캔 영역들을 클러스터링하고, 클러스터링된 스캔 영역들의 크기를 토대로 얼굴 영상으로 판별된 스캔 영역들 중에서 최종 후보 검출 영역을 추출하는 단계; 및 상기 최종 후보 검출 영역으로부터 얼굴을 검출하는 단계를 포함하고, 상기 연산자는 복수의 서브블록을 포함하며, 상기 특징 요소는 상기 서브 블록의 좌표, 넓이, 높이 그리고 이웃하는 서브 블록과의 간격을 포함한다. According to an aspect of the present invention, there is provided a face detection method including: generating a face pyramid including images for an input image; Scanning each image included in the face pyramid using a window of a set size; An operator based on feature elements for facial feature extraction is applied to the scan region to obtain a feature value by coding the image to determine whether the scan image in the scan region is a facial image or a non facial image based on the feature value ; Clustering scan regions discriminated as facial images and extracting a final candidate detection region from scan regions discriminated as facial images based on the size of the clustered scan regions; And detecting a face from the final candidate detection area, wherein the operator includes a plurality of subblocks, and the feature element includes a coordinate, a width, a height, and an interval with neighboring subblocks of the subblock do.

여기서, 상기 판별하는 단계는 상기 특징 요소에 따른 서브블록들로 이루어진 연산자를 포함하는 복수의 약분류기를 스캔 영역에 적용시켜 중심 서브블록에 대응하는 영역과 주변 서브블록에 대응하는 영역의 화소값에 따라 상기 스캔 영역의 스캔 영상을 코딩하여 특징값들을 획득하는 단계; 스캔 영역별로 상기 복수의 약분류기에 의하여 획득한 특징값들을 토대로 강분류값을 산출하는 단계; 및 상기 강분류값을 미리 설정된 문턱치와 비교하여 해당 스캔 영역의 영상이 얼굴 영상인지 비얼굴 영상인지를 판별하는 단계를 포함할 수 있다. Here, the determining step may include applying a plurality of weak classifiers including an operator consisting of sub-blocks according to the feature element to a scan area to obtain a pixel value of a region corresponding to a center sub- Coding the scan image of the scan region to obtain feature values; Calculating strong classification values based on feature values obtained by the plurality of weak classifiers for each scan region; And comparing the strong classification value with a preset threshold to determine whether the image of the scan region is a face image or a non-face image.

그리고 상기 복수의 약분류기들에 사용되는 연산자를 구성하는 특징 요소들 중 적어도 하나는 다른 약분류기의 연산자의 해당 특징 요소와 다른 값을 가질 수 있다. At least one of the feature elements constituting the operator used in the plurality of weak classifiers may have a different value from the corresponding feature element of the operator of the other weak classifiers.

또한 상기 특징 요소의 좌표는 x좌표 및 y 좌표를 포함하고, 상기 이웃하는 서브블록과의 간격은 x방향의 간격 및 y방향의 간격을 포함할 수 있다. In addition, the coordinates of the feature element include an x-coordinate and a y-coordinate, and the interval with the neighboring sub-block may include an interval in the x direction and an interval in the y direction.

한편, 상기 최종 후보 검출 영역을 추출하는 단계는 클러스터링된 스캔 영역들의 크기와, 스캔 영역들의 강분류값을 토대로 산출되는 신뢰도를 토대로, 상기 스캔 영역들 중에서 최종 후보 검출 영역을 추출할 수 있다. Meanwhile, the step of extracting the final candidate detection region may extract the final candidate detection region from among the scan regions based on the size of the clustered scan regions and the reliability calculated based on the strong classification value of the scan regions.

또한 상기 최종 후보 검출 영역을 추출하는 단계는 얼굴 영상으로 판별된 스캔 영역들을 그 크기에 따라 서로 다른 클러스터로 클러스터링하는 단계; 상기 클러스터들에 포함되는 스캔 영역들인 멤버들의 수를 토대로 각 클러스터의 크기를 산출하는 단계; 상기 클러스터들 중에서, 클러스터의 크기가 미리 설정된 크기 이상인 클러스터들을 추출하는 단계; 상기 추출된 클러스터들에 대하여 각각 신뢰도를 산출하는 단계; 및 신뢰도가 설정값 이상인 클러스터들만을 선택하여 최종 후보 검출 영역으로 사용하는 단계를 포함할 수 있다. The extracting of the final candidate detection region may include clustering scan regions determined as face images into different clusters according to their sizes; Calculating a size of each cluster based on the number of members that are scan regions included in the clusters; Extracting clusters among the clusters, the clusters having a size larger than a predetermined size; Calculating reliability for each of the extracted clusters; And selecting only the clusters whose reliability is equal to or greater than the set value as the final candidate detection region.

한편, 상기 얼굴 검출 방법은 상기 최종 후보 검출 영역으로 선택된 클러스터들에 대하여, 각 클러스터에 포함된 영역들을 머징(merging)하여 각 클러스터별로 얼굴 검출 영역을 획득하는 단계를 더 포함할 수 있다. The face detection method may further include merging regions included in each of the clusters selected as the final candidate detection region to obtain a face detection region for each cluster.

여기서, 상기 얼굴을 검출하는 단계는 상기 최종 후보 검출 영역으로 선택된 클러스터들 중에서 그 신뢰도가 설정값 이상인 클러스터를 추출하고, 추출된 클러스터의 얼굴 검출 영역을 얼굴 영역으로 최종 검출할 수 있다. Here, the detecting the face may extract clusters whose reliability is equal to or higher than a set value among the clusters selected as the final candidate detection area, and finally detect the face detection area of the extracted cluster as a face area.

본 발명의 다른 특징에 따른 얼굴 검출 장치는, 입력 영상에 대하여 이미지들로 이루어진 얼굴 피라미드를 생성하는 영상 피라미드 생성부; 얼굴 피라미드에 포함된 각 이미지들을 설정 크기의 윈도우를 이용하여 스캔하는 영상 스캔부; 상기 스캔 영역에 대하여 얼굴 특징 추출을 위한 특징 요소들을 토대로 하는 연산자--상기 연산자는 복수의 서브블록을 포함하며, 상기 특징 요소는 상기 서브 블록의 x 좌표 및 y 좌표, 넓이, 높이 그리고 이웃하는 서브 블록과의 x 방향의 간격 및 y 방향의 간격을 포함함--를 적용하여 영상을 코딩하여 특징값을 획득하고, 특징값을 토대로 스캔 영역의 스캔 영상이 얼굴 영상인지 비얼굴 영상인지를 판별하는 영역 판별부; 얼굴 영상으로 판별된 스캔 영역들을 클러스터링하고, 클러스터링된 스캔 영역들의 크기를 토대로 얼굴 영상으로 판별된 스캔 영역들 중에서 최종 후보 검출 영역을 추출하는 클러스터링부; 및 상기 최종 후보 검출 영역으로부터 얼굴을 검출하는 얼굴 검출부를 포함한다. According to another aspect of the present invention, there is provided a face detection apparatus including: an image pyramid generation unit for generating a face pyramid composed of images with respect to an input image; An image scanning unit that scans each image included in the face pyramid using a window of a set size; An operator based on feature elements for facial feature extraction for the scan region, the operator including a plurality of subblocks, wherein the feature elements include x and y coordinates, width, height, and neighboring sub- A feature value is acquired by coding an image by applying the interval including the interval in the x direction and the interval in the y direction with respect to the block, and based on the feature value, whether the scan image in the scan region is the face image or the non- An area discriminator; A clustering unit which clusters the scan regions discriminated as face images and extracts a final candidate detection region from the scan regions discriminated as face images based on the size of the clustered scan regions; And a face detection unit for detecting a face from the final candidate detection area.

상기 영역 판별부는 상기 특징 요소에 따른 서브블록들로 이루어진 연산자를 스캔 영역에 적용시켜 중심 서브블록에 대응하는 영역과 주변 서브블록에 대응하는 영역의 화소값에 따라 상기 스캔 영역의 스캔 영상을 코딩하여 특징값들을 획득하는 복수의 약분류기; 및 스캔 영역별로 상기 복수의 약분류기에 의하여 획득한 특징값들을 토대로 강분류값을 산출하고, 상기 강분류값을 미리 설정된 문턱치와 비교하여 해당 스캔 영역의 영상이 얼굴 영상인지 비얼굴 영상인지를 판별하는 강분류기를 포함하며, 상기 복수의 약분류기들에 사용되는 연산자를 구성하는 특징 요소들 중 적어도 하나는 다른 약분류기의 연산자의 해당 특징 요소와 다른 값을 가질 수 있다. The region discriminator applies an operator consisting of sub-blocks according to the feature element to a scan region and codes a scan image of the scan region according to a pixel value of a region corresponding to a central sub-block and a region corresponding to a neighboring sub-block A plurality of weak classifiers for obtaining feature values; And a step of calculating a strong classification value based on the feature values acquired by the plurality of weak classifiers for each scan region and comparing the strong classification value with a preset threshold to determine whether the image of the scan region is a face image or a non- And at least one of the feature elements constituting the operator used in the plurality of weak classifiers may have a different value from the corresponding feature element of the operator of the other weak classifier.

상기 클러스터링부는 상기 영역 판별부에서 얼굴 영상으로 판별된 스캔 영역들의 크기를 토대로 스캔 영역들을 복수의 클러스터로 클러스터링하는 제1 처리부; 각 클러스터의 크기와 신뢰도를 토대로 필터링을 수행하여 최종 후보 검출 영역을 획득하는 제2 처리부; 및 상기 최종 후보 검출 영역으로 선택된 클러스터에 포함된 멤버인 영역들을 머징하여 각 클러스터별로 얼굴 검출 영역을 획득하는 머징부를 포함할 수 있다. The clustering unit may include a first processing unit for clustering the scan areas into a plurality of clusters based on the sizes of the scan areas determined as face images in the area discrimination unit; A second processing unit for performing filtering based on the size and reliability of each cluster to obtain a final candidate detection region; And a merging unit for merging the regions included in the cluster selected as the final candidate detection region to acquire the face detection region for each cluster.

상기 제2 처리부는 상기 클러스터들에 포함되는 스캔 영역들인 멤버들의 수를 토대로 각 클러스터의 크기를 산출하고, 클러스터의 크기가 미리 설정된 크기 이상인 클러스터들을 추출하고, 추출된 클러스터들 중에서 그 신뢰도가 설정값 이상인 클러스터들만을 선택하여 최종 후보 검출 영역으로 사용할 수 있다. The second processor calculates the size of each cluster on the basis of the number of members that are the scan regions included in the clusters, extracts clusters having a cluster size larger than a predetermined size, Only clusters can be selected and used as the final candidate detection region.

상기 머징부는 상기 최종 후보 검출 영역으로 선택된 클러스터에 포함된 멤버들인 영역들의 평균 영역을 산출하여 해당 클러스터의 얼굴 검출 영역으로 사용할 수 있다. The merging unit may calculate an average region of the regions included in the cluster selected as the final candidate detection region and use the average region as a face detection region of the cluster.

본 발명의 실시 예에 따르면, 얼굴 인식을 위하여 얼굴 특징을 나타내는 특징 표현 방법의 표현력을 높일 수 있으므로, 보다 적은 계산으로 얼굴 검출이 가능하고 얼굴 검출의 정확도를 높일 수 있다. 또한 얼굴 검출 성능 개선으로 인해 실시간 얼굴 인증, 성별 판별, 나이 추정 등의 응용 시스템 개발에 기여할 수 있다. According to the embodiment of the present invention, since the expressive power of the feature expression method expressing the face feature for the face recognition can be increased, the face detection can be performed with less calculation and the accuracy of the face detection can be increased. Also, improvement of face detection performance can contribute to development of application system such as real time face authentication, gender discrimination, and age estimation.

도 1은 본 발명의 실시 예에 따른 얼굴 검출 방법의 흐름도이며, 도 2는 얼굴 검출 방법이 적용되는 영상을 나타낸 예시도이다.
도 3은 본 발명의 실시 예에 따른 얼굴 인식 필터 LBP의 연산자를 나타낸 예시도이다.
도 4는 본 발명의 실시 예에 따른 얼굴 특징 추출을 위한 연산자의 특징 요소를 나타낸 도이다.
도 5는 본 발명의 실시 예에 따른 얼굴 검출 방법에서 클러스터링 과정을 나타낸 흐름도이다.
도 6은 본 발명의 실시 예에 따른 얼굴 검출 방법에서, 클러스터링 과정에서 추가적으로 수행되는 검증 과정의 흐름도이다.
도 7은 본 발명의 실시 예에 따른 얼굴 검출 장치의 구조를 나타낸 도이다.
도 8은 본 발명의 실시 예에 따른 영역 판별부의 구조를 나타낸 도이다. FIG. 1 is a flowchart of a face detection method according to an embodiment of the present invention, and FIG. 2 is an example of an image to which a face detection method is applied.
3 is an exemplary diagram illustrating an operator of a face recognition filter LBP according to an embodiment of the present invention.
FIG. 4 is a diagram illustrating characteristic elements of an operator for facial feature extraction according to an embodiment of the present invention.
5 is a flowchart illustrating a clustering process in the face detection method according to an embodiment of the present invention.
6 is a flowchart of a verification process performed in the clustering process in the face detection method according to the embodiment of the present invention.
7 is a diagram illustrating a structure of a face detection apparatus according to an embodiment of the present invention.
8 is a diagram illustrating a structure of a region determining unit according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 및 청구범위 전체에서, 어떤 부분이 어떤 구성 요소를 '포함'한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다. Throughout the specification and claims, when a section is referred to as including an element, it is understood that it may include other elements as well, without departing from the other elements unless specifically stated otherwise.

이제 본 발명의 실시 예에 따른 얼굴 검출 방법 및 그 장치에 대하여 설명한다. Now, a face detection method and apparatus according to an embodiment of the present invention will be described.

도 1은 본 발명의 실시 예에 따른 얼굴 검출 방법의 흐름도이며, 도 2는 얼굴 검출 방법이 적용되는 영상을 나타낸 예시도이다. FIG. 1 is a flowchart of a face detection method according to an embodiment of the present invention, and FIG. 2 is an example of an image to which a face detection method is applied.

본 발명의 실시 예에서는 도 1에서와 같이, 입력 영상에 대하여 영상 피라미드(image pyramid)를 생성하고, 영상 피라미드에 포함되는 영상들을 각각 스캔하여 얼굴 영상과 비얼굴 영상으로 분류한다. In the embodiment of the present invention, as shown in FIG. 1, an image pyramid is generated for the input image, and the images included in the image pyramid are respectively scanned and classified into a facial image and a non-facial image.

입력 영상을 소정 비율들로 축소하여 영상 피라미드를 생성한다(S100). 즉, 영상 피라미드는 입력 영상이 서로 다른 비율로 축소된 이미지들을 포함한다. 예를 들어, 도 2에서와 같이, 입력 영상을 제1 비율, 제2 비율, 그리고 제3 비율(제1 비율> 제2 비율> 제3 비율)로 각각 축소시켜 복수의 이미지들(I1, I2, I3)을 포함하는 영상 피라미드를 생성한다. 여기서 각 이미지의 해상도는 다를 수 있다. 설명의 편의를 위하여, 입력 영상에 대하여 생성된 영상 피라미드에 포함되는 영상들을 "이미지"라고 명명하여, 입력 영상과 구별되도록 한다. The input image is reduced to predetermined ratios to generate an image pyramid (S100). That is, the image pyramid includes images in which input images are reduced in different ratios. For example, as shown in FIG. 2, the input image is reduced to a first ratio, a second ratio, and a third ratio (first ratio> second ratio> third ratio) , I3). &Lt; / RTI > Here, the resolution of each image may be different. For convenience of explanation, the images included in the image pyramid generated for the input image are referred to as "image " to be distinguished from the input image.

영상 피라미드에 포함되는 각 이미지들을 소정 크기를 가지는 서브 윈도우(subwindow)를 이용하여 스캔하고(S110), 스캔되는 영역이 얼굴 영상인지 비얼굴 영상인지를 판별한다(S120). 예를 들어, 도 2에서와 같이, 서브 윈도우(W)를 이용하여 이미지를 순차적으로 스캔하며, 서브 윈도우(W)에 의하여 스캔되는 이미지의 영역 즉, 스캔 영역이 얼굴 영상인지 비얼굴 영상인지를 판별한다. 이때, 본 발명의 실시 예에 따른 분류기를 사용하여 스캔되는 영역이 얼굴 패턴에 해당하는 영역 즉, 얼굴 영상인지 비얼굴 영상인지를 판별한다. 특히 본 발명의 실시 예에서는 얼굴에 대한 특징을 나타내는 요소로 픽셀의 위치에 따른 x좌표 및 y좌표, 넓이, 높이 그리고 이웃 픽셀과의 간격들의 특징 요소들을 고려한 분류기를 이용하여 스캔되는 영역의 영상이 얼굴 영상인지 비얼굴 영상인지를 판별한다. 분류기를 이용한 얼굴 영상 판별에 대해서는 추후에 보다 구체적으로 설명하기로 한다. Each image included in the image pyramid is scanned using a subwindow having a predetermined size (S110), and it is determined whether the scanned area is a face image or a non-face image (S120). For example, as shown in FIG. 2, an image is sequentially scanned using a sub window W, and an area of an image scanned by the sub window W, that is, whether a scan area is a face image or a non- . At this time, it is determined whether an area scanned using the classifier according to an embodiment of the present invention is a face pattern area, i.e., a facial image or a non-facial image. Particularly, in the embodiment of the present invention, the image of the region to be scanned using the classifier considering the feature elements of the x-coordinate and y-coordinate, the width, the height, It is determined whether the face image is a face image or a non-face image. The facial image discrimination using the classifier will be described later in more detail.

다음, 이미지에서 얼굴 영상으로 판별된 스캔 영역들을 클러스터링(clustering) 하고 머징(merging)하여 검출 영역을 선택한다(S130). 분류기에 의하여 얼굴 영상으로 판별된 스캔 영역들은 확률상 매우 많은 오검출(false detection)을 포함하고 있다. 즉, 분류기는 얼굴의 작은 위치, 크기 변화에 대해 강인하도록 학습되기 때문에 하나의 얼굴에 대하여 중첩된 복수개의 검출 결과를 얻게 된다. 따라서 이런 경우 하나의 검출 결과를 만들기 위한 클러스터링 과정이 필요하게 된다. 그러므로 얼굴 영상으로 판별된 스캔 영역들에 대하여 클러스터링 및 머징을 통한 검증을 수행하여 얼굴 검출을 위한 후보인 검출 영역을 선택한다. 본 발명의 실시 예에 따른 클러스터링 과정에 대해서는 추후에 보다 구체적으로 설명하기로 한다. Next, the scan regions discriminated from the face image in the image are clustered and merged to select a detection region (S130). The scan regions discriminated by facial image by the classifier include a very large number of false detections. That is, since the classifier is learned to be robust against the small position and size change of the face, a plurality of detection results superimposed on one face are obtained. Therefore, in this case, a clustering process is required to produce a single detection result. Therefore, the scan regions discriminated as face images are subjected to verification through clustering and merging to select a detection region that is a candidate for face detection. The clustering process according to the embodiment of the present invention will be described later in more detail.

그리고 선택된 후보 검출 영역들에 대하여, 각 영역에 대한 신뢰도(confidence)와 크기를 토대로 중첩된 검출 결과들을 제거하여 얼굴 검출을 위한 최종 후보 검출 영역들을 획득한다(S150).In step S150, the final candidate detection areas for the face detection are obtained by removing the overlapping detection results based on the confidence and size of each of the selected candidate detection areas.

획득한 최종 후보 검출 영역들로부터 얼굴을 검출하며(S160), 이 경우, 최종 후보 검출 영역들에 대하여 논리적으로 발생할 수 없는 상황을 제거하는 필터링을 추가적으로 수행한 다음에 얼굴을 검출하여 오검출을 감소시킬 수 있다. 검출 결과 중 일정 비율이상 겹친 것은 그 중 하나만 실제 얼굴이고 나머지는 오검출일 가능성이 매우 높다. 따라서 그런 경우 신뢰도의 값이 낮은 쪽을 버려서 오검출을 재차 필터링할 수 있다. The face is detected from the acquired final candidate detection regions (S160). In this case, filtering is further performed to remove a situation that can not logically occur in the final candidate detection regions, and then a face is detected to reduce false detection . It is highly probable that only one of the detection results overlapping at a certain rate is an actual face and the rest is false detection. Therefore, in such a case, false detection can be filtered again by discarding the lower value of the reliability.

위와 같은 본 발명의 실시 예에 따른 얼굴 검출 방법에서, 분류기를 사용하여 스캔되는 영역이 얼굴 영상인지 비얼굴 영상인지를 판별하는 방법에 대하여 보다 구체적으로 설명한다. In the method of detecting a face according to an embodiment of the present invention, a method of determining whether a region scanned using a classifier is a face image or a non-face image will be described in more detail.

스캔 영역에 해당하는 영상(이하, 설명의 편의를 위하여 스캔 영상이라고 명명함)에 대하여 얼굴 특징을 추출하기 위하여 얼굴 인식 필터를 스캔 영상에 적용하여 필터 응답 데이터를 획득한다. 여기서 얼굴 인식 필터는 조명 변화에 강인한 특성을 가지는 LBP(local binary pattern) 필터를 사용한다. A face recognition filter is applied to a scan image to extract a face response for an image corresponding to a scan region (hereinafter, referred to as a scan image for convenience of explanation). Here, the face recognition filter uses a local binary pattern (LBP) filter having characteristics robust to illumination change.

스캔 영상의 텍스쳐(texture)들은 LBP 필터의 패턴들에 의하여 국부적으로 부호화된다. LBP 연산자는 도 3과 같이 나타낼 수 있다. The textures of the scanned image are locally encoded by the patterns of the LBP filter. The LBP operator can be represented as shown in FIG.

도 3은 본 발명의 실시 예에 따른 얼굴 인식 필터 LBP의 연산자를 나타낸 예시도이다. 3 is an exemplary diagram illustrating an operator of a face recognition filter LBP according to an embodiment of the present invention.

LBP 연산자는 하나의 중심 픽셀에 대하여 이웃 픽셀들과의 관계를 규정하는 연산자(operator)이며, 예를 들어, 각각의 이웃 픽셀의 밝기를 중심 픽셀의 밝기 값으로 이진화한다. 도 3에 도시되어 있듯이, 9개의 서브블록(subblock)(예를 들어, 픽셀들)들로 이루어진 사각형으로 이루어진 연산자를 스캔 영상에 적용하여, 중앙에 위치한 서브블록의 세기(intensity) 값과 이웃 서브블록의 세기값의 대소 비교를 통하여 LBP 코드들을 획득한다. 이러한 LBP를 이용한 얼굴 특징 추출에서, 연산자의 디멘전(dimension)에 자유도를 부가하여 LBP의 표현력을 높일 수 있다. The LBP operator is an operator that specifies the relationship with neighboring pixels for one central pixel, for example, binarizes the brightness of each neighboring pixel to the brightness value of the center pixel. As shown in FIG. 3, an operator consisting of a quadrangle consisting of nine subblocks (e.g., pixels) is applied to a scan image, and an intensity value of a subblock located at the center and a neighboring subblock The LBP codes are obtained by comparing magnitude values of the blocks. In facial feature extraction using LBP, the expression power of LBP can be increased by adding a degree of freedom to the dimension of the operator.

본 발명의 실시 예에서는 서브 블록의 x 좌표 및 y 좌표, 넓이(w) 그리고 높이(h)의 특징 요소들로 이루어진 연산자를 이용하여, 스캔 영상에 포함된 중심 서브블록과 이웃 서브블록들의 대소 관계를 8비트로 코딩한다. 특히, 본 발명의 실시 예에서는 자유도를 가지는 4개의 특징 요소들 이외에, 각 서브블록간의 간격(△x, △y)을 추가적으로 고려하여 보다 높은 표현력을 가지는 특징 추출이 이루어지도록 한다. 이하에서는 설명의 편의를 위하여 추가적으로 고려되는 각 서브블록간의 간격을 "픽셀 간격"이라고 명명한다. In the embodiment of the present invention, by using an operator made up of characteristic elements of x coordinate, y coordinate, width (w), and height (h) of a subblock, the magnitude relation between the center subblock included in the scan image and neighboring subblocks Into 8 bits. Particularly, in the embodiment of the present invention, in addition to the four feature elements having degrees of freedom, feature extraction with higher expressive power is performed by further considering the intervals (x, y) between the sub-blocks. Hereinafter, for convenience of explanation, the interval between each subblock considered further is called "pixel interval ".

도 4는 본 발명의 실시 예에 따른 얼굴 특징 추출을 위한 연산자의 특징 요소를 나타낸 도이다. FIG. 4 is a diagram illustrating characteristic elements of an operator for facial feature extraction according to an embodiment of the present invention.

연산자를 구성하는 9개의 서브블록들 중에서 최우측 최상단의 서브블록의 좌표값을 (x0, y0)이라고 하고, 해당 서브블록의 넓이를 w, 그리고 높이를 h라고 하며, 주변의 이웃 서브블록과의 픽셀 간격을 (Δx, Δy)라고 할 경우, 이러한 특징 요소들로 이루어진 주변 서브 블록과 중심 서브블록의 값을 비교하여 그 결과를 이진화하여 LBP 코드를 획득할 수 있다. 예를 들어, 중심 서브 블록에 대하여 각 서브블록의 밝기값을 비교하여 이진화된 결과를 획득하고 이를 도 4에서와 같은 화살표 방향으로 정렬시켜 "01011011"의 이진 패턴인 LBP 코드가 획득한다. (X0, y0), the width of the sub-block is w, and the height is h, and the coordinate value of the uppermost sub-block in the uppermost sub- If the pixel interval is (? X,? Y), the LBP code can be obtained by comparing the value of the center subblock with the neighboring subblock composed of these characteristic elements and binarizing the result. For example, the binarized result is obtained by comparing the brightness values of the sub-blocks with respect to the central sub-block, and the binarized result is aligned in the arrow direction as shown in FIG. 4 to obtain the binary pattern "01011011".

본 발명의 실시 예에서는 연산자를 구성하는 서브블록에 대하여 x 좌표, y 좌표, 넓이, 높이 그리고 x방향의 픽셀 간격 및 y 방향의 픽셀 간격을 고려한 특징 요소를 부가함으로써, 각 서브블록간의 빈공간을 특징에 반영할 수 있다. 이러한 6 자유도를 가지는 특징 요소로 이루어진 서브블록을 포함하는 연산자를 이용하여, 얼굴의 주요 특징 이외에 아무래도 괜찮은 영역(얼굴의 빈공간)은 특징벡터 추출시 반영시키지 않음으로써 보다 검출 대상의 표현에 효과적이다. 이러한 본 발명의 실시 예에 따른 얼굴 특징을 추출하는 방법을 "S-LBP(space- local binary pattern)"라고 명명할 수 있다. In the embodiment of the present invention, a feature element considering x-coordinate, y-coordinate, width, height, pixel interval in the x-direction and pixel interval in the y-direction is added to the subblock constituting the operator, Can be reflected in the characteristic. By using an operator including a sub-block composed of feature elements having six degrees of freedom, a good region (a blank space of a face) other than the main features of the face is more effective in expressing the detection object than the feature vector extraction . The method of extracting facial features according to the embodiment of the present invention may be referred to as "S-LBP (space-local binary pattern) ".

위에 기술된 바와 같은 S-LBP에 따라 8비트 즉, 1바이트의 필터 응답 데이터가 획득될 수 있으며, 필터 응답 데이터 즉, 특징값들은 0~255의 값 중 하나로 나타낼 수 있다. 일반적으로 영상에서 0~255의 특징값이 균일하게 나타나지 않으며, 단순한 몇 가지 이진 패턴이 지배적으로 나타낸다. 예를 들어, 0→1 또는 1→0으로 비트 변화가 발생한 횟수가 적은 패턴이 지배적으로 나타나며, 비트 변화가 자주 발생하는 패턴은 그 발생 빈도가 낮다(즉, 비트 변화가 자주 발생하는 패턴은 노이즈이거나 반복적으로　관찰되지 않는 패턴인 경우가 많다). 8 bits or 1 byte of filter response data may be obtained according to the S-LBP as described above, and the filter response data, i.e., the characteristic values, may be represented by one of the values of 0 to 255. [ Generally, the feature values of 0 ~ 255 do not appear uniformly in the image, and some simple binary patterns are predominant. For example, a pattern with a small number of bit changes is dominant from 0 to 1 or 1 to 0, and a pattern in which a bit change frequently occurs is low in frequency (i.e., Or a pattern that is not repeatedly observed).

본 발명의 실시 예에서는 스캔 영상에 대하여 획득한 특징값들의 비트 변화가 설정 횟수(예를 들어, 4회 이하인 경우) 이하인 경우에만 해당 코드값을 배정하고, 설정 횟수 이상의 비트 변화가 발생하는 경우는 미리 설정된 하나의 코드값으로 표현하도록 한다. 따라서 특징값은 0~255가 아닌 설정 범위 0~198의 값으로 나타낼 수 있다. In the embodiment of the present invention, the code value is assigned only when the bit change of the acquired characteristic values for the scan image is equal to or less than the set number (for example, four times or less), and when a bit change of more than the set number occurs It is expressed by one preset code value. Therefore, the feature value can be represented by a value in the setting range 0 to 198, not 0 to 255.

본 발명의 실시 예에서는 설정 범위 0~198의 값을 가지는 특징값들(분산 특징값들)을 이용하여 해당 영상이 얼굴 영상인지 비얼굴 영상인지를 판별하는 분류기를 구현한다. In the embodiment of the present invention, a classifier for discriminating whether a corresponding image is a facial image or a non-facial image is implemented using characteristic values (distributed characteristic values) having a set range of 0 to 198.

분류기를 약분류기(weak classifier)로 구현할 경우, 히스토그램(histogram)을 이용한 우도비율(likelihood ratio)로 분류기를 구성할 수 있다. 스캔 영상에 대하여 LBP 코드를 히스토그램으로 나타내면, 히스토그램은 해당 영상에 대한 좋은 텍스쳐 표현이 된다. 특징값인 x(1bp 코드 x)에 대한 약분류기의 h(x)는 다음과 같이 나타낼 수 있다. When a classifier is implemented as a weak classifier, a classifier can be constructed with a likelihood ratio using a histogram. If the LBP code for the scan image is represented by a histogram, the histogram becomes a good texture representation for the image. The h (x) of the weak classifier for the feature value x (1bp code x) can be expressed as:

여기서, P_pos(x)는 얼굴 샘플에서의 특징값인 x의 확률 분포이고, P_neg(x)는 비얼굴 샘플에서의 특징값인 x의 확률 분포이다. "ε"는 0으로 나누기(divide-by-zero)를 방지하기 위한　최소값이다.Where P _pos (x) is the probability distribution of x, the feature value in the face sample, and P _neg (x) is the probability distribution of x, the feature value in the non-face sample. "ε" is the minimum value to prevent divide-by-zero.

특징값인 x가 분산적(discrete)이므로, 확률분포 P_pos(x)와 P_neg(x)는 소정 개수 예를 들어, 199개의 빈(bin)을 가지는 히스토그램으로 모델링될 수 있다.Since the feature value x is discrete, the probability distributions P _pos (x) and P _neg (x) can be modeled as a histogram with a predetermined number of, for example, 199 bins.

이러한 약분류기 h(x)는 단일 S-LBP 특징 요소로 구성되므로 분류 성능이 높지 않다. 따라서 얼굴과 비얼굴을 정확히 분류하기 위해서는 다양한 S-LBP 특징 요소들로 구성된 h(x)들을 조합하여 강분류기(strong classifier)가 요구된다. 연산자의 서브 블록의 특징 요소들 즉, 픽셀의 위치에 따른 x좌표 및 y좌표, 넓이, 높이 그리고 이웃 픽셀과의 픽셀 간격들이 서로 다른 값을 가지는 복수의 S-LBP 기반의 복수의 약분류기 h(x)들을 조합하여 강분류기를 구현할 수 있다. 이 경우, 부스팅(boosting) 알고리즘(예를 들어, Realboost)으로 복수의 약분류기들 중에서 얼굴 영상과 비얼굴 영상을 잘 분류할 수 있는 것들을 선택할 수 있다. Since this weak classifier h (x) is composed of a single S-LBP feature element, the classification performance is not high. Therefore, in order to accurately classify faces and non-faces, a strong classifier is required by combining h (x) composed of various S-LBP feature elements. A plurality of S-LBP-based weak classifiers h (x, y, z) having different values of the feature elements of the subblock of the operator, that is, x and y coordinates, x) can be combined to implement a strong classifier. In this case, a boosting algorithm (for example, RealBoost) can select among the plurality of weak classifiers that can classify facial images and non-facial images well.

본 발명의 실시 예에 따른 강분류기는 다음과 같이 나타낼 수 있다. The strong classifier according to the embodiment of the present invention can be expressed as follows.

여기서 H(x)의 값은 입력된 특징값이 얼굴에 가까울수록 큰 값을 가지는 함수로, 원하는　얼굴 검출 성능(오검출율과 검출율의 트레이드오프(trade-off) 관계에 따른 목표 성능)에 따라 설정되는 문턱치와 특징값을 비교하여 얼굴/비얼굴 여부를 판별한다. Here, the value of H (x) is a function having a larger value as the input feature value approaches the face, and the desired face detection performance (target performance according to the trade-off relationship between the false detection rate and the detection rate) The threshold value to be set is compared with the feature value to determine whether the face / non-face is present.

이와 같이, 본 발명의 실시 예에서는 서브블록의 위치에 따른 x좌표 및 y좌표, 넓이, 높이 그리고 이웃 서브블록과의 픽셀 간격들의 특징 요소들이 서로 다른 값을 가지는 연산자를 이용한 분류기들을 조합한 강분류기를 이용하여 스캔되는 영역의 영상이 얼굴 영상인지 또는 비얼굴 영상인지를 판별한다. As described above, in the embodiment of the present invention, the x-coordinate and y-coordinate, the width, the height, and the pixel spacing between the neighboring sub-blocks according to the position of the sub- , It is determined whether the image of the area to be scanned is a face image or a non-face image.

한편, 분류기는 얼굴의 작은 위치, 크기 변화에 대해 강인하도록 학습되기 때문에 하나의 얼굴에 대하여 중첩된 복수개의 검출 결과를 얻게 된다. 이러한 복수개의 검출 결과들을 하나의 검출 결과로 만들기 위한 클러스터링 과정에 대하여 보다 구체적으로 설명한다. On the other hand, since the classifier is learned to be robust against changes in the small position and size of the face, a plurality of detection results superimposed on one face are obtained. The clustering process for generating a plurality of detection results as one detection result will be described in more detail.

도 5는 본 발명의 실시 예에 따른 얼굴 검출 방법에서 클러스터링 과정을 나타낸 흐름도이다. 5 is a flowchart illustrating a clustering process in the face detection method according to an embodiment of the present invention.

본 발명의 실시 예에서는 초기 검출 결과의 상호 중복도가 일정 수준 이상인 것을 중첩되었다고 판단하고, 중첩된 검출 결과들의 집합을 반복적으로 업데이트한다. In the embodiment of the present invention, it is determined that the mutual overlap degree of the initial detection result is over a predetermined level, and the set of overlap detection results is updated repetitively.

영상에 대하여 강분류기를 적용하여 얼굴 영상으로 판별된 영역들 즉, 얼굴 후보 영역들로 이루어진 초기 검출 결과를 소정 순서 예를 들어, 내림차순으로 정렬하여 검출 결과 풀(Pool)인 D={d₁, d₂, d₃, …, d_n} 를 획득한다. 즉, 얼굴 후보 영역들에 대한 H(x)의 값을 내림차순으로 정렬하여 검출 결과 D를 획득할 수 있다(S300). Of the applied the steel classifier with respect to the image determined in the facial image area, that is, given an initial result of detection made by the face candidate region in order, for example, the sorted in descending order, the detection result pool _{(Pool) D = {d 1} , d ₂ , d ₃ , ... , d _n }. That is, the detection result D can be obtained by sorting the values of H (x) for the face candidate regions in descending order (S300).

검출 결과 풀(D)에 포함되는 구성 요소인 얼굴 후보 영역들에 대응하여 클러스터를 형성한다. 예를 들어, d₁을 클러스터 c₁에 대응시켜 하나의 클러스터 c₁={d₁}을 구성하고, 클러스터 집합 C = {c₁}을 생성한다(S310). The clusters are formed corresponding to the face candidate regions which are the constituent elements included in the pool D as the detection result. For example, to produce a single cluster c _₁ = {d _1} the configuration and cluster set C = {c _1} to correspond to d ₁ in the cluster c ₁ (S310).

그리고 클러스터 집합 C = {c₁}을 토대로, 다음과 같이 얼굴 후보 영역들을 클러스터링한다. Then, based on the cluster set C = {c ₁ }, the face candidate regions are clustered as follows.

검출 결과 풀(D)에 포함되는 임의 요소 d_j에 대해 c_i의 임의의 멤버 d_k와 중첩되는 영역의 크기를 구한다(S320, S330). 예를 들어, d₂를 제1 클러스터 c₁에 포함되는 멤버인 d₁과 중첩되는 영역의 크기를 구한다. The size of the region overlapping with any member d _k of c _i is obtained for any element d _j included in the detection result pool D (S320, S330). For example, calculate the size of an area that overlaps with the member, d ₁ to d ₂ included in the first cluster c _1.

중첩 영역의 크기가 두 얼굴 후보 영역들 중에서 면적이 작은 얼굴 후보 영역에 대하여 설정된 비율 이상의 크기를 가지고(S340), 두 얼굴 후보 영역들의 크기의 차이가 설정 차이 이내로 두 얼굴 후보 영역들의 크기가 비슷한 경우(S350), 동일 영역을 검출한 것으로 판단하고 d_j를 c_i에 추가한다(S360, S370). 예를 들어, d₂와 d₁의 중첩 크기가 d₂와 d₁ 중 면적인 작은 영역에 대하여 설정된 비율 이상의 크기를 가지고, d₂와 d₁의 크기가 서로 비슷한 경우, d₂는 d₁과 동일한 영역인 것으로 판단하여, d₂를 d₁이 포함되어 있는 제1 클러스터 c₁의 멤버로 추가한다. 이후 검증된 얼굴 후보 영역 d_j(예를 들어, d₂)는 검출 결과 풀 D에서 제거한다.If the size of the overlapping region has a size equal to or larger than a ratio set for the face candidate region having a smaller area among the two face candidate regions (S340), if the sizes of the two face candidate regions are within the set difference, (S350), it is determined that the same area is detected, and d _j is added to c _i (S360, S370). For example, if the overlap size d ₂ and d ₁ d ₂ and d ₁ If d ₂ and d ₁ are similar in size to each other, d ₂ is determined to be the same region as d ₁ , and d ₂ is determined to be the _first region including d 1 Add it as a member of cluster c ₁ . The validated face candidate region d _j (e.g., d ₂ ) is then removed from the pool D as a result of detection.

한편, 중첩 영역의 크기가 두 얼굴 후보 영역들 중에서 면적이 작은 얼굴 후보 영역에 대하여 설정된 비율 이하의 크기를 가지는 경우, 또는 중첩 크기가 설정된 비율 이하의 크기를 가져도 두 얼굴 후보 영역들의 크기의 차이가 설정 차이 이상으로 두 얼굴 후보 영역들의 크기가 비슷하지 않은 경우에는 다른 영역을 검출한 것으로 판단한다. 그리고 비교가 이루어진 해당 클러스터 즉, 제1 클러스터 c₁에 다른 멤버가 있다면(S380) 해당 멤버 영역을 토대로 위의 과정(S330~S370)을 반복적으로 수행한다. 해당 클러스터에 다른 멤버가 포함되어 있지 않은 경우에는 다른 클러스터(예를 들어, c_i ₊₁)가 존재하는지를 판단하고(S390), 다른 클러스터가 있는 경우에는 해당 클러스터의 멤버에 대하여 위의 과정(S330~S370)을 반복적으로 수행한다. 그리고 다른 클러스터가 존재하지 않는 경우에는 다른 영역에 대한 새로운 클러스터를 생성한다(S400). 예를 들어, d_j가 클러스터 c_i에 포함되는 멤버들과는 다른 영역인 것으로 판단되고 다른 클러스터들이 없는 경우에는 d_j에 대응하는 새로운 클러스터 c_i ₊₁을 생성하여 클러스터 집합 C 에 추가한다. 이 경우, 클러스터 집합 C= {c₁ _,c₂} 가 된다. On the other hand, if the size of the overlapping region has a size smaller than a ratio set for a face candidate region having a smaller area among the two face candidate regions, or if the overlapping size has a size smaller than a set ratio, It is determined that another region is detected when the sizes of the two face candidate regions are not similar to each other. If there is another member in the corresponding cluster, that is, the first cluster c ₁ (S380), the above steps S330 to S370 are repeatedly performed based on the corresponding member region. If another cluster is not included in the cluster, it is determined whether another cluster (e.g., c _i ₊₁ ) exists (S390). If there is another cluster, To S370) are repeatedly performed. If another cluster does not exist, a new cluster for another region is created (S400). For example, if d _j is determined to be an area different from the members included in the cluster c _i , and if there are no other clusters, a new cluster c _i ₊₁ corresponding to d _j is created and added to the cluster set C. In this case, cluster set C = {c ₁ _, c ₂ }.

위에 기술된 바와 같은 과정을 검출 결과 풀 D에 포함되는 모든 구성 요소에 대하여 수행한 경우, 클러스터링 과정을 종료한다(S410, S420). 클러스터링을 통하여 얼굴 후보 영역들은 서로 다른 클러스터별로 클러스터링된다. 클러스터들을 얼굴 검출을 위한 후보 검출 영역이라고 명명한다. If the process described above is performed for all the components included in the detection result pool D, the clustering process is terminated (S410, S420). Through clustering, face candidate regions are clustered by different clusters. Clusters are called candidate detection regions for face detection.

이러한 클러스터링 과정은 중첩된 검출 결과를 하나로 만들어　주는 효과 이외에, 클러스터의 크기(하나의 클러스터에 포함된 멤버의 수)에 따른 얼굴/비얼굴 판별에도 유용하게 사용될 수 있다. 즉, 실제 얼굴은 클러스터의 크기가 크고, 오검출의 경우에는 클러스터의 크기가 매우 작은 경향이 있다. 예를 들어, 클러스터의 크기가 2~3 이상인 경우 실제 얼굴일 가능성이 높으며, 클러스터의 크기가 1인 경우는 대부분 오검출이다. This clustering process can be used to discriminate facial / non-facial faces according to the cluster size (the number of members included in one cluster) in addition to the effect of making the overlapping detection results into one. That is, the size of clusters tends to be large in a real face, and the cluster size tends to be very small in the case of false detection. For example, if the size of the cluster is 2 ~ 3 or more, it is likely that the face is actually the face.

본 발명의 실시 예에서는 다음과 같이 클러스터링 수행 결과 얻어진 후보 검출 영역들에 대하여 다음과 같은 검증 과정을 추가적으로 수행하여 오검출된 영역들을 제거하여 최종 후보 검출 영역만을 획득할 수 있다. In the embodiment of the present invention, the following verification process is additionally performed on the candidate detection areas obtained as a result of the clustering as described below, so that only the final candidate detection areas can be obtained by removing the erroneously detected areas.

도 6은 본 발명의 실시 예에 따른 얼굴 검출 방법에서, 클러스터링 과정에서 추가적으로 수행되는 검증 과정의 흐름도이다. 6 is a flowchart of a verification process performed in the clustering process in the face detection method according to the embodiment of the present invention.

첨부한 도 6에서와 같이, 클러스터링을 통하여 얼굴 후보 영역들은 서로 다른 클러스터별로 클러스터링 되고, 클러스터들을 그 크기를 기준으로 필터링한다. 각 클러스터의 크기를 설정 크기와 비교하여, 설정 크기 이상의 크기를 가지는 클러스터들만 추출한다(S500, S510). 예를 들어, 클러스터의 크기 즉 클러스터에 포함된 멤버의 수가 2 이상인 경우만 추출하고, 2 보다 작은 경우에는 선택하지 않는다. As shown in FIG. 6, the face candidate regions are clustered by different clusters through clustering, and the clusters are filtered based on their sizes. Comparing the size of each cluster with the set size, only clusters having a size larger than the set size are extracted (S500, S510). For example, the size of the cluster, that is, the number of members included in the cluster, is extracted only when the number is two or more, and is not selected when the number is less than two.

추출된 클러스터들에 대하여, 각 멤버들의 분류기 출력값들을 합하여 신뢰도(confidence)를 구한다(S520). 예를 들어, 크기가 2인 클러스터 c₁이 두 멤버 d₁, d₂를 포함하는 경우, d₁에 대한 분류기 출력값 H₁(x)와 d₂에 대한 분류기 출력값 H₂(x)를 합하여 클러스터 c₁에 대한 신뢰도를 산출한다. For the extracted clusters, confidence values are obtained by adding the classifier output values of the respective members (S520). For example, if the size of two clusters c ₁ includes two members d _1, d _2, the cluster by summing the classifier output value H ₂ (x) for the classifier output value H ₁ (x) and d ₂ on d ₁ lt; RTI ID = 0.0 _> c1. < / RTI >

그리고 추출된 클러스터별로 신뢰도와 설정값을 비교하고(S530), 추출된 클러스터들 중에서 설정값보다 낮은 신뢰도를 가지는 클러스터는 선택하지 않고(S540), 설정값보다 높은 신뢰도를 가지는 클러스터들만 선택하여 최종 후보 검출 영역으로 사용한다(S550). In step S540, the reliability of the extracted clusters is compared with the set value in step S530. In step S540, the clusters having reliability lower than the set value are selected from among the extracted clusters in step S540. Is used as a detection area (S550).

분류기의 출력값은 우도비의 합이므로, 클러스터에 포함된 멤버들인 중첩된 검출 결과들의 출력값들의 합을 토대로 필터링함으로써, 검출율을 유지하면서 오검출을 효과적으로 줄일 수 있다. Since the output value of the classifier is the sum of the likelihood ratios, filtering is performed on the basis of the sum of the output values of the overlapping detection results, which are the members included in the cluster, so that false detection can be effectively reduced while maintaining the detection rate.

위와 같은 검증 과정을 수행하여 획득한 최종 후보 검출 영역에 해당하는 클러스터들에 대하여, 클러스터들 멤버들을 머징(Merging)하여 최종 출력될 얼굴 검출 영역을 획득한다(S560). 예를 들어, 클러스터 c₁가 최종 후보 검출 영역으로 선택된 경우, 클러스터 c₁에 포함되는 두 멤버 d₁, d₂를 머징하여 클러스터 c₁에 대한 최종 검출 영역을 획득한다. 최종 검출 영역은 클러스터들 멤버들인 영역들의 평균 영역으로 계산될 수 있다. In step S560, clusters corresponding to the final candidate detection area obtained by performing the above-described verification process are merged to obtain a face detection area to be finally output. For example, if c ₁ cluster is selected as the final candidate detection area, merging the two members d _1, d ₂ that is included in the cluster c ₁ and to obtain a final detection area for cluster c _1. The final detection area may be calculated as the average area of the areas that are members of the clusters.

이러한 클러스터링 및 머징을 포함하는 검증 과정을 통하여 최종 후보 검출 영역에 해당하는 클러스터들이 획득되고, 클러스터별로 얼굴 검출 영역이 획득된다. 이러한 최종 후보 검출 영역들 중에서도 중첩된 결과가 포함될 수 있다. 즉, 하나의 클러스터로 묶이지는 않았으나 서로 중첩되는 영역들이 있을 수 있다(예를 들어, 큰 검출 영역 안에 여러 개의 작은 검출 결과가　있는 경우).Clusters corresponding to the final candidate detection region are obtained through a verification process including clustering and merging, and a face detection region is obtained for each cluster. Among these final candidate detection regions, overlapping results can be included. That is, there may be regions overlapping each other although they are not grouped into one cluster (for example, when there are several small detection results within a large detection region).

이러한 것들은 실제 나타날 수 없는 상황이다. 그러므로 본 발명의 실시 예에서는 최종 후보 검출 영역들 중에서 논리적으로 발생할 수 없는 경우를 필터링 하여 오검출을 감소시킨다. 구체적으로, 최종 후보 검출 영역인 클러스터에 대하여 그 신뢰도값이 설정 신뢰도 이하인 클러스터는 제거하여 얼굴 검출을 위한 최종 후보 검출 영역에서 제외시킨다. These are things that can not really happen. Therefore, in the embodiment of the present invention, false detection can be reduced by filtering the cases where logically not occur among the final candidate detection regions. Specifically, clusters whose reliability values are equal to or lower than the set reliability are removed from the final candidate detection region and excluded from the final candidate detection region for face detection.

이러한 과정들을 통하여 최종적으로 획득한 최종 후보 검출 영역들에서 실제 얼굴에 해당하는 영역이 검출된다. 예를 들어, 최종 후보 검출 영역으로 선택된 클러스터들 중에서 그 신뢰도가 설정 신뢰도 이상인 클러스터를 추출하고, 추출된 클러스터의 얼굴 검출 영역이 얼굴 영역으로 최종 검출된다. Through these processes, the region corresponding to the actual face is detected in the finally obtained candidate detection regions. For example, clusters whose reliability is equal to or higher than the set reliability are extracted from the clusters selected as the final candidate detection region, and the face detection region of the extracted cluster is finally detected as the face region.

도 7은 본 발명의 실시 예에 따른 얼굴 검출 장치의 구조를 나타낸 도이다. 7 is a diagram illustrating a structure of a face detection apparatus according to an embodiment of the present invention.

첨부한 도 7에서와 같이, 본 발명의 실시 예에 따른 얼굴 검출 장치(1)는 영상 피라미드 생성부(11), 영상 스캔부(12), 영역 판별부(13), 클러스터링부(14), 그리고 얼굴 검출부(15)를 포함한다. 7, the face detection apparatus 1 according to the embodiment of the present invention includes an image pyramid generation unit 11, an image scanning unit 12, an area determination unit 13, a clustering unit 14, And a face detector 15.

영상 피라미드 생성부(11)는 입력 영상을 서로 다른 설정 비율로 축소한 복수의 이미지를 포함하는 영상 피라미드를 생성한다. The image pyramid generation unit 11 generates an image pyramid including a plurality of images obtained by reducing input images to different setting ratios.

영상 스캔부(12)는 설정 크기를 가지는 윈도우를 이용하여 영상 피라미드에 포함되는 각 이미지들을 스캔한다. The image scanning unit 12 scans each image included in the image pyramid using a window having a set size.

영역 판별부(13)는 스캔되는 영역이 얼굴 영상인지 비얼굴 영상인지를 판별한다. 특히 영역 판별부(13)는 얼굴에 대한 특징을 추출하기 위하여, 위치에 따른 x 좌표 및 y 좌표, 넓이, 높이 그리고 이웃 블록과의 픽셀 간격을 포함하는 특징 요소들을 고려한 서브블록들로 이루어진 연산자를 이용하여 스캔 영상에 대하여 특징값을 추출하고, 추출된 특징값을 이용하여 스캔되는 영역의 영상이 얼굴 영상인지 비얼굴 영상인지를 판별한다. 이러한 영역 판별부(13)는 분류기라고도 명명할 수 있다. The area determination unit 13 determines whether the scanned area is a facial image or a non-facial image. In particular, in order to extract a feature of a face, the region determining unit 13 determines an operator consisting of sub-blocks considering feature elements including x-coordinate and y-coordinate, a width, a height, And extracts the feature value of the scan image using the extracted feature value to determine whether the image of the scanned area is a facial image or a non-facial image. Such an area determination unit 13 may be referred to as a classifier.

영역 판별부(13)는 스캔 영상들에 대하여 얼굴 영상인지 비얼굴 영상인지를 판별하여, 얼굴 영상으로 판단된 스캔 영역들을 후보 얼굴 영역으로 출력한다. The region determining unit 13 determines whether the face image or the non-face image is the scan image, and outputs the scan regions determined as the face image to the candidate face region.

클러스터링부(14)는 영역 판별부(13)의 검출 결과를 클러스터링하고 머징하여 얼굴 영상으로 판단된 스캔 영역들 중에서 오검출된 영역들을 제거한다. 이를 위하여, 클러스터링부(14)는 영역 판별부(13)에서 후보 얼굴 영역들의 크기를 토대로 후보 얼굴 영역들을 복수의 클러스터로 클러스터링하는 제1 처리부(141), 각 클러스터의 크기와 신뢰도를 토대로 필터링을 수행하여 최종 후보 검출 영역을 획득하는 제2 처리부(142), 최종 후보 검출 영역으로 선택된 클러스터에 포함된 멤버들을 머징하여 각 클러스터별로 얼굴 검출 영역을 획득하는 머징부(143)를 포함한다. The clustering unit 14 clusters and merges the detection results of the area discrimination unit 13 to remove erroneously detected areas from among the scan areas determined to be facial images. For this, the clustering unit 14 includes a first processing unit 141 for clustering candidate face regions into a plurality of clusters on the basis of the sizes of candidate face regions in the region determination unit 13, filtering based on the size and reliability of each cluster, And a merging unit 143 for merging the members included in the cluster selected as the final candidate detection region to acquire the face detection region for each cluster.

얼굴 검출부(15)는 최종 후보 검출 영역들 중에서 신뢰도가 설정 신뢰도 이하인 영역들을 필터링하고, 필터링된 최종 후보 검출 영역에서 얼굴을 검출한다. The face detection unit 15 filters regions having reliability lower than the set reliability among the final candidate detection regions, and detects faces in the filtered final candidate detection region.

한편, 얼굴 특징을 추출하고 이를 토대로 얼굴 영상과 비얼굴 영상을 분류하는 영역 판별부(13)의 성능을 보다 향상시키기 위하여, 영역 판별부(13)는 다음과 같은 구조로 이루어질 수 있다. On the other hand, in order to further enhance the performance of the region discriminator 13 for extracting facial features and classifying the facial image and the non-facial image based on the extracted facial features, the region discriminator 13 may have the following structure.

도 8은 본 발명의 실시 예에 따른 영역 판별부의 구조를 나타낸 도이다. 8 is a diagram illustrating a structure of a region determining unit according to an embodiment of the present invention.

영역 판별부(13)는 서로 다른 값을 가지는 특징 요소들로 이루어진 연산자를 이용하여 스캔 영상이 얼굴 영상인지 비얼굴 영상인지를 분류하는 복수의 약분류기(1311~131n)와, 스캔 영상별로 복수의 약분류기들의 출력값을 토대로 강분류값(H(x))을 산출하고 이를 미리 설정된 문턱치와 비교하여 해당 스캔 영상이 얼굴 영상인지 비얼굴 영상인지를 최종적으로 판별하는 강분류기(132)를 포함할 수 있다. The region discriminator 13 includes a plurality of weak classifiers 1311 to 131n for classifying whether a scan image is a facial image or a non-facial image by using an operator made up of feature elements having different values, And a strong classifier 132 for finally determining whether the scanned image is a facial image or a non-facial image by calculating a strong classification value H (x) based on an output value of the weak classifiers and comparing it with a predetermined threshold value have.

여기서, 복수의 약분류기(1311~131n)들에 사용되는 연산자를 구성하는 특징 요소들 중 적어도 하나는 다른 약분류기의 연산자의 해당 특징 요소와 다른 값을 가진다. 예를 들어, 제1 약분류기의 특징 요소인 좌표(x좌표, y좌표), 넓이, 높이 그리고 간격(x방향의 간격, y 방향의 간격) 중 적어도 하나가 제2 분류기의 해당 특징 요소와 다른 값을 가질 수 있다. 즉, 제1 약분류기의 연산자의 서브블록의 높이가 제2 약분류기의 연산자의 서브블록의 높이와 다른 값을 가질 수 있다. Here, at least one of the characteristic elements constituting the operator used in the plurality of weak classifiers 1311 to 131n has a different value from the corresponding characteristic element of the operator of the weak classifiers. For example, at least one of coordinates (x-coordinate, y-coordinate), width, height and interval (interval in x direction, interval in y direction) which are characteristic elements of the first weak classifier are different from corresponding feature elements of the second classifier Value. &Lt; / RTI > That is, the height of the subblock of the operator of the first weak classifier may be different from the height of the subblock of the operator of the second weak classifier.

이상에서 설명한 본 발명의 실시 예는 장치 및 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시 예에 따른 방법 및 그 시스템의 구성에 대응하는 기능을 실행시킬 수 있는 프로그램 또는 그 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록 매체를 통해 구현될 수도 있으며, 이러한 구현은 앞서 설명한 실시예의 기재로부터 본 발명이 속하는 기술분야의 전문가라면 쉽게 구현할 수 있는 것이다. Although the present invention has been described and illustrated in detail, it is clearly understood that the same is by way of illustration and example only and is not to be construed as limited to the embodiments set forth herein. And the present invention can be easily implemented by those skilled in the art from the description of the embodiments described above.

이상에서 본 발명의 실시 예에 대하여 상세하게 설명하였지만 본 발명의 권리 범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리 범위에 속하는 것이다.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, It belongs to the scope of right.

Claims

Generating a face pyramid comprising images for an input image;
Acquiring a scan region by scanning each image included in the face pyramid using a window of a set size;
An operator based on feature elements for facial feature extraction is applied to the scan region to obtain a feature value by coding the image to determine whether the scan image in the scan region is a facial image or a non facial image based on the feature value ;
Extracting a final candidate detection region from all the clusters based on the cluster size and the reliability of the cluster based on the number of scan regions included in the cluster; And
Detecting a face from the final candidate detection area
Lt; / RTI >
Wherein the operator includes a plurality of subblocks, and the feature element includes a coordinate, a width, a height of the subblock, and an interval with neighboring subblocks.

Generating a face pyramid comprising images for an input image;
Acquiring a scan region by scanning each image included in the face pyramid using a window of a set size;
An operator based on feature elements for facial feature extraction is applied to the scan region to obtain a feature value by coding the image to determine whether the scan image in the scan region is a facial image or a non facial image based on the feature value ;
Clustering scan regions discriminated as facial images and extracting a final candidate detection region from scan regions discriminated as facial images based on the size of the clustered scan regions; And
Detecting a face from the final candidate detection area
Lt; / RTI >
Wherein the operator includes a plurality of subblocks, the feature elements including coordinates of the subblock, a width, a height, and an interval between neighboring subblocks,
The determining step
A plurality of weak classifiers including an operator consisting of sub-blocks according to the feature elements are applied to the scan region to scan the scan region of the scan region according to the pixel value of the region corresponding to the central sub- To obtain characteristic values;
Calculating strong classification values based on feature values obtained by the plurality of weak classifiers for each scan region; And
Comparing the strong classification value with a preset threshold value to determine whether the image of the corresponding scan region is a facial image or a non-facial image
The face detection method comprising:

The method according to claim 2, wherein
Wherein at least one of the feature elements constituting the operator used in the plurality of weak classifiers has a different value from the corresponding feature element of the operator of another weak classifier.

The method according to claim 2, wherein
Wherein the coordinates of the feature element include an x coordinate and a y coordinate, and the interval from the neighboring sub-block includes an interval in the x direction and an interval in the y direction.

The method according to claim 2, wherein
The step of extracting the final candidate detection region
And extracting a final candidate detection region from among the scan regions based on the size of the clustered scan regions and the reliability calculated based on the strong classification value of the scan regions.

Generating a face pyramid comprising images for an input image;
Acquiring a scan region by scanning each image included in the face pyramid using a window of a set size;
An operator based on feature elements for facial feature extraction is applied to the scan region to obtain a feature value by coding the image to determine whether the scan image in the scan region is a facial image or a non facial image based on the feature value ;
Clustering scan regions discriminated as facial images and extracting a final candidate detection region from scan regions discriminated as facial images based on the size of the clustered scan regions; And
Detecting a face from the final candidate detection area
Lt; / RTI >
Wherein the operator includes a plurality of subblocks, the feature elements including coordinates of the subblock, a width, a height, and an interval between neighboring subblocks,
The step of extracting the final candidate detection region
Clustering scan regions discriminated as face images into different clusters according to their sizes;
Calculating a size of each cluster based on the number of members that are scan regions included in the clusters;
Extracting clusters among the clusters, the clusters having a size larger than a predetermined size;
Calculating reliability for each of the extracted clusters; And
Selecting only the clusters whose reliability is equal to or greater than the set value and using the clusters as the final candidate detection region
The face detection method comprising:

The method of claim 6, wherein
Further comprising merging regions included in each cluster for the clusters selected as the final candidate detection region to obtain a face detection region for each cluster.

The method of claim 7, wherein
The step of detecting the face
Extracting clusters whose reliability is equal to or greater than a set value among the clusters selected as the final candidate detection region, and finally detecting the face detection region of the extracted cluster as a face region.

An image pyramid generation unit for generating a face pyramid composed of images with respect to an input image;
An image scanning unit that scans each image included in the face pyramid using a window of a set size to acquire a scanned image;
An operator based on feature elements for facial feature extraction for the scan region, the operator including a plurality of subblocks, wherein the feature elements include x and y coordinates, width, height, and neighboring sub- A feature value is acquired by coding an image by applying the interval including the interval in the x direction and the interval in the y direction with respect to the block, and based on the feature value, whether the scan image in the scan region is the face image or the non- An area discriminator;
A clustering unit that clusters the scan areas determined as face images into different clusters, extracts a final candidate detection area from all clusters based on the cluster size and the reliability of the clusters based on the number of scan areas included in the clusters; And
A face detection unit for detecting a face from the final candidate detection area,
And a face detection unit.

An image pyramid generation unit for generating a face pyramid composed of images with respect to an input image;
An image scanning unit scanning each image included in the face pyramid using a window of a set size to acquire a scan area;
An operator based on feature elements for facial feature extraction for the scan region, the operator including a plurality of subblocks, wherein the feature elements include x and y coordinates, width, height, and neighboring sub- A feature value is acquired by coding an image by applying the interval including the interval in the x direction and the interval in the y direction with respect to the block, and based on the feature value, whether the scan image in the scan region is the face image or the non- An area discriminator;
A clustering unit which clusters the scan regions discriminated as face images and extracts a final candidate detection region from the scan regions discriminated as face images based on the size of the clustered scan regions; And
A face detection unit for detecting a face from the final candidate detection area,
Lt; / RTI >
The area determination unit
An operator consisting of sub-blocks according to the feature element is applied to the scan region to code the scan image of the scan region according to the pixel values of the region corresponding to the central sub-block and the region corresponding to the neighboring sub-block, A plurality of weak classifiers; And
A strong classification value is calculated on the basis of the feature values obtained by the plurality of weak classifiers for each scan region, and the strong classification value is compared with a preset threshold value to discriminate whether the image of the scan region is a face image or a non- Strong classifier
/ RTI >
Wherein at least one of the feature elements constituting the operator used in the plurality of weak classifiers has a different value from the corresponding feature element of the operator of another weak classifier.

An image pyramid generation unit for generating a face pyramid composed of images with respect to an input image;
An image scanning unit scanning each image included in the face pyramid using a window of a set size to acquire a scan area;
An operator based on feature elements for facial feature extraction for the scan region, the operator including a plurality of subblocks, wherein the feature elements include x and y coordinates, width, height, and neighboring sub- A feature value is acquired by coding an image by applying the interval including the interval in the x direction and the interval in the y direction with respect to the block, and based on the feature value, whether the scan image in the scan region is the face image or the non- An area discriminator;
A clustering unit which clusters the scan regions discriminated as face images and extracts a final candidate detection region from the scan regions discriminated as face images based on the size of the clustered scan regions; And
A face detection unit for detecting a face from the final candidate detection area,
Lt; / RTI >
The clustering unit
A first processor for clustering the scan areas into a plurality of clusters based on the sizes of the scan areas determined as face images in the area discriminator;
A second processing unit for performing filtering based on the size and reliability of each cluster to obtain a final candidate detection region; And
A merging unit for merging the regions that are members included in the cluster selected as the final candidate detection region and acquiring a face detection region for each cluster,
And a face detection unit.

The method of claim 11, wherein
The second processing unit
The size of each cluster is calculated on the basis of the number of members which are the scan regions included in the clusters, the clusters having the cluster size larger than the preset size are extracted, and only the clusters having the reliability higher than the set value are selected To use as a final candidate detection area.

The method of claim 11, wherein
The merging portion
And calculates an average region of the regions included in the cluster selected as the final candidate detection region and uses the average region as a face detection region of the cluster.