KR101198322B1

KR101198322B1 - Method and system for recognizing facial expressions

Info

Publication number: KR101198322B1
Application number: KR1020110096332A
Authority: KR
Inventors: 김용경
Original assignee: (주) 어펙트로닉스; 김용국
Priority date: 2011-09-23
Filing date: 2011-09-23
Publication date: 2012-11-06
Also published as: WO2013042992A1

Abstract

PURPOSE: A facial expression recognizing method and a system thereof are provided to enhance regional spatial characteristics and weaken information of shapes which include unnecessary information by applying a DoG(Difference of Gaussian) kernel to an image applied with an AAM(active appearance model). CONSTITUTION: A DoG kernel(514) convolutes an input image. An AAM unit(516) extracts a facial area from the convoluted image, extracts an appearance and shape parameter from the facial area, and converts an AAM based on face feature elements in order to combine the AAM with the facial area. The AAM unit recognizes a facial expression by updating the appearance and shape parameter and the image forming the facial area to which a synthetic face image is inputted. An EFM(Enhanced Fisher Model) classifier(518) processes the appearance and shape parameter through an EFM classifying method. [Reference numerals] (510) Camera; (512) Image pre-processing unit; (514) DOG kernel; (516) AAM modeling unit; (518) EFM classifier; (520) Display unit; (522) Database

Description

Facial expression recognition method and system {METHOD AND SYSTEM FOR RECOGNIZING FACIAL EXPRESSIONS}

본 발명은 얼굴 표정 인식 방법 및 시스템에 관한 것으로서, 보다 상세하게는 AAM(Active Appearance Model)에 적용되는 영상에 DoG(Difference of Gaussian) 커널을 적용하여 세밀한 영역의 가시성을 증진시키고 노이즈를 감소시켜 눈, 코, 입 등과 같은 국지적인 영역의 특징은 강화하고 볼과 같이 반복되는 불필요한 정보를 담고 있는 외형의 정보를 약화시킴으로써, 영상에서 불필요한 정보를 제거할 수 있을 뿐만 아니라 객체 특징 추출을 통해 조명으로 제거되는 중요정보를 유지할 수 있는 얼굴 표정 인식 방법 및 시스템에 관한 것이다.
The present invention relates to a method and a system for recognizing facial expressions, and more particularly, a DoG (Difference of Gaussian) kernel is applied to an image applied to an AAM (Active Appearance Model) to enhance the visibility of a detailed area and reduce noise. By reinforcing the features of the local area such as the nose, mouth, etc. and weakening the information of the appearance that contains unnecessary information such as the ball, not only the unnecessary information can be removed from the image, but also the object feature extraction is removed by lighting. The present invention relates to a facial expression recognition method and system capable of maintaining important information.

획득된 이미지로부터 사용자의 표정을 추출하기 위한 다양한 기술이 소개되어 있다. 현재 소개된 얼굴의 특징 검출에 관한 기술로는, 에지 정보를 이용한 방법, 휘도(Luminance), 색차(Chrominance), 얼굴의 기하학적인 외형 및 대칭에 기반한 접근법, 주성분 분석법(PCA; Principal Component Analysis), 템플릿 매칭을 이용한 방법, 얼굴의 곡률을 이용하는 접근법, 신경망을 이용하는 방법 등이 있다.Various techniques for extracting a facial expression of a user from the acquired image have been introduced. Techniques related to the detection of facial features that are currently introduced include methods using edge information, luminance, chrominance, geometric appearance and symmetry of faces, principal component analysis (PCA), There is a method using template matching, an approach using a curvature of a face, a method using a neural network, and the like.

에지 정보를 이용하는 방법은 대부분 이진화 과정을 거쳐야 하므로 조명이 불균일할 경우 에지 정보 손실로 인해 안정적인 결과를 산출해내기가 어렵다. 또한, 주성분 분석법은 실시간 구현이 용이한 장점이 있으나 영역분리와 배경영상에 민감하며, 신경망에 의한 접근방식은 제한된 조건하에서는 안정적으로 작동할 수 있으나, 새로운 입력을 학습할 때 네트워크 변수의 조정이 어렵고 시간이 많이 걸린다는 단점이 있다.Since edge information is mostly binarized, it is difficult to produce stable results due to loss of edge information when lighting is uneven. In addition, Principal Component Analysis has the advantage of easy real-time implementation, but it is sensitive to area separation and background image, and the neural network approach can operate stably under limited conditions, but it is difficult to adjust network variables when learning new inputs. The disadvantage is that it takes a lot of time.

얼굴의 특징 검출에 관한 기술을 더욱 확장하여 평면상의 2차원 영상에 포함된 얼굴영역을 인식한 후, 이를 통해 3차원 영상으로 구현하기 위한 연구도 활발하게 진행되고 있다. 이러한 연구에서의 대부분은 4방향, 8방향, 또는 16방향 등에서의 여러 방향에 대한 2차원 영상을 합성하여 파노라마 형태의 얼굴 영상을 텍스처 영상으로 사용함으로써, 특정 사람의 3차원 얼굴모델을 생성한다.After further expanding the technology of facial feature detection to recognize a face region included in a planar 2D image, studies are being actively conducted to realize a 3D image. Most of these studies produce a three-dimensional face model of a specific person by synthesizing two-dimensional images in various directions in four directions, eight directions, or sixteen directions, and using a panoramic face image as a texture image.

이러한 연속 얼굴 표정인식 방법 중 대표적 모델로는 AAM(Active Appearance Model)이 있다. AAM은 얼굴 형상(model) 벡터와 얼굴 표면 질감(texture) 벡터에 주성분 분석(PCA)을 적용하여 다양한 사람의 얼굴 통계를 이용하여 만들어진 표본 얼굴 모델에 워핑(warping)하여, 표본 얼굴의 데이터와 정규화된 영상(2D)의 얼굴 데이터의 오차 제곱을 최소화시킨다. 이 데이터를 이용하여 얼굴의 특징점을 찾는다. AAM은 속도 계산을 빨리 할 수 있고 트레킹(Tracking)이 가능하다는 장점이 있다. A representative model of the continuous facial expression recognition method is AAM (Active Appearance Model). AAM applies principal component analysis (PCA) to face model vectors and face surface texture vectors to warn sample face models created using a variety of human face statistics to normalize the sample face data. The error square of the face data of the captured image 2D is minimized. Use this data to find facial feature points. AAM has the advantage of speed calculation and tracking.

그러나, AAM은 조명 변화가 심한 모바일 환경에서 성능이 많이 떨어지는 단점이 있다. 그 이유는 AAM을 갱신할 때 사용되는 최적화된 파라미터를 계산하기 위해서는 모델과 입력영상 간의 오차를 사용하게 되는데, 입력영상과 트레인 셋의 영상이 조명 및 포즈 변화와 같은 이류로 유사하지 않으면 오차가 커져 파라미터를 계산하는데 어려움을 주기 때문이다.
However, AAM has a disadvantage in that a lot of performance is degraded in a mobile environment in which lighting changes are severe. The reason is that the error between the model and the input image is used to calculate the optimized parameters used to update the AAM. This is because it is difficult to calculate the parameter.

한국공개특허 제2010-0081874호(2010.07.15): 사용자 맞춤형 표정 인식 방법 및 장치Korean Laid-Open Patent No. 2010-0081874 (July 15, 2010): User-tailored facial expression recognition method and apparatus

본 발명은 상술한 문제점을 해결하기 위하여 안출된 것으로서, AAM(Active Appearance Model)에 적용되는 영상에 DoG(Difference of Gaussian) 커널을 적용함으로써, 세밀한 영역의 가시성을 증진시키고 노이즈를 감소시켜 조명으로 인해 제거되는 중요정보를 유지할 수 있는 얼굴 표정 인식 방법 및 시스템을 제공하는데 그 기술적 과제가 있다.
The present invention has been made to solve the above-described problems, by applying a DoG (Difference of Gaussian) kernel to the image applied to the AAM (Active Appearance Model), thereby improving the visibility of the detailed area and reducing the noise due to lighting There is a technical problem to provide a facial expression recognition method and system capable of maintaining important information to be removed.

상술한 목적을 달성하기 위한 본 발명은 얼굴 표정 인식 방법은, 입력 영상을 DoG 커널을 이용하여 컨벌루션 하는 단계; 상기 컨벌루션된 영상에서 얼굴 영역을 추출하는 단계; 상기 얼굴 영역에서 외형(appearance) 파라미터와 형상(shape)의 파라미터를 추출하는 단계; 상기 추출한 얼굴특징 요소를 토대로 기 저장중인 통계학적 얼굴모델(AAM; Active Appearance Model)을 변환하여 상기 얼굴영역과 합성하는 단계; 및 상기 합성 얼굴 영상이 입력된 얼굴영역을 이루는 영상과 기 설정된 맵핑값 이내로 수렴할 때까지 상기 외형(appearance)과 형상(shape)의 파라미터를 갱신하여 얼굴 표정을 인식하는 단계를 포함한다.According to another aspect of the present invention, there is provided a facial expression recognition method comprising: convolving an input image using a DoG kernel; Extracting a face region from the convolved image; Extracting an appearance parameter and a shape parameter from the face region; Converting a statistical face model (AAM; Active Appearance Model) previously stored based on the extracted facial feature elements and synthesizing the facial region; And recognizing a facial expression by updating the appearance and shape parameters until the synthesized facial image converges within the preset mapping value with the image forming the input face region.

여기서, 상기 입력 영상을 DoG 커널을 이용하여 컨벌루션 하는 단계는, 서로 다른 표준편차를 갖고 있는 두 개의 가우시안(Gaussian) 커널로 영상을 각각 컨벌루션하여 블러드(Blurred)영상을 만든 후 두 영상의 차 영상을 계산하는 단계를 포함할 수 있다.Here, in the step of convolving the input image using a DoG kernel, two Gaussian kernels having different standard deviations are convolved to make a blood image and then create a difference image between the two images. Calculating may be included.

그리고, 상기 외형(appearance) 파라미터와 형상(shape)의 파라미터를 EFM(Enhanced Fisher Model) 분류방법으로 처리하여 얼굴 표정을 분류하는 단계를 더 포함할 수 있다.The method may further include classifying facial expressions by processing the appearance parameters and the shape parameters with an Enhanced Fisher Model (EMF) classification method.

상술한 목적을 달성하기 위한 본 발명은 얼굴 표정 인식 시스템은, 입력 영상을 DoG 커널을 이용하여 컨벌루션 하는 DoG 커널; 및 상기 컨벌루션된 영상에서 얼굴 영역을 추출하고, 상기 얼굴 영역에서 외형(appearance) 파라미터와 형상(shape)의 파라미터를 추출하여, 상기 추출한 얼굴특징 요소를 토대로 기 저장중인 통계학적 얼굴모델(AAM; Active Appearance Model)을 변환하여 상기 얼굴영역과 합성한 후, 상기 합성 얼굴 영상이 입력된 얼굴영역을 이루는 영상과 기 설정된 맵핑값 이내로 수렴할 때까지 상기 외형(appearance)과 형상(shape)의 파라미터를 갱신하여 얼굴 표정을 인식하는 AAM 모델링부를 포함한다.The present invention for achieving the above object is a facial expression recognition system, the DoG kernel to convolution the input image using the DoG kernel; And extracting a face region from the convolved image, extracting an appearance parameter and a shape parameter from the face region, and storing a statistical face model (AAM; Active) based on the extracted facial feature elements. After converting an Appearance Model and synthesizing the face region, the parameters of appearance and shape are updated until the synthesized face image converges within the preset mapping value with the image forming the input face region. AAM modeling unit for recognizing facial expressions.

여기서, 상기 DoG 커널은, 서로 다른 표준편차를 갖고 있는 두 개의 가우시안(Gaussian) 커널로 영상을 각각 컨벌루션하여 블러드(Blurred) 영상을 만든 후 두 영상의 차 영상을 계산하여 상기 AAM 모델링부에 제공할 수 있다.Here, the DoG kernel generates a blood image by convolving images with two Gaussian kernels having different standard deviations, and then calculates a difference image between the two images to provide the AAM modeling unit. Can be.

그리고, 상기 외형(appearance) 파라미터와 형상(shape)의 파라미터를 EFM(Enhanced Fisher Model) 분류방법으로 처리하여 얼굴 표정을 분류하는 EFM 분류기를 더 포함할 수 있다.
The apparatus may further include an EFM classifier for classifying facial expressions by processing the appearance parameters and shape parameters with an Enhanced Fisher Model (EMF) classification method.

상술한 바와 같이 본 발명의 얼굴 표정 인식 방법 및 시스템은, AAM(Active Appearance Model)에 적용되는 영상에 DoG(Difference of Gaussian) 커널을 적용함으로써 세밀한 영역의 가시성을 증진시키고 노이즈를 감소시켜 눈, 코, 입 등과 같은 국지적인 영역의 특징은 강화하고 볼과 같이 반복되는 불필요한 정보를 담고 있는 형상의 정보를 약화시킬 수 있다.As described above, the method and system for recognizing facial expressions of the present invention improves visibility of detailed regions and reduces noise by applying a DoG (Difference of Gaussian) kernel to an image applied to an AAM (Active Appearance Model). Local area features, such as squeezing, squeezing, and so forth, can enhance and weaken information in shapes that contain unnecessary information, such as balls.

또한, 본 발명의 얼굴 표정 인식 방법 및 시스템은, AAM(Active Appearance Model)에 적용되는 영상에 DoG(Difference of Gaussian) 커널을 적용함으로써 영상에서 불필요한 정보를 제거한 후 객체 특징을 추출하여 조명으로 인해 제거되는 중요정보를 유지할 수 있다.
In addition, the facial expression recognition method and system of the present invention, by applying the DoG (Difference of Gaussian) kernel to the image applied to the AAM (Active Appearance Model), after removing unnecessary information from the image to extract the object feature to remove due to lighting Maintain important information.

도 1은 본 발명의 실시예에 따른 얼굴 표정 인식 방법의 개념도,
도 2는 본 발명의 실시예에 따른 얼굴 표정 인식 시스템의 제어 흐름도,
도 3 및 도 4는 조명이 변화하는 조건에서 본 발명과 종래 기술의 얼굴 표정 인식결과의 비교 실험 결과,
도 5 및 도 6은 표정이 변화하는 조건에서 본 발명과 종래 기술의 얼굴 표정 인식결과의 비교 실험 결과,
도 7은 본 발명의 실시예에 따른 얼굴 표정 인식 시스템의 제어 블록도,
도 8은 본 발명의 다른 실시예에 따른 얼굴 표정 인식 시스템의 제어 흐름도,
도 9는 본 발명의 실시예에 따른 얼굴 표정 인식 시스템의 사용 상태도이다.1 is a conceptual diagram of a facial expression recognition method according to an embodiment of the present invention,
2 is a control flowchart of a facial expression recognition system according to an embodiment of the present invention;
3 and 4 are comparative experiment results of facial expression recognition results of the present invention and the prior art under the condition that the illumination changes,
5 and 6 are comparative experiment results of facial expression recognition results of the present invention and the prior art under the condition that the expression changes,
7 is a control block diagram of a facial expression recognition system according to an embodiment of the present invention;
8 is a control flowchart of a facial expression recognition system according to another embodiment of the present invention;
9 is a state diagram used in the facial expression recognition system according to an embodiment of the present invention.

이하에서는 첨부한 도면을 참조하여 본 발명의 실시예에 따른 얼굴 표정 인식 방법 및 시스템에 대해서 상세하게 설명한다. 다만, 본 발명을 설명함에 있어, 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그에 대한 상세한 설명은 생략한다.Hereinafter, a facial expression recognition method and system according to an exemplary embodiment of the present invention will be described in detail with reference to the accompanying drawings. However, in describing the present invention, when it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, a detailed description thereof will be omitted.

도 1은 본 발명의 실시예에 따른 얼굴 표정 인식 방법의 개념도이다.1 is a conceptual diagram of a facial expression recognition method according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명은 입력 영상(A)를 DoG(Difference of Gaussian) 커널로 컨벌루션하여 컨벌루션된 영상(B)를 생성한다. 이후, 컨벌루션된 영상(B)에 대해 AAM(Active Appearance Model) 영상 피팅을 수행하여 AAM 모델(C)을 생성한 후, 트레이닝 셋(training set)을 적용하여 표정이 인식된 출력 영상(D)를 출력한다.As shown in FIG. 1, the present invention convolves an input image A with a Difference of Gaussian (DoG) kernel to generate a convolved image B. FIG. Subsequently, an AAM model is generated by fitting an AAM (Active Appearance Model) image to the convoluted image (B), and then a training set is applied to the output image (D) where a facial expression is recognized. Output

DoG 커널(DoG Kernel)은 그레이(Gray) 영상의 노이즈를 제거하고 특징을 검출하는 영상처리 알고리즘이다. DOG 커널은 서로 다른 표준편차를 갖고 있는 두 개의 가우시안(Gaussian) 커널로 영상을 각각 컨벌루션하여 블러드(Blurred)영상을 만든 후 두 영상의 차 영상을 계산하는 것이다. 이러한 DoG 커널은 다음의 [수학식 1]과 같이 정의할 수 있다.DoG Kernel is an image processing algorithm that removes noise of gray images and detects features. The DOG kernel is composed of two Gaussian kernels with different standard deviations, each of which convolves an image to produce a blurred image, and then calculates a difference image between the two images. Such a DoG kernel can be defined as in Equation 1 below.

[수학식 1][Equation 1]

식(5.16)에서 L(x, y, kσ)와 L(x, y, σ)는 서로 다른 표준 편차(Standard Deviation)인 kσ와 σ를 갖는 가우시안 커널이다. In equation (5.16), L (x, y, kσ) and L (x, y, σ) are Gaussian kernels with different standard deviations kσ and σ.

DOG 커널은 영상 특징 검출을 목적으로 하는 알고리즘으로서, 디지털 영상에서 에지(Edge) 및 기타 다른 디테일의 가시성을 증진하는데 유용하게 사용된다. DoG 커널은 가우시안 필터링을 통해 노이즈를 감소시키기 때문에 영상에서 불필요한 정보를 제거할 수 있을 뿐만 아니라 객체 특징 추출을 통해 조명으로 제 되는 중요정보를 유지시켜 줄 수 있다. 특히, DoG 커널을 얼굴 영상에 적용하게 되면 눈, 코, 입 등과 같은 국지적인 형상의 특징은 강화되고 볼과 같이 반복되는 불필요한 정보를 담고 있는 형상의 정보를 약화시킬 수 있다. The DOG kernel is an algorithm aimed at detecting image features, which is useful for enhancing the visibility of edges and other details in digital images. Since the DoG kernel reduces noise through Gaussian filtering, it can not only remove unnecessary information from the image, but also maintain important information provided by lighting through object feature extraction. In particular, when DoG kernels are applied to face images, local features such as eyes, noses, mouths, etc. may be strengthened, and information of shapes containing unnecessary information, such as balls, may be weakened.

DoG 커널로 컨벌루션된 영상(B)는 얼굴 영상 중 많은 정보를 담고 있는 국지적인 형상, 예컨대, 눈, 코, 입 등의 특징부분의 형상이 강화되어 얼굴형상이 인식된다. The image B convolved with the DoG kernel has a local shape containing a lot of information in the face image, for example, a shape of a feature part such as an eye, a nose, a mouth, and the like, thereby recognizing a face shape.

이렇게 컨벌루션된 영상(B)에 대해, AAM 피팅을 수행하여 얼굴 표정을 인식할 수 있다. The facial expression may be recognized by performing AAM fitting on the convolved image B. FIG.

AAM에 DoG 커널이 적용되면 AAM의 식은 다음과 같이 [수학식 2]로 정의될 수 있다.When the DoG kernel is applied to AAM, the expression of AAM can be defined as [Equation 2] as follows.

[수학식 2]&Quot; (2) "

위 식에서 *은 DoG 커널이 적용된 영상, 즉, 컨벌루션된 영상(B)를 뜻한다.In the above formula, * means the image to which the DoG kernel is applied, that is, the convolved image (B).

AAM에서 사용하는 피팅 알고리즘은 얼굴특징 요소를 추출하고, 추출한 얼굴특징 요소를 토대로 통계학적 얼굴모델을 변환하여 얼굴영역과 매칭하는 합성 얼굴 영상을 모델링 한다. 이 후, 합성 얼굴 영상이 입력된 얼굴영역을 이루는 영상과 기 설정된 맵핑값 이내로 수렴할 때까지 외형(appearance)과 형상(shape)의 파라미터를 반복적으로 갱신하며 모델과 영상 간의 오차를 줄여나간다. The fitting algorithm used in AAM extracts facial feature elements, transforms the statistical face model based on the extracted facial feature elements, and models a synthetic face image matching the face region. After that, the parameters of the appearance and shape are repeatedly updated until the synthesized face image converges within the input face region within a preset mapping value, thereby reducing the error between the model and the image.

이에, 입력 영상의 외형 파라미터와 형상 파라미터가 측정되었으면 좌표 프레임 위에 입력 영상을 맞추고 현재 모델 인스턴트(C)와 트레이닝 셋을 컨벌루션하여 AAM이 피팅하는 영상 사이의 오차영상을 구해 오차를 줄이며 최적화하는 것이다. 피팅 알고리즘은 오차가 앞에서 말한 임계값을 만족하거나 지정된 횟수만큼 반복(interation)할 때까지 계속해서 반복 수행하며, 이에, 오차가 최적화된 얼굴 표정을 인식할 수 있다.When the external parameters and the shape parameters of the input image are measured, the input image is aligned on the coordinate frame, and the current model instant (C) and the training set are convolved to obtain an error image between the AAM-fitted image, thereby reducing and optimizing the error. The fitting algorithm continues to iterate until the error satisfies the aforementioned threshold or the specified number of times is repeated, thereby recognizing the facial expression with the optimized error.

이와 같이 본 발명은 AAM에 DoG 커널을 적용함으로써, 얼굴 영상의 객체 내에서 많은 정보를 담고 있는 국지적인 형상, 예컨대, 눈, 코, 입 등의 형상의 특징은 강화하고, 볼과 같이 불필요한 정보를 담고 있는 형상의 정보는 약화시킨 후, AAM 피팅 알고리즘을 수행함으로써 AAM 피팅 알고리즘의 성능을 높일 수 있다. As described above, according to the present invention, by applying the DoG kernel to the AAM, the feature of the local shape containing a lot of information in the object of the face image, for example, the shape of the eye, nose, mouth, etc. After attenuating the shape information, the performance of the AAM fitting algorithm can be improved by performing the AAM fitting algorithm.

도 2는 본 발명의 실시예에 따른 얼굴 표정 인식 시스템의 제어 흐름도이다.2 is a control flowchart of a facial expression recognition system according to an embodiment of the present invention.

본 발명의 실시예에 따른 얼굴 표정 인식 시스템은 얼굴 표정인식을 위해 AAM 모델을 사용하고 AAM 모델 생성에 쓰인 영상을 사용하여 EFM(Enhanced Fisher Model) 모델을 트레이닝 하였다(S110). 얼굴 표정 데이터가 저장되는 AAM 모델에는 여러 인종적 배경의 남성과 여성의 얼굴 표정 영상 시퀀스가 저장된다. 예컨대, 기쁨, 놀람, 화남, 혐오, 두려움, 슬픔 등의 표정 영상이 저장될 수 있다. EFM 모델 트레이닝에는 AAM 모델 생성에 사용된 영상을 적용할 수 있다. EFM은 표준 FLD(Fisher linear discriminant) 기반 방식에 대한 성능을 개선하기 위해 소개되었으며, 최초의 EFM은 차원 축소를 위해 PCA(Principle Component Analysis)에 적용되어 축소된 PCA 하위공간을 식별하는데 사용되었다. 표정인식 시스템에서 EFM은 얼굴 표정 사이의 특징들을 결정하여 얼굴 표정을 분류한다. The facial expression recognition system according to an embodiment of the present invention trained an Enhanced Fisher Model (EMF) model using an AAM model for facial expression recognition and an image used to generate the AAM model (S110). In the AAM model where facial expression data is stored, facial expression image sequences of males and females of various ethnic backgrounds are stored. For example, facial expression images such as joy, surprise, anger, disgust, fear, and sadness may be stored. The EFM model training can be applied to the images used to generate the AAM model. EFM was introduced to improve the performance of the standard Fisher linear discriminant (FLD) based approach, and the first EFM was applied to Principle Component Analysis (PCA) for dimension reduction and used to identify the reduced PCA subspace. In facial expression recognition systems, EFM classifies facial expressions by determining features between facial expressions.

얼굴 표정 인식 시스템에 얼굴 표정 인식 시스템에 인식 대상이 되는 얼굴을 포함하는 영상이 입력된다(S112). The image including the face to be recognized is input to the facial expression recognition system (S112).

입력된 영상에서 얼굴을 인식한다(S114). 여기서, 얼굴 인식에는 DoG 커널이 적용된다. DoG 커널을 이용하여 입력된 영상을 컨벌루션 함으로써 영상에 포함된 에지(Edge) 및 기타 다른 디테일의 가시성이 증가되어 영상에 포함된 얼굴이 정확히 인식될 수 있다.The face is recognized from the input image (S114). Here, DoG kernel is applied to face recognition. By convolving the input image using the DoG kernel, the visibility of edges and other details included in the image is increased to accurately recognize a face included in the image.

이 후, 인식된 얼굴 영상에 대해 AAM 영상 피팅을 시작한다(S116). Thereafter, AAM image fitting is started on the recognized face image (S116).

AAM 영상 피팅 과정을 통해 영상으로부터 얼굴의 외형(appearance)과 형상(shape)의 파라미터를 추출한다(S118, S120).Through the AAM image fitting process, parameters of appearance and shape of the face are extracted from the image (S118 and S120).

추출된 얼굴의 외형(appearance)과 형상(shape)의 파라미터를 EFM 분류기를 이용하여 처리한다(S122). EFM은 얼굴 표정 사이의 특징들을 결정하여 얼굴 표정을 분류한다. The parameters of the appearance and shape of the extracted face are processed using an EFM classifier (S122). EFM classifies facial expressions by determining features between facial expressions.

EFM 분류기에 의해 표정이 분류되면(S124), 감정(expression)과 형상(shape)이 반영된 트레이닝 셋을 입력 영상과 컨벌루션 한다(S126).When the expression is classified by the EFM classifier (S124), the training set reflecting the expression and the shape is convolved with the input image (S126).

이후, 외형(appearance)과 형상(shape)의 파라미터를 특정 임계값을 만족할 때까지 반복적으로 갱신하며 모델과 영상 간의 오차를 줄여나간다. 예를 들어, 현재 형상의 파라미터가 측정되었으면 모델 좌표 프레임 위에 입력영상을 맞추고 현재 모델 인스턴트와 AAM이 피팅하는 영상 사이의 오차영상을 구해 오차를 줄이며 최적화하는 것이다. 피팅 알고리즘은 오차가 앞에서 말한 임계값을 만족하거나 지정된 횟수만큼 반복(interation)할 때까지 계속해서 반복 수행하게 된다. Thereafter, the parameters of appearance and shape are updated repeatedly until a certain threshold is met, thereby reducing the error between the model and the image. For example, if the parameter of the current shape is measured, it fits the input image on the model coordinate frame and obtains an error image between the current model instant and the image fitted by AAM to reduce and optimize the error. The fitting algorithm continues to iterate until the error meets the aforementioned threshold or iterations for a specified number of times.

도 3 및 도 4는 조명이 변화하는 조건에서 본 발명과 종래 기술의 얼굴 표정 인식결과의 비교 그래프이다.3 and 4 are comparison graphs of facial expression recognition results of the present invention and the prior art under the condition that the illumination changes.

도 3은 조명이 변화하는 경우 AAM 피팅 결과를 표시한 것으로써, AAM만 적용하였을 경우(1-1), DoG 커널로 컨벌루션된 영상에 AAM을 적용하였을 경우(1-2), Canny Edge Detector가 적용된 영상에 AAM을 적용하였을 경우(1-3)의 표정 인식 결과를 도시한 것이다. Figure 3 shows the AAM fitting result when the lighting changes, when only AAM is applied (1-1), when AAM is applied to the image convolved with the DoG kernel (1-2), Canny Edge Detector is When the AAM is applied to the applied image (1-3) shows the facial expression recognition results.

도 3에 도시된 바와 같이 DoG 커널로 컨벌루션된 영상에 AAM을 적용하였을 경우(1-2)와 Canny Edge Detector가 적용된 영상에 AAM을 적용하였을 경우(1-3)에 보다 효과적으로 AAM 피팅이 이루어짐을 확인할 수 있다.As shown in FIG. 3, when AAM is applied to the image convolved with the DoG kernel (1-2) and AAM is applied to the image to which the Canny Edge Detector is applied (1-3), AAM fitting is more effectively performed. You can check it.

도 4는 조명이 변화하는 경우 종래기술에 따른 표정인식 결과와 본 발명에 따른 표정인식 결과의 정확도를 그래프로 나타낸 것이다.4 is a graph showing the accuracy of facial expression recognition results according to the prior art and facial expression recognition results according to the present invention when illumination changes.

도시된 그래프는 조명이 변화하는 환경에서 촬영된 영상을, AAM만 적용하였을 경우, DoG 커널로 컨벌루션된 영상에 AAM을 적용하였을 경우, Canny Edge Detector가 적용된 영상에 AAM을 적용하였을 경우로 표정 인식 결과의 정확도를 계산하여 그래프에 도시한 것이다.The graph shows the facial expression recognition result when the image captured in the environment of changing lighting is applied only with AAM, when AAM is applied to the image convolved with the DoG kernel, and when AAM is applied to the image with Canny Edge Detector. The accuracy of the calculation is shown in the graph.

각 인식 결과의 정확도를 평가하기 위하여 RMS error를 이용하여 형상(Shape)과 Ground Truth 사이에 표준편차와 평균 에러를 계산하였다. 또한 각 AAM의 성능을 보다 정확하게 측정하기 위하여 평균 에러에 대응하는 영상의 수를 누적 계산하였다. In order to evaluate the accuracy of each recognition result, the standard deviation and mean error were calculated between shape and ground truth using RMS error. In addition, in order to measure the performance of each AAM more accurately, the number of images corresponding to the mean error was cumulatively calculated.

도 4의 그래프에 표시된 바와 같이, DoG 커널과 Canny Edge Detector가 적용된 AAM이 더 작은 에러율에서 더 많은 영상이 누적됐음을 볼 수 있다. 특히 DoG 커널이 적용된 AAM은 평균에러가 작은 구간에서 가장 많은 영상을 누적하고 있다.As shown in the graph of FIG. 4, it can be seen that the AAM to which the DoG kernel and the Canny Edge Detector are applied has accumulated more images at a smaller error rate. In particular, AAM with DoG kernel accumulates the most images in the section where the average error is small.

도 5 및 도 6은 표정이 변화하는 조건에서 본 발명과 종래 기술의 얼굴 표정 인식결과의 비교 그래프이다.5 and 6 are comparison graphs of the facial expression recognition results of the present invention and the prior art under the condition that the expression changes.

도 5는 조명이 변화하는 경우 AAM 피팅 결과를 표시한 것으로써, AAM만 적용하였을 경우(2-1), DoG 커널로 컨벌루션된 영상에 AAM을 적용하였을 경우(2-2), Canny Edge Detector가 적용된 영상에 AAM을 적용하였을 경우(2-3)의 표정 인식 결과를 도시한 것이다. 5 shows the AAM fitting result when the illumination changes, when only AAM is applied (2-1), when AAM is applied to the image convolved with the DoG kernel (2-2), and the Canny Edge Detector is shown. When the AAM is applied to the applied image (2-3) shows the facial expression recognition results.

도 5에 도시된 바와 같이 DoG 커널로 컨벌루션된 영상에 AAM을 적용하였을 경우(2-2)가 Canny Edge Detector가 적용된 영상에 AAM을 적용하였을 경우(2-3) 보다 효과적으로 AAM 피팅이 이루어짐을 확인할 수 있다. 이는, Canny Edge Detector가 적용된 모델의 특징이 DoG 커널이 적용된 모델보다 영상의 특징이 약하게 표현되기 때문에 얼굴의 특징을 잘 피팅하지 못하기 때문이다.When AAM is applied to the image convolved with the DoG kernel as shown in FIG. 5 (2-2), when AAM is applied to the image to which the Canny Edge Detector is applied (2-3), the AAM fitting is more effectively confirmed. Can be. This is because the features of the model with Canny Edge Detector do not fit the features of the face well because the features of the image are weaker than those with the DoG kernel.

도 6은 표정 변화 하에서의 종래기술에 따른 표정인식 결과와 본 발명에 따른 표정인식 결과의 정확도를 그래프로 나타낸 것이다.6 is a graph showing the accuracy of the facial expression recognition result according to the prior art and the facial expression recognition result according to the present invention under a facial expression change.

도시된 그래프는 얼굴 영상의 표정이 변화하는 환경에서 촬영된 영상을, AAM만 적용하였을 경우, DoG 커널로 컨벌루션된 영상에 AAM을 적용하였을 경우, Canny Edge Detector가 적용된 영상에 AAM을 적용하였을 경우로 표정 인식 결과의 정확도를 계산하여 그래프에 도시한 것이다.In the graph shown, when AAM is applied to an image taken in an environment where facial expressions are changed, when AAM is applied to an image convolved with a DoG kernel, AAM is applied to an image to which Canny Edge Detector is applied. The accuracy of the facial expression recognition result is calculated and shown in the graph.

AAM의 정확도를 평가하기 위하여 RMS(root mean square) errorr를 이용하여 Shape과 Ground Truth 사이에 표준편차와 평균 에러를 계산하고, AAM의 성능을 보다 정확하게 측정하기 위하여 평균 에러에 대응하는 영상의 수를 누적 계산하였다. In order to evaluate the accuracy of AAM, we calculate the standard deviation and mean error between shape and ground truth by using root mean square (RMS) errorr, and measure the number of images corresponding to mean error to measure AAM performance more accurately. Cumulative calculations.

도 6의 그래프에 표시된 바와 같이, DoG 커널로 컨벌루션된 영상에 AAM을 적용하였을 경우 더 작은 에러율에서 더 많은 영상이 누적됐음을 볼 수 있다. As shown in the graph of FIG. 6, when AAM is applied to an image convolved with a DoG kernel, it can be seen that more images are accumulated at a smaller error rate.

반면, Canny Edge Detector가 적용된 AAM의 경우, 성능이 좋지 않음을 볼 수 있다. 이는, Canny Edge Detector가 적용된 모델의 특징이 DoG 커널이 적용된 모델보다 약하게 표현되어 얼굴의 특징을 잘 피팅하지 못하기 때문이다.On the other hand, in the case of AAM with Canny Edge Detector, the performance is not good. This is because the features of the model with Canny Edge Detector are weaker than those with the DoG kernel, and do not fit the features of the face well.

이상의 실험 결과에서 확인할 수 있듯이, AAM에 사용되는 영상에 DoG 커널을 적용하는 경우 조명이 변화하거나 표정이 변화하는 경우에도 표정인식 성능을 보장할 수 있다.As can be seen from the above experimental results, when the DoG kernel is applied to the image used for AAM, the expression recognition performance can be guaranteed even when the lighting changes or the expression changes.

도 7은 본 발명의 실시예에 따른 얼굴 표정 인식 시스템의 제어 블럭도로써, 스마트폰과 같은 휴대용 단말기에 본 발명의 얼굴 표정 인식 시스템을 적용한 경우를 예시한 것이다.7 is a control block diagram of a facial expression recognition system according to an embodiment of the present invention, illustrating a case where the facial expression recognition system of the present invention is applied to a portable terminal such as a smartphone.

도 7에 도시된 바와 같이, 얼굴 표정 인식 시스템은, 카메라(510), 영상 전처리부(512), DOG 커널(514), AAM 모델링부(516), EFM 분류기(518), 디스플레이부(520), 데이터베이스(522)를 포함한다.As shown in FIG. 7, the facial expression recognition system includes a camera 510, an image preprocessor 512, a DOG kernel 514, an AAM modeling unit 516, an EFM classifier 518, and a display unit 520. Database 522.

카메라(510)는 객체, 예컨대, 특정 사람의 얼굴의 영상을 촬영하여 2차원 영상을 생성한다.The camera 510 generates an 2D image by capturing an image of an object, for example, a face of a specific person.

영상 전처리부(512)는 카메라(510)로부터 제공되는 2차원 영상을 얼굴인식 실행할 수 있는 영상으로 변환 처리한다.The image preprocessor 512 converts the 2D image provided from the camera 510 into an image for face recognition.

DOG 커널(514)은 가우시안 필터링을 통해 입력된 영상의 노이즈를 감소시켜 에지(Edge) 및 기타 다른 디테일의 가시성을 증진하고 반복되는 불필요한 정보를 담고 있는 형상의 정보를 약화시킨다. DOG 커널(514)로 컨벌루션된 영상은 얼굴 영상 중 많은 정보를 담고 있는 국지적인 형상, 예컨대, 눈, 코, 입 등의 특징부분의 형상이 강화되어 얼굴형상이 인식된다. The DOG kernel 514 reduces the noise of the input image through Gaussian filtering to enhance the visibility of edges and other details and to attenuate information in the shape that contains redundant information that is repeated. The image convolved with the DOG kernel 514 is a local shape containing a lot of information of the face image, for example, the shape of the feature portion, such as eyes, nose, mouth, etc. is enhanced to recognize the face shape.

AAM 모델링부(516)는 DOG 커널(514)로 컨벌루션된 영상에서 외형(appearance)과 형상(shape)의 파라미터를 추출하고, 추출한 얼굴특징 요소를 토대로 통계학적 얼굴모델을 변환하여 상기 얼굴영역과 매칭하는 합성 얼굴 영상을 모델링한다. AAM 모델링부(516)는 합성얼굴 영상이 인식 대상 얼굴 영상과 기 설정된 맵핑값 이내로 수렴할 때까지 외형(appearance)과 형상(shape)의 파라미터를 반복적으로 갱신하며 모델과 영상 간의 오차를 줄여나간다. The AAM modeling unit 516 extracts parameters of appearance and shape from the image convolved with the DOG kernel 514 and converts a statistical face model based on the extracted facial feature elements to match the face region. A synthetic face image is modeled. The AAM modeling unit 516 repeatedly updates the parameters of appearance and shape until the synthetic face image converges within the preset mapping value with the face image to reduce the error between the model and the image.

EFM 분류기(518)는 외형(appearance)과 형상(shape)의 파라미터에 기초하여 얼굴 영상의 표정을 인식한다. EFM 분류기(518)의 표정 인식 결과는 AAM 모델링부(516)에 제공되어 합성 얼굴 영상을 모델링하기 위한 외형(appearance)과 형상(shape)의 파라미터를 결정하는데 사용된다.The EFM classifier 518 recognizes an expression of the face image based on parameters of appearance and shape. The facial expression recognition result of the EFM classifier 518 is provided to the AAM modeling unit 516 and used to determine the appearance and shape parameters for modeling the composite face image.

데이터베이스(522)에는 AAM을 수행하기 위한 얼굴 표정 영상 시퀀스가 저장된다. 또한, 얼굴 표정 영상 시퀀스를 트레이닝하여 생성된 EFM 모델이 저장된다.The database 522 stores a facial expression image sequence for performing AAM. In addition, the EFM model generated by training the facial expression image sequence is stored.

디스플레이부(520)는 AAM 모델링부(516)에서 결정된 얼굴인식 결과가 표시된다.The display unit 520 displays the face recognition result determined by the AAM modeling unit 516.

도 8은 본 발명의 다른 실시예에 따른 얼굴 표정 인식 시스템의 제어 흐름도이다.8 is a control flowchart of a facial expression recognition system according to another embodiment of the present invention.

사용자는 카메라(510)로 얼굴을 촬영하여 얼굴 영상을 입력할 수 있다(S612).The user may photograph a face with the camera 510 and input a face image (S612).

입력된 얼굴 영상을 DoG 커널(514)을 이용하여 컨버전한다(S614). 이에, 얼굴 영상에 포함된 국지적인 외형의 특징이 강화되어 얼굴이 인식된다(S616).The input face image is converted using the DoG kernel 514 (S614). Accordingly, the facial features are recognized by enhancing local features included in the facial image (S616).

AAM 모델링부(516)는 DoG 커널(514)로 컨버전된 영상에 대해 AAM 피팅을 수행하여 얼굴 영상을 모델링하기 위한 외형(appearance)과 형상(shape)의 파라미터를 검출한다(S618).The AAM modeling unit 516 performs an AAM fitting on the image converted into the DoG kernel 514 to detect parameters of appearance and shape for modeling a face image (S618).

EFM 분류기(518)는 AAM 모델링부(516)에서 외형(appearance)과 형상(shape)의 파라미터에 기초하여 얼굴 표정을 분류한다(S620).The EFM classifier 518 classifies facial expressions based on parameters of appearance and shape in the AAM modeling unit 516 (S620).

AAM 모델링부(516)는 EFM 분류기(518)의 분류결과를 참고하여 외형(appearance)과 형상(shape)의 파라미터를 반복적으로 갱신함으로써 입력된 얼굴 영상의 표정을 인식하여 인식결과를 표시한다(S622).The AAM modeling unit 516 repeats the updating of appearance and shape parameters by referring to the classification result of the EFM classifier 518 to recognize the expression of the input face image and displays the recognition result (S622). ).

도 9는 본 발명의 실시예에 따른 얼굴 표정 인식 시스템의 사용 상태도로서, 휴대용 단말기에 본 발명의 실시예에 따른 얼굴 표정 인식 시스템을 적용하여 서비스를 제공하는 경우 사용 상태를 예시한 것이다.9 is a diagram illustrating a usage state of the facial expression recognition system according to an exemplary embodiment of the present invention, and illustrates a usage state when a service is provided by applying the facial expression recognition system according to an exemplary embodiment of the present invention to a portable terminal.

카메라(510)로 촬영된 사용자의 얼굴 영상은 휴대용 단말기에 탑재된 얼굴 표정 인식 시스템에 입력되어 현재 사용자의 표정이 실시간으로 인식될 수 있다. The face image of the user photographed by the camera 510 may be input to a facial expression recognition system mounted in the portable terminal so that the expression of the current user may be recognized in real time.

인식된 사용자의 표정은 휴대용 단말기에 표시된 캐릭터에 실시간으로 반영된다.The recognized user's expression is reflected in real time on the character displayed on the portable terminal.

이에, 사용자의 현재 얼굴 표정과 휴대용 단말기에 표시된 캐릭터의 표정이 상호 연동되어, 캐릭터를 통해 (a)무표정, (b)기쁨, (c)놀람, (d)슬픔 등의 감정을 실시간으로 표시할 수 있다. Accordingly, the user's current facial expression and the facial expression of the character displayed on the portable terminal are linked to each other to display emotions such as (a) no expression, (b) joy, (c) surprise, and (d) sadness through the character in real time. Can be.

한편, 상술한 실시예에서는 이모티콘 캐릭터 상에 사용자의 얼굴 표정을 반영하는 경우를 설명하고 있지만, 본 발명의 얼굴 표정 인식 시스템의 표정 인식 결과는 사용자의 아바타, 이미지, 동물이나 애니메이션 캐릭터 등 감정표시가 가능한 그래픽 이미지에는 모두 적용이 가능하다. 또한, 표정 인식 결과를 텍스트로 출력하거나 디스플레이 화면의 그래픽 색상, 혹은 테마를 바꾸는 데에 적용하는 등 다양한 응용이 가능하다.On the other hand, in the above-described embodiment has been described a case where the user's facial expression is reflected on the emoticon character, the facial expression recognition result of the facial expression recognition system of the present invention is the display of emotions such as the user's avatar, image, animal or animation character All possible graphic images are applicable. In addition, various applications are possible, such as outputting a facial expression recognition result as text, changing the graphic color of a display screen, or a theme.

이상 설명한 바와 같이, 본 발명은 DOG 커널(514)을 이용하여 영상의 노이즈는 제거하고 특징만을 유지시킴으로써, AAM에 필요하지 않은 정보를 제거하고 피팅 알고리즘에 필요한 정보는 유지하도록 하고 있다. 이에, 조명 변화 및 표정 변화가 있는 영상에서도 우수한 표정 인식 성능을 얻을 수 있으며, 스마튼 폰 등의 휴대용 단말기에서 획득된 저화질의 영상에서도 우수한 성능을 보장할 수 있다.As described above, according to the present invention, the DOG kernel 514 is used to remove the noise of the image and maintain only the features, thereby removing the information not necessary for the AAM and maintaining the information necessary for the fitting algorithm. Accordingly, excellent facial recognition performance can be obtained even in an image having a change in illumination and a facial expression, and excellent performance can be guaranteed even in a low quality image obtained from a portable terminal such as a smart phone.

이와 같이, 본 발명이 속하는 기술분야의 당업자는 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로서 이해해야만 한다. 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 등가개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.
Thus, those skilled in the art will appreciate that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. It is therefore to be understood that the embodiments described above are to be considered in all respects only as illustrative and not restrictive. The scope of the present invention is defined by the appended claims rather than the detailed description and all changes or modifications derived from the meaning and scope of the claims and their equivalents are to be construed as being included within the scope of the present invention do.

본 발명은 AAM(Active Appearance Model)에 적용되는 영상에 DoG(Difference of Gaussian) 커널을 적용함으로써, 세밀한 영역의 가시성을 증진시키고 노이즈를 감소시켜 눈, 코, 입 등과 같은 국지적인 영역의 특징은 강화하고 볼과 같이 반복되는 불필요한 정보를 담고 있는 외형(appearance) 정보를 약화시킴으로써, 영상에서 불필요한 정보를 제거할 수 있을 뿐만 아니라 객체 특징 추출을 통해 조명으로 제거되는 중요정보를 유지할 수 있는 얼굴 표정 인식 방법 및 시스템에 적용할 수 있다.
According to the present invention, by applying a DoG (Difference of Gaussian) kernel to an image applied to an AAM (Active Appearance Model), the local area, such as the eyes, nose, mouth, etc. is enhanced by enhancing the visibility of the detailed area and reducing the noise. Facial expression recognition method that not only removes unnecessary information from the image but also maintains important information removed by lighting through object feature extraction by weakening appearance information containing unnecessary information repeated as seen. And systems.

510 : 카메라 512 : 영상 전처리부
514 : DOG 커널 516 : AAM 모델링부
518 : EFM 분류기 520 : 디스플레이부
522 : 데이터베이스510: camera 512: image preprocessor
514: DOG kernel 516: AAM modeling unit
518: EFM classifier 520: display unit
522: database

Claims

Convolving the input image using the DoG kernel;
Extracting a face region from the convolved image;
Extracting an appearance parameter and a shape parameter from the face region;
Classifying a facial expression by processing the appearance parameter and the shape parameter with an Enhanced Fisher Model (EMF) classification method;
Reflecting the classification result, converting a previously stored statistical facial model (AAM; Active Appearance Model) based on the appearance parameter and the shape parameter to synthesize the face area; And
Recognizing a facial expression by updating the parameters of the appearance and shape until the composite face image converges within the preset mapping value with the image forming the input face region .

The method of claim 1,
Convolving the input image using a DoG kernel,
A facial expression recognition method comprising calculating a difference image between two images after creating a blood image by convolving images with two Gaussian kernels having different standard deviations, respectively.

delete

A DoG kernel which convolves the input image using the DoG kernel;
Extract a facial region from the convolved image, extract an appearance parameter and a shape parameter from the facial region, and store a statistical face model (AAM; Active Appearance) based on the extracted facial feature elements. Model, and synthesizes the face region, and then updates the appearance and shape parameters until the synthesized face image converges within the preset mapping value with the image forming the input face region. An AAM modeling unit for recognizing facial expressions; And
An EFM classifier for classifying facial expressions by processing the extracted appearance parameters and shape parameters with an Enhanced Fisher Model (EMF) classification method, and providing a classification result to the AAM modeling unit;
Facial expression recognition system comprising a.

5. The method of claim 4,
The DoG kernel is
Facial expression recognition, characterized by convoluting images with two Gaussian kernels having different standard deviations to create a blooded image, and calculating the difference image of the two images and providing them to the AAM modeling unit. system.

delete