KR101697476B1

KR101697476B1 - Method for recognizing continuous emotion for robot by analyzing facial expressions, recording medium and device for performing the method

Info

Publication number: KR101697476B1
Application number: KR1020160121858A
Authority: KR
Inventors: 강보영; 이현순; 반상규
Original assignee: 경북대학교 산학협력단
Priority date: 2016-09-23
Filing date: 2016-09-23
Publication date: 2017-01-19
Also published as: KR20160116311A

Abstract

로봇의 표정 기반 연속적 정서 인식 방법은, 각 정서별 영상들을 저장하는 데이터 베이스를 구축하는 단계; 상기 데이터 베이스에 저장된 영상들의 분포를 바탕으로 PCA(Principal Component Analysis, 주성분 분석)를 통해 독립 변수를 계산하는 단계; 계산된 상기 독립 변수를 이용하여 선형 회귀식을 학습하는 단계; 및 입력 영상을 상기 선형 회귀식에 적용하여 정서 상태의 값을 계산하는 단계를 포함한다. 이에 따라, 인간의 표정을 통해 로봇이 연속적으로 인간의 감정을 추정하므로, 로봇과 인간 사이에 정서 공유가 가능하게 된다.A robot-based facial expression consecutive emotion recognition method comprises: constructing a database for storing images for each emotion; Calculating an independent variable through Principal Component Analysis (PCA) based on the distribution of images stored in the database; Learning the linear regression equation using the calculated independent variable; And applying the input image to the linear regression equation to calculate a value of emotion state. Accordingly, the robot continuously estimates the emotion of the human being through the expression of the human, so that the emotion can be shared between the robot and the human.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a robot-based continuous emotion recognition method, and a recording medium and apparatus for performing the same. 2. Description of the Related Art [0002]

본 발명은 로봇의 표정 기반 연속적 정서 인식 방법, 이를 수행하기 위한 기록 매체 및 장치에 관한 것으로서, 더욱 상세하게는 인간과 로봇의 상호작용을 위해 인간 표정을 통해 로봇이 연속적으로 감정 추정이 가능한 로봇의 표정 기반 연속적 정서 인식 방법, 이를 수행하기 위한 기록 매체 및 장치에 관한 것이다.The present invention relates to a robot-based continuous emotion recognition method, and a recording medium and apparatus for performing the same. More particularly, the present invention relates to a robot capable of continuously estimating emotion through human facial expression A facial expression based continuous emotion recognition method, and a recording medium and apparatus for performing the same.

일반적으로, 로봇이라고 하면 사람과 비슷한 형상을 하고 스스로 일을 하는 인조인간을 떠올린다. 하지만, 이는 불과 수십 년 전까지만 해도 단지 공상 과학영화에서 나오는 허구일 뿐이었고, 실제 로봇은 단순 반복 작업을 하는 산업용 기계에 불과했다. 그러다가 현대에 들어 컴퓨터 기술이 발달함에 따라 인공지능에 대한 연구가 활발히 진행되어 왔으며, 로봇 또한 지능적이고 능동적으로 더 다양한 일들을 할 수 있게 되었다.Generally speaking, a robot is a robot that resembles a human and works on its own. But only a few decades ago, it was merely a fiction coming out of science fiction movies, and the actual robot was just an industrial machine with simple repetition. Then, as the computer technology developed in the modern world, researches on artificial intelligence have been actively carried out, and robots can also do more various things intelligently and actively.

그리고 최근 일상생활의 양적, 질적 발전으로 인하여 공공기관뿐 아니라 가정에서도 로봇을 수용할 수 있게 되었고, 이에 따라 일반인들에게 친숙한 감성적 로봇이 요구되고 있다. 이를 위해 인간의 명령에 의해서 정해진 작업을 수행하는 단방향성 로봇이 아닌, 정서 모델을 가지며 로봇의 정서를 표현하고, 사람의 감정을 인식하여 반응하는 로봇을 위한 많은 연구가 있어왔다.Recently, the quantitative and qualitative development of daily life has enabled the robots to be accommodated not only in public institutions but also in the home. Therefore, an emotional robot that is familiar to the general public is required. For this purpose, a lot of research has been done not only on unidirectional robots that perform tasks determined by human commands, but also on robots that have emotion models, express emotions of robots, and recognize and respond to human emotions.

한편, 인간의 정서 상태는 표정, 음성, 동작 등을 통해 나타내며, 심장 박동수, 혈압, 뇌파 등의 생채 신호를 통해서도 추정할 수 있다. 이 중 음성 신호는 잡음의 영향이 크고 화자 및 문맥에 독립적이지 못하며, 생체 신호를 이용하는 방법은 장비의 장착 문제 등으로 로봇과 인간의 실제 상호작용에서는 적용하기 어렵다. On the other hand, the emotional state of a human being is expressed through facial expressions, voices, and movements, and can also be estimated through the raw heart signals such as heart rate, blood pressure, and brain waves. Among these, the voice signal has a large influence of noise and is not independent of the speaker and the context, and the method using the bio-signal is difficult to be applied to the actual interaction between the robot and human due to the mounting problem of the equipment.

또한, 얼굴 표정을 이용하여 정서 상태를 인식하기 위한 많은 연구들이 있어왔지만, 대부분 한 장의 영상정보를 바탕으로 표정의 종류를 구분하는 방식이었고, 연속적인 영상의 정보를 이용하는 것도 표정을 짓는 동영상을 저장한 뒤 이를 바탕으로 구분하는 방식이었다. 하지만, 실제 상호작용 상황에서 사람의 감정 상태에 대한 반응을 할 때, 그 순간의 표정 프레임에 대해 정서의 상태를 인식할 경우 얼굴 추적의 실패, 인식의 오류 등이 있을 수 있다. In addition, there have been many studies for recognizing the emotional state using facial expressions. However, most of them have been classified into types of facial expressions based on a single image information, and a continuous moving image information is also used to store a moving image And then based on this. However, when responding to the emotional state of the person in the actual interaction situation, if the state of the emotion is recognized with respect to the facial expression frame at that moment, there may be a failure of face tracking, an error of recognition, and the like.

Wilhelm Wundt. "Grundriss der Psychologie [Outlines of psychology]", Leipzig, Engelmann, 1896. Wilhelm Wundt. &Quot; Grundriss der Psychologie [Outlines of psychology] ", Leipzig, Engelmann, 1896. James A. Russell, Albert Mehrabian, "Distinguishing anger and anxiety in terms of emotional response factors", Journal of Consulting and Clinical Psychology, 42, 1974, pp.79-83. James A. Russell, Albert Mehrabian, "Distinguishing anger and anxiety in emotional response factors," Journal of Consulting and Clinical Psychology, 42, 1974, pp. 79-83.

이에, 본 발명의 기술적 과제는 이러한 점에서 착안된 것으로 본 발명의 목적은 인간과 로봇의 상호작용을 위해 인간 표정을 통해 로봇이 연속적으로 감정 추정이 가능한 로봇의 표정 기반 연속적 정서 인식 방법을 제공하는 것이다.SUMMARY OF THE INVENTION Accordingly, the present invention has been made in view of the above problems, and it is an object of the present invention to provide a robot-based continuous emotional recognition method based on a robot capable of continuously estimating emotions through human facial expressions will be.

본 발명의 다른 목적은 상기 로봇의 표정 기반 연속적 정서 인식 방법을 수행하기 위한 컴퓨터 프로그램이 기록된 기록 매체를 제공하는 것이다.Another object of the present invention is to provide a recording medium on which a computer program for performing a facial expression based continuous emotion recognition method of the robot is recorded.

본 발명의 또 다른 목적은 상기 로봇의 표정 기반 연속적 정서 인식 방법을 수행하기 위한 장치를 제공하는 것이다.It is still another object of the present invention to provide an apparatus for performing a facial expression based continuous emotion recognition method of the robot.

상기한 본 발명의 목적을 실현하기 위한 일 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 방법은, 각 정서별 영상들을 저장하는 데이터 베이스를 구축하는 단계; 상기 데이터 베이스에 저장된 영상들의 분포를 바탕으로 PCA(Principal Component Analysis, 주성분 분석)를 통해 독립 변수를 계산하는 단계; 계산된 상기 독립 변수를 이용하여 선형 회귀식을 학습하는 단계; 및 입력 영상을 상기 선형 회귀식에 적용하여 정서 상태의 값을 계산하는 단계를 포함한다.According to another aspect of the present invention, there is provided a method of recognizing consecutive emotion based on a facial expression of a robot, the method comprising: constructing a database storing images for each emotion; Calculating an independent variable through Principal Component Analysis (PCA) based on the distribution of images stored in the database; Learning the linear regression equation using the calculated independent variable; And applying the input image to the linear regression equation to calculate a value of emotion state.

본 발명의 실시예에서, 상기 선형 회귀식을 학습하는 단계는, 각 정서에 해당하는 각성(Arousal) 값 및 쾌(Valence) 값을 종속 변수로 이용할 수 있고, 상기 선형 회귀식은, 각성(Arousal)-비각성 축(A 축)과 쾌(Valence)-불쾌 축(V 축)의 2차원 형태일 수 있다.In the embodiment of the present invention, the step of learning the linear regression equation may use an arousal value and a valence value corresponding to each emotion as a dependent variable, and the linear regression equation may be an arousal, - It can be a two-dimensional form of unevaluated axis (A axis) and pleasantness (Valence axis).

본 발명의 실시예에서, 상기 선형 회귀식을 학습하는 단계는, 회귀식 계수를 계산할 수 있다.In an embodiment of the present invention, learning the linear regression equation may calculate a regression equation coefficient.

본 발명의 실시예에서, 상기 입력 영상에 대해 상기 선형 회귀식에 적용하여 정서 상태의 값을 계산하는 단계는, 상기 입력 영상을 상기 데이터 베이스의 분포로부터 얻어진 고유 벡터에 투사하여 독립 변수를 계산하는 단계; 및 상기 독립 변수를 상기 선형 회귀식에 대입하여 상기 입력 영상의 정서 상태를 추정하는 단계를 포함할 수 있다.In an embodiment of the present invention, the step of calculating the emotion state value by applying the linear regression equation to the input image may include projecting the input image on an eigenvector obtained from the distribution of the database to calculate an independent variable step; And estimating an emotion state of the input image by substituting the independent variable into the linear regression equation.

본 발명의 실시예에서, 상기 로봇의 표정 기반 연속적 정서 인식 방법은, 상기 PCA를 통해 독립 변수를 계산하는 단계 이전에, 상기 데이터 베이스에 저장된 영상들을 전처리하는 단계를 더 포함할 수 있다.In an embodiment of the present invention, the robot-based facial expression continuous emotion recognition method may further include a step of pre-processing images stored in the database before the step of calculating the independent variable through the PCA.

본 발명의 실시예에서, 상기 데이터 베이스에 저장된 영상들을 전처리하는 단계는, 상기 데이터 베이스에 저장된 영상들의 그레이 스케일 영상들을 벡터화 하는 단계; 벡터화된 영상들의 명암을 정규화하는 단계; 정규화된 영상들의 평균 벡터를 구하는 단계; 및 상기 평균 벡터와 각 영상 벡터의 편차인 차 벡터를 구하는 단계를 더 포함할 수 있다.In an embodiment of the present invention, the step of preprocessing images stored in the database comprises: vectorizing the grayscale images of the images stored in the database; Normalizing the contrast of the vectorized images; Obtaining an average vector of normalized images; And obtaining a difference vector that is a deviation between the average vector and each image vector.

본 발명의 실시예에서, 상기 PCA를 통해 독립 변수를 계산하는 단계는, 상기 차 벡터를 통해 공분산 행렬을 구하여 고유 벡터를 계산하는 단계; 및 상기 차 벡터를 고유 벡터에 투사하여 차원이 축소된 PCA 값을 계산하는 단계를 포함할 수 있다.In an embodiment of the present invention, the step of calculating an independent variable through the PCA may include calculating an eigenvector by obtaining a covariance matrix through the difference vector; And projecting the difference vector to an eigenvector to calculate a reduced PCA value.

본 발명의 실시예에서, 상기 벡터화된 영상들의 명암을 정규화하는 단계는, 각 벡터의 평균과 분산, 및 전체 벡터의 평균과 분산을 통해 수행될 수 있다.In an embodiment of the present invention, normalizing the brightness of the vectorized images may be performed through averaging and variance of each vector, and averaging and variance of the whole vector.

본 발명의 실시예에서, 상기 데이터 베이스를 구축하는 단계는, 평가자에 의해 각 영상 별로 정서 상태의 값을 수치적으로 평가하여 저장하는 단계를 포함할 수 있다.In the embodiment of the present invention, the step of constructing the database may include a step of numerically evaluating and storing a value of emotion state for each image by an evaluator.

본 발명의 실시예에서, 상기 데이터 베이스를 구축하는 단계는, 상기 평가에 앞서, 각 정서별 영상들의 얼굴 영역을 검출하는 단계를 더 포함할 수 있다.In the embodiment of the present invention, the step of constructing the database may further include detecting a face region of each emotion-related image prior to the evaluation.

본 발명의 실시예에서, 상기 입력 영상을 상기 선형 회귀식에 적용하여 정서 상태의 값을 계산하는 단계는, 연속적 정서 모델을 적용하여, 상기 입력 영상의 연속되는 매 프레임의 정서 상태의 값을 계산하는 단계를 포함할 수 있다.In the embodiment of the present invention, the step of applying the input image to the linear regression equation to calculate the emotion state value may include calculating a value of emotion state of each successive frame of the input image by applying a continuous emotion model, .

상기한 본 발명의 다른 목적을 실현하기 위한 일 실시예에 따른 컴퓨터로 판독 가능한 저장 매체에는, 로봇의 표정 기반 연속적 정서 인식 방법을 수행하기 위한 컴퓨터 프로그램이 기록되어 있다. According to another aspect of the present invention, there is provided a computer-readable storage medium storing a computer program for performing a facial expression-based continuous emotion recognition method of a robot.

상기한 본 발명의 또 다른 목적을 실현하기 위한 일 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 장치는, 각 정서별 영상들을 저장하는 데이터 베이스; 상기 데이터 베이스에 저장된 영상들의 분포를 바탕으로 PCA(Principal Component Analysis, 주성분 분석)를 통해 독립 변수를 계산하는 제1 PCA부; 계산된 상기 독립 변수를 이용하여 선형 회귀식을 학습하는 학습부; 및 입력 영상을 상기 선형 회귀식에 적용하여 정서 상태의 값을 계산하는 표정 인식부를 포함한다.According to another aspect of the present invention, there is provided an apparatus for recognizing a consciousness based on a robot, comprising: a database for storing images of each emotion; A first PCA unit for calculating an independent variable through PCA (Principal Component Analysis) based on the distribution of images stored in the database; A learning unit for learning a linear regression equation using the calculated independent variables; And a facial expression recognition unit for calculating an emotional state value by applying the input image to the linear regression equation.

본 발명의 실시예에서, 상기 학습부는, 각 정서에 해당하는 각성(Arousal) 값 및 쾌(Valence) 값을 종속 변수로 이용할 수 있고, 상기 선형 회귀식은, 각성(Arousal)-비각성 축(A 축)과 쾌(Valence)-불쾌 축(V 축)의 2차원 형태일 수 있으며, 상기 선형 회귀식을 학습하여 회귀식 계수를 계산할 수 있다.In the embodiment of the present invention, the learning unit may use an arousal value and a valence value corresponding to each emotion as dependent variables, and the linear regression equation is an Arousal- Dimensional shape of the axis and the valence axis and the unpleasant axis (V axis), and the regression coefficient can be calculated by learning the linear regression equation.

본 발명의 실시예에서, 상기 표정 인식부는, 상기 입력 영상을 상기 데이터 베이스의 분포로부터 얻어진 고유 벡터에 투사하여 독립 변수를 계산하는 제2 PCA부; 및 상기 독립 변수를 상기 선형 회귀식에 대입하여 상기 입력 영상의 정서 상태를 추정하는 정서 상태 출력부를 포함할 수 있다.In an embodiment of the present invention, the facial expression recognition unit may include a second PCA unit for projecting the input image to an eigenvector obtained from the distribution of the database to calculate an independent variable; And an emotion state output unit for substituting the independent variable into the linear regression equation to estimate the emotion state of the input image.

본 발명의 실시예에서, 상기 표정 인식부는, 상기 입력 영상의 얼굴 영역을 검출하는 얼굴 영역 검출부를 더 포함할 수 있다.In an embodiment of the present invention, the facial expression recognition unit may further include a face region detection unit that detects a face region of the input image.

본 발명의 실시예에서, 상기 로봇의 표정 기반 연속적 정서 인식 장치는, 상기 PCA 전에 상기 데이터 베이스에 저장된 영상들을 전처리하는 전처리부를 더 포함할 수 있다.In an embodiment of the present invention, the robot-based facial expression sequential emotion recognition apparatus may further include a pre-processing unit for pre-processing images stored in the database before the PCA.

본 발명의 실시예에서, 상기 전처리부는, 상기 데이터 베이스에 저장된 영상들의 그레이 스케일 영상들을 벡터화 하는 벡터부; 벡터화된 영상들의 명암을 정규화하는 정규화부; 정규화된 영상들의 평균 벡터를 구하는 평균부; 및 상기 평균 벡터와 각 영상 벡터의 편차인 차 벡터를 구하는 차 벡터부를 포함할 수 있다. In an embodiment of the present invention, the preprocessing unit includes: a vector unit for vectorizing gray-scale images of images stored in the database; A normalization unit for normalizing brightness and darkness of the vectorized images; An average part for obtaining an average vector of the normalized images; And a difference vector unit for obtaining a difference vector which is a deviation between the average vector and each image vector.

본 발명의 실시예에서, 상기 데이터 베이스는, 평가자에 의해 수치적으로 평가된 각 영상의 정서 상태의 값이 저장될 수 있다.In the embodiment of the present invention, the database may store values of emotion states of respective images numerically evaluated by the evaluator.

본 발명의 실시예에서, 상기 표정 인식부는, 상기 입력 영상의 연속되는 매 프레임의 정서 상태의 값을 계산할 수 있다.In an embodiment of the present invention, the facial expression recognition unit may calculate a value of an emotion state of each successive frame of the input image.

이와 같은 로봇의 표정 기반 연속적 정서 인식 방법에 따르면, PCA와 선형 회귀분석을 사용하여, 단순히 기본 정서의 종류를 분류해 내는 것이 아니라, A-V 2차원 정서 모델에 대해 각 축의 상태 값을 수치적으로 계산하는 방법을 제안하여, 대상의 정서에 대해 더 세분화 된 조건으로 반응할 수 있도록 하였다. 이에 따라, 인간 표정을 통해 로봇이 연속적으로 감정을 추정하므로, 로봇과 인간 사이에 정서 공유가 가능하다. 따라서, 감성 로봇 및 가전제품 분야의 발전에 기여할 수 있다.According to such a robot-based facial expression sequential emotion recognition method, not using the PCA and the linear regression analysis but simply classifying the basic emotion types, the state value of each axis is numerically calculated The emotions of the subjects were reacted with more detailed conditions. Accordingly, since the robot continuously estimates the emotion through the human expression, emotion can be shared between the robot and the human being. Therefore, it can contribute to the development of the emotional robot and the household appliance field.

도 1은 본 발명의 일 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 장치의 블록도이다.
도 2는 도 1의 데이터 베이스에 저장된 A-V 평면의 주요 영역의 예를 보여주는 도표이다.
도 3은 도 1의 데이터 베이스에 저장된 정서 상태의 사전 평가 값을 보여주는 도표이다.
도 4는 도 1의 전처리부의 상세 블록도이다.
도 5는 도 4의 전처리부에서 수행하는 그레이 스케일 영상의 벡터화를 설명하기 위한 도면이다.
도 6은 표정 인식을 위한 회귀분석 모델링의 예를 보여주는 도면이다.
도 7 및 도 8은 본 발명의 일 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 방법의 흐름도이다.
도 9는 PCA 방법을 통해 차원을 축소할 때, 축소한 차원의 크기에 대한 V의 RMS 오차를 나타낸 그래프이다.1 is a block diagram of a facial expression based continuous emotion recognition apparatus of a robot according to an embodiment of the present invention.
2 is a diagram showing an example of a main area of an AV plane stored in the database of FIG.
FIG. 3 is a chart showing the pre-evaluation values of emotion states stored in the database of FIG. 1; FIG.
4 is a detailed block diagram of the preprocessing unit of FIG.
5 is a diagram for explaining vectorization of a gray scale image performed by the preprocessing unit of FIG.
6 is a diagram showing an example of regression analysis modeling for facial expression recognition.
FIGS. 7 and 8 are flowcharts of a facial expression based continuous emotion recognition method of a robot according to an embodiment of the present invention.
9 is a graph showing the RMS error of V with respect to the size of the reduced dimension when the dimension is reduced through the PCA method.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예에 관련하여 본 발명의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다. 또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 본 발명의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 본 발명의 범위는, 적절하게 설명된다면, 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다.The following detailed description of the invention refers to the accompanying drawings, which illustrate, by way of illustration, specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. It should be understood that the various embodiments of the present invention are different, but need not be mutually exclusive. For example, certain features, structures, and characteristics described herein may be implemented in other embodiments without departing from the spirit and scope of the invention in connection with an embodiment. It is also to be understood that the position or arrangement of the individual components within each disclosed embodiment may be varied without departing from the spirit and scope of the invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is to be limited only by the appended claims, along with the full scope of equivalents to which such claims are entitled, if properly explained. In the drawings, like reference numerals refer to the same or similar functions throughout the several views.

이하, 도면들을 참조하여 본 발명의 바람직한 실시예들을 보다 상세하게 설명하기로 한다. Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the drawings.

도 1은 본 발명의 일 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 장치의 블록도이다.1 is a block diagram of a facial expression based continuous emotion recognition apparatus of a robot according to an embodiment of the present invention.

최근 가정에서도 로봇을 수용할 수 있게 됨에 따라 로봇과 인간이 상호작용을 할 수 있는 정서적인 로봇이 요구되고 있다. 이러한 정서적 상호작용이 가능한 로봇에 대한 기존연구는 대부분 로봇 정서를 정해진 몇 가지 기본적인 정서로 나누거나 분류하는 방법으로 연속적인 인간의 감정을 재현하는데 한계가 있다. 따라서, 본 발명에서는 인간의 얼굴 표정으로부터 로봇이 인간의 정서 상태를 선형모델에 기반하여 연속적으로 측정할 수 있는 정서 인식 장치를 제안한다.Recently, as the robot can be accommodated in the home, an emotional robot capable of interacting with the robot is required. Previous research on robots capable of these emotional interactions has limited the ability to recreate consecutive human emotions by dividing or classifying robot emotions into some basic emotions. Accordingly, the present invention proposes an emotion recognition apparatus in which the robot can continuously measure human emotion states based on a linear model from human facial expressions.

도 1을 참조하면, 본 발명에 따른 로봇의 표정 기반 연속적 정서 인식 장치(10, 이하 장치)는 미리 영상들을 수집하여 선형 회귀식을 학습하는 오프라인부(100) 및 새로운 영상에 대한 정서를 온라인으로 추정하는 표정 인식부(300)를 포함한다. Referring to FIG. 1, a facial expression continuous emotion recognition apparatus 10 (hereinafter referred to as a device) of a robot according to the present invention includes an off-line unit 100 for collecting images in advance and learning a linear regression formula, And a facial expression recognition unit 300 for estimating the facial expression.

본 발명의 상기 장치(10)는 로봇의 표정 기반 연속적 정서 인식을 수행하기 위한 소프트웨어(애플리케이션)가 설치되어 실행될 수 있으며, 상기 오프라인부(100) 및 상기 표정 인식부(300)의 구성은 상기 장치(10)에서 실행되는 상기 로봇의 표정 기반 연속적 정서 인식을 위한 소프트웨어에 의해 제어될 수 있다. The apparatus 10 of the present invention can be implemented with software (application) for performing facial expression-based continuous emotion recognition of the robot, and the configuration of the off-line unit 100 and the facial expression recognition unit 300 can be implemented by the device Based continuous emotion recognition of the robot executed in the robot 10 of the present invention.

상기 장치(10)는 별도의 단말이거나 또는 단말의 일부 모듈일 수 있다. 또한, 상기 장치(10)의 상기 오프라인부(100) 및 상기 표정 인식부(300)의 구성은 통합 모듈로 형성되거나, 하나 이상의 모듈로 이루어 질 수 있다. 그러나, 이와 반대로 각 구성은 별도의 모듈로 이루어질 수도 있다.The device 10 may be a separate terminal or some module of the terminal. In addition, the configuration of the off-line unit 100 and the facial expression recognition unit 300 of the device 10 may be formed of an integrated module or may be composed of one or more modules. However, conversely, each configuration may be a separate module.

상기 장치(10)는 이동성을 갖거나 고정될 수 있다. 상기 장치(10)는, 서버(server) 또는 엔진(engine) 형태일 수 있으며, 디바이스(device), 기구(apparatus), 단말(terminal), UE(user equipment), MS(mobile station), 무선기기(wireless device), 휴대기기(handheld device) 등 다른 용어로 불릴 수 있다.The device 10 may be mobile or stationary. The device 10 may be in the form of a server or an engine and may be a device, an apparatus, a terminal, a user equipment (UE), a mobile station (MS) a wireless device, a handheld device, and the like.

상기 오프라인부(100)는 실시간으로 입력되는 영상의 정서를 추정하기 전에, 먼저 인간의 정서 상태를 추정할 수 있는 선형 모델을 형성한다. 이를 위해, 상기 오프라인부(100)는 데이터 베이스(110), 제1 PCA부(150) 및 학습부(170)를 포함한다. 상기 오프라인부(100)는 상기 제1 PCA부(150)에 입력되는 영상 데이터에 대한 전처리를 수행하는 전처리부(130)를 더 포함할 수 있다.The off-line unit 100 forms a linear model that can estimate a human emotion state before estimating the emotion of an image input in real time. The off-line unit 100 includes a database 110, a first PCA unit 150, and a learning unit 170. The off-line unit 100 may further include a preprocessor 130 for pre-processing image data input to the first PCA unit 150.

상기 데이터 베이스(110)는 각 정서별 영상들을 저장한다. The database 110 stores images for each emotion.

정서란, 다양한 감정, 생각, 행동과 관련된 정신적, 생리적 상태로 정의할 수 있으며, 주관적인 경험으로 대개 기분, 기질, 성격 등과 관련이 있다고 알려져 있다. 인간의 일반적인 정서를 분류하기 한 많은 연구가 있어 왔지만, 정서의 주관적, 추상적 성격 때문에 인간의 정서를 설명하고 분류하는 것은 많은 어려움이 있었고, 많은 심리학자들의 의견이 상이하여 현재까지도 공식적인 정서의 개념 및 분류는 확립되지 않고 있다.Emotions can be defined as mental and physiological states associated with various emotions, thoughts, and behaviors, and subjective experiences are generally known to be related to mood, temperament, personality, and the like. Although there have been many studies to classify the general emotion of humans, it has been difficult to explain and classify human emotions because of the subjective and abstract nature of the emotions, and the opinions of many psychologists are different, Is not established.

본 발명에서는 James Russell이 제시한 하위정서에 대해 쾌-불쾌, 각성-비각성 축에 맵핑시킨 정서 원형 모델 정서를 이용하였다. 일반적으로, 각성-비각성 축은 Arousal, 쾌-불쾌 축은 Valence로 불려진다. 즉, 기본적인 Arousal-Valence(이하 A-V) 2차원 정서로 모델링 하였으며, 각 축에 정서 모델을 적용하였다. 정서 모델이란 정서의 상태를 시간에 대한 미분방정식으로 표현하는 방법이다. In the present invention, emotional prototype model emotions which are mapped to the pleasant-displeasure, arousal-non-arousal axis are used for the sub-sentences suggested by James Russell. In general, the arousal-unaerobic axis is called Arousal and the pleasant-off axis is Valence. In other words, we modeled the basic Arousal-Valence (A-V) two-dimensional emotion and applied the emotion model to each axis. The emotion model is a method of expressing the state of emotion as a differential equation for time.

상기 데이터 베이스(110)에 저장된 정서 상태는 각성-비각성 축과 쾌-불쾌 축, 즉 A-V 2차원의 형태로 나타내어지며, 정서의 V값, 정서의 A값으로 표현한다. 예를 들어, 정서 평면 상 각 주요 값에 해당하는 각 정서에 해당하는 표정의 영상들을 수집하여 저장할 수 있다.The emotion state stored in the database 110 is expressed in the form of arousal-non-arousal axis and pleasant-off-axis axis, that is, A-V two-dimensional, and expressed by the V value of emotion and the A value of emotion. For example, images of facial expressions corresponding to each emotion corresponding to each main value on the emotion plane can be collected and stored.

본 발명에서 수집된 데이터 베이스(110)는 두 가지의 용도로 사용된다. 첫 번째는 PCA에서 데이터 베이스(110)의 분포를 가장 잘 나타내는 고유 벡터를 찾는 것이고, 두 번째는 얻어진 PCA 값을 통해 회귀식을 학습하는 것이다. 표정을 통해 인식하고자 하는 정서가 A-V 2축 정서 모델을 바탕으로 하므로, 각 데이터 베이스는 A-V 평면의 주요 영역에 해당하는 9개의 정서로 구분지어 고르게 수집할 수 있다. 주요 영역은 중립(0,0), 기분좋음(1,0), 웃음(1,1), 놀람(0,1), 공포(-1,1), 화남(-1,0), 슬픔(-1,-1), 졸림(0,-1), 안락(1,-1)으로 도 2와 같다.The database 110 collected in the present invention is used for two purposes. The first is to find the eigenvector that best represents the distribution of the database 110 in the PCA, and the second is to learn the regression equation through the obtained PCA value. Since the emotion to be recognized through facial expressions is based on the A-V 2-axis emotion model, each database can be divided into nine emotions corresponding to the main areas of the A-V plane and evenly collected. The main areas are neutral (0,0), pleasant (1,0), laugh (1,1), surprise (0,1), horror (-1,1), angry (-1,0), sadness -1, -1), sleepiness (0, -1), and comfort (1, -1).

또한, 상기 데이터 베이스(110)는 평가자에 의해 평가된 각 영상의 정서 상태의 수치적 값을 함께 저장할 수 있다. 예를 들어, 상기 데이터 베이스(110)의 영상을 각 정서별로 10장씩 총 90장의 얼굴 영상을 사용하고, 각 영상별로 한 명의 평가자에 의해, 정서 상태의 값인 V값과 A값을 0에서 1의 값을 가지도록 평가할 수 있다. 학습 영상과 정서 상태의 평가 값의 예는 도 3에 도시하였다.In addition, the database 110 may store numerical values of emotion states of respective images evaluated by the evaluator. For example, a total of 90 face images are used for each of the images of the database 110 for each emotion, and one evaluator for each image sets the V value and A value, which are emotion state values, from 0 to 1 Quot; value " An example of evaluation values of the learning image and the emotion state is shown in Fig.

상기 데이터 베이스(110)에 영상들이 저장되기 전에 영상으로부터 얼굴 영역을 추출할 수 있다. 이를 위해, 상기 장치(10)는 상기 영상들로부터 얼굴 영역을 검출하는 얼굴 영역 검출부(미도시)를 더 포함할 수 있다.The face region may be extracted from the image before the images are stored in the database 110. [ To this end, the apparatus 10 may further include a face region detection unit (not shown) for detecting a face region from the images.

표정 분석에서는 얼굴을 눈 영역과 입 영역으로 나누어 사용한다. 사람의 표정은 눈썹, 눈꺼풀, 미간 등 눈 영역의 움직임과 입술 및 입 주변의 근육 등 입 영역의 움직임의 조합으로 표현할 수 있다. 따라서, 표정변화가 뚜렷한 특정 영역을 참고하여 표정분석을 한다. 사람의 얼굴은 일반적으로 일정한 배열을 이루고 있으므로, 얼굴은 동일하게 100X100 (pixel)로 크기를 조정하고, 예를 들어, 눈 영역은 가로 11에서 90, 세로 21에서 50 이며, 입 영역은 가로 21에서 80, 세로 71에서 100으로 선택할 수 있다.In facial analysis, the face is divided into the eye area and the mouth area. The human expression can be expressed by a combination of movement of the eye region such as eyebrows, eyelids, and eyes, and movement of the mouth region such as muscles around the mouth and mouth. Therefore, facial analysis is performed with reference to a specific region in which the facial expression change is pronounced. For example, the eye area is 11 to 90 in the horizontal direction and 50 to 21 in the vertical direction, and the mouth area is 21 to 21 in the horizontal direction. 80, and the length 71 to 100, respectively.

상기 전처리부(130)는 상기 제1 PCA부(150)에서 PCA를 수행하기 전에 상기 데이터 베이스(110)에 저장된 영상들을 전처리한다. The preprocessing unit 130 preprocesses the images stored in the database 110 before performing the PCA in the first PCA unit 150.

도 4를 참조하면, 상기 전처리부(130)는 벡터부(131), 정규화부(133), 평균부(135) 및 차 벡터부(137)를 포함한다.Referring to FIG. 4, the preprocessing unit 130 includes a vector unit 131, a normalization unit 133, an averaging unit 135, and a difference vector unit 137.

상기 벡터부(131)는 상기 데이터 베이스(110)에 저장된 영상들의 그레이 스케일 영상들을 벡터화한다. 그레이 스케일이란, 영상을 흑백으로 하여 각 픽셀의 값을 하나의 차원으로 나타내는 것을 말한다. 그레이 스케일의 영상은 가로, 세로 2차원의 배열로 나타나며, 2X2 픽셀의 영상은 도 5와 같이 벡터화 할 수 있다.The vector unit 131 vectorizes the gray scale images of the images stored in the database 110. Grayscale means that the image is displayed in black and white, and the value of each pixel is represented by one dimension. Grayscale images are displayed in an array of two dimensions in the horizontal and vertical directions, and images of 2x2 pixels can be vectorized as shown in FIG.

상기 정규화부(133)는 벡터화된 영상들의 명암을 정규화한다. 영상정보를 이용한 분석은 조명의 영향에 취약하므로, 전체 영상의 명암을 정규화 시켜줘야 한다. 벡터화 된 전체 영상은 각 벡터의 평균과 분산, 그리고 전체 벡터의 평균과 분산을 통해 정규화 시킬 수 있으며, 아래의 수학식 1에 의해 수행될 수 있다.The normalization unit 133 normalizes the contrast of the vectorized images. Since the analysis using image information is vulnerable to the influence of illumination, it is necessary to normalize the brightness of the entire image. The vectorized full image can be normalized through the mean and variance of each vector, and the mean and variance of the whole vector, and can be performed by Equation 1 below.

여기서, ptr은 정규화하고자 하는 벡터이고, DB는 전체 벡터이다.Here, ptr is a vector to be normalized, and DB is an entire vector.

상기 평균부(135)는 각 벡터간의 차이를 부각시키기 위해 평균 벡터를 구하고, 상기 차 벡터부(137)는 상기 평균 벡터와 각 영상 벡터의 편차인 차 벡터를 구한다. 여기까지가 전처리 과정이며, 상기 제1 PCA부(150)는 이 차 벡터를 이용하여 PCA를 한다. The averaging unit 135 obtains an average vector to highlight a difference between the vectors, and the difference vector unit 137 obtains a difference vector that is a deviation between the average vector and each image vector. This is the preprocessing step, and the first PCA unit 150 performs the PCA using this difference vector.

상기 제1 PCA부(150)는 상기 데이터 베이스에 저장된 영상들의 분포를 바탕으로 PCA(Principal Component Analysis)를 통해 독립 변수를 계산한다. PCA는 주성분 분석법으로, 벡터로 모델링 되어있는 고차원 문제의 차원을 줄여 저차원으로 풀 수 있는 방식이다. The first PCA unit 150 calculates an independent variable based on a PCA (Principal Component Analysis) based on the distribution of images stored in the database. PCA is a principal component analysis method that can solve low-dimensional problems by reducing the dimension of high-dimensional problems modeled as vectors.

PCA의 과정은 먼저 고차원의 데이터에 대해 공분산이 큰 방향 순서로 고유 벡터를 찾아 고유값이 큰 순서로 몇 개의 고유 벡터를 선택한다. 그리고 데이터 베이스 내 각각의 차 벡터를 선택된 고유 벡터에 투사하여 차원이 축소된 PCA 값을 구할 수 있다. 사용하는 고유 벡터의 개수가 m개인 경우에는 m차원으로 축소된 값을 가진다. 계산된 PCA 값이 선형 회귀 분석에서 표정의 특징을 설명하는 독립 변수로 사용된다.The PCA process first finds eigenvectors in order of direction with large covariance for high dimensional data and then selects several eigenvectors in order of increasing eigenvalue. The PCA values can be obtained by projecting each difference vector in the database onto the selected eigenvectors. If the number of eigenvectors used is m, the value is reduced to m dimensions. The calculated PCA value is used as an independent variable to describe the characteristics of the facial expression in the linear regression analysis.

상기 학습부(170)는 상기 제1 PCA부(150)에서 계산된 상기 독립 변수를 이용하여 선형 회귀식을 학습한다.The learning unit 170 learns a linear regression equation using the independent variable calculated by the first PCA unit 150. [

회귀 분석(Regression analysis)은 독립 변수에 대한 종속 변수의 관계를 예측하고 적합도를 분석하는 이론으로, 독립 변수와 종속 변수의 데이터 분포를 바탕으로 수학적 관계식을 세우는 통계학적 방법이다. 회귀 분석은 주어진 독립 변수와 종속 변수 데이터들의 관계를 바탕으로 둘의 상관 관계식을 찾아 그 적합성을 평가하고, 새로 입력되는 독립 변수에 대해 그 종속 변수를 예측하는 것을 목표로 한다. 아래의 수학식 2에서

는 독립변수,

는 종속변수이고,

는 회귀식의 계수,

는 잔차이다. Regression analysis is a statistical method for predicting the relationship of dependent variables to independent variables and analyzing the fitness, and establishing a mathematical relationship based on the data distribution of independent variables and dependent variables. Regression analysis is based on the relationship between given independent variables and dependent variable data, evaluates the fitness of two correlations, and predicts dependent variables for newly input independent variables. In the following equation (2)

Is an independent variable,

Is a dependent variable,

Is the coefficient of the regression equation,

Is the residual.

본 발명에서는 정서의 두 축에 대하여, 눈 영역 및 입 영역의 PCA 값을 독립 변수로 하였으며, 표정에 해당하는 정서의 상태인 Valence와 Arousal의 값을 종속 변수로 하였다. 표정 인식을 위한 회귀분석 모델링은 도 6과 같이 표현할 수 있다.In the present invention, the PCA values of the eye region and the mouth region are used as independent variables for the two axes of emotion, and the values of Valence and Arousal, which are emotion states corresponding to the facial expressions, are used as dependent variables. Regression analysis modeling for facial expression recognition can be expressed as shown in FIG.

상기 데이터 베이스(110)의 각 영상의 눈 영역과 입 영역에 대해 PCA 분석을 함으로써 독립 변수를 얻을 수 있다. 여기서, PCA k번째 축 상의 데이터베이스 i번째 표정의 눈 영역과 입 영역 값을 각각

와

로 표현하였으며, 사전에 평가한 i번째 표정의 Valence 값과 Arousal의 값은 각각

와

로 표현하였다. Independent variables can be obtained by performing PCA analysis on the eye area and the mouth area of each image of the database 110. Here, the eye region and the input region value of the database i-th facial expression on the PCA k-th axis are represented by

Wow

The Valence and Arousal values of the i-th facial expression evaluated in advance are

Wow

Respectively.

상기 데이터 베이스(110)의 각 영상의 PCA 분석을 통해 독립 변수를 계산할 수 있으며, 평가된 종속 변수의 값을 가지고 있으므로, 데이터 셋을 구성하게 된다. 따라서, 도 6 행렬식과 같이 모델링되어, 선형 회귀분석을 통해 이를 만족하는 회귀식 계수

를 찾을 수 있다. 이러한 방식으로 상기 학습부(170)는 회귀식을 학습한다. Independent variables can be calculated through PCA analysis of each image of the database 110, and the data set has the value of the evaluated dependent variable. Therefore, it is modeled as the determinant of FIG. 6, and the regression coefficient

Can be found. In this way, the learning unit 170 learns the regression equation.

상기 표정 인식부(300)는 입력 영상을 상기 선형 회귀식에 적용하여 정서 상태의 값을 실시간으로 계산한다. 상기 오프라인부(100)에서 선형 회귀식이 학습되면 새로운 표정이 입력되었을 때, 상기 데이터 베이스(110)의 분포로부터 얻어진 고유 벡터에 투사하여 독립 변수를 구하고, 이를 상기 선형 회귀식에 대입함으로써 A-V 2축상 정서의 상태를 추정할 수 있다.The facial expression recognition unit 300 calculates the emotion state value in real time by applying the input image to the linear regression equation. When the linear regression equation is learned in the off-line unit 100, when a new expression is input, an independent variable is obtained by projecting the eigenvector obtained from the distribution of the database 110 to the linear regression equation, The state of emotion can be estimated.

이를 위해, 도 1을 다시 참조하면, 상기 표정 인식부(300)는 상기 입력 영상을 상기 데이터 베이스의 분포로부터 얻어진 고유 벡터에 투사하여, 즉 PCA 분석을 이용하여 독립 변수를 계산하는 제2 PCA부(330) 및 상기 독립 변수를 상기 선형 회귀식에 대입하여 상기 입력 영상의 정서 상태를 추정하는 정서 상태 출력부(350)를 포함할 수 있다. 1, the facial expression recognition unit 300 may include a second PCA unit 300 for projecting the input image onto an eigenvector obtained from the distribution of the database, that is, calculating an independent variable using PCA analysis, And an emotion state output unit 350 for substituting the independent variable into the linear regression equation to estimate the emotion state of the input image.

또한, 상기 표정 인식부(300)는 상기 PCA 분석 전에 상기 입력 영상의 얼굴 영역을 검출하는 얼굴 영역 검출부(310)를 더 포함할 수 있다. 상기 입력 영상은 카메라를 통한 사진, 영상 등을 포함할 수 있다.The facial expression recognition unit 300 may further include a facial region detection unit 310 for detecting a facial region of the input image before the PCA analysis. The input image may include a photograph, an image, and the like through a camera.

상기 표정 인식부(300)는 상기 입력 영상의 연속되는 매 프레임의 정서 상태의 값을 계산하여 연속적인 정서 상태를 인식할 수 있다.The facial expression recognition unit 300 can recognize the continuous emotion state by calculating the emotion state value of each successive frame of the input image.

이와 같이, 본 발명에서는 인간의 얼굴 표정으로부터 로봇이 인간의 정서 상태를 선형 모델에 기반하여 연속적으로 측정할 수 있다. 인간의 얼굴 표정이 입력되면 선형 회귀분석(Liner regression) 모델에 기반하여 정서 상태의 값이 계산되며, 계산된 정서값은 각성(Arousal)과 쾌/불쾌(Valence)의 두 좌표축에 표현된다.As described above, in the present invention, the robot can continuously measure the emotional state of the human being based on the linear model from the facial expression of the human. When the facial expression of a human is inputted, the value of the emotion state is calculated based on a linear regression model, and the calculated emotion value is expressed in two coordinate axes of arousal and pleasantness / valence.

도 7은 본 발명의 일 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 방법의 흐름도이다.FIG. 7 is a flowchart of a facial expression based continuous emotion recognition method of a robot according to an embodiment of the present invention.

본 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 방법은, 도 1의 장치(10)와 실질적으로 동일한 구성에서 진행될 수 있다. 따라서, 도 1의 장치(10)와 동일한 구성요소는 동일한 도면부호를 부여하고, 반복되는 설명은 생략한다. 또한, 본 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 방법은 로봇의 표정 기반 연속적 정서 인식 방법을 수행하기 위한 소프트웨어(애플리케이션)에 의해 실행될 수 있다.The robot-based facial expression continuous emotion recognition method according to the present embodiment can be performed in substantially the same configuration as the apparatus 10 of FIG. Therefore, the same constituent elements as those of the apparatus 10 of FIG. 1 are denoted by the same reference numerals, and repeated description is omitted. Further, the robot-based expression-based continuous emotion recognition method according to the present embodiment can be executed by software (application) for performing the robot-based expression-based continuous emotion recognition method.

도 7을 참조하면, 본 실시예에 따른 로봇의 표정 기반 연속적 정서 인식 방법은, 각 정서별 영상들을 저장하는 데이터 베이스를 구축한다(단계 S10). 상기 데이터 베이스에 저장된 정서 상태는 각성-비각성 축과 쾌-불쾌 축, 즉 A-V 2차원의 형태로 나타내어지며, 정서의 V값, 정서의 A값로 표현한다. 예를 들어, 정서 평면 상 각 주요 값에 해당하는 각 정서에 해당하는 표정의 영상들을 수집하여 저장할 수 있다.Referring to FIG. 7, the robot-based consecutive emotion recognition based on the robot according to the present embodiment constructs a database for storing images for each emotion (step S10). The emotion state stored in the database is expressed in the form of arousal-non-arousal axis and pleasant-off-axis axis, that is, A-V two-dimensional, and expressed by the V value of emotion and the A value of emotion. For example, images of facial expressions corresponding to each emotion corresponding to each main value on the emotion plane can be collected and stored.

상기 데이터 베이스를 구축하는 단계(단계 S10)는, 상기 평가에 앞서, 각 정서별 영상들의 얼굴 영역을 검출하고, 평가자에 의해 각 영상 별로 정서 상태의 값을 수치적으로 평가하여 저장할 수 있다.Prior to the evaluation, the step of constructing the database (step S10) may detect the face region of each emotion-based image and numerically evaluate and store the emotion state value for each image by the evaluator.

상기 데이터 베이스가 구축되면, 상기 데이터 베이스에 저장된 영상들의 분포를 바탕으로 PCA(Principal Component Analysis, 주성분 분석)를 통해 독립 변수를 계산한다(단계 S30).When the database is constructed, an independent variable is calculated through PCA (Principal Component Analysis) based on the distribution of images stored in the database (step S30).

상기 PCA를 통해 독립 변수를 계산하는 단계(단계 S30) 이전에 전처리 과정을 거칠 수 있다. A preprocessing process may be performed before the step of calculating the independent variable through the PCA (step S30).

도 8을 참조하면, 상기 전처리 과정은 상기 데이터 베이스에 저장된 영상들의 그레이 스케일 영상들을 벡터화 하는 단계(단계 S21), 벡터화된 영상들의 명암을 정규화하는 단계(단계 S22), 정규화된 영상들의 평균 벡터를 구하는 단계(단계 S23) 및 상기 평균 벡터와 각 영상 벡터의 편차인 차 벡터를 구하는 단계(단계 S24)를 포함할 수 있다. 상기 벡터화된 영상들의 명암을 정규화하는 단계(단계 S22)는, 각 벡터의 평균과 분산, 및 전체 벡터의 평균과 분산을 통해 수행될 수 있다.Referring to FIG. 8, the preprocessing step includes vectorizing the gray-scale images of the images stored in the database (step S21), normalizing the brightness of the vectorized images (step S22), calculating the average vector of the normalized images (Step S23) and obtaining a difference vector which is a deviation between the average vector and each image vector (step S24). The step of normalizing the brightness and darkness of the vectorized images (step S22) may be performed through averaging and variance of each vector, and averaging and variance of the whole vector.

상기 PCA를 통해 독립 변수를 계산하는 단계(단계 S30)는 상기 전처리 과정에서 얻어진 상기 차 벡터를 통해 공분산 행렬을 구하여 고유 벡터를 계산하고, 상기 차 벡터를 고유 벡터에 투사하여 차원이 축소된 PCA 값을 계산한다.The step of calculating the independent variable through the PCA (step S30) includes calculating an eigenvector by obtaining a covariance matrix through the difference vector obtained in the preprocessing step, projecting the difference vector to an eigenvector, .

이후, 상기 계산된 PCA 값을 독립 변수로 이용하여 선형 회귀식을 학습한다(단계 S50). 상기 선형 회귀식을 학습하는 단계(단계 S50)는, 각 정서에 해당하는 각성(Arousal) 값 및 쾌(Valence) 값을 종속 변수로 이용할 수 있다. 이 경우, 상기 선형 회귀식은, 각성(Arousal)-비각성 축(A 축)과 쾌(Valence)-불쾌 축(V 축)의 2차원 형태이다. 상기 선형 회귀식 학습을 통해 회귀식 계수를 계산한다.Then, the linear regression equation is learned using the calculated PCA value as an independent variable (step S50). In the step of learning the linear regression equation (step S50), an arousal value and a valence value corresponding to each emotion can be used as dependent variables. In this case, the linear regression equation is a two-dimensional shape of an Arousal-non-arousal axis (A axis) and a pleasantness (Valence axis). The regression coefficient is calculated through the linear regression learning.

상기 회귀식 학습까지 마치면, 실시간으로 입력되는 영상으로부터 정서 상태를 추정할 있다. 이를 위해, 입력 영상을 상기 선형 회귀식에 적용하여 정서 상태의 값을 계산한다(단계 S70). 이 경우, 연속적 정서 모델을 적용하여, 상기 입력 영상의 연속되는 매 프레임의 정서 상태의 값을 계산하여 연속적인 정서상태를 추정할 수 있다.When the regression formula learning is completed, the emotion state can be estimated from the image input in real time. To this end, the emotion state value is calculated by applying the input image to the linear regression equation (step S70). In this case, a continuous emotion state can be estimated by applying a continuous emotion model to calculate values of emotion states of successive frames of the input image.

구체적으로, 상기 입력 영상을 상기 데이터 베이스의 분포로부터 얻어진 고유 벡터에 투사하여, 즉 PCA 분석을 이용하여 독립 변수를 계산하고, 상기 독립 변수를 상기 선형 회귀식에 대입하여 상기 입력 영상의 정서 상태를 추정한다. 상기 입력 영상은 카메라를 통한 사진, 영상 등을 포함할 수 있고, 상기 PCA 분석 전에 상기 입력 영상의 얼굴 영역을 검출하여, 검출된 얼굴 영역만을 PCA 분석할 수 있다.Specifically, the input image is projected onto an eigenvector obtained from the distribution of the database, that is, an independent variable is calculated using PCA analysis, and the emotion state of the input image is calculated by substituting the independent variable into the linear regression equation . The input image may include a photograph, an image, and the like through a camera. Before the PCA analysis, the face region of the input image may be detected, and only the detected face region may be PCA analyzed.

이와 같이, 본 발명에서는 인간의 얼굴 표정으로부터 로봇이 인간의 정서 상태를 선형 모델에 기반하여 연속적으로 측정할 수 있다. 인간의 얼굴 표정이 입력되면 선형 회귀분석(Liner regression) 모델에 기반하여 정서 상태의 값이 계산되며, 계산된 정서값은 각성-비각성(A축)과 쾌-불쾌(V축)의 두 좌표축에 표현된다.As described above, in the present invention, the robot can continuously measure the emotional state of the human being based on the linear model from the facial expression of the human. When the human facial expression is inputted, the emotion state value is calculated based on the Liner regression model, and the calculated emotion value is calculated based on two coordinate axes of arousal-non-arousal (A axis) and pleasant- Lt; / RTI >

이하에서는, 본 발명에 따른 로봇의 표정 기반 연속적 정서 인식 방법에 대한 효과를 검증한다.Hereinafter, the effect of the robot based on the expression-based continuous emotion recognition method according to the present invention will be verified.

정서란, 매우 추상적인 개념으로 그 상태를 수치적으로 정의하기 어렵기 때문에 피실험자의 정서 상태에 대한 정답이 없어 정확한 평가 또한 어렵다. 연속적 정서 시스템으로 가정하였을 경우의 식별된 시스템 파라미터 또한 진위 여부를 판단하기는 어렵다.Emotion is a very abstract concept, which makes it difficult to define the state numerically, so it is difficult to accurately assess the subject 's emotional state. It is also difficult to determine the authenticity of the identified system parameters when assuming a continuous emotional system.

본 발명에서는 표정 분석 결과에 대해 사전에 평가한 정서 상태의 값과 비교하여 수치화 된 정서 상태의 인식에 대해 그 정확도를 평가하였다. 연속적 정서 상태 추정은 실제 표정 분석 결과를 바탕으로 시간에 대한 정서 상태의 변화에 따른 연속적 정서 모델의 응답 특성을 보이고, 결과에 대해 분석하였다.In the present invention, the accuracy of the recognition of the emotion state quantified by comparing the facial expression analysis result with the value of the emotion state evaluated in advance was evaluated. The continuous emotional state estimation showed the response characteristics of the continuous emotional model according to the change of the emotional state with respect to time based on the results of the actual facial expression analysis, and analyzed the results.

도 9는 PCA 방법을 통해 차원을 축소할 때, 축소한 차원의 크기에 대한 V의 RMS 오차를 나타낸 그래프이다.9 is a graph showing the RMS error of V with respect to the size of the reduced dimension when the dimension is reduced through the PCA method.

표정 분석 시스템을 구축하기 위한 얼굴 표정의 데이터 베이스는 한 사람에 대해 9개의 주요 정서 상태에 대한 표정을 각 표정 당 10장씩, 총 90장의 영상을 사용하였다. 분석 방법은 각 표정을 하나의 데이터로 하여 데이터의 분포를 가장 잘 설명하는 m개의 고유벡터에 투사하여 m차원으로 나타내고, 눈 영역과 입 영역의 m차원 값을 독립변수로 하고 사전에 평가한 정서 상태의 값을 종속변수로 하여 선형 회귀식을 학습하였다.The facial expression database for constructing the facial expression analysis system used nine facial expressions for each person, with a total of 90 images, 10 for each facial expression. The analytical method is to project each of the facial expressions as m data on m eigenvectors that best describe the distribution of data and to represent the data in m dimensions. The m dimension values of the eye area and the mouth area are used as independent variables, The linear regression equations are learned by using the state values as dependent variables.

실험을 위해 90장의 영상 중 테스트에 사용하는 한 장의 영상을 제외한 나머지 89장의 영상을 통해 PCA 분석을 하고 선형 회귀식을 학습하였으며, 모든 영상에 대하여 V축과 A축에 대한 오차를 계산하였다. 여기서 오차의 값은 각 정서 차원 값의 최대값을 1로 하였을 때, 이에 대한 상대적인 크기로 하였다.For the experiment, PCA analysis was performed on 89 images except for one image used in the test, and the linear regression equations were learned. The errors of the V and A axes were calculated for all the images. The value of the error is the relative size of the emotion dimension when the maximum value of each emotion dimension is 1.

도 9는 PCA 방법을 통해 차원을 축소할 때, 축소한 차원의 크기에 대한 V의 RMS(Root-Mean-Square) 오차를 나타낸 것이다. 선형 회귀식에서 독립변수로 사용하는 고유벡터의 수가 많아짐에 따라 점점 그 정확도가 상승하는 경향을 보인다. Valence와 Arousal의 평균 RMS(Root-Mean-Square) 오차가 0.3335로 우수한 추정 성능을 보였다.FIG. 9 shows a root-mean-square (RMS) error of V with respect to the size of the reduced dimension when the dimension is reduced through the PCA method. As the number of eigenvectors used as independent variables increases in the linear regression equation, the accuracy tends to increase gradually. The mean RMS (Root-Mean-Square) error of Valence and Arousal is 0.3335, which is an excellent estimation performance.

본 발명은 표정을 통해 정서를 인식하는 부분에서는 PCA 분석과 선형 회귀분석을 사용하여, 단순히 기본 정서의 종류를 분류해 내는 것이 아니라, A-V 2차원 정서 모델에 대해 각 축의 상태 값을 수치적으로 계산하는 방법을 제안하여, 대상의 정서에 대해 더 세분화 된 조건으로 반응할 수 있다. The present invention uses PCA analysis and linear regression analysis to classify emotional states through expression, rather than simply classifying the basic emotion types, but numerically calculates the state values of each axis for the AV two-dimensional emotion model , And can respond to the subject's emotion with more granular conditions.

또한, 실시간으로 정서를 인식하며, 필요한 순간에 정서의 상태를 확인해야 하는 실제 상호작용에 적용하기 위하여 연속적 정서모델을 도입하여 시간에 대해 연속적인 정서의 상태를 추정하는 방법을 제안하였다.In addition, we propose a method to estimate the state of consecutive emotion over time by introducing a continuous emotion model in order to apply real - time emotion recognition and real - time interaction to confirm emotional state at necessary moment.

이와 같은, 로봇의 표정 기반 연속적 정서 인식 방법은 애플리케이션으로 구현되거나 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. Such a robot-based facial expression sequential emotion recognition method may be implemented in an application or in the form of program instructions that can be executed through various computer components and recorded in a computer-readable recording medium. The computer-readable recording medium may include program commands, data files, data structures, and the like, alone or in combination.

상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거니와 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다. The program instructions recorded on the computer-readable recording medium may be ones that are specially designed and configured for the present invention and are known and available to those skilled in the art of computer software.

컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like.

프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of program instructions include machine language code such as those generated by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware device may be configured to operate as one or more software modules for performing the processing according to the present invention, and vice versa.

이상에서는 실시예들을 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the present invention as defined by the following claims. You will understand.

10: 로봇의 표정 기반 연속적 정서 인식 장치
100: 오프라인부 110: 데이터 베이스
130: 전처리부 150: 제1 PCA부
170: 학습부 300: 표정 인식부
310: 얼굴 영역 검출부 330: 제2 PCA부
350: 정서 상태 출력부 131: 벡터부
133: 정규화부 135: 평균부
137: 차 벡터부10: Robot's Facial Expression Continuous Emotion Recognition Device
100: Offline part 110: Database
130: preprocessing unit 150: first PCA unit
170: learning unit 300: facial expression recognition unit
310: face area detecting unit 330: second PCA unit
350: emotion state output unit 131: vector unit
133: normalization part 135: average part
137: car vector part

Claims

An emotional state discrimination method using an AV two-axis emotion model based on A axis (Arousal - non-awake axis) and V axis (Valence - unpleasant axis)
The evaluator evaluates the emotion state evaluated by the evaluator as the A value corresponding to the A-axis and the V-value corresponding to the V-axis by the evaluator with respect to the emotion state according to the facial expression of the images of each emotion and the facial region of the emotion- Building a database for storing the data;
Preprocessing images for each emotion stored in the database;
Calculating an independent variable through PCA (Principal Component Analysis) based on the distribution of the pre-processed images for each emotion;
Learning the linear regression equation using the calculated independent variable; And
Applying an input image to the linear regression equation to calculate a value of an emotion state,
The pre-
Vectorizing the gray-scale images of the eye region and the input region provided in the face region of the emotional images;
Obtaining an average vector of the eye region and an average vector of the input region; And
Calculating a plurality of eye area difference vectors that are differences between the average vector of the eye area and each of the eye areas and a plurality of mouth area difference vectors that are differences between vectors of the mean and the input areas of the mouth area, and,
Wherein the step of calculating the independent variable comprises:
Calculating a PCA value for the eye area through the PCA using the plurality of eye area difference vectors and calculating a PCA value for the mouth area through the PCA using the plurality of eye area difference vectors; and,
The step of learning the linear regression equation includes:
Constructing a data set having the PCA value for the eye region and the PCA value for the input region as independent variables and the A value and the V value as dependent variables;
Calculating a regression coefficient by linear regression analysis of the data set; And
And learning the linear regression equation including the regression coefficient,
Wherein the step of calculating the emotion state value by applying the linear regression equation to the input image comprises:
Projecting the input image onto an eigenvector obtained from a distribution of the database to calculate an independent variable; And
And substituting the independent variable into the linear regression equation to estimate the emotion state of the input image.

The method according to claim 1,
Wherein the linear regression equation is a two-dimensional form of arousal-non-arousal axis (A axis) and pleasantness (Valence) -incomfort axis (V axis).

The method according to claim 1,
The step of pre-processing images stored in the database comprises:
Further comprising the step of: normalizing the brightness of the vectorized images.

The method according to claim 1,
The step of calculating the independent variable via the PCA comprises:
Calculating an eigenvector for an eye region by obtaining a covariance matrix through a difference vector of the eye region, calculating an eigenvector for the input region by obtaining a covariance matrix through a difference vector of the input region, And
Calculating a PCA value of a reduced-size eye region by projecting the difference vector of the eye region on an eigenvector for the eye region, projecting the difference vector of the input region onto an eigenvector for the input region, And calculating a PCA value of the input region of the robot.

4. The method of claim 3, wherein normalizing the intensity of the vectorized images comprises:
A robot based facial expression consecutive emotion recognition method, which is performed through averaging and variance of each vector, and averaging and variance of whole vectors.

2. The method of claim 1, wherein the calculating the emotion state value by applying the input image to the linear regression equation comprises:
Further comprising applying a continuous emotion model to calculate a value of an emotion state of each successive frame of the input image.

A computer-readable recording medium on which a computer-readable recording medium for performing a facial expression based continuous emotion recognition method according to any one of claims 1 to 6.

An emotional state discrimination device using an AV two-axis emotion model based on A axis (Arousal - non-awake axis) and V axis (Valence - off-axis axis)
The evaluator evaluates the emotion state evaluated by the evaluator as the A value corresponding to the A-axis and the V-value corresponding to the V-axis by the evaluator with respect to the emotion state according to the facial expression of the images of each emotion and the facial region of the emotion- A first PCA unit for calculating an independent variable through PCA (Principal Component Analysis) based on the distribution of images stored in the database, and a second PCA unit for calculating an independent variable using a linear regression equation An offline part including a learning part for learning the learning part; And
And a facial expression recognition unit for calculating an emotional state value by applying the input image to the linear regression equation,
The off-
A preprocessor for preprocessing images of each emotion stored in the database before the PCA; Further comprising:
The pre-
A vector unit for vectorizing the gray-scale images of the eye region and the input region included in the face region of the emotional images, an averaging unit for obtaining an average vector of the eye region and an average vector of the input region, And a difference vector section for calculating a difference in eye area difference vector which is a deviation between a vector and a vector of each of the eye areas and a difference between a mean vector of the mouth area and a vector of each of the mouth areas,
The first PCA unit,
Calculating a PCA value for the input area through the PCA using the PCA value for the eye area and a plurality of the input area difference vectors through the PCA using the plurality of eye area difference vectors,
Wherein,
A PCA value for the eye region and a PCA value for the input region are set as independent variables, and a data set having the A value and the V value as dependent variables is constructed, and the data set is subjected to a linear regression analysis Calculating the expression coefficient, learning the linear regression equation including the regression coefficient,
The facial expression recognition unit
A second PCA unit for projecting the input image to an eigenvector obtained from a distribution of the database to calculate an independent variable; And
And an emotion state output unit for substituting the independent variable into the linear regression equation to estimate the emotion state of the input image.

9. The method of claim 8,
Wherein the linear regression equation is a two-dimensional shape of arousal-non-arousal axis (A axis) and pleasantness (Valence) -incomfort axis (V axis).

9. The apparatus according to claim 8,
Further comprising a face region detection unit for detecting a face region of the input image.

9. The method of claim 8,
Wherein the preprocessing unit further includes a normalization unit for normalizing the lightness and darkness of the vectorized images.

9. The apparatus according to claim 8,
And calculates a value of an emotion state of each successive frame of the input image.