WO2018154098A1 - Method and system for recognizing mood by means of image analysis - Google Patents
Method and system for recognizing mood by means of image analysis Download PDFInfo
- Publication number
- WO2018154098A1 WO2018154098A1 PCT/EP2018/054622 EP2018054622W WO2018154098A1 WO 2018154098 A1 WO2018154098 A1 WO 2018154098A1 EP 2018054622 W EP2018054622 W EP 2018054622W WO 2018154098 A1 WO2018154098 A1 WO 2018154098A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- mood
- subject
- facial
- images
- distance
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
Definitions
- the present invention is comprised in the technical field corresponding to the sector of artificial intelligence and facial expression recognition. More specifically, the invention relates to a mood recognition method based on image sequence processing.
- FACS Facial Action Coding System
- - moods relate to emotions insofar as a person who is in a certain mood tends to experience certain emotions. In other words, by means of noticeable effects produced by emotions, facial expressions or gestures, it is possible to recognize a person's mood.
- the applications for mood-based facial recognition may be very useful in various sectors, such as commercial or political marketing, human resources, video games, distance learning, digital signage and human-computer interactions in general.
- US 8798374 B2 discloses an automatic method for image processing for the detection of AUs
- US 8879854 B2 discloses a method and apparatus for recognizing emotions based on action units.
- the descriptors constructed in a heuristic manner have very little discriminatory power, fundamentally in interpersonal detection. This is why various lines of work have tended to construct more complex descriptors by means of automatic methods for selecting features.
- US 9405962 B2 discloses a method for determining emotions in a set of images in the presence of a facial artifact (beard, mustache, glasses, etc.), including the detection of action units.
- PAD "Pleasure-Arousal-Dominance"
- PAD a system that allows defining and measuring different moods, emotional traits and personality traits as a function of three orthogonal dimensions: pleasure (P), arousal (A), and dominance (D).
- P pleasure
- A arousal
- D dominance
- the PAD model is a framework that is generally used for defining moods and it allows the interrelation thereof with the facial coding in FACS. In other words, PAD can describe a mood in terms of action units.
- octants representing the basic categories of moods can be derived (Table 1 ).
- Table 1 Moods, PAD space octants.
- a mood can give rise to various emotions.
- the mood "anxious” can manifest itself in emotions such as “confused”, “fearful”, “worried”, “ashamed”, etc., which in turn can be related to action units (AUs).
- AUs action units
- Table 2 Example of emotions represented in PAD space. Particularly, it is possible to define the correspondence between AUs and PAD space octants by means of the PAD model. The main objective of this correspondence is the description of each of the eight moods in AU terms.
- the Facial Expression Repertoire (or FER) is known for this description.
- FER Facial Expression Repertoire
- the manner of transforming captured images of people into facial expressions/movements is through the use of generic methods, based on processing instantaneous images of the subjects subjected to analysis. However, these methods entail errors since the particular form of the facial features of the subject analyzed cannot be "learned" and customized, such that the emotion recognition method is more precise.
- said methods of the state of the art are restricted to the identification of emotions (happiness, sadness, etc.), but they do not allow detecting complex constructs such as moods, the activation of which may comprise, at the same time, different configurations of emotions, sometimes even opposing emotions. For example, an anxious mood can be reflected in both a sad subject and in a happy subject. Therefore, the known solutions of the state of the art are still unable to solve the technical problem which entails providing a precise mood recognition method.
- the present invention proposes a solution to this technical problem by means of a novel facial recognition method for recognizing moods in a set of images, which provides for the customization of the subject to minimize AU detection errors.
- the main object of the invention relates to a method for recognizing the mood of a subject based on their relationship with facial expressions/movements.
- the method of the invention focuses on recognizing moods, a concept that is technically different from emotion.
- the manner of transforming the captured images of the subjects into facial gestures/movements is customized, "learning" the particular form of the facial features of the person analyzed, such that the mood recognition method is more precise than if this customization were not performed.
- the mentioned object of the invention is performed by means of a mood recognition method for recognizing the mood of a subject based on facial images of said subjected obtained by means of a system comprising a camera suitable for taking said images, and a processor for storing and/or processing said images.
- said method comprises carrying out the following steps:
- step b) defining one or more resting patterns corresponding to the distances between the characteristic facial points of the subject, defined in step b);
- AUs action units
- step e) obtaining, for each image of the sequence, the activation probability distribution of the action units associated with the mood to be recognized, according to the rules defined in step e);
- step f) determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f).
- a reliable and robust mood recognition method is thereby achieved, where image analysis is performed in sequences captured by the camera, such that said sequences allow dynamically evaluating the contribution of the action units to the mood of the subject.
- the mood recognition method further comprises carrying out the following steps in step f):
- - defining a standard probability distribution associated with the activation of one or more action units associated with a mood / ' , defining to that end a value p ⁇ , between 0 and 1 to designate the contribution of each action unit j, where the value 0 is assigned to the minimum contribution and the value 1 to the maximum;
- the mood recognition method further comprises carrying out the following steps in step h):
- the mood recognition method further comprises carrying out the following step in step i):
- step h determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f) by calculating the Bhattacharyya coefficient, D,, for each mood / ' , according to the expression:
- the W facial images of the subject are consecutive in the sequence captured by the camera.
- the set of n action units involved in determining the mood or moods of the subject are selected from all the action units existing in FACS.
- the action units involved in determining the mood or moods of the subject are one or more of the following: inner brow raiser; outer brow raiser; brow lowerer; upper lid raiser; cheek raiser; upper lip raiser; lip corner puller; lip corner depressor; lips part; jaw drop; eyes closed.
- the moods considered are the eight moods of the Pleasure-Arousal-Dominance (PAD) space.
- the relationship between the eight moods of the PAD space developed by Mehrabian and the action units that are activated in each of them are those defined in the Facial Expression Repertoire (FER).
- one or more resting patterns corresponding to the distances between the characteristic facial points of the subject are defined, with said distances being one or more of the following: middle right eye-eyebrow distance; inner right eye-eyebrow distance; middle left eye-eyebrow distance; inner left eye-eyebrow distance; open right eye distance; open left eye distance; horizontal mouth distance; upper mouth-nose distance; jaw-nose distance; almost lower mouth-outer mouth distance; left eyebrow-upper lid distance; left eyebrow-lower lid distance; right eyebrow-upper lid distance; right eyebrow-lower lid distance.
- the mood or moods of the subject are gauged in a session with known and controlled stimuli, such that one or more action units can be associated with one or more moods / of said subject.
- Another object of the invention relates to a mood recognition system for recognizing the mood of a subject through the mood recognition method according to any of the embodiments described herein, comprising:
- processing means (3) for storing and/or processing the facial images, wherein said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
- said system additional comprises a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera, as a function of results obtained in previous analyses. More preferably, said learning subsystem is locally or remotely connected to the processing means.
- Figure 2 shows the characteristic facial points used in detecting action units of the method of the invention according to a preferred embodiment thereof.
- Figure 3 depicts the detection of the activation of an action unit (specifically, AU1 ) in a sequence of images upon comparing the minimum theoretical variation in pixels with the experimental variation of facial parameters with respect to the customized resting pattern parameters (in this case parameter P2).
- Figure 4 shows a mood recognition system according to a preferred embodiment of the invention, showing in detail the elements thereof.
- One object of the invention relates to a mood recognition method for recognizing the mood of a subject (1 ) based on their relationship with facial expressions/movements.
- the method of the invention focuses on recognizing moods, a concept that is different from emotion.
- the theory existing between facial gestures/movements and emotions FACS coding
- the theory relating emotions and moods PAD model
- the manner of transforming the captured images of the subjects (1 ) into facial gestures/movements is customized, "learning" the particular form of the facial features of the analyzed subject (1 ), such that the mood recognition method is more precise than if this customization were not performed.
- the method of the invention furthermore takes into account the prior history of the sequence of images (i.e., the recognition of expressions in the images preceding the processed image).
- the invention is therefore based on the analysis of a set of a given number of images, unlike methods based on instantaneous recognition for the identification of emotions.
- the method comprises three fundamental steps: defining general previous criteria and data, defining customized resting patterns, and evaluating the mood. Each of these steps is described below in detail.
- a subset n of action units which are considered sufficient for being able to describe and recognize any mood of the PAD space, must be selected from among all those existing in FACS.
- Table 3 Subset of action units considered in mood recognition.
- the starting data must also indicate the importance of each gesture or AUj in the corresponding mood.
- a number between 0 and 1 is defined to determine the weight of each gesture or AUj. If an AUj is highly determinant, it is assigned the value 1 , whereas if it is not important for a certain mood, it is assigned the value 0.
- Each pij is a scalar that determines the importance of an AUj in the mood / '
- p,- is a pattern of the mood that relates it with gestures or AUs.
- a standard probability distribution associated with the activation of one or more action units associated with a mood is thereby defined.
- the method requires defining criteria for activating each AUj when can be used to determine if a gesture or AUj has been made by the subject (1 ) under study when interpreting the image data.
- criteria for activating each AUj when can be used to determine if a gesture or AUj has been made by the subject (1 ) under study when interpreting the image data.
- the definition of these resting patterns includes the definition of a mean value ⁇ and a maximum deviation ⁇ from the mean value.
- These resting patterns must be found for each subject (1 ) subjected to the method of facial recognition analysis. It is a step included in each analysis, not a prior independent gauging.
- Table 6 shows an example of a set of rules for detecting activations of AUs that describe a threshold value for each variation of parameters relating to the AUs and are defined as a function of the deviation ⁇ . For example, if in an image ⁇ 7 (+) > 2 ⁇ , AU12 will have been activated.
- the method of the invention then comprises a final step of comparison to carry out the final step of evaluating the mood:
- the final step of the method of the invention consists of comparing the pattern with the experiment. To that end, the Bhattacharyya coefficient, D, is used for each mood:
- This coefficient gives a value indicating the proximity of the probability distribution of the experiment with respect to the standard probability distribution.
- this invention considers the use of descriptors of the temporary dynamics of a person's facial expression to determine said person's mood. These descriptors encode the importance of the occurrence of each AU for each mood.
- the invention uses a method of detection AUs capable of learning the particular parameters of the appearance of the facial movement in a customized manner in the same analysis session without a prior learning step.
- the final system that is provided also allows the possibility of defining a temporary analysis parameter W relating to the set of images to be processed, which allows the correct robust interpretation over partial errors of the mood of the person participating in the analysis.
- the analysis process is an iteration the duration of which depends on the number of image sequences.
- Another object of the invention relates to a facial recognition system for recognizing the mood of a subject (1 ) through the mood recognition method such as the one described in the preceding embodiment, comprising:
- processing means (3) for storing and/or processing the facial images, where said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
- the system of the invention can additionally comprise a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera (2), as a function of results obtained in previous analyses. This allows progressively improving system precision and feeding the previously obtained information back into said system, associating certain action units with moods of the subject, in a customized manner.
- the learning subsystem can be locally and remotely connected to the processing means (3).
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention relates to a mood recognition method for recognizing the mood of a subject (1) based on their relationship with facial expressions/movements. The method of the invention focuses on recognizing moods, a concept that is different from emotion. The manner of transforming the captured images of the subjects (1) into facial movements is customized, by learning the particular form of the facial features of the analyzed subject (1). The invention is based on the analysis of a set of a given number of images, but said number being greater than the number used in standard emotion recognition. A more robust mood recognition method is thereby defined. The method comprises three fundamental steps: defining general previous criteria and data, defining customized resting patterns, and evaluating the mood.
Description
DESCRIPTION
"Method and system for recognizing mood by means of image analysis"
FIELD OF THE INVENTION
The present invention is comprised in the technical field corresponding to the sector of artificial intelligence and facial expression recognition. More specifically, the invention relates to a mood recognition method based on image sequence processing.
BACKGROUND OF THE INVENTION
The recognition of emotions from facial expressions is a very dynamic field today given its various applications in the field of psychology, advertising or marketing, among others. Said recognition is typically performed according to the system known as the Facial Action Coding System (FACS). FACS allows analyzing human facial expressions through facial coding, and it can be used to classify virtually any anatomical facial expression by analyzing the possible movements of the muscles associated with said facial expression. These movements are divided into what is commonly referred to as Action Units (AU), which are the fundamental actions of muscles or individual muscle groups (for example, according to the mentioned classification, AU6 refers to raising cheeks). Terms such as action units, gestures, facial expressions and AU will be used interchangeably herein.
On the other hand, the terms mood and emotion are normally confused in colloquial language and in their formal definitions. There is a general consensus today that establishes at least three main differences between both terms:
- moods last longer than emotions do;
- moods are not outwardly expressed in a direct manner, unlike emotions;
- moods relate to emotions insofar as a person who is in a certain mood tends to experience certain emotions. In other words, by means of noticeable effects produced by emotions, facial expressions or gestures, it is possible to
recognize a person's mood.
As mentioned, the applications for mood-based facial recognition may be very useful in various sectors, such as commercial or political marketing, human resources, video games, distance learning, digital signage and human-computer interactions in general.
In the field of facial recognition for recognizing emotions, different analysis technologies are known, such as those disclosed in patents US 8798374 B2, US 8879854 B2 or US 9405962 B2. These patent documents disclose systems focusing on recognizing emotions, not moods (which are different concepts), and their associated methods of analysis therefore focus on the recognition and processing of instantaneous images of the subjects under study. These patent documents primarily disclose the construction of a set of descriptors based on detectable geometric facial features, and a method of classifying AUs based on these descriptors. The heuristic definition of a set of rules is used to obtain suitable descriptors, and even automatic methods for selecting features in the context of automatic learning methods are used for the same purpose. Therefore, US 8798374 B2 discloses an automatic method for image processing for the detection of AUs, and US 8879854 B2 discloses a method and apparatus for recognizing emotions based on action units. The descriptors constructed in a heuristic manner have very little discriminatory power, fundamentally in interpersonal detection. This is why various lines of work have tended to construct more complex descriptors by means of automatic methods for selecting features. For example, US 9405962 B2 discloses a method for determining emotions in a set of images in the presence of a facial artifact (beard, mustache, glasses, etc.), including the detection of action units.
On the other hand, the "Pleasure-Arousal-Dominance" (or PAD) model is also known today as a theoretical framework for mood recognition. The PAD model is a system that allows defining and measuring different moods, emotional traits and personality traits as a function of three orthogonal dimensions: pleasure (P), arousal (A), and dominance (D). The PAD model is a framework that is generally used for defining moods and it allows the interrelation thereof with the facial coding in FACS. In other words, PAD can describe a mood in terms of action units.
In the PAD model, based on the intersection of the pleasure, arousal and dominance axes, eight octants representing the basic categories of moods can be derived (Table 1 ).
Table 1 : Moods, PAD space octants.
It is possible to express emotions in terms of pleasure, arousal and dominance according to a certain correlation (Table 2). Therefore, a mood can give rise to various emotions. For example, the mood "anxious" can manifest itself in emotions such as "confused", "fearful", "worried", "ashamed", etc., which in turn can be related to action units (AUs).
Table 2: Example of emotions represented in PAD space. Particularly, it is possible to define the correspondence between AUs and PAD space octants by means of the PAD model. The main objective of this correspondence is the description of each of the eight moods in AU terms. The Facial Expression Repertoire (or FER) is known for this description. In the state of the art, the manner of transforming captured images of people into facial expressions/movements is through the use of generic methods, based on processing instantaneous images of the subjects subjected to analysis. However, these methods entail errors since the particular form of the facial features of the subject analyzed cannot be "learned" and customized, such that the emotion recognition method is more precise. Additionally, said methods of the state of the art
are restricted to the identification of emotions (happiness, sadness, etc.), but they do not allow detecting complex constructs such as moods, the activation of which may comprise, at the same time, different configurations of emotions, sometimes even opposing emotions. For example, an anxious mood can be reflected in both a sad subject and in a happy subject. Therefore, the known solutions of the state of the art are still unable to solve the technical problem which entails providing a precise mood recognition method.
The present invention proposes a solution to this technical problem by means of a novel facial recognition method for recognizing moods in a set of images, which provides for the customization of the subject to minimize AU detection errors.
BRIEF DISCLOSURE OF THE INVENTION
The main object of the invention relates to a method for recognizing the mood of a subject based on their relationship with facial expressions/movements. The method of the invention focuses on recognizing moods, a concept that is technically different from emotion. In the method of the invention, the manner of transforming the captured images of the subjects into facial gestures/movements is customized, "learning" the particular form of the facial features of the person analyzed, such that the mood recognition method is more precise than if this customization were not performed. The mentioned object of the invention is performed by means of a mood recognition method for recognizing the mood of a subject based on facial images of said subjected obtained by means of a system comprising a camera suitable for taking said images, and a processor for storing and/or processing said images. Advantageously, said method comprises carrying out the following steps:
a) registering one or more facial images of the subject in a reference mood; b) defining a plurality of characteristic facial points of the subject in one or more of the images associated with the reference mood;
c) defining one or more resting patterns corresponding to the distances between the characteristic facial points of the subject, defined in step b);
d) defining one or more action units (AUs) corresponding to the movement of
the facial points with respect to the resting patterns;
e) defining one or more activation rules of each action unit for the mood to be recognized based on threshold values associated with the amount of movement of the characteristic facial points with respect to the resting patterns;
f) defining a standard probability distribution associated with the activation of one or more action units associated with a mood;
g) registering a sequence of facial images of the subject that is associated with the mood to be recognized;
h) obtaining, for each image of the sequence, the activation probability distribution of the action units associated with the mood to be recognized, according to the rules defined in step e);
i) determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f). A reliable and robust mood recognition method is thereby achieved, where image analysis is performed in sequences captured by the camera, such that said sequences allow dynamically evaluating the contribution of the action units to the mood of the subject. In another preferred embodiment of the invention, the mood recognition method further comprises carrying out the following steps in step f):
- defining a standard probability distribution associated with the activation of one or more action units associated with a mood /', defining to that end a value p<, between 0 and 1 to designate the contribution of each action unit j, where the value 0 is assigned to the minimum contribution and the value 1 to the maximum;
- constructing with these values p,j a vector p, for each mood where n is the number of action units involved in determining the moods:
where Cp is a normalization constant for imposing the condition that .∑"/=·/ < =1 .
In another preferred embodiment of the invention, the mood recognition method further comprises carrying out the following steps in step h):
- registering a number W of facial images of the subject;
- obtaining, for the set of images W, the activation probability distribution of the action units j associated with the mood / to be recognized, defining to that end a
value qj to designate the contribution of each action unit j, according to the expression:
qj= Cq (IM ^k-o Skj
where / =0,1 ,...,l l/; y=1 ,2,...,n; and s¾- is assigned the value s¾=1 if the action unit j is activated, and s¾=0 if the action unit j is not activated; and Cq is a normalization constant for imposing the condition that∑nj=i q, =1.
In another preferred embodiment of the invention, the mood recognition method further comprises carrying out the following step in step i):
- determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f) by calculating the Bhattacharyya coefficient, D,, for each mood /', according to the expression:
More preferably, the W facial images of the subject are consecutive in the sequence captured by the camera.
In another preferred embodiment of the method of the invention, the set of n action units involved in determining the mood or moods of the subject are selected from all the action units existing in FACS.
More preferably, the action units involved in determining the mood or moods of the subject are one or more of the following: inner brow raiser; outer brow raiser; brow lowerer; upper lid raiser; cheek raiser; upper lip raiser; lip corner puller; lip corner depressor; lips part; jaw drop; eyes closed.
In another preferred embodiment of the method of the invention, the moods considered are the eight moods of the Pleasure-Arousal-Dominance (PAD) space.
More preferably, the relationship between the eight moods of the PAD space developed by Mehrabian and the action units that are activated in each of them are those defined in the Facial Expression Repertoire (FER).
In another preferred embodiment of the method of the invention, one or more resting patterns corresponding to the distances between the characteristic facial points of the subject are defined, with said distances being one or more of the following: middle
right eye-eyebrow distance; inner right eye-eyebrow distance; middle left eye-eyebrow distance; inner left eye-eyebrow distance; open right eye distance; open left eye distance; horizontal mouth distance; upper mouth-nose distance; jaw-nose distance; almost lower mouth-outer mouth distance; left eyebrow-upper lid distance; left eyebrow-lower lid distance; right eyebrow-upper lid distance; right eyebrow-lower lid distance.
In another preferred embodiment of the method of the invention, the mood or moods of the subject are gauged in a session with known and controlled stimuli, such that one or more action units can be associated with one or more moods / of said subject.
Another object of the invention relates to a mood recognition system for recognizing the mood of a subject through the mood recognition method according to any of the embodiments described herein, comprising:
- a camera suitable for taking facial images of said subject;
- one or more processing means (3) for storing and/or processing the facial images, wherein said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
In a preferred embodiment of the system of the invention, said system additional comprises a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera, as a function of results obtained in previous analyses. More preferably, said learning subsystem is locally or remotely connected to the processing means.
DESCRIPTION OF THE DRAWINGS Figure 1 shows a flowchart of the steps of the method of the invention according to a preferred embodiment thereof.
Figure 2 shows the characteristic facial points used in detecting action units of the method of the invention according to a preferred embodiment thereof.
Figure 3 depicts the detection of the activation of an action unit (specifically, AU1 ) in a sequence of images upon comparing the minimum theoretical variation in pixels with the experimental variation of facial parameters with respect to the customized resting pattern parameters (in this case parameter P2).
Figure 4 shows a mood recognition system according to a preferred embodiment of the invention, showing in detail the elements thereof.
DETAILED DISCLOSURE OF THE INVENTION
A detailed description of the method of the invention is provided below in reference to a preferred embodiment thereof based on Figure 1 of the present patent document. Said embodiment is provided for the purpose of illustrating the claimed invention in a non-limiting manner.
One object of the invention relates to a mood recognition method for recognizing the mood of a subject (1 ) based on their relationship with facial expressions/movements. The method of the invention focuses on recognizing moods, a concept that is different from emotion. In defining said relationship, the theory existing between facial gestures/movements and emotions (FACS coding) and the theory relating emotions and moods (PAD model) are used. In the method of the invention, the manner of transforming the captured images of the subjects (1 ) into facial gestures/movements is customized, "learning" the particular form of the facial features of the analyzed subject (1 ), such that the mood recognition method is more precise than if this customization were not performed.
The method of the invention furthermore takes into account the prior history of the sequence of images (i.e., the recognition of expressions in the images preceding the processed image). The invention is therefore based on the analysis of a set of a given number of images, unlike methods based on instantaneous recognition for the identification of emotions.
According to Figure 1 , the method comprises three fundamental steps: defining general previous criteria and data, defining customized resting patterns, and
evaluating the mood. Each of these steps is described below in detail.
1. Defining general previous criteria and data The method requires basic data, prior to the analysis of the mood of the subject (1 ):
- Firstly, a subset n of action units (AUs), which are considered sufficient for being able to describe and recognize any mood of the PAD space, must be selected from among all those existing in FACS. For example, Table 3 shows a possible subset n=11. Therefore, there is a set of gestures or AUj the combination of which gestures can describe moods, where y'=1 ,2,...n.
Table 3: Subset of action units considered in mood recognition.
- Secondly, a previous criterion relating the eight moods of the PAD space developed by Mehrabian with facial gestures or action units (AUs) that are activated in each of them is required. Table 4 shows all this starting data defined by Russel and Mehrabian and the Facial Expression Repertoire (FER) according to the subset of action units considered.
Mood Active AUs
Exuberant AU5, AU6, AU12, AU25, AU26
Anxious AU1 , AU2, AU4, AU5, AU15, AU25, AU26
Bored AU1 , AU2, AU4, AU15, AU43
Docile AU1 , AU2, AU12, AU43
Hostile AU4, AU 10, AU5, AU15, AU25, AU26
Relaxed AU6, AU12, AU43
Dependent AU1 , AU2, AU5, AU12, AU25, AU26
Disdainful AU4, All 15, AU43
Table 4. Active AUs per PAD quadrant
The starting data must also indicate the importance of each gesture or AUj in the corresponding mood. To that end, a number between 0 and 1 is defined to determine the weight of each gesture or AUj. If an AUj is highly determinant, it is assigned the value 1 , whereas if it is not important for a certain mood, it is assigned the value 0. A vector is constructed for each mood / with these values. This vector is called p,, for example: p,= CP {pij} = CP {pi, pi2l... Pin} = CP {1 ,1 ,1 ,0.7,0,0,0,1 ,1 ,1 ,0} (Eq. 1 )
Each pij is a scalar that determines the importance of an AUj in the mood /', and Cp is a normalization constant for imposing the condition that .∑"/= p<, =1 in Eq. 1. Then p,- is a pattern of the mood that relates it with gestures or AUs.
A standard probability distribution associated with the activation of one or more action units associated with a mood is thereby defined.
2. Defining customized resting patterns
In a second step, the method requires defining criteria for activating each AUj when can be used to determine if a gesture or AUj has been made by the subject (1 ) under study when interpreting the image data. To define said criteria, the following steps are carried out:
- Registering one or more facial images of the subject (1 ) in a reference mood.
- Defining a plurality of characteristic facial points of the subject (1 ) in one or more of the images in the reference mood. For example, as shown in Figure 2, 24 facial points or curves can be taken. These characteristic points are
strategically associated with the facial points or curves that are most susceptible to undergoing changes in position upon activating one or more AUj. - A plurality of distances between those characteristic facial points selected in the preceding step are defined. These distances are called parameters P. As an example, 15 distance parameters that will be used in detecting AUs are defined in Table 5.
Table 5. Distance parameters for detecting AUs.
- Defining one or more resting patterns for each parameter P. The definition of these resting patterns includes the definition of a mean value μ and a maximum deviation σ from the mean value. These resting patterns must be found for each subject (1 ) subjected to the method of facial recognition analysis. It is a step included in each analysis, not a prior independent gauging.
- Defining one or more rules relating the non-resting measurements with the resting patterns to indicate the activation of each AUj. If a comparison with respect to the resting pattern
is a positive number, it is an expansion ΔΡ(+), and if in contrast there is a negative difference, it is a contraction of this facial parameter ΔΡ(-). Table 6 shows an example of a set of rules for detecting activations of AUs that describe a threshold value for each variation of parameters relating to the AUs and are defined as a function of the deviation σ. For example, if in an image ΔΡ7(+) > 2σ, AU12 will have been activated.
Table 6. Rules used in detecting AUs.
3. Evaluating the mood With these steps described above, it is possible to determine changes in facial parameters in a consecutive image package, as shown in Figure 3 by way of example. According to the activation rules, what the theoretical change for activating an AU should be like can be compared with the actual changes experienced by the subject (1 ) throughout a sequence of images. Figure 3 shows as an example the rule for activating AU1 with respect to parameter P2 and the experimental value of parameter P2 in pixels in a sequence of images. Since the theoretical variation fits with the
experimental variation, AU1 is considered to have been activated in that set of images.
The method of the invention then comprises a final step of comparison to carry out the final step of evaluating the mood:
Assume that there is a number Wof images, which are preferably consecutive images. If each of those images is compared with the criteria for activating the AUs, it can be determined if it has been activated for each gesture or AUj. By repeating that comparison with all the images, it is possible to determine if it has been activated in one or in several images. In other words, an occurrence or relevance value can be obtained for each gesture or AUj. Each of those occurrence values can be referred to as qj. Each q is calculated with the following expression: qj= Cq (1/W)∑Wk-o Skj (Eq. 2) where / =0,1 ,...,l l/ and each k designates an image and where Skj represents the activation or non-activation of the gesture AUj. If the gesture AUj has been activated, Skj is assigned the value Skj=1, whereas if it has not been activated, Skj=0. Finally, Cq is a normalization constant for imposing the condition that∑nj=i q, =1 in Eq. 2.
With the set of resulting scalars a vector <¾={¾} having the same dimensions as pattern p, can be constructed, but this time it denotes the experimental weight or activation probability distribution of each gesture in a set of images W associated with the mood to be recognized.
The final step of the method of the invention consists of comparing the pattern with the experiment. To that end, the Bhattacharyya coefficient, D,, is used for each mood:
This coefficient gives a value indicating the proximity of the probability distribution of the experiment with respect to the standard probability distribution.
Through these steps it is possible to determine the mood or moods that are closer to the experimental mood of the subject (1 ) under analysis.
In conclusion, this invention considers the use of descriptors of the temporary dynamics of a person's facial expression to determine said person's mood. These descriptors encode the importance of the occurrence of each AU for each mood. The invention uses a method of detection AUs capable of learning the particular parameters of the appearance of the facial movement in a customized manner in the same analysis session without a prior learning step. The final system that is provided also allows the possibility of defining a temporary analysis parameter W relating to the set of images to be processed, which allows the correct robust interpretation over partial errors of the mood of the person participating in the analysis. The analysis process is an iteration the duration of which depends on the number of image sequences.
Alternatively, it is possible to perform a special gauging of the subject (1 ) in a session with known stimuli, which allow evaluating the degree of response of the subject (1 ) to standard stimuli to then recognize moods in non-standard stimuli with greater precision.
Another object of the invention relates to a facial recognition system for recognizing the mood of a subject (1 ) through the mood recognition method such as the one described in the preceding embodiment, comprising:
- a camera (2) suitable for taking facial images of said subject (1 );
- one or more processing means (3) for storing and/or processing the facial images, where said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
The system of the invention can additionally comprise a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera (2), as a function of results obtained in previous analyses. This allows progressively improving system precision and feeding the previously obtained information back into said system, associating certain action units with moods of the subject, in a customized manner. The learning subsystem can be locally and remotely connected to the processing means (3).
Claims
1 . A mood recognition method for recognizing the mood of a subject (1 ) based on facial images of said subject (1 ) obtained by means of a system comprising a camera (2) suitable for taking said images, and a processor (3) for storing and/or processing said images; where said method is characterized in that it comprises carrying out the following steps:
a) registering one or more facial images of the subject (1 ) in a reference mood;
b) defining a plurality of characteristic facial points of the subject (1 ) in one or more of the images associated with the reference mood;
c) defining one or more resting patterns corresponding to the distances between the characteristic facial points of the subject (1 ), defined in step b);
d) defining one or more action units (AUs) corresponding to the movement of the facial points with respect to the resting patterns;
e) defining one or more activation rules of each action unit for the mood to be recognized based on threshold values associated with the amount of movement of the characteristic facial points with respect to the resting patterns;
f) defining a standard probability distribution associated with the activation of one or more action units associated with a mood;
g) registering a sequence of facial images of the subject (1 ) that is associated with the mood to be recognized;
h) obtaining, for each image of the sequence, the activation probability distribution of the action units associated with the mood to be recognized, according to the rules defined in step e);
i) determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f).
2. The mood recognition method according to the preceding claim, wherein:
- a standard probability distribution associated with the activation of one or more action units associated with a mood / is defined, defining to that end a value p<, between 0 and 1 to designate the contribution of each action unit j, where the value 0 is assigned to the minimum contribution and the value 1 is assigned to the maximum contribution;
- a vector p, is constructed with these values p<, for each mood where n is the
number of action units involved in determining the moods:
where Cp is a normalization constant for imposing the condition that∑"i
3. The mood recognition method according to any of the preceding claims, wherein:
- a number W of facial images of the subject (1 ) are registered;
- the activation probability distribution of the action units j associated with the mood / to be recognized is obtained for the set of images W, defining to that end a value ¾ to designate the contribution of each action unit j, according to the expression: where / =0,1 , ..., l l/; y=1 ,2,... ,n; and s¾- is assigned the value s¾=1 if the action unit j is activated, and
if the action unit j is not activated; and Cq is a normalization constant for imposing the condition that∑nj=i q, =1 .
4. The mood recognition method according to any of the preceding claims, wherein the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f) is determined by calculating the Bhattacharyya coefficient, D,, for each mood /', according to the expression:
5. The mood recognition method according to the preceding claim, wherein the W facial images of the subject (1 ) are consecutive images in a sequence captured by the camera (2).
6. The mood recognition method according to any of the preceding claims, wherein the set of n action units involved in determining the mood or moods of the subject (1 ) are selected from all the action units existing in the Facial Action Coding System (FACS).
7. The mood recognition method according to the preceding claim, wherein the action units involved in determining the mood or moods of the subject (1 ) are one or more of the following: inner brow raiser; outer brow raiser; brow lowerer; upper lid raiser; cheek raiser; upper lip raiser; lip corner puller; lip corner depressor; lips part; jaw drop; eyes closed.
8. The mood recognition method according to any of the preceding claims, wherein the moods considered are the eight moods of the PAD space developed by Mehrabian.
9. The mood recognition method according to the preceding claim, wherein the relationship between the eight moods of the PAD space developed by Mehrabian and the action units that are activated in each of them are those defined by Russel and Mehrabian and the Facial Expression Repertoire (FER).
10. The mood recognition method according to any of the preceding claims, wherein one or more resting patterns corresponding to the distances between the characteristic facial points of the subject (1 ) are defined, with said distances being one or more of the following: middle right eye-eyebrow distance; inner right eye-eyebrow distance; middle left eye-eyebrow distance; inner left eye-eyebrow distance; open right eye distance; open left eye distance; horizontal mouth distance; upper mouth-nose distance; jaw-nose distance; almost lower mouth- outer mouth distance; left eyebrow-upper lid distance; left eyebrow-lower lid distance; right eyebrow-upper lid distance; right eyebrow-lower lid distance.
1 1 . The mood recognition method according to any of the preceding claims, wherein the mood or moods of the subject (1 ) are gauged in a session with known stimuli.
12. A mood recognition system for recognizing the mood of a subject (1 ) through a mood recognition method according to any of claims 1 to 1 1 , comprising:
- a camera (2) suitable for taking facial images of said subject (1 );
- a processor (3) for storing and/or processing the facial images;
one or more processing means (3) for storing and/or processing the facial images, wherein said processing means (3) are configured by means of hardware and/or software for carrying out a mood recognition method according to any of the preceding claims.
13. The mood recognition system for recognizing the mood of a subject (1 ) according to the preceding claim, wherein the images of the registered sequence are consecutive images obtained by the camera (2).
14. The mood recognition system for recognizing the mood of a subject (1 ) according to any of the preceding claims, additionally comprises a learning subsystem configured by means of hardware and/or software to establish classification criteria for the sequences taken by the camera (2) as a function of results obtained in previous analyses, wherein said learning subsystem is locally or remotely connected to the processing means (3).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
ES201730259A ES2633152B1 (en) | 2017-02-27 | 2017-02-27 | METHOD AND SYSTEM FOR THE RECOGNITION OF THE STATE OF MOOD THROUGH IMAGE ANALYSIS |
ESP201730259 | 2017-02-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018154098A1 true WO2018154098A1 (en) | 2018-08-30 |
Family
ID=59846800
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2018/054622 WO2018154098A1 (en) | 2017-02-27 | 2018-02-26 | Method and system for recognizing mood by means of image analysis |
Country Status (2)
Country | Link |
---|---|
ES (1) | ES2633152B1 (en) |
WO (1) | WO2018154098A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109523290A (en) * | 2018-09-14 | 2019-03-26 | 平安科技(深圳)有限公司 | Evaluation method, device, equipment and medium are paid attention to the class based on the micro- expression of audience |
CN109961054A (en) * | 2019-03-29 | 2019-07-02 | 山东大学 | It is a kind of based on area-of-interest characteristic point movement anxiety, depression, angry facial expression recognition methods |
CN110889908A (en) * | 2019-12-10 | 2020-03-17 | 吴仁超 | Intelligent sign-in system integrating face recognition and data analysis |
CN112115751A (en) * | 2019-06-21 | 2020-12-22 | 北京百度网讯科技有限公司 | Training method and device for animal mood recognition model |
CN112507959A (en) * | 2020-12-21 | 2021-03-16 | 中国科学院心理研究所 | Method for establishing emotion perception model based on individual face analysis in video |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI756681B (en) * | 2019-05-09 | 2022-03-01 | 李至偉 | Artificial intelligence assisted evaluation method applied to aesthetic medicine and system using the same |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100266213A1 (en) * | 2009-04-16 | 2010-10-21 | Hill Daniel A | Method of assessing people's self-presentation and actions to evaluate personality type, behavioral tendencies, credibility, motivations and other insights through facial muscle activity and expressions |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110263946A1 (en) * | 2010-04-22 | 2011-10-27 | Mit Media Lab | Method and system for real-time and offline analysis, inference, tagging of and responding to person(s) experiences |
-
2017
- 2017-02-27 ES ES201730259A patent/ES2633152B1/en active Active
-
2018
- 2018-02-26 WO PCT/EP2018/054622 patent/WO2018154098A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100266213A1 (en) * | 2009-04-16 | 2010-10-21 | Hill Daniel A | Method of assessing people's self-presentation and actions to evaluate personality type, behavioral tendencies, credibility, motivations and other insights through facial muscle activity and expressions |
Non-Patent Citations (5)
Title |
---|
ADAMS ANDRA ET AL: "Automated recognition of complex categorical emotions from facial expressions and head motions", 2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), IEEE, 21 September 2015 (2015-09-21), pages 355 - 361, XP032825053, DOI: 10.1109/ACII.2015.7344595 * |
BOUKRICHA H ET AL: "Pleasure-arousal-dominance driven facial expression simulation", AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION AND WORKSHOPS, 2009. ACII 2009. 3RD INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 10 September 2009 (2009-09-10), pages 1 - 7, XP031577868, ISBN: 978-1-4244-4800-5 * |
DIANA DI LORENZA E ARELLANO TAVARA: "Visualization of Affect in Faces based on Context Appraisal", 1 January 2012 (2012-01-01), pages 881 - 905, XP055481766, Retrieved from the Internet <URL:https://www.tdx.cat/bitstream/handle/10803/84078/Tddlat1de1.pdf?sequence=1> [retrieved on 20180606], DOI: 10.1016/j.jm.2004.06.005 * |
EL KALIOUBY R ET AL: "Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures", 20040627; 20040627 - 20040602, 27 June 2004 (2004-06-27), pages 154 - 154, XP010761935 * |
GUNES HATICE ET AL: "Categorical and dimensional affect analysis in continuous input: Current trends and future directions", IMAGE AND VISION COMPUTING, ELSEVIER, GUILDFORD, GB, vol. 31, no. 2, 20 July 2012 (2012-07-20), pages 120 - 136, XP028973723, ISSN: 0262-8856, DOI: 10.1016/J.IMAVIS.2012.06.016 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109523290A (en) * | 2018-09-14 | 2019-03-26 | 平安科技(深圳)有限公司 | Evaluation method, device, equipment and medium are paid attention to the class based on the micro- expression of audience |
CN109961054A (en) * | 2019-03-29 | 2019-07-02 | 山东大学 | It is a kind of based on area-of-interest characteristic point movement anxiety, depression, angry facial expression recognition methods |
CN112115751A (en) * | 2019-06-21 | 2020-12-22 | 北京百度网讯科技有限公司 | Training method and device for animal mood recognition model |
CN110889908A (en) * | 2019-12-10 | 2020-03-17 | 吴仁超 | Intelligent sign-in system integrating face recognition and data analysis |
CN110889908B (en) * | 2019-12-10 | 2020-11-27 | 苏州鱼得水电气科技有限公司 | Intelligent sign-in system integrating face recognition and data analysis |
CN112507959A (en) * | 2020-12-21 | 2021-03-16 | 中国科学院心理研究所 | Method for establishing emotion perception model based on individual face analysis in video |
Also Published As
Publication number | Publication date |
---|---|
ES2633152B1 (en) | 2018-05-03 |
ES2633152A1 (en) | 2017-09-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018154098A1 (en) | Method and system for recognizing mood by means of image analysis | |
US10573313B2 (en) | Audio analysis learning with video data | |
Bishay et al. | Schinet: Automatic estimation of symptoms of schizophrenia from facial behaviour analysis | |
Bandini et al. | Analysis of facial expressions in parkinson's disease through video-based automatic methods | |
Girard et al. | Spontaneous facial expression in unscripted social interactions can be measured automatically | |
Rudovic et al. | Context-sensitive dynamic ordinal regression for intensity estimation of facial action units | |
US9547808B2 (en) | Head-pose invariant recognition of facial attributes | |
JP6467965B2 (en) | Emotion estimation device and emotion estimation method | |
EP3740898A1 (en) | Systems and methods for evaluating individual, group, and crowd emotion engagement and attention | |
Griffin et al. | Laughter type recognition from whole body motion | |
Al Osman et al. | Multimodal affect recognition: Current approaches and challenges | |
Khatri et al. | Facial expression recognition: A survey | |
US20220101146A1 (en) | Neural network training with bias mitigation | |
Wilhelm | Towards facial expression analysis in a driver assistance system | |
Alshamsi et al. | Automated facial expression and speech emotion recognition app development on smart phones using cloud computing | |
JP2022553779A (en) | Method and device for adjusting environment in cabin | |
Rudovic et al. | 1 Machine Learning Methods for Social Signal Processing | |
Silva et al. | Real-time emotions recognition system | |
EP3799407B1 (en) | Initiating communication between first and second users | |
Ponce-López et al. | Non-verbal communication analysis in victim–offender mediations | |
Alugupally et al. | Analysis of landmarks in recognition of face expressions | |
Bakchy et al. | Facial expression recognition based on support vector machine using Gabor wavelet filter | |
Jan | Deep learning based facial expression recognition and its applications | |
Chiarugi et al. | Facial Signs and Psycho-physical Status Estimation for Well-being Assessment. | |
Liliana et al. | The fuzzy emotion recognition framework using semantic-linguistic facial features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18710784 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18710784 Country of ref document: EP Kind code of ref document: A1 |