CN111626113A - Facial expression recognition method and device based on facial action unit - Google Patents

Facial expression recognition method and device based on facial action unit Download PDF

Info

Publication number
CN111626113A
CN111626113A CN202010312602.6A CN202010312602A CN111626113A CN 111626113 A CN111626113 A CN 111626113A CN 202010312602 A CN202010312602 A CN 202010312602A CN 111626113 A CN111626113 A CN 111626113A
Authority
CN
China
Prior art keywords
facial
action unit
face
features
facial expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010312602.6A
Other languages
Chinese (zh)
Inventor
姚辉
芦燕云
陈晓华
孙广宇
李欣
才智
章莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Zhongyunwei Technology Co ltd
Beijing Xicheng District Center School For Mental Retardation
Original Assignee
Chengdu Zhongyunwei Technology Co ltd
Beijing Xicheng District Center School For Mental Retardation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Zhongyunwei Technology Co ltd, Beijing Xicheng District Center School For Mental Retardation filed Critical Chengdu Zhongyunwei Technology Co ltd
Priority to CN202010312602.6A priority Critical patent/CN111626113A/en
Publication of CN111626113A publication Critical patent/CN111626113A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/169Holistic features and representations, i.e. based on the facial image taken as a whole
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a facial expression recognition method and a device based on a facial action unit, wherein the method comprises the following steps: preprocessing the collected picture data containing the face region by using a face key point detection technology, and subdividing the face region into a plurality of face action unit trigger regions; identifying the facial action unit by using a preset three-dimensional convolutional neural network identification model to obtain the characteristics of the facial action unit; performing feature fusion on the obtained global facial features and the features of the facial action units to obtain facial expression fusion features; and inputting the facial expression fusion features into a preset Gaussian process classifier, identifying the facial expressions of the human face by using the Gaussian process classifier, and outputting facial expression identification results. By adopting the method, the expression recognition can be carried out based on the facial action unit, the recognition of the micro expression is more facilitated, and the facial expression recognition performance and robustness are improved.

Description

Facial expression recognition method and device based on facial action unit
Technical Field
The embodiment of the invention relates to the field of computer vision, in particular to a facial expression recognition method and device based on a facial action unit. In addition, an electronic device and a storage medium are also related.
Background
In recent years, with the rapid development of economic society, the demand of people for intelligent life is increasingly increased. The artificial intelligence technology is developed continuously, and high-quality human-computer interaction experience is a necessary condition for improving and perfecting the intelligent life quality, so that emotion analysis of a user in the human-computer interaction process is a key link. Since the facial expression plays a very important role in human emotional expression, abundant emotional information can be obtained by analyzing the facial expression. In addition, the facial expression is acquired only by using a camera, and the acquisition mode is convenient and fast. Therefore, the analysis of the facial expression has wide application prospect in the current intelligent products.
However, the existing facial expression recognition algorithm has a good effect only in basic expressions with macroscopical sizes, and sometimes microscopic expressions are not obvious in facial expression. Because the facial features vary slightly from those of a macroscopic expression and the differences between expressions are not significant. Therefore, the problem of identifying the micro-expressions cannot be effectively solved only by the traditional machine learning method. The analysis of the micro expression is not only a proprietary subject of the artificial intelligence field, but also a large number of research results of the anatomical field are obtained on the subject, the facial activity unit is a research result of the subject of the anatomical field about the analysis of the micro expression, the correlation between the facial muscle movement and the facial expression is further known through the facial activity unit, a large amount of priori knowledge can be provided for the analysis of the micro expression, and the recognition of the facial expression can be effectively assisted by applying the facial activity unit to the artificial intelligence field. Therefore, the application of the facial activity unit to the research of facial expression recognition is receiving wide attention, and the technology of facial expression recognition based on the facial activity unit is also one of the important research subjects in the field of computer vision. Therefore, how to accurately recognize facial expression information based on a facial action unit is a problem that is currently urgently needed to be solved.
Disclosure of Invention
Therefore, the embodiment of the invention provides a facial expression recognition method based on a facial action unit, so as to solve the problem that the machine learning method in the prior art cannot effectively solve the recognition of micro-expressions.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
according to the facial expression recognition method based on the facial action unit, provided by the embodiment of the invention, the method comprises the following steps: collecting picture data containing a face area; preprocessing picture data containing a face region by using a face key point detection technology, and subdividing the face region into a plurality of face action unit trigger regions; identifying the facial action unit by using a preset three-dimensional convolutional neural network identification model to obtain the characteristics of the facial action unit; performing feature fusion on the obtained global facial features and the features of the facial action units to obtain facial expression fusion features; and inputting the facial expression fusion features into a preset Gaussian process classifier, identifying the facial expressions of the human face by using the Gaussian process classifier, and outputting facial expression identification results.
Further, utilize face key point detection technology to carry out the preliminary treatment to the picture data that contain the face region, subdivide the face region into a plurality of face action unit trigger areas, specifically include: detecting a preset number of key points on the face area according to the face gouging structure and a face key point detection algorithm; clipping the picture data according to a preset number of key points to obtain a target face area, and scaling the target face area to a preset pixel size to realize the normalization of the face image; and subdividing the target human face area into three facial action unit trigger areas, wherein the three facial action unit trigger areas comprise an eye action unit trigger area, a T area action unit trigger area and a lip action unit trigger area.
Further, the identifying the facial action unit by using the preset three-dimensional convolutional neural network identification model to obtain the characteristics of the facial action unit specifically includes: selecting a facial action unit having an association relation with the expression, and constructing a corresponding relation between the facial action unit and the three facial action unit trigger areas; extracting local features of the face action unit in a corresponding face action unit trigger area by using the three-dimensional convolutional neural network identification model to obtain local features of the face action unit trigger area; inputting the local features of the trigger area of the facial action unit into a softmax layer for identifying the facial action unit, and obtaining the features of the facial action unit.
Further, the performing feature fusion on the obtained global facial feature and the feature of the facial action unit to obtain a facial expression fusion feature includes: the target face area is cut into blocks and input into a preset three-dimensional convolutional neural network recognition model to extract features of a frontal face, and the global features of the face are obtained; and performing feature level fusion on the features of the facial action unit and the global facial features to obtain facial expression fusion features.
Further, the gaussian process classifier is based on an associative relationship constraint between the facial action units and the expressions.
Further, the recognizing facial expressions of the human face by using the gaussian process classifier and outputting facial expression recognition results specifically include: inputting the facial expression fusion features into the Gaussian process classifier to obtain a primary facial expression recognition result; and further constraining the preliminary facial expression recognition result by using the characteristics of the facial action unit and the association relationship between the facial action unit and the facial expression, and outputting a facial expression recognition result.
Correspondingly, an embodiment of the present application further provides a facial expression recognition apparatus based on a facial action unit, including: the data acquisition unit is used for acquiring picture data containing a face area; the preprocessing unit is used for preprocessing the picture data containing the face area by using a face key point detection technology and subdividing the face area into a plurality of face action unit trigger areas; the face action unit identification unit is used for identifying the face action unit by utilizing a preset three-dimensional convolutional neural network identification model to obtain the characteristics of the face action unit; the feature fusion unit is used for carrying out feature fusion on the obtained global facial features and the features of the facial action unit to obtain facial expression fusion features; and the facial expression recognition unit is used for inputting the facial expression fusion characteristics into a preset Gaussian process classifier, recognizing the facial expression of the human face by using the Gaussian process classifier and outputting a facial expression recognition result.
Further, the preprocessing unit is specifically configured to: detecting a preset number of key points on the face area according to the face gouging structure and a face key point detection algorithm; clipping the picture data according to a preset number of key points to obtain a target face area, and scaling the target face area to a preset pixel size to realize the normalization of the face image; and subdividing the target human face area into three facial action unit trigger areas, wherein the three facial action unit trigger areas comprise an eye action unit trigger area, a T area action unit trigger area and a lip action unit trigger area.
Further, the facial action unit recognition unit is specifically configured to: selecting a facial action unit having an association relation with the expression, and constructing a corresponding relation between the facial action unit and the three facial action unit trigger areas; extracting local features of the face action unit in a corresponding face action unit trigger area by using the three-dimensional convolutional neural network identification model to obtain local features of the face action unit trigger area; inputting the local features of the trigger area of the facial action unit into a softmax layer for identifying the facial action unit, and obtaining the features of the facial action unit.
Further, the feature fusion unit is specifically configured to: the target face area is cut into blocks and input into a preset three-dimensional convolutional neural network recognition model to extract features of a frontal face, and the global features of the face are obtained; and performing feature level fusion on the features of the facial action unit and the global facial features to obtain facial expression fusion features.
Further, the gaussian process classifier is based on an associative relationship constraint between the facial action units and the expressions.
Further, the facial expression recognition unit is specifically configured to: inputting the facial expression fusion features into the Gaussian process classifier to obtain a primary facial expression recognition result; and further constraining the preliminary facial expression recognition result by using the characteristics of the facial action unit and the association relationship between the facial action unit and the facial expression, and outputting a facial expression recognition result.
Correspondingly, the present application also provides an electronic device, comprising: a processor and a memory; the memory is used for storing a program of a facial expression recognition method based on a facial action unit, and the electronic equipment is powered on and executes the program of the facial expression recognition method based on the facial action unit through the processor, so that any one of the facial expression recognition methods based on the facial action unit is executed.
Accordingly, the present application also provides a computer-readable storage medium containing one or more program instructions for executing the facial expression recognition method based on a facial action unit as described in any one of the above by a server.
The facial expression recognition method based on the facial action unit realizes an end-to-end algorithm from a user visual portrait picture to a user visual facial expression recognition result, has stronger robustness to different test objects, can perform expression recognition based on the facial action unit, and uses a microscopic facial action unit to restrict macroscopic expression compared with macroscopic expression recognition, thereby being more beneficial to recognition of tiny expression.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.
Fig. 1 is a flowchart of a facial expression recognition method based on a facial action unit according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a facial expression recognition apparatus based on a facial action unit according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an electronic device according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an identification model in a data enhancement method based on image background style conversion according to an embodiment of the present invention.
Detailed Description
The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following describes an embodiment of the facial expression recognition method based on the facial action unit according to the present invention in detail. As shown in fig. 1, which is a flowchart of a facial expression recognition method based on a facial action unit according to an embodiment of the present invention, a specific implementation process includes the following steps:
step S101: and collecting picture data containing the face area.
Step S102: and preprocessing the picture data containing the face region by using a face key point detection technology, and subdividing the face region into a plurality of face action unit trigger regions.
After the image data containing the face region is collected in step S101, the image data containing the face region may be preprocessed by using a face key point detection technique in this step.
In the embodiment of the present invention, the preprocessing is performed on the picture data containing the face region by using the face keypoint detection technology, and the face region is subdivided into a plurality of face action unit trigger areas, and the specific implementation process may include: detecting a preset number of key points on the face area according to the face gouging structure and a face key point detection algorithm; clipping the picture data according to a preset number of key points to obtain a target face area, and scaling the target face area to a preset pixel size to realize the normalization of the face image; and subdividing the target human face area into three facial action unit trigger areas, wherein the three facial action unit trigger areas comprise an eye action unit trigger area, a T area action unit trigger area and a lip action unit trigger area. For example, in the specific implementation process, 68 key points on the face can be detected and extracted based on the face gouging and topology structure and the face key point (feature point) detection algorithm, the face region is cut from the original portrait picture according to the 68 key points, and the face region is scaled to 250 × 250 pixels to realize the normalization of the face image; the human face area is subdivided into three face action unit trigger areas, namely an eye action unit trigger area, a T area action unit trigger area and a lip action unit trigger area.
Step S103: and identifying the facial action unit by using a preset three-dimensional convolutional neural network identification model to obtain the characteristics of the facial action unit.
After the face region is subdivided into a plurality of face action unit trigger regions in step S102, in this step, the face action units may be identified by using a preset three-dimensional convolutional neural network identification model, so as to further obtain features of the face action units.
As shown in fig. 4, in the embodiment of the present invention, the identifying the facial action unit by using the preset three-dimensional convolutional neural network identification model to obtain the characteristics of the facial action unit may include: selecting a facial action unit having an association relation with the expression, and constructing a corresponding relation between the facial action unit and the three facial action unit trigger areas; extracting local features of the face action unit in a corresponding face action unit trigger area by using the three-dimensional convolutional neural network identification model to obtain local features of the face action unit trigger area; inputting the local features of the trigger area of the facial action unit into a softmax layer for identifying the facial action unit, and obtaining the features of the facial action unit. For example, in the specific implementation process, 68 feature points on the face are extracted according to the face gouging and topological structure, 13 face Action units (Action units, AUs) with strong expression association relationship are selected from 41 expression-related face Action units (AU1-AU41) according to the association relationship between the face Action units and the expressions, the triggering areas of the 13 AUs are integrated to divide the face into three Action Unit triggering areas, namely, an eye Action Unit triggering area, a T area Action Unit triggering area and a lip Action Unit triggering area, and the correspondence relationship between the 13 face Action units and the three face Action Unit triggering areas is established. Correspondingly, the local features of the 13 face action units in the corresponding face action unit trigger areas are respectively extracted by using 13 three-dimensional Convolutional Neural network recognition models (3D Convolutional Neural Networks; 3D CNN) to obtain the local features of the 13 action units trigger areas; and introducing the 13 local features of the trigger area of the action unit into a softmax layer to perform identification on 13 facial action features, and obtaining an identification result of the 13 facial action features, that is, obtaining the features of the facial action unit, which is not specifically limited herein. The three-dimensional convolutional neural network recognition model can be divided into 13 parallel 3DCNN networks. The 4 eye receiving action unit trigger areas serve as input, and recognition results of AU1 (corresponding to the characteristic that the inner angle of eyebrows is raised), AU2 (corresponding to the characteristic that the outer angle of eyebrows is raised), AU4 (corresponding to the characteristic that eyebrows are wrinkled) and AU7 (corresponding to the characteristic that the muscle circles of the orbicularis rings are tightened) serve as output; 2 receiving T area action unit trigger areas as input, and using the identification results of AU9 (corresponding to characteristic of raised inner angle of eyebrow) and AU17 (corresponding to characteristic of raised inner angle of eyebrow) as output; 7 lip action unit trigger areas as input, AU10 (corresponding to the feature of upward movement of the pulling mouth angle), AU12 (corresponding to the feature of upward inclination of the pulling mouth angle), AU15 (corresponding to the feature of downward movement of the pulling mouth angle), AU20 (corresponding to the feature of upward pushing of the lower lip), AU24 (corresponding to the feature of stretching of the mouth angle), AU25 (corresponding to the feature of mutual pressing of the mouth angles), and AU26 (corresponding to the feature of open mouth) as output.
In addition, it should be noted that, in the AU-based facial expression recognition process, it is important to select an AU having a strong correlation with a facial expression from among a plurality of AUs to recognize the facial expression, and therefore, an AU having a weak correlation should be excluded in the selection process to reduce the influence on the facial expression recognition. According to the statistical association relationship between AUs and 6 basic facial expressions in FACS, 13 AUs with association degrees of more than 70% with 6 basic facial expressions (namely 13 AUs with the maximum association degree) are selected. The 13 AUs are mainly distributed in the eyebrow area, nose area and lip area. Wherein, U1 (corresponding to the characteristic of raised inner brow angle), AU2 (corresponding to the characteristic of raised outer brow angle), AU4 (corresponding to the characteristic of frown ridge), AU7 (corresponding to the characteristic of tightened eye ring muscle circle) describes the muscle movement of the brow eye region, AU9 (corresponding to the characteristic of raised inner brow angle), AU17 (corresponding to the characteristic of raised inner brow angle) describes the muscle movement of the nose region, AU10 (corresponding to the characteristic of pulled mouth angle upward movement), AU12 (corresponding to the characteristic of pulled mouth angle upward inclination), AU15 (corresponding to the characteristic of pulled mouth angle downward movement), AU20 (corresponding to the characteristic of pushed lower lip upward), AU24 (corresponding to the characteristic of pulled mouth angle), AU25 (corresponding to the characteristic of mutual pressing of mouth angles), and AU26 (corresponding to the characteristic of opened mouth) describes the muscle movement of the lip region.
In computer vision research, the use of two-dimensional convolution is common, but when video data needs to be processed, dynamic information between a plurality of consecutive frames is required. Three-dimensional convolution is performed in the convolution stage of the CNN, so that feature calculation is performed from two dimensions of space and time. Three-dimensional convolution is the extraction of convolution features on a cube formed by the stacking of successive video frames by using a three-dimensional convolution kernel. The structure enables the feature mapping in the convolutional layer to be connected with a plurality of continuous frames in the previous layer, so that the motion information in the video can be captured, and the specific implementation process is not expanded to be repeated.
Step S104: and performing feature fusion on the obtained global facial features and the features of the facial action units to obtain facial expression fusion features.
After the features of the face action unit are obtained in step S103 described above, the obtained global features of the face and the features of the face action unit may be subjected to a feature fusion process in this step.
The feature fusion is performed on the obtained global facial features and the features of the facial action units to obtain the facial expression fusion features, and the specific implementation process may include: the target face area is cut into blocks and input into a preset three-dimensional convolutional neural network recognition model to extract features of a frontal face, and the global features of the face are obtained; and performing feature level fusion on the features of the facial action unit and the global facial features to obtain facial expression fusion features. Specifically, the obtained normalized face region cut block can be introduced into another preset three-dimensional convolutional neural network recognition model (3D CNN) to extract the features of the frontal face. And performing feature level fusion on the obtained 13 local action unit trigger area features and the obtained whole face features. In a specific implementation process, the normalized face region can be input into an independent 3DCNN network for extracting the whole face features; and then, carrying out feature layer fusion on the extracted whole face features and the local features of the previous 13 AUs to obtain fusion features combining the whole face and the local action unit trigger area.
Step S105: and inputting the facial expression fusion features into a preset Gaussian process classifier, identifying the facial expressions of the human face by using the Gaussian process classifier, and outputting facial expression identification results.
After the facial expression fusion features are obtained in step S104, the gaussian process classifier may be used to identify the facial expressions of the human face in this step, and a facial expression identification result is output.
Wherein the Gaussian process classifier is based on an associative relationship constraint between facial action units and expressions. Therefore, the identifying the facial expression of the human face by using the gaussian process classifier and outputting a facial expression identifying result may include: inputting the facial expression fusion features into the Gaussian process classifier to obtain a primary facial expression recognition result; and further constraining the preliminary facial expression recognition result by using the characteristics of the facial action unit and the association relationship between the facial action unit and the facial expression, and outputting a facial expression recognition result. Specifically, the fusion features obtained in step S104 may be introduced into a gaussian process classifier to obtain a preliminary facial expression recognition result. And (4) performing further constraint on the obtained preliminary facial expression recognition result by using the facial motion characteristic recognition result obtained in the step (S103) and the incidence relation between the facial motion characteristic and the facial expression, so as to obtain more accurate expression recognition.
In a specific embodiment of the invention, a human face key point detection technology is utilized for picture data containing a human face area, and then the picture is cut according to the planning resolving structure of a key point and a face action unit, so that the human face is divided into three parts, namely an eye part, a nose part and a mouth part; multitask facial action unit recognition: respectively inputting the cutting data of the eyes, the nose and the mouth into 13 deep Convolutional Neural Networks (CNN) in different areas for AU identification; AU-based facial expression recognition: and performing feature fusion on the AU features extracted in the AU identification process and the facial features extracted from the facial image to obtain the facial expression features. On the basis of the design of a basic facial expression recognition Gaussian process classifier, symbiotic relation is added as constraint, the prior knowledge of the relation between AU (AU) -expressions is utilized, the facial expression recognition performance is improved, an end-to-end algorithm from a user visual portrait picture to a user visual facial expression recognition result is realized, and the method has strong robustness to different test objects.
By adopting the facial expression recognition method based on the facial action unit, the end-to-end algorithm from the visual portrait picture of the user to the visual facial expression recognition result of the user is realized, and has strong robustness to different test objects, can perform expression recognition based on facial action units, and compared with macroscopic expression recognition, the microscopic facial action units are used for restraining the macroscopic expression, which is more beneficial to the recognition of the tiny expression, meanwhile, facial features which have more representation capability on facial expressions are formed by combining the whole face features and the local features of the action units, and by utilizing the face topology and the relationship between the facial action units and the expressions, the system consumption can be reduced while the recognition effect is not influenced, and the facial expression recognition performance is improved.
Corresponding to the facial expression recognition method based on the facial action unit, the invention also provides a facial expression recognition device based on the facial action unit. Since the embodiment of the device is similar to the embodiment of the method, the description is simple, and please refer to the description of the embodiment of the method, and the following description of the embodiment of the facial expression recognition device based on the facial action unit is only illustrative. Fig. 2 is a schematic diagram of a facial expression recognition apparatus based on a facial action unit according to an embodiment of the present invention.
The invention relates to a facial expression recognition device based on a facial action unit, which comprises the following parts:
the data acquisition unit 201 is configured to acquire image data including a face region.
The preprocessing unit 202 is configured to preprocess the image data containing the face region by using a face keypoint detection technology, and subdivide the face region into a plurality of face action unit trigger areas.
And the face action unit identification unit 203 is used for identifying the face action unit by using a preset three-dimensional convolutional neural network identification model to obtain the characteristics of the face action unit.
And the feature fusion unit 204 is configured to perform feature fusion on the obtained global facial features and the features of the facial action unit to obtain facial expression fusion features.
And the facial expression recognition unit 205 is configured to input the facial expression fusion features into a preset gaussian process classifier, recognize facial expressions of the human face by using the gaussian process classifier, and output a facial expression recognition result.
By adopting the facial expression recognition device based on the facial action unit and the facial expression recognition method based on the facial action unit, the end-to-end algorithm from the visual portrait picture of the user to the visual facial expression recognition result of the user is realized, and has strong robustness to different test objects, can perform expression recognition based on facial action units, and compared with macroscopic expression recognition, the microscopic facial action units are used for restraining the macroscopic expression, which is more beneficial to the recognition of the tiny expression, meanwhile, facial features which have more representation capability on facial expressions are formed by combining the whole face features and the local features of the action units, and by utilizing the face topology and the relationship between the facial action units and the expressions, the system consumption can be reduced while the recognition effect is not influenced, and the facial expression recognition performance is improved.
Corresponding to the facial expression recognition method based on the facial action unit, the invention also provides electronic equipment. Since the embodiment of the electronic device is similar to the above method embodiment, the description is relatively simple, and please refer to the description of the above method embodiment, and the electronic device described below is only schematic. Fig. 3 is a schematic view of an electronic device according to an embodiment of the present invention.
The electronic device specifically includes: a processor 301 and a memory 302; wherein the memory 302 is used for executing one or more program instructions for storing a program of the facial expression recognition method based on the facial action unit, and the server is powered on and executes the program of the facial expression recognition method based on the facial action unit through the processor 301 to execute any one of the facial expression recognition methods based on the facial action unit. The electronic device of the present invention may be a server.
Corresponding to the facial expression recognition method based on the facial action unit, the invention also provides a computer storage medium. Since the embodiment of the computer storage medium is similar to the above method embodiment, the description is simple, and please refer to the description of the above method embodiment, and the computer storage medium described below is only schematic.
The computer storage medium contains one or more program instructions for executing the facial action unit-based facial expression recognition method described above by a server.
In an embodiment of the invention, the processor or processor module may be an integrated circuit chip having signal processing capabilities. The Processor may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The processor reads the information in the storage medium and completes the steps of the method in combination with the hardware.
The storage medium may be a memory, for example, which may be volatile memory or nonvolatile memory, or which may include both volatile and nonvolatile memory.
The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory.
The volatile Memory may be a Random Access Memory (RAM) which serves as an external cache. By way of example and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (ddr Data Rate SDRAM), Enhanced SDRAM (ESDRAM), synclink DRAM (SLDRAM), and Direct memory bus RAM (DRRAM).
The storage media described in connection with the embodiments of the invention are intended to comprise, without being limited to, these and any other suitable types of memory. Those skilled in the art will appreciate that the functionality described in the present invention may be implemented in a combination of hardware and software in one or more of the examples described above. When software is applied, the corresponding functionality may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (10)

1. A facial expression recognition method based on a facial action unit is characterized by comprising the following steps:
collecting picture data containing a face area;
preprocessing picture data containing a face region by using a face key point detection technology, and subdividing the face region into a plurality of face action unit trigger regions;
identifying the facial action unit by using a preset three-dimensional convolutional neural network identification model to obtain the characteristics of the facial action unit;
performing feature fusion on the obtained global facial features and the features of the facial action units to obtain facial expression fusion features;
and inputting the facial expression fusion features into a preset Gaussian process classifier, identifying the facial expressions of the human face by using the Gaussian process classifier, and outputting facial expression identification results.
2. The method according to claim 1, wherein the preprocessing the image data including the face region by using the face keypoint detection technique to subdivide the face region into a plurality of face motion unit trigger areas comprises:
detecting a preset number of key points on the face area according to the face gouging structure and a face key point detection algorithm;
clipping the picture data according to a preset number of key points to obtain a target face area, and scaling the target face area to a preset pixel size to realize the normalization of the face image;
and subdividing the target human face area into three facial action unit trigger areas, wherein the three facial action unit trigger areas comprise an eye action unit trigger area, a T area action unit trigger area and a lip action unit trigger area.
3. The method for recognizing facial expressions based on facial action units according to claim 2, wherein the recognizing facial action units by using a preset three-dimensional convolutional neural network recognition model to obtain the characteristics of the facial action units specifically comprises:
selecting a facial action unit having an association relation with the expression, and constructing a corresponding relation between the facial action unit and the three facial action unit trigger areas;
extracting local features of the face action unit in a corresponding face action unit trigger area by using the three-dimensional convolutional neural network identification model to obtain local features of the face action unit trigger area;
inputting the local features of the trigger area of the facial action unit into a softmax layer for identifying the facial action unit, and obtaining the features of the facial action unit.
4. The facial expression recognition method based on the facial action unit as claimed in claim 3, wherein the feature fusion of the obtained global facial features and the features of the facial action unit to obtain the facial expression fusion features comprises:
the target face area is cut into blocks and input into a preset three-dimensional convolutional neural network recognition model to extract features of a frontal face, and the global features of the face are obtained;
and performing feature level fusion on the features of the facial action unit and the global facial features to obtain facial expression fusion features.
5. A facial action unit based facial expression recognition method as claimed in claim 4, wherein the Gaussian process classifier is based on an associative relationship constraint between facial action units and expressions.
6. The facial expression recognition method based on facial action unit as claimed in claim 5, wherein the recognizing facial expressions of the human face by using the gaussian process classifier and outputting the recognition results of the facial expressions specifically comprises:
inputting the facial expression fusion features into the Gaussian process classifier to obtain a primary facial expression recognition result;
and further constraining the preliminary facial expression recognition result by using the characteristics of the facial action unit and the association relationship between the facial action unit and the facial expression, and outputting a facial expression recognition result.
7. A facial expression recognition apparatus based on a facial action unit, comprising:
the data acquisition unit is used for acquiring picture data containing a face area;
the preprocessing unit is used for preprocessing the picture data containing the face area by using a face key point detection technology and subdividing the face area into a plurality of face action unit trigger areas;
the face action unit identification unit is used for identifying the face action unit by utilizing a preset three-dimensional convolutional neural network identification model to obtain the characteristics of the face action unit;
the feature fusion unit is used for carrying out feature fusion on the obtained global facial features and the features of the facial action unit to obtain facial expression fusion features;
and the facial expression recognition unit is used for inputting the facial expression fusion characteristics into a preset Gaussian process classifier, recognizing the facial expression of the human face by using the Gaussian process classifier and outputting a facial expression recognition result.
8. A facial action unit based facial expression recognition device as claimed in claim 7, wherein the pre-processing unit is specifically configured to: detecting a preset number of key points on the face area according to the face gouging structure and a face key point detection algorithm; clipping the picture data according to a preset number of key points to obtain a target face area, and scaling the target face area to a preset pixel size to realize the normalization of the face image; and subdividing the target human face area into three facial action unit trigger areas, wherein the three facial action unit trigger areas comprise an eye action unit trigger area, a T area action unit trigger area and a lip action unit trigger area.
9. An electronic device, comprising:
a processor; and
a memory for storing a program of a facial expression recognition method based on a facial action unit, the electronic device being powered on and executing the program of the facial expression recognition method based on a facial action unit through the processor to execute the facial expression recognition method based on a facial action unit according to any one of claims 1 to 6.
10. A computer-readable storage medium having embodied therein one or more program instructions for execution by a server of the facial action unit-based facial expression recognition method of any one of claims 1-6.
CN202010312602.6A 2020-04-20 2020-04-20 Facial expression recognition method and device based on facial action unit Pending CN111626113A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010312602.6A CN111626113A (en) 2020-04-20 2020-04-20 Facial expression recognition method and device based on facial action unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010312602.6A CN111626113A (en) 2020-04-20 2020-04-20 Facial expression recognition method and device based on facial action unit

Publications (1)

Publication Number Publication Date
CN111626113A true CN111626113A (en) 2020-09-04

Family

ID=72258969

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010312602.6A Pending CN111626113A (en) 2020-04-20 2020-04-20 Facial expression recognition method and device based on facial action unit

Country Status (1)

Country Link
CN (1) CN111626113A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991358A (en) * 2020-09-30 2021-06-18 北京字节跳动网络技术有限公司 Method for generating style image, method, device, equipment and medium for training model
CN113011386A (en) * 2021-04-13 2021-06-22 重庆大学 Expression recognition method and system based on equally divided characteristic graphs
CN113486867A (en) * 2021-09-07 2021-10-08 北京世纪好未来教育科技有限公司 Face micro-expression recognition method and device, electronic equipment and storage medium
CN113792572A (en) * 2021-06-17 2021-12-14 重庆邮电大学 Facial expression recognition method based on local representation
CN113855019A (en) * 2021-08-25 2021-12-31 杭州回车电子科技有限公司 Expression recognition method and device based on EOG, EMG and piezoelectric signals
CN113920575A (en) * 2021-12-15 2022-01-11 深圳佑驾创新科技有限公司 Facial expression recognition method and device and storage medium
WO2024000233A1 (en) * 2022-06-29 2024-01-04 中国科学院深圳理工大学(筹) Facial expression recognition method and apparatus, and device and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729835A (en) * 2017-10-10 2018-02-23 浙江大学 A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features
CN110263673A (en) * 2019-05-31 2019-09-20 合肥工业大学 Human facial expression recognition method, apparatus, computer equipment and storage medium
CN110738102A (en) * 2019-09-04 2020-01-31 暗物质(香港)智能科技有限公司 face recognition method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729835A (en) * 2017-10-10 2018-02-23 浙江大学 A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features
CN110263673A (en) * 2019-05-31 2019-09-20 合肥工业大学 Human facial expression recognition method, apparatus, computer equipment and storage medium
CN110738102A (en) * 2019-09-04 2020-01-31 暗物质(香港)智能科技有限公司 face recognition method and system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991358A (en) * 2020-09-30 2021-06-18 北京字节跳动网络技术有限公司 Method for generating style image, method, device, equipment and medium for training model
CN113011386A (en) * 2021-04-13 2021-06-22 重庆大学 Expression recognition method and system based on equally divided characteristic graphs
CN113792572A (en) * 2021-06-17 2021-12-14 重庆邮电大学 Facial expression recognition method based on local representation
CN113855019A (en) * 2021-08-25 2021-12-31 杭州回车电子科技有限公司 Expression recognition method and device based on EOG, EMG and piezoelectric signals
CN113855019B (en) * 2021-08-25 2023-12-29 杭州回车电子科技有限公司 Expression recognition method and device based on EOG (Ethernet over coax), EMG (electro-magnetic resonance imaging) and piezoelectric signals
CN113486867A (en) * 2021-09-07 2021-10-08 北京世纪好未来教育科技有限公司 Face micro-expression recognition method and device, electronic equipment and storage medium
CN113486867B (en) * 2021-09-07 2021-12-14 北京世纪好未来教育科技有限公司 Face micro-expression recognition method and device, electronic equipment and storage medium
CN113920575A (en) * 2021-12-15 2022-01-11 深圳佑驾创新科技有限公司 Facial expression recognition method and device and storage medium
WO2024000233A1 (en) * 2022-06-29 2024-01-04 中国科学院深圳理工大学(筹) Facial expression recognition method and apparatus, and device and readable storage medium

Similar Documents

Publication Publication Date Title
CN111626113A (en) Facial expression recognition method and device based on facial action unit
Wu et al. Recent advances in video-based human action recognition using deep learning: A review
CN109961005B (en) Dynamic gesture recognition method and system based on two-dimensional convolutional network
US20190095701A1 (en) Living-body detection method, device and storage medium
WO2019033525A1 (en) Au feature recognition method, device and storage medium
CN111199230B (en) Method, device, electronic equipment and computer readable storage medium for target detection
US11216652B1 (en) Expression recognition method under natural scene
CN109063626B (en) Dynamic face recognition method and device
CN108960076B (en) Ear recognition and tracking method based on convolutional neural network
Anand et al. An improved local binary patterns histograms techniques for face recognition for real time application
JP2010108494A (en) Method and system for determining characteristic of face within image
CN113255630B (en) Moving target recognition training method, moving target recognition method and device
Wu et al. Convolutional LSTM networks for video-based person re-identification
WO2023279799A1 (en) Object identification method and apparatus, and electronic system
CN112949689A (en) Image recognition method and device, electronic equipment and storage medium
CN113298018A (en) False face video detection method and device based on optical flow field and facial muscle movement
Saeed A framework for recognition of facial expression using HOG features
CN115862120B (en) Face action unit identification method and equipment capable of decoupling separable variation from encoder
Vural et al. Multi-view fast object detection by using extended haar filters in uncontrolled environments
Curran et al. The use of neural networks in real-time face detection
CN114387670A (en) Gait recognition method and device based on space-time feature fusion and storage medium
CN112766112B (en) Dynamic expression recognition method and system based on space-time multi-feature fusion
CN113887429A (en) Digital man video generation method and device and electronic equipment
Wang et al. Video-based emotion recognition using face frontalization and deep spatiotemporal feature
CN112200080A (en) Face recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination