CN111325131B - Micro-expression detection method based on self-adaptive transition frame depth network removal - Google Patents

Micro-expression detection method based on self-adaptive transition frame depth network removal Download PDF

Info

Publication number
CN111325131B
CN111325131B CN202010092959.8A CN202010092959A CN111325131B CN 111325131 B CN111325131 B CN 111325131B CN 202010092959 A CN202010092959 A CN 202010092959A CN 111325131 B CN111325131 B CN 111325131B
Authority
CN
China
Prior art keywords
frame
micro
network
expression
transition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010092959.8A
Other languages
Chinese (zh)
Other versions
CN111325131A (en
Inventor
付晓峰
牛力
柳永翔
赵伟华
计忠平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202010092959.8A priority Critical patent/CN111325131B/en
Publication of CN111325131A publication Critical patent/CN111325131A/en
Application granted granted Critical
Publication of CN111325131B publication Critical patent/CN111325131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a micro-expression detection method based on a self-adaptive transition frame removal depth network. The method comprises network construction, network training and microexpressive detection, wherein in the network training, data preprocessing is firstly carried out on an original video; then removing the transition frame by using a self-adaptive transition frame removing method; and finally, inputting the micro-expression frames and the neutral frame samples without the transition frames into a MesNet network for training. The MesNet constructed by the invention is essentially a two-class network, and the micro-expression frame detection does not depend on the frame time sequence relationship, so that the MesNet not only can detect the micro-expression frame from the complete video of the micro-expression database, but also can detect the micro-expression frame from a given arbitrary frame set, and can judge whether a given single frame is the micro-expression frame or not.

Description

Micro-expression detection method based on self-adaptive transition frame depth network removal
Technical Field
The invention belongs to the technical field of computer image processing, and relates to a micro-expression detection method based on a self-adaptive transition frame removal depth network.
Background
Unlike traditional facial expressions with duration of 0.5 s-4 s, facial micro-expressions with duration of 1/25 s-1/5 s are an instant and unconscious reaction revealing the true emotion of the person. Microexpressive recognition has attracted increasing attention from researchers over the past decade because of its potential application in various fields of emotion monitoring, lie detection, clinical diagnosis, business negotiations, etc.
The micro-expressions have the characteristics of difficult induction, difficult data acquisition, small sample scale, difficult human eyes recognition and the like, and the initial micro-expression recognition is mainly manually recognized by psychologists and other professionals, so that the automatic recognition of the micro-expressions by using a computer vision method and a machine learning method is possible by the progress of computer hardware in recent years.
The micro-expression recognition comprises two steps of micro-expression detection and micro-expression type discrimination. The micro-expression detection is a precondition for judging the micro-expression type, and for a section of video containing the micro-expression, firstly, the frames in which the micro-expression is distributed are detected, and then the category to which the micro-expression belongs can be further judged. The existing microexpressive detection method has the common problems that the microexpressive detection precision is low or the application range is smaller. Databases commonly used for microexpressive detection are CASME II, SMIC-E-HS and CAS (ME) 2 There has been no prior method of detecting microexpressions that can be verified on three databases simultaneously.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a micro-expression detection method based on a self-adaptive transition frame depth network, which has the characteristics of high precision and wide application range in micro-expression detection application.
The invention comprises network construction, network training and detecting micro-expressions by using the trained network.
The network structure specifically comprises:
step S1: selecting a pre-trained CNN model on an ImageNet database, and reserving a convolution layer and pre-training parameters.
Step S2: and adding a full connection layer after the CNN model.
Step S3: and adding an output layer and a logistic classifier after the full connection layer.
Specifically, the invention uses the acceptance-ResNet-V2 as a basis to construct a micro expression detection network, which is named MesNet (micro-expression spotting network).
Specifically, the number of fully connected layer neurons is 512.
Specifically, the MesNet network is a micro-expression frame and neutral frame classification network, and the number of neurons of an output layer is 1.
The network training specifically comprises the following steps:
step S1: and carrying out data preprocessing on the original video of the training set.
Step S2: transition frames are removed from the training set using an adaptive transition frame removal method.
Step S3: and inputting the micro-expression frames and neutral frame samples without the transition frames into a MesNet network for training.
Specifically, the data preprocessing of the original video comprises face detection, face alignment and micro-expression region clipping.
The method for adaptively removing the transition frame specifically comprises the following steps:
step S1: the training set is divided into a confidence sample and a transition frame sample to be removed.
Step S2: and training the MesNet network through the confidence sample to obtain micro expression frame and neutral frame classification models.
Step S3: and predicting transition frame samples to be removed by using a classification model to obtain the probability that each sample belongs to a positive sample micro-expression frame.
Step S4: and adaptively determining a threshold value for screening the transition frames through a transition frame sample probability distribution map to be removed, thereby removing the transition frames.
The method for detecting the micro-expression by using the trained network specifically comprises the following steps:
step S1: and carrying out data preprocessing on the original video of the test set.
Step S2: and inputting the preprocessed sample into a trained MesNet network to obtain a predicted tag value. A label of 1 represents a sample as a micro-expression frame, and a label of 0 represents a neutral frame.
Specifically, the input sample to be detected can be a single-segment video or a multi-segment video of a complete test set.
Compared with the prior art, the method has the following beneficial effects:
the invention has high micro expression detection precision, and MesNet is arranged in CASME II, SMIC-E-HS and CAS (ME) 2 The database obtains the current optimal result. The invention has wide application range, has no limit on the length of the input video, is not only suitable for short videos of CASME II and SMIC-E-HS, but also suitable for CAS (ME) 2 Of databasesLong video.
Drawings
Fig. 1 shows a MesNet training flowchart.
Fig. 2 shows a CASME II database video clip example.
Fig. 3 shows an adaptive removal transition frame method.
Fig. 4 shows the probability distribution of transition frame samples to be removed.
Fig. 5 (a) shows a certain frame of image in a video.
Fig. 5 (b) shows an extracted rectangular frame of a face.
Fig. 6 shows a face alignment method.
Fig. 7 (a) shows an image after face alignment.
Fig. 7 (b) shows a cropped micro-expression region.
Fig. 8 (a) shows some video frames of the CASME II database.
Fig. 8 (b) is a Dlib face detection diagram corresponding to fig. 8 (a).
Fig. 8 (c) is a diagram obtained by preprocessing fig. 8 (a) by the preprocessing method of the present invention.
Detailed Description
The invention will now be described in detail with reference to the accompanying drawings, it being pointed out that the embodiments described are only intended to facilitate an understanding of the invention and do not in any way limit it.
Fig. 1 illustrates the MesNet training procedure using a video segment numbered 20_ep15_03f in the CASME II database as an example. As shown in equation (1), input is a sample of micro-expression frames and neutral frames Input to the MesNet network, and f (Input) represents the extraction of shape and texture Features from the image using a pre-training model:
Features=f(Input). (1)
to further extract the micro-expression features, as shown in equation (2), a function f 1 (Featues, N) means that a fully connected layer containing N neurons is connected after the pre-training model by taking Featues as input:
FC=f 1 (Features,N). (2)
then, the FC is taken as an input, and an Output layer Output is constructed as shown in a formula (3). Because MesNet is a two-class network, the output layer contains only 1 neuron:
Output=f 1 (FC,1). (3)
MesNet network uses logistic classifier with loss function of
Figure BDA0002384329360000051
Where m represents the number of samples involved in one iteration, y (i) Representing the true label value of the ith training sample, label 1 representing positive sample microexpressive frame, and 0 representing negative sample neutral frame.
Figure BDA0002384329360000052
Probability value indicating that MesNet predicts that the i-th sample is a positive sample, +.>
Figure BDA0002384329360000053
The calculation method is that
Figure BDA0002384329360000054
The optimization of the MesNet network adopts a learning rate self-adaptive Adam method.
Fig. 2 shows an example of a video clip of a CASME II database, which is the same video segment as fig. 1, and which has a duration of 5 seconds for 1024 frames. According to the CASME II database description document, the 86 th Frame of the initial Frame (Oset Frame) is the Frame from which the micro-expression starts, the 129 th Frame of the vertex Frame (Apex Frame) is the micro-expression peak Frame, and the 181 th Frame of the end Frame (Offset Frame) is the last Frame from which the micro-expression continues.
In supervised learning, the quality of the label corresponding to the training data has an important influence on the learning effect. From the microexpressive database production process, some frames near the start and end frames are not quite explicitly divided into microexpressive or neutral frames under a high speed camera of 200 fps. Thus, frames near frame 86 and frame 181 may have noisy labels that, if placed in the training set, can interfere with model training. Therefore, the present embodiment defines frames with noise tags near the start frame and the end frame as transition frames, and performs processing for removing transition frames on the training set.
Taking the video shown in fig. 2 as an example, in order to remove the transition frame, the whole video is divided into four segments by taking the 86 th, 129 th and 181 th frames as boundaries, and each segment is divided into two parts, and a total of 8 parts are numbered in the figure. In U shape 1 Representing a set of part 1 samples, denoted by L 1 The number of samples in part 1 is indicated, the remaining 7 parts and so on.
Fig. 3 shows a method for adaptively removing transition frames, which specifically comprises the following steps:
step S1: considering that the transition frame is a small number of samples with noise labels, the proportion of the transition frame does not exceed 50% of the total number of training set samples and the transition frame is close to the start frame or the end frame. Then, as shown in FIG. 2, initialize L 1 :L 2 =L 3 :L 4 =L 5 :L 6 =L 7 :L 8 =1:1. In U shape T Representing a transition frame sample set, in U T0 Representing a transition frame sample set to be removed, U T0 =U 2 ∪U 3 ∪U 6 ∪U 7
Step S2: u (U) 4 ∪U 5 U as a micro-expression frame sample 1 ∪U 8 As a neutral frame sample, the MesNet network was trained to obtain model C.
Step S3: predicting U using model C T0 Sample x in (a) (i) Probability P belonging to positive sample micro-expression frame i If P i Near 0, the samples are neutral frames, if P i Near 1, the sample is a micro-expression frame. Then the transition frame discrimination formula is U T ={x i |P1<P i <P2,x i ∈U T0 } (6)
Wherein P1, P2E (0, 1), the specific values of P1, P2 will be discussed below;
step S4: u (U) 2 、U 3 、U 6 、U 7 After removal of the transition frameThe sample sets are U respectively 2- 、U 3- 、U 6- 、U 7- . The set of micro-expression frame samples put into the training set is:
U ME =U 3 -UU 4 UU 5 UU 6 -, (7)
the neutral frame sample set is:
U N =U 1 UU 2 -UU 7 -UU 8 . (8)
fig. 4 shows the probability distribution of transition frame samples to be removed. U is set to T0 A total of 24454 samples are input into the model C for prediction, and corresponding 24454 probability values are obtained. To determine the optimal threshold values P1, P2, probability distribution is performed as shown in fig. 4. Probability distribution is [0.000,0.050 ]]16616 samples in the interval, and the probability that the model C judges that the model C is a micro-expression frame is not higher than 0.05, namely the probability that the model C is a neutral frame is not lower than 0.95; probability distribution at (0.950,1.000)]5429 samples in the interval, and the probability that the model C judges that the model C is a micro-expression frame is not lower than 0.95; the closer the sample prediction probability value is to 0.5, the more difficult the model C is to judge the category of the model C, the lower the reliability of the prediction result is, and the samples are transition frames. In combination with probability distribution, distributed in [0.000,0.050 ]]The number of samples for a bin is much greater than for the next bin (0.050,0.100]And (0.050,0.100)]The number of interval samples is not much greater than the interval (0.100,0.150]Therefore, the value of P1 can be determined to be 0.05, and the value of P2 can be determined to be 0.95 in the same way. And removing 2409 transition frame samples from 48670 samples in the CASME II database training lump by adopting an adaptive transition frame removal method, wherein the transition frame samples account for about 4.950 percent of the total number of the training set samples.
Fig. 5 (a) shows a frame of image in video number 15_ep03_02 in the CASME II database, and the head of the subject is seen to be inclined at a significant angle, and in addition, a lot of interference information such as background, hair, earphone, etc. The preprocessing of the original video is divided into three steps: face detection, face alignment and micro-expression region clipping. Fig. 5 (b) shows a face rectangular frame extracted by a Dlib forward face detector. And next, detecting 68 human face feature points in the rectangular frame by using a residual neural network human face feature point detection model.
Fig. 6 shows face alignment. The two external canthus coordinates are respectively 36 and 45, and the deflection angle of the face can be calculated by using the horizontal and vertical coordinates of the two points. Let the 36 th and 45 th keypoint coordinates be (x 1, y 1) and (x 2, y 2), respectively, then
Horizontal difference:
dx=x2-x1, (9)
vertical difference:
dy=y2-y1, (10)
face deflection angle:
Figure BDA0002384329360000081
the affine matrix is calculated by angle to carry out affine transformation, so that an image with aligned faces as shown in fig. 7 (a) can be obtained.
As can be seen from fig. 7 (a), the image after the face alignment still contains more noise, for example, the glasses frame and the hair at the four corners of the image, and other subjects may also have the interference information such as the clothing neckline and the earphone line (see fig. 8 (b)), and the intra-class distance caused by the noise interference is more remarkable than the small inter-class distance between the micro-expression frame and the neutral frame. To minimize intra-class spacing, the image needs to be further cropped. In connection with the encoding of related micro-expressions by the facial motion encoding system (Facial Action Coding System, FACS), two principles are based: the action units contained in the CASME II micro expressions are reserved to the maximum extent, noise interference is reduced to the maximum extent, and the image is further cut. The trial and error determines the optimal cropping parameters and the final results are shown in fig. 7 (b). CASME II, SMIC-E-HS and CAS (ME) 2 And preprocessing all 32 ten thousand multiframe images in the database according to the flow.
Fig. 8 (a) shows some image preprocessing original images of the CASME II database, fig. 8 (b) is a Dlib face detection image, and fig. 8 (c) is a graph obtained by preprocessing by the preprocessing method of the present invention. As can be seen from comparing fig. 8 (a) and fig. 8 (c), the preprocessing method of the present invention can obtain the facial microexpressive area image from the original video more accurately, and effectively remove most of noise interference affecting microexpressive detection.
After MesNet finishes training, the test set samples are input to obtain the probability that each sample belongs to the positive sample micro expression frame
Figure BDA0002384329360000091
If->
Figure BDA0002384329360000092
Judging the micro-expression frame to be more than or equal to 0.5, and outputting a label of 1; if->
Figure BDA0002384329360000093
Less than 0.5, judging as a neutral frame, and outputting a label of 0. Based on the test set true tags and the MesNet predicted tags, ROC plots can be made and AUC values calculated. The higher the AUC value, the better the model performance.
Experimental results
In order to show that the method of the invention has higher micro-expression detection AUC value, the invention is compared with other methods, and the comparison result is shown in the following table. Other method references in the table are as follows:
[1]DAVISON A K,LANSLEY C,NG C C,et al.Objective Micro-Facial Movement Detection Using FACS-Based Regions and Baseline Evaluation[C]//2018 13th IEEE International Conference on Automatic Face&Gesture Recognition(FG 2018),2018:642-649.
[2]QU F,WANG S J,YAN W J,et al.CAS(ME)^2:ADatabase for Spontaneous Macro-expression and Micro-expression Spotting and Recognition[J].IEEE Transactions on Affective Computing,2018,9(4):424-436.
[3]WANG S J,WU S,QIAN X,et al.Amain directional maximal difference analysis for spotting facial movements from long-term videos[J].Neurocomputing,2017,230:382-389.
[4]LI X,HONG X,MOILANEN A,et al.Towards Reading Hidden Emotions:A Comparative Study of Spontaneous Micro-Expression Spotting and Recognition Methods[J].IEEE Transactions on Affective Computing,2018,9(4):563-577.
[5]DUQUE C A,ALATA O,EMONET R,et al.Micro-Expression Spotting Using the Riesz Pyramid[C]//2018IEEE Winter Conference on Applications of Computer Vision(WACV).IEEE,2018:66-74.
Figure BDA0002384329360000101
as can be seen from the table, in CASME II, SMIC-E-HS and CAS (ME) 2 The AUC values of MesNet are all prior to the existing methods on the database. Compared with other methods, the MesNet has the advantage of wide application range besides higher precision. The MesNet has no limit on the length of the input video, and is not only suitable for short videos of CASME II and SMIC-E-HS, but also suitable for CAS (ME) 2 Long video of the database. In contrast, document [1][4][5]The proposed method is only validated on short videos of CASME II or SMIC-E-HS, literature [2 ]][3]The proposed method is only on CAS (ME) 2 And the long video database is verified.
In order to show the effectiveness of the self-adaptive transition frame removal method, a comparison experiment of the transition frame removal method and the self-adaptive transition frame removal method is set, and the comparison result of AUC values is shown in the following table.
Figure BDA0002384329360000111
As can be seen from the table, the micro-expression detection AUC value is effectively improved on all three databases by adopting the self-adaptive transition frame removal method.
While the foregoing has specifically described embodiments of the present invention, it will be appreciated by one of ordinary skill in the art that variations and modifications within the scope of the invention as described above and specifically set forth in the appended claims may be made to the invention as well without departing from the scope of the invention.

Claims (6)

1. The micro-expression detection method based on the self-adaptive transition frame depth network removal comprises network construction, network training and micro-expression detection, and is characterized in that:
the network structure specifically comprises:
step S1: selecting a pre-trained CNN model on an ImageNet database, and reserving a convolution layer and pre-training parameters;
step S2: adding a full connection layer after the CNN model;
step S3: adding an output layer and a logistic classifier after the full connection layer, and constructing a finished network named as a MesNet network;
the network training specifically comprises the following steps:
step S1: preprocessing data of an original video to remove noise interference affecting microexpressive detection;
step S2: removing transition frames from the training set using an adaptive transition frame removal method;
step S3: inputting the micro-expression frames and neutral frame samples without the transition frames into a MesNet network for training;
the method for adaptively removing the transition frame specifically comprises the following steps:
step S1: dividing the training set into a confidence sample and a transition frame sample to be removed;
step S2: training a MesNet network through a confidence sample to obtain a micro expression frame and a neutral frame classification model;
step S3: predicting transition frame samples to be removed by using a binary classification model to obtain the probability that each sample belongs to a positive sample micro expression frame;
step S4: adaptively determining a threshold value for screening the transition frames through a transition frame sample probability distribution diagram to be removed, so that the transition frames are removed;
the micro expression detection specifically comprises the following steps:
step S1: preprocessing data of the original video of the test set;
step S2: and inputting the preprocessed sample into a trained MesNet network to obtain a predicted tag value.
2. The micro-expression detection method based on the adaptive transition frame depth removal network according to claim 1, wherein:
and in the network construction stage, a pre-trained acceptance-ResNet-V2 model is used as a basis, a full-connection layer containing 512 neurons and an output layer containing 1 neuron are added, and a micro-expression frame and neutral frame classification network is constructed for detecting micro-expressions from videos.
3. The micro-expression detection method based on the adaptive transition frame depth removal network according to claim 1, wherein:
the network training stage, the data preprocessing of the original video comprises face detection, face alignment and micro expression region cutting;
the human face detection is to extract a human face rectangular frame by using a Dlib forward human face detector, and detect 68 human face feature points in the rectangular frame by using a residual neural network human face feature point detection model;
the face alignment is that the face deflection angle is determined by calculating the horizontal difference and the vertical difference of the two outer eye angles, and affine transformation is performed by calculating an affine matrix by utilizing the face deflection angle, so that the face alignment is completed.
4. The micro-expression detection method based on the adaptive transition frame depth removal network according to claim 1, wherein:
the transition frame is provided with a noise label, and the transition frame can be identified and removed by a self-adaptive transition frame removing method.
5. The micro-expression detection method based on the adaptive transition frame depth removal network according to claim 1, wherein:
after the MesNet network finishes training, inputting the test set samples, and obtaining the probability that each sample belongs to a positive sample micro-expression frame; if the probability is more than or equal to 0.5, judging the frame as a micro-expression frame, and outputting a label as 1; if the probability is less than 0.5, judging the frame as a neutral frame, and outputting a label as 0; according to the real label of the test set and the MesNet network prediction label, making an ROC curve graph and calculating an AUC value; the higher the AUC value, the better the MesNet network performance.
6. The micro-expression detection method based on the adaptive transition frame depth removal network according to claim 1, wherein:
in the micro-expression detection process, the MesNet network detects that the micro-expression pair input video is suitable for short videos with the length of tens of frames or long videos with the length of thousands of frames.
CN202010092959.8A 2020-02-14 2020-02-14 Micro-expression detection method based on self-adaptive transition frame depth network removal Active CN111325131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010092959.8A CN111325131B (en) 2020-02-14 2020-02-14 Micro-expression detection method based on self-adaptive transition frame depth network removal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010092959.8A CN111325131B (en) 2020-02-14 2020-02-14 Micro-expression detection method based on self-adaptive transition frame depth network removal

Publications (2)

Publication Number Publication Date
CN111325131A CN111325131A (en) 2020-06-23
CN111325131B true CN111325131B (en) 2023-06-23

Family

ID=71171012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010092959.8A Active CN111325131B (en) 2020-02-14 2020-02-14 Micro-expression detection method based on self-adaptive transition frame depth network removal

Country Status (1)

Country Link
CN (1) CN111325131B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530648A (en) * 2013-10-14 2014-01-22 四川空港知觉科技有限公司 Face recognition method based on multi-frame images
CN106803909A (en) * 2017-02-21 2017-06-06 腾讯科技(深圳)有限公司 The generation method and terminal of a kind of video file
CN107679526A (en) * 2017-11-14 2018-02-09 北京科技大学 A kind of micro- expression recognition method of face

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8848068B2 (en) * 2012-05-08 2014-09-30 Oulun Yliopisto Automated recognition algorithm for detecting facial expressions
EP2960905A1 (en) * 2014-06-25 2015-12-30 Thomson Licensing Method and device of displaying a neutral facial expression in a paused video

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530648A (en) * 2013-10-14 2014-01-22 四川空港知觉科技有限公司 Face recognition method based on multi-frame images
CN106803909A (en) * 2017-02-21 2017-06-06 腾讯科技(深圳)有限公司 The generation method and terminal of a kind of video file
CN107679526A (en) * 2017-11-14 2018-02-09 北京科技大学 A kind of micro- expression recognition method of face

Also Published As

Publication number Publication date
CN111325131A (en) 2020-06-23

Similar Documents

Publication Publication Date Title
Yang et al. Deep multimodal representation learning from temporal data
Littlewort et al. Dynamics of facial expression extracted automatically from video
US9530048B2 (en) Automated facial action coding system
CN111797683A (en) Video expression recognition method based on depth residual error attention network
CN109543526A (en) True and false facial paralysis identifying system based on depth difference opposite sex feature
CN112784763A (en) Expression recognition method and system based on local and overall feature adaptive fusion
CN109299690B (en) Method capable of improving video real-time face recognition precision
CN111666845B (en) Small sample deep learning multi-mode sign language recognition method based on key frame sampling
CN112560810A (en) Micro-expression recognition method based on multi-scale space-time characteristic neural network
Khatri et al. Facial expression recognition: A survey
CN111967354B (en) Depression tendency identification method based on multi-mode characteristics of limbs and micro-expressions
Zhao et al. Applying contrast-limited adaptive histogram equalization and integral projection for facial feature enhancement and detection
CN112949560A (en) Method for identifying continuous expression change of long video expression interval under two-channel feature fusion
Bartlett et al. Towards automatic recognition of spontaneous facial actions
CN111680660A (en) Human behavior detection method based on multi-source heterogeneous data stream
Suh et al. Adversarial deep feature extraction network for user independent human activity recognition
Lee et al. Face and facial expressions recognition system for blind people using ResNet50 architecture and CNN
Chang et al. Using gait information for gender recognition
CN111325131B (en) Micro-expression detection method based on self-adaptive transition frame depth network removal
Mao et al. Robust facial expression recognition based on RPCA and AdaBoost
CN106709442B (en) Face recognition method
Khan et al. Traditional features based automated system for human activities recognition
Lee et al. Recognition of facial emotion through face analysis based on quadratic bezier curves
Hema et al. Gait energy image projections based on gender detection using support vector machines
CN113408389A (en) Method for intelligently recognizing drowsiness action of driver

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant