CN113902774B - Facial expression detection method of thick and dense optical flow characteristics in video - Google Patents

Facial expression detection method of thick and dense optical flow characteristics in video Download PDF

Info

Publication number
CN113902774B
CN113902774B CN202111171053.6A CN202111171053A CN113902774B CN 113902774 B CN113902774 B CN 113902774B CN 202111171053 A CN202111171053 A CN 202111171053A CN 113902774 B CN113902774 B CN 113902774B
Authority
CN
China
Prior art keywords
face
detected
image
facial expression
src
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111171053.6A
Other languages
Chinese (zh)
Other versions
CN113902774A (en
Inventor
杨赛
顾全林
曹攀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Xishang Bank Co ltd
Original Assignee
Wuxi Xishang Bank Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Xishang Bank Co ltd filed Critical Wuxi Xishang Bank Co ltd
Priority to CN202111171053.6A priority Critical patent/CN113902774B/en
Publication of CN113902774A publication Critical patent/CN113902774A/en
Application granted granted Critical
Publication of CN113902774B publication Critical patent/CN113902774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Abstract

The invention relates to the technical field of expression detection, and particularly discloses a facial expression detection method of thick density optical flow characteristics in video, which comprises the following steps: extracting a plurality of frames of face images to be detected from the face video, processing the face images to obtain corrected face images, and detecting face key points to obtain 68 face key points; extracting density optical flow of face images of the ith frame and the (i+k) th frame after correction, and converting the density optical flow into a BGR space image; respectively extracting eyebrow and mouth area images of the BGR space image according to the 68 human face key points, and obtaining a target image; and constructing a facial expression recognition model according to the training target image in the target image and the corresponding label thereof, and inputting the test target image in the target image into the facial expression recognition model to obtain a facial expression recognition result. The facial expression detection method of the thick density optical flow characteristic in the video can realize the detection of micro/macro expression in the video.

Description

Facial expression detection method of thick and dense optical flow characteristics in video
Technical Field
The invention relates to the technical field of expression detection, in particular to a facial expression detection method of thick density optical flow characteristics in video.
Background
Emotion is one of three basic psychological processes of "awareness", which is based on individual wish and need, and is represented by a person's attitude experience to an objective thing and a corresponding behavioral response. Facial expressions are the most dominant behavioral manifestations of emotion and can be divided into macro-expressions and micro-expressions. The micro-expression belongs to a spontaneous expression, cannot be forged and restrained, and can reflect the true emotion of a person. In some scenarios, the microexpressions may reflect more and more credible information than limbs or utterances. Therefore, the research on the micro-expression emotion recognition task has a great deal of potential utilization value, such as background investigation, interview recruitment, suspicion interrogation, loan surface examination and the like.
Expression research is currently divided into two large directions, expression detection and expression recognition. Expression recognition has been studied for a long time, however, the task of expression detection has begun to be slowly paid attention to by researchers in recent years. The duration of the micro-expressions is only 1/25 s-1/3 s, and the motion amplitude is very small, and when the micro-expressions in the long video are interwoven with the common or macro-expressions, detecting the micro-expressions and macro-expressions becomes more and more challenging.
The current research is mostly based on single micro-expression or macro-expression scenes, or the detection task processing method for micro-expression and macro-expression in the video is rough, the utilization of the interested area of the facial expression is insufficient, the dependence on the scenes is high, the influence on the detection result is large when the video shakes or the face swings, the dependence on the feature extraction and the model is high, and the post-processing for the expression detection result in the video after the model prediction is insufficient.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a facial expression detection method of thick optical flow characteristics in video, so as to solve the problems of large influence on detection results, high dependence on characteristic extraction and models when video shakes or faces swing, and defects in post-processing of expression detection results in video after model prediction.
As a first aspect of the present invention, there is provided a facial expression detection method of thick optical flow features in a video, including:
step S1: acquiring a plurality ofFace videos to be detected, wherein each face video S to be detected comprises a plurality of frames of face images to be detected, and the face video S= { src to be detected 1 ,...,src i ,src N },src i The ith frame of the face image to be detected in the face video to be detected is the face image to be detected;
step S2: extracting a plurality of frames of face images to be detected from each face video to be detected, and respectively processing the plurality of frames of face images to be detected to obtain a corrected face image set S ' = { src ' ' 1 ,…,src' i ,src' N }, where src' i The detected face image is the i frame after correction;
step S3: respectively carrying out face key point detection on the corrected multi-frame face image to be detected to obtain 68 face key points;
step S4: extracting corrected face image src 'to be detected' i And a face image src' i+k Thick density optical flow f i The thick density optical flow f i Conversion from HSV space to BGR space image img i Wherein the BGR aerial image img i The label of the (B) is the corrected face image src 'to be detected of the ith frame' i Is a label of (2);
step S5: respectively extracting the BGR space image img according to the 68 human face key points i An eyebrow area image and a mouth area image, and respectively processing the eyebrow area image and the mouth area image to obtain a final target image
Step S6: dividing the final target images of the face video to be detected into training target images and test target images, constructing a facial expression recognition model according to the training target images and the corresponding labels, and inputting the test target images into the facial expression recognition model to obtain a facial expression recognition result.
Further, in the steps S2 and S3, the method further includes:
and utilizing Retinaface to frame the face video S= { src to be detected 1 ,...,src i ,src N Performing face detection on the multi-frame face image to be detected in the sequence to obtain a face coordinate frame set bbox= { bbox 1 ,...,bbox i ,bbox N Sum 5-point face key point set lmk = { lmk 1 ,...,lmk i ,lmk N -a }; wherein src is i The ith frame of the face image to be detected in the face video to be detected is the face image to be detected, bbox i A face coordinate frame of the i-th frame of the face video to be detected, lmk i The key points of the 5-point face of the face image to be detected of the ith frame in the face video to be detected are N, and N is the total frame number of the face image to be detected in the face video to be detected;
according to 5-point face key points lmk i And performing face alignment on a plurality of frames of face images to be detected in the face video to be detected by a conversion matrix M to obtain a corrected face image set S '= { src'. 1 ,...,src' i ,src' N And a corrected face coordinate frame set bbox '= { bbox' 1 ,…,bbox' i ,bbox' N }, where src' i To-be-detected face image of the ith frame after correction, src' i ∈224×224×3,bbox' i The calculation formula of the conversion matrix M is shown as (1) for the face coordinate frame of the face image to be detected of the ith frame after correction:
correcting the corrected face coordinate frame bbox 'by using 3DDFA_V2 algorithm' i Face key point detection is carried out on the face image of the region, and 68 face key point sets lmk '= { lmk' 1 ,…,lmk' i ,lmk' N In which lmk' i Face coordinate frame bbox 'of face image to be detected for ith frame after correction' i Is a 68 face key point.
Further, in the step S4, the method further includes:
when correctedHuman face image src 'to be detected in ith frame' i Belongs to the facial expression area, the corrected face image src 'to be detected in the ith frame' i Is defined as label i =1, otherwise label i =0, wherein the selection of k value adopts half of the average length of the facial expression in the face video data set to be detected, and the calculation formula of k value is as follows (2):
wherein T is the total video number in the face video data set to be detected, n i The frame number of facial expression exists for the jth face video to be detected.
Further, in the step S5, the method further includes:
eyebrow region image ROI 1 Is a coordinate frameImage of region->Coordinate frame->The calculation formula of (2) is shown as (3):
wherein omega is 1 Is an extended number of pixels;
mouth region image ROI 2 Is a coordinate frameImage of region->Coordinate frame->The calculation formula of (2) is shown as (4):
wherein omega is 2 Is an extended number of pixels;
respectively image the imagesAnd->Normalized to H W size and then combined to obtain the final target imageWherein H and W are normalized height and width, respectively.
Further, in the step S6, the plurality of training target images form a training set IMG train The plurality of test target images form a test set IMG train Further comprising:
step S61: IMG the j-th test set test·j Inputting the test target image of (2) into the facial expression recognition model to obtain a j-th test set IMG test·j Predictive label 'corresponding to a test target image' j E {0,1} and confidence value j ∈[0,1]Computing a j-th test set IMG according to equation (5) test·j Facial expression score s of i-th frame test target image in (1) i Then the j-th test set IMG test·j The facial expression score set of the N frames of test target images is S j ={s 0·j ,…,s i·j ,s N·j };
s i =value i·j *label' i·j (5)
Step S62: smoothing the j-th test set IMG using Savitzky-Golay convolution test·j Facial expression score for middle N frames of test target imagesSet S j Becomes a continuous curve S' j
Step S63: dynamic threshold T is used as curve S' j The calculation formula of the dynamic threshold T is shown as (6), S mean For the set of facial expression scores S j Mean value of mid-facial expression scores, S max For the set of facial expression scores S j The maximum value of the expression score of the middle part, eta is a weight coefficient;
T=S mean +η*(S max -S mean ) (6)
step S64: finding curve S' j With a threshold T j And nearest neighbor k as a constraint, if and only if curve S' j Is greater than the threshold T j The distance between adjacent peak points is larger than the nearest neighbor k, so that the peak value of the target facial expression is met, and the target peak value set meeting the limiting conditions is set as a group by adjacent intervals, so that a final predicted facial expression label interval set is obtained;
step S65: and when the overlapping degree IOU of the predicted interval and the real interval in the facial expression label interval set is more than or equal to 0.5, judging that the predicted interval in the facial expression label interval set is correct.
The facial expression detection method of thick density optical flow characteristics in video has the following advantages: the face is normalized and corrected, useless noise is eliminated in the expression region of interest, and the micro/macro expression in the video is detected by combining the thick density optical flow characteristic of the region of interest with the expression detection post-processing method.
Drawings
The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate the invention and together with the description serve to explain, without limitation, the invention.
Fig. 1 is a flowchart of a method for detecting facial expression of thick optical flow features in video according to the present invention.
Fig. 2 is a flowchart of processing a face image to be detected provided by the present invention.
Fig. 3 is a schematic diagram of label division provided in the present invention.
Fig. 4 is a flowchart for detecting facial expression according to the present invention.
Fig. 5 is a schematic diagram of a 68-person face key point provided by the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description refers to the specific implementation, structure, characteristics and effects of the facial expression detection method in video of thick light flow characteristics according to the invention with reference to the accompanying drawings and preferred embodiments. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the invention herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In this embodiment, a method for detecting facial expression of thick optical flow features in a video is provided, as shown in fig. 1, where the method for detecting facial expression of thick optical flow features in a video includes:
step S1: acquiring a plurality of face videos to be detected, wherein each face video S to be detected comprises a plurality of frames of face images to be detected, and the faces to be detectedFace video s= { src 1 ,…,src i ,src N },src i The ith frame of the face image to be detected in the face video to be detected is the face image to be detected;
step S2: extracting a plurality of frames of face images to be detected from each face video to be detected, and respectively processing the plurality of frames of face images to be detected to obtain a corrected face image set S ' = { src ' ' 1 ,...,src' i ,src' N }, where src' i The detected face image is the i frame after correction;
step S3: respectively carrying out face key point detection on the corrected multi-frame face image to be detected to obtain 68 face key points;
it should be noted that, the purpose of extracting 68 face key points is to distribute the key points in the face expression key areas (eyebrows, eyes, mouth, etc.), as shown in fig. 5;
step S4: correcting the face image src 'to be detected' i And src' i+k Converting into a gray image, and extracting a corrected face image src 'to be detected' i And a face image src' i+k Is a thick density optical flow f (Two-Frame Motion Estimation Based on PolynomialExpansion) i The thick density optical flow f i Conversion from HSV space to BGR space image img i Wherein the BGR aerial image img i The label of the (B) is the corrected face image src 'to be detected of the ith frame' i Is a label of (2);
it should be noted that, in the step S4, the face images to be detected of the i-th frame and the i+k-th frame after correction are selected, where k is a sliding window value;
step S5: respectively extracting the BGR space image img according to the 68 human face key points i An eyebrow area image and a mouth area image, and respectively processing the eyebrow area image and the mouth area image to obtain a final target image
Step S6: and processing videos in batches, dividing a plurality of final target images of the face video to be detected into a plurality of training target images and a plurality of test target images according to leave-one-out cross verification, constructing a facial expression recognition model based on Mobilene_v2 according to the training target images and corresponding labels, and inputting the test target images into the facial expression recognition model to obtain a facial expression recognition result.
It should be noted that the facial expression recognition model uses 0.5 times random data enhancement (e.g., flipping, adding noise, contrast, etc.) during training.
Preferably, as shown in fig. 2, in the steps S2 and S3, further include:
the face video S= { src to be detected is carried out frame by using Retinaface (Single-stage Dense Face Localisation in the Wild) 1 ,…,src i ,src N Performing face detection on the multi-frame face image to be detected in the sequence to obtain a face coordinate frame set bbox= { bbox 1 ,…,bbox i ,bbox N Sum 5-point face key point set lmk = { lmk 1 ,…,lmk i ,lmk N -a }; wherein src is i The ith frame of the face image to be detected in the face video to be detected is the face image to be detected, bbox i A face coordinate frame of the i-th frame of the face video to be detected, lmk i The key points of the 5-point face of the face image to be detected of the ith frame in the face video to be detected are N, and N is the total frame number of the face image to be detected in the face video to be detected;
in order to reduce the influence of camera shake or face swing, all face images to be detected need to be normalized to the same scale and front view angle according to 5-point face key points lmk i And performing face alignment on a plurality of frames of face images to be detected in the face video to be detected by a conversion matrix M to obtain a corrected face image set S '= { src'. 1 ,...,src' i ,src' N And a corrected face coordinate frame set bbox '= { bbox' 1 ,…,bbox' i ,bbox' N }, where src' i For the corrected i-th frame face image to be detected, src' 1 ∈224×224×3,bbox' i the calculation formula of the conversion matrix M is shown as (1) for the face coordinate frame of the face image to be detected of the ith frame after correction:
corrected face coordinate frame bbox 'is obtained by using a 3DDFA_V2 algorithm (Towards Fast, accurate and Stable 3D Dense Face Alignment)' i Face key point detection is carried out on the face image of the region, and 68 face key point sets lmk '= { lmk' 1 ,…,lmk' i ,lmk' N In which lmk' i Face coordinate frame bbox 'of face image to be detected for ith frame after correction' i Is a 68 face key point.
Preferably, in the step S4, the method further includes:
when the corrected ith frame is used for detecting the face image src' i Belonging to the facial expression (micro/macro expression) region (start-end), the corrected i-th frame is used for detecting the human face image src' i Is defined as label i =1, otherwise label i =0, wherein the selection of k value adopts half of the average length of the facial expression in the face video data set to be detected, and the calculation formula of k value is as follows (2):
wherein T is the total video number in the face video data set to be detected, n i The frame number of facial expression exists for the jth face video to be detected.
In this embodiment, a label division schematic diagram is provided, as shown in fig. 3, (1) when the sliding window k=3, a BGR space image img obtained by the 1 st frame face image to be detected and the 4 th frame face image to be detected is obtained 1 The label of the frame 1 is the label of the face image to be detected; (2) When F onset ≤F i ≤F offset At the time, the ith frame F i The label of BGR space image is1, otherwise 0, wherein F onset F, the initial frame of the expression interval of the BGR space image offset And the end frame of the expression interval of the BGR space image.
It should be noted that the facial expression includes a micro-expression or a macro-expression, and the facial expression recognition model includes a micro-expression recognition model or a macro-expression recognition model.
Preferably, as shown in fig. 2, in step S5, the method further includes:
eyebrow region image ROI 1 Is a coordinate frameImage of region->Coordinate frame->The calculation formula of (2) is shown as (3):
wherein omega is 1 Is an extended number of pixels; since the eye portion does not contain an effective face movement unit, and since the eye is irregularly closed, the eyebrow region image ROI 1 Influence is generated, thus according to 68 human face key point sets lmk' i Middle eye region lmk' i [37:48]Performing region filling on key points of the map;
mouth region image ROI 2 Is a coordinate frameImage of region->Coordinate frame->The calculation formula of (1) is as follows(4) The following is shown:
wherein omega is 2 Is an extended number of pixels;
respectively image the imagesAnd->Normalized to H W size and then combined to obtain the final target imageWhere H and W are normalized height and width, respectively, where h=96 and w=128 are set.
Preferably, in this embodiment, a flowchart of a process after detecting the micro-expressions (here, taking micro-expressions as an example) is provided, as shown in fig. 4, and in step S6, the plurality of training target images form a training set IMG train The plurality of test target images form a test set IMG train Further comprising:
step S61: IMG the j-th test set test·j Inputting the test target image of (2) into the facial expression recognition model to obtain a j-th test set IMG test·j Predictive label 'corresponding to a test target image' j E {0,1} and confidence value j ∈[0,1]Computing a j-th test set IMG according to equation (5) test·j Facial expression score s of i-th frame test target image in (1) i Then the j-th test set IMG test·j The facial expression score set of the N frames of test target images is S j ={s 0·j ,…,s i·j ,s N·j };
s i =value i·j *label' i·j (5)
Step S62: because of the facial expression score set S j Is a set tending to be discrete, in order to eliminate the cause modelClassification-induced errors, smoothing the j-th test set IMG using Savitzky-Golay convolution test·j Facial expression score set S of middle N frames of test target images j Becomes a continuous curve S' j
Step S63: since the facial expression score sets of different videos have large differences, a dynamic threshold T is used as a curve S' j The calculation formula of the dynamic threshold T is shown as (6), S mean For the set of facial expression scores S j Mean value of mid-facial expression scores, S max For the set of facial expression scores S j The maximum value of the expression score of the middle part, eta is a weight coefficient;
T=S mean +η*(S max -S mean ) (6)
step S64: finding curve S' j With a threshold T j And nearest neighbor k as a constraint, if and only if curve S' j Is greater than the threshold T j The distance between adjacent peak points is larger than the nearest neighbor k, so that the peak value of the target facial expression is met, and the target peak value set meeting the limiting conditions is set as a group by adjacent intervals, so that a final predicted facial expression label interval set is obtained;
step S65: when the overlapping degree IOU of the predicted interval and the real interval in the facial expression label interval set is more than or equal to 0.5, judging that the predicted interval in the facial expression label interval set is correct, taking the balanced average F1-score as an evaluation index, and performing parameter optimization by adopting a grid optimization method.
According to the facial expression detection method for the thick-density optical flow characteristics in the video, provided by the invention, the facial expression is normalized and corrected, useless noise is eliminated in the expression region of interest, and the detection of micro/macro expression in the video is realized by combining the thick-density optical flow characteristics of the region of interest with the expression detection post-processing method.
The present invention is not limited to the above-mentioned embodiments, but is intended to be limited to the following embodiments, and any modifications, equivalents and modifications can be made to the above-mentioned embodiments without departing from the scope of the invention.

Claims (5)

1. A facial expression detection method of thick and dense optical flow characteristics in video is characterized by comprising the following steps:
step S1: acquiring a plurality of face videos to be detected, wherein each face video S to be detected comprises a plurality of frames of face images to be detected, and the face videos S= { src to be detected 1 ,...,src i ,src N },src i The ith frame of the face image to be detected in the face video to be detected is the face image to be detected;
step S2: extracting a plurality of frames of face images to be detected from each face video to be detected, and respectively processing the plurality of frames of face images to be detected to obtain a corrected face image set S ' = { src ' ' 1 ,...,src′ i ,src' N }, where src' i The detected face image is the i frame after correction;
step S3: respectively carrying out face key point detection on the corrected multi-frame face image to be detected to obtain 68 face key points;
step S4: extracting corrected face image src 'to be detected' i And a face image src' i+k Thick density optical flow f i The thick density optical flow f i Conversion from HSV space to BGR space image img i Wherein the BGR aerial image img i The label of the (B) is the corrected face image src 'to be detected of the ith frame' i Is a label of (2);
step S5: respectively extracting the BGR space image img according to the 68 human face key points i An eyebrow area image and a mouth area image, respectively for the eyebrow area image and the mouth areaProcessing the domain image to obtain a final target image
Step S6: dividing the final target images of the face video to be detected into training target images and test target images, constructing a facial expression recognition model according to the training target images and the corresponding labels, and inputting the test target images into the facial expression recognition model to obtain a facial expression recognition result.
2. The method for detecting facial expression of thick-density optical flow features in video according to claim 1, wherein in the steps S2 and S3, further comprising:
and utilizing Retinaface to frame the face video S= { src to be detected 1 ,...,src i ,src N Performing face detection on the multi-frame face image to be detected in the sequence to obtain a face coordinate frame set bbox= { bbox 1 ,...,bbox i ,bbox N Sum 5-point face key point set lmk = { lmk 1 ,...,lmk i ,lmk N -a }; wherein src is i The ith frame of the face image to be detected in the face video to be detected is the face image to be detected, bbox i A face coordinate frame of the i-th frame of the face video to be detected, lmk i The key points of the 5-point face of the face image to be detected of the ith frame in the face video to be detected are N, and N is the total frame number of the face image to be detected in the face video to be detected;
according to 5-point face key points lmk i And performing face alignment on a plurality of frames of face images to be detected in the face video to be detected by a conversion matrix M to obtain a corrected face image set S '= { src'. 1 ,...,src′ i ,src' N And a corrected face coordinate frame set bbox '= { bbox' 1 ,...,bbox′ i ,bbox' N }, where src' i To-be-detected face image of the ith frame after correction, src' 1 ∈224×224×3,bbox′ i The calculation formula of the conversion matrix M is shown as (1) for the face coordinate frame of the face image to be detected of the ith frame after correction:
correcting the corrected face coordinate frame bbox 'by using 3DDFA_V2 algorithm' i Face key point detection is carried out on the face image of the region, and 68 face key point sets lmk '= { lmk' 1 ,...,lmk′ i ,lmk' N In which lmk' i Face coordinate frame bbox 'of face image to be detected for ith frame after correction' i Is a 68 face key point.
3. The method for detecting facial expression of thick-density optical flow features in video according to claim 2, wherein in step S4, further comprising:
when the corrected ith frame is used for detecting the face image src' i Belongs to the facial expression area, the corrected face image src 'to be detected in the ith frame' i Is defined as label i =1, otherwise label i =0, wherein the selection of k value adopts half of the average length of the facial expression in the face video data set to be detected, and the calculation formula of k value is as follows (2):
wherein T is the total video number in the face video data set to be detected, n i The frame number of facial expression exists for the jth face video to be detected.
4. The method for detecting facial expression of thick-density optical flow features in video according to claim 3, wherein in step S5, further comprising:
eyebrow region image ROI 1 Is a coordinate frameImage of region->Coordinate frame->The calculation formula of (2) is shown as (3):
wherein omega is 1 Is an extended number of pixels;
mouth region image ROI 2 Is a coordinate frameImage of region->Coordinate frame->The calculation formula of (2) is shown as (4):
wherein omega is 2 Is an extended number of pixels;
respectively image the imagesAnd->Normalized to H W size and then combined to obtain the final target imageWherein H and W are normalized height and width, respectively.
5. The method for detecting facial expression of thick-density optical flow features in video according to claim 4, wherein in step S6, the plurality of training target images form a training set IMG train The plurality of test target images form a test set IMG train Further comprising:
step S61: IMG the j-th test set test·j Inputting the test target image of (2) into the facial expression recognition model to obtain a j-th test set IMG test·j Predictive label 'corresponding to a test target image' j E {0,1} and confidence value j ∈[0,1]Computing a j-th test set IMG according to equation (5) test·j Facial expression score s of i-th frame test target image in (1) i Then the j-th test set IMG test·j The facial expression score set of the N frames of test target images is S j ={s 0·j ,...,s i·j ,s N·j };
s i =value i·j *label′ i·j (5)
Step S62: smoothing the j-th test set IMG using Savitzky-Golay convolution test·j Facial expression score set S of middle N frames of test target images j Becomes a continuous curve S' j
Step S63: dynamic threshold T is used as curve S' j The calculation formula of the dynamic threshold T is shown as (6), S mean For the set of facial expression scores S j Mean value of mid-facial expression scores, S max For the set of facial expression scores S j The maximum value of the expression score of the middle part, eta is a weight coefficient;
T=S mean +η*(S max -S mean ) (6)
step S64: finding curve S' j With a threshold T j And nearest neighbor k as a constraint, if and only if curve S' j Is greater than the threshold T j The distance between adjacent peak points is larger than the nearest neighbor k, so that the peak value of the target facial expression is met, and the target peak value set meeting the limiting conditions is set as a group by adjacent intervals, so that a final predicted facial expression label interval set is obtained;
step S65: and when the overlapping degree IOU of the predicted interval and the real interval in the facial expression label interval set is more than or equal to 0.5, judging that the predicted interval in the facial expression label interval set is correct.
CN202111171053.6A 2021-10-08 2021-10-08 Facial expression detection method of thick and dense optical flow characteristics in video Active CN113902774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111171053.6A CN113902774B (en) 2021-10-08 2021-10-08 Facial expression detection method of thick and dense optical flow characteristics in video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111171053.6A CN113902774B (en) 2021-10-08 2021-10-08 Facial expression detection method of thick and dense optical flow characteristics in video

Publications (2)

Publication Number Publication Date
CN113902774A CN113902774A (en) 2022-01-07
CN113902774B true CN113902774B (en) 2024-04-02

Family

ID=79190355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111171053.6A Active CN113902774B (en) 2021-10-08 2021-10-08 Facial expression detection method of thick and dense optical flow characteristics in video

Country Status (1)

Country Link
CN (1) CN113902774B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117153403A (en) * 2023-09-13 2023-12-01 安徽爱学堂教育科技有限公司 Mental health evaluation method based on micro-expressions and physical indexes

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991348A (en) * 2019-12-05 2020-04-10 河北工业大学 Face micro-expression detection method based on optical flow gradient amplitude characteristics
CN112766159A (en) * 2021-01-20 2021-05-07 重庆邮电大学 Cross-database micro-expression identification method based on multi-feature fusion
CN113158978A (en) * 2021-05-14 2021-07-23 无锡锡商银行股份有限公司 Risk early warning method for micro-expression recognition in video auditing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514432B (en) * 2012-06-25 2017-09-01 诺基亚技术有限公司 Face feature extraction method, equipment and computer program product

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991348A (en) * 2019-12-05 2020-04-10 河北工业大学 Face micro-expression detection method based on optical flow gradient amplitude characteristics
CN112766159A (en) * 2021-01-20 2021-05-07 重庆邮电大学 Cross-database micro-expression identification method based on multi-feature fusion
CN113158978A (en) * 2021-05-14 2021-07-23 无锡锡商银行股份有限公司 Risk early warning method for micro-expression recognition in video auditing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于LBP和SVM的人脸表情识别系统的设计与实现;姚丽莎;张军委;房波;张绍雷;周欢;赵凤;;贵州师范大学学报(自然科学版);20200115(第01期);全文 *
平均光流方向直方图描述的微表情识别;马浩原;安高云;阮秋琦;;信号处理;20180325(第03期);全文 *

Also Published As

Publication number Publication date
CN113902774A (en) 2022-01-07

Similar Documents

Publication Publication Date Title
CN112506342B (en) Man-machine interaction method and system based on dynamic gesture recognition
Najibi et al. G-cnn: an iterative grid based object detector
WO2021043168A1 (en) Person re-identification network training method and person re-identification method and apparatus
CN106778687B (en) Fixation point detection method based on local evaluation and global optimization
CN104616316B (en) Personage's Activity recognition method based on threshold matrix and Fusion Features vision word
CN105718873B (en) Stream of people's analysis method based on binocular vision
CN105740758A (en) Internet video face recognition method based on deep learning
KR20160101973A (en) System and method for identifying faces in unconstrained media
CN112800903B (en) Dynamic expression recognition method and system based on space-time diagram convolutional neural network
MX2013002904A (en) Person image processing apparatus and person image processing method.
CN104700078B (en) A kind of robot scene recognition methods based on scale invariant feature extreme learning machine
CN110889865B (en) Video target tracking method based on local weighted sparse feature selection
CN107862240A (en) A kind of face tracking methods of multi-cam collaboration
CN111028216A (en) Image scoring method and device, storage medium and electronic equipment
CN110705366A (en) Real-time human head detection method based on stair scene
CN113902774B (en) Facial expression detection method of thick and dense optical flow characteristics in video
Bian et al. Conditional adversarial consistent identity autoencoder for cross-age face synthesis
CN113762009B (en) Crowd counting method based on multi-scale feature fusion and double-attention mechanism
CN111144220B (en) Personnel detection method, device, equipment and medium suitable for big data
CN112487948A (en) Multi-space fusion-based concentration perception method for learner in learning process
CN110766093A (en) Video target re-identification method based on multi-frame feature fusion
Fan et al. A high-precision correction method in non-rigid 3D motion poses reconstruction
Yang et al. Video system for human attribute analysis using compact convolutional neural network
CN107016675A (en) A kind of unsupervised methods of video segmentation learnt based on non local space-time characteristic
CN113901915B (en) Expression detection method of light-weight network and MagFace in video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant