CN112183419A

CN112183419A - Micro-expression classification method based on optical flow generation network and reordering

Info

Publication number: CN112183419A
Application number: CN202011070119.8A
Authority: CN
Inventors: 柯逍; 林艳; 王俊强
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2020-10-09
Filing date: 2020-10-09
Publication date: 2021-01-05
Anticipated expiration: 2040-10-09
Also published as: CN112183419B

Abstract

The invention relates to a micro-expression classification method based on optical flow generation network and reordering. Firstly, acquiring a micro expression data set, extracting a start frame and a peak frame, and preprocessing; training an optical flow generation network, and generating optical flow characteristics according to all initial frames and peak frames; then, the obtained optical flow image is divided into a corresponding training set and a corresponding testing set according to an LOSO principle, and a residual error network is input for training; and finally, reordering results obtained by preliminary classification of the residual error network to obtain a final result with higher precision.

Description

Micro-expression classification method based on optical flow generation network and reordering

Technical Field

The invention relates to the field of pattern recognition and computer vision, in particular to a micro-expression classification method based on optical flow generation network and reordering.

Background

In the field of emotion calculation, facial expressions are often researched to judge the emotion of a human at the moment, but the human body is the highest animal and sometimes disguises or hides the emotion of the human body, and in such a case, people cannot acquire useful information from the macroscopic expression of the face. To be able to tap useful information from camouflaged facial expressions, ackerman discovered a transient, involuntary, rapid facial emotion, i.e., a micro-expression, that is provoked to involuntarily appear in the face when one tries to hide some kind of real emotion. A standard micro-expression lasts 1/5 to 1/25 seconds and usually appears only in a specific part of the face.

Micro-expressions have great prospects in the aspects of national security, criminal inquiries and medical applications, but the subtlety and conciseness of micro-expressions form a great challenge for human eyes, so that in recent years, people put forward a lot of work of realizing automatic identification of micro-expressions by using computer vision and machine learning algorithms.

Disclosure of Invention

The invention aims to provide a micro-expression classification method based on an optical flow generation network and reordering, which can effectively classify micro-expression images.

In order to achieve the purpose, the technical scheme of the invention is as follows: a micro-representation classification method based on optical flow generation network and reordering comprises the following steps:

s1, acquiring a micro expression data set, extracting a start frame and a peak frame, and preprocessing;

step S2, training an optical flow generation network, and generating optical flow characteristics according to all initial frames and peak frames;

step S3, dividing the obtained optical flow image into a training set and a test set according to an LOSO principle, and inputting a residual error network for training;

and step S4, reordering the classification results obtained by the residual error network to obtain a final result with higher precision.

In an embodiment of the present invention, the step S1 specifically includes the following steps:

step S11, acquiring a micro expression data set, and cutting the image into 224 × 224 images after face alignment;

step S12, extracting the initial frame and the peak frame directly according to the marked content for the micro expression data set with the initial frame and the peak frame mark, and executing step S15;

step S13, extracting the initial frame and the peak frame of the video sequence by using a frame difference method for the micro expression data set which is not marked with the initial frame and the peak frame; the frame difference method comprises the following steps: let P be { P ═ P_iDenotes an input image sequence, where p is 1,2_iRepresenting the ith input picture with the first frame of the sequence as the starting frame, i.e. p_start＝p₁The gray values of pixels corresponding to the first frame and the nth frame of the video sequence are recorded as f1(x, y) and fn (x, y), the gray values of the pixels corresponding to the two frames of images are subtracted, the absolute value of the subtraction is taken to obtain a difference image Dn, Dn (x, y) ═ fn (x, y) -f1(x, y) |, and the average inter-frame difference Dnavg of the difference image is calculated, wherein the calculation method comprises the following steps:

wherein, Dn.shape [0 ]]Shape [1 ] represents the height of the difference image Dn]The width of the difference image Dn is indicated. Calculating the average interframe difference of all frames except the initial frame and sequencing, wherein the frame with the maximum average interframe difference is the peak value frame p corresponding to the image sequence_apex(ii) a After extracting the start frame and the peak frame, executing step S15;

step S15, performing euler action amplification on the extracted start frame and peak frame, wherein the calculation process is as follows:

I(x,t)＝g(x+(1+α)(t))

wherein I (x, t) represents the brightness value of the image at the position x and time t, g (-) represents the mapping function of euler's motion amplification process, and (t) represents the motion deviation, and the amplified image is generated by adjusting the motion amplification factor α.

In an embodiment of the present invention, in step S2, the optical flow generation network performs point-to-point pixel training by using a structure of two sub-networks, the two sub-networks are ranked with respect to each other, one sub-network performs optical flow estimation on large displacement, the other sub-network performs optical flow estimation on small displacement, each sub-network is composed of a feature extraction module and an optical flow estimation module, and finally, the optical flow estimates obtained by the two sub-networks are fused to obtain a final optical flow estimated image;

s21, for the sub-network of large displacement optical flow estimation, the feature extraction module is composed of nine convolution layers, the input of the feature extraction module is the superposition of the amplified input image pair, the feature mapping function of the feature extraction module is H (·), and the calculation process is as follows:

feature_big＝H(p_ls+p_la)

wherein p is_lsIndicating the starting frame after enlargement, p_laShowing the enlarged peak frame, feature_bigRepresenting the result of motion feature extraction for large displacement;

the optical flow estimation module of the large displacement optical flow estimation sub-network consists of an upper pooling layer and a convolution layer, and the feature value feature obtained by the output of the feature extraction module_bigAnd feature at layer 5-1 of the feature extraction module, i.e., layer 4 of the feature extraction module_big4And superposing, pooling, estimating the optical flow and restoring the optical flow image resolution to obtain the calculation result of the 1 st layer of the optical flow estimation module, wherein the calculation process is as follows:

feature_Bflow1＝estimate(feature_big+feature_big4)

wherein feature_Bflow1Representing the characteristics output by the layer 1 of the optical flow estimation module under the large displacement sub-network, and estimate (DEG) representing the mapping function of the layer 1 of the optical flow estimation module under the large displacement sub-network;

then, for the remaining 2-4 layers of optical flow estimation modules, the result calculated by the previous layer is added to the input of the next layer, and the calculation process is as follows:

feature_Bflowi＝estimate(feature_big+feature_big(5-i)+feature_Bflow(i-1))

wherein feature_Bflow(i-1)The characteristics represent the output of the i-1 layer of the optical flow estimation module under the large displacement sub-network;

s22, for the small displacement optical flow estimation sub-network, the feature extraction module is composed of nine convolution layers, the first three convolutions are respectively to the input initial frame p without action amplification_sSum peak frame p_aRespectively extracting features, wherein the input of the last six convolutions is superposition of the output results of the first three convolutions on two image frames, the mapping function of the first three convolutions of the feature extraction module is first (-) and the mapping function of the last convolution is last (-), and the calculation process is as follows:

feature_small＝last(first(p_s)+first(p_a))

wherein feature_smallRepresenting the result of motion feature extraction for small displacements;

the optical flow estimation module of the small displacement optical flow estimation sub-network consists of an upper pooling layer and a convolution layer, and the feature value feature obtained by the output of the feature extraction module_smallAnd feature at layer 6-1 of the feature extraction module, i.e., layer 5 of the feature extraction module_small5And superposing, pooling, estimating the optical flow and restoring the optical flow image resolution to obtain a calculation result of a first layer of the optical flow estimation module, wherein the calculation process is as follows:

feature_Sflow1＝estimate(feature_small+feature_small5)

wherein feature_Sflow1The method comprises the steps of representing characteristics output by a layer 1 of an optical flow estimation module under a small displacement sub-network, and estimate (DEG) represents a mapping function of the layer 1 of the optical flow estimation module under the small displacement sub-network;

then, for the remaining 2-5 layers of optical flow estimation modules, the result calculated by the previous layer is added to the input of the next layer, and the calculation process is as follows:

feature_Sflowi＝estimate(feature_small+feature_small(6-i)+feature_Sflow(i-1))

wherein feature_Sflow(i-1)Optical flow estimation model under representation of small displacement sub-networkCharacteristics of the i-1 layer output of the block;

s23, fusing the results obtained by the large displacement optical flow estimation sub-network and the small displacement optical flow estimation sub-network to obtain the final output result, and making fusion (-) represent the final fusion operation, wherein the calculation process is as follows:

p_fusion＝fusion(feature_Bflow4+feature_Sflow5)。

in an embodiment of the present invention, the step S3 specifically includes the following steps:

step S31, under each optical flow image data set, there are multiple subjects, each subject represents a testee, and each subject contains multiple micro-expression sequences, which represent the multiple micro-expression sequences generated by the testee, according to the principle of leave-one-subject-out, one subject of one data set is taken as a test set when the data sets are divided, all other subjects are combined together as a training set, and the last data set obtains a subject_iA training set and a test set, wherein Sub_iRepresenting the number of subjects in a data set;

and step S32, inputting the divided test set and training set into a residual error network in sequence for classification to obtain a primary classification result.

In an embodiment of the present invention, the step S4 specifically includes the following steps:

step S41, for the preliminary classification result obtained by the residual error network training, there may be a case where the probabilities of two results of the same graph are very similar, so that the results need to be reordered;

step S42, selecting images under corresponding classification in the training set according to the classification results with similar classification probability of the tested images, selecting k neighbors according to the selected images, and calculating as follows:

wherein e_iRepresenting the ith selected training set image, and p represents the tested image;

step S43, calculating tested image p and selected image e_iThe calculation process of (a) is as follows:

D_i＝1-probe_max(e_i)+probe_max(p)

wherein probe_maxRepresenting the maximum probability in the classification result probabilities;

step S44: for each selected training set image e_iCalculating the Jaccard distance D between the measured image p and the measured image_jThe calculation process is as follows:

weighting D_iAnd D_jObtaining a final distance result;

the micro-expression images are reclassified in a reordering mode, so that the situation of wrong classification caused by too close two types of probabilities possibly occurring in the micro-expression recognition process is reduced, and the accuracy of micro-expression recognition is improved.

Compared with the prior art, the invention has the following beneficial effects:

1. the constructed optical flow generation network and the reordered micro-expression classification method can effectively classify the micro-expression images and improve the classification effect of the micro-expression images.

2. The method generates the result of the optical flow estimation between two frames in a neural network mode, and has more robust effect, better performance and clearer boundary compared with the traditional optical flow generation method.

3. Aiming at the problem that two types of expressions are difficult to distinguish in the conventional micro expression recognition process, the invention uses the method for reordering the test images with similar probability of the classification result, better classifies the single micro expression and improves the classification effect.

Drawings

Fig. 1 is a schematic diagram of the principle of the present invention.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

As shown in fig. 1, the present embodiment provides a micro-representation classification method based on optical flow generation network and reordering, which specifically includes the following steps:

step S1: acquiring a micro expression data set, extracting a start frame and a peak frame, and preprocessing;

step S2: training an optical flow generation network, and generating optical flow characteristics according to all initial frames and peak frames;

step S3: dividing the obtained optical flow image into a training set and a testing set according to an LOSO principle, and inputting a residual error network for training;

step S4: and finally, reordering the classification results obtained by the residual error network to obtain a final result with higher precision.

In this embodiment, the step S1 includes the following steps:

step S11: acquiring a micro expression data set, and cutting an image into 224 × 224 images after face alignment;

step S12: for a micro expression data set with initial frame and peak frame labels, directly extracting the initial frame and the peak frame according to the label content;

step S13: extracting the initial frame and the peak frame of the video sequence by using a frame difference method for the micro expression data set which is not marked by the initial frame and the peak frame;

step S14: the specific content of the frame difference method is that P is { P ═ P_iDenotes an input image sequence, where p is 1,2_iRepresenting the ith input picture, let us start the first frame of the sequence, i.e. p_start＝p₁The gray values of pixels corresponding to the first frame and the nth frame of the video sequence are recorded as f1(x, y) and fn (x, y), the gray values of the pixels corresponding to the two frames of images are subtracted, the absolute value of the subtraction is taken to obtain a difference image Dn, Dn (x, y) ═ fn (x, y) -f1(x, y) |, and the average inter-frame difference Dnavg of the difference image is calculated, wherein the calculation method comprises the following steps:

wherein, Dn.shape [0 ]]Shape [1 ] represents the height of the difference image Dn]The width of the difference image Dn is indicated. Calculating the average interframe difference of all frames except the initial frame and sequencing, wherein the frame with the maximum average interframe difference is the peak value frame p corresponding to the image sequence_apex；

Step S15: and carrying out Euler action amplification on the processed initial frame and the processed peak frame, wherein the calculation process is as follows:

I(x,t)＝g(x+(1+α)(t))

wherein I (x, t) represents the brightness value of the image at position x and time t, g (-) represents the mapping function of Euler motion amplification process, and (t) represents the motion deviation, the method adjusts the motion amplification coefficient

To generate an enlarged image.

In this embodiment, step S2 specifically includes the following steps:

step S21: the optical flow generation network carries out point-to-point pixel training by adopting a structure of two sub-networks, the two sub-networks are leveled with each other, one sub-network is specially used for carrying out optical flow estimation on large displacement, the other sub-network is specially used for carrying out optical flow estimation on small displacement, each sub-network is composed of a feature extraction module and an optical flow estimation module, and finally the optical flow estimation obtained by the two sub-networks are fused to obtain a final optical flow estimation image;

step S22: for a sub-network specially used for carrying out optical flow estimation on large displacement, a feature extraction module of the sub-network mainly comprises nine convolution layers, the input of the module is superposition of an amplified input image pair, a feature mapping function of the module is made to be H (·), and the calculation process is as follows:

feature_big＝H(p_ls+p_la)

wherein p is_lsIndicating the starting frame after enlargement, p_laShowing the enlarged peak frame, feature_bigRepresenting the result of feature extraction for large displacement motion.

The optical flow estimation module of the large-displacement optical flow estimation sub-network mainly comprises an upper pooling layer and a convolution layer, and a feature value feature output by the upper layer module_bigAnd feature of upper module layer 5-1, layer 4_big4And (3) superposing, pooling, estimating the optical flow and restoring the optical flow image resolution to obtain a calculation result of a first layer, wherein the calculation process is as follows:

feature_Bflow1＝estimate(feature_big+feature_big4)

wherein feature_Bflow1Representing the characteristics output by the layer 1 of the optical flow estimation module under the large displacement sub-network, and the estimate (DEG) represents the mapping function of the layer;

then, for the remaining 2-4 layers, the calculation result of the previous layer is added to the input of the next layer, and the calculation process is as follows:

feature_Bflowi＝estimate(feature_big+feature_big(5-i)+feature_Bflow(i-1))

step S23: for the sub-network specially used for carrying out optical flow estimation on small displacement, the feature extraction module mainly comprises nine convolution layers, wherein the first three convolutions are respectively used for the input initial frame p which is not amplified in action_sSum peak frame p_aAnd respectively extracting features, wherein the input of the last six convolutions is the superposition of the first three convolutions on the output structures of the two image frames, the mapping function of the first three convolutions of the module is first (·), the mapping function of the last six convolutions is last (·), and the calculation process is as follows:

feature_small＝last(first(p_s)+first(p_a))

wherein feature_smallRepresenting the result of feature extraction for small displacement motion.

The optical flow estimation module of the small displacement optical flow estimation sub-network mainly comprises an upper pooling layer and a convolution layer, and a feature value feature output by the upper layer module_smallAnd feature of layer 6-1, layer 5 of the upper module_small5And (3) superposing, pooling, estimating the optical flow and restoring the optical flow image resolution to obtain a calculation result of a first layer, wherein the calculation process is as follows:

feature_Sflow1＝estimate(feature_small+feature_small5)

wherein feature_Sflow1Representing the characteristics output by the layer 1 of the optical flow estimation module under the small displacement sub-network, and the estimate (DEG) represents the mapping function of the layer;

then, for the remaining 2-5 layers, the calculation result of the previous layer is added to the input of the next layer, and the calculation process is as follows:

feature_Sflowi＝estimate(feature_small+feature_small(6-i)+feature_Sflow(i-1))

wherein feature_Sflow(i-1)The characteristics represent the output of the i-1 layer of the optical flow estimation module under the small displacement sub-network;

step S24: and finally, fusing the results obtained by the two sub-networks to obtain a final output result, and enabling fusion (-) to represent the final fusion operation, wherein the calculation process is as follows:

p_fusion＝fusion(feature_Bflow4+feature_Sflow5)

the convolution neural network is used for simulating and fusing optical flow estimation results of large displacement and small displacement, the generalization of the model is facilitated to be improved, the micro expression segments with overlarge or undersize micro expression changes can be adjusted more reasonably, and compared with the traditional method, the implementation mode of the convolution neural network reduces the edge blurring problem possibly generated in the micro expression optical flow estimation process, and the optical flow estimation result is more accurate.

In this embodiment, step S3 specifically includes the following steps:

step S31: each data set is provided with a plurality of subjects, each subject represents a testee, each subject comprises a plurality of micro-expression sequences which represent the micro-expression sequences generated by the testee, and according to the principle of leave-one-subject-out, one subject of one data set is taken as a test set at one time when the data sets are divided, and all the rest subjects are combined together to be used as a training set, so that finally, the subjects can be obtained for one data set_iA training set and a test set, wherein Sub_iRepresenting the number of subjects in a data set.

Step S32: inputting the divided test set and training set into a residual error network in sequence for classification to obtain a primary classification result;

in this embodiment, step S4 specifically includes the following steps:

step S41: for the results of the preliminary classification, there may be a case that the probability of some two results of the same graph is very similar, which requires the results to be reordered;

step S42: selecting images under corresponding classification in a training set according to classification results with similar classification probabilities of tested images, and selecting k neighbors according to the selected images, wherein the calculation process is as follows:

wherein e_iRepresenting the ith selected training set image and p representing the image under test.

Step S43: calculating the tested image p and the selected image e_iThe calculation process of (a) is as follows:

D_i＝1-probe_max(e_i)+probe_max(p)

wherein probe_maxRepresenting the highest probability among the probabilities of the classification results.

weighting D_iAnd D_jI.e. to obtain the final distance result.

The micro-expression images are reclassified in a reordering mode, so that the situation of wrong classification caused by the fact that certain two types of probabilities are too close to each other in the micro-expression recognition process is greatly reduced, and the accuracy of micro-expression recognition is improved.

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A micro-expression classification method based on optical flow generation network and reordering is characterized by comprising the following steps:

2. The method for sorting micro-scenarios based on optical flow generation networking and reordering of claim 1, wherein the step S1 specifically comprises the following steps:

step S13, extracting the initial frame and the peak frame of the video sequence by using a frame difference method for the micro expression data set which is not marked with the initial frame and the peak frame; the frame difference method comprises the following steps: let P be { P ═ P_iDenotes an input image sequence, where p is 1,2_iRepresenting the ith input picture with the first frame of the sequence as the starting frame, i.e. p_start＝p₁Recording the gray values of pixels corresponding to the first frame and the nth frame of the video sequence as f1(x, y) and fn (x, y), subtracting the gray values of the pixels corresponding to the two frames of images, taking the absolute value of the subtraction to obtain a difference image Dn, Dn (x, y) ═ fn (x, y) -f1(x, y) |, calculating the average inter-frame difference Dnavg of the difference image, and calculating the methodThe following were used:

wherein, Dn.shape [0 ]]Shape [1 ] represents the height of the difference image Dn]Representing the width of the difference image Dn, calculating the average inter-frame difference of all frames except the initial frame and the initial frame, and sequencing, wherein the frame with the maximum average inter-frame difference is the peak value frame p corresponding to the image sequence_apex(ii) a After extracting the start frame and the peak frame, executing step S15;

I(x,t)＝g(x+(1+α)(t))

3. The method according to claim 1, wherein in step S2, the optical flow generating network performs point-to-point pixel training by adopting a structure of two sub-networks, the two sub-networks are ranked with respect to each other, one sub-network performs optical flow estimation for large displacement, the other sub-network performs optical flow estimation for small displacement, each sub-network is composed of a feature extraction module and an optical flow estimation module, and finally the optical flow estimates obtained from the two sub-networks are fused to obtain a final optical flow estimation image;

feature_big＝H(p_ls+p_la)

wherein p is_lsAfter showing enlargementStart frame of p_laShowing the enlarged peak frame, feature_bigRepresenting the result of motion feature extraction for large displacement;

feature_Bfl_ow1＝estimate(feature_big+feature_big4)

feature_Bflowi＝estimate(feature_big+feature_big(5-i)+feature_Bflow(i-1))

feature_small＝last(first(p_s)+first(p_a))

wherein feature_smallRepresenting motion feature extraction for small displacementsThe result of (1);

feature_Sflow1＝estimate(feature_small+feature_small5)

feature_Sflowi＝estimate(feature_small+feature_small(6-i)+feature_Sflow(i-1))

p_fusion＝fusion(feature_Bflow4+feature_Sflow5)。

4. the method for sorting micro-scenarios based on optical flow generation networking and reordering of claim 1, wherein the step S3 specifically comprises the following steps:

step S31, each optical flow image data set has multiple subjects, each subject represents a subject, and each subject has multiple micro-expression sequences, each micro-expression sequence represents the subjectAccording to the principle of leave-one-subject-out, when the data sets are divided, one subject of one data set is taken as a test set, all other subjects are combined together to be used as a training set, and the last data set obtains the subject_iA training set and a test set, wherein Sub_iRepresenting the number of subjects in a data set;

5. The method for sorting micro-scenarios based on optical flow generation networking and reordering of claim 1, wherein the step S4 specifically comprises the following steps:

D_i＝1-probe_max(e_i)+probe_max(p)

wherein probe_maxIn expressing the probability of classification resultThe maximum probability;

weighting D_iAnd D_jObtaining a final distance result;