CN114005157B - Micro-expression recognition method for pixel displacement vector based on convolutional neural network - Google Patents

Micro-expression recognition method for pixel displacement vector based on convolutional neural network Download PDF

Info

Publication number
CN114005157B
CN114005157B CN202111204917.XA CN202111204917A CN114005157B CN 114005157 B CN114005157 B CN 114005157B CN 202111204917 A CN202111204917 A CN 202111204917A CN 114005157 B CN114005157 B CN 114005157B
Authority
CN
China
Prior art keywords
displacement vector
image
pixel displacement
maximum frame
frame image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111204917.XA
Other languages
Chinese (zh)
Other versions
CN114005157A (en
Inventor
何双江
项金桥
董喆
方博
鄢浩
赵俭辉
赵慧娟
翟芷君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei Provincial People's Procuratorate
Wuhan Fiberhome Information Integration Technologies Co ltd
Original Assignee
Hubei Provincial People's Procuratorate
Wuhan Fiberhome Information Integration Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei Provincial People's Procuratorate, Wuhan Fiberhome Information Integration Technologies Co ltd filed Critical Hubei Provincial People's Procuratorate
Priority to CN202111204917.XA priority Critical patent/CN114005157B/en
Publication of CN114005157A publication Critical patent/CN114005157A/en
Application granted granted Critical
Publication of CN114005157B publication Critical patent/CN114005157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a pixel displacement vector micro-expression recognition method based on a convolutional neural network, which comprises the steps of establishing an end-to-end micro-expression recognition network based on a pixel displacement generation module, and processing flow based on the micro-expression recognition network: selecting a maximum frame, wherein a certain frame before and after the original maximum frame is randomly selected as a maximum frame image in the training process; inputting the selected maximum frame image and the initial frame image into a pixel displacement generating module together, and outputting a pixel displacement vector feature map between the two images; calculating a correlation loss function, which comprises the steps of firstly sampling the generated displacement vector feature image to obtain a displacement feature image, then sampling to generate an approximate maximum frame image, and calculating reconstruction loss and regular loss; normalizing operation, including normalizing the generated pixel displacement vector feature map; and performing feature learning and microexpressive classification, namely connecting the largest frame image with the normalized pixel displacement vector feature image, and inputting the connected image into a classification network to obtain a classification prediction result.

Description

Micro-expression recognition method for pixel displacement vector based on convolutional neural network
Technical Field
The invention belongs to the technical field of micro-expression recognition, and relates to a micro-expression recognition technology based on dynamic feature representation.
Background
Currently, the mainstream deep learning methods for micro-expression recognition are divided into two main categories:
The first major category is to sequentially perform feature extraction on each frame in an image sequence and input the feature extraction into a time-series neural network, and learn spatial distribution and time-varying features at the same time. As a ELRCN network (document 1) has been proposed in recent years, experimental results indicate that temporal and spatial features play different roles in microexpressive recognition, and that good recognition effects depend on the effective combination of both.
The second major category extracts the variation characteristics of the whole expression sequence as a characteristic map, and the characteristic map is directly input into a classification network for prediction, and is generally classified by utilizing variation difference characteristics between a starting frame and a maximum frame of a micro expression segment. The feature extraction method is continuously improved, and LBP-TOP (document 2) is widely used in the early stage to extract the spatiotemporal variation features of micro expressions and is used as a reference method in the field. Based on this approach, a series of LBP variants have also been proposed one by one to improve the quality and robustness of the extracted features. Later gradually replaced by Optical flow (literature 3), optical flow estimates the change of the object position between two frames, characterizes the direction and the size of the image pixel movement, and can extract the inter-frame object movement information more robustly. Bi-WOOF (document 4) calculates Optical strain as a supplement on the basis of Optical flow. In addition, the method for extracting the change characteristics of the micro-expression segment also includes DYNAMIC IMAGING (document 5) method for the motion recognition field, which compresses a picture sequence into an RGB image, wherein the RGB image contains the spatial characteristics and the time dynamic characteristics of the whole image sequence.
However, the extraction of the change characteristics of the expression sequence is realized in the preprocessing process of training at present, is limited to the respective processing process, is not fused with a deep learning network for classification, cannot adjust the generated dynamic characteristics according to the feedback of the classification effect, and lacks sufficient flexibility and adaptability.
Related literature:
[ literature ] 1】H.Khor,J.See,R.C.Phan,W.Lin,"Enriched Long-term Recurrent Convolutional Network for Facial Micro-Expression Recognition,"Proceedings of the 2018International Conference on Automatic Face&Gesture Recognition(FG),2018,pp.667–674.
[ Literature ] 2】G.Zhao,M.Pietikainen,"Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions,"Pattern Analysis and Machine Intelligence,IEEE Transactions,2009,pp.915–928.
[ Document 3 ] D.Fleet, Y.Weiss, "Optical Flow Estimation," Springer US,2006.
[ Literature ] 4】Liong ST,See J,Wong K,Phan RC,"Less is more:Micro-expression recognition from video using apex frame,"Signal Processing:Image Communication,2018,pp.62:82–92.
[ Literature ] 5】H.Bilen,B.Fernando,E.Gavves,A.Vedaldiand S.Gould,"Dynamic image networks for action recognition,"In Proc.IEEE Int.Conf.Comput.Vis.Pattern Recognit,2016,pp.3034–3042.
Disclosure of Invention
Aiming at the defects of the existing micro-expression recognition method, the invention provides an end-to-end micro-expression recognition network based on a pixel displacement generation module based on deep learning, and more spaces capable of being automatically adjusted according to data are provided for the displacement feature extraction and expression recognition classification module so as to increase the overall fitness of the model.
The technical proposal of the invention is a micro-expression recognition method based on a pixel displacement vector of a convolutional neural network, which establishes an end-to-end micro-expression recognition network based on a pixel displacement generation module, the processing flow based on the micro-expression recognition network comprises the following steps,
Selecting a maximum frame, wherein a certain frame before and after the original maximum frame is randomly selected as a maximum frame image in the training process;
Generating a pixel displacement vector feature map, which comprises inputting a selected maximum frame image and a starting frame image into a pixel displacement generating module, and outputting the pixel displacement vector feature map between the two images through the learning and feature fusion of each convolution layer;
Calculating a correlation loss function, which comprises the steps of firstly carrying out bilinear interpolation up-sampling on the generated displacement vector feature image to obtain a displacement feature image with the same size as a maximum frame, then carrying out sampling on an original initial frame image according to the displacement feature image to generate an approximate maximum frame image, and calculating reconstruction loss and regular loss according to the generated approximate maximum frame image and the original selected maximum frame image;
Normalizing operation, including normalizing the generated pixel displacement vector feature map;
and performing feature learning and microexpressive classification, namely connecting the previously selected maximum frame image with the normalized pixel displacement vector feature image, and inputting the connected maximum frame image and normalized pixel displacement vector feature image into a classification network to obtain a classification prediction result.
In the training process, the selection of the maximum frame is realized through a randomization process, a certain frame in a certain range before and after the original maximum frame is randomly selected, and the image pair actually used for training is increased; if in the verification or test stage, directly adopting the original maximum frame image;
the generated pixel displacement vector features are normalized before being input into the classification network, and each displacement vector feature map is divided by the average value of the first several values of the absolute values of each displacement vector feature map.
Moreover, for the generated pixel displacement vector feature map, the loss function thereof includes a reconstruction loss between the original maximum frame and the maximum frame reconstructed from the start frame and the displacement vector, and an L1 canonical loss calculated for the displacement vector feature map itself.
And the selected maximum frame image and the generated pixel displacement vector feature image are input into a classification network together for learning, and after a classification prediction result is obtained, the classification loss is calculated according to the requirement so as to use the related evaluation index.
Compared with the prior art, the invention has the following advantages and positive effects:
(1) The pixel displacement generation module provided by the invention can be combined with a classification network to perform end-to-end unified training, and classification loss can be reversely transmitted to the pixel displacement generation module, so that the pixel displacement generation module automatically adjusts parameters according to classification effects to generate displacement characteristics easier to classify, and meanwhile, the overall model also has higher fitness.
(2) The random maximum frame selection operation provided by the invention can increase the image pair actually used for training, enhance the robustness of the network and the sensitivity to slight change, and improve the generation and classification effects of displacement characteristics.
(3) The normalization operation provided by the invention is equivalent to reducing the expression displacement with larger amplitude, amplifying the expression displacement with smaller amplitude, playing a role in self-adaptive expression amplitude adjustment, reducing the influence of amplitude difference between different image pairs on the classification network, and enabling the classification network to be easier to learn.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate and explain the application and, together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic diagram of an overall structure of an end-to-end micro-expression recognition network based on a pixel displacement generation module according to an embodiment of the present invention;
Fig. 2 is a schematic structural diagram of a pixel displacement generating module according to an embodiment of the present invention;
Detailed Description
The following describes specific embodiments of the present invention in detail with reference to the drawings. It should be understood that the detailed description is presented by way of illustration or example only, and is not intended to limit the invention.
The invention discloses a full convolution pixel Displacement Generation Module (DGM) which is used for generating a pixel displacement vector feature map (displacement) between two frames to replace dynamic features generated by a traditional Optical Flow or DYNAMIC IMAGING method. The module is combined with the existing LEARNet classification network to form an end-to-end micro-expression recognition model. We also disclose a randomization operation on the largest frame to increase training sample pairs and increase the sensitivity of the network to fine expression changes. The invention directly takes a starting frame in the expression sequence and a maximum frame selected by randomizing operation as input, generates a pixel displacement vector feature map by using a pixel displacement generating module, performs normalization processing on the feature map, and then is connected with a maximum frame image to be input to LEARNet classification network for learning and prediction. The model can back-propagate the gradient of the classification loss to the DGM, so that the gradient can adjust parameters according to classification results, and displacement features easier to classify are generated.
As shown in fig. 1, an embodiment of the present invention provides an end-to-end micro-expression recognition method based on a pixel displacement generating Module, and the end-to-end micro-expression recognition network based on the pixel displacement generating Module provided by the present invention includes two parts of neural networks for generating pixel displacement vector features and for learning and classifying, the network model performs end-to-end training, and dynamic features of micro-expressions are represented by pixel displacement vectors between a start frame and a maximum frame generated by a convolution Module DGM (i.e., a pixel displacement generating Module DISPLACEMENT GENERATING Module shown in fig. 2).
In an embodiment, the main flow based on the micro-expression recognition network is as follows:
(1) Selecting the maximum frame: in the training process, a randomization process is needed to select the maximum frame, a certain frame in a certain range before and after the original maximum frame is randomly selected as the maximum frame image, and the image pair actually used for training is increased. If in the verification or test stage, the original maximum frame image is directly adopted.
Let the initial frame index be I, the maximum frame index be j, and the index of the last frame in the image sequence be I last, then the randomizing selected frame index I select is calculated by the following formula:
Iselect=random[MAX(i+1,round(j-(j-i)*0.2)),MIN(round(j+(j-i)*0.2),Imax)]
Where MAX represents selecting the larger of the two and MIN represents selecting the smaller of the two to ensure that the selected largest frame is after the start frame and does not exceed the last frame of the sequence. random () represents any integer within the randomly chosen interval.
(2) Generating a pixel displacement vector feature map: the selected maximum frame image and the initial frame image are input into a pixel displacement generating module together, and a pixel displacement vector feature diagram between the two images is output through the learning and feature fusion of each convolution layer.
(3) Calculating a correlation loss function: the generated displacement vector feature image is subjected to bilinear interpolation up-sampling to obtain the displacement feature image with the same size as the maximum frame, and then the original initial frame image is sampled according to the displacement feature image to generate an approximate maximum frame image. If the value of the displacement vector is not an integer, bilinear interpolation is adopted to calculate the pixel value of the corresponding point. Then, based on the generated approximate maximum frame image and the original selected maximum frame image, the L rec reconstruction loss and the L 1 regular loss are calculated.
(4) Normalization operation: and normalizing the generated pixel displacement vector feature map. Let the network generated displacement feature map be I f, the M (I, n) function represents taking the average of the first n numbers of the image I, and the normalized image I n can be obtained by the following formula:
Wherein the comparison with 0.0001 is to avoid zero errors.
(5) Feature learning and micro expression classification are carried out: and connecting the previously selected maximum frame image with the normalized pixel displacement vector feature image (concat), inputting the maximum frame image and the normalized pixel displacement vector feature image into a classification network to obtain a classification prediction result (micro expressions are divided into three types of negative, positive and surprise) and calculating the classification Loss Softmax Loss, the evaluation indexes such as UF1 and UAR according to the requirement.
The invention can be considered to provide an end-to-end micro-expression recognition model based on a pixel displacement generation module, which comprises a displacement vector feature generation module, a randomization processing module, a normalization processing module and a classification network module. Wherein:
The displacement vector feature generation module takes a starting frame in the expression sequence and a maximum frame selected by randomization operation as inputs, and a convolutional neural network provided by the pixel displacement generation module generates a pixel displacement vector feature map (DISPLACEMENTS) between the two frames to replace a dynamic image generated by a traditional Optical flow or DYNAMIC IMAGING method. The pixel displacement vector feature represents the displacement of each pixel of the maximum frame image on the basis of the initial frame image, the range of values is between (-1, 1), and in order to make the network concentrate on the features around each pixel point, the generated pixel displacement vector feature map is multiplied by a scaling factor alpha epsilon (0, 1) to limit the range between (-alpha, alpha). The correlation loss function includes: reconstruction loss between the original maximum frame and the maximum frame reconstructed from the start frame and the displacement vector; l1 canonical loss calculated on the displacement vector feature map itself.
In the randomization processing module, because different image sequences among data sets and in the data sets have different expression amplitudes, in order to make a network for generating displacement vector characteristic images more robust, a certain frame in a certain range before and after the original maximum frame is randomly selected as the maximum frame when data is loaded.
In the normalization processing module, in order to normalize the pixel displacement vector feature maps with different magnitudes, the invention divides each displacement vector feature map by the average value of the first n large values of the absolute values of the displacement vector feature maps. Averaging instead of maximum is to reduce the interference of larger noise points that may occur.
The classifying network can select different existing network structures, and the pixel displacement vector feature images obtained before are normalized and then input into the classifying network together with the selected maximum frame image for learning and prediction, so that the characteristics of time dimension and space dimension are maintained. According to the embodiment of the invention, LEARNet(Verma Monu,Vipparthi Santosh Kumar,Singh Girdhari,Murala Subrahmanyam,"LEARNet:Dynamic Imaging Network for Micro Expression Recognition,"IEEE Transactions on Image Processing,2019,pp.99.) is selected as the classification network, and compared with the classical ResNet and VGG structure, the network can retain more details and better learn and distinguish the characteristics of different expression categories.
As shown in fig. 2, the present invention provides a schematic structure diagram of a pixel displacement generation module, in which Conv, conv1, conv2, conv3, conv4, up, conv5, conv6 are sequentially set, and the Up output is connected (Concat) with the output of the Conv1 layer as the input of Conv 5. The module thus comprises two downsamples (implemented by Conv1 and Conv3 layers with stride 2) and one upsample (implemented by Up layer), the specific parameter configurations of the respective convolution layers being shown in the following table. Wherein each convolutional layer Conv, conv1, conv2, conv3, conv4, conv5 is followed by BN layer (batch normalization layer) and leak_ relu activation function layer, and the last Conv6 layer is followed by BN layer and Tanh activation function layer. The Up layer represents an Up-sampling layer using bilinear interpolation, the output of which is connected to the output of the Conv1 layer as input to the next layer.
For the input images with the width and the height of w and h respectively, the final output channel number is 2, and the pixel displacement vector characteristic diagrams with the width and the height of w/2 and h/2 are used for classification. Wherein the first channel represents displacement in the X direction and the second channel represents displacement in the Y direction. And meanwhile, carrying out bilinear interpolation up-sampling on the generated pixel displacement vector feature map to obtain a displacement feature map with width and height of w and h for calculating a loss function. The method comprises the steps of firstly carrying out grid sampling on an initial frame image according to an up-sampled displacement characteristic image to generate an approximate maximum frame image, then calculating L rec loss of the approximate maximum frame and an original selected maximum frame, and simultaneously calculating L 1 regular loss of the displacement characteristic image. The associated loss function is set as follows:
(1) Let the original starting frame image be designated as I s, the selected largest frame image be designated as I t, and T (I s) represent the approximate largest frame image obtained by sampling the starting frame according to the displacement signature, then the L rec reconstruction loss is calculated by:
Lrec=||T(Is)-It||1
(2) To further refine the generated displacement features, let T xy represent the pixel displacement vector at (x, y), calculate the L 1 canonical loss for the pixel displacement vector feature map as shown in the following equation:
(3) The classification loss of the microexpressions uses classical Cross Entropy cross entropy loss, denoted as L c, and the overall loss of the network is calculated as follows:
L=w1×Lc+w2×Lrec+w3×L1
Where w 1,w2 and w 3 are the weights of the L c,Lrec and L 1 loss functions, respectively. The three loss functions can be respectively and reversely propagated to the displacement generation module, and the weight coefficient is selected according to the magnitude difference, and the weight is set to enable the gradient to have higher magnitude than that of the L c and the L 1 because the module takes the reconstruction loss L rec as the main loss. The examples preferably take the experimental value w 1=0.0001,w2=1000,w3 =1.
The pixel displacement value is expressed as a percentage relative to the image width and height, assuming T xy=(Δxy) as the pixel displacement vector at (x, y), it indicates that the pixel at (x, y) of the original starting frame has moved to (x+w×Δ x,y+h×Δy) of the approximate maximum frame image. In order to make the network concentrate on the displacement characteristics around the pixels, multiplying the displacement characteristics by a scaling factor alpha E (0, 1) to obtain a final pixel displacement vector characteristic diagram in the range of [ -alpha, alpha ], namely limiting the actual X-direction displacement component size between [ -w X alpha, w X alpha ] and Y-direction displacement component size between [ -h X alpha, h X alpha ].
For the convenience of understanding the technical effects of the present invention, the following experimental results are attached:
TABLE 1 ablation experiment results on networks proposed by the patent
TABLE 2 UF1 and UAR results comparison of networks using conventional dynamic feature extraction methods and networks proposed by the patent
In particular, the method according to the technical solution of the present invention may be implemented by those skilled in the art using computer software technology to implement an automatic operation flow, and a system apparatus for implementing the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention, and a computer device including the operation of the corresponding computer program, should also fall within the protection scope of the present invention.
In some possible embodiments, a micro-expression recognition system based on a pixel displacement vector of a convolutional neural network is provided, and the micro-expression recognition system comprises a processor and a memory, wherein the memory is used for storing program instructions, and the processor is used for calling the stored instructions in the memory to execute a micro-expression recognition method based on the pixel displacement vector of the convolutional neural network.
In some possible embodiments, a micro-expression recognition system based on a pixel displacement vector of a convolutional neural network is provided, which comprises a readable storage medium, wherein a computer program is stored on the readable storage medium, and the computer program is executed to realize the micro-expression recognition method based on the pixel displacement vector of the convolutional neural network.
It should be understood that parts of the specification not specifically set forth herein are all prior art.
It should be understood that the foregoing description of the implementation examples of the current popular framework is not to be construed as limiting the scope of the invention, but that the appended claims are intended to cover all such alternatives and modifications as may be included within the scope of the invention as defined by the appended claims.

Claims (5)

1. A micro-expression recognition method of pixel displacement vector based on convolutional neural network is characterized in that: establishing an end-to-end micro-expression recognition network based on a pixel displacement generation module, wherein the processing flow based on the micro-expression recognition network comprises the following steps of selecting a maximum frame, and randomly selecting a certain frame before and after the original maximum frame as a maximum frame image in a training process;
Generating a pixel displacement vector feature map, which comprises inputting a selected maximum frame image and a starting frame image into a pixel displacement generating module, and outputting the pixel displacement vector feature map between the two images through the learning and feature fusion of each convolution layer;
Calculating a correlation loss function, which comprises the steps of firstly carrying out bilinear interpolation up-sampling on the generated displacement vector feature image to obtain a displacement feature image with the same size as a maximum frame, then carrying out sampling on an original initial frame image according to the displacement feature image to generate an approximate maximum frame image, and calculating reconstruction loss and regular loss according to the generated approximate maximum frame image and the original selected maximum frame image;
Normalizing operation, including normalizing the generated pixel displacement vector feature map;
and performing feature learning and microexpressive classification, namely connecting the previously selected maximum frame image with the normalized pixel displacement vector feature image, and inputting the connected maximum frame image and normalized pixel displacement vector feature image into a classification network to obtain a classification prediction result.
2. The method for identifying the microexpressions of the pixel displacement vectors based on the convolutional neural network according to claim 1, wherein the method comprises the following steps: in the training process, the selection of the maximum frame is realized through a randomization process, a certain frame in a certain range before and after the original maximum frame is randomly selected, and an image pair actually used for training is increased; if in the verification or test stage, the original maximum frame image is directly adopted.
3. The method for identifying the microexpressions of the pixel displacement vectors based on the convolutional neural network according to claim 1, wherein the method comprises the following steps: the generated pixel displacement vector features are normalized before being input into the classification network, and each displacement vector feature map is divided by the average value of the first several values of the absolute values of each displacement vector feature map.
4. The method for identifying the microexpressions of the pixel displacement vectors based on the convolutional neural network according to claim 1, wherein the method comprises the following steps: for the generated pixel displacement vector feature map, the loss function comprises reconstruction loss between the original maximum frame and the maximum frame reconstructed according to the initial frame and the displacement vector, and L1 regular loss calculated on the displacement vector feature map.
5. The micro-expression recognition method of the pixel displacement vector based on the convolutional neural network according to claim 1,2, 3 or 4, wherein: the selected maximum frame image and the generated pixel displacement vector feature image are input into a classification network together for learning, and after a classification prediction result is obtained, the classification loss is calculated according to the need so as to correlate with an evaluation index.
CN202111204917.XA 2021-10-15 2021-10-15 Micro-expression recognition method for pixel displacement vector based on convolutional neural network Active CN114005157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111204917.XA CN114005157B (en) 2021-10-15 2021-10-15 Micro-expression recognition method for pixel displacement vector based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111204917.XA CN114005157B (en) 2021-10-15 2021-10-15 Micro-expression recognition method for pixel displacement vector based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN114005157A CN114005157A (en) 2022-02-01
CN114005157B true CN114005157B (en) 2024-05-10

Family

ID=79923097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111204917.XA Active CN114005157B (en) 2021-10-15 2021-10-15 Micro-expression recognition method for pixel displacement vector based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN114005157B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114627218B (en) * 2022-05-16 2022-08-12 成都市谛视无限科技有限公司 Human face fine expression capturing method and device based on virtual engine

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020037965A1 (en) * 2018-08-21 2020-02-27 北京大学深圳研究生院 Method for multi-motion flow deep convolutional network model for video prediction
CN112183419A (en) * 2020-10-09 2021-01-05 福州大学 Micro-expression classification method based on optical flow generation network and reordering
CN112766159A (en) * 2021-01-20 2021-05-07 重庆邮电大学 Cross-database micro-expression identification method based on multi-feature fusion
CN112800891A (en) * 2021-01-18 2021-05-14 南京邮电大学 Discriminative feature learning method and system for micro-expression recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020037965A1 (en) * 2018-08-21 2020-02-27 北京大学深圳研究生院 Method for multi-motion flow deep convolutional network model for video prediction
CN112183419A (en) * 2020-10-09 2021-01-05 福州大学 Micro-expression classification method based on optical flow generation network and reordering
CN112800891A (en) * 2021-01-18 2021-05-14 南京邮电大学 Discriminative feature learning method and system for micro-expression recognition
CN112766159A (en) * 2021-01-20 2021-05-07 重庆邮电大学 Cross-database micro-expression identification method based on multi-feature fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴进 ; 闵育 ; 马思敏 ; 张伟华 ; .一种基于CNN与LSTM结合的微表情识别算法.电讯技术.2020,(01),全文. *

Also Published As

Publication number Publication date
CN114005157A (en) 2022-02-01

Similar Documents

Publication Publication Date Title
CN111062872B (en) Image super-resolution reconstruction method and system based on edge detection
CN111639692A (en) Shadow detection method based on attention mechanism
CN112507617B (en) Training method of SRFlow super-resolution model and face recognition method
CN113688723A (en) Infrared image pedestrian target detection method based on improved YOLOv5
Cai et al. Residual channel attention generative adversarial network for image super-resolution and noise reduction
CN112149500B (en) Face recognition small sample learning method with partial shielding
CN111047543A (en) Image enhancement method, device and storage medium
CN111291669A (en) Two-channel depression angle human face fusion correction GAN network and human face fusion correction method
CN112507920A (en) Examination abnormal behavior identification method based on time displacement and attention mechanism
CN114022506A (en) Image restoration method with edge prior fusion multi-head attention mechanism
CN114005157B (en) Micro-expression recognition method for pixel displacement vector based on convolutional neural network
CN110570375B (en) Image processing method, device, electronic device and storage medium
CN117351542A (en) Facial expression recognition method and system
CN118212463A (en) Target tracking method based on fractional order hybrid network
CN117893409A (en) Face super-resolution reconstruction method and system based on illumination condition constraint diffusion model
Hua et al. An Efficient Multiscale Spatial Rearrangement MLP Architecture for Image Restoration
CN115860113B (en) Training method and related device for self-countermeasure neural network model
CN114582002B (en) Facial expression recognition method combining attention module and second-order pooling mechanism
CN116977200A (en) Processing method and device of video denoising model, computer equipment and storage medium
CN115797646A (en) Multi-scale feature fusion video denoising method, system, device and storage medium
CN111047537A (en) System for recovering details in image denoising
CN113012072A (en) Image motion deblurring method based on attention network
CN114596609A (en) Audio-visual counterfeit detection method and device
CN114240778A (en) Video denoising method and device and terminal
Maity et al. A survey on super resolution for video enhancement using gan

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant